Dev diary - 7. January 2025

Quick document summarization with LangChain and rate limiting

header_image

After receiving countless rate limit responses (429 Too May requests) from the OpenAI API in my RAG app, I want to share the concise approach I settled on for text summarization.

For more context: the app (a RESTful API) allows users to upload a document, and there is an endpoint to generate a summary of the document.

This is the code I ended up with:

python

>rate_limiter = InMemoryRateLimiter(
>requests_per_second=0.3,
>check_every_n_seconds=0.1,
>max_bucket_size=10,
>)
>

>llm = ChatOpenAI(temperature=0, model_name=OPEN_AI_MODEL, rate_limiter=rate_limiter)
>

>def initialize_chain(docs):
>num_tokens = sum(
>[get_num_tokens_from_string(doc.page_content, OPEN_AI_MODEL) for doc in docs]
>)
>if num_tokens < MODEL_MAX_TOKENS:
>chain = load_summarize_chain(
>llm, chain_type="stuff", prompt=prompt, verbose=VERBOSE
>)
>logger.info("Stuff chain initialized")
>else:
>chain = load_summarize_chain(
>llm,
>chain_type="map_reduce",
>map_prompt=prompt,
>combine_prompt=prompt,
>verbose=VERBOSE,
>)
>logger.info("Map-reduce chain initialized")
>return chain
>

There are two main components in the code above:

  1. The rate limiter!

This is as simple as instantiating the InMemoryRateLimiter from langchain_core with your desired parameters and passing it to the chosen LLM (in our case, the ChatOpenAI).

  1. The strategy

The strategy depends on whether the maximum number of tokens is surpassed for the retrieved context. If the token limit is not exceeded, we use the “stuff” strategy and call it a day. Otherwise, we switch to the “map_reduce” strategy, which leverages the rate limiter introduced above as it iterates through each API call.

Using this approach, I started getting consistent results, regardless of the length of the documents that needed summarization. Try it with the GDPR guidelines, for example!

blog author
Author
Germán Distel

I'm a Fullstack JavaScript Developer who enjoys bringing ideas to life using NestJS, React, and Angular—plus a bit of Python when needed. Outside of coding, I like trying out new technologies, reading, and spending quality time playing with my daughter.

Read more

Contact us

Let's talk

I hereby consent to the processing of my personal data.