Quick document summarization with LangChain and rate limiting

After receiving countless rate limit responses (429 Too May requests) from the OpenAI API in my RAG app, I want to share the concise approach I settled on for text summarization.
For more context: the app (a RESTful API) allows users to upload a document, and there is an endpoint to generate a summary of the document.
This is the code I ended up with:
python
Copied!
>rate_limiter = InMemoryRateLimiter(
>requests_per_second=0.3,
>check_every_n_seconds=0.1,
>max_bucket_size=10,
>)
>
>llm = ChatOpenAI(temperature=0, model_name=OPEN_AI_MODEL, rate_limiter=rate_limiter)
>
>def initialize_chain(docs):
>num_tokens = sum(
>[get_num_tokens_from_string(doc.page_content, OPEN_AI_MODEL) for doc in docs]
>)
>if num_tokens < MODEL_MAX_TOKENS:
>chain = load_summarize_chain(
>llm, chain_type="stuff", prompt=prompt, verbose=VERBOSE
>)
>logger.info("Stuff chain initialized")
>else:
>chain = load_summarize_chain(
>llm,
>chain_type="map_reduce",
>map_prompt=prompt,
>combine_prompt=prompt,
>verbose=VERBOSE,
>)
>logger.info("Map-reduce chain initialized")
>return chain
>There are two main components in the code above:
- The rate limiter!
This is as simple as instantiating the InMemoryRateLimiter from langchain_core with your desired parameters and passing it to the chosen LLM (in our case, the ChatOpenAI).
- The strategy
The strategy depends on whether the maximum number of tokens is surpassed for the retrieved context. If the token limit is not exceeded, we use the “stuff” strategy and call it a day. Otherwise, we switch to the “map_reduce” strategy, which leverages the rate limiter introduced above as it iterates through each API call.
Using this approach, I started getting consistent results, regardless of the length of the documents that needed summarization. Try it with the GDPR guidelines, for example!

I'm a Fullstack JavaScript Developer who enjoys bringing ideas to life using NestJS, React, and Angular—plus a bit of Python when needed. Outside of coding, I like trying out new technologies, reading, and spending quality time playing with my daughter.