Events processing with AWS Lambdas

We have a RESTful API that processes various requests and produces events for asynchronous processing. In essence, it follows the principle of event sourcing.
The events can be of different types. Some types need to be processed in order and some could be parallelised. There are many solutions that provide implementation for event sourcing like Akka Persistence. If we want to utilise AWS infrastructure, one design choice would be to use AWS Lambdas to process the events.
Events can be saved to a database together with the request related data in a pattern known as Transactional Outbox and propagated to a Kinesis stream using CDC (Change Data Capture). Lambda functions then process these events.
In basic implementation we’d have one Lambda reading all events and handling code for all event types.
caibwioz1lbu40gu01l3fju8k57j8s.webp)
This comes with a number of issues
- since some event types need to be processed in order, all event types need to be processed in order. This makes the Lambda sequential and limits the processing speed. Unprocessed events can pile up, causing delays. This could be mitigated by having multiple shards with the event type as a sharding key, but it’s not ideal as it would limit the parallelism
- errors in processing of one event type can affect other events. Depending on the implementation the error can be retried by propagating the failure to the Lambda runtime in the response type. The message would not be checkpointed in one shard and retried, which can cause retry loops and delay processing of the shard.
One solution to this is to have one Lambda per event type. Event source mapping between the Lambda function and the kinesis stream, which is a separate consumer, can contain a filter that would pass through only event types handled by the Lambda. This approach has a drawback in infrastructure management if the number of event types is high. Number of Lambdas can explode which makes maintenance, deployment, monitoring etc. more complex.
k5hm93d39yr6eg1pjk383q7oh8zcj7.webp)
We decided to go with a solution somewhere in-between. We would still have one Lambda however with event source mapping per event type (using filter on event type). Each event source mapping is a separate kinesis consumer and can therefore process at its own pace.
fg31k9mei0vnggtb644jh7v25rmbzy.webp)
The tricky part is understanding Lambda concurrency.
Lambda has a concurrency limit setting, which defines how many instances of the function can execute simultaneously. To ensure that one event type does not affect others, the Lambda capacity must be at least:
numberOfShards × numberOfSources × parallelizationFactor
Each event source mapping consumes records from each shard independently and can also process multiple batches from a single shard at the same time if parallelizationFactor > 1.
For example, 4 shards in the stream with 4 different event types would require at least 16 concurrent Lambda executions to avoid blocking.
This assumes that the sharding key is not the event type itself, but rather a business key with higher cardinality, such as userId, where events for the same user must be processed in order while allowing high parallelism across different users.
I am a knowledge-obsessed, life-positive software developer who approaches every day with a passion for learning and a drive to inspire others. As a natural problem solver, I excel at applying creative thinking to solve complex problems and am constantly pushing the boundaries of what is possible in software development.