DEV Community

Cover image for Amazon EventBridge archive and ordered replay of events
Wojciech Matuszewski for AWS Community Builders

Posted on • Updated on

Amazon EventBridge archive and ordered replay of events

Edited on 17.06.2022

With many serverless event-first services available, utilizing events in application backends has become very popular. Those asynchronous workflows are often more cost-efficient and resilient than their synchronous counterparts.

In AWS, one service that usually works great for event-driven workflows is the Amazon EventBridge. With a rich integration with other AWS services, extensive event filtering capabilities, an option to archive events, and the ability to send HTTP requests via API destinations, it seemingly has everything you would need to place it in the core of your application.

In this blog post, I would like to zoom in on the event archival capabilities of the Amazon EventBridge, mainly how to ensure that the replayed events flow through your application in order they have been sent to the event bus.

About the event archive

If enabled, Amazon EventBridge will save all the events that flow through the event bus into the so-called event archive. You can configure the archive to have a given retention period or only to save events that match a specific shape. You can define multiple event archives per event bus.

The following is a high-level diagram of an EventBridge event bus that utilizes the event archive for event archival.

High-level diagram of an EventBridge bus

One of the more exciting capabilities of the event archive is the ability to replay events from the archive. As you can imagine, if something goes wrong in your system or you have to backfill historical data, this capability comes in handy.

The archive event replay

In the context of event replay, there are a few essential things to consider – the period from which the events should be replayed and the fact that EventBridge does not guarantee the order of events when replaying them to a given target.

Event order is not guaranteed when using archive replay

In a perfect world, the systems we create should withstand the out-of-order delivery of events and proceed with the workflows as if the order of events is a non-factor, but sometimes it is impossible to do so. Replaying events in the order, they arrived on the event bus could be helpful while developing new features or trying to debug a particular issue.

With the help of AWS Step Functions, it is possible to force the "correct" order of the events back to our system when the event replay is active. Let us take a look at how next.

Ordered replay with the help of AWS Step Functions

The following is an example structure of an EventBridge event replayed by the archive.

{
  "version": "0",
  "id": "6a7e8feb-b491-4cf7-a9f1-bf3703467718",
  "detail-type": "EC2 Instance State-change Notification",
  "source": "aws.ec2",
  "account": "111122223333",
  "time": "2017-12-22T18:43:48Z",
  "region": "us-west-1",
  "resources": [
    "arn:aws:ec2:us-west-1:123456789012:instance/i-1234567890abcdef0"
  ],
  "replay-name": "myEventReplay",
  "detail": {
    "instance-id": "i-1234567890abcdef0",
    "state": "terminated"
  }
}
Enter fullscreen mode Exit fullscreen mode

To distinguish between a replayed event and the regular event, the EventBridge archive adds the replay-name property to the event. This property is handy for writing filtering rules that only allow the replayed event to pass onto the underlying integration. In our particular situation, the time property is the one that will allow us artificially enforce order on the replayed events.

The EventBridge service automatically adds the time property to the event. You do not have (but you definitely can) to supply the timestamp in the event body.

By computing the difference between the current timestamp and the timestamp of the event, we can deduce the amount of time the event has to wait before being forwarded to the target. If we apply this heuristic to all replayed events, we effectively enforce order based on the event time property. The work of computing the time difference and the waiting will be done by a middleman service – in our case, an AWS Step Function.

Ordering events based on the event time property

Instead of forwarding the events to the original destination, we deploy a middle-man, the AWS Step Function, to enforce the order and deliver the events to the original destination. Inside the Step Function workflow, we first calculate the time difference between the replay start timestamp and the event time property, then, using the Wait state, we idle for that period and finally forward the event to its original destination.

The following is a simplified design of the AWS Step Function step machine that takes care of ordering logic.

Simplified diagram of the AWS Step Function state machine that enforces event ordering

You can find an example GitHub repository where I've implemented the ordered replay here.

Note that we cannot publish the event back to the event bus. If we were to do that , we would lose the ordering guarantee enforced by our Step Function – EventBridge does not guarantee strict ordering of events when invoking targets.

You could also forward the event to Amazon DynamoDB (and then use DynamoDB Streams) or Amazon Kinesis Data Streams and read the events from those services. Both of these services have ordering guarantees.

Cost considerations

Enforcing ordering for the replayed events is not free. In fact, it can get quite expensive for large amounts of events. Since we have to invoke an AWS Step Functions state machine for each replayed event, we incur additional charges subject to the AWS Step Functions pricing dimensions. Personally, I would only use the feature of ordered replay in development, where I control the volume of the incoming events.

Closing words

I hope you found this little nugget of AWS knowledge as helpful as I did. I encourage you to explore ways to enforce event order in other parts of your system, if only for training purposes – I guarantee you will learn something new and exciting.

For more AWS serverless content, consider following me on Twitter – @wm_matuszewski.

And as always, thank you for your precious time!

Latest comments (2)

Collapse
 
garryhammack profile image
Danny Acton • Edited

The focus on event archival capabilities in the Amazon EventBridge is intriguing. Ensuring that replayed events flow through the application in the order with online resource and they were sent to the event bus is a critical aspect for many event-driven systems. I'm looking forward to learning more about the best practices and techniques covered in this blog post to maintain the integrity and consistency of event data in the application. Thank you for sharing this valuable content with the community, and I can't wait to dive into the details! Keep up the great work!

Collapse
 
joakim_sandstrm_900a968a profile image
Joakim Sandström

I fail to understand how this could fix the overall system as EventBridge does not guarantee event ordering even in normal situations (without replay) nor does it guarantee exactly-once delivery.

So the downstream system(s) must handle out of order events as well as possible duplicate events.

ref: aws.amazon.com/blogs/architecture/...

EventBridge does not guarantee event order will be maintained and promises as-least-once event delivery, meaning duplicate messages can be introduced.