Event Sourcing is complex and involves multiple concepts, patterns, and architectures. At The Agile Monkeys, we have been working with it for several years and we would like to share our vision of the main concepts around Event Sourcing.
Things that have happened (yes in the past).
e.g. a customer has placed an order.
Something that evolves and changes over time with a finite number of possible values.
e.g. A SKU can have multiple states (in stock, out of stock, soon to be available, discontinued, etc.)
Events = Things that have happened that changed some state.
e.g. Customer placed an order and therefore the order is in the PLACED status and the inventory of the item decreased by 1 etc.
Permanent Storage where events are persisted.
e.g. Can be a DB (NoSQL, SQL…) or maybe object storage (S3...)
- Store the full history of events (aka state changes) in the Event Storage: of course, we mean only the events we intended to store, the ones we designed to persist.
- Events are chronologically ordered: This allows to reconstruct the state of the system at any given point in time.
- Immutable events: it’s an append-only system (no update nor removal).
Because of data privacy policies (like GDPR in Europe), there are some cases where users want to delete their data. Allowing this would break the Event Sourcing Contract we just described (immutable events).
How can we support this use case?
We can encrypt the data. When users request to delete their data, we are going to delete the encryption key.
This design solves both problems:
- By dumping the key we are making data unusable which fulfills the purpose of deleting the data from the user perspective.
- We are not opening unwanted doors by actually deleting the data (no permissions, no repo with delete methods, etc.) so we are still respecting the contract we described.
- Domain-Driven Design
- Event-Driven Architecture
Event sourcing excels when used together with the 4 concepts above.
Uses conceptual maps (called object models) of the domain (easily maps to an actual business concept/idea) that incorporates both behavior and data.
Creates common concepts/vocabulary to easily communicate with other technical teams and business people.
e.g In the e-commerce business, a must-have object model would be an order (defined by its ID, customer ID, date, status, etc..)
Captures something that happened that changed the state of a domain model.
e.g. A person changed their address, which changes the Address domain model given an AddressChanged event.
We can then use the knowledge gained from the changes to do something useful with it.
e.g. Since the person moved to a new place, we can send them reminders to update their address for all their bills.
Stores the full history of domain events in the Event Storage.
Misconception: They are not the same!
Event-Driven Architecture: System which is based on components communicating mainly or exclusively through notification events.
When talking about “event” in event sourcing we are actually referring to the state change and not a “notification” as in the event-driven architecture (queues, bus, streams, async communication in general).
domain event != notification event
When working with microservices architecture, we need to notify multiple services about state changes (domain events) so they can react to them.
Usually, this is done with notification events (so we are using an event-driven architecture).
In this case, notification events are domain events.
e.g. Domain events persisted in the Event Storage. In an event-driven architecture, we need to notify Service A and B about them. In this case, we can call them notification events, as we decided to use async communication for this purpose.
Don’t use your Event Storage as a Read Model.
You might be tempted to query your Event storage to find the latest state for a given event.
This can be slow for large systems. It might work for very small systems but reconstructing the full history to find the latest state for regular usage of your system doesn’t sound like a very good idea.
A pattern that requires having separate classes for reading and writing data.
- This allows us to have different models to read and write data.
- This also means the repositories (and therefore the DBs) can be different for reading and writing data. For instance, we can write to a relational DB (the writing part is the Event storage) and then read from a NoSQL DB.
CQRS generally has 2 main concepts: commands (write) and queries (read).
- A command can return a value: Most likely the returned value will have to do with operation confirmation or identifiers and nothing to do with state changes and this operation is expected to be asynchronous. A domain event will be put into a queue (stream etc..) and be processed at a later stage.
- A query does not change any state, its only goal is to read.
This separation makes the following scenario possible.
Let’s say we decide our Event Storage will be a NoSQL DB. Our Read Models will live in our Services and they will be separate DBs, they can be relational DBs.
The Event Storage will remain our source of truth but each Read Model will be designed to render the data as needed per each service. That’s its only goal.
Since we are storing everything in the Event storage, one very common question is “How do we avoid reading all the data to find the latest state for a given domain model?” The solution is to create snapshots from time to time. Snapshots are actually part of the Event Storage itself. Let’s give an example to make it clearer.
One of the classic examples of event sourcing is the bank card we use to perform withdrawal and credit operations. Domain events will persist these operations. In order to know the current balance of the account, we need to reconstruct all the operations from the beginning. Snapshotting here can be used to have the balance computed every 5 operations. In the worst case, we would need to get the latest snapshot + 4 domain events to know the current balance, thus decreasing the number of reads significantly.
If Snapshotting is used to reconstruct the latest state for a given domain model, why would we use Read Models?
When working with a micro-service architecture they are used for different purposes.
Imagine adding a new micro-service, therefore a new ReadModel DB. We need to reconstruct the latest history of domain events from the Event Storage so we’ll use the Replay events feature to achieve this.
Replay Events is a feature of event sourcing and is a byproduct of 2 of the main concepts we defined: full history of domain events + ordered domain events. We can replay the full history of what happened and get the latest state for a given domain model.
This can be a very time-consuming operation if we are talking about millions of domain events. The fastest way to do this is using Snapshotting.
In this case we can seamlessly add a new microservice. We are using snapshots for a different purpose as we are still using read models to read data from a given Microservice and not the snapshot from the Event storage.
The same situation applies if we decide to completely dump a Read Model and change it for a new one. Our source of truth remains the Event Storage and we just need to replay all the events and create our new read model.
When starting a new business/new product etc.. we just don’t know how much our data is worth. Even if the goal is clear at that point in time, it’s always a better idea to keep all the data to possibly use it at some point in the future (storing data is cheap now) Example: Let’s imagine we are creating a new e-commerce site and we are in charge of specifically adding features and then the checkout process to cart items. In this case, what matters the most is the final state of my cart before the user places the order. The fact that the user has added and then removed items doesn’t really affect the checkout process and the correct functioning of the website. Nevertheless, we should keep the full event history of adding/removing items from the cart. Even if not useful now, it can help us analyze the customer behavior and see why they added and deleted those items.
By reliably persisting the full history of events we will be able to debug our applications more easily by finding the full history of events for a given domain model.
Since it’s an append-only system, we never lose data. We can’t delete any data.
High-Security Standard naturally by design: We can’t update or remove data. If a system gets hacked, data could still be appended meaning the current state of a given domain model would change. In any case, not losing any data is a great benefit.
Great fit for analytics: Since we have the full history of everything that happened, we analyze the past and use data to drive the business.
- Mental shift for developers: There are a bunch of concepts that need to be mastered here, event sourcing, DDD, CQRS, event-driven architecture etc. It can be overwhelming.
- It should not be used for all scenarios. Like any other pattern, it doesn’t make sense in some cases.
- Unless we use the Event Storage for the read models (which is definitely not the best practice or not even possible in some cases), we are introducing data redundancy which can lead to potential inconsistencies and also eventual consistency.
This is how we work with event sourcing in Booster.