Discussion on: Event Sourcing: What it is and why it's awesome

View post

My current project is event sourced. I agree it is awesome for all the reasons you describe. I would also add that you can start small and progressively optimize. Small teams don't have to opt into eventual consistency, process managers, snapshots, etc. from the get-go. These can be evolved later as needs arise.

Additional challenge for ES: set validation -- i.e. unique email. To say an email is unique creates a relationship between that email and every other email in the set (mutual exclusivity). Log storage is very inefficient at this type of verification, since you have to replay the entire log every time to validate it.

The party line on set validation has been to keep the set data with the other eventually-consistent read models. Then in the rare case a duplicate email address slipped in before the eventually consistent data got updated, detect the failure (could not add an email to the set), and compensate (notify an admin).

There is also a school of thought that set-based data should just be modeled as relational data, not event sourced. The main loss there is the audit log. Temporal tables could be used to get an audit log, but they are generally more of a pain to work with. That's the tricky part of being an architect: weighing the trade-offs and picking ones which are best for your product.

Alexey Zimarev • Sep 2 '17

"Validation" is not an issue on its own. This is just a subset of eventual consistency issue. Generally speaking, CQRS is suffering from this, not necessarily the event-sourcing.

Kasey Speakman • Sep 2 '17

I disagree and don't think this is a problem specific to those patterns. I think it is a human nature problem of getting a new data hammer and then every bit of data looks like a nail. It's part of how we learn appropriate uses of the tools, but it's nice when somebody can save you the trouble of learning the hard way.

Alexey Zimarev • Sep 2 '17

This is not what I meant. Essentially, the "validation" issue comes each time you have any bit of eventual consistency, where you have to check the system state in a potentially stale store. CQRS is more often eventually consistent and the question of validation comes as a winner in the DDD-CQRS-ES mailing list and on StackOverflow. However, event-sourced system can have a fully-consistent index on something that needs to be validated and therefore this problem can be removed.

Kasey Speakman • Sep 2 '17

I understand what you are saying, but I didn't want to conflate the issue with the universe of DDD/CQRS things. I just wanted to point out that sets are one of the things that log based storage does not do well. Go ahead and look at relational tables for that. Either as the source of truth for set data or -- and I like how you put it -- as an index from log storage.

Barry O Sullivan • Aug 29 '17

Agreed, the jury seems to be out on set validation. For that we're using immediately consistent read models, ie. as soon as a UserCreated event is fired we update the projection. This is not an optimal solution though, we have to force all CreateUser usecases to run sequentially. We can't have two running at the same time, or we can't enforce the constraint.

If you go with this solution you have to reporting in place to warn you if you're hitting the limit of sequential requests, otherwise requests will start failing.

Once we get these warnings, we're gonna move to the eventual consistent model you mentioned above, and just accept that there could be duplicates.

This isn't a problem unique to event sourcing either. Standard table driven web apps have the exact same problem when it comes to uniqueness across a set, it's just not as obvious that the issue exists. I think it's because DDD and ES forces you to model your constraints explicitly, while in standard web dev it's not as obvious that you're not actually enforcing your set wide constraints.

Kasey Speakman • Aug 30 '17 • Edited

So I have a line of thought here. Most event sourced systems already have fully-consistent read models in addition to the log itself. Namely, the tables used to track events by aggregate or by event type. They could be reasonably reconstructed from the event log, but they are there as a convenience (an index really) and updated in a fully consistent manner with the event log.

I wonder if it is necessary then that all my read models be eventually consistent. Maybe some of them (especially set-based data) could be fully consistent write models that also get propagated to the read side (if separate databases are used for scaling reads... which is the whole reason for eventual consistency anyway). Then you could validate the set constraint at write time, and even guarantee that duplicates cannot be written (unique index will raise an error if violated while the use case is running) even if the use case checked first, but missed the duplicate due to concurrency. Then there would be no need to make the use cases sequential.

Barry O Sullivan • Aug 30 '17

You can definitely do that, and it will work. A unique index will still force sequential operations, but only on that SQL call, rather than the entire usecase, so the likely hood of failure is much smaller.

The only issue is that you have to handle failing usecases. Say the email was written to the read model at the start of the usecase, but some business operation failed and the event was never actually stored in the log. Now you have a read model with invalid data.

If your event log and constraint projections are stored in the same SQL DB, you can solve this with a transaction around the entire usecase (we do this).

There is still the issue of potential failure. Ie. too many requests hitting the DB, but it's not one you need to deal with at the start. You'll only face this when you have massive numbers of people registering at once. We're currently adding monitoring for this, just to warn us if we're reaching that threshhold. It's unlikely we'll reach it, we're not building Facebook/Uber, but it's reassuring to know it's there.

Kasey Speakman • Aug 30 '17

Yes, I was assuming same database as it is the only convenient way to get full consistency. With separate databases, you pretty much have to use eventual consistency. (Well, distributed transactions exist too, but become problematic under load.)

Kasey Speakman • Jan 10 '20

Update. I have a new tactic on this user email issue for our next product. (Using eventual consistency.) We decided to allow a login account to belong to multiple organizations. That means that different orgs are free to create an user account with an email address that is already used. When the user logs in with that email, they will be able to see a dashboard across organizations.

We use an external auth provider that does not allow duplicate email accounts. We have an event sourced process manager that listens for user creation and email changes. We generate a deterministic UUID from the email address and use it as the stream ID to track/control the external auth integration. E.g. Replay the stream to get current state of the integration for that email. And take appropriate action and save events based on current state and the event received.

To make this more resilient, I also add causation IDs to every event that this process manager saves. And when I replay the stream, I verify that the event I just received is not found among the causation IDs of all replayed events. It didn't cost too much code to do this, and it eliminates the possibility of accidentally retriggering side effects/events that were already done before. Since this is an autonomous component, it seemed important to prevent this.