Discussion on: What would you use as a sortable, globally unique, ID?

View post

I had the idea that I wanted to construct a scalable distributed event store on top of AWS and ran into similar questions (ordering in a distributed system). Dynamo for distributed event storage and S3 for initial replay storage (due to high cost of Dynamo replays).

I started thinking about how to place the events in S3 in such a way that they could be replayed in some semblance of order. Each event stream is totally ordered by itself by StreamID + Version, but there is no order indicated by across streams. Dynamo doesn't support any such order because streams are distributed across different servers.

I looked at using the event envelope to construct the S3 file name which would be in lexicographical order. That way listeners would ask S3 for a file listing by name and they would come back in order. That would include a node-specific timestamp and the node id in the file name. The timestamp would be the default "ordering", and the node id would be used as an arbitrary tie-breaker. And if we become aware of time skew on a specific node, we could also correct for it. Many listeners only care about specific types of events. So I included the event type in the name so that only needed events had to be fully fetched. Ultimately the design ended up with pretty long file names like {timestamp}/{nodeid}/{streamid}/{event version}/{event type}.

There are huge problems with this, however. I could nitpick a bunch of them, but I will jump to the overarching problem. The scale where I need to distribute the event store (and thus worry about cross-node ordering), full replays would be impractically time consuming and the volume of events would generally be unwieldy for listeners. That's even if I could pick the perfect ordered key up front. The scale of this really needs a change in tactics.

Ultimately I decided that cases where I need grouping and ordering of data had to be exercised in the small. And if these need to be aggregated to larger scale, the smaller systems would have to publish "external" events that rolled up events to a higher granularity. For example, a lot of events may go into placing an Order and the sequence in which they happen matters a lot. But external listeners will not be interested in handling all those details. They instead prefer a single OrderPlaced event with all details included. So in the Order system I'll have a listener go back and construct one for external publishing. These could be published to a stream processing platform such as Kafka for larger integration scenarios. And Kafka already has some metaphors for how to make that work from a subscriber's perspective.

Moral of the story (applicable to ordered GUIDs I think) is that ordering in a distributed scenario wasn't the best problem for me to solve. It is possible to power through the ordering problem (also dealing with clock skew or failed clocks that claim the event happened in 1970), but it requires extra work and ongoing upkeep. And I still don't end up with a system that has the right granularity for large scale.

rhymes • Sep 10 '19

Thanks for the detailed explanation and I can see how complicating the architecture to allow ordering is not worth it in your case, especially because you have what I believe is an "event sourced" architecture, with multiple levels of granularity of events.

It is true indeed that not all listeners, especially those that are merely external consumers, are interested in that detail, in addition to the ability to replay events for those external consumers.

Our case is very limited in scope, which ultimately will become a very simple pub/sub system. The topic of what sort of "universal" IDs to choose for those events transiting the inside of the app and to be sent outside arose and thus, I opened this discuss thread.

I don't foresee any real drawbacks in choosing a guid that's also sortable right from the start, what do you think? By having IDs that are inherently sortable the consumer can use that property or ignore it as they wish

Hope this makes sense

Kasey Speakman • Sep 10 '19 • Edited

The only real downside is that making them sortable will encourage you to use and depend on that feature. And when clock skew comes into play, then the code that depends on sortability will probably behave unexpectedly. When your code processes the data out of order you could observe strange things like issuing updates for data that hasn't been inserted yet. How: On a cloud provider, they may restart your code on a different node (for failure or maintenance or no reason) whose clock may be skewed from the previous node. They usually do have time sync but it is best effort -- no guarantees about clock accuracy between servers. The margin of error should be small enough that you don't have a problem normally. But if you ever do have the issue it will be hard to diagnose. Also note that if a hardware clock does fail, it is common for it to reset to zero. That would be a little easier to diagnose, but either way recovery (changing the IDs? mapping them to new IDs?) could be painful.

For an event store I use a big integer called Position (not an auto increment) to have total ordering for that node. It guarantees that whatever happens later has a larger Position than whatever happens before. But it doesn't provide a global perspective. You can get a global order by reading from nodes in Position order and then choosing the lowest timestamp among those events. When you do it that way, you know it is a "best effort" ordering, and you may be able to account for some known skewed timestamps. Even if timestamps are off, you are guaranteed that events from the same node are in order. I haven't actually had to do a global order across nodes yet, but there is one on the horizon to merge different event stores.

Kasey Speakman • Sep 16 '19

Forgot to mention. Random UUIDs still used to identify entities on top of the node-wide Position and event timestamps.