DEV Community

Discussion on: What would you use as a sortable, globally unique, ID?

Collapse
 
matteojoliveau profile image
Matteo Joliveau

May I ask what the need for sorting is?
I personally don't see many reasons to sort by id, most often what people really want is sort by creation date.

Anyway, back in topic. This a tricky question because globally unique and predictable are two properties in direct clashing with each other. A solution is taking a look at Microsoft SQL Server Sequential ID, a sortable GUID.
There a couple of disadvantages:

  • They are not decentralized. If they were you would lose the anti-clashing capabilities of GUIDs.
  • They are predictable. Don't ever expose them to the public as they can lead to resource enumeration.

Basically, you may as well use BIGINT as your ids if you need sorting.

If you need to sort events on a distributed system, better use some other properties. Like timestamps.

Collapse
 
rhymes profile image
rhymes • Edited

May I ask what the need for sorting is?

Sure, efficiency and convenience mostly. Let's say we use UUIDs, they work mostly well until these IDs land in a place far from your system. Someone decides to store events on S3 using the UUID as a file, suddenly you have gigabytes of events that can't sort well unless you peek inside the file to find the timestamp.

Or you grep an event log and suddenly you have to come up with a combination of bash commands in a pipe to extract both the ID and the timestamp to sort them.

Or you want to create shards out of them to group related data in different machines, UUIDs are useless for that.

In my experience UUIDs are great, until they aren't :D

The great thing about globally unique and sortable IDs is that they carry information with them. If well designed I can even deconstruct them to extract such info.

This a tricky question because globally unique and predictable are two properties in direct clashing with each other.

Not exactly true, see the examples at the end of my comment :)

A solution is taking a look at Microsoft SQL Server Sequential ID, a sortable GUID.

This definitely wouldn't work as you said, they are basically sequential as the name says.

If you need to sort events on a distributed system, better use some other properties. Like timestamps.

But timestamps are not globally unique and can be duplicate.

I'll leave you with two examples of partially sortable IDs that are also random and unique:

For example, Firebase uses something like this for their IDs: The 2 ^ 120 Ways to Ensure Unique Identifiers