Viktor Dorotovič

Posted on Oct 7, 2022 • Originally published at toomanyafterthoughts.com

UUIDs Are Bad for Database Index Performance, enter UUID7!

#database #performance #webdev #news

UUIDs, Universal Unique Identifiers, are a specific form of identifier designed to be unique even when generated on multiple machines. Compared to autoincremented sequential identifiers commonly used in relational databases, generating does not require centralized storage of the current state, I.e., the identifiers that have already been allocated. This is useful when a centralized system would pose a performance bottleneck or a single point of failure. UUIDs are designed to be able to support very high allocation rates, up to 10 million per second per machine.

Despite the fact that some types (eg., UUID4) are not guaranteed to be unique by their method of generation, the chance of generating two conflicting UUIDs is very low. This is also due to the fact that UUIDs are 128 bits long.

UUIDs were formally defined as an Internet Draft in 2002, which was promoted to a Proposed Standard as RFC 4122. At the time of writing, 17 years later, it still has the status of a proposal.

Representation & structure

UUIDs are just 128 bit numbers, or 16-byte long binary sequences, and can be stored as such for efficiency. In API calls, they are commonly transferred in a text format of hexadecimal sequenced separated by dashes in the pattern of 8-4-4-4-12. This mirrors the internal structure as defined by the RFC:

   0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                          time_low                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |       time_mid                |         time_hi_and_version   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |clk_seq_hi_res |  clk_seq_low  |         node (0-1)            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                         node (2-5)                            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

UUID versions

Even though the layout looks like a single schema for UUID generation, there are multiple versions of how the individual fields are populated – UUID1 to UUID5. The version is actually encoded in the UUID itself and makes up the most significant 4 bits of the time_hi_and_version field.

UUID1 - time-based

UUID2 – DCE security version

The specification of this version is available at https://pubs.opengroup.org/onlinepubs/9696989899/chap5.htm#tagcjh_08_02_01_01

UUID3 – name-based, MD5

UUID4 – random

UUID5 – name-based, SHA1

UUIDs and B-tree indices

As described in the post comparing random and sequential primary keys, B-tree indexes are well suited for inserting keys when values are increasing and not so much when the values are randomly distributed. Notice something about UUIDs? Neither version is designed to generate increasing values, where the sort order is similar to how the UUIDs were generated in time. No, not even UUID1. UUID1 is based on time, but the timestamp is split up into multiple segments and they are in the wrong order in the UUID structure. The least significant bits – the bits that have the highest change frequency are located at the start of the UUID stream – as the time_low field.

UUID1 vs UUID4

Commonly used UUID versions are 1 and 4 because they can be generated “out of thin air”, without requiring some custom values (namespace, name).
Considering that none of them is increasing, there should be little difference in insert performance between UUID1 and UUID4.

To test this hypothesis, I extended the test rig developed for comparing random and sequential integer IDs. The script inserts equal-sized chunks of records into a table and tracks the I/O write volume of the DB engine generated by the inserts. The I/O measurement is done by running the DB engine in a Docker container and polling Docker’s stats API.

For this comparison, I’ve used SQLite in clustered mode.

Umm, nope. Hypothesis busted. UUID1 performs much better.

The reason is that while overall, uuid1 isn’t ordered by time, it is increasing in short periods until time_low rolls over and starts from 0 again. The time it takes time_low to roll over is about 7 minutes and 9 seconds. After this period, filled B-tree pages will be revisited again and page splitting will occur.

Inserting 1M records during the test took about 33 minutes, so this would mean only 4 rollovers. In order to show how it would perform on a longer timescale, I created a modified version of UUID for which time passes 100× faster – uuid1_fastrollover.

Designing efficient UUIDs

What if a new UUID version could be designed that would take the randomness of UUID4 and combine it with a timestamp prefix? This would make the UUID increase overall, but not locally – due to the random postfix. The random part ensures uniqueness when a high generation rate is necessary and also makes the UUIDs hard to predict – it’s not possible to guess the previous, or next UUID. It’s fairly simple to devise a custom UUID scheme, but fortunately, there is a new Internet-Draft (at the time of writing) defining new pseudo-sequential UUID versions that aim to solve exactly this issue: draft-peabody-dispatch-new-uuid-format-04. The current state and progress can be viewed at IETF Datatracker.

UUID 6 – field-compatible version of UUID1

The original field structure is kept, but the timestamp fields are shuffled around so that the whole timestamp is in the correct order – starting with the most significant bits and ending with the least significant ones. This way, consecutively generated UUIDs are always increasing in value.

UUID 7 – time-ordered

This is is the go-to version for use in newly developed systems, where forward compatibility is not necessary. Notice that UUID1 uses a Gregorian epoch timestamp, while UUID7 is defined to use the Unix epoch timestamp.

UUID 8 – custom, vendor-specific

I have included this version just for completeness – it is in the draft, but here the ordering will clearly depend on the custom data, so this version is not going to be tested.

Testing & comparison

Implementation notes – which Python library was used for UUID7 and UUID8?

I’ve found a library on PyPI that appears to implement UUID7 – uuid7, but upon further inspection of (the only) version 0.1.0, it does not seem to match the structure defined in the draft at the time of writing.

Confusingly enough, there is a package named uuid6 that implements both UUID6 and UUID7 and seems to follow the draft. I used it for the test.

InnoDB – clustered index

UUID6 and UUID7 perform equally well, as expected.

PostgreSQL – non-clustered index

When using Postgres’s non-clustered index, there is a similar overhead for UUID4 over UUID6/7. The I/O is an order of magnitude lower than for InnoDB though.

Conclusion

When comparing UUID1 and UUID4 from the original RFC, UUID1 performs much better even though the time fields are in the wrong order, this is because the low bits still take about 7 minutes to roll over, and during this time the UUID value is increasing.

UUID6 and UUID7 from the draft take this further and keep the ordering of generated values over the full timeline. Even though the versions are not yet standardized, there are existing libraries, at least for Python, that follow the draft. When designing a new database, UUID7 is the go-to option and would perform well in most database engines.

If you’re working on a system that already generates UUID1 or UUID4 but you can still choose a DB engine that supports non-clustered indexes like PostgreSQL, then it is a viable option. There is a performance hit, but much less pronounced compared to InnoDB’s clustered index. Microsoft SQL Server also supports both index types.

Alternative solutions & related studies

Universally Unique Lexicographically Sortable Identifier

UUIDs are Popular, but Bad for Performance — Let’s Discuss – Percona Database Performance Blog

Illustrating Primary Key models in InnoDB and their impact on disk usage – Percona Database Performance
I dig the straightforward graphical view of InnoDB page allocation.

MySQL 8 can convert UUID1 to binary and shuffle the time fields to the “correct” order:
UUID_TO_BIN(string_uuid, swap_flag)
BIN_TO_UUID(binary_uuid, swap_flag)
Rick James: GUID/UUID Performance Breakthrough

DEV Community