Using UUIDs (Universally Unique Identifiers) as primary keys in databases is a common practice for us as developer, but this approach can have significant performance drawbacks. I am going to explore with you two major performance issues associated with using UUIDs as keys in your database tables.
What are UUIDs?
A UUID (Universal Unique Identifier) is a 128-bit value used to uniquely identify an object or entity on the internet. Among various versions, UUIDv4 is the most popular.
Here’s an example of a UUIDv4:
123e4567-e89b-12d3-a456-426614174000
UUIDs are like the star players in a football team – they stand out and are unique, but not always the best choice for every play.
Problem 1: Insert Performance – The Fumble
When a new record is inserted into a table, the primary key index must be updated to maintain optimal query performance. Indexes are constructed using the B+ Tree data structure, which requires rebalancing with each insertion to stay efficient.
With UUIDs, the inherent randomness complicates this process, leading to significant inefficiencies. As your database scales, millions of nodes need rebalancing, drastically reducing insert performance when using UUID keys.
Example:
CREATE TABLE players (
id UUID PRIMARY KEY,
name VARCHAR(255)
);
INSERT INTO players (id, name) VALUES ('123e4567-e89b-12d3-a456-426614174000', 'Kevin Lebron');
Tip: Consider using UUIDv7 instead, as it has inherent ordering that simplifies indexing, like having a well-coordinated offensive line 😊!
Problem 2: Higher Storage Requirements
UUIDs consume much more storage compared to auto-incrementing integer keys. Auto-incrementing integers use 32 bits per value, whereas UUIDs use 128 bits – four times more per row. When stored in human-readable form, a UUID can consume up to 688 bits, approximately 20 times more per row.
This is like getting a penalty flag for excessive celebration. It’s unnecessary and costly.
Example:
CREATE TABLE players (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(255)
);
INSERT INTO players (name) VALUES ('Kevin Lebron');
Let’s simulate the impact with two tables:
- Table 1: Contains 1 million rows with UUIDs.
- Table 2: Contains 1 million rows with auto-incrementing integers.
Results:
- Total table size: The UUID table is about 2.3 times larger than the integer table.
- ID field size: An individual UUID field requires 9.3 times more storage space than an integer field.
- ID column size: Excluding other attributes, the UUID column is 3.5 times larger than the integer column.
Using UUIDs is like getting hit with repeated penalty flags – your storage requirements balloon unnecessarily, impacting performance.
Conclusion
While UUIDs are excellent for ensuring uniqueness, they present significant scalability challenges. The performance issues discussed are more noticeable at scale, so for smaller applications, the impact might be minimal. However, it is crucial to understand these implications and design your database accordingly.
Remember: In both football and databases, it's all about making the right plays at the right time!
If you found this article helpful, please share, and connect with me!
References
What is a UUID, and what is it used for?
The Problem with UUID
B Trees and B+ Trees. How they are useful in Databases
Top comments (0)