DEV Community

Cover image for Everything you need to know about UUID.
Flavio Rosselli
Flavio Rosselli

Posted on

Everything you need to know about UUID.

A Universally Unique Identifier (UUID) is a 128-bit label used in computer systems to identify information uniquely. UUIDs are designed to be unique across space and time, allowing them to be generated independently without a central authority, minimising the risk of duplication.

UUIDs serve various purposes, including:

  • Identifying records in databases.
  • Tagging objects in distributed systems.
  • Serving as primary keys in applications where uniqueness is critical.

Real-world Use Cases

  • Databases: UUID is used as the primary key in relational databases to ensure the unique identification of records.
  • Microservices: Facilitate service communication by providing unique identifiers for requests and resources.
  • IoT Devices: Identify devices uniquely in a network, ensuring that data from multiple sources can be aggregated without conflicts.

Advantages and Disadvantages in use of UUID

Advantages:

  • Global Uniqueness: UUIDs are extremely unlikely to collide, making them suitable for distributed systems where multiple nodes generate identifiers independently.
  • No Central Authority Required: They can be generated without coordination, which simplifies operations in distributed environments.
  • Scalability: They work well in systems that require scaling across multiple servers or services.

Disadvantages:

  • Storage Size: UUIDs consume more space (128 bits) compared to traditional integer IDs (typically 32 bits), which can lead to increased storage costs.
  • Performance Issues: Indexing UUIDs can degrade database performance due to their randomness and size, leading to slower query times compared to sequential IDs.
  • User Unfriendliness: UUIDs are not easily memorable or user-friendly when presented in user interfaces.

The Standard

The standard representation of a UUID consists of 32 hexadecimal characters divided into five groups, separated by hyphens, following the format 8-4-4-4-12, resulting in a total of 36 characters (32 alphanumeric plus 4 hyphens).

The UUID format can be visualized as follows:

xxxxxxxx-xxxx-Mxxx-Nxxx-xxxxxxxxxxxx
Enter fullscreen mode Exit fullscreen mode

Where:

  • M indicates the UUID version.
  • N indicates the variant, which helps interpret the UUID's layout.

Components of a UUID

  1. TimeLow: 4 bytes (8 hex characters) representing the low field of the timestamp.
  2. TimeMid: 2 bytes (4 hex characters) representing the middle field of the timestamp.
  3. TimeHighAndVersion: 2 bytes (4 hex characters) that include the version number and the high field of the timestamp.
  4. ClockSequence: 2 bytes (4 hex characters) used to help avoid collisions, especially when multiple UUIDs are generated in quick succession or if the system clock is adjusted.
  5. Node: 6 bytes (12 hex characters), typically representing the MAC address of the generating node.

Types of UUIDs

  1. Version 1: Time-based UUIDs that use a combination of the current timestamp and the MAC address of the generating node. This version ensures uniqueness across space and time.

  2. Version 2: Similar to version 1 but includes local domain identifiers; however, it is less commonly used due to its limitations.

  3. Version 3: Name-based UUIDs generated using an MD5 hash of a namespace identifier and a name.

  4. Version 4: Randomly generated UUIDs that provide high randomness and uniqueness, with only a few bits reserved for versioning.

  5. Version 5: Like version 3 but uses SHA-1 for hashing, making it more secure than version 3.

Variants

The variant field in a UUID determines its layout and interpretation. The most common variants include:

  • Variant 0: Reserved for NCS backward compatibility.
  • Variant 1: The standard layout used for most UUIDs.
  • Variant 2: Used for DCE Security UUIDs, which are less common.
  • Variant 3: Reserved for future definitions.

Example

For Version 4, a UUID might look like this:

550e8400-e29b-41d4-a716-446655440000
Enter fullscreen mode Exit fullscreen mode

Here:

  • 41d4 indicates it's a version 4.
  • a7 represents the variant, in this case, the common "Leach-Salz" variant.

How UUIDs are Calculated

  1. Version 1 (Time-based):

    • The timestamp is typically the number of 100-nanosecond intervals since October 15, 1582 (the date of the Gregorian calendar reform).
    • The node is the MAC address of the machine generating the UUID.
    • The clock sequence helps ensure uniqueness when the clock time changes (e.g., due to system restarts).
  2. Version 3 and Version 5 (Name-based):

    • A namespace (like a DNS domain) is combined with a name (like a file path or URL) and hashed.
    • The hash (MD5 for version 3, SHA-1 for version 5) is then structured into a UUID format, ensuring the version and variant fields are properly set.
  3. Version 4 (Random-based):

    • Random or pseudo-random numbers are generated for the 122 bits of the UUID.
    • The version and variant fields are set accordingly, ensuring compliance with UUID standards.

UUIDv4 Calculation Example

Step 1: Generate 128 Random Bits

Let's assume we generate the following 128-bit random value:

11001100110101101101010101111010101110110110111001011101010110110101111011010011011110100100101111001011

Step 2: Apply UUIDv4 Version and Variant

  1. Version: Replace bits 12-15 (4th character) with 0100 (for UUID version 4).
    Original: 1100 becomes 0100 → Updated value in this position.

  2. Variant: Replace bits 6-7 of the 9th byte with 10 (for the RFC 4122 variant).
    Original: 11 becomes 10 → Updated value in this position.

Step 3: Format into Hexadecimal

Convert the 128-bit binary into 5 hexadecimal groups:

  1. 32-bit group: 11001100110101101101010101111010ccda55ba
  2. 16-bit group: 1011101101101110b76e
  3. 16-bit group: 01000101010001014545 (with 0100 for version 4)
  4. 16-bit group: 1010110111110010adf2 (with 10 for the variant)
  5. 48-bit group: 11010011011110100100101111001011d39d25cb

Step 4: Combine the Groups

The final UUID would look like this:
ccda55ba-b76e-4545-adf2-d39d25cb

Top comments (5)

Collapse
 
oculus42 profile image
Samuel Rouse

I don't think it's a standard yet, but UUIDv7 promises to solve one of the biggest issues with UUID, which is the lack of sequencing. While maintaining sufficient randomness to avoid overlap, it provides a sequence of events which can be extremely useful for partitioning data, archiving records, and providing more of a guarantee of sequence.

Collapse
 
miketalbot profile image
Mike Talbot ⭐ • Edited

I base so much of my code on v5 GUIDs. Unique for the same data, so bloody helpful, cache keys, filenames for s3, and on and on.



const { v5 } = require("uuid")

const NAMESPACE = "fd671d64-7115-431e-93b0-fc518f1f9944"

function deriveGuidFrom(...data) {
    return v5(JSON.stringify(data), NAMESPACE)
}

module.exports = { deriveGuidFrom }



Enter fullscreen mode Exit fullscreen mode

Then you can just do cache keys like:



     const cacheKey = deriveGuidFrom(tableName, filters, sortOrder)


Enter fullscreen mode Exit fullscreen mode

or



     const fileName = deriveGuidFrom(fileContents)


Enter fullscreen mode Exit fullscreen mode
Collapse
 
oculus42 profile image
Samuel Rouse

This is an interesting idea. Is a v5 faster/better than other hashing options? Is the benefit primarily that the output is a UUID?

Collapse
 
miketalbot profile image
Mike Talbot ⭐

Yeah, the benefit is it makes a short key out of what would be a much longer hash, etc. It's comparable with other keys quickly. I'm always storing things in Redis using such a key, or for instance, I name my files in s3 using v5 guids, then I don't need to bother virus scanning or uploading a file that already exists.

Collapse
 
sreno77 profile image
Scott Reno

Interesting! I had no idea how UUIDs were generated