tl;dr
- Version 6 & 7 UUIDs make for efficient database indexing.
- Newly defined in the year 2024.
UUID now index-friendly
The biggest criticism against UUID as database keys is: inefficient indexing.
The arbitrary layout of bits within UUID values inhibit efficient indexing. This inefficiency is not significant with smaller sized tables, but becomes a major bottleneck as tables grow upwards of a million records.
This criticism is now resolved by the new Version 6 and Version 7 UUIDs.
RFC 9562
In 2024, the specification for UUIDs was revamped in a big way. RFC 9562 obsoletes RFC 4122.
Firstly, RFC 9562 was entirely rewritten. The writing is much improved, easier to read, bringing clarity and insight. The new RFC discusses the pros and cons of the various Versions, explaining the motivations for the changes. And thankfully the new spec lends some guidance in choosing a Version. So I strongly recommend reading the actual specification.
UUID Version 6 & 7
Secondly, RFC 9562 defines three new Versions of UUID: Versions 6, 7, & 8.
Version 6
Version 1 and Version 6 are similar, both being based on points in time and space.
Version 6 supplants Version 1, rearranging its bits for optimal indexing, and changing some of the semantics of its parts, while remaining field-compatible with Version 1.
👉🏽 For brownfield projects using Version 1, consider switching to Version 6.
Version 6 bits diagram
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| time_high |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| time_mid | ver | time_low |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|var| clock_seq | node |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| node |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Version 7
Version 7 is an all-new kind of UUID: time + random.
👉🏽 Version 7 is recommended for greenfield projects. Prefer 7 over 1 & 6 if possible.
Changes incude:
- Version 7 is time-based, but redefines the time granularity and epoch (now milliseconds since beginning of 1970 in UTC excluding leap seconds). The bits are arranged at the front of the UUID, a 48-bit big-endian unsigned number. This layout promotes efficient indexing.
- To avoid security, privacy, and practical problems, the use of a MAC address as the point-in-space “node” value of Version 1 & 6 is replaced by randomly-generated values (by default).
- Ditto for the counter field defined in Versions 1 & 2, replaced by random bits (by default).
Implementations are free to choose to fill these otherwise-random bits with values based on sub-millisecond timestamp fraction, and/or a counter as defined in Versions 1 & 2. For example, Postgres 18 includes a 12-bit sub-millisecond timestamp fraction immediately after the timestamp, providing monotonicity for all UUIDv7 values generated by the same Postgres session (same backend process) (see post).
Version 7 bits diagram
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| unix_ts_ms |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| unix_ts_ms | ver | rand_a |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|var| rand_b |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| rand_b |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Version 8
- Version 8 in reserved for experimental or specialized use. The bits are entirely undefined, outside of the required 2-bits for Variant and 4-bits for Version.