UUID format: 8-4-4-4-12 - Why?

FormatGuidUuid

Format Problem Overview


Why are UUID's presented in the format "8-4-4-4-12" (digits)? I've had a look around for the reason but can't find the decision that calls for it.

Example of UUID formatted as hex string: 58D5E212-165B-4CA0-909B-C86B9CEE0111

Format Solutions


Solution 1 - Format

It's separated by time, version, clock_seq_hi, clock_seq_lo, node, as indicated in the following rfc.

From the IETF RFC4122:

4.1.2.  Layout and Byte Order

   To minimize confusion about bit assignments within octets, the UUID
   record definition is defined only in terms of fields that are
   integral numbers of octets.  The fields are presented with the most
   significant one first.

   Field                  Data Type     Octet  Note
                                        #

   time_low               unsigned 32   0-3    The low field of the
                          bit integer          timestamp

   time_mid               unsigned 16   4-5    The middle field of the
                          bit integer          timestamp

   time_hi_and_version    unsigned 16   6-7    The high field of the
                          bit integer          timestamp multiplexed
                                               with the version number  

   clock_seq_hi_and_rese  unsigned 8    8      The high field of the
   rved                   bit integer          clock sequence
                                               multiplexed with the
                                               variant

   clock_seq_low          unsigned 8    9      The low field of the
                          bit integer          clock sequence

   node                   unsigned 48   10-15  The spatially unique
                          bit integer          node identifier

   In the absence of explicit application or presentation protocol
   specification to the contrary, a UUID is encoded as a 128-bit object,
   as follows:

   The fields are encoded as 16 octets, with the sizes and order of the
   fields defined above, and with each field encoded with the Most
   Significant Byte first (known as network byte order).  Note that the
   field names, particularly for multiplexed fields, follow historical
   practice.

   0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                          time_low                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |       time_mid                |         time_hi_and_version   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |clk_seq_hi_res |  clk_seq_low  |         node (0-1)            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                         node (2-5)                            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Solution 2 - Format

The format is defined in IETF RFC4122 in section 3. The output format is defined where it says "UUID = ..."

> 3.- Namespace Registration Template > > Namespace ID: UUID Registration Information: > Registration date: 2003-10-01 > > Declared registrant of the namespace: > JTC 1/SC6 (ASN.1 Rapporteur Group) > > Declaration of syntactic structure: > A UUID is an identifier that is unique across both space and time, > with respect to the space of all UUIDs. Since a UUID is a fixed > size and contains a time field, it is possible for values to > rollover (around A.D. 3400, depending on the specific algorithm > used). A UUID can be used for multiple purposes, from tagging > objects with an extremely short lifetime, to reliably identifying > very persistent objects across a network. > > The internal representation of a UUID is a specific sequence of > bits in memory, as described in Section 4. To accurately > represent a UUID as a URN, it is necessary to convert the bit > sequence to a string representation. > > Each field is treated as an integer and has its value printed as a > zero-filled hexadecimal digit string with the most significant > digit first. The hexadecimal values "a" through "f" are output as > lower case characters and are case insensitive on input. > > The formal definition of the UUID string representation is > provided by the following ABNF [7]: > > UUID = time-low "-" time-mid "-" > time-high-and-version "-" > clock-seq-and-reserved > clock-seq-low "-" node > time-low = 4hexOctet > time-mid = 2hexOctet > time-high-and-version = 2hexOctet > clock-seq-and-reserved = hexOctet > clock-seq-low = hexOctet > node = 6hexOctet > hexOctet = hexDigit hexDigit > hexDigit = > "0" / "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9" / > "a" / "b" / "c" / "d" / "e" / "f" / > "A" / "B" / "C" / "D" / "E" / "F"

Solution 3 - Format

128 bits

The "8-4-4-4-12" format is just for reading by humans. The UUID is really a 128-bit number.

Consider the string format requires the double of the bytes than the 128 bit number when stored or in memory. I would suggest to use the number internally and when it needs to be shown on a UI or exported in a file, use the string format.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionFidelView Question on Stackoverflow
Solution 1 - FormatMattenView Answer on Stackoverflow
Solution 2 - FormatPaul-Joseph de WerkView Answer on Stackoverflow
Solution 3 - FormatPablo PazosView Answer on Stackoverflow