How should I store GUID in MySQL tables?

MysqlGuidUuid

Mysql Problem Overview


Do I use varchar(36) or are there any better ways to do it?

Mysql Solutions


Solution 1 - Mysql

My DBA asked me when I asked about the best way to store GUIDs for my objects why I needed to store 16 bytes when I could do the same thing in 4 bytes with an Integer. Since he put that challenge out there to me I thought now was a good time to mention it. That being said...

You can store a guid as a CHAR(16) binary if you want to make the most optimal use of storage space.

Solution 2 - Mysql

I would store it as a char(36).

Solution 3 - Mysql

Adding to the answer by ThaBadDawg, use these handy functions (thanks to a wiser collegue of mine) to get from 36 length string back to a byte array of 16.

DELIMITER $$

CREATE FUNCTION `GuidToBinary`(
    $Data VARCHAR(36)
) RETURNS binary(16)
DETERMINISTIC
NO SQL
BEGIN
    DECLARE $Result BINARY(16) DEFAULT NULL;
    IF $Data IS NOT NULL THEN
        SET $Data = REPLACE($Data,'-','');
        SET $Result =
            CONCAT( UNHEX(SUBSTRING($Data,7,2)), UNHEX(SUBSTRING($Data,5,2)),
                    UNHEX(SUBSTRING($Data,3,2)), UNHEX(SUBSTRING($Data,1,2)),
                    UNHEX(SUBSTRING($Data,11,2)),UNHEX(SUBSTRING($Data,9,2)),
                    UNHEX(SUBSTRING($Data,15,2)),UNHEX(SUBSTRING($Data,13,2)),
                    UNHEX(SUBSTRING($Data,17,16)));
    END IF;
    RETURN $Result;
END

$$

CREATE FUNCTION `ToGuid`(
    $Data BINARY(16)
) RETURNS char(36) CHARSET utf8
DETERMINISTIC
NO SQL
BEGIN
    DECLARE $Result CHAR(36) DEFAULT NULL;
    IF $Data IS NOT NULL THEN
        SET $Result =
            CONCAT(
                HEX(SUBSTRING($Data,4,1)), HEX(SUBSTRING($Data,3,1)),
                HEX(SUBSTRING($Data,2,1)), HEX(SUBSTRING($Data,1,1)), '-', 
                HEX(SUBSTRING($Data,6,1)), HEX(SUBSTRING($Data,5,1)), '-',
                HEX(SUBSTRING($Data,8,1)), HEX(SUBSTRING($Data,7,1)), '-',
                HEX(SUBSTRING($Data,9,2)), '-', HEX(SUBSTRING($Data,11,6)));
    END IF;
    RETURN $Result;
END
$$

CHAR(16) is actually a BINARY(16), choose your preferred flavour

To follow the code better, take the example given the digit-ordered GUID below. (Illegal characters are used for illustrative purposes - each place a unique character.) The functions will transform the byte ordering to achieve a bit order for superior index clustering. The reordered guid is shown below the example.

12345678-9ABC-DEFG-HIJK-LMNOPQRSTUVW
78563412-BC9A-FGDE-HIJK-LMNOPQRSTUVW

Dashes removed:

123456789ABCDEFGHIJKLMNOPQRSTUVW
78563412BC9AFGDEHIJKLMNOPQRSTUVW

Solution 4 - Mysql

char(36) would be a good choice. Also MySQL's UUID() function can be used which returns a 36-character text format (hex with hyphens) which can be used for retrievals of such IDs from the db.

Solution 5 - Mysql

"Better" depends on what you're optimizing for.

How much do you care about storage size/performance vs. ease of development? More importantly - are you generating enough GUIDs, or fetching them frequently enough, that it matters?

If the answer is "no", char(36) is more than good enough, and it makes storing/fetching GUIDs dead-simple. Otherwise, binary(16) is reasonable, but you'll have to lean on MySQL and/or your programming language of choice to convert back and forth from the usual string representation.

Solution 6 - Mysql

Binary(16) would be fine, better than use of varchar(32).

Solution 7 - Mysql

The GuidToBinary routine posted by KCD should be tweaked to account for the bit layout of the timestamp in the GUID string. If the string represents a version 1 UUID, like those returned by the uuid() mysql routine, then the time components are embedded in letters 1-G, excluding the D.

12345678-9ABC-DEFG-HIJK-LMNOPQRSTUVW
12345678 = least significant 4 bytes of the timestamp in big endian order
9ABC     = middle 2 timestamp bytes in big endian
D        = 1 to signify a version 1 UUID
EFG      = most significant 12 bits of the timestamp in big endian

When you convert to binary, the best order for indexing would be: EFG9ABC12345678D + the rest.

You don't want to swap 12345678 to 78563412 because big endian already yields the best binary index byte order. However, you do want the most significant bytes moved in front of the lower bytes. Hence, EFG go first, followed by the middle bits and lower bits. Generate a dozen or so UUIDs with uuid() over the course of a minute and you should see how this order yields the correct rank.

select uuid(), 0
union 
select uuid(), sleep(.001)
union 
select uuid(), sleep(.010)
union 
select uuid(), sleep(.100)
union 
select uuid(), sleep(1)
union 
select uuid(), sleep(10)
union
select uuid(), 0;

/* output */
6eec5eb6-9755-11e4-b981-feb7b39d48d6
6eec5f10-9755-11e4-b981-feb7b39d48d6
6eec8ddc-9755-11e4-b981-feb7b39d48d6
6eee30d0-9755-11e4-b981-feb7b39d48d6
6efda038-9755-11e4-b981-feb7b39d48d6
6f9641bf-9755-11e4-b981-feb7b39d48d6
758c3e3e-9755-11e4-b981-feb7b39d48d6 

The first two UUIDs were generated closest in time. They only vary in the last 3 nibbles of the first block. These are the least significant bits of the timestamp, which means we want to push them to the right when we convert this to an indexable byte array. As a counter example, the last ID is the most current, but the KCD's swapping algorithm would put it before the 3rd ID (3e before dc, last bytes from the first block).

The correct order for indexing would be:

1e497556eec5eb6... 
1e497556eec5f10... 
1e497556eec8ddc... 
1e497556eee30d0... 
1e497556efda038... 
1e497556f9641bf... 
1e49755758c3e3e... 

See this article for supporting information: http://mysql.rjweb.org/doc.php/uuid

*** note that I don't split the version nibble from the high 12 bits of the timestamp. This is the D nibble from your example. I just throw it in front. So my binary sequence ends up being DEFG9ABC and so on. This implies that all my indexed UUIDs start with the same nibble. The article does the same thing.

Solution 8 - Mysql

For those just stumbling across this, there is now a much better alternative as per research by Percona.

It consists of reorganising the UUID chunks for optimal indexing, then converting into binary for reduced storage.

Read the full article here

Solution 9 - Mysql

I would suggest using the functions below since the ones mentioned by @bigh_29 transforms my guids into new ones (for reasons I don't understand). Also, these are a little bit faster in the tests I did on my tables. https://gist.github.com/damienb/159151

DELIMITER |

CREATE FUNCTION uuid_from_bin(b BINARY(16))
RETURNS CHAR(36) DETERMINISTIC
BEGIN
  DECLARE hex CHAR(32);
  SET hex = HEX(b);
  RETURN LOWER(CONCAT(LEFT(hex, 8), '-', MID(hex, 9,4), '-', MID(hex, 13,4), '-', MID(hex, 17,4), '-', RIGHT(hex, 12)));
END
|

CREATE FUNCTION uuid_to_bin(s CHAR(36))
RETURNS BINARY(16) DETERMINISTIC
RETURN UNHEX(CONCAT(LEFT(s, 8), MID(s, 10, 4), MID(s, 15, 4), MID(s, 20, 4), RIGHT(s, 12)))
|

DELIMITER ;

Solution 10 - Mysql

if you have a char/varchar value formatted as the standard GUID, you can simply store it as BINARY(16) using the simple CAST(MyString AS BINARY16), without all those mind-boggling sequences of CONCAT + SUBSTR.

BINARY(16) fields are compared/sorted/indexed much faster than strings, and also take two times less space in the database

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionCDRView Question on Stackoverflow
Solution 1 - MysqlthaBadDawgView Answer on Stackoverflow
Solution 2 - MysqlBrian FisherView Answer on Stackoverflow
Solution 3 - MysqlKCDView Answer on Stackoverflow
Solution 4 - MysqlLearningView Answer on Stackoverflow
Solution 5 - MysqlcanduView Answer on Stackoverflow
Solution 6 - MysqlOnkar JanwaView Answer on Stackoverflow
Solution 7 - Mysqlbigh_29View Answer on Stackoverflow
Solution 8 - MysqlsleepycalView Answer on Stackoverflow
Solution 9 - MysqlMisbitView Answer on Stackoverflow
Solution 10 - MysqlGeorge HazanView Answer on Stackoverflow