52

UUIDs in MySQL are really not random

 5 years ago
source link: https://www.tuicool.com/articles/hit/eEv2ui2
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Universally Unique Identifiers (UUIDs) are great. I love how you can tell the progress of a batch job just by looking at the current UUID. If it starts with 0... , the task is less than 1/16th done. If it starts with 7d.. , we're almost halfway there. At ff... we are nearing the end. The fact that you can tell this rests on two principles: 1) you sort your jobs by their uuid and 2) UUIDs are random, as in, distributed uniformly .

However, last week, I noticed a strange thing: a clearly visible pattern in the uuid column of a database table. It should be impossible, but there it was. It looked like this:

> SELECT uuid FROM example ORDER BY id;

4f95de28-0fd1-48db-ad2e-34ecd169c483

4331cb9e-1d91-11e9-be2c-45923c63e8a2

4331cc4c-1d91-11e9-be2c-45923c63e8a2

4331ccec-1d91-11e9-be2c-45923c63e8a2

4331cd7e-1d91-11e9-be2c-45923c63e8a2

c7e2f124-f6ba-4434-843f-89958a7436ec

4331ce10-1d91-11e9-be2c-45923c63e8a2

4331ce9e-1d91-11e9-be2c-45923c63e8a2

4331cf28-1d91-11e9-be2c-45923c63e8a2

4331cfaf-1d91-11e9-be2c-45923c63e8a2

4331d017-1d91-11e9-be2c-45923c63e8a2

4331d0c6-1d91-11e9-be2c-45923c63e8a2

4331d139-1d91-11e9-be2c-45923c63e8a2

4331d1a7-1d91-11e9-be2c-45923c63e8a2

4331d20e-1d91-11e9-be2c-45923c63e8a2

4331d271-1d91-11e9-be2c-45923c63e8a2

4331d2d7-1d91-11e9-be2c-45923c63e8a2

3e18f8dd-b1d3-4e16-8a81-4bdceac91772

4331d33a-1d91-11e9-be2c-45923c63e8a2

f6b8658d-846b-4418-a79c-713db516203e

4331d3a8-1d91-11e9-be2c-45923c63e8a2

4331d40d-1d91-11e9-be2c-45923c63e8a2

4331d48e-1d91-11e9-be2c-45923c63e8a2

4331d4fe-1d91-11e9-be2c-45923c63e8a2

4331d567-1d91-11e9-be2c-45923c63e8a2

4331d5cb-1d91-11e9-be2c-45923c63e8a2

4331d630-1d91-11e9-be2c-45923c63e8a2

4331d691-1d91-11e9-be2c-45923c63e8a2

14a5cb4c-f336-4240-aff3-4ffcfa8d135f

4331d6f9-1d91-11e9-be2c-45923c63e8a2

4331d762-1d91-11e9-be2c-45923c63e8a2

4331d7c8-1d91-11e9-be2c-45923c63e8a2

4331d83d-1d91-11e9-be2c-45923c63e8a2

Hm, that is very weird. Did we maybe convert an old auto-incrementing integer column into UUIDs in a very stupid way? Did we maybe use UUID version 3 or 5? Did our library corrupt the first bits of the binary representation of the UUID? After a while, I remembered that we initialized this column like so:

UPDATE example SET uuid = UUID() WHERE uuid IS NULL;

I also remembered reading this in the MySQL documentation :

"Warning: Although UUID() values are intended to be unique, they are not necessarily unguessable or unpredictable. If unpredictability is required, UUID values should be generated some other way."

If you are like me, you won't use UUID() for secrets after reading this (and I didn't!). If you are even more like me, you will think that this is like the difference between /dev/urandom  and /dev/random and that the uniform distribution law still applies here. However, to my great surprise, the UUIDs generated by UUID() are not uniform at all! The result is that a significant portion of UUIDs in our database are not uniform:

> SELECT COUNT(*), LEFT(uuid, 1) FROM example GROUP BY LEFT(uuid, 1);

+----------+---------------+

| count(*) | LEFT(guid, 1) |

+----------+---------------+

|     1943     | 0 

     |

| 1871      | 1           |

| 1913      | 2           |

| 1843      | 3           |

| 3050      | 4           |

| 1943      | 5           |

| 1889      | 6           |

| 1866      | 7           |

| 1865      | 8           |

| 1903      | 9           |

| 1868      | a           |

| 1898      | b           |

| 1854      | c           |

| 1897      | d           |

| 1941      | e           |

| 1836      | f              |

+----------+---------------+

So, the lesson for today is: take warnings in documentation seriously. If you used  UUID() for data that is supposed to be secret (like passwords), you have a serious problem, as these secrets can now be easily guessed.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK