Is a GUID unique 100% of the time?

Language AgnosticGuid

Language Agnostic Problem Overview


Is a GUID unique 100% of the time?

Will it stay unique over multiple threads?

Language Agnostic Solutions


Solution 1 - Language Agnostic

> While each generated GUID is not > guaranteed to be unique, the total > number of unique keys (2128 or > 3.4×1038) is so large that the probability of the same number being > generated twice is very small. For > example, consider the observable > universe, which contains about 5×1022 > stars; every star could then have > 6.8×1015 universally unique GUIDs.

From Wikipedia.


These are some good articles on how a GUID is made (for .NET) and how you could get the same guid in the right situation.

https://ericlippert.com/2012/04/24/guid-guide-part-one/

https://ericlippert.com/2012/04/30/guid-guide-part-two/

https://ericlippert.com/2012/05/07/guid-guide-part-three/

​​

Solution 2 - Language Agnostic

If you are scared of the same GUID values then put two of them next to each other.

Guid.NewGuid().ToString() + Guid.NewGuid().ToString();

If you are too paranoid then put three.

Solution 3 - Language Agnostic

The simple answer is yes.

Raymond Chen wrote a great article on GUIDs and why substrings of GUIDs are not guaranteed unique. The article goes in to some depth as to the way GUIDs are generated and the data they use to ensure uniqueness, which should go to some length in explaining why they are :-)

Solution 4 - Language Agnostic

As a side note, I was playing around with Volume GUIDs in Windows XP. This is a very obscure partition layout with three disks and fourteen volumes.

\\?\Volume{23005604-eb1b-11de-85ba-806d6172696f}\ (F:)
\\?\Volume{23005605-eb1b-11de-85ba-806d6172696f}\ (G:)
\\?\Volume{23005606-eb1b-11de-85ba-806d6172696f}\ (H:)
\\?\Volume{23005607-eb1b-11de-85ba-806d6172696f}\ (J:)
\\?\Volume{23005608-eb1b-11de-85ba-806d6172696f}\ (D:)
\\?\Volume{23005609-eb1b-11de-85ba-806d6172696f}\ (P:)
\\?\Volume{2300560b-eb1b-11de-85ba-806d6172696f}\ (K:)
\\?\Volume{2300560c-eb1b-11de-85ba-806d6172696f}\ (L:)
\\?\Volume{2300560d-eb1b-11de-85ba-806d6172696f}\ (M:)
\\?\Volume{2300560e-eb1b-11de-85ba-806d6172696f}\ (N:)
\\?\Volume{2300560f-eb1b-11de-85ba-806d6172696f}\ (O:)
\\?\Volume{23005610-eb1b-11de-85ba-806d6172696f}\ (E:)
\\?\Volume{23005611-eb1b-11de-85ba-806d6172696f}\ (R:)
                                     | | | | |
                                     | | | | +-- 6f = o
                                     | | | +---- 69 = i
                                     | | +------ 72 = r
                                     | +-------- 61 = a
                                     +---------- 6d = m

It's not that the GUIDs are very similar but the fact that all GUIDs have the string "mario" in them. Is that a coincidence or is there an explanation behind this?

Now, when googling for part 4 in the GUID I found approx 125.000 hits with volume GUIDs.

Conclusion: When it comes to Volume GUIDs they aren't as unique as other GUIDs.

Solution 5 - Language Agnostic

It should not happen. However, when .NET is under a heavy load, it is possible to get duplicate guids. I have two different web servers using two different sql servers. I went to merge the data and found I had 15 million guids and 7 duplicates.

Solution 6 - Language Agnostic

Yes, a GUID should always be unique. It is based on both hardware and time, plus a few extra bits to make sure it's unique. I'm sure it's theoretically possible to end up with two identical ones, but extremely unlikely in a real-world scenario.

Here's a great article by Raymond Chen on Guids:

[https://blogs.msdn.com/oldnewthing/archive/2008/06/27/8659071.aspx][1] ​ ​ ​ [1]: https://blogs.msdn.com/oldnewthing/archive/2008/06/27/8659071.aspx

Solution 7 - Language Agnostic

Guids are statistically unique. The odds of two different clients generating the same Guid are infinitesimally small (assuming no bugs in the Guid generating code). You may as well worry about your processor glitching due to a cosmic ray and deciding that 2+2=5 today.

Multiple threads allocating new guids will get unique values, but you should get that the function you are calling is thread safe. Which environment is this in?

Solution 8 - Language Agnostic

Eric Lippert has written a very interesting series of articles about GUIDs.

> There are on the order 230 personal computers in the world (and of > course lots of hand-held devices or non-PC computing devices that have > more or less the same levels of computing power, but lets ignore > those). Let's assume that we put all those PCs in the world to the > task of generating GUIDs; if each one can generate, say, 220 GUIDs per > second then after only about 272 seconds -- one hundred and fifty > trillion years -- you'll have a very high chance of generating a > collision with your specific GUID. And the odds of collision get > pretty good after only thirty trillion years.

Solution 9 - Language Agnostic

Theoretically, no, they are not unique. It's possible to generate an identical guid over and over. However, the chances of it happening are so low that you can assume they are unique.

I've read before that the chances are so low that you really should stress about something else--like your server spontaneously combusting or other bugs in your code. That is, assume it's unique and don't build in any code to "catch" duplicates--spend your time on something more likely to happen (i.e. anything else).

I made an attempt to describe the usefulness of GUIDs to my blog audience (non-technical family memebers). From there (via Wikipedia), the odds of generating a duplicate GUID:

  • 1 in 2^128
  • 1 in 340 undecillion (don’t worry, undecillion is not on the quiz)
  • 1 in 3.4 × 10^38
  • 1 in 340,000,000,000,000,000,000,000,000,000,000,000,000

Solution 10 - Language Agnostic

None seems to mention the actual math of the probability of it occurring.

First, let's assume we can use the entire 128 bit space (Guid v4 only uses 122 bits).

We know that the general probability of NOT getting a duplicate in n picks is:

>(1-1/2128)(1-2/2128)...(1-(n-1)/2128)

Because 2128 is much much larger than n, we can approximate this to:

>(1-1/2128)n(n-1)/2

And because we can assume n is much much larger than 0, we can approximate that to:

>(1-1/2128)n^2/2

Now we can equate this to the "acceptable" probability, let's say 1%:

>(1-1/2128)n^2/2 = 0.01

Which we solve for n and get:

>n = sqrt(2* log 0.01 / log (1-1/2128))

Which Wolfram Alpha gets to be 5.598318 × 1019

To put that number into perspective, lets take 10000 machines, each having a 4 core CPU, doing 4Ghz and spending 10000 cycles to generate a Guid and doing nothing else. It would then take ~111 years before they generate a duplicate.

Solution 11 - Language Agnostic

From http://www.guidgenerator.com/online-guid-generator.aspx

> What is a GUID? > > GUID (or UUID) is an acronym for 'Globally Unique Identifier' (or 'Universally Unique Identifier'). It is a 128-bit integer number used to identify resources. The term GUID is generally used by developers working with Microsoft technologies, while UUID is used everywhere else. > > How unique is a GUID? > > 128-bits is big enough and the generation algorithm is unique enough that if 1,000,000,000 GUIDs per second were generated for 1 year the probability of a duplicate would be only 50%. Or if every human on Earth generated 600,000,000 GUIDs there would only be a 50% probability of a duplicate.

Solution 12 - Language Agnostic

> Is a GUID unique 100% of the time?

Not guaranteed, since there are several ways of generating one. However, you can try to calculate the chance of creating two GUIDs that are identical and you get the idea: a GUID has 128 bits, hence, there are 2128 distinct GUIDs – much more than there are stars in the known universe. Read the wikipedia article for more details.

Solution 13 - Language Agnostic

MSDN: > There is a very low probability that the value of the new Guid is all zeroes or equal to any other Guid.

Solution 14 - Language Agnostic

If your system clock is set properly and hasn't wrapped around, and if your NIC has its own MAC (i.e. you haven't set a custom MAC) and your NIC vendor has not been recycling MACs (which they are not supposed to do but which has been known to occur), and if your system's GUID generation function is properly implemented, then your system will never generate duplicate GUIDs.

If everyone on earth who is generating GUIDs follows those rules then your GUIDs will be globally unique.

In practice, the number of people who break the rules is low, and their GUIDs are unlikely to "escape". Conflicts are statistically improbable.

Solution 15 - Language Agnostic

For more better result the best way is to append the GUID with the timestamp (Just to make sure that it stays unique)

Guid.NewGuid().ToString() + DateTime.Now.ToString();

Solution 16 - Language Agnostic

I experienced a duplicate GUID.

I use the Neat Receipts desktop scanner and it comes with proprietary database software. The software has a sync to cloud feature, and I kept getting an error upon syncing. A gander at the logs revealed the awesome line:

> "errors":[{"code":1,"message":"creator_guid: is already > taken","guid":"C83E5734-D77A-4B09-B8C1-9623CAC7B167"}]}

I was a bit in disbelief, but surely enough, when I found a way into my local neatworks database and deleted the record containing that GUID, the error stopped occurring.

So to answer your question with anecdotal evidence, no. A duplicate is possible. But it is likely that the reason it happened wasn't due to chance, but due to standard practice not being adhered to in some way. (I am just not that lucky) However, I cannot say for sure. It isn't my software.

Their customer support was EXTREMELY courteous and helpful, but they must have never encountered this issue before because after 3+ hours on the phone with them, they didn't find the solution. (FWIW, I am very impressed by Neat, and this glitch, however frustrating, didn't change my opinion of their product.)

Solution 17 - Language Agnostic

I have experienced the GUIDs not being unique during multi-threaded/multi-process unit-testing (too?). I guess that has to do with, all other tings being equal, the identical seeding (or lack of seeding) of pseudo random generators. I was using it for generating unique file names. I found the OS is much better at doing that :)

Trolling alert

You ask if GUIDs are 100% unique. That depends on the number of GUIDs it must be unique among. As the number of GUIDs approach infinity, the probability for duplicate GUIDs approach 100%.

Solution 18 - Language Agnostic

In a more general sense, this is known as the "birthday problem" or "birthday paradox". Wikipedia has a pretty good overview at: Wikipedia - Birthday Problem

In very rough terms, the square root of the size of the pool is a rough approximation of when you can expect a 50% chance of a duplicate. The article includes a probability table of pool size and various probabilities, including a row for 2^128. So for a 1% probability of collision you would expect to randomly pick 2.610^18 128-bit numbers. A 50% chance requires 2.210^19 picks, while SQRT(2^128) is 1.8*10^19.

Of course, that is just the ideal case of a truly random process. As others mentioned, a lot is riding on the that random aspect - just how good is the generator and seed? It would be nice if there was some hardware support to assist with this process which would be more bullet-proof except that anything can be spoofed or virtualized. I suspect that might be the reason why MAC addresses/time-stamps are no longer incorporated.

Solution 19 - Language Agnostic

The Answer of "Is a GUID is 100% unique?" is simply "No" .

  • If You want 100% uniqueness of GUID then do following.
  1. generate GUID
  2. check if that GUID is Exist in your table column where you are looking for uniquensess
  3. if exist then goto step 1 else step 4
  4. use this GUID as unique.

Solution 20 - Language Agnostic

GUID algorithms are usually implemented according to the v4 GUID specification, which is essentially a pseudo-random string. Sadly, these fall into the category of "likely non-unique", from Wikipedia (I don't know why so many people ignore this bit): "... other GUID versions have different uniqueness properties and probabilities, ranging from guaranteed uniqueness to likely non-uniqueness."

The pseudo-random properties of V8's JavaScript Math.random() are TERRIBLE at uniqueness, with collisions often coming after only a few thousand iterations, but V8 isn't the only culprit. I've seen real-world GUID collisions using both PHP and Ruby implementations of v4 GUIDs.

Because it's becoming more and more common to scale ID generation across multiple clients, and clusters of servers, entropy takes a big hit -- the chances of the same random seed being used to generate an ID escalate (time is often used as a random seed in pseudo-random generators), and GUID collisions escalate from "likely non-unique" to "very likely to cause lots of trouble".

To solve this problem, I set out to create an ID algorithm that could scale safely, and make better guarantees against collision. It does so by using the timestamp, an in-memory client counter, client fingerprint, and random characters. The combination of factors creates an additive complexity that is particularly resistant to collision, even if you scale it across a number of hosts:

http://usecuid.org/

Solution 21 - Language Agnostic

The hardest part is not about generating a duplicated Guid.

The hardest part is designed a database to store all of the generated ones to check if it is actually duplicated.

From WIKI:

For example, the number of random version 4 UUIDs which need to be generated in order to have a 50% probability of at least one collision is 2.71 quintillion, computed as follows:

enter image description here

This number is equivalent to generating 1 billion UUIDs per second for about 85 years, and a file containing this many UUIDs, at 16 bytes per UUID, would be about 45 exabytes, many times larger than the largest databases currently in existence, which are on the order of hundreds of petabytes

Solution 22 - Language Agnostic

GUID stands for Global Unique Identifier

In Brief: (the clue is in the name)

In Detail: GUIDs are designed to be unique; they are calculated using a random method based on the computers clock and computer itself, if you are creating many GUIDs at the same millisecond on the same machine it is possible they may match but for almost all normal operations they should be considered unique.

Solution 23 - Language Agnostic

Enough GUIDs to assign one to each and every hypothetical grain of sand on every hypothetical planet around each and every star in the visible universe.

Enough so that if every computer in the world generates 1000 GUIDs a second for 200 years, there might (MIGHT) be a collision.

Given the number of current local uses for GUIDs (one sequence per table per database for instance) it is extraordinarily unlikely to ever be a problem for us limited creatures (and machines with lifetimes that are usually less than a decade if not a year or two for mobile phones).

... Can we close this thread now?

Solution 24 - Language Agnostic

I think that when people bury their thoughts and fears in statistics, they tend to forget the obvious. If a system is truly random, then the result you are least likely to expect (all ones, say) is equally as likely as any other unexpected value (all zeros, say). Neither fact prevents these occurring in succession, nor within the first pair of samples (even though that would be statistically "truly shocking"). And that's the problem with measuring chance: it ignores criticality (and rotten luck) entirely.

IF it ever happened, what's the outcome? Does your software stop working? Does someone get injured? Does someone die? Does the world explode?

The more extreme the criticality, the worse the word "probability" sits in the mouth. In the end, chaining GUIDs (or XORing them, or whatever) is what you do when you regard (subjectively) your particular criticality (and your feeling of "luckiness") to be unacceptable. And if it could end the world, then please on behalf of all of us not involved in nuclear experiments in the Large Hadron Collider, don't use GUIDs or anything else indeterministic!

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionDavid BasarabView Question on Stackoverflow
Solution 1 - Language AgnosticAdam DavisView Answer on Stackoverflow
Solution 2 - Language AgnosticBura ChuhadarView Answer on Stackoverflow
Solution 3 - Language AgnosticljsView Answer on Stackoverflow
Solution 4 - Language AgnosticJonas EngströmView Answer on Stackoverflow
Solution 5 - Language AgnosticTimView Answer on Stackoverflow
Solution 6 - Language AgnosticEric Z BeardView Answer on Stackoverflow
Solution 7 - Language AgnosticRob WalkerView Answer on Stackoverflow
Solution 8 - Language AgnosticPaolo MorettiView Answer on Stackoverflow
Solution 9 - Language AgnosticMichael HarenView Answer on Stackoverflow
Solution 10 - Language AgnosticCineView Answer on Stackoverflow
Solution 11 - Language AgnosticTono NamView Answer on Stackoverflow
Solution 12 - Language AgnosticKonrad RudolphView Answer on Stackoverflow
Solution 13 - Language AgnosticJakub ŠturcView Answer on Stackoverflow
Solution 14 - Language AgnosticDrPizzaView Answer on Stackoverflow
Solution 15 - Language AgnosticAdithya SaiView Answer on Stackoverflow
Solution 16 - Language AgnosticexintrovertView Answer on Stackoverflow
Solution 17 - Language AgnosticRobert Jørgensgaard EngdahlView Answer on Stackoverflow
Solution 18 - Language AgnosticmszilView Answer on Stackoverflow
Solution 19 - Language AgnosticBaba KhedkarView Answer on Stackoverflow
Solution 20 - Language AgnosticEric ElliottView Answer on Stackoverflow
Solution 21 - Language AgnosticTrong Hiep LeView Answer on Stackoverflow
Solution 22 - Language AgnosticBenjamin RobertsView Answer on Stackoverflow
Solution 23 - Language AgnosticWilliam M. RawlsView Answer on Stackoverflow
Solution 24 - Language AgnosticAlex TView Answer on Stackoverflow