Generating human-readable/usable, short but unique IDs

.NetDatabaseIdentity

.Net Problem Overview


  • Need to handle > 1000 but < 10000 new records per day

  • Cannot use GUID/UUIDs, auto increment numbers etc.

  • Ideally should be 5 or 6 chars long, can be alpha of course

  • Would like to reuse existing, well-known algos, if available

Anything out there ?

.Net Solutions


Solution 1 - .Net

Base 62 is used by tinyurl and bit.ly for the abbreviated URLs. It's a well-understood method for creating "unique", human-readable IDs. Of course you will have to store the created IDs and check for duplicates on creation to ensure uniqueness. (See code at bottom of answer)

Base 62 uniqueness metrics

5 chars in base 62 will give you 62^5 unique IDs = 916,132,832 (~1 billion) At 10k IDs per day you will be ok for 91k+ days

6 chars in base 62 will give you 62^6 unique IDs = 56,800,235,584 (56+ billion) At 10k IDs per day you will be ok for 5+ million days

Base 36 uniqueness metrics

6 chars will give you 36^6 unique IDs = 2,176,782,336 (2+ billion)

7 chars will give you 36^7 unique IDs = 78,364,164,096 (78+ billion)

Code:

public void TestRandomIdGenerator()
{
	// create five IDs of six, base 62 characters
	for (int i=0; i<5; i++) Console.WriteLine(RandomIdGenerator.GetBase62(6));
	
	// create five IDs of eight base 36 characters
	for (int i=0; i<5; i++) Console.WriteLine(RandomIdGenerator.GetBase36(8));
}

public static class RandomIdGenerator 
{
	private static char[] _base62chars = 
		"0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
		.ToCharArray();
	
	private static Random _random = new Random();
	
	public static string GetBase62(int length) 
	{
		var sb = new StringBuilder(length);
		
		for (int i=0; i<length; i++) 
			sb.Append(_base62chars[_random.Next(62)]);
		
		return sb.ToString();
	}		
	
	public static string GetBase36(int length) 
	{
		var sb = new StringBuilder(length);
		
		for (int i=0; i<length; i++) 
			sb.Append(_base62chars[_random.Next(36)]);
		
		return sb.ToString();
	}
}

Output:

z5KyMg
wd4SUp
uSzQtH
UPrGAT
UIf2IS

QCF9GNM5 0UV3TFSS 3MG91VKP 7NTRF10T AJK3AJU7

Solution 2 - .Net

I recommend http://hashids.org/ which converts any number (e.g. DB ID) into a string (using salt).

It allows decoding this string back to the number. So you don't need to store it in the database.

Has libs for JavaScript, Ruby, Python, Java, Scala, PHP, Perl, Swift, Clojure, Objective-C, C, C++11, Go, Erlang, Lua, Elixir, ColdFusion, Groovy, Kotlin, Nim, VBA, CoffeeScript and for Node.js & .NET.

Solution 3 - .Net

I had similar requirements as the OP. I looked into available libraries but most of them are based on randomness and I didn't want that. I could not really find anything that was not based on random and still very short... So I ended up rolling my own based on the technique Flickr uses, but modified to require less coordination and allow for longer periods offline.

In short:

  • A central server issues ID blocks consisting of 32 IDs each
  • The local ID generator maintains a pool of ID blocks to generate an ID every time one is requested. When the pool runs low it fetches more ID blocks from the server to fill it up again.

Disadvantages:

  • Requires central coordination
  • IDs are more or less predictable (less so than regular DB ids but they aren't random)

Advantages

  • Stays within 53 bits (Javascript / PHP max size for integer numbers)
  • very short IDs
  • Base 36 encoded so very easy for humans to read, write and pronounce
  • IDs can be generated locally for a very long time before needing contact with the server again (depending on pool settings)
  • Theoretically no chance of collissions

I have published both a Javascript library for the client side, as well as a Java EE server implementation. Implementing servers in other languages should be easy as well.

Here are the projects:

suid - Distributed Service-Unique IDs that are short and sweet

suid-server-java - Suid-server implementation for the Java EE technology stack.

Both libraries are available under a liberal Creative Commons open source license. Hoping this may help someone else looking for short unique IDs.

Solution 4 - .Net

I used base 36 when I solved this problem for an application I was developing a couple of years back. I needed to generate a human readable reasonably unique number (within the current calendar year anyway). I chose to use the time in milliseconds from midnight on Jan 1st of the current year (so each year, the timestamps could duplicate) and convert it to a base 36 number. If the system being developed ran into a fatal issue it generated the base 36 number (7 chars) that was displayed to an end user via the web interface who could then relay the issue encountered (and the number) to a tech support person (who could then use it to find the point in the logs where the stacktrace started). A number like 56af42g7 is infinitely easier for a user to read and relay than a timestamp like 2016-01-21T15:34:29.933-08:00 or a random UUID like 5f0d3e0c-da96-11e5-b5d2-0a1d41d68578.

Solution 5 - .Net

I really like the simplicity of just encoding a GUID using Base64 format and truncating the trailing == to get a string of 22 characters (it takes one line of code, and you can always convert it back to GUID). Sadly, it sometimes includes + and / characters. OK for database, not great for URLs, but it helped me appreciate the other answers :-)

From https://www.codeproject.com/Tips/1236704/Reducing-the-string-Length-of-a-Guid by Christiaan van Bergen

> We found that converting the Guid (16 bytes) to an ASCII > representation using Base64 resulted in a useable and still unique > messageID of only 22 characters.

var newGuid = Guid.NewGuid();
var messageID = Convert.ToBase64String(newGuid.ToByteArray());

var message22chars = Convert.ToBase64String(Guid.NewGuid().ToByteArray()).Substring(0,22);

> For example: The Guid 'e6248889-2a12-405a-b06d-9695b82c0a9c' (string > length: 36) will get a Base64 representation: > 'iYgk5hIqWkCwbZaVuCwKnA==' (string length: 24) > > The Base64 representation ends with the '==' characters. You could > just truncate these, without any impact on the uniqueness. Leaving you > with an identifier of only 22 characters in length.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionKumarView Question on Stackoverflow
Solution 1 - .NetPaul SasikView Answer on Stackoverflow
Solution 2 - .NetSlawaView Answer on Stackoverflow
Solution 3 - .NetStijn de WittView Answer on Stackoverflow
Solution 4 - .NetWarren SmithView Answer on Stackoverflow
Solution 5 - .NetEkusView Answer on Stackoverflow