Obfuscating an ID

AlgorithmEncryption

Algorithm Problem Overview


I'm looking for a way to encrypt/obfuscate an integer ID into another integer. More precisely, I need a function int F(int x), so that

  • x<->F(x) is one-to-one correspondence (if x != y, F(x) != F(y))
  • given F(x), it's easy to find out x - so F is not a hash function
  • given x and F(x) it's hard/impossible to find out F(y), something like x ^ 0x1234 won't work

For clarity, I'm not looking for a strong encryption solution, it's only obfuscation. Imagine a web application with urls like example.com/profile/1, example.com/profile/2 etc. The profiles themselves are not secret, but I'd like to prevent casual voyeurs to view/fetch all profiles one after another, so I'd rather hide them behind something like example.com/profile/23423, example.com/profile/80980234 etc. Although database-stored tokens can do the job quite easily, I'm curious if there's some simple math available for this.

One important requirement I wasn't clear about is that results should look "random", that is, given a sequence x,x+1,...,x+n , F(x),F(x+1)...F(x+n) shouldn't form a progression of any kind.

Algorithm Solutions


Solution 1 - Algorithm

Obfuscate it with some combination of 2 or 3 simple methods:

  • XOR
  • shuffle individual bits
  • convert to modular representation (D.Knuth, Vol. 2, Chapter 4.3.2)
  • choose 32 (or 64) overlapping subsets of bits and XOR bits in each subset (parity bits of subsets)
  • represent it in variable-length numberic system and shuffle digits
  • choose a pair of odd integers x and y that are multiplicative inverses of each other (modulo 232), then multiply by x to obfuscate and multiply by y to restore, all multiplications are modulo 232 (source: "A practical use of multiplicative inverses" by Eric Lippert)

Variable-length numberic system method does not obey your "progression" requirement on its own. It always produces short arithmetic progressions. But when combined with some other method, it gives good results.

The same is true for the modular representation method.

Here is C++ code example for 3 of these methods. Shuffle bits example may use some different masks and distances to be more unpredictable. Other 2 examples are good for small numbers (just to give the idea). They should be extended to obfuscate all integer values properly.

// *** Numberic system base: (4, 3, 5) -> (5, 3, 4)
// In real life all the bases multiplied should be near 2^32
unsigned y = x/15 + ((x/5)%3)*4 + (x%5)*12; // obfuscate
unsigned z = y/12 + ((y/4)%3)*5 + (y%4)*15; // restore

// *** Shuffle bits (method used here is described in D.Knuth's vol.4a chapter 7.1.3)
const unsigned mask1 = 0x00550055; const unsigned d1 = 7;
const unsigned mask2 = 0x0000cccc; const unsigned d2 = 14;

// Obfuscate
unsigned t = (x ^ (x >> d1)) & mask1;
unsigned u = x ^ t ^ (t << d1);
t = (u ^ (u  >> d2)) & mask2;
y = u ^ t ^ (t << d2);

// Restore
t = (y ^ (y >> d2)) & mask2;
u = y ^ t ^ (t << d2);
t = (u ^ (u >> d1)) & mask1;
z = u ^ t ^ (t << d1);

// *** Subset parity
t = (x ^ (x >> 1)) & 0x44444444;
u = (x ^ (x << 2)) & 0xcccccccc;
y = ((x & 0x88888888) >> 3) | (t >> 1) | u; // obfuscate

t = ((y & 0x11111111) << 3) | (((y & 0x11111111) << 2) ^ ((y & 0x22222222) << 1));
z = t | ((t >> 2) ^ ((y >> 2) & 0x33333333)); // restore

Solution 2 - Algorithm

You want the transformation to be reversible, and not obvious. That sounds like an encryption that takes a number in a given range and produces a different number in the same range. If your range is 64 bit numbers, then use DES. If your range is 128 bit numbers then use AES. If you want a different range, then your best bet is probably Hasty Pudding cipher, which is designed to cope with different block sizes and with number ranges that do not fit neatly into a block, such as 100,000 to 999,999.

Solution 3 - Algorithm

Obfuscation is not really sufficient in terms of security.

However, if you are trying to thwart the casual onlooker, I'd recommend a combination of two methods:

  • A private key that you combine with the id by xor'ing them together
  • Rotating the bits by a certain amount both before and after the key has been applied

Here is an example (using pseudo code):

  def F(x)
    x = x XOR 31415927       # XOR x with a secret key
    x = rotl(x, 5)           # rotate the bits left 5 times
    x = x XOR 31415927       # XOR x with a secret key again
    x = rotr(x, 5)           # rotate the bits right 5 times
    x = x XOR 31415927       # XOR x with a secret key again
    return x                 # return the value
  end

I haven't tested it, but I think this is reversible, should be fast, and not too easy to tease out the method.

Solution 4 - Algorithm

I found this particular piece of Python/PHP code very useful:

https://github.com/marekweb/opaque-id

Solution 5 - Algorithm

I wrote some JS code using some of the ideas in this thread:

const BITS = 32n;
const MAX = 4294967295n;
const COPRIME = 65521n;
const INVERSE = 2166657316n;
const ROT = 6n;
const XOR1 = 10296065n; 
const XOR2 = 2426476569n;


function rotRight(n, bits, size) {
    const mask = (1n << bits) - 1n;
    // console.log('mask',mask.toString(2).padStart(Number(size),'0'));
    const left = n & mask;
    const right = n >> bits;
    return (left << (size - bits)) | right;
}

const pipe = fns => fns.reduce((f, g) => (...args) => g(f(...args)));

function build(...fns) {
    const enc = fns.map(f => Array.isArray(f) ? f[0] : f);
    const dec = fns.map(f => Array.isArray(f) ? f[1] : f).reverse();

    return [
        pipe(enc),
        pipe(dec),
    ]
}

[exports.encode, exports.decode] = build(
    [BigInt, Number],
    [i => (i * COPRIME) % MAX, i => (i * INVERSE) % MAX],
    x => x ^ XOR1,
    [x => rotRight(x, ROT, BITS), x => rotRight(x, BITS-ROT, BITS)],
    x => x ^ XOR2,
);

It produces some nice results like:

1 1352888202n 1 'mdh37u'
2 480471946n 2 '7y26iy'
3 3634587530n 3 '1o3xtoq'
4 2225300362n 4 '10svwqy'
5 1084456843n 5 'hxno97'
6 212040587n 6 '3i8rkb'
7 3366156171n 7 '1jo4eq3'
8 3030610827n 8 '1e4cia3'
9 1889750920n 9 'v93x54'
10 1017334664n 10 'gtp0g8'
11 4171450248n 11 '1wzknm0'
12 2762163080n 12 '19oiqo8'
13 1621319561n 13 'qtai6h'
14 748903305n 14 'cdvlhl'
15 3903018889n 15 '1sjr8nd'
16 3567473545n 16 '1mzzc7d'
17 2426613641n 17 '144qr2h'
18 1554197390n 18 'ppbudq'
19 413345678n 19 '6u3fke'
20 3299025806n 20 '1ik5klq'
21 2158182286n 21 'zoxc3y'
22 1285766031n 22 'l9iff3'
23 144914319n 23 '2ea0lr'
24 4104336271n 24 '1vvm64v'
25 2963476367n 25 '1d0dkzz'
26 2091060108n 26 'ykyob0'
27 950208396n 27 'fpq9ho'
28 3835888524n 28 '1rfsej0'
29 2695045004n 29 '18kk618'
30 1822628749n 30 'u559cd'
31 681777037n 31 'b9wuj1'
32 346231693n 32 '5q4y31'

Testing with:

  const {encode,decode} = require('./obfuscate')

  for(let i = 1; i <= 1000; ++i) {
        const j = encode(i);
        const k = decode(j);
        console.log(i, j, k, j.toString(36));
   }

XOR1 and XOR2 are just random numbers between 0 and MAX. MAX is 2**32-1; you should set this to whatever you think your highest ID will be.

COPRIME is a number that's coprime w/ MAX. I think prime numbers themselves are coprime with every other number (except multiples of themselves).

INVERSE is the tricky one to figure out. These blog posts don't give a straight answer, but WolframAlpha can figure it out for you. Basically, just solve the equation (COPRIME * x) % MAX = 1 for x.

The build function is something I created to make it easier to create these encode/decode pipelines. You can feed it as many operations as you want as [encode, decode] pairs. These functions have to be equal and opposite. The XOR functions are their own compliments so you don't need a pair there.


Here's another fun involution:

function mixHalves(n) {
    const mask = 2n**12n-1n;
    const right = n & mask;
    const left = n >> 12n;
    const mix = left ^ right;
    return (mix << 12n) | right;
}

(assumes 24-bit integers -- just change the numbers for any other size)

Solution 6 - Algorithm

Do anything with the bits of the ID that won't destroy them. For example:

  • rotate the value
  • use lookup to replace certain parts of the value
  • xor with some value
  • swap bits
  • swap bytes
  • mirror the whole value
  • mirror a part of the value
  • ... use your imagination

For decryption, do all that in reverse order.

Create a program that will 'encrypt' some interesting values for you and put them in a table you can examine. Have same program TEST your encryption/decryption routine WITH all set of values that you want to have in your system.

Add stuff to the above list into the routines until your numbers will look properly mangled to you.

For anything else, get a copy of [The Book][1].

[1]: http://www.schneier.com/book-applied.html "The Book"

Solution 7 - Algorithm

I wrote an article on secure permutations with block ciphers, which ought to fulfil your requirements as stated.

I'd suggest, though, that if you want hard to guess identifiers, you should just use them in the first place: generate UUIDs, and use those as the primary key for your records in the first place - there's no need to be able to convert to and from a 'real' ID.

Solution 8 - Algorithm

Not sure how "hard" you need it to be, how fast, or how little memory to use. If you have no memory constraints you could make a list of all integers, shuffle them and use that list as a mapping. However, even for a 4 byte integer you would need a lot of memory.

However, this could be made smaller so instead of mapping all integers you would map only 2 (or worst case 1) byte and apply this to each group in the integer. So, using 2 bytes a integer would be (group1)(group2) you would map each group through the random map. But that means that if you only change group2 then the mapping for group1 would stay the same. This could "fixed" by mapping different bits to each group.

So, *(group2) could be (bit 14,12,10,8,6,4,2,0) so, adding 1 would change both group1 and group2.

Still, this is only security by obscurity, anyone that can feed numbers into your function (even if you keep the function secret) could fairly easily figure it out.

Solution 9 - Algorithm

What you're describing here seems to be the opposite of a one-way function: it's easy to invert but super difficult to apply. One option would be to use a standard, off-the-shelf public-key encryption algorithm where you fix a (secret, randomly-chosen) public key that you keep a secret and a private key that you share with the world. That way, your function F(x) would be the encryption of x using the public key. You could then easily decrypt F(x) back to x by using the private decryption key. Notice that the roles of the public and private key are reversed here - you give out the private key to everyone so that they can decrypt the function, but keep the public key secret on your server. That way:

  1. The function is a bijection, so it's invertible.
  2. Given F(x), x is efficiently computable.
  3. Given x and F(x), it is extremely difficult to compute F(y) from y, since without the public key (assuming you use a cryptographically strong encryption scheme) there is no feasible way to encrypt the data, even if the private decryption key is known.

This has many advantages. First, you can rest assured that the crypto system is safe, since if you use a well-established algorithm like RSA then you don't need to worry about accidental insecurity. Second, there are already libraries out there to do this, so you don't need to code much up and can be immune to side-channel attacks. Finally, you can make it possible for anyone to go and invert F(x) without anyone actually being able to compute F(x).

One detail- you should definitely not just be using the standard int type here. Even with 64-bit integers, there are so few combinations possible that an attacker could just brute-force try inverting everything until they find the encryption F(y) for some y even if they don't have the key. I would suggest using something like a 512-bit value, since even a science fiction attack would not be able to brute-force this.

Hope this helps!

Solution 10 - Algorithm

Generate a private symmetric key for use in your application, and encrypt your integer with it. This will satisfy all three requirements, including the hardest #3: one would need to guess your key in order to break your scheme.

Solution 11 - Algorithm

If xor is acceptable for everything but inferring F(y) given x and F(x) then I think you can do that with a salt. First choose a secret one-way function. For example S(s) = MD5(secret ^ s). Then F(x) = (s, S(s) ^ x) where s is chosen randomly. I wrote that as a tuple but you can combine the two parts into an integer, e.g. F(x) = 10000 * s + S(s) ^ x. The decryption extracts the salt s again and uses F'(F(x)) = S(extract s) ^ (extract S(s)^x). Given x and F(x) you can see s (though it is slightly obfuscated) and you can infer S(s) but for some other user y with a different random salt t the user knowing F(x) can't find S(t).

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestiongeorgView Question on Stackoverflow
Solution 1 - AlgorithmEvgeny KluevView Answer on Stackoverflow
Solution 2 - AlgorithmrossumView Answer on Stackoverflow
Solution 3 - AlgorithmIAmNaNView Answer on Stackoverflow
Solution 4 - Algorithmmiracle2kView Answer on Stackoverflow
Solution 5 - AlgorithmmpenView Answer on Stackoverflow
Solution 6 - AlgorithmDaniel MošmondorView Answer on Stackoverflow
Solution 7 - AlgorithmNick JohnsonView Answer on Stackoverflow
Solution 8 - AlgorithmRoger LindsjöView Answer on Stackoverflow
Solution 9 - AlgorithmtemplatetypedefView Answer on Stackoverflow
Solution 10 - AlgorithmSergey KalinichenkoView Answer on Stackoverflow
Solution 11 - AlgorithmBen JacksonView Answer on Stackoverflow