PHP Unique Hash
I want a unique alphanumeric hash who’s sequence is difficult to deduce. I could run it out to md5 and trim the first n chars but that’s not going to be very unique. Storing a truncated checksum in a unique field means that the frequency of collisions will increase geometrically as the number of unique keys for a lowercase alphanumeric hash approaches 36^n. I’d rather do it right than code myself a timebomb. So I came up with this.
It’s dense so don’t bother wading through it, I’ll walk through the finer points below.
class PseudoCrypt { private static $golden_primes = array( 36 => array(1,23,809,28837,1038073,37370257 /*,1345328833*/) ); public static function udihash($num, $len = 5, $base = 36) { $gp = self::$golden_primes[$base]; $maxlen = count($gp); $len = $len > ($maxlen-1) ? ($maxlen-1) : $len; while($len < $maxlen && pow($base,$len) < $num) $len++; if($len >= $maxlen) throw new Exception($num." out of range (max ".pow($base,$maxlen-1).")"); $ceil = pow($base,$len); $prime = $gp[$len]; $dechash = ($num * $prime) % $ceil; $hash = base_convert($dechash, 10, $base); return str_pad($hash, $len, "0", STR_PAD_LEFT); } }
Here’s a use case
$ids = range(1,10); foreach($ids as $id) { echo PseudoCrypt::udihash($id) . "\n"; }
m8z2p 8hy5e uqx83 gzwas 38vdh phug6 bqtiv xzslk k8ro9 6hqqy
Pretty random-looking, huh?
Alphanum = 36
Lowercase alphanumeric is a base 36 encoded base 10 integer. 1=1, 10=a, 36=z, 37=10, 48=1a, 1296=zz, 1297=100, etc.
36, 1296, 46656, 1679616, 60466176
1 character = 36 permutations, 2 characters = 1296 permutations, etc.
23, 809, 28837, 1038073, 37370257
23 = next highest prime from golden mean. 36-22-23, 1296-801-809, etc.
Uniqueness
I chose to use primes to ensure hash uniqueness.
Appearance of randomness
I chose primes near the golden ratio to maximize the appearance of randomness. Given a small set of hashes (even with the associated id) it would be difficult for anyone to guess the next hash.
Hash Length Overrun
Give it a low hash length and uniqueness will be preserved as it oversteps its range.
$ids = range(807,812); foreach($ids as $id) { echo PseudoCrypt::udihash($id, 2) . "\n"; }
r3
dk
01
n4i
9dj
vmk32-bit woe
I capped this example at a len of 5 because the modulus operator on any 32 bit machine (like this WinXP terdbox) chokes on len >= 6. I don’t anticipate a need for > 60.4 million of these suckers, so it’s a tradeoff I’m willing to make. I’d feel more comfortable releasing this as a default len of 6. Len of 6 (on a 64-bit system) will get you to 2.1 billion. I could probably make this work if I got into bit shifting, but… I don’t reeeeaaallly care. 60.4 mil is probably more than I will ever need.
This is a minimum security technique.
Keep your primes a secret and limit the number of hashes a user can get his hands on to make it harder for script kiddies to reverse engineer your algo. This is a thin rotation and base re-encoding obfuscation algorithm, not an encryption algorithm. Don’t use this to crypt sensitive info. Use it to obfuscate integer IDs.
Tools Used
Golden Ratio Calculator
Prime Numbers Generator
That’s it.
Any of you CS majors out there care to tell me what the technical term is for this type of hash?
Comments(6)
What does $id refer to on line:
while($len < $maxlen && pow($base,$len) < $id) $len++
Typo, should read $num. Thankx
Nice! this is what I was looking for. But it’d be nicer if you could code a decoder to get the original number back. Is there any chance I could get that? I don’t think ‘m capable to code it myself. Thanks for your time anyway!
Hey Kev, that’s very useful!
Obviously not if you need something secure, but to obfuscate ids it’s quite simple and powerful (compared with some things going around in the web).
Have you ever needed to unhash its results?
it may be of some use if you are hashing ids…
Wonderful function — but the output diverges above 58 on two systems I’m testing on. I suspect it’s because one’s a 32bit machine, and the other 64.
Any thoughts on a fix for that?
If I find one, I’ll come back and post.
Hey, that’s very useful!! I wanted to use unique IDs to generate a code as password for my users so they can come back later on the site, and enter this code, so they can edit what they entered. your algorithm is useful in my case.
thank you.
@Tim: I think, not needing the inverse function, it is not important to have different values for one number.