PHP Unique Hash
I want a short alphanumeric hash that’s unique and who’s sequence is difficult to deduce. I could run it out to md5 and trim the first n chars but that’s not going to be very unique. Storing a truncated checksum in a unique field means that the frequency of collisions will increase geometrically as the number of unique keys for a base 62 encoded integer approaches 62^n. I’d rather do it right than code myself a timebomb. So I came up with this.
I’ll walk us through the finer points below.
class PseudoCrypt { /* Next prime greater than 62 ^ n / 1.618033988749894848 */ private static $golden_primes = array( 1,41,2377,147299,9132313,566201239,35104476161,2176477521929 ); /* Ascii : 0 9, A Z, a z */ /* $chars = array_merge(range(48,57), range(65,90), range(97,122)) */ private static $chars = array( 0=>48,1=>49,2=>50,3=>51,4=>52,5=>53,6=>54,7=>55,8=>56,9=>57,10=>65, 11=>66,12=>67,13=>68,14=>69,15=>70,16=>71,17=>72,18=>73,19=>74,20=>75, 21=>76,22=>77,23=>78,24=>79,25=>80,26=>81,27=>82,28=>83,29=>84,30=>85, 31=>86,32=>87,33=>88,34=>89,35=>90,36=>97,37=>98,38=>99,39=>100,40=>101, 41=>102,42=>103,43=>104,44=>105,45=>106,46=>107,47=>108,48=>109,49=>110, 50=>111,51=>112,52=>113,53=>114,54=>115,55=>116,56=>117,57=>118,58=>119, 59=>120,60=>121,61=>122 ); public static function base62($int) { $key = ""; while($int > 0) { $mod = $int-(floor($int/62)*62); $key .= chr(self::$chars[$mod]); $int = floor($int/62); } return strrev($key); } public static function udihash($num, $len = 5) { $ceil = pow(62, $len); $prime = self::$golden_primes[$len]; $dec = ($num * $prime)-floor($num * $prime/$ceil)*$ceil; $hash = self::base62($dec); return str_pad($hash, $len, "0", STR_PAD_LEFT); } }
Here’s a use case
$ids = range(1,10); foreach($ids as $id) { echo PseudoCrypt::udihash($id) . "\n"; }
cJio3 EdRc6 qxAQ9 TGtEC 5ac2F huKqI KE3eL wXmSO YrVGR BBE4U
Pretty random-looking, huh?
Base 62
Hashes are base 62 encoded base 10 integers. 1=1, 10=a, 36=Z, 61=z, 62=10, 72=1a, etc.
62, 3844, 238328, 14776336, 916132832, 56800235584, 3521614606208
1 character = 62 permutations, 2 characters = 3844 permutations, etc.
41, 2377, 147299, 9132313, 566201239, 35104476161, 2176477521929
41 = next highest prime from golden mean of 62.
2377 = next highest prime from golden mean of 3844.
Uniqueness
I chose to use primes to ensure hash uniqueness. Any prime greater than one half 62^n will do, but if you use a prime near 62^n or 62^n/2 or 2*62^n/3 etc, you will detect a linearity in the sequence at certain points in the ring.
Appearance of randomness
I chose primes near the golden ratio to maximize the appearance of randomness. Given a small set of hashes (even with the associated id) it would be difficult for anyone to guess the next hash.
This is a minimum security technique.
Keep your primes a secret and limit the number of hashes a user can get his hands on to make it harder for script kiddies to reverse engineer your algo. This is a thin rotation and base re-encoding obfuscation algorithm, not an encryption algorithm. Don’t use this to crypt sensitive info. Use it to obfuscate integer IDs.
Tools Used
Golden Ratio Calculator
Prime Numbers Generator
That’s it.
Any of you CS majors out there care to tell me what the technical term is for this type of hash?
Comments(12)
Nice! this is what I was looking for. But it’d be nicer if you could code a decoder to get the original number back. Is there any chance I could get that? I don’t think ‘m capable to code it myself. Thanks for your time anyway!
Hey Kev, that’s very useful!
Obviously not if you need something secure, but to obfuscate ids it’s quite simple and powerful (compared with some things going around in the web).
Have you ever needed to unhash its results?
it may be of some use if you are hashing ids…
Wonderful function — but the output diverges above 58 on two systems I’m testing on. I suspect it’s because one’s a 32bit machine, and the other 64.
Any thoughts on a fix for that?
If I find one, I’ll come back and post.
Hey, that’s very useful!! I wanted to use unique IDs to generate a code as password for my users so they can come back later on the site, and enter this code, so they can edit what they entered. your algorithm is useful in my case.
thank you.
@Tim: I think, not needing the inverse function, it is not important to have different values for one number.
Hey German, if you want decodable ones — would printing random numbers in base 64 work? Most languages let you print in base 16 (hex) and usually (with some sleuthing) have ways to read/write base 64.
Try rining the script with: PseudoCrypt::udihash($id, 10)
All keys wil have values:
000000000000
000000000000
000000000000
000000000000
000000000000
000000000000
000000000000
000000000000
000000000000
000000000000
How to solve it?
10 is probably too large. I suggest you run it w/ 5 or 6 and append some random data.
$hash = PseudoCrypt::udihash($id, 5);
$rand = PseudoCrypt::base62(rand(0,pow(62,5)));
$rand = str_pad($rand, 5, “0″, STR_PAD_LEFT);
$key = $hash.$rand;
Hi, I implemented this in Factor, if you’re curious:
http://re-factor.blogspot.com/2011/03/unique-hash.html
How many total “unique” possibilities are we talking about here?
Me again :)
I wanted to know if you think it would be safe to run this command instead of your look.
echo PseudoCrypt::udihash(time());
I’m trying to create unique HASH to be inserted as KEY on database, but generating your way the hash always stays the same.
Please let me know if there is any draw back to this solution.
Thank you!
Jenny
Using time() would probably not be a good idea, just because there’s a good chance that two records may be created within the same second.
What I usually do instead for keys is
echo time().’-’.base_convert(rand(1,getrandmax()), 10, 36);
which yields a key like 1304134276-udxxi7
Kevin,
Great algorithm! I wanted to use it in my upcoming Node.js project, so I made a CommonJS/Node.js module available here: https://github.com/alecgorge/node-psuedohash
Thanks again for putting this together!