md5(uniqid) makes sense for random unique tokens? - php

I want to create a token generator that generates tokens that cannot be guessed by the user and that are still unique (to be used for password resets and confirmation codes).
I often see this code; does it make sense?
md5(uniqid(rand(), true));
According to a comment uniqid($prefix, $moreEntopy = true) yields
first 8 hex chars = Unixtime, last 5 hex chars = microseconds.
I don't know how the $prefix-parameter is handled..
So if you don't set the $moreEntopy flag to true, it gives a predictable outcome.
QUESTION: But if we use uniqid with $moreEntopy, what does hashing it with md5 buy us? Is it better than:
md5(mt_rand())
edit1: I will store this token in an database column with a unique index, so I will detect columns. Might be of interest/

rand() is a security hazard and should never be used to generate a security token: rand() vs mt_rand() (Look at the "static" like images). But neither of these methods of generating random numbers is cryptographically secure. To generate secure secerts an application will needs to access a CSPRNG provided by the platform, operating system or hardware module.
In a web application a good source for secure secrets is non-blocking access to an entropy pool such as /dev/urandom. As of PHP 5.3, PHP applications can use openssl_random_pseudo_bytes(), and the Openssl library will choose the best entropy source based on your operating system, under Linux this means the application will use /dev/urandom. This code snip from Scott is pretty good:
function crypto_rand_secure($min, $max) {
$range = $max - $min;
if ($range < 0) return $min; // not so random...
$log = log($range, 2);
$bytes = (int) ($log / 8) + 1; // length in bytes
$bits = (int) $log + 1; // length in bits
$filter = (int) (1 << $bits) - 1; // set all lower bits to 1
do {
$rnd = hexdec(bin2hex(openssl_random_pseudo_bytes($bytes)));
$rnd = $rnd & $filter; // discard irrelevant bits
} while ($rnd >= $range);
return $min + $rnd;
}
function getToken($length=32){
$token = "";
$codeAlphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
$codeAlphabet.= "abcdefghijklmnopqrstuvwxyz";
$codeAlphabet.= "0123456789";
for($i=0;$i<$length;$i++){
$token .= $codeAlphabet[crypto_rand_secure(0,strlen($codeAlphabet))];
}
return $token;
}

This is a copy of another question I found that was asked a few months before this one. Here is a link to the question and my answer: https://stackoverflow.com/a/13733588/1698153.
I do not agree with the accepted answer. According to PHPs own website "[uniqid] does not generate cryptographically secure tokens, in fact without being passed any additional parameters the return value is little different from microtime(). If you need to generate cryptographically secure tokens use openssl_random_pseudo_bytes()."
I do not think the answer could be clearer than this, uniqid is not secure.

I know the question is old, but it shows up in Google, so...
As others said, rand(), mt_rand() or uniqid() will not guarantee you uniqueness... even openssl_random_pseudo_bytes() should not be used, since it uses deprecated features of OpenSSL.
What you should use to generate random hash (same as md5) is random_bytes() (introduced in PHP7). To generate hash with same length as MD5:
bin2hex(random_bytes(16));
If you are using PHP 5.x you can get this function by including random_compat library.

Define "unique". If you mean that two tokens cannot have the same value, then hashing isn't enough - it should be backed with a uniqueness test. The fact that you supply the hash algorithm with unique inputs does not guarantee unique outputs.

To answer your question, the problem is you can't have a generator that is guaranteed random and unique as random by itself, i.e., md5(mt_rand()) can lead to duplicates. What you want is "random appearing" unique values. uniqid gives the unique id, rand() affixes a random number making it even harder to guess, md5 masks the result to make it yet even harder to guess. Nothing is unguessable. We just need to make it so hard that they wouldn't even want to try.

I ran into an interesting idea a couple of years ago.
Storing two hash values in the datebase, one generated with md5($a) and the other with sha($a). Then chek if both the values are corect. Point is, if the attacker broke your md5(), he cannot break your md5 AND sha in the near future.
Problem is: how can that concept be used with the token generating needed for your problem?

First, the scope of this kind of procedure is to create a key/hash/code, that will be unique for one given database. It is impossible to create something unique for the whole world at a given moment.
That being said, you should create a plain, visible string, using a custom alphabet, and checking the created code against your database (table).
If that string is unique, then you apply a md5() to it and that can't be guessed by anyone or any script.
I know that if you dig deep into the theory of cryptographic generation you can find a lot of explanation about this kind of code generation, but when you put it to real usage it's really not that complicated.
Here's the code I use to generate a simple 10 digit unique code.
$alphabet = "aA1!bB2#cC3#dD5%eE6^fF7&gG8*hH9(iI0)jJ4-kK=+lL[mM]nN{oO}pP\qQ/rR,sS.tT?uUvV>xX~yY|zZ`wW$";
$code = '';
$alplhaLenght = strlen($alphabet )-1;
for ($i = 1; $i <= 10; $i++) {
$n = rand(1, $alplhaLenght );
$code .= $alphabet [$n];
}
And here are some generated codes, although you can run it yourself to see it work:
SpQ0T0tyO%
Uwn[MU][.
D|[ROt+Cd#
O6I|w38TRe
Of course, there can be a lot of "improvements" that can be applied to it, to make it more "complicated", but if you apply a md5() to this, it'll become, let's say "unguessable" . :)

MD5 is a decent algorithm for producing data dependent IDs. But in case you have more than one item which has the same bitstream (content), you will be producing two similar MD5 "ids".
So if you are just applying it to a rand() function, which is guaranteed not to create the same number twice, you are quite safe.
But for a stronger distribution of keys, I'd personally use SHA1 or SHAx etc'... but you will still have the problem of similar data leads to similar keys.

Related

PHP Seeded, Deterministic, Cryptographically Secure PRNG (PseudoRandom Number Generator). Is it possible?

I'm required to create a provably-fair (deterministic & seeded) cryptographically secure (CS) random number generator in PHP. We are running PHP 5 and PHP 7 isn't really an option right now. However, I found a polyfill for PHP 7's new CS functions so I've implemented that solution (https://github.com/paragonie/random_compat).
I thought that srand() could be used to seed random_int(), but now I'm not certain if that is the case. Can a CSPRNG even be seeded? If it can be seeded, will the output be deterministic (same random result, given same seed)?
Here is my code:
require_once($_SERVER['DOCUMENT_ROOT']."/lib/assets/random_compat/lib/random.php");
$seed_a = 8138707157292429635;
$seed_b = 'JuxJ1XLnBKk7gPASR80hJfq5Ey8QWEIc8Bt';
class CSPRNG{
private static $RNGseed = 0;
public function generate_seed_a(){
return random_int(0, PHP_INT_MAX);
}
public function generate_seed_b($length = 35){
$characters = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
$randomString = '';
for($i = 0; $i < $length; $i++){
$randomString .= $characters[random_int(0, strlen($characters) - 1)];
}
return $randomString;
}
public function seed($s = 0) {
if($s == 0){
$this->RNGseed = $this->generate_seed_a();
}else{
$this->RNGseed = $s;
}
srand($this->RNGseed);
}
public function generate_random_integer($min=0, $max=PHP_INT_MAX, $pad_zeros = true){
if($this->RNGseed == 0){
$this->seed();
}
$rnd_num = random_int($min, $max);
if($pad_zeros == true){
$num_digits = strlen((string)$max);
$format_str = "%0".$num_digits."d";
return sprintf($format_str, $rnd_num);
}else{
return $rnd_num;
}
}
public function drawing_numbers($seed_a, $num_of_balls = 6){
$this->seed($seed_a);
$draw_numbers = array();
for($i = 0; $i < $num_of_balls; $i++) {
$number = ($this->generate_random_integer(1, 49));
if(in_array($number, $draw_numbers)){
$i = $i-1;
}else{
array_push($draw_numbers, $number);
}
}
sort($draw_numbers);
return $draw_numbers;
}
}
$CSPRNG= new CSPRNG();
echo '<p>Seed A: '.$seed_a.'</p>';
echo '<p>Seed B: '.$seed_b.'</p>';
$hash = hash('sha1', $seed_a.$seed_b);
echo '<p>Hash: '.$hash.'</p>';
$drawNumbers = $CSPRNG->drawing_numbers($seed_a);
$draw_str = implode("-", $drawNumbers);
echo "<br>Drawing: $draw_str<br>";
When this code is run, the Drawing ($draw_str) should be the same on each run, but it is not.
To prove that the drawing is fair, a seed (Seed A) is chosen before the winning number is picked and shown. Another random number is generated as well (Seed B). Seed B is used as a salt and combined with Seed A and the result is hashed. This hash is shown to the user prior to the drawing. They would also be provided with the source code so that when the winning number is picked, both seeds are revealed. They can verify that the hash matches and everything was done fairly.
Duskwuff asks:
How do you intend to prove that the seed was chosen fairly? A suspicious user can easily claim that you picked a seed that would result in a favorable outcome for specific users, or that you revealed the seed to specific users ahead of time.
Before you investigate solutions, what exactly is the problem you are trying to solve? What is your threat model?
It sounds like you want SeedSpring (version 0.3.0 supports PHP 5.6).
$prng = new \ParagonIE\SeedSpring\SeedSpring('JuxJ1XLnBKk7gPAS');
$byte = $prng->getBytes(16);
\var_dump(bin2hex($byte));
This should always return:
string(32) "76482c186f7c5d1cb3f895e044e3c649"
The numbers should be unbiased, but since it's based off a pre-shared seed, it is not, by strict definition, cryptographically secure.
Keep in mind that SeedSpring was created as a toy implementation/proof of concept rather than an official Paragon Initiative Enterprises open source security solution, so feel free to fork it and tweak it to suit your purposes. (I doubt our branch will ever reach a "stable 1.0.0 release").
(Also, if you're going to accept/award the bounty to any of these answers, Aaron Toponce's answer is more correct. Encrypting the nonce with ECB mode is more performant than encrypting a long stream of NUL bytes with AES-CTR, for approximately the same security benefit. This is one of the extremely rare occasions that ECB mode is okay.)
First, you shouldn't be implementing your own userspace CSPRNG. The operating system you have PHP 5 installed on already ships a CSPRNG, and you should be using that for all your randomness, unless you know you can use it, or performance is a concern. You should be using random_int(), random_bytes(), or openssl_random_pseudo_bytes().
However, if you must implement a userspace CSPRNG, then this can be done by simply using an AES library (E.G.: libsodium), and encrypting a counter. Psuedocode would be:
Uint-128 n = 0;
while true:
output = AES-ECB(key, n);
n++;
They AES key, in this case, needs sufficient entropy to withstand a sophisticated attack, or the security of your userspace CSPRNG falls apart, of course. The key could be the bcrypt() of a user-supplied password.
Provided your counter represented as a 128-bit unsigned integer is always unique, you will always get a unique output every time the generator is "seeded" with a new counter. If it's seeded with a previously used counter, but a different key, then the output will also be different. The best case scenario, would be a changing key and a changing counter every time the generator is called.
You may be tempted to use high precision timestamp, such as using microsecond accuracy, in your counter. This is fine, except you run the risk of someone or something manipulating the system clock. As such, if the clock can be manipulated, then the CSPRNG generator can be compromised. You're best off providing a new key every time you call the generator, and start encrypting with a 128-bit zero.
Also, notice that we're using ECB mode with AES. Don't freak out. ECB has problems with maintaining structure in the ciphertext that the plaintext provides. In general terms, you should not use ECB mode. However, with 128-bits of data, you will only be encrypting a single ECB block, so there will be no leak of structured data. ECB is preferred over CTR for a userspace CSPRNG, as you don't have to keep track of a key, a counter object, and the data to be encrypted. Only a key and the data are needed. Just make sure you are never encrypting more than 128-bits of data, and you'll never need more than 1 block.
Can a CSPRNG even be seeded?
Yes, and it should always be seeded. If you look at your GNU/Linux operating system, you'll likely notice a file in /var/lib/urandom/random-seed. When the operating system shuts down, it creates that file from the CSPRNG. On next boot, this file is used to seed the kernelspace CSPRNG to prevent reusing previous state of the generator. On every shutdown, that file should change.
If it can be seeded, will the output be deterministic (same random result, given same seed)?
Yes. Provided the same seed, key, etc., the output is deterministic, so the output will be the same. If one of your variables changes, then the output will be different. This is why on each call of the generator should be rekeyed.

Web development/PHP: when should you/shouldn't you use a cryptographically secure token?

The standard way to create a cryptographically secure token using PHP seems to be:
$token = bin2hex(openssl_random_pseudo_bytes(16));
I understand if you're using Linux (which I always do) because this uses /dev/urandom — which is changed according all the many things that go on in the operating system —it makes it nigh impossible to predict.
My function is more like this so I can do it by char length rather than bit length (though I don't really ever use it, see below):
function token($charLength = 32) {
// Each byte produces 2 hexadecimal characters so bit length should be half the char length
$bitLength = $charLength / 2;
// Generate token
$token = bin2hex(openssl_random_pseudo_bytes($bitLength));
return $token;
}
Is it the unpredictability that makes it secure? I can't help thinking it's less secure because the output is hexadecimal and therefore is less hard to guess or brute-force than a string with the same number of chars that contains the rest of the alphabet, uppercase letters, other symbols, etc.
Is this why when people refer to tokens they refer to the bit length as opposed to char length?
Consider instead:
function randomString($length,
$alpha = true,
$alphau = true,
$numeric = true,
$specialChars = '') {
$string = $specialChars;
if($alpha === true) {
$string .= 'abcdefghijklmnopqrstuvwxyz';
}
if($alphau === true) {
$string .= 'ABCDEFGHIJKLMNOPQRSTUVWXYZ';
}
if($numeric === true) {
$string .= '0123456789';
}
$array = str_split($string);
$string = '';
for($counter = 0; $counter < $length; $counter ++) {
$string .= $array[array_rand($array)];
}
return $string;
}
In the context of web development when would you use the first function over the second for:
Creating a random password for a password reset
Creating a one-time use token (e.g. for a forgotten password link)
Creating a salt for a password hash (e.g. bcrypt, sha512, PBKDF2)
Creating a token for a “remember me” cookie token
In all instances I would use randomString() over token() so I guess I'm asking if and why I'm wrong in any of the above.
My rationale in relation to the above points:
12 char random password with uppercase, lower case and numbers is hard to guess; plus I freeze people out for 15 mins after 5 failed login attempts
64 char random string, If someone tried brute-forcing the token to reset a password the firewall would pick up on it
Salts should be assumed to be public anyway, so long as they're different per password it makes it impossible to produce a rainbow table
My remember me token is 128 char random string stored in a cookie and is salted and sha 512'd in the database
The primary concern with random number generators is generally not the output created, but the predictability in which this data is generated. Your basic question is why not use array_rand (which internally uses php_rand) over openssl_random_pseudo_bytes for cryptographic purposes. The answer has to do with the technique each function takes, with array_rand being a much more predictable (and reproduce-able) approach. See Pádraic Brady's article "Predicting Random Numbers In PHP – It’s Easier Than You Think!" for more detail: http://blog.astrumfutura.com/2013/03/predicting-random-numbers-in-php-its-easier-than-you-think/.
Concerning the output of random number generators, password/key strength in relation to brute force attacks is often measured in entropy. This is usually listed in bits with the more bits the better. The Wikipedia page on password strength (http://en.wikipedia.org/wiki/Password_strength) has some great reference tables for determining the entropy level of passwords at different lengths and using various combinations of character types. The openssl_random_pseudo_bytes() function utilizes all binary/hex values resulting in a full 8 bits of entropy per symbol. At best your randomString() function would result in 5.954 bits of entropy per symbol.
Use of a crypto strong random number should be used in all security related scenarios where the ability to guess one of these numbers would negatively affect your site in some manner. The only item in your list of 4 where I see a crypto strong random number not being required is with salt values for hashes. A salt value must be universally unique. It can certainly be produced by a crypto random number generator (CRNG), but this is not required as the resulting value can be made public. See https://security.stackexchange.com/questions/8246/what-is-a-good-enough-salt-for-a-saltedhash

How can I generate strong unique API keys with PHP?

I need to generate a strong unique API key.
Can anyone suggest the best solution for this? I don't want to use rand() function to generate random characters. Is there an alternative solution?
As of PHP 7.0, you can use the random_bytes($length) method to generate a cryptographically-secure random string. This string is going to be in binary, so you'll want to encode it somehow. A straightforward way of doing this is with bin2hex($binaryString). This will give you a string $length * 2 bytes long, with $length * 8 bits of entropy to it.
You'll want $length to be high enough such that your key is effectively unguessable and that the chance of there being another key being generated with the same value is practically nil.
Putting this all together, you get this:
$key = bin2hex(random_bytes(32)); // 64 characters long
When you verify the API key, use only the first 32 characters to select the record from the database and then use hash_equals() to compare the API key as given by the user against what value you have stored. This helps protect against timing attacks. ParagonIE has an excellent write-up on this.
For an example of the checking logic:
$token = $request->bearerToken();
// Retrieve however works best for your situation,
// but it's critical that only the first 32 characters are used here.
$users = app('db')->table('users')->where('api_key', 'LIKE', substr($token, 0, 32) . '%')->get();
// $users should only have one record in it,
// but there is an extremely low chance that
// another record will share a prefix with it.
foreach ($users as $user) {
// Performs a constant-time comparison of strings,
// so you don't leak information about the token.
if (hash_equals($user->api_token, $token)) {
return $user;
}
}
return null;
Bonus: Slightly More Advanced Use With Base64 Encoding
Using Base64 encoding is preferable to hexadecimal for space reasons, but is slightly more complicated because each character encodes 6 bits (instead of 4 for hexadecimal), which can leave the encoded value with padding at the end.
To keep this answer from dragging on, I'll just put some suggestions for handling Base64 without their supporting arguments. Pick a $length greater than 32 that is divisible by both 3 and 2. I like 42, so we'll use that for $length. Base64 encodings are of length 4 * ceil($length / 3), so our $key will be 56 characters long. You can use the first 28 characters for selection from your storage, leaving another 28 characters on the end that are protected from leaking by timing attacks with hash_equals.
Bonus 2: Secure Key Storage
Ideally, you should be treating the key much like a password. This means that instead of using hash_equals to compare the full string, you should hash the remainder of the key like a password, store that separately than the first half of your key (which is in plain-text), use the first half for selection from your database and verify the latter half with password_verify.
using mcrypt:
<?php
$bytes = mcrypt_create_iv(4, MCRYPT_DEV_URANDOM);
$unpack = unpack("Nint", $bytes);
$id = $unpack['int'] & 0x7FFFFFFF;
PHP has uniqid function http://php.net/manual/en/function.uniqid.php with optional prefix and you can even add additional entropy to further avoid collision. But if you absolutely possitevily need something unique you should not use anything with randomness in it.
This is the best solution i found.
http://www.php.net/manual/en/function.uniqid.php#94959

Never Generate Random Number Again

I am looking for a random number generating PHP Solution which did not generate same number again.. is there any solution then please let me know..
I need this solution for one of my Project which generate uniqu key for URL and i don't want to check Generated number is existed or not from the data..
Thanks..
--------- EDIT ----------
I am using this random number generating method is its help full?
function randomString($length = 10, $chars = '1234567890') {
// Alpha lowercase
if ($chars == 'alphalower') {
$chars = 'abcdefghijklmnopqrstuvwxyz';
}
// Numeric
if ($chars == 'numeric') {
$chars = '1234567890';
}
// Alpha Numeric
if ($chars == 'alphanumeric') {
$chars = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890';
}
// Hex
if ($chars == 'hex') {
$chars = 'ABCDEF1234567890';
}
$charLength = strlen($chars)-1;
for($i = 0 ; $i < $length ; $i++)
{
$randomString .= $chars[mt_rand(0,$charLength)];
}
return $randomString;
}
Look at the php function uniqid():
http://php.net/manual/en/function.uniqid.php
It's impossible to generate a random number which is unique - if the generator is dependent on state, then the output is by definition not random.
It is possible to generate a set of random numbers and remove duplicates (although at the numbers again cease to be be truly random).
Do you really need a random number or do you need a sequence number or a unique identifier - these are 3 separate things.
which generate unique key for URL
MySQL and SQLite both support auto-increment column types which will be unique (effectively the same as a sequence number). MySQL even has a mechanism for ensuring uniqueness across equivalent nodes - even where they are not tightly coupled. Oracle provides sequence generators.
Both MySQL and PHP have built-in functionality for generating uuids, although since most DBMS support surrogate key generation, there is little obvious benefit to this approach.
You can use a database... Everytime a random number has shown up, put it in a database and next time, compare the random number of the new script with those already in the database.
Use a random number generator, keep stored the already generated values, discard and generate again when you get a duplicate number.
Ignore uniqids and stuff like that because they are just plain wrong.
There are no real "perfect and low price" random number generators!!
The best that can be done from mathematical functions are pseudorandom which in the end seem random enough for most intents and purposes.
mt_rand function uses the Mersenne twister, which is a pretty good PRNG!
so it's probably going to be good enough for most casual use.
give a look here for more info: http://php.net/manual/en/function.mt-rand.php
a possible code implementation is
<?php
$random = mt_rand($yourMin, $yourMax);
?>
EDITD:
find a very good explanation here:
Generate cryptographically secure random numbers in php
The typical answer is to use a GUID or UUID, although I avoid those forms that use only random numbers. (Eg, avoid version 4 GUID or UUIDs)

Session hash does size matter?

Does size matter when choosing the right algorithm to use for a session hash.
I recently read this article and it suggested using whirlpool to create a hash for session id. Whirlpool generates a 128 character hash string, is this too large?
The plan is to store the session hash in a db. Is there much of a difference between maybe using 64 character field (sha256), 96 character field (sha384) or 128 character field (whirlpool)? One of the initial arguments made for whirlpool was the speed vs other algorithms but looking at the speed results sha384 doesn't fair too badly.
There is the option truncate the hash to make it smaller than 128 characters.
I did modify the original code snippet, to allow changing of the algorithm based of the needs.
Update: There was some discussion about string being hashed, so I've included the code.
function generateUniqueId($maxLength = null) {
$entropy = '';
// try ssl first
if (function_exists('openssl_random_pseudo_bytes')) {
$entropy = openssl_random_pseudo_bytes(64, $strong);
// skip ssl since it wasn't using the strong algo
if($strong !== true) {
$entropy = '';
}
}
// add some basic mt_rand/uniqid combo
$entropy .= uniqid(mt_rand(), true);
// try to read from the windows RNG
if (class_exists('COM')) {
try {
$com = new COM('CAPICOM.Utilities.1');
$entropy .= base64_decode($com->GetRandom(64, 0));
} catch (Exception $ex) {
}
}
// try to read from the unix RNG
if (is_readable('/dev/urandom')) {
$h = fopen('/dev/urandom', 'rb');
$entropy .= fread($h, 64);
fclose($h);
}
// create hash
$hash = hash('whirlpool', $entropy);
// truncate hash if max length imposed
if ($maxLength) {
return substr($hash, 0, $maxLength);
}
return $hash;
}
The time taken to create the hash is not important, and as long as your database is properly indexed, the storage method should not be a major factor either.
However, the hash has to be transmitted with the client's request every time, frequently as a cookie. Large cookies can add a small amount of additional time to each request. See Yahoo!'s page performance best practices for more information. Smaller cookies, thus a smaller hash, have benefits.
Overall, large hash functions are probably not justified. For their limited scope, good old md5 and sha1 are probably just fine as the source behind a session token.
Yes, size matters.
If it's too short, you run the risk of collisions. You also make it practical for an attacker to find someone else's session by brute-force attack.
Being too long matters less, but every byte of the session ID has to be transferred from the browser to the server with every request, so if you're really optimising things, you may not want an ID that's too long.
You don't have to use all the bits of a hash algorithm, though - there's nothing stopping you from using something like Whirlpool, then only taking the first 128 bits (32 characters in hex). Practically speaking, 128 bits is a good lower bound on length, too.
As erickson points out, though, using a hash is a bit odd. Unless you have at least as much entropy as input as the length of the ID you're using, you're vulnerable to attacks that guess the input to your hash.
The article times out when I try to read it, but I can't think of a good reason to use a hash as a session identifier. Session identifiers should be unpredictable; given the title of the article, it sounds like the authors acknowledge that principle. Then, why not use a cryptographic random number generator to produce session identifiers?
A hash takes input, and if that input is predictable, so is the hash, and that's bad.
SHA1 or MD5 is probably enough for your needs. In practice, the probability of a collision is so small that it will likely never happen.
Ultimately, though, it all depends upon your required level of security. Do also keep in mind that longer hashes are both more expensive to compute and require more storage space.

Categories