Sources of Entropy on the Web - php

In order to guarantee honesty of a random number generator, the idea is that users can, if they wish, verify that the number is, in fact, generated from public sources of entropy. This enables the system to ensure it's users that the random number could not have been selected by the server.
$entropy = "what_do_you_think";
$md5 = md5($entropy);
/*take the first 10 hex characters of the md5 hash*/
$hex = substr($md5, 0, 9);
/*convert the hex to decimal*/
$dec = hexdec($hex);
/*use this decimal as a seed*/
srand($dec);
/*pick a random number between 0 and 9, ultimately seeded by the entropy*/
$rand = rand(0,9);
My question is: What are some good public sources of entropy (preferably immutable and chaotic), and absolutely referencable, that could be concatenated together in a string and fed into md5? Some ideas are specific stock prices, temperature (from an honest source), the hashes contained in the bitcoin block-chain...

Check out xkcd's geohashing algorithm. I think it is pretty much what you are looking for.
http://wiki.xkcd.com/geohashing/Implementations
The geohashing algorithm uses the DOW Jones as a source of entropy. This page discusses ways to get the Dow's opening price via the web.
http://wiki.xkcd.com/geohashing/Dow_Jones_Industrial_Average
But I think the best source of public, immutable, and verifiable entropy can be found in the BitCoin transaction database. It is widely distributed and continuously verified and has a defined protocol.

Get it from a physics department.
http://qrng.physik.hu-berlin.de/
http://qrng.physik.hu-berlin.de/download
or just
http://www.random.org/bytes/
that users can, if they wish, verify that the number is, in fact, generated from public sources of entropy
How do they do that?
Do you give them realtime access to the system's memory to ensure that the assembly of the program running that collects entropy is correct and not malicious?

The security value of using physical entropy is that it's unpredictable, i.e. unknown to anyone but the acquirer. What on earth would be the point of using entropy that could be available to anyone? May as well open up your printout of Pi to a million places and pick a starting point.
Quite apart from that, there is in principle no way to determine whether the random numbers a server gives you were in fact derived from the sources of entropy it apparently uses.

Related

Hashing passwords into database [duplicate]

I read the Wikipedia article about md5 hashes but I still can't understand how a hash can't be "reconstituted" back to the original text.
Could someone explain to someone who knows very little about cryptography how this works? What part of the function makes it one-way?
Since everyone until now has simply defined what a hash function was, I will bite.
A one-way function is not just a hash function -- a function that loses information -- but a function f for which, given an image y ("SE" or 294 in existing answers), it is difficult to find a pre-image x such that f(x)=y.
This is why they are called one-way: you can compute an image but you can't find a pre-image for a given image.
None of the ordinary hash function proposed until now in existing answers have this property. None of them are one-way cryptographic hash functions. For instance, given "SE", you can easily pick up the input "SXXXE", an input with the property that X-encode("SXXXE")=SE.
There are no "simple" one-way functions. They have to mix their inputs so well that not only you don't recognize the input at all in the output, but you don't recognize another input either.
SHA-1 and MD5 used to be popular one-way functions but they are both nearly broken (specialist know how to create pre-images for given images, or are nearly able to do so). There is a contest underway to choose a new standard one, which will be named SHA-3.
An obvious approach to invert a one-way function would be to compute many images and keep them in a table associating to each image the pre-image that produced it. To make this impossible in practice, all one-way function have a large output, at least 64 bits but possibly much larger (up to, say, 512 bits).
EDIT: How do most cryptographic hash functions work?
Usually they have at their core a single function that does complicated transformations on a block of bits (a block cipher). The function should be nearly bijective (it shouldn't map too many sequences to the same image, because that would cause weaknesses later) but it doesn't have to be exactly bijective. And this function is iterated a fixed number of times, enough to make the input (or any possible input) impossible to recognize.
Take the example of Skein, one of the strong candidates for the SHA-3 context. Its core function is iterated 72 times. The only number of iterations for which the creators of the function know how to sometimes relate the outputs to some inputs is 25. They say it has a "safety factor" of 2.9.
Think of a really basic hash - for the input string, return the sum of the ASCII values of each character.
hash( 'abc' ) = ascii('a')+ascii('b')+ascii('c')
= 97 + 98 + 99
= 294
Now, given the hash value of 294, can you tell what the original string was? Obviously not, because 'abc' and 'cba' (and countless others) give the same hash value.
Cryptographic hash functions work the same way, except that obviously the algorithm is much more complex. There are always going to be collisions, but if you know string s hashes to h, then it should be very difficult ("computationally infeasible") to construct another string that also hashes to h.
Shooting for a simple analogy here instead of a complex explanation.
To start with, let's break the subject down into two parts, one-way operations and hashing. What is a one-way operation and why would you want one?
One way operations are called that because they are not reversible. Most typical operations like addition and multiplication can be reversed while modulo division can not be reversed. Why is that important? Because you want to provide a output value which 1) is difficult to duplicate without the original inputs and 2) provides no way to figure out the inputs from the output.
Reversible
Addition:
4 + 3 = 7
This can be reversed by taking the sum and subtracting one of the addends
7 - 3 = 4
Multiplication:
4 * 5 = 20
This can be reversed by taking the product and dividing by one of the factors
20 / 4 = 5
Not Reversible
Modulo division:
22 % 7 = 1
This can not be reversed because there is no operation that you can do to the quotient and the dividend to reconstitute the divisor (or vice versa).
Can you find an operation to fill in where the '?' is?
1 ? 7 = 22
1 ? 22 = 7
With that being said, one-way hash functions have the same mathematical quality as modulo division.
Why is this important?
Lets say I gave you a key to a locker in a bus terminal that has one thousand lockers and asked you to deliver it to my banker. Being the smart guy you are, not to mention suspicious, you would immediately look on the key to see what locker number is written on the key. Knowing this, I've done a few devious things; first I found two numbers that when divided using modulo division gives me a number in the range between 1 and 1000, second I erased the original number and written on it the divisor from the pair of numbers, second I chose a bus terminal that has a guard protecting the lockers from miscreants by only letting people try one locker a day with their key, third the banker already knows the dividend so when he gets the key he can do the math and figure out the remainder and know which locker to open.
If I choose the operands wisely I can get near to a one-to-one relationship between the quotient and the dividend which forces you to try each locker because the answer spreads the results of the possible inputs over the range of desired numbers, the lockers available in the terminal. Basically, it means you can't acquire any knowledge about the remainder even if you know one of the operands.
So, now I can 'trust' you to deliver the key to its rightful owner without worrying that you can easily guess to which locker it belongs. Sure, you could brute force search all the lockers but that would take almost 3 years, plenty of time for my banker to use the key and empty the locker.
See the other answers for more specifics on the different hash functions.
Here's a very simple example. Assume that I'm a beginning cryptographer and I create a hash function that does the following:
int SimpleHash(file) {
return 0 if file.length is even;
return 1 if file.length is odd;
}
Now here's the test. SimpleHash(specialFile) is 0. What was my original file?
Obviously, there's no way to know (although you could likely discover pretty easily that my hash is based on file length). There is no way to "reconstitute" my file based on the hash because the hash doesn't contain everything that my file did.
In simple terms, a hash function works by making a big tangled mess of the input data.
See MD5 for instance. It processes input data by 512-bit blocks. Each block is split into 16 32-bit words. There are 64 steps, each step using one of the 16 input words. So each word is used four times within the course of the algorithm. This is where one-wayness comes from: any input bit is input at several places, and between two such inputs the function mixes all the current data together so that each input bit impacts most of the 128-bit running state. This prevents you from inverting the function, or computing a collision, by looking at only a part of the data. You have to look at the whole 128 bits, and the space of 128-bit blocks is too wide to be efficiently walked through.
Now MD5 does not do a good job at it, since collisions for that function can be found. From a cryptographer point of view, MD5 is a rotated encryption function. The processing of one message block M (512 bits) uses an input state V (a 128-bit value) and computes the new state V' as V' = V + E(M, V) where '+' is a word-wise addition, and 'E' happens to be a symmetric encryption function (aka a 'block cipher') which uses M as key and V as the message to be encrypted. From a closer look, E can is a kind of "extended Feistel network", similar to the DES block cipher, with four quarters instead of two halves. Details are not important here; my point is that what makes a "good" hash function, among hash functions which use that structure (called "Merkle-Damgård"), is similar to what makes a block cipher "secure". The successful collision attacks on MD5 use differential cryptanalysis, a tool which was designed to attack block ciphers in the first place.
From a good block cipher to a good hash function, there is a step which is not to be dismissed. With the Merkle-Damgård structure, the hash function is secure if the underlying block cipher is resistant to "related key attacks", a rather obscure property against which block ciphers are rarely strengthened because, for symmetric encryption, related key attacks barely have any practical impact. For instance, the AES encryption turned out not to be as resistant to related key attacks as could be wished for, and this did not trigger general panic. That resistance was not part of the properties which were sought for when AES was designed. It just prevents turning the AES into a hash function. There is a hash function called Whirlpool, which builds on a derivate of Rijndael, "Rijndael" being the initial name of what became the AES; but Whirlpool takes care to modify the parts of Rijndael which are weak to related key attacks.
Also, there are other structures which can be used for building a hash function. The current standard functions (MD5, SHA-1, and the "SHA-2" family, aka SHA-224, SHA-256, SHA-384 and SHA-512) are Merkle-Damgård functions, but many of the would-be successors are not. There is an ongoing competition, organized by the NIST (the US federal organization which deals with that kind of things), to select a new standard hash function, dubbed "SHA-3". See this page for details. Right now, they are down to 14 candidates from an initial 51 (not counting a dozen extra which failed the administrative test of sending a complete submission with code which compiles and runs properly).
Let's now have a more conceptual look. A secure hash function should look like a random oracle: an oracle is a black box which, when given a message M as input, outputs an answer h(M) which is chosen at random, uniformly, in the output space (i.e. all n-bit strings if the hash function output length is n). If given the same message M again as input, the oracle outputs the same value than previously. Apart from that restriction, the output of the oracle on a non previously used input M is unpredictable. One can imagine the oracle as a container for a gnome who throws dice, and carefully records the input messages and corresponding outputs in a big book, so that he will honor his oracle contract. There is no way to predict what the next output will be since the gnome himself does not know that.
If a random oracle exists, then inverting the hash function has cost 2^n: in order to have a given output, there is no better strategy than using distinct input messages until one yields the expected value. Due to the uniform random selection, probability of success is 1/(2^n) at each try, and the average number of requests to the dice-throwing gnome will be 2^n. For collisions (finding two distinct inputs which yields the same hash value), the cost is about 1.42^(n/2)* (roughly speaking, with 1.42^(n/2)* outputs, we can assemble about 2^n pairs of output, each having a probability of 1/(2^n) of matching, i.e. having two distinct inputs which have the same output). These are the best that can be done with a random oracle.
Therefore, we look for hash functions which are as good as a random oracle: they must mix the input data in such a way that we cannot find a collision more efficiently than what it would cost to simply invoke the function 2^(n/2) times. The bane of hash function is mathematical structure, i.e. shortcuts which allow the attacker to view the hash function internal state (which is big, at least n bits) as a variation on a mathematical object which lives in a much shorter space. 30 years of public research on symmetric encryption systems have produced a whole paraphernalia of notions and tools (diffusion, avalanche, differentials, linearity...) which can be applied. Bottom-line, however, is that we have no proof that a random oracle may actually exist. We want a hash function which cannot be attacked. What we have are hash function candidates, for which no attack is currently known, and, somewhat better, we have some functions for which some kinds of attack can be proven not to work.
There is still some research to be done.
A hash is a (very) lossy encoding.
To give you a simpler example, imagine a fictitious 2-letter encoding of a 5-letter word called the X-encoding. The algorithm for the X-encoding is simple: take the first and last letters of the word.
So,
X-encode( SAUCE ) = SE
X-encode( BLOCK ) = BK
Clearly, you cannot reconstruct SAUCE from its encoding SE (assuming our range of possible inputs is all 5-letter words). The word could just as easily be SPACE.
As an aside, the fact that SAUCE and SPACE both produce SE as an encoding is called a collision, and you can see that the X-ecoding wouldn't make a very good hash. :)
array
With some squinting, associative arrays look very much like hashes. The major differences were the lack of the % symbol on hash names, and that one could only assign to them one key at a time. Thus, one would say $foo{'key'} = 1;, but only #keys = keys(foo);. Familiar functions like each, keys, and values worked as they do now (and delete was added in Perl 2).
Perl 3 had three whole data types: it had the % symbol on hash names, allowed an entire hash to be assigned to at once, and added dbmopen (now deprecated in favour of tie). Perl 4 used comma-separated hash keys to emulate multidimensional arrays (which are now better handled with array references).
Perl 5 took the giant leap of referring to associative arrays as hashes. (As far as I know, it is the first language to have referred to the data structure thus, rather than "hash table" or something similar.) Somewhat ironically, it also moved the relevant code from hash.c into hv.c.
Nomenclature
Dictionaries, as explained earlier, are unordered collections of values indexed by unique keys. They are sometimes called associative arrays or maps. They can be implemented in several ways, one of which is by using a data structure known as a hash table (and this is what Perl refers to as a hash).
Perl's use of the term "hash" is the source of some potential confusion, because the output of a hashing function is also sometimes called a hash (especially in cryptographic contexts), and because hash tables aren't usually called hashes anywhere else.
To be on the safe side, refer to the data structure as a hash table, and use the term "hash" only in obvious, Perl-specific contexts.

Long random string generator in PHP

Does anyone know how to generate a long (e.g. 280 characters) random string in PHP without having to use a for loop that will loop through characters 280 times? I need it in order to create a custom session ID.
The PHPSESSID is not secure enough in my opinion being too short and not too random. I know Facebook and Twitter, use long session IDs (150, 550 chars respectively).
There could be an option to use MD5 strings or Bcrypt encryption of different string such as PHPSESSID, host, User-Agent etc. but I'm not sure this is the right way of doing it.
If you're asking a question like that, it probably means you don't know anything about cryptography or security. Trying to generate a "long random string" because, as you say, "The PHPSESSID is not secure enough" will probably lead you to a custom and insecure implementation.
Generating a random string is IMPOSSIBLE, at least not with your current hardware: you may approximate a fair pseudorandom generator but that is only useful for educational purposes.
PHP's Session ID generation algorithm is fairly efficient; if you think it is not secure enough, then you'll likely waste time making it better. You may probably want to use a different authentication mechanism if you are looking at maximum security (using a client certificate for example).
If websites such as Twitter, Facebook, or another site with similar traffic use longer session IDs, it may be not because it is more secure (well in a way), but rather because it avoids conflicts.
Finally, if you want a longer session ID without trying to write your own algorithm, you should use the following PHP configuration directive:
session.hash_function which can take any hash algorithm known by PHP.
You may also want to use session.bits_per_characters to shorten or lengthen the string. Note that if you do this, the string may be longer or shorter, but the data remains the same -- only represented differently (base 16, base 32, etc.)
Additional info:
You may also increase the entropy by using a custom source (file) and setting the length of the seed:
ini_set("session.hash_function", "sha512");
ini_set("session.bits_per_charater", 4); // 4 means hex
ini_set("session.entropy_file", "/dev/urandom");
ini_set("session.entropy_length", "512");
Can it be binary hexadecimal?
bin2hex(mcrypt_create_iv(140, MCRYPT_DEV_URANDOM));
Or if you have access to hash_pbkdf2():
hash_pbkdf2('sha256', PHPSESSID, mt_rand(), 1, 280, false);
You'll probably find that not all the bits in a "long" session id are purely random - they might provide additional information to the application to enable rapid extraction of key information.
PHP session ids by default aren't too bad, and can be improved by providing some extra entropy (e.g. from /dev/urandom)
Look at php_session_create_id in ext/session/session.c in the php source
It goes like this:
get time of day
get remote ip address
build a string with the seconds and microseconds from the current time, along with the IP address
feed that into configured session hash function (either MD5 or SHA1)
if configured, feed some additional randomness from an entropy file
generate final hash value
So getting a duplicate or predicting a session id is pretty difficult.
Make yourself a long string and shuffle it.
$string = 'asdfhdfiadfshsdfDSGFADSFDSFDSFER524353452345';
$new_string = shuffle($string);

Entropy in CSRF Tokens and Nonces

I am making a classified ads site with Zend Framework (for portfolio purposes, yes I know the world doesn't have room for "yet another Craigslist clone"). I am trying to implement the ability to post/edit/delete without ever needing an account.
To do this, I feel like I need to have a Nonce generated upon post submission and stored in the database. Then email a link to the user which makes a GET request for the delete, like this:
http://www.somesite.com/post/delete/?id=123&nonce=2JDXS93JFKS8204HJTHSLDH230945HSLDF
Only the user has this unique key or nonce, and upon submission I check the database under the post's ID and ensure the nonce matches prior to deleting.
My issue is how secure the nonce actually is. If I use Zend Framework's Zend_Form_Element_Hash, it creates the hash like this:
protected function _generateHash()
{
$this->_hash = md5(
mt_rand(1,1000000)
. $this->getSalt()
. $this->getName()
. mt_rand(1,1000000)
);
$this->setValue($this->_hash);
}
In reading about mt_rand(), one commenter said "This function has limited entrophy. So, if you want to create random string, it will produce only about 2 billion different strings, no matter the length of the string. This can be serous security issue if you are using such strings for session indentifiers, passwords etc."
Due to the lifetime of the nonce/token in the application, which could be days or weeks before user chooses to delete post, I think more than enough time would be given for a potential hack.
I realize mt_rand() is a huge upgrade from rand() as seen in this visual mapping pixels with rand on the left, and mt_rand on the right. But is it enough? What makes "2 billion different strings" a security issue?
And ultimately, how can I increase the entropy of a nonce/token/hash?
For such security it's not only important how long your output is. It counts how much randomness you've used to create it.
For mt_rand() the source of randomness is its seed and state (number of times you've used it since it was seeded). More mt_rand() calls will just give you more rehasing of the same randomness source (no new entropy).
mt_rand()'s seed is only 32-bit (anything less than 128bit makes cryptographers suspicious ;)
Strength of a keys with 32-bits of entropy is 4 billion divided by (roughly) number of keys you'll generate (e.g. after 100K uses there will be ~1:43000 chance to guess any valid key, which approaches practical brute-forcing).
You're adding salt to this, which makes it much stronger, because in addition to guessing the seed attacker would have to know the salt as well, so if the salt is long, then overall the key may be quite strong despite "low" entropy.
To increase entropy you need to add more random stuff (even slightly random is OK too, just gives less bits) from different sources than mt_rand: microtime(), amount of memory used, process ID... or just use /dev/random, which collects all entropy it can get.
(edit: uniqid() has weak entropy, so it won't help here)
The Zend hash generating code above's input for the md5() hashing function has 1,000,000 X 1,000,000 different possibilities. md5() has 32^16 (1208925819614629174706176) possible outcomes no matter what the input is. On average, the hacker would need to send 500,000,000,000 requests to your server in order to guess the right nonce.
At 100 requests per minute, that's about 3472222 days to hack.

Generate cryptographically secure random numbers in php

PHP's rand() function doesn't give good random numbers. So I started to use mt_rand() which is said to give better results. But how good are these results? Are there any methods to improve them again?
My idea:
function rand_best($min, $max) {
$generated = array();
for ($i = 0; $i < 100; $i++) {
$generated[] = mt_rand($min, $max);
}
shuffle($generated);
$position = mt_rand(0, 99);
return $generated[$position];
}
This should give you "perfect" random numbers, shouldn't it?
Pseudorandom number generators (PRNG) are very complex beast.
There are no real "perfect" random number generators -- in fact the best that can be done from mathematical functions are pseudorandom -- they seem random enough for most intents and purposes.
In fact, performing any additional actions from a number returned by a PRNG doesn't really increase its randomness, and in fact, the number can become less random.
So, my best advice is, don't mess around with values returned from a PRNG. Use a PRNG that is good enough for the intended use, and if it isn't, then find a PRNG that can produce better results, if necessary.
And frankly, it appears that the mt_rand function uses the Mersenne twister, which is a pretty good PRNG as it is, so it's probably going to be good enough for most casual use.
However, Mersenne Twister is not designed to be used in any security contexts. See this answer for a solution to use when you need randomness to ensure security.
Edit
There was a question in the comments why performing operations on a random number can make it less random. For example, some PRNGs can return more consistent, less random numbers in different parts of the bits -- the high-end can be more random than the low-end.
Therefore, in operations where the high-end is discarded, and the low end is returned, the value can become less random than the original value returned from the PRNG.
I can't find a good explanation at the moment, but I based that from the Java documentation for the Random.nextInt(int) method, which is designed to create a fairly random value in a specified range. That method takes into account the difference in randomness of the parts of the value, so it can return a better random number compared to more naive implementations such as rand() % range.
Quick answer:
In a new PHP7 there is a finally a support for a cryptographically secure pseudo-random integers.
int random_int ( int $min , int $max )
There is also a polyfill for PHP5x.
Longer answer
There is no perfect random number generator, and computers use pseudorandom number generator to create sequences that looks random. The sequences look random (and pass some randomness tests) but because there is some algorithm to generate it, you can repeat algorithm with absolutely the same states and get the same result.
The same advice as with cryptography "do not invent your own cypher" can be translated to random number generators and mean that you can not just get a lot of random number generators combined together and get expect to get a better generator.
One of the subsets of random number generators is cryptographically secure random number generators:
The requirements of an ordinary PRNG are also satisfied by a
cryptographically secure PRNG, but the reverse is not true. CSPRNG
requirements fall into two groups: first, that they pass statistical
randomness tests; and secondly, that they hold up well under serious
attack, even when part of their initial or running state becomes
available to an attacker
So this is pretty close to your definition of "perfect". One more time under no condition (except of learning how to do cryptography) you should try to implement one of that algorithms and use it in your system.
But luckily PHP7 has it implemented,
int random_int ( int $min , int $max )
Generates cryptographic random integers that are suitable for use
where unbiased results are critical (i.e. shuffling a Poker deck).
The sources of random are as follows:
On Windows CryptGenRandom() is used exclusively
arc4random_buf() is used if it is available (generally BSD specific)
/dev/arandom is used where available
The getrandom(2) syscall (on newer Linux kernels)
/dev/urandom is used where none of the above is available
This makes all the previous answers obsolete (and some deprecated).
I'm not sure that what you've done "improves" the randomness. From what I can understand you generate 100 random numbers and then randomly pick one of them.
From what I can remember from my probability course, this probably doesn't increase the randomness, as if there is an underlying bias in the generator function (mt_rand()), then it will still be reflected somehow in the output.
In what way is mt_rand() "bad"?
For example: If it favors a certain number. Lets say mt_rand(1, 10) favours low numbers in the range, ie "1" and "2" occurs on average more than 10% each. Then your "improvement" would still suffer from the same problem.
Selecting a random number out of a faulty sequence will still be faulty.
<?php
function random_number(){
return 4; // return generated number
// guaranteed to be random
}
?>
All joking aside, you're getting into a philosophical question of what is "random" or what is "best". Ideally you'd want your random numbers to have few patterns in them over the course of your procedure. Generally system time is used as the seed, but I've also used the previous random number as the seed, the previous random numberth ago as the seed. The problem is, with a powerful enough computer and full knowledge of the hardware running, and generator function, you would be able to predict the entire set of numbers generated. Thus if you had a powerful enough computer (some people put God into this category) that knew all possible variables and functions of the universe you would then be able to predict every event that happened or will happen. Most random number generators are fine on their own but if you know someone who can see the patterns, more likely they are like the guy in Beautiful Mind and you should get them checked into a clinic.
By popular demand :D
I wrote a cronjob that gets 1000 numbers from random.org periodically (say, once an hour) and added them into a PHP array. Whenever I want random numbers in my script, I use mt_rand(0,1000) to call a number from that. A few extra microseconds of overhead, but I get truly random numbers based on natural atmospheric noise.
It all depends what for you need that random number :)
For me ShuffleBag is the best one :)
Edit: My comment is no longer valid. Please see the following answer: https://stackoverflow.com/a/31443898/109561
I'm guessing you're worried about the distribution of mt_rand(). I have tested it and it is very level and both bounds are inclusive.
I added my test to the comments of the documentation for mt_rand() on the php manual, but it was removed by a silly moderator due to politics that are too long winded to go into here.
If you don't like PHP's built in rand(), you probably shouldn't use their built-in shuffle() either, since it seems to be built on their rand().
I am halfway sure the "industry standard" shuffle now is the Fisher-Yates shuffle.
There is no such thing as a "perfect" random number. No matter what subjective definition of "perfect" you have. You can only achieve pseudo-random.
I was simply trying to point you in the right direction. You asked a question about perfect random numbers, even if perfect was in quotes. And yes, you can improve randomness. You can even implement heuristic or "natural" algorithms, such ideas like "atmospheric noise" -- but still, you're not perfect, not by any means.
use /dev/ramdom (linux device true random number generator) to seed mt_rand
<?
$rnd_dev=mcrypt_create_iv(4, MCRYPT_DEV_RANDOM); //need "apt-get install php5-mcrypt"
$seed=ord(substr($rnd_dev, 0, 1))<<24 |
ord(substr($rnd_dev, 1, 1))<<16 |
ord(substr($rnd_dev, 2, 1))<<8 |
ord(substr($rnd_dev, 3, 1));
mt_srand($seed);
echo mt_rand();
?>
I made a PHP class for generating random numbers and strings PHPRandomValue
It uses "mcrypt_create_iv(4, MCRYPT_DEV_URANDOM)" to generate random numbers and values. I made it while working on a crypto project because I needed a safe random value generator. Here's an example usage
$randomValue = new RandomValue;
$randomValue->randomNumber(): = -3880998
$randomValue->randomNumberBetween(1,10): = 2
$randomValue->randomTextString(): = CfCkKDHRgUULdGWcSqP4
$randomValue->randomTextString(10): = LorPIxaeEY
$randomValue->randomKey(): = C7al8tX9.gqYLf2ImVt/!$NOY79T5sNCT/6Q.$!.6Gf/Q5zpa3
$randomValue->randomKey(10): = RDV.dc6Ai/
It is not possible to generate true random numbers, the best you can hope for is pseudo-random which is what rand() provides, your function is no closer to random then rand(). Take a look at this http://en.wikipedia.org/wiki/Random_number_generator
Tru Random numbers
<?php
for ($i = -1; $i <= 4; $i++) {
$bytes = openssl_random_pseudo_bytes($i, $cstrong);
$hex = bin2hex($bytes);
echo "Lengths: Bytes: $i and Hex: " . strlen($hex) . PHP_EOL;
var_dump($hex);
var_dump($cstrong);
echo PHP_EOL;
}
?>
and also crypto secure ;)
Although the answer was accepted years ago, I'll re-reopen it.
Since all this randomness depends on the system time, let's mess with the system time too! The amount of time an operation takes on the computer is actually rather variable (especially if other stuff is happening on that server), so if we take advantage of that with microtime... (couldn't find any portable nanotime commands)
$a='';
for (int $i=0; $i<9001; $i++)
{
usleep(mt_rand(1000,10000));//Also eliminates timing attacks... possibly?
$a=hash('SHA512',$a.uniqid(mt_rand().microtime(),true));
}
echo $a;
Nominally this has 207023 bits of entropy, since you're adding another 23 bits every iteration, but there's a lot of interdependencies, so it's probably a few orders of magnitude less. Still pretty good.
Do you know of any operations on PHP that take a really random amount of time? Like... HTTP-requesting some website (other than RANDOM.org) and measuring the time it takes?
Using random.org, you can use this:
function getToken($length, $min, $max){
$r = explode('
',file_get_contents('http://www.random.org/integers/num='.$length.'&min='.$min.'&max='.$max.'&col=1&base=10&format=plain'));
$string = '';
foreach ( $r as $char ) $string.=$char;
return $string;
}
this should give real random numbers

Better Random Generating PHP

I know that just using rand() is predictable, if you know what you're doing, and have access to the server.
I have a project that is highly dependent upon choosing a random number that is as unpredictable as possible. So I'm looking for suggestions, either other built-in functions or user functions, that can generate a better random number.
I used this to do a little test:
$i = 0;
while($i < 10000){
$rand = rand(0, 100);
if(!isset($array[$rand])){
$array[$rand] = 1;
} else {
$array[$rand]++;
}
sort($array);
$i++;
}
I found the results to be evenly distributed, and there is an odd pattern to the number of times each number is generated.
Adding, multiplying, or truncating a poor random source will give you a poor random result. See Introduction to Randomness and Random Numbers for an explanation.
You're right about PHP rand() function. See the second figure on Statistical Analysis for a striking illustration. (The first figure is striking, but it's been drawn by Scott Adams, not plotted with rand()).
One solution is to use a true random generator such as random.org. Another, if you're on Linux/BSD/etc. is to use /dev/random. If the randomness is mission critical, you will have to use a hardware random generator.
random.org has an API you can access via HTTP.
RANDOM.ORG is a true random number service that generates randomness
via atmospheric noise.
I would be wary of the impression of randomness: there have been many experiments where people would choose the less random distribution. It seems the mind is not very good at producing or estimating randomness.
There are good articles on randomness at Fourmilab, including another true random generator. Maybe you could get random data from both sites so if one is down you still have the other.
Fourmilab also provides a test program to check randomness. You could use it to check your various myRand() programs.
As for your last program, if you generate 10000 values, why don't you choose the final value amongst the 10 thousand? You restrict yourself to a subset. Also, it won't work if your $min and $max are greater than 10000.
Anyway, the randomness you need depends on your application. rand() will be OK for an online game, but not OK for cryptography (anything not thoroughly tested with statistical programs will not be suitable for cryptography anyway). You be the judge!
Another way of getting random numbers, similar in concept to getting UUID
PHP Version 5.3 and above
openssl_random_pseudo_bytes(...)
Or you can try the following library using RFC4122
Variation on #KG, using the milliseconds since EPOCH as the seed for rand?
A new PHP7 there is a function that does exactly what you needed: it generates cryptographically secure pseudo-random integers.
int random_int ( int $min , int $max )
Generates cryptographic random integers that are suitable for use
where unbiased results are critical (i.e. shuffling a Poker deck).
For a more detailed explanation about PRNG and CSPRNG (and their difference) as well as why your original approach is actually a bad idea, please read my another highly similar answer.

Categories