I want to generate the profile ids in my software. The mt_rand function works well but I need the ids to be a fixed 10 digit long. Currently I am looping through mt_rand outputs until I get a 10 digit number. But the problem I am facing now is that most of the profile ids start from 1 and some from 2. None from any of the other single digit numbers. I understand this happens because of mt_rand's range and it can't produce 10 digit numbers that start with 3 or more.
This is what I am currently doing
for($i = 0; $i < 200; $i++){
$num = mt_rand();
if(strlen($num) == 10) echo $num."<br>";
}
If you run the above code you will see all numbers start from either 1 or 2. Any way to fix this?
Edit: I guess I can just flip the numbers but some numbers end with zero and this seems like a bit of a hack anyways. But then again, random number generation is a hack in itself I guess.
just start your IDs at 1000000001 , then ID 2 at 1000000002 , ID 543 at 1000000543 , and so on?
alternatively, keep calling mt_rand(1000000001,min((PHP_INT_SIZE>4 ? intval("9999999999",10): PHP_INT_MAX),mt_getrandmax())) until you get an ID which does not already exist in your database? (this will be more and more cpu intesive as your db grows larger and larger.. when its almost full, i wouldn't be surprised if it took billions of iterations and several minutes..)
To elaborate on Rizier's suggestion, the only way to ensure any string (even a string of numbers) fits a given mold for length and rules is to generate it one character at a time and then fit them together
$str = '';
for($loop = 0; $loop < 10; $loop++) {
$str .= mt_rand(0,9);
}
echo $str;
You can then add rules to this. Maybe you don't want a leading 0 so you can add a rule for that. Maybe you want letters too. This will always give you a random string with the rules you want.
You can see this in action here http://3v4l.org/kIRdV
Related
I need to generate a random number or hash that will be the same each time based on a string. This is done easily enough with the function crc32, however, I need it to be an integer between a range because the random number will be picking an item out of an array.
Here's the code I have so far:
$min=0;
$max=count($myarray);
$number = crc32("Joe Jones");
$rnd = '.'.(string)$number;
//(Int((max-min+1)*Rnd+min))
$rand = round(($max-$min+1)*$rnd+$min);
echo $rand;
It seems to work, but it always picks lower numbers. It never picks the higher numbers.
Just use mod (%). $x % $n will ensure an output between 0 and $n-1 for any $x.
$myArray=range(1,1000);
$max=count($myArray); //1000
$number = crc32("Joe Jones"); //2559948711
$rand=$number % $max; //711
Also just a note about crc32: It may return a negative number if you run it on a 32 bit platform, so you may optionally want to do abs(crc32($input))
Your crc32 function is producing a negative number. Change the line as follows:
$number = abs(crc32("Joe Jones"));
This turns that negative number in to a positive one. Also, you might want to consider multiplying that number if your your array count is is low. How high that goes is up to you.
I need to create a function which takes a single integer as argument in the range 0-N and returns a seemingly random number in the same range.
Each input number should always have exactly one output and it should always be the same.
Such a function would produce something like this:
f(1) = 4
f(2) = 1
f(3) = 5
f(4) = 2
f(5) = 3
I believe this could be accomplished by some kind of a hashing algorithm? I don't need anything complex, just not something too simple like f(1) = 2, f(2) = 3 etc.
The biggest issue is that I need this to be reversible. E.g. the above table should be true left-to-right as well as right-to-left, using a different function for the right-to-left conversion is fine.
I know the easiest way is to create an array, shuffle it and just store the relations in a db or something, but as I need N to be quite large I'd like to avoid this if possible.
Edit: For my particular case N is a specific number, it's exactly 16777216 (64^4).
If the range is always a power of two -- like [0,16777216) -- then you can use exclusive-or just as #MarkBaker suggested. It just doesn't work so easily if your range is not a power of two.
You can use addition and subtraction modulo N, although these alone are too obvious, so you have to combine it with something else.
You can also do multiplication modulo-N, but reversing that is complicated. To make it simpler, we can isolate the bottom eight bits and multiply those and add them in a way that doesn't interfere with those bits so we can use them again to reverse the operation.
I don't know PHP so I'm going to give an example in C, instead. Maybe it's the same.
int enc(int x) {
x = x + 4799 * 256 * (x % 256);
x = x + 8896843;
x = x ^ 4777277;
return (x + 1073741824) % 16777216;
}
And to decode, play the operations back in reverse order:
int dec(int x) {
x = x + 1073741824;
x = x ^ 4777277;
x = x - 8896843;
x = x - 4799 * 256 * (x % 256);
return x % 16777216;
}
That 1073741824 must be a multiple of N, and 256 must be a factor of N, and if N is not a power of two then you can't (necessarily) use exclusive-or (^ is exclusive-or in C and I assume in PHP too). The other numbers you can fiddle with, and add and remove stages, at your leisure.
The addition of 1073741824 in both functions is to ensure that x stays positive; this is so that the modulo operation doesn't ever give a negative result, even after we've subtracted values from x which might have made it go negative in the interim.
I offered to describe how I "randomly" scramble up 9-digit SSNs when producing research data sets. This does not replace or hash an SSN. It re-orders the digits. It is difficult to put the digits back in the correct order if you don't know the order in which they were scrambled. I have a gut feeling that this is not what the questioner really wants. So, I am happy to delete this answer if it is deemed off-topic.
I know that I have 9 digits. So, I start with an array that has 9 index values in order:
$a = array(0,1,2,3,4,5,6,7,8);
Now, I need to turn a key that I can remember into a way to shuffle the array. The shuffling has to be the same order for the same key every time. I use a couple tricks. I use crc32 to turn a word into a number. I use srand/rand to get a predictable order of random values. Note: mt_rand no longer produces the same sequence of random digits with the same seed, so I have to use rand.
srand(crc32("My secret key"));
usort($a, function($a, $b) { return rand(-1,1); });
The array $a still has the digits 0 through 8, but they are shuffled. If I use the same keyword I will get the same shuffled order every time. That lets me repeat this every month and get the same result. Then, with a shuffled array, I can pick the digits off the SSN. First, I ensure it has 9 characters (some SSNs are sent as integers and a leading 0 is omitted). Then, I build a masked SSN by picking the digits using $a.
$ssn = str_pad($ssn, 9, '0', STR_PAD_LEFT);
$masked_ssn = '';
foreach($a as $i) $masked_ssn.= $ssn{$i};
$masked_ssn will now have all the digits in $ssn, but in a different order. Technically, there are keywords that make $a become the original ordered array after shuffling, but that is very very rare.
Hopefully this makes sense. If so, you can do it all much faster. If you turn the original string into an array of characters, you can shuffle the array of characters. You just need to reseed rand every time.
$ssn = "111223333"; // Assume I'm using a proper 9-digit SSN
$a = str_split($ssn);
srand(crc32("My secret key"));
usort($a, function($a, $b) { return rand(-1,1); });
$masked_ssn = implode('', $a);
This is not really faster in a runtime way because rand is a rather expensive function and you run rand a hell of lot more here. If you are masking thousands of values as I do, you will want to use an index array that is shuffled just once, not a shuffling for every value.
Now, how do I undo it? Assume I'm using the first method with the index array. It will be something like $a = {5, 3, 6, 1, 0, 2, 7, 8, 4}. Those are the indexes for the original SSN in the masked order. So, I can easily build the original SSN.
$ssn = '000000000'; // I like to define all 9 characters before I start
foreach($a as $i=>$j) $ssn[$j] = $masked_ssn{$i};
As you can see, $i counts from 0 to 8 across the masked SSN. $j counts 5, 3, 6... and puts each value from the masked SSN in the correct place in the original SSN.
Looks like you've got good answer, but still there is an alternative. Linear Congruential Generator (LCG) could provide 1-to-1 mapping and it is known to be a reversible using Euclid's algorithm. For 24bit
Xi = [(A * Xi-1) + C] Mod M
where M = 2^24 = 16,777,216
A = 16,598,013
C = 12,820,163
For LCG reversability take a look at Reversible pseudo-random sequence generator
I am just wondering, how unique is a mt_rand() number is, if you draw 5-digits number?
In the example, I tried to get a list of 500 random numbers with this function and some of them are repeated.
http://www.php.net/manual/en/function.mt-rand.php
<?php
header('Content-Type: text/plain');
$errors = array();
$uniques = array();
for($i = 0; $i < 500; ++$i)
{
$random_code = mt_rand(10000, 99999);
if(!in_array($random_code, $uniques))
{
$uniques[] = $random_code;
}
else
{
$errors[] = $random_code;
}
}
/**
* If you get any data in this array, it is not exactly unique
* Run this script for few times and you may see some repeats
*/
print_r($errors);
?>
How many digits may be required to ensure that the first 500 random numbers drawn in a loop are unique?
If numbers are truly random, then there's a probability that numbers will be repeated. It doesn't matter how many digits there are -- adding more digits makes it much less likely there will be a repeat, but it's always a possibility.
You're better off checking if there's a conflict, then looping until there isn't like so:
$uniques = array();
for($i = 0; $i < 500; $i++) {
do {
$code = mt_rand(10000, 99999);
} while(in_array($code, $uniques));
$uniques[] = $code
}
Why not use range, shuffle, and slice?
<?php
$uniques = range(10000, 99999);
shuffle($uniques);
$uniques = array_slice($uniques, 0, 500);
print_r($uniques);
Output:
Array
(
[0] => 91652
[1] => 87559
[2] => 68494
[3] => 70561
[4] => 16514
[5] => 71605
[6] => 96725
[7] => 15908
[8] => 14923
[9] => 10752
[10] => 13816
*** truncated ***
)
This method is less expensive as it does not search the array each time to see if the item is already added or not. That said, it does make this approach less "random". More information should be provided on where these numbers are going to be used. If this is an online gambling site, this would be the worst! However if this was used in returning "lucky" numbers for a horoscope website, I think it would be fine.
Furthermore, this method could be extended, changing the shuffle method to use mt_rand (where as the original method simply used rand). It may also use openssl_random_pseudo_bytes, but that might be overkill.
The birthday paradox is at play here. If you pick a random number from 10000-99999 500 times, there's a good chance of duplicates.
Intuitive idea with small numbers
If you flip a coin twice, you'll get a duplicate about half the time. If you roll a six-sided die twice, you'll get a duplicate 1/6 of the time. If you roll it 3 times, you'll get a duplicate 4/9 (44%) of the time. If you roll it 4 times you'll get at least one duplicate 13/18 (63.33%). Roll it a fifth time and it's 49/54 (90.7%). Roll it a sixth time and it's 98.5%. Roll it a seventh time and it's 100%.
If you take replace the six-sided die with a 20-sided die, the probabilities grow a bit more slowly, but grow they do. After 3 rolls you have a 14.5% chance of duplicates. After 6 rolls it's 69.5%. After 10 rolls it's 96.7%, near certainty.
The math
Let's define a function f(num_rolls, num_sides) to generalize this to any number of rolls of any random number generator that chooses out of a finite set of choices. We'll define f(num_rolls, num_sides) to be the probability of getting no duplicates in num_rolls of a num_sides-side die.
Now we can try to build a recursive definition for this. To get num_rolls unique numbers, you'll need to first roll num_rolls-1 unique numbers, then roll one more unique number, now that num_rolls-1 numbers have been taken. Therefore
f(num_rolls, num_sides) =
f(num_rolls-1, num_sides) * (num_sides - (num_rolls - 1)) / num_sides
Alternately,
f(num_rolls + 1, num_side) =
f(num_rolls, num_sides) * (num_sides - num_rolls) / num_sides
This function follows a logistic decay curve, starting at 1 and moving very slowly (since num_rolls is very low, the change with each step is very small), then slowly picking up speed as num_rolls grows, then eventually tapering off as the function's value gets closer and closer to 0.
I've created a Google Docs spreadsheet that has this function built in as a formula to let you play with this here: https://docs.google.com/spreadsheets/d/1bNJ5RFBsXrBr_1BEXgWGein4iXtobsNjw9dCCVeI2_8
Tying this back to your specific problem
You've generated rolled a 90000-sided die 500 times. The spreadsheet above suggests you'd expect at least one duplicate pair about 75% of the time assuming a perfectly random mt_rand. Mathematically, the operation your code was performing is choosing N elements from a set with replacement. In other words, you pick a random number out of the bag of 90000 things, write it down, then put it back in the bag, then pick another random number, repeat 500 times. It sounds like you wanted all of the numbers to be distinct, in other words you wanted to choose N elements from a set without replacement. There are a few algorithms to do this. Dave Chen's suggestion of shuffle and then slice is a relatively straightforward one. Josh from Qaribou's suggestion of separately rejecting duplicates is another possibility.
Your question deals with a variation of the "Birthday Problem" which asks if there are N students in a class, what is the probability that at least two students have the same birthday? See Wikipedia: The "Birthday Problem".
You can easily modify the formula shown there to answer your problem. Instead of having 365 equally probable possibilities for the birthday of each student, you have 90001 (=99999-10000+2) equally probable integers that can be generated between 10000 and 99999. The probability that if you generate 500 such numbers that at least two numbers will be the same is:
P(500)= 1- 90001! / ( 90001^n (90001 - 500)! ) = 0.75
So there is a 75% chance that at least two of the 500 numbers that you generate will be the same or, in other words, only a 25% chance that you will be successful in getting 500 different numbers with the method you are currently using.
As others here have already suggested, I would suggest checking for repeated numbers in your algorithm rather than just blindly generating random numbers and hoping that you don't have a match between any pair of numbers.
I want some generator script to generate unique numbers but not in one order. We need to sell tickets.
For example currently ticket numbers are like this:
100000
100001
100002
...
So the users can see how many are sold.
How can I generate unique numbers?
for example:
151647
457561
752163
...
I could use random number generator, but then I have always check in database if such number has not been generated.
Hmm, maybe when using index on that column - the check would not take long.
Still now I have to get last card number, if I want to add 1 to it, but getting last is fast enough.
And the more tickets will be sold, then bigger chance that RNG will generate existing number. So migth be more checks in future. SO the best would be to take last number and generate next by it.
Here's a simple way to scramble ticket numbers (note: you need 64-bit PHP, or change the code to use the bcmath library):
function scramble($number) {
return (305914*($number-100000)+151647) % 999983;
}
Look, the output even looks like your example:
Input Output
------ ------
100000 151647
100001 457561
100002 763475
100003 069406
If you want to you can reverse it, so you can use these codes in URLs and then recover the original number:
function unscramble($number) {
return (605673*($number-151647)+100000) % 999983 ;
}
Is this safe? Someone with access to many sequential numbers can find the pattern so don't use this if the ticket numbers are extremely sensitive.
Generate random numbers, make the ticket number unique index, insert the record with the new ticket, if fails means that you had a collision, so you have to generate another id. With a good random space, say 32 bit integer, the chance of collision is minimal. The SQL implementation behind if the column is index and numerical is lightning fast.
You can have your number generated, store in a pool, when you need new number, get one with RNG index of the pool, remove from the pool and return it.
if the pool nearly run out, just generate another batch of it
function generateCode() {
$chars = '01234567890';
do {
$code = '';
for ($x = 0; $x < 6; $x++) {
$code .= $chars[ rand(0, strlen($chars)-1) ];
}
you may check here in databse if this code has been generated earlier, if yes, return;
} while (true);
return $code;
}
The easy way, you can simply use md5() function..
And to get a 6 digit string, you can do
$x = md5(microtime());
echo substr($x, 0, 6);
Edit:
session_start();
$x = md5(microtime().session_id());
echo substr($x, 0, 6);
I want be able to generate an unique string id from an integer.
So for example '1' is converted to 'o7wu' and vice-versa. (use the number to search the DB and the string for display)
I found this great function: http://kevin.vanzonneveld.net/techblog/article/create_short_ids_with_php_like_youtube_or_tinyurl/, but the generated ids are really ugly for small numbers, for example
'1' is 'aacd' and '2' is 'aadd'. I also found http://blog.kevburnsjr.com/php-unique-hash, the generated ids look great, but they are 'one way' only, i think.
I dont really need any kind of encryption, I just need it to be short, pretty.
For those having truble with my deffinition of a pretty ID: I define pretty as real random mix of chars and ints. NOT: aabb33, abc123, aa22cc. YES: dfh7, ao8f, z6t4 .. and so one...
Random number generators can be hacked to give less random numbers. Actually, you can make them use the same numbers every time. If you set the seed to the same number, you will get the same sequence. Here's an example that creates a "random" string by passing a single number.
<?php
$n = 1;
// Any number to be added to the random seed
// Different numbers give different sequences
$offset = 45;
srand($n + $offset);
$code = '';
for($i=0; $i<4; $i++) {
// Numbers and letters
$char = rand(48, 57+26);
// Lower case letters
if ($char > 57) $char += 39;
$code .= chr($char);
}
echo "The code is $code\n";
With this offset, 1 gives 'ghep', 2 gives 'tw70', 3 gives 'ob0c'. To increase the number of characters, change the $i<4 to a 5, or 13. To change the included characters, use a different range of ASCII codes.
I've tried the first link too and had little success with the code provided. I did some research and I think that YouTube IDs are basically just randomly generated strings (given that there are nearly 74 quintillion possible combinations, the chances of two IDs ever being alike is very slim). Below is a little PHP script that I use to create 11-character YouTube-style random IDs.
$length = 11;
$characters = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_-';
$random_id = "";
for ($i = 0; $length > $i; $i++) {
$random_id .= $characters[mt_rand(0, strlen($characters) -1)];
}
The script above basically creates a list of 64 characters identical to those used by YouTube, randomly selects eleven of those characters and then concatenates them all within a for loop. I then store the random IDs generated in a column for random IDs which are searched for via the URL. It does not include encryption, but it is the closest thing I could come up with. You can change the length of the string by changing the $length variable and the characters used by changing the characters that make up the $characters variable.
Little hint, if you are using a MySQL database, change the random ID column's collation to latin1_bin, that way you can include case sensitivity.