I have some strings containing alpha numeric values, say
asdf1234,
qwerty//2345
etc..
I want to generate a specific constant number related with the string. The number should not match any number generated corresponding with other string..
Does it have to be a number?
You could simply hash the string, which would give you a unique value.
echo md5('any string in here');
Note: This is a one-way hash, it cannot be converted from the hash back to the string.
This is how passwords are typically stored (using this or another hash function, typically with a 'salt' method added.) Checking a password is then done by hashing the input and comparing to the stored hash.
edit: md5 hashes are 32 characters in length.
Take a look at other hash functions:
http://us3.php.net/manual/en/function.crc32.php (returns a number, possibly negative)
http://us3.php.net/manual/en/function.sha1.php (40 characters)
You can use a hashing function like md5, but that's not very interesting.
Instead, you can turn the string into its sequence of ASCII characters (since you said that it's alpha-numeric) - that way, it can easily be converted back, corresponds to the string's length (length*3 to be exact), it has 0 collision chance, since it's just turning it to another representation, always a number and it's a little more interesting... Example code:
function encode($string) {
$ans = array();
$string = str_split($string);
#go through every character, changing it to its ASCII value
for ($i = 0; $i < count($string); $i++) {
#ord turns a character into its ASCII values
$ascii = (string) ord($string[$i]);
#make sure it's 3 characters long
if (strlen($ascii) < 3)
$ascii = '0'.$ascii;
$ans[] = $ascii;
}
#turn it into a string
return implode('', $ans);
}
function decode($string) {
$ans = '';
$string = str_split($string);
$chars = array();
#construct the characters by going over the three numbers
for ($i = 0; $i < count($string); $i+=3)
$chars[] = $string[$i] . $string[$i+1] . $string[$i+2];
#chr turns a single integer into its ASCII value
for ($i = 0; $i < count($chars); $i++)
$ans .= chr($chars[$i]);
return $ans;
}
Example:
$original = 'asdf1234';
#will echo
#097115100102049050051052
$encoded = encode($original);
echo $encoded . "\n";
#will echo asdf1234
$decoded = decode($encoded);
echo $decoded . "\n";
echo $original === $decoded; #echoes 1, meaning true
You're looking for a hash function, such as md5. You probably want to pass it the $raw_output=true parameter to get access to the raw bytes, then cast them to whatever representation you want the number in.
A cryptographic hash function will give you a different number for each input string, but it's a rather large number — 20 bytes in the case of SHA-1, for example. In principle it's possible for two strings to produce the same hash value, but the chance of it happening is so extremely small that it's considered negligible.
If you want a smaller number — say, a 32-bit integer — then you can't use a hash function because the probability of collision is too high. Instead, you'll need to keep a record of all the mappings you've established. Make a database table that associates strings with numbers, and each time you're given a string, look it up in the table. If you find it there, return the associated number. If not, choose a new number that isn't used by any of the existing records, and add the new string and number to the table.
Related
I have many strings. Each string something like:
"i_love_pizza_123"
"whatever_this_is_now_later"
"programming_is_awesome"
"stack_overflow_ftw"
...etc
I need to be able to convert each string to a random number, 1-10. Each time that string gets converted, it should consistently be the same number. A sampling of strings, even with similar text should result in a fairly even spread of values 1-10.
My first thought was to do something like md5($string), then break down a-f,0-9 into ten roughly-equal groups, determine where the first character of the hash falls, and put it in that group. But doing so seems to have issues when converting 16 down to 10 by multiplying by 0.625, but that causes the spread to be uneven.
Thoughts on a good method to consistently convert a string to a random/repeatable number, 1-10? There has to be an easier way.
Here's a quick demo how you can do it.
function getOneToTenHash($str) {
$hash = hash('sha256', $str, true);
$unpacked = unpack("L", $hash); // convert first 4 bytes of hash to 32-bit unsigned int
$val = $unpacked[1];
return ($val % 10) + 1; // get 1 - 10 value
}
for ($i = 0; $i < 100; $i++) {
echo getOneToTenHash('str' . $i) . "\n";
}
How it works:
Basically you get the output of a hash function and downscale it to desired range (1..10 in this case).
In the example above, I used sha256 hash function which returns 32 bytes of arbitrary binary data. Then I extract just first 4 bytes as integer value (unpack()).
At this point I have a 4 bytes integer value (0..4294967295 range). In order to downscale it to 1..10 range I just take the remainder of division by 10 (0..9) and add 1.
It's not the only way to downscale the range but an easy one.
So, the above example consists of 3 steps:
get the hash value
convert the hash value to integer
downscale integer range
A much shorter example with crc32() function which returns integer value right away thus allowing us to omit step 2:
function getOneToTenHash($str) {
$int = crc32($str); // 0..4294967295
return ($int % 10) + 1; // 1..10
}
below maybe what u want
$inStr = "hello world";
$md5Str = md5($inStr);
$len = strlen($md5Str);
$out = 0;
for($i=0; $i<$len; $i++) {
$out = 7*$out + intval($md5Str[$i]); // if you want more random, can and random() here
}
$out = ($out % 10 + 9)%10; // scope= [1,10]
I was trying to make a Alphanumeric string and use it for a unique field in my database , it is not a replacement of the Primary key mind it . The following code is generating a 22 length text but my concern is will it continue to produce unique strings as i might need it for unique identification of the data.
<?php
$len =22;
$rand = substr(str_shuffle(md5(time())),0,$len);
echo $rand;
?>
Use openssl_random_pseudo_bytes - it will Generate a pseudo-random string of bytes
and the bin2hex() function converts a string of ASCII characters to hexadecimal values
It will provide you secure token
bin2hex(openssl_random_pseudo_bytes($length))
I will always include the time() in the resulting string to make sure it's unique, if first 10 characters are all numerical will be acceptable to you:
$rand = substr(time().str_shuffle(md5(time())),0,$len);
The function str_shuffle(md5(time())) is very unlikely to produce same results within a second.
This is the easiest way aside from manually checking the records of the existence of the random string for uniqueness.
You can use php provided method uniqid().
You can try the following:
$random = 'abcdefghijklmnopqrstuvwxyz0123456789';
$string = '';
for ($i = 0; $i < $string_length; $i++) {
$string .= $random [rand(0, strlen($random ) - 1)];
}
$string_length is the length of your desired string.It will continue giving you unique strings.
I need to convert a random text to a number. But the ramdom text has always to be converted to the same number. For example:
xxxx -> 10
testing -> 396
stackoverflow -> 72
I cant use the number of characters to convert the string cause if I have 2 strings with the same number characters they need to have a different number (at most times at least).
I do not need to have this number in a range. No! It can be any number, since it will always be the same given a certain string.
You could try using hashes (md5, sha1, etc):
$number = hexdec( md5("hello world") );
$number = hexdec( sha1("hello world") );
Hashes of the same string will transform to the same number.
What about;
$number = crc32($string);
Should be cheap, gives integer output, and produce reasonable randomness for your use case.
Other methods that have been shown have the potential of having collisions. The following should not.
$num = "";
for($i = 0; $i < strlen($str); $i++)
$num .= str_pad(ord($str[$i]), 3 "0", STR_PAD_LEFT);
return $num;
I found this openssl_random_pseudo_bytes functions from php.net.
function generate_password($length = 24) {
if(function_exists('openssl_random_pseudo_bytes')) {
$password = base64_encode(openssl_random_pseudo_bytes($length, $strong));
if($strong == TRUE)
return substr($password, 0, $length); //base64 is about 33% longer, so we need to truncate the result
}
# fallback to mt_rand if php < 5.3 or no openssl available
$characters = '0123456789';
$characters .= 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz/+';
$charactersLength = strlen($characters)-1;
$password = '';
# select some random characters
for ($i = 0; $i < $length; $i++) {
$password .= $characters[mt_rand(0, $charactersLength)];
}
return $password;
}
But I want to be sure that whether the value it generates is repeatable or not? I am looking for a result which is not repeatable.
I have been told that the value from mt_rand() is not repeatable? but a random number should be repeatable as it self-explained already - mt_rand, doesn't it?
EDIT:
for instance, I just tested it and it has generated W8hhkS+ngIl7DxxFDxEx6gSn. but if will generate the same value again in the future - then it is repeatable.
random_pseudo_bytes is not repeatable. It is true that any output of finite length must eventually repeat, for example if you specify a length of 1 byte, then there is only 256 possible strings, so you must get the same string you had before no later than after 256 attempts (and likely quite a bit sooner).
But you're talking practically and not mathemathically, and have a default length of 24.
Yes, picking random 24-byte strings will eventually give you a string that you had before, but that's only true in the mathemathical universe. In the real physical universe, there's 6277101735386680763835789423207666416102355444464034512896 possible such strings, which means that even if you generated billions of passwords this way every second, you'd still not be likely to get the same string twice in a million years.
Any random function, including the one above, is repeatable.
Perhaps you're looking for the PHP function uniqid?
I'm building a simple URL shortening script, I want to hash the URL to serve as a unique id but if I used something like MD5 the URL wouldn't be very short.
Is their some hashing functions or anyway to create a unique ID thats only 4 or 5 digits long?
Use auto incrementing integers and convert them into identifiers consisting of all letters (lower & uppercase) to shorten them:
function ShortURL($integer, $chr='abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ') {
// the $chr has all the characters you want to use in the url's;
$base = strlen($chr);
// number of characters = base
$string = '';
do {
// start looping through the integer and getting the remainders using the base
$remainder = $integer % $base;
// replace that remainder with the corresponding the $chr using the index
$string .= $chr[$remainder];
// reduce the integer with the remainder and divide the sum with the base
$integer = ($integer - $remainder) / $base;
} while($integer > 0);
// continue doing that until integer reaches 0;
return $string;
}
and the corresponding function to get them back to integers:
function LongURL($string, $chr='abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ') {
// this is just reversing everything that was done in the other function, one important thing to note is to use the same $chr as you did in the ShortURL
$array = array_flip(str_split($chr));
$base = strlen($chr);
$integer = 0;
$length = strlen($string);
for($c = 0; $c < $length; ++$c) {
$integer += $array[$string[$c]] * pow($base, $length - $c - 1);
}
return $integer;
}
Hashing will cause collisions. Just use an autoincrementing value. This includes using alphanumeric characters too to compress it. That is how most URL shortners work.
niklas's answer below is wonderfully done.
The advantage of using MD5 (or equivalent methods) is that the number of possibilities is so large that you can, for all practical purposes, assume that the value is unique. To ensure that a 4-digit random-like ID is unique would require a database to track existing IDs.
Essentially you have to repeatedly generate IDs and check against the DB.
You could always just keep the first 5 characters of a MD5 and if it already exists you add a random value to the url-string and retry until you get a unique one.
I just copied the code and ran it, and it appears that he string function are backwards. I entered the number generated in the shorturl and ran it back thought and got a different number. So I decoded the number and found the string has to be fed back into long url in reverse with the current coding above.