Convert string to consistent but random 1 of 10 options - php

I have many strings. Each string something like:
"i_love_pizza_123"
"whatever_this_is_now_later"
"programming_is_awesome"
"stack_overflow_ftw"
...etc
I need to be able to convert each string to a random number, 1-10. Each time that string gets converted, it should consistently be the same number. A sampling of strings, even with similar text should result in a fairly even spread of values 1-10.
My first thought was to do something like md5($string), then break down a-f,0-9 into ten roughly-equal groups, determine where the first character of the hash falls, and put it in that group. But doing so seems to have issues when converting 16 down to 10 by multiplying by 0.625, but that causes the spread to be uneven.
Thoughts on a good method to consistently convert a string to a random/repeatable number, 1-10? There has to be an easier way.

Here's a quick demo how you can do it.
function getOneToTenHash($str) {
$hash = hash('sha256', $str, true);
$unpacked = unpack("L", $hash); // convert first 4 bytes of hash to 32-bit unsigned int
$val = $unpacked[1];
return ($val % 10) + 1; // get 1 - 10 value
}
for ($i = 0; $i < 100; $i++) {
echo getOneToTenHash('str' . $i) . "\n";
}
How it works:
Basically you get the output of a hash function and downscale it to desired range (1..10 in this case).
In the example above, I used sha256 hash function which returns 32 bytes of arbitrary binary data. Then I extract just first 4 bytes as integer value (unpack()).
At this point I have a 4 bytes integer value (0..4294967295 range). In order to downscale it to 1..10 range I just take the remainder of division by 10 (0..9) and add 1.
It's not the only way to downscale the range but an easy one.
So, the above example consists of 3 steps:
get the hash value
convert the hash value to integer
downscale integer range
A much shorter example with crc32() function which returns integer value right away thus allowing us to omit step 2:
function getOneToTenHash($str) {
$int = crc32($str); // 0..4294967295
return ($int % 10) + 1; // 1..10
}

below maybe what u want
$inStr = "hello world";
$md5Str = md5($inStr);
$len = strlen($md5Str);
$out = 0;
for($i=0; $i<$len; $i++) {
$out = 7*$out + intval($md5Str[$i]); // if you want more random, can and random() here
}
$out = ($out % 10 + 9)%10; // scope= [1,10]

Related

Generate an unique and random integer

I want to create user accounts with a public_id which is always a unique, integer random (not incremental) value.
I can use loops to check if the random integer is unique, but that doesn't seem like a really nice solution.
I found some alphabetic-numeric generators, and I guess I could convert them to integers using some string to integer converter, but are there an integer -specific ways?
I also worry about possible collisions, but it looks like the chance will be always there in a long run.(?)
You can either use one of native php functions like mt_rand or use more reliably way - generating integer based on microtime function.
To ensure that the value is unique you need to add a unique index on a column in DB and write 'ON DUPLICATE UPDATE' to insert/update queries which will add some digits to the value if it is not unique
There are 2 possible solutions:
1) If your "long run" is really really long - it means this is
possible, that you are out of PHP_INT_MAX and there is no
only-integer-specific way.
2) If you are not out of PHP_INT_MAX - then you need some storage for
checking the ids.
In case of 1 you can use library hashids. To avoid collisions - you'll need some incremental counter on input. Then you can convert strings by each letter back to integer.
In case of 2 - you can use some in-memory database like redis for performance.
Using timeStamp will really do a great job since it uses time to generate it random numbers .you can also concatenate the below function with other random generated numbers.
function passkey($format = 'u', $utimestamp = null){
if (is_null($utimestamp)) {
$utimestamp = microtime(true);
}
$timestamp = floor($utimestamp);
$milliseconds = round(($utimestamp - $timestamp) * 1000000);
return date(preg_replace('`(?<!\\\\)u`', $milliseconds, $format),$timestamp);
}
echo passkey(); // 728362
You can use a linear congruential generator with a large period.
Here is one that generates unique integers which always have 6 digits. It will not generate duplicates until it has generated all numbers between 100000 and 996722, which gives you almost 900 000 different numbers.
The condition is that you can provide the function the number it last generated. So if you store the number in the database, you have to somehow retrieve the last assigned one, so you can feed it to this function:
function random_id($prev) {
return 100000 + (($prev-100000)*97 + 356563) % 896723;
}
$prev = 100000; // must be a 6 digit number: the initial seed.
// Generate the first 10 pseudo-random integers.
for ($i = 0; $i < 10; $i++) {
$prev = random_id($prev);
echo $prev . "\n";
}
The above generation of the first 10 numbers yields:
456563
967700
331501
494085
123719
963860
855744
232445
749606
697735
You can do this for other ranges by following the rules in the referenced article on getting a full period in linear congruential generators. Concretely, if you want to generate numbers with n digits, where the first digit cannot be zero (so between 10n-1 and 10n-1), then I find it easiest to find a large prime just below 9⋅10n-1 to serve as the last number of the formula. The other two numbers can then be any positive integer, but better keep the first one small to avoid overflow.
However, PHP integers are limited to PHP_INT_MAX (typically 2147483647), so for numbers with 10 or more digits you will need to use floating point operators. The % operator should not be used then. Use fmod instead.
For example, to generate numbers with 12 digits, you could use this formula:
function random_id($prev) {
return 100000000000 + fmod((($prev-100000000000)*97 + 344980016453), 899999999981);
}
$prev = 100000000000; // must be a 12 digit number: the initial seed.
// Generate the first 10 pseudo-random integers.
for ($i = 0; $i < 10; $i++) {
$prev = random_id($prev);
echo $prev . "\n";
}

How to get number of digits in both right, left sides of a decimal number

I wonder if is there a good way to get the number of digits in right/left side of a decimal number PHP. For example:
12345.789 -> RIGHT SIDE LENGTH IS 3 / LEFT SIDE LENGTH IS 5
I know it is readily attainable by helping string functions and exploding the number. I mean is there a mathematically or programmatically way to perform it better than string manipulations.
Your answers would be greatly appreciated.
Update
The best solution for left side till now was:
$left = floor(log10($x))+1;
but still no sufficient for right side.
Still waiting ...
To get the digits on the left side you can do this:
$left = floor(log10($x))+1;
This uses the base 10 logarithm to get the number of digits.
The right side is harder. A simple approach would look like this, but due to floating point numbers, it would often fail:
$decimal = $x - floor($x);
$right = 0;
while (floor($decimal) != $decimal) {
$right++;
$decimal *= 10; //will bring in floating point 'noise' over time
}
This will loop through multiplying by 10 until there are no digits past the decimal. That is tested with floor($decimal) != $decimal.
However, as Ali points out, giving it the number 155.11 (a hard to represent digit in binary) results in a answer of 14. This is because as the number is stored as something like 155.11000000000001 with the 32 bits of floating precision we have.
So instead, a more robust solution is needed. (PoPoFibo's solutions above is particularly elegant, and uses PHPs inherit float comparison functions well).
The fact is, we can never distinguish between input of 155.11 and 155.11000000000001. We will never know which number was originally given. They will both be represented the same. However, if we define the number of zeroes that we can see in a row before we just decide the decimal is 'done' than we can come up with a solution:
$x = 155.11; //the number we are testing
$LIMIT = 10; //number of zeroes in a row until we say 'enough'
$right = 0; //number of digits we've checked
$empty = 0; //number of zeroes we've seen in a row
while (floor($x) != $x) {
$right++;
$base = floor($x); //so we can see what the next digit is;
$x *= 10;
$base *= 10;
$digit = floor($x) - $base; //the digit we are dealing with
if ($digit == 0) {
$empty += 1;
if ($empty == $LIMIT) {
$right -= $empty; //don't count all those zeroes
break; // exit the loop, we're done
}
} else {
$zeros = 0;
}
}
This should find the solution given the reasonable assumption that 10 zeroes in a row means any other digits just don't matter.
However, I still like PopoFibo's solution better, as without any multiplication, PHPs default comparison functions effectively do the same thing, without the messiness.
I am lost on PHP semantics big time but I guess the following would serve your purpose without the String usage (that is at least how I would do in Java but hopefully cleaner):
Working code here: http://ideone.com/7BnsR3
Non-string solution (only Math)
Left side is resolved hence taking the cue from your question update:
$value = 12343525.34541;
$left = floor(log10($value))+1;
echo($left);
$num = floatval($value);
$right = 0;
while($num != round($num, $right)) {
$right++;
}
echo($right);
Prints
85
8 for the LHS and 5 for the RHS.
Since I'm taking a floatval that would make 155.0 as 0 RHS which I think is valid and can be resolved by String functions.
php > $num = 12345.789;
php > $left = strlen(floor($num));
php > $right = strlen($num - floor($num));
php > echo "$left / $right\n";
5 / 16 <--- 16 digits, huh?
php > $parts = explode('.', $num);
php > var_dump($parts);
array(2) {
[0]=>
string(5) "12345"
[1]=>
string(3) "789"
As you can see, floats aren't the easiest to deal with... Doing it "mathematically" leads to bad results. Doing it by strings works, but makes you feel dirty.
$number = 12345.789;
list($whole, $fraction) = sscanf($number, "%d.%d");
This will always work, even if $number is an integer and you’ll get two real integers returned. Length is best done with strlen() even for integer values. The proposed log10() approach won't work for 10, 100, 1000, … as you might expect.
// 5 - 3
echo strlen($whole) , " - " , strlen($fraction);
If you really, really want to get the length without calling any string function here you go. But it's totally not efficient at all compared to strlen().
/**
* Get integer length.
*
* #param integer $integer
* The integer to count.
* #param boolean $count_zero [optional]
* Whether 0 is to be counted or not, defaults to FALSE.
* #return integer
* The integer's length.
*/
function get_int_length($integer, $count_zero = false) {
// 0 would be 1 in string mode! Highly depends on use case.
if ($count_zero === false && $integer === 0) {
return 0;
}
return floor(log10(abs($integer))) + 1;
}
// 5 - 3
echo get_int_length($whole) , " - " , get_int_length($fraction);
The above will correctly count the result of 1 / 3, but be aware that the precision is important.
$number = 1 / 3;
// Above code outputs
// string : 1 - 10
// math : 0 - 10
$number = bcdiv(1, 3);
// Above code outputs
// string : 1 - 0 <-- oops
// math : 0 - INF <-- 8-)
No problem there.
I would like to apply a simple logic.
<?php
$num=12345.789;
$num_str="".$num; // Converting number to string
$array=explode('.',$num_str); //Explode number (String) with .
echo "Left side length : ".intval(strlen($array[0])); // $array[0] contains left hand side then check the string length
echo "<br>";
if(sizeof($array)>1)
{
echo "Left side length : ".intval(strlen($array[1]));// $array[1] contains left hand check the string length side
}
?>

Convert random text to a number with PHP

I need to convert a random text to a number. But the ramdom text has always to be converted to the same number. For example:
xxxx -> 10
testing -> 396
stackoverflow -> 72
I cant use the number of characters to convert the string cause if I have 2 strings with the same number characters they need to have a different number (at most times at least).
I do not need to have this number in a range. No! It can be any number, since it will always be the same given a certain string.
You could try using hashes (md5, sha1, etc):
$number = hexdec( md5("hello world") );
$number = hexdec( sha1("hello world") );
Hashes of the same string will transform to the same number.
What about;
$number = crc32($string);
Should be cheap, gives integer output, and produce reasonable randomness for your use case.
Other methods that have been shown have the potential of having collisions. The following should not.
$num = "";
for($i = 0; $i < strlen($str); $i++)
$num .= str_pad(ord($str[$i]), 3 "0", STR_PAD_LEFT);
return $num;

php How to make a proper hash function that will handle given strings

I want to create a hash function that will receive strings and output the corresponding value in an array that has predefined "proportions". For instance, if my array holds the values:
[0] => "output number 1"
[1] => "output number 2"
[2] => "output number 3"
Then the hash function int H(string) should return only values in the range 0 and 2 for any given string (an input string will always return the same key).
The thing is that i want it to also make judgement by predefined proportions so, for instance
85% of given strings will hash out as 0, 10% as 1 and 5% as 2. If there are functions that can emulate normal distribution that will be even better.
I also want it to be fast as it will run frequently. Can someone point me to the right direction on how to approach this in php? I believe I'm not the first one that asked this but I came short digging on SO for an hour.
EDIT:
What i did until now is built a hash function in c. It does the above hashing without proportions (still not comfortable with php):
int StringFcn (const void *key, size_t arrSize)
{
char *str = key;
int totalAsciiVal = 0;
while(*str)
{
totalAsciiVal += *str++;
}
return totalAsciiVal % arrSize;
}
What about doing something like this:
// Hash the string so you can pretty much guarantee it will have a number in it and it is relatively "random"
$hash = sha1($string);
// Iterate through string and get ASCII values
$chars = str_split($hash);
$num = 0;
foreach ($chars as $char) {
$num += ord($int);
}
// Get compare modulo 100 of the number
if ($num % 100 < 85) {
return 0;
}
if ($num % 100 < 95) {
return 1;
}
return 2;
Edit:
Instead of hashing with sha1, you can get a sufficiently large integer directly using crc32 (thanks to #nivrig in the comments).
// Convert string to integer
$num = crc32($string);
// Get compare modulo 100 of the number
if ($num % 100 < 85) {
return 0;
}
if ($num % 100 < 95) {
return 1;
}
return 2;

unique number from a string - php

I have some strings containing alpha numeric values, say
asdf1234,
qwerty//2345
etc..
I want to generate a specific constant number related with the string. The number should not match any number generated corresponding with other string..
Does it have to be a number?
You could simply hash the string, which would give you a unique value.
echo md5('any string in here');
Note: This is a one-way hash, it cannot be converted from the hash back to the string.
This is how passwords are typically stored (using this or another hash function, typically with a 'salt' method added.) Checking a password is then done by hashing the input and comparing to the stored hash.
edit: md5 hashes are 32 characters in length.
Take a look at other hash functions:
http://us3.php.net/manual/en/function.crc32.php (returns a number, possibly negative)
http://us3.php.net/manual/en/function.sha1.php (40 characters)
You can use a hashing function like md5, but that's not very interesting.
Instead, you can turn the string into its sequence of ASCII characters (since you said that it's alpha-numeric) - that way, it can easily be converted back, corresponds to the string's length (length*3 to be exact), it has 0 collision chance, since it's just turning it to another representation, always a number and it's a little more interesting... Example code:
function encode($string) {
$ans = array();
$string = str_split($string);
#go through every character, changing it to its ASCII value
for ($i = 0; $i < count($string); $i++) {
#ord turns a character into its ASCII values
$ascii = (string) ord($string[$i]);
#make sure it's 3 characters long
if (strlen($ascii) < 3)
$ascii = '0'.$ascii;
$ans[] = $ascii;
}
#turn it into a string
return implode('', $ans);
}
function decode($string) {
$ans = '';
$string = str_split($string);
$chars = array();
#construct the characters by going over the three numbers
for ($i = 0; $i < count($string); $i+=3)
$chars[] = $string[$i] . $string[$i+1] . $string[$i+2];
#chr turns a single integer into its ASCII value
for ($i = 0; $i < count($chars); $i++)
$ans .= chr($chars[$i]);
return $ans;
}
Example:
$original = 'asdf1234';
#will echo
#097115100102049050051052
$encoded = encode($original);
echo $encoded . "\n";
#will echo asdf1234
$decoded = decode($encoded);
echo $decoded . "\n";
echo $original === $decoded; #echoes 1, meaning true
You're looking for a hash function, such as md5. You probably want to pass it the $raw_output=true parameter to get access to the raw bytes, then cast them to whatever representation you want the number in.
A cryptographic hash function will give you a different number for each input string, but it's a rather large number — 20 bytes in the case of SHA-1, for example. In principle it's possible for two strings to produce the same hash value, but the chance of it happening is so extremely small that it's considered negligible.
If you want a smaller number — say, a 32-bit integer — then you can't use a hash function because the probability of collision is too high. Instead, you'll need to keep a record of all the mappings you've established. Make a database table that associates strings with numbers, and each time you're given a string, look it up in the table. If you find it there, return the associated number. If not, choose a new number that isn't used by any of the existing records, and add the new string and number to the table.

Categories