I've got a 6-digit number and a 31-digit number (e.g. "234536" & "201103231043330478311223582826") that I need to cram into the same 22-character alphanumeric field in an API using PHP. I tried converting each to base 32 (had to use a custom function as base_convert() doesn't handle big numbers well) and joining with a single-character delimiter, but that only gets me down to 26 characters. It's a REST API, so the characters need to be URI-safe.
I'd really like to do this without creating a database table cross referencing the two numbers with another reference value, if possible. Any suggestions?
Use a radix of 62 instead. That will get you 3.35 characters for the former and 17.3 characters for the latter, for an upper total of 22 characters.
>>> math.log(10**6)/math.log(62)
3.3474826039165504
>>> math.log(10**31)/math.log(62)
17.295326786902177
You can write something like pack() that works with big numbers using bc. Here is my quick solution, it converts your second number in a 13-character string. Pretty nice !
<?php
$i2 = "201103231043330478311223582826";
function pack_large($i) {
$ret = '';
while(bccomp($i, 0) !== 0) {
$mod = bcmod($i, 256);
$i = bcsub($i, $mod);
$ret .= chr($mod);
$i = bcdiv($i, 256);
}
return $ret;
}
function unpack_large($s) {
$ret = '0';
$len = strlen($s);
for($i = $len - 1; $i >= 0; --$i) {
$add = ord($s[$i]);
$ret = bcmul($ret, 256);
$ret = bcadd($ret, $add);
}
return $ret;
}
var_dump($i2);
var_dump($pack = pack_large($i2));
var_dump(unpack_large($pack));
Sample output :
string(30) "201103231043330478311223582826"
string(13) "jàÙl¹9±̉"
string(47) "201103231043330478311223582826.0000000000000000"
Since you need URL-friendly characters, use base64_encode on the packed string, this will give you a 20-character string (18 if your remove the padding).
Related
I have many strings. Each string something like:
"i_love_pizza_123"
"whatever_this_is_now_later"
"programming_is_awesome"
"stack_overflow_ftw"
...etc
I need to be able to convert each string to a random number, 1-10. Each time that string gets converted, it should consistently be the same number. A sampling of strings, even with similar text should result in a fairly even spread of values 1-10.
My first thought was to do something like md5($string), then break down a-f,0-9 into ten roughly-equal groups, determine where the first character of the hash falls, and put it in that group. But doing so seems to have issues when converting 16 down to 10 by multiplying by 0.625, but that causes the spread to be uneven.
Thoughts on a good method to consistently convert a string to a random/repeatable number, 1-10? There has to be an easier way.
Here's a quick demo how you can do it.
function getOneToTenHash($str) {
$hash = hash('sha256', $str, true);
$unpacked = unpack("L", $hash); // convert first 4 bytes of hash to 32-bit unsigned int
$val = $unpacked[1];
return ($val % 10) + 1; // get 1 - 10 value
}
for ($i = 0; $i < 100; $i++) {
echo getOneToTenHash('str' . $i) . "\n";
}
How it works:
Basically you get the output of a hash function and downscale it to desired range (1..10 in this case).
In the example above, I used sha256 hash function which returns 32 bytes of arbitrary binary data. Then I extract just first 4 bytes as integer value (unpack()).
At this point I have a 4 bytes integer value (0..4294967295 range). In order to downscale it to 1..10 range I just take the remainder of division by 10 (0..9) and add 1.
It's not the only way to downscale the range but an easy one.
So, the above example consists of 3 steps:
get the hash value
convert the hash value to integer
downscale integer range
A much shorter example with crc32() function which returns integer value right away thus allowing us to omit step 2:
function getOneToTenHash($str) {
$int = crc32($str); // 0..4294967295
return ($int % 10) + 1; // 1..10
}
below maybe what u want
$inStr = "hello world";
$md5Str = md5($inStr);
$len = strlen($md5Str);
$out = 0;
for($i=0; $i<$len; $i++) {
$out = 7*$out + intval($md5Str[$i]); // if you want more random, can and random() here
}
$out = ($out % 10 + 9)%10; // scope= [1,10]
I am trying to find a way to encode a database ID into a short URL, e.g. 1 should become "Ys47R". Then I would like to decode it back from "Ys47R" to 1 so I can run a database search using the INT value. It needs to be unique using the database ID. The sequence should not be easily guessable such as 1 = "Ys47R", 2 = "Ys47S". It should be something along the lines of YouTube or bitly's URL's. I have read up on hundreds of different sources using md5, base32, base64 and `bcpow but have come up empty.
This blog post looked promising but once I added padding and a passkey, short ID's such as 1 became SDDDG, 2 became "SDDDH" and 3 became "SDDDI". It is not very random.
base32 used only a-b 0-9
base64 had characters such as == on the end.
I then tried this:
function getRandomString($db, $length = 7) {
$validCharacters = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
$validCharNumber = strlen($validCharacters);
$result = "";
for ($i = 0; $i < $length; $i++) {
$index = mt_rand(0, $validCharNumber - 1);
$result .= $validCharacters[$index];
}
Which worked but meant I had to run a database query every time to make sure there were no collisions and it did not exist in the database.
Is there a way I can create short ID's that are 4 characters minimum with a charset of [a-z][A-Z][0-9] that can be encoded and decoded back, using increment unique ID in a database where each number is unique. I can't get my head around advance techniques using base32 or base64.
Or am I looking into this too much and there is an easier way to do it? Would it be best to do the random string function above and query the database to check for uniqueness all the time?
You could use function from comments: http://php.net/manual/en/function.base-convert.php#106546
$initial = '11111111';
$dic = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
var_dump($converted = convBase($initial, '0123456789', $dic));
// string(4) "KCvt"
var_dump(convBase($converted, $dic, '0123456789'));
// string(8) "11111111"
function convBase($numberInput, $fromBaseInput, $toBaseInput)
{
if ($fromBaseInput==$toBaseInput) return $numberInput;
$fromBase = str_split($fromBaseInput,1);
$toBase = str_split($toBaseInput,1);
$number = str_split($numberInput,1);
$fromLen=strlen($fromBaseInput);
$toLen=strlen($toBaseInput);
$numberLen=strlen($numberInput);
$retval='';
if ($toBaseInput == '0123456789')
{
$retval=0;
for ($i = 1;$i <= $numberLen; $i++)
$retval = bcadd($retval, bcmul(array_search($number[$i-1], $fromBase),bcpow($fromLen,$numberLen-$i)));
return $retval;
}
if ($fromBaseInput != '0123456789')
$base10=convBase($numberInput, $fromBaseInput, '0123456789');
else
$base10 = $numberInput;
if ($base10<strlen($toBaseInput))
return $toBase[$base10];
while($base10 != '0')
{
$retval = $toBase[bcmod($base10,$toLen)].$retval;
$base10 = bcdiv($base10,$toLen,0);
}
return $retval;
}
If you want some symmetric obfuscation, then base_convert() is often sufficient.
base_convert($id, 10, 36);
Will return strings like 1i0g and convert them back.
Before and after that base conversion you can add:
To get a minimum string length, I'd suggest just adding 70000 to your $id. And on the receiving end just subtract that again.
A minor multiplication $id *= 3 would add some "holes" in the generated alphanumeric ID range, yet not exhaust the available string space.
For some appearance of arbitrariness, a bit of nibble moving:
$id = ($id & 0xF0F0F0F) << 4
| ($id & 0x0F0F0F0) >> 4;
Which works for generating your obfuscated ID strings, and getting back the original ones.
Just to be crystal clear: this is no encryption of any sort. It just shifts numeric jumps between consecutive numbers, and looks slightly more arbitrary.
You still may not like the answer, but generating random IDs in your database is the only approach that really hinders ID guessing.
Is there a term for the idea of storing large numbers as letters? For example let's say I have the (relatively small) number 138201162401719 and I want to shrink the number of characters (I know this does not help with saving disk space) to the fewest possible number of characters. There are 26 letters in the English alphabet (but i count them as 25 since we need a zero letter). If I start splitting up my large number into pieces that are each 25 or less I get:
13, 8, 20, 11, 6, 24, 0, 17, 19
If I then count the numbers of the alphabet a=0, b=1, c=2, d=3... I can convert this to:
NIULGYART
So I went from 15 digits long (138201162401719) to 9 characters long (NIULGYART). This could of course be easily converted back to the original number as well.
So...my first question is "Does this have a name" and my second "Does anyone have PHP code that will do the conversion (in both directions)?"
I am looking for proper terminology so that I can do my own research in Google...though working code examples are cool too.
This only possible if you're considering to store your number before processing as a string. Because you can't store huge number as integers. You will lost the precision (13820116240171986468445 will be stored as 1.3820116240172E+22) so the alot of digits are lost.
If you're considering storing the number as a string this will be your answer:
Functions used: intval, chr and preg_match_all.
<?php
$regex = '/(2[0-5])|(1[0-9])|([0-9])/';
$numberString = '138201162401719';
preg_match_all($regex, $numberString, $numberArray, PREG_SET_ORDER);
echo($numberString . " -> ");
foreach($numberArray as $value){
$character = chr (intval($value[0]) + 65);
echo($character);
}
?>
Demo
This is the result:
138201162401719 -> NIULGYART
Here's how I would do it:
Store the big number as a string and split it into an array of numbers containing one digit each
Loop through the array extract 2-digit chunks using substr()
Check if the number is less than 26 (in which case, it is an alphabet) and add them to an array
Use array_map() with chr() to create a new array of characters from the above array
Implode the resulting array to get the cipher
In code:
$str = '138201162401719';
$arr = str_split($str);
$i = 0; // starting from the left
while ($i < count($arr)) {
$n = substr($str, $i, 2);
$firstchar = substr($n, 0, 1);
if ($n < 26 && $firstchar != 0) {
$result[] = substr($str, $i, 2);
$i += 2; // advance two characters
} else {
$result[] = substr($str, $i, 1);
$i++; // advance one character
}
}
$output = array_map(function($n) {
return chr($n+65);
}, $result);
echo implode($output); // => NIULGYART
Demo.
As an alternative, you could convert the input integer to express it in base 26, instead of base 10. Something like (pseudocode):
func convertBase26(num)
if (num < 0)
return "-" & convertBase26(-num) // '&' is concatenate.
else if (num = 0)
return "A"
endif
output = "";
while (num > 0)
output <- ('A' + num MOD 26) & output // Modulus operator.
num <- num DIV 26 // Integer division.
endwhile
return output
endfunc
This uses A = 0, B = 1, up to Z = 25 and standard place notation: 26 = BA. Obviously a base conversion is easily reversible.
strtr() is a magnificent tool for this task! It replaces the longest match as is traverses the string.
Code: (Demo)
function toAlpha ($num) {
return strtr($num, range("A", "Z"));
}
$string = toAlpha("138201162401719");
echo "$string\n";
$string = toAlpha("123456789012345");
echo "$string\n";
$string = toAlpha("101112131415161");
echo "$string\n";
$string = toAlpha("2625242322212019");
echo "$string";
Output:
NIULGYART
MDEFGHIJAMDEF
KLMNOPQB
CGZYXWVUT
Just flip the lookup array to reverse the conversion: https://3v4l.org/YsFZu
Merged: https://3v4l.org/u3NQ5
Of course, I must mention that there is a vulnerability with converting a sequence of letters to numbers and back to letters. Consider BB becomes 11 then is mistaken for eleven which would traslate to L when converted again.
There are ways to mitigate this by adjusting the lookup array, but that may not be necessary/favorable depending on program requirements.
And here is another consideration from CodeReview.
I have been trying to do the same thing in PHP without success.
Assuming I'm using the 26 letters of the English alphabet, starting with A = 0 down to Z as 25:
I find the highest power of 26 lower than the number I am encoding. I divide it by the best power of 26 I found. Of the result I take away the integer, convert it to a letter and multiply the decimals by 26. I keep doing that until I get a whole number. It's ok to get a zero as it's an A, but if it has decimals it must be multiplied.
For 1 billion which is DGEHTYM and it's done in 6 loops obviously. Although my answer demonstrates how to encode, I'm afraid it does not help doing so on PHP which is what I'm trying to do myself. I hope the algorithm helps people out there though.
I am working on Yii. I want to generate 20 digit random keys. I had written a function as -
public function GenerateKey()
{
//for generating random confirm key
$length = 20;
$chars = array_merge(range(0,9), range('a','z'), range('A','Z'));
shuffle($chars);
$password = implode(array_slice($chars, 0, $length));
return $password;
}
This function is generating 20 digit key correctly. But I want the key in a format like
"g12a-Gh45-gjk7-nbj8-lhk8". i.e. separated by hypen. So what changes do I need to do?
You can use chunk_split() to add the hyphens. substr() is used to remove the trailing hyphen it adds, leaving only those hyphens that actually separate groups.
return substr(chunk_split($password, 4, '-'), 0, 24);
However, note that shuffle() not only uses a relatively poor PRNG but also will not allow the same character to be used twice. Instead, use mt_rand() in a for loop, and then using chunk_split() is easy to avoid:
$password = '';
for ($i = 0; $i < $length; $i++) {
if ( $i != 0 && $i % 4 == 0 ) { // nonzero and divisible by 4
$password .= '-';
}
$password .= $chars[mt_rand(0, count($chars) - 1)];
}
return $password;
(Even mt_rand() is not a cryptographically secure PRNG. If you need to generate something that must be extremely hard to predict (e.g. an encryption key or password reset token), use openssl_random_pseudo_bytes() to generate bytes and then a separate function such as bin2hex() to encode them into printable characters. I am not familiar with Yii, so I cannot say whether or not it has a function for this.)
You can use this Yii internal function:
Yii::app()->getSecurityManager()->generateRandomString($length);
I need PHP function that will create 8 chars long [a-z] hash from any input string.
So e.g. when I'll submit "Stack Overflow" it will return e.g. "gdqreaxc" (8 chars [a-z] no numbers allowed)
Perhaps something like:
$hash = substr(strtolower(preg_replace('/[0-9_\/]+/','',base64_encode(sha1($input)))),0,8);
This produces a SHA1 hash, base-64 encodes it (giving us the full alphabet), removes non-alpha chars, lowercases it, and truncates it.
For $input = 'yar!';:
mwinzewn
For $input = 'yar!!';:
yzzhzwjj
So the spread seems pretty good.
This function will generate a hash containing evenly distributed characters [a-z]:
function my_hash($string, $length = 8) {
// Convert to a string which may contain only characters [0-9a-p]
$hash = base_convert(md5($string), 16, 26);
// Get part of the string
$hash = substr($hash, -$length);
// In rare cases it will be too short, add zeroes
$hash = str_pad($hash, $length, '0', STR_PAD_LEFT);
// Convert character set from [0-9a-p] to [a-z]
$hash = strtr($hash, '0123456789', 'qrstuvwxyz');
return $hash;
}
By the way, if this is important for you, for 100,000 different strings you'll have ~2% chance of hash collision (for a 8 chars long hash), and for a million of strings this chance rises up to ~90%, if my math is correct.
function md5toabc($myMD5)
{
$newString = "";
for ($i = 0; $i < 16; $i+=2)
{
//add the first val of 0-15 to the second val of 0-15 for a range of 0-30
$myintval = hexdec(substr($myMD5, $i, $i +1) ) +
hexdec(substr($myMD5, $i+1, $i +2) );
// mod by 26 and add 97 to get to the lowercase ascii range
$newString .= chr(($myintval%26) + 97);
}
return $newString;
}
Note this introduces bias to various characters, but do with it what you will.
(Like when you roll two dice, the most common value is a 7 combined...) plus the modulo, etc...
one can give you a good a-p{8} (but not a-z) by using and modifying (the output of) a well known algo:
function mini_hash( $string )
{
$h = hash( 'crc32' , $string );
for($i=0;$i<8;$i++) {
$h{$i} = chr(96+hexdec($h{$i}));
}
return $h;
}
interesting set of constraints you posted there
how about
substr (preg_replace(md5($mystring), "/[1-9]/", ""), 0, 8 );
you could add a bit more entorpy by doing a
preg_replace($myString, "1", "g");
preg_replace($myString, "2", "h");
preg_replace($myString, "3", "i");
etc instead of stripping the digits.