I've been attempting to implement the MD5 hashing Algorithm in PHP and I have produced the code below. However, when I run the function with the test input "test", it produces the string "21aa63b9882532cd590623dbd8f2fa225350d682" as opposed to the expected "098f6bcd4621d373cade4e832627b4f6".
I have completely run out of ideas as to why this is returning an error, and would be extremely grateful if someone were to assist me.
Edit: I'm not using this in production, but rather in a school project.
My code:
<?php
/**
* Created by PhpStorm.
* User: Sam Gunner
* Date: 25/01/2017
* Time: 18:37
*/
class md5
{
private $k;
private $s;
//Constants as defined by the specification
private $a0 = 0x67452301;
private $b0 = 0xefcdab89;
private $c0 = 0x98badcfe;
private $d0 = 0x10325476;
//Convert a character to its binary representation using ASCII
private function convertCharToByteString($char) {
$charNum = ord($char);
$charNumString = decbin($charNum);
$charNumString = str_pad($charNumString, 8, '0', STR_PAD_LEFT);
return $charNumString;
}
//Takes in a number of zeroes and the string, and then adds that number of zeroes to the end of the string
private function padZeroRight($str, $amount) {
for ($i = 0; $i < $amount; $i++) {
$str .= '0';
}
return $str;
}
//Splits up a string into several pieces
private function getCharChunks($original, $length) {
$chunks = Array();
$currentChunk = null;
$currentStart = null;
$numberOfParts = ceil(strlen($original) / $length); //Get the number of chunks
for ($i = 0; $i < $numberOfParts; $i++) {
$currentStart = ($i * $length) - ($length - 1); //Get the starting position of the substring
$currentChunk = substr($original, $currentStart, $length);
$chunks[$i] = $currentChunk;
}
return $chunks;
}
//Easy way of converting multiple binary integers to array of decimal integers
private function convertChunkArrayToIntegers($chunkArray) {
$finalChunks = Array();
for ($i = 0; $i < count($chunkArray); $i++) {
$finalChunks[$i] = decbin($chunkArray[$i]);
}
return $finalChunks;
}
//Begin MD5-specific functions
private function F($B, $C, $D) {
return ($B & $C) | ((~$B) & $D);
}
private function G($B, $C, $D) {
return ($B & $D) | ($C & (~$D));
}
private function H($B, $C, $D) {
return ($B ^ $C ^ $D);
}
private function I($B, $C, $D) {
return ($C ^ ($B | (~$D)));
}
private function rotate($decimal, $bits) { //returns hex
return (($decimal << $bits) | ($decimal >> (32 - $bits))) & 0xffffffff;
}
public function __construct() {
$this->s = Array(
7, 12, 17, 22, 7, 12, 17, 22, 7, 12, 17, 22, 7, 12, 17, 22,
5, 9, 14, 20, 5, 9, 14, 20, 5, 9, 14, 20, 5, 9, 14, 20,
4, 11, 16, 23, 4, 11, 16, 23, 4, 11, 16, 23, 4, 11, 16, 23,
6, 10, 15, 21, 6, 10, 15, 21, 6, 10, 15, 21, 6, 10, 15, 21
);
for ($i = 0; $i < 64; $i++) { //Generate the constants that are defined in the specification
$this->k[$i] = floor(abs(sin($i + 1)) * pow(2, 32));
}
}
//Returns the MD5 hash of a message in string form
public function hash($message) {
$finalStr = '';
$subChunks = null;
$A = null;
$B = null;
$C = null;
$D = null;
$F = null;
$g = null;
$DTemp = null;
for ($i = 0; $i < strlen($message); $i++) { //Change the string representation of the message into a series of bits, which we can manipulate
$finalStr .= $this->convertCharToByteString(substr($finalStr, $i, 1));
}
$finalStr .= '1'; //Append 1, as the specification says to
$messageLen = strlen($finalStr);
$messageLenFinal = $messageLen % 512; //Find out how much we are under a multiple of 512
if ($messageLenFinal > 448) { //If the message length remainder is longer than 448, then we need to add some more zeroes to get it to 448 over
$remainingToAdd = (512 - $messageLenFinal) + 448; //'tip it over the edge' and then add 448 to make it 64 below 512
} else {
$remainingToAdd = 448 - $messageLenFinal;
}
$finalStr = $this->padZeroRight($finalStr, $remainingToAdd); //Add zeroes onto the end until criteria is met
//$messageLen = $messageLen % (pow(2, 64)); //Get the length of the message MOD 2 pow 64
$messageLen = strlen($finalStr);
$messageLenStr = decbin($messageLen); //Convert the decimal representation to binary
$messageLenStr = str_pad($messageLenStr, 64, "0", STR_PAD_LEFT); //Pad the message with zeroes to make it 64 bits long
$finalStr .= $messageLenStr;
$chunks = $this->getCharChunks($finalStr, 512); //Get message in 512-bit chunks
foreach ($chunks as $chunk) {
$subChunks = $this->convertChunkArrayToIntegers($this->getCharChunks($chunk, 32)); //Get sub chunks of 32-bit size
$A = $this->a0;
$B = $this->b0;
$C = $this->c0;
$D = $this->d0;
for ($i = 0; $i < 64; $i++) {
if ($i >= 0 && $i < 16) {
$F = $this->F($B, $C, $D);
$g = $i;
} elseif ($i > 15 && $i <32) {
$F = $this->G($B, $C, $D);
$g = ((5 * $i) + 1) % 16;
} elseif ($i > 31 && $i < 48) {
$F = $this->H($B, $C, $D);
$g = ((3 * $i) + 5) % 16;
} elseif ($i > 47 && $i < 64) {
$F = $this->I($B, $C, $D);
$g = (7 * $i) % 16;
}
$DTemp = $D;
$D = $C;
$C = $B;
$B = $B + $this->rotate(($A + $F + $this->k[$i] + $subChunks[$g]), $this->s[$i]);
$A = $DTemp;
}
$this->a0 += $A;
$this->b0 += $B;
$this->c0 += $C;
$this->d0 += $D;
}
$final = dechex($this->a0) . dechex($this->b0) . dechex($this->c0) . dechex($this->d0);
return $final;
}
}
Many thanks,
- Sam
The four additions you're performing at the end of a round:
$this->a0 += $A;
$this->b0 += $B;
$this->c0 += $C;
$this->d0 += $D;
are not wrapping. This is causing A/B/C/D to become too large -- this is probably why your output is ending up 160 bits long instead of 128.
Note that you'll also need to pad the results of dechex() in the final stage to avoid giving too short of a result if one of the four components ends up being very small.
(There may be other issues as well -- these are just the first ones I noticed.)
If it's an option at all, I'd strongly recommend that you use C for this project. PHP is not well suited to writing low-level crypto code; you will find yourself wrestling with issues like this often.
Related
I'm trying to do a syntax cast where I have a javascript function that calculates the check digit from an input number.
where the variables:
input - is the input value (Example: 200300)
num_digits - is the number of digits (if defined 1, its respective digit will be 7; already defined it 2, its respective digit will be 70. According to the input value)
limit - is the multiplication limit (in my case I need it to be multiplied by/up to 9)
x10 - in this case being true or false, being true the digit will be multiplied by 10
all variables mentioned above refer to my JavaScript function:
function calcDigitMod11(input, num_digits, limit, x10) {
var mult, sum, i, n, digit;
if (!x10) num_digits = 1;
for (n = 1; n <= num_digits; n++) {
sum = 0; mult = 2;
for (i = (input.length - 1); i >= 0; i--) {
sum += (mult * parseInt(input.charAt(i)));
if (++mult > limit) mult = 2;
}
if (x10) {
digit = ((sum * 10) % 11) % 10;
} else {
digit = sum % 11;
if (digit == 10) digit = 'x';
}
input += (digit);
}
return input.substring((input.length - num_digits), num_digits);
}
in short, my big problem is to create this same function in the php syntax
If you print the function's return on the console, passing the following parameters: calcDigitMod11(200300, 1, 9, true); your return must be the check digit
7
giving an applied...
"tried" to convert syntaxes using the same variables, parameters, among others
function calcDigitMod11($input, $num_digits, $limit, $x10) {
$mult; $sum; $i; $n; $digit;
if (!$x10) $num_digits = 1;
for ($n = 1; $n <= $num_digits; $n++) {
$sum = 0; $mult = 2;
for ($i = (strlen($input) - 1); $i >= 0; $i--) {
$sum += ($mult * (int)$input[$i]);
if ((++$mult) > $limit) $mult = 2;
}
if ($x10) {
$digit = (($sum * 10) % 11) % 10;
} else {
$digit = $sum % 11;
if ($digit == 10) $digit = 'x';
}
$input += ($digit);
}
return substr($input, strlen($input) - $num_digits, $num_digits);
}
I wrote an echo calcDigitMod11(200300, 1, 9, true); but it returns the digit to me.
0
I didn't find where the mistaken point x is, I don't know if I'm running away from logic! Here is a table of the JavaScript function of the true digits according to the input value:
calcDigitMod11(200300, 1, 9, true); is return 7
calcDigitMod11(200301, 1, 9, true); is return 5
calcDigitMod11(200302, 1, 9, true); is return 3
calcDigitMod11(200303, 1, 9, true); is return 1
In PHP, you can retrieve characters using index values only for strings. Example:
$str = "98765";
echo $str[1]; // will return 8
If we do the same for integers it will throw a warning.
$num = 98765;
echo $num[1];
Warning: Trying to access array offset on value of type int
To enable errors/warnings in your script, add the below code at the top of your php file
error_reporting(E_ALL);
ini_set('display_errors', '1');
Now, to fix your algorithm, we can use str_split to convert the integer values into an array before using index to extract digits.
<?php
function calcDigitMod11($input, $num_digits, $limit, $x10) {
$mult; $sum; $i; $n; $digit;
if (!$x10) $num_digits = 1;
for ($n = 1; $n <= $num_digits; $n++) {
$sum = 0; $mult = 2;
$input_arr = str_split($input); // Convert string to an array
for ($i = (strlen($input) - 1); $i >= 0; $i--) {
$sum += ($mult * $input_arr[$i]); // Use array to fetch digits
if ((++$mult) > $limit) $mult = 2;
}
if ($x10) {
$digit = (($sum * 10) % 11) % 10;
} else {
$digit = $sum % 11;
if ($digit == 10) $digit = 'x';
}
$input += ($digit);
}
echo "<br/><br/>True Digit=$digit<br/>";
return substr($input, strlen($input) - $num_digits, $num_digits);
}
echo "Output=" . calcDigitMod11(200300, 1, 9, true);
echo "Output=" . calcDigitMod11(200301, 1, 9, true);
echo "Output=" . calcDigitMod11(200302, 1, 9, true);
echo "Output=" . calcDigitMod11(200303, 1, 9, true);
?>
Output:
True Digit=7
Output=7
True Digit=5
Output=6
True Digit=3
Output=5
True Digit=1
Output=4
Working Demo
So I have an array of integers: <1, 2, 3, 9, 10, 11, 14>, that I would like to join together in this format: <1-3, 9-11, 14>.
I'm new to PHP and tried doing this by looping through the array:
function pasteTogether($val)
{
$newVals = array();
$min = $val[0];
$max = $val[1];
$counter = 0;
for ($i = 0; $i < count($val); $i++)
{
if ($val[$i + 1] === $val[$i] + 1)
{
$max = $val[$i + 1];
}
else
{
$tempVal = $min."-".$max;
$newVals[$counter] = $tempVal;
$counter++;
$min = $val[$i];
}
}
return $newVals;
}
However, when I run this code, I get <1-3, 3-11, 11-11, 14-14>
PHP Fatal error: Maximum execution time of 30 seconds exceeded in ../learning.php on line 36
Because the for loop never ends you increment $val instead of $i
$array = array(1, 2, 3, 9, 10, 11, 14);
function pasteTogether($val)
{
$newVals = array();
$min = $val[0];
$max = $val[1];
$counter = 0;
for ($i = 0; $i < count($val); $i++)
{
if ($val[$i + 1] === $val[$i] + 1)
{
$max = $val[$i + 1];
}
else
{
$tempVal = $min."-".$max;
$newVals[$counter] = $tempVal;
$counter++;
$min = $val[$i];
}
}
return $newVals;
}
pasteTogether($array);
I have been playing around with this interesting problem and found another solution. So, if anyone is interested:
$arr=array(1, 2, 3, 9, 10, 11, 14, 15, 16, 18);
$v0=$dif=null;$rows=array();
foreach ($arr as $i => $v) {
if ($dif!=($d=($v-$i))){
if ($v0) $rows[]="$v0-".$arr[$i-1];
$v0=$v;
$dif=$d;
}
}
$rows[]="$v0-".($d==$dif?$arr[$i]:$v0);
print_r($rows);
I added a few numbers to the array and the result is this:
$rows = Array
(
[0] => 1-3
[1] => 9-11
[2] => 14-16
[3] => 18-18
)
You can find a little demo here: http://rextester.com/ABC25608
This works:
function pasteTogether($val)
{
$compacted = [];
$min = null;
$max = null;
$format = function ($a, $b) {
return ($a < $b ? "$a-$b" : $a);
};
foreach ($val as $current) {
if ($min === null) {
$min = $current;
$max = $current - 1;
}
if ($current == $max + 1) {
$max++;
} else {
$compacted[] = $format($min, $max);
$min = $current;
$max = $current;
}
}
$compacted[] = $format($min, $max);
return $compacted;
}
echo '<', implode(', ', pasteTogether([1, 2, 3, 9, 10, 11, 14])), '>';
Output:
<1-3, 9-11, 14>
This question already has answers here:
PHP get the item in an array that has the most duplicates
(2 answers)
Closed 1 year ago.
I have an array of numbers like this:
$array = array(1,1,1,4,3,1);
How do I get the count of most repeated value?
This should work:
$count=array_count_values($array);//Counts the values in the array, returns associatve array
arsort($count);//Sort it from highest to lowest
$keys=array_keys($count);//Split the array so we can find the most occuring key
echo "The most occuring value is $keys[0][1] with $keys[0][0] occurences."
I think array_count_values function can be useful to you. Look at this manual for details : http://php.net/manual/en/function.array-count-values.php
You can count the number of occurrences of values in an array with array_count_values:
$counts = array_count_values($array);
Then just do a reverse sort on the counts:
arsort($counts);
Then check the top value to get your mode.
$mode = key($counts);
If your array contains strings or integers only you can use array_count_values and arsort:
$array = array(1, 1, 1, 4, 3, 1);
$counts = array_count_values($array);
arsort($counts);
That would leave the most used element as the first one of $counts. You can get the count amount and value afterwards.
It is important to note that if there are several elements with the same amount of occurrences in the original array I can't say for sure which one you will get. Everything depends on the implementations of array_count_values and arsort. You will need to thoroughly test this to prevent bugs afterwards if you need any particular one, don't make any assumptions.
If you need any particular one, you'd may be better off not using arsort and write the reduction loop yourself.
$array = array(1, 1, 1, 4, 3, 1);
/* Our return values, with some useless defaults */
$max = 0;
$max_item = $array[0];
$counts = array_count_values($array);
foreach ($counts as $value => $amount) {
if ($amount > $max) {
$max = $amount;
$max_item = $value;
}
}
After the foreach loop, $max_item contains the last item that appears the most in the original array as long as array_count_values returns the elements in the order they are found (which appears to be the case based on the example of the documentation). You can get the first item to appear the most in your original array by using a non-strict comparison ($amount >= $max instead of $amount > $max).
You could even get all elements tied for the maximum amount of occurrences this way:
$array = array(1, 1, 1, 4, 3, 1);
/* Our return values */
$max = 0;
$max_items = array();
$counts = array_count_values($array);
foreach ($counts as $value => $amount) {
if ($amount > $max) {
$max = $amount;
$max_items = array($value);
} elif ($amount = $max) {
$max_items[] = $value;
}
}
$vals = array_count_values($arr);
asort($vals);
//you may need this end($vals);
echo key($vals);
I cant remember if asort sorts asc or desc by default, you can see the comment in the code.
<?php
$arrrand = '$arr = array(';
for ($i = 0; $i < 100000; $i++)
{
$arrrand .= rand(0, 1000) . ',';
}
$arrrand = substr($arrrand, 0, -1);
$arrrand .= ');';
eval($arrrand);
$start1 = microtime();
$count = array_count_values($arr);
$end1 = microtime();
echo $end1 - $start1;
echo '<br>';
$start2 = microtime();
$tmparr = array();
foreach ($arr as $key => $value);
{
if (isset($tmparr[$value]))
{
$tmparr[$value]++;
} else
{
$tmparr[$value] = 1;
}
}
$end2 = microtime();
echo $end2 - $start2;
Here check both solutions:
1 by array_count_values()
and one by hand.
<?php
$input = array(1,2,2,2,8,9);
$output = array();
$maxElement = 0;
for($i=0;$i<count($input);$i++) {
$count = 0;
for ($j = 0; $j < count($input); $j++) {
if ($input[$i] == $input[$j]) {
$count++;
}
}
if($count>$maxElement){
$maxElement = $count;
$a = $input[$i];
}
}
echo $a.' -> '.$maxElement;
The output will be 2 -> 3
$arrays = array(1, 2, 2, 2, 3, 1); // sample array
$count=array_count_values($arrays); // getting repeated value with count
asort($count); // sorting array
$key=key($count);
echo $arrays[$key]; // get most repeated value from array
String S;
Scanner in = new Scanner(System.in);
System.out.println("Enter the String: ");
S = in.nextLine();
int count =1;
int max = 1;
char maxChar=S.charAt(0);
for(int i=1; i <S.length(); i++)
{
count = S.charAt(i) == S.charAt(i - 1) ? (count + 1):1;
if(count > max)
{
max = count;
maxChar = S.charAt(i);
}
}
System.out.println("Longest run: "+max+", for the character "+maxChar);
here is the solution
class TestClass {
public $keyVal;
public $keyPlace = 0;
//put your code here
public function maxused_num($array) {
$temp = array();
$tempval = array();
$r = 0;
for ($i = 0; $i <= count($array) - 1; $i++) {
$r = 0;
for ($j = 0; $j <= count($array) - 1; $j++) {
if ($array[$i] == $array[$j]) {
$r = $r + 1;
}
}
$tempval[$i] = $r;
$temp[$i] = $array[$i];
}
//fetch max value
$max = 0;
for ($i = 0; $i <= count($tempval) - 1; $i++) {
if ($tempval[$i] > $max) {
$max = $tempval[$i];
}
}
//get value
for ($i = 0; $i <= count($tempval) - 1; $i++) {
if ($tempval[$i] == $max) {
$this->keyVal = $tempval[$i];
$this->keyPlace = $i;
break;
}
}
// 1.place holder on array $this->keyPlace;
// 2.number of reapeats $this->keyVal;
return $array[$this->keyPlace];
}
}
$catch = new TestClass();
$array = array(1, 1, 1, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 3, 1, 2, 3, 1, 1, 2, 5, 7, 1, 9, 0, 11, 22, 1, 1, 22, 22, 35, 66, 1, 1, 1);
echo $catch->maxused_num($array);
Array (3, 5, 1, 3, 5, 48, 4, 7, 13, 55, 65, 4, 7, 13, 32)
the frequent sequence of numbers will be (3, 5) f=2 + (4, 7, 13) f=2
any Algorithm or Pseudo code to find that ?
Update(1):
if (7, 13) also occurrence it will be included in the longest one by update its frequency so
(4, 7, 13) f=3 and so on...
Update(2):
in case of (1,2,3,4,1,2,3,4,1,2,7,8,7,8,3,4,3,4,1,2) the output should be (1,2,3,4) & (3,4,1,2)
& (7,8) , to make it clear consider each number as a word and you want to find most frequent phrases
so it is common to see same word(s) in a lot of phrases but if any phrase was sub-string for any other
phrase(s) should not be consider as a phrase but will update frequency of each phrase includes it
** EDIT ** : slightly better implementation, now also returns frequences and has a better sequence filter.
function getFrequences($input, $minimalSequenceSize = 2) {
$sequences = array();
$frequences = array();
$len = count($input);
for ($i=0; $i<$len; $i++) {
$offset = $i;
for ($j=$i+$minimalSequenceSize; $j<$len; $j++) {
if ($input[$offset] == $input[$j]) {
$sequenceSize = 1;
$sequence = array($input[$offset]);
while (($offset + $sequenceSize < $j)
&& ($input[$offset+$sequenceSize] == $input[$j+$sequenceSize])) {
if (false !== ($seqIndex = array_search($sequence, $frequences))) {
// we already have this sequence, since we found a bigger one, remove the old one
array_splice($sequences, $seqIndex, 1);
array_splice($frequences, $seqIndex, 1);
}
$sequence[] = $input[$offset+$sequenceSize];
$sequenceSize++;
}
if ($sequenceSize >= $minimalSequenceSize) {
if (false !== ($seqIndex = array_search($sequence, $sequences))) {
$frequences[$seqIndex]++;
} else {
$sequences[] = $sequence;
$frequences[] = 2; // we have two occurances already
}
// $i += $sequenceSize; // move $i so we don't reuse the same sub-sequence
break;
}
}
}
}
// remove sequences that are sub-sequence of another frequence
// ** comment this to keep all sequences regardless **
$len = count($sequences);
for ($i=0; $i<$len; $i++) {
$freq_i = $sequences[$i];
for ($j=$i+1; $j<$len; $j++) {
$freq_j = $sequences[$j];
$freq_inter = array_intersect($freq_i, $freq_j);
if (count($freq_inter) != 0) {
$len--;
if (count($freq_i) > count($freq_j)) {
array_splice($sequences, $j, 1);
array_splice($frequences, $j, 1);
$j--;
} else {
array_splice($sequences, $i, 1);
array_splice($frequences, $i, 1);
$i--;
break;
}
}
}
}
return array($sequences, $frequences);
};
Test case
header('Content-type: text/plain');
$input = array(3, 5, 1, 3, 5, 48, 4, 7, 13, 55, 3, 5, 65, 4, 7, 13, 32, 5, 48, 4, 7, 13);
list($sequences, $frequences) = getFrequences($input);
foreach ($sequences as $i => $s) {
echo "(" . implode(',', $s) . ') f=' . $frequences[$i] . "\n";
}
** EDIT ** : here's an update to the function. It was almost completely rewritten... tell me if this is what you were looking for. I also added a redundancy check to prevent counting the same sequence, or subsequence, twice.
function getFrequences2($input, $minSequenceSize = 2) {
$sequences = array();
$last_offset = 0;
$last_offset_len = 0;
$len = count($input);
for ($i=0; $i<$len; $i++) {
for ($j=$i+$minSequenceSize; $j<$len; $j++) {
if ($input[$i] == $input[$j]) {
$offset = 1;
$sub = array($input[$i]);
while ($i + $offset < $j && $j + $offset < $len) {
if ($input[$i + $offset] == $input[$j + $offset]) {
array_push($sub, $input[$i + $offset]);
} else {
break;
}
$offset++;
}
$sub_len = count($sub);
if ($sub_len >= $minSequenceSize) {
// $sub must contain more elements than the last sequence found
// otherwise we will count the same sequence twice
if ($last_offset + $last_offset_len >= $i + $sub_len) {
// we already saw this sequence... ignore
continue;
} else {
// save offset and sub_len for future check
$last_offset = $i;
$last_offset_len = $sub_len;
}
foreach ($sequences as & $sequence) {
$sequence_len = count($sequence['values']);
if ($sequence_len == $sub_len && $sequence['values'] == $sub) {
//echo "Found add-full ".var_export($sub, true)." at $i and $j...\n";
$sequence['frequence']++;
break 2;
} else {
if ($sequence_len > $sub_len) {
$end = $sequence_len - $sub_len;
$values = $sequence['values'];
$slice_len = $sub_len;
$test = $sub;
} else {
$end = $sub_len - $sequence_len;
$values = $sub;
$slice_len = $sequence_len;
$test = $sequence['values'];
}
for ($k=0; $k<=$end; $k++) {
if (array_slice($values, $k, $slice_len) == $test) {
//echo "Found add-part ".implode(',',$sub)." which is part of ".implode(',',$values)." at $i and $j...\n";
$sequence['values'] = $values;
$sequence['frequence']++;
break 3;
}
}
}
}
//echo "Found new ".implode(',',$sub)." at $i and $j...\n";
array_push($sequences, array('values' => $sub, 'frequence' => 2));
break;
}
}
}
}
return $sequences;
};
In Python3
>>> from collections import Counter
>>> count_hash=Counter()
>>> T=(3, 5, 1, 3, 5, 48, 4, 7, 13, 55, 65, 4, 7, 13, 32)
>>> for i in range(2,len(T)+1):
... for j in range(len(T)+1-i):
... count_hash[T[j:j+i]]+=1
...
>>> for k,v in count_hash.items():
... if v >= 2:
... print(k,v)
...
(3, 5) 2
(4, 7, 13) 2
(7, 13) 2
(4, 7) 2
Do you need to filter the (7,13) and the (4,7) out? What happens if there was also (99, 7, 14) in the sequence?
a Counter is just like a hash used to keep track of the number of times we see each substring
The two nested for loops produce all the substrings of T, using count_hash to accumulate the count of each substring.
The final for loop filters all those substrings that only occurred once
Here is a version with a filter
from collections import Counter
def substrings(t, minlen=2):
tlen = len(t)
return (t[j:j+i] for i in range(minlen, tlen+1) for j in range(tlen+1-i))
def get_freq(*t):
counter = Counter(substrings(t))
for k in sorted(counter, key=len):
v=counter[k]
if v < 2:
del counter[k]
continue
for t in substrings(k):
if t in counter:
if t==k:
continue
counter[k]+=counter[t]-v
del counter[t]
return counter
print(get_freq(3, 5, 1, 3, 5, 48, 4, 7, 13, 55, 65, 4, 7, 13, 32, 4, 7))
print(get_freq(1,2,3,4,1,2,3,4,1,2,7,8,7,8,3,4,3,4,1,2))
the output is
Counter({(4, 7, 13): 3, (3, 5): 2})
Counter({(1, 2, 3, 4, 1, 2): 8, (7, 8): 2}) # Is this the right answer?
Which is why I asked how the filtering should work for the sequence I gave in the comments
Ok, just to start off the discussion.
Create another array/map, call this
weightage array.
Start iterating on the values array.
For each value in
values array,increment the
corresponding position in weightage
array. Eg: for 3 increase
weightage[3]++, for 48
weightage[48]++.
After the iteration the weightage array contains
repetitions
I am looking to create an auto incrementing unique string using PHP, containing [a-Z 0-9] starting at 2 chars long and growing when needed.
This is for a url shrinker so each string (or alias) will be saved in the database attached to a url.
Any insight would be greatly appreciated!
Note this solution won't produce uppercase letters.
Use base_convert() to convert to base 36, which will use [a-z0-9].
<?php
// outputs a, b, c, ..., 2o, 2p, 2q
for ($i = 10; $i < 99; ++$i)
echo base_convert($i, 10, 36), "\n";
Given the last used number, you can convert it back to an integer with intval() increment it and convert the result back to base 36 with base_convert().
<?php
$value = 'bc9z';
$value = intval($value, 36);
++$value;
$value = base_convert($value, 10, 36);
echo $value; // bca0
// or
echo $value = base_convert(intval($value, 36) + 1, 10, 36);
Here's an implementation of an incr function which takes a string containing characters [0-9a-zA-Z] and increments it, pushing a 0 onto the front if required using the 'carry-the-one' method.
<?php
function incr($num) {
$chars = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
$parts = str_split((string)$num);
$carry = 1;
for ($i = count($parts) - 1; $i >= 0 && $carry; --$i) {
$value = strpos($chars, $parts[$i]) + 1;
if ($value >= strlen($chars)) {
$value = 0;
$carry = 1;
} else {
$carry = 0;
}
$parts[$i] = $chars[$value];
}
if ($carry)
array_unshift($parts, $chars[0]);
return implode($parts);
}
$num = '0';
for ($i = 0; $i < 1000; ++$i) {
echo $num = incr($num), "\n";
}
If your string was single case rather than mixed, and didn't contain numerics, then you could literally just increment it:
$testString="AA";
for($x = 0; $x < 65536; $x++) {
echo $testString++.'<br />';
}
$testString="aa";
for($x = 0; $x < 65536; $x++) {
echo $testString++.'<br />';
}
But you could possibly make some use of this feature even with a mixed alphanumeric string
To expand on meagar's answer, here is how you can do it with uppercase letters as well and for number arbitrarily big (requires the bcmath extension, but you could as well use gmp or the bigintegers pear package):
function base10ToBase62($number) {
static $chars = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
$result = "";
$n = $number;
do {
$remainder = bcmod($n, 62);
$n = bcdiv($n, 62);
$result = $chars[$remainder] . $result;
} while ($n > 0);
return $result;
}
for ($i = 10; $i < 99; ++$i) {
echo base10ToBase62((string) $i), "\n";
}