How to calculate the hamming distance of two binary sequences in PHP?

How to calculate the hamming distance of two binary sequences in PHP? - php

hamming('10101010','01010101')
The result of the above should be 8.
How to implement it?

without installed GMP here is easy solution for any same-length binary strings
function HammingDistance($bin1, $bin2) {
$a1 = str_split($bin1);
$a2 = str_split($bin2);
$dh = 0;
for ($i = 0; $i < count($a1); $i++)
if($a1[$i] != $a2[$i]) $dh++;
return $dh;
}
echo HammingDistance('10101010','01010101'); //returns 8

You don't need to implement it because it already exists:
http://php.net/manual/en/function.gmp-hamdist.php
(If you have GMP support)

The following function works with hex strings (equal length), longer than 32 bits.
function hamming($hash1, $hash2) {
$dh = 0;
$len1 = strlen($hash1);
$len2 = strlen($hash2);
$len = 0;
do {
$h1 = hexdec(substr($hash1, $len, 8));
$h2 = hexdec(substr($hash2, $len, 8));
$len += 8;
for ($i = 0; $i < 32; $i++) {
$k = (1 << $i);
if (($h1 & $k) !== ($h2 & $k)) {
$dh++;
}
}
} while ($len < $len1);
return $dh;
}

If you don't have GMP support there is always something like this. Downside it only works on binary strings up to 32 bits in length.
function hamdist($x, $y){
for($dist = 0, $val = $x ^ $y; $val; ++$dist){
$val &= $val - 1;
}
return $dist;
}
function hamdist_str($x, $y){
return hamdist(bindec($x), bindec($y));
}
echo hamdist_str('10101010','01010101'); //8

Try this function:
function hamming($b1, $b2) {
$b1 = ltrim($b1, '0');
$b2 = ltrim($b2, '0');
$l1 = strlen($b1);
$l2 = strlen($b2);
$n = min($l1, $l2);
$d = max($l1, $l2) - $n;
for ($i=0; $i<$n; ++$i) {
if ($b1[$l1-$i] != $b2[$l2-$i]) {
++$d;
}
}
return $d;
}

You can easily code your hamming function with the help of substr_count() and the code provided in this comment on the PHP manual.
/* hamdist is equivilent to: */
echo gmp_popcount(gmp_xor($ham1, $ham2)) . "\n";

Try:
echo gmp_hamdist('10101010','01010101')

Related

PHP Function To Get Character Number

How can I write a function that gives me number of the character that is passed to it
For example, if the funciton name is GetCharacterNumber and I pass B to it then it should give me 2
GetCharacterNumber("A") // should print 1
GetCharacterNumber("C") // should print 3
GetCharacterNumber("Z") // should print 26
GetCharacterNumber("AA") // should print 27
GetCharacterNumber("AA") // should print 27
GetCharacterNumber("AC") // should print 29
Is it even possible to achieve this ?

There is a function called ord which gives you the ASCII number of the character.
ord($chr) - ord('A') + 1
gives you the correct result for one character. For longer strings, you can use a loop.
<?php
function GetCharacterNumber($str) {
$num = 0;
for ($i = 0; $i < strlen($str); $i++) {
$num = 26 * $num + ord($str[$i]) - 64;
}
return $num;
}
GetCharacterNumber("A"); //1
GetCharacterNumber("C"); //3
GetCharacterNumber("Z"); //26
GetCharacterNumber("AA"); //27
GetCharacterNumber("AC"); //29
GetCharacterNumber("BA"); //53
?>

Not very efficient but gets the job done:
function get_character_number($end)
{
$count = 1;
$char = 'A';
$end = strtoupper($end);
while ($char !== $end) {
$count++;
$char++;
}
return $count;
}
echo get_character_number('AA'); // 27
demo
This works because when you got something like $char = 'A' and do $char++, it will change to 'B', then 'C', 'D', … 'Z', 'AA', 'AB' and so on.
Note that this will become the slower the longer $end is. I would not recommend this for anything beyond 'ZZZZ' (475254 iterations) or if you need many lookups like that.
An better performing alternative would be
function get_character_number($string) {
$number = 0;
$string = strtoupper($string);
$dictionary = array_combine(range('A', 'Z'), range(1, 26));
for ($pos = 0; isset($string[$pos]); $pos++) {
$number += $dictionary[$string[$pos]] + $pos * 26 - $pos;
}
return $number;
}
echo get_character_number(''), PHP_EOL; // 0
echo get_character_number('Z'), PHP_EOL; // 26
echo get_character_number('AA'), PHP_EOL; // 27
demo

Use range and strpos:
$letter = 'z';
$alphabet = range('a', 'z');
$position = strpos($alphabet, $letter);
For double letters (eg zz) you'd probably need to create your own alphabet using a custom function:
$alphabet = range('a', 'z');
$dictionary = range('a','z');
foreach($alphabet AS $a1){
foreach($alphabet AS $a2) {
$dictionary[] = $a1 . $a2;
}
}
Then use $dictionary in place of $alphabet.

Here is the full code that does what you want.
Tested it and works perfectly for the examples you gave.
define('BASE', 26);
function singleOrd($chr) {
if (strlen($chr) == 1) {
return (ord($chr)-64);
} else{
return 0;
}
}
function multiOrd($string) {
if (strlen($string) == 0) {
return 0;
} elseif (strlen($string) == 1) {
return singleOrd($string);
} else{
$sum = 0;
for($i = strlen($string) - 1; $i >= 0; $i--) {
$sum += singleOrd($string[$i]) * pow(BASE, $i);
}
}
return $sum;
}

I think ord should be a more efficient way to have your number :
$string = strtolower($string);
$result = 0;
$length = strlen($string);
foreach($string as $key=>$value){
$result = ($length -$key - 1)*(ord($value)-ord(a)+1);
}
and result would contain what you want.

Can you help me understand this simple PHP function, please?

I would like to know how this function works so I can re-write it in ColdFusion. I've never programmed in PHP. I think, the function checks if $string is 11 numbers in length.
But how is it doing it?
Why is it using a loop instead of a simple len() function?
I read about str_pad(), and understand it. But what is the function of the loop iteration $i as a third argument? What's it doing?
if($this->check_fake($cpf, 11)) return FALSE;
function check_fake($string, $length)
{
for($i = 0; $i <= 9; $i++) {
$fake = str_pad("", $length, $i);
if($string === $fake) return(1);
}
}
The function is meant to validate a CPF, the Brazilian equivalent of a US SSN #. The CPF is 11 characters in length. Basically, I need to know what it's doing so I can write the function in Coldfusion.
If it's just a length, can't it be if (len(cpf) != 11) return false; ?
Here is the entire code snippet if it interests you:
<?
/*
*# Class VALIDATE - It validates Brazilian CPF/CNPJ numbers
*# Developer: André Luis Cupini - andre#neobiz.com.br
************************************************************/
class VALIDATE
{
/*
*# Remove ".", "-", "/" of the string
*****************************************************/
function cleaner($string)
{
return $string = str_replace("/", "", str_replace("-", "", str_replace(".", "", $string)));
}
/*
*# Check if the number is fake
*****************************************************/
function check_fake($string, $length)
{
for($i = 0; $i <= 9; $i++) {
$fake = str_pad("", $length, $i);
if($string === $fake) return(1);
}
}
/*
*# Validates CPF
*****************************************************/
function cpf($cpf)
{
$cpf = $this->cleaner($cpf);
$cpf = trim($cpf);
if(empty($cpf) || strlen($cpf) != 11) return FALSE;
else {
if($this->check_fake($cpf, 11)) return FALSE;
else {
$sub_cpf = substr($cpf, 0, 9);
for($i =0; $i <=9; $i++) {
$dv += ($sub_cpf[$i] * (10-$i));
}
if ($dv == 0) return FALSE;
$dv = 11 - ($dv % 11);
if($dv > 9) $dv = 0;
if($cpf[9] != $dv) return FALSE;
$dv *= 2;
for($i = 0; $i <=9; $i++) {
$dv += ($sub_cpf[$i] * (11-$i));
}
$dv = 11 - ($dv % 11);
if($dv > 9) $dv = 0;
if($cpf[10] != $dv) return FALSE;
return TRUE;
}
}
}
}
?>

it basically makes sure the number isn't:
11111111111
22222222222
33333333333
44444444444
55555555555
etc...

With comments:
function check_fake($string, $length)
{
// for all digits
for($i = 0; $i <= 9; $i++)
{
// create a $length characters long string, consisting of
// $length number of the number $i
// e.g. 00000000000
$fake = str_pad("", $length, $i);
// return true if the provided CPN is equal
if($string === $fake) return(1);
}
}
In short, it tests to see if the provided string is just the same digit repeated $length times.

php's preg_replace() versus(vs.) ord()

What is quicker, for camelCase to underscores;
using preg_replace() or using ord() ?
My guess is the method using ord will be quicker,
since preg_replace can do much more then needed.
<?php
function __autoload($class_name){
$name = strtolower(preg_replace('/([a-z])([A-Z])/', '$1_$2', $class_name));
require_once("some_dir/".$name.".php");
}
?>
OR
<?php
function __autoload($class_name){
// lowercase first letter
$class_name[0] = strtolower($class_name[0]);
$len = strlen($class_name);
for ($i = 0; $i < $len; ++$i) {
// see if we have an uppercase character and replace
if (ord($class_name[$i]) > ord('A') && ord($class_name[$i]) < ord('Z')) {
$class_name[$i] = '_' . strtolower($class_name[$i]);
// increase length of class and position
++$len;
++$i;
}
}
return $class_name;
}
?>
disclaimer -- code examples taken from StackOverflowQuestion 1589468.
edit, after jensgram's array-suggestion and finding array_splice i have come up with the following :
<?php
function __autoload ($string)// actually, function camel2underscore
{
$string = str_split($string);
$pos = count( $string );
while ( --$pos > 0 )
{
$lower = strtolower( $string[ $pos ] );
if ( $string[ $pos ] === $lower )
{
// assuming most letters will be underscore this should be improvement
continue;
}
unset( $string[ $pos ] );
array_splice( $string , $pos , 0 , array( '_' , $lower ) );
}
$string = implode( '' , $string );
return $string;
}
// $pos could be avoided by using the array key, something i might look into later on.
?>
When i will be testing these methods i will add this one
but feel free to tell me your results at anytime ;p

i think (and i'm pretty much sure) that the preg_replace method will be faster - but if you want to know, why dont you do a little benchmark calling both functions 100000 times and measure the time?

(Not an answer but too long to be a comment - will CW)
If you're going to compare, you should at least optimize a little on the ord() version.
$len = strlen($class_name);
$ordCurr = null;
$ordA = ord('A');
$ordZ = ord('Z');
for ($i = 0; $i < $len; ++$i) {
$ordCurr = ord($class_name[$i]);
// see if we have an uppercase character and replace
if ($ordCurr >= $ordA && $ordCurr <= $ordZ) {
$class_name[$i] = '_' . strtolower($class_name[$i]);
// increase length of class and position
++$len;
++$i;
}
}
Also, pushing the name onto a stack (an array) and joining at the end might prove more efficient than string concatenation.
BUT Is this worth the optimization / profiling in the first place?

My usecase was slightly different than the OP's, but I think it's still illustrative of the difference between preg_replace and manual string manipulation.
$a = "16 East, 95 Street";
echo "preg: ".test_preg_replace($a)."\n";
echo "ord: ".test_ord($a)."\n";
$t = microtime(true);
for ($i = 0; $i &lt 100000; $i++) test_preg_replace($a);
echo (microtime(true) - $t)."\n";
$t = microtime(true);
for ($i = 0; $i &lt 100000; $i++) test_ord($a);
echo (microtime(true) - $t)."\n";
function test_preg_replace($s) {
return preg_replace('/[^a-z0-9_-]/', '-', strtolower($s));
}
function test_ord($s) {
$a = ord('a');
$z = ord('z');
$aa = ord('A');
$zz = ord('Z');
$zero = ord('0');
$nine = ord('9');
$us = ord('_');
$ds = ord('-');
$toret = '';
for ($i = 0, $len = strlen($s); $i < $len; $i++) {
$c = ord($s[$i]);
if (($c >= $a && $c <= $z)
|| ($c >= $zero && $c <= $nine)
|| $c == $us
|| $c == $ds)
{
$toret .= $s[$i];
}
elseif ($c >= $aa && $c <= $zz)
{
$toret .= chr($c + $a - $aa); // strtolower
}
else
{
$toret .= '-';
}
}
return $toret;
}
The results are
0.42064881324768
2.4904868602753
so the preg_replace method is vastly superior. Also, string concatenation is slightly faster than inserting into an array and imploding it.

If all you want to do is convert camel case to underscores, you can probably write a more efficient function to do so than either ord or preg_replace in less time than it takes to profile them.

I've written a benchmark using the following four functions and I figured out that the one implemented in Magento is the fastest one (it's Test4):
Test1:
/**
* #see: http://www.paulferrett.com/2009/php-camel-case-functions/
*/
function fromCamelCase_1($str)
{
$str[0] = strtolower($str[0]);
return preg_replace('/([A-Z])/e', "'_' . strtolower('\\1')", $str);
}
Test2:
/**
* #see: http://stackoverflow.com/questions/3995338/phps-preg-replace-versusvs-ord#answer-3995435
*/
function fromCamelCase_2($str)
{
// lowercase first letter
$str[0] = strtolower($str[0]);
$newFieldName = '';
$len = strlen($str);
for ($i = 0; $i < $len; ++$i) {
$ord = ord($str[$i]);
// see if we have an uppercase character and replace
if ($ord > 64 && $ord < 91) {
$newFieldName .= '_';
}
$newFieldName .= strtolower($str[$i]);
}
return $newFieldName;
}
Test3:
/**
* #see: http://www.paulferrett.com/2009/php-camel-case-functions/#div-comment-133
*/
function fromCamelCase_3($str) {
$str[0] = strtolower($str[0]);
$func = create_function('$c', 'return "_" . strtolower($c[1]);');
return preg_replace_callback('/([A-Z])/', $func, $str);
}
Test4:
/**
* #see: http://svn.magentocommerce.com/source/branches/1.6-trunk/lib/Varien/Object.php :: function _underscore($name)
*/
function fromCamelCase_4($name) {
return strtolower(preg_replace('/(.)([A-Z])/', "$1_$2", $name));
}
Result using the string "getExternalPrefix" 1000 times:
fromCamelCase_1: 0.48158717155457
fromCamelCase_2: 2.3211658000946
fromCamelCase_3: 0.63665509223938
fromCamelCase_4: 0.18188905715942
Result using random strings like "WAytGLPqZltMfHBQXClrjpTYWaEEkyyu" 1000 times:
fromCamelCase_1: 2.3300149440765
fromCamelCase_2: 4.0111720561981
fromCamelCase_3: 2.2800230979919
fromCamelCase_4: 0.18472790718079
Using the test-strings I got a different output - but this should not appear in your system:
original:
MmrcgUmNfCCTOMwwgaPuGegEGHPzvUim
last test:
mmrcg_um_nf_cc_to_mwwga_pu_geg_eg_hpzv_uim
other tests:
mmrcg_um_nf_c_c_t_o_mwwga_pu_geg_e_g_h_pzv_uim
As you can see at the timestamps - the last test has the same time in both tests :)

Auto incrementing unique url

I am looking to create an auto incrementing unique string using PHP, containing [a-Z 0-9] starting at 2 chars long and growing when needed.
This is for a url shrinker so each string (or alias) will be saved in the database attached to a url.
Any insight would be greatly appreciated!

Note this solution won't produce uppercase letters.
Use base_convert() to convert to base 36, which will use [a-z0-9].
<?php
// outputs a, b, c, ..., 2o, 2p, 2q
for ($i = 10; $i < 99; ++$i)
echo base_convert($i, 10, 36), "\n";
Given the last used number, you can convert it back to an integer with intval() increment it and convert the result back to base 36 with base_convert().
<?php
$value = 'bc9z';
$value = intval($value, 36);
++$value;
$value = base_convert($value, 10, 36);
echo $value; // bca0
// or
echo $value = base_convert(intval($value, 36) + 1, 10, 36);

Here's an implementation of an incr function which takes a string containing characters [0-9a-zA-Z] and increments it, pushing a 0 onto the front if required using the 'carry-the-one' method.
<?php
function incr($num) {
$chars = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
$parts = str_split((string)$num);
$carry = 1;
for ($i = count($parts) - 1; $i >= 0 && $carry; --$i) {
$value = strpos($chars, $parts[$i]) + 1;
if ($value >= strlen($chars)) {
$value = 0;
$carry = 1;
} else {
$carry = 0;
}
$parts[$i] = $chars[$value];
}
if ($carry)
array_unshift($parts, $chars[0]);
return implode($parts);
}
$num = '0';
for ($i = 0; $i < 1000; ++$i) {
echo $num = incr($num), "\n";
}

If your string was single case rather than mixed, and didn't contain numerics, then you could literally just increment it:
$testString="AA";
for($x = 0; $x < 65536; $x++) {
echo $testString++.'<br />';
}
$testString="aa";
for($x = 0; $x < 65536; $x++) {
echo $testString++.'<br />';
}
But you could possibly make some use of this feature even with a mixed alphanumeric string

To expand on meagar's answer, here is how you can do it with uppercase letters as well and for number arbitrarily big (requires the bcmath extension, but you could as well use gmp or the bigintegers pear package):
function base10ToBase62($number) {
static $chars = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
$result = "";
$n = $number;
do {
$remainder = bcmod($n, 62);
$n = bcdiv($n, 62);
$result = $chars[$remainder] . $result;
} while ($n > 0);
return $result;
}
for ($i = 10; $i < 99; ++$i) {
echo base10ToBase62((string) $i), "\n";
}

PHP Random Number

I want to generate a random number in PHP where the digits itself should not repeat in that number.
Is that possible?
Can you paste sample code here?
Ex: 674930, 145289. [i.e Same digit shouldn't come]
Thanks

Here is a good way of doing it:
$amountOfDigits = 6;
$numbers = range(0,9);
shuffle($numbers);
for($i = 0;$i < $amountOfDigits;$i++)
$digits .= $numbers[$i];
echo $digits; //prints 217356
If you wanted it in a neat function you could create something like this:
function randomDigits($length){
$numbers = range(0,9);
shuffle($numbers);
for($i = 0;$i < $length;$i++)
$digits .= $numbers[$i];
return $digits;
}

function randomize($len = false)
{
$ints = array();
$len = $len ? $len : rand(2,9);
if($len > 9)
{
trigger_error('Maximum length should not exceed 9');
return 0;
}
while(true)
{
$current = rand(0,9);
if(!in_array($current,$ints))
{
$ints[] = $current;
}
if(count($ints) == $len)
{
return implode($ints);
}
}
}
echo randomize(); //Numbers that are all unique with a random length.
echo randomize(7); //Numbers that are all unique with a length of 7
Something along those lines should do it

<?php
function genRandomString() {
$length = 10; // set length of string
$characters = '0123456789'; // for undefined string
$string ="";
for ($p = 0; $p < $length; $p++) {
$string .= $characters[mt_rand(0, strlen($characters))];
}
return $string;
}
$s = genRandomString(); //this is your random print var
or
function rand_string( $length )
{
$chars = "0123456789";
$size = strlen( $chars );
for( $i = 0; $i < $length; $i++ )
{
$str .= $chars[ rand( 0, $size – 1 ) ];
}
return $str;
}
$rid= rand_string( 6 ); // 6 means length of generate string
?>

$result= "";
$numbers= "0123456789";
$length = 8;
$i = 0;
while ($i < $length)
{
$char = substr($numbers, mt_rand(0, strlen($numbers)-1), 1);
//prevents duplicates
if (!strstr($result, $char))
{
$result .= $char;
$i++;
}
}
This should do the trick. In $numbers you can put any char you want, for example: I have used this to generate random passwords, productcodes etc.

The least amount of code I saw for something like this was:
function random_num($n=5)
{
return rand(0, pow(10, $n));
}
But I'm assuming it requires more processing to do this than these other methods.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

How to calculate the hamming distance of two binary sequences in PHP? - php

hamming('10101010','01010101') The result of the above should be 8. How to implement it?

You don't need to implement it because it already exists: http://php.net/manual/en/function.gmp-hamdist.php (If you have GMP support)

Try this function: function hamming($b1, $b2) { $b1 = ltrim($b1, '0'); $b2 = ltrim($b2, '0'); $l1 = strlen($b1); $l2 = strlen($b2); $n = min($l1, $l2); $d = max($l1, $l2) - $n; for ($i=0; $i<$n; ++$i) { if ($b1[$l1-$i] != $b2[$l2-$i]) { ++$d; } } return $d; }

You can easily code your hamming function with the help of substr_count() and the code provided in this comment on the PHP manual. /* hamdist is equivilent to: */ echo gmp_popcount(gmp_xor($ham1, $ham2)) . "\n";

Try: echo gmp_hamdist('10101010','01010101')

Related

PHP Function To Get Character Number

Can you help me understand this simple PHP function, please?

php's preg_replace() versus(vs.) ord()

Auto incrementing unique url

PHP Random Number

Categories

Resources