I'm trying to align strings in PHP using Levenshtein distance algorithm. The problem is that my back tracing code does not work properly for all cases. For example when the second array has inserted lines at the beginning. Then the back tracing will only go as far as when i=0.
How to properly implement back tracing for Levenshtein distance?
Levenshtein distance, $s and $t are arrays of strings (rows)
function match_rows($s, $t)
{
$m = count($s);
$n = count($t);
for($i = 0; $i <= $m; $i++) $d[$i][0] = $i;
for($j = 0; $j <= $n; $j++) $d[0][$j] = $j;
for($i = 1; $i <= $m; $i++)
{
for($j = 1; $j <= $n; $j++)
{
if($s[$i-1] == $t[$j-1])
{
$d[$i][$j] = $d[$i-1][$j-1];
}
else
{
$d[$i][$j] = min($d[$i-1][$j], $d[$i][$j-1], $d[$i-1][$j-1]) + 1;
}
}
}
// backtrace
$i = $m;
$j = $n;
while($i > 0 && $j > 0)
{
$min = min($d[$i-1][$j], $d[$i][$j-1], $d[$i-1][$j-1]);
switch($min)
{
// equal or substitution
case($d[$i-1][$j-1]):
if($d[$i][$j] != $d[$i-1][$j-1])
{
// substitution
$sub['i'][] = $i;
$sub['j'][] = $j;
}
$i = $i - 1;
$j = $j - 1;
break;
// insertion
case($d[$i][$j-1]):
$ins[] = $j;
$j = $j - 1;
break;
// deletion
case($d[$i-1][$j]):
$del[] = $i;
$i = $i - 1;
break;
}
}
This is not to be nit-picky, but to help you find the answers you want (and improve your implementation).
The algorithm you are using is the Wagner-Fischer algorithm, not the Levenshtein algorithm. Also, Levenshtein distance is not use to align strings. It is strictly a distance measurement.
There are two types of alignment: global and local. Global alignment is used to minimize the distance between two entire strings. Example: global align "RACE" on "REACH", you get "RxACx". The x's are gaps.
In general, this is the Needleman-Wunsch algorithm, which is very similar to the Wagner-Fischer algorithm. Local alignment finds a substring in a long string and minimizes the difference between a short string and a the substring of the long string.
Example: local align "BELL" on "UMBRELLA" and you get "BxELL" aligned on "BRELL". It is the Smith-Waterman algorithm which, again, is very similar to the Wagner-Fischer algorithm.
I hope that this is helpful in allowing you to better define the exact kind of alignment you want.
I think your bug is exactly what you say in your question that it is: you stop as soon as i==0, instead of going all the way to i==0 && j==0. Simply replace this condition:
while($i > 0 && $j > 0)
with
while ($i > 0 || $j > 0)
and you're halfway to your solution. The tricky bit is that if $i==0, then it's incorrect to use the array index $i-1 in the loop body. So you'll also have to change the body of the loop to something more like
while ($i || $j) {
$min = $d[$i][$j]; // or INT_MAX or something
if ($i && $j && $min > $d[$i-1][$j-1]) {
$newi = $i-1;
$newj = $j-1;
$min = $d[$newi][$newj];
}
if ($i && $min > $d[$i-1][$j]) {
$newi = $i-1;
$newj = $j;
$min = $d[$newi][$newj];
}
if ($j && $min > $d[$i][$j-1]) {
$newi = $i;
$newj = $j-1;
$min = $d[$newi][$newj];
}
// What sort of transformation is this?
if ($newi == $i && $newj == $j) {
assert(false); // should never happen
} else if ($newi == $i) {
// insertion
$ins[] = $j;
} else if ($newj == $j) {
// deletion
$del[] = $i;
} else if ($d[$i][$j] != $d[$newi][$newj]) {
// substitution
$sub['i'][] = $i;
$sub['j'][] = $j;
} else {
// identity
}
assert($newi >= 0); assert($newj >= 0);
$i = $newi;
$j = $newj;
}
Related
I want to use gaussian elimination to solve the following matrix Matrix and this is the answer I'm expecting. I would like to get back an equation in the form as is displayed in answer but i can't figure out how to do it.
public function gauss($A, $x) {
# Just make a single matrix
for ($i=0; $i < count($A); $i++) {
$A[$i][] = $x[$i];
}
$n = count($A);
for ($i=0; $i < $n; $i++) {
# Search for maximum in this column
$maxEl = abs($A[$i][$i]);
$maxRow = $i;
for ($k=$i+1; $k < $n; $k++) {
if (abs($A[$k][$i]) > $maxEl) {
$maxEl = abs($A[$k][$i]);
$maxRow = $k;
}
}
# Swap maximum row with current row (column by column)
for ($k=$i; $k < $n+1; $k++) {
$tmp = $A[$maxRow][$k];
$A[$maxRow][$k] = $A[$i][$k];
$A[$i][$k] = $tmp;
}
# Make all rows below this one 0 in current column
for ($k=$i+1; $k < $n; $k++) {
$c = -$A[$k][$i]/$A[$i][$i];
for ($j=$i; $j < $n+1; $j++) {
if ($i==$j) {
$A[$k][$j] = 0;
} else {
$A[$k][$j] += $c * $A[$i][$j];
}
}
}
}
# Solve equation Ax=b for an upper triangular matrix $A
$x = array_fill(0, $n, 0);
for ($i=$n-1; $i > -1; $i--) {
$x[$i] = $A[$i][$n]/$A[$i][$i];
for ($k=$i-1; $k > -1; $k--) {
$A[$k][$n] -= $A[$k][$i] * $x[$i];
}
}
return $x;
}
I hope someone can help me to rewrite this code so it gives the solution i've provided or recommend a library which is capable of doing this.
I've searched for possible solutions on Google but haven't been able to find one yet.
Thanks in advance.
I found it online and there are no comments.
It comes with a Complex.class that basically simulates complex numbers and their operations.
I'd like to comment it myself but I really can't identify which algorithm is being used. I went online and found that the Cooley-Tukey algorithm is the most widespread, but I'm not sure that this code uses it.
private $dim;
private $p;
private $ind;
private $func;
private $w1;
private $w1i;
private $w2;
public function __construct($dim) {
$this->dim = $dim;
$this->p = log($this->dim, 2);
}
public function fft($func) {
$this->func = $func;
// Copying func in w1 as a complex.
for ($i = 0; $i < $this->dim; $i++)
$this->w1[$i] = new Complex($func[$i], 0);
$w[0] = new Complex(1, 0);
$w[1] = new Complex(cos((-2 * M_PI) / $this->dim), sin((-2 * M_PI) / $this->dim));
for ($i = 2; $i < $this->dim; $i++)
$w[$i] = Complex::Cmul($w[$i-1], $w[1]);
return $this->calculate($w);
}
private function calculate($w) {
$k = 1;
$ind[0] = 0;
for ($j = 0; $j < $this->p; $j++) {
for ($i = 0; $i < $k; $i++) {
$ind[$i] *= 2;
$ind[$i+$k] = $ind[$i] + 1;
}
$k *= 2;
}
for ($i = 0; $i < $this->p; $i++) {
$indw = 0;
for ($j = 0; $j < pow(2, $i); $j++) {
$inf = ($this->dim / pow(2, $i)) * $j;
$sup = (($this->dim / pow(2, $i)) * ($j+1)) - 1;
$comp = ($this->dim / pow(2, $i)) / 2;
for ($k = $inf; $k <= floor($inf+(($sup-$inf)/2)); $k++)
$this->w2[$k] = Complex::Cadd(Complex::Cmul($this->w1[$k], $w[0]), Complex::Cmul($this->w1[$k+$comp], $w[$ind[$indw]]));
$indw++;
for ($k = floor($inf+(($sup-$inf)/2)+1); $k <= $sup; $k++)
$this->w2[$k] = Complex::Cadd(Complex::Cmul($this->w1[$k], $w[$ind[$indw]]), Complex::Cmul($this->w1[$k-$comp], $w[0]));
$indw++;
}
for($j = 0; $j < $this->dim; $j++)
$this->w1[$j] = $this->w2[$j];
}
for ($i = 0; $i < $this->dim; $i++)
$this->w1[$i] = $this->w2[$ind[$i]];
return $this->w1;
}
This is a radix 2 FFT implementation based on the Cooley-Tukey algorithm, with the work being done in the function "compute". It will only work with FFT lengths that are a power of 2, although I don't see any parameter checking in the function itself.
$i iterates over the multiple FFT passes (of which there are log2(N) where N is the length of the FFT), and in each pass, the twiddle factors (stored in $w) are multiplied by the output from the previous stage before finding the complex sum and difference.
There are much better implementations of FFTs out there, such as FFTW, that implement a mixed radix approach, which allow an arbitrary length of FFT to be computed.
Here is my code:
$n = 300;
$set = 0;
$set2 = 0;
for($i = 1; $i<$n; $i++)
{
for($j = 1; $j <$i; $j++)
{
$qol = $i % $j;
if($qol == 0)
{
$set += $j;
}
}
for($s=1; $s<$set; $s++)
{
$qol2 = $set % $s;
if($s == 0)
{
$set2 += $s;
}
}
if($set2 == $i)
{
echo "$set and $i are amicable numbers</br>";
}
}
I do not know what the heck the problem is!
FYI: 220 and 284 are an example of amicable numbers. The sum of the proper divisors of one number are equal to other number and vice versa (wiki).
I am having troubles following your logic. In your code how would $set2 == $i ever be true? Seems to me that $i would always be greater.
I would do it the following way:
First make a separate function that finds the sums of the proper divisors:
// Function to output sum of proper divisors of $num
function sumDiv($num) {
// Return 0 if $num is 1 or less
if ($num <= 1) {
return 0;
}
$result = 1; // All nums divide by 1
$sqrt = sqrt($num);
// Add divisors to result
for ($i = 2; $i < $sqrt; $i++) {
if ($num % $i == 0) {
$result += $i + $num / $i;
}
}
// If perfect square add squareroot to result
if (floor($sqrt) == $sqrt) {
$result += $sqrt;
}
return $result;
}
Next check each iteration for a match:
$n = 1500;
for ($i = 1; $i < $n; $i++) {
// Get sum of proper devisors of $i, and sum of div. of result.
$currentDivs = sumDiv($i);
$resultDivs = sumDiv($currentDivs);
// Check for a match with sums not equal to each other.
if ($i == $resultDivs && $currentDivs != $resultDivs) {
echo "$i and $currentDivs are amicable numbers<br>";
}
}
Here a functioning phpfiddle.
Warning: Large numbers will take very long to process!
I have this piece of code that loops 1 through 99 and is a formula.
function getExperienceByLevel ($maxLevel)
{
$levels = array ();
$current = 0;
for ($i = 1; $i <= $maxLevel; $i++)
{
$levels[$i - 1] = floor ($current / 4);
$current += floor($i+300*pow(2, ($i/9.75)));
}
return $levels;
}
First you initiate it like so $aLevels = getExperienceByLevel(99); then to see how much EXP you need to get to level 6 you do this echo $aLevels[5]; since it's an array.
Now I'm trying to do reverse. Get Level by EXP.
function getLevelByExp($exp)
{
$myLevel = 0;
$aLevels = getExperienceByLevel(99);
for ($i = 1; $i < 100; $i++)
{
if ($exp > $aLevels[$i-1])
{
return $myLevel;
}
}
}
When called upon getLevelByExp(1124); or any number inside, it seems to return a zero. But it seems to work when you put echos inside that statement.
Like instead of return $myLevel do echo "You are up to level $i<br />"; and it will go all the way up to the current level you've gained EXP for.
But still.. doesn't work when I want to simply return a number.
This seems to work better than your function:
function getLevelByExp($exp)
{
$aLevels = getExperienceByLevel(99);
for ($i = 0; $i <= 99; ++$i)
{
//echo "cmp $exp >= aLevels[$i]={$aLevels[$i]}\n";
if ($exp <= $aLevels[$i])
return $i - 1;
}
return -1;
}
It needs improvement for the edge cases, such as when $exp is zero.
Return $i instead because it always '0'
if ($exp > $aLevels[$i-1]) {
return $i;
}
You never change $myLevel, so it will always stay at 0.
Try returning $i instead of $myLevel, as $i is actually changing:
function getLevelByExp($exp) {
$aLevels = getExperienceByLevel(99);
for ($i = 1; $i < 100; $i++) {
if ($exp > $aLevels[$i-1]) {
return $i;
}
}
}
Would you write this code any differently? I don't mind tertiary operator by the way so any ideas that include it are also welcome
if ($stopYear < $startYear) {
for ($i = $startYear; $i >= $stopYear; $i--) {
$yearMultiOptions[$i] = $i;
}
} else {
for ($i = $startYear; $i <= $stopYear; $i++) {
$yearMultiOptions[$i] = $i;
}
}
$min = min($startYear, $stopYear);
$max = max($startYear, $stopYear);
for ($i = $min; $i <= $max; $i++) {
$yearMultiOptions[$i] = $i;
}
I don't know php, so min and max might have different syntax, but you get the idea.
You could do something like this:
$step = ($stopYear < $startYear) ? -1 : 1;
for ($i = $startYear; $i != $stopYear; $i += step) {
$yearMultiOptions[$i] = $i;
}