So I've read the two related questions for calculating a trend line for a graph, but I'm still lost.
I have an array of xy coordinates, and I want to come up with another array of xy coordinates (can be fewer coordinates) that represent a logarithmic trend line using PHP.
I'm passing these arrays to javascript to plot graphs on the client side.
Logarithmic Least Squares
Since we can convert a logarithmic function into a line by taking the log of the x values, we can perform a linear least squares curve fitting. In fact, the work has been done for us and a solution is presented at Math World.
In brief, we're given $X and $Y values that are from a distribution like y = a + b * log(x). The least squares method will give some values aFit and bFit that minimize the distance from the parametric curve to the data points given.
Here is an example implementation in PHP:
First I'll generate some random data with known underlying distribution given by $a and $b
// True parameter valaues
$a = 10;
$b = 5;
// Range of x values to generate
$x_min = 1;
$x_max = 10;
$nPoints = 50;
// Generate some random points on y = a * log(x) + b
$X = array();
$Y = array();
for($p = 0; $p < $nPoints; $p++){
$x = $p / $nPoints * ($x_max - $x_min) + $x_min;
$y = $a + $b * log($x);
$X[] = $x + rand(0, 200) / ($nPoints * $x_max);
$Y[] = $y + rand(0, 200) / ($nPoints * $x_max);
}
Now, here's how to use the equations given to estimate $a and $b.
// Now convert to log-scale for X
$logX = array_map('log', $X);
// Now estimate $a and $b using equations from Math World
$n = count($X);
$square = create_function('$x', 'return pow($x,2);');
$x_squared = array_sum(array_map($square, $logX));
$xy = array_sum(array_map(create_function('$x,$y', 'return $x*$y;'), $logX, $Y));
$bFit = ($n * $xy - array_sum($Y) * array_sum($logX)) /
($n * $x_squared - pow(array_sum($logX), 2));
$aFit = (array_sum($Y) - $bFit * array_sum($logX)) / $n;
You may then generate points for your Javascript as densely as you like:
$Yfit = array();
foreach($X as $x) {
$Yfit[] = $aFit + $bFit * log($x);
}
In this case, the code estimates bFit = 5.17 and aFit = 9.7, which is quite close for only 50 data points.
For the example data given in the comment below, a logarithmic function does not fit well.
The least squares solution is y = -514.734835478 + 2180.51562281 * log(x) which is essentially a line in this domain.
I would recommend using library: http://www.drque.net/Projects/PolynomialRegression/
Available by Composer: https://packagist.org/packages/dr-que/polynomial-regression.
In case anyone is having problems with the create_function, here is how I edited it. (Though I wasn't using logs, so I did take those out.)
I also reduced the number of calculations and added an R2. It seems to work so far.
function lsq(){
$X = array(1,2,3,4,5);
$Y = array(.3,.2,.7,.9,.8);
// Now estimate $a and $b using equations from Math World
$n = count($X);
$mult_elem = function($x,$y){ //anon function mult array elements
$output=$x*$y; //will be called on each element
return $output;
};
$sumX2 = array_sum(array_map($mult_elem, $X, $X));
$sumXY = array_sum(array_map($mult_elem, $X, $Y));
$sumY = array_sum($Y);
$sumX = array_sum($X);
$bFit = ($n * $sumXY - $sumY * $sumX) /
($n * $sumX2 - pow($sumX, 2));
$aFit = ($sumY - $bFit * $sumX) / $n;
echo ' intercept ',$aFit,' ';
echo ' slope ',$bFit,' ' ;
//r2
$sumY2 = array_sum(array_map($mult_elem, $Y, $Y));
$top=($n*$sumXY-$sumY*$sumX);
$bottom=($n*$sumX2-$sumX*$sumX)*($n*$sumY2-$sumY*$sumY);
$r2=pow($top/sqrt($bottom),2);
echo ' r2 ',$r2;
}
Related
I have a task which requires to generate the slope and interception of two sets of data by linear regression. According to the following link, it can be easily accomplished by R:
https://www.datacamp.com/community/tutorials/linear-regression-R
The codes are simply like
model <- lm(sales ~ youtube, data = marketing)
However, I will need to implement it in PHP. Is it possible ?
Normally you can do such operation easily in R.
But since you said you have to do it this time in PHP. You may use the following function:
<?php
function linear_regression($x, $y) {
// calculate number points
$n = count($x);
// ensure both arrays of points are the same size
if ($n != count($y)) {
trigger_error("linear_regression(): Number of elements in coordinate arrays do not match.", E_USER_ERROR);
}
// calculate sums
$x_sum = array_sum($x);
$y_sum = array_sum($y);
$xx_sum = 0;
$xy_sum = 0;
for($i = 0; $i < $n; $i++) {
$xy_sum+=($x[$i]*$y[$i]);
$xx_sum+=($x[$i]*$x[$i]);
}
// calculate slope
$m = (($n * $xy_sum) - ($x_sum * $y_sum)) / (($n * $xx_sum) - ($x_sum * $x_sum));
// calculate intercept
$b = ($y_sum - ($m * $x_sum)) / $n;
// return result
return array("m"=>$m, "b"=>$b);
}
?>
As an example, you may use the following codes to get the slope and intercept of the two sets of data:
$a=array();
$b=array();
array_push($a,$adata1);
array_push($a,$adata2);
array_push($a,$adata3);
array_push($a,$adata4);
array_push($b,$bdata1);
array_push($b,$bdata2);
array_push($b,$bdata3);
array_push($b,$bdata4);
$aa= linear_regression($a, $b);
$slope1= $aa["m"];
$intercept1= $aa["b"];
I tried to do something like that:
$total = 0;
for ($j = 0; $j < 1000; $j++) {
$x = $j / 1000;
$total += pow($x, 1500) * pow((1 - $x), 500);
}
$total is 0.
PHP can't work with too small float values. What can I do? Which libraries can I use?
The function
f(x) = x^1500 * (1-x)^500
has (logarithmic) derivative
f'(x)/f(x)=d/dx log(f(x))
= 1500/x - 500/(1-x)
which is zero for
x0 = 3/4
having the maximum value of
f(3/4) = 3^1500/2^4000 = exp(-1124.6702892376163)
= 10^(-488.4381005764309)
= 3.646694848749686e-489
Using that as reference value, one can now sum up
f(i/1000)/f(3/4)=exp(1500*log(i/1000)+500*log(1-i/1000)+1124.6702892376163)
giving a sum of 24.26257515625789 so that the desired result is
24.26257515625789*f(3/4)=8.847820783972776e-488
A practical way to compute such a sum would be to compute the list of logarithms (more python than PHP, look up the corresponding array operations)
logf = [ log(f(i/1000.0)) for i=1..999 ]
using the transformed logarithm of f, log(f(x))=1500*log(x)+500*log(1-x).
Then compute maxlogf = max(logf), extract the number N=floor(maxlogf/log(10)) of the decimal power and compute the sum as
sumfred = sum([ exp( logfx - N*log(10) ) for logfx in logf ])
so that the final result is sumfred*10^N.
I am new to PHP and stackoverflow and try to figure things out for myself before asking but I am having a little trouble doing some maths on an array I have pulled from a database with PHP.
So far I have an array of numbers called $array['sn']
I have created a function in excel that does the maths and works well in excel but I cant figure out a way to do it in PHP.
the excel function is =QUOTIENT(E32,65536)"IENT(E32-F34*65536,256)&(G33-G35*256)
E32 being the value I start with i.e $sn
F34 being the answer to the first quotient
G35 being the answer to the second quotient
G33 being E32-F34*65536
I want to take a number e.g. 3675177 divide it by 65536 but without the remainder which is 56, then multiply 56 by 65536 which equals 3670016, then find the difference between 3670016 and 3675177 which is 5161. Then divide 5161 by 256 with no remainder which is 20 then multiply 20 by 256 and subtract 5161 which is 41.
The end result from 3675177 should be 562041. I want to do this calculation on every number in the $array['sn'], any help would be appreciated.
The calculation and formatting of the output would be like this:
$n = 3675177;
$const = 65536;
$const2 = 256;
$a = intval($n / $const); // intval returns only the integer part of a number
$x = $n % $const; // $n % $const means "the remainder of $n / $const"
$b = intval($x / $const2);
$c = $x % $const2;
// Two options to handle values of $c < 10:
// if ($c < 10) $c = "0$c";
// $c = str_pad($c, 2, "0", STR_PAD_LEFT);
echo "$a$b$c";
I would recommend using array_map to apply the calculation to your array of values.
There are php arithmetic operations you can use.
I would do something like this:
$initialNumber = //the initial number, wherever you get it from
$entireDivision = ceil($initialNumber/65536)-1;
$remainder = $initialNumber%65536;
$remainderMultiplied = $remainder * 56;
$difference = $initialNumber - $remainderMultiplied;
$differenceDivided = ceil($difference/256)-1;
$differenceMultipliedAndSubstracted = ($differenceDivided * 256) - $difference;
Maybe I used too many variables, this is to be a bit more easy to understand for you. Maybe I did some operation wrong, check it out too. But this is the idea of mathematic operations in php. Maybe you should put this inside a php function with parameters, so your code gets cleaner if you use multiple times.
EDIT: You should put this code inside a function, then run a foreach loop in your array running this function taking as parameter the value of the array position.
$results = array();
foreach ($array['sn'] as $key => $a) {
$b = intval($a / 65536);
$c = ($a - $b * 65536);
$d = intval($c / 256);
$e = $c - $d * 256;
$results[$key] = $b . $d . $e;
}
var_dump($results);
Need a little help
I have
$_POST["zapremina"]=2000;
$_POST["starost"]="15%";
$_POST["namena"]="50%";
I want simple function to do this
$foo=(2000 - 15%) - 50%;
How to do that?
PHP is loosely typed, so you don't have to cast types explicity or do unnecessary operations (e.g. str_replace)
You can do the following:
$z = $_POST["zapremina"]; //$_POST["zapremina"]=2000;
$s = $_POST["starost"]; //$_POST["starost"]=15%;
$n = $_POST["namena"]; //$_POST["namena"]="50%;
$result = (($z - ($z *($s / 100))) - ($z * ($n / 100)));
Remember to use parentheses to have a readable code and meaningful var names.
Like this:
$starostPercentage = (substr($POST["starost"], 0, -1) / 100);
$namenaPercentage = (substr($POST["namena"], 0, -1) / 100);
$foo = ($_POST["zapremina"] * (100 - $starostPercentage)) * $namenaPercentage;
This is what this does and why:
Convert the percentages (like 15%) from their text form to their decimal form (substr(15%) = 15, 15 / 100 = 0.15).
Calculate $foo with these decimals. 2000 - 15% is what you would write (as a human), but in PHP you need to write that as 2000 * (100 * 0.15), meaning: 85% of 2000).
I'd go with this:
$zap = intval($_POST['zapremina']);
$sta = intval($_POST['starost']);
$nam = intval($_POST['namena']);
$foo = ($zap * ((100-$sta)/100)) * ((100 - $nam)/100)
add this function and then call it
function calculation($a, $b, $c)
{
$b = substr($b, 0, -1) / 100;
$c = substr($c, 0, -1) / 100;
return (($a * $b) * $c);
}
and now you can call
$foo = calculation($_POST["zapremina"], $_POST["starost"], $_POST["namena"]);
go with function most of times, because it will be helpful for reusability.
I've been attempting to implement Vincenty's formulae with the following:
/* Implemented using Vincenty's formulae from http://en.wikipedia.org/wiki/Vincenty%27s_formulae,
* answers "Direct Problem".
* $latlng is a ('lat'=>x1, 'lng'=>y1) array
* $distance is in miles
* $angle is in degrees
*/
function addDistance($latlng, $distance, $bearing) {
//variables
$bearing = deg2rad($bearing);
$iterations = 20; //avoid too-early termination while avoiding the non-convergant case
//knowns
$f = EARTH_SPHEROID_FLATTENING; //1/298.257223563
$a = EARTH_RADIUS_EQUATOR_MILES; //3963.185 mi
$phi1 = deg2rad($latlng['lat']);
$l1 = deg2rad($latlng['lng']);
$b = (1 - $f) * $a;
//first block
$tanU1 = (1-$f)*tan($phi1);
$U1 = atan($tanU1);
$sigma1 = atan($tanU1 / cos($bearing));
$sinalpha = cos($U1)*sin($bearing);
$cos2alpha = (1 - $sinalpha) * (1 + $sinalpha);
$usquared = $cos2alpha * (($a*$a - $b*$b) / 2);
$A = 1 + ($usquared)/16384 * (4096+$usquared*(-768+$usquared*(320 - 175*$usquared)));
$B = ($usquared / 1024)*(256*$usquared*(-128 + $usquared * (74 - 47*$usquared)));
//the loop - determining our value
$sigma = $distance / ($b * $A);
for($i = 0; $i < $iterations; ++$i) {
$twosigmam = 2*$sigma1 + $sigma;
$delta_sigma = $B * sin($sigma) * (cos($twosigmam)+(1/4)*$B*(cos(-1 + 2*cos(cos($twosigmam))) - (1/6)*$B*cos($twosigmam)*(-3+4*sin(sin($sigma)))*(-3+4*cos(cos($twosigmam)))));
$sigma = $distance / ($b * $A) + $delta_sigma;
}
//second block
$phi2 = atan((sin($U1)*cos($sigma)+cos($U1)*sin($sigma)*cos($bearing)) / ((1-$f) * sqrt(sin($sinalpha) + pow(sin($U1)*sin($sigma) - cos($U1)*cos($sigma)*cos($bearing), 2))));
$lambda = atan((sin($sigma) * sin($bearing)) / (cos($U1)*cos($sigma) - sin($U1)*sin($sigma)*cos($bearing)));
$C = ($f / 16)* $cos2alpha * (4+$f*(4-3*$cos2alpha));
$L = $lambda - (1 - $C) * $f * $sinalpha * ($sigma + $C*sin($sigma)*(cos($twosigmam)+$C*cos($sigma)*(-1+2*cos(cos($twosigmam)))));
$alpha2 = atan($sinalpha / (-sin($U1)*sin($sigma) + cos($U1)*cos($sigma)*cos($bearing)));
//and return our results
return array('lat' => rad2deg($phi2), 'lng' => rad2deg($lambda));
}
var_dump(addDistance(array('lat' => 93.129, 'lng' => -43.221), 20, 135);
The issue is that the results are not reasonable - I'm getting variances of up to 20 latitude and longitude keeping the distance at 20. Is it not in units of elliptical distance on the sphere? Am I misunderstanding something, or is my implementation flawed?
There are a number of errors in transcription from the wikipedia page Direct Problem section:
Your u2 expression has 2 in the denominator where it should have b2;
Your A and B expressions are inconsistent about whether the initial fraction factor needs to be parenthesised to correctly express a / b * c as (a/b) * c - what happens without parentheses is a php syntax issue which I don't know the answer to, but you should favour clarity;
You should be iterating "until there is no significant change in sigma", which may or may not happen in your fixed number of iterations;
There are errors in your DELTA_sigma formula:
on the wikipedia page, the first term inside the square bracket [ is cos sigma (-1 etc, whereas you have cos (-1 etc, which is very different;
in the same formula and also later, note that cos2 x means (cos x)(cos x), not cos cos x!
Your phi_2 formula has a sin($sinalpha) where it should have a sin($sinalpha)*sin($sinalpha);
I think that's all.
Have you tried this:
https://github.com/treffynnon/Geographic-Calculations-in-PHP