As a fun side-project for myself to help in learning yet another PHP MVC framework, I've been writing Reversi / Othello as a PHP & Ajax application, mostly straightforward stuff. I decided against using a multidimensional array for a number of reasons and instead have a linear array ( in this case 64 elements long ) and a couple methods to convert from the coordinates to integers.
So I was curious, is there any other, possibly faster algorithms for converting an integer to a coordinate point?
function int2coord($i){
$x = (int)($i/8);
$y = $i - ($x*8);
return array($x, $y);
}
//Not a surprise but this is .003 MS slower on average
function int2coord_2($i){
$b = base_convert($i, 10, 8);
$x = (int) ($b != 0 ? $b/8 : 0); // could also be $b < 8 for condition
$y = $b % 10;
return array($x, $y);
}
And for posterity sake, the method I wrote for coord2int
function coord2int($x, $y){
return ($x*8)+$y;
}
Update:
So in the land of the weird, the results were not what I was expecting but using a pre-computed lookup table has predominantly shown to be the fastest, guess trading memory for speed is always a winner?
There was a table with times here but I cut it due to styling issues with SO.
Oh yes! This is a perfect example of binary:
function int2coord($i){
$x = $i >> 3;
$y = $i & 0x07;
return array($x, $y);
}
The reality is that a good compiler will find this optimization and use it, so it's not necessarily faster. Test and see if your compiler/interpreter does this.
It works because any binary division by 8 is the same as a right shift by 3 bits. Modern processors have barrel shifters that can do up to a 32 bit shift in one instruction.
The reverse is as easy:
function coord2int($x, $y){
return ($x << 3)+$y;
}
-Adam
I don't have the time to measure this myself right now, but I would suspect that a pre-computed lookup table would beat your solution in speed. The code would look something like this:
class Converter {
private $_table;
function __construct()
{
$this->_table = array();
for ($i=0; $i<64; $i++) {
$this->_table[$i] = array( (int)($i/8), (int)($i%8) );
}
}
function int2coord( $i )
{
return $this->_table[$i];
}
}
$conv = new Converter();
$coord = $conv->int2coord( 42 );
Of course, this does add a lot of over-head so in practice you would only bother to pre-compute all coordinates if you conversion code was called very often.
I'm not in a position to measure right now, but you should be able to eke out some additional speed with this:
function int2coord($i){
$y = $i%8;
$x = (int)($i/8);
return array($x, $y);
}
edit: ignore me -- Adam's bitshifting answer should be superior.
function int2coord_3($i){
return array((int) ($i / 8), ($i % 8));
}
this is a little faster because there is no var declaration and affectation.
I think most of your performance is lost by returning array(...) at the end. Instead, I propose:
* define two functions, one for x and one for y
or
* inline the bit arithmetic in code needing the calculation
Related
I am using this PHP routine to calc Pearson Correlation:
function correlation ($x,$y) {
$length = count($x);
$mean1 = array_sum($x)/$length;
$mean2 = array_sum($y)/$length;
$a = $b = 0;
$a2 = $b2 = 0;
$axb = 0;
for ($i = 0; $i < $length; $i++) {
$a = $x[$i]-$mean1;
$b = $y[$i]-$mean2;
$axb +=$a*$b;
$a2 += pow($a,2);
$b2 += pow($b,2);
}
if ($sqrt = sqrt($a2*$b2))
return $axb/$sqrt;
return 0;
}
When I test it for several conditions it returns 0 on exact matchs:
echo correlation([0,0,0,0,0],[0,0,0,0,0]); // Returns 0!!
echo correlation([0,0,0,0,0],[1,1,1,1,1]); // Returns 0!!
echo correlation([1,1,1,1,1],[1,1,1,1,1]); // Returns 0!!
echo correlation([0,0,0,0,0],[9,9,9,9,9]); // Returns 0!!
echo correlation([0,0,0,0,0],[0,1,2,3,4]); // Returns 0 OK
echo correlation([9,9,9,9,9],[0,1,2,3,4]); // Returns 0 OK
echo correlation([0,1,2,3,4],[0,1,2,3,4]); // Returns 1 OK
Why? and How to accomplish that? Thank you!
For info:
A Pearson correlation is a number between -1 and 1 that indicates the
extent to which two variables are linearly related. The Pearson
correlation is also known as the “product moment correlation
coefficient” (PMCC) or simply “correlation”.
Approach 1 (doing at your own):
Using PHP to statistics is a hard path.
First of all, as you're using a weak typed language (you don't need to specify the types on variables), the language can interpret as int so, you need to set all of your variables on type float and execute again to run this. You can have some problems with float in PHP, see here why I talking this: https://3v4l.org/1FU9J
But if you don't mind about high precision, you can modify your precision you can set your round() function or you can set ini_set('precision', 3); to get the precision on your data.
Another thing. If you need precision, you need to use bc extension because floating point in PHP is a problem and can affect your results.
Look more about bc math extension here: https://www.php.net/manual/en/book.bc.php or try to use another language.
Some references about the floating point:
https://www.leaseweb.com/labs/2013/06/the-php-floating-point-precision-is-wrong-by-default/
Problem with Floats! (in PHP)
Approach 2 (using language functions):
And, PHP have some functions to help in this. So, if this isn't a homework to learn or something like this, you can try this: https://www.php.net/manual/en/function.stats-stat-correlation.php
As generic of a question as this seems, I'm having a really hard time
learning specifically about how to base-convert large high-precision float values in PHP using BCMath.
I'm trying to base-convert something like
1234.5678900000
to
4D2.91613D31B
How can I do this?
I just want base-10 → base-16, but a conversion for arbitrary-base floats would probably make the most useful answer for others as well.
How to convert a huge integer to hex in php? involves BC, but only for integers.
https://www.exploringbinary.com/base-conversion-in-php-using-bcmath/ explores floats, but only in the context of decimal<->binary. (It says extending the code for other bases is easy, and it probably is (using the code in the previous point), but I have no idea how to reason through the correctness of the result I'd reach.)
Fast arbitrary-precision logarithms with bcmath is also float-based, but in the context of reimplementing high-precision log(). (There is a mention of converting bases in there, though, along with notes about how BC dumbly uses PHP's own pow() and loses precision.)
The other results I've found are just talking about PHP's own float coercion, and don't relate to BC at all.
Up to base 36 conversions with high precision
I think this question is just a bit too difficult for Stack Overflow. Not only do you want to base-convert floating-points, which is a bit unusual by itself, but it has to be done at high precision. This is certainly possible, but not many people will have a solution for this lying around and making one takes time. The math of base conversion is not very complex, and once you understand it you can work it out yourself.
Oh, well, to make a long story short, I couldn't resist this, and gave it a try.
<?php
function splitNo($operant)
// get whole and fractional parts of operant
{
if (strpos($operant, '.') !== false) {
$sides = explode('.',$operant);
return [$sides[0], '.' . $sides[1]];
}
return [$operant, ''];
}
function wholeNo($operant)
// get the whole part of an operant
{
return explode('.', $operant)[0];
}
function toDigits($number, $base, $scale = 0)
// convert a positive number n to its digit representation in base b
{
$symbols = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ';
$digits = '';
list($whole, $fraction) = splitNo($number);
while (bccomp($whole, '0.0', $scale) > 0) {
$digits = $symbols{(int)bcmod($whole, $base, $scale)} . $digits;
$whole = wholeNo(bcdiv($whole, $base, $scale));
}
if ($scale > 0) {
$digits .= '.';
for ($i = 1; $i <= $scale; $i++) {
$fraction = bcmul($fraction, $base, $scale);
$whole = wholeNo($fraction);
$fraction = bcsub($fraction, $whole, $scale);
$digits .= $symbols{$whole};
}
}
return $digits;
}
function toNumber($digits, $base, $scale = 0)
// compute the number given by digits in base b
{
$symbols = str_split('0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ');
$number = '0';
list($whole, $fraction) = splitNo($digits);
foreach (str_split($whole) as $digit) {
$shiftUp = bcmul($base, $number, $scale);
$number = bcadd($shiftUp, array_search($digit, $symbols));
}
if ($fraction != '') {
$shiftDown = bcdiv('1', $base, $scale);
foreach (str_split(substr($fraction, 1)) as $symbol) {
$index = array_search($symbol, $symbols);
$number = bcadd($number, bcmul($index, $shiftDown, $scale), $scale);
$shiftDown = bcdiv($shiftDown, $base, $scale);
}
}
return $number;
}
function baseConv($operant, $fromBase, $toBase, $scale = 0)
// convert the digits representation of a number from base 1 to base 2
{
return toDigits(toNumber($operant, $fromBase, $scale), $toBase, $scale);
}
echo '<pre>';
print_r(baseConv('1234.5678900000', 10, 16, 60));
echo '</pre>';
The output is:
4D2.91613D31B9B66F9335D249E44FA05143BF727136A400FBA8826AA8EB4634
It looks a bit complicated, but isn't really. It just takes time. I started with converting whole numbers, then added fractions, and when that all worked I put in all the BC Math functions.
The $scale argument represents the number of wanted decimal places.
It may look a bit strange that I use three function for the conversion: toDigits(), toNumber() and baseConv(). The reason is that the BC Math functions work with a base of 10. So, toDigits() converts away from 10 to another base and toNumber() does the opposite. To convert between two arbitrary-base operants we need both functions, and this results in the third: baseConv().
This could possible be further optimized, if needed, but you haven't told us what you need it for, so optimization wasn't a priority for me. I just tried to make it work.
You can get higher base conversions by simply adding more symbols. However, in the current implementation each symbol needs to be one character. With UTF8 that doesn't really limit you, but make sure everything is multibyte compatible (which it isn't at this moment).
NOTE: It seems to work, but I don't give any guarantees. Test thoroughly before use!
I have a system of equations of grade 1 to resolve in PHP.
There are more equations than variables but there aren't less equations than variables.
The system would look like bellow. n equations, m variables, variables are x[i] where 'i' takes values from 1 to m. The system may have a solution or not.
m may be maximum 100 and n maximum ~5000 (thousands).
I will have to resolve like a few thousands of these systems of equations. Speed may be a problem but I'm looking for an algorithm written in PHP for now.
a[1][1] * x[1] + a[1][2] * x[2] + ... + a[1][m] * x[m] = number 1
a[2][1] * x[1] + a[2][2] * x[2] + ... + a[2][m] * x[m] = number 2
...
a[n][1] * x[1] + a[n][2] * x[2] + ... + a[n][m] * x[m] = number n
There is Cramer Rule which may do it. I could make 1 square matrix of coefficients, resolve the system with Cramer Rule (by calculating matrices' determinants) and than I should check the values in the unused equations.
I believe I could try Cramer by myself but I'm looking for a better solution.
This is a problem of Computational Science,
http://en.wikipedia.org/wiki/Computational_science#Numerical_simulations
I know there are some complex algorithms to solve my problem but I can't tell which one would do it and which is the best for my case. An algorithm would use me better than just the theory with the demonstration.
My question is, does anybody know a class, script, code of some sort written in PHP to resolve a system of linear equations of grade 1 ?
Alternatively I could try an API or a Web Service, best to be free, a paid one would do it too.
Thank you
I needed exactly this, but I couldn't find determinant function, so I made one myself. And the Cramer rule function too. Maybe it'll help someone.
/**
* $matrix must be 2-dimensional n x n array in following format
* $matrix = array(array(1,2,3),array(1,2,3),array(1,2,3))
*/
function determinant($matrix = array()) {
// dimension control - n x n
foreach ($matrix as $row) {
if (sizeof($matrix) != sizeof($row)) {
return false;
}
}
// count 1x1 and 2x2 manually - rest by recursive function
$dimension = sizeof($matrix);
if ($dimension == 1) {
return $matrix[0][0];
}
if ($dimension == 2) {
return ($matrix[0][0] * $matrix[1][1] - $matrix[0][1] * $matrix[1][0]);
}
// cycles for submatrixes calculations
$sum = 0;
for ($i = 0; $i < $dimension; $i++) {
// for each "$i", you will create a smaller matrix based on the original matrix
// by removing the first row and the "i"th column.
$smallMatrix = array();
for ($j = 0; $j < $dimension - 1; $j++) {
$smallMatrix[$j] = array();
for ($k = 0; $k < $dimension; $k++) {
if ($k < $i) $smallMatrix[$j][$k] = $matrix[$j + 1][$k];
if ($k > $i) $smallMatrix[$j][$k - 1] = $matrix[$j + 1][$k];
}
}
// after creating the smaller matrix, multiply the "i"th element in the first
// row by the determinant of the smaller matrix.
// odd position is plus, even is minus - the index from 0 so it's oppositely
if ($i % 2 == 0){
$sum += $matrix[0][$i] * determinant($smallMatrix);
} else {
$sum -= $matrix[0][$i] * determinant($smallMatrix);
}
}
return $sum;
}
/**
* left side of equations - parameters:
* $leftMatrix must be 2-dimensional n x n array in following format
* $leftMatrix = array(array(1,2,3),array(1,2,3),array(1,2,3))
* right side of equations - results:
* $rightMatrix must be in format
* $rightMatrix = array(1,2,3);
*/
function equationSystem($leftMatrix = array(), $rightMatrix = array()) {
// matrixes and dimension check
if (!is_array($leftMatrix) || !is_array($rightMatrix)) {
return false;
}
if (sizeof($leftMatrix) != sizeof($rightMatrix)) {
return false;
}
$M = determinant($leftMatrix);
if (!$M) {
return false;
}
$x = array();
foreach ($rightMatrix as $rk => $rv) {
$xMatrix = $leftMatrix;
foreach ($rightMatrix as $rMk => $rMv) {
$xMatrix[$rMk][$rk] = $rMv;
}
$x[$rk] = determinant($xMatrix) / $M;
}
return $x;
}
Wikipedia should have pseudocode for reducing the matrix representing your equations to reduced row echelon form. Once the matrix is in that form, you can walk through the rows to find a solution.
There's an unmaintained PEAR package which may save you the effort of writing the code.
Another question is whether you are looking mostly at "wide" systems (more variables than equations, which usually have many possible solutions) or "narrow" systems (more equations than variables, which usually have no solutions), since the best strategy depends on which case you are in — and narrow systems may benefit from using a linear regression technique such as least squares instead.
This package uses Gaussian Elimination. I found that it executes fast for larger matrices (i.e. more variables/equations).
There is a truly excellent package based on JAMA here: http://www.phpmath.com/build02/JAMA/docs/index.php
I've used it for simple linear right the way to highly complex Multiple Linear Regression (writing my own Backwards Stepwise MLR functions on top of that). Very comprehensive and will hopefully do what you need.
Speed could be considered an issue, for sure. But works a treat and matched SPSS when I cross referenced results on the BSMLR calculations.
I wanna solve this problem with your support.
Assume that, there is an array in variable named $ar, and exist 5 numbers in this array, so i want to calculate geometric average of these numbers through Pascal or PHP programming language. How can i do ?
Here is PHP version:
function geometric_average($a) {
foreach($a as $i=>$n) $mul = $i == 0 ? $n : $mul*$n;
return pow($mul,1/count($a));
}
//usage
echo geometric_average(array(2,8)); //Output-> 4
Possible solution in "standard" Pascal:
program GeometricAvarage;
const SIZE = 5;
function GeoAvg(A:array of real):real;
var
avg: real;
i: integer;
begin
avg := 1;
for i:=0 to (SIZE) do
avg := avg * A[i];
avg :=Exp(1/SIZE*Ln(avg));
Result:=avg;
end;
begin
var
ar: array [1..SIZE] of real :=(1,2,3,4,5);
writeln('Geometric Avarage = ', GeoAvg(ar)); {Output should be =~2.605}
readln;
end.
If you want to use dynamic arrays this should be done in Delphi or ObjectPascal for example.
For someone that had an issue with this, as I have stated in the comment to the PHP answer, that answer may not be suitable for everyone, especially with ones looking to find geometric average/mean for large numbers or large number of numbers as PHP will simply not store it.
Pretty easy solution is to split the initial array into chunks, calculate mean and then multiply them:
function geometricMean(array $array)
{
if (!count($array)) {
return 0;
}
$total = count($array);
$power = 1 / $total;
$chunkProducts = array();
$chunks = array_chunk($array, 10);
foreach ($chunks as $chunk) {
$chunkProducts[] = pow(array_product($chunk), $power);
}
$result = array_product($chunkProducts);
return $result;
}
Note the 10 - it's the number of elements in a chunk, you may change that if you need to do so. If you get INF as a result, try lowering that.
I am trying to calculate an average without being thrown off by a small set of far off numbers (ie, 1,2,1,2,3,4,50) the single 50 will throw off the entire average.
If I have a list of numbers like so:
19,20,21,21,22,30,60,60
The average is 31
The median is 30
The mode is 21 & 60 (averaged to 40.5)
But anyone can see that the majority is in the range 19-22 (5 in, 3 out) and if you get the average of just the major range it's 20.6 (a big difference than any of the numbers above)
I am thinking that you can get this like so:
c+d-r
Where c is the count of a numbers, d is the distinct values, and r is the range. Then you can apply this to all the possble ranges, and the highest score is the omptimal range to get an average from.
For example 19,20,21,21,22 would be 5 numbers, 4 distinct values, and the range is 3 (22 - 19). If you plug this into my equation you get 5+4-3=6
If you applied this to the entire number list it would be 8+6-41=-27
I think this works pretty good, but I have to create a huge loop to test against all possible ranges. In just my small example there are 21 possible ranges:
19-19, 19-20, 19-21, 19-22, 19-30, 19-60, 20-20, 20-21, 20-22, 20-30, 20-60, 21-21, 21-22, 21-30, 21-60, 22-22, 22-30, 22-60, 30-30, 30-60, 60-60
I am wondering if there is a more efficient way to get an average like this.
Or if someone has a better algorithm all together?
You might get some use out of standard deviation here, which basically measures how concentrated the data points are. You can define an outlier as anything more than 1 standard deviation (or whatever other number suits you) from the average, throw them out, and calculate a new average that doesn't include them.
Here's a pretty naive implementation that you could fix up for your own needs. I purposely kept it pretty verbose. It's based on the five-number-summary often used to figure these things out.
function get_median($arr) {
sort($arr);
$c = count($arr) - 1;
if ($c%2) {
$b = round($c/2);
$a = $b-1;
return ($arr[$b] + $arr[$a]) / 2 ;
} else {
return $arr[($c/2)];
}
}
function get_five_number_summary($arr) {
sort($arr);
$c = count($arr) - 1;
$fns = array();
if ($c%2) {
$b = round($c/2);
$a = $b-1;
$lower_quartile = array_slice($arr, 1, $a-1);
$upper_quartile = array_slice($arr, $b+1, count($lower_quartile));
$fns = array($arr[0], get_median($lower_quartile), get_median($arr), get_median($upper_quartile), $arr[$c-1]);
return $fns;
}
else {
$b = round($c/2);
$a = $b-1;
$lower_quartile = array_slice($arr, 1, $a);
$upper_quartile = array_slice($arr, $b+1, count($lower_quartile));
$fns = array($arr[0], get_median($lower_quartile), get_median($arr), get_median($upper_quartile), $arr[$c-1]);
return $fns;
}
}
function find_outliers($arr) {
$fns = get_five_number_summary($arr);
$interquartile_range = $fns[3] - $fns[1];
$low = $fns[1] - $interquartile_range;
$high = $fns[3] + $interquartile_range;
foreach ($arr as $v) {
if ($v > $high || $v < $low)
echo "$v is an outlier<br>";
}
}
//$numbers = array( 19,20,21,21,22,30,60 ); // 60 is an outlier
$numbers = array( 1,230,239,331,340,800); // 1 is an outlier, 800 is an outlier
find_outliers($numbers);
Note that this method, albeit much simpler to implement than standard deviation, will not find the two 60 outliers in your example, but it works pretty well. Use the code for whatever, hopefully it's useful!
To see how the algorithm works and how I implemented it, go to: http://www.mathwords.com/o/outlier.htm
This, of course, doesn't calculate the final average, but it's kind of trivial after you run find_outliers() :P
Why don't you use the median? It's not 30, it's 21.5.
You could put the values into an array, sort the array, and then find the median, which is usually a better number than the average anyway because it discounts outliers automatically, giving them no more weight than any other number.
You might sort your numbers, choose your preferred subrange (e.g., the middle 90%), and take the mean of that.
There is no one true answer to your question, because there are always going to be distributions that will give you a funny answer (e.g., consider a biased bi-modal distribution). This is why may statistics are often presented using box-and-whisker diagrams showing mean, median, quartiles, and outliers.