PHP find median using heap vs sort - php

I was looking for a quick way to calculate the median of a list of numbers and came across this:
function array_median($array) {
// perhaps all non numeric values should filtered out of $array here?
$iCount = count($array);
if ($iCount == 0) {
return null;
}
// if we're down here it must mean $array
// has at least 1 item in the array.
$middle_index = floor($iCount / 2);
sort($array, SORT_NUMERIC);
$median = $array[$middle_index]; // assume an odd # of items
// Handle the even case by averaging the middle 2 items
if ($iCount % 2 == 0) {
$median = ($median + $array[$middle_index - 1]) / 2;
}
return $median;
}
This approach using sort() makes sense and is certainly the obvious approach. However, I was curious if a median heap would be faster. What was surprising was that when I implemented a simple median heap it is consistently significantly slower than the above method.
My simple MedianHeap class:
class MedianHeap{
private $lowerHeap;
private $higherHeap;
private $numbers = [];
public function __construct($numbers = null)
{
$this->lowerHeap = new SplMaxHeap();
$this->higherHeap = new SplMinHeap();
if (count($numbers)) {
$this->insertArray($numbers);
}
}
public function insertArray ($numbers) {
foreach($numbers as $number) {
$this->insert($number);
}
}
public function insert($number)
{
$this->numbers[] = $number;
if ($this->lowerHeap->count() == 0 || $number < $this->lowerHeap->top()) {
$this->lowerHeap->insert($number);
} else {
$this->higherHeap->insert($number);
}
$this->balance();
}
protected function balance()
{
$biggerHeap = $this->lowerHeap->count() > $this->higherHeap->count() ? $this->lowerHeap : $this->higherHeap;
$smallerHeap = $this->lowerHeap->count() > $this->higherHeap->count() ? $this->higherHeap : $this->lowerHeap;
if ($biggerHeap->count() - $smallerHeap->count() >= 2) {
$smallerHeap->insert($biggerHeap->extract());
}
}
public function getMedian()
{
if (!count($this->numbers)) {
return null;
}
$biggerHeap = $this->lowerHeap->count() > $this->higherHeap->count() ? $this->lowerHeap : $this->higherHeap;
$smallerHeap = $this->lowerHeap->count() > $this->higherHeap->count() ? $this->higherHeap : $this->lowerHeap;
if ($biggerHeap->count() == $smallerHeap->count()) {
return ($biggerHeap->top() + $smallerHeap->top())/2;
} else {
return $biggerHeap->top();
}
}
}
And then the code to benchmark:
$array = [];
for($i=0; $i<100000; $i++) {
$array[] = mt_rand(1,100000) / mt_rand(1,10000);
}
$t = microtime(true);
echo array_median($array);
echo PHP_EOL . 'Sort Median: ' . (microtime(true) - $t) . ' seconds';
echo PHP_EOL;
$t = microtime(true);
$list = new MedianHeap($array);
echo $list->getMedian();
echo PHP_EOL . 'Heap Median: '. (microtime(true) - $t) . ' seconds';
Is there something in PHP that makes using heaps for this inefficient somehow or is there something wrong with my implemenation?

Related

In which case we can use interpolation algorithm search instead binary search algorithm?

I have two sorted arrays:
$idProducts = [2,9,25,666,1001,1002,1005,2546...n]; //almost 55.000 values
$someIds = [1,9,11,12,99,111,855...n]; //almost 2.500 values
I try make a new array from intersection of idProducts and $someIds.
I applied 3 search algorithms: linear, binary, interpolar searching, but only linear and binary algorithm work properly:
foreach($someIds as $id){
if(interpolation_search($idProducts, $id) >=0){
$newArray[]=$id;
}
}
function interpolation_search($list, $x)
{
$l = 0;
$r = count($list) - 1;
while ($l <= $r) {
if ($list[$l] == $list[$r]) {
if ($list[$l] == $x) {
return $l;
} else {
// not found
return -1;
}
}
$k = ($x - $list[$l])/($list[$r] - $list[$l]);
// not found
if ($k < 0 || $k > 1) {
return -1;
}
$mid = round($l + $k*($r - $l));
if ($x < $list[$mid]) {
$r = $mid - 1;
} else if ($x > $list[$mid]) {
$l = $mid + 1;
} else {
// success!
return $mid;
}
// not found
return -1;
}
}
Anyway the best solution was:
$idProducts = array_flip($idProducts);
foreach ($someIds as $id){
if (isset($idProducts [$id])===true){
$newArray[]=$id;
}
}
But i want know why interpolation doesnt work properly in my case and in which case we can implement interpolation algorithm searching.
Thank you.

Basic perceptron for AND gate in PHP, am I doing it right? Weird results

I'd like to learn about neural nets starting with the very basic perceptron algorithm. So I've implemented one in PHP and I'm getting weird results after training it. All the 4 possible input combinations return either wrong or correct results (more often the wrong ones).
1) Is there something wrong with my implementation or the results I'm getting are normal?
2) Can this kind of implementation work with more than 2 inputs?
3) What would be the next (easiest) step in learning neural nets after this? Maybe adding more neurons, changing the activation function, or ...?
P.S. I'm pretty bad at math and don't necessarily understand the math behind perceptron 100%, at least not the training part.
Perceptron Class
<?php
namespace Perceptron;
class Perceptron
{
// Number of inputs
protected $n;
protected $weights = [];
protected $bias;
public function __construct(int $n)
{
$this->n = $n;
// Generate random weights for each input
for ($i = 0; $i < $n; $i++) {
$w = mt_rand(-100, 100) / 100;
array_push($this->weights, $w);
}
// Generate a random bias
$this->bias = mt_rand(-100, 100) / 100;
}
public function sum(array $inputs)
{
$sum = 0;
for ($i = 0; $i < $this->n; $i++) {
$sum += ($inputs[$i] * $this->weights[$i]);
}
return $sum + $this->bias;
}
public function activationFunction(float $sum)
{
return $sum < 0.0 ? 0 : 1;
}
public function predict(array $inputs)
{
$sum = $this->sum($inputs);
return $this->activationFunction($sum);
}
public function train(array $trainingSet, float $learningRate)
{
foreach ($trainingSet as $row) {
$inputs = array_slice($row, 0, $this->n);
$correctOutput = $row[$this->n];
$output = $this->predict($inputs);
$error = $correctOutput - $output;
// Adjusting the weights
$this->weights[0] = $this->weights[0] + ($learningRate * $error);
for ($i = 0; $i < $this->n - 1; $i++) {
$this->weights[$i + 1] =
$this->weights[$i] + ($learningRate * $inputs[$i] * $error);
}
}
// Adjusting the bias
$this->bias += ($learningRate * $error);
}
}
Main File
<?php
require_once 'vendor/autoload.php';
use Perceptron\Perceptron;
// Create a new perceptron with 2 inputs
$perceptron = new Perceptron(2);
// Test the perceptron
echo "Before training:\n";
$output = $perceptron->predict([0, 0]);
echo "{$output} - " . ($output == 0 ? 'correct' : 'nope') . "\n";
$output = $perceptron->predict([0, 1]);
echo "{$output} - " . ($output == 0 ? 'correct' : 'nope') . "\n";
$output = $perceptron->predict([1, 0]);
echo "{$output} - " . ($output == 0 ? 'correct' : 'nope') . "\n";
$output = $perceptron->predict([1, 1]);
echo "{$output} - " . ($output == 1 ? 'correct' : 'nope') . "\n";
// Train the perceptron
$trainingSet = [
// The 3rd column is the correct output
[0, 0, 0],
[0, 1, 0],
[1, 0, 0],
[1, 1, 1],
];
for ($i = 0; $i < 1000; $i++) {
$perceptron->train($trainingSet, 0.1);
}
// Test the perceptron again - now the results should be correct
echo "\nAfter training:\n";
$output = $perceptron->predict([0, 0]);
echo "{$output} - " . ($output == 0 ? 'correct' : 'nope') . "\n";
$output = $perceptron->predict([0, 1]);
echo "{$output} - " . ($output == 0 ? 'correct' : 'nope') . "\n";
$output = $perceptron->predict([1, 0]);
echo "{$output} - " . ($output == 0 ? 'correct' : 'nope') . "\n";
$output = $perceptron->predict([1, 1]);
echo "{$output} - " . ($output == 1 ? 'correct' : 'nope') . "\n";
I must thank you for posting this question, I have wanted a chance to dive a little deeper into neural networks. Anyway, down to business. After tinkering around and verbose logging what all is happening, it ended up only requiring 1 character change to work as intended:
public function sum(array $inputs)
{
...
//instead of multiplying the input by the weight, we should be adding the weight
$sum += ($inputs[$i] + $this->weights[$i]);
...
}
With that change, 1000 iterations of training ends up being overkill.
One bit of the code was confusing, different setting of weights:
public function train(array $trainingSet, float $learningRate)
{
foreach ($trainingSet as $row) {
...
$this->weights[0] = $this->weights[0] + ($learningRate * $error);
for ($i = 0; $i < $this->n - 1; $i++) {
$this->weights[$i + 1] =
$this->weights[$i] + ($learningRate * $inputs[$i] * $error);
}
}
I don't necessarily understand why you chose to do it this way. My unexperienced eye would think that the following would work as well.
for ($i = 0; $i < $this->n; $i++) {
$this->weight[$i] += $learningRate * $error;
}
Found my silly mistake, I wasn't adjusting the bias for each row of a training set as I accidentally put it outside the foreach loop. This is what the train() method should look like:
public function train(array $trainingSet, float $learningRate)
{
foreach ($trainingSet as $row) {
$inputs = array_slice($row, 0, $this->n);
$correctOutput = $row[$this->n];
$output = $this->predict($inputs);
$error = $correctOutput - $output;
// Adjusting the weights
for ($i = 0; $i < $this->n; $i++) {
$this->weights[$i] += ($learningRate * $inputs[$i] * $error);
}
// Adjusting the bias
$this->bias += ($learningRate * $error);
}
}
Now I get the correct results after training each time I run the script. Just 100 epochs of training is enough.

Ball clock implemented in PHP spins out of control.

I'm trying to implement the ball clock (determine how many iterations an inputed number of balls in the initial queue would go before returning to its initial order) in php. I believe at this point my code is not recognizing when to stop. I'm using an SplQueue to represent the queue and keeping an int inside each ball that represents their order.
I believe the difficulty to be in traversing the $queue in the function hasTerminated(). The code runs, it just doesn't stop. An input of 30, for example (php script.php 30) should output 30 12 hour cycles to complete, but it does not terminate. I appreciate any and all ideas as to how to access the objects of the $queue and compare them without destroying the $queue. PHP is not a language I am very comfortable with.
<?php
class Ball {
static $counter = 0;
protected $index;
function __construct(){
$this->index = self::$counter;
self::$counter++;
}
function getIndex(){
return $this->index;
}
}
class Clock {
protected $queue;
protected $minuteTrack;
protected $fiveMinuteTrack;
protected $hourTrack;
protected $count;
protected $time = 0;
function __construct($num){
$this->queue = new \SplQueue;
$this->minuteTrack = new \SplStack;
$this->fiveMinuteTrack = new \SplStack;
$this->hourTrack = new \SplStack;
$this->count = $num;
for($i = 0; $i<$num; $i++){
$ball = new Ball();
$this->queue->enqueue($ball); //inefficient
}
}
function hasTerminated(){
$inOrder = true;
$prev = -1;
$q = clone $this->queue;
while($q->count() != 0){
$elt = $q->dequeue();
if($elt->getIndex() >= $prev){
$inOrder = false;
} else {
$prev = $elt->getIndex();
}
}
echo PHP_EOL;
if($inOrder){
echo 'It takes ' . $time . ' 12 hour cycles to return to the original ordering' . PHP_EOL;
exit(0);
} else {
echo 'Not finished yet' . PHP_EOL;
}
}
function run(){
while(true){
$ball = $this->queue->dequeue();
if(!$this->addBallToMinuteTrack($ball)){
if(!$this->addBallToFiveMinuteTrack($ball)){
if(!$this->addBallToHourTrack($ball)){
$this->queue->enqueue($ball);
}
}
}
}
}
function addBallToMinuteTrack($ball){
echo '1';
if($this->minuteTrack->count() < 4){
$this->minuteTrack->push($ball);
return true;
} else if( $this->minuteTrack->count() == 4){
for($i=0; $i<4; $i++){
$this->queue->enqueue($this->minuteTrack->pop()); //hate this
}
return false;
}
}
function addBallToFiveMinuteTrack($ball){
echo '5';
if($this->fiveMinuteTrack->count() < 11){
$this->fiveMinuteTrack->push($ball);
return true;
} else if( $this->fiveMinuteTrack->count() == 11){
for($i=0; $i<11; $i++){
$this->queue->enqueue($this->fiveMinuteTrack->pop());
}
return false;
}
}
function addBallToHourTrack($ball){
echo '60' . PHP_EOL;
if($this->hourTrack->count() < 11){
$this->hourTrack->push($ball);
return true;
} else if( $this->hourTrack->count() == 11){
for($i=0; $i<11; $i++){
$this->queue->enqueue($this->hourTrack->pop()); //hate this
}
$this->time = $this->time + 1; //In half day increments
$this->hasTerminated();
return false;
}
}
}
foreach (array_slice($argv, 1) as $arg) {
if(!is_numeric($arg)){
echo 'Arguments must be numeric' . PHP_EOL;
exit(1);
} else {
$clock = new Clock($arg);
$clock->run();
}
}
?>

Memoizing fibonacci function in php

I've created a memoized function of the recursive version of fibonacci.
I use this as an example for other kinds of functions that would use memoization.
My implementation is bad since if I include it in a library, that means that the global variable is still seen..
This is the original recursive fibonacci function:
function fibonacci($n) {
if($n > 1) {
return fibonacci($n-1) + fibonacci($n-2);
}
return $n;
}
and I modified it to a memoized version:
$memo = array();
function fibonacciMemo($n) {
global $memo;
if(array_key_exists($n, $memo)) {
return $memo[$n];
}
else {
if($n > 1) {
$result = fibonacciMemo($n-1) + fibonacciMemo($n-2);
$memo[$n] = $result;
return $result;
}
return $n;
}
}
I purposely didn't use the iterative method in implementing fibonacci.
Is there any better ways to memoize fibonacci function in php? Can you suggest me better improvements? I've seen func_get_args() and call_user_func_array as another way but I can't seem to know what is better?
So my main question is: How can I memoize fibonacci function in php properly? or What is the best way in memoizing fibonacci function in php?
Well, Edd Mann shows an excellent way to implement a memoize function in php in His post
Here is the example code (actually taken from Edd Mann's post):
$memoize = function($func)
{
return function() use ($func)
{
static $cache = [];
$args = func_get_args();
$key = md5(serialize($args));
if ( ! isset($cache[$key])) {
$cache[$key] = call_user_func_array($func, $args);
}
return $cache[$key];
};
};
$fibonacci = $memoize(function($n) use (&$fibonacci)
{
return ($n < 2) ? $n : $fibonacci($n - 1) + $fibonacci($n - 2);
});
Notice that the global definition it's replaced thanks to function clousure and PHP's first-class function support.
Other solution:
You can create a class containing as static members: fibonnacciMemo and $memo. Notice that you don't longer have to use $memo as a global variable, so it won't give any conflict with other namespaces.
Here is the example:
class Fib{
//$memo and fibonacciMemo are static members
static $memo = array();
static function fibonacciMemo($n) {
if(array_key_exists($n, static::$memo)) {
return static::$memo[$n];
}
else {
if($n > 1) {
$result = static::fibonacciMemo($n-1) + static::fibonacciMemo($n-2);
static::$memo[$n] = $result;
return $result;
}
return $n;
}
}
}
//Using the same method by Edd Mann to benchmark
//the results
$start = microtime(true);
Fib::fibonacciMemo(10);
echo sprintf("%f\n", microtime(true) - $start);
//outputs 0.000249
$start = microtime(true);
Fib::fibonacciMemo(10);
echo sprintf("%f\n", microtime(true) - $start);
//outputs 0.000016 (now with memoized fibonacci)
//Cleaning $memo
Fib::$memo = array();
$start = microtime(true);
Fib::fibonacciMemo(10);
echo sprintf("%f\n", microtime(true) - $start);
//outputs 0.000203 (after 'cleaning' $memo)
Using this, you avoid the use of global and also the problem of cleaning the cache. Althought, $memo is not thread save and the keys stored are no hashed values.
Anyways, you can use all the php memoize utilites such as memoize-php
i think... this should to to memoize a fibonacci:
function fib($n, &$computed = array(0,1)) {
if (!array_key_exists($n,$computed)) {
$computed[$n] = fib($n-1, $computed) + fib($n-2, $computed);
}
return $computed[$n];
}
some test
$arr = array(0,1);
$start = microtime(true);
fib(10,$arr);
echo sprintf("%f\n", microtime(true) - $start);
//0.000068
$start = microtime(true);
fib(10,$arr);
echo sprintf("%f\n", microtime(true) - $start);
//0.000005
//Cleaning $arr
$arr = array(0,1);
$start = microtime(true);
fib(10,$arr);
echo sprintf("%f\n", microtime(true) - $start);
//0.000039
Another solution:
function fib($n, &$memo = []) {
if (array_key_exists($n,$memo)) {
return $memo[$n];
}
if ($n <=2 ){
return 1;
}
$memo[$n] = fib($n-1, $memo) + fib($n-2, $memo);
return $memo[$n];
}
Performance:
$start = microtime(true);
fib(100);
echo sprintf("%f\n", microtime(true) - $start);
// 0.000041
This's an implementation of memoize a fibonacci:
function fib(int $n, array &$memo = [0,1,1]) : float {
return $memo[$n] ?? $memo[$n] = fib($n-1, $memo) + fib($n-2, $memo);
}
Call
echo fib(20); // 6765
function fibMemo($n)
{
static $cache = [];
//print_r($cache);
if (!empty($cache[$n])) {
return $cache[$n];
} else {
if ($n < 2) {
return $n;
} else {
$p = fibMemo($n - 1) + fibMemo($n - 2);
$cache[$n] = $p;
return $p;
}
}
}
echo fibMemo(250);

how to optimise this code?

I have a solution to the below problem in PHP.
But it is taking too much time to execute for 10 digit numbers. I want to know where am I going wrong ?
I am new to dynamic programming .Can someone have a look at this ?
Problem
In Byteland they have a very strange monetary system.
Each Bytelandian gold coin has an integer number written on it. A coin n
can be exchanged in a bank into three coins: n/2, n/3 and n/4.
But these numbers are all rounded down (the banks have to make a profit).
You can also sell Bytelandian coins for American dollars. The exchange
rate is 1:1. But you can not buy Bytelandian coins.
You have one gold coin. What is the maximum amount of American dollars
you can get for it?
========================================================
<?php
$maxA=array();
function exchange($money)
{
if($money == 0)
{
return $money;
}
if(isset($maxA[$money]))
{
$temp = $maxA[$money]; // gets the maximum dollars for N
}
else
{
$temp = 0;
}
if($temp == 0)
{
$m = $money/2;
$m = floor($m);
$o = $money/3;
$o = floor($o);
$n = $money/4;
$n = floor($n);
$total = $m+$n+$o;
if(isset($maxA[$m]))
{
$m = $maxA[$m];
}
else
{
$m = exchange($m);
}
if(isset($maxA[$n]))
{
$n = $maxA[$n];
}
else
{
$n = exchange($n);
}
if(isset($maxA[$o]))
{
$o = $maxA[$o];
}
else
{
$o = exchange($o);
}
$temp = max($total,$m+$n+$o,$money);
$maxA[$money]=$temp; //store the value
}
return $temp;
}
$A=array();
while(1)
{
$handle = fopen ("php://stdin","r");
$line = fgets($handle);
if(feof($handle))
{
break;
}
array_push($A,trim($line));
}
$count =count($A);
for($i=0;$i<$count;$i++)
{
$val = exchange($A[$i]);
print "$val \n";
}
?>
Here a reformatted version of the code for the ones (like I) who could understand the above. It doesn't improve anything.
function exchange($money) {
static $maxA = array(0 => 0);
if (isset($maxA[$money])) {
return $money;
}
$m = floor($money/2);
$o = floor($money/3);
$n = floor($money/4);
$total = $m+$n+$o;
if (isset($maxA[$m])) {
$m = $maxA[$m];
} else {
$m = exchange($m);
}
if (isset($maxA[$n])) {
$n = $maxA[$n];
} else {
$n = exchange($n);
}
if (isset($maxA[$o])) {
$o = $maxA[$o];
} else {
$o = exchange($o);
}
return $maxA[$money] = max($total, $m + $n + $o, $money);
}
I still have my code from this problem. But it's in c++. It's by no means the most efficient, but it passed. It might not be too hard to port to php.
#include <cstdio>
#include <queue>
using namespace std;
unsigned long results;
queue to_test;
int main()
{
char tmp_val[30];
unsigned long coin_value = 1;
while (coin_value)
{
scanf("%s", tmp_val);
coin_value = 0;
results = 0;
for (int w = 0; tmp_val[w] != '\0'; w++)
{
coin_value *= 10;
coin_value += tmp_val[w] - 0x30;
}
if (coin_value != 0)
{
to_test.push(coin_value);
while(!to_test.empty())
{
unsigned long tester = to_test.front();
to_test.pop();
unsigned long over2 = tester/2;
unsigned long over3 = tester/3;
unsigned long over4 = tester/4;
if (tester < over2 + over3 + over4)
{
to_test.push(over2);
to_test.push(over3);
to_test.push(over4);
}
else
{
results += tester;
}
}
printf("%lu\n", results);
}
}
}

Categories