How to iterate efficiently when some sub intervals results are known

How to iterate efficiently when some sub intervals results are known - php

You have a function that always inputs an interval (natural numbers in this case), this function returns a result, but is quite expensive on the processor, simulated by sleep in this example:
function calculate($start, $end) {
$result = 0;
for($x=$start;$x<=$end;$x++) {
$result++;
usleep(250000);
}
return $result;
}
In order to be more efficient there is an array of old results, that contains the interval used an the result of the function for that interval:
$oldResults = [
['s'=>1, 'e'=>2, 'r' => 1],
['s'=>2, 'e'=>6, 'r' => 4],
['s'=>4, 'e'=>7, 'r' => 3]
];
If I call calculate(1,10) the function should be able to calculate new intervals based on old results and accumulate them, In this particular case it should take the old result from 1 to 2 add that to the old result from 2 to 6 and do a new calculate(6,10) and add that too. Take in consideration that the function ignores the old saved interval from 4 to 7 since it was more convenient to use 2-6.
This is a visual representation of the problem:
Of course in this example, calculate() is quite simple and you can just find particular ways to solve this problem around it, but in the real code calculate() is complex and the only thing I know is that calculate(n0,n3)==calculate(n0,n1)+calculate(n1,n2)+calculate(n2,n3).
I cannot find a way to solve the reuse of the old data without using a bunch of IF and foreach, I'm sure there is a more elegant approach to solve this.
You can play with the code here.
Note: I'm using PHP but I can read JS, Pyton, C and similar languages.

if you are certain that calculate(n0,n3)==calculate(n0,n1)+calculate(n1,n2)+calculate(n2,n3), then it seems to me that one approach might simply be to establish a database cache.
you can pre-calculate each discrete interval, and store its result in a record.
$start = 0;
$end = 1000;
for($i=1;$i<=$end;$i++) {
$result = calculate($start, $i);
$sql = "INSERT INTO calculated_cache (start, end, result) VALUES ($start,$i,$result)";
// execute statement via whatever dbms api
$start++;
}
now whenever new requests come in, a database lookup should be significantly faster. note you may need to tinker with my boundary cases in this rough example.
function fetch_calculated_cache($start, $end) {
$sql = "
SELECT SUM(result)
FROM calculated_cache
WHERE (start BETWEEN $start AND $end)
AND (end BETWEEN $start AND $end)
";
$result = // whatever dbms api you chose
return $result;
}
there are a couple obvious considerations such as:
cache invalidation. how often will the results of your calculate function change? you'll need to repopulate the database then.
how many intervals do you want to store? in my example, I arbitrarily picked 1000
will you ever need to retrieve non-sequential interval results? you'll need to apply the above procedure in chunks.

i wrote this:
function findFittingFromCache($from, $to, $cache){
//length for measuring usefulnes of chunk from cache (now 0.1 means 10% percent of total length)
$totalLength = abs($to - $from);
$candidates = array_filter($cache, function($val) use ($from, $to, $totalLength){
$chunkLength = abs($val['e'] - $val['s']);
if($from <= $val['s'] && $to >= $val['e'] && ($chunkLength/$totalLength > 0.1)){
return true;
}
return false;
});
//sorting to have non-decremental values of $x['s']
usort($candidates, function($a, $b){ return $a['s'] - $b['s']; });
$flowCheck = $from;
$needToCompute = array();
foreach($candidates as $key => $val){
if($val['s'] < $flowCheck){
//already using something with this interval
unset($candidates[$key]);
} else {
if($val['s'] > $flowCheck){
//save what will be needed to compute
$needToCompute[] = array('s'=>$flowCheck, 'e'=>$val['s']);
}
//increase starting position for next loop
$flowCheck = $val['e'];
}
}
//rest needs to be computed as well
if($flowCheck < $to){
$needToCompute[] = array('s'=>$flowCheck, 'e'=>$to);
}
return array("computed"=>$candidates, "missing"=>$needToCompute);
}
It is function which returns you two arrays, one "computed" holds found already computed pieces, second "missing" holds gaps between them which must be computed yet.
inside function there is 0.1 threshold, which disqualifies chunks shorter than 10% of total searched length, you can rewrite function to send threshold as parameter, or ommit it completely.
i presume results will be stored and after computing added into cache ($oldResults), which might be of any form (for example database as Jeff Puckett suggested). Do not forget to add all computed chunks and whole seeked length into cache.
I am sorry but i can't find a way without cycles and ifs
Working demo:
link

Related

2D PHP array, join values based on similar values

I have PHP array which I use to draw a graph
Json format:
{"y":24.1,"x":"2017-12-04 11:21:25"},
{"y":24.1,"x":"2017-12-04 11:32:25"},
{"y":24.3,"x":"2017-12-04 11:33:30"},
{"y":24.1,"x":"2017-12-04 11:34:25"},
{"y":24.2,"x":"2017-12-04 11:35:35"},.........
{"y":26.2,"x":"2017-12-04 11:36:35"}, ->goes up for about a minute
{"y":26.3,"x":"2017-12-04 11:37:35"},.........
{"y":24.1,"x":"2017-12-04 11:38:25"},
{"y":24.3,"x":"2017-12-04 11:39:30"}
y=is temperature and x value is date time,
as you can see temperature doesn't change so often even if, it change only for max 0.4. But sometimes after a long period of similar values it change for more than 0.4.
I would like to join those similar values, so graph would not have 200k of similar values but only those that are "important".
I would need an advice, how to make or which algorithm would be perfect to create optimized array like i would like.
perfect output:
{"y":24.1,"x":"2017-12-04 11:21:25"},.........
{"y":24.1,"x":"2017-12-04 11:34:25"},
{"y":24.2,"x":"2017-12-04 11:35:35"},.........
{"y":26.2,"x":"2017-12-04 11:36:35"}, ->goes up for about a minute
{"y":26.3,"x":"2017-12-04 11:37:35"},.........
{"y":24.1,"x":"2017-12-04 11:38:25"}
Any help?

As you specified php I'm going to assume you can handle this on the output side.
Basically, you want logic like "if the absolute value of the temperature exceeds the last temperature by so much, or the time is greater than the last time by x minutes, then let's output a point on the graph". If that's the case you can get the result by the following:
$temps = array(); //your data in the question
$temp = 0;
$time = 0;
$time_max = 120; //two minutes
$temp_important = .4; //max you'll tolerate
$output = [];
foreach($temps as $point){
if(strtotime($point['x']) - $time > $time_max || abs($point['y'] - $temp) >= $temp_important){
// add it to output
$output[] = $point;
}
//update our data points
if(strtotime($point['x']) - $time > $time_max){
$time = strtotime($point['x']);
}
if(abs($point['y'] - $temp) >= $temp_important){
$temp = $point['y'];
}
}
// and out we go..
echo json_encode($output);
Hmm, that's not exactly what you're asking for, as if the temp spiked in a short time and then went down immediately, you'd need to change your logic - but think of it in terms of requirements.
If you're RECEIVING data on the output side I'd write something in javascript to store these points in/out and use the same logic. You might need to buffer 2-3 points to make your decision. Your logic here is performing an important task so you'd want to encapsulate it and make sure you could specify the parameters easily.

Implementing Cutting Stock Algorithm in PHP

I need to implement the Cutting Stock Problem with a php script.
As my math skills are not that great I am just trying to brute force it.
Starting with these parameters
$inventory is an array of lengths that are available to be cut.
$requestedPieces is an array of lengths that were requested by the
customer.
$solution is an empty array
I have currently worked out this recursive function to come up with all possible solutions:
function branch($inventory, $requestedPieces, $solution){
// Loop through the requested pieces and find all inventory that can fulfill them
foreach($requestedPieces as $requestKey => $requestedPiece){
foreach($inventory as $inventoryKey => $piece){
if($requestedPiece <= $piece){
$solution2 = $solution;
array_push($solution2, array($requestKey, $inventoryKey));
$requestedPieces2 = $requestedPieces;
unset($requestedPieces2[$requestKey]);
$inventory2 = $inventory;
$inventory2[$inventoryKey] = $piece - $requestedPiece;
if(count($requestedPieces2) > 0){
branch($inventory2, $requestedPieces2, $solution2);
}else{
global $solutions;
array_push($solutions, $solution2);
}
}
}
}
}
The biggest inefficiency I have discovered with this is that it will find the same solution multiple times but with the steps in a different order.
For example:
$inventory = array(1.83, 20.66);
$requestedPieces = array(0.5, 0.25);
The function will come up with 8 solutions where it should come up with 4 solutions.
What is a good way to resolve this.

This does not answer your question, but I thought it could be worth being mentioned:
You have several other ways to solve your problem, rather than brute forcing it. The wikipedia page on the topic is pretty thorough, but I'll just describe two others simpler ideas. I will use the wikipedia terminology for certain words, namely master for inventory piece, and cut for a requested piece. I will use set to denote a set of cuts pertaining to a given master.
The first one is based on the greedy algorithm, and consist in filling a set with the largest available cut, until no more cut may fit, and repeat that same process for each master, yielding a set for each one of them.
The second one is more dynamic: it uses recursion (like yours), and look for the best fit for the remaining length of master and cuts at each step of the recursion, the goal being to minimize the wasted length when no more cuts can fit.
function branch($master, $cuts, $set){
$goods = array_filter($cuts, function($v) use ($master) { return $v <= $master;});
$res = array($master,$set,$cuts);
if (empty($goods))
return $res;
$remaining = array_diff($cuts, $goods);
foreach($goods as $k => $g){
$t = $set;
array_push($t, $g);
$r = $remaining;
$c = $goods;
for ($i = 0; $i < $k; $i++)
array_push($r,array_shift($c));
array_shift($c);
$t = branch($master - $g, $c, $t);
array_walk($r, function($k,$v) use ($t) {array_push($t[2], $v);});
if ($t[0] == 0) return $t;
if ($t[0] < $res[0])
$res = $t;
}
return $res;
}
The function above should give you the optimal set for a given master. It returns an array of 3 values:
the wasted length on master
the set
the remaining cuts
The parameters are
the master length,
the cuts to be performed (must be sorted in descending order),
the set of cuts already scheduled (a preexisting set, which would be empty for the first call for each master)
Caveats: It depends on the masters' order, you could certainly write a function which tries all the relevant possibilities to find the best order of masters.

generating an sequential five digit alphanumerical ID

General Overview:
The function below spits out a random ID. I'm using this to provide a confirmation alias to identify a record. However, I've had to check for collision(however unlikely), because we are only using a five digit length. With the allowed characters listed below, it comes out to about 33 million plus combinations. Eventually we will get to five million or so records so collision becomes an issue.
The Problem:
Checking for dupe aliases is inefficient and resource heavy. Five million records is a lot to search through. Especially when this search is being conducted concurrently by different users.
My Question:
Is there a way to 'auto increment' the combinations allowed by this function? Meaning I only have to search for the last record's alias and move on to the next combination?
Acknowledged Limitations:
I realize the code would be vastly different than the function below. I also realize that mysql has an auto increment feature for numerical IDs, but the project is requiring a five digit alias with the allowed characters of '23456789ABCDEFGHJKLMNPQRSTUVWXYZ'. My hands are tied on that issue.
My Current Function:
public function random_id_gen($length)
{
$characters = '23456789ABCDEFGHJKLMNPQRSTUVWXYZ';
$max = strlen($characters) - 1;
$string = '';
for ($i = 0; $i < $length; $i++) {
$string .= $characters[mt_rand(0, $max)];
}
return $string;
}

Why not just create a unique index on the alias column?
CREATE UNIQUE INDEX uniq_alias ON MyTable(alias);
at which point you can try your insert/update and if it returns an error, generate a new alias and try again.

What you really need to do is convert from base 10 to base strlen($characters).
PHP comes with a built in base_convert function, but it doesn't do exactly what you want as it will use the numbers zero, one and the letter 'o', which you don't have in your version. So you'll need a function to map the values from base_convert from/to your values:
function map_basing($number, $from_characters, $to_characters) {
if ( strlen($from_characters) != strlen($to_characters)) {
// ERROR!
}
$mapped = '';
foreach( $ch in $number ) {
$pos = strpos($from_characters, $ch);
if ( $pos !== false ) {
$mapped .= $to_characters[$pos];
} else {
// ERROR!
}
}
return $mapped;
}
Now that you have that:
public function next_id($last_id)
{
$my_characters = '23456789ABCDEFGHJKLMNPQRSTUVWXYZ';
$std_characters ='0123456789abcdefghijklmnopqrstuv';
// Map from your basing to the standard basing.
$mapped = map_basing($last_id, $my_characters, $std_characters);
// Convert to base 10 integer and increment.
$intval = base_convert($mapped, strlen($my_characters), 10);
$intval++;
// Convert to standard basing, then to our custom basing.
$newval_std = base_convert($intval, 10, strlen($my_characters));
$newval = map_basing($newval_std, $std_characters, $my_characters);
return $newval;
}
Might be some syntax errors in there, but you should get the gist of it.

You could roll your own auto-increment. It would probably be fairly inefficient though as you'd have to figure out where in the process your increment was. For instance, if you assigned the position in your random string as an integer and started with (0)(0)(0)(0)(0) that would equate to 22222 as the ID. Then to get the next one, just increment the last value to (0)(0)(0)(0)(1) which would translate into 22223. If the last one gets to your string length, then make it 0 and increment the second to last, etc... It's not exactly random, but it would be incremented and unique.

Calculate average without being thrown by strays

I am trying to calculate an average without being thrown off by a small set of far off numbers (ie, 1,2,1,2,3,4,50) the single 50 will throw off the entire average.
If I have a list of numbers like so:
19,20,21,21,22,30,60,60
The average is 31
The median is 30
The mode is 21 & 60 (averaged to 40.5)
But anyone can see that the majority is in the range 19-22 (5 in, 3 out) and if you get the average of just the major range it's 20.6 (a big difference than any of the numbers above)
I am thinking that you can get this like so:
c+d-r
Where c is the count of a numbers, d is the distinct values, and r is the range. Then you can apply this to all the possble ranges, and the highest score is the omptimal range to get an average from.
For example 19,20,21,21,22 would be 5 numbers, 4 distinct values, and the range is 3 (22 - 19). If you plug this into my equation you get 5+4-3=6
If you applied this to the entire number list it would be 8+6-41=-27
I think this works pretty good, but I have to create a huge loop to test against all possible ranges. In just my small example there are 21 possible ranges:
19-19, 19-20, 19-21, 19-22, 19-30, 19-60, 20-20, 20-21, 20-22, 20-30, 20-60, 21-21, 21-22, 21-30, 21-60, 22-22, 22-30, 22-60, 30-30, 30-60, 60-60
I am wondering if there is a more efficient way to get an average like this.
Or if someone has a better algorithm all together?

You might get some use out of standard deviation here, which basically measures how concentrated the data points are. You can define an outlier as anything more than 1 standard deviation (or whatever other number suits you) from the average, throw them out, and calculate a new average that doesn't include them.

Here's a pretty naive implementation that you could fix up for your own needs. I purposely kept it pretty verbose. It's based on the five-number-summary often used to figure these things out.
function get_median($arr) {
sort($arr);
$c = count($arr) - 1;
if ($c%2) {
$b = round($c/2);
$a = $b-1;
return ($arr[$b] + $arr[$a]) / 2 ;
} else {
return $arr[($c/2)];
}
}
function get_five_number_summary($arr) {
sort($arr);
$c = count($arr) - 1;
$fns = array();
if ($c%2) {
$b = round($c/2);
$a = $b-1;
$lower_quartile = array_slice($arr, 1, $a-1);
$upper_quartile = array_slice($arr, $b+1, count($lower_quartile));
$fns = array($arr[0], get_median($lower_quartile), get_median($arr), get_median($upper_quartile), $arr[$c-1]);
return $fns;
}
else {
$b = round($c/2);
$a = $b-1;
$lower_quartile = array_slice($arr, 1, $a);
$upper_quartile = array_slice($arr, $b+1, count($lower_quartile));
$fns = array($arr[0], get_median($lower_quartile), get_median($arr), get_median($upper_quartile), $arr[$c-1]);
return $fns;
}
}
function find_outliers($arr) {
$fns = get_five_number_summary($arr);
$interquartile_range = $fns[3] - $fns[1];
$low = $fns[1] - $interquartile_range;
$high = $fns[3] + $interquartile_range;
foreach ($arr as $v) {
if ($v > $high || $v < $low)
echo "$v is an outlier<br>";
}
}
//$numbers = array( 19,20,21,21,22,30,60 ); // 60 is an outlier
$numbers = array( 1,230,239,331,340,800); // 1 is an outlier, 800 is an outlier
find_outliers($numbers);
Note that this method, albeit much simpler to implement than standard deviation, will not find the two 60 outliers in your example, but it works pretty well. Use the code for whatever, hopefully it's useful!
To see how the algorithm works and how I implemented it, go to: http://www.mathwords.com/o/outlier.htm
This, of course, doesn't calculate the final average, but it's kind of trivial after you run find_outliers() :P

Why don't you use the median? It's not 30, it's 21.5.

You could put the values into an array, sort the array, and then find the median, which is usually a better number than the average anyway because it discounts outliers automatically, giving them no more weight than any other number.

You might sort your numbers, choose your preferred subrange (e.g., the middle 90%), and take the mean of that.
There is no one true answer to your question, because there are always going to be distributions that will give you a funny answer (e.g., consider a biased bi-modal distribution). This is why may statistics are often presented using box-and-whisker diagrams showing mean, median, quartiles, and outliers.

How to get a random value from 1~N but excluding several specific values in PHP?

rand(1,N) but excluding array(a,b,c,..),
is there already a built-in function that I don't know or do I have to implement it myself(how?) ?
UPDATE
The qualified solution should have gold performance whether the size of the excluded array is big or not.

No built-in function, but you could do this:
function randWithout($from, $to, array $exceptions) {
sort($exceptions); // lets us use break; in the foreach reliably
$number = rand($from, $to - count($exceptions)); // or mt_rand()
foreach ($exceptions as $exception) {
if ($number >= $exception) {
$number++; // make up for the gap
} else /*if ($number < $exception)*/ {
break;
}
}
return $number;
}
That's off the top of my head, so it could use polishing - but at least you can't end up in an infinite-loop scenario, even hypothetically.
Note: The function breaks if $exceptions exhausts your range - e.g. calling randWithout(1, 2, array(1,2)) or randWithout(1, 2, array(0,1,2,3)) will not yield anything sensible (obviously), but in that case, the returned number will be outside the $from-$to range, so it's easy to catch.
If $exceptions is guaranteed to be sorted already, sort($exceptions); can be removed.
Eye-candy: Somewhat minimalistic visualisation of the algorithm.

I don't think there's such a function built-in ; you'll probably have to code it yourself.
To code this, you have two solutions :
Use a loop, to call rand() or mt_rand() until it returns a correct value
which means calling rand() several times, in the worst case
but this should work OK if N is big, and you don't have many forbidden values.
Build an array that contains only legal values
And use array_rand to pick one value from it
which will work fine if N is small

Depending on exactly what you need, and why, this approach might be an interesting alternative.
$numbers = array_diff(range(1, N), array(a, b, c));
// Either (not a real answer, but could be useful, depending on your circumstances)
shuffle($numbers); // $numbers is now a randomly-sorted array containing all the numbers that interest you
// Or:
$x = $numbers[array_rand($numbers)]; // $x is now a random number selected from the set of numbers you're interested in
So, if you don't need to generate the set of potential numbers each time, but are generating the set once and then picking a bunch of random number from the same set, this could be a good way to go.

The simplest way...
<?php
function rand_except($min, $max, $excepting = array()) {
$num = mt_rand($min, $max);
return in_array($num, $excepting) ? rand_except($min, $max, $excepting) : $num;
}
?>

What you need to do is calculate an array of skipped locations so you can pick a random position in a continuous array of length M = N - #of exceptions and easily map it back to the original array with holes. This will require time and space equal to the skipped array. I don't know php from a hole in the ground so forgive the textual semi-psudo code example.
Make a new array Offset[] the same length as the Exceptions array.
in Offset[i] store the first index in the imagined non-holey array that would have skipped i elements in the original array.
Now to pick a random element. Select a random number, r, in 0..M the number of remaining elements.
Find i such that Offset[i] <= r < Offest[i+i] this is easy with a binary search
Return r + i
Now, that is just a sketch you will need to deal with the ends of the arrays and if things are indexed form 0 or 1 and all that jazz. If you are clever you can actually compute the Offset array on the fly from the original, it is a bit less clear that way though.

Maybe its too late for answer, but I found this piece of code somewhere in my mind when trying to get random data from Database based on random ID excluding some number.
$excludedData = array(); // This is your excluded number
$maxVal = $this->db->count_all_results("game_pertanyaan"); // Get the maximum number based on my database
$randomNum = rand(1, $maxVal); // Make first initiation, I think you can put this directly in the while > in_array paramater, seems working as well, it's up to you
while (in_array($randomNum, $excludedData)) {
$randomNum = rand(1, $maxVal);
}
$randomNum; //Your random number excluding some number you choose

This is the fastest & best performance way to do it :
$all = range($Min,$Max);
$diff = array_diff($all,$Exclude);
shuffle($diff );
$data = array_slice($diff,0,$quantity);

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

How to iterate efficiently when some sub intervals results are known - php

Related

2D PHP array, join values based on similar values

Implementing Cutting Stock Algorithm in PHP

generating an sequential five digit alphanumerical ID

Calculate average without being thrown by strays

How to get a random value from 1~N but excluding several specific values in PHP?

Categories

Resources