I'm learning PHP. Having trouble understanding why this piece of code isn't working.
In particular: why is the result of array_sum($x) (1596) greater than $cap? Perhaps I'm not understanding the nature of while loops, but it seems to me (looking at a print_r($x)), the loop should cut out a step before it actually does.
<?php
function fibonacci_sum($cap = 1000){
list( $cur, $nxt, $seq ) = array( 0, 1, array() );
while ( array_sum($seq) < $cap ) {
$seq[] = $cur;
$add = $cur + $nxt;
$cur = $nxt;
$nxt = $add;
}
return $seq;
}
$x = fibonacci_sum();
echo array_sum($x);
?>
Any insight is appreciated.
Best,
matt
Look at it this way: if array_sum($x) were less than $cap, then the body of the while loop would be executed again. Therefore, when the while loop has stopped executing, by definition the condition at the top of the while will be false. (*) I think you want to say:
while ( array_sum($seq) + $cur < $cap ) {
That will stop just before you exceed $cap, which is what you seem to want to do.
(*) Yeah, yeah, disclaimer about the effect of break statements.
Simple order of execution.
If you add a little variable watcher it's easy to see how this happens
while ( array_sum( $seq ) < $cap )
{
echo $cur, ' : ', array_sum( $seq ), '<br>';
$seq[] = $cur;
$add = $cur + $nxt;
$cur = $nxt;
$nxt = $add;
}
If you run this, the last output shows 610 : 986. So, 986 is less than 1000, which is why that iteration of the loop executes, but the very first line of the loop pushes $cur onto $seq anyway - completely oblivious to the fact that it just made the sum creater than the cap. So when you do array_sum() outside the function, that 610 is part of it, hence the 1596.
So what you really want is when the array sum plus the next value is less than the cap.
while ( array_sum( $seq ) + $cur < $cap )
But, it's a bit weird that your function returns an array at all. Why not just have it return the sum directly?
The sum is less than cap the last time the body of the while loop is entered. That's how while loops work in all major languages. In this case, the sum is greater than cap when the loop is exited for the last time.
Related
Im trying to slice an array on 30 at a time to send a call to an endpoint which can only recibe 30 codes at a time. Right now I have about 203 lines in the array. Everything is fine until the 4th round where I don't know why the indexes (init and endt) I've set up, get messed up and the slice no longer works which in turns makes the call to the endpoint impossible. Here is the code:
$arrcount = ceil(count($requestarr['SelectionDetails'])/30);
$init = 0;
if (count($requestarr['SelectionDetails']) > 30) {
$endt = 29;
} else {
$endt = count($requestarr['SelectionDetails']);
}
for ($i=0; $i < $arrcount; $i++) {
$request['SelectionDetails'] = array_slice($requestarr['SelectionDetails'], $init, $endt);
$init = $endt+1;
if (count($requestarr['SelectionDetails']) > ($endt+30)) {
$endt += 30;
} else {
$endt = count($requestarr['SelectionDetails']);
}
//call to endpoind and other code here....
}
The requestarr[] has the whole 203 indexes and the request[] contains the 30 indexes slice sent each loop. Its worth noticing that my array has "custom" keys that are tracking numbers from FedEx. At the 4th loop, the init turns into 120 and the endt turns into 113. Im not sure how or why this happens. Needless to say the next 3 for loops the endt gets reduced by 7 then by 34 and finally by 1. There is no code where the endt should be decreasing. So I'm not sure how this is happening. Any ideas? If you need more information or more code I'll gladly help.
To split items into chunks - use array_chunk:
$chunks = array_chunk($your_data, 30);
foreach ($chunks as $chunk) {
// send $chunk to endpoint
}
$arrcount = ceil(count($requestarr['SelectionDetails'])/30);
for ($i=0; $i < $arrcount; $i++) {
$request['SelectionDetails'] = array_slice($requestarr['SelectionDetails'], $i * 30, 30);
//call to endpoind and other code here....
}
array_slice can take a length argument longer than the array. You just have to move the starting point.
I have such an array of intervals sorted by the lower bound ($a[$i] <= $a[$i+1] for every $i), key l is lower bound and , key h is upper bound and I'd like to remove all rows with intervals that are enclosed by larger intervals.
$a[0] = array('l' => 123, 'h'=>241);
$a[1] = array('l' => 250, 'h'=>360);
$a[2] = array('l' => 280, 'h'=>285);
$a[3] = array('l' => 310, 'h'=>310);
$a[4] = array('l' => 390, 'h'=>400);
So the result I'd like to get is
$a[0] = array('l' => 123, 'h'=>241);
$a[1] = array('l' => 250, 'h'=>360);
$a[2] = array('l' => 390, 'h'=>400);
This is what I attempted
function dup($a){
$c = count($a)-1;
for ($i = $c; $i > 0; $i --){
while ($a[$i]['h'] <= $a[$i-1]['h']){
unset($a[$i]);
}
}
$a = array_values($a);
}
The first answer which comes in mind was given with different variations by other contributors : for each interval, loop on each interval looking for a larger and enclosing interval. It's simple to understand and to write, and it works for sure.
This is basically n2 order, which means for n intervals we'll do n*n loop turns. There can be some tricks to optimize it :
break'ing when we find an enclosing interval in the nested loop, as in user3137702's answer, because it's useless to continue if we find at least one enclosing interval
avoiding looping on the same interval in the nested loop because we know an interval cant be strictly enclosed in itself (not significant)
avoiding looping on already excluded intervals in the nested loop (can have a significant impact)
looping on intervals (global loop) in ascending width = (h - l) order, because smaller intervals have more chance to be enclosed in others and the earliest we eliminate intervals, the more the next loop turns are effective (can be significant too in my opinion)
searching for enclosing intervals (nested loop) in descending width order, because larger intervals have more chance to be enclosing other intervals (I think it can have a significant impact too)
probably many other things that do not come to mind at the moment
Let me say now that :
optimization does not matter much if we have only few intervals to compute from time to time, and currently accepted user3137702's answer does the trick
to develop the suitable algorithm, it is necessary anyway to study the characteristics of the data that we have to deal with : in the case before us, how is the distribution of intervals ? Are there many enclosed intervals ? This can help to choose from the above list, the most useful tricks.
For educational purposes, I wondered if we could develop a different algorithm avoiding a n*n order which running time is necessarily very quickly deteriorated gradually as you increase the number of intervals to compute.
"Virtual rule" algorithm
I imagined this algorithm I called the "virtual rule".
place starting and ending points of the intervals on a virtual rule
run through the points along the rule in ascending order
during the run, register open or not intervals
when an interval starts and ends while another was opened before and is still open, we can say it is enclosed
so when an interval ends, check if it was opened after one of the other currently open intervals and if it is strictly closed before this interval. If yes, it is enclosed !
I do not pretend this is the best solution. But we can assume this is faster than the basic method because, despite many tests to do during the loop, this is n order.
Code example
I wrote comments to make it as clear as possible.
<?php
function removeEnclosedIntervals_VirtualRule($a, $debug = false)
{
$rule = array();
// place one point on a virtual rule for each low or up bound, refering to the interval's index in $a
// virtual rule has 2 levels because there can be more than one point for a value
foreach($a as $i => $interval)
{
$rule[$interval['l']][] = array('l', $i);
$rule[$interval['h']][] = array('h', $i);
}
// used in the foreach loop
$open = array();
$enclosed = array();
// loop through the points on the ordered virtual rule
ksort($rule);
foreach($rule as $points)
{
// Will register open intervals
// When an interval starts and ends while another was opened before and is still open, it is enclosed
// starts
foreach($points as $point)
if($point[0] == 'l')
$open[$point[1]] = $point[1]; // register it as open
// ends
foreach($points as $point)
{
if($point[0] == 'h')
{
unset($open[$point[1]]); // UNregister it as open
// was it opened after a still open interval ?
foreach($open as $i)
{
if($a[$i]['l'] < $a[$point[1]]['l'])
{
// it is enclosed.
// is it *strictly* enclosed ?
if($a[$i]['h'] > $a[$point[1]]['h'])
{
// so this interval is strictly enclosed
$enclosed[$point[1]] = $point[1];
if($debug)
echo debugPhrase(
$point[1], // $iEnclosed
$a[$point[1]]['l'], // $lEnclosed
$a[$point[1]]['h'], // $hEnclosed
$i, // $iLarger
$a[$i]['l'], // $lLarger
$a[$i]['h'] // $hLarger
);
break;
}
}
}
}
}
}
// obviously
foreach($enclosed as $i)
unset($a[$i]);
return $a;
}
?>
Benchmarking against basic method
It runs tests on randomly generated intervals
basic method works without a doubt. Comparing results from the two methods allows me to predent the "VirtualRule" method works because as far as I tested, it returned the same results
// * include removeEnclosingIntervals_VirtualRule function *
// arbitrary range for intervals start and end
// Note that it could be interesting to do benchmarking with different MIN and MAX values !
define('MIN', 0);
define('MAX', 500);
// Benchmarking params
define('TEST_MAX_NUMBER', 100000);
define('TEST_BY_STEPS_OF', 100);
// from http://php.net/manual/en/function.microtime.php
// used later for benchmarking purpose
function microtime_float()
{
list($usec, $sec) = explode(" ", microtime());
return ((float)$usec + (float)$sec);
}
function debugPhrase($iEnclosed, $lEnclosed, $hEnclosed, $iLarger, $lLarger, $hLarger)
{
return '('.$iEnclosed.')['.$lEnclosed.' ; '.$hEnclosed.'] is strictly enclosed at least in ('.$iLarger.')['.$lLarger.' ; '.$hLarger.']'.PHP_EOL;
}
// 2 foreach loops solution (based on user3137702's *damn good* work ;) and currently accepted answer)
function removeEnclosedIntervals_Basic($a, $debug = false)
{
foreach ($a as $i => $valA)
{
$found = false;
foreach ($a as $j => $valB)
{
if (($valA['l'] > $valB['l']) && ($valA['h'] < $valB['h']))
{
$found = true;
if($debug)
echo debugPhrase(
$i, // $iEnclosed
$a[$i]['l'], // $lEnclosed
$a[$i]['h'], // $hEnclosed
$j, // $iLarger
$a[$j]['l'], // $lLarger
$a[$j]['h'] // $hLarger
);
break;
}
}
if (!$found)
{
$out[$i] = $valA;
}
}
return $out;
}
// runs a benchmark with $number intervals
function runTest($number)
{
// Generating a random set of intervals with values between MIN and MAX
$randomSet = array();
for($i=0; $i<$number; $i++)
// avoiding self-closing intervals
$randomSet[] = array(
'l' => ($l = mt_rand(MIN, MAX-2)),
'h' => mt_rand($l+1, MAX)
);
/* running the two methods and comparing results and execution time */
// Basic method
$start = microtime_float();
$Basic_result = removeEnclosedIntervals_Basic($randomSet);
$end = microtime_float();
$Basic_time = $end - $start;
// VirtualRule
$start = microtime_float();
$VirtualRule_result = removeEnclosedIntervals_VirtualRule($randomSet);
$end = microtime_float();
$VirtualRule_time = $end - $start;
// Basic method works for sure.
// If results are the same, comparing execution time. If not, sh*t happened !
if(md5(var_export($VirtualRule_result, true)) == md5(var_export($VirtualRule_result, true)))
echo $number.';'.$Basic_time.';'.$VirtualRule_time.PHP_EOL;
else
{
echo '/;/;/;Work harder, results are not the same ! Cant say anything !'.PHP_EOL;
stop;
}
}
// CSV header
echo 'Number of intervals;Basic method exec time (s);VirtualRule method exec time (s)'.PHP_EOL;
for($n=TEST_BY_STEPS_OF; $n<TEST_MAX_NUMBER; $n+=TEST_BY_STEPS_OF)
{
runTest($n);
flush();
}
Results (for me)
As I thought, clearly different performances are obtained.
I ran the tests on a Core i7 computer with PHP5 and on a (old) AMD Quad Core computer with PHP7. There are clear differences in performance between the two versions on my systems ! which in principle can be explained by the difference in PHP versions because the computer that is running PHP5 is much more powerful...
A simplistic approach, maybe not exactly what you want, but should at least point you in the right direction. I can refine it if needed, just a bit busy and didn't want to leave the question unanswered..
$out = [];
foreach ($a as $valA)
{
$found = false;
foreach ($a as $valB)
{
if (($valA['l'] > $valB['l']) && ($valA['h'] < $valB['h']))
{
$found = true;
break;
}
}
if (!$found)
{
$out[] = $valA;
}
}
This is entirely untested, but should end up with only the unique (large) ranges in $out. Overlaps as I mentioned in my comment are unhandled.
The problem was missing break in the while cycle
function dup($a){
$c = count($a)-1;
for ($i = $c; $i > 0; $i --){
while ($a[$i]['h'] <= $a[$i-1]['h']){
unset($a[$i]);
break; //here
}
}
$a = array_values($a);
}
Here is the code
function sort_by_low($item1,$item2){
if($item1['l'] == $item2['l'])
return 0;
return ($item1['l']>$item2['l'])? -1:1;
}
usort($a,'sort_by_low');
for($i=0; $i<count($a); $i++){
for($j=$i+1; $j<count($a);$j++){
if($a[$i][l]<=$a[$j]['l'] && $a[$i][h]>=$a[$j]['h']){
unset($a[$j]);
}
}
}
$a=array_values($a);
Here is the working code (Tested)
$result = array();
usort($a, function ($item1, $item2) {
if ($item1['l'] == $item2['l']) return 0;
return $item1['l'] < $item2['l'] ? -1 : 1;
});
foreach ($a as $element) {
$exists = false;
foreach ($result as $r) {
if (($r['l'] < $element['l'] && $r['h'] > $element['h'])) {
$exists = true;
break;
}
}
if (!$exists) {
$result[] = $element;
}
}
$result will contain the desired result
Supposedly string is:
$a = "abc-def"
if (preg_match("/[^a-z0-9]/i", $a, $m)){
$i = "i stopped scanning '$a' because I found a violation in it while
scanning it from left to right. The violation was: $m[0]";
}
echo $i;
example above: should indicate "-" was the violation.
I would like to know if there is a non-preg_match way of doing this.
I will likely run benchmarks if there is a non-preg_match way of doing this perhaps 1000 or 1 million runs, to see which is faster and more efficient.
In the benchmarks "$a" will be much longer.
To ensure it is not trying to scan the entire "$a" and to ensure it stops soon as it detects a violation within the "$a"
Based on information I have witnessed on the internet, preg_match stops when the first match is found.
UPDATE:
this is based on the answer that was given by "bishop" and will likely to be chosen as the valid answer soon ( shortly ).
i modified it a little bit because i only want it to report the violator character. but i also commented that line out so benchmark can run without entanglements.
let's run a 1 million run based on that answer.
$start_time = microtime(TRUE);
$count = 0;
while ($count < 1000000){
$allowed = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789';
$input = 'abc-def';
$validLen = strspn($input, $allowed);
if ($validLen < strlen($input)){
#echo "violation at: ". substr($input, $validLen,1);
}
$count = $count + 1;
};
$end_time = microtime(TRUE);
$dif = $end_time - $start_time;
echo $dif;
the result is: 0.606614112854
( 60 percent of a second )
let's do it with the preg_match method.
i hope everything is the same. ( and fair )..
( i say this because there is the ^ character in the preg_match )
$start_time = microtime(TRUE);
$count = 0;
while ($count < 1000000){
$input = 'abc-def';
preg_match("/[^a-z0-9]/i", $input, $m);
#echo "violation at:". $m[0];
$count = $count + 1;
};
$end_time = microtime(TRUE);
$dif = $end_time - $start_time;
echo $dif;
i use "dif" in reference to the terminology "difference".
the "dif" was.. 1.1145210266113
( took 11 percent more than a whole second )
( if it was 1.2 that would mean it is 2x slower than the php way )
You want to find the location of the first character not in the given range, without using regular expressions? You might want strspn or its complement strcspn:
$allowed = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789';
$input = 'abc-def';
$validLen = strspn($input, $allowed);
if (strlen($input) !== $validLen) {
printf('Input invalid, starting at %s', substr($input, $validLen));
} else {
echo 'Input is valid';
}
Outputs Input invalid, starting at -def. See it live.
strspn (and its complement) are very old, very well specified (POSIX even). The standard implementations are optimized for this task. PHP just leverages that platform implementation, so PHP should be fast, too.
I'm attempting to solve Project Euler in PHP and running into a problem with my for loop conditions inside the while loop. Could someone point me towards the right direction? Am I on the right track here?
The problem, btw, is to find the sums of all prime numbers below 2,000,000
Other note: The problem I'm encountering is that it seems to be a memory hog and besides implementing the sieve, I'm not sure how else to approach this. So, I'm wondering if I did something wrong in the implementation.
<?php
// The sum of the primes below 10 is 2 + 3 + 5 + 7 = 17.
// Additional information:
// Sum below 100: 1060
// 1000: 76127
// (for testing)
// Find the sum of all the primes below 2,000,000.
// First, let's set n = 2 mill or the number we wish to find
// the primes under.
$n = 2000000;
// Then, let's set p = 2, the first prime number.
$p = 2;
// Now, let's create a list of all numbers from p to n.
$list = range($p, $n);
// Now the loop for Sieve of Eratosthenes.
// Also, let $i = 0 for a counter.
$i = 0;
while($p*$p < $n)
{
// Strike off all multiples of p less than or equal to n
for($k=0; $k < $n; $k++)
{
if($list[$k] % $p == 0)
{
unset($list[$k]);
}
}
// Re-initialize array
sort ($list);
// Find first number on list after p. Let that equal p.
$i = $i + 1;
$p = $list[$i];
}
echo array_sum($list);
?>
You can make a major optimization to your middle loop.
for($k=0; $k < $n; $k++)
{
if($list[$k] % $p == 0)
{
unset($list[$k]);
}
}
By beginning with 2*p and incrementing by $p instead of by 1. This eliminates the need for divisibility check as well as reducing the total iterations.
for($k=2*$p; $k < $n; $k += $p)
{
if (isset($list[k])) unset($list[$k]); //thanks matchu!
}
The suggestion above to check only odds to begin with (other than 2) is a good idea as well, although since the inner loop never gets off the ground for those cases I don't think its that critical. I also can't help but thinking the unsets are inefficient, tho I'm not 100% sure about that.
Here's my solution, using a 'boolean' array for the primes rather than actually removing the elements. I like using map,filters,reduce and stuff, but i figured id stick close to what you've done and this might be more efficient (although longer) anyway.
$top = 20000000;
$plist = array_fill(2,$top,1);
for ($a = 2 ; $a <= sqrt($top)+1; $a++)
{
if ($plist[$a] == 1)
for ($b = ($a+$a) ; $b <= $top; $b+=$a)
{
$plist[$b] = 0;
}
}
$sum = 0;
foreach ($plist as $k=>$v)
{
$sum += $k*$v;
}
echo $sum;
When I did this for project euler i used python, as I did for most. but someone who used PHP along the same lines as the one I did claimed it ran it 7 seconds (page 2's SekaiAi, for those who can look). I don't really care for his form (putting the body of a for loop into its increment clause!), or the use of globals and the function he has, but the main points are all there. My convenient means of testing PHP runs thru a server on a VMWareFusion local machine so its well slower, can't really comment from experience.
I've got the code to the point where it runs, and passes on small examples (17, for instance). However, it's been 8 or so minutes, and it's still running on my machine. I suspect that this algorithm, though simple, may not be the most effective, since it has to run through a lot of numbers a lot of times. (2 million tests on your first run, 1 million on your next, and they start removing less and less at a time as you go.) It also uses a lot of memory since you're, ya know, storing a list of millions of integers.
Regardless, here's my final copy of your code, with a list of the changes I made and why. I'm not sure that it works for 2,000,000 yet, but we'll see.
EDIT: It hit the right answer! Yay!
Set memory_limit to -1 to allow PHP to take as much memory as it wants for this very special case (very, very bad idea in production scripts!)
In PHP, use % instead of mod
The inner and outer loops can't use the same variable; PHP considers them to have the same scope. Use, maybe, $j for the inner loop.
To avoid having the prime strike itself off in the inner loop, start $j at $i + 1
On the unset, you used $arr instead of $list ;)
You missed a $ on the unset, so PHP interprets $list[j] as $list['j']. Just a typo.
I think that's all I did. I ran it with some progress output, and the highest prime it's reached by now is 599, so I'll let you know how it goes :)
My strategy in Ruby on this problem was just to check if every number under n was prime, looping through 2 and floor(sqrt(n)). It's also probably not an optimal solution, and takes a while to execute, but only about a minute or two. That could be the algorithm, or that could just be Ruby being better at this sort of job than PHP :/
Final code:
<?php
ini_set('memory_limit', -1);
// The sum of the primes below 10 is 2 + 3 + 5 + 7 = 17.
// Additional information:
// Sum below 100: 1060
// 1000: 76127
// (for testing)
// Find the sum of all the primes below 2,000,000.
// First, let's set n = 2 mill or the number we wish to find
// the primes under.
$n = 2000000;
// Then, let's set p = 2, the first prime number.
$p = 2;
// Now, let's create a list of all numbers from p to n.
$list = range($p, $n);
// Now the loop for Sieve of Eratosthenes.
// Also, let $i = 0 for a counter.
$i = 0;
while($p*$p < $n)
{
// Strike off all multiples of p less than or equal to n
for($j=$i+1; $j < $n; $j++)
{
if($list[$j] % $p == 0)
{
unset($list[$j]);
}
}
// Re-initialize array
sort ($list);
// Find first number on list after p. Let that equal p.
$i = $i + 1;
$p = $list[$i];
echo "$i: $p\n";
}
echo array_sum($list);
?>
Im making my own function to search for the total number of something. but it is not working propperly. function getNumberOfCounts gets the $fromIndex but not the searchWord
public function getNumberOfCounts( $searchWord, $fromIndex )
{
$index = $fromIndex;
$counter = 0;
while( $index <= $endPos )
{
$index++;
$pos = strpos( $this->text, $searchWord, ($index+1) );
if( $pos > $index )
{
$counter++;
$index = $pos;
}
else
break;
}
return $counter;
}
public function searchDemo()
{
$startPos = 11; // ex
echo "<br /> count= " . $this->getNumberOfCounts( "Lorem", $startPos );
}
They are both part of the same class ofc.
EDIT: i know there is some missing info, but if I try to print $searchWord on the first line of getNumberOfCounts nothing is outputted.
You will likely have less problems and better performing code when just using
substr_count — Count the number of substring occurrences
You're starting from position 11 for starters, which would be greater than the total length of "Lorem". Secondly, in your while loop, you're running while startpos < endpos. $endpos hasn't been assigned a value yet, so it's not even entering the loop.
While I'm at it, you're incrementing index at the start of the loop (probably not desired). It's usually the case the the index is incremented at the end of the loop, so you can use the index to "index into" an array without having to shift position.