Faster to use in_array() or large if-conditional? - php

I'm doing a check in a loop to see if a string equals another string. Easy stuff.
However, it seems that I keep adding to the strings to check against, and have like ten different strings I'm checking against with each loop through. It's easier code wise to just create an array of the strings to check against, and then do in_array();, but I was wondering which would parse faster and use less system resources?
Array
$hideme = array(".", "..", "Thumb.db", "index.php", "icons", "index_backup.php",
"style.css", "highlighter.css", "highlighter.js", "users");
if (!in_array($sub, $hideme)) {
String != String
if ($sub != "." && $sub != ".." ...etc
The difference is probably negligible, just curious for future reference.

Use the first one. There won't be much of a difference in speed, but readability is the real difference.
CPU cycles are cheap. Programmer hours are not.

The build-in function are always faster, as they are compiled C code. The PHP code must be interpreted.
If you really care about CPU cycles, isset() is fastest, so setting the possible values as array keys will be fastest way. Of course, there is CPU vs memory usage, so which use less system resources depends on which resources you want to save.
As #Kendall Frey stated, this is micro-optimization, so keep code readable and don't do anything about optimization, unless profiler shows that this code has large impact on the execution.

The easiest solution (and perhaps the fastest) that will easily scale if your $hideme array gets big is to use isset().
$hideme = array(".", "..", "Thumb.db", "index.php", "icons", "index_backup.php",
"style.css", "highlighter.css", "highlighter.js", "users");
if (!isset($hideme[$sub])) {
// $sub is not in $hideme
}
For small arrays, in_array works fine but is generally slower and can become too slow if your array is large.

Related

Best way to count coincidences in an Array in PHP

I have an array of thousands of rows and I want to know what is the best way or the best prectices to count the number of rows in PHP that have a coincidence on it.
In the example you can see that I can find the number of records that match with a range.
I´m thinking in this 2 options:
Option 1:
$rowCount = 0;
foreach ($this->data as $row) {
if ($row['score'] >= $rangeStart && $row['score'] <= $rangeEnd) {
$rowCount++;
}
}
return $rowCount;
Option 2:
$countScoreRange = array_filter($this->data, function($result) {
return $result['score'] >= $this->rangeStart && $result['score'] <= $this->rangeEnd;
});
return count($countScoreRange);
Thanks in advance.
it depends on what you mean when are you speaking about best practices?
if your idea about best practice is about performance, it can say that there is one tradeoff that you must care about"
**speed <===> memory**
if you need performance about memory :
in this way : when you think about performance in iterating an iterable object in PHP, you can Use YIELD to create a generator function , from PHP doc :
what is generator function ?
A generator function looks just like a normal function, except that instead of returning a value, a generator yields as many values as it needs to. Any function containing yield is a generator function.
why we must use generator function ?
A generator allows you to write code that uses foreach to iterate over a set of data without needing to build an array in memory, which may cause you to exceed a memory limit, or require a considerable amount of processing time to generate.
so it's better to dedicate a bit of memory instead of reserving an array of thousands
Unfortunately:
Unfortunately, PHP also does not allow you to use the array traversal functions on generators, including array_filter, array_map, etc.
so till here you know if :
you are iterating an array
of thousands element
especially if you use it in a function and the function runs many where.
and you care about performance specially in memory usage.
it's highly recommended to use generator functions instead.
** but if you need performance about speed :**
the comparison will be about 4 things :
for
foreach
array_* functions
array functions (just like nexT() , reset() and etc.)
The 'foreach' is slow in comparison to the 'for' loop. The foreach copies the array over which the iteration needs to be performed.
but you can do some tricks if you want to use it :
For improved performance, the concept of references needs to be used. In addition to this, ‘foreach’ is easy to use.
what about foreach and array-filter ?
as the previous answer said, also based on this article , also this : :
it is incorrect when we thought that array_* functions are faster!
Of course, if you work on the critical system you should consider this advice. Array functions are a little bit slower than basic loops but I think that there is no significant impact on performance.
Using foreach is much faster, and incrementing your count is also faster than using count().
I once tested both performance and foreach was about 3x faster than array_filter.
I'll go with the first option.

Create new variable or reassign the old one - php

What is less expensive in terms of performance and why ? Though for the first case it creates new variable, but in second case should not it first unset the var1 to reassign it ?
1)
$var1 = $someBigArray;
$var2 = $this->someFunction($var1);
// use $var2
2)
$var1 = $someBigArray;
$var1 = $this->someFunction($var1);
// user $var1
UPDATE
I cant really do this, I just excluded the rest of my code, asking the core part and making it look simpler
$var1 = $this->someFunction($someBigArray);
There you have two things, One is processing other is memory.
About Processing:
In PHP you really don't know what is the type of the variable. if the variable type of $someBigArray and $this->someFunction($var1) are not the same, it will be lot more expensive then assigning a new variable $var2. if they are same type($someBigArray and $this->someFunction($var1)), then it is less expensive.(less processing)
About memory:
using the same variable will may use less memory. small optimization for the RAM.
Memory is cheap, You should be more careful about the processing power your using. In these cases, try to do a benchmark.
PHP's garbage collection is a bit weird and when overriding a variable it isn't (always?) flushed from memory. So memory wise, assign a new variable and use unset() on the old one. You don't really want 30Mb of array wasting away in your memory, especially if this is a script that runs for any length of time or if you ever have multiple of it running at once.
I don't know about processing performance. As Nafis said, try a benchmark. (Simplest and dumbest way, make a script that runs it 1000 times and see which is faster)

PHP | $value = $anothervalue = getValue() - does it have negative influence on performace?

This question is about code optimalization: What is better for performance and why (the first example is cleaner for human being->programmer->me)?
$value = $anothervalue = getValue();
or
$anothervalue = $getValue;
$value = $anothervalue;
This has nothing to do with real performance issues.
Performance improvement is when you replaced 100 sql queries with 1 and reduced page generation time from 1 second to 0.0001s
As long as you cannot (can you?) measure the difference between 2 cases - use the one that is more readable and easy to maintain
$value = $anothervalue = getValue();
I'd guess is probably the most efficient, and it looks much nicer, too. However! An optimization like this should not matter in terms of execution time at all, so feel free to use whichever is clearer to you.
It should have no influence on performance they perform the same operation IMO, although I prefer the later due to readability
What you're talking about is a micro-optimization. There is absolutely no benefit to trying to determine which is faster, because even if one of them is (which I honestly doubt) then the difference will be so small as to make no practical difference in real life.
If you absolutely must find out one way or the other, then you could benchmark it. Run a loop that does operation in the $a = $b = func() style, then run the same loop but using the $a = func(); $b = $a style instead.
As the difference is probably nearly non-existent, you'll need a very big loop, at least 100,000 iterations.

Exploding an array within a foreach loop parameter

foreach(explode(',' $foo) as $bar) { ... }
vs
$test = explode(',' $foo);
foreach($test as $bar) { ... }
In the first example, does it explode the $foo string for each iteration or does PHP keep it in memory exploded in its own temporary variable? From an efficiency point of view, does it make sense to create the extra variable $test or are both pretty much equal?
I could make an educated guess, but let's try it out!
I figured there were three main ways to approach this.
explode and assign before entering the loop
explode within the loop, no assignment
string tokenize
My hypotheses:
probably consume more memory due to assignment
probably identical to #1 or #3, not sure which
probably both quicker and much smaller memory footprint
Approach
Here's my test script:
<?php
ini_set('memory_limit', '1024M');
$listStr = 'text';
$listStr .= str_repeat(',text', 9999999);
$timeStart = microtime(true);
/*****
* {INSERT LOOP HERE}
*/
$timeEnd = microtime(true);
$timeElapsed = $timeEnd - $timeStart;
printf("Memory used: %s kB\n", memory_get_peak_usage()/1024);
printf("Total time: %s s\n", $timeElapsed);
And here are the three versions:
1)
// explode separately
$arr = explode(',', $listStr);
foreach ($arr as $val) {}
2)
// explode inline-ly
foreach (explode(',', $listStr) as $val) {}
3)
// tokenize
$tok = strtok($listStr, ',');
while ($tok = strtok(',')) {}
Results
Conclusions
Looks like some assumptions were disproven. Don't you love science? :-)
In the big picture, any of these methods is sufficiently fast for a list of "reasonable size" (few hundred or few thousand).
If you're iterating over something huge, time difference is relatively minor but memory usage could be different by an order of magnitude!
When you explode() inline without pre-assignment, it's a fair bit slower for some reason.
Surprisingly, tokenizing is a bit slower than explicitly iterating a declared array. Working on such a small scale, I believe that's due to the call stack overhead of making a function call to strtok() every iteration. More on this below.
In terms of number of function calls, explode()ing really tops tokenizing. O(1) vs O(n)
I added a bonus to the chart where I run method 1) with a function call in the loop. I used strlen($val), thinking it would be a relatively similar execution time. That's subject to debate, but I was only trying to make a general point. (I only ran strlen($val) and ignored its output. I did not assign it to anything, for an assignment would be an additional time-cost.)
// explode separately
$arr = explode(',', $listStr);
foreach ($arr as $val) {strlen($val);}
As you can see from the results table, it then becomes the slowest method of the three.
Final thought
This is interesting to know, but my suggestion is to do whatever you feel is most readable/maintainable. Only if you're really dealing with a significantly large dataset should you be worried about these micro-optimizations.
In the first case, PHP explodes it once and keeps it in memory.
The impact of creating a different variable or the other way would be negligible. PHP Interpreter would need to maintain a pointer to a location of next item whether they are user defined or not.
From the point of memory it will not make a difference, because PHP uses the copy on write concept.
Apart from that, I personally would opt for the first option - it's a line less, but not less readable (imho!).
Efficiency in what sense? Memory management, or processor? Processor wouldn't make a difference, for memory - you can always do $foo = explode(',', $foo)

Is it better call a function every time or store that value in a new variable?

I use often the function sizeof($var) on my web application, and I'd like to know if is better (in resources term) store this value in a new variable and use this one, or if it's better call/use every time that function; or maybe is indifferent :)
TLDR: it's better to set a variable, calling sizeof() only once. (IMO)
I ran some tests on the looping aspect of this small array:
$myArray = array("bill", "dave", "alex", "tom", "fred", "smith", "etc", "etc", "etc");
// A)
for($i=0; $i<10000; $i++) {
echo sizeof($myArray);
}
// B)
$sizeof = sizeof($myArray);
for($i=0; $i<10000; $i++) {
echo $sizeof;
}
With an array of 9 items:
A) took 0.0085 seconds
B) took 0.0049 seconds
With a array of 180 items:
A) took 0.0078 seconds
B) took 0.0043 seconds
With a array of 3600 items:
A) took 0.5-0.6 seconds
B) took 0.35-0.5 seconds
Although there isn't much of a difference, you can see that as the array grows, the difference becomes more and more. I think this has made me re-think my opinion, and say that from now on, I'll be setting the variable pre-loop.
Storing a PHP integer takes 68 bytes of memory. This is a small enough amount, that I think I'd rather worry about processing time than memory space.
In general, it is preferable to assign the result of a function you are likely to repeat to a variable.
In the example you suggested, the difference in processing code produced by this approach and the alternative (repeatedly calling the function) would be insignificant. However, where the function in question is more complex it would be better to avoid executing it repeatedly.
For example:
for($i=0; $i<10000; $i++) {
echo date('Y-m-d');
}
Executes in 0.225273 seconds on my server, while:
$date = date('Y-m-d');
for($i=0; $i<10000; $i++) {
echo $date;
}
executes in 0.134742 seconds. I know these snippets aren't quite equivalent, but you get the idea. Over many page loads by many users over many months or years, even a difference of this size can be significant. If we were to use some complex function, serious scalability issues could be introduced.
A main advantage of not assigning a return value to a variable is that you need one less line of code. In PHP, we can commonly do our assignment at the same time as invoking our function:
$sql = "SELECT...";
if(!$query = mysql_query($sql))...
...although this is sometimes discouraged for readability reasons.
In my view for the sake of consistency assigning return values to variables is broadly the better approach, even when performing simple functions.
If you are calling the function over and over, it is probably best to keep this info in a variable. That way the server doesn't have to keep processing the answer, it just looks it up. If the result is likely to change, however, it will be best to keep running the function.
Since you allocate a new variable, this will take a tiny bit more memory. But it might make your code a tiny bit more faster.
The troubles it bring, could be big. For example, if you include another file that applies the same trick, and both store the size in a var $sizeof, bad things might happen. Strange bugs, that happen when you don't expect it. Or you forget to add global $sizeof in your function.
There are so many possible bugs you introduce, for what? Since the speed gain is likely not measurable, I don't think it's worth it.
Unless you are calling this function a million times your "performance boost" will be negligible.
I do no think that it really matters. In a sense, you do not want to perform the same thing over and over again, but considering that it is sizeof(); unless it is a enormous array you should be fine either way.
I think, you should avoid constructs like:
for ($i = 0; $i < sizeof($array), $i += 1) {
// do stuff
}
For, sizeof will be executed every iteration, even though it is often not likely to change.
Whereas in constructs like this:
while(sizeof($array) > 0) {
if ($someCondition) {
$entry = array_pop($array);
}
}
You often have no choice but to calculate it every iteration.

Categories