In PHP which operator should i use '<' or '<='? - php

I am using the following code :
<?php
$start_time = microtime(true);
for($i=1;$i <= 999999; $i++){
}
$end_time = microtime(true);
echo "Time Interval : ".$end_time-$start_time;
$start_time = microtime(true);
for($j=1;$j < 1000000; $j++){
}
$end_time = microtime(true);
echo "Time Interval : ".$end_time-$start_time;
?>
It shows met the time difference between them of 0.0943 seconds.So in this way <= operator is faster than '<'. I just wanna know if there is any disadvantage of using <= operator over '<'?

There is certainly no performance penalty. Both operands are single opcode in PHP bytecode, and ultimately both should be executed using exactly one CPU instruction.
http://www.php.net/manual/en/internals2.opcodes.is-smaller.php
http://www.php.net/manual/en/internals2.opcodes.is-smaller-or-equal.php
and of course I agree with comments - you should never optimize things at this low level of your code.

http://en.wikipedia.org/wiki/Program_optimization
"Premature optimization" is a phrase used to describe a situation
where a programmer lets performance considerations affect the design
of a piece of code. This can result in a design that is not as clean
as it could have been or code that is incorrect, because the code is
complicated by the optimization and the programmer is distracted by
optimizing.
Usually I'd stick with having meaningful code vs. faster code that's less meaningful. That guard will probably be a value coming from a variable defined elsewhere, so you should keep it's meaning. You shouldn't change all your < to <= to save a fraction of time (which may be a random error).

Related

preg_match is faster than strpos on large text?

I am currently updating very old script written for PHP 5.2.17 for PHP 8.1.2. There is a lot of text processing code blocks and almost all of them are preg_match/preg_match_all. I used to know, that strpos for string matching have always been faster than preg_match, but I decided to check one more time.
Code was:
$c = file_get_contents('readme-redist-bins.txt');
$start = microtime(true);
for ($i=0; $i < 1000000; $i++) {
strpos($c, '[SOMEMACRO]');
}
$el = microtime(true) - $start;
exit($el);
and
$c = file_get_contents('readme-redist-bins.txt');
$start = microtime(true);
for ($i=0; $i < 1000000; $i++) {
preg_match_all("/\[([a-z0-9-]{0,100})".'[SOMEMACRO]'."/", $c, $pma);
}
$el = microtime(true) - $start;
exit($el);
I took readme-redist-bins.txt file which comes with php8.1.2 distribution, about 30KB.
Results(preg_match_all):
PHP_8.1.2: 1.2461s
PHP_5.2.17: 11.0701s
Results(strpos):
PHP_8.1.2: 9.97s
PHP_5.2.17: 0.65s
Double checked... Tried Windows and Linux PHP builds, on two machines.
Tried the same code with small file(200B)
Results(preg_match_all):
PHP_8.1.2: 0.0867s
PHP_5.2.17: 0.6097s
Results(strpos):
PHP_8.1.2: 0.0358s
PHP_5.2.17: 0.2484s
And now the timings is OK.
So, how cant it be, that preg_match is so match faster on large text? Any ideas?
PS: Tried PHP_7.2.10 - same result.
PCRE2 is really fast. It's so fast that there usually is barely any difference between it and plain string processing in PHP and sometimes it's even faster. PCRE2 internally uses JIT and contains a lot of optimizations. It's really good at what it does.
On the other hand, strpos is poorly optimized. It's doing some simple byte comparison in C. It doesn't use parallelization/vectorization. For short needles and short haystacks, it uses memchr, but for longer values, it performs Sunday Algorithm.
For small datasets, the overhead from calling PCRE2 will probably outweigh its optimizations, but for larger strings, or case-insensitive/Unicode strings PCRE2 might offer better performance.

PHP: Which is faster - array_sum or a foreach?

I'm trying to work out which method would be faster (if either would be?). I've designed a test here: http://codepad.org/odyUN0xg
When I run that test my results are very inconsistent. Both vary wildly.
Is there a problem with the test?
otherwise, can anyone suggest which one would or wouldn't be faster??
Edit
Okay guys, thanks I've edited the codepad here: http://codepad.org/n1Xrt98J
With all the comments and discussion i've decided to go with array sum. As soon as I used that microtime(true) thing it started to look alot faster (array_sum)
Cheers for the advice, oh and I've added a "for" loop so that results are more even, but as noted on the results there is little time saving if any over a foreach.
The problem is that you took a very low limit, 1000. The overhead of the PHP interpreter is MUCH larger. I would take 100000000 or something like that.
However I think array_sum is faster since it's more specialized and probably implemented in fast C.
Oh, and as Michael McTiernan said you must change every instance of microtime() to microtime(true). http://php.net/manual/en/function.microtime.php
And finally, I wouldn't use codepad as testing environment since you have no control over it. You have no idea what happens and whether your process is paused or not.
To be honest, there's little value in using an artificial test and either way this sounds like fairly pointless micro-optimisation unless you've specifically identified this as a problem area after profiling the necessary code.
As such, it probably makes sense to use which ever feels more appropriate. I'd personally plump for array_sum - that's what it's there for after all.
Change any instance of microtime() to microtime(true).
Also, after testing this, the results aren't that wildly different.
$s = microtime();
A call to microtime() with no arguments will return a string like this: 0.35250000 1300737802. You probably want this:
$s = microtime(TRUE);
array sum took 0.125188 seconds sum
numbers took 0.166603 seconds
These kind of tests need to be run a few thousand times so you can get large execution times that are not affected by tiny external factors.
You need much bigger runs, and need to average several of them. You should also separate the two tests into two files. The server has other things going on that will disturb test timing, hence the need to average many runs.
array_sum() should be the faster of two as there is no extra script parsing associated with it, but its worth checking.
Almost always array_sum is faster. It depends more on your server php.ini configuration than actual usage between array_sum and foreach.
Coming to the point of this question, for a sample setup like following:
<?php
set_time_limit(0);
$s = microtime(TRUE);
$array = range(1, 10000);
$sum = 0;
for($j = 0; $j < 1000; $j++){
$sum += array_sum($array);
}
$s1 = microtime(TRUE);
$diff = $s1 - $s;
echo "for 1000 pass, array_sum took {$diff} seconds. Result = {$sum}<br/>";
$sum = 0;
$s2 = microtime(TRUE);
for($j = 0; $j < 1000; $j++){
foreach($array as $val){
$sum += $val;
}
}
$s3 = microtime(TRUE);
$diff = $s3 - $s2;
echo "for 1000 pass, foreach took {$diff} seconds. Result = {$sum}<br/>";
I got results where foreach was always slower. So that should answer your question. Sample:
for 1000 pass, array_sum took 0.2720000743866 seconds. Result = 50005000000
for 1000 pass, foreach took 1.7239999771118 seconds. Result = 50005000000

Comparing execution times in PHP

I would like to compare different PHP code to know which one would be executed faster. I am currently using the following code:
<?php
$load_time_1 = 0;
$load_time_2 = 0;
$load_time_3 = 0;
for($x = 1; $x <= 20000; $x++)
{
//code 1
$start_time = microtime(true);
$i = 1;
$i++;
$load_time_1 += (microtime(true) - $start_time);
//code 2
$start_time = microtime(true);
$i = 1;
$i++;
$load_time_2 += (microtime(true) - $start_time);
//code 3
$start_time = microtime(true);
$i = 1;
$i++;
$load_time_3 += (microtime(true) - $start_time);
}
echo $load_time_1;
echo '<br />';
echo $load_time_2;
echo '<br />';
echo $load_time_3;
?>
I have executed the script several times.
The first result is
0.44057559967041
0.43392467498779
0.43600964546204
The second result is
0.50447297096252
0.48595094680786
0.49943733215332
The third result is
0.5283739566803
0.55247902870178
0.55091571807861
The result looks okay, but the problem is, that every time I execute this code the result is different. Also, I am comparing three times the same code and on the same machine.
Why would there be a difference in speed while comparing? And is there a way to compare execution times and to see the the real difference?
there is a thing called Observational error.
As long as your numbers do not exceed it, all your measurements are just waste of time.
The only proper way of doing measurements is called profiling and stands for measuring significant parts of the code, not senseless ones.
Why would there be a difference in
speed while comparing?
There are two reasons for this, which are both related to how things that are out of your control are handled by the PHP and the Operating system.
Firstly, the computer processor can only do certain amount of operations at any given time. The Operating system is basically responsible of handling the multitasking to divide these available cycles to your applications. Since these cycles aren't given at a constant rate, small speed variations are to be expected even with identical PHP commands, because of how processor cycles are allocated.
Secondly, a bigger cause to time variations are the background operations of PHP. There are many things that are completely hidden to the user, like memory allocation, garbage collection and handling various name spaces for variables and the like. These operations also take computer cycles and they can be run at unexpected times during your script. If garbage collection is performed during the first incrementation, but not the second, it causes the first operation to take longer than the second. Sometimes, because of garbage collection, the order in which the tests are performed can also impact the execution time.
Speed testing can be a bit tricky, because unrelated factors (like other applications running in the background) can skew the results of your test. Generally small speed differences are hard to tell between scripts, but when a speed test is run enough number of times, the real results can be seen. For example, if one script is constantly faster than another, it usually points out to that script being more efficient in terms of processing speed.
the reason the results vary is because there are other things going on at the same time, such as windows or linux based tasks, other processes, you will never get an exact result, you best of running the code over a 100 iterations and then devide the result to find the average time take, and use that as your figure/
Also it would be beneficial for you to create a class that can handle this for you, this way you can use it all the time without having to write the code every time:
try something like this (untested):
class CodeBench
{
private $benches = array();
public function __construct(){}
public function begin($name)
{
if(!isset($this->benches[$name]))
{
$this->benches[$name] = array();
}
$this->benches[$name]['start'] = array(
'microtime' => microtime(true)
/* Other information*/
);
}
public function end($name)
{
if(!isset($this->benches[$name]))
{
throw new Exception("You must first declare a benchmark for " . $name);
}
$this->benches[$name]['end'] = array(
'microtime' => microtime()
/* Other information*/
);
}
public function calculate($name)
{
if(!isset($this->benches[$name]))
{
throw new Exception("You must first declare a benchmark for " . $name);
}
if(!isset($this->benches[$name]['end']))
{
throw new Exception("You must first call an end call for " . $name);
}
return ($this->benches[$name]['end'] - $this->benches[$name]['start']) . 'ms'
}
}
And then use like so:
$CB = new CodeBench();
$CB->start("bench_1");
//Do work:
$CB->end("bench_1");
$CB->start("bench_2");
//Do work:
$CB->end("bench_2");
echo "First benchmark had taken: " . $CB->calculate("bench_1");
echo "Second benchmark had taken: " . $CB->calculate("bench_2");
Computing speeds are never 100% set in stone. PHP is a server-side script, and thus depending on the computing power available to the server, it can take a varying amount of time.
Since you're subtracting from the start time with each step, it is expected that load time 3 will be greater than 2 which will be greater than 1.

Is wrapping code into a function that doesn't need to be, bad in PHP?

Many many times on a page I will have to set post and get values in PHP like this
I just want to know if it is better to just continue doing it the way I have above or if performance would not be touched by adding it into a function like in the code below?
This would make it much easiar to write code but at the expense of making extra function calls on the page.
I have all the time in the world so making the code as fast as possible is more important to me then making it "easiar to write or faster to develop"
Appreciate any advice and please nothing about whichever makes it easier to develop, I am talking pure performance here =)
<?php
function arg_p($name, $default = null) {
return (isset($_GET[$name]))?$_GET[$name]:$default;
}
$pagesize = arg_p('pagesize', 10);
$pagesize = (isset($_GET['pagesize'])) ? $_GET['pagesize'] : 10;
?>
If you have all the time in the world, why don't you just test it?
<?php
// How many iterations?
$iterations = 100000;
// Inline
$timer_start = microtime(TRUE);
for($i = 0; $i < $iterations; $i++) {
$pagesize = (isset($_GET['pagesize'])) ? $_GET['pagesize'] : 10;
}
$time_spent = microtime(TRUE) - $timer_start;
printf("Inline: %.3fs\n", $time_spent);
// By function call
function arg_p($name, $default = null) {
return (isset($_GET[$name])) ? $_GET[$name] : $default;
}
$timer_start = microtime(TRUE);
for($i = 0; $i < $iterations; $i++) {
$pagesize = arg_p('pagesize', 10);
}
$time_spent = microtime(TRUE) - $timer_start;
printf("By function call: %.3fs\n", $time_spent);
?>
On my machine, this gives pretty clear results in favor of inline execution by a factor of almost 10. But you need a lot of iterations to really notice it.
(I would still use a function though, even if me answering this shows that I have time to waste ;)
Sure you'll probably get a performance benefit from not wrapping it into a function. But would it be noticeable? Not really.
Your time is worth more than the small about of CPU resources you'd save.
I doubt the difference in speed would be noticeable unless you are doing it many hundreds of times.
Function call is a performance hit, but you should also think about maintainability - wrapping it in the function could ease future changes (and copy-paste is bad for that).
Whilst performance wouldn't really be affected, anything that takes code out of the html stream the better.
Even with a thousand calls to your arg_p() you wouldn't be able to measure —let alone notice— the difference in performance. The time you will spend typing the extra "inline" code plus the time you will spend whenever you'll have to duplicate the changes to every inlined copy plus the added complexity and higher probability of typo or random error will cost you more than the unmeasurable performance improvement. In fact, that time could be spent on optimizing what really counts such as improving your database design, profile your code to find the areas that really affect the generation time, etc...
You'll be better off keeping your code clean. It will save you time that you can in turn invest into optimizing what really counts.

PHP function efficiencies

Is there a table of how much "work" it takes to execute a given function in PHP? I'm not a compsci major, so I don't have maybe the formal background to know that "oh yeah, strings take longer to work with than integers" or anything like that. Are all steps/lines in a program created equal? I just don't even know where to start researching this.
I'm currently doing some Project Euler questions where I'm very sure my answer will work, but I'm timing out my local Apache server at a minute with my requests (and PE has said that all problems can be solved < 1 minute). I don't know how/where to start optimizing, so knowing more about PHP and how it uses memory would be useful. For what it's worth, here's my code for question 206:
<?php
$start = time();
for ($i=1010374999; $i < 1421374999; $i++) {
$a = number_format(pow($i,2),0,".","");
$c = preg_split('//', $a, -1, PREG_SPLIT_NO_EMPTY);
if ($c[0]==1) {
if ($c[2]==2) {
if ($c[4]==3) {
if ($c[6]==4) {
if ($c[8]==5) {
if ($c[10]==6) {
if ($c[12]==7) {
if ($c[14]==8) {
if ($c[16]==9) {
if ($c[18]==0) {
echo $i;
}
}
}
}
}
}
}
}
}
}
}
$end = time();
$elapsed = ($end-$start);
echo "<br />The time to calculate was $elapsed seconds";
?>
If this is a wiki question about optimization, just let me know and I'll move it. Again, not looking for an answer, just help on where to learn about being efficient in my coding (although cursory hints wouldn't be flat out rejected, and I realize there are probably more elegant mathematical ways to set up the problem)
There's no such table that's going to tell you how long each PHP function takes to execute, since the time of execution will vary wildly depending on the input.
Take a look at what your code is doing. You've created a loop that's going to run 411,000,000 times. Given the code needs to complete in less than 60 seconds (a minute), in order to solve the problem you're assuming each trip through the loop will take less than (approximately) .000000145 seconds. That's unreasonable, and no amount of using the "right" function will solve your call. Try your loop with nothing in there
for ($i=1010374999; $i < 1421374999; $i++) {
}
Unless you have access to science fiction computers, this probably isn't going to complete execution in less than 60 seconds. So you know this approach will never work.
This is known a brute force solution to a problem. The point of Project Euler is to get you thinking creatively, both from a math and programming point of view, about problems. You want to reduce the number of trips you need to take through that loop. The obvious solution will never be the answer here.
I don't want to tell you the solution, because the point of these things is to think your way through it and become a better algorithm programmer. Examine the problem, think about it's restrictions, and think about ways you reduce the total number of numbers you'd need to check.
A good tool for taking a look at execution times for your code is xDebug: http://xdebug.org/docs/profiler
It's an installable PHP extension which can be configured to output a complete breakdown of function calls and execution times for your script. Using this, you'll be able to see what in your code is taking longest to execute and try some different approaches.
EDIT: now that I'm actually looking at your code, you're running 400 million+ regex calls! I don't know anything about project Euler, but I have a hard time believing this code can be excuted in under a minute on commodity hardware.
preg_split is likely to be slow because it's using a regex. Is there not a better way to do that line?
Hint: You can access chars in a string like this:
$str = 'This is a test.';
echo $str[0];
Try switching preg_split() to explode() or str_split() which are faster
First, here's a slightly cleaner version of your function, with debug output
<?php
$start = time();
$min = (int)floor(sqrt(1020304050607080900));
$max = (int)ceil(sqrt(1929394959697989990));
for ($i=$min; $i < $max; $i++) {
$c = $i * $i;
echo $i, ' => ', $c, "\n";
if ($c[0]==1
&& $c[2]==2
&& $c[4]==3
&& $c[6]==4
&& $c[8]==5
&& $c[10]==6
&& $c[12]==7
&& $c[14]==8
&& $c[16]==9
&& $c[18]==0)
{
echo $i;
break;
}
}
$end = time();
$elapsed = ($end-$start);
echo "<br />The time to calculate was $elapsed seconds";
And here's the first 10 lines of output:
1010101010 => 1020304050403020100
1010101011 => 1020304052423222121
1010101012 => 1020304054443424144
1010101013 => 1020304056463626169
1010101014 => 1020304058483828196
1010101015 => 1020304060504030225
1010101016 => 1020304062524232256
1010101017 => 1020304064544434289
1010101018 => 1020304066564636324
1010101019 => 1020304068584838361
That, right there, seems like it oughta inspire a possible optimization of your algorithm. Note that we're not even close, as of the 6th entry (1020304060504030225) -- we've got a 6 in a position where we need a 5!
In fact, many of the next entries will be worthless, until we're back at a point where we have a 5 in that position. Why bother caluclating the intervening values? If we can figure out how, we should jump ahead to 1010101060, where that digit becomes a 5 again... If we can keep skipping dozens of iterations at a time like this, we'll save well over 90% of our run time!
Note that this may not be a practical approach at all (in fact, I'm fairly confident it's not), but this is the way you should be thinking. What mathematical tricks can you use to reduce the number of iterations you execute?

Categories