PHP efficiency question

PHP efficiency question - php

I am working on website and I am trying to make it fast as much as possible - especially the small things that can make my site a little bit quicker.
So, my to my question - I got loop that run 5 times and in each time it echo something, If I'll make variable and the loop will add the text I want to echo into the variable and just in the end I'll echo the variable - will it be faster?
loop 1 (with the echo inside the loop)
for ($i = 0;$i < 5;$i++)
{
echo "test";
}
loop 2 (with the echo outside [when the loop finish])
$echostr = "";
for ($i = 0;$i < 5;$i++)
{
$echostr .= "test";
}
echo $echostr;
I know that loop 2 will increase a bit the file size and therfore the user will have to download more bytes but If I got huge loop will it be better to use second loop or not?
Thanks.

The difference is negligible. Do whatever is more readable (which in this case is definitely the first case). The first approach is not a "naive" approach so there will be no major performance difference (it may actually be faster, I'm not sure). The first approach will also use less memory. Also, in many languages (not sure about PHP), appending to strings is expensive, and therefore so is concatenation (because you have to seek to the end of the string, reallocate memory, etc.).
Moreover, file size does not matter because PHP is entirely server-side -- the user never has to download your script (in fact, it would be scary if they did/could). These types of things may matter in Javascript but not in PHP.
Long story short -- don't write code constantly trying to make micro-optimizations like this. Write the code in the style that is most readable and idiomatic, test to see if performance is good, and if performance is bad then profile and rewrite the sections that perform poorly.
I'll end on a quote:
"premature emphasis on efficiency is a big mistake which may well be the source of most programming complexity and grief."
- Donald Knuth

This is a classic case of premature optimization. The performance difference is negligible.
I'd say that in general you're better off constructing a string and echoing it at the end, but because it leads to cleaner code (side effects are bad, mkay?) not because of performance.
If you optimize like this, from the ground up, you're at risk of obfuscating your code for no perceptable benefit. If you really want your script to be as fast as possible then profile it to find out where the real bottlenecks are.
Someone else mentioned that using string concatenation instead of an immediate echo will use more memory. This isn't true unless the size of the string exceeds the size of output buffer. In any case to actually echo immediately you'd need to call flush() (perhaps preceded by ob_flush()) which adds the overhead of a function call*. The web server may still keep its own buffer which would thwart this anyway.
If you're spending a lot of time on each iteration of the loop then it may make sense to echo and flush early so the user isn't kept waiting for the next iteration, but that would be an entirely different question.
Also, the size of the PHP file has no effect on the user - it may take marginally longer to parse but that would be negated by using an opcode cache like APC anyway.
To sum up, while it may be marginally faster to echo each iteration, depending on circumstance, it makes the code harder to maintain (think Wordpress) and it's most likely that your time for optimization would be better spent elsewhere.
* If you're genuinely worried about this level of optimization then a function call isn't to be sniffed at. Flushing in pieces also implies extra protocol overhead.

The size of your PHP file does not increase the size of the download by the user. The output of the PHP file is all that matters to the user.
Generally, you want to do the first option: echo as soon as you have the data. Assuming you are not using output buffering, this means that the user can stream the data while your PHP script is still executing.

The user does not download the PHP file, but only its output, so the second loop has no effect on the user's download size.
It's best not to worry about small optimizations, but instead focus on quickly delivering working software. However, if you want to improve the performance of your site, Yahoo! has done some excellent research: developer.yahoo.com/performance/rules.html

The code you identify as "loop 2" wouldn't be any larger of a file size for users to download. String concatination is faster than calling a function like echo so I'd go with loop 2. For only 5 iterations of a loop I don't think it really matters all that much.
Overall, there are a lot of other areas to focus on such as compiling PHP instead of running it as a scripted language.
http://phplens.com/lens/php-book/optimizing-debugging-php.php

Your first example would, in theory, be fastest. Because your provided code is so extremely simplistic, I doubt any performance increase over your second example would be noticed or even useful.
In your first example the only variable PHP needs to initialize and utilize is $i.
In your second example PHP must first create an empty string variable. Then create the loop and its variable, $i. Then append the text to $echostr and then finally echo $echostr.

Related

PHP - is storing singular items in an associative array more/less performant than basic variable storage?

Suppose I have a string... 'iAmSomeRandomString'. Let's say I choose to store this string in two ways:
$stringVar = 'iAmSomeRandomString';
$stringArr = array('rando' => 'iAmSomeRandomString');
If I later would like to access that string, is using $stringVar or $stringArr['rando'] more performant?
My understanding is that there was no performance difference, but I was asked this in an interview and told I was wrong. So now I want to know what's up.

Is there a relevant performance difference? No.
https://3v4l.org/snDII
...
Output for 7.0.4
StartiAmSomeRandomStringiAmSomeRandomString
First case: 0.000003
Second case: 0.000005
Output for 7.0.3
StartiAmSomeRandomStringiAmSomeRandomString
First case: 0.000023
Second case: 0.000015
...
Note that you see numbers that vary between versions, but highly depend on the basis of where they are run. You probably cannot compare two versions.
Also note that I started the text output before measuring - this is because without it, you would measure the performance penalty of PHP preparing for output, and this is a relevant effect in this test.
Your question is related to an interview situation. While it is always easier to tell afterwards what would be a better answer, I'd still like to stress that it should not matter in which way variables are stored. The more important factor should be if the code you write is readable and if the code you read is understood.
It simply does not make sense to store a value into an array because of performance reasons if the value is guaranteed to be a single value. Storing it into an array tells me that there will be more values. If I don't immediately see where they are added, I will start searching for them, because if I need to change something in the array, I don't want to use a key that is used somewhere else.
Also, the question targets simple global variables. I would think that a proper application design will use objects. Why didn't they ask about class properties? Why didn't they ask about static vs. dynamic properties and their performance difference, or public vs. protected vs. private variables?
A good candidate probably would discuss the usefulness of the question itself, and challenge the source of the performance data that decides the correct answer. If you change the code in the above link and remove the first echo statement, you will see that the data changes dramatically.
https://3v4l.org/dGAJI
Output for 5.5.2, 5.5.25, 5.6.10, 7.0.4
iAmSomeRandomStringiAmSomeRandomString
First case: 0.000031
Second case: 0.000004
Output for 7.0.3
iAmSomeRandomStringiAmSomeRandomString
First case: 0.000127
Second case: 0.000017

$stringVar = 'iAmSomeRandomString';
Have Total memory usage 179336.
$stringArr = array('rando' => 'iAmSomeRandomString');
Have total memory usage 179616.
According to Memory when we have just single element in array then it is better to use basic variable.
memory_get_usage() a PHP Function can be used for memory allocated to PHP Script.

First case will consume less memory and will be faster. But the difference is so tiny that it doesn't really matter most of the time. Though, yes, technically they're right.
$time_start=microtime(true);
for ($i=0; $i < 100000; $i++)
{
$stringVar=1;
$stringVar++;
}
$time_end=microtime(true);
echo (microtime(true) - $time_start);
Will output 0.0034611225128174
$time_start=microtime(true);
for ($i=0; $i < 100000; $i++)
{
$stringArr = array('rando' => 1);
$stringArr['rando']++;
}
$time_end=microtime(true);
echo (microtime(true) - $time_start);
Will output 0.017335176467896

It's obvious that the first case (keeping the values in individual variables) uses less memory and requires less time for access.
An array is a variable itself and, besides the memory needed to keep its content, it also uses memory for its bookkeeping. The items stored in an array are kept in linked lists to allow sequential access (foreach) and there is also a map that allows direct access to elements ($array[$index]).
All these internal data structures use memory and they grow as the array grows (the memory overhead of the array is not constant).
Regarding the time needed to access the data, an array introduces a supplemental level of indirection. Accessing a simple string variable requires finding its name in the current table of symbols then its content is accessible. Using an array, the operations happen twice: find the array variable and its content then find the desired value inside the array.
However, for most scripts, the difference on both memory consumption and access time is not significant and it is not worth trying to optimize anything this way.

Does extensive use of php echo statement make page load times slower?

I know this question has been asked before but I haven't been able to find a definitive answer.
Does the overly use of the echo statement slow down end user load times?
By having more echo statements in the file the file size increases so I know this would be a factor. Correct me if I'm wrong.
I know after some research that using php's ob_start() function along with upping Apaches SendBufferSize can help decrease load times, but from what I understand this is more of decrease in php execution time by allowing php to finish/exit sooner, which in turn allows Apache to exit sooner.
With that being said, php does exit sooner, but does that mean php actually took less time to execute and in turn speed things up on the end user side ?
To be clear, what I mean by this is if I had 2 files, same content, and one made use of the echo statement for every html tag and the other file used the standard method of breaking in and out of php, aside for the difference in file size from the "overly" use of the echo statement (within reason I'm guessing?), which one would be faster? Or would there really not be any difference?
Maybe I'm going about this or looking at this wrong?
Edit: I have done a bit of checking around and found a way to create a stop watch to check execution time of a script and seems to work quit well. If anybody is interested in doing the same here is the link to the method I have chosen to use for now.
http://www.phpjabbers.com/measuring-php-page-load-time-php17.html

Does the overly use of the echo statement slow down end user load times?
No.
By having more echo statements in the file the file size increases so I know this would be a factor. Correct me if I'm wrong.
You are wrong.
does that mean php actually took less time to execute and in turn speed things up on the end user side?
No.
Or would there really not be any difference?
Yes.
Maybe I'm going about this or looking at this wrong?
Definitely.
There is a common problem with performance related questions.
Most of them coming up not from the real needs but out of imagination.
While one have to solve only real problems, not imaginable ones.

This is not an issue.
You are overthinking things.
This is an old question, but the problem with the logic presented here is it assumes that “More commands equals slower performance…” when—in terms of modern programming and modern systems—this is an utterly irrelevant issue. These concerns are only of concerns of someone who—for some reason—programs at an extremely low level in something like assembler and such,.
The reason why is there might be a slowdown… But nothing anyone would ever humanly be able to perceive. Such as a slowdown of such a small fraction of a second that the any effort you make to optimize that code would not result in anything worth anything.
That said, speed and performance should always be a concern when programming, but not in terms of how many of a command you use.
As someone who uses PHP with echo statements, I would recommend that you organize your code for readability. A pile of echo statements is simply hard to read and edit. Depending on your needs you should concatenate the contents of those echo statements into a string that you then echo later on.
Or—a nice technique I use—is to create an array of values I need to echo and then run echo implode('', $some_array); instead.
The benefit of an array over string concatenation is it’s naturally easier to understand that some_array[] = 'Hello!'; will be a new addition to that array where something like $some_string .= 'Hello!'; might seem simple but it might be confusing to debug when you have tons of concatenation happening.
But at the end of the day, clean code that is easy to read is more important to all involved than shaving fractions of a second off of a process. If you are a modern programmer, program with an eye towards readability as a first draft and then—if necessary—think about optimizing that code.

Do not worry about having 10 or 100 calls to echo. When optimizing these shouldn't be even take in consideration.
Think that on a normal server you can run an echo simple call faster than 1/100,000 part of a second.
Always worry about code readability and maintenance than those X extra echo calls.

Didn't made any benchmark. All I can say is, in fact when your echo strings (HTML or not) and use double quotes (") it's slower than single quotes (').
For strings with double quotes PHP has to parse those strings. You could know the possibility to get variables inside of strings by just insert them into your string:
echo "you're $age years old!";
PHP has to parse your string to lookup those variables and automatically replace them. When you're sure, you don't have any variables inside your string use single quotes.
Hope this would help you.
Even when you use a bunch of echo calls, I don't think it would slow down your loading time. Loading time depends on reaction time of your server and execution time. When your loading time would be to high for the given task, check the whole code not only the possibility of echoes could slow down your server. I think there would be something wrong inside your code.

PHP – Slow String Manipulation

I have some very large data files and for business reasons I have to do extensive string manipulation (replacing characters and strings). This is unavoidable. The number of replacements runs into hundreds of thousands.
It's taking longer than I would like. PHP is generally very quick but I'm doing so many of these string manipulations that it's slowing down and script execution is running into minutes. This is a pain because the script is run frequently.
I've done some testing and found that str_replace is fastest, followed by strstr, followed by preg_replace. I've also tried individual str_replace statements as well as constructing arrays of patterns and replacements.
I'm toying with the idea of isolating string manipulation operation and writing in a different language but I don't want to invest time in that option only to find that improvements are negligible. Plus, I only know Perl, PHP and COBOL so for any other language I would have to learn it first.
I'm wondering how other people have approached similar problems?
I have searched and I don't believe that this duplicates any existing questions.

Well, considering that in PHP some String operations are faster than array operation, and you are still not satisfied with its speed, you could write external program as you mentioned, probably in some "lower level" language. I would recommend C/C++.

There are two ways of handling this, IMO:
[easy] Precompute some generic replacements in a background process and store them in a DB/file (this trick comes from a gamedev, where all the sinuses and cosinuses are precomputed once and then stored in RAM). You can easily run into curse of dimensionality here, though;
[not so easy] Implement replacement tool in C++ or other fast and compilable programming language and use it afterwards. Sphinx is a good example of fast manipulation tool on big textual data sets implemented in C++.

If you'd allow the replacement to be handled over multiple executions, you could create a script that process each file, temporarily creating replacement files with duplicate content. This would allow for you to extract data from one file to another, process the copy - and then merge the changes, or if you use a stream buffer you might be able to remember each row so the copy/merge step can be skipped.
The problem though might be that you process a file without completing it, rendering it mixed. Therefore a temporary file is suitable.
This would allow for the script to run as many times there's still changes to be made, all you need is a temporary file that remembers which files that has been processed.

The limiting factor is about PHP rebuilding the strings. Consider:
$out=str_replace('bad', 'good', 'this is a bad example');
It's a relatively low cost operation to locate 'bad' in the string, but in order to make room for the substitution, PHP then has to move up, each of the chars e,l,p,m,a,x,e,space before writing in the new value.
Passing arrays for the needle and haystack will improve performance, but not as much as it might.
AFAIK, PHP does not have low level memory access functions, hence an optimal solution would have to be written in a different language, dividing the data up into 'pages' which can be stretched to accomodate changes. You could try this using chunk_split to divide the string up into smaller units (hence each replacement would require less memory juggling).
Another approach would be to dump it into a file and use sed (this still operates one search/replace at a time), e.g.
sed -i 's/good/bad/g;s/worse/better/g' file_containing_data

If you have to do this operation only once and you have to replace with static content you can use Dreamwaver or other editor, so you will not need PHP. It will be much faster.
Still, if you do need to do this dynamically with PHP (you need database records or others) you can use shell commands via exec - google search for search-replace

It is possible that you have hit a wall with PHP. PHP is great, but in some areas it fails, such as processing LOTS of data. There are a few things you could do:
Use more than one php process to accomplish the task (2 process potentially could take half as long).
Install a faster CPU.
Do the processing on multiple machines.
Use a compiled language to process the data (Java, C, C++, etc)

I think the question is why are you running this script frequently? Are you performing the computations (the string replacements) on the same data over and over again, or are you doing it on different data every time?
If the answer is the former then there isn't much more you can do to improve performance on the PHP side. You can improve performance in other ways such as using better hardware (SSDs for faster reads/writes on the files), multicore CPUs and breaking up the data into smaller pieces running multiple scripts at the same time to process the data concurrently, and faster RAM (i.e. higher bus speeds).
If the answer is the latter then you might want to consider caching the result using something like memcached or reddis (key/value cache stores) so that you can only perform the computation once and then it's just a linear read from memory, which is very cheap and involves virtually no CPU overhead (you might also utilize CPU cache at this level).
String manipulation in PHP is already cheap because PHP strings are essentially just byte arrays. There's virtually no overhead from PHP in reading a file into memory and storing it in a string. If you have some sample code that demonstrates where you're seeing performance issues and some bench mark numbers I might have some better advice, but right now it just looks like you need refactor your approach based on what your underlying needs are.
For example, there are both CPU and I/O costs to consider individually when you're dealing with data in different situations. I/O involves blocking since it's a system call. This means your CPU has to wait for more data to come over the wire (while your disk transfers data to memory) before it can continue to process or compute that data. Your CPU is always going to be much faster than memory and memory is always much faster than disk.
Here's a simple benchmark to show you the difference:
/* First, let's create a simple test file to benchmark */
file_put_contents('in.txt', str_repeat(implode(" ",range('a','z')),10000));
/* Now let's write two different tests that replace all vowels with asterisks */
// The first test reads the entire file into memory and performs the computation all at once
function test1($filename, $newfile) {
$start = microtime(true);
$data = file_get_contents($filename);
$changes = str_replace(array('a','e','i','o','u'),array('*'),$data);
file_put_contents($newfile,$changes);
return sprintf("%.6f", microtime(true) - $start);
}
// The second test reads only 8KB chunks at a time and performs the computation on each chunk
function test2($filename, $newfile) {
$start = microtime(true);
$fp = fopen($filename,"r");
$changes = '';
while(!feof($fp)) {
$changes .= str_replace(array('a','e','i','o','u'),array('*'),fread($fp, 8192));
}
file_put_contents($newfile, $changes);
return sprintf("%.6f", microtime(true) - $start);
}
The above two tests do the same exact thing, but Test2 proves significantly faster for me when I'm using smaller amounts of data (roughly 500KB in this test).
Here's the benchmark you can run...
// Conduct 100 iterations of each test and average the results
for ($i = 0; $i < 100; $i++) {
$test1[] = test1('in.txt','out.txt');
$test2[] = test2('in.txt','out.txt');
}
echo "Test1 average: ", sprintf("%.6f",array_sum($test1) / count($test1)), "\n",
"Test2 average: ", sprintf("%.6f\n",array_sum($test2) / count($test2));
For me the above benchmark gives Test1 average: 0.440795 and Test2 average: 0.052054, which is an order of magnitude difference and that's just testing on 500KB of data. Now, if I increase the size of this file to about 50MB Test1 actually proves to be faster since there are fewer system I/O calls per iteration (i.e. we're just reading from memory linearly in Test1), but more CPU cost (i.e. we're performing a much larger computation per iteration). The CPU generally proves to be able to handle much larger amounts of data at a time than your I/O devices can send over the bus.
So it's not a one-size-fits-all solution in most cases.

Since you know Perl, I would suggest doing the string manipulations in perl using regular expressions and use the final result in PHP web page.
This seems better for the following reasons
You already know Perl
Perl does string processing better
You can use PHP where necessary only.

does this manipulation have to happen on the fly? if not, might i suggest pre-processing... perhaps via a cron job.
define what rules your going to be using.
is it just one str_replace or a few different ones?
do you have to do the entire file in one shot? or can you split it into multiple batches? (e.g. half the file at a time)
once your rules are defined decide when you will do the processing. (e.g. 6am before everyone gets to work)
then you can setup a job queue. i have used apache's cron jobs to run my php scripts on a given time schedule.
for a project i worked on a while ago i had a setup like this:
7:00 - pull 10,000 records from mysql and write them to 3 separate files.
7:15 - run a complex regex on file one.
7:20 - run a complex regex on file two.
7:25 - run a complex regex on file three.
7:30 - combine all three files into one.
8:00 - walk into the metting with the formatted file you boss wants. *profit*
hope this helps get you thinking...

inner workings of PHP (really long PHP script)

I have a really long php script for just 1 page i.e. something like:
mywebsite.com/page.php?id=99999
I have about 10000-20000 cases of the id, each with a different settings. Will this slow down my website significantly?
i.e. my question is really along the lines of, what happens when php is executed. does the server execute it and display the results, or does the client's computer download it, execute it and display the results.
if its the latter, does it mean a really slow load time? each of the 10000-20000 cases has about 20-25 lines of code after it.
thanks, xoxo

A PHP file is usually processed (interpreted) on the web server and the output is passed to the client.
If the website is slow or not, that totally depends on what the PHP script actually does. However, a PHP file with 10000-20000 cases sounds really, really bad code-wise. Yet, it might perform well for your case (pardon the pun).
Everything comes down to what code is actually performed: Do you just print out different text depending on the given id or do you run a really expensive operation (eg. create a zip file, download stuff, compute PI to the last decimal, ...)?

10,000 to 20,000 distinct cases sounds like a nightmare. Although it's technically possible, I find it hard to believe that your processing needs require that level of granularity.
Is the processing in each of the 10,000 to 20,000 cases really so different that it needs completely separate testing and handling? Aren't there cases similar enough to be handled in a similar way?
For example, if the processing for case $x = 5 is something like:
echo 5;
And the processing for case $x = 10 is something like:
echo 10;
Then these could be grouped into a single test and single handler:
function dumbEcho($x){
echo $x;
}
function isDumbEchoAble($x){
return in_array($x, array(5,10));
}
if (isDumbEchoAble($x)){
dumbEcho($x);
}
For each structurally similar set of processing, you could create a isXXXAble() function to test and an XXX() function to process. [Of course, this is just a simple example, intended to demonstrate a principle, a concept, not necessarily code that you can copy/paste into your current situation.]
The essence of programming - IMHO - is to find these structural similarities, find a parameterization sufficient to handle the unique cases, and then apply this paramaterized processing to those cases.

How to debug, and protect against, infinite loops in PHP?

I recently ran up against a problem that challenged my programming abilities, and it was a very accidental infinite loop. I had rewritten some code to dry it up and changed a function that was being repeatedly called by the exact methods it called; an elementary issue, certainly. Apache decided to solve the problem by crashing, and the log noted nothing but "spawned child process". The problem was that I never actually finished debugging the issue that day, it cropped up in the afternoon, and had to solve it today.
In the end, my solution was simple: log the logic manually and see what happened. The problem was immediately apparent when I had a log file consisting of two unique lines, followed by two lines that were repeated some two hundred times apiece.
What are some ways to protect ourselves against infinite loops? And, when that fails (and it will), what's the fastest way to track it down? Is it indeed the log file that is most effective, or something else?
Your answer could be language agnostic if it were a best practice, but I'd rather stick with PHP specific techniques and code.

You could use a debugger such as xdebug, and walk through your code that you suspect contains the infinite loop.
Xdebug - Debugger and Profiler Tool for PHP
You can also set
max_execution_time
to limit the time the infinite loop will burn before crashing.

I sometimes find the safest method is to incorporate a limit check in the loop. The type of loop construct doesn't matter. In this example I chose a 'while' statement:
$max_loop_iterations = 10000;
$i=0;
$test=true;
while ($test) {
if ($i++ == $max_loop_iterations) {
echo "too many iterations...";
break;
}
...
}
A reasonable value for $max_loop_iterations might be based upon:
a constant value set at runtime
a computed value based upon the size of an input
or perhaps a computed value based upon relative runtime speed
Hope this helps,
- N

Unit tests might be a good idea, too. You might want to try PHPUnit.

can you trace execution and dump out a call graph? infinite loops will be obvious, and you can easily pick out the ones that are on purpose (a huge mainloop) vs an accident (local abababa... loop or loop that never returns back to mainloop)
i've heard of software that does this for large complex programs, but i couldn't find a screenshot for you. Someone traced a program like 3dsmax and dumped out a really pretty call graph.

write simple code, so bugs will be more apparent? I wonder if your infinite loop is due to some ridiculous yarn-ball of control structures that no human being can possibly understand. Thus of course you f'ed it up.
all infinite loops aren't bad (e.g. while (!quit)).

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.