best way to measure (and refine) performance with PHP? - php

A site I am working with is starting to get a little sluggish, and I would like to refine it. I think the problem is with the PHP, but I can't be sure. How can I see how long functions are taking to perform?

If you want to test the execution time :
<?php
$startTime = microtime(true);
// Your content to test
$endTime = microtime(true);
$elapsed = $endTime - $startTime;
echo "Execution time : $elapsed seconds";
?>

Try the profiler feature in XDebug or Zend Debugger?

Two things you can do.
place Microtime calls everywhere although its not convenient if you want to test more than one function. So there is a simpler way to do it a better solution if you want to test many functions which i assume you would like to do.
just have a class (click on link to follow tutorial) where you can test how long all your functions take. Rather than place microtime everywhere. you just use this class. which is very convenient
http://codeaid.net/php/calculate-script-execution-time-%28php-class%29
the second thing you can do is to optimize your script is by taking a look at the memory usage.
By observing the memory usage of your scripts, you may be able optimize your code better.
PHP has a garbage collector and a pretty complex memory manager. The amount of memory being used by your script. can go up and down during the execution of a script. To get the current memory usage, we can use the memory_get_usage() function, and to get the highest amount of memory used at any point, we can use the memory_get_peak_usage() function.
view plaincopy to clipboardprint?
echo "Initial: ".memory_get_usage()." bytes \n";
/* prints
Initial: 361400 bytes
*/
// let's use up some memory
for ($i = 0; $i < 100000; $i++) {
$array []= md5($i);
}
// let's remove half of the array
for ($i = 0; $i < 100000; $i++) {
unset($array[$i]);
}
echo "Final: ".memory_get_usage()." bytes \n";
/* prints
Final: 885912 bytes
*/
echo "Peak: ".memory_get_peak_usage()." bytes \n";
/* prints
Peak: 13687072 bytes
*/
http://net.tutsplus.com/tutorials/php/9-useful-php-functions-and-features-you-need-to-know/
PK

You can also make it manually, by recording microtime() value in various places, like this:
<?
$TIMER['start']=microtime(TRUE);
// some code
$query="SELECT ...";
$TIMER['before q']=microtime(TRUE);
$res=mysql_query($query);
$TIMER['after q']=microtime(TRUE);
while ($row = mysql_fetch_array($res)) {
// some code
}
$TIMER['array filled']=microtime(TRUE);
// some code
$TIMER['pagination']=microtime(TRUE);
/and so on
?>
and then visualize it
<?
if ('127.0.0.1' === $_SERVER['REMOTE_ADDR']) {
echo "<table border=1><tr><td>name</td><td>so far</td><td>delta</td><td>per cent</td></tr>";
reset($TIMER);
$start=$prev=current($TIMER);
$total=end($TIMER)-$start;
foreach($TIMER as $name => $value) {
$sofar=round($value-$start,3);
$delta=round($value-$prev,3);
$percent=round($delta/$total*100);
echo "<tr><td>$name</td><td>$sofar</td><td>$delta</td><td>$percent</td></tr>";
$prev=$value;
}
echo "</table>";
}
?>
an IP address check implies that we are doing this profiling on the working site
Though I doubt it's PHP itself. Most likely it's database. So, pay most attention to query execution timing.
however, a "site" term is very broad. It includes also JS, CSS, images and stuff. So, I'd suggest to start form FirebFug's Net page to see what part of whole page takes more time.
Of course, refining can be done only after analysis of profiling results, and cannot be advised here without it.

Your best bet is Xdebug. Im happy as it comes bundled in my PHPed IDE. I can get profiler data at the click of a button.
So maybe you could consider that.

I had similar issues and so I created 2 new tables on the database and two new functions. One was audit_sql and the other was audit_code. Because I used an SQL abstraction class it was easy to time every single SQL call (I used php microtime as some others have suggested). So, I called microtime before and after the SQL call and stored the results on the database.
Similarly with pages. I called microtime at the start and end of each page and if necessary at the start and end of functons, divs - whatever I thought might be a culprit.
The general results were:
SQL calls to MySQL were almost instantaneous and were nto a problem at all. The only thing I would say is that even I was surprised at the number being executed! The site is generated from the database - even the menus, permissions etc. To produce the home page the SQL calls were measured in the 100s.
PHP was not the culprit. This was even more instantaneous that MySQL.
The culprit was.... (big build up!) calls to You Tube and Picassa and other sites like that. I host videos and photo albums on the site (well, I don't actually store them - they are stored on YT etc.) and on the home page are thumbnails that are extracted from You Tube and the like via the You Tube PHP API/Zend Framework. Because this is all http based to the other sites, each one was taking 1, 2 or 3 seconds. This was causing those divs containing these to take between 6 and 12 seconds and the home page up to 17 seconds.
The solution - store all thumbnails on my server. The first time one has to be served from the remote site (YT, Picassa etc.) so do that and then store it on your own site. Future times, you check if you have it and if so serve it always from your server. Cuts the page load time down to 2-3 seconds tops. Granted the first person to view the first home page load after someone has loaded more videos/images will take some time, but not thereafter. People will put a long one-off page load time down to their connection/the internet in general. Too many slow loads of your site and they will stop visiting!
I hope that helps somewhat.

Related

Simple PHP Script Makes Heavy Server Load

I've got a simple PHP script that, once ran, makes impossible for me to access any other page on the server.
The script is as simple as this:
for($league=11387; $league<=11407; $league++){
for($i=1; $i<9; $i++){
//gets the team object here from external resource
$team = $HT->getYouthTeam($HT->getTeam($HT->getLeague($league)->getTeam($i)->getTeamId())->getYouthTeamId());
if($team->getId() != 2286094){
$youthTeams[] = $team;
}
set_time_limit(10);
}
}
Obviously, I am supposed to get thousands of "teams" here (except one with the ID of 2286094), but once I run this script I cannot open any other page on the server until this is over and it takes lots of time until the script fetches the results into $youthTeams array.
My intent was to make a progress bar that would tell exactly (in %) where the script is at, but I can't since this script makes impossible for the server to display any other pages (you get any other page "loading" but it never loads because of this script being ran on the server).
Also, addition sub-question: once all of this data is fetched, would it be smart to insert it all into the mysql database in one single query?
I really wanna learn more on this and want to get this finished so please help me out on this one.
Maybe you can identify which one of your lookups eats the most time by checking on the times?
$t0=microtime(1);
$teamid=$HT->getLeague($league)->getTeam($i)->getTeamId();
echo "lookup teamid: ".(($t1=microtime(1))-$t0)."<br>";
if (if($team->getId() != 2286094) {
$youthteamid=$HT->getTeam($teamid)->getYouthTeamId();
echo "lookup youthteamid: ".(($t2=microtime(1))-$t1)."<br>";
$youthteam = $HT->getYouthTeam($youthteamid);
echo "lookup youthteam: ".(($t3=microtime(1))-$t2)."<br>total time: ".($t3-$t0)."<br>";
}

The bwshare module and PHP scraping

I wrote a script downloading a list of pages from a website. From time to time I receive the following error (the number of seconds is variable):
The bwshare module will refuse your requests for the next 7 seconds.
You have downloaded data too rapidly.
I found when using sleep(2) in the loop, it works much better, however the time delay is too expensive.
What's the best way how to deal with this module? Should I scrape it without any delay and if the response will be similar to the above message simply use sleep for the requested number of seconds?
It all depends on how many pages you can get before the error message.
Try and measure how many pages in average you can get.
4 pages before the bwshare message is the minimum.
If you are getting the error message before reaching 4 page downloads, then il would be faster to sleep(2) after each download.
try this way... it might help u.
$requestTime = 0.1; // s/connection
foreach(/* blah */) {
$start = microtime(true);
// Do your stuff to here.. get_file_content($url) and other processing .........
if($timeTaken = microtime(true)-$start < $requestTime) {
usleep(($requestTime-$timeTaken)*1000000);
}
}
if your problem is solved then try to post your answer so that other people may also be benefited

PHP spreading a script into multiple parts to avoid server timeout

I have a script that is very long to execute, so when i run it it hit the max execution time on my webserver and end up timing out.
To illustrate that imagine i have a for loop that make some pretty intensive manipulation one million time. How could i spread this loop execution in several parts so that i don t hit the max execution time of my Webserver?
Many thanks,
If you have an application that is going to loop a known number of times (i.e. you are sure that it's going to finish some time) you can increase time limit inside the loop:
foreach ($data as $row) {
set_time_limit(10);
// do your stuff here
}
This solution will protect you from having one run-away iteration, but will let your whole script run undisturbed as long as you need.
Best solution is to use http://php.net/manual/en/function.set-time-limit.php to change the timeout. Otherwise, you can use 301 redirects to send to an updated URL on a timeout.
$threshold = 10000;
$t = microtime();
$i = isset( $_GET['i'] ) ? $_GET['i'] : 0;
for( $i; $i < 10000000; $i++ )
{
if( microtime - $t > $threshold )
{
header('Location: http://www.example.com/?i='.$i);
exit;
}
// Your code
}
The browser will only respect a few redirects before it stops, you're better to use javascript to force a page reload.
I someday used a technique where I splitted the work from one file into three parts. It was just an array of 120.000 elements with intensive operation. I created a splitter script which stored the arrays in a database of the size of 40.000 each one. Then I created an HTML file with a redirect to the first PHP file to compute the first 40.000 elements. After computing the first 40.000 elments I had again a HTML forward to the next PHP file and so on.
Not very elegant, but it worked :-)
If you have the right permissions on your hosting server, you could use the php interpreter to execute a php script and have it run in the background.
See Asynchronous shell exec in PHP.
if you are running a script that needs to execute for unknown time, you can use:
set_time_limit(0);
If possible you can make the script so that it handles a portion of the wanted operations. Once it completes say 10%, you via AJAX call the script again to execute the next 10%. But there are circumstances where this is not an ideal solution, it really depends on what you are doing.
I used this method to create a web-based crawler which only ran on my computer for instance. If it had to do the operations at once it would time out as well. So it was split into 200 "tasks", each called via Ajax once the previous completes. Works perfectly, and it's been over a year since it started running (crawling?)

PHP Speed Test for user connection speed without echo in current page

I am looking for a possibility to check the user connection speed. It is supposed to be saved as a cookie and javascript files as well as css files will be adapted if the speed is slow.
The possibility for testing speed i have at the moment ist the following
$kb = 512;
flush();
//
echo "<!-";
$time = explode(" ",microtime());
for($x=0;$x<$kb;$x++){
echo str_pad('', 512, '.');
flush();
}
$time_end = explode(" ",microtime());
echo "->";
$start = $time[0] + $time[1];
$finish = $time_end[0] + $time_end[1];
$deltat = $finish - $start;
return round($kb / $deltat, 3);
While it works, I do not like it to put so many characters into my code also if I echo all this I can not save the result in a cookie because there has already been an output.
Could one do something like this in a different file wor something? Do you have any solution?
Thanks in advance.
Do you have any solution?
My solution is to not bother with the speed test at all. Here's why:
You stated that the reason for the test is to determine which JS/CSS files to send. You have to keep in mind that browsers will cache these files after the first download (so long as they haven't been modified). So in effect, you are sending 256K of test data to determine if you should send, say, an additional 512K?
Just send the data and it will be cached. Unless you have MBs of JS/CSS (in which case you need a site redesign, not a speed test) the download time will be doable. Speed tests should be reserved for things such as streaming video and the like.
The only idea what i can come up is a redirect.
Measure users' speed
Redirect to index
While this isn't a nice solution it only need to measure users' speed only once so i think it's excusable.
How about using javascript to time how long it takes to load a page. Then use javascript to set the cookie.
microtime in javascript http://phpjs.org/functions/microtime:472
Using jQuery
<head>
<!-- include jquery & other html snipped -->
<script>
function microtime (get_as_float) {
// http://kevin.vanzonneveld.net
// + original by: Paulo Freitas
// * example 1: timeStamp = microtime(true);
// * results 1: timeStamp > 1000000000 && timeStamp < 2000000000
var now = new Date().getTime() / 1000;
var s = parseInt(now, 10);
return (get_as_float) ? now : (Math.round((now - s) * 1000) / 1000) + ' ' + s;
}
function setCookie(c_name, value, expiredays) {
var exdate=new Date();
exdate.setDate(exdate.getDate()+expiredays);
document.cookie=c_name+ "=" +escape(value)+
((expiredays==null) ? "" : ";expires="+exdate.toUTCString());
}
start = microtime(true);
$(window).load(function () {
// everything finished loading
end = microtime(true);
diff = end - start;
// save in a cookie for the next 30 days
setCookie('my_speed_test_cookie', diff, 30);
});
</script>
</head>
<body>
<p>some page to test how long it loads</p>
<img src="some_image_file.png">
</body>
Some pitfalls:
- The page would need to start loading first. JQuery would need to be loaded (or you can rework the above code to avoid jQuery)
testing speed on ASCII / Latin data may not give the best result, because the characters may get compressed. Besides the high level gzip compression, Some modems / lines (if not all) have basic compression that is able to detect repeating characters and tell the other end that the next 500 are repeat of ' '. I guess it would be best to use binary data that has been compressed
The problem here is that you can't really solve this nicely, and probably not in pure PHP. The approach you've taken will make the user download (512x512) = 262 144 bytes of useless data, which is much bigger than most complete pages. If the user is on a slow connection, they may assume your site is down before the speed test is over (with 10 kB/sec, it'd take half a minute before anything interesting shows up on screen!).
You could make an AJAX request for a file of a known size and time how long that takes. The problem here is that the page needs to be already loaded for that to work, so it'd only work for subsequent pages.
You could make a "loading" page (like you see on GMail when accessing it from a slow connection) that preloads the data, with a link to the low-bandwidth version (or maybe a redirect if the loading is taking too long).
Or you could save the "start" time in the cookie and make an AJAX request when the page is done loading - that would give you the actual loading time of your page; if that's, say, over 10 seconds, you may want to switch to the low-bandwidth version.
None of these, however, will get you the speed on the very first access; and sending a big empty page up front is not a very good first impression either.
you visit the first page(maybe 100kB with all external files), a session is immeadeatly started with
$_SESSION["start_time"] = time();
when page finished loading(jQuery window load or smth:) u send a request again with time,
u compute the speed (jQueryRequestTime - $_SESSION["start_time"] / PageSize) and set another session variable, the next link he clicks then can include custom css/js approved for that
ofc this is not perfect:)
After you've determined the user's speed, send javascript to the browser to set the cookie and then do a refresh or redirect in cases where the speed is below what you'd like.
The only thing I can think of would be to subscribe to a service which offers an IP to net speed lookup. These services work by building a database of IP addresses and cataloging their registered intended use. They're not always accurate, but they do provide a starting point. Look up the user's IP address against one of these and see what it returns.
Ip2Location.com provides such a database, beginning with their DB13 product.
Of course, if your goal is a mobile version of the site, user agent sniffing is a better solution.

Patterns for PHP multi processes?

Which design pattern exist to realize the execution of some PHP processes and the collection of the results in one PHP process?
Background:
I do have many large trees (> 10000 entries) in PHP and have to run recursive checks on it. I want to reduce the elapsed execution time.
If your goal is minimal time - the solution is simple to describe, but not that simple to implement.
You need to find a pattern to divide the work (You don't provide much information in the question in this regard).
Then use one master process that forks children to do the work. As a rule the total number of processes you use should be between n and 2n, where n is the number of cores the machine has.
Assuming this data will be stored in files you might consider using non-blocking IO to maximize the throughput. Not doing so will make most of your process spend time waiting for the disk. PHP has stream_select() that might help you. Note that using it is not trivial.
If you decide not to use select - increasing the number of processes might help.
In regards to pcntl functions: I've written a deamon with them (a proper one with forking, changing session id, the running user, etc...) and it's one of the most reliable piece of software I've written. Because it spawns workers for every task, even if there is a bug in one of the tasks, it does not affect the others.
From your php script, you could launch another script (using exec) to do the processing. Save status updates in a text file, which could then be read periodically by the parent thread.
Note: to avoid php waiting for the exec'd script to complete, pipe the output to a file:
exec('/path/to/file.php | output.log');
Alternatively, you can fork a script using the PCNTL functions. This uses one php script, which when forked can detect whether it is the parent or the child and operate accordingly. There are functions to send/receive signals for the purpose of communicating between parent/child, or you have the child log to a file and the parent read from that file.
From the pcntl_fork manual page:
$pid = pcntl_fork();
if ($pid == -1) {
die('could not fork');
} else if ($pid) {
// we are the parent
pcntl_wait($status); //Protect against Zombie children
} else {
// we are the child
}
This might be a good time to consider using a message queue, even if you run it all on one machine.
The question seems to be a bit confused.
I want to reduce the absolute execution time.
Do you mean elapsed time? Certainly use of the right data-structure will improve throughput, but for a given data-structure, the minmimum order of the algorithm is absolute, and nothing to do with how you implement the algorithm.
Which design pattern exist to realize....?
Design Patterns are something which code is, not a template for writing programs, and a useful tools for curriculum design. To start with a pattern and make your code fit it is in itself an anti-pattern.
Nobody can answer this question withuot knowing a lot more about your data and how its structured, however the key driver for efficiency will be the data-structure you use to implement your tree. If elapsed time is important then certainly look at parallel execution, however it may also be worth considering performing the operation in a different tool - databases are highly optimized for dealing with large sets of data, however note that the obvious method for describing a tree in a relational database is very inefficient when it comes to isolating sub-trees and walking the tree.
In response to Adam's suggesting of forking you replied:
I "heard" that pcntl isnt a good solution. Any experiences?
Where did you hear that? Certainly forking from a CGI or mod_php invoked script is a bad idea, but nothing wrong with doing it from the command line. Do have a google for long running PHP processes (be warned there is a lot of bad information out there). What code you write will vary depending on the underlying OS - which you've not stated.
I suspect that you could solve a large part of your performance issues by identifying which parts of the tree need to be checked and only checking those parts AND triggering the checks when the tree is updated, or at least marking the nodes as 'dirty'.
You might find these helpful:
http://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/
http://en.wikipedia.org/wiki/Threaded_binary_tree
C.
You could use a more efficient data structure, such as a btree. I used once in Java but not in PHP. You can try this script: http://www.phpclasses.org/browse/file/708.html, it is an implementation of btree.
If it is not enough, you can use Hadoop to implement a Map/Reduce pattern, as Michael said. I would not fork PHP process, it does not seem to help for performace.
Personally, I would use PHP as client and put everything in Hadoop. This tutorial might help: http://www.lunchpauze.com/2007/10/writing-hadoop-mapreduce-program-in-php.html.
Another solution can be to use a Java implementation of Btree: http://jdbm.sourceforge.net/. JDBM is an object database using a Btree+ data astructures. Then you can search with PHP by exposing data with a web service or by accessing it directly with Quercus
Using web or CLI?
If you use web, you could intergrate that part in Quercus Then you could use the advantages of JAVA multithreading.
I don't actually know how reliable Quercus is though. I'd also suggest using a kind of message queue and refactoring the code, so it doesn't need the scope.
Maybe you could rebuild the code to a Map/Reduce pattern. You then can run the PHP code in Hadoop Then you can cluster the processing through a couple of machines.
I don't know if it's useful, but I came across another project, called Gearman. It's also used to cluster PHP processes. I guess you can combine that with a reduce script as well, if Hadoop is not the way you want to go.
pthreads
There is a rather new (since 2012) PHP extension available: pthreads. It can be installed via PECL.
Simple Implementation in PHP Code: extend from Thread Class. Add a run() method and execute the start() method.
<?php
// Example from http://www.phpgangsta.de/richtige-threads-in-php-einfach-erstellen-mit-pthreads
class AsyncOperation extends Thread
{
public function __construct($threadId)
{
$this->threadId = $threadId;
}
public function run()
{
printf("T %s: Sleeping 3sec\n", $this->threadId);
sleep(3);
printf("T %s: Hello World\n", $this->threadId);
}
}
$start = microtime(true);
for ($i = 1; $i <= 5; $i++) {
$t[$i] = new AsyncOperation($i);
$t[$i]->start();
}
echo microtime(true) - $start . "\n";
echo "end\n";
Outputs
>php pthreads.php
0.041301012039185
end
T 1: Sleeping 3sec
T 2: Sleeping 3sec
T 3: Sleeping 3sec
T 4: Sleeping 3sec
T 5: Sleeping 3sec
T 1: Hello World
T 2: Hello World
T 3: Hello World
T 4: Hello World
T 5: Hello World
Try this: PHPThreads
Code Example:
function threadproc($thread, $param) {
echo "\tI'm a PHPThread. In this example, I was given only one parameter: \"". print_r($param, true) ."\" to work with, but I can accept as many as you'd like!\n";
for ($i = 0; $i < 10; $i++) {
usleep(1000000);
echo "\tPHPThread working, very busy...\n";
}
return "I'm a return value!";
}
$thread_id = phpthread_create($thread, array(), "threadproc", null, array("123456"));
echo "I'm the main thread doing very important work!\n";
for ($n = 0; $n < 5; $n++) {
usleep(1000000);
echo "Main thread...working!\n";
}
echo "\nMain thread done working. Waiting on our PHPThread...\n";
phpthread_join($thread_id, $retval);
echo "\n\nOur PHPThread returned: " . print_r($retval, true) . "!\n";
Requires PHP extensions:
posix
pcntl
sockets

Categories