Find out where your PHP code is slowing down (Performance Issue) - php

Here's my first question at SO.
I have a internal application for my company which I've been recently ask to maintain. The applications is built in PHP and its fairly well coded (OO, DB Abstraction, Smarty) nothing WTF-ish.
The problem is the applications is very slow.
How do I go about finding out what's slowing the application down? I've optimized the code to make very few DB queries, so I know that it is the PHP code which is taking a while to execute. I need to get some tools which can help me with this and need to devise a strategy for checking my code.
I can do the checking/strategy work myself, but I need more PHP tools to figure out where my app is crapping up.
Thoughts?

I've used XDebug profiling recently in a similiar situation. It outputs a full profile report that can be read with many common profiling apps ( Can't give you a list though, I just used the one that came with slackware ).

As Juan mentioned, xDebug is excellent. If you're on Windows, WinCacheGrind will let you look over the reports.

Watch this presentation by Rasmus Lerdorf (creator of PHP). He goes into some good examples of testing PHP speed and what to look for as well as some internals that can slow things down. XDebug is one tool he uses. He also makes a very solid point about knowing what performance cost you're getting into with frameworks.
Video:
http://www.archive.org/details/simple_is_hard
Slides (since it's hard to see on the video):
http://talks.php.net/show/drupal08/1

There are many variables that can impact your application's performance. I recommend that you do not instantly assume PHP is the problem.
First, how are you serving PHP? Have you tried basic optimization of Apache or IIS itself? Is the server busy processing other kinds of requests? Have you taken advantage of a PHP code accelerator? One way to test whether the server is your bottleneck is to try running the application on another server.
Second, is performance of the entire application slow, or does it only seem to affect certain pages? This could give you an indication of where to start analyzing performance. If the entire application is slow, the problem is more likely in the underlying server/platform or with a global SQL query that is part of every request (user authentication, for example).
Third, you mentioned minimizing the number of SQL queries, but what about optimizing the existing queries? If you are using MySQL, are you taking advantage of the various strengths of each storage system? Have you run EXPLAIN on your most important queries to make sure they are properly indexed? This is critical on queries that access big tables; the larger the dataset, the more you will notice the effects of poor indexing. Luckily, there are many articles such as this one which explain how to use EXPLAIN.
Fourth, a common mistake is to assume that your database server will automatically use all of the resources available to the system. You should check to make sure you have explicitly allocated sufficient resources to your database application. In MySQL, for example, you'll want to add custom settings (in your my.cnf file) for things like key buffer, temp table size, thread concurrency, innodb buffer pool size, etc.
If you've double-checked all of the above and are still unable to find the bottleneck, a code profiler like Xdebug can definitely help. Personally, I prefer the Zend Studio profiler, but it may not be the best option unless you are already taking advantage of the rest of the Zend Platform stack. However, in my experience it is very rare that PHP itself is the root cause of slow performance. Often, a code profiler can help you determine with more precision which DB queries are to blame.

Also You could use APD (Advanced PHP Debugger).
It's quite easy to make it work.
$ php apd-test.php
$ pprofp -l pprof.SOME_PID
Trace for /Users/martin/develop/php/apd-test/apd-test.php
Total Elapsed Time = 0.12
Total System Time = 0.01
Total User Time = 0.07
Real User System secs/ cumm
%Time (excl/cumm) (excl/cumm) (excl/cumm) Calls call s/call Memory Usage Name
--------------------------------------------------------------------------------------
71.3 0.06 0.06 0.05 0.05 0.01 0.01 10000 0.0000 0.0000 0 in_array
27.3 0.02 0.09 0.02 0.07 0.00 0.01 10000 0.0000 0.0000 0 my_test_function
1.5 0.03 0.03 0.00 0.00 0.00 0.00 1 0.0000 0.0000 0 apd_set_pprof_trace
0.0 0.00 0.12 0.00 0.07 0.00 0.01 1 0.0000 0.0000 0 main
There is a nice tutorial how to compile APD and make profiling with it : http://martinsikora.com/compiling-apd-for-php-54

phpED (http://www.nusphere.com/products/phped.htm) also offers great debugging and profiling, and the ability to add watches, breakpoints, etc in PHP code. The integrated profiler directly offers a time breakdown of each function call and class method from within the IDE. Browser plugins also enable quick integration with Firefox or IE (i.e. visit slow URL with browser, then click button to profile or debug).
It's been very useful in pointing out where the app is slow in order to concentrate most coding effort; and it avoids wasting time optimising already fast code. Having tried Zend and Eclipse, I've now been sold on the ease of use of phpED.
Bear in mind both Xdebug and phpED (with DBG) will require an extra PHP module installed when debugging against a webserver. phpED also offers (untried by me) a local debugging option too.

Xdebug profile is definitely the way to go. Another tip - WincacheGrind is good, but not been updated recently. http://code.google.com/p/webgrind/ - in a browser may be an easy and quick alternative.
Chances are though, it's still the database anyway. Check for relevant indexes - and that it has sufficient memory to cache as much of the working data as possible.

ifs its a large code base try apc if you're not already.
http://pecl.php.net/package/APC

you can also try using the register_tick_function function in php. which tells php to call a certain function periodcally through out your code. You could then keep track of which function is currently running and the amount of time between calls. then you could see what's taking the most time.
http://www.php.net/register_tick_function

We use Zend Development Environment (windows). We resolved a memory usage spike yesterday by stepping through the debugger while running Process Explorer to watch the memory/cpu/disk activity as each line was executed.
Process Explorer: http://technet.microsoft.com/en-us/sysinternals/bb896653.aspx.
ZDE includes a basic performance profiler that can show time spent in each function call during page requests.

I use a combination of PEAR Benchmark and log4php.
At the top of scripts I want to profile I create an object that wraps around a Benchmark_Timer object. Throughout the code, I add in $object->setMarker("name");calls, especially around suspect code.
The wrapper class has a destroy method that takes the logging information and writes it to log4php. I typically send this to syslog (many servers, aggregates to one log file on one server).
In debug, I can watch the log files and see where I need to improve things. Later on in production, I can parse the log files and do performance analysis.
It's not xdebug, but it's always on and gives me the ability to compare any two executions of the code.

You can also look at the HA Proxy or any other load balancing solution if your server degraded performance is the cause of the application slow processing. server.

Related

debugging long running PHP script

I have php script running as a cron job, extensively using third party code. Script itself has a few thousands LOC. Basically it's the data import / treatment script. (JSON to MySQL, but it also makes a lot of HTTP calls and some SOAP).
Now, performance is downgrading with the time. When testing with a few records (around 100), performance is ok, it is done in a 10-20 minutes. When running whole import (about 1600 records), mean time of import of one record grows steadily, and whole thing takes more than 24 hours, so at least 5 times longer than expected.
Memory seems not to be a problem, usage growing as it should, without unexpected peaks.
So, I need to debug it to find the bottleneck. It can be some problem with the script, underlying code base, php itself, database, os or network part. I am suspecting for now some kind of caching somewhere which is not behaving well with a near 100 % miss ratio.
I cannot use XDebug, profile file grows too fast to be treatable.
So question is: how can I debug this kind of script?
PHP version: 5.4.41
OS: Debian 7.8
I can have root privileges if necessary, and install the tools. But it's the production server and ideally debugging should not be too disrupting.
Yes its possible and You can use Kint (PHP Debugging Script)
What is it?
Kint for PHP is a tool designed to present your debugging data in the absolutely best way possible.
In other words, it's var_dump() and debug_backtrace() on steroids. Easy to use, but powerful and customizable. An essential addition to your development toolbox.
Still lost? You use it to see what's inside variables.
Act as a debug_backtrace replacer, too
you can download here or Here
Total Documentations and Help is here
Plus, it also supports almost all php framework
CodeIgniter
Drupal
Symfony
Symfony 2
WordPress
Yii
framework
Zend Framework
All the Best.... :)
There are three things that come to mind:
Set up an IDE so you can debug the PHP script line by line
Add some logging to the script
Look for long running queries in MySQL
Debug option #2 is the easiest. Since this is running as a cron job, you add a bunch of echo's in your script:
<?php
function log_message($type, $message) {
echo "[{strtoupper($type)}, {date('d-m-Y H:i:s')}] $message";
}
log_message('info', 'Import script started');
// ... the rest of your script
log_message('info', 'Import script finished');
Then pipe stdout to a log file in the cron job command.
01 04 * * * php /path/to/script.php >> /path/to/script.log
Now you can add log_message('info|warn|debug|error', 'Message here') all over the script and at least get an idea of where the performance issue lies.
Debug option #3 is just straight investigation work in MySQL. One of your queries might be taking a long time, and it might show up in a long running query utility for MySQL.
Profiling tool:
There is a PHP profiling tool called Blackfire which is currently in public beta. There is specific documentation on how to profile CLI applications. Once you collected the profile you can analyze the application control flow with time measurements in a nice UI:
Memory consumption suspicious:
Memory seems not to be a problem, usage growing as it should, without unexpected peaks.
A growing memory usage actually sounds suspicious! If the current dataset does not depend on all previous datasets of the import, then a growing memory most probably means, that all imported datasets are kept in memory, which is bad. PHP may also frequently try to garbage collect, just to find out that there is nothing to remove from memory. Especially long running CLI tasks are affected, so be sure to read the blog post that discovered the behavior.
Use strace to see what the program is basically doing from the system perspective. Is it hanging in IO operations etc.? strace should be the first thing you try when encountering performance problems with whatever kind of Linux application. Nobody can hide from it! ;)
If you should find out that the program hangs in network related calls like connect, readfrom and friends, meaning the network communication does hang at some point while connecting or waiting for responses than you can use tcpdump to analyze this.
Using the above methods you should be able to find out most common performance problems. Note that you can even attach to a running task with strace using -p PID.
If the above methods doesn't help, I would profile the script using xdebug. You can analyse the profiler output using tools like KCachegrind
Although it is not stipulated, and if my guess is correct you seem to be dealing with records one at a time, but in one big cron.
i.e. Grab a record#1, munge it somehow, add value to it, reformat it then save it, then move to record#2
I would consider breaking the big cron down. ie
Cron#1: grab all the records, and cache all the salient data locally (to that server). Set a flag if this stage is achieved.
Cron #2: Now you have the data you need, munge and add value, cache that output. Set a flag if this stage is achieved.
Cron #3: Reformat that data and store it. Delete all the files.
This kind of "divide and conquer" will ease your debugging woes, and lead to a better understanding of what is actually going on, and as a bonus give you the opportunity to rerun say, cron 2.
I've had to do this many times, and for me logging is the key to identifying weaknesses in your code, identify poor assumptions about data quality, and can hint at where latency is causing a problem.
I've run into strange slowdowns when doing network heavy efforts in the past. Basically, what I found was that during manual testing the system was very fast but when left to run unattended it would not get as much done as I had hoped.
In my case the issue I found was that I had default network timeouts in place and many web requests would simply time out.
In general, though not an external tool, you can use the difference between two microtime(TRUE) requests to time sections of code. To keep the logging small set a flag limit and only test the time if the flag has not been decremented down to zero after reducing for each such event. You can have individual flags for individual code segments or even for different time limits within a code segment.
$flag['name'] = 10; // How many times to fire
$slow['name'] = 0.5; // How long in seconds before it's a problem?
$start = microtime(TRUE);
do_something($parameters);
$used = microtime(TRUE) - $start;
if ( $flag['name'] && used >= $slow['name'] )
{
logit($parameters);
$flag['name']--;
}
If you output what URL, or other data/event took to long to process, then you can dig into that particular item later to see if you can find out how it is causing trouble in your code.
Of course, this assumes that individual items are causing your problem and not simply a general slowdown over time.
EDIT:
I (now) see it's a production server. This makes editing the code less enjoyable. You'd probably want to make integrating with the code a minimal process having the testing logic and possibly supported tags/flags and quantities in an external file.
setStart('flagname');
// Do stuff to be checked for speed here
setStop('flagname',$moredata);
For maximum robustness the methods/functions would have to ensure they handled unknown tags, missing parameters, and so forth.
xdebug_print_function_stack is an option, but what you can also do is to create a "function trace".There are three output formats. One is meant as a human readable trace, another one is more suited for computer programs as it is easier to parse, and the last one uses HTML for formatting the trace
http://www.xdebug.org/docs/execution_trace
Okay, basically you have two possibilities - it's either the ineffective PHP code or ineffective MySQL code. Judging by what you say, it's probably inserting into indexed table a lot of records separately, which causes the insertion time to skyrocket. You should either disable indexes and rebuild them after insertion, or optimize the insertion code.
But, about the tools.
You can configure the system to automatically log slow MySQL queries:
https://dev.mysql.com/doc/refman/5.1/en/slow-query-log.html
You can also do the same with PHP scripts, but you need a PHP-FPM environment (and you probably have Apache).
https://rtcamp.com/tutorials/php/fpm-slow-log/
These tools are very powerful and versatile.
P.S. 10-20 minutes for 100 records seems like A LOT.
You can use https://github.com/jmartin82/phplapse to record the application activity for determinate time.
For example start recording after n iterations with:
phplapse_start();
And stop it in next iteration with:
phplapse_stop();
With this process you was created a snapshot of execution when all seems works slow.
(I'm the author of project, don't hesitate to contact with me to improve the functionality)
I have a similar thing running each night (a cron job to update my database). I have found the most reliable way to debug is to set up a log table in the database and regularly insert / update a json string containing a multi-dimensional array with info about each record and whatever useful info you want to know about each record. This way if your cron job does not finish you still have detailed information about where it got up to and what happened along the way. Then you can write a simple page to pull out the json string, turn it back into an array and print useful data onto the page including timing and passed tests etc. When you see something as an issue you can concentrate on putting more info from that area into the json string.
Regular "top" command can show you, if CPU usage by php or mysql is bottleneck. If not, then delays may be caused by http calls.
If CPU usage by mysqld is low, but constant, then it may be disk usage bottleneck.
Also, you can check your bandwidth usage by installing and using "speedometer", or other tools.

measuring php performance

I'm trying to track down issues with an application [modx] I have several of these sites [about 10] on my server & was wondering how I can see what php is doing.
Pages on these sites are extremely slow while the same sites in dev are fine as are other php applications on the server.
I tried using xdebug to get an idea of what php was doing while processing these pages & where the bottleneck was occurring, but it only appeared to want to do anything on an error [there are no errors being thrown]
Any suggestions on how to track this down?
[linux/Centos5/php5.2.?/apache2]
Xdebug and webgrind are a nice way to see where your bottel necks are...
Read XDEBUG_PROFILE and Webgrind
Set up the php.ini to have xdebug profile your code on every run or if a special param is passed, then setup webgrind to read from the same directory xdebug writes its profile dumps to.
Webgrind will show you what functions and set of functions require the most time, it breaks it down and makes it easy to find slow and/or inefficient code. (eg. your script is calling "PDOStatement->execute" 300 times on a fast query [Or calling it once and a massively slow one] taking up 90% of the execution time).
The most commonly used tool, for finding bottlenecks in PHP, would be Xdebug. But you should also manually examine the codebase.
There are three different areas where you will have to focus on:
frontend performance
SQL queries
php logic itself
.. and the impact on the perceived speed is in this order.
You should start by running ySlow, and make sure that your site follows the guidelines as closely as possible.
The next step would be tracking down what SQL queries are executed, and (assuming you are using mysql) try to run them with EXPLAIN. Also, check the queries themselves. There might be some extremely stupid code there, like ORDER BY RAND() or use of LIKE in huge tables.
And the last stage would fixing it all would a hard looks at the code itself. Both on PHP and JavaScript side of things.
Also , you should upgrade to PHP 5.3, because your version is extremely outdated.
Usually when you don't know what you're looking for, you cannot spot it with tools like xdebug or other plugins/debug bars etc built into CMS/Framework, new relic is the simplest solution - you'll be able to spot bottlenecks after few min.
while new relic is a paid app, you can test if for free for first 14 days - it's more than enough to find problem.
It's great because it integrates all other tool's and data sources you usually use:
xdebug, cpu & i/o monitoring, mysql slowlog, queries log.
It will also show you if your app is slow on php/DB/frontend/network.
You should try it out instead of wasting time for debugging with other tools.
here is a guide for centos installation: https://newrelic.com/docs/php/php-agent-installation-redhat-and-centos

PHP Performance Metrics

I am currently developing a PHP MVC Framework for a personal project. While I am developing the framework I am interested to see any notable performance by implementing different techniques for optimization. I have implemented a crude BenchMark class that logs mircotime.
The problem is I have no frame of reference for execution times. I am very near the beginnig of this project with a database connection and a few queries but no output (bar some debugging text and BenchMark log). I have a current execution time of 0.01917 seconds.
I was expecting this to be lower but as I said before I have no frame of reference. I appreciate there are many variables to take into account when juding performance but I am hoping to find some sort of metric to
a) techniques to measure performance for example requests per second and
b) compare results for example; how a "moderately" sized PHP application on a "standard" webserver will perform. I appreciate "moderately" and "standard" are very subjective words so perhaps a table of known execution times for a particular application (eg StackOverFlow's executing time).
What are other techniques of measuring performance are there other than execution time?
When looking at MVC Framework Performance Comparisom it talks about Requests Per Second (RPS). How is this calculated? I am guessing with my current execution time of 0.01917 seconds can handle 52 RPS (= 1 / 0.01917 ). This seems to be significantly lower than that quoted on the graph especially when you consider my current limited funcitonality.
To benchmark a certain page, use ab. To benchmark a load of pages on the server, try siege.
However... both of those are still mostly artificial tests. I personally add some extra logs too.
Page load time in the webserver (or proxy, whatever)
Slow query logging in the database
Query count logging per page too if possible, that way you'll know how heavy your pages are ;)
You can use xdebug to profile your code. But you are optimizing way too early in the development process. Just the act of calling microtime is going to slow things down since it has to make a call out to the system (outside the PHP engine). Every include, object creation, connection to another resource (i.e. database) is going to add a lot of overhead, relatively speaking.
If you design your system to be very cache friendly, then you don't have to execute code. For example, WordPress is very slow. About 15 pages/sec on a decent web server. It does a lot of includes and runs a lot of code. But add the SuperCache plugin and performance increases 10x. It works by creating a cache file and using some Apache rules so PHP doesn't have to be run at all.

CPU load with file_exists in php

i own a site with a high load cpu httpd request per minute. I've noticed i use "file_exists" on each httpd request. Is that function to much heavy?
This function will only check of a file exists -- which means an access to the disk (which might take a little time, but not that much either)
Considering your application is probably made of dozens (if not hundreds) of PHP files, which all have to be read for each request, I don't think one file_exists makes any difference.
(Well, at least, as long as your are checking for a file on a local disk -- not going through any network drive or anything like that)
As a sidenote : if you want to identify where CPU is spend in your PHP scripts, you might be interested by the Xdebug extension, which provides a profiling functionnality.
You can read this answer I gave some time ago, which is quite long : How can I measure the speed of code written in php? -- I won't copy-paste it here.
You might also want to read my answer to that question (there is a section where I wrote about Xdebug and profiling) : Optimizing Kohana-based Websites for Speed and Scalability
Being realistic, playing 'guess the bottleneck' is likely to be a pretty fruitless task - I'd recommend using a profiler, such as the one built into Zend Studio.
file_exists is typically very cheap, especially since the result is cached in php's stat cache.. areas like heavy DB tend to be the largest consumer of cpu.
try some profiling to determine what part of your app is using up the most time, some examples here:
http://www.ibm.com/developerworks/opensource/library/os-php-fastapps2/

Optimizing Kohana-based Websites for Speed and Scalability

A site I built with Kohana was slammed with an enormous amount of traffic yesterday, causing me to take a step back and evaluate some of the design. I'm curious what are some standard techniques for optimizing Kohana-based applications?
I'm interested in benchmarking as well. Do I need to setup Benchmark::start() and Benchmark::stop() for each controller-method in order to see execution times for all pages, or am I able to apply benchmarking globally and quickly?
I will be using the Cache-library more in time to come, but I am open to more suggestions as I'm sure there's a lot I can do that I'm simply not aware of at the moment.
What I will say in this answer is not specific to Kohana, and can probably apply to lots of PHP projects.
Here are some points that come to my mind when talking about performance, scalability, PHP, ...
I've used many of those ideas while working on several projects -- and they helped; so they could probably help here too.
First of all, when it comes to performances, there are many aspects/questions that are to consider:
configuration of the server (both Apache, PHP, MySQL, other possible daemons, and system); you might get more help about that on ServerFault, I suppose,
PHP code,
Database queries,
Using or not your webserver?
Can you use any kind of caching mechanism? Or do you need always more that up to date data on the website?
Using a reverse proxy
The first thing that could be really useful is using a reverse proxy, like varnish, in front of your webserver: let it cache as many things as possible, so only requests that really need PHP/MySQL calculations (and, of course, some other requests, when they are not in the cache of the proxy) make it to Apache/PHP/MySQL.
First of all, your CSS/Javascript/Images -- well, everything that is static -- probably don't need to be always served by Apache
So, you can have the reverse proxy cache all those.
Serving those static files is no big deal for Apache, but the less it has to work for those, the more it will be able to do with PHP.
Remember: Apache can only server a finite, limited, number of requests at a time.
Then, have the reverse proxy serve as many PHP-pages as possible from cache: there are probably some pages that don't change that often, and could be served from cache. Instead of using some PHP-based cache, why not let another, lighter, server serve those (and fetch them from the PHP server from time to time, so they are always almost up to date)?
For instance, if you have some RSS feeds (We generally tend to forget those, when trying to optimize for performances) that are requested very often, having them in cache for a couple of minutes could save hundreds/thousands of request to Apache+PHP+MySQL!
Same for the most visited pages of your site, if they don't change for at least a couple of minutes (example: homepage?), then, no need to waste CPU re-generating them each time a user requests them.
Maybe there is a difference between pages served for anonymous users (the same page for all anonymous users) and pages served for identified users ("Hello Mr X, you have new messages", for instance)?
If so, you can probably configure the reverse proxy to cache the page that is served for anonymous users (based on a cookie, like the session cookie, typically)
It'll mean that Apache+PHP has less to deal with: only identified users -- which might be only a small part of your users.
About using a reverse-proxy as cache, for a PHP application, you can, for instance, take a look at Benchmark Results Show 400%-700% Increase In Server Capabilities with APC and Squid Cache.
(Yep, they are using Squid, and I was talking about varnish -- that's just another possibility ^^ Varnish being more recent, but more dedicated to caching)
If you do that well enough, and manage to stop re-generating too many pages again and again, maybe you won't even have to optimize any of your code ;-)
At least, maybe not in any kind of rush... And it's always better to perform optimizations when you are not under too much presure...
As a sidenote: you are saying in the OP:
A site I built with Kohana was slammed with
an enormous amount of traffic yesterday,
This is the kind of sudden situation where a reverse-proxy can literally save the day, if your website can deal with not being up to date by the second:
install it, configure it, let it always -- every normal day -- run:
Configure it to not keep PHP pages in cache; or only for a short duration; this way, you always have up to date data displayed
And, the day you take a slashdot or digg effect:
Configure the reverse proxy to keep PHP pages in cache; or for a longer period of time; maybe your pages will not be up to date by the second, but it will allow your website to survive the digg-effect!
About that, How can I detect and survive being “Slashdotted”? might be an interesting read.
On the PHP side of things:
First of all: are you using a recent version of PHP? There are regularly improvements in speed, with new versions ;-)
For instance, take a look at Benchmark of PHP Branches 3.0 through 5.3-CVS.
Note that performances is quite a good reason to use PHP 5.3 (I've made some benchmarks (in French), and results are great)...
Another pretty good reason being, of course, that PHP 5.2 has reached its end of life, and is not maintained anymore!
Are you using any opcode cache?
I'm thinking about APC - Alternative PHP Cache, for instance (pecl, manual), which is the solution I've seen used the most -- and that is used on all servers on which I've worked.
See also: Slides APC Facebook,
Or Benchmark Results Show 400%-700% Increase In Server Capabilities with APC and Squid Cache.
It can really lower the CPU-load of a server a lot, in some cases (I've seen CPU-load on some servers go from 80% to 40%, just by installing APC and activating it's opcode-cache functionality!)
Basically, execution of a PHP script goes in two steps:
Compilation of the PHP source-code to opcodes (kind of an equivalent of JAVA's bytecode)
Execution of those opcodes
APC keeps those in memory, so there is less work to be done each time a PHP script/file is executed: only fetch the opcodes from RAM, and execute them.
You might need to take a look at APC's configuration options, by the way
there are quite a few of those, and some can have a great impact on both speed / CPU-load / ease of use for you
For instance, disabling [apc.stat](https://php.net/manual/en/apc.configuration.php#ini.apc.stat) can be good for system-load; but it means modifications made to PHP files won't be take into account unless you flush the whole opcode-cache; about that, for more details, see for instance To stat() Or Not To stat()?
Using cache for data
As much as possible, it is better to avoid doing the same thing over and over again.
The main thing I'm thinking about is, of course, SQL Queries: many of your pages probably do the same queries, and the results of some of those is probably almost always the same... Which means lots of "useless" queries made to the database, which has to spend time serving the same data over and over again.
Of course, this is true for other stuff, like Web Services calls, fetching information from other websites, heavy calculations, ...
It might be very interesting for you to identify:
Which queries are run lots of times, always returning the same data
Which other (heavy) calculations are done lots of time, always returning the same result
And store these data/results in some kind of cache, so they are easier to get -- faster -- and you don't have to go to your SQL server for "nothing".
Great caching mechanisms are, for instance:
APC: in addition to the opcode-cache I talked about earlier, it allows you to store data in memory,
And/or memcached (see also), which is very useful if you literally have lots of data and/or are using multiple servers, as it is distributed.
of course, you can think about files; and probably many other ideas.
I'm pretty sure your framework comes with some cache-related stuff; you probably already know that, as you said "I will be using the Cache-library more in time to come" in the OP ;-)
Profiling
Now, a nice thing to do would be to use the Xdebug extension to profile your application: it often allows to find a couple of weak-spots quite easily -- at least, if there is any function that takes lots of time.
Configured properly, it will generate profiling files that can be analysed with some graphic tools, such as:
KCachegrind: my favorite, but works only on Linux/KDE
Wincachegrind for windows; it does a bit less stuff than KCacheGrind, unfortunately -- it doesn't display callgraphs, typically.
Webgrind which runs on a PHP webserver, so works anywhere -- but probably has less features.
For instance, here are a couple screenshots of KCacheGrind:
(source: pascal-martin.fr)
(source: pascal-martin.fr)
(BTW, the callgraph presented on the second screenshot is typically something neither WinCacheGrind nor Webgrind can do, if I remember correctly ^^ )
(Thanks #Mikushi for the comment) Another possibility that I haven't used much is the the xhprof extension : it also helps with profiling, can generate callgraphs -- but is lighter than Xdebug, which mean you should be able to install it on a production server.
You should be able to use it alonside XHGui, which will help for the visualisation of data.
On the SQL side of things:
Now that we've spoken a bit about PHP, note that it is more than possible that your bottleneck isn't the PHP-side of things, but the database one...
At least two or three things, here:
You should determine:
What are the most frequent queries your application is doing
Whether those are optimized (using the right indexes, mainly?), using the EXPLAIN instruction, if you are using MySQL
See also: Optimizing SELECT and Other Statements
You can, for instance, activate log_slow_queries to get a list of the requests that take "too much" time, and start your optimization by those.
whether you could cache some of these queries (see what I said earlier)
Is your MySQL well configured? I don't know much about that, but there are some configuration options that might have some impact.
Optimizing the MySQL Server might give you some interesting informations about that.
Still, the two most important things are:
Don't go to the DB if you don't need to: cache as much as you can!
When you have to go to the DB, use efficient queries: use indexes; and profile!
And what now?
If you are still reading, what else could be optimized?
Well, there is still room for improvements... A couple of architecture-oriented ideas might be:
Switch to an n-tier architecture:
Put MySQL on another server (2-tier: one for PHP; the other for MySQL)
Use several PHP servers (and load-balance the users between those)
Use another machines for static files, with a lighter webserver, like:
lighttpd
or nginx -- this one is becoming more and more popular, btw.
Use several servers for MySQL, several servers for PHP, and several reverse-proxies in front of those
Of course: install memcached daemons on whatever server has any amount of free RAM, and use them to cache as much as you can / makes sense.
Use something "more efficient" that Apache?
I hear more and more often about nginx, which is supposed to be great when it comes to PHP and high-volume websites; I've never used it myself, but you might find some interesting articles about it on the net;
for instance, PHP performance III -- Running nginx.
See also: PHP-FPM - FastCGI Process Manager, which is bundled with PHP >= 5.3.3, and does wonders with nginx.
Well, maybe some of those ideas are a bit overkill in your situation ^^
But, still... Why not study them a bit, just in case ? ;-)
And what about Kohana?
Your initial question was about optimizing an application that uses Kohana... Well, I've posted some ideas that are true for any PHP application... Which means they are true for Kohana too ;-)
(Even if not specific to it ^^)
I said: use cache; Kohana seems to support some caching stuff (You talked about it yourself, so nothing new here...)
If there is anything that can be done quickly, try it ;-)
I also said you shouldn't do anything that's not necessary; is there anything enabled by default in Kohana that you don't need?
Browsing the net, it seems there is at least something about XSS filtering; do you need that?
Still, here's a couple of links that might be useful:
Kohana General Discussion: Caching?
Community Support: Web Site Optimization: Maximum Website Performance using Kohana
Conclusion?
And, to conclude, a simple thought:
How much will it cost your company to pay you 5 days? -- considering it is a reasonable amount of time to do some great optimizations
How much will it cost your company to buy (pay for?) a second server, and its maintenance?
What if you have to scale larger?
How much will it cost to spend 10 days? more? optimizing every possible bit of your application?
And how much for a couple more servers?
I'm not saying you shouldn't optimize: you definitely should!
But go for "quick" optimizations that will get you big rewards first: using some opcode cache might help you get between 10 and 50 percent off your server's CPU-load... And it takes only a couple of minutes to set up ;-) On the other side, spending 3 days for 2 percent...
Oh, and, btw: before doing anything: put some monitoring stuff in place, so you know what improvements have been made, and how!
Without monitoring, you will have no idea of the effect of what you did... Not even if it's a real optimization or not!
For instance, you could use something like RRDtool + cacti.
And showing your boss some nice graphics with a 40% CPU-load drop is always great ;-)
Anyway, and to really conclude: have fun!
(Yes, optimizing is fun!)
(Ergh, I didn't think I would write that much... Hope at least some parts of this are useful... And I should remember this answer: might be useful some other times...)
Use XDebug and WinCacheGrind or WebCacheGrind to profile and analyze slow code execution.
(source: jokke.dk)
Profile code with XDebug.
Use a lot of caching. If your pages are relatively static, then reverse proxy might be the best way to do it.
Kohana is out of the box very very fast, except for the use of database objects. To quote Zombor "You can reduce memory usage by ensuring you are using the database result object instead of result arrays." This makes a HUGEE performance difference on a site that is being slammed. Not only does it use more memory, it slows down execution of scripts.
Also - you must use caching. I prefer memcache and use it in my models like this:
public function get($e_id)
{
$event_data = $this->cache->get('event_get_'.$e_id.Kohana::config('config.site_domain'));
if ($event_data === NULL)
{
$this->db_slave
->select('e_id,e_name')
->from('Events')
->where('e_id', $e_id);
$result = $this->db_slave->get();
$event_data = ($result->count() ==1)? $result->current() : FALSE;
$this->cache->set('event_get_'.$e_id.Kohana::config('config.site_domain'), $event_data, NULL, 300); // 5 minutes
}
return $event_data;
}
This will also dramatically increase performance. The above two techniques improved a sites performance by 80%.
If you gave some more information about where you think the bottleneck is, I'm sure we could give some better ideas.
Also check out yslow (google it) for some other performance tips.
Strictly related to Kohana (you probably already have done this, or not):
In production mode:
Enable internal caching (this will only cache the Kohana::find_file results, but this actually can help a lot.
Disable profiler
Just my 2 cents :)
I totally agree with the XDebug and caching answers. Don't look into the Kohana layer for optimization until you've identified your biggest speed and scale bottlenecks.
XDebug will tell you were you spend the most of your time and identify 'hotspots' in your code. Keep this profiling information so you can baseline and measure performance improvements.
Example problem and solution:
If you find that you're building up expensive objects from the database each time, that don't really change often, then you can look at caching them with memcached or another mechanism. All of these performance fixes take time and add complexity to your system, so be sure of your bottlenecks before you start fixing them.

Categories