I have made site on drupal
My site has 7500 users and approx (20 to 50 without logged in)(2 to 10 logged in) users are online (and this is not heavy traffic I think)
The site is on dedicated server. I have enabled setting in performance from drupal admin and also installed memcache and eaccelerator
I looked in query logs from using devel module. it is firing total 600 to 900 queries on each page
When I have installed patch of path.inc to reduce the queries of drupal_look_path(). It has reduced queries to around 400
I have also made some positive changes in mysql (my.cnf) file, but still there are many same queries run form user_load() function again and again
I have 60 to 70 modules enabled and all are use full. I can't remove the modules
Still the site is running slow it is taking approx 10 to 15 sec
Now I don't know why the site is running so slow
Is it because the drupal has the large php code ?
Is it because it is firing so many queries on each page?
Does the InnoDB engine improve the performance?
Please, any kind of suggestions are welcome
400 queries for each requests is a sucidie (but even 50+).
You should implement some html cacher. My website generally doens't even make the db connection. It just fires the html cached in a file.
Some additional things to look into:
Install a tool like Yslow/PageSpeed to see how much of those 10-15s are client and server time.
Instal XhProf (on a development site, not live) together with Devel to see which are the functions that use the most time. Look into these first. Edit, now with link: http://groups.drupal.org/node/82889
Using pressflow might help a bit, but since you are alrady using the path.inc patch, probably not so much.
You mentioned that you installed memcache. Did you also install the memcache module and configure the cache plugin to use memcache?
EDIT: Yes, switching to InnoDB can help. One of the main performance advantages of InnoDB is row-level locking (as opposed to table-level locking of MyISAM), which means that multiple INSERT/UPDATE queries against the same table won't block each other unless really necessary. However, InnoDB does not perform well at all out of the box, you really need to fine-tune your mysql configuration for your specific site. So this is a step that you should only take carefully and after testing and optimizing on a development site. There are various questions already on this site and elsewhere about InnoDB tuning...
Anything else than that is then site specific and depends on the modules you are using. But especially things like complex node_access setups and multiple languages (i18n!) tend to either cause slow queries and/or a lot of them.
Not all modules make use of the caching mechanisms you can switch on in the performance settings area. It would be worth trying to identify which ones are doing the most/slowest queries and attempting to get the developer(s) to improve them.
Alternatively, examine whether you could achieve things with fewer modules. Some modules do overlap somewhat in functionality, so you may be able to reorganise the way the site functions a bit.
Additionally, you need to look at whether your settings MySQL are allowing enough memory for these queries to be carried out. Most MySQL distributions come with different versions of my.ini labelled 'small', 'medium', 'huge' etc. Copy the 'huge' one to my.ini (back up the old one first) and restart the DB to see if maxing out all the cache sizes makes a difference. You may well have a bottle neck there, but it can be hard to work out what setting is causing it.
Same goes for PHP. Set memory_limit in php.ini to 500MB or something and see if it helps. Of course, you may not be able to do this, depending on your hosting arrangements, but it will eliminate one possible cause (or not) if you can.
Performance of your Drupal website also depends on how well your hosting platform is tuned for Drupal. Drupal requires special optimization of LAMP stack components. You can try Drupal-specific hosting companies http://www.drupalspecific.com to make your website run faster.
facing the drupal slowness issue myself. But have a very different issue than the others mentioned.
I disabled all the content also the drupal header for a drupal page of a specific content type.
Still the time taken by this page to load is above 20 secs!
I took help of YSlow and NET firebug panels.
Upon looking at them, noticed:
JS and CSS files inclusion individually takes 3 to 2 secs, and there are fair bit of inclusions happening, as a result it takes like 20 secs.
But i am not able to figure out, why the js and css inclusions are taking so much time. (this includes normal drupal core js and css files as well)
The question, simply put, is the one in the title. Is it possible?
So far, my experience with scripting languages is that, to increase performance, you need to cache everything and later just serve the generated HTML files.
That's ok for some use cases, but when you really need to generate a new page in realtime, it's just impossible.
Drupal can take up to 3 seconds (or more!) to render some web pages (PHP execution time, not DB). That's crazy. Completely crazy.
If many projects (like Facebook) are using PHP, obviously the problem is mine. But googling for this problem shows that it's common. Too common.
(Of course I installed APC for PHP. It certainly helps, but PHP is still ultra-slow).
Must I assume this is the reality for Drupal / PHP?
Thanks.
Short answer is no. But why would you not want to cache?
What do you mean by 'generate a new page in realtime'? Authenticated users (anyone logged in) can see new content right away. Anonymous users may have to wait a little bit (if you are using Boost, for example), BUT, you can always control it, or flush it when new content is added. You should cache as much as you can.
You can install Boost (static HTML files), Memcache, and enable Drupal cache. It's encouraged, especially the last one. You can also run nginx on the server.
You can also try using Pressflow, a drop-in replacement for Drupal that will give you better performance.
http://pressflow.org/
Its been discussed many times.. you can make Drupal extremely fast if you want to. Check out some of the 2bits articles:
http://2bits.com/contents/articles
Utilizing the available methods of caching will help you keep your hosting cost low, instead of throw more hardware on an unoptimized site.
As you say, Facebook uses PHP, and they clearly have reason to need good performance. Their solution was to write their own compiler for PHP called HipHop, which they released as open source. If you're worried about PHP's performance, you should give it a try as it will definitely improve things.
The downside is that it doesn't (yet) cover 100% of the PHP function set, so some PHP programs may not compile. I don't know where Drupal fits into this, but it would be worth trying it out - there's nothing to be lost by doing a test compilation; if its not going to work, you won't have lost anything.
On a similar vein, there is a project in the Drupal community to convert parts of the Drupal Core into a PHP Extension, meaning that some key Drupal functions are then built-in to the PHP runtime as compiled code. See the project page here. But note that this is still in a fairly early stage of development: it's still listed as experimental, and only covers a small number of functions. It might be worth keeping an eye on the project, though.
According to http://groups.drupal.org/node/34076, yes you can get a < 200ms response time with Drupal without caching.
The tips that I've received from some friends regarding Drupal load performance is to install less than 40 modules.
More than 40, especially if those contrib modules use too much hooks and memory, and the performance will be decreased.
Other tips:
remove imagecache ui and views ui on production site
if possible put htaccess on vhost.conf so that htaccess will only be called once on apahe start
use throttle module
use gzip for all html, css and js files
use cdn module and amazon server solution
use ajax for some parts or blocks of your site
last and if there is enough budget, migrate to oracle
I've been writing PHP for years, and have used every framework under the sun, but one thing has always bugged me... and that's that the whole bloody thing has to be interpreted and executed every time someone tells my server they want the page served.
I've experimented with caching, FastCGI, the Zend Job Queue (and symfony plug-ins that do similar - as well as my own DB-based solutions that implement the System_Daemon class to run background processes) and I've managed to make my apps fairly quick using all that stuff... but I can't get over the mental block that my settings files, system/environment check functions, and all the stuff that should only really be loaded ONCE... loads every darn time someone hits my page.
So, my ramble leads to the following Q--
Is there some method/technique for loading certain aspects of PHP into RAM so that when that page is requested, all my settings.yml files, system checks, framework files, cached pages etc can be loaded directly from memory without ever even touching the HD... or needing to go through the same loading mechanism 50,000 times per day to init the program?
If there's nothing in PHP... are there any other 'web' languages that can be compiled in this way, to allow for true init-once apps?
I think you should give memcached a try, if you're talking about caching data. I think PHP is fairly proficient in caching compiled php-pages if you use stuff like mod_php in apache (which doesn't die in between requests).
Take a look on APC (Alternative PHP Cache), it keeps a cache of compiled files (PHP Opcode) and also lets you store random variables on memory with apc_fetch, apc_store.
The instalation is very simple and it really gives a boost on performance.
Create a full page cache on the ram disk and make your web server serve the page from there. This is a method that wordpress supercache plugin uses and it works great if your web site is suitable for full page caching. This whay you are not even invoking the PHP interpreter.
For users that are logged in (have an open session) you can create a rewrite condition that will redirect their request to the PHP engine.
Also, always use an opcode cache like APC and use it for caching config files (memcache is also fine).
If you are asking for a JVM/Tomcat like application server, then the answer is likely no. To my knowledge nothing (usable) like this exists for PHP. PHP uses a shared-nothing architecture, so it is by design everything is setup on all requests. But actually, this makes PHP scale pretty well.
As for speeding up your apps, try to use memcached and a code accelerator. Maybe look into Zend Server to get a complete package.
Regarding your last question, I believe at least most of the Python and Ruby web frameworks work like that.
Ruby web applications are nowadays built so that the app is only initialized once per server process. When requests come in, the server (Apache, for example) passes them to the web application (over Rack interface) which is running on the background.
This is how web frameworks based on Rack work. Older versions of Ruby on Rails were similar, although they used a different interface to talk to the web server.
I'd keep an eye on the Facebook Engineering ppage (http://www.facebook.com/notes.php?id=9445547199), every now and then they come up with posts about how they keep things fast/optimize/scale. I think they're use of php is super impressive.
A friend has recommended that I install php APC, claiming it will help php run faster and use less memory
sounds promising but I'm a little nervous about adding it to my VPS server
I have one small app that I've built using codeigniter, and several sites that use the popular slideshowpro photo gallery software
could install this break any of the back end code on my sites?
I'm no high tech server guy, but should I give this a try?
Depends entirely on your situation.
Is your site unresponsive or slow at the moment? Is this definitely due to the PHP scripts and not any other data sources such as a database or remote API?
If you answered yes to the above, then installing one of the many PHP accelerators out there would be a good shout. As for using less memory, that's largely dependent on your apache/lightppd/nginx config and php.ini variables.
Most PHP accelerators work by converting the (to be) interpreted PHP code into opcode. This is then stored in memory (RAM) for fast access. If you haven't already implemented file-based caching in CodeIgniter then the benefits of installing a PHP accelerator would be noticeable. If you haven't, then I suggest you do that first before moving straight over to (wasting?) spending time trying to install APC manually.
If your site is currently performing well and you're not too confident in your *nix skills then I suggest you try implementing CodeIgniter caching first rather than try messing with what is an already working VPS.
My personal preference is PHP eAccelerator.
Should installing a PHP cache engine not improve your site's performance then I suggest you look at what other factors influence your application. As stated above, these could be: database or API to name a few.
Hope this helps.
APC is basically a cache engine that stores your compiled php scripts on a temp location on your server. Meaning that these do not have to be interpreted every time someone calls your sccript. It is a PHP extension can can safely be turned ON or OFF and it does not affect your actual code. So... do not fear!
When a php script is processed, there is a compilation phase, where php converts the source code of the php files into "opcodes". APC simply caches the result of this compilation phase, so it should be safe to turn on.
That said, when making such changes to production code it is always wise to run a regression test to ensure no new issues have been introduced.
A site I built with Kohana was slammed with an enormous amount of traffic yesterday, causing me to take a step back and evaluate some of the design. I'm curious what are some standard techniques for optimizing Kohana-based applications?
I'm interested in benchmarking as well. Do I need to setup Benchmark::start() and Benchmark::stop() for each controller-method in order to see execution times for all pages, or am I able to apply benchmarking globally and quickly?
I will be using the Cache-library more in time to come, but I am open to more suggestions as I'm sure there's a lot I can do that I'm simply not aware of at the moment.
What I will say in this answer is not specific to Kohana, and can probably apply to lots of PHP projects.
Here are some points that come to my mind when talking about performance, scalability, PHP, ...
I've used many of those ideas while working on several projects -- and they helped; so they could probably help here too.
First of all, when it comes to performances, there are many aspects/questions that are to consider:
configuration of the server (both Apache, PHP, MySQL, other possible daemons, and system); you might get more help about that on ServerFault, I suppose,
PHP code,
Database queries,
Using or not your webserver?
Can you use any kind of caching mechanism? Or do you need always more that up to date data on the website?
Using a reverse proxy
The first thing that could be really useful is using a reverse proxy, like varnish, in front of your webserver: let it cache as many things as possible, so only requests that really need PHP/MySQL calculations (and, of course, some other requests, when they are not in the cache of the proxy) make it to Apache/PHP/MySQL.
First of all, your CSS/Javascript/Images -- well, everything that is static -- probably don't need to be always served by Apache
So, you can have the reverse proxy cache all those.
Serving those static files is no big deal for Apache, but the less it has to work for those, the more it will be able to do with PHP.
Remember: Apache can only server a finite, limited, number of requests at a time.
Then, have the reverse proxy serve as many PHP-pages as possible from cache: there are probably some pages that don't change that often, and could be served from cache. Instead of using some PHP-based cache, why not let another, lighter, server serve those (and fetch them from the PHP server from time to time, so they are always almost up to date)?
For instance, if you have some RSS feeds (We generally tend to forget those, when trying to optimize for performances) that are requested very often, having them in cache for a couple of minutes could save hundreds/thousands of request to Apache+PHP+MySQL!
Same for the most visited pages of your site, if they don't change for at least a couple of minutes (example: homepage?), then, no need to waste CPU re-generating them each time a user requests them.
Maybe there is a difference between pages served for anonymous users (the same page for all anonymous users) and pages served for identified users ("Hello Mr X, you have new messages", for instance)?
If so, you can probably configure the reverse proxy to cache the page that is served for anonymous users (based on a cookie, like the session cookie, typically)
It'll mean that Apache+PHP has less to deal with: only identified users -- which might be only a small part of your users.
About using a reverse-proxy as cache, for a PHP application, you can, for instance, take a look at Benchmark Results Show 400%-700% Increase In Server Capabilities with APC and Squid Cache.
(Yep, they are using Squid, and I was talking about varnish -- that's just another possibility ^^ Varnish being more recent, but more dedicated to caching)
If you do that well enough, and manage to stop re-generating too many pages again and again, maybe you won't even have to optimize any of your code ;-)
At least, maybe not in any kind of rush... And it's always better to perform optimizations when you are not under too much presure...
As a sidenote: you are saying in the OP:
A site I built with Kohana was slammed with
an enormous amount of traffic yesterday,
This is the kind of sudden situation where a reverse-proxy can literally save the day, if your website can deal with not being up to date by the second:
install it, configure it, let it always -- every normal day -- run:
Configure it to not keep PHP pages in cache; or only for a short duration; this way, you always have up to date data displayed
And, the day you take a slashdot or digg effect:
Configure the reverse proxy to keep PHP pages in cache; or for a longer period of time; maybe your pages will not be up to date by the second, but it will allow your website to survive the digg-effect!
About that, How can I detect and survive being “Slashdotted”? might be an interesting read.
On the PHP side of things:
First of all: are you using a recent version of PHP? There are regularly improvements in speed, with new versions ;-)
For instance, take a look at Benchmark of PHP Branches 3.0 through 5.3-CVS.
Note that performances is quite a good reason to use PHP 5.3 (I've made some benchmarks (in French), and results are great)...
Another pretty good reason being, of course, that PHP 5.2 has reached its end of life, and is not maintained anymore!
Are you using any opcode cache?
I'm thinking about APC - Alternative PHP Cache, for instance (pecl, manual), which is the solution I've seen used the most -- and that is used on all servers on which I've worked.
See also: Slides APC Facebook,
Or Benchmark Results Show 400%-700% Increase In Server Capabilities with APC and Squid Cache.
It can really lower the CPU-load of a server a lot, in some cases (I've seen CPU-load on some servers go from 80% to 40%, just by installing APC and activating it's opcode-cache functionality!)
Basically, execution of a PHP script goes in two steps:
Compilation of the PHP source-code to opcodes (kind of an equivalent of JAVA's bytecode)
Execution of those opcodes
APC keeps those in memory, so there is less work to be done each time a PHP script/file is executed: only fetch the opcodes from RAM, and execute them.
You might need to take a look at APC's configuration options, by the way
there are quite a few of those, and some can have a great impact on both speed / CPU-load / ease of use for you
For instance, disabling [apc.stat](https://php.net/manual/en/apc.configuration.php#ini.apc.stat) can be good for system-load; but it means modifications made to PHP files won't be take into account unless you flush the whole opcode-cache; about that, for more details, see for instance To stat() Or Not To stat()?
Using cache for data
As much as possible, it is better to avoid doing the same thing over and over again.
The main thing I'm thinking about is, of course, SQL Queries: many of your pages probably do the same queries, and the results of some of those is probably almost always the same... Which means lots of "useless" queries made to the database, which has to spend time serving the same data over and over again.
Of course, this is true for other stuff, like Web Services calls, fetching information from other websites, heavy calculations, ...
It might be very interesting for you to identify:
Which queries are run lots of times, always returning the same data
Which other (heavy) calculations are done lots of time, always returning the same result
And store these data/results in some kind of cache, so they are easier to get -- faster -- and you don't have to go to your SQL server for "nothing".
Great caching mechanisms are, for instance:
APC: in addition to the opcode-cache I talked about earlier, it allows you to store data in memory,
And/or memcached (see also), which is very useful if you literally have lots of data and/or are using multiple servers, as it is distributed.
of course, you can think about files; and probably many other ideas.
I'm pretty sure your framework comes with some cache-related stuff; you probably already know that, as you said "I will be using the Cache-library more in time to come" in the OP ;-)
Profiling
Now, a nice thing to do would be to use the Xdebug extension to profile your application: it often allows to find a couple of weak-spots quite easily -- at least, if there is any function that takes lots of time.
Configured properly, it will generate profiling files that can be analysed with some graphic tools, such as:
KCachegrind: my favorite, but works only on Linux/KDE
Wincachegrind for windows; it does a bit less stuff than KCacheGrind, unfortunately -- it doesn't display callgraphs, typically.
Webgrind which runs on a PHP webserver, so works anywhere -- but probably has less features.
For instance, here are a couple screenshots of KCacheGrind:
(source: pascal-martin.fr)
(source: pascal-martin.fr)
(BTW, the callgraph presented on the second screenshot is typically something neither WinCacheGrind nor Webgrind can do, if I remember correctly ^^ )
(Thanks #Mikushi for the comment) Another possibility that I haven't used much is the the xhprof extension : it also helps with profiling, can generate callgraphs -- but is lighter than Xdebug, which mean you should be able to install it on a production server.
You should be able to use it alonside XHGui, which will help for the visualisation of data.
On the SQL side of things:
Now that we've spoken a bit about PHP, note that it is more than possible that your bottleneck isn't the PHP-side of things, but the database one...
At least two or three things, here:
You should determine:
What are the most frequent queries your application is doing
Whether those are optimized (using the right indexes, mainly?), using the EXPLAIN instruction, if you are using MySQL
See also: Optimizing SELECT and Other Statements
You can, for instance, activate log_slow_queries to get a list of the requests that take "too much" time, and start your optimization by those.
whether you could cache some of these queries (see what I said earlier)
Is your MySQL well configured? I don't know much about that, but there are some configuration options that might have some impact.
Optimizing the MySQL Server might give you some interesting informations about that.
Still, the two most important things are:
Don't go to the DB if you don't need to: cache as much as you can!
When you have to go to the DB, use efficient queries: use indexes; and profile!
And what now?
If you are still reading, what else could be optimized?
Well, there is still room for improvements... A couple of architecture-oriented ideas might be:
Switch to an n-tier architecture:
Put MySQL on another server (2-tier: one for PHP; the other for MySQL)
Use several PHP servers (and load-balance the users between those)
Use another machines for static files, with a lighter webserver, like:
lighttpd
or nginx -- this one is becoming more and more popular, btw.
Use several servers for MySQL, several servers for PHP, and several reverse-proxies in front of those
Of course: install memcached daemons on whatever server has any amount of free RAM, and use them to cache as much as you can / makes sense.
Use something "more efficient" that Apache?
I hear more and more often about nginx, which is supposed to be great when it comes to PHP and high-volume websites; I've never used it myself, but you might find some interesting articles about it on the net;
for instance, PHP performance III -- Running nginx.
See also: PHP-FPM - FastCGI Process Manager, which is bundled with PHP >= 5.3.3, and does wonders with nginx.
Well, maybe some of those ideas are a bit overkill in your situation ^^
But, still... Why not study them a bit, just in case ? ;-)
And what about Kohana?
Your initial question was about optimizing an application that uses Kohana... Well, I've posted some ideas that are true for any PHP application... Which means they are true for Kohana too ;-)
(Even if not specific to it ^^)
I said: use cache; Kohana seems to support some caching stuff (You talked about it yourself, so nothing new here...)
If there is anything that can be done quickly, try it ;-)
I also said you shouldn't do anything that's not necessary; is there anything enabled by default in Kohana that you don't need?
Browsing the net, it seems there is at least something about XSS filtering; do you need that?
Still, here's a couple of links that might be useful:
Kohana General Discussion: Caching?
Community Support: Web Site Optimization: Maximum Website Performance using Kohana
Conclusion?
And, to conclude, a simple thought:
How much will it cost your company to pay you 5 days? -- considering it is a reasonable amount of time to do some great optimizations
How much will it cost your company to buy (pay for?) a second server, and its maintenance?
What if you have to scale larger?
How much will it cost to spend 10 days? more? optimizing every possible bit of your application?
And how much for a couple more servers?
I'm not saying you shouldn't optimize: you definitely should!
But go for "quick" optimizations that will get you big rewards first: using some opcode cache might help you get between 10 and 50 percent off your server's CPU-load... And it takes only a couple of minutes to set up ;-) On the other side, spending 3 days for 2 percent...
Oh, and, btw: before doing anything: put some monitoring stuff in place, so you know what improvements have been made, and how!
Without monitoring, you will have no idea of the effect of what you did... Not even if it's a real optimization or not!
For instance, you could use something like RRDtool + cacti.
And showing your boss some nice graphics with a 40% CPU-load drop is always great ;-)
Anyway, and to really conclude: have fun!
(Yes, optimizing is fun!)
(Ergh, I didn't think I would write that much... Hope at least some parts of this are useful... And I should remember this answer: might be useful some other times...)
Use XDebug and WinCacheGrind or WebCacheGrind to profile and analyze slow code execution.
(source: jokke.dk)
Profile code with XDebug.
Use a lot of caching. If your pages are relatively static, then reverse proxy might be the best way to do it.
Kohana is out of the box very very fast, except for the use of database objects. To quote Zombor "You can reduce memory usage by ensuring you are using the database result object instead of result arrays." This makes a HUGEE performance difference on a site that is being slammed. Not only does it use more memory, it slows down execution of scripts.
Also - you must use caching. I prefer memcache and use it in my models like this:
public function get($e_id)
{
$event_data = $this->cache->get('event_get_'.$e_id.Kohana::config('config.site_domain'));
if ($event_data === NULL)
{
$this->db_slave
->select('e_id,e_name')
->from('Events')
->where('e_id', $e_id);
$result = $this->db_slave->get();
$event_data = ($result->count() ==1)? $result->current() : FALSE;
$this->cache->set('event_get_'.$e_id.Kohana::config('config.site_domain'), $event_data, NULL, 300); // 5 minutes
}
return $event_data;
}
This will also dramatically increase performance. The above two techniques improved a sites performance by 80%.
If you gave some more information about where you think the bottleneck is, I'm sure we could give some better ideas.
Also check out yslow (google it) for some other performance tips.
Strictly related to Kohana (you probably already have done this, or not):
In production mode:
Enable internal caching (this will only cache the Kohana::find_file results, but this actually can help a lot.
Disable profiler
Just my 2 cents :)
I totally agree with the XDebug and caching answers. Don't look into the Kohana layer for optimization until you've identified your biggest speed and scale bottlenecks.
XDebug will tell you were you spend the most of your time and identify 'hotspots' in your code. Keep this profiling information so you can baseline and measure performance improvements.
Example problem and solution:
If you find that you're building up expensive objects from the database each time, that don't really change often, then you can look at caching them with memcached or another mechanism. All of these performance fixes take time and add complexity to your system, so be sure of your bottlenecks before you start fixing them.