Best practices for optimizing LAMP sites for speed? [closed] - php

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I want to know when building a typical site on the LAMP stack how do you optimize it for the best possible load times. I am picturing a typical DB-driven site.
This is a high-level look and could probably pull in question and let me break it down into each layer of the stack.
L - At the system level, (setup and filesystem) can you do to improve speed? One thing I can think of is image sizes, can compression here help optimize anything?
A - There have to be a ton of settings related to site speed here in the web server. Not my Forte. Probably depends a lot on how many sites are running concurrently.
M - MySQL in a database driven site, DB performance is key. Is there a better normalization approach i.e, using link tables? Web developers often just make simple monolithic tables resembling 1NF and this can kill performance.
P - aside from performance-boosting settings like caching, what can the programmer do to affect performance at a high level? I would really like to know if MVC design approaches hit performance more than quick-and-dirty. Other simple tips like are sessions faster than cookies would be interesting to know.
Obviously you have to get down and dirty into the details and find what code is slowing you down. Also I realize that many sites have many different performance characteristics, but let's assume a typical site that has more reads then writes.
I am just wondering if we can compile a bunch of best practices and fully expect people to link other questions so we can effectively workup a checklist.
My goal is to see if even in addition to the usual issues in performance we can see some oddball things you might not think of crop up to go along with a best-practices summary.
So my question is, if you were starting from scratch, how would you make sure your LAMP site was fast?

Here's a few personal must-dos that I always set up in my LAMP applications.
Install mod_deflate for apache, and
do not use PHP's gzip handlers.
mod_deflate will allow you to
compress static content, like
javascript/css/static html, as well
as the usual dynamic PHP output, and
it's one less thing you have to worry
about in your code.
Be careful with .htaccess files!
Enabling .htaccess files for
directories in your app means that
Apache has to scan the filesystem
constantly, looking for .htaccess
directives. It is far better to put
directives inside the main
configuration or a vhost
configuration, where they are loaded
once. Any time you can get rid of a
directory-level access file by moving
it into a main configuration file,
you save disk access time.
Prepare your application's database
layer to utilize a connection manager
of some sort (I use a Singleton for
most applications). It's not very
hard to do, and reducing the number
of database connections your
application opens saves resources.
If you think your application will
see significant load, memcached can
perform miracles. Keep this in mind
while you write your code... perhaps
one day instead of creating objects
on the fly, you will be getting them
from memcached. A little foresight
will make implementation painless.
Once your app is up and running, set
MySQL's slow query time to a small
number and monitor the slow query log
diligently. This will show you where
your problem queries are coming from,
and allow you to optimize your
queries and indexes before they
become a problem.
For serious performance tweakers, you
will want to compile PHP from source.
Installing from a package installs a
lot of libraries that you may never
use. Since PHP environments are
loaded into every instance of an
Apache thread, even a 5MB memory
overhead from extra libraries quickly
becomes 250MB of lost memory when
there's 50 Apache threads in
existence. I keep a list of my
standard ./configure line I use when
building PHP here, and I find it
suits most of my applications. The
downside is that if you end up
needing a library, you have to
recompile PHP to get it. Analyze
your code and test it in a devel
environment to make sure you have
everything you need.
Minify your Javascript.
Be prepared to move static content,
such as images and video, to a
non-dynamic web server. Write your
code so that any URLs for images and
video are easily configured to point
to another server in the future. A
web server optimized for static
content can easily serve tens or even
hundreds of times faster than a
dynamic content server.
That's what I can think of off the top of my head. Googling around for PHP best practices will find a lot of tips on how to write faster/better code as well (Such as: echo is faster than print).

First, realize that performance is an iterative process. You don't build a web application in a single pass, launch it, and never work on it again. On the contrary, you start small, and address performance issues as your site grows.
Now, onto specifics:
Profile. Identify your bottlenecks. This is the most important step. You need to focus your effort where you'll get the best results. You should have some sort of monitoring solution in place (like cacti or munin), giving you visibility into what's going on on your server(s)
Cache, cache, cache. You'll probably find that database access is your biggest bottleneck on the back end -- but you should verify this on your own. Fortunately, you'll probably find that a lot of your traffic is for a small set of resources. You can cache those resources in something like memcached, saving yourself the database hit, and resulting in better backend performance.
As others have mentioned above, take a look at the YDN performance rules. Consider picking up the accompanying book. This'll help you with front end performance
Install PHP APC, and make sure it's configured with enough memory to hold all your compiled PHP bytecode. We recently discovered that our APC installation didn't have nearly enough ram; giving it enough to work in cut our CPU time in half, and disk activity by 10%
Make sure your database tables are properly indexed. This goes hand in hand with monitoring the slow query log.
The above will get you very far. That is to say, even a fairly db-heavy site should be able to survive a frontpage digg on a single modestly-spec'd server if you've done the above.
You'll eventually hit a point where the default apache config won't always be able to keep up with incoming requests. When you hit this wall, there are two things to do:
As above, profile. Monitor your apache activity -- you should have an idea of how many connections are active at any given time, in addition to the max number of active connections when you get sudden bursts of traffic
Configure apache with this in mind. This is the best guide to apache config I've seen: Practical mod_perl chapter 11
Take as much load off of apache as you can. Apache's too heavy-duty to serve static content efficiently. You should be using a lighter-weight reverse proxy (like squid) or webserver (lighttpd or nginx) to serve static content, and to take over the job of spoon-feeding bytes to slow clients. This leaves Apache to do what it does best: execute your code. Again, the mod_perl book does a good job of explaining this.
Once you've gotten this far, it's largely an issue of caching more, and keeping an eye on your database. Eventually, you'll outgrow a single server. First, you'll probably add more front end boxes, all backed by a single database server. Then you're going to have to start spreading your database load around, probably by sharding. For an excellent overview of this growth process, see this livejournal presentation
For a more in-depth look at much of the above, check out Building Scalable Web Sites, by Cal Henderson, of Flickr fame. Google has portions of the book available for preview

I've used MysqlTuner for performance analysis on my mysql servers and its given a good insight into further issues for googling, as well as making its own recommendations

A resource you might find helpful is the YDN set of performance rules.

Don't forget the fact that your users will be thousands of miles away from your server, and downloading dozens of files to render a single page. That latency, and the overhead of rendering the page in their browsers can be larger than the amount of time that you spend collecting the information, and generating the page.
See the pages at Yahoo Developer Network about Best Practices for Speeding Up Your Web Site, and the YSlow tool for seeing what part of the downloading of the site is taking time.

Don't forget to turn off atime for your filesystem!

I'd recommend using Jet Profiler for MySQL to find any bad queries. I've successfully used it on a couple of my sites. Really helpful, and much easier to digest than the slow query log.

I'd recommend starting with http://highscalability.com/
As for your suggestions:
Compression for images, definitely no. Type of files system tunning, yes, that could have some effect, but minimal. But actually the best is to use in-memory reverse proxy, or even better CDN.
For Apache basically only load the modules you need. Do not load anything else. As with PHP you can only use forking MPM, it's important to keep it slim. As for optimal settings, well you have to fine tune them to specific application, hardware etc. If you have enough CPU, it's recommendable that you use mod_deflate. Faster the server can send data to the client, faster it can start processing next request.

Related

Optimizing Drupal via PHP Apache and Mysql optimization

I installed Drupal common from acquia and using it for my college Intranet Website. I configured it on Ubuntu lucid lynx Desktop edition running latest XAMPP. I want to increase the performance of the website. My databse server and webserver is on same machine.
Can any one suggest methos to increase the performance on following point
What should be the ideal hardware configuration
What parameters should i change in PHP to run it for best performance?
How can I optimize apache and My SQL to get best performance out of both??
are there tweaks in drupal which can make it more faster?
Are there any additional packages for caching etc which can improve the speed??
Also, try Varnish if you're using PressFlow, as suggested by berkes. It helps a lot if you have to serve content for anonymous users.
Varnish can cache in memory all the content that Drupal produces, reducing hits to your web server and database.
Here a good start point for configuring Varnish with Pressflow:
https://wiki.fourkitchens.com/display/PF/Configure+Varnish+for+Pressflow
Google some for more details.
And don't forget about non Drupal related optimization, like reducing the number of http requests, serving web page elements from different domains to reduce browser pipelining, etc. Use YSlow and follow Yahoo's excellent rules. Google for "yahoo Best Practices for Speeding Up Your Web Site" (can't include link due to SO limitation for new users).
Is not specific for Drupal, but for every PHP setup. More general: for each web-app. I advise you to start with O'Reilly's Building Scalable Websites.
See above. For Drupal, note the memory limit; many people just crank it up to rediculous values; after logic: Drupal needs more then 38MB, I'll just give it 250MB, to be safe.
Again, see above. For Drupal, pay extra attention to the amount of queries. If you focus on Slow Queries only, you may miss that single tiny query hammering your DB 100+ times per request.
Lots. My advice is to start looking at pressflow, an optimised Drupal. It has all the tweaks you are looking for built in. And more.
Yes. Many-, but start with memcached. And if you rely on search a lot, consider moving search to SOLR search.
Many more tips for starters can be found at Drupal performance Blog
The question you ask is very broad, so it is hard to give any specifics in answers. A good place to start is drupal's own handbook on performance tuning.
I would also highly recommend the boost module if your site serves largely anonymous users, as this allows requests to not even go to drupal and be served entirely from a static cache.
Drupal's Devel module has a Performance module that will log memory usage and access times to the Reports section of your site.
Use this to determine which pages on your site are slow.
Load xdebug (a PHP extension) and turn on the profiling feature. Make requests to your performance-intensive pages and it will create (very large) dumps of the entire request. Open up the cache file in a program like KCacheGrind or WinCacheGrind and you will be able to see every function call that Drupal made when building the page. From here you can see which parts are slowest and optimize them.
This should get you a good 30-80% improvement in performance if you have a slow site. In my experience, there's usually a few blocks or views that account for a huge part of any performance issues.
Pro Drupal 7 Development has a whole section regarding fine-tuning called "optimizing drupal".
I think you will find it quite interesting. It also discusses hardware architectures which is of your interest.
Regarding the 4th question, you can for a start checkout the boost module and disable modules you are not using.
Additionally, for improving page-performance you can enable page caching from Configuration -> Performance. In the same page you can use the aggregate and compress CSS(JS) files into one", in this way you reduce the number of HTTP requests per page and the overall size of the downloaded page.
You should also consider if CRON is setup. Not running cron can fill up the db with log , stale cache and other "garbage".
A last suggestion is to convert your db from MyIsam to InnoDB, but I think this requires some investigation because it not always the case that InnoDB is faster. With InnoDb there is less time lost from table locking while MyISAM is faster in table readings.

Is it crazy to not rely on a caching system like memcached nowadays ( for dynamic sites )?

I was just reviewing one of my client's applications which uses some old outdated php framework that doesn't rely on caching at all and is pretty much completely database dependent.
I figure I'll just rewrite it from scratch because it's really outdated and in this rewrite I want to implement a caching system. It'd be nice if I could get a few pointers if anyone has done this prior.
Rewrite will be done in either PHP or Python
Would be nice if I could profile before and after this implementation
I have my own server so I'm not restricted by shared hosting
Caching, when it works right (==high hit rate), is one of the few general-purpose techniques that can really help with latency -- the harder part of problems generically describes as "performance". You can enhance QPS (queries per second) measures of performance just by throwing more hardware at the problem -- but latency doesn't work that way (i.e., it doesn't take just one month to make a babies if you set nine mothers to work on it;-).
However, the main resource used by caching is typically memory (RAM or disk as it may be). As you mention in a comment that the only performance problem you observe is memory usage, caching wouldn't help: it would just earmark some portion of memory to use for caching purposes, leaving even less available as a "general fund". As a resident of California I'm witnessing first-hand what happens when too many resources are earmarked, and I couldn't recommend such a course of action with a clear conscience!-)
If your site performance is fine then there's no reason to add caching. Lots of sites can get by without any cache at all, or by moving to a file-system based cache. It's only the super high traffic sites that need memcached.
What's "crazy" is code architecture (or a lack of architecture) that makes adding caching in latter difficult.
Since Python is one of your choices, I would go with Django. Built-in caching mechanism, and I've been using this debug_toolbar to help me while developing/profiling.
By the way, memcached does not work the way you've described. It maps unique keys to values in memory, it has nothing to do with .csh files or database queries. What you store in a value is what's going to be cached.
Oh, and caching is only worth if there are (or will be) performance problems. There's nothing wrong with "not relying" with caches if you don't need it. Premature optimization is 99% evil!
Depending on the specific nature of the codebase and traffic patterns, you might not even need to re-write the whole site. Horribly inefficient code is not such a big deal if it can be bypassed via cache for 99.9% of page requests.
When choosing PHP or Python, make sure you figure out where you're going to host the site (or if you even get to make that call). Many of my clients are already set up on a webserver and Python is not an option. You should also make sure any databases/external programs you want to interface with are well-supported in PHP or Python.

Whats a good way about troubleshooting a script in terms of performance (php/mysql)?

I've written a site CMS from scratch, and now that the site is slowly starting to get traffic (30-40k/day) Im seeing the server load a lot higher than it should be. It hovers around 6-9 all the time, on a quad core machine with 8gb of ram. I've written scripts that performed beautifully on 400-500k/day sites, so I'd like to think Im not totally incompetent.
I've reduce numbers of queries that are done on every page by nearly 60% by combining queries, eliminating some mysql calls completely, and replacing some sections of the site with static TXT files that are updated with php when necessary. All these changes affected the page execution time (index loads in 0.3s, instead of 1.7 as before).
There is virtually no IOwait, and the mysql DB is just 30mb. The site runs lighttpd, php 5.2.9, mysql 5.0.77
What can I do to get to the bottom of what exactly is causing the high load? I really wanna localize the problem, since "top" just tells me its mysql, which hovers between 50-95% CPU usage at all times.
Use EXPLAIN to help you optimize/troubleshoot your queries. It will show you how tables are referenced and how many rows are being read. It's very useful.
Also if you've made any modifications to your MySQL configuration, you may want to revisit that.
The best thing you can do is to profile your application code. Find out which calls are consuming so much of your resources. Here are some options (the first three Google hits for "php profiler"):
Xdebug
NuSphere PhpED
DBG
You might have some SQL queries that are very slow, but if they are run infrequently, they probably aren't a major cause of your performance problems. It may be that you have SQL queries that are more speedy, but they are run so often that their net impact to performance is greater. Profiling the application will help identify these.
The most general-purpose advice for improving application performance with respect to database usage is to identify data that changes infrequently, and put that data in a cache for speedier retrieval. It's up to you to identify what data would benefit from this the most, since it's very dependent on your application usage patterns.
As far as technology for caching, APC and memcached are options with good support in PHP.
You can also read through the MySQL optimization chapter carefully to identify any improvements that are relevant to your application.
Other good resources are MySQL Performance Blog, and the book "High Performance MySQL." If you're serious about running a MySQL-based website, you should be consulting these resources frequently.
mytop is a good place to start. It's basically top for MySQL, and will give you a window into what exactly your DB is doing:
http://jeremy.zawodny.com/mysql/mytop/
Noah
It could be any number of reasons, so it could take a lot of proding. A good first step would be to turn on the slow query log, and go over it by hand or with a parser. You can pick specific heavily used, slow queries to optimize (perhaps ones that hit something unindexed)

Techniques for writing a scalable website

I am new in the website scalability realm. Can you suggest to me some the techniques for making a website scalable to a large number of users?
Test your website under heavy load.
Monitor all statistics
Find bottleneck
Fix bottleneck
Go back to 1
good luck
If you expect your site to scale beyond the capabilities of a single server you will need to plan carefully. Design so the following will be possible:-
Make it so your database can be on a separate server. This isn't normally too hard.
Ensure all your static content can be moved to a CDN, as this will normally pull a lot of load off your servers.
Be prepared to spend a lot of money on hardware. More RAM and faster disks help a LOT.
It gets a lot harder when you need to split either the database or the php from a single server to multiple servers, so optimise everything, from your code, your database schema, your server config and anything else you can think of to put this final step off for as long as possible.
Other than that, all you can do is stress test your site, figure out where the bottlenecks are and try and design them away.
Check out this talk by Rasmus Lerdorf (creator of PHP)
Specially Page 8 and beyond.
You might want to look at this resource- highscalability.com.
A number of people have mentioned tools for identifying bottlenecks, and that is of course necessary. You can't spend productive time speeding something up without knowing where it's slow. But the other thing you need to know is where your target scalability lies. Is it value for money to spend a couple of months making your site scale to the same number of users as Twitter if it's going to be used by three people in HR? Do you have a known rate of transactions, or response latency, or number of users, in the requirements of the product? If so, target those numbers with your optimisation strategy. If not, find those out before chasing the performance rat down the hole.
Very similar: How Is PHP Done the Right Way?
Scalability is no small subject and certainly more material than can be reasonably covered in a single question.
For instance, with some kinds of applications, joins (in SQL) don't scale, which brings up all sorts of caching and sharding strategies.
Beanstalk is another scalability and performance tool in high-performance PHP sites. As is memcache (different kind).
The biggest problem for scalability is usually shared resources like DBMS's. The problem arises because DBMS's usually have no way to relax consistency guarantees.
If you want to increase scalability when you use something like MySQL you have to change your schema design to relax consistency.
For instance, you can separate your database schema to have your normalized data model for writes, and a replicated read only denormalized part for the 90% of read operations. The read only data can be spread over several servers.
Another way to increase scalability of a database is to partition the data, e.g. separate the data into a database for every department and aggregate them either in the ORM or in the DBMS.
In order of importance:
If you run PHP, use an opcode cache like APC. (This is important enough to be built-in in the next generation of PHP.)
Use YSlow or Google Page Speed to identify bottlenecks. (This will reveal structural problems with your website that affect both client and server performance.)
Ensure that your web server sends a proper Expires header for static content (images, Javascript, CSS), such that the browser can cache it properly. (YSlow will warn you about this, too.)
Use an HTTP accelerator, such as Varnish. (This picture says it all – and they already had an HTTP accelerator in place.)
Develop your site using solid OOP techniques. You will need your site to be modular as not all performance bottlenecks are obvious at the start. Be ready to refactor parts of your site as traffic increases. The first sentence I wrote will help you do it more easily and safely. Also, use test driven development, As refactor means new introduced bugs, and good TDD is good in catching them before they go into production.
Separate as much as possible client side code from server side code, as they will likely to be served from different servers, if your site traffic justify this.
Read articles (read the YSlow tips for instance).
GL
In addition to the other suggestions, look into splitting your sites into tiers, as in multitier architecture. If done right, you can then use one server per tier.

Optimizing Kohana-based Websites for Speed and Scalability

A site I built with Kohana was slammed with an enormous amount of traffic yesterday, causing me to take a step back and evaluate some of the design. I'm curious what are some standard techniques for optimizing Kohana-based applications?
I'm interested in benchmarking as well. Do I need to setup Benchmark::start() and Benchmark::stop() for each controller-method in order to see execution times for all pages, or am I able to apply benchmarking globally and quickly?
I will be using the Cache-library more in time to come, but I am open to more suggestions as I'm sure there's a lot I can do that I'm simply not aware of at the moment.
What I will say in this answer is not specific to Kohana, and can probably apply to lots of PHP projects.
Here are some points that come to my mind when talking about performance, scalability, PHP, ...
I've used many of those ideas while working on several projects -- and they helped; so they could probably help here too.
First of all, when it comes to performances, there are many aspects/questions that are to consider:
configuration of the server (both Apache, PHP, MySQL, other possible daemons, and system); you might get more help about that on ServerFault, I suppose,
PHP code,
Database queries,
Using or not your webserver?
Can you use any kind of caching mechanism? Or do you need always more that up to date data on the website?
Using a reverse proxy
The first thing that could be really useful is using a reverse proxy, like varnish, in front of your webserver: let it cache as many things as possible, so only requests that really need PHP/MySQL calculations (and, of course, some other requests, when they are not in the cache of the proxy) make it to Apache/PHP/MySQL.
First of all, your CSS/Javascript/Images -- well, everything that is static -- probably don't need to be always served by Apache
So, you can have the reverse proxy cache all those.
Serving those static files is no big deal for Apache, but the less it has to work for those, the more it will be able to do with PHP.
Remember: Apache can only server a finite, limited, number of requests at a time.
Then, have the reverse proxy serve as many PHP-pages as possible from cache: there are probably some pages that don't change that often, and could be served from cache. Instead of using some PHP-based cache, why not let another, lighter, server serve those (and fetch them from the PHP server from time to time, so they are always almost up to date)?
For instance, if you have some RSS feeds (We generally tend to forget those, when trying to optimize for performances) that are requested very often, having them in cache for a couple of minutes could save hundreds/thousands of request to Apache+PHP+MySQL!
Same for the most visited pages of your site, if they don't change for at least a couple of minutes (example: homepage?), then, no need to waste CPU re-generating them each time a user requests them.
Maybe there is a difference between pages served for anonymous users (the same page for all anonymous users) and pages served for identified users ("Hello Mr X, you have new messages", for instance)?
If so, you can probably configure the reverse proxy to cache the page that is served for anonymous users (based on a cookie, like the session cookie, typically)
It'll mean that Apache+PHP has less to deal with: only identified users -- which might be only a small part of your users.
About using a reverse-proxy as cache, for a PHP application, you can, for instance, take a look at Benchmark Results Show 400%-700% Increase In Server Capabilities with APC and Squid Cache.
(Yep, they are using Squid, and I was talking about varnish -- that's just another possibility ^^ Varnish being more recent, but more dedicated to caching)
If you do that well enough, and manage to stop re-generating too many pages again and again, maybe you won't even have to optimize any of your code ;-)
At least, maybe not in any kind of rush... And it's always better to perform optimizations when you are not under too much presure...
As a sidenote: you are saying in the OP:
A site I built with Kohana was slammed with
an enormous amount of traffic yesterday,
This is the kind of sudden situation where a reverse-proxy can literally save the day, if your website can deal with not being up to date by the second:
install it, configure it, let it always -- every normal day -- run:
Configure it to not keep PHP pages in cache; or only for a short duration; this way, you always have up to date data displayed
And, the day you take a slashdot or digg effect:
Configure the reverse proxy to keep PHP pages in cache; or for a longer period of time; maybe your pages will not be up to date by the second, but it will allow your website to survive the digg-effect!
About that, How can I detect and survive being “Slashdotted”? might be an interesting read.
On the PHP side of things:
First of all: are you using a recent version of PHP? There are regularly improvements in speed, with new versions ;-)
For instance, take a look at Benchmark of PHP Branches 3.0 through 5.3-CVS.
Note that performances is quite a good reason to use PHP 5.3 (I've made some benchmarks (in French), and results are great)...
Another pretty good reason being, of course, that PHP 5.2 has reached its end of life, and is not maintained anymore!
Are you using any opcode cache?
I'm thinking about APC - Alternative PHP Cache, for instance (pecl, manual), which is the solution I've seen used the most -- and that is used on all servers on which I've worked.
See also: Slides APC Facebook,
Or Benchmark Results Show 400%-700% Increase In Server Capabilities with APC and Squid Cache.
It can really lower the CPU-load of a server a lot, in some cases (I've seen CPU-load on some servers go from 80% to 40%, just by installing APC and activating it's opcode-cache functionality!)
Basically, execution of a PHP script goes in two steps:
Compilation of the PHP source-code to opcodes (kind of an equivalent of JAVA's bytecode)
Execution of those opcodes
APC keeps those in memory, so there is less work to be done each time a PHP script/file is executed: only fetch the opcodes from RAM, and execute them.
You might need to take a look at APC's configuration options, by the way
there are quite a few of those, and some can have a great impact on both speed / CPU-load / ease of use for you
For instance, disabling [apc.stat](https://php.net/manual/en/apc.configuration.php#ini.apc.stat) can be good for system-load; but it means modifications made to PHP files won't be take into account unless you flush the whole opcode-cache; about that, for more details, see for instance To stat() Or Not To stat()?
Using cache for data
As much as possible, it is better to avoid doing the same thing over and over again.
The main thing I'm thinking about is, of course, SQL Queries: many of your pages probably do the same queries, and the results of some of those is probably almost always the same... Which means lots of "useless" queries made to the database, which has to spend time serving the same data over and over again.
Of course, this is true for other stuff, like Web Services calls, fetching information from other websites, heavy calculations, ...
It might be very interesting for you to identify:
Which queries are run lots of times, always returning the same data
Which other (heavy) calculations are done lots of time, always returning the same result
And store these data/results in some kind of cache, so they are easier to get -- faster -- and you don't have to go to your SQL server for "nothing".
Great caching mechanisms are, for instance:
APC: in addition to the opcode-cache I talked about earlier, it allows you to store data in memory,
And/or memcached (see also), which is very useful if you literally have lots of data and/or are using multiple servers, as it is distributed.
of course, you can think about files; and probably many other ideas.
I'm pretty sure your framework comes with some cache-related stuff; you probably already know that, as you said "I will be using the Cache-library more in time to come" in the OP ;-)
Profiling
Now, a nice thing to do would be to use the Xdebug extension to profile your application: it often allows to find a couple of weak-spots quite easily -- at least, if there is any function that takes lots of time.
Configured properly, it will generate profiling files that can be analysed with some graphic tools, such as:
KCachegrind: my favorite, but works only on Linux/KDE
Wincachegrind for windows; it does a bit less stuff than KCacheGrind, unfortunately -- it doesn't display callgraphs, typically.
Webgrind which runs on a PHP webserver, so works anywhere -- but probably has less features.
For instance, here are a couple screenshots of KCacheGrind:
(source: pascal-martin.fr)
(source: pascal-martin.fr)
(BTW, the callgraph presented on the second screenshot is typically something neither WinCacheGrind nor Webgrind can do, if I remember correctly ^^ )
(Thanks #Mikushi for the comment) Another possibility that I haven't used much is the the xhprof extension : it also helps with profiling, can generate callgraphs -- but is lighter than Xdebug, which mean you should be able to install it on a production server.
You should be able to use it alonside XHGui, which will help for the visualisation of data.
On the SQL side of things:
Now that we've spoken a bit about PHP, note that it is more than possible that your bottleneck isn't the PHP-side of things, but the database one...
At least two or three things, here:
You should determine:
What are the most frequent queries your application is doing
Whether those are optimized (using the right indexes, mainly?), using the EXPLAIN instruction, if you are using MySQL
See also: Optimizing SELECT and Other Statements
You can, for instance, activate log_slow_queries to get a list of the requests that take "too much" time, and start your optimization by those.
whether you could cache some of these queries (see what I said earlier)
Is your MySQL well configured? I don't know much about that, but there are some configuration options that might have some impact.
Optimizing the MySQL Server might give you some interesting informations about that.
Still, the two most important things are:
Don't go to the DB if you don't need to: cache as much as you can!
When you have to go to the DB, use efficient queries: use indexes; and profile!
And what now?
If you are still reading, what else could be optimized?
Well, there is still room for improvements... A couple of architecture-oriented ideas might be:
Switch to an n-tier architecture:
Put MySQL on another server (2-tier: one for PHP; the other for MySQL)
Use several PHP servers (and load-balance the users between those)
Use another machines for static files, with a lighter webserver, like:
lighttpd
or nginx -- this one is becoming more and more popular, btw.
Use several servers for MySQL, several servers for PHP, and several reverse-proxies in front of those
Of course: install memcached daemons on whatever server has any amount of free RAM, and use them to cache as much as you can / makes sense.
Use something "more efficient" that Apache?
I hear more and more often about nginx, which is supposed to be great when it comes to PHP and high-volume websites; I've never used it myself, but you might find some interesting articles about it on the net;
for instance, PHP performance III -- Running nginx.
See also: PHP-FPM - FastCGI Process Manager, which is bundled with PHP >= 5.3.3, and does wonders with nginx.
Well, maybe some of those ideas are a bit overkill in your situation ^^
But, still... Why not study them a bit, just in case ? ;-)
And what about Kohana?
Your initial question was about optimizing an application that uses Kohana... Well, I've posted some ideas that are true for any PHP application... Which means they are true for Kohana too ;-)
(Even if not specific to it ^^)
I said: use cache; Kohana seems to support some caching stuff (You talked about it yourself, so nothing new here...)
If there is anything that can be done quickly, try it ;-)
I also said you shouldn't do anything that's not necessary; is there anything enabled by default in Kohana that you don't need?
Browsing the net, it seems there is at least something about XSS filtering; do you need that?
Still, here's a couple of links that might be useful:
Kohana General Discussion: Caching?
Community Support: Web Site Optimization: Maximum Website Performance using Kohana
Conclusion?
And, to conclude, a simple thought:
How much will it cost your company to pay you 5 days? -- considering it is a reasonable amount of time to do some great optimizations
How much will it cost your company to buy (pay for?) a second server, and its maintenance?
What if you have to scale larger?
How much will it cost to spend 10 days? more? optimizing every possible bit of your application?
And how much for a couple more servers?
I'm not saying you shouldn't optimize: you definitely should!
But go for "quick" optimizations that will get you big rewards first: using some opcode cache might help you get between 10 and 50 percent off your server's CPU-load... And it takes only a couple of minutes to set up ;-) On the other side, spending 3 days for 2 percent...
Oh, and, btw: before doing anything: put some monitoring stuff in place, so you know what improvements have been made, and how!
Without monitoring, you will have no idea of the effect of what you did... Not even if it's a real optimization or not!
For instance, you could use something like RRDtool + cacti.
And showing your boss some nice graphics with a 40% CPU-load drop is always great ;-)
Anyway, and to really conclude: have fun!
(Yes, optimizing is fun!)
(Ergh, I didn't think I would write that much... Hope at least some parts of this are useful... And I should remember this answer: might be useful some other times...)
Use XDebug and WinCacheGrind or WebCacheGrind to profile and analyze slow code execution.
(source: jokke.dk)
Profile code with XDebug.
Use a lot of caching. If your pages are relatively static, then reverse proxy might be the best way to do it.
Kohana is out of the box very very fast, except for the use of database objects. To quote Zombor "You can reduce memory usage by ensuring you are using the database result object instead of result arrays." This makes a HUGEE performance difference on a site that is being slammed. Not only does it use more memory, it slows down execution of scripts.
Also - you must use caching. I prefer memcache and use it in my models like this:
public function get($e_id)
{
$event_data = $this->cache->get('event_get_'.$e_id.Kohana::config('config.site_domain'));
if ($event_data === NULL)
{
$this->db_slave
->select('e_id,e_name')
->from('Events')
->where('e_id', $e_id);
$result = $this->db_slave->get();
$event_data = ($result->count() ==1)? $result->current() : FALSE;
$this->cache->set('event_get_'.$e_id.Kohana::config('config.site_domain'), $event_data, NULL, 300); // 5 minutes
}
return $event_data;
}
This will also dramatically increase performance. The above two techniques improved a sites performance by 80%.
If you gave some more information about where you think the bottleneck is, I'm sure we could give some better ideas.
Also check out yslow (google it) for some other performance tips.
Strictly related to Kohana (you probably already have done this, or not):
In production mode:
Enable internal caching (this will only cache the Kohana::find_file results, but this actually can help a lot.
Disable profiler
Just my 2 cents :)
I totally agree with the XDebug and caching answers. Don't look into the Kohana layer for optimization until you've identified your biggest speed and scale bottlenecks.
XDebug will tell you were you spend the most of your time and identify 'hotspots' in your code. Keep this profiling information so you can baseline and measure performance improvements.
Example problem and solution:
If you find that you're building up expensive objects from the database each time, that don't really change often, then you can look at caching them with memcached or another mechanism. All of these performance fixes take time and add complexity to your system, so be sure of your bottlenecks before you start fixing them.

Categories