How do I implement a HTML cache for a PHP site? - php

What is the best way of implementing a cache for a PHP site? Obviously, there are some things that shouldn't be cached (for example search queries), but I want to find a good solution that will make sure that I avoid the 'digg effect'.
I know there is WP-Cache for WordPress, but I'm writing a custom solution that isn't built on WP. I'm interested in either writing my own cache (if it's simple enough), or you could point me to a nice, light framework. I don't know much Apache though, so if it was a PHP framework then it would be a better fit.
Thanks.

You can use output buffering to selectively save parts of your output (those you want to cache) and display them to the next user if it hasn't been long enough. This way you're still rendering other parts of the page on-the-fly (e.g., customizable boxes, personal information).

If a proxy cache is out of the question, and you're serving complete HTML files, you'll get the best performance by bypassing PHP altogether. Study how WP Super Cache works.
Uncached pages are copied to a cache folder with similar URL structure as your site. On later requests, mod_rewrite notes the existence of the cached file and serves it instead. other RewriteCond directives are used to make sure commenters/logged in users see live PHP requests, but the majority of visitors will be served by Apache directly.

The best way to go is to use a proxy cache (Squid, Varnish) and serve appropriate Cache-Control/Expires headers, along with ETags : see Mark Nottingham's Caching Tutorial for a full description of how caches work and how you can get the most performance out of a caching proxy.
Also check out memcached, and try to cache your database queries (or better yet, pre-rendered page fragments) in there.

I would recommend Memcached or APC. Both are in-memory caching solutions with dead-simple APIs and lots of libraries.
The trouble with those 2 is you need to install them on your web server or another server if it's Memcached.
APC
Pros:
Simple
Fast
Speeds up PHP execution also
Cons
Doesn't work for distributed systems, each machine stores its cache locally
Memcached
Pros:
Fast(ish)
Can be installed on a separate server for all web servers to use
Highly tested, developed at LiveJournal
Used by all the big guys (Facebook, Yahoo, Mozilla)
Cons:
Slower than APC
Possible network latency
Slightly more configuration
I wouldn't recommend writing your own, there are plenty out there. You could go with a disk-based cache if you can't install software on your webserver, but there are possible race issues to deal with. One request could be writing to the file while another is reading.
You actually could cache search queries, even for a few seconds to a minute. Unless your db is being updated more than a few times a second, some delay would be ok.

The PHP Smarty template engine (http://www.smarty.net) includes a fairly advanced caching system.
You can find details in the caching section of the Smarty manual: http://www.smarty.net/manual/en/caching.php

You seems to be looking for a PHP cache framework.
I recommend you the template system TinyButStrong that comes with a very good CacheSystem plugin.
It's simple, light, customizable (you can cache whatever part of the html file you want), very powerful ^^

Simple caching of pages, or parts of pages - the Pear::CacheLite class. I also use APC and memcache for different things, but the other answers I've seen so far are more for more complete, and complex systems. If you just need to save some effort rebuilding a part of a page - Cache_lite with a file-backed store is entirely sufficient, and very simple to implement.

Project Gazelle (an open source torrent site) provides a step by step guide on setting up Memcached on the site which you can easily use on any other website you might want to set up which will handle a lot of traffic.
Grab down the source and read the documentation.

Related

Quickest ways to speed up your YII application?

I've just started using YII and managed to finish my first app. unfortunately, launch day is close and I want this app to be super fast. So far, the only way of speeding it up I've come across, is standard caching. What other ways are there to speed up my app?
First of all, read Performance Tuning in the official guide. Additionally:
Check HTTP caching.
Update your PHP. Each major version gives you a good boost.
Use redis (or at least database) for sessions (default PHP sessions are using files and are blocking).
Consider using nginx instead (or with) apache. It serves content much better.
Consider using CDN.
Tweak your database.
These are all general things that are relatively easy to do. If it's not acceptable afterwards, do not assume. Profile.
1. Following best practices
In this recipe, we will see how to configure Yii for best performances and will see some additional principles of building responsive applications. These principles are both general and Yii-related. Therefore, we will be able to apply some of these even without using Yii.
Getting ready
Install APC (http://www.php.net/manual/en/apc.installation.php)
Generate a fresh Yii application using yiic webapp
2.Speeding up sessions handling
Native session handling in PHP is fine in most cases. There are at least two possible reasons why you will want to change the way sessions are handled:
When using multiple servers, you need to have a common session storage for both servers
Default PHP sessions use files, so the maximum performance possible is limited by disk I/O
3.Using cache dependencies and chains
Yii supports many cache backends, but what really makes Yii cache flexible is the dependency and dependency chaining support. There are situations when you cannot just simply cache data for an hour because the information cached can be changed at any time.
In this recipe, we will see how to cache a whole page and still always get fresh data when it is updated. The page will be dashboard-type and will show five latest articles added and a total calculated for an account. Note that an operation cannot be edited as it was added, but an article can.
4.Profiling an application with Yii
If all of the best practices for deploying a Yii application are applied and you still do not have the performance you want, then most probably, there are some bottlenecks with the application itself. The main principle while dealing with these bottlenecks is that you should never assume anything and always test and profile the code before trying to optimize it.
If most of your app is cacheable you should try a proxy like varnish.
Go for general PHP Mysql Performance turning.
1)Memcache
Memcahced open source distributed memory object caching system it helps you to speeding up the dynamic web applications by reducing database server load.
2)MySQL Performance Tuning
3)Webserver Performance turning for PHP

Caching strategy for heavily used web-site

We’re in process of designing caching strategy for a heavily used web-site.
The site consists of a mix of dynamic and static content. The front-end is PHP, middle tier is Tomcat and mysql on the back.
Only user login screen is done over HTTPS to secure the credentials. After that, all content is served over plain HTTP. Some of the screens are specific to the customer (let’s say his last orders), while other screens are common to everybody (most popular products, promotions, rules, etc).
Given the expected traffic volume it’s clear that we need a comprehensive caching strategy. So we’re considering following options:
Put Squid or Varnish in front of PHP and configure it to cache all public content and even order submission form of a customer.
Use memcached by PHP to cache page fragments (such as most popular products)
Implement caching in the middle tier/tomcats (i.e. before returning content to web-servers, try to fetch it from local cache such as ehcache)
Use PHP-level cache like Zend Cache and store there fragments of the pages. This is close to the second option that i mentioned but it's built into the Zend framework.
It’s possible that we will use a combination of those strategies.
So the question is whether it's worthwhile to add front cache like Varnish, or just use Zend Cache inside?
The other option that i forgot to mention is to use PHP-level cache like Zend Cache and store there fragments of the pages. This is close to the second option that i mentioned but it's built into the Zend framework.
So the question is whether it's worthwhile to add front cache like Varnish, or just use Zend Cache inside?
Thanks again,
Philopator.
I've done quite a few projects like this and found that:
creating a (complete) custom solution is hard and expensive. Luckily you found Squid/Varnish, memcache and ehcache
The dynamic behaviour of sites differ a lot and you know your site best, so it makes sense to devise a specific caching strategy
it makes sense to deploy multiple layers of cache. However, this will complicate the behavior of your site, so you should tell everybody involved with the site (e.g. business) something about it and tell your engineers a lot about it.
Think of how you're going to debug problems. e.g. add headers that indicate the freshness of the data served, allow certain people to purge or avoid the cache
Regularly check how the different cache layers perform (e.g. use nagios plugins for your varnish machines).
Measure where your performance problems are before you build any caches :)
caching certain objects for just a short while can already be a very significant improvement
These days I like Varnish a lot: it's a separate layer that doesn't clutter the Java/PHP code, it's fast and very flexible. Downside is that the configuration in vcl is a bit too complex.
I typically use ehcache + in memory storage to avoid latency (e.g. database queries or service requests) with small data sets, and memcached when there's a lot of data and the cache needs to shared by multiple nodes.

Is it possible to get a <200ms response with Drupal (without caching)?

The question, simply put, is the one in the title. Is it possible?
So far, my experience with scripting languages is that, to increase performance, you need to cache everything and later just serve the generated HTML files.
That's ok for some use cases, but when you really need to generate a new page in realtime, it's just impossible.
Drupal can take up to 3 seconds (or more!) to render some web pages (PHP execution time, not DB). That's crazy. Completely crazy.
If many projects (like Facebook) are using PHP, obviously the problem is mine. But googling for this problem shows that it's common. Too common.
(Of course I installed APC for PHP. It certainly helps, but PHP is still ultra-slow).
Must I assume this is the reality for Drupal / PHP?
Thanks.
Short answer is no. But why would you not want to cache?
What do you mean by 'generate a new page in realtime'? Authenticated users (anyone logged in) can see new content right away. Anonymous users may have to wait a little bit (if you are using Boost, for example), BUT, you can always control it, or flush it when new content is added. You should cache as much as you can.
You can install Boost (static HTML files), Memcache, and enable Drupal cache. It's encouraged, especially the last one. You can also run nginx on the server.
You can also try using Pressflow, a drop-in replacement for Drupal that will give you better performance.
http://pressflow.org/
Its been discussed many times.. you can make Drupal extremely fast if you want to. Check out some of the 2bits articles:
http://2bits.com/contents/articles
Utilizing the available methods of caching will help you keep your hosting cost low, instead of throw more hardware on an unoptimized site.
As you say, Facebook uses PHP, and they clearly have reason to need good performance. Their solution was to write their own compiler for PHP called HipHop, which they released as open source. If you're worried about PHP's performance, you should give it a try as it will definitely improve things.
The downside is that it doesn't (yet) cover 100% of the PHP function set, so some PHP programs may not compile. I don't know where Drupal fits into this, but it would be worth trying it out - there's nothing to be lost by doing a test compilation; if its not going to work, you won't have lost anything.
On a similar vein, there is a project in the Drupal community to convert parts of the Drupal Core into a PHP Extension, meaning that some key Drupal functions are then built-in to the PHP runtime as compiled code. See the project page here. But note that this is still in a fairly early stage of development: it's still listed as experimental, and only covers a small number of functions. It might be worth keeping an eye on the project, though.
According to http://groups.drupal.org/node/34076, yes you can get a < 200ms response time with Drupal without caching.
The tips that I've received from some friends regarding Drupal load performance is to install less than 40 modules.
More than 40, especially if those contrib modules use too much hooks and memory, and the performance will be decreased.
Other tips:
remove imagecache ui and views ui on production site
if possible put htaccess on vhost.conf so that htaccess will only be called once on apahe start
use throttle module
use gzip for all html, css and js files
use cdn module and amazon server solution
use ajax for some parts or blocks of your site
last and if there is enough budget, migrate to oracle

Seriously speeding up PHP?

I've been writing PHP for years, and have used every framework under the sun, but one thing has always bugged me... and that's that the whole bloody thing has to be interpreted and executed every time someone tells my server they want the page served.
I've experimented with caching, FastCGI, the Zend Job Queue (and symfony plug-ins that do similar - as well as my own DB-based solutions that implement the System_Daemon class to run background processes) and I've managed to make my apps fairly quick using all that stuff... but I can't get over the mental block that my settings files, system/environment check functions, and all the stuff that should only really be loaded ONCE... loads every darn time someone hits my page.
So, my ramble leads to the following Q--
Is there some method/technique for loading certain aspects of PHP into RAM so that when that page is requested, all my settings.yml files, system checks, framework files, cached pages etc can be loaded directly from memory without ever even touching the HD... or needing to go through the same loading mechanism 50,000 times per day to init the program?
If there's nothing in PHP... are there any other 'web' languages that can be compiled in this way, to allow for true init-once apps?
I think you should give memcached a try, if you're talking about caching data. I think PHP is fairly proficient in caching compiled php-pages if you use stuff like mod_php in apache (which doesn't die in between requests).
Take a look on APC (Alternative PHP Cache), it keeps a cache of compiled files (PHP Opcode) and also lets you store random variables on memory with apc_fetch, apc_store.
The instalation is very simple and it really gives a boost on performance.
Create a full page cache on the ram disk and make your web server serve the page from there. This is a method that wordpress supercache plugin uses and it works great if your web site is suitable for full page caching. This whay you are not even invoking the PHP interpreter.
For users that are logged in (have an open session) you can create a rewrite condition that will redirect their request to the PHP engine.
Also, always use an opcode cache like APC and use it for caching config files (memcache is also fine).
If you are asking for a JVM/Tomcat like application server, then the answer is likely no. To my knowledge nothing (usable) like this exists for PHP. PHP uses a shared-nothing architecture, so it is by design everything is setup on all requests. But actually, this makes PHP scale pretty well.
As for speeding up your apps, try to use memcached and a code accelerator. Maybe look into Zend Server to get a complete package.
Regarding your last question, I believe at least most of the Python and Ruby web frameworks work like that.
Ruby web applications are nowadays built so that the app is only initialized once per server process. When requests come in, the server (Apache, for example) passes them to the web application (over Rack interface) which is running on the background.
This is how web frameworks based on Rack work. Older versions of Ruby on Rails were similar, although they used a different interface to talk to the web server.
I'd keep an eye on the Facebook Engineering ppage (http://www.facebook.com/notes.php?id=9445547199), every now and then they come up with posts about how they keep things fast/optimize/scale. I think they're use of php is super impressive.

can i do caching with php?

I have read a few things here and there and about PHP being able to "cache" things. I'm not super familiar with the concept of caching from a computer science point of view. How does this work and where would I use it in a PHP website and/or application.
Thanks!
You can cache:
Query results
The HTML output of a PHP script/request
Cache variables
Cache parts of a page.
Cache the code itself (speeds up things, no need to do bytecode).
Each of those is a different subject with different methods.
There are many questions on StackOverflow already about PHP and Caching. Perhaps if you were more clear in you question (right now it has poor grammar and sort of vaguely rambles), we could better answer you.
PHP Object Caching
HTML Caching for PHP sites
Caching dynamic php pages
PHP Opcode caching
PHP HTTP Headers for caching
Here is a good introductory article, by The UK Web Design Company, on how caching is done with php. There are products available that simplify this process a bit.
"How does this work" >> well, if done properly
How to use cache ? Well, there are many types of solutions :
caching parts of web pages (or even full pages) ; you can take a look at PEAR Cache_Lite (there are things like this in probably every existing frameworks ; there is in Zend Framework, with many backends supported)
caching data (like objects, for instance) ; you can cache to files, to RAM (with APC for instance), to a caching server (like memcached, for instance)
that data can come from many sources ; generally, it'll be from database, or a call to a webservice, or stuff like that
that data will generally be something : often used, hard / long / costly to get
you can also (not specific to PHP, though) use a reverse proxy (like varnish, for example) as a frontend to your web server, to cache entire HTML pages
The subject is really vast : there is almost an infinite number of possibilities...
But one thing to remember is : don't use caching "just to use caching" : caching, like anything else, can have drawbacks ; so use it if/when necessary...
Have a look at Zend Cache
Not exactly about php but, refering just to the caching of the html output, there are also templating systems like smarty capable to cache. I use it and I like how it works.
Have a look at Pear Cache and Cache_Lite at http://pear.php.net

Categories