I have read a few things here and there and about PHP being able to "cache" things. I'm not super familiar with the concept of caching from a computer science point of view. How does this work and where would I use it in a PHP website and/or application.
Thanks!
You can cache:
Query results
The HTML output of a PHP script/request
Cache variables
Cache parts of a page.
Cache the code itself (speeds up things, no need to do bytecode).
Each of those is a different subject with different methods.
There are many questions on StackOverflow already about PHP and Caching. Perhaps if you were more clear in you question (right now it has poor grammar and sort of vaguely rambles), we could better answer you.
PHP Object Caching
HTML Caching for PHP sites
Caching dynamic php pages
PHP Opcode caching
PHP HTTP Headers for caching
Here is a good introductory article, by The UK Web Design Company, on how caching is done with php. There are products available that simplify this process a bit.
"How does this work" >> well, if done properly
How to use cache ? Well, there are many types of solutions :
caching parts of web pages (or even full pages) ; you can take a look at PEAR Cache_Lite (there are things like this in probably every existing frameworks ; there is in Zend Framework, with many backends supported)
caching data (like objects, for instance) ; you can cache to files, to RAM (with APC for instance), to a caching server (like memcached, for instance)
that data can come from many sources ; generally, it'll be from database, or a call to a webservice, or stuff like that
that data will generally be something : often used, hard / long / costly to get
you can also (not specific to PHP, though) use a reverse proxy (like varnish, for example) as a frontend to your web server, to cache entire HTML pages
The subject is really vast : there is almost an infinite number of possibilities...
But one thing to remember is : don't use caching "just to use caching" : caching, like anything else, can have drawbacks ; so use it if/when necessary...
Have a look at Zend Cache
Not exactly about php but, refering just to the caching of the html output, there are also templating systems like smarty capable to cache. I use it and I like how it works.
Have a look at Pear Cache and Cache_Lite at http://pear.php.net
Related
We’re in process of designing caching strategy for a heavily used web-site.
The site consists of a mix of dynamic and static content. The front-end is PHP, middle tier is Tomcat and mysql on the back.
Only user login screen is done over HTTPS to secure the credentials. After that, all content is served over plain HTTP. Some of the screens are specific to the customer (let’s say his last orders), while other screens are common to everybody (most popular products, promotions, rules, etc).
Given the expected traffic volume it’s clear that we need a comprehensive caching strategy. So we’re considering following options:
Put Squid or Varnish in front of PHP and configure it to cache all public content and even order submission form of a customer.
Use memcached by PHP to cache page fragments (such as most popular products)
Implement caching in the middle tier/tomcats (i.e. before returning content to web-servers, try to fetch it from local cache such as ehcache)
Use PHP-level cache like Zend Cache and store there fragments of the pages. This is close to the second option that i mentioned but it's built into the Zend framework.
It’s possible that we will use a combination of those strategies.
So the question is whether it's worthwhile to add front cache like Varnish, or just use Zend Cache inside?
The other option that i forgot to mention is to use PHP-level cache like Zend Cache and store there fragments of the pages. This is close to the second option that i mentioned but it's built into the Zend framework.
So the question is whether it's worthwhile to add front cache like Varnish, or just use Zend Cache inside?
Thanks again,
Philopator.
I've done quite a few projects like this and found that:
creating a (complete) custom solution is hard and expensive. Luckily you found Squid/Varnish, memcache and ehcache
The dynamic behaviour of sites differ a lot and you know your site best, so it makes sense to devise a specific caching strategy
it makes sense to deploy multiple layers of cache. However, this will complicate the behavior of your site, so you should tell everybody involved with the site (e.g. business) something about it and tell your engineers a lot about it.
Think of how you're going to debug problems. e.g. add headers that indicate the freshness of the data served, allow certain people to purge or avoid the cache
Regularly check how the different cache layers perform (e.g. use nagios plugins for your varnish machines).
Measure where your performance problems are before you build any caches :)
caching certain objects for just a short while can already be a very significant improvement
These days I like Varnish a lot: it's a separate layer that doesn't clutter the Java/PHP code, it's fast and very flexible. Downside is that the configuration in vcl is a bit too complex.
I typically use ehcache + in memory storage to avoid latency (e.g. database queries or service requests) with small data sets, and memcached when there's a lot of data and the cache needs to shared by multiple nodes.
what are the available cache methods i could use in php ?
Cache HTML output
Cache some variables
it would be great to implement more than one caching method , so i need them all , all the available out there (i do caching currently with files , any other ideas ?)
Most PHP build don't have a caching mechanism built in. There are extensions though that can take care of caching for you.
Have a look at APC or MemCache
If you are using a framework, then most come with some form of caching mechanism that you can use e.g. Zend Framework's Zend_Cache.
If you are not using a framework then the APC or Memcache as Pelle ten Cate mentioned can be used. The correct approach to use does depend in your situation though, do you have your website or application running on more than server and does the information in the cache need to be shared between those servers? (if yes then something like memcache is your answer, or maybe a database or distributed NoSQL solution if you are feeling brave).
If you code is only running on the one server you could try something simple like serializing your variables, and writing them to disk, then on every request afterwards, see if the files exists, if it does, open it and unserialize the string into the variable you need.
This though is only worth it if it would take a long time to generate the varaible normally,
(e.g longer than it would to open,read,unserialize the file on disk)
For HTML caching you are generally going to get the most mileage from using a proxy like Varnish or Squid to do it for you but i realise that this may not be an option for you.
If its not then you could the write to disk approach i mentioned above, and save chunks of HTML to files. look in the PHP manual for ob_start and its friends.
Since every PHP run starts from scratch on page request, there is nothing that would persist between calls, making cacheing moot.
Well, that's the basic view. Of course there are ways to implement a caching, sort of - and a few packages and extensions do so (like Zend Extensions and APC). However, you should have a very close look whether it actually improves performance. Other methods like memcache (for DB results), or switching from PHP to e.g. Java will often yield better results.
You can store variables in the $_SESSION, but you shouldn't keep larger HTML there.
Please check what you are actually trying to do. "Bytecode cacheing" (that is, saving PHP parsing time) needs to be done by the PHP runtime executable. For cacheing Database (SQL) request/reply-pairs, there is memcache. Cacheing HTML output can be done, but is often not a good idea.
See also an earlier answer on a similar question.
We need a caching solution that essentially caches data (text files) anywhere from 3 days up to a week based on user preferences and criteria. In this case memory based caching does not make sense to us. We were referred to MemcacheDB however I also thought of some NO SQL solutions.
Our current application uses RDMS (MYSQL) and I guess it makes sense to use MemcacheDB however NOSQL does appeal as it is something more on the horizon. However we have not deployed a production level application under NOSQL and the beta stuff does not settle well with management/investors. Any how what are your thoughts and how would you address it?
Thank You
CouchDB and MongoDB are both great databases, but they are terrible choices for a cache layer on top of your existing RDBMS. Besides the fact that they are still fairly immature, they don't suit the purpose at all. Also, speed-wise you'd be better off going without a cache layer than using CouchDB or MongoDB--they are both slower for simple read/writes than even MySQL. Yes, the NoSQL databases are "cool", but that does not mean you should use them for something they were not meant to do.
I'd go with Memcached, as it's just about the fastest and lightest thing you'll find, and it's well-known and well-supported.
If you're worried about the appeal to management and investors, and the current system (you mention MySQL) works, why would you change? You're moving from a fairly stable project to projects still in beta, and what value are you adding if the current system already works?
As mentioned above, all CouchDB resources contain etags.
What wasn't mentioned is that you can put any HTTP caching solution in front of CouchDB and have it do etag based caching. This way you can use Varnish, nginx, whatever you want.
I'd also take a look at Cassandra ( http://cassandra.apache.org/ ). I've tried MemcacheDB and CouchDB, somehow found Cassandra more appealing (Dunno about PHP since i work with Coldfusion). Here's related question Cassandra PHP module
CouchDB does already some caching: when you get a document the server also sends the HTTP ETag header (it's the same as the document revision in CouchDB).
The next time the browser asks for the same document it sends the Etag received. If the document hasn't been modified the server responds with the HTTP code 304 Not Modified and your browser retrieves the document from its local cache.
However if you have to cache files for different times based on user preferences, even if the text file changes, probably your best option is to write custom code that sends the approriate HTTP caching headers based on the user preferences.
For completeness another good option is Redis. You get performance comparable to Memcache but Redis also supports various data structures (hashes, lists, set, sorted sets) and atomic operations.
If you memcached with persistence, you should check out Redis. It has all the memecached functionality (and more) along with persistence.
I have not tried it myself, but I do remember reading that Redis also supported the memcached API as well.
I was wondering about caching dynamic PHP pages. Is it really about pre-compiling the PHP code and storing it in byte-code? Something similar to Python's .pyc which is a more compiled and ready to execute version and so that if the system sees that the .pyc file is newer than the .py file, then it won't bother to re-compile to .py file.
So is PHP caching mainly about this? Can someone offer a little bit more information on this?
Depends on the type of caching you are talking about. Opcode caching does exactly like you are saying. It takes the opcode and caches it so that whenever a user visits a particular page, that page does not need to be re-compiled if its opcode is already compiled and in the cache. If you modify a php file the caching mechanism will detect this and re-compile the code and put it in the cache.
If you're talking about caching the data on the page itself this is something different altogether.
Take a look at the Alternative PHP Cache for more info on opcode caching.
What you're describing is a PHP accelerator and they do exactly what you said; store the cached, compiled bytecode so that multiple executions of the same script require only one compilation.
It's also possible to cache the results of executing the PHP script. This usually requires at least a little bit of logic, since the content of the page might have changed since it was cached. For example, you can have a look at the general cache feature provided by CodeIgniter.
Peter D's answer covers opcode caching well. This can save you over 50% of page generation time (subjective) if your pages are simple!
The other caching you want to know about is the caching of data. This could be caching database result sets, a web service response, chunks of HTML or even entire pages!
A simple 'example' should illustrate:
$cache = new Cache();
$dataset;
if (!$dataset == $cache->get('expensiveDataset')){
//run code to fetch dataset from database
$dataset = expensiveOperation();
$cache->set('expensiveDataset', $dataset);
}
echo $dataset; //do something with the data
There are libraries to help with object, function and page level caching. Zend Framework's Zend_Cache component is food for thought and a great implementation if you like what you see.
There are actually a few different forms of caching. What you're referring to is handled by packages such as eAccelerator, MMCache, etc.
While this will help some, where you'll really get a performance boost is in actually caching the HTML output where applicable, or in caching DB result sets for repetitive queries (something like memcache).
Installing any of the opcode cache mechanisms is very easy, but the other two areas of caching I referenced will gain you much larger performance benefits.
What is the best way of implementing a cache for a PHP site? Obviously, there are some things that shouldn't be cached (for example search queries), but I want to find a good solution that will make sure that I avoid the 'digg effect'.
I know there is WP-Cache for WordPress, but I'm writing a custom solution that isn't built on WP. I'm interested in either writing my own cache (if it's simple enough), or you could point me to a nice, light framework. I don't know much Apache though, so if it was a PHP framework then it would be a better fit.
Thanks.
You can use output buffering to selectively save parts of your output (those you want to cache) and display them to the next user if it hasn't been long enough. This way you're still rendering other parts of the page on-the-fly (e.g., customizable boxes, personal information).
If a proxy cache is out of the question, and you're serving complete HTML files, you'll get the best performance by bypassing PHP altogether. Study how WP Super Cache works.
Uncached pages are copied to a cache folder with similar URL structure as your site. On later requests, mod_rewrite notes the existence of the cached file and serves it instead. other RewriteCond directives are used to make sure commenters/logged in users see live PHP requests, but the majority of visitors will be served by Apache directly.
The best way to go is to use a proxy cache (Squid, Varnish) and serve appropriate Cache-Control/Expires headers, along with ETags : see Mark Nottingham's Caching Tutorial for a full description of how caches work and how you can get the most performance out of a caching proxy.
Also check out memcached, and try to cache your database queries (or better yet, pre-rendered page fragments) in there.
I would recommend Memcached or APC. Both are in-memory caching solutions with dead-simple APIs and lots of libraries.
The trouble with those 2 is you need to install them on your web server or another server if it's Memcached.
APC
Pros:
Simple
Fast
Speeds up PHP execution also
Cons
Doesn't work for distributed systems, each machine stores its cache locally
Memcached
Pros:
Fast(ish)
Can be installed on a separate server for all web servers to use
Highly tested, developed at LiveJournal
Used by all the big guys (Facebook, Yahoo, Mozilla)
Cons:
Slower than APC
Possible network latency
Slightly more configuration
I wouldn't recommend writing your own, there are plenty out there. You could go with a disk-based cache if you can't install software on your webserver, but there are possible race issues to deal with. One request could be writing to the file while another is reading.
You actually could cache search queries, even for a few seconds to a minute. Unless your db is being updated more than a few times a second, some delay would be ok.
The PHP Smarty template engine (http://www.smarty.net) includes a fairly advanced caching system.
You can find details in the caching section of the Smarty manual: http://www.smarty.net/manual/en/caching.php
You seems to be looking for a PHP cache framework.
I recommend you the template system TinyButStrong that comes with a very good CacheSystem plugin.
It's simple, light, customizable (you can cache whatever part of the html file you want), very powerful ^^
Simple caching of pages, or parts of pages - the Pear::CacheLite class. I also use APC and memcache for different things, but the other answers I've seen so far are more for more complete, and complex systems. If you just need to save some effort rebuilding a part of a page - Cache_lite with a file-backed store is entirely sufficient, and very simple to implement.
Project Gazelle (an open source torrent site) provides a step by step guide on setting up Memcached on the site which you can easily use on any other website you might want to set up which will handle a lot of traffic.
Grab down the source and read the documentation.