I'm using Cache_Lite for html and array Cache in my project. I found Cache_Lite may lead to high system IO problem. Maybe because the performance of Cache_Lite is not good
I'm asking is there any stable php html/page cache to use?
I already have APC installed for opcode cache, Memcached installed for common data/array cache.
I've had exact problem with Cache Lite, as library doesn't properly implement file locks.
Solved it with new library and drop in replacement for Cache Lite.
https://github.com/mpapec/simple-cache/blob/master/example_clite1.php
https://github.com/mpapec/simple-cache/blob/master/example_clite2.php
https://github.com/mpapec/simple-cache/blob/master/example_clite3.php
Just to mention that library lacks some features that I didn't found useful like cache cleaning and caching in memory (_memoryCaching property which is false by default and marked as "beta quality" in original library).
Algorithm which is used for file locking follows this diagram,
Without more information it is hard to know if you are currently experiencing an IO problem or are likely to experience an IO problem in the future. (If your site is not getting much traffic or you are using a SSD you are unlikely to have a problem)
Cache Lite appears to be a file based caching system. This may lead to IO problems if your site experiences heavy load / lots of concurrent users / is hosted on a shared server / has other programs heavily using the filesystem.
An alternative to Cache Lite is memcache which is a key/value store that stores data in memory. This may not be suitable if you are storing large amounts of data or you server does not have any spare RAM as it stores all of its information in memory. Another benefit of memory is that it is much faster than accessing files from the disk. If you are only accessing a small amount of data or the same data repeatedly this is not likely to be an issue though because of disk/OS caching.
I would suggest checking to see if your system is currently experiencing any issues with IO before worrying about IO performance (unless you plan on getting slashdotted or something)
You could install a tool like Munin http://munin-monitoring.org/ and monitor your system to see if IO is a problem or is becoming a problem. Once installed check the cpu graph and look at the iowait data.
EDIT: Just saw the comment above, depending on your needs reverse proxys are another great tool checkout https://www.varnish-cache.org/ . At work we use a combination of the two ( memcache and varnish) We have 1 machine serving over 900,000 page views per month, this site includes static and dynamic content.
If you're talking about https://pear.php.net/package/Cache_Lite then i could tell you a story. We used it once, but it proved to be unreliable for websites with lots of request.
We then switched to Zend_Cache (ZF1) in combination with memcached. I can be used as standalone component.
However, you have to tune it a bit in order to use tags. There are a few implementations out there to get the job done: https://github.com/bigwhoop/taggable-zend-memcached-backend
Related
I'm thinking of replacing some of my file system caching (via phpfastcache) over to memcache (again using phpfastcache) but I have a query on storage sizes. Potentially a silly question but:
If I had 80gb of file cache storage being used would that equal to needing 80gb of ram to store the exact same cache via memory?
I'm thinking they potentially use different compression methods but I've struggled to find an answer online.
80gb of cache is huge, I'm myself surprised that Phpfastcache handles it well. If you migrate to Memcache, for sure you can expect some serious memory issue on the server.
For such size I recommend you more performant backends such as Mongodb/Couchdb/Arangodb or maybe Redis too.
If you're just looking for backend storage stats, you can have a look at this API offered by Phpfastcache: Wiki - Cache Statistics
It will return you misc statistics provided by the backends supported by Phpfastcache.
(Note: I'm the author of this Library)
I have a high traffic website and I need make sure my site is fast enough to display my pages to everyone rapidly.
I searched on Google many articles about speed and optimization and here's what I found:
Cache the page
Save it to the disk
Caching the page in memory:
This is very fast but if I need to change the content of my page I have to remove it from cache and then re-save the file on the disk.
Save it to disk
This is very easy to maintain but every time the page is accessed I have to read on the disk.
Which method should I go with?
Jan & idm are right but here's how to:
Caching (pages or contents) is crutial for performance. The minimum calls you request to the database or the file system is better whether if your content is static or dynamic.
You can use a PHP accelerator if you need to run dynamic content:
My recommendation is to use Alternative PHP Cache (APC)
Here's some benchmark:
What is the best PHP accelerator to use?
PHP Accelerators : APC vs Zend vs XCache with Zend Framework
Lighttpd – PHP Acceleration Benchmarks
For caching content and even pages you can use: Memcached or Redis.
Memcached:
Free & open source, high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load. Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering.
Redis
Redis is an open source, advanced key-value store. It is often referred to as a data structure server since keys can contain strings, hashes, lists, sets and sorted sets.
Both are very good tool for caching contents or variables.
Here's some benchmark and you can choose which one you prefer:
Redis vs Memcached
Redis vs Memcached
Redis VS Memcached (slightly better bench)
On Redis, Memcached, Speed, Benchmarks and The Toilet
You can install also Varnish, nginx, or G-Wan
Varnish:
Varnish is an HTTP accelerator designed for content-heavy dynamic web sites. In contrast to other HTTP accelerators, such as Squid, which began life as a client-side cache, or Apache, which is primarily an origin server, Varnish was designed from the ground up as an HTTP accelerator.
nginx
nginx (pronounced ?engine-x?) is a lightweight, high-performance Web server/reverse proxy and e-mail (IMAP/POP3) proxy, licensed under a BSD-like license. It runs on Unix, Linux, BSD variants, Mac OS X, Solaris, and Microsoft Windows.
g-wan
G-WAN is a Web server with ANSI C scripts and a Key-Value store which outperform all other solutions.
Here's some benchmark and you can choose which one you prefer:
Serving static files: a comparison between Apache, Nginx, Varnish and G-WAN
Web Server Performance Benchmarks
Nginx+Varnish compared to Nginx
Apache, Varnish, nginx and lighttpd
G-WAN vs Nginx
You have a good idea, which is close to what i do myself. If i have a page that is 100% static, i'll save a html version of it and serve that to the user instead of generating the content again every time. This saves both mysql queries and several io operations in some cases. Every time i make some change, my administration interface simply removes the html file and recreates it.
This method has proven to be around 100x faster on my server.
The big question with website performance is "do you serve static pages, or do you serve dynamic pages?".
Static pages
The best way to speed up static pages is to cache them outside your website. If you can afford to, serve them from a CDN (Akamai, Cotendo, Level3). In this case, the traffic never hits your site. There are several ways to control the cache - from fixed duration to the standard HTTP cache directives.
Even if you can't serve your HTML from a CDN, storing your images, javascript and other static assets on a CDN can speed up your site - you could use a cloud service like Amazon for this.
If you can't afford a CDN for your HTML, you could use your own caching proxy layer, as book of Zeus suggests. I've had good results with Varnish. Ideally, you'd run your caching proxy on its own hardware - but you can run it on your existing servers.
Dynamic pages
Dynamic pages are harder to cache - so then you need to concentrate on making the pages themselves as efficient as possible. This basically means hunting the bottleneck - in most systems, the bottleneck is the database (but by no means always).
If you're confident your bottleneck is the database, there are several ways caching options - you can cache "snippets" of HTML, or you can cache database queries. Using an accelerator helps with this - I wouldn't invent one from scratch. This probably means re-architecting (parts of) your application.
You have to profile your site first.
Instead of wild guess one have to determine certain bottleneck(s) and then solve that certain problem.
Cahing is not a silver bullet nor a synonym for the optimization.
Sometimes caching is not applicable (for the ads, for example), sometimes it will help nothing as the reason of ht site slowness may be in some unrelated spot.
Your site may run out of memory. So, memory caching will make the things worse.
I can't believe someone has a high traffic site and said nmot a word of the prior profiling. How can you run it knowing nothing of it's internals? CPU load, memory load, disk i/o and such.
I can add:
Cache everything you can
Minimize number of includes
Use accelerator
Please, investigate, what makes your site slow. Don't forget about YSlow and similar things, they can help you a lot.
Besides, if you have heavy calculations you could write php extension for them, but i don't think this is your case
Trying to get to grips with the different types of cache engines File, APC, Xcache, Memcache. Anybody know of any good resources/links?
Note I am using Linux, PHP and mysql
There are 2 types of caching terminology thrown around in PHP.
First is an optcode cache:
http://en.wikipedia.org/wiki/PHP_accelerator
Second is a data cache:
http://simas.posterous.com/php-data-caching-techniques
A few of the technologies can cross boundaries into both realms, but the basics behind them are simple. The idea is: Keep as much data in ram and precompiled because compiling and HD seeks are very expensive processes. HD Seeks can be done to find a file to compile / query the DB to get data / looking for a temp file, and every time that happens it slows down the user experience.
Memcached is generally the way to go, but it has some "features" such as once you save some data to t cache, it doesn't necessarily guarantee that it will be available later as it dynamically removes old caches to make way for new ones. It's also fairly basic, you'll need to roll your own system for handling timeouts and preventing cascading but it's all fairly simple. There's tons of info in the Memcached FAQ, or feel free to ask and I'll post some code examples. Memcached can also act as a session handler which is great if you have lots of users or more than one server.
Otherwise disc caching is good if you only have one server or don't mind generating separate caches of each server. Generally faster than memcached as it doesn't have the network overhead (unless you have memcached on the same server). There are plenty of good disc caching frameworks but probably the best are Pear Cache_Lite and APC.
APC also has the added advantage that it can cache your compiled PHP code which may help on high-performance websites.
I'm involved in a project that will end up creating around 10 million new pages on an existing site. The site, and the new project, are built with CodeIgniter and connecting to MySQL.
I've never dealt with a site of this size before, and I'm concerned about how we should handle caching. Has anyone dealt with caching on a PHP site of this size that could give me some pointers? I'm used to the CodeIgniter caching system and similar, but the number of cache files that would create worries me.
Any suggestions would be appreciated.
I haven't done anything on that scale, but I don't see a problem with file-based caching as long as the caching mechanism isn't completely dumb, and you're using a modern filesystem. Distributing cache files throughout a directory tree is smart enough.
If you're worried, that's good. Of course, I would suggest writing a wrapper around CI's built-in mechanism, so that you can easily swap it out for something else (Like Zend_Cache, possibly with a beefy memcached server, or some smarter file-based system of your own design).
There are several layers of caching available to PHP and CodeIgniter, but you shouldn't have to worry about the number of cached files on a standard linux server (various file systems can handle hundreds of millions of files per mount point). But to pick your caching method, you need to measure carefully.
Options:
Opcode caching (Zend, eAccelerator, and more)
CodeIgniter view caching (configured per view)
CodeIgniter read query caching
General web caching (more info)
Optimize your database (more info)
(and so on)
Additionally, you can improve the file caches by using memory file systems and in-memory tables.
The real question is, how do you pick caching strategies? Capacity planning. You model your system (users, accounts, pages, files), simulate, measure, and add caches based on best theories. Measure again. Produce new theories and measurements until you have approaches that fit your desired scale.
In my experience, view caching and web caching are a big gain for widely read sites (WPSuperCache, for example). Opcode caching (and other forms of min-imisation) are useful for heavily dynamic sites, as is database performance tuning.
FYI: If the system runs on a Windows server: Windows can (could?) max. have approx. 65.000 files in a folder, including cache folders. Not sure if this upper limit has been fixed in newer versions.
All big guys use APC.
The number of webpages is not relevant.
The relevant number is the number of hits (pageviews ).
And if you design for speed ditch the Windows machines.
I'm currently implementing memcached into my service but what keeps cropping up is the suggestion that I should also implement APC for caching of the actual code.
I have looked through the few tutorials there are, and the PHP documentation as well, but my main question is, how do I implement it on a large scale? PHP documentation talks about storing variables, but it isn't that detailed.
Forgive me for being uneducated in this area but I would like to know where in real sites this is implemented. Do I literally cache everything or only the parts that are used often, such as functions?
Thanks!
As you know PHP is an interpreted language, so everytime a request arrives to the server it need to open all required and included files, parse them and execute them. What APC offers is to skip the require/include and parsing steps (The files still have to be required, but are stored in memory so access is much much faster), so the scripts just have to be executed. On our website, we use a combination of APC and memcached. APC to speed up the above mentioned steps, and memcached to enable fast and distributed storing and accessing of both global variables (precomputed expensive function calls etc that can be shared by multiple clients for a certain amount of time) as well as session variables. This enables us to have multiple front end servers without losing any client state such as login status etc.
When it comes to what you should cache... well, that really depends on your application. If you have a need for multiple frontends somewhere down the line, I would try to go with memcached for such caching and storing, and use APC as an opcode cache.
APC is both an opcode cache and a general data cache. The latter works pretty much like memcached, whereas the opcode cache works by caching the parsed php-files, so that they won't have to be parsed on each request. That can generally speed up execution time up quite a bit.
You don't have to implement the opcode caching features of APC, you just enable them as a php module.
APC cache size and other configuration information is here.