I have a single server site thats pushing 200k unqiues per day, and the traffic doubles roughly every 40 days (for the last 5 months anyway).
I pretty much only plan to cache the output of mysql_query functions for an hour or so. If cache is older than that, run query, put result back into the cache for another hour.
My mysql DB is about 200mb in size (grows by maybe 10-20mb/month).
Im doing a lot of file caching by writing HTML outputs and using them for a few minutes, and then regenerating the html.
Unfortunately, since its a database site, that allows for many sorting, searching and ordering methods, as well as pagination.... there are over 150,000 cached pages. Im also not caching the search queries, which cause most of the load.
I'd like to implement a caching system, and I wanted to know which one is faster. Would love to see some benchmarks.
A quick Googling says that APC is 5 times faster than Memcached.
My experience say that APC is nearly 7-8 times faster than Memcached.. but, memchached can be accessed by different services (for example, if you run mainly on apache and delegates some traffic, e.g. static contents like images or pure html, to another web-service, like lighttpd), that can be really usefull, if not indispensable.
APC have less feature than memcached and is easly to use and optimize, but this depends on your needs.
Like you mentioned there are a few different aspects of caching. I probably would focus on the following aspects of caching in your php app:
opcode caching which caches the compiled bytecode of php scripts. You can see a benchmark here (albeit an older article): http://itst.net/654-php-on-fire-three-opcode-caches-compared
Note: I strongly recommend using opcode caching.
Caching user data - APC and others do this. This would be your reference data or data that is fairly static and doesn't change often. You can clear the cache every day or trigger a clean cache when this reference data changes. This is also strongly recommended since typically reference data is used frequently and doesn't change often.
Caching sql queries - I know that Zend makes this task easy with a simple setup. Since these queries don't change this is another obvious one (like you mentioned)
Additional (if possible):
caching html pages - obviously caching a static page is faster than a generated one and typically this is hard to do since most pages in apps are so dynamic. Worth it if you can do it although if your queries are cached and your SQL is simple I wouldn't focus on this.
caching sql results - personally I stay away from this. I'll let the database do its work and what it does best since the DBMS typically has caching. I may cache the results for the thread of execution (i.e., I just retrieved this so don't do it again) but I don't go much beyond that.
I've used APC and eAccelerator successfully (I personally like to work with APC and it supposed opcode caching and user data caching for my reference data and sql queries). Use XDebug to profile your code.
You want to compare APC key-value store vs Memcache right? Because APC also does opcode cache, which is a different thing.
Well, on a single machine, APC k-v cache is way faster than memcache. Memcache has more functionality, but is intended for distributed environments, while APC works on single servers only.
I did a benchmark recently to set and then get 1 million keys in both, each key was a sequential integer, and the values were a 32byte string.
Over localhost, memcache could retrieve 12k keys/second in a single thread. APC returned 90K/second. However, if you use multi-threads or "multi_get" with memcache, it gets very close to APC performance.
The benchmark ran on a 1GB vps at slicehost.
in my case apc is 59 times faster than memcache
<?php
ini_set('apc.enable_cli','1'); //if u run in cli you may need to do changes in php.ini
error_reporting(E_ALL);
$mem=new Memcache();
$mem->connect('127.0.0.1',11211);
$mem->replace('testin','something');
$i=0;
$time=time()+microtime();
apc_store ( 'testin','something');
$num=1000000;
while($i<$num){
$mem->get('testin');
$i++;
}
echo "memcache took: ",time()+microtime()-$time," for 1 million gets","\n";
$time=time()+microtime();
$i=0;
print_r(apc_fetch('testin'));
while($i<$num) {
apc_fetch('testin');
$i++;
}
echo "apc took: ",time()+microtime()-$time,"for 1 million gets \n";
here is the output
memcache took: 37.657398939133 for 1 million gets
somethingapc took: 0.64599800109863for 1 million gets
It's almost impossible to accurately predict which would be faster. I would run tests with both in a development environment with similar data.
When performance is of importance, always use a profiler.
Im use IPB 3.1.4 with APC it works justy two times faster then without it.
Requests per second: 43.46 [#/sec] (mean)
Requests per second: 24.23 [#/sec] (mean)
Don't test IPB with memcached yet
Related
Whenever I needed to cache some information I relied on timestamps and MySQL, storing the data into a database and fetching it that way. I just read about APC.
APC is so much easier but is it worth converting my previous cache methods to switch to APC besides just less SQL's going through and cleaner code?
If you already have a database running and doing most of your things the first step to improve your performance is to peroperly tune the database. MySQL, properly configured, is very fast.
Obviously at some point in time it isn't fast enough anymore and one needs further caches. When caching one thing to consider is that your data might not be consistent anymore. Meaning that you might update data in your primary store (the database) but others stll read an outdated cache entry
Now you've mentoned APC as a possible solution: APC is two related but different things:
An opcode cache for the PHP scrip
A shared memorz cache for PHP user data
An opcode cache works by storing the compiled PHP script in memory. So when requesting a site the PHP interpreter doesn't have to read the file from disk and analyze the code but can directly execute it. This gives a major boost and is always a good thing.
A shared memory cache takes any PHP variable (well, there are a few exceptions ...) and stores it in shared memory in the system, so all PHP processes on the same machine might read it. So if you store the result of a database query inside APC you save time as access to shared memory is very fast compared to querying a database (sending the query to a different machine, parsing it, executing it, sending the result back ...) but as said in the begginning you have to mind that the data might be outdated. And also mind that all data is stored in memory. So depending on the amount of avilable RAM there are limitations in what can be stored. Another big downside of this is that the data is stored in memory only. This means whenerver the system goes down the cache will be empty and everything in there will be lost.
To answer literally to the question, yes. Mysql is not a cache, APC is, and thus, is better.
Mysql is an storage option to implement a cache on top of it, but you are implementing the cache with those timestamps you mention and whatever logic you are doing with them. APC is a complete implementation of a cache, both for data and for code.
Performance wise, accessing the local APC cache will always be infinitely faster than accessing a mysql database. Keyword there is local, APC is not distributed (as far as I know), so if you want to share your cache, you'll need an external cache system, such as memcached.
Generally, APC will be much, much faster than MySQL, so it's well worth the time to look into it and consider switching from one system to the other. And, as you mention, you will be firing less SQL queries to the database.
More information can be found via Google, I came across the following:
http://www.mysqlperformanceblog.com/2006/08/09/cache-performance-comparison/
I'm primarily wondering what the speed difference is in accessing the object cache of APC v. memcached (NOT op-code cache). The primary advantage of memcached is that it is distributed and not restricted to the local machine. However, since it is over the network, there's is some sort of latency involved.
I was wondering whether the speed difference between accessing APC (on the machine) and memcached (on another server) is big enough to warrant having a staged caching scheme, where the program first tries APC, then memcached, and finally the database if all else fails.
Like most everything else: it depends.
If you have a lot of calculations and can store the results then caching will speed things up. If you're just basically storing rows from the database then in memory caching will help but memcached may not add a huge amount of difference vs. a database (assuming the db queries are all simple). On the other hand if you're doing complex queries, or a lot of programmatic work to create something, then caching makes much more sense.
To give you an example, I recently worked on a site that was written by a 3rd party contractor who did not do any performance work during design. It was slow as an ox because it had a lot of unoptimized includes and such. Adding APC basically improved the performance by 10x. Adding memached decreased load times by 10 - 20 ms.
If you're far enough along then do some performance testing (look up xdebug, or another tool) and see where your bottlenecks are, then plan accordingly.
Keep in mind that if you fill up your APC cache with other things then APC will have to re-calculate the op-code for your pages again. This can cause problems if the pages keep removing objects, then once the page runs the objects keep removing pages. Not fun.
Just be safe and don't be tempted to use APC for anything but config values which won't cause your pages to be removed to make space.
TL;DR Once APC gets full your site will slow down and your server will work much harder.
I'm working on some old(ish) software in PHP that maintains a $cache array to reduce the number of SQL queries. I was thinking of just putting memcached in its place and I'm wondering whether or not to get rid of the internal caching. Would there still be a worthwihle performance increase if I keep the internal caching, or would memcached suffice?
According to this blog post, the PHP internal array is way faster than any other method:
Cache Type Cache Gets/sec
Array Cache 365000
APC Cache 98000
File Cache 27000
Memcached Cache (TCP/IP) 12200
MySQL Query Cache (TCP/IP) 9900
MySQL Query Cache (Unix Socket) 13500
Selecting from table (TCP/IP) 5100
Selecting from table (Unix Socket) 7400
It seems likely that memcache (which is implemented on the metal) would be faster than some php interpreted caching scheme.
However: if it's not broken, don't fix it.
If you remove the custom caching code, you might have to deal with other code that depends on the cache. I can't speak for the quality of the code you have to maintain but it seems like one of those "probably not worth it" things.
Let me put it this way: Do you trust the original developer(s) to have written code that will still work if you rip out the caching? (I probably wouldn't)
So unless the existing caching is giving you problems I would recommend against taking it out.
There's an advantage in using memcache vs local caching if:
1) you have mulitple webservers running off the same database, and have memcache set up to run across multiple nodes
2) the database does not implement query result caching or is very slow to access
Otherwise, unless the caching code is very poor, you shouldn't expect to see much performance benefit.
HTH
C.
I'm running a php/mysql-driven website with a lot of visits and I'm considering the possibility of caching result-sets in shared memory in order to reduce database load.
However, right now MySQL's query cache is enabled and it seems to be doing a pretty good job since if I disable query caching, the use of CPU jumps to 100% immediately.
Given that situation, I dont know if caching result-sets (or even the generated HTML code) locally in shared memory with PHP will result in any noticeable performace improvement.
Does anyone out there have any experience on this matter?
PS: Please avoid suggesting heavy-artillery solutions like memcached. Right now I'm looking for simple solutions that dont require too much time to implement, deploy and maintain.
Edit:
I see my comment about memcached deviated answers from the actual point, which is whether caching DB queries in the application layer would result in a noticeable performace impact considering that the result of those queries are already being cached at the DB level.
I know you didn't want to hear about memcached, but it is one of the best solutions for what you're trying to do. Depending on your site usage, there can be massive improvements in performance. By simply using memcached's session handler over my database session handler, I was able to cut the load in half and cut back on request serving times by over 30%.
Realistically, memcached is a simple solution. It's already integrated with PHP (if you have the extension loaded), and it requires virtually no configuration (I simply had to add memcached as a service on my linux box, which is done in one or two shell commands).
I would suggest storing session data (and anything that lends itself to caching) in memcache. For dynamic pages (such as stack overflow homepage), I would recommend caching output for a couple of seconds to prevent flooding.
A decent single box solution is file-based caching, but you have to sweep them out manually. Other than that, you could use APC, which is very fast and in-memory (still have to expire them yourself though).
As soon as you scale past one web server, though, you're going to need a shared cache, which is memcached. Why are you so adamant about not deploying this? It's not hard, and it's just going to save you time down the road. You can either start using memcache now and be done with it, or you could use one of the above methods for now and then end up switching to memcache later anyways, resulting in even more work. Plus too, you don't have to deal with running a cronjob or some other ugly hack to get cache expiration features: it does that for you.
The mysql query cache is nice, but it's not without issues. One of the big ones is it expires automatically every time the source data is changed, which you probably don't want.
I am creating a new PHP framework depending on Zend Framework.
It will be a general purpose MVC framework for web development.
I am worried about 2 aspects:
Logging:
Should I use logging? Is there any substantial performance problems when using logging?
Caching database queries:
I am caching some queries from database.
I am concerned about caching user related information. Suppose there are some information related to users. Like their personal info, etc.
If I cache such data, for every user a cache file will be generated in my data folder. Now suppose there are 10,000 - 20,000 users online in 2 hours span of time. These means that there will be 20000 files on my folder.
My question is that, will it affect the performance of my server. Is there any upper limit on how many files a folder can have on server.
Do not use a file based cache. File system operations are exceptionally slow: http://imgur.com/X1Hi1.gif . Use memcached, you don't need a lot of memory contrary to what the above post says, the amount of memory you need for it is totally proportional to how much stuff you want to store, plus memcached can cull data based on access frequency.
1) You definitely want logging, I'd recommend xdebug available at http://www.xdebug.org/. You can read further about the performance overheads at their site. (plus it integrates nicely with Eclipse's PHP version.)
2) I'm not really sure I'd want to cache much user information, but memcache is probably one of the better choices for caching in php (http://se2.php.net/memcache). And yeah, there's no limit on file number, and you'll probably not be going over the 32-bit filesize limit either =)
Caching is a real problem it's almost impossible to get it right from a user/programmer perspective. I wouldn't cache things as simple as user data. This is already cached in the database. Focus more on complex queries and complete webpages (or parts of it).
Unless you have a page like stackoverflow where i see really few ways to cache anything you have to search hard and check your logfiles about what users do on your site and you will see some hotspots soon.
Memcache is not recommended by me unless you have a lot of memory (> 8GB) on your machine. Memcache works best if you throw in Memcache servers with 16 GB doing nothing else them caching things.
For smaller sites, hardware and requirements you should consider APC as this is a very low overhead cache for data and it speeds up the execution of php at the same time (you don't want to run a production server without a bytecode cache).