You need to cache arbitrary data like results of PHP logic within methods,database query calls and generally any data results from a process (not Opcode caching).
What would you want to use between third-party caching softwares like Apc and Memcached?What makes you prefer the above tools to caching your data onto your local file system?
thanks
Luca
Go with Memcache. It has a lot more support and larger community (because it can be used by multiple languages). Supports access from multiple servers, so it allow for a more scalable architecture.
That being said, still install APC or another opcode cache for PHP. It will significantly speed up PHP's execution time.
They're both different. APC is a local machine cache specific to PHP and memcached is a multiple-computer distributed cache. If you're trying to scale your programs memcached is often preferred. If you're designing for a single server then APC will suit you better.
I personally prefer a combination of both.
Simple answer, Memcache and APC store the data in memory, not on the disk. Access time is MUCH faster.
Related
Trying to get to grips with the different types of cache engines File, APC, Xcache, Memcache. Anybody know of any good resources/links?
Note I am using Linux, PHP and mysql
There are 2 types of caching terminology thrown around in PHP.
First is an optcode cache:
http://en.wikipedia.org/wiki/PHP_accelerator
Second is a data cache:
http://simas.posterous.com/php-data-caching-techniques
A few of the technologies can cross boundaries into both realms, but the basics behind them are simple. The idea is: Keep as much data in ram and precompiled because compiling and HD seeks are very expensive processes. HD Seeks can be done to find a file to compile / query the DB to get data / looking for a temp file, and every time that happens it slows down the user experience.
Memcached is generally the way to go, but it has some "features" such as once you save some data to t cache, it doesn't necessarily guarantee that it will be available later as it dynamically removes old caches to make way for new ones. It's also fairly basic, you'll need to roll your own system for handling timeouts and preventing cascading but it's all fairly simple. There's tons of info in the Memcached FAQ, or feel free to ask and I'll post some code examples. Memcached can also act as a session handler which is great if you have lots of users or more than one server.
Otherwise disc caching is good if you only have one server or don't mind generating separate caches of each server. Generally faster than memcached as it doesn't have the network overhead (unless you have memcached on the same server). There are plenty of good disc caching frameworks but probably the best are Pear Cache_Lite and APC.
APC also has the added advantage that it can cache your compiled PHP code which may help on high-performance websites.
I am using memcache for cacheing objects, but would like to add in addition an opcode accelerator like APC. Since they both involve cacheing, I am not sure if they will be "stepping on each others toes", i.e. I am not sure if memcache is already an OP code accelerator.
Can someone clarify? I would like to use them both - bit for different things. memcache for cacheing my objects and APC for code acceleration
Memcache is more along the lines of a distributed object cache vs something like APC or XCache, which stores PHP bytecode in memory so you avoid having to parse it each time. Their main purposes are different.
For example, if you had a very CPU intensive database query that people often requested, you could cache the resulting object in memcache and then refer to it instead of re-running that query all the time.
APC & XCache do have similar object caching features, but you are limited to the host machine. What if you wanted 10 different servers to all have access to that one object without having to re-do the query for each server? You'd just direct them to your memcache server and away you go. You still get a benefit if you only have a single server because using memcache will help you scale in the future if you need to branch out to more boxes.
The main thing to consider is if you think your app is going to need to scale. Memcache has more overhead since you have to use a TCP connection to access it, versus just a function call for APC/Xcache shared objects.
However, Memcache has the following benefits:
Faster than the disk or re-running query.
Scales to multiple servers.
Works with many different languages, your objects are not locked into PHP + APC/Xcache only.
All processes/languages have access to the same objects, so you don't have to worry if your PHP child processes have an empty object cache or not. This may not be as big a deal if you're running PHP-FPM though.
In most cases, I would recommend caching your objects in memcache as it's not much harder & is more flexible for the future.
Keep in mind that this is only regarding caching objects. Memcache does NOT have any bytecode or PHP acceleration features, which is why I would run it side-by-side with APC or Xcache
yes you can use them both together at the same time.
I'm studying high-performance coding for websites in PHP, and this idea popped into my mind:
We know that accessing a database uses a significant amount of CPU usage, so we cache such data, saving it to the HDD. But I was wondering, can't it rest in the RAM of the server, so I can access it even more faster?
You might want to check out memcached:
http://www.php.net/manual/en/intro.memcache.php
PHP normally comes with APC as a bytecode cache. You can also use it as a local cache. If you need something in a distributed/clustered environment, then memcached (plus possibly beanstalkd) is the way to go.
XCache, eaccelerator, apc and memcache allow you to save items to semi persistent memory (you don't necessarily know when an item will expire in most cases). It isn't the same as a database, more like a key/value list. The downside being that it requires a third party library, so you might be a bit limited depending on your environment.
I think you might be able to get the same effect using shared memory (via php's shmop_ functions). But I have never used them or know if they are included with php's library so someone feel free to bash me or edit out this mention.
If your server is ANY good, then it will already do so. But of course, it may be the case that your server is serving a few thousand other tasks besides yours as well, meaning you don't have that server's cache all for yourself.
And if there really are a few thousand others being served besides you, then the probability just gets higher that there is at least one nutcase among those thousands of others, who is doing something that he really shouldn't be doing but that the server has not been programmed to detect, not been programmed to stop, but just been programmed to try and make the best of it, at the expense of availability of resources for the x999 "responsible" users.
I'd like to have your opinion about writing web apps in PHP vs. a long-running process using tools such as Django or Turbogears for Python.
As far as I know:
- In PHP, pages are fetched from the hard-disk every time (although I assume the OS keeps files in RAM for a while after they've been accessed)
- Pages are recompiled into opcode every time (although tools from eg. Zend can keep a compiled version in RAM)
- Fetching pages every time means reading global and session data every time, and re-opening connections to the DB
So, I guess PHP makes sense on a shared server (multiple sites sharing the same host) to run apps with moderate use, while a long-running process offers higher performance with apps that run on a dedicated server and are under heavy use?
Thanks for any feedback.
After you apply memcache, opcode caching, and connection pooling, the only real difference between PHP and other options is that PHP is short-lived, processed based, while other options are, typically, long-lived multithreaded based.
The advantage PHP has is that its dirt simple to write scripts. You don't have to worry about memory management (its always released at the end of the request), and you don't have to worry about concurrency very much.
The major disadvantage, I can see anyways, is that some more advanced (sometimes crazier?) things are harder: pre-computing results, warming caches, reusing existing data, request prioritizing, and asynchronous programming. I'm sure people can think of many more.
Most of the time, though, those disadvantages aren't a big deal. You can scale by adding more machines and using more caching. The average web developer doesn't need to worry about concurrency control or memory management, so taking the minuscule hit from removing them isn't a big deal.
With APC, which is soon to be included by default in PHP compiled bytecode is kept in RAM.
With mod_php, which is the most popular way to use PHP, the PHP interpreter stays in web server's memory.
With APC data store or memcache, you can have persistent objects in RAM instead of for example always creating them all anew by fetching data from DB.
In real life deployment you'd use all of above.
PHP is fine for either use in my opinion, the performance overheads are rarely noticed. It's usually other processes which will delay the program. It's easy to cache PHP programs with something like eAccelerator.
As many others have noted, PHP nor Django are going to be your bottlenecks. Hitting the hard disk for the bytecode on PHP is irrelevant for a heavily trafficked site because caching will take over at that point. The same is true for Django.
Model/View and user experience design will have order of magnitude benefits to performance over the language itself.
PHP is a language like Java etc.
Only your executable is the php binary and not the JVM! You can set another MAX-Runtime for PHP-Scripts without any problems (if your shared hosting provider let you do so).
Where your apps are running shouldn't depend on the kind of the server. It should depend on the ressources used by the application (CPU-Time,RAM) and what is given by your Server/Vserver/Shared Host!
For performance tuning reasons you should have a look at eAccelerator etc.
Apache supports also modules for connection pooling! See mod_dbd.
If you need to scale (like in a cluster) you can use distributed memory caching systems like memcached!
I'm currently implementing memcached into my service but what keeps cropping up is the suggestion that I should also implement APC for caching of the actual code.
I have looked through the few tutorials there are, and the PHP documentation as well, but my main question is, how do I implement it on a large scale? PHP documentation talks about storing variables, but it isn't that detailed.
Forgive me for being uneducated in this area but I would like to know where in real sites this is implemented. Do I literally cache everything or only the parts that are used often, such as functions?
Thanks!
As you know PHP is an interpreted language, so everytime a request arrives to the server it need to open all required and included files, parse them and execute them. What APC offers is to skip the require/include and parsing steps (The files still have to be required, but are stored in memory so access is much much faster), so the scripts just have to be executed. On our website, we use a combination of APC and memcached. APC to speed up the above mentioned steps, and memcached to enable fast and distributed storing and accessing of both global variables (precomputed expensive function calls etc that can be shared by multiple clients for a certain amount of time) as well as session variables. This enables us to have multiple front end servers without losing any client state such as login status etc.
When it comes to what you should cache... well, that really depends on your application. If you have a need for multiple frontends somewhere down the line, I would try to go with memcached for such caching and storing, and use APC as an opcode cache.
APC is both an opcode cache and a general data cache. The latter works pretty much like memcached, whereas the opcode cache works by caching the parsed php-files, so that they won't have to be parsed on each request. That can generally speed up execution time up quite a bit.
You don't have to implement the opcode caching features of APC, you just enable them as a php module.
APC cache size and other configuration information is here.