Question about APC and user data - php

I have never bothered to look at caching for my projects, because they´re usually small, with a hundred users at most, and the data is always changing.
Then, I thought about trying Symfony and it warned me that APC was disabled with the check_configuration.php script.
I went to check what is APC, and saw that it´s main use is opcode caching, which is good, bu that it also has user data caching, which I´m not sure is something I want when any changes in the database are meant to be seen, and they happen every couple of minutes.
Could anyone explain how do I disable this user data cache, or is APC something not to be used when data is always changing?

APC doesn't cache any user data unless you force it to. If APC caches and serves stale user data, that's because you've designed your application to do so. Outside of opcode caching, it's just a key-value store somewhat comparable to memcache -- it only caches what you explicitly put in it.
If symfony has page caching behavior, you need to disable that in symfony, not APC.

I don't use APC, I use EAccelerator instead, but the concepts are the same.
Opcode caches are generally good.
Content caching is tricky if your application isn't RESTful. You need a consistent relationship between your namespaces and your output to make caching meaningful.
For example if you have an RSS feed at an url http://example.com/rss.php and the content changes regularly without the URL changing, caching is much more complex than if you used http://example.com/rss.php?time=XXXXXXXXXXUTC
If all you want to do is prevent DOS attacks on an URL which uses a lot of resources and changes rarely you can set a timeout for the content cache, and accept that it will be more-or-less up to date.

Related

Selective Disable APC caching

I installed APC on my VPS and it works great with W3 Cache wordpress plugin. My problem is that there is one database in MySQL which is pinged by client end every few seconds to see if there are new updates. These db contains certain time sensitive information and hence it can't be part of cached data.
How can I disable APC for this database/files? or Can I set a very short expiry of certain type of data?
Any help is highly appreciated.
APC does two things. It provides a transparent cache of PHP bytecode, and it can cache data at the request of the application.
There is no reason at all to attempt to disable the bytecode cache, but that's not what you seem to be talking about here. The bytecode cache just caches bytecode, not data.
If the application you are using asks APC to cache certain data, and it does not contain an option to disable this caching if APC is installed and available, you are going to need to modify that application. Look for calls to apc_store and apc_fetch and alter the code as required.
As mentioned in the comments, your real problem is probably with the Wordpress caching plugin that you've chosen, not with APC. APC just stores data. If it can not disable itself for selected pages, you may need to find a solution that can, or find another way to get to the data you need that bypasses it.

PHP ob_start vs opcode APC, explain differences and real world usage?

Premise: I'm not trying to reinvent the wheel, I'm just trying to understand.
Output caching can be implemented easily:
//GetFromMyCache returns the page if it finds the file otherwise returns FALSE
if( ($page = GetFromMyCache($page_id)) !== FALSE )
{
echo $page; //sending out page from cache
exit();
}
//since we reach this point in code, it means page was not in cache
ob_start(); //let's start caching
//we process the page getting data from DB
//saving processed page in cache and flushing it out
echo CachePageAndFlush(ob_get_contents());
explained well in another article, and also in another answer.
But then comes APC (that will be included in PHP6 by default).
Is APC a module that once installed on the server, existing PHP code will run faster without modification?
Is APC automatic?
Then, why are there functions like apc_add?
How do we cache entire pages using APC?
When APC is installed, do I still need to do any caching on my part?
If APC is going to save hosting providers money, why do they not install it? (I mean they should be racing to install it, but I don't see that happening.)
Does installing APC have disadvantages for these hosting providers?
APC is an opcode cache:
The Alternative PHP Cache (APC) is a free and open opcode cache for
PHP. Its goal is to provide a free, open, and robust framework for
caching and optimizing PHP intermediate code.
This is not the same as a template cache (what you are demonstrating), and it has little impact on output buffering. It is not the same thing.
Opcode caching means cache the PHP code after it has been interpreted. This could be any code fragment (not necessarily something that outputs HTML). For example, you could stick classes and the template engine itself in an opcode cache. This would dramatically speed up your code, as the PHP interpreter doesn't need to "interpret" your code again, it can simply load the "interpreted" version from the cache.
Please do not confuse output buffering with a cache. There are many levels of caching, for example, two of the most common that you may be familiar with.
Caching the session
A very basic version of this is a cookie that stores some settings. You only execute the code that "calculates" the settings once (when a user logs in), and for the rest of the session, you use the "cached" settings from the cookie.
Caching the rendered template
This is done when a page that needs to be generated once, but doesn't change very often. For example a "daily specials" page, which is a template. You only generate this once, and then serve the "rendered" page from cache.
None of these use APC
Is APC makes the PHP to run faster on its own?
Yes. In a way. The benefit hugely differs though.
When using APC do I still need to cache rendered HTML?
Bytecode is NOT like resulting HTML. It is the same program as a regular PHP script.
Even with APC enabled, PHP have to process data and render HTML.
I hope you understand the difference now.
APC cache provides both byte-code cache and memory-based storage to store user data.
So, you can also use it to store some user-defined data.
And store whole rendered pages as well (I don't understand your confusion here - what is that 'page' data type you are talking about? Isn't the ob result being just a regular string?).
However, caching of the resulting HTML is not that easy as you imagine.
Premature optimization is the root of all evil.
Start optimizing your site only when you have a reason.
why are Web Hosters waiting to install APC?
There are several reasons. But one is enough - bytecode cache won't make any profit for the usual PHP-based ugly homepage ecommerce site.
APC caches bytecodes. PHP turns source code you write into these when a file gets requested or included, and then gets rid of them. With APC the bytecode stays around.
ob_start turns on an output buffer. It can be used to cache one effect of the program code, which is the text it prints.
Use APC if you want your program to run faster and consume less CPU power. It has no effect on database throughput.
Cache ob_start output if you only want to run the program every now and then and just statically serve its last output. This saves database throughput, at the price of information freshness and personalization.
APC is good when each page request conveys new information, or information specific to the user.
Cache ob_start output if you are running some heavyweight calculations or data access and it's okay that everyone gets the same not-quite-fresh output.

Caching strategy for heavily used web-site

We’re in process of designing caching strategy for a heavily used web-site.
The site consists of a mix of dynamic and static content. The front-end is PHP, middle tier is Tomcat and mysql on the back.
Only user login screen is done over HTTPS to secure the credentials. After that, all content is served over plain HTTP. Some of the screens are specific to the customer (let’s say his last orders), while other screens are common to everybody (most popular products, promotions, rules, etc).
Given the expected traffic volume it’s clear that we need a comprehensive caching strategy. So we’re considering following options:
Put Squid or Varnish in front of PHP and configure it to cache all public content and even order submission form of a customer.
Use memcached by PHP to cache page fragments (such as most popular products)
Implement caching in the middle tier/tomcats (i.e. before returning content to web-servers, try to fetch it from local cache such as ehcache)
Use PHP-level cache like Zend Cache and store there fragments of the pages. This is close to the second option that i mentioned but it's built into the Zend framework.
It’s possible that we will use a combination of those strategies.
So the question is whether it's worthwhile to add front cache like Varnish, or just use Zend Cache inside?
The other option that i forgot to mention is to use PHP-level cache like Zend Cache and store there fragments of the pages. This is close to the second option that i mentioned but it's built into the Zend framework.
So the question is whether it's worthwhile to add front cache like Varnish, or just use Zend Cache inside?
Thanks again,
Philopator.
I've done quite a few projects like this and found that:
creating a (complete) custom solution is hard and expensive. Luckily you found Squid/Varnish, memcache and ehcache
The dynamic behaviour of sites differ a lot and you know your site best, so it makes sense to devise a specific caching strategy
it makes sense to deploy multiple layers of cache. However, this will complicate the behavior of your site, so you should tell everybody involved with the site (e.g. business) something about it and tell your engineers a lot about it.
Think of how you're going to debug problems. e.g. add headers that indicate the freshness of the data served, allow certain people to purge or avoid the cache
Regularly check how the different cache layers perform (e.g. use nagios plugins for your varnish machines).
Measure where your performance problems are before you build any caches :)
caching certain objects for just a short while can already be a very significant improvement
These days I like Varnish a lot: it's a separate layer that doesn't clutter the Java/PHP code, it's fast and very flexible. Downside is that the configuration in vcl is a bit too complex.
I typically use ehcache + in memory storage to avoid latency (e.g. database queries or service requests) with small data sets, and memcached when there's a lot of data and the cache needs to shared by multiple nodes.

MySQL query cache vs caching result-sets in the application layer

I'm running a php/mysql-driven website with a lot of visits and I'm considering the possibility of caching result-sets in shared memory in order to reduce database load.
However, right now MySQL's query cache is enabled and it seems to be doing a pretty good job since if I disable query caching, the use of CPU jumps to 100% immediately.
Given that situation, I dont know if caching result-sets (or even the generated HTML code) locally in shared memory with PHP will result in any noticeable performace improvement.
Does anyone out there have any experience on this matter?
PS: Please avoid suggesting heavy-artillery solutions like memcached. Right now I'm looking for simple solutions that dont require too much time to implement, deploy and maintain.
Edit:
I see my comment about memcached deviated answers from the actual point, which is whether caching DB queries in the application layer would result in a noticeable performace impact considering that the result of those queries are already being cached at the DB level.
I know you didn't want to hear about memcached, but it is one of the best solutions for what you're trying to do. Depending on your site usage, there can be massive improvements in performance. By simply using memcached's session handler over my database session handler, I was able to cut the load in half and cut back on request serving times by over 30%.
Realistically, memcached is a simple solution. It's already integrated with PHP (if you have the extension loaded), and it requires virtually no configuration (I simply had to add memcached as a service on my linux box, which is done in one or two shell commands).
I would suggest storing session data (and anything that lends itself to caching) in memcache. For dynamic pages (such as stack overflow homepage), I would recommend caching output for a couple of seconds to prevent flooding.
A decent single box solution is file-based caching, but you have to sweep them out manually. Other than that, you could use APC, which is very fast and in-memory (still have to expire them yourself though).
As soon as you scale past one web server, though, you're going to need a shared cache, which is memcached. Why are you so adamant about not deploying this? It's not hard, and it's just going to save you time down the road. You can either start using memcache now and be done with it, or you could use one of the above methods for now and then end up switching to memcache later anyways, resulting in even more work. Plus too, you don't have to deal with running a cronjob or some other ugly hack to get cache expiration features: it does that for you.
The mysql query cache is nice, but it's not without issues. One of the big ones is it expires automatically every time the source data is changed, which you probably don't want.

PHP APC, educate me

I'm currently implementing memcached into my service but what keeps cropping up is the suggestion that I should also implement APC for caching of the actual code.
I have looked through the few tutorials there are, and the PHP documentation as well, but my main question is, how do I implement it on a large scale? PHP documentation talks about storing variables, but it isn't that detailed.
Forgive me for being uneducated in this area but I would like to know where in real sites this is implemented. Do I literally cache everything or only the parts that are used often, such as functions?
Thanks!
As you know PHP is an interpreted language, so everytime a request arrives to the server it need to open all required and included files, parse them and execute them. What APC offers is to skip the require/include and parsing steps (The files still have to be required, but are stored in memory so access is much much faster), so the scripts just have to be executed. On our website, we use a combination of APC and memcached. APC to speed up the above mentioned steps, and memcached to enable fast and distributed storing and accessing of both global variables (precomputed expensive function calls etc that can be shared by multiple clients for a certain amount of time) as well as session variables. This enables us to have multiple front end servers without losing any client state such as login status etc.
When it comes to what you should cache... well, that really depends on your application. If you have a need for multiple frontends somewhere down the line, I would try to go with memcached for such caching and storing, and use APC as an opcode cache.
APC is both an opcode cache and a general data cache. The latter works pretty much like memcached, whereas the opcode cache works by caching the parsed php-files, so that they won't have to be parsed on each request. That can generally speed up execution time up quite a bit.
You don't have to implement the opcode caching features of APC, you just enable them as a php module.
APC cache size and other configuration information is here.

Categories