i'm new in PHP and want to try caching(for the first time), so i make website and it has :
dynamic home page
dynamic portfolio page
dynamic contact page
static about page
static admin page
so i read the tutorial about caching and i try to make my own caching system:
using file cache based on the what page is requested, when the page is requested the cache system will check if there's cache in cache directory if there's no cache file yet then write all the output(html) from the php script(in this case output from output buffer) and if there's cache file that corresponds with the specific id(based on URI) then just include_once() the html file.
Then i read in CodeIgniter(i make this website using CI) says there's APC for caching, then i read again about APC, what i read about APC is that it caches the DB results, but now i'm confused which should i use
what i get so far:
file caching probably would slower if there's alot of request (i dont know if this is true or not but i read it somewhere from search engine)
APC is fast
but i'm still confused which i should use , i'm on shared hosting
The levels of caching most relevant in a PHP application:
File / Script caching - The operating system will actually do this to a large extent. When a file is opened it's added to an OS-level cache. It stays there until the file is touched or the OS needs to free memory for other processes. A homegrown PHP solution isn't a good replacement for this.
Opcode caching - In order to function, PHP needs to parse and compile a script into opcodes. A mechanism like APC will cache the opcodes of every PHP script executed by Apache, provided that the cache doesn't overflow. A homegrown PHP solution build on top of APC can partially do this, but APC already does it ... so don't bother.
Query caching - If your script accesses a lot of data that doesn't change very frequently, or wherein some latency between updates and the visibility of those updates is acceptable, caching the results from complex queries is beneficial. A homegrown PHP solution built on APC is acceptable and beneficial at this level. But a database level solution is also appropriate here, and often more appropriate.
Output caching - If your page is largely deterministic and/or the same sort of latency applicable to query caching is acceptable, you can cache the entire output of the script using output buffering and APC. A homegrown PHP solution built on APC is acceptable here, but generally not necessary. If the page is static, you're probably not saving yourself any re-computation. And if it's dynamic, it's usually preferable to just re-render the page anyway.
In a dedicated or virtual-dedicated environment you'd need install APC (or something similar) yourself. But, in a shared hosting environment, it's very likely that APC is installed. And if it weren't you couldn't install it yourself anyway.
And, due to my own uncertainty, I'd recommend not performing any query or output caching with APC in a shared environment -- I'm not sure whether APC segregates caches by virtual host. Even if it does, I wouldn't assume that my site is truly a separate virtual host.
Related
Is there a way to define constants that boot into memory when the PHP process starts on the server? If so, can it be done with arrays, classes, and functions as well?
Before people start listing the different ways of declaring things that will be available across pages and scopes: I'm not asking this from a coding convenience, but rather a performance perspective.
It seems like a waste to keep loading things that never change, from pages or databases etc, on every script execution. Being a server-side process, I would think PHP has some way to read things into memory once and have them always available.
What you are looking for is an opcode cache. Opcode caches work by storing the compiled contents of PHP files in shared memory, then using that data to short-circuit the standard code parsing process when the file is included/required in the future.
The canonical opcode cache at this point is the PHP opcache extension, but a number of other opcode caches exist, including APC and XCache.
Opcode caches do not cache data from databases. There are other extensions which you can use to assist in this process, though, such as APCu which stores them in shared memory, or memcached which stores them in an external process. None of this is automatic, though, as caching data from a database requires some application knowledge to know what is useful to cache, and to handle cache invalidation when you update the database.
In general php is a single treaded environment and it does not actually run constantly (unlike java). Php starts working only when receives a command to do so. Then it parses the code into opcode and eventually executes it. And that technically happens on every request (or CLI command).
However, there are several ways how you can cache the opcode with not much efforts with APC: http://www.php.net/manual/en/book.apc.php
Premise: I'm not trying to reinvent the wheel, I'm just trying to understand.
Output caching can be implemented easily:
//GetFromMyCache returns the page if it finds the file otherwise returns FALSE
if( ($page = GetFromMyCache($page_id)) !== FALSE )
{
echo $page; //sending out page from cache
exit();
}
//since we reach this point in code, it means page was not in cache
ob_start(); //let's start caching
//we process the page getting data from DB
//saving processed page in cache and flushing it out
echo CachePageAndFlush(ob_get_contents());
explained well in another article, and also in another answer.
But then comes APC (that will be included in PHP6 by default).
Is APC a module that once installed on the server, existing PHP code will run faster without modification?
Is APC automatic?
Then, why are there functions like apc_add?
How do we cache entire pages using APC?
When APC is installed, do I still need to do any caching on my part?
If APC is going to save hosting providers money, why do they not install it? (I mean they should be racing to install it, but I don't see that happening.)
Does installing APC have disadvantages for these hosting providers?
APC is an opcode cache:
The Alternative PHP Cache (APC) is a free and open opcode cache for
PHP. Its goal is to provide a free, open, and robust framework for
caching and optimizing PHP intermediate code.
This is not the same as a template cache (what you are demonstrating), and it has little impact on output buffering. It is not the same thing.
Opcode caching means cache the PHP code after it has been interpreted. This could be any code fragment (not necessarily something that outputs HTML). For example, you could stick classes and the template engine itself in an opcode cache. This would dramatically speed up your code, as the PHP interpreter doesn't need to "interpret" your code again, it can simply load the "interpreted" version from the cache.
Please do not confuse output buffering with a cache. There are many levels of caching, for example, two of the most common that you may be familiar with.
Caching the session
A very basic version of this is a cookie that stores some settings. You only execute the code that "calculates" the settings once (when a user logs in), and for the rest of the session, you use the "cached" settings from the cookie.
Caching the rendered template
This is done when a page that needs to be generated once, but doesn't change very often. For example a "daily specials" page, which is a template. You only generate this once, and then serve the "rendered" page from cache.
None of these use APC
Is APC makes the PHP to run faster on its own?
Yes. In a way. The benefit hugely differs though.
When using APC do I still need to cache rendered HTML?
Bytecode is NOT like resulting HTML. It is the same program as a regular PHP script.
Even with APC enabled, PHP have to process data and render HTML.
I hope you understand the difference now.
APC cache provides both byte-code cache and memory-based storage to store user data.
So, you can also use it to store some user-defined data.
And store whole rendered pages as well (I don't understand your confusion here - what is that 'page' data type you are talking about? Isn't the ob result being just a regular string?).
However, caching of the resulting HTML is not that easy as you imagine.
Premature optimization is the root of all evil.
Start optimizing your site only when you have a reason.
why are Web Hosters waiting to install APC?
There are several reasons. But one is enough - bytecode cache won't make any profit for the usual PHP-based ugly homepage ecommerce site.
APC caches bytecodes. PHP turns source code you write into these when a file gets requested or included, and then gets rid of them. With APC the bytecode stays around.
ob_start turns on an output buffer. It can be used to cache one effect of the program code, which is the text it prints.
Use APC if you want your program to run faster and consume less CPU power. It has no effect on database throughput.
Cache ob_start output if you only want to run the program every now and then and just statically serve its last output. This saves database throughput, at the price of information freshness and personalization.
APC is good when each page request conveys new information, or information specific to the user.
Cache ob_start output if you are running some heavyweight calculations or data access and it's okay that everyone gets the same not-quite-fresh output.
A very noob question here.
I just installed APC and when I go to the monitoring page (apc.php) and click on the "System Cache Entries" tab I can see a lot of pages on that list after browsing my app that's hosted on the server. To test I restarted apache and all the cache entries were gone but as soon as I started browsing more pages on my app again they started appearing on that list.
I didn't make any changes to my code, so is this all I have to do to enable optcode caching? Or are changes to my code also necessary?
I ask because my app is using codeigniter and there is a page in the codeigniter docs on caching docs that describes code changes:
http://codeigniter.com/user_guide/libraries/caching.html
APC stores opcode caches as they are parsed. As you have already discovered the caches only persist as long as apache remains open. But when an opcode cache is missing for a page that is requested, APC will store it for as long as Apache remains running. However, opcode caches are only half the battle. While it is true you will receive a speed increase from caching opcode, a lot of time in PHP is lost to file input/output and socket communications (ie. database queries). As long as you can be sure that your script is the only resource which will be modifying the database or a file, you can safely caching database query results or file contents so that each request doesn't need to touch the filesystem or database layer. The logic for this uses some APC functions:
if(apc_exists('some_database_value')) {
$value = apc_fetch('some_database_value');
} else {
//Query db for value, store in $value
apc_store('some_database_value', $value);
}
The only disadvantage to this solution is if you need to modify any cached resource outside of the PHP script, you will need to clear the APC cache from the CLI.
No, APC requires no code changes to accellerate the actual running of the code; for a bit more information, see for example this answer
With APC, you first get an opcode cache -- for that part, you having
nothing to modify in your code : just install the extension, and
enable it.
The opcode cache will generally speed up things : it prevents the PHP
scripts from being compiled again and again, by keeping the opcodes --
the result of the compilation of the PHP files -- in memory.
I'm currently implementing memcached into my service but what keeps cropping up is the suggestion that I should also implement APC for caching of the actual code.
I have looked through the few tutorials there are, and the PHP documentation as well, but my main question is, how do I implement it on a large scale? PHP documentation talks about storing variables, but it isn't that detailed.
Forgive me for being uneducated in this area but I would like to know where in real sites this is implemented. Do I literally cache everything or only the parts that are used often, such as functions?
Thanks!
As you know PHP is an interpreted language, so everytime a request arrives to the server it need to open all required and included files, parse them and execute them. What APC offers is to skip the require/include and parsing steps (The files still have to be required, but are stored in memory so access is much much faster), so the scripts just have to be executed. On our website, we use a combination of APC and memcached. APC to speed up the above mentioned steps, and memcached to enable fast and distributed storing and accessing of both global variables (precomputed expensive function calls etc that can be shared by multiple clients for a certain amount of time) as well as session variables. This enables us to have multiple front end servers without losing any client state such as login status etc.
When it comes to what you should cache... well, that really depends on your application. If you have a need for multiple frontends somewhere down the line, I would try to go with memcached for such caching and storing, and use APC as an opcode cache.
APC is both an opcode cache and a general data cache. The latter works pretty much like memcached, whereas the opcode cache works by caching the parsed php-files, so that they won't have to be parsed on each request. That can generally speed up execution time up quite a bit.
You don't have to implement the opcode caching features of APC, you just enable them as a php module.
APC cache size and other configuration information is here.
What is the best way of implementing a cache for a PHP site? Obviously, there are some things that shouldn't be cached (for example search queries), but I want to find a good solution that will make sure that I avoid the 'digg effect'.
I know there is WP-Cache for WordPress, but I'm writing a custom solution that isn't built on WP. I'm interested in either writing my own cache (if it's simple enough), or you could point me to a nice, light framework. I don't know much Apache though, so if it was a PHP framework then it would be a better fit.
Thanks.
You can use output buffering to selectively save parts of your output (those you want to cache) and display them to the next user if it hasn't been long enough. This way you're still rendering other parts of the page on-the-fly (e.g., customizable boxes, personal information).
If a proxy cache is out of the question, and you're serving complete HTML files, you'll get the best performance by bypassing PHP altogether. Study how WP Super Cache works.
Uncached pages are copied to a cache folder with similar URL structure as your site. On later requests, mod_rewrite notes the existence of the cached file and serves it instead. other RewriteCond directives are used to make sure commenters/logged in users see live PHP requests, but the majority of visitors will be served by Apache directly.
The best way to go is to use a proxy cache (Squid, Varnish) and serve appropriate Cache-Control/Expires headers, along with ETags : see Mark Nottingham's Caching Tutorial for a full description of how caches work and how you can get the most performance out of a caching proxy.
Also check out memcached, and try to cache your database queries (or better yet, pre-rendered page fragments) in there.
I would recommend Memcached or APC. Both are in-memory caching solutions with dead-simple APIs and lots of libraries.
The trouble with those 2 is you need to install them on your web server or another server if it's Memcached.
APC
Pros:
Simple
Fast
Speeds up PHP execution also
Cons
Doesn't work for distributed systems, each machine stores its cache locally
Memcached
Pros:
Fast(ish)
Can be installed on a separate server for all web servers to use
Highly tested, developed at LiveJournal
Used by all the big guys (Facebook, Yahoo, Mozilla)
Cons:
Slower than APC
Possible network latency
Slightly more configuration
I wouldn't recommend writing your own, there are plenty out there. You could go with a disk-based cache if you can't install software on your webserver, but there are possible race issues to deal with. One request could be writing to the file while another is reading.
You actually could cache search queries, even for a few seconds to a minute. Unless your db is being updated more than a few times a second, some delay would be ok.
The PHP Smarty template engine (http://www.smarty.net) includes a fairly advanced caching system.
You can find details in the caching section of the Smarty manual: http://www.smarty.net/manual/en/caching.php
You seems to be looking for a PHP cache framework.
I recommend you the template system TinyButStrong that comes with a very good CacheSystem plugin.
It's simple, light, customizable (you can cache whatever part of the html file you want), very powerful ^^
Simple caching of pages, or parts of pages - the Pear::CacheLite class. I also use APC and memcache for different things, but the other answers I've seen so far are more for more complete, and complex systems. If you just need to save some effort rebuilding a part of a page - Cache_lite with a file-backed store is entirely sufficient, and very simple to implement.
Project Gazelle (an open source torrent site) provides a step by step guide on setting up Memcached on the site which you can easily use on any other website you might want to set up which will handle a lot of traffic.
Grab down the source and read the documentation.