Is it problematic to cache files for too long? - php

I discovered https://developers.google.com/speed/pagespeed/ the other day and have improved my website's page speed from ~75 to ~95 now.
One of the last few things it recommends is that I:
Leverage browser caching: Setting an expiry date or a maximum age in the HTTP headers
for static resources instructs the browser to load previously downloaded resources
from local disk rather than over the network.
The cache time for my main javascript and css files is set to 2 days, Google suggests I set it to at least 1 week. They also suggest that I do the same for html and php files.
What would happen to my users if I decided to make a large website change and they had just cached my website yesterday (for 1 week)? Would they not see the changes on my website until 1 week later?
Also, since my website contains a control panel and has some dynamically generated PHP pages, is there any reason for caching any of it? Wouldn't my server still be churning through php script and generating new content every time they logged into their account?

You probably doesn't want to cache your HTML and PHP in visitors browsers. However you might want to cache that in a layer you have more control over, like PHP opcode caching with APC and a reverse proxy like Varnish.
For the static assets, like your JavaScript and CSS files, it should be safe to cache them a year or more. If you make a change to them you can just update their URL to say mystyles.css?v=123 and browsers will think it's a whole different file from mystyles.css?v=122 or even just mystyles.css.

Related

php header location vs file_get_contents performace

I have a website where the client search a term and results are retrieved through an ajax request. On php side, the called script check the date of the cache (cache are files) and if it's older than an established time it refreshes the results, else it return the cache file content: die(file_get_contents($cache_path));
The cache time is a few hours, an to refresh it takes just a few seconds, so the greatest part of the requests will end up in cache response.
So I thought that using header("location: $cache_path"); would be less stressful for the server, because it simply tells the browser to get the contents from the cache file without passing it through the script.
The downside is that the cache file path would become public (which is not biggest problem ever, because the content is the same), but, you know, it's never good to give the resources locations...
So, performance wise, is there a big difference between file_get_contents and redirecting? The average cache file size is 120kb... Any other ideas and suggestions?
You can use "internal redirect": through X-Accel-Redirect header for nginx or X-Sendfile for Apache. In this case you don't show any additional URLs to a client and don't deal with cache files in you script.
For configuration details you can read an official documentation or, of course, other SO questions (like this one).

Caching website by Codeigniter or 3rd party caching web application?

I'm very new to the concept "caching", so excuse me if my question is too simple.
So,I'm using Codeigniter(PHP framework) and it supports page caching, simply by doing this $this->output->cache(n)//n: number of minutes to remain cached
(I think) Codeigniter's caching will store any requested page in a cache file, and get the page when needed immediately.
Also there's a 3rd part web application called Vanish Cache, it sits between Apache and the client, then it will cache requested pages and re-send them again when needed, isn't that the same thing Codeigniter does, or is it different from that?
Wouldn't it be a waste to cache each page twice, by Codeigniter and Vanish?
assuming they do the exact same thing(cache pages and re-send them to the user),which one is more efficient for dynamic(database driver) websites?
On the surface, they do the same thing, however there are appropriate uses for different levels of cache.
A cache like Varnish, that sits between the web server and the application, provides very high performance. You use it for static content like CSS, static pages and dynamic content that changes very rarely.
An application cache provides a less performant but far more flexibile option. Usually you can cache by time, but also by application/request variables like "current user". This allows you to provide an state-dependant cache with a lot more fine control. For example, you could cache an object's detail page by it's last modified time in the database.

Use a cached version of a php page unless the database has changed

I've looked at similar questions about caching in PHP and I'm still stumped as to how to check whether the database has changed without making a new call to the database, which would defeat the point of caching.
I understand technically how to implement caching in PHP -- using ETag and Last Modified headers, output buffering, storing static files, etc. What is tripping me up is how to determine when to serve up a new version of a page instead of a cached version. If the database content has changed, I want to show the new version and not the cached version.
For example, let's say I have a page that displays details about a product. Generally, once the product info is stored in the database, it won't change much. But occasionally there might be an edit to the product description or a price change. If the product has a new price, I don't want to show the user the old price by using a cached version of the page. For that reason, updating the cached content every hour doesn't seem sufficient. Not to mention that that's too often for the content that doesn't change, the real problem is that it won't update the content fast enough when there is a change.
So should I store something (e.g., an ETag value or a static html file) every time the product database is updated through a form in the Admin area of the application? What am I missing here?
[Note: Not interested in using a caching library here. I'd like to learn how to do it in straight PHP for now.]
Caching is a pretty complex topic, because you can cache all sort of data in various places. Usually you implement caching to relieve bottlenecks in your server structure.
In your setup you can cache data at three different locations:
1) Clientside, between client and server
You would use this method to save bandwidth and shorten loading times for the user. You can achieve this by setting cache related fields into the http header (Cache-Control, Expires, ETag and so on).
If you use Cache-Control or Expires, the decision wether to load an updated version from the server or not purely depends on the client. So even if there is a new version available, the user won't see it. On the plus side you are saving lots of cpu cycles on the server, because your php script won't be executed.
If you use ETag, you can inform the client on each request, if the version of the requested content has changed. But your php script will be executed on each request, even if the ETag is unchanged.
2) Serverside, between client and server
This kind of caching primarily reduces high cpu load on your server. It won't affect the amount of traffic generated between client and server.
You can use a client proxy like Varnish to store rendered responses on the server side. The good thing is, that you have full control over the cache. If an updated version of a requested content is available, you can simply purge the old version from the cache, so that a new version is generated from your php script and stored in your cache.
Every response that is cacheable will only be generated one time and then be served from cache to the clients.
3) In your application
If you are heavily using your database, you should consider using a fast key value store like memcached to cache query results. Of course you have to adjust your database classes for this (first ask memcached, if memcached doesn't have the result ask the database and store the result into memcached), but the performance gain will be quite impressive, because memcached is really fast.
Sometimes it even makes sense to store data solely in memcached, if the data doesn't has to be persisted permanently (php sessions for example).
I had also faced the same problem long back (I dont know if you will find my way to be correct).
Why I needed the cache :-
What my site use to do was, it use to update the database by running the script on cron.php file and index.php use to show the listing from database (this use to take ages to load )
my solution :-
every time a new list was created or updated I unlinked the cache file then on index.php page I checked if cache file exists load cache or else load content from database also at the same time write this data to the cache file so next time when user requests for index.php file

Caching, CDN - are they the same for PHP Yii site? How to use it for a dynamic php site?

I have a dynamic php (Yii framework based) site. User has to login to do anything on the site. I am trying to understand how caching and CDN work; and I am a bit confused.
Caching (memcache):
My site has a good amount of css, js, and images. I've been given to understand that enabling caching ("memcache"?) will GREATLY speed up my site. But this has me confused. How does caching help? I mean, how can you cache something that's coming out of DB for each user separately? For instance, user-1 logs-in, he sees his control panel. User-2 logs-in, user 2 will see their control panel.
How do I determine what to cache? Plus, how do I enable caching (memcaching)?
CDN:
I have been told to use a content delivery network like CloudFlare. It is suppose to automatically cache my site. So, when my user-1 logs in, what will it cache? Will it cache only the homepage CSS, JS, and homepage images? Because everything else requires login? What happens when user logs-out? I mean, do "sessions" interfere with working of a CDN?
Does serving up images via CDN reduce significant load on my server? I don't have much cash for getting a clustered-server configuration. So, I just want my (shared) server to be able to devote all its resources in processing PHP code. So, how much load can I save by using "caching" (something like memcache) and/or "CDN" (something like CloudFlare)?
Finally,
What would be general strategy to implement in this scenario for caching, cdn, and basic performance optimization? do I need to make any changes to my php-code to enable CDN like CloudFlare and to enable/implement/configure caching? What can I do that would take least amount of developer/coding time and will make my site run much much faster?
Oh wait, some of my pages like "about us" page etc. are going to be static html too. But they won't get as many hits. Except for maybe the iFrame page that will be used for my Facebook Page.
I actually work for CloudFlare & thought I would hop in to address some of the concerns.
"do I need to make any changes to my php-code to enable CDN like
CloudFlare and to enable/implement/configure caching? What can I do
that would take least amount of developer/coding time and will make my
site run much much faster?"
No, nothing like a need to re-write urls, etc. We automatically cache static content by file extension. This does require changing your DNS to point to us, however.
Does serving up images via CDN reduce significant load on my server?
Yes, and it should also help most visitors access the site faster and save you a fair amount on bandwidth.
"Oh wait, some of my pages like "about us" page etc. are going to be
static html too."
CloudFlare doesn't cache HTML by default. You use PageRules to setup more advanced caching options for things like static HTML.
Caching helps because instead of performing disk io for each user the data is stored in the memory, ie memcached. This provides a SIGNIFICANT increase in performance.
Memcache is generally used for cacheing data ie query results.
http://pureform.wordpress.com/2008/05/21/using-memcache-with-mysql-and-php/
There are lots of tutorials.
I have only ever used amazon s3 which is is not quite a cdn. It is more of a storage platform but still it helps to take the load off of my own servers when serving media.
I would put all of your static resources on a CDN so your own server would not have to serve these. It would not require any modifcation to your php code. This includes JS, and CSS.
For your static pages (your about page) I'd make sure that php isn't processing that since there is no reason for it. Your web server should serve it directly.
Cacheing will require changes to your code. For cacheing a normal flow is:
1) user makes a request
2) check if data is in cache
3) if it is not in cache do the DB query and put it in cache
4) if it is in cache retrieve it
5) return data.
You can cache anything that requires disk io and you should see a speed up.
Memcached works by storing database information (usually from a remote server or even a database engine on the same server) in a flat file format in the filesystem of the web server. Accessing a flat file directly to retrieve data stored in a regulated format is much much muuuuuch faster than accessing that data from a remote query each time. This is typically useful when you have data that can be safely stored for certain periods of time as it is not subject to regular changes.
The way this works is that if you want to store a user's account information in a cache to speed up loading pages where that user is logged in. You would load the information and cache it locally. On any subsequent requests for that data, it will load in a fraction of the time it normally would take to load that information from the database itself. Obviously you will need to make sure that you update/recache that information if the user changes it while logged in, but you will greatly reduce the time it takes to serve up pages if you implement a caching system that can minimize the time spent waiting on the database.
I'm personally not familiar with CloudFlare so I can't offer any advice to that effect, but in terms of implementing caching in your application, you should check out:
http://code.google.com/p/memcached/wiki/NewOverview
And read the rest of the Wiki entries there which cover installation/implementation/etc. That should get you started on the right track.

php "page caching" solution suggestions for CMS Applications

Most examples use time-based cache expiration. I'd like to read more about file caches (where the database is called only when there is no file in a given directory). This is for a basic information site with CMS functions made with php/mysql. My searches are returning too many sites on web applications. Adding CMS to the search returns script repositories. I'd appreciate your suggestions.
It's not hard to write something like this yourself. Use file_exists() to check whether a specific file exists, or glob() how many files matching a given pattern there are.
I use a page build system...
Each page created is given a guid - when a request comes in for the page check to see if a file in the cache named GUID.xxx serve it if not build the page and cache.
On editing a page (or if its expiration has passed) delete the file from the cache.
You can elaborate at will as to how the expiration is determined/administered and what protions of the page to cache and which to allow dynamic builds per request...
I'm not quite sure what you're looking for.
If you're talking about generating a page (from the CMS) and placing it at the requested URI (so the next request bypasses even the CMS) - it's possible, but you make refreshing the 'cache' a little difficult.
However, what you may be looking for is just a server side cache (as opposed to telling the browser how long to cache a page). Those are usually file or memory based, and if you place the caching mechanism high in the CMS flow (perhaps where it processes the requests), you'll be caching a large portion of page creation.
Some cache libraries let you set an unlimited lifetime (for example Zend_Cache), leaving the cache maintenance up to you. That may be what you're looking for.

Categories