HTACCESS image caching rule that checks the image modification time - php

I'm serving images two different ways:
Using a PHP script for profile pictures for example
By pointing to them directly, for icons and backgrounds for example
I'm in the process of handling their caching properly, and i'm totally new to this.
For the PHP script, i'm just adding a Last-Modified header to the response, and delivering a 304 status code if it's called again, if the file hasn't changed (using filemtime()).
For direct accesses, i'm using HTACCESS, but every rule i saw so far doesn't allow me to do the same as in my PHP script (checking if the file has changed, then serving a 304 or the file itself).
Here's the HTACCESS rule i'm planning to use:
Header unset Pragma
FileETag None
Header unset ETag
# cache images/pdf docs for 10 days
<FilesMatch "\.(ico|pdf|jpg|jpeg|png|gif)$">
Header set Cache-Control "max-age=864000, public, must-revalidate"
Header unset Last-Modified
</FilesMatch>
From what i understand, the only way of updating a cached image is to rename it. Does someone know a way around it? By checking the image's last modification date for instance?

You could use mod_expires, if available:
<FilesMatch "\.(ico|pdf|jpg|jpeg|png|gif)$">
ExpiresDefault "modification plus 10 days"
</FilesMatch>

What you are doing with PHP should do apache automatically for static files. It will set the Last-Modified header and respond with 304 if it will find if-Modified-since in the request. This is done automatically and has nothing to do with caching. It will not prevent repeated requests to your server, it will just save you bandwidth (and loading times for user) when the file is not modified by returning just 304 info instead of the whole file.
To prevent those repeated requests to your server, browser (and proxy servers) has to do some caching. You can control the caching either via HTTP headers or for HTML also via META tags. When you specify that the file is cacheable for 1 week, browser won't try to contact your server for 1 week (although most browsers are set to revalidate cache entries on first access after startup).
So you will either live with the possibility that some users will use old cached copy for some time (depends on the expiry header) or you must change your URL as Gerben suggested. Only then you can be 100% sure that everyone will get the new version (this is important for javascript as having some of the js files old and some new can make very strange errors). Nowadays almost every high performance website uses the file.ext?v=3 approach, so that they can set the expiry header to large values like 6 months.

As #Gumbo pointed out, "Apache should already do that for static files".
And that's true, Apache does that, so that kind of stuff works fine:
<FilesMatch "\.(ico|pdf|jpg|jpeg|png|gif)$">
Header set Cache-Control "max-age=864000, public, must-revalidate"
</FilesMatch>
ps: Sorry #Gumbo, but i asked you to change your answer so that i can accept it, but you wouldnt do it and i had to close that question eventually, so.

Related

Apache2 Headers not working correctly

Having issues with iframes (have no control as these come with the system I have) and the cross-site stuff.
Have added the usual X-Frame-Options to my .htaccess file to include the directive to allow it to allow the iframe from this other system that wants to iframe the site. No problem at first.
<IfModule mod_headers.c>
Header always set X-Frame-Options "ALLOW-FROM https://otherhost"
</IfModule>
And I can confirm that the above is taking effect as I have messed with the header content and it is reflected.
For some reason, I keep seeing the header X-Frame-Options ALLOW-FROM https://otherhost, SAMEORIGIN with this additional SAMEORIGIN, which of course is not valid and fails within the browsers, ultimately resulting in the browser falling back to DENY, which then means the iframe is not shown.
The apache2 specs states for the set option, that;
The response header is set, replacing any previous header with this name. The value may be a format string.
Yet I do not see it replacing the string. If I curl the login page, it presents correctly, if I inspect it in the chrome/safari inspector, it shows the additional , SAMEORIGIN and then complains that it's not valid.
I've even tried using the unset option for the Header directive, but it still keeps producing this header.
Is the Header directive post or pre output? as this is driving me nuts and wasting so much time for a simple thing.

Do web browsers cache HTML files and PHP generated files differently?

I'm using Nginx as web server and Firefox to view response headers. For testing, I had two files on the server with the same content: test.html and test.php. In the Nginx configuration file, the expires directive is set to 30d in the server context.
When accessing test.html in a web browser multiple times, the browser first obtains a 304 Not Modified response and serves a copy cached in the browser. However, when accessing test.php, the browser always makes a full request to the server (200 OK) without using the browser cache.
The questions are:
Is the behaviour (i.e. different treatment of HTML and PHP generated files) normal?
What could be done to make web browsers cache HTML and PHP generated files in the same way?
nginx sets the response header for the static file, included in the headers are:
Cache-Control
Expires
Last-Modified
Cache-Control tells the client (at least) how to cache the content.
Expires and Last-Modified allow the client to determine when to fetch new content.
What you must do is ensure that PHP sends the same headers, or sensible headers if not exactly the same; Now that you know which headers are important, inspecting the requests in your browser will tell you how to achieve this.

php caching and .htaccess caching

There is another thread similar to this that was closed and that didn't have any useful information in it: https://stackoverflow.com/questions/11955822/php-file-caching-vs-cache-through-htaccess
Is it necessary to implement a php caching system if you are caching through .htaccess? Here is my current .htaccess caching:
<IfModule mod_headers.c>
# Cache Media Files
<FilesMatch "\.(ico|pdf|flv|jpg|jpeg|png|gif|swf|mp3|mp4)$">
Header set Cache-Control "public"
Header set Expires "Mon, 20 Apr 2015 20:00:00 GMT"
Header unset Last-Modified
</FilesMatch>
# Cache JavaScript & CSS
<FilesMatch "\.(js|css)$">
Header set Cache-Control "public"
Header set Expires "Mon, 20 Apr 2015 20:00:00 GMT"
Header unset Last-Modified
</FilesMatch>
# Disable Caching for Scripts and Other Dynamic Files
<FilesMatch "\.(pl|php|cgi|spl|scgi|fcgi)$">
Header unset Cache-Control
</FilesMatch>
</IfModule>
with this file caching, will building out a php caching system improve my site even more? Or would it make more sense to compress data in .htaccess and use php to cache? I'm just trying to understand which method of caching will improve a site more or if using both is recommended.
For static files, you can cache them by HTML Headers tags, and .htaccess
The browsers will cache them in local machines.
For Dynamic content with .PHP, you can cache widget, objects to reduce the query call to mysql database.
You can try this one. Example, it cache $products in 600 seconds, and your PHP only send 1 request to database. If you have like 500 visitors online, your page still use 1 query from first visitors to serve 500.
<?php
include("php_fast_cache.php");
// try to get from Cache first.
$products = phpFastCache::get("products_page");
if($products == null) {
$products = YOUR DB QUERIES || GET_PRODUCTS_FUNCTION;
// set products in to cache in 600 seconds = 5 minutes
phpFastCache::set("products_page",$products,600);
}
foreach($products as $product) {
// Output Your Contents HERE
}
?>
If you are using Wordpress, you can see all the cache plugins, they cache your content by PHP ( Files or Memcache ), and cache your images, css, js by .htaccess
We need both of them together will speed up the site and save bandwidth / CPU
You're doing client-side caching for static files only.
Caching in PHP solves a completely different problem - server-side performace issues of your application. So you should use it if your site is loading too slowly or if you're causing high server load.
There are many strategies how to implement server-side caching and it's up to you what fits best your application.
For example you can cache SQL queries results or you can cache HTML output of whole webpages. Do not forget about cache invalidation when your data changes.

How to add far-future expires headers to minified cssfiles/scripts?

how to i add add far-future expires headers to minified cssfiles/scripts? I am using minify to combine css files and javascripts files, but the minified versions don't have a far-future expiration date.
If you're using Apache, then this sort of thing is the way to go. There are several different ways to do it depending on the modules installed; some make it easier than others. I recommend the expires (docs) and headers (docs) modules (they're both required for the example below, but you can pull it off with only headers if you really want).
<LocationMatch "/js/(.*)\.js">
ExpiresDefault "access plus 10 years"
Header set Cache-Control "public"
</LocationMatch>
This example matches all files in /js/ that end with a .js extension and set an expiry time of 10 years into the future relative to the time the file is accessed. It also explicitly sets Cache-Control to be public; we run everything over SSL, so it might not be necessary otherwise, but it won't hurt you either way.
This example can easily be extended to match your CSS locations and files as well; just copy, paste and change the LocationMatch.
There are plenty of sites that will give you a full rundown on this; check out this one, "Caching Tutorial", which seems to cover it all.
/min/README.txt has documentation for sending far-future expires headers.
Minify can send far-future (one year) Expires headers. To enable this you must
add a number to the querystring (e.g. /min/?g=js&1234 or /min/f=file.js&1234)
and alter it whenever a source file is changed. If you have a build process you
can use a build/source control revision number.
You can alternately use the utility function Minify_getUri() to get a "versioned"
Minify URI for use in your HTML.
That depends on what web server you are using. It can't be done by modifying the CSS or script files themselves, though.

How to get grade A on these Yslow rules?

Use a Content Delivery Network (CDN)
Compress components with gzip
Configure entity tags (ETags)
Add Expires headers
If i don't have access to Apache configuration.
Use a Content Delivery Network (CDN)
This involves changing your hosting (for at least some files)
Compress components with gzip
Configure entity tags (ETags)
Add Expires headers
You can either:
Get access to your Apache configuration
Get someone who does have access to it to change it
I find "HOW TO SPEED UP YOUR SITE AND GET A YSLOW GRADE" is useful for me. Hope this help.
If you have grade A on every other YSlow rule then you're doing pretty well already and don't need to worry about those items. By the way, you can create custom rulesets in YSlow that are more tailored to your needs and server setup. So if you can't change any of these things, just remove them from the rules that YSlow uses.
Use a Content Delivery Network (CDN)
You can add your site domain as a CDN in YSlow. The idea of this one is to store static components on different domains to increase "parallelisation" (downloading more files at once). If you are using limited hosting then you could open a separate account and host some files there on a different domain.
Compress components with gzip
You can do this in PHP, using ob_start('ob_gzhandler'); at the very start of your scripts. This is a little more resource intensive so use Apache if possible.
Configure entity tags (ETags)
Remove this from the rule list, it's not necessary in 90% of cases. Yahoo only says to remove them because in the rare situation you have multiple servers in the back-end, the same file might have a different ETag if it comes from a different server. When each file comes from one server then ETags are a good thing and removing them is detrimental.
Add Expires headers
If you have no access to the server then you probably won't be able to change this. Ask your host about it. You may be able to override the server setting in your .htaccess file. You'd need the mod_expires Apache module. This page has some usage examples.
Paste this code bottom of .htaccess file
RewriteEngine On
# BEGIN Mod Header
ExpiresActive On
# Turn on Expires and set default expires to 10 years
# END Mod Header
# BEGIN Cache Control
Header set Expires "Thu, 15 Apr 2012 20:00:00 GMT"
Header unset ETag
FileETag None
#END Cache Control

Categories