How to send Akamai cache expiration headers? - php

I have a site where each time you upload an image it gets rendered in various frame sizes. A cron job runs every 10 minutes which looks to see if any new images have been uploaded during that time and if so it generates all the needed frames.
Since this cron runs every 10 minutes there is some time between the content (such as an article) goes live and the time the images are made available. So during that meantime a generic placeholder image with the site's logo is shown.
Since Akamai caches an image, when a site user loads a page which has an image that hasn't been rendered by the cron yet, then a static placeholder will show for that image path and Akamai will cache this. Even when the image is later rendered and is available users will still get the cached version from Akamai.
One solution is to bust the "ages" of these images when the cron has rendered them. But it takes Akamai about 8 min to come back for the new ones.
Is there any other solution where I can tell Akamai perhaps through cache expiration headers to come back every 10 seconds until a new image is received and once that's done don't come back again and keep showing the cached version?

Yes, in a way. If you combine a few steps from the server side and within the akamai settings.
Here's the concept: the Edge Server delivers the content that it has. If you use cache-control headers, from php for example, the TTL settings in the akamai configuration settings of the respective digital property blow them away and it uses those instead. Meaning you tell it how often to come to your origin server by path, file type, extension, or whatever. Then from the client side, whatever files it has it delivers to the end user and it doesn't really matter how often the Edge Servers get requested for the content unless you don't cache at that level, rolling it back up to you.
Using those configuration settings you can specify that a specific file has an exact expiration - or to not cache it at all.
So if on the server side if you specify placeholder.jpg on your page and tell akamai not to cache that image at all, then it'll come back each time the Edge Server gets a request for it. Once you have the image in place then placeholder.jpg doesn't exist on your page any more and instead there is sizeA.jpg, which would obey regular image caching times.
This may not be exactly ideal, but it's as best as you can do other than manually clearing the page and as far as I know they don't have an API call to clear a page that you can fire off (plus it take 7-10 minutes for a cache-clear to propagate through their n/w anyway).

Related

php header location vs file_get_contents performace

I have a website where the client search a term and results are retrieved through an ajax request. On php side, the called script check the date of the cache (cache are files) and if it's older than an established time it refreshes the results, else it return the cache file content: die(file_get_contents($cache_path));
The cache time is a few hours, an to refresh it takes just a few seconds, so the greatest part of the requests will end up in cache response.
So I thought that using header("location: $cache_path"); would be less stressful for the server, because it simply tells the browser to get the contents from the cache file without passing it through the script.
The downside is that the cache file path would become public (which is not biggest problem ever, because the content is the same), but, you know, it's never good to give the resources locations...
So, performance wise, is there a big difference between file_get_contents and redirecting? The average cache file size is 120kb... Any other ideas and suggestions?
You can use "internal redirect": through X-Accel-Redirect header for nginx or X-Sendfile for Apache. In this case you don't show any additional URLs to a client and don't deal with cache files in you script.
For configuration details you can read an official documentation or, of course, other SO questions (like this one).

Does forcing no-cache on html pages also force no-cache on images?

I have a very simple question, which I have been unable to find a clear answer to. I have a web page which is generated dynamically (so I absolutely do not want it cached), which loads in several thousand images. Since there are so many images, and they never change, I very definitely want the user's browser to cache these images.
I'm applying headers to the HTML page to prevent caching, by following the advice in this solution: How to control web page caching, across all browsers?
My question is: Will this cause the user's browser to also not-cache any images this page contains, or will it cache them? Thank you.
TL;DR the answer is not clear because it is complicated.
There is an ongoing struggle between a drive to do the "right" thing (i.e., follow the standards... which themselves have changed) and a drive to "improve" the standards to achieve better performances or smoother navigation experience for users. So from the application point of view you need to properly use headers such as ETag, If-Modified-Since and Expires, together with cache hinting pragmas, but the browser - or something in the middle such as a proxy - might still decide to override what would be the "clear thing to do".
On the latest Firefox, directly attached to an Apache 2.4 on virtual Ubuntu machine, I have tried with a page (test.html) referring to an image (test.jpg).
When the page is cached, server side I see a single request for the HTML and nothing for the image. What is probably happening is that the "rendering" part of Firefox does request the image (it has to!), but that is entirely supplied by the local cache. This makes sense; if the page has not changed, its content hasn't changed.
When the page is not cached, I see two requests, one for the page and one for the image, to which the server responds with a 304, but that is because I also send the image's Last-Modified header. This also makes sense - if the page has changed, the images might have changed too, so the browser has to know whether this is the case, and can only do so by asking the server (unless the Expires header is used to "assure" the client that the image will not change).
I have not yet tried with an uncached page that responds with a 304. I expect it to generate a single request (no image request to the server), for the same reasons.
What you might want to consider is that your way you will not cache the HTML page but might still perform a thousand image requests (which will yield one thousand 304's, but still). Performances on this kind of event depend on whether the requests are sent independently or back-to-back by using the Keep-Alive HTTP/1.1 extension (has to be enabled and advertised server side).
You should then use the Expires header on the images to tell the client that those resources will not go stale anytime soon.
You might perhaps also want to explore a different approach:
the HTML is cached
images are cached too
the HTML also references a (cached?) Javascript
variable content is loaded by the Javascript via AJAX. That request can be made cache-unfriendly by including a timestamp, without involving the server at all.
This way you can configure the server for caching everything everywhere, except where you make sure it can't via a single crafted request.

Is it problematic to cache files for too long?

I discovered https://developers.google.com/speed/pagespeed/ the other day and have improved my website's page speed from ~75 to ~95 now.
One of the last few things it recommends is that I:
Leverage browser caching: Setting an expiry date or a maximum age in the HTTP headers
for static resources instructs the browser to load previously downloaded resources
from local disk rather than over the network.
The cache time for my main javascript and css files is set to 2 days, Google suggests I set it to at least 1 week. They also suggest that I do the same for html and php files.
What would happen to my users if I decided to make a large website change and they had just cached my website yesterday (for 1 week)? Would they not see the changes on my website until 1 week later?
Also, since my website contains a control panel and has some dynamically generated PHP pages, is there any reason for caching any of it? Wouldn't my server still be churning through php script and generating new content every time they logged into their account?
You probably doesn't want to cache your HTML and PHP in visitors browsers. However you might want to cache that in a layer you have more control over, like PHP opcode caching with APC and a reverse proxy like Varnish.
For the static assets, like your JavaScript and CSS files, it should be safe to cache them a year or more. If you make a change to them you can just update their URL to say mystyles.css?v=123 and browsers will think it's a whole different file from mystyles.css?v=122 or even just mystyles.css.

Use a cached version of a php page unless the database has changed

I've looked at similar questions about caching in PHP and I'm still stumped as to how to check whether the database has changed without making a new call to the database, which would defeat the point of caching.
I understand technically how to implement caching in PHP -- using ETag and Last Modified headers, output buffering, storing static files, etc. What is tripping me up is how to determine when to serve up a new version of a page instead of a cached version. If the database content has changed, I want to show the new version and not the cached version.
For example, let's say I have a page that displays details about a product. Generally, once the product info is stored in the database, it won't change much. But occasionally there might be an edit to the product description or a price change. If the product has a new price, I don't want to show the user the old price by using a cached version of the page. For that reason, updating the cached content every hour doesn't seem sufficient. Not to mention that that's too often for the content that doesn't change, the real problem is that it won't update the content fast enough when there is a change.
So should I store something (e.g., an ETag value or a static html file) every time the product database is updated through a form in the Admin area of the application? What am I missing here?
[Note: Not interested in using a caching library here. I'd like to learn how to do it in straight PHP for now.]
Caching is a pretty complex topic, because you can cache all sort of data in various places. Usually you implement caching to relieve bottlenecks in your server structure.
In your setup you can cache data at three different locations:
1) Clientside, between client and server
You would use this method to save bandwidth and shorten loading times for the user. You can achieve this by setting cache related fields into the http header (Cache-Control, Expires, ETag and so on).
If you use Cache-Control or Expires, the decision wether to load an updated version from the server or not purely depends on the client. So even if there is a new version available, the user won't see it. On the plus side you are saving lots of cpu cycles on the server, because your php script won't be executed.
If you use ETag, you can inform the client on each request, if the version of the requested content has changed. But your php script will be executed on each request, even if the ETag is unchanged.
2) Serverside, between client and server
This kind of caching primarily reduces high cpu load on your server. It won't affect the amount of traffic generated between client and server.
You can use a client proxy like Varnish to store rendered responses on the server side. The good thing is, that you have full control over the cache. If an updated version of a requested content is available, you can simply purge the old version from the cache, so that a new version is generated from your php script and stored in your cache.
Every response that is cacheable will only be generated one time and then be served from cache to the clients.
3) In your application
If you are heavily using your database, you should consider using a fast key value store like memcached to cache query results. Of course you have to adjust your database classes for this (first ask memcached, if memcached doesn't have the result ask the database and store the result into memcached), but the performance gain will be quite impressive, because memcached is really fast.
Sometimes it even makes sense to store data solely in memcached, if the data doesn't has to be persisted permanently (php sessions for example).
I had also faced the same problem long back (I dont know if you will find my way to be correct).
Why I needed the cache :-
What my site use to do was, it use to update the database by running the script on cron.php file and index.php use to show the listing from database (this use to take ages to load )
my solution :-
every time a new list was created or updated I unlinked the cache file then on index.php page I checked if cache file exists load cache or else load content from database also at the same time write this data to the cache file so next time when user requests for index.php file

Apache is lagging or something else is bad

I have a website. It's my first website with Zend Framework but I think it's written good. Generatiom time is about 0.9s now. I'll do it something like 0.2 but leave it now. When you press any link on the website it tooks about 1,5-2s before web browser is loading page. Then it tooks 0.15s to show it. So if execution time is 0.9 where are the other 1.1s? Ping is about 13ms. Website address is http://zgarnijlicke.pl
Edit:
Strange. Second domain, http://lottek.eu, is working good. Look at http://lottek.eu/picostreamer. It isn't lagging like the zgarnijlicke.pl domain.
Edit 2:
There is a problem with Zend-Framework. I setted up action without rendering view (layout disabled too) and it's working as fast as server can do it. I'll make new question for it.
Here's a webpagetest.org report for your site: http://www.webpagetest.org/result/100721_1P0Y/
If you view the waterfall graph for the first view, you'll see that the browser gets your HTML source at around the 1.2 second mark, and is first able to render your page at just after 4 seconds. What happens in between those two is downloading of your three javascript files and two CSS files. So, this is where you want to start. Some suggestions:
Consider using a free CDN for jquery.js insteading of serving it from your server, e.g. Google's: http://code.google.com/apis/ajaxlibs/ . This way, users are more likely to already have it cached, Google will serve it from a location geographically closer to the user, and (I think) in compressed format.
For jquery.corner.js and jquery.media.js, consider merging them into one file and serving them compressed (the Apache module mod_deflate makes this very easy to do)
Same for your CSS files - consider merging them into one file and serving them compressed.
Those will give you some quick wins. However there are other things you can improve:
Add width and height attributes to your image tags. Without these, some browsers will halt rendering while they download the images so that they know how much space they'll occupy. None of your image tags have these attributes.
Make sure you're using the right image format for the job. Your banner.png image is over 300k which is far too large. I converted this to a JPEG image (80% quality) and it was 30k.
As for the execution time, 0.9 seconds seems quite high. Are you using APC or similar? Is the page doing any heavy database work?
Try putting some timer code in your php that measures the length of time it takes to generate the page content. This way you can confirm or rule out server problems.
You might also use network tools like ping and traceroute to see if your problem is caused by network latency.
A quick test with wget here gets an overall execution time of 1.5s to transfer one of the pages, with an actual download time of 0.2 seconds, so 1.3s of overhead. The pause occurs before the transfer starts, so that's a server-side problem.
Is that site on a virtual server? It's possible that if the underlying physical server is heavily loaded, your image could be getting swapped out or otherwise CPU-starved and takes ~1 second to become responsive again.
Perhaps it's an internal resource thing - are you connecting to a DB, especially a remote one? Even if some or most of the pages aren't DB-driven, the overhead of connecting to a DB could be causing this slowdown. And then gets swapped/delayed again as there's little further activity to keep the image active.
It could even be something as silly as Apache being configured with 'IdentityCheck' on, though unlikely, as this would slow down all requests. I'm not seeing any slowdown on the requests for .css/.js files from your server when viewed from HTTPFox. Interestingly, requesting the .css/.js via wget returns a '500 Internal Server Error'.
I found it. It's problem with ZF because when I did hello.php page with code like that:
hello world
Without any < ?php ?> script took 0.4s to complete.

Categories