Question
I am running a bit of an experiment and could use some help.
I have created 2 files. main-real.css which is a standard plain ol' css file, and main.css which is parsed by PHP and has an include() which grabs the former real css file.
Here is the code for main.css:
<?php
include("main-real.css");
?>
I am then adding an instruction to my .htaccess file to parse this css file with PHP:
<FilesMatch "main.css">
AddHandler application/x-httpd-php5 .css
Header Set Content-Type "text/css"
</FilesMatch>
This works perfectly on my PHP 5.2 server running Apache.
The issue is that this file does not appear to be cached by the browser, or at least does not return a
304 Not Modified Status code like the regular un-PHP-parsed CSS file.
Here are the headers for main-real.css if accessed directly:
RESPONSE HEADERS
Date..............Thu, 18 Nov 2010 22:10:57 GMT
Server............Apache/2.2.14 (Unix) mod_ssl/2.2.14 OpenSSL/0.9.8i DAV/2 mod_auth_passthrough/2.1 mod_bwlimited/1.4 FrontPage/5.0.2.2635
Last-Modified.....Thu, 18 Nov 2010 22:10:23 GMT
Etag.............."11b010a-26-4955b0e6671c0"
Accept-Ranges.....bytes
Content-Length....38
Content-Type......text/css
REQUEST HEADERS
Accept.............text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language....en-us,en;q=0.5
Accept-Encoding....gzip,deflate
Accept-Charset.....ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive.........115
Connection.........keep-alive
Cookie.............fc=fcVal=7625790752294348480
If-Modified-Since..Thu, 18 Nov 2010 22:10:23 GMT
If-None-Match......"11b010a-26-4955b0e6671c0"
Cache-Control......max-age=0
Here are the headers for the PHP parsed main.css:
RESPONSE HEADERS
Date...............Thu, 18 Nov 2010 22:11:11 GMT
Server.............Apache/2.2.14 (Unix) mod_ssl/2.2.14 OpenSSL/0.9.8i DAV/2 mod_auth_passthrough/2.1 mod_bwlimited/1.4 FrontPage/5.0.2.2635
X-Powered-By.......PHP/5.2.11
Content-Type.......text/css
Keep-Alive.........timeout=5, max=97
Connection.........Keep-Alive
Transfer-Encoding..chunked
REQUEST HEADERS
Accept.............text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language....en-us,en;q=0.5
Accept-Encoding....gzip,deflate
Accept-Charset.....ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive.........115
Connection.........keep-alive
Cookie.............fc=fcVal=7625790752294348480
Cache-Control......max-age=0
I have tried modifying the http-headers in all sorts of ways, adding max-age, last-modified and others with no success. Is there something I am missing or misunderstanding?
Solution & Final Code
The main missing piece of code was that I needed to send the Last-Modified header prior to the include(). This needs to be done within the PHP file itself! I previously tried adding Last-Modified using an .htaccess Header set instruction, and although that does add the appropriate header, it did not trigger caching.
Here is my final code for main.css with far-future Expires headers and Cache-Control for good measure.
<?php
$last_modified = date("D, d M Y H:i:s \G\M\T", filemtime("main-shared.css"));
$expiration = date("D, d M Y H:i:s \G\M\T", strtotime('+1 year'));
header("Cache-Control: public, no-transform");
header("Expires: $expiration");
header("Last-Modified: $last_modified");
include("main-shared.css");
?>
What headers Apache would send for main-real.css is irrelevant, because you are include() ing that file through the file system.
You need to send the same headers through your PHP script before you include the other file.
header("Cache-Control: ........ ");
header("Expires: ....... ");
....
include("main-real.css");
You need to look at the inbound HTTP headers and determine if the CSS file has legitimately been changed in that time. That means that you're going to be looking at If-Modified-Since in the request headers. Here's some code that'll do it for you:
$last_modified = filemtime("main-real.css");
if(isset($_SERVER["HTTP_IF_MODIFIED_SINCE"])) {
$expected_modified = strtotime(preg_replace('/;.*$/','',$_SERVER["HTTP_IF_MODIFIED_SINCE"]));
if($last_modified <= $expected_modified) {
header("HTTP/1.0 304 Not Modified");
return;
}
}
Related
I have seen this method being used on about three sites now, including Facebook, Dropbox and Microsoft's Skydrive. It works like this. Let's say you want to look at the image without downloading, then you'd just do this.
https://fbcdn-sphotos-a.akamaihd.net/hphotos-ak-xxxx/xxx_xxxxxxxxxxxxxxx_xxxxxxxxx_o.jpg
But if I want to download it, I'd add ?dl=1
https://fbcdn-sphotos-a.akamaihd.net/hphotos-ak-xxxx/xxx_xxxxxxxxxxxxxxx_xxxxxxxxx_o.jpg?dl=1
Easy peasy right? Well, it's probably not easy on the server side, and this is where my problem is. I would know how to do this if that .jpg-file was a PHP script and the $_GET parameter pointed to the image and another parameter would specify whether the image were to be downloaded or not. But that's not the case.
So, what methods did I try? None. Because I honestly have no idea how this works, it's like magic to me. Maybe it's something that you do in .htaccess? That sounds reasonable to me, but after a while of googling I didn't find anything even close to what I'm asking for.
You have some options.
One option would be to use a PHP script instead of the .jpg file. So your URL would point to a PHP file and in the PHP file you would do something like this:
header('Content-Type: image/jpeg');
if ($_GET['dl'] == 1)
header('Content-Disposition: attachment; filename="downloaded.jpg"');
$file = $_GET["file"];
// do some checking to make sure the user is allowed to get the file specified.
echo file_get_contents($file);
Another option would be to use mod_rewrite in your .htaccess file to check for ?dl=1 and if found, redirect to the PHP script that will download the file (the same way as above).
I'm sure there are more options, but those two are the only ones popping into my head right now.
I would have redirected all of the images to a single PHP file that will handle them based on their URI parameters.
in the .htaccess I would put:
Options +FollowSymLinks +ExecCGI
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_URI} \.(jpg|png|gif)$
RewriteRule (.*) images.php [QSA]
</IfModule>
that will make all of the requests to files with the extensions of jpg, png and gif redirect to you images.php file.
in the images.php file I would search for the existence of ?dl=1 and by that decide how to serve the image:
$requestedImage = $_SERVER['REQUEST_URI']
if (strpos($requestedImage,'?dl=1') !== false) {
// serve the image as attachment
}else{
// just print it as usual
}
The display response headers on such a facebook URL:
HTTP/1.1 200 OK
Content-Type: image/jpeg
Content-Length: 5684
Last-Modified: Fri, 01 Jan 2010 00:00:00 GMT
X-Backend: hs675.ash3
X-BlockId: 157119
X-Object-Type: PHOTO_PROFILE
Access-Control-Allow-Origin: *
Cache-Control: max-age=1209600
Expires: Fri, 12 Oct 2012 13:07:08 GMT
Date: Fri, 28 Sep 2012 13:07:08 GMT
Connection: keep-alive
And the download response headers:
HTTP/1.1 200 OK
Content-Type: image/jpeg
Content-Length: 5684
Last-Modified: Fri, 01 Jan 2010 00:00:00 GMT
X-Backend: hs675.ash3
X-BlockId: 157119
X-Object-Type: PHOTO_PROFILE
Content-Disposition: attachment
Access-Control-Allow-Origin: *
Cache-Control: max-age=1209600
Expires: Fri, 12 Oct 2012 13:07:17 GMT
Date: Fri, 28 Sep 2012 13:07:17 GMT
Connection: keep-alive
See the Content-Disposition: attachment line which is a difference.
So as you're already serving the images from a PHP script, in case the download parameter is set, add:
header('Content-Disposition: attachment');
and you should be fine.
I am looking for a way to confirm if X-Sendfile is properly handling requests handed back to the webserver by a script (PHP). Images are being served correctly but I thought I would see the header in curl requests.
$ curl -I http://blog2.stageserver.net/wp-includes/ms-files.php?file=/2011/05/amos-lee-feature.jpg
HTTP/1.1 200 OK
Date: Wed, 04 Jan 2012 17:19:45 GMT
Server: Cherokee/1.2.100 (Arch Linux)
ETag: "4dd2e306=9da0"
Last-Modified: Tue, 17 May 2011 21:05:10 GMT
Content-Type: image/jpeg
Content-Length: 40352
X-Powered-By: PHP/5.3.8
Content-Disposition: inline; filename="amos-lee-feature.jpg"
Configuration
Cherokee 1.2.100 with PHP-FPM 5.3.8 in FastCGI:
cherokee.conf: vserver!20!rule!500!handler!xsendfile = 1
(Set by vServer > Behavior > Extensions php > Handler: Allow X-Sendfile [check Enabled])
Wordpress Network / WPMU 3.3.1:
define('WPMU_SENDFILE',true); is set in the wp-config.php the following just before wp-settings.php is included. This will trigger the following code to be executed in WP's wp-includes/ms-files.php:50 serves up files for a particular blog:
header( 'X-Sendfile: ' . $file );
exit;
I have confirmed that the above snippet is executing by adding an additional header for disposition right before the exit(); call. That Content-Disposition is present with curl results above and not originally in the ms-files.php code. The code that was added is:
header('Content-Disposition: inline; filename="'.basename($file).'"');
Research
I have:
Rebooted php-fpm / cherokee daemons after making configuration changes.
Tried several tricks in the comments over at php.net/readfile and replaced the simple header in ms-files.php with more complete code from examples.
php.net/manual/en/function.readfile.php
www.jasny.net/articles/how-i-php-x-sendfile/
*codeutopia.net/blog/2009/03/06/sending-files-better-apache-mod_xsendfile-and-php/*
Confirmed [cherokee support][5] and tested [with and without][6] compression even though I don't think it would apply since my images are serving correctly. I also found a suspiciously similar problem from a lighttpd post.
*cherokee-project.com/doc/other_goodies.html*
code.google.com/p/cherokee/issues/detail?id=1228
webdevrefinery.com/forums/topic/4761-x-sendfile/
Found a blurb here on SO that may indicate the header gets stripped
stackoverflow.com/questions/7296642/django-understanding-x-sendfile
Tested that the headers above are consistent from curl, wget, Firefox, Chrome, and web-sniffer.net.
Found out that I can't post more than 2 links yet due to lack of reputation.
Questions
Will X-Sendfile be present in the headers when it is working correctly or is it stripped out?
Can the access logs be used to determine if X-Sendfile is working?
I am looking for general troubleshooting tips or information here, not necessarily specific to PHP / Cherokee.
Update
I have found a suitable way to confirm X-Sendfile or X-Accel-Redirect in a test or sandbox environment: Disable X-Sendfile and check the headers.
With Allow X-Sendfile disabled in Cherokee:
$ curl -I http://blog2.stageserver.net/wp-includes/ms-files.php?file=/2011/05/amos-lee-feature.jpg
HTTP/1.1 200 OK
Date: Fri, 06 Jan 2012 15:34:49 GMT
Server: Cherokee/1.2.101 (Ubuntu)
X-Powered-By: PHP/5.3.6-13ubuntu3.3
Content-Type: image/jpeg
X-Sendfile: /srv/http/wordpress/wp-content/blogs.dir/2/files/2011/05/amos-lee-feature.jpg
Content-Length: 40352
The image will not load in browsers but you can see that the header is present. After re-enabling Allow X-Sendfile the image loads and you can be confident that X-Sendfile is working.
According to the source on github X-Sendfile headers will be stripped.
If I'm skimming the file correctly, it's only logging success if it's been compiled in debug mode.
You could check memory usage of sending large files with and without xsendfile.
They are being stripped, simply because having them present will prevent one of the reasons to use it, namely having the file served without the recepient knowing the location of the file being served.
Using .htaccess, I'm setting PHP handler to all my .css and ,js in order to output user-agent based code:
AddHandler application/x-httpd-php .css .js
For example:
<?PHP if ($CurrentBrowser == 'msie') { ?>
.bind('selectstart', function(event) { ... })
<?PHP } ?>
So, in fact, my code files are dynamically created but can be considered static files. That's because, once they have been compiled for the first time, browsers can get them back from cache and reuse them until I change their content.
That's why I'm using fingerprinting/versioning and long time expiration on them:
[INDEX.PHP]
<script type="application/javascript" src="<?PHP echo GetVersionedFile('/script.js'); ?>"></script>
<script type="application/javascript" src="/script.1316108341.js"></script>
[.HTACCESS]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule "^(.+)\.(\d+)\.(css|js)$" $1.$3 [L]
The problem is that those files, even if I send them with a proper header, are never being cached by any browser (I never get a 304 code, always 200). This is a log of my server responses:
[CHROME]
Request URL:http://127.0.0.1:8888/script.1316108341.js
Request Method:GET
Status Code:200 OK
-----
Cache-Control:max-age=31536000, public
Connection:Keep-Alive
Content-Encoding:gzip
Content-Length:6150
Content-Type:application/javascript
Date:Thu, 15 Sep 2011 21:41:25 GMT
Expires:Fri, 14 Sep 2012 21:41:25 GMT
Keep-Alive:timeout=5, max=100
Server:Apache/2.2.17 (Win32) PHP/5.3.6
Vary:Accept-Encoding
X-Powered-By:PHP/5.3.6
[MOZILLA]
Request URL:http://127.0.0.1:8888/script.1316108341.js
Request Method:GET
Status Code:200 OK
-----
Date Thu, 15 Sep 2011 21:43:26 GMT
Server Apache/2.2.17 (Win32) PHP/5.3.6
X-Powered-By PHP/5.3.6
Content-Encoding gzip
Vary Accept-Encoding
Cache-Control max-age=31536000, public
Expires Fri, 14 Sep 2012 21:43:26 GMT
Content-Type application/javascript
Content-Length 6335
Keep-Alive timeout=5, max=100
Connection Keep-Alive
-----
Last Modified Thu Sep 15 2011 23:43:26 GMT+0200 (= time i loaded the page) (???)
Last Fetched Thu Sep 15 2011 23:43:26 GMT+0200 (= time i loaded the page) (???)
Expires Fri Sep 14 2012 23:43:26 GMT+0200
Data Size 6335
Fetch Count 10
Device disk
What could be the problem? How can I force caching on these files?
Many, many thanks!
Since the requests for PHP and CSS files are being handled by PHP, your PHP code with its conditionals is being executed each time.
Apache/PHP have no idea if the content is cacheable or if it should be regenerated so it executes your PHP code each time.
If you send the last modified header, or use your versioning/fingerprinting method, then it is your responsibility in your PHP script to check the fingerprint or version and determine if it is still valid. If so, then you can send a 304 Not Modified header and terminate any further processing. You can also check the request headers for a Last-Modified tag and use that method.
Another approach would be to cache the response for various browsers and dates to a file so you can serve that file up for first time users rather than regenerating it with php. Then you can check the modification time of that file to determine if you can send a 304 header.
This SitePoint article explains several methods of using PHP to cache. Hope that helps.
I believe this would be a more CPU friendly method, can it be implemented with php ?, instead of gzipping content for every request, I compress the files once and serve those instead =).
Yes, this is quite easy to do with Apache.
Store the uncompressed and compressed files side by side. E.g.:
\-htdocs
|-index.php
|-javascript.js
\-javascript.js.gz
Enable content negotiation in Apache. Use:
Options +MultiViews
Now when "/javascript" is requested, Apache will serve the gzipped version if the client declares it accepts it (through Accept-encoding).
Example of two HTTP requests (some headers omitted):
Client claims to accept gzip
GET /EP/Exames/2006-2007/exame2B HTTP/1.1
Host: lebm.geleia.net
Accept-Encoding: gzip, identity
HTTP/1.1 200 OK
Date: Fri, 13 Aug 2010 16:22:59 GMT
Content-Location: exame2B.nb.gz
Vary: negotiate,accept-encoding
TCN: choice
Last-Modified: Sun, 04 Feb 2007 15:33:53 GMT
ETag: "0-c9d-428a84de03a40;48db6d490abee"
Accept-Ranges: bytes
Content-Length: 3229
Content-Type: application/mathematica
Content-Encoding: gzip
‹áüÅE
(response continues)
Client does not claim to accept gzip
GET /EP/Exames/2006-2007/exame2B HTTP/1.1
Host: lebm.geleia.net
Accept-Encoding: identity
HTTP/1.1 200 OK
Date: Fri, 13 Aug 2010 16:23:14 GMT
Content-Location: exame2B.nb
Vary: negotiate,accept-encoding
TCN: choice
Last-Modified: Sun, 04 Feb 2007 15:33:53 GMT
ETag: "0-257f-428a84de03a40;48db6d490abee"
Accept-Ranges: bytes
Content-Length: 9599
Content-Type: application/mathematica
(************** Content-type: application/mathematica **************
CreatedBy='Mathematica 5.2'
(response continues)
See a more complete version here http://pastebin.com/TAwxpngX
Yes, this is a sensible approach to save both bandwidth and connections. (You can enable gzip compression within Apache if so desired, but it's potentially worth doing this anyway as you've save connections.)
In essence, use a PHP function to check if the browser supports gzip compression. (If if doesn't you'll need to fetch the JavaScript/CSS as per normal.) If it does, you can simply point the JavaScript or CSS source location at a PHP script which is responsible for:
Checking to see if there's a compressed version in place. (Simply output the existing 'on disk' if there is.)
Creating a compressed version of the required files.
You'll also probably want to enable/disable this from a define/top level config (for testing purposes, etc.) As a suggestion, you could store the required CSS/JavaScript files paths in a set of arrays which could be used as a basis for creating the cache file or including the files in the traditional manner as a fallback.
I've written a solution along these lines in the past that created a file based on a hash of the required filenames. As such, the cache was automatically rebuilt if a different/additional file was included. (It also re-built the cache after 'n' hours, but that's only to keep things fresh if the filenames didn't change, but the content did.)
We have a magento commerce site running on an IIS 6.0 server with PHP 5.2.11 running magento.
Whenever user tries to use the print to download pdf to their computer from the admin panel the download does not complete. I can see that the full file is downloaded to the computer but the browser still keeps on saying it is downloading. This means the file gets save with a .part in the end and users cant open the file as pdf. If i remove .part extension created by firefox then i can view the pdf correctly. This means the data is sent to the browser from server in full but download does not terminate.
See headers below on response while starting to download the pdf
HTTP/1.x 200 OK
Cache-Control: must-revalidate, post-check=0, pre-check=0
Pragma: public
Content-Length: 1456781
Content-Type: application/pdf
Content-Encoding: gzip
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Last-Modified: Fri, 18 Dec 2009 10:23:37 +0000
Vary: Accept-Encoding
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET, PHP/5.2.11
Content-Disposition: attachment; filename=invoice2009-12-18_10-23-37.pdf
Date: Fri, 18 Dec 2009 10:23:37 GMT
I guess it is something to do with not closing the connection after sending the whole file through? Please help!
Thanks.
I had the exact same problem (Apache), I temporarily solved the issue by turning off the gzip compression on the responses. My guess is that the size being reported by Magento (which it gets from a strlen() call on the PDF content) to the browser does not reflect the real content size that the browser gets given that it gets compressed later on. This results in the browser waiting for more data which is never going to arrive..
edit: worth noting that in my case I was going to the site through a reverse proxy.
Have you tried explicitly calling exit; after you output the pdf data. Sounds like an IIS thing.