How to obtain size of served HTTP response in PHP - php

In a PHP shutdown function, I want to know the size of the HTTP response that's been received by the client.
I'd like to register a shutdown function and verify the size of the HTTP response received by the client with the size of the file which was read. This would let me flag cases where the response was incomplete.
Background: We're seeing reports of damaged (incomplete) file downloads using Ubercart uc_file.
http://api.ubercart.org/api/function/_uc_file_download_transfer/2 is the function serving the file. It already checks that the complete file has been read before logging the download, but it doesn't check if the client was still connected when the file is fully served.

I don't know how to obtain the size of generated content, but the reason why your clients are experiencing incomplete downloads could be that you don't specify the correct Content-Length header before sending the file. Standards-compliant browsers will not save the file if it's size turns out to be less than the Content-Length declared in HTTP response.

Related

Apache and Content-Range

I am trying to implement support for Content-Range in PHP-generated files. When a browser sends Range request my script gives correct bytes and it works well.
But while testing how Content-Range looks when downloading a PDF from Apache server I realized the first request from a web browser to my server does not contain Range header but somehow server still doesn't return full file and only 32 kB.
On this screenshot you can see that Firefox sends 5 requests to Apache for my_pdf.pdf and Apache each time responds with 32-192 kB. The whole PDF is 28 MB. Requests 2-5 do contain Range request. But the first request- highlighted does not. You can see on the right that Content-Length is 28 MB but that Apache returned only 32 kB.
So my question is- how did Apache know to return only 32 kB and not the whole 28 MB PDF file?
So my question is- how did Apache know to return only 32 kB and not the whole 28 MB PDF file?
It didn't. If you look at the Content-Length header in the response, it shows the full file size of 29.3 million bytes.
The client probably closed the connection without reading the entire response.
Answer posted by #duskwuff is correct- Firefox terminates the transfer of the first requests once it gets enough to process the PDF.
Below is just a few details I discovered.
Firefox will terminate if your scripts returns these headers:
Accept-Ranges: bytes
Content-Length: 29293315
You can (but don't have to) also return this header:
header("Content-Range: bytes 0-29293314/29293315");
However by default Apache tries to compress whatever PHP returns and then adds this header:
Transfer-Encoding: chunked
And when Firefox (and Chrome) see this they won't close the connection. So I just disabled Apache compression and everything works. Now Firefox just does a few requests, get bits of PDF instead of the whole file and renders first page just fine (because it didn't need whole PDF to render just the first page).

Can I limit upload size in PHP but still receive truncated files?

I would like people to upload any size file to my site but I need only first 1 kB of it. So I would like PHP to somehow stop receiving a file after it got this 1 kB and then just process this truncated file.
Try to check this question, Maybe it has the answer for you.
Receiving only chunks of an uploaded file in PHP?
Version 2. If you cant do it with basic php support, then:
Create a new php, with a tcp socket server. Start the socket on another port, for example 8080, and that will handle your file upload.
You wait for start of the content (trim headers, and other unneeded data.), and when you got 1kb of uploaded file, you can parse it, and maybe send a redirection back to client, and handle that data.
So a little tricky, but not impossible.

Setting header for notify file content changed. PHP, Twilio

I'm working on a twilio project with PHP which will be playing back a frequently changing audio file.
Twilio's TwiML Voice documentation states to:
make sure your web server is sending the proper headers to inform us
that the contents of the file have changed
Which headers are these and how do I set them in PHP.
Which headers are these?
This is how caching works on Twilio
Twilio requests a .mp3 from your server using a GET request. Your
server sends back a 200 OK, and also sends back an E-Tag header.
Twilio will save the E-Tag header, as well as the mp3 file, in its
database.
The next time Twilio sends a GET request to that URL, it will send
along the E-Tag header (it should look like "If-None-Match"). If the
file has not changed since the last time Twilio accesses it, your
server will send back a 304 Not Modified header. Crucially, it will
not send the mp3 file data. Twilio will use the mp3 file it has
stored in its database. It's much faster for Twilio to read the mp3
file from its database than it is for your server to send it (and it
also saves your server bandwidth).
If you change the content of the mp3 that is being served at the URL,
and Twilio makes a GET request to the URL, then your server will send
back a 200 OK, with a new E-Tag. Twilio will download the file from
your server, and cache it.
How do I set them in PHP?
header("ETag: \"uniqueID\");
When sending a file, web server attaches ID of the file in header called ETag. When requesting file, browser checks if the file was already downloaded. If cached file is found, server sends the ID with the file request to server. Server checks if the IDs match and if they do, sends back header("HTTP/1.1 304 Not Modified"); else Server sends the file normally.
One easy way to check is by adding some fake key-value pairs to the end of the URL, like http://yoururl.com/play.mp3?key=somevalue. Your website should still serve the same mp3 as it would if you loaded example.com/test.mp3, but to Twilio it will appear to be a new URL (uncached).
Twilio uses Squid to cache MP3. You can control how long an item is cached using the cache control header.
cache-control: max-age=3600
http://wiki.squid-cache.org/SquidFaq/InnerWorkings#How_does_Squid_decide_when_to_refresh_a_cached_object.3F

Content-Length header is not getting in Android 2.3 above browser

I am having an hybrib application in that in that a simple php page will open which contents some link of files, and from my android wrapper i have implemented the download functionality of file.
So for user convenience i am showing the length and progress of download while the file is downloading for that my application server has set a content-length header to pass the size on device, but the problem I am facing is surprising.
The file length is working fine in Android 2.2. I am getting the content header correctlt but in Android 2.3 above I am getting the content length for smaller files but for the larger file I am not even getting the Header Field.
con.getHeaderField("content-length");
returning me null in case of Android 2.3 above.
So is there any limitation of size for the User Agent above 2.3 because if it is working fine in 2.2 means there is no problem at server end it is the problem only on device user agent.
Update
I have tried it with different size of files and it is working fine till 60KB in Android 2.3 above as well.
It sounds like the client may be chunking the file. Check for the presence of the following header:
Transfer-Encoding: chunked
If that exists, the request is chunked and you will not get a Content-Length header.
See http://en.wikipedia.org/wiki/Chunked_transfer_encoding for more details.

Caching HTTP responses when they are dynamically created by PHP

I think my question seems pretty casual but bear with me as it gets interesting (at least for me :)).
Consider a PHP page that its purpose is to read a requested file from filesystem and echo it as the response. Now the question is how to enable cache for this page? The thing to point out is that the files can be pretty huge and enabling the cache is to save the client from downloading the same content again and again.
The ideal strategy would be using the "If-None-Match" request header and "ETag" response header in order to implement a reverse proxy cache system. Even though I know this far, I'm not sure if this is possible or what should I return as response in order to implement this technique!
Serving huge or many auxiliary files with PHP is not exactly what it's made for.
Instead, look at X-accel for nginx, X-Sendfile for Lighttpd or mod_xsendfile for Apache.
The initial request gets handled by PHP, but once the download file has been determined it sets a few headers to indicate that the server should handle the file sending, after which the PHP process is freed up to serve something else.
You can then use the web server to configure the caching for you.
Static generated content
If your content is generated from PHP and particularly expensive to create, you could write the output to a local file and apply the above method again.
If you can't write to a local file or don't want to, you can use HTTP response headers to control caching:
Expires: <absolute date in the future>
Cache-Control: public, max-age=<relative time in seconds since request>
This will cause clients to cache the page contents until it expires or when a user forces a page reload (e.g. press F5).
Dynamic generated content
For dynamic content you want the browser to ping you every time, but only send the page contents if there's something new. You can accomplish this by setting a few other response headers:
ETag: <hash of the contents>
Last-Modified: <absolute date of last contents change>
When the browser pings your script again, they will add the following request headers respectively:
If-None-Match: <hash of the contents that you sent last time>
If-Modified-Since: <absolute date of last contents change>
The ETag is mostly used to reduce network traffic as in some cases, to know the contents hash, you first have to calculate it.
The Last-Modified is the easiest to apply if you have local file caches (files have a modification date). A simple condition makes it work:
if (!file_exists('cache.txt') ||
filemtime('cache.txt') > strtotime($_SERVER['HTTP_IF_MODIFIED_SINCE'])) {
// update cache file and send back contents as usual (+ cache headers)
} else {
header('HTTP/1.0 304 Not modified');
}
If you can't do file caches, you can still use ETag to determine whether the contents have changed meanwhile.

Categories