For various reasons, I need to play the intermediary between an HTTP Request and a file on disk. My approach has been to populate headers and then perform a readfile('/path/to/file.jpg');
Now, everything works fine, except that it returns even a medium sized image very slowly.
Can anyone provide me with a more efficient way of streaming the file to the client once the headers have been sent?
Note: it's a linux box in a shared hosting environment if it matters
Several web servers allow an external script to tell them to do exactly this. X-Sendfile on Apache (with mod_xsendfile) is one.
In a nutshell, all you send is headers. The special X-Sendfile header instructs the web server to send the named file as the body of the response.
You could start with implementing conditional GET request support.
Send out a "Last-Modified" header with the file and reply with "304 Not Modified" whenever the client requests the file with "If-Modified-Since" and you see that the file has not been modified. Some sensible freshness-information (via "Cache-Control" / "Expires" headers) also is advisable to prevent repeated requests for an unchanged resource in the first place.
This way at least the perceived performance can be improved, even if you should find that you can do nothing about the actual performance.
This should actually be fairly fast. We have done this with large images without a problem. Are you doing anything else before outputting the image that might be slowing the process, such as calculating some meta data on the image?
Edit:
You may need to flush the output and use fread. IE:
$fp = fopen($strFile, "rb");
//start buffered download
while(!feof($fp))
{
print(fread($fp,1024*16));
flush();
ob_flush();
}
fclose($fp);
Basically you want to build a server... That's not trivial.
There is a very promissing project of a PHP based server : Nanoweb.
It's free, and fully extensible.
Related
This is what I need to achieve:
Request: http://www.example.com/image5.jpg
Should rewrite, for example:
RewriteRule ([^.]+)\.jpg$ /image.php?image=$1
Now, in my image.php, how do I serve the image, if I know where it is?
For example:
<?php
$path = '/images/5/2/3/1/small/latest/'.$_GET['image'].'.jpg';
?>
What is the best way to handle this request, so that it behaves like a image file (sends an image header) and displays the image?
There are various alternatives around the net, mainly X-Sendfile and readfile(), but I'm not sure what is the optimal solution and why.
Assuming web server can still serve the images directly via full path, I would just redirect directly to image:
header("Location: /images/5/2/3/1/small/latest/{$_GET['image']}.jpg");
exit;
The exit is good to have after sending headers. Also, make sure you do not send any other output, as header will not work then.
The fastest way to do this is to leverage the operating systems zero-copy support, whereas the file is sent by telling the network driver to move data directly from the OS file system buffer and onto the network. This way, no RAM coping is necessary, and the bottleneck will be in your network bandwidth.
I can't find any mention of PHP supporting this, however, which means that if you must serve the file with PHP, you will have to copy the file from the OS file cache into a PHPs memory space, and then have PHP ask the driver to copy it.
I would assume the built in method for doing that will be the most performant way to do it: http://php.net/manual/en/function.http-send-file.php
I'm writing an API using php to wrap a website functionality and returning everything in json\xml. I've been using curl and so far it's working great.
The website has a standard file upload Post that accepts file(s) up to 1GB.
So the problem is how to redirect the file upload stream to the correspondent website?
I could download the file and after that upload it, but I'm limited by my server to just 20MG. And it seems a poor solution.
Is it even possible to control the stream and redirect it directly to the website?
I preserverd original at the bottom for posterity, but as it turns out, there is a way to do this
What you need to use is a combination of HTTP put method (which unfortunately isn't available in native browser forms), the PHP php://input wrapper, and a streaming php Socket. This gets around several limitations - PHP disallows php://input for post data, but it does nothing with regards to PUT filedata - clever!
If you're going to attempt this with apache, you're going to need mod_actions installed an activated. You're also going to need to specify a PUT script with the Script directive in your virtualhost/.htaccess.
http://blog.haraldkraft.de/2009/08/invalid-command-script-in-apache-configuration/
This allows put methods only for one url endpoint. This is the file that will open the socket and forward its data elsewhere. In my example case below, this is just index.php
I've prepared a boilerplate example of this using the python requests module as the client sending the put request with an image file. If you run the remote_server.py it will open a service that just listens on a port and awaits the forwarded message from php. put.py sends the actual put request to PHP. You're going to need to set the hosts put.py and index.php to the ones you define in your virtual host depending on your setup.
Running put.py will open the included image file, send it to your php virtual host, which will, in turn, open a socket and stream the received data to the python pseudo-service and print it to stdout in the terminal. Streaming PHP forwarder!
There's nothing stopping you from using any remote service that listens on a TCP port in the same way, in another language entirely. The client could be rewritten the same way, so long as it can send a PUT request.
The complete example is here:
https://github.com/DeaconDesperado/php-streamer
I actually had a lot of fun with this problem. Please let me know how it works and we can patch it together.
Begin original answer
There is no native way in php to pass a file asynchronously as it comes in with the request body without saving its state down to disc in some manner. This means you are hard bound by the memory limit on your server (20MB). The manner in which the $_FILES superglobal is initialized after the request is received depends upon this, as it will attempt to migrate that multipart data to a tmp directory.
Something similar can be acheived with the use of sockets, as this will circumvent the HTTP protocol at least, but if the file is passed in the HTTP request, php is still going to attempt to save it statefully in memory before it does anything at all with it. You'd have the tail end of the process set up with no practical way of getting that far.
There is the Stream library comes close, but still relies on reading the file out of memory on the server side - it's got to already be there.
What you are describing is a little bit outside of the HTTP protocol, especially since the request body is so large. HTTP is a request/response based mechanism, and one depends upon the other... it's very difficult to accomplish a in-place, streaming upload at an intermediary point since this would imply some protocol that uploads while the bits are streamed in.
One might argue this is more a limitation of HTTP than PHP, and since PHP is designed expressedly with the HTTP protocol in mind, you are moving about outside its comfort zone.
Deployments like this are regularly attempted with high success using other scripting languages (Twisted in Python for example, lot of people are getting onboard with NodeJS for its concurrent design patter, there are alternatives in Ruby or Java that I know much less about.)
I have a PHP page that sends a file to the browser depending on the request data it receives. getfile.php?get=something sends file A, getfile.php?get=somethingelse sends file B and so on and so forth.
This is done like so:
header('Content-Disposition: attachment; filename='. urlencode($filename));
readfile($fileURL);
It works except it can only send one file at a time. Any other files requested are send in linear fashion. One starts as soon as another finishes.
How can I get this to send files in parallel if the user requests another file while one is downloading?
Edit: I have tried downloading two files at the same time by directly using their filepaths and it works, so neither Apache, nor the browser seem to have a problem. It seems PHP is the issue. I have by the way used session_start() at the beginning of the page.
This may be down to their browser settings, your server settings, PHP, or all three. Most browsers will only process two simultaneous HTTP connections to the same server, queuing others. Many web servers will also queue connections if there are more than two from the same browser. If you're using sessions, PHP may be serializing fulfilling requests in the session (to just one active request at a time) to minimize race conditions. (I don't know if PHP does this; some others do.)
Two of these (server and PHP) you're in control of; not much you can do about the browser.
Somewhat OT, but you could always allow them to select multiple files and then send them back a dynamically-created zip (or other container format).
Adding session_write_close() right after I'm finished with the session and before starting the download seems to have solved the issue.
I have noticed that files delivered by PHP through readfile or fpassthru techniques are never cached by the browser.
How can I "encourage" browsers to cache items delivered via these methods?
Whether your content is cached or not has nothing to do with readfile() and consorts, but probably the default caching headers issued by the server (that would activate caching for HTML pages and image resources) don't apply when you use PHP to pass through files.
You will have to send the appropriate headers along with your content, telling the browser that caching for this resource is all right.
See for example
Caching tutorial for Web Authors and Webmasters
How to use HTTP cache headers with PHP
I ended up finding this page and using it as a starting point for my own implementation. The example code on this page, along with some of the reading Pekka pointed to, was a great springboard for me.
I need to upload potentially big (as in, 10's to 100's of megabytes) files from a desktop application to a server. The server code is written in PHP, the desktop application in C++/MFC. I want to be able to resume file uploads when the upload fails halfway through because this software will be used over unreliable connections. What are my options? I've found a number of HTTP upload components for C++, such as http://www.chilkatsoft.com/refdoc/vcCkUploadRef.html which looks excellent, but it doesn't seem to handle 'resume' of half done uploads (I assume this is because HTTP 1.1 doesn't support it). I've also looked at the BITS service but for uploads it requires an IIS server. So far my only option seems to be to cut up the file I want to upload into smaller pieces (say 1 meg each), upload them all to the server, reassemble them with PHP and run a checksum to see if everything went ok. To resume, I'd need to have some form of 'handshake' at the beginning of the upload to find out which pieces are already on the server. Will I have to code this by hand or does anyone know of a library that does all this for me, or maybe even a completely different solution? I'd rather not switch to another protocol that supports resume natively for maintenance reasons (potential problems with firewalls etc.)
I'm eight months late, but I just stumbled upon this question and was surprised that webDAV wasn't mentioned. You could use the HTTP PUT method to upload, and include a Content-Range header to handle resuming and such. A HEAD request would tell you if the file already exists and how big it is. So perhaps something like this:
1) HEAD the remote file
2) If it exists and size == local size, upload is already done
3) If size < local size, add a Content-Range header to request and seek to the appropriate location in local file.
4) Make PUT request to upload the file (or portion of the file, if resuming)
5) If connection fails during PUT request, start over with step 1
You can also list (PROPFIND) and rename (MOVE) files, and create directories (MKCOL) with dav.
I believe both Apache and Lighttpd have dav extensions.
You need a standard size (say 256k). If your file "abc.txt", uploaded by user x is 78.3MB it would be 313 full chunks and one smaller chunk.
You send a request to upload stating filename and size, as well as number of initial threads.
your php code will create a temp folder named after the IP address and filename,
Your app can then use MULTIPLE connections to send the data in different threads, so you could be sending chunks 1,111,212,313 at the same time (with separate checksums).
your php code saves them to different files and confirms reception after validating the checksum, giving the number of a new chunk to send, or to stop with this thread.
After all thread are finished, you would ask the php to join all the files, if something is missing, it would goto 3
You could increase or decrease the number of threads at will, since the app is controlling the sending.
You can easily show a progress indicator, either a simple progress bar, or something close to downthemall's detailed view of chunks.
libcurl (C api) could be a viable option
-C/--continue-at
Continue/Resume a previous file transfer at the given offset. The given offset is the exact number of bytes that will be skipped, counting from the beginning of the source file before it is transferred to the destination. If used with uploads, the FTP server command SIZE will not be used by curl.
Use "-C -" to tell curl to automatically find out where/how to resume the transfer. It then uses the given output/input files to figure that out.
If this option is used several times, the last one will be used
Google have created a Resumable HTTP Upload protocol. See https://developers.google.com/gdata/docs/resumable_upload
Is reversing the whole proccess an option? I mean, instead of pushing file over to the server make the server pull the file using standard HTTP GET with all bells and whistles (like accept-ranges, etc.).
Maybe the easiest method would be to create an upload page that would accept the filename and range in parameter, such as http://yourpage/.../upload.php?file=myfile&from=123456 and handle resumes in the client (maybe you could add a function to inspect which ranges the server has received)
# Anton Gogolev
Lol, I was just thinking about the same thing - reversing whole thing, making server a client, and client a server. Thx to Roel, why it wouldn't work, is clearer to me now.
# Roel
I would suggest implementing Java uploader [JumpLoader is good, with its JScript interface and even sample PHP server side code]. Flash uploaders suffer badly when it comes to BIIIGGG files :) , in a gigabyte scale that is.
F*EX can upload files up to TB range via HTTP and is able to resume after link failures.
It does not exactly meets your needs, because it is written in Perl and needs an UNIX based server, but the clients can be on any operating system. Maybe it is helpful for you nevertheless:
http://fex.rus.uni-stuttgart.de/
Exists the protocol called TUS for resumable uploads with some implementations in PHP and C++