How to start the download of remote big files directly? - php

I'm trying to pass a large file from an external API to the user, (think 100MB or more)
Currently, I'm using a bit of a paranoid script (due to failures from the past) to get the script downloading ASAP.
By 'downloading', I only mean the download trigger on the browser, not the actual downloading of the file. Just the point where user can select where (s)he wants to save the file.
set_time_limit(0);
apache_setenv('no-gzip', 1);
ini_set('zlib.output_compression', 0);
ini_set('output_buffering', 0);
ini_set('implicit_flush', 1);
for($i = 0; $i < ob_get_level(); $i++) { ob_end_flush(); }
ob_implicit_flush(1);
header('Content-Description: File Transfer');
header('Content-type: application/octet-stream');
header('Content-Transfer-Encoding: Binary');
header('Content-Disposition: attachment; filename="' . $filename . '"');
header('Cache-Control: private');
ob_flush();
flush();
$fh = fopen($external_api_url, 'rb');
while(!feof($fh))
{
echo fread($fh, 512);
ob_flush();
flush();
}
fclose($fh);
Using this script, it still takes 20 seconds for a 50mb file before the download popup shows up, and much longer with bigger files.
Is there any way to start the stream faster?
EDIT:
I've also tried fpassthru() and readfile() but these take 40 seconds for the same 50mb file, making me think this way is better. I've also played around with different read sizes (512, 256, 64, couple of others) but I didn't notice a difference)

The reason it is taking so long for your browser to trigger the download dialog, is due to the API taking that long to return its first chunk of data, for example they might be reading the whole file to memory before starting to send the data to you, which would explain why longer files take longer to start even though you always try to write the first 512 bytes as quickly as possible.
I tried to simulate it locally by reading from a local file, but having a sleep(5); statement right before your while loop.
I was able to get Google Chrome to start the download before any data, by omitting the header('Content-type: application/octet-stream'); line, while still issuing a flush(); call before attempting to read the file. (You are already doing this second part)
This however doesn't seem to work with Firefox 3.6, so you might need different gimmicks for different browsers, unless you can predict the first character of a file (Take a look at BOM) and echo that before anything, subsequently removing it from the beginning of your first fread() call.
I hope it helps! But basically the external API is screwing you over.

Related

PHP Download Problem: Not Successful w/ Slow Connection Speeds

I've been stuck on this problem for a few days, and have yet to find a solution that fixes the problem I'm having.
What I'm Trying To Do:
I'm attempting to use PHP to download PDFs, and the code works very well for files that can download within about a minute and a half. On my home wifi, I'm able to download a 159MB file within 10 seconds, and it works every time. But when I limit the internet speed to "Fast 3G" (around 170KB/s, in order to simulate slower office speeds), the download fails. And nearly every time, it does so exactly 3 minutes and 24 seconds into the download process, but occasionally it is a lower time of 1 minute and 57 seconds.
What I've Tried:
I've tweaked the php.ini file (setting max_execution_time = 0, and memory_limit at higher intervals than the originally configured 128M)
I've tried other download methods that seem to "chunk" the larger PDFs. This has been mostly unsuccessful. In one instance the download would complete, but there would be an error when trying to open the PDF. According to the poster of this solution, it was only a valid solution for UTF-8 encoded files, and I found the one's I'm dealing with to be UTF-16. (I believe it was some kind of incompatibility with the print() function.
I've made sure the file can download if using a direct link in the URL. It has no problems this way, but it was only done for testing, and cannot be a permanent solution because the PDFs I'm dealing with contain sensitive information. So based off of this result, I was at least able to narrow down the problem to be PHP related and not IIS.
Here's the current code I'm using
header('Content-Description: File Transfer');
header('Content-Type: application/octet-stream');
header("Pragma: public");
header("Expires: 0");
header("Cache-Control:must-revalidate, post-check=0, pre-check=0");
header("Content-Type: application/force-download");
header("Content-Type: application/download");
header('Content-Disposition: attachment; filename="'.basename($file).'"');
header("Content-Transfer-Encoding: binary ");
header('Content-Length: ' . filesize($file));
//$file is a full path to the PDF
while(ob_get_level()) {
ob_end_clean();
}
readfile($file);
flush();
exit;
/*I realize it may be off, but it is at least working for quicker load
times as it currently is, so I'm leaving it alone for now*/
I tried to include any information that seemed relevant, but if any additional information would be useful please let me know! I will also be sure to include the current code that is handling the download process that I mentioned at the top of the post.
Instead of
readfile($file);
flush();
I would try
$handle = fopen($file, 'r');
while (!feof($handle)) {
echo fread($handle, 8192);
flush();
}
fclose($handle);
you may need to adjust the above to handle proper encoding, but that will depend on your environment

Proper way to use flush and ob_flush during file download in php

Am using the code below to download a file. it uses flush() and ob_flush() functions. I read brenns10 comment's at this
link
It says that use of flush() and ob_flush() will cause the data to go in memory until it's displayed and as such that its not good
for server with limited resources. Am on a shared server.
Please I need an explanation on this. should I flush() and ob_flush() as its in the code below or should I remove it. Thanks
$download_file = '10gb_file.zip';
$chunk = 1024; // 1024 kb/s
if (file_exists($download_file) && is_file($download_file)) {
header('Cache-control: private');
header('Content-Type: application/octet-stream');
header('Content-Length: ' . filesize($download_file));
header('Content-Disposition: filename=' . $download_file);
$file = fopen($download_file, 'r');
while(!feof($file)) {
print fread($file, round($chunk * 1024));
ob_flush();
flush();
}
fclose($file);
}
The better option when sending large files is to turn off buffering in PHP and allow your web server or underlying CGI layer to handle it, as they're better equipped to deal with large output streams, with techniques such as writing to temporary files, or delegating it into the socket.
If you have already started an output buffer elsewhere in the code, you would want to first close it using ob_end_clean().
// removes anything which was in the buffer, as this might corrupt your download
ob_end_clean();
// sends the file data to the server without trying to copy it into memory
$fp = fopen($download_file, 'rb');
fpassthru($fp);

Returning a local BIG PDF to be viewed with PHP isn't working properly

I figured out how to return the PDF correctly, however it takes 5 - 20 seconds (depending on file size) for Google Chrome/Microsoft Edge/Internet Explorer to show a progress bar.
$file = 'http://foobar.com/data/users/1/uploads/2342343/signed/protected.pdf';
$filename = 'protected';
$headers = get_headers($file, 1);
$fsize = $headers['Content-Length'];
header('Content-type: application/pdf');
header('Content-Disposition: inline; filename="' . $filename . '"');
header('Content-Transfer-Encoding: binary');
header('Content-Length: ' . $fsize);
header('Accept-Ranges: bytes');
#readfile($file);
This is taking way to long for it to actually display a result because the loading doesn't fire fast enough. What am I missing? Am I doing something wrong to cause the progress bar to not immediately show to start loading the PDF? Is the get_headers actually downloading the file first?
Or what is the best way to return a BIG PDF in the fastest way possible?
I think you should read the file in a stream fashion, flushing the content parts to the client.
I did some code to read in a stream fashion these days, but I was using the OCI-Lob::read, because my PDF was stored in an Oracle database. I think your file may be stored in a different way, so you need a different implementation. In my case, I read the file contents 1MB each time. I was not working with flushing content to the client.
I'm not that expert in PHP, but I think you could take a look in the flush function to accomplish the loading progress.

Download large CSV file to browser while it is being generated

I have a script that generates a large CSV file using fputcsv and sends it to the browser. It works, but the browser doesn't show the file download prompt (or start downloading the file) until the whole CSV file has been generated serverside, which takes a long time.
Instead, I'd like the download to begin while the remainder of the file has still being generated. I know this is possible because it's how the 'Export database' option in PHPMyAdmin works - the download starts as soon as you click the 'export' button even if your database is huge.
How can I tweak my existing code, below, to let the download begin immediately?
$csv = 'title.csv';
header( "Content-Type: text/csv;charset=utf-8" );
header( "Content-Disposition: attachment;filename=\"$csv\"" );
header( "Pragma: no-cache" );
header( "Expires: 0" );
$fp = fopen('php://output', 'w');
fputcsv($fp, array_keys($array), ';', '"');
foreach ($array as $fields)
{
fputcsv($fp, $fields, ';', '"');
}
fclose($fp);
exit();
Empirically, it seems that when receiving responses featuring a Content-Disposition: attachment header, different browsers will show the file download dialog at the following moments:
Firefox shows the dialog as soon as it receives the headers
Internet Explorer shows the dialog once it has received the headers plus 255 bytes of the response body.
Chromium shows the dialog once it has received the headers plus 1023 bytes of the response body.
Our objectives, then, are as follows:
Flush the first kilobyte of the response body to the browser as soon as possible, so that Chrome users see the file download dialog at the earliest possible moment.
Thereafter, regularly send more content to the browser.
Standing in the way of these objectives are, potentially, multiple levels of buffering, which you can try to fight in different ways.
PHP's output_buffer
If you have output_buffering set to a value other than Off, PHP will automatically create an output buffer which stores all output your script tries to send to the response body. You can prevent this by ensuring that you have output_buffering set to Off from your php.ini file, or from a webserver config file like apache.conf or nginx.conf. Alternatively, you can turn off the output buffer, if one exists, at the start of your script using ob_end_flush() or ob_end_clean():
if (ob_get_level()) {
ob_end_clean();
}
Buffering done by your webserver
Once your output gets past the PHP output buffer, it may be buffered by your webserver. You can try to get around this by calling flush() regularly (e.g. every 100 lines), although the PHP manual is hesitant about providing any guarantees, listing some particular cases where this may fail:
flush
...
Flushes the write buffers of PHP and whatever backend PHP is using (CGI, a web server, etc). This attempts to push current output all the way to the browser with a few caveats.
flush() may not be able to override the buffering scheme of your web server ...
Several servers, especially on Win32, will still buffer the output from your script until it terminates before transmitting the results to the browser.
Server modules for Apache like mod_gzip may do buffering of their own that will cause flush() to not result in data being sent immediately to the client.
You can alternatively have PHP call flush() automatically every time you try to echo any output, by calling ob_implicit_flush at the start of your script - though beware that if you have gzip enabled via a mechanism that respects flush() calls, such as Apache's mod_deflate module, this regular flushing will cripple its compression attempts and probably result in your 'compressed' output being larger than if it were uncompressed. Explicitly calling flush() every n lines of output, for some modest but non-tiny n, is thus perhaps a better practice.
Putting it all together, then, you should probably tweak your script to look something like this:
<?php
if (ob_get_level()) {
ob_end_clean();
}
$csv = 'title.csv';
header( "Content-Type: text/csv;charset=utf-8" );
header( "Content-Disposition: attachment;filename=\"$csv\"" );
header( "Pragma: no-cache" );
header( "Expires: 0" );
flush(); // Get the headers out immediately to show the download dialog
// in Firefox
$array = get_your_csv_data(); // This needs to be fast, of course
$fp = fopen('php://output', 'w');
fputcsv($fp, array_keys($array), ';', '"');
foreach ($array as $i => $fields)
{
fputcsv($fp, $fields, ';', '"');
if ($i % 100 == 0) {
flush(); // Attempt to flush output to the browser every 100 lines.
// You may want to tweak this number based upon the size of
// your CSV rows.
}
}
fclose($fp);
?>
If this doesn't work, then I don't think there's anything more you can do from your PHP code to try to resolve the problem - you need to figure out what's causing your web server to buffer your output and try to solve that using your server's configuration files.
have not tested this. try to flush the script after n number of data rows.
flush();
Try Mark Amery's answer, but just emphasize on the statement:
$array = get_your_csv_data(); // This needs to be fast, of course
If you're fetching huge number of records, fetch them by chunks (every 1000 records for example).
So:
Fetch 1000 records
Output them
Repeat
I think you are looking for the octet-stream header.
$csv = 'title.csv';
header('Content-Type: application/octet-stream');
header("Content-Disposition: attachment;filename=\"$csv\"" );
header('Content-Transfer-Encoding: binary');
header('Cache-Control: must-revalidate');
header('Expires: 0');
$fp = fopen('php://output', 'w');
fputcsv($fp, array_keys($array), ';', '"');
foreach ($array as $fields)
{
fputcsv($fp, $fields, ';', '"');
}
fclose($fp);
exit();

Problem serving large (image?) files to Safari

Server setup: Apache 2.2.14, PHP 5.3.1
I use a PHP script to serve files of all types as part of an application with complex access permissions. Things have worked out pretty well so far, but then one of our beta users took a picture with a 10-megapixel digital camera and uploaded it. It's somewhere north of 9 MB, 9785570 bytes.
For some reason, in Safari (and thus far ONLY in Safari, I've reproduced this on 5.0.5) the download will sometimes hang partway through and never finish. Safari just keeps on merrily trying to load forever. I can't consistently reproduce the problem - if I reload over and over sometimes the download will complete and sometimes it won't. There's no apparent pattern.
I'm monitoring the server access logs and in the cases where Safari hangs I see a 200 response of the appropriate filesize after I navigate away from the page, or cancel the page load, but not before.
Here's the code that serves the file, including headers. When the download succeeds and I inspect the headers browser-side I see the content type and size have been set correctly. Is it something in my headers? Something in Safari? Both?
header('Content-Type: ' . $fileContentType);
header('Content-Disposition: filename=' . basename($fpath));
header('Content-Transfer-Encoding: binary');
header('Expires: 0');
header('Cache-Control: must-revalidate, post-check=0, pre-check=0');
header('Pragma: public');
header('Content-Length: ' . filesize($fpath));
ob_clean();
flush();
session_write_close();
readfile($fpath);
exit;
FURTHER BULLETINS AS EVENTS WARRANT:
By artificially throttling download speed to 256k/s -- that is, by chunking the file into 256k pieces and pausing between serving them, as
$chunksize = 1 * (256 * 1024); // how many bytes per chunk
if ($size > $chunksize) {
$handle = fopen($fpath, 'rb');
$buffer = '';
while (!feof($handle)) {
$buffer = fread($handle, $chunksize);
echo $buffer;
ob_flush();
flush();
sleep(1);
}
fclose($handle);
} else {
readfile($fpath);
}
I was able to guarantee a successful display of the image file in Safari under arbitrary conditions.
A chunksize of 512k does not guarantee a successful display.
I am almost certain that the problem here is that Safari's image renderer can't handle data coming in any faster, but:
I would like to know for certain
I would also to know if there's some other kind of workaround like a special CSS webkit property or whatever to handle large images because 256k/second is kind of dire for a 10 MB file.
And just to pile on the weird, setting up a finer-grained sleep with usleep() results in problems at a sleep time of 500 ms but not 750 ms.
I did a little digging and found little specific, but I do see a lot of people asserting that Safari has issues with honoring cache control directives. One person asserts:
You don't need all those Cache Controls, just a max-age with Expires set in the past, does everything all those headers your using does [...] many of those Cache Controls headers your using cause problems for Safari [...] Lastly, some browsers don't understand filename, the only understand name, which must be included in the Content-Type header line, never in the Content-Disposition line. [...]
( see last post in thread: http://www.codingforums.com/archive/index.php/t-114251.html OLD info, but you never know... )
So possibly comment out some of your headers and look to see if there is an improvement.
(anecdotal) I also saw some some older post complaining about safari both resuming an interrupted download by appending the whole file to the end of the partial one, and endless downloading which appears to count bytes beyond the file length being sent. (anecdotal)
You might want to try to "chunk" the file while reading it in.
There a numerous posts here on PHP.net that explain ways to do that: http://php.net/manual/en/function.readfile.php
Try:
ob_start();
readfile($path);
$buffer = ob_get_clean();
echo $buffer;

Categories