On my page, people can choose to either view a pdf-file (on screen) or to download it. (to view it later on when they're offline)
When users choose to download, the code is executed once. I am keeping track of this with a counter and it increments by 1 for each download. So, this option is working fine and can be seen in the if-block below.
When users choose to view the file, the pdf file is displayed - so that's OK - but the counter increments by 2 for each view. This code is run from the else-block below.
I also checked the "Yii trace" and it is really going through all of it twice, but only when viewing the file...
if ($mode==Library::DOWNLOAD_FILE){
//DOWNLOAD
Yii::app()->getRequest()->sendFile($fileName, #file_get_contents( $rgFiles[0] ) );
Yii::app()->end();
}
else {
//VIEW
// Set up PDF headers
header('Content-type: application/pdf');
header('Content-Disposition: inline; filename="' . $rgFiles[0] . '"');
header('Content-Transfer-Encoding: binary');
header('Content-Length: ' . filesize($rgFiles[0]));
header('Accept-Ranges: bytes');
// Render the file
readfile($rgFiles[0]);
Yii::app()->end();
}
}
I tried a few other options, just to see how it would cause this to run twice:
When removing the "PDF headers" from the code above, the counter is
incremented by 1, but I obviously only get garbage on the screen...
If I get rid off the readfile command, the counter is also incremented by 1,
but the browser won't render the pdf (because it is not getting the data without this line)...
So, it's only when going through the else-block that all of it (Yii request) is executed twice...
Thanks in advance for any suggestions...
I think that is because with the sendFile() method you open the file actually just once, and in the else branch you really open it twice.
In the if branch you open the file once with the file_get_contents() and pass the file as a string to the sendFile() method and then it counts the size of this string, outputs headers, etc: http://www.yiiframework.com/doc/api/1.1/CHttpRequest#sendFile-detail
In the else branch you open the file first with the filesize() and then also with the readfile() method.
I think you could solve this problem by rewriting the else branch similar to the sendFile() method:
Basically read in the file with file_get_contents() into a string, and then count the length of this string with mb_strlen(). After you output the headers, just echo the content of the file without reopening it.
You could even copy-paste the whole sendFile() method into the else branch, just change the "attachment" to "inline" in the line (or replace this whole if/else statement with the sendFile method and just change the attachment/inline option to download or view, an even more elegant way would be overriding this method and extending with another parameter, to view or download the given file) :
header("Content-Disposition: attachment; filename=\"$fileName\"");
So I think something like this would be a solution:
// open the file just once
$contents = file_get_contents(rgFiles[0]);
if ($mode==Library::DOWNLOAD_FILE){
//DOWNLOAD
// pass the contents of file to the sendFile method
Yii::app()->getRequest()->sendFile($fileName, $contents);
} else {
//VIEW
// calculate length of file.
// Note: the sendFile() method uses some more magic to calculate length if the $_SERVER['HTTP_RANGE'] exists, you should check it out if this does not work.
$fileSize=(function_exists('mb_strlen') ? mb_strlen($content,'8bit') : strlen($content));
// Set up PDF headers
header('Content-type: application/pdf');
header('Content-Disposition: inline; filename="' . $rgFiles[0] . '"');
header('Content-Transfer-Encoding: binary');
header('Content-Length: ' . $fileSize);
header('Accept-Ranges: bytes');
// output the file
echo $contents;
}
Yii::app()->end();
I hope this solves your problem, and my explanations are understandable.
Related
I'm at a bit of a loss as to why this folder is not being found. I have a script that, after searching a database to find the $filename of someone's purchase based on a stored random code, should simply return their file. My code looks like this (including the trailing end of the db query):
$stmt_2 -> bind_result($filename);
$stmt_2 -> fetch();
$stmt_2 -> close();
// For .zip files
$filepath='/media-files/Label/' . $filename;
if (headers_sent()) {
echo 'HTTP header already sent';
} else {
if (!is_file($filepath)) {
header($_SERVER['SERVER_PROTOCOL'].' 404 Not Found');
echo 'File not found.';
} else if (!is_readable($filepath)) {
header($_SERVER['SERVER_PROTOCOL'].' 403 Forbidden');
echo 'File not readable.';
} else {
header('Content-Type: application/zip');
header('Content-Disposition: attachment; filename="' . basename($filepath) . '"');
header('Content-Length: ' . filesize($filepath));
readfile($filepath);
exit;
}
}
When I run this code, I receive "File not found." so !is_file($filepath) is where it is getting tripped up -- However, the path is correct and the zip is definitely there, so I'm not sure what is wrong here.
In terms of debugging, I've tried removing the checks, going directly to the headers and readfile, which returns an empty zip folder. What does work is if I navigate directly to the file by URL...
UPDATE
The file path issue has been fixed, but I am still not able to download the file. In all attempts I get either ERR_INVALID_RESPONSE or if I try to brute force download the file, it returns an empty file. I tried using these headers with no success:
header_remove();
ob_end_clean();
header('Content-Description: File Transfer');
header('Content-Type: application/octet-stream');
header('Content-Disposition: attachment; filename="' . $filename . '"');
header('Content-Transfer-Encoding: binary');
header('Expires: 0');
header('Cache-Control: must-revalidate');
header('Pragma: public');
header('Content-Length: ' . filesize($filepath));
readfile($filepath);
ob_end_flush();
exit;
They are large audio files, which appears to be causing the issue...
You have two types of pathes:
(a) The path of an URL. You have a web-adress which defines the root of your webpage.
e.g. https://www.stackoverflow.com is the start of the site. If you adress /questions at this site you always have the path https://www.stackoverflow.com/questions
(b) The path of the drive where the webpage is located. It is the filesystem-root.
e.g. /home/httpd/html/MyWebPage/questions
If you try to use /questions in (b) it will fail because you need the whole path.
So, this said you need to know where '/media-files/Label/'.$filename is located. It seems to me that /media-files is not at root-level of your filesystem (b).
Maybe it is at the web-root but this is not enough for your system to find the file. Therefore you need something like this:
'/root/httpd/MyWebPage/media-files/Label/'.$filename
Nico Haase was absolutely correct, this is an issue with misunderstanding of paths. Here is a link to an article that should clear things up:
https://phpdelusions.net/articles/paths
Currently your script is trying to find the file in:
/media-files/Label/file.zip
not:
/var/www/myproject/media-files/Label/file.zip
The linked article should provide you with all the neccesary information.
TLDR;
use:
$filepath=$_SERVER['DOCUMENT_ROOT'].'/media-files/Label/' . $filename;
UPDATE
With the file size issue it might be that PHP runs out of allowed memory when trying to load the whole file. We could try something like:
flush();
$file = fopen($filepath, "r");
while(!feof($file)) {
// send the current file part to the browser
print fread($file, round(10 * 1024));
// flush the content to the browser
flush();
}
fclose($file);
There are some issues with flush() but it's a good shot I think. You can have a read on: https://www.php.net/manual/en/function.flush
Other then that there is always the possibility to split the file into smaller chunks.
I am having a problem with my webpage. I am building a report tool for downloading data as .csv - I have a php skript which aggregates the data and builds a csv from it. The skript is invoked with the exec() command, detailed code is below. The skript itself uses file_put_contents() to generate the file, which is then stored in my /tmp/ folder until its downloaded (I am working in a dockerized environment and our filter rules delete the file at the next request, but I could store the file permanently somewhere else if that would be neccessary). I am then checking if the file is present with file_exists() and proceed to invoke my download function. In Firefox I get the desired result, a file with the correct content of only the csv data.
My main Problem is: When I download the csv in Chrome I get the csv data followed by the html source of my page - so starting with <!doctype html> in the first line after the csv data, then <html lang="de">in the next line of te csv and so on..
Let me show you some code:
In my skript:
private function writeToFile($csv)
{
$fileName = '/path/to/file' '.csv';
echo "\n" . 'Write file to ' . $fileName . "\n";
file_put_contents($fileName, $csv);
}
In my page class:
$filePath = '/path/to/finished/csv/'
exec('php ' . $skriptPath . $skriptParams);
if (file_exists($filePath)) {
$this->downloadCsv($filePath);
} else {
$pageModel->addMessage(
new ErrorMessage('Error Text')
);
}
My download function in the same class:
private function downloadCsv($filePath)
{
header('Content-Description: File Transfer');
header('Content-Type: text/csv');
header('Content-Disposition: attachment; filename="' . basename($filePath) . '"');
header('Content-Length: ' . filesize($filePath));
readfile($filePath);
}
The shown above is working in Firefox, but not in Chrome. I already tried to clear the output buffer with ob_clean() or send and disable it with ob_end_flush() but nothing worked for Chrome.
I also tried something like this in my download function:
header('Content-Disposition: attachment; filename="' . basename($filePath) . '"');
$fp =fopen($filePath, 'rw');
fpassthru($fp);
fclose($fp);
This produces the same results in Firefox and Chrome - I get the csv data followed by the html sourcecode mixed into the same file.
I am working within a Symfony framework if that could be from help, I saw there are some helper functions for file downloads but I so far I could not use them with success..
Until now my target is only to get the download working in Chrome to have a working mvp which can go into production - it is supposed to be for internal use only, so I don't have to care about IE or some other abominations because our staff is told to use a normal browser... But when someone sees flaws in the general concept feel free to tell me!
Thanks in advance :)
So I managed to get it working, I was on the wrong track with the output buffer, but a simple exit()after my readfile()was enough to stop parts of the html ending up in the csv file.
Code:
private function downloadCsv($filePath)
{
header('Content-Description: File Transfer');
header('Content-Type: text/csv');
header('Content-Disposition: attachment; filename="' . basename($filePath) . '"');
header('Content-Length: ' . filesize($filePath));
readfile($filePath);
exit;
}
I figured out how to return the PDF correctly, however it takes 5 - 20 seconds (depending on file size) for Google Chrome/Microsoft Edge/Internet Explorer to show a progress bar.
$file = 'http://foobar.com/data/users/1/uploads/2342343/signed/protected.pdf';
$filename = 'protected';
$headers = get_headers($file, 1);
$fsize = $headers['Content-Length'];
header('Content-type: application/pdf');
header('Content-Disposition: inline; filename="' . $filename . '"');
header('Content-Transfer-Encoding: binary');
header('Content-Length: ' . $fsize);
header('Accept-Ranges: bytes');
#readfile($file);
This is taking way to long for it to actually display a result because the loading doesn't fire fast enough. What am I missing? Am I doing something wrong to cause the progress bar to not immediately show to start loading the PDF? Is the get_headers actually downloading the file first?
Or what is the best way to return a BIG PDF in the fastest way possible?
I think you should read the file in a stream fashion, flushing the content parts to the client.
I did some code to read in a stream fashion these days, but I was using the OCI-Lob::read, because my PDF was stored in an Oracle database. I think your file may be stored in a different way, so you need a different implementation. In my case, I read the file contents 1MB each time. I was not working with flushing content to the client.
I'm not that expert in PHP, but I think you could take a look in the flush function to accomplish the loading progress.
I am using the following code to download files that are stored outside of the public folder.
$mime_type = mime_content_type("{$_GET['file']}");
define("IMG_LOC","/var/www/domain.com/upload/");
$filename = $_GET['file'];
header('Content-Description: File Transfer');
header('Content-Type: '.$mime_type);
header('Content-Disposition: attachment; filename='.basename(IMG_LOC.$filename));
header('Expires: 0');
header('Cache-Control: must-revalidate');
header('Pragma: public');
header('Content-Length: ' . filesize($filename));
readfile($filename);
exit;
The problem is, file downloaded using this script is not usable. Excel is opening empty, powerpoint tells "there is an error reading" and word tells its missing a converter. Whereas, if I download the same files using ftp and open them manually, the files open properly, showing that the files are not corrupt.
For info, this is getting called from another page as : file.php?file='. $filename
Any help will be welcome. Thanks for your time.
You seem to be missing the path to your file:
header('Content-Length: ' . filesize(IMG_LOC . $filename));
readfile(IMG_LOC . $filename);
You should also add validation for the filename to avoid security problems.
If you still have a problem, you should also check the exact output of the script, perhaps there are php warnings or messages before your file.
I'm deducing that $filename is not the absolute path to the file you're seeking and hence why you define the IMG_LOC constant with a path. It's clear from there that filesize($filename)and readfile($filename) will not likely give you what you want.
Try concatenating the constant before the $filename variable like so...
header('Content-Length: ' . filesize(IMG_LOC . $filename));
readfile(IMG_LOC . $filename);
Also, consider that this code is susceptible to header-injection attacks as well as other security issues such as the user supplying you with a filename on your server that you may not want them to see. For example if I call your script with the query string ?file=yourscript.php I will be able to download your actual PHP code and potentially see any sensitive information you might not want exposed like your database password, or worse.
Also, mime_content_type is a deprecated function and should be replaced with the Fileinfo extension instead.
You script has various issues which all in all will prevent it from properly working. I roughly go through the lines and leave some comments, write a little summary then and offer another code-example with the comments incorporated:
$mime_type = mime_content_type("{$_GET['file']}");
You don't need to wrap the $_GET superglobal in curly brackets and then into double quotes. It's just not necessary for that parameter. You seem to be distracted at this point.
Anyway, this mime-type thing isn't necessary as the mime-type is not interesting if you want to offer the download. You take application/octet-stream instead and you can take care later on for a more specific mime-type:
$mime_type = "application/octet-stream";
Then at the wrong position you define the IMG_LOC constant:
define("IMG_LOC", "/var/www/domain.com/upload/");
This belongs at the very top of the script instead as you define the configuration by that.
In the line:
$filename = $_GET['file'];
you don't do any further error checking this opens up your script to directory traversal and path injection attacks which actually turns the script as you have it into a backdoor. Any file the script has access to on that server can be downloaded.
The next two lines are more or less correct then:
header('Content-Description: File Transfer');
header('Content-Type: '.$mime_type);
For the next header:
header('Content-Disposition: attachment; filename='.basename(IMG_LOC.$filename));
I would extract the basename earlier and just pass a variable here. Same for the content-length header later:
header('Content-Length: ' . filesize($filename));
Then you have this block of caching headers, as you serve the file from disk I don't think those are actually necessary, so I would remove them:
header('Expires: 0');
header('Cache-Control: must-revalidate');
header('Pragma: public');
The readfile line seems ok, you could do some error checking however:
readfile($filename);
And the last line I don't understand, as the script is at the end anyway, why exit?
exit;
My suggestions after this little review:
Gather the information which files should be served and how they must be named. Gathering such information will allow you to close the directory traversal issue which you have to close first.
Second putting the logic part above the output (and the configuration above the logic) should allow you to order the script in a more useful manner allowing you to handle issues with the mime-type for example easier when you maintain the script (or the caching if it is really an issue).
<?php
/**
* download a file
*
* parameter:
*
* file - name of the relative to upload folder
*/
const IMG_LOC = "/var/www/domain.com/upload";
// validate filename input
if (!isset($_GET['file'])) {
return;
}
$filename = $_GET['file'];
$path = realpath(IMG_LOC . '/' . $filename);
if (0 !== strpos($path, IMG_LOC)) {
return;
}
if (!is_readable($filename)) {
return;
}
// obtain data
$basename = basename($filename);
$mime_type = "application/octet-stream"; # can be improved later
$size = filesize($path);
// output
header('Content-Description: File Transfer');
header('Content-Type: ' . $mime_type);
header('Content-Disposition: attachment; filename=' . $basename);
header('Content-Length: ' . $size);
readfile($filename);
Most peculiar problem with following code. It returns a pdf report to the browser.
function cart_aspdf() {
trace('cart_aspdf_in');
$file = 'order_WS000250.pdf';
header('Content-type: application/pdf');
header('Content-Disposition: inline; filename="' . $file . '"');
$file = APPPATH.'pdfcache/'.$file;
header('Content-Transfer-Encoding: binary');
header('Content-Length: ' . filesize($file));
header('Accept-Ranges: bytes');
trace('cart_aspdf_readfile');
#readfile($file);
trace('cart_aspdf_out');
}
The trace output in opera,firefox,ie,safari is as you would expect:
cart_aspdf_in
cart_aspdf_readfile
cart_aspdf_out
BUT the trace for chrome shows the following which seems to indicate that the function is being called at least twice if not three times. Why should this be so?
cart_aspdf_in
cart_aspdf_readfile
cart_aspdf_out
cart_aspdf_in
cart_aspdf_readfile
cart_aspdf_in
cart_aspdf_readfile
cart_aspdf_out
The problem does not occur if I omit the content-type line but then chrome shows the raw pdf data which is no use
I ran into the same problem.
header('Content-Disposition: inline;');
For whatever reason, when the content-disposition is inline it calls the page twice.
This was giving me problems trying to use the referrer because the second call does not pass referrer data.
using
header('Content-Disposition: attachment;');
only runs once, but will not display inside the browsers PDF viewer. It will instead download the file.
I think this needs to be posted on chrome's bugtracker. This is quite annoying and for streaming files a bandwidth waste.