PHP force downloading .xlsx file corrupt - php

I am working on a site that allows teachers to upload documents and students download them. However, there is a problem. Microsoft Word (.docx) files download perfectly, but when downloading an excel (xlsx) file, excel gives a "This file is corrupt and cannot be opened" dialog. Any help with this would be greatly appreciated!
My download code is as follows:
case 'xlsx':
header('Content-type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet');
header('Content-Disposition: attachment; filename="' . $filename . '"');
header('Content-Transfer-Encoding: binary');
header('Expires: 0');
header('Pragma: no-cache');
readfile('./uploads/resources/courses/' . $filename);
break;

I have this problem and was the BOM.
How to notice it
unzip: Checking the output file with unzip, I saw a warning at the second line.
$ unzip -l file.xlsx
Archive: file.xlsx
warning file: 3 extra bytes at beginning or within zipfile
...
xxd (hex viewer): I saw the first 5 bytes with the following command
head -c5 file.xlsx | xxd -g 1
0000000: ef bb bf 50 4b PK...
Notice the 3 first bytes ef bb bf that's BOM!
Why?
Maybe a php file with BOM or a previous output from a library.
You have to find where is the file or command with the BOM, In my case and right now, I don't have time to find it, but I solve this with output buffer.
<?php
ob_start();
// ... code, includes, etc
ob_get_clean();
// headers ...
readfile($file);

this works fine on my local xampp setup regardless of extension so from my point of view no case statement is needed unless i'm missing something
i've tested with docx, accdb, xlsx, mp3, anything ...
$filename = "equiv1.xlsx";
header('Content-type: application/octet-stream');
header('Content-Disposition: attachment; filename="' . $filename . '"');
header('Content-Transfer-Encoding: binary');
header('Expires: 0');
header('Pragma: no-cache');

try this:
header("Content-Disposition: attachment; filename=\"$filename\"");
header("Content-Type: application/vnd.ms-excel");

try:
<?
//disable gzip
#apache_setenv('no-gzip', 1);
//set download attachment
header('Content-Disposition: attachment;filename="filename.xlsx"');
//clean the output buffer
ob_clean();
//output file
readfile('filepath/filename.xlsx');
//discard any extra characters after this line
exit;
?>

Try adding a additional header
header('Content-Length: ' . filesize('./uploads/resources/courses/' . $filename));

Probably it's very misleading information given by Windows and has nothing to do with the code, Excel library, or server, and the file itself is a proper one. Windows blocks opening some files downloaded from the Internet (like .xlsx) and instead of asking whether you want to open an insecure file, it just writes that the file is corrupt. In Windows 10, one needs to right-click the file and select "Unblock" (you can read more for example here: https://winaero.com/blog/how-to-unblock-files-downloaded-from-internet-in-windows-10/)

Related

PHP: image corrupted if I force the download

I've a strange behavior with a simple PHP code. When I try to force the download or print out the image using the correct content-type, the output file is corrupted.
Seems that the webserver (apache) adds two bytes (0x20 and 0x0A) at the begin of the file.
This is the code:
$file = "image.png";
$image = file_get_contents($file);
// Test
file_put_contents("test.png", $image);
// Download
header('Content-Description: File Transfer');
header('Content-Type: application/octet-stream');
header('Content-Disposition: attachment; filename='.basename($file));
echo $image;
I use the same code on other websites hosted on the same server without problems.
The problem is only on download only because test.png works properly. The MD5 checksum of text.png and the original image are equals.
This is the hex code of test.png.
And this is the hex code of the corrupted file after download:
As you can see, there are 2 extra bytes at the begin. If I remove them, the file returns to work properly.
I attach the screen of Wireshark (as you can see is not a browser issue):
How can I fix it?
The server is Ubuntu 16.04 with PHP-5.6 (yes I done the downgrade from 7.0 to 5.6 for compatibility issues with roundcube)
UPDATE 1: I'm trying to find if somewhere in the file there is a space + newline
UPDATE 2:
First of all: thanks.
The code is part of a Wordpress plugin and the download is called using the AJAX system. I wrote a simple plugin test:
<?php
/*
Plugin Name: Test
Plugin URI: http://www.google.com
Description: Test
Author: Anon
Version: 4.0
*/
function downlod_test() {
echo "test";
header('Content-Description: File Transfer');
header('Content-Type: application/octet-stream');
header('Content-Disposition: attachment; filename=prova.html');
die();
}
function iopman_shared_download_doc_ajax() {
downlod_test();
}
add_action('wp_ajax_frontend_download_doc', 'iopman_shared_download_doc_ajax');
//downlod_test();
?>
If I call downlod_test with /wp-admin/admin-ajax.php?action=frontend_download_doc it adds the 2 extra bytes. If I call it directly (by removing the comments), it works.
So the problem now is: how to strip out these bytes that wordpress adds?
$file = "image.png";
header('Content-Description: File Transfer');
header('Content-Type: application/octet-stream');
header('Content-Disposition: attachment; filename=' . basename($file));
header("Content-Encoding: gzip");
header('Expires: 0');
header('Cache-Control: must-revalidate');
header('Pragma: public');
header("Content-Length: " . filesize($file));
header('Content-Transfer-Encoding: binary');
header('Connection: Keep-Alive');
header('Expires: 0');
header('Cache-Control: must-revalidate, post-check=0, pre-check=0');
ob_get_clean();
readfile($file);
exit;
To help you find that unwanted whitespace you can track loaded files with get_included_files(). Additionally, a backtrace could also shred some light on what your script does.
In many cases, it'll come from closing PHP tags at the end of the file. Since they're optional it's recommended to just not use them.
Once you locate the file where that white space is, you only need to load in your favourite text editor and remove them (you might need to enable your editor's Show hidden chars feature).
P.S. I understand that's probably simplified code to illustrate the issue but you may want to give readfile() a try.

PHP - Forcing an MP3 file download

So, I need a little help here. I have a site which hosts some mp3s. When users click on the download url, it links directly to a file called downloadmp3.php, which goes 2 parameters in the url...the php file is included below, and it's basically supposed to FORCE the user to save the mp3. (not play it in the browser or anything).
That doesnt happen. Instead, it seems like the file is WRITTEN out in ascii to the browser. It seems like it's the actual mp3 file written out.
Here is my downloadmp3.php file...please, what's wrong in this code.
It works on my local LAMP (Bitnami Wampstack on windows)....that is, on my local testing environment, it sends the file to my broswer, and I can save it. When I upload it to the real server, it basically writes out the mp3 file.
Here is the culprit file, downloadmp3.php...please help
<?php
include 'ngp.php';
$file = $_GET['songurl'];
$songid = $_GET['songid'];
increasedownloadcount($songid);
if (file_exists($file)) {
header('Content-Description: File Transfer');
header('Content-Type: audio/mpeg');
header('Content-Disposition: attachment; filename=' . basename($file));
header('Content-Transfer-Encoding: binary');
header('Expires: 0');
header('Pragma: public');
header('Content-Length: ' . filesize($file));
header('Cache-Control: must-revalidate, post-check=0, pre-check=0');
ob_clean();
flush();
readfile($file);
exit;
}
?>
By the way, this site only hosts mp3s - no other audio or file format. So, this downloadmp3.php script should ideally ask the user where they want to save this file.
Thanks for your help in advance.
I think the filename should be in quotes:
header('Content-Disposition: attachment; filename="' . basename($file) . '"');
Change the content-type value to text/plain. With this browser wont recognize it and wont play the file. Instead it will download the file at clients machine.
Seems there is too many headers. I am sure they do SOMETHING... but this code works.
This code works with MP3 files.... downloads to a file. Plays without a problem.
if(isset($_GET['file'])){
$file = $_GET['file'];
header('Content-type: audio/mpeg');
header('Content-Disposition: attachment; filename=".$file.'"');
readfile('path/to/your/'.$file);
exit();
}
You can access it with ajax call, or this:
<a id="dl_link" href="download.php?file=<>file-you-wish-to-download<>" target="_blank">Download this file</a>
Hopefully this is of some use

Force Downloading a PDF file, corrupt file

I've got a problem that has risen many times on SO, but I can't seem to find the solution to mine! I'm trying to deliver a pdf file to the client without it opening in the browser, the file downloads but it is corrupt when I open it and is missing quite a few bytes from the original file. I've tried several such methods for downloading the file but I'll just show you the latest I've used and hopefully get some feedback.
I have also opened the downloaded PDF in a text editor and there are no php errors at the top of it that I can see!
I'm also aware that readfile() is much quicker but for testing purposes I am desperate to get anything working so I used the while(!feof()) approach!
Anyway enough rambling, heres the code (taken from why my downloaded file is alwayes damaged or corrupted?):
$file = __DIR__ . '/reports/somepdf.pdf';
$basename = basename($file);
$length = sprintf("%u", filesize($file));
header('Content-Description: File Transfer');
header('Content-Type: application/octet-stream');
header('Content-Disposition: attachment; filename="' . $basename . '"');
header('Content-Transfer-Encoding: binary');
header('Connection: Keep-Alive');
header('Expires: 0');
header('Cache-Control: must-revalidate, post-check=0, pre-check=0');
header('Pragma: public');
header('Content-Length: ' . $length);
ob_clean();
set_time_limit(0);
readfile($file);
Also to note was the difference in file size:
Original: 351,873 bytes
Downloaded: 329,163 bytes
Make sure you're not running any compression output buffering handlers, such as ob_gzhandler. I had a similar case and I had to disable output buffering for this to work properly
You are using the the ob_gzhandler on the output buffer.
It works by gzencoding chunks of output. The output then is a stream of the encoded chunks.
Each chunk needs to get some bytes to get encoded, so the output is a little bit buffered until enough bytes are available.
However at the end of your script you discard the remaining buffer instead of flushing it.
Use ob_end_flush() instead of ob_clean() and the file gets through fully and not corrupted.
You are to use the transfer encoding of ob_gzhandler with file-uploads not having any problems when you don't destroy the output-buffer before it could have done it's work.
This is also the same if any other output buffering that works chunked would have been
enabled.
Example code:
$file = __DIR__ . '/somepdf.pdf';
$basename = basename($file);
$length = sprintf("%u", filesize($file));
header('Content-Description: File Transfer');
header('Content-Type: application/octet-stream');
header('Content-Disposition: attachment; filename="' . $basename . '"');
header('Content-Transfer-Encoding: binary');
header('Connection: Keep-Alive');
header('Expires: 0');
header('Cache-Control: must-revalidate, post-check=0, pre-check=0');
header('Pragma: public');
header('Content-Length: ' . $length);
ob_end_flush(); // <--- instead of ob_clean()
set_time_limit(0);
readfile($file);
return;
(FYI: actually even the ob_end_flush(); is not necessary, the important part is not to just kick the output-buffer before it could have done it's work)
I fought with using content-disposition for pushing a PDF download for two days before finding a solution to my problem. My PDF files were also smaller in size and corrupt - however, I could open them in Windows Preview - just not Adobe. After much troubleshooting, I discovered that Adobe expects the %PDF in the first 1024 bytes of the file. I was doing all my file type checks in my php code before creating the headers. I took out the majority of code before the headers and my PDF file was fixed.
You might not be setting it up the same way I did, but it might be the same problem:
http://helpx.adobe.com/acrobat/kb/pdf-error-1015-11001-update.html

file_get_contents over https displays garbage

I am trying to download uploaded files over https and, while the files themselves download, they cannot be viewed.
I have tried JPG, DOC and XLS files and all give the same problem and, in all cases, if I download via FTP they open perfectly and they open fine in the browser pre-download using the script.
Here is a subset of the script showing the code I am trying to use? Any idea why it downloads garbage?
$_file = sanitiseData($_GET['doc']);
$filename = '/doc_uploads/'.$_file;
if (file_exists($filename)) {
header('Content-type:image/jpg');
header('Content-Disposition: attachment; filename="'.$_file.'"');
echo file_get_contents($filename);
} else {
echo "The file $_file does not exist";
}
Here is a sample of the garbage when trying to view a downloaded JPG via browser:
����JFIF��;CREATOR: gd-jpeg v1.0 (using IJG JPEG v62), quality = 90 ��C ��C ��R�"�� ���}!1AQa"q2���#B��R��$3br� %&'()*456789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz��������������������������������������������������������������������������� ���w!1AQaq"2�B���� #3R�br� $4�%�&'()*56789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz��������������������������������������������������������������������������?�P��q\�O�^�-�C�z�z����o�N��P;��.i�~k+Զ���|�7`�'e����G�>+���_�6�%�Ԓ��Y�w���P�~.�����2E�� ��"��ڗȌ��ms����[���?��%|"�R5�s�c������=?V��>���IG�=?V��>���I_Q#w����o���������o����������=?V��>���IG�=?V��>���I_Q#w����o���������o����������=?V��>���IG�=?V��>���I_Q#w����o���������o����������=?V��>���IG�=?V��>���I_Q#w����o���������o����������=?V��>���IG�=?V��>���I_Q#w����o���������o����������=?V��>���IG�=?V��>���I_Q#w����o���������o����������=?V��>���IG�=?V��>���I_Q#w����o���������o����������=?V��>���IG�=?V��>���I_Q#w����o���������o����������=?V��>���IG�=?V��>���I_Q#w����o���������o����������=?V��>���IG�=?V��>���I_Q#w����o���������o����������=GU��>���� �N�v������%|!E~�xO� �ỹx_P����j(�z����_
Your best bet is to use readfile(...). PHP's website has a nice example that should help you. I use it on my website and it works like a charm:
if (file_exists($file)) {
// Inform browser that this is a force-download
header('Content-Description: File Transfer');
header('Content-Type: application/octet-stream');
// Inform browser that data can be binary in addition to text
header('Content-Disposition: attachment; filename='.basename($file));
header('Content-Transfer-Encoding: binary');
// Inform browser that this page expires immediately so that an update to the file will still work.
header('Expires: 0');
header('Cache-Control: must-revalidate');
header('Pragma: public');
header('Content-Length: ' . filesize($file));
// Push actual file.
ob_clean();
flush();
readfile($file);
exit();
}
Single quotes are causing you to output the string "$filename" instead of the variable value $filename.
echo file_get_contents($filename);
Though we don't see the function sanitiseData(), we assume it is properly filtering out strings that could be used for path injection like ../.
Addendum:
I'll note that the correct MIME type for a jpeg is image/jpeg, rather than image/jpg. That is likely going to cause you problems too.

headers to force the download of a tar archive

i have a tar archive on the server that must be downloadable through php. This is the code that i've used:
$content=file_get_contents($tar);
header("Content-Type: application/force-download");
header("Content-Disposition: attachment; filename=$tar");
header("Content-Length: ".strlen($content));
unlink($name);
die($content);
The file is downloaded but it's broken and it can't be open. I think that there's something wrong with headers because the file on the server can be open without problems. Do you know how can i solve this problem?
UPDATE
I've tried to print an iframe like this:
<iframe src="<?php echo $tar?>"></iframe>
And the download works, so i'm sure that there's something missing in headers.
I have used this code when I have had to do it:
function _Download($f_location, $f_name){
header('Content-Description: File Transfer');
header('Content-Type: application/octet-stream');
header('Content-Length: ' . filesize($f_location));
header('Content-Disposition: attachment; filename=' . basename($f_name));
readfile($f_location);
}
_Download("../directory/to/tar/raj.tar", "raj.tar");
//or
_Download("/var/www/vhost/domain.com/httpdocs/directory/to/tar/raj.tar", "raj.tar");
Try that.
Don't use file_get_contents() and then echo or print to output the file. That loads the full contents of the file into memory. A large file can/will exceed your script's memory_limit and kill the script.
For dumping a file's contents to the client, it's best to use readfile() - it will properly slurp up file chunks and spit them out at the client without exceeding available memory. Just remember to turn off output buffering before you do so, otherwise you're essentially just doing file_get_contents() again
So, you end up with this:
$tar = 'somefile.tar';
$tar_path = '/the/full/path/to/where/the/file/is' . $tar;
$size = filesize($tar_path);
header("Content-Type: application/x-tar");
header("Content-Disposition: attachment; filename='".$tar."'");
header("Content-Length: $size");
header("Content-Transfer-Encoding: binary");
readfile($tar_path);
If your tar file is actually gzipped, then use "application/x-gtar" instead.
If the file still comes out corrupted after download, do some checking on the client side:
Is the downloaded file 0 bytes, but the download process seemed to take much longer than it would take for 0 bytes to transfer, then it's something client-side preventing the download. Virus scanner? Trojan?
Is the downloaded file partially present, but smaller than the original? Something killed the transfer prematurely. Overeager firewall? Download manager having a bad day? Output buffering active on the server and the last buffer bucket not being flushed properly?
Is the downloaded file the same size as the original? Do an md5/sha1/crc checksum on both copies. If those are the same, then something's wrong with the app opening the file, not the file itself
Is the downloaded file bigger than the original? Open the file in notepad (or something better like notepad++ which doesn't take years to open big fils) and see if any PHP warnings messages, or some invisible whitespace you can't see in your script got inserted into the download at the start or end of the file.
Try something like the following:
$s_filePath = 'somefile.ext';
header('Content-Type: application/octet-stream');
header('Content-Disposition: attachment; filename="'. s_filePath.'"');
header('Content-Transfer-Encoding: binary');
header('Accept-Ranges: bytes');
header('Cache-control: private');
header('Pragma: private');
header('Expires: Mon, 26 Jul 1997 05:00:00 GMT');
header("Content-Length: ".filesize($s_filePath));
$r_fh = fopen($s_filePath,'r');
while(feof($r_fh) === false) {
$s_part = fread($r_fh,10240);
echo $s_part;
}
fclose($r_fh);
exit();
Use Content-Type: application/octet-stream or Content-Type: application/x-gtar
Make sure you aren't echoing anything that isn't the file output. Call ob_clean() before the headers

Categories