PHP UTF-8 to GB2312 - php

Part of our web app has a little Ajax method that will load a page in an iFrame or allow you to download it.
We store a bunch of search results from search engines and we have script opens the file containing our info and the search html. We strip out the stuff we don't need from the top (our info) and then we serve that up either by echo'ing the $html variable or putting it in a temporary file and dishing it off to download.
The problem: I load the page in the iFrame and it's loaded in UTF-8 because everything else is. If I download the file manually it is fine and FF tells me the endoding is x-gbk.
I've tried using mb_convert_encoding to no avail. We are using PHP4 on this server.
Thoughts?
EDIT: Code that drives this
f(!isset($_GET['file']) || $_GET['file'] == '')
{
header("location:index.php");
}
$download = false;
if(!isset($_GET['view']) || $_GET['view'] != 'true')
{
$download = true;
}
$file = LOG_PATH . $_GET['file'];
$fileName = end(explode("/", $file));
$fh = fopen($file, "rb");
if(!$fh)
{
echo "There was an error in processing this file. Please retry.";
return;
}
// Open HTML file, rip out garbage at top, inject "http://google.com" before all "images/"
$html = fread($fh, filesize($file));
fclose($fh);
// Need to trim off our headers
$htmlArr = explode("<!", $html, 2);
$htmlArr[1] = "<!" . $htmlArr[1];
if(strstr($file, "google"))
{
$html = str_replace('src="/images/', 'src="http://google.com/images/', $htmlArr[1]);
$html = str_replace('href="/', 'href="http://google.com/', $html);
}
else if(strstr($file, "/msn/"))
{
$html = str_replace('src="/images/', 'src="http://bing.com/images/', $htmlArr[1]);
$html = str_replace('href="/', 'href="http://www.bing.com/', $html);
}
else
{
$html = $htmlArr[1];
}
if(strstr($file, "baidu"))
{
$html = mb_convert_encoding($html, 'utf-8'); // Does not work
}
if($download)
{
// Write to temporary file
$fh = fopen("/tmp/" . $fileName, 'w+');
fwrite($fh, $html);
fclose($fh);
$fh = fopen("/tmp/" . $fileName, "rb");
header('Content-type: application/force-download;');
header("Content-Type: text/html;");
header('Content-Disposition: attachment; filename="' . $fileName . '"');
fpassthru($fh);
fclose($fh);
unlink("/tmp/" . $fileName);
}
else // AJAX Call
{
echo $html;
}

You may want to try iconv() instead of mb_convert_encoding()--it has support for a much broader set of encodings.

Related

PHP get image from server

I am trying to display an image from a PHP server.
Here it is my function:
function get_file($path) {
$fileToGet = $GLOBALS['homedir'].$path;
//echo $fileToGet.PHP_EOL;
if (file_exists($fileToGet)) {
//echo 'file exists';
header('Content-Type: image/png');
header('Content-Length: '.filesize($fileToGet));
echo file_get_contents($file);
}
}
I am using the browser or postman and the image is invalid.
What I am missing?
I edited a little bit your function:
function get_file( $path ){
$fileToGet = $GLOBALS['homedir'];
if( substr( $fileToGet, -1) != '/' ){
// add trailing slash if needed
$fileToGet .= '/';
}
$fileToGet .= $path;
if (file_exists($fileToGet)) {
header('Content-Type: image/png');
header('Content-Length: '.filesize($fileToGet));
echo file_get_contents($fileToGet);
}
}
Just a security hint: if $path comes from the user there may be a problem because he will be able to access to some other file.
Think about this code:
get_file( $_GET['path'] );
then the user can call this url
yoursite/yourpage.php?path=../../../mypreciousimage.png
You're not outputting the contents of the file you're reading:
file_get_contents($file);
You'll need to echo it:
echo file_get_contents($file);
Or:
readfile($file);
You'll probably also want to add exit; to the end of that function, to ensure that no other code runs and that no other output gets sent.
Try
print file_get_contents($file);
Instead of
file_get_contents($file);

Download a file with php and polymer

I'm having some trouble with this one. I have found some helpful scripts on the web and have been modifying them for my needs. However, I can't seem to download a file. It will respond back with the contents of the file but doesn't download it. I am using Polymer 1.0+ for my client side and PHP for my server side. The client side code to download a file is as follows:
<!--THIS IS THE HTML SIDE-->
<iron-ajax
id="ajaxDownloadItem"
url="../../../dropFilesBackend/index.php/main/DownloadItem"
method="GET"
handle-as="document"
last-response="{{downloadResponse}}"
on-response="ajaxDownloadItemResponse">
</iron-ajax>
//THIS IS THE JAVASCRIPT THAT WILL CALL THE "iron-ajax" ELEMENT
downloadItem:function(e){
this.$.ajaxDownloadItem.params = {"FILENAME":this.selectedItem.FILENAME,
"PATH":this.folder};
this.$.ajaxDownloadItem.generateRequest();
},
The server side code is as follows (the url is different because I do some url modification to get to the correct script):
function actionDownloadItem(){
valRequestMethodGet();
$username = $_SESSION['USERNAME'];
if(validateLoggedIn($username)){
$itemName = arrayGet($_GET,"FILENAME");
$path = arrayGet($_GET,"PATH");
$username = $_SESSION['USERNAME'];
$downloadItem = CoreFilePath();
$downloadItem .= "/".$_SESSION['USERNAME']."".$path."".$itemName;
DownloadFile($downloadItem);
}
else {
echo "Not Logged In.";
}
}
function DownloadFile($filePath) {
//ignore_user_abort(true);
set_time_limit(0); // disable the time limit for this script
//touch($filePath);
//chmod($filePath, 0775);
if ($fd = fopen($filePath, "r")) {
$fsize = filesize($filePath);//this returns 12
$path_parts = pathinfo($filePath);//basename = textfile.txt
$ext = strtolower($path_parts["extension"]);//this returns txt
$header = headerMimeType($ext); //this returns text/plain
header('Content-disposition: attachment; filename="'.$path_parts["basename"].'"'); // use 'attachment' to force a file download
header("Content-type: $header");
header("Content-length: $fsize");
header("Cache-control: private"); //use this to open files directly
while(!feof($fd)) {
$buffer = fread($fd, 2048);
echo $buffer;
}
}
fclose ($fd);
}
Any help on this one would be greatly appreciated.
First you will need the file handle
$pathToSave = '/home/something/something.txt';
$writeHandle = fopen($pathToSave, 'wb');
Then, while you are reading the download, write to the file instead of echoing
fwrite($writeHandle, fread($fd, 2048));
Finally, after writing to the file finished close the handle
fclose($writeHandle);
I neglect the error check, you should implement your own.

Download all images from a website url

I need to download all images (only) from a url.I've been searching but never got the right answer for this. I have a page that will accept website url to a input box then after submitting it will download all the images from that website.I got this code from the web, but I dont know how to use or where to put the web url. For example i want to download all images from http://www.microsoft.com/en-ph/default.aspx
$url = $_REQUEST['webUrl'];
$string = FetchPage($url);
$image_regex_src_url = '/<img[^>]*'.
'src=[\"|\'](.*)[\"|\']/Ui';
$image_regex_src_url = '/<img[^>]*'.
'src=[\"|\'](.*)[\"|\']/Ui';
preg_match_all($image_regex_src_url, $string, $out, PREG_PATTERN_ORDER);
$images_url_array = $out[1];
foreach ($images_url_array as $pica)
{
echo '<img src="'.$pica.'" >';
$fileNames[] = $pica;
}
$_SESSION['filesArr'] = $fileNames;
function FetchPage($path)
{
$file = fopen($path, "r");
if (!$file)
{
exit("URL Unknown");
}
$data = '';
while (!feof($file))
{
$data .= fgets($file, 1024);
}
return $data;
}
This is my download script.
$files = array ('http://c.s-microsoft.com/en-ph/CMSImages/mslogo.png?version=856673f8-e6be-0476-6669-d5bf2300391d');
$zip = new ZipArchive();
$tmp_file = tempnam('.','');
$zip->open($tmp_file, ZipArchive::CREATE);
foreach($files as $file){
$download_file = file_get_contents($file);
$zip->addFromString(basename($file),$download_file);
}
# close zip
$zip->close();
# send the file to the browser as a download
header('Content-disposition: attachment; filename=download.zip');
header('Content-type: application/zip');
readfile($tmp_file);
You could also use wget... For example:
wget -r -A=.jpg,.png http://www.microsoft.com/en-ph/default.aspx

Serving documents outside the web root folder.

I have a function called "viewDoc" which is supposed to go to a folder outside the web root and fetch a file for me. It seams to work ok with images (Jpgs etc) but with PDF's it just outputs a blank gray page as demonstrated here - http://www.tutorplanner.com/userimage/viewdoc/12787622467.pdf
Can anyone see what I'm doing wrong as I've been scratching my head over this for a day!
public function viewDoc($doc) {
$path_parts = pathinfo($_SERVER['REQUEST_URI']);
$file = $doc;
$fileDir = '/var/uploads/';
if (file_exists($fileDir . $file))
{
$contents = file_get_contents($fileDir . $file);
//print_r($contents);
header('Content-Type: ' . mime_content_type($fileDir . $file));
header('Content-Length: ' . filesize($fileDir . $file));
readfile($contents);
}
}
readfile is used with the parameter of a file name - NOT text.
Two examples that would work (file_get_contents):
public function viewDoc($doc) {
$path_parts = pathinfo($_SERVER['REQUEST_URI']);
$file = $doc;
$fileDir = '/var/uploads/';
$filePath = $fileDir . $file;
if (file_exists($filePath))
{
$contents = file_get_contents($filePath);
header('Content-Type: ' . mime_content_type($filePath));
header('Content-Length: ' . filesize($filePath));
echo $contents;
}
}
or (readfile):
public function viewDoc($doc) {
$path_parts = pathinfo($_SERVER['REQUEST_URI']);
$file = $doc;
$fileDir = '/var/uploads/';
$filePath = $fileDir . $file;
if (file_exists($filePath))
{
header('Content-Type: ' . mime_content_type($filePath));
header('Content-Length: ' . filesize($filePath));
readfile($filePath);
}
}
I also added the $filePath variable for you, as there's no reason to concat the string multiple times.
Edit
As extra security, to Yazmat's comment, you can use $file = str_replace(array('..', '/'), '', $doc); as this would remove all references to other directories (however, with the slash it would also remove access to sub directories, so you might want to skip that, dependend on your code and file structure).
You have a great security issue here, anyone can access anything on your server with the function you wrote. I realy advise you to not use it and just put your files (that should be accessible) on a public web directory.

Damaged data when gzipping

This is the script I have written for gzipping content on my site, which is located in 'gzip.php'. The way I use it is that on pages where I want to enable gzipping I include the file at the top and at the bottom I call the output function like this:
print_gzipped_page('javascript')
If the file is a css-file I use 'css' as the $type-argument and if its a php file I call the function without declaring any arguments. The script works fine in all browsers except Opera which gives an error saying it could not decode the page due to damaged data. Can anyone tell me what I have done wrong?
<?php
function print_gzipped_page($type = false) {
if(headers_sent()){
$encoding = false;
}
elseif( strpos($_SERVER['HTTP_ACCEPT_ENCODING'], 'x-gzip') !== false ){
$encoding = 'x-gzip';
}
elseif( strpos($_SERVER['HTTP_ACCEPT_ENCODING'],'gzip') !== false ){
$encoding = 'gzip';
}
else{
$encoding = false;
}
if ($type!=false) {
$type_header_array = array("css" => "Content-Type: text/css", "javascript" => "Content-Type: application/x-javascript");
$type_header = $type_header_array[$type];
}
$contents = ob_get_contents();
ob_end_clean();
$etag = '"' . md5($contents) . '"';
$etag_header = 'Etag: ' . $etag;
header($etag_header);
if ($type!=false) {
header($type_header);
}
if (isset($_SERVER['HTTP_IF_NONE_MATCH']) and $_SERVER['HTTP_IF_NONE_MATCH']==$etag) {
header("HTTP/1.1 304 Not Modified");
exit();
}
if($encoding){
header('Content-Encoding: '.$encoding);
print("\x1f\x8b\x08\x00\x00\x00\x00\x00");
$size = strlen($contents);
$contents = gzcompress($contents, 9);
$contents = substr($contents, 0, $size);
}
echo $contents;
exit();
}
ob_start();
ob_implicit_flush(0);
?>
Additional info: The script works if the length of the document being compressed is only 10-15 characters.
Thanks for the help, corrected version:
<?php
function print_gzipped_page($type = false) {
if(headers_sent()){
$encoding = false;
}
elseif( strpos($_SERVER['HTTP_ACCEPT_ENCODING'], 'x-gzip') !== false ){
$encoding = 'x-gzip';
}
elseif( strpos($_SERVER['HTTP_ACCEPT_ENCODING'],'gzip') !== false ){
$encoding = 'gzip';
}
else{
$encoding = false;
}
if ($type!=false) {
$type_header_array = array("css" => "Content-Type: text/css", "javascript" => "Content-Type: application/x-javascript");
$type_header = $type_header_array[$type];
header($type_header);
}
$contents = ob_get_contents();
ob_end_clean();
$etag = '"' . md5($contents) . '"';
$etag_header = 'Etag: ' . $etag;
header($etag_header);
if (isset($_SERVER['HTTP_IF_NONE_MATCH']) and $_SERVER['HTTP_IF_NONE_MATCH']==$etag) {
header("HTTP/1.1 304 Not Modified");
exit();
}
if($encoding){
header('Content-Encoding: ' . $encoding);
$contents = gzencode($contents, 9);
}
$length = strlen($contents);
header('Content-Length: ' . $length);
echo $contents;
exit();
}
ob_start();
ob_implicit_flush(0);
?>
This approach is a bit too clumsy. Rather make use of ob_gzhandler. It will automatically GZIP the content which the client supports it and set the necessary headers.
ob_start('ob_gzhandler');
readfile($path);
Two things stand out:
1) you don't seem to be setting the Content-Length header to the size of the compressed data. (Maybe I've overlooked it.) If you don't set this a browser might think you've finished sending data too early.
2) you are doing a substr of the compressed $content with the uncompressed $size. Some browsers will stop decompressing when the internal structure has an EOF marker but other browsers (Opera?) may attempt to decompress the entire downloaded buffer. That would definitely give you a 'damaged data' error. You might not be seeing this problem with small buffers because the amount of overhead and the amount of compression might exactly match.

Categories