Encoding bug with an image and file_get_contents in PHP

Encoding bug with an image and file_get_contents in PHP - php

I use this code to retrieve and display an image:
header("Content-type: image/png");
echo file_get_contents(site_domain().image_asset_module_url("header.png",$this->name));
on my local WAMP it works, but on the remote server file_get_contents returns a wrong-encoded string:
Local:
‰PNG IHDR^jRÀ2¡ pHYsÒÝ~üÿÿIDATxÚì½˜Uõµþ¿`ŠŠÔéÃÕ¨¹&&ù'77¹i¦˜è‰=V:RlH‡™aAlH™B¯Jbh...
Remote:
�PNG IHDR^jR�2� pHYs��~���IDATx����U����`������ը�&&�'77�i��草=V:Rl...
If I use utf8_encode I get:
PNG IHDR^jRÀ2¡ pHYsÒÝ~üÿÿIDATxÚì½Uõµþ¿`ÔéÃÕ¨¹&&ù'77¹i¦è=V:RlHaAlHB¯Jbh...
So I always get a break picture on my remote Server - why and what is the solution?

The data is always the same. file_get_contents does not alter data in any way. You're also not dealing with text in some encoding, but with binary data. Any sort of text-encoding or conversion thereof does not apply here.
Your first sample is the binary image data as interpreted as Latin-1 encoded text.
Your second sample is the same binary data as interpreted as UTF-8 encoded text.
I.e., the data is fine, the interpretation is wrong. The interpretation should be set by the Content-Type header, perhaps this is not being set correctly on the remote server. For this problem, inspect the raw HTTP response headers and see How to fix "Headers already sent" error in PHP.

I would have rather used
<?php
$file = 'http://url/to_image.png';
$data = file_get_contents($file);
header('Content-type: image/png');
echo $data;
Or Can you try this
$remoteImage = "http://www.example.com/gifs/logo.gif";
$imginfo = getimagesize($remoteImage);
header("Content-type: $imginfo['mime']");
readfile($remoteImage);

Related

PHP read a line from a csv file return wrong in charset

I got a csv file, if I set the charset to ISO-8859-2(eastern europe) in Libre Calc, than it renders the characters correctly, but since the server's locale set to EN-UK.
I can not read the characters correctly, for example:
it returns : T�t insted of Tót.
I tried many things like:
echo (mb_detect_encoding("T�t","ISO-8859-2","UTF-8"));
I know probably the char does not exist in UTF-8 but I tried.
Also tried to setup the correct charset in the header:
header('Content-Type: text/html; charset=iso-8859-2');
echo "T�th";
but its returns : TÄĹźËth insted of Tóth.
Please help me solve this, thanks in advance

I advise against setting the header to charset=iso-8859-2'. It is usual to work with UTF-8. If the data is available with a different encoding, it should be converted to UTF-8 and then processed as CSV. The following example code could be kept as simple as the newline characters in UTF-8 and iso-8859-2 are the same.
$fileName = "yourpath/Iso8859_2.csv";
$fp = fopen($fileName,"r");
while($row = fgets($fp)){
$strUtf8 = mb_convert_encoding($row,'UTF-8','ISO-8859-2');
$arr = str_getcsv($strUtf8);
var_dump($arr);
}
fclose($fp);
The exact encoding of the CSV file must be known. mb_detect_encoding is not suitable for determining the encoding of a file.

Remove 'E' output headers from base64 string in TCPDF

I'm using TCPDF using
$base64String = $pdf->Output('file.pdf', 'E');
So I can send the data via AJAX
The only problem is that it comes with header information in addition to the Base64 string
Content-Type: application/pdf;
name="FILE-31154d59f28c63efae86e4f3d6a00e13.pdf"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename="FILE-31154d59f28c63efae86e4f3d6a00e13.pdf"
So if I take the string that is created to base64_decode() or use with phpMailer in my case it errors. Is it possible to remove the headers so I only have the base64 string?
(The error is that the pdf can't be read by any PDF reader when opened)
I thought I'd be able to find something that solves this but I haven't found anything!!
UPDATE
This is what I've put in place to solve the issue
$base64String = preg_replace('/Content-[\s\S]+?;/', '', $base64String);
$base64String = preg_replace('/name=[\s\S]+?pdf"/', '', $base64String);
$base64String = preg_replace('/filename=[\s\S]+?"/', '', $base64String);
However it's not very elegant! So if anyone has a better solution please post it below :)

TCPDF docs are huge but unusable – it's easier to read the source code directly. It has those extra headers because you're asking for them by using the E output mode, which is intended for generating email messages.
For sending the PDF data as a PHPMailer attachment, you want the straight binary PDF data as a string, as provided by the S output mode, which you can pass straight into addStringAttachment(), and PHPMailer will handle all the encoding for you. All you have to do is this:
$mail->addStringAttachment($pdf->Output('file.pdf', 'S'), 'file.pdf');
To convert the PDF binary into base64, for example to us it in a JSON string, simply pass it through base64_encode:
$base64String = base64_encode($pdf->Output('file.pdf', 'S'));

Base64Decode to file - whats missing?

I have in base64 encoded string in a $_POST field $_POST['nimage'] if I echo it directly as the src value in an img tag, i see the image just fine in browser: echo "<img src='".$_POST['nimage']."'>";
Now, I'm obviously missing a step, because when I base64_decode the string and write it to a file locally on the server, an attempt to view the created file in browser states error:
"The image 'xxxx://myserversomewhere.com/images/img1.jpg' cannot be displayed because it contains errors"
My decode and file put are:
$file = base64_decode($_POST['nimage']);
file_put_contents('images/'. $_POST['imgname'], $file);
which results in images/img1.jpg on the local server. What am I doing wrong in the decode here? Although the base64 output doesn't appear to be URLencoded I have tried urldecode() on it first before base64_decode() just for safe measure with same results.
First few lines of the base64 encode is:
data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAMCAgICAgMCAgIDAwMDBAYEBAQEBAgGBgUGCQgKCgkICQkKDA8MCgsOCwkJDRENDg8QEBEQCgwSExIQEw8QEBD/2wBDAQMDAwQDBAgEBAgQCwkLEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBD/wAARCAF4AqsDAREAAhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD2gJt+XPJPUGv2A/NB2044oAdtY9M8ccCgB6r8+0jtSYDxEW4xz2qQFCnGOPQ0AAQDJIz9KAF8rI6/hQA9Y+SBgjHIqWA5Yxz2xUsBwUdAMdzSAcFGAB0NADgCVK/KB/OgB6BNzc49agse2OgX2BFZvcCRUO7g

The data you're decoding has a data URI header attached:
data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD...
The header is use by the browser to identify the file type and encoding, but isn't part of the encoded data.
Strip the header (data:image/jpeg;base64,) from the data and base64 decode the rest before writing it to a file: you should be good to go.
$b64 = 'data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD...';
$dat = explode(',' $b64);
// element 1 of array from explode() contains B64-encoded data
if (($fileData = base64_decode($dat[1])) === false) {
exit('Base64 decoding error.');
}
file_put_contents($someFileName, $fileData);
NB: Check the return value of your call to base64_decode() for false and abort somehow with a message. It will trap any problems with the decoding process (like not removing the header!).

ANSI <--> UTF-8 charset issue, when displaying images from the database (PHP)

I am loading images from a MySQL database, to display them in an Web GUI.
This is pretty standard and worked pretty well, till I tried to install the software in russia...
Here a example of the code that loads the image:
// Load overview image
if ($global_mode == 'overview') {
// Load the image from the database.
mysql_select_db("$db_x");
$sql = "SELECT $db_x.sensor_images.image
FROM $db_x.sensor_images
WHERE $db_x.sensor_images.image_id = '" . $global_image_id . "'";
$sql = mysql_query($sql);
$row = mysql_fetch_assoc($sql);
// Image output.
header('Content-type: image/jpeg');
echo $row['image'];
}
I installed the software on many european based laptops and I never had the problem, that images were not displayed...
Apparently on russian laptops (Windows 7, XAAMP, MySQL) this was not the case, images were not displayed.
I started to do some research and found out (on my laptop, where images get displayed...), that if I change the Encoding of the php file (in this case the show_image.php), I could replicate the error I had on russian laptops.
If the encoding is set to ANSI, the images get displayed...
Here I have disabled the header, so the browser displays the binary data (the encoding of the PHP file is set to ANSI)...
EXAMPLE A
Now I set the Encoding of the PHP file to UTF-8
By doing this images do not get displayed any more...
This is the output when I try to display the data without the header...
EXAMPLE B
As you can see, the output is different...
On my laptop (european):
ANSI: images get displayed, the data (without header) looks like EXAMPLE A
UTF-8: images get not displayed, the data (without header) looks like EXAMPLE B
On russian laptops:
ANSI: images get not displayed, the data (without header) looks like EXAMPLE B
UTF-8: images get not displayed, the data (without header) looks like EXAMPLE B
I still don't understand why changing the encoding of a php file has an impact on the output of binary data, respectively an image...
On the russian laptops the PHP files get always interpreted as if the encoding was set to UTF-8, no matter if I set it to ANSI or something else...
Please help!
Thx.

In your IDE you see "UTF-8" and "UTF-8 without BOM". You're choosing UTF-8, which in this case means with BOM. The BOM is prepended to the file and is the first thing that's output. This may a) break the output of your header, thereby breaking the data display, and b) giving the browser a clue that the following data is supposedly UTF-8 encoded, hence the browser is interpreting the data as UTF-8, which results in a lot of UNICODE REPLACEMENT CHARACTERS �. Check your error logs, you should see PHP complain about Headers already sent.
The data you're sending is always the same, it's just interpreted in different encodings depending on the machine's default and the presence or absence of a UTF-8 BOM.
The only reason it breaks at all under any circumstances is that you're outputting the wrong headers and/or are sending additional content before or after the image data. Check with a low-level tool like curl what exactly is output, and find and remove anything that doesn't belong.

Save the php File that sends out the picture as "UTF-8 w/o BOM".If there are any files inlcuded, it is mandatory that they are saved as either "ANSI" or "UTF-8 w/o BOM", too. There also must be no space nor any text before the <?php -Tag.
If you want to send any text, e.g. an error message because the picture file is non-existent, you need to send the header("Content-Type: charset=utf-8"); right before the text in order to display all characters correctly - but not in combination with the image:
<?php
include "/someSafeDir/utils.inc.php";
$pic= secureGet($_GET['pic']);
$imagePath=/somePath/
$picture=$imagePath.$pic;
if (file_exists($picture)){
if ($fd = fopen ($picture, "r")) {
$fsize = filesize($picture);
header ("Content-type: image/jpeg");
header ("Content-length: $fsize");
readfile($picture);
} else
echo "File \"$pic\" could not be opened.\n";
fclose ($fd);
} else {
header("Content-Type: text/html; charset=utf-8");
echo "File \"$pic\" not existent";
}
?>

It could be that your PHP server charset is not the correct one.
In your php.ini file, try having the following directive:
default_charset = "UTF-8"
And restart your server.

fwrite() and UTF8

I am creating a file using php fwrite() and I know all my data is in UTF8 ( I have done extensive testing on this - when saving data to db and outputting on normal webpage all work fine and report as utf8.), but I am being told the file I am outputting contains non utf8 data :( Is there a command in bash (CentOS) to check the format of a file?
When using vim it shows the content as:
Donâ~#~Yt do anything .... Itâ~#~Ys a
great site with
everything....Weâ~#~Yve only just
launched/
Any help would be appreciated: Either confirming the file is UTF8 or how to write utf8 content to a file.
UPDATE
To clarify how I know I have data in UTF8 i have done the following:
DB is set to utf8 When saving data
to database I run this first:
$enc = mb_detect_encoding($data);
$data = mb_convert_encoding($data, "UTF-8", $enc);
Just before I run fwrite i have checked the data with Note each piece of data returns 'IS utf-8'
if (strlen($data)==mb_strlen($data, 'UTF-8')) print 'NOT UTF-8';
else print 'IS utf-8';
Thanks!

If you know the data is in UTF8 than you want to set up the header.
I wrote a solution answering to another tread.
The solution is the following: As the UTF-8 byte-order mark is \xef\xbb\xbf we should add it to the document's header.
<?php
function writeStringToFile($file, $string){
$f=fopen($file, "wb");
$file="\xEF\xBB\xBF".$file; // this is what makes the magic
fputs($f, $string);
fclose($f);
}
?>
You can adapt it to your code, basically you just want to make sure that you write a UTF8 file (as you said you know your content is UTF8 encoded).

fwrite() is not binary safe. That means, that your data - be it correctly encoded or not - might get mangled by this command or it's underlying routines.
To be on the safe side, you should use fopen() with the binary mode flag. that's b. Afterwards, fwrite() will safe your string data "as-is", and that is in PHP until now binary data, because strings in PHP are binary strings.
Background: Some systems differ between text and binary data. The binary flag will explicitly command PHP on such systems to use the binary output. When you deal with UTF-8 you should take care that the data does not get's mangeled. That's prevented by handling the string data as binary data.
However: If it's not like you told in your question that the UTF-8 encoding of the data is preserved, than your encoding got broken and even binary safe handling will keep the broken status. However, with the binary flag you still ensure that this is not the fwrite() part of your application that is breaking things.
It has been rightfully written in another answer here, that you do not know the encoding if you have data only. However, you can validate data if it validates UTF-8 encoding or not, so giving you at least some chance to check the encoding. A function in PHP which does this I've posted in a UTF-8 releated question so it might be of use for you if you need to debug things: Answer to: SimpleXML and Chinese look for can_be_valid_utf8_statemachine, that's the name of the function.

//add BOM to fix UTF-8 in Excel
fputs($fp, $bom =( chr(0xEF) . chr(0xBB) . chr(0xBF) ));
I find this piece works for me :)

The problem is your data is double encoded. I assume your original text is something like:
Don’t do anything
with ’, i.e., not the straight apostrophe, but the right single quotation mark.
If you write a PHP script with this content and encoded in UTF-8:
<?php
//File in UTF-8
echo utf8_encode("Don’t"); //this will double encode
You will get something similar to your output.

$handle = fopen($file,"w");
fwrite($handle, pack("CCC",0xef,0xbb,0xbf));
fwrite($handle,$file);
fclose($handle);

I know all my data is in UTF8 - wrong.
Encoding it's not the format of a file. So, check charset in headers of the page, where you taking data from:
header("Content-type: text/html; charset=utf-8;");
And check if data really in multi-byte encoding:
if (strlen($data)==mb_strlen($data, 'UTF-8')) print 'not UTF-8';
else print 'utf-8';

There is some reason:
first you get information from database it is not utf-8.
if you sure that was true use this ,I always use this and it work :
$file= fopen('../logs/logs.txt','a');
fwrite($file,PHP_EOL."_____________________output_____________________".PHP_EOL);
fwrite($file,print_r($value,true));

The only thing I had to do is add a UTF8 BOM to the CSV, the data was correct but the file reader (external application) couldn't read the file properly without the BOM

Try this simple method that is more useful and add to the top of the page before tag <body> :
<head>
<meta charset="utf-8">
</head>

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Encoding bug with an image and file_get_contents in PHP - php

Related

PHP read a line from a csv file return wrong in charset

Remove 'E' output headers from base64 string in TCPDF

Base64Decode to file - whats missing?

ANSI <--> UTF-8 charset issue, when displaying images from the database (PHP)

fwrite() and UTF8

Categories

Resources