After update to 5.4, fopen can't read file - php

I have a website on a host that recently switched from PHP 5.2 to 5.4, and required us to chose a new php.ini file: 5.4 plain, 5.4 solo (just one php.ini file used throughout the site), and 5.4 fast.
I do not know which one I was using prior to making the switch, but when I did, (I chose 5.4 solo), I noticed that a part of my website that depends on mbstring (multibyte characters) no longer works.
In specific, it opens a text file that is full of characters and then that is used in an encryption script and it stores garbage in the mysql database. Then to retrieve it, it's again run through the script and decrypted, and displayed on the screen.
This worked just fine until the 5.4 change. Now it appears that it's unable to retrieve (open?) the text file. I have tested this with a non-multibyte character version and that works fine, so I don't think the issue is with the code, but rather with the way PHP is treating multibyte chars...and I suspect, just a hunch, that this is fixable by tweaking the PHP.ini file somehow. Zend.multibyte seems to be PHP's new thing.
My problem is that I have no idea what to tweak. I tried several different Zend.multibyte/mbstring combos and that didn't work.
I know that everything works up until a string is sent for encryption. It comes back as a null value, instead of a garbled string. I feel like something in the string is being rejected by PHP and thus it's failing...offering nothing instead of the string it should.
Does anyone have a thought as to what might be happening and why my script no-longer works with 5.4? I have checked and the mbstring module IS loaded, with default values in the php.ini.
Any suggestions would be great...I'm totally stumped. Even some additional reports or ways to test or narrow down the problem would be fantastic.
Thank you!
Here is some code, where I think the problem is:
$this->s1 = "";
$s1array = array("a1.txt", "a2.txt", "a3.txt");
foreach ($s1array as $i => $value) {
$myFile = "../a/dir/somewhere/$s1array[$i]";
$fh = fopen($myFile, 'r');
$theData = fgets($fh);
fclose($fh);
$this->s1 .= html_entity_decode($theData, ENT_NOQUOTES, 'UTF-8');
}
The files ../a/dir/somewhere/a1.txt and ../a/dir/somewhere/a2.txt (etc) are semi-comma delimited strings of html coded letters, for example: & #x0fb0f;& #x02c97;& #x00436;& #x10833;& #x00514; (I added the spaces so it would show code not the HTML values!).
But I guess now, for some reason, this above code isn't returning any results. If I assign the result to a variable and echo that variable, there's nothing. But if I assign $this->s1 = "abcde"; or a longer string and skip the "foreach" part, it will work. So something in this process, this code, no longer works in 5.4. Can anyone tell what's going on here? Thank you!

Why you use fopen and so on for text files when you could use file_put_contents and file_get_contents - they are mostly wrappers for fopen, freads and so on. I have NEVER ever had any problems with UTF8 using that two functions.
Also make sure everything (from php, to db if you are using it, and php files) are encoded or using utf8. There is nothing funnier than *.php files in for example latin2 and all the rest in utf8.

Related

Function with special characters

I am creating a site where the authenticated user can write messages for the index site.
On the message create site I have a textbox where the user can give the title of the message, and a textbox where he can write the message.
The message will be exported to a .txt file and from the title I'm creating the title of the .txt file and like this:
Title: This is a message (The filename will be: thisisamessage.txt)
The original given text as filename will be stored in a database rekord among with the .txt filename as path.
For converting the title text I am using a function that looks like this:
function filenameconverter($title){
$filename=str_replace(" ","",$title);
$filename=str_replace("ű","u",$filename);
$filename=str_replace("á","a",$filename);
$filename=str_replace("ú","u",$filename);
$filename=str_replace("ö","o",$filename);
$filename=str_replace("ő","o",$filename);
$filename=str_replace("ó","o",$filename);
$filename=str_replace("é","e",$filename);
$filename=str_replace("ü","u",$filename);
$filename=str_replace("í","i",$filename);
$filename=str_replace("Ű","U",$filename);
$filename=str_replace("Á","A",$filename);
$filename=str_replace("Ú","U",$filename);
$filename=str_replace("Ö","O",$filename);
$filename=str_replace("Ő","O",$filename);
$filename=str_replace("Ó","O",$filename);
$filename=str_replace("É","E",$filename);
$filename=str_replace("Ü","U",$filename);
$filename=str_replace("Í","I",$filename);
return $filename;
}
However it works fine at the most of the time, but sometimes it is not doing its work.
For example: "Pamutkéztörlő adagoló és higiéniai kéztörlő adagoló".
It should stand as a .txt as:
pamutkeztorloadagoloeshigieniaikeztorloadagolo.txt, and most of the times it is.
But sometimes when im giving this it will be:
pamutkă©ztă¶rlĺ‘adagolăłă©shigiă©niaikă©ztă¶rlĺ‘adagolăł.txt
I'm hungarian so the title text will be also hungarian, thats why i have to change the characters.
I'm using XAMPP with apache and phpmyadmin.
I would rather use a generated unique ID for each file as its filename and save the real name in a separate column.
This way you can avoid that someone overwrites files by simply uploading them several times. But if that is what you want you will find several approaches on cleaning filenames here on SO and one very good that I used is http://cubiq.org/the-perfect-php-clean-url-generator
intl
I don't think it is advisable to use str_replace manually for this purpose. You can use the bundled intl extension available as of PHP 5.3.0. Make sure the extension is turned on in your XAMPP settings.
Then, use the transliterator_transliterate() function to transform the string. You can also convert them to lowercase along. Credit goes to simonsimcity.
<?php
$input = 'Pamutkéztörlő adagoló és higiéniai kéztörlő adagoló';
$output = transliterator_transliterate('Any-Latin; Latin-ASCII; lower()', $input);
print(str_replace(' ', '', $output)); //pamutkeztorloadagoloeshigieniaikeztorloadagolo
?>
P.S. Unfortunately, the php manual on this function doesn't elaborate the available transliterator strings, but you can take a look at Artefacto's answer here.
iconv
Using iconv still returns some of the diacritics that are probably not expected.
print(iconv("UTF-8","ASCII//TRANSLIT",$input)); //Pamutk'ezt"orl"o adagol'o 'es higi'eniai k'ezt"orl"o adagol'o
mb_convert_encoding
While, using encoding conversion from Hungarian ISO to ASCII or UTF-8 also gives similar problems you have mentioned.
print(mb_convert_encoding($input, "ASCII", "ISO-8859-16")); //Pamutk??zt??rl?? adagol?? ??s higi??niai k??zt??rl?? adagol??
print(mb_convert_encoding($input, "UTF-8", "ISO-8859-16")); //PamutkéztörlŠadagoló és higiéniai kéztörlŠadagoló
P.S. Similar question could also be found here and here.

PHP7 UTF-8 filenames on Windows server, new phenomenon caused by ZipArchive

Update:
Preparing a bug report to the great people that make PHP 7 possible I revised my research once more and tried to melt it down to a few simple lines of code. While doing this I found that PHP itself is not the cause of the problem. I will share my results here when I'm done. Just so you know and don't possibly waste your time or something :)
Synopsis: PHP7 now seems able to write UTF-8 filenames but is unable to access them?
Preamble: I read about 10-15 articles here touching the subject but they did not help me solve the problem and they all are older than the PHP7 release. It seems to me that this is probably a new issue and I wonder if it might be a bug. I spent a lot of time experimenting with en-/decoding of the strings and trying to figure out a way to make it work - to no avail.
Good day everybody and greetings from Germany (insert shy not-my-native-language-remark here), I hope you can help me out with this new phenomenon I encountered. It seems to be "new" in the sense that it came with PHP 7.
I think most people working with PHP on a Windows system are very familiar with the problem of filenames and the transparent wrapper of PHP that manages access to files that have non-ASCII filenames (or windows-1252 or whatever is the system code page).
I'm not quite sure how to approach the subject and as you can see I'm not very experienced in composing questions so please don't rip my head off instantly. And yes I will strive to keep it short. Here we go:
First symptom: after updating to PHP7 I sometimes encountered problems with accessing files generated by my software. Sometimes it worked as usual, sometimes not. I found out the difference was that PHP7 now seems able to write UTF-8 filenames but is unable to access files with those names.
After generating said files on two separate "identical" systems (differing only in the PHP version) this is how the files are named on the hard drive:
PHP 5.5: Lokaltest_KG_漢字_汉字_Krümhold-DEZ1604-140081-complete.zip
PHP 7: Lokaltest_KG_漢字_汉字_Krümhold-DEZ1604-140081-complete.zip
Splendid, PHP 7 is capable of writing unicode-filenames on the HDD, and UTF-16 is used on windows afaik. Now the downside is that when I try to access those files for example with is_file() PHP 5.5 works but PHP 7 does not.
Consider this code snippet (note: I "hacked" into this function because it was the simplest way, it was not written for this purpose). This function gets called after a zip-file gets generated taking on the name of the customer and other values to determine a proper name. Those come out of the database. Database and internal encoding of PHP are both UTF-8. clearstatcache is per se not necessary but I included it to make things clearer. Important: Everything that happens is done with PHP7, no other entity is responsible for creating the zip-file. To be precise it is done with class ZipArchive. Actually it does not even matter that it is a zip-archive, the point is that the filename and the content of the file are created by PHP7 - successfully.
public static function downloadFileAsStream( $file )
{
clearstatcache();
print $file . "<br/>";
var_dump(is_file($file));
die();
}
Output is:
D:/htdocs/otm/.data/_tmp/Lokaltest_KG_漢字_汉字_Krümhold-DEZ1604-140081-complete.zip
bool(false)
So PHP7 is able to generate the file - they indeed DO exist on the harddrive and are legit and accessible and all - but is incapable of accessing them. is_file is not the only function that fails, file_exists() does too for example.
A little experiment with encoding conversion to give you a taste of the things I tried:
public static function downloadFileAsStream( $file )
{
clearstatcache();
print $file . "<br/>";
print mb_detect_encoding($file, 'ASCII,UTF-16,windows-1252,UTF-8', false) . "<br/>";
print mb_detect_encoding($file, 'ASCII,UTF-16,windows-1252,UTF-8', true) . "<br/>";
if (($detectedEncoding = mb_detect_encoding($file, 'ASCII,UTF-16,windows-1252,UTF-8', true)) != 'windows-1252')
{
$file = mb_convert_encoding($file, 'UTF-16', $detectedEncoding);
}
print $file . "<br/>";
var_dump(is_file($file));
die();
}
Output is:
D:/htdocs/otm/.data/_tmp/Lokaltest_KG_漢字_汉字_Krümhold-DEZ1604-140081-complete.zip
UTF-8
UTF-8
D:/htdocs/otm/.data/_tmp/Lokaltest_KG_o"[W_lI[W_Kr�mhold-DEZ1604-140081-complete.zip
NULL
So converting from UTF-8 (database/internal encoding) to UTF-16 (windows file system) does not seem to work either.
I am at the end of my rope here and sadly the issue is very important to us since we cannot update our systems with this problem looming in the background. I hope somebody can shed a little light on this. Sorry for the long post, I'm not sure how well I could get my point across.
Addition:
$file = utf8_decode($file);
var_dump(is_file($file));
die();
Delivers false for the filename with the japanese letters. When I change the input used to create the filename so that the filename now is Lokaltest_KG_Krümhold-DEZ1604-140081-complete.zip above code delivers true. So utf8_decode helps but only with a small part of unicode, german umlauts?
Answering my own question here: The actual bad boy was the component ZipArchive which created files with incorrectly encoded filenames. I have written a hopefully helpful bug report: https://bugs.php.net/bug.php?id=72200
Consider this short script:
print "php default_charset: ".ini_get('default_charset')."\n"; // just 4 info (UTF-8)
$filename = "bugtest_müller-lüdenscheid.zip"; // just an example
$filename = utf8_encode($filename); // simulating my database delivering utf8-string
$zip = new ZipArchive();
if( $zip->open($filename, ZipArchive::CREATE | ZipArchive::OVERWRITE) === true )
{
$zip->addFile('bugtest.php', 'bugtest.php'); // copy of script file itself
$zip->close();
}
var_dump( is_file($filename) ); // delivers ?
output:
output PHP 5.5.35:
php default_charset: UTF-8
bool(true)
output PHP 7.0.6:
php default_charset: UTF-8
bool(false)

Is it possible to change the behavior of PHP's print_r function [duplicate]

This question already has answers here:
making print_r use PHP_EOL
(5 answers)
Closed 6 years ago.
I've been coding in PHP for a long time (15+ years now), and I usually do so on a Windows OS, though most of the time it's for execution on Linux servers. Over the years I've run up against an annoyance that, while not important, has proved to be a bit irritating, and I've gotten to the point where I want to see if I can address it somehow. Here's the problem:
When coding, I often find it useful to output the contents of an array to a text file so that I can view it's contents. For example:
$fileArray = file('path/to/file');
$faString = print_r($fileArray, true);
$save = file_put_contents('fileArray.txt', $faString);
Now when I open the file fileArray.txt in Notepad, the contents of the file are all displayed on a single line, rather than the nice, pretty structure seen if the file were opened in Wordpad. This is because, regardless of OS, PHP's print_r function uses \n for newlines, rather than \r\n. I can certainly perform such replacement myself by simply adding just one line of code to make the necessary replacements, ans therein lies the problem. That one, single line of extra code translates back through my years into literally hundreds of extra steps that should not be necessary. I'm a lazy coder, and this has become unacceptable.
Currently, on my dev machine, I've got a different sort of work-around in place (shown below), but this has it's own set of problems, so I'd like to find a way to "coerce" PHP into putting in the "proper" newline characters without all that extra code. I doubt that this is likely to be possible, but I'll never find out if I never ask, so...
Anyway, my current work-around goes like this. I have, in my PHP include path, a file (print_w.php) which includes the following code:
<?php
function print_w($in, $saveToString = false) {
$out = print_r($in, true);
$out = str_replace("\n", "\r\n", $out);
switch ($saveToString) {
case true: return $out;
default: echo $out;
}
}
?>
I also have auto_prepend_file set to this same file in php.ini, so that it automatically includes it every time PHP executes a script on my dev machine. I then use the function print_w instead of print_r while testing my scripts. This works well, so long as when I upload a script to a remote server I make sure that all references to the function print_w are removed or commented out. If I miss one, I (of course) get a fatal error, which can prove more frustrating than the original problem, but I make it a point to carefully proofread my code prior to uploading, so it's not often an issue.
So after all that rambling, my question is, Is there a way to change the behavior of print_r (or similar PHP functions) to use Windows newlines, rather than Linux newlines on a Windows machine?
Thanks for your time.
Ok, after further research, I've found a better work-around that suite my needs, and eliminates the need to call a custom function instead of print_r. This new work-around goes like this:
I still have to have an included file (I've kept the same name so as not to have to mess with php.ini), and php.ini still has the auto_prepend_file setting in place, but the code in print_w.php is changes a bit:
<?php
rename_function('print_r', 'print_rw');
function print_r($in, $saveToString = false) {
$out = print_rw($in, true);
$out = str_replace("\n", "\r\n", $out);
switch ($saveToString) {
case true: return $out;
default: echo $out;
}
}
?>
This effectively alters the behavior of the print_r function on my local machine, without my having to call custom functions, and having to make sure that all references to that custom function are neutralized. By using PHP's rename_function I was able to effectively rewrite how print_r behaves, making it possible to address my problem.

PHP's base64_decode returns wrong result working in IIS

I'm trying to show a jpg that was previously encoded in a WCF web service using:
<?php
require_once '../inc/config.php';
[...]
header("Content-type: image/jpg");
echo base64_decode($doc['BDATA']);
But I'm getting a
Can't display the image because it contains errors.
I've decoded the base64 string in this web app www.opinionatedgeek.com/dotnet/tools/base64decode/ and the result is right, but different that the one I'm getting with base64_decode, which is wrong.
Edit: I have two enviroments using the same code: Test and Production. It works fine in Test, but not in Production, so I'm thinking in some configuration problem.
I'm working with PHP 5.5.9 in Microsoft IIS.
An example of a string that base64_decode isn't decoding well:
/9j/4AAQSkZJRgABAQEAeAB4AAD/2wBDAAIBAQIBAQICAgICAgICAwUDAwMDAwYEBAMFBwYHBwcGBwcICQsJCAgKCAcHCg0KCgsMDAwMBwkODw0MDgsMDAz/2wBDAQICAgMDAwYDAwYMCAcIDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAz/wAARCAABAAEDASIAAhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD9/KKKKAP/2Q==
Any ideas?
Edit 2: If I comment this line
require_once '../inc/config.php';
and copy the code from config.php to my actual file, it works fine. What could be happening?
From base_64_decode manual comments
php <= 5.0.5's base64_decode( $string ) will assume that a space is
meant to be a + sign where php >= 5.1.0's base64_decode( $string )
will no longer make that assumption
To fix this behavior try this code
$encodedData = str_replace(' ','+',$encodedData);
$decocedData = base64_decode($encodedData);
As this is no't your case then you have to check this answer
Because every thing work fine for me here on (WAMP)
EDIT:
As in our below conversation
There are a lot of things that may corrupt header for example , if
your file encoding is UTF-8 then you should save it as UTF-8 Without
bom you can do this using notepad ++ , also make sure if you use FTP
that your client didn't any chars to your file , rather than that
every thing should work fine
base64 encoding is not completely standardised.
Some implementations use different characters, so you'll have to replace those characters before you run your decode.
further details

Why might my PHP log file not entirely be text?

I'm trying to debug a plugin-bloated Wordpress installation; so I've added a very simple homebrew logger that records all the callbacks, which are basically listed in a single, ultimately 250+ row multidimensional array in Wordpress (I can't use print_r() because I need to catch them right before they are called).
My logger line is $logger->log("\t" . $callback . "\n");
The logger produces a dandy text file in normal situations, but at two points during this particular task it is adding something which causes my log file to no longer be encoded properly. Gedit (I'm on Ubuntu) won't open the file, claiming to not understand the encoding. In vim, the culprit corrupt callback (which I could not find in the debugger, looking at the array) is about in the middle and printed as ^#lambda_546 and at the end of file there's this cute guy ^M. The ^M and ^# are blue in my vim, which has no color theme set for .txt files. I don't know what it means.
I tried adding an is_string($callback) condition, but I get the same results.
Any ideas?
^# is a NUL character (\0) and ^M is a CR (\r). No idea why they're being generated though. You'd have to muck through the source and database to find out. geany should be able to open the file easily enough though.
Seems these cute guys are a result of your callback formatting for windows.
Mystery over. One of the callbacks was an anonymous function. Investigating the PHP create_function documentation, I saw that a commenter had noted that the created function has a name like so: chr(0) . lambda_n. Thanks PHP.
As for the \r. Well, that is more embarrassing. My logger reused some older code that I previously written which did end lines in \r\n.

Categories