php 5.4 charset for Exception - php

In php 5.4 my code dont work properly. I use cyrillic charset. In short:
throw new Exception('Сообщение');
will output:
Fatal error: in test.php ...
although the result would be:
Fatal error: Uncaught exception 'Exception' with message ...
If I dont use cyrillic characters, the result is Ok. Moreover, if I run this code in 5.3, I'll get the proper result. I. e. if I use cyrillic, the result message is empty string.

There are reported issues with non utf-8 chars in exceptions. Try converting the message to utf-8 like so:
throw new Exception(utf8_encode('Сообщение'));
if that does not work then try the following:
$message = 'Сообщение';
$message = mb_convert_encoding($message, 'Windows-1251', 'UTF-8');
throw new Exception($message);
-- EDIT --
The actual problem is not that the exception message is not stored, but rather - the exception is not displayed properly. In PHP 5.3, xdebug is not turned on by default and in PHP 5.4, it is. xdebug is set to display everything in UTF-8 and your message is probably encoded in some other charset, thus the message not being rendered correctly.
If you scroll to the bottom of this page, you will find a single comment referring to this problem.
PHP themselves tracked this issue on here
This stackoverflow thread is also related to the same issue.
You might be able to get away by setting the xdebug encoding to a non utf-8 charset. Please read the xdebug manual regarding this

Related

PHP-Firebird (ibase_connect = CHARACTER SET iso-8859-1 is not defined)

im new about firebird. I'm trying to connect between php and firebird. This is the code :
$host='192.168.12.1:D:/DB/ALFABETA.FDB';
$username='john.doe';
$password='123456789';
$database='ALFABETA';
$dbh=ibase_connect($host,$username,$password) or die (ibase_errmsg());
$sth= ibase_query($dbh) or die (ibase_errmsg());
But after i run the code in browser, the warning statement is coming up. Help. What i should do ?
Warning Statement
Warning: ibase_connect(): bad parameters on attach or create database
CHARACTER SET iso-8859-1 is not defined in
/var/www/fortrainingcrud/connect_db.php on line 7 bad parameters on
attach or create database CHARACTER SET iso-8859-1 is not defined
I finally got the answer ! Here it is :
In php.ini, i add extension=php_interbase.dll
Restart the Apache service
The code that i've write before, works well
You aren't setting the connection charset yet you get an error about it. That suggests that PHP is taking the value from somewhere else and the first candidate is the ibase.default_charset directive. You can see its current value with var_dump(ini_get('ibase.default_charset')); or simply by running phpinfo().
You can either change the directive yourself or, even better, specify the encoding at ibase_connect() so your code doesn't break randomly depending on server configuration.
As about iso-8859-1, it seems the appropriate syntax for Firebird is ISO8859_1 (assuming you really want that encoding).

UTF-8, PHP, Win7 - Is there a solution now to save UTF-8-filenames on Win 7 using php?

Update: Just to not make you reading through all: PHP starting with
7.1.0alpha2 supports UTF-8 filenames on Windows. (Thanks to Anatol-Belski!)
Following some link chains on stackoverflow I found part of the answer:
https://stackoverflow.com/a/10138133/3716796 by Umberto Salsi
(and on the same question: https://stackoverflow.com/a/2950046/3716796 by Artefacto)
In short: 'PHP communicate[s] with the underlying file system as a "non-Unicode aware program"', and because of that all filenames given to PHP by Windows and vice versa are automatically translated/reencoded by Windows. This causes the errors. And you seemingly can't stop the automatic reencoding.
(And https://stackoverflow.com/a/2888039/3716796 by Artefacto: "PHP does not use the wide WIN32 API calls, so you're limited by the codepage.")
And at https://bugs.php.net/bug.php?id=47096 there is the bug report for PHP.
Though on there nicolas suggests, that a COM-object might work! $fs = new COM('Scripting.FileSystemObject', null,
CP_UTF8);
Maybe I will try that sometimes.
So there is the part of my questionleft : Is there PHP6 out, or was it withdrawn, or is there anything new on PHP about that topic?
// full Question
The most questions about this topic are 1 to 5 years old.
Could php now save a file using
file_put_contents($dir . '/' . $_POST['fileName'], $_POST['content']);
when the $_POST['fileName'] is UTF-8 encoded, for example "Крым.xml" ?
Currently it is saved as
Крым.xml
I checked the fileName variable, so I can be sure it's UTF-8:
echo mb_detect_encoding($_POST['fileName']);
Is there now anything new in PHP that could accomplish it?
At some places I read PHP 6 would be able to do it, but PHP 6 if i I remember right, has been withdrawn. ?
In Windows Explorer I can change the name of a file to "Крым.xml". As far as I have understood the old questions&answers, it should be possible to use file_put_contents if the fileName-var is simply encoded to the encoding used by windows 7 and it's NTFS disc.
There is even 3 old question with answers that claim to have succeeded: PHP File Handling with UTF-8 Special Characters
Convert UTF-16LE to UTF-8 in php
and PHP: How to create unicode filenames
Overall and most approved answers say it is not possible.
I checked all suggested answers already myself, and none works.
How to definitly and with absolute accuracy find out, in which encoding my Win 7 and Explorer saves the filename on my NTFS disc and with German language setting?
As said: I can create a file "Крым.xml" in the Explorer.
My conclusion:
1. Either file_put_contents doesn'T work correctly when handing over the fileName (which I tried with conversions to UTF-16, UTF-16LE, ISO-8859-1 and Windows-1252) to Windows,
2. or file_put_contents just doesn't implement a way to call Windows' own file function in the appropriate way (so this second possibility would mean it's not a bug but just not implemented.) (For example notepad++ has no problems creating, writing and renaming a file called Крым.xml.)
Just one example of the error messages I got, in this case when I used
mb_convert_encoding($theFilename , 'Windows-1252' , 'UTF-8')
"Warning: file_put_contents(dirToSaveIn/????.xml): failed to open stream: No error in C:\aa xampp\htdocs\myinterface.lo\myinterface\phpWriteLocalSearchResponseXML.php on line 26 "
With other conversion I got other error messages, ranging from 'invalid characters' to no string recognized at all.
Greetings
John
PHP starting with 7.1.0alpha2 supports UTF-8 filenames on Windows.
Thanks.

readOuterXml(), Input is not proper UTF-8, indicate encoding

I'm using XMLReader to parse a large XML file from a third party, file size is 1GB+. The XML file specifies the encoding as UTF8 (<?xml version="1.0" encoding="utf-8" ?>), although it isn't.
XMLReader throws an error because of the unknown encoding type, but not until it's already processed most of the file.
Exception message:
Input is not proper UTF-8, indicate encoding
I have determined that the real encoding of the file is ISO-8859-1, and it will work fine if I manually specify this when calling $reader->open().
The problem is that my script needs to parse unknown files from the database, so it needs to rely on the encoding type specified within the file. I need to find a way to parse any file regardless of its encoding, are there any suggestions for doing this?
I figured out that vim is pretty good at converting from one encoding to another.
My trick is to parse the file normally, and when the encoding error is encountered just re-encode the file with vim and start parsing again.
Here's the rough idea:
$xmlFile = '/path/to/file.xml';
// Parse the file in a loop
while(...)
{
try
{
// Normal parsing logic...
$reader->readOuterXml();
//...
}
catch(Exception $ex)
{
$encoding = getXMLEncoding($xmlFile) ?: 'utf-8';
exec(sprintf(VIM_PATH . ' -c "set fileencoding=%s" -c "wq" "%s"', $encoding, $xmlFile));
// File has been re-encoded
// The real encoding should now match the declared encoding
// -> Go back to the beginning and parse the file again
}
}
Using this method might garble 1 or 2 chars, but it's way better than completely failed parsing. Ideally the 3rd party would mark their files correctly.
My system is Windows, so the vim arguments might be different on Linux (don't know).
Use simplexml_load_file to parse XML. In order to avoid encoding problems, use utf8_encode on data.

GBP £ symbol in ASCII php file being converted to £ on live server (transferring with git)

I have a piece of PHP code, which was written in notepad++ on a Windows 7 machine
The Encoding in notepad++ is set to "Encode to ANSI" (ASCII)
I am them doing this in my code:
utf8_encode("£")
so I am sure to get the utf friendly version of the £ symbol.
All works perfectly fine on the local server.
But when I push it up to my live server I'm getting all sorts of issues with utf8 encoding errors in php.
Is something in the git push/pull process corrupting this, or is it perhaps a locale setting on the live server?
Both local and live servers run ubuntu 12.04
Thanks
Update 1
The actual error I'm getting is
invalid byte sequence for encoding "UTF8": 0xa3'
(This is a Postgres SQL error)
Other difference in local and live is live is over https and local is just http (both apache)
Update 2
Running:
file -bi script.php
on both local and live produces:
text/x-php; charset=iso-8859-1
So it seems as if the encoding of the file is intact?
Update 3
Looking at the local Postgres installation it has the following settings:
ENCODING = 'UTF8'
LC_COLLATE = 'en_GB.UTF-8'
LC_CTYPE = 'en_GB.UTF-8'
Whereas live has:
ENCODING = 'UTF8'
LC_COLLATE = 'en_US.UTF-8'
LC_CTYPE = 'en_US.UTF-8'
I'm going to see if I can swap the collate types to match local and see if that helps
Update 4
I'm doing this, which is the ultimately resulting in the failing piece of code on live (not local)
setlocale(LC_MONETARY, 'en_GB');
$equivFinal = utf8_encode("£") . money_format('%.2n', $equivFinal);
Update 5
I'm getting closer to the issue.
On local the string is produced as
£1.00
On live the string is produced as
£�1.00
So for some reason the live server is adding more crap in when doing the UTF8 conversion
Update 6
Ok so I've pinned it down to this:
setlocale(LC_MONETARY, 'en_GB');
Logger::getInstance(__NAMESPACE__)->info("TEST 01= " .money_format('%.2n', 1.00));
On local it outputs
TEST 01= 1.00
As expected
on live it output
TEST 01= �1.00
With the random characters added to the start, which is what is causing my utf8 issue as it's croaking on that.
Any idea why money_format would do that on one server and not another?
finally nailed it
it's money_format
if you dont specifiy a locale or specify it incorrectly then it just does its own thing
so i was doing
setlocale(LC_MONETARY, 'en_GB');
and on local that meant money_format just ignored the £ from the start of the output
but on live it meant that money_format put the unicode WTF character.
doing it properly for ubuntu of
setlocale(LC_MONETARY, 'en_GB.UTF-8');
means money_format comes out with £ at the front and therefore i dont need my utf8 rubbish
Update 1
Better still, don't bother with setlocale and I'm just going to do this:
utf8_encode("£") . money_format('%!.2n', $equivFinal);
Which basically formats the money and excludes the symbol prefix
and then better still just use number_format and do
utf8_encode("£") . number_format($equivFinal, 2);
I've learnt something new :)
The issue is that you can't save raw GBP symbol inside ASCII file.
Never use weird characters in your source code because no matter how much they "should" work you always run into problems like this. (You can come up with your own definition of "weird" but mine is anything you can't type in on a us-english keyboard without resorting to alt-codes.)
To get arround this restriction concatinate in the results of the chr() function. (use the following code snipit to find out the parameter you need to pass chr is 163 in this case.)
<?php echo(ord('£')); ?>
so in your case the line would read:
$equivFinal = chr(163) . money_format('%.2n', $equivFinal);

PHP MongoDB->insert - fatal error with utf8

An annoying encoding error worries about a new dataset in a mongoDB insert and stops my script when there is a encoding issue?
PHP Fatal error: Uncaught exception 'MongoException' with message 'non-utf8 string: ü'
How to fix the new dataset before the PHP driver breaks?
Is there a better idea than utf8_encode any string data, even those that are already utf8?
Had the same issue. This works:
$string = mb_convert_encoding($string, 'ISO-8859-1', 'UTF-8');
utf8_encode() ( http://php.net/manual/en/function.utf8-encode.php ) since the default PHP encoding is still not utf8 yet I think (not sure about PHP 5.4).

Categories