PHP MongoDB->insert - fatal error with utf8 - php

An annoying encoding error worries about a new dataset in a mongoDB insert and stops my script when there is a encoding issue?
PHP Fatal error: Uncaught exception 'MongoException' with message 'non-utf8 string: ü'
How to fix the new dataset before the PHP driver breaks?
Is there a better idea than utf8_encode any string data, even those that are already utf8?

Had the same issue. This works:
$string = mb_convert_encoding($string, 'ISO-8859-1', 'UTF-8');

utf8_encode() ( http://php.net/manual/en/function.utf8-encode.php ) since the default PHP encoding is still not utf8 yet I think (not sure about PHP 5.4).

Related

Encoding problem when connecting to SQL Server database via odbc_connect()

Firstly, I don't have an option to use latest SQLSRV drivers on my host so I am stuck with odbc connection.
$connection_string = 'DRIVER={SQL Server};SERVER=111.111.111.111;DATABASE=MY_DATABASE';
$user = 'name';
$pass = 'pass';
$connection = odbc_connect( $connection_string, $user, $pass, SQL_CUR_USE_ODBC );
The collation of that database is Slovak_CI_AI. If I set my PHP header to utf-8, output data looks messed up, encoding is wrong.
If I put 'Slovak_CI_AI' as a charset to my PHP header, data displays fine, but it is probably a no go, because I need to work with that data in WordPress, which fails to process them if they contain special/non-english characters (those strings looks broken to WP).
I've tried many conversions with mb_convert_encoding, iconv or utf8_decode, but no luck. WordPress uses utf-8.
I can't find any solution for this.
Update: I've tried adding CHARSET=UTF8 to my odbc connection string, but no luck. Also I found out the character set for texts in the database is cp1250. I've tried setting cp1250 as a charset to my PHP header, output is fine but WordPress still fails once it encounters a special character. I've tried converting those strings from cp1250 to utf-8 with iconv, but no luck as well - strings have wrong encoding on output and WordPress fails as well.
This whole encoding thing still feels chaotic to me, but I somehow managed to do it. It works when:
odbc connection string contains charset=cp1250
PHP header character set is set to utf-8
I convert all problematic strings from cp1250 to utf-8 with iconv

json_encode - Invalid UTF-8 sequence in argument

I had issue with json_encode. I was getting
PHP Warning: json_encode() [<a href='function.json-encode'>function.json-encode</a>]: Invalid UTF-8 sequence in argument in /var/www/html/web/example.php on line 500
Then I set magic_quotes_gpc = 0 in php.ini (it was 1 before) and it stopped showing the json_encode error.
Now, I started getting the same error again. magic_quotes_gpc is 0 in php.ini. I am using PHP 5.3
I found many answers which say to convert it to UTF-8. But I cannot do it because I am using json_encode in many places and changing all is not possible.
I would like to fix the root issue so that I don't need to change the json_encode code.
In MySQL, the result for
SHOW VARIABLES LIKE 'character_set%';
is
character_set_client latin1
character_set_connection latin1
character_set_database latin1
character_set_filesystem binary
character_set_results
character_set_server latin1
character_set_system utf8
character_sets_dir /usr/share/mysql/charsets/
What is the reason for the json_encode error?
I am using zend server and I see this json_encode error in zend server logs.
Another thing I noticed is, even if I see the error in the server logs, it is properly converting the array to json.
There is no error in converting array to json. Then Why I see the error in zend server?
json_encode() needs valid UTF-8 data as input.
You are feeding it invalid data.
From what you describe, you probably are getting latin1 data from your database connection, which will cause json_encode() to choke.
Set the database's connection in your script to UTF-8. How to do that depends on the database library you are using.
Here is a list of ways to switch to UTF-8 in the most common libraries:
UTF-8 all the way through
Try to add the instruction below after the connection parameters:
mysql_set_charset('utf8');
And for the result value:
mb_convert_encoding($result,'UTF-8','UTF-8');

convert latin1_swedish_ci to utf8

I am importing data from an xml, and it appears they use " latin1_swedish_ci" which is causing me lots of issues with php and mysql (PDO).
I'm getting a lot of these errors:
General error: 1267 Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '='
I am wondering how I can convert them to proper UTF-8 to store in my database.
I've tried doing this:
$game['human_name'] = iconv(mb_detect_encoding($game['human_name'], mb_detect_order(), true), "UTF-8", $game['human_name']);
Which I found here: PHP: Convert any string to UTF-8 without knowing the original character set, or at least try
I still seem to get the same error though?
Don't use any conversion functions -- get utf8 specified throughout the processing.
Make sure the data is encoded utf8.
When using PDO:
$db = new PDO('dblib:host=host;dbname=db;charset=UTF-8', $user, $pwd);
Table definitions (or at least columns):
CHARACTER SET utf8
Web page output:
<meta ... charset=UTF-8>

php 5.4 charset for Exception

In php 5.4 my code dont work properly. I use cyrillic charset. In short:
throw new Exception('Сообщение');
will output:
Fatal error: in test.php ...
although the result would be:
Fatal error: Uncaught exception 'Exception' with message ...
If I dont use cyrillic characters, the result is Ok. Moreover, if I run this code in 5.3, I'll get the proper result. I. e. if I use cyrillic, the result message is empty string.
There are reported issues with non utf-8 chars in exceptions. Try converting the message to utf-8 like so:
throw new Exception(utf8_encode('Сообщение'));
if that does not work then try the following:
$message = 'Сообщение';
$message = mb_convert_encoding($message, 'Windows-1251', 'UTF-8');
throw new Exception($message);
-- EDIT --
The actual problem is not that the exception message is not stored, but rather - the exception is not displayed properly. In PHP 5.3, xdebug is not turned on by default and in PHP 5.4, it is. xdebug is set to display everything in UTF-8 and your message is probably encoded in some other charset, thus the message not being rendered correctly.
If you scroll to the bottom of this page, you will find a single comment referring to this problem.
PHP themselves tracked this issue on here
This stackoverflow thread is also related to the same issue.
You might be able to get away by setting the xdebug encoding to a non utf-8 charset. Please read the xdebug manual regarding this

UTF-8 Parsing error in reading UTF-8 Excel files using PHP-ExcelReader

I’m trying to read and parse a UTF-8 Excel file using the PHPExcelReader but unfortunately this does not work correctly and I receive some ???? Instead of UTF-8 characters. Would you please help me about the instructions about the way to handle this situation? I have used this configuration for my parsing:
$data = new Spreadsheet_Excel_Reader();
$data->setOutputEncoding('CP1251');
$data->setUTFEncoder('mb');
Thanks
UTF-8 is the default encoding for Spreadsheet_Excel_Reader...you should not need to change this at all unless you want the values automatically converted to some other charset.
e.g.
$data = new Spreadsheet_Excel_Reader("test.xls",true,"UTF-16");
to convert outputs to UTF-16
I tried UTF-16 but there's still an error.
Then I tried:
$data->setOutputEncoding('UTF-8');
$data = new Spreadsheet_Excel_Reader("test.xls",true,"UTF-8");
This is ok.

Categories