Converting Hebrew characters to UTF-8 using PHP - php

i have a MySQL field value with a json object containing Hebrew characters like this:
[{"name":"אספנות ואומנות","value":1,"target":null},{"name":"אופניים","value":2,"target":null}]
(the one in the name field)
This field output is giving me some trouble with a certain web interface.
so, looking around in the database i found another field containing json object and its output works fine.
[{"name":"\u05d0\u05e1\u05e4\u05e0\u05d5\u05ea \u05d5\u05d0\u05d5\u05de\u05e0\u05d5\u05ea","value":1,"target":null},{"name":"\u05d0\u05d5\u05e4\u05e0\u05d9\u05d9\u05dd","value":2,"target":null}]
So i would like to convert the first field to this encoding to see if its solves the output issue.
what is this encoding ? is it UTF-8 ? how can i convert it using PHP ?
i tried to isolate the value and convert it to UTF-8 using
echo iconv("Windows-1255","UTF-8",'אספנות ואומנות');
but its just returning an empty value.
Any help would be great

So, in PHP
json_encode('אספנות ואומנות');
did the trick

Related

How is json decoding of utf emoji possible?

As far as i call recall. A valid Json data comes in this format '{"key":"value"}'
But while surfing, I found an article about sending UTF-8 codes as emoji.
Emoji was stored in variable as
$emoji = "\ud83d\udc4e";. For it to work properly, the answer was to use json_decode($emoji);. I tried it out and it returned a thumbs down emoji. Meanwhile, I was expecting NULL but it turns out that it was a valid json data. So I'm confused how that is possible.

Laravel - Russian Symbols in Output

I have the following in my categories table. There are Russian symbols present:
id name
1 Обувь
Обувь = Obuv
I'm looking to get "Obuv" as Russian symbols within the output.
echo Category::select('name')->first();
This script should give me "Обувь" as the output, but I'm getting
{"name":"\u041e\u0431\u0443\u0432\u044c"}
What's wrong? How can I get the correct output? If I write "Obuv" in the database, the english "Obuv" will give me the correct output in Laravel. In phpMyAdmin, it gives me Russian symbols without trouble. The problem lies in Laravel.
You are outputting the data by echoing your Category model which outputs in JSON by default in Laravel. This automatically escapes multibyte Unicode characters upon output by default. When you run json_encode(), you could supply the JSON_UNESCAPED_UNICODE option like so:
$data = Category::select('name')->first();
echo json_encode($data, JSON_UNESCAPED_UNICODE);
I should be clear, though. Your Category does not have the escaped data stored within it, the escaping only takes place when you output to JSON. If all you were doing is ensuring you got the correct data back from the database by echoing it, you should be in the clear.
If you're looking for just the raw text of the category's name property, you should be able to output this like so:
echo $data->name;
For more information, see http://php.net/manual/en/function.json-encode.php and https://laracasts.com/discuss/channels/laravel/how-to-prevent-laravel-from-returning-escaped-json-data.

Weird encoding issue in XML-RPC call

I'm retrieving from Odoo 9 on Ubuntu 14.04 ENG a list of partners via XML-RPC using PHP and ripcord
Some names contain one or more diacritics:
Pièr
Frère Pièr
All those names have been entered from a single computer running Windows 8.1 using one version of Chrome.
The strange fact is that I get a list where some diacritics are correct, some other have encoding problems, like:
Pi�r
Fr�re Pièr
The same diacritic in the same string is correctly encoded or not.
In subsequent calls the result is always the same.
If I edit the string, then it could change the results, giving
Frère Pi�r
Frère Pièr
Fr�re Pi�r...
I need to output a JSON, and thus I need to encode this in UTF-8: but it is currently impossible since I don't have a clue of what encoding the original text is (and it seems to not have any encoding at all!)
Any idea?
I found out that the incoming array was in charset "Latin1"
I solved normalizing the array generated from the XML-RPC output, recursively applying a multbyte conversion function:
// given an XML-RPC output named $arr_output...
function descramble_diacritics(&$entry, $key) {
$entry = mb_convert_encoding($entry, 'UTF-8', 'Latin1');
}
array_walk_recursive($arr_output, 'descramble_diacritics');
header('Access-Control-Allow-Origin: *');
header('Content-Type: application/json');
echo json_encode($arr_output);

PHP json_decode returns null

I'm writing PHP code that uses a database. To do so, I use an array as a hash-map.
Every time content is added or removed from my DB, I save it to file.
I'm forced by my DB structure to use this method and can't use mysql or any other standard DB (School project, so structure stays as is).
I built two functions:
function saveDB($db){
$json_db = json_encode($db);
file_put_contents("wordsDB.json", $json_db);
} // saveDB
function loadDB(){
$json_db = file_get_contents("wordsDB.json");
return json_decode($json_db, true);
} // loadDB
When echo-ing the string I get after the encoding or after loading from file, I get a valid json (Tested it on a json viewer) Whenever I try to decode the string using json_decode(), I get null (Tested it with var_dump()).
The json string itself is very long (~200,000 characters, and that's just for testing).
I tried the following:
Replacing single/double-quotes with double/single-quotes (Without any backslashes, with one backslash and three backslashes. And any combination I could think of with a different number of backslashes in the original and replaced string), both manually and using str_replace().
Adding quotes before and after the json string.
Changing the page's encoding.
Decoding without saving to file (Right after encoding).
Checked for slashes and backslashes. None to be found.
Tried addslashes().
Tried using various "Escape String" variants.
json_last_error() doesn't work. I get no error number (Get null, not 0).
It's not my server, so I'm not sure what PHP version is used, and I can't upgrade/downgrade/install anything.
I believe the size has something to do with it, because small strings seem to work fine.
Thanks Everybody :)
In your JSON file change null to "null" and it will solve the problem.
Check if your file is UTF8 encoded. json_decode works with UTF8 encoded data only.
EDIT:
After I saw uploaded JSON data, I did some digging and found that there are 'null' key. Search for:
"exceeding":{"S01E01.html":{"2217":1}},null:{"S01E01.html":
Change that null to be valid property name and json_decode will do the job.
I had a similar problem last week. my json was valid according to jsonlint.com.
My json string contained a # and a & and those two made json_decode fail and return null.
by using var_dump(json_decode($myvar)) which stops right where it fails I managed to figure out where the problem was coming from.
I suggest var_dumping and using find dunction to look for these king of characters.
Just on the off chance.. and more for anyone hitting this thread rather than the OP's issue...I missed the following, someone had htmlentities($json) way above me in the call stack. Just ensure you haven't been bitten by the same and check the html source.
Kickself #124

How do I get PHP to accept ISO-8859-1 characters in general?

This has been bugging me for ages and I want to get to the bottom of this once and for all. I have an associative array which fields I have defined using ISO-8859-1 characters. For instance:
array("utført" => "red");
I also have another array that I have loaded in from a file. I have printed this array out in a browser, checking that values like Æ, Ø and Å is intact. I try to compare two fields from these arrays and I'm slapped by the message:
Undefined index: utfã¸rt on line 39
I can't help but sob. Every single damn time I involve any letters outside UTF-8 in a script they are at some point converted into ã¸r or similar nonsense.
My script file is encoded in ISO-8859-1, the document from which I'm loading my data is the same, and so is the MySQL table I'm trying to save the data to.
So the only conclusion I can draw is that PHP isn't accepting just any character-sets into it's code, and I have to somehow force PHP to speak Norwegian.
Thanks for any suggestions
Just FYI, I won't accept any answers in the lines of "Just don't use those characters" or "Just replace those characters with UTF equivalents at file load" or any other hack solutions
When you read your data from external file try to convert them in proper encoding.
Something like this I have on my mind...
$f = file_get_contents('externaldata.txt');
$f = mb_convert_encoding($f, 'iso-8859-1');
// from this point deal with $f whatever you want
Also, look at mb_convert_encoding() manual for more info.

Categories