Encode accentuated characters in json using php - php

I've a query result that contains some accentuated characters Like :
CollectionTitle => Afleuréss
But when i write a json file with json_encode($Result_Array) and retrieve the result it shows :
CollectionTitle => NULL
then i used array_map() :
$res[] = array_map('utf8_encode', $row);
But it results me :
CollectionTitle => Afleuréss instead of CollectionTitle => Afleuréss
Please suggest me better way to resolve this issue.
Thanks

The second one is actually the correct one. The problem is your browser cannot detect the encoding and defaults to whatever the default is (probably ISO-8859-1). Switch your browser encoding and you'll see the right character appear.

Add to your HTML head:
<meta http-equiv="content-type" content="text/html;charset=utf-8" />
Note that you should have a proper HTML doctype because browsers default to non utf8. You can do a simple test, like I did, this works:
<meta http-equiv="content-type" content="text/html;charset=utf-8" />
<?php
$title = "Jérôme";
echo $title."<br>";
But the place for the meta tag is in the head tag. The HTML document should look like this:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>An XHTML 1.0 Strict standard template</title>
<meta http-equiv="content-type" content="text/html;charset=utf-8" />
</head>
<body>
<?php
$title = "Jérôme";
echo $title."<br>";
?>
That is standard.

json_encode only supports UTF-8, but the rest of your app is using Windows-1252. I don't suggest using utf8_encode as that converts from ISO-8859-1 to UTF-8. That only works 95% of the time for you because you are using Windows-1252, not ISO-8859-1*.
I don't know if it's possible for you but if you can, you should switch over to UTF-8 so you don't need this fragile conversion code anywhere.
*This is probably confusing. Browsers do not actually allow you to use ISO-8859-1 and instead treat it as Windows-1252. Same with MySQL, Latin1 means Windows-1252. Both are defaults. utf8_encode/decode of course use actual ISO-8859-1, so it's incompatible in the 0x80-0x9F range.

In your second example / step:
$res[] = array_map('utf8_encode', $row);
It looks like you're trying to encode in UTF-8 something that is not ISO-8859-1.
You should detect / know what's the encoding coming from your Database, and transcode it properly to UTF-8 with iconv for example.
As an alternative, you should know:
What is the encoding in the Database?
What is the encoding of the PHP files?
What is the encoding in the HTML page? <meta charset="utf-8">
And if that's possible, move all of the above to UTF-8...

Related

Incorrect encoding on Php file but not on Html file

I write this code in Html file
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-15" />
</head>
<body>
Try with special character (ì)
</body>
</html>
When I display my html file it's all ok
Try with special character (ì)
But when I rename my html file in php file this is the result
Try with special character (�)
Someone can help me to understand?
You could try with UTF-8:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
Have a look at the PHP function mb_internal_encoding. It sets the character set that PHP will use internally. If you add this at the top of your HTML:
<?php
mb_internal_encoding("iso-8859-15");
?>
Your HTML should show fine if the file is in ISO-8859-15. What happens is that PHP interprets the HTML looking for PHP blocks and generates the output based on the default internal encoding. In combination with mb_http_output() you can have a source file in one encoding, generating output in a different encoding.
Specify the character encoding for the HTML document:
character_set Specifies the character encoding for the HTML document.
Common values:
UTF-8 - Character encoding for Unicode
ISO-8859-1 - Character encoding for the Latin alphabet
In theory, any character encoding can be used, but no browser understands all of them. The more widely a character encoding is used, the better the chance that a browser will understand it.
To view all available character encodings, look at IANA character sets.
You just replace the below html:
<html>
<head>
<meta charset="UTF-8">
</head>
<body>
Try with special character (ì)
</body>
</html>

Some characters become "�" in our webpage

I use PHP to access a database to get a string like this
‘Chloe’ Fashion Show & Dinner
and then I do a printf() to output the string as html, but my webpage shows this:
�Chloe� Fashion Show & Dinner
All contents are English-based, do I miss something in PHP?
Where should I be checking?
Check if your .php file is encoded as UTF-8 without BOM
Check that your connection to the database is UTF-8
Check that you send <meta charset="utf-8"> in your HTML markup in the <head> tag
If your connection to the database is not UTF-8 and you don't want to change it (but I recommend it -> everything UTF-8 is the most secure solution against character rubbish) use utf8_encode($databaseValue) to ensure the encoding of your value is UTF-8.
Make sure that you use:
<meta charset="utf-8">
in the head of your page.
You need to add charset meta tag in 'head' section of html.
Note that the meta tag must appear within the first 1024 bytes of rendered page.
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>

UTF-8 French accented characters issue

When i see data as stored on mysql database using phpmyadmin, the characters are stored exactly as é à ç however when i use php to display these data on an html document that has the exact following structure:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title></title>
</head>
<body>
</body>
</html>
I got square instead of accented character, however, i don't have this issue with any accented characters on static content that haven't been loaded from mysql in the same page.
when i see on the source code of the page they seem to be identical! for example:
part of static data on the source code displayed as:
éçà
part of mysql origin data:
éçà
i tried replacing
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
with
<meta http-equiv="Content-Type" content="text/html; charset=windows-1552" />
and as result i got mysql one fixed, static with squares !
any hints?
This is quite common charset issue, you need to set connection encoding manually for MySQL connection (those should be first queries you execute after establishing connection):
SET NAMES utf8;
SET CHARACTER SET utf8;
And also make sure every table has CHARACTER SET set to UTF-8.
Or you could also update server configuration.
Looks like a misconfiguration issue. Most probably your DB or drivers are not using UTF-8.
The fact that the data that comes from the DB shows OK when you change to windows-1552 and the static files do not can mean that your source file is (correctly) in UTF-8, but the data from your DB is arriving in the wrong encoding format.
Whatever is going on, stick to UTF-8.
UPDATE: There is a thread that explains how to automatically set the encoding for the connection:
Change MySQL default character set to UTF-8 in my.cnf?

How to save Russian characters in a UTF-8 encoded file

OK so I have a PHP file with several strings of text in various languages. For most languages like French or Spanish I just simply type in the characters.
The problem I have is with Russian language characters. The PHP file is encoded in UTF-8, how can I make sure that the Russian characters are both saved correctly and displayed correctly on the output web page... Is it just a case of pasting the text into the PHP file, or is there a way to guarantee the characters will be saved into the file correctly - perhaps converting it into HTML-like notation for example?
Obviously I am assuming the end user will have the correct encoding set in their web browser, I just want to make sure I got it all covered from my end.
I am using Notepad++ on Windows to edit my PHP file.
Thanks!
If you want to tell browsers your encoding, place it inside your <header> tag:
<meta http-equiv='Content-Type' content='text/html; charset=utf-8'>
Or short version
<meta charset='utf-8'>
That should be pretty enough for Russian characters to be correctly displayed on a webpage.
if your doctype is html declare <meta http-equiv='Content-Type' content='text/html; charset=UTF-8'> but if your doctype is xhtml then declare <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />.Never assume that end-user will act correctly during your designsIf you already have some document, edit your document's meta tag for charset declaration and use notepad++ encoding>convert to UTF-8 without BOM, save your document, safely go on with your multilingual structure from now on.php tag is irrelevant for your question since you don't mention about any database char setting.
There is no difference between Latin and Cyrillic characters in UTF-8. Both are just byte sequences. Configure your server or PHP script to send Content-Type: text/html;charset=utf, and you are rather safe.
Your editor might have problems when the font you are using does not contain Russian characters. Choose another font then.
And please ignore the <meta> element recommendations. You don't need that: it is useless when your HTTP headers are correct, and maybe harmful if they aren’t.
Well you have to check 2 things
To ensure that *.php is an UTF-8 file I use PSPad. If file is not in UTF-8, I save
it like that: http://stepolabs.com/upload/utf-8.png
Then your website must have UTF-8 encoding in <meta> tag;
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
... more about metatagging.
Finally if everything is done well - (format and meta declaration) all should be displayed properly!

not being displayed properly

I am having problem in displaying the in my web page, after using utf8_decode() in PHP it gets displayed as �.
i have been using
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
I just noticed, all the other special characters, like ® , ™ etc are also not working.
Be sure that you've specified UTF-8 encoding in your HTML document's tag:
<meta content="text/html; charset=UTF-8" http-equiv="content-type" />
That's strange since utf8_encode(' ')===' '. Regardless of whether it's utf8 or latin1 encoded the byte-sequence for is the same.
Is the remaining string properly utf8 encoded?
edit: Why do you use utf8_decode() (converting utf8 encoded strings to latin1) in the first place when you're telling the browser that your page is utf8 encoded?
Have you checked the encoding of the php file itself?
In some windows editors (like notepad++) you can have some utf-8 character problems when you check the wrong encoding for your file - even if you set your meta tag correctly.
In notepad++ you can change it in this section:
Change notepad++ file encoding http://img198.imageshack.us/img198/9081/notepadp.png
If you're not using notepad++, we'll need some more detailed information from your setup, like Operating System used, IDE, etc.
Also make sure you give the document a proper dtd definition by putting something like this:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
As the first line of html in your php file.
When you use utf8_decode, the string that is passed to this function where is it loaded from? Are you loading data from database? Do you have any included files? If so, check that they are all encoded as GmonC wrote. Try to echo somewhere in page and see if it will show correctly. If not, try to make clean .php file and than see if problem still occurs. If not than some included file could be the problem because it could have different encoding

Categories