Multiple languages in web service - php

Currently I'm using a php web service to retrieve the information from MySQL.
I'm dealing with multiple languages including chinese/japanese/french characters, I'm having issues displaying chinese/japanese and a few other languages.
<?php
echo "你好";
?>
For example, when I'm trying to echo simple chinese characters, what shows is "ä½ å¥½" instead, I'm not sure how to proceed.
Please advise.
Thank you

You should probably set the character encoding.
This is traditionally done by setting the html content-type header. The default is usually:
Content-Type: text/html; charset=ISO-8859-1
You can change this via php by using the header() function.
header('Content-Type: text/html; charset=utf-8');
Some other resources for you:
http://www.w3.org/International/questions/qa-html-encoding-declarations
http://en.wikipedia.org/wiki/Character_encodings_in_HTML

1) Make sure the file is saved as UTF-8 (without BOM)
2) Tell the browser that it's UTF-8 (as hafichuk explained)
3) Make sure the browser is using a font that has Chinese/Japanese/etc characters (a ton of fonts do not have them -- if you've done 1 and 2, this could very well be the problem)

Related

symbols displayed at run time but not present in code

I am creating a site with html and php.
When I Run my php page on borwser using localhost(XAMPP server), then some symbols () are displayed but when I check my html-php code, then no symbol or script like: ¿ or » is found.
If i am wrong somewhere then Please let me know.
That's a UTF-8 byte-order marker. You should configure your editor to save UTF-8 without BOM. It isn't mandatory for the UTF-8 encoding; in fact, its use is discouraged and it only causes problems.
Additionally, make sure your web server is sending an appropriate Content-Type HTTP header:
Content-Type: text/plain; charset=utf-8
¿ or » are html entities, they are looks different at php code and at browser. You can find them, for example, here. Also, you possibly have an issue with BOM
My best guess: You have an issue with encoding (UTF vs. ISO). Look up encoding used by your editor on saving, and send it to the browser like i.e. header("Content-type:text/html;charset=UTF-8")
sounds like you're dealing with a character encoding problem.
try to declare the encoding in your headers.
header("Content-Type: text/html; charset=UTF-8")
this needs to be output before any text is sent to the client.

Change browser encoding with PHP?

I've got a program on which I have non-ASCII characters which do not show properly on ISO-8859-1. Is there a way to use PHP and change the browser encoding somehow, and also allow the characters to display properly in the browser even though the encoding is ISO-8859-1?
Much Appreciated.
Use the header function to send an (explicit) HTTP Content-Type response header.
header('Content-Type: text/html; charset=ISO-8859-1');
… replacing ISO-8859-1 with whatever encoding you are actually using. Hopefully that will be UTF-8.
you should use the header function
header( 'Content-Type: text/html; charset=ISO-8859-1');
Note: you should make sure no content have been sent to the browser or you can't modify the headers anymore, so I advise you to use this code as soon as possible in your script
The browser itself doesn't have an encoding. It supports many encodings and uses the one you tell it too. If you specify (in headers and/or HTML) that the encoding is ISO-8859-1, then your document should be in that encoding and you should make sure that all characters you send are in the right encoding. So you should actually send ISO-8859-1 characters. You cannot send a document that uses different encodings for different sections of the document.
For some characters, you may post an HTML entity instead. For instance é can be sent as é. This will work, regardless of encoding.
If you have the choice, I'd opt to use UTF-8. It supports any character and you don't have to worry about escaping diacritics or other special characters, except those that are special to HTML/XML itself.
Like others have said, using the header function:
header('Content-type: text/html; charset=ISO-8859-1');
or, if you want to serve valid XHTML files instead of the standard HTML:
header('Content-type: application/xml+xhtml; charset=ISO-8859-1');
It is possible to call the header later on in the script, unlike what RageZ said, but you will need to have enabled output buffering for that, using ob_start().

show hebrew words in a php response

i am sending a JSON response through a php script which has some hebrew words.but when i run this script on browser it is showing '?' instead of hebrew characters..
FYI.. database is in hebrew_general_ci collation
any help would be appreciated..
thanks..
you may not included appropriate CSS file. also charset variable is another problem while using native languages. use Utf-8 if you are using a unicode font
like
also problem may occur, if the created eot font is not good. i am not sure Hebrew need such font.
You need to set the appropriate character set in your HTTP Content-Type header for the HTML page that will ultimately display the data contained in your JSON. Try adding this to the top of your php script:
header('Content-Type: text/html; charset="windows-1255"');
Use a content-type header, as others have suggested:
header('Content-Type: text/html; charset=YOUR_CHOSEN_CHARSET');
Try charset=utf8, or charset=iso-8859-8. There's a list of character sets over here.

PHP Japanese Strings getting set to?

I have a PHP file with one simple echo function:
echo 'アクセスは撥ねりません。';
but when I access that page i get this:
????????????
Can someone help me?
I also have my page encoding set to UTF-8, and I know it, because all of the browsers i used said so.
I also do this before the echo function:
mb_internal_encoding('UTF-8');
What does this do?
Does it help me?
All I need is to be able to echo a static Japanese string.
Thanks!
There are a few places where this could go wrong.
Firstly, if you aren't setting the output encoding in php with header()
header('Content-type: text/html; charset=utf-8');
or in your html with a meta tag:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
you will need to check the php.ini setting default_charset. Chances are this is defaulted to iso-8859-1
Secondly, you may also need to check the content encoding you are saving the php script as. If you are saving it as ASCII or some other latin charset, it will munge the characters.
I got it.
I just had to set the mbstring extension settings to handle internal strings in UTF-8. Thas extension is standard with my build of PHP 5.3.0.
Maybe you are printing Japanese characters contained in UTF-16 (extended set of chars)?
I just did a quick test and your example works for me, so it's most likely one of these:
Your file is not saved in UTF-8, but some other encoding, such as Shift-JIS. A decent editor should be able to let you see what encoding it used
Your server is sending bad http headers. Can you use some tool to check the headers and paste the results? Or the results you got from the browser?
The browser is using an incompatible font
I saved a file in UTF-8, pasted your code into it, and my server is serving the file with Content-Type: text/html; charset=utf-8 and it shows up just fine. Did not need to use the mb_ function or anything else.

PHP: Current encoding used to send data to the browser

How can I know what encoding will be used by PHP when sending data to the browser? I.e. with the Cotent-Type header, for instance: iso-8859-1.
Usually Apache + PHP servers of webhosters are configured to send out NO charset header.
The shortest way to test how your server is configured are these:
Use this tool to see the server header by getting any one of your pages on your webiste.
If in the server headers you see a charset it means your server is using it, usually it won't contain a charset.
Another way is to run this simple script on your server: <?php echo ini_get('default_charset'); ?> As said above this usually prints out an empty string, if different it will show you the charset of the PHP.
The 2nd solution is supposing Apache is not configured with AddDefaultCharset some_charset which is not usually the case, but in such case I'm afraid Apache setting might override PHP deafult_charset ini directive.
You can use the header() solution that William suggested, however if you are running Apache, and the Apache config is using a default charset, that will win everytime (Internet Explorer will go crazy) See: AddDefaultCharset
Keep in mind that content-types and encodings are two different things. text/html is a content-type; ISO-8859-1 and UTF-8 are encodings.
The HTTP response header that the server sends typically looks like this:
Content-Type: text/html; charset=utf-8
"charset" is actually the character encoding. It's not in a separate header; however there is a header called "Content-Encoding" which actually specifies what kind of compression the response uses (e.g. gzip).
If you want to change the character encoding to UTF-8, in a file that contains HTML:
<?
header("Content-Type: text/html; charset=utf-8");
You can set your own with header('Content-type: xxx/yyy');, but I believe that text/html is sent by default.
AFAIK, PHP sends strings bytewise. that is, if your variables hold UTF-8, it will send UTF-8. if you have iso-8859-1, it will send that too. if you mix them, it won't be pretty.
If your server is not configured to have a default content or charset, and neither is PHP, PHP will send only Content-Type: text/html - it won't specify a charset at all, and will send the bytes as it sees them in the script.
If a browser receives a page without charset specified, various things can happen:
most browsers have an "Encoding/Charset" menu; if the user explicitly selects one, the browser will try to apply it. Doesn't happen too often, so:
some browsers try to render it with a default charset (which is locale-dependent, e.g. for FF and cs_CZ it used to be iso-8859-2; YMMV)
IE will try to determine the charset heuristically (it will take a guess, based on character distribution - and many times it gets it right; sometimes it gets it wrong and you get a page in Romanian interpreted as Chinese text, which usually means "unreadable")
some old browsers will fall back on us-ascii
If with this procedure, the PHP script's charset and the browser's charset matches, the text will - accidentally - be readable. If not, there will be weird signs and similar phenomena.

Categories