Unicode to Character Converter In PHP - php

I need to convert unicode to character in PHP. I am using MySQL database to store text The text is in unicode format with collation utf8-general-ci. When I retrive those data and display, some special characters are displayed:: like "मिनिसà¥à¤•à¤°à¥à¤Ÿà¤®à¤¾ करà¥à¤•à¥‡ नजर" for the text "मिनिस्कर्टमा कर्के नजर". This is Nepali font in unicode format. I need it in character or ascii format in PHP. I have tried utf8 encode and decode but none of them worked(displays question marks ???? in decoding and "à ¤®à ¤¿à ¤¨à ¤¿à ¤¸à ¥Âà ¤•à ¤°à ¥Âà ¤Ÿà ¤®à ¤¾ à ¤•à ¤°à ¥Âà ¤•à ¥‡ à ¤¨à ¤œà ¤°" on encoding). So, how can I get ascii value or character or unicode value of each unicode characters from mysql database in PHP???

Chnage the collation to utf_bin and in header of your pages <meta http-equiv="Content-Type" content="text/html;charset=UTF-8">. Hope it works.

Ok, I got it. I used this php library and use utf8_chr_to_unicode_style function to convert each unicode charaters to code. I converted all the codes to my required font code(preeti nepali font code). That's all :).

Try function iconv.
It's a function for conversions from different encodings. Please try some at the link provided above. If you cannot manage to figure it out then comment and I will try to research more on the subject.

i have an issue related with the utf-8 charset. I've been all around the web (well, not entirely) but for quite awhile now and the best advice was and is to set the header charset to "UTF-8".
However, I was developing my web application locally on my machine using xampp (and sometimes wamp so as to get a distinction of the two when it came to debugging my code). Everything was working great =). But as soon as i uploaded it online, the result was not all that jazzy (the kind of errors you would get if you had set the headers to a different charset like "iso-8859-1").
Every header in my code has UTF-8 as the default charset, but i still got the same "hieroglyphic thingies". Then you guys gave me the idea that the issue isn't my code but the php.ini that was running it.. Turns out my local machine was running php 5.5 and the cpanel where i had uploaded my web application was running native php 5.3.
Well, when i changed the version of php that my cpanel was set by default from Native PHP 5.3 to PHP 5.5, believe you me guys =) it worked like a charm just like as if i was right there at the localhost of machine.
NOTE: Please, if you got the same problem as i did, just make sure your PHP is 5.5 version.. I'm posting this coz i feel you guys. Cheers!

Related

PHP Character Encoding Error: How they do it?

Problem:
I have a Textarea, that except XML as content and post to server. It works fine if all Ascii characters are there, but when we put data in hebrew then simplexml_load_string fail to load the data, prompting that invalid XML as data encoding breaks the data been posted.
What I did:
I have my HTML meta tag for UTF-8 is set, I do have php header set for content to be UTF-8
I have MySQL set to 'SET NAMES utf8.
When print_r(iconv_get_encoding('all')); it print all three values as ISO-8859-1.
When I print $_POST it shows hebrew characters fine on browser [on Browser view source as well], but still the function failed.
When I change php.ini to take iconv encoding as UTF-8 all works fine again.
However:
Same server does have 100s of Wordpress installation that run Hebrew website, and they don't have such problem.
So, my question is: Why my code is failing but wordpress or any other open source software works just fine with encoding. I did try to set iconv to utf-8 as first executable line, but nothing changed for me.
Not sure I explain my problem fine and my question is clear, if not please let me know. Thanks.
EDIT: I did try utf8_encode and utf8_decode function but they too failed.
You need to use mb_internal_encoding('UTF-8') to tell php what encoding you are using. With this you are overwriting the settings from php.ini.

UTF-8 encode in php - Eurosign missing

I read in a iCal File from a external server via curl. However I use UTF-8 in my outputdocument therefore the german umlauts Ä,Ö.Ü etc don't work (This showed up instead: �).
I correctly assumed the iCal File uses a different charset, and found utf8_encode($value) to solve my problem partly.
All the �s are gone and the proper ä,ö,ü chars are showing up. However I discovered,that a € Sign that was also displayed as � doesnt show up. But the � is also gone.
How do I get my €-sign back? ;)
Thanks

UTF8 encoded strings not shown correctly in MySQL

So I have programmed a crawler to scrape information and data from a website with charset utf8. But when I tried to store the contents into MySQL, some special characters, such as Spanish letters), did not show correctly in MySQL.
Here is what I have done:
Put header("Content-Type: text/html; charset=utf-8") in PHP
Set all charset in MySQL into utf8-unicode-ci
Have $conn->query("SET NAMES 'utf8'") this upon connection
Double checked that the html I parsed was encoded in utf-8
So what are some potentially problems here?
Maybe you coded your crawler using functions which are not supposed to manage multi-byte characters.
For example strlen instead of mb_strlen.
Try putting:
mb_internal_encoding("UTF-8");
as first line of your php coce, and then check if you have to convert some functions in their respective mb version.
Have a look at multibyte string reference
As a last chance you may play with iconv function just before inserting the string into mysql.
Something as:
$utf8_string = iconv(iconv_get_encoding($string), "UTF-8", $string);
should do the trick
Start by checking if the data is stored wrong in the database, in which case the problem is with your crawler. Otherwise the problem is in your presentation.
To test this, I would suggest that you use a dedicated mysql client (Such as the command line client) to inspect data.
I remember pulling my hair out in dealing with UTF8 issues until I started adding this to my header:
setlocale(LC_ALL, 'en_US.UTF-8');

PHP urlencode for chinese characters

I'm creating a php application that involves sending chinese characters as url parameters.
I have to send query like :
http://xyz.com/?q=新
But the script at xyz.com won't automatically encode the chinese character. So, I need to explicitly send an encoded string as the paramter. It becomes:
http://xyz.com/?q=%E6%96%B0
The problem is, PHP won't encode the chinese character properly.
I've tried urlencode() and rawurlencode(). But they give %D0%C2 (doesn't work for my purpose) instead of %E6%96%B0 (works well with xyz.com) as the output.
I'm using this website to create the latter encoded string.
I've also defined header('Content-Type: text/html; charset=gb2312'); to display chinese characters properly.
Is there anything I can do to urlencode the chinese character properly?
Thanks!
PS: I'm a relatively new programmer and don't understand chinese.
You're URLencoding using the charset you specify in your header. %D0%C2 is 新 in gb2312; %E6%96%B0 is 新 in UTF-8. Switch your charset over to UTF-8 and you should fix this issue and still be able to display Simplified Chinese Han.
In order to reproduce your problem I created a simple PHP file:
<?php
var_dump(urlencode('新'));
?>
First I used UTF8 encoding and got %E6%96%B0. Afterwards I changed to GB2312 and got %D0%C2.
At http://meyerweb.com/eric/tools/dencoder/ they seem to use JavaScript, that's UTF8 capable and therefore returns %E6%96%B0, too.
PS: When changing from GB2312 to UTF8 some editors might break code some internationalized code. So please make sure to have a copy of your file before converting!

Encoding issue with Apache , displaying diamond characters in browser

Request you all to help me set up Apache server on Cent OS. It looks like some encoding issue, but I am not able to resolve it yet.
Instead of HTML content it displays HTML source in (chrome,firefox), IE 9 works fine. It displays � character after each "<" symbol.
http://pdf.gen.in/index1.htm
Second Problem is with PHP. It displays source code of PHP http://pdf.gen.in/index.php with similar diamond characters, wherever it encounters a "<" character. It seems like php issue is related to the first issue.
Those files are encoded with UTF-16LE. For the static HTML page, you might be able to get it to work by setting the charset correctly in the MIME type (it's currently text/html; charset=UTF-8). I don't know how strong PHP's Unicode support is. Try using UTF-8 instead, it's generally more well supported due to its partial overlap with ASCII.
You should use a decent text editor, and always set encoding of php/html to "UTF-8 without BOM".
Create a file named "test.php", paste below codes and save with "UTF-8 without BOM" encoding, then it will work just fine.
<?php
phpinfo();
?>

Categories