I installed windows 7 with code page 950. Now my php with utf8 query cannot run the query in mysql. It said invalid utf8 chracter..So, my question is how can I encode the non-ascii character from code page 950 string to utf8 string?
Thanks
You could try this:
iconv(mb_detect_encoding($text, mb_detect_order(), true), "UTF-8", $text);
But I don't guarantee it'll work as it's quite hard to detect the old format and convert to UTF8.
Will it be a help for you if you try mb_convert_encoding() in PHP?
Related
I have variables with chinese words, their charset is GB2312. I want to convert them to UTF-8 because I want to save them to mysql table with utf-8 encoding. How to do that is PHP? I'm using PHP 7.
Here are what I have tried:
I have tried using $myvar = iconv('gb2312', 'utf-8', $myvar); However some of my variables get empty if it contains some characters (invalid UTF-8 chars maybe?)
I have tried using $myvar = mb_convert_encoding($myvar, 'UTF-8', 'GB2312'); It works better than iconv but when $myvar contain some characters as I mentioned above, they turned into question mark (?)
Please help me, thanks
Update
Here is an example of my chinese string:
GB2312 (Expected result): 第3章︰林鴻
Using mb_convert_encoding become: 第3章?林?
Using iconv become empty
I am using this scraper for IMDB, and the problem is that some characters are in UNICODE ï.
I use this scraper with CURL, and the answer its a string encoded in UTF8
I try to get the encode of the string with mb_detect_encoding() and it answer with UTF-8
$html = $this->geturl("${imdbUrl}combined");
mb_detect_encoding($html);
So I have a string with some HEX values inside, like this for example:
$var = 'Saïd Taghmaoui'
So I try to get the value of $html with utf8_decode() but no luck, I still have some characters in HEX.
So I have a few questions:
1- What's the best solution for this? I imagine different scenarios for example a read the string and with a REGEX change all the HEX codes with the character, but I am not sure if this one its the best solution, and also I dont know how to create the REGEX for this.
2- The solution can be through cURL? I mean manage some configurations to set the encoding of cURL in UTF-8 for example?
I try with the functions recode_string or iconv or mb_convert_encoding
Well basically my problem is that the answer from the scraper comes with UTF-8 encoding, but before print the text I need to work the data with this functions
$var = 'Saïd Taghmaoui'
htmlspecialchars(html_entity_decode($var, ENT_QUOTES, 'UTF-8'), ENT_NOQUOTES, 'UTF-8');
I have a database with data in windows-1253 encoding.
I'm trying to convert them to utf8 with iconv function and display them in a page but I get characters like these: g óôçí åðüìåíç ôáéíßá ôïõ
Any thoughts?
This is the code I use
iconv(mb_detect_encoding($this->row["question"], mb_detect_order(), true),"UTF-8",htmlentities(stripslashes($this->row["question"])))
If you know the encoding is windows-1253, then simply try to use:
iconv('Windows-1253','UTF-8', $text);
I try to eject text from Word .DOC file with PHP. All seems ok, but the only trouble is something like
СУДОВА БУХГАЛТЕРІЯ
instead of russian text. I've tried to use html_entity_decode and utf8_encode, but they didn't help. Is there any simple solution?
html_entity_decode should work with the proper parameters (unless you’re using PHP 5.3.3 or later):
html_entity_decode($str, ENT_QUOTES, 'UTF-8')
This will convert the character references into UTF-8. Before PHP 5.3.3, the charset parameter’s default value was ISO-8859-1. In that case the cyrillic characters can’t be converted as the ISO 8859-1 character set doesn’t contain them.
While working with Froogle, the datafeed is constantly bugging me with encoding problems in some article-descriptions.
The script, string and output is utf8 encoded, but I can't find the characters that cause the problem.
is there a way to detect troublesome characters?
Try using the htmlentities function for your string.
echo htmlentities($your_str, ENT_QUOTES);
And then use, html_entity_decode function to read back with utf8 as parameter.