The strange encoding occurs when article includes letters like "ş, ç, ö, İ". How can I fix this ?
Kesme \u015fekere benzeyen, kire\u00e7 beyaz\u0131 evleriyle kar\u015f\u0131l\u0131yor BODRUM bizi..T\u0131pk\u0131, \u00e7ocuklu\u011fumuzdaki gibi.. Pencerelerdeki mavi \u00e7izgiler, denizin g\u00fcl\u00fcmsemesi adeta.. Ve denizden esen ilk r\u00fczgar bir\u00e7ok kokuyla ho\u015fgeldin diyor bize..Eski sevgililer, aile, dostluklar, an\u0131lar.. Film \u015feridi gibi ge\u00e7iyor \u00f6n\u00fcm\u00fczden
Are you sending this data in JSON? If so, that's standard encoding for Unicode characters, and you can simply run it through json_decode to decode them.
If not, more information will be needed to help you.
Related
I am trying to use an ajax based API to get content in Urdu language but the problem is whenever I access the api I see strange characters which I think are not encoded properly by the server before it returns the results
The api endpoint
To return the proper Urdu characters you need to use mb_convert_encoding function before sending it to the client but as it's just public api and I can't access their server I can't do this
I want to convert them back to proper Urdu characters
Something like this
$strangeLetters = '\u06c1\u062a\u06d2 \u0641\u0644\u0633\u0637\u06cc\u0646\u06cc\u0648\u06ba \u067e\u0631 \u0627\u0633\u0631\u0627\u0626\u06cc\u0644\u06cc \u062c\u0627\u0631\u062d\u06cc\u062a \u062c\u0646\u06af\u06cc \u062c\u0631\u0627\u0626\u0645 \u06a9\u06d2 \u0632\u0645\u0631\u06d2 \u0645\u06cc\u06ba \u0622\u062a\u06cc \u06c1\u06d2\u060c \u0648\u0632\u06cc\u0631\u0627\u0639\u0638\u0645';
$properUrduCharacters = someFunction(
$strangeLetters);
echo $properUrduCharacters;
Result:
ہتے فلسطینیوں پر اسرائیلی جارحیت جنگی جرائم کے زمرے میں آتی ہے، وزیراعظم
The quick and easy way to show unicode data with PHP:
echo json_decode('"\u06c1"');
Other solutions here:
How to decode Unicode escape sequences like "\u00ed" to proper UTF-8 encoded characters?
For your example:
<?php
$strangeLetters = '\u06c1\u062a\u06d2 \u0641\u0644\u0633\u0637\u06cc\u0646\u06cc\u0648\u06ba \u067e\u0631 \u0627\u0633\u0631\u0627\u0626\u06cc\u0644\u06cc \u062c\u0627\u0631\u062d\u06cc\u062a \u062c\u0646\u06af\u06cc \u062c\u0631\u0627\u0626\u0645 \u06a9\u06d2 \u0632\u0645\u0631\u06d2 \u0645\u06cc\u06ba \u0622\u062a\u06cc \u06c1\u06d2\u060c \u0648\u0632\u06cc\u0631\u0627\u0639\u0638\u0645';
$strange = explode('\u', $strangeLetters);
foreach($strange as $letter){
echo json_decode('"\u'.$letter.'"');
}
var_dump($strange);
I'm trying to correct an encoding error.
For example, a string which should read "Morgan Pålsson - världsreporter" has been encoded as "Morgan P\xc3\xa5lsson - v\xc3\xa4rldsreporter".
How do I convert "\xc3\xa5" back to "å" and "\xc3\xa4" back to "ä"?
I've tried combinations of various encode/decode functions and iconv, but no luck.
This seems like it should be straightforward. Any ideas?
We had this problem when decoding strings coming from Salesforce via the SOAP interface. Our solution to this "\xc3\xa4" problem looks really weird, but it works. Note that this is a Python solution, but maybe you can apply this to PHP as well! :)
decoded_string=encoded_string.encode('raw_unicode_escape').decode('unicode_escape').encode('latin1').decode('utf-8')
I'm facing an issue with the special characters. I'm taking information from a DB in MSSQL which returns in php a value which may contain specials characters like "à é ö ü" etc. In my sample, I will use the city name of Zürich and when I try to insert this information into a MySQL database, I get the following error :
"Incorrect string value: '\xFCrich ...' for column..."
so, I've done the following but it still showing the same error message:
$arrSearch = array('\xE4','\xF6','\xFC','\xC4','\xD6','\xDC','\xDF');
$arrReplace = array('ä','ö','ü','Ä','Ö','Ü','ß',);
$City=str_replace($arrSearch, $arrReplace, $City);
If I do an echo of $City, I get the following :
Z�rich (rectangular block)
I've tried as well hex2bin() but I just get a white page and nothing is inserted into Database. FYI, DB collation is in utf8mb4_general_ci and setlocale(LC_ALL, 'en_EN') is set in php file. All php files are encoded into UTF8 and chatset is set as follow : mysql_set_charset('utf8mb4',$link);
I must admit, I'm a bit lost. Does anyone has a clue on how to fix this?
Thanks.
EDIT: The server hosting this app is running under 2008R2/IIs 7.5 and I've found this KB by Microsoft. I'll try the hotfix and the registry modification but it didnt work. http://support.microsoft.com/kb/2277918/
Set the character set to utf8.
Ok got it! Was so stupid.... I'm using FPDF with that insertion and to show special characters properly in FPDF, I had to set an iconv('UTF-8', $charset, $_REQUEST['City']);
Sorry and thanks again for assistance! now works like a charm
I am facing an small issue in my project When I am trying to store some German words into the MYSQL Database. When this German words contains umlauts i.e. characters ä, ö, ß, ü etc., they are not stored as they are.....?
I want to store them as it is into the Database.To do so I tried to change the COLLATION to UTF8-general-ci, and others in the list using PHP myAdmin. But none of them is working for me.
Am I in the right way or I have to do something else.
Please suggest some help.
Thanks In Advance......
You have to choose the right transfer encoding either. Call
SET NAMES utf8
before inserting the data and make sure that the german words are utf8-encoded before inserting.
Try to use utf8_encode($string) to encode your text into UTF8 first, before saving it into the database. In order for characters to display correctly in a certain language, you have to (1) set the text into the right charset and then also (2) set a database to the right charset (as you did).
Also, for example, file display.php will output the German text, you can open the file in any editors (EmEditor?) and then "save as", choose a right encoding scheme. After that, the display file, when outputting the text, will take care of the charset.
years ago I've faced the same problem. I've solved it by implicit setting NAMES option for mysql. In my code it looks like this:
//inside AbstractMapper class
public function __construct($modelClass, $dbTable) {
$this->setDbTable($dbTable);
$stmt = new Zend_Db_Statement_Pdo($this->getDbTable()->getAdapter(), 'set names utf8');
$stmt->execute();
$this->_model_class = $modelClass;
}
After connecting to the database, use the following codes:
SET NAMES XXX
replace XXX with your working charset.
This has been bugging me for ages and I want to get to the bottom of this once and for all. I have an associative array which fields I have defined using ISO-8859-1 characters. For instance:
array("utført" => "red");
I also have another array that I have loaded in from a file. I have printed this array out in a browser, checking that values like Æ, Ø and Å is intact. I try to compare two fields from these arrays and I'm slapped by the message:
Undefined index: utfã¸rt on line 39
I can't help but sob. Every single damn time I involve any letters outside UTF-8 in a script they are at some point converted into ã¸r or similar nonsense.
My script file is encoded in ISO-8859-1, the document from which I'm loading my data is the same, and so is the MySQL table I'm trying to save the data to.
So the only conclusion I can draw is that PHP isn't accepting just any character-sets into it's code, and I have to somehow force PHP to speak Norwegian.
Thanks for any suggestions
Just FYI, I won't accept any answers in the lines of "Just don't use those characters" or "Just replace those characters with UTF equivalents at file load" or any other hack solutions
When you read your data from external file try to convert them in proper encoding.
Something like this I have on my mind...
$f = file_get_contents('externaldata.txt');
$f = mb_convert_encoding($f, 'iso-8859-1');
// from this point deal with $f whatever you want
Also, look at mb_convert_encoding() manual for more info.