Hello i have german client and i am getting string with german alphabet which i am trying to display properly in output.I tried utf8_encode to convert string but not working for me.
Code:
echo "Desc Short=>". utf8_encode($obj->Desc_Short) . "<br>\r\n";
echo "Desc Long=>". utf8_encode($obj->Desc_Long) . "<br>\r\n";
Output:
Desc Short=>Ablagefach mittig in Gepäckraumtrennwand;ESACO_UG(122)
Desc Long=>Ablagefach mittig in Gepäckraumtrennwand inkl. verschiebbarem Haltenetz
It seems you need to simply use utf8_decode and use php header to set encoding (or set encoding in HTML document).
For the following code:
<?php
header( 'Content-type: text/html; charset=utf-8' );
$x = 'Ablagefach mittig in Gepäckraumtrennwand;ESACO_UG(122)';
echo utf8_decode($x);
Output for this is:
Ablagefach mittig in Gepäckraumtrennwand;ESACO_UG(122)
Your output indicates that the string is already utf-8 encoded.
Either you would have to use utf8_decode() to get the umlaut or - better - change any component in your application to properly handle utf-8. :)
parse the string through utf8_decode function
TRY :
utf8_decode($obj->Desc_Short)
utf8_decode($obj->Desc_Long)
Related
Since some days I read about Character-Encoding, I want to make all my Pages with UTF-8 for Compability. But I get stuck when I try to convert User-Input to UTF-8, this works on all Browsers, expect Internet-Explorer (like always).
I don't know whats wrong with my code, it seems fine to me.
I set the header with char encoding
I saved the file in UTF-8 (No BOM)
This happens only, if you try to access to the page via $_GET on the internet-Explorer myscript.php?c=äüöß
When I write down specialchars on my site, they would displayed correct.
This is my Code:
// User Input
$_GET['c'] = "äüöß"; // Access URL ?c=äüöß
//--------
header("Content-Type: text/html; charset=utf-8");
mb_internal_encoding('UTF-8');
$_GET = userToUtf8($_GET);
function userToUtf8($string) {
if(is_array($string)) {
$tmp = array();
foreach($string as $key => $value) {
$tmp[$key] = userToUtf8($value);
}
return $tmp;
}
return userDataUtf8($string);
}
function userDataUtf8($string) {
print("1: " . mb_detect_encoding($string) . "<br>"); // Shows: 1: UTF-8
$string = mb_convert_encoding($string, 'UTF-8', mb_detect_encoding($string)); // Convert non UTF-8 String to UTF-8
print("2: " . mb_detect_encoding($string) . "<br>"); // Shows: 2: ASCII
$string = preg_replace('/[\xF0-\xF7].../s', '', $string);
print("3: " . mb_detect_encoding($string) . "<br>"); // Shows: 3: ASCII
return $string;
}
echo $_GET['c']; // Shows nothing
echo mb_detect_encoding($_GET['c']); // ASCII
echo "äöü+#"; // Shows "äöü+#"
The most confusing Part is, that it shows me, that's converted from UTF-8 to ASCII... Can someone tell me why it doesn't show me the specialchars correctly, whats wrong here? Or is this a Bug on the Internet-Explorer?
Edit:
If I disable converting it says, it's all UTF-8 but the Characters won't show to me either... They are displayed like "????"....
Note: This happens ONLY in the Internet-Explorer!
Although I prefer using urlencoded strings in address bar but for your case you can try to encode $_GET['c'] to utf8. Eg.
$_GET['c'] = utf8_encode($_GET['c']);
An approach to display the characters using IE 11.0.18 which worked:
Retrieve the Unicode of your character : example for 'ü' = 'U+00FC'
According to this post, convert it to utf8 entity
Decode it using utf8_decode before dumping
The line of code illustrating the example with the 'ü' character is :
var_dump(utf8_decode(html_entity_decode(preg_replace("/U\+([0-9A-F]{4})/", "&#x\\1;", 'U+00FC'), ENT_NOQUOTES, 'UTF-8')));
To summarize: For displaying purposes, go from Unicode to UTF8 then decode it before displaying it.
Other resources:
a post to retrieve characters' unicode
I am passing my message to SMS api,
This is the documentation
Normally Unicode Messages are Arabic and Chinese Message, which are
defined by GSM Standards. Unicode messages are nothing but normal text
type messages but it has to be submitted in HEX form. To submit
Unicode messages following Url to be used.
I tried bin2hex() there is not working for the output.
$str = '人';
//$str = 'a';
$output = bin2hex($str);
echo $output;
//output
//人 = e4baba ; I would expect '4EBA'
I found a similar solution but it is in VB.net anyone can convert it?
http://www.supportchain.com/index.php?/Knowledgebase/Article/View/28/7/unable-to-send-sms-with-chinese-character-using-api
the sample i had tried, and it is work:-
example of conversion : a converted to hexadecimal is 0061, 人 converted to hexadecimal is 4EBA
The issue you are facing has to do with encoding. Since these are considered special characters, you need to add some encoding details when converting to hex.
Each of these outputs exactly what you were looking for when I run them:
echo bin2hex(iconv('UTF-8', 'ISO-10646-UCS-2', '人')) . PHP_EOL;
//Outputs 4eba
echo bin2hex(iconv('UTF-8', 'UNICODE-1-1', '人')) . PHP_EOL;
//Outputs 4eba
echo bin2hex(iconv('UTF-8', 'UTF-16BE', '人')) . PHP_EOL;
//Outputs 4eba
Pick whichever one you fancy.
If you want to convert back:
echo iconv('UTF-16BE', 'UTF-8', hex2bin('4eba')) . PHP_EOL;
//outputs 人
Is there any way to decode this string??
Actual string : 其他語言測試 - testing
base64 encode while sending on mail as subject as
"=?iso-2022-jp?B?GyRCQjZCPjhsOEBCLDtuGyhCIC0gdGVzdGluZw==?="
<?php
echo base64_decode('GyRCQjZCPjhsOEBCLDtuGyhCIC0gdGVzdGluZw==');
?>
This is base 64 encode, I couldn't decode it to actual Chinese string.Since it has been encoded using "iso-2022-jp", I have also tried online base64decode.org site to decode this string, but I couldn't find the original string, how can I do that?
Use iconv():
<?php
$input = base64_decode('GyRCQjZCPjhsOEBCLDtuGyhCIC0gdGVzdGluZw==');//$BB6B>8l8#B,;n(B - testing
$input_encoding = 'iso-2022-jp';
echo iconv($input_encoding, 'UTF-8', $input); //其他語言測試 - testing
?>
What you are looking at is MIME header encoding. It can be decoded by mb_decode_mimeheader(), and generated by mb_encode_mimeheader(). For example:
<?php
mb_internal_encoding("utf-8");
$subj = "=?iso-2022-jp?B?GyRCQjZCPjhsOEBCLDtuGyhCIC0gdGVzdGluZw==?=";
print mb_decode_mimeheader($subj);
?>
其他語言測試 - testing
(The call to mb_internal_encoding() is necessary here because the contents of the subject line can't be represented in the default internal encoding of ISO8859-1.)
Try encoding the string to UTF-8 first and then encode it to base 64.
Same when decoding, decode the string from base64 and then from UTF-8.
This is working for me:
php > $base = "其他語言測試 - testing";
php > $encoded = base64_encode(utf8_encode($base));
php > $decoded = utf8_decode(base64_decode($encoded));
php > echo ($decoded === $base) . "\n";
1
I have a test text which i post with and ajax call (JQuery):
čéáűőúöüó é$ߤ÷׸¨¸˝¨´~˘˝°´˛>*čéáűőúöüó$>*ß$÷×÷;$¨˝´>$đ;ä
i just write the very same text in the response
<?php
$text=$_POST["text"];
echo "\n\nUTF8_DECODE:\n";
echo utf8_decode($text);
echo "\nISO8859-2 -> UTF-8:\n";
echo iconv("ISO-8859-2","UTF-8",$text);
echo "\nUTF-8 -> ISO-8859-2 \n";
echo iconv("UTF-8","ISO-8859-2",$text);
?>
The result should be:
UTF8_DECODE:
?éá??úöüó é$ߤ÷׸¨¸?¨´~??°´?>*?éá??úöüó$>*ß$÷×÷;$¨?´>$?;ä
ISO8859-2 -> UTF-8:
čéáűőúöüó
é$ߤ÷׸¨¸˝¨´~˘˝°´˛>*čéáűőúöüó$>*ß$÷×÷;$¨˝´>$đ;ä
UTF-8 -> ISO-8859:
ĂÂÄĹ ÄÄÄšÄÄšÂÄĹÄĹÄĹşÄĹ ÄĹ
$ÄÂäÄËÄÂøèøĂÂèô~ĂÂĂÂĂ°Ă´ĂÂ>*ĂÂÄĹ
ÄÄÄšÄÄšÂÄĹÄĹÄĹşÄĹ$>*ÄÂ$ÄËÄÂÄË;$èĂÂĂ´>$ĂÂ;Ĥ
But it is:
UTF8_DECODE:
?éá??úöüó é$ߤ÷׸¨¸?¨´~??°´?>*?éá??úöüó$>*ß$÷×÷;$¨?´>$?;ä
ISO8859-2 -> UTF-8:
ĂÂÄĹ ÄÄÄšÄÄšÂÄĹÄĹÄĹşÄĹ ÄĹ
$ÄÂäÄËÄÂøèøĂÂèô~ĂÂĂÂĂ°Ă´ĂÂ>*ĂÂÄĹ
ÄÄÄšÄÄšÂÄĹÄĹÄĹşÄĹ$>*ÄÂ$ÄËÄÂÄË;$èĂÂĂ´>$ĂÂ;Ĥ
UTF-8 -> ISO-8859-2:
čéáűőúöüó
é$ߤ÷׸¨¸˝¨´~˘˝°´˛>*čéáűőúöüó$>*ß$÷×÷;$¨˝´>$đ;ä
My question is why is that?? What i miss?
Because my text is at ISO-8859-2 and i want to transfer to UTF-8, why i need to use the opposite method when:
string iconv ( string $in_charset , string $out_charset , string $str
)
Performs a character set conversion on the string str from in_charset
to out_charset.
Maybe the ajax request encoded in UTF-8 the ISO-8859-2 characters?
My php script gives out this string (for example) for JSON:
{"time":"0:38:01","kto":"\u00d3\u00e1\u00e8\u00e2\u00f6\u00e0 \u00c3\u00e5\u00ed\u00e5\u00f0\u00e0\u00eb\u00ee\u00e2","mess":"\u00c5\u00e4\u00e8\u00ed\u00fb\u00e9: *mm"}
jQuery code gets this string through JSON:
$.getJSON('chat_ajax.php?q=1',
function(result) {
alert('Time ' + result.time + ' Kto' + result.kto + ' Mess' + result.mess);
});
Browser show:
0:38:01 Óáèâöà Ãåíåðàëîâ
Åäèíûé: *mm
How can I decode this string to cyrillic?
Try use:
<META http-equiv="content-type" content="text/html; charset=windows-1251">
but nothing change
PHP Code:
$res1=mysqli_query($dbc, "SELECT * FROM chat ORDER BY id DESC LIMIT 1");
while ($row1=mysqli_fetch_array($res1)) {
$rawArray=array('time' => #date("G:i:s", ($row1['time'] + $plus)), 'kto' => $row1[kto], 'mess' => $row1[mess]);
$encodedArray = array_map(utf8_encode, $rawArray);
echo json_encode($encodedArray);
PHP ver 5.3.19
\uXXXX stands for unicode characters and in unicode 00d3 is Ó and so on. Unicode characters are unambigouos, so the character encoding of the page is ignored for them. You could use the correct unicode (i.e. \u0443 for У) or write your script so that it outputs the real characters in Windows-1251 instead of unicode sequences.
Update
I see from your comment that you fetch this data from MySQL and use json_encode() to output it. json_encode only works for UTF-8 encoded data (and d3 is Ó in UTF-8 as well, this is why you get the wrong unicode sequences).
So, you will have to convert all data from Windows-1251 to UTF-8 before passing it to json_encode, then everything else will work fine.
Converting:
$utf8Array = array_map(function($in) {
return iconv('Windows-1251', 'UTF-8', $in);
}, $rawArray);
utf8_encode will not work because it is only useful for input in ISO-8859-1 encoding.
I had similar problem when storing json datas in MySQL BDD : this solved the problem :
json_encode($json_data, JSON_UNESCAPED_UNICODE) ;