How to convert unicode in php? - php

I want to convert my string to Unicode like if "ग" than give output like "0917" or "917" any one of them.
Link for Unicode of string i want
Please give me a Hint i used ord() but it's not work proper.
$ord = mb_convert_encoding("ग", 'HTML-ENTITIES', 'UTF-8');
echo $ord;
$ord = ord("ग");
echo $ord; // 224 output
Both try but not working.

iconv — Convert string to requested character encoding
http://php.net/manual/en/function.iconv.php

Related

PHP: Convert Extended Ascii file to UTF-8

i don't have any chance to get a valid utf-8 as output...
$fx = file_get_contents("Extended Ascii file.txt"); // example only has chr(129), but could be mixed Extended Ascii + UTF8
// not working:
//$fx = html_entity_decode($fx, ENT_QUOTES, "UTF-8");
//$fx = mb_convert_encoding($fx, 'UTF-8', 'ASCII');
//$fx = utf8_encode($fx);
//$fx = iconv('ASCII', 'UTF-8//IGNORE', $fx);
echo '"chr('.ord($fx[0]).')"=>"'.$fx[0].'"<br><br>'; // result: "chr(129)"=>"�"
$fx = strtr($fx, [chr(128)=>'Ç',chr(129)=>'ü',chr(130)=>'é',chr(131)=>'â',chr(132)=>'ä',chr(133)=>'à',chr(134)=>'å',chr(135)=>'ç',chr(136)=>'ê',chr(137)=>'ë',chr(138)=>'è',chr(139)=>'ï',chr(140)=>'î',chr(141)=>'ì',chr(142)=>'Ä',chr(143)=>'Å',chr(144)=>'É',chr(145)=>'æ',chr(146)=>'Æ',chr(147)=>'ô',chr(148)=>'ö',chr(149)=>'ò',chr(150)=>'û',chr(151)=>'ù',chr(152)=>'ÿ',chr(153)=>'Ö',chr(154)=>'Ü',chr(155)=>'ø',chr(156)=>'£',chr(157)=>'Ø',chr(158)=>'×',chr(159)=>'ƒ',chr(160)=>'á',chr(161)=>'í',chr(162)=>'ó',chr(163)=>'ú',chr(164)=>'ñ',chr(165)=>'Ñ',chr(166)=>'ª',chr(167)=>'º',chr(168)=>'¿',chr(169)=>'®',chr(170)=>'¬',chr(171)=>'½',chr(172)=>'¼',chr(173)=>'¡',chr(174)=>'«',chr(175)=>'»',chr(176)=>'░',chr(177)=>'▒',chr(178)=>'▓',chr(179)=>'│',chr(180)=>'┤',chr(181)=>'Á',chr(182)=>'Â',chr(183)=>'À',chr(184)=>'©',chr(185)=>'╣',chr(186)=>'║',chr(187)=>'╗',chr(188)=>'╝',chr(189)=>'¢',chr(190)=>'¥',chr(191)=>'┐',chr(192)=>'└',chr(193)=>'┴',chr(194)=>'┬',chr(195)=>'├',chr(196)=>'─',chr(197)=>'┼',chr(198)=>'ã',chr(199)=>'Ã',chr(200)=>'╚',chr(201)=>'╔',chr(202)=>'╩',chr(203)=>'╦',chr(204)=>'╠',chr(205)=>'═',chr(206)=>'╬',chr(207)=>'¤',chr(208)=>'ð',chr(209)=>'Ð',chr(210)=>'Ê',chr(211)=>'Ë',chr(212)=>'È',chr(213)=>'ı',chr(214)=>'Í',chr(215)=>'Î',chr(216)=>'Ï',chr(217)=>'┘',chr(218)=>'┌',chr(219)=>'█',chr(220)=>'▄',chr(221)=>'¦',chr(222)=>'Ì',chr(223)=>'▀',chr(224)=>'Ó',chr(225)=>'ß',chr(226)=>'Ô',chr(227)=>'Ò',chr(228)=>'õ',chr(229)=>'Õ',chr(230)=>'µ',chr(231)=>'þ',chr(232)=>'Þ',chr(233)=>'Ú',chr(234)=>'Û',chr(235)=>'Ù',chr(236)=>'ý',chr(237)=>'Ý',chr(238)=>'¯',chr(239)=>'´',chr(240)=>'≡',chr(241)=>'±',chr(242)=>'‗',chr(243)=>'¾',chr(244)=>'¶',chr(245)=>'§',chr(246)=>'÷',chr(247)=>'¸',chr(248)=>'°',chr(249)=>'¨',chr(250)=>'·',chr(251)=>'¹',chr(252)=>'³',chr(253)=>'²',chr(254)=>'■',chr(255)=>'nbsp']);
echo '"chr('.ord($fx[0]).')"=>"'.$fx[0].'"<br><br>'; // result: "chr(195)"=>"�"
How to convert or remove � ?
28.05.2020 Update: Solution found, thanks to Andrea Pollini!
Some notes:
iconv('UTF-8', 'UTF-8//IGNORE', $fx); // IGNORE is broken in PHP since - https://www.php.net/manual/en/function.iconv.php#108643 - use mb_convert_encoding
Here was my real problem (i figured it out later after many tests):
$P["T"] .= $text; // here was the problem, array is converting strings... (don't know why?)
changed to:
ini_set('mbstring.substitute_character', "none"); // mb_convert_encoding set remove unknown
$P["T"] .= mb_convert_encoding($text, 'UTF-8', 'UTF-8');
Now it's working. But if somebody knows why arrays are converting strings and how to disable that, would be great. :)
first configure in order to discard extended characters
<?php
ini_set('mbstring.substitute_character', "none");
?>
next you can use mb_convert_encoding
mb_convert_encoding($fx, "UTF-8", mb_detect_encoding($fx, "UTF-8, ISO-8859-1, ISO-8859-15", true));
you can add the encoding you need in mb_detect_encoding

How to convert “é” to “é” in PHP?

I'm trying to convert a string from this: “é” to this: “é”. It's a latin1 character but I can't do it right. So far I've tried two functions but none of them give me the right output.
$translation = 'Copà © rnico was Italian';
$translation = mb_convert_encoding($translation, 'utf-8', 'iso-8859-1'); //opt 1
$translation = iconv('utf-8', 'latin1', $translation); //opt 2
I'm getting this data from an Api so I don't know what's going on in the database.
This is the string in Spanish: Copérnico es italiano.
This is the data from the API: Copà © rnico is Italian
This is the result with $translation = bin2hex($translation);
436f70c38320c2a920726e69636f206973204974616c69616e
What's the right way to go? Greetings.
I had the same problem before and this option
$translation = iconv('utf-8', 'latin1', $translation); //opt 2
work verry well.
Your problem is `Copà © rnico was Italian` is not the same than `Copérnico was Italian`.
So when you try to convert the function iconv see 2 wrong UTF-8 symbols because de spaces, is not the same "à © "(2 invalid UTF-8 symbols and 2 spaces) than "é"(1 Valid UTF-8 symbol)

How do I convert arabic letters in htmlentities symbols?

I need convert arabic letters in htmlentities symbols. Codepage: ISO-8859-1.
سك - this is arabic symbol for example.
htmlentities("سك")
returns:
س�
How can I get from this symbol the html-entities سك?
htmlentities() can do only characters that have named entities. See this question on how to convert arbitrary characters into numeric entities.
You're probably not targeting the correct charset. Try: htmlentities('سك', ENT_QUOTES, 'UTF-8');
i'm using a function to make sure there are no html code or cotation posted by user
function cleartext($x1){
$x1 = str_replace('"','',$x1);
$x1 = str_replace("'",'',$x1);
$x1 = htmlentities($x1, ENT_QUOTES, 'UTF-8');
return $x1;
}
so thank for ( ENT_QUOTES, 'UTF-8' ) it helped me to find what am looking for

How to convert some multibyte characters into its numeric html entity using PHP?

Test string:
$s = "convert this: ";
$s .= "–, —, †, ‡, •, ≤, ≥, μ, ₪, ©, ® y ™, ⅓, ⅔, ⅛, ⅜, ⅝, ⅞, ™, Ω, ℮, ∑, ⌂, ♀, ♂ ";
$s .= "but, not convert ordinary characters to entities";
$encoded = mb_convert_encoding($s, 'HTML-ENTITIES', 'UTF-8');
asssuming your input string is UTF-8, this should encode most everything into numeric entities.
Well htmlentities doesn't work correctly. Fortunately someone has posted code on the php website that seems to do the translation of multibyte characters properly
I did work on decoding ascii into html coded text (&#xxxx). https://github.com/hellonearthis/ascii2web

Problem in UTF Encoding in PHP

I use the following lines of code:
$revTerm = "". strrev($limitAry["term"]);
$revTerm = utf8_encode($revTerm);
The $revTerm contains Norwegian characters as ø æ å. However, it is shown correctly. I need to reverse them before displaying, so I use the first line.
When I display them this way, I get an error of bad xml format - used to fill a grid.
When I try to use the second line, I don't get an error but the characters are not shown correctly. Could there be any other way to solve that?
If it may help, I use jqGrid to fill those data in.
strrev, like most PHP string functions, is not safe for multi-byte encodings.
try this example
$test = 'А роза упала на лапу Азора ウィキ';
$test = iconv('utf-8', 'utf-16le', $test);
$test = strrev($test);
// キィウ арозА упал ан алапу азор А
echo iconv('utf-16be', 'utf-8', $test);
(russian)
http://bolknote.ru/2012/04/02/~3625#56
Try this:
$revTerm = utf8_decode($limitAry["term"]);
$revTerm = strrev($revTerm);
$revTerm = utf8_encode($revTerm);
For using strrev you have to decode your string to a non-multibyte string.

Categories