How to convert Arabic text to Hex using PHP - php

Kindly I need to convert the Arabic text to and from Hexadecimal like the following example Using PHP
مرحبا
06450631062D06280627
Regards,
Eco

If you just need to have the Arabic text written in the HTML document begin generated, I think the simplest way is to convert the sequence to character references, turning e.g. 0645 to م. This could be done as follows:
<?php
$str = '06450631062D06280627';
for($i = 0; $i < strlen($str)/4; $i++) {
echo "&#x", substr($str, 4*$i, 4), ";";
}
?>

I get a unicode string with following code.
$str = "Some Hexa String";
$replacedString = preg_replace("/\\\\u([0-9abcdef]{4})/", "&#x$1;", $str);
$unicodeString = mb_convert_encoding($replacedString, 'UTF-8', 'HTML-ENTITIES');

bin2hex($str); // Bin to Hex
pack("H*", $hexStr); // Hex to Bin

Related

Convert UTF8 Text and Numbers to UTF16-BE for Clickatell

I use an online sms service (Clickatell) for a web app i use. My main language is Greek, so i need to convert in my php file the sms text to UTF-16BE before i send it. For example i need to convert the text
"Το ραντεβού σας έχει μεταφερθεί στις 12-12-2016 και ώρα 18:25"
to
03a403bf002003c103b103bd03c403b503b203bf03cd002003c303b103c2002003ad03c703b503b9002003bc03b503c403b103c603b503c103b803b503af002003c303c403b903c2002000310032002d00310032002d0032003000310036002003ba03b103b9002003ce03c103b1002000310038003a00320035
I need to conver everything inluding spaces, symbols and numbers.
I have found a few php commands but they are converting only the text.
$text=strtoupper(str_replace(array('"', '\u'), array('',''), json_encode('Το ραντεβού σας έχει μεταφερθεί στις 12-12-2016 και ώρα 18:25')));
When using the above code I get the below result:
03A403BF 03C103B103BD03C403B503B203BF03CD 03C303B103C2 03AD03C703B503B9 03BC03B503C403B103C603B503C103B803B503AF 03C303C403B903C2 12-12-2016 03BA03B103B9 03CE03C103B1 18:25
If you notice the date and time as well as all the spaces are not in unicode.
Can anyone tell me how to get my whole phrase in unicode? How can i do this with php?
Thank you in advance
I'm not sure what you mean by "they are converting only the text", but if you're looking to convert a UTF-8 string to UTF-16BE, then you can try:
iconv('UTF-8', 'UTF-16BE', $string);
or..
mb_convert_encoding($string, 'UTF-16BE', 'UTF-8');
Edit:
Since you've shared some code now, your technique for conversion is not sound, unless you really want it represented like you have it. Your result is basically a hex representation of the individual bytes, but not the bytes themselves.
Edit 2:
If you genuinely need it in the format specified, the following will do it for you:
$string = iconv('UTF-8', 'UTF-16BE', $string); // .. or mb_convert_encoding
$converted = '';
for ($i = 0; $i < strlen($string); $i++) {
$converted .= sprintf('%02X', ord($string[$i]));
}

How to convert Emoji from Unicode in PHP?

I use this table of Emoji and try this code:
<?php print json_decode('"\u2600"'); // This convert to ☀ (black sun with rays) ?>
If I try to convert this \u1F600 (grinning face) through json_decode, I see this symbol — ὠ0.
Whats wrong? How to get right Emoji?
PHP 5
JSON's \u can only handle one UTF-16 code unit at a time, so you need to write the surrogate pair instead. For U+1F600 this is \uD83D\uDE00, which works:
echo json_decode('"\uD83D\uDE00"');
😀
PHP 7
You now no longer need to use json_decode and can just use the \u and the unicode literal:
echo "\u{1F30F}";
🌏
In addition to the answer of Tino, I'd like to add code to convert hexadecimal code like 0x1F63C to a unicode symbol in PHP5 with splitting it to a surrogate pair:
function codeToSymbol($em) {
if($em > 0x10000) {
$first = (($em - 0x10000) >> 10) + 0xD800;
$second = (($em - 0x10000) % 0x400) + 0xDC00;
return json_decode('"' . sprintf("\\u%X\\u%X", $first, $second) . '"');
} else {
return json_decode('"' . sprintf("\\u%X", $em) . '"');
}
}
echo codeToSymbol(0x1F63C); outputs 😼
Example of code parsing string including emoji unicode format
$str = 'Test emoji \U0001F607 \U0001F63C';
echo preg_replace_callback(
'/\\\U([A-F0-9]+)/',
function ($matches) {
return mb_convert_encoding(hex2bin($matches[1]), 'UTF-8', 'UTF-32');
},
$str
);
Output: Test emoji 😇 😼
https://3v4l.org/63dUR

How do I write UTF-8 data to a UTF-16LE file using PHP?

Given a string of UTF-8 data in PHP, how can I convert and save it to a UTF-16LE file (this particular file happens to be destined for Indesign - to be placed as a tagged text document).
Data:
$copy = "<UNICODE-MAC>\n";
$copy .= "<Version:8><FeatureSet:InDesign-Roman><ColorTable:=<Black:COLOR:CMYK:Process:0,0,0,1>>\n";
$copy .= "A bunch of unicode special characters like ñ, é, etc.";
I am using the following code, but to no avail:
file_put_contents("output.txt", pack("S",0xfeff) . $copy);
You can use iconv:
$copy_utf16 = iconv("UTF-8", "UTF-16LE", $copy);
file_put_contents("output.txt", $copy_utf16);
Note that UTF-16LE does not include a Byte-Order-Marker, because the byte order is well defined. To produce a BOM use "UTF-16" instead.
Using the following code, I have found a solution:
this function changes the byte order (from http://shiplu.mokadd.im/95/convert-little-endian-to-big-endian-in-php-or-vice-versa/):
function chbo($num) {
$data = dechex($num);
if (strlen($data) <= 2) {
return $num;
}
$u = unpack("H*", strrev(pack("H*", $data)));
$f = hexdec($u[1]);
return $f;
}
used with a utf-8 to utf-16LE conversion, it creates a file that will work with indesign:
file_put_contents("output.txt", pack("S",0xfeff). chbo(iconv("UTF-8","UTF-16LE",$copy));
Alternatively, you could use mb_convert_encoding() as follows:
$copy_UTF16LE = mb_convert_encoding($copy,'UTF-16LE','UTF-8');

How to properly write unicode words to image using PHP imgttftext() function

I am trying to write some Urdu text on an image using imgttftext() function of PHP. It does not display the characters unless I convert the text using the following code:
function convert($text){
$out="";
mb_language('uni');
mb_internal_encoding('UTF-8');
$text = mb_convert_encoding($text, 'HTML-ENTITIES',"UTF-8");
$text = html_entity_decode($text,ENT_NOQUOTES, "ISO-8859-1");
for($i = 0; $i < strlen($text); $i++) {
$letter = $text[$i];
$num = ord($letter);
if($num>127) {
$out .= "&#$num;";
} else {
$out .= $letter;
}
}
return $out;
}
Now, the text e.g. عچں (which contains the three characters ع چ ں) is printed on to the image as separate and full characters instead of cutting and joining the characters to form an Urdu word like عچں.
I have used the characters ا ب ت ث with codes U+0627, U+0628, U+0629 and so on from this page http://en.wikipedia.org/wiki/List_of_Unicode_characters#Arabic
I have shared the code here: https://code.google.com/p/urdu-captcha/downloads/list
Note: I have added space between the characters in the code provided
removing which makes no difference to how the text is displayed on the
image.
How do I make it write the characters joined together to form proper words?
You'll need an additional library to perform Arabic glyph joining. Check out AR-PHP.

php non latin to hex function

I have website that's in win-1251 encoding and it needs to stay that way. But I also need to be able to echo few links that contain non latin, non cyrillic characters like šžāņūī...
I need a function that convert this
"māja un man tā patīk"
to
"māja un man tā patīk"
and that does not touch html, so if there is <b> it needs to stay as <b>, not > or <
And please no advices about the encoding and how wrong that is.
$str = "<b>Obāchan</b> おばあちゃん";
$str = preg_replace_callback('/./u', function ($matches) {
$chr = $matches[0];
if (strlen($chr) > 1) {
$chr = mb_convert_encoding($chr, 'HTML-ENTITIES', 'UTF-8');
}
return $chr;
}, $str);
This expects the original $str to be UTF-8 encoded, i.e. your PHP file should be saved in UTF-8. It encodes all non-ASCII compatible code points to HTML entities. Since all HTML special characters are ASCII characters, they remain untouched. The resulting string is pure ASCII. Since the lower Win-1251 code points are ASCII compatible, the resulting string is also a valid Win-1251 string. The above $str converts to:
<b>Obāchan</b> おばあちゃん
The main things you probably don't want to encode are <, > and &. Those are really the only special characters. So how about encoding everything first, and then just decode <, > and & I feel you should be fine.
This is untested:
$output =
htmlspecialchars_decode(
htmlentities($input, ENT_NOQUOTES, 'CP-1251')
);
let me know
What Evert suggest looks logical to me too! If you insist this is a way to do it if there are only two letters that bother you. For more letters the scrit will not be as effective and needs to change.
<?PHP
function myConvert($str)
{
$chars['ā']='ā';
$chars['ī']='ī';
foreach ($chars as $key => $value)
$output = str_replace($key, $value, $str);
echo $str;
}
myConvert("māja un man tā patīk");
?>
==================edited==============
For many characters maybe this one can help you:
<?PHP
function myConvert($str)
{
$final=null;
$parts = preg_split("/&#[0-9]*;/i", $str);//get all text parts
preg_match_all("/&#[0-9]*;/i", $str, $delimiters );//get delimiters;
$delimiters[0][]='';//make arrays equal size
foreach($parts as $key => $value)
$final.=$value.mb_convert_encoding
($delimiters[0][$key], "UTF-8", "HTML-ENTITIES");
return $final;
}
$fh = fopen("testFile.txt", 'w') ;
fwrite($fh, myConvert("māja un man tā patīkī"));
fclose($fh);
?>
The desired output is written in the text file. This code, exactly as it is -not merged in some project- does what it claims to do. Converts codes like ā to the analogous character they present.

Categories