I was thinking to use this:
<?php
$string1 = 'V2h5IEkgY2FuJ3QgZG8gdGhpcyEhISEh';
echo base64_decode($string1);
?>
The output for this example should always be 18 characters! But sometimes this output is less than 18.
24 (base64 characters) multiplied by 6 (bits per base64 character) equals to 144 (bits) divided by 8 (bits per ASCII character) equals to 18 ASCII characters.
The problem is that the output is displayed in plain text; and some characters don't even have a "text representation" and that data will be lost. The next test will show that there are 41 different ASCII characters with no visible output.
<?php
for ($i = 0; $i <= 255; $i++) {
$string2 = chr($i);
echo $i . " = " . $string2 . "<br>";
}
?>
My plan was to decode the base64 string and from the output in ASCII reconvert it to hexadecimal. Now that is not possible because of those 41 characters.
I also tried base_convert but there is no base64 support for it.
You can do this with bin2hex():
Returns an ASCII string containing the hexadecimal representation of str. The conversion is done byte-wise with the high-nibble first.
php > $string1 = 'V2h5IEkgY2FuJ3QgZG8gdGhpcyEhISEh';
php > echo base64_decode($string1);
Why I can't do this!!!!!
php > echo bin2hex(base64_decode($string1));
57687920492063616e277420646f20746869732121212121
php >
<?php
$string1 = 'V2h5IEkgY2FuJ3QgZG8gdGhpcyEhISEh';
$binary = base64_decode($string1);
$hex = bin2hex($binary);
echo $hex;
?>
A simple one line solution is:
<?=bin2hex(base64_decode($string1));?>
Related
The code:
#0c0f56415445532d413636373231343939
Is: VATES-A66721499 but encoded in hex.
I have made the following attempt:
$hex = bin2hex('VATES-A66721499');
echo $hex;
output:
56415445532d413636373231343939
But I need to get this other part:
#0c0f
I have tried the following but no result: #0c0f56415445532d413636373231343939
0c and 0f are unprintable control characters, and # is not part of hexadecimal encoding at all.
You can either:
'#' . bin2hex("\x0c\x0f" . 'VATES-A66721499')
Or:
'#0c0f' . bin2hex('VATES-A66721499')
Both will give the desired output.
I use this table of Emoji and try this code:
<?php print json_decode('"\u2600"'); // This convert to ☀ (black sun with rays) ?>
If I try to convert this \u1F600 (grinning face) through json_decode, I see this symbol — ὠ0.
Whats wrong? How to get right Emoji?
PHP 5
JSON's \u can only handle one UTF-16 code unit at a time, so you need to write the surrogate pair instead. For U+1F600 this is \uD83D\uDE00, which works:
echo json_decode('"\uD83D\uDE00"');
😀
PHP 7
You now no longer need to use json_decode and can just use the \u and the unicode literal:
echo "\u{1F30F}";
🌏
In addition to the answer of Tino, I'd like to add code to convert hexadecimal code like 0x1F63C to a unicode symbol in PHP5 with splitting it to a surrogate pair:
function codeToSymbol($em) {
if($em > 0x10000) {
$first = (($em - 0x10000) >> 10) + 0xD800;
$second = (($em - 0x10000) % 0x400) + 0xDC00;
return json_decode('"' . sprintf("\\u%X\\u%X", $first, $second) . '"');
} else {
return json_decode('"' . sprintf("\\u%X", $em) . '"');
}
}
echo codeToSymbol(0x1F63C); outputs 😼
Example of code parsing string including emoji unicode format
$str = 'Test emoji \U0001F607 \U0001F63C';
echo preg_replace_callback(
'/\\\U([A-F0-9]+)/',
function ($matches) {
return mb_convert_encoding(hex2bin($matches[1]), 'UTF-8', 'UTF-32');
},
$str
);
Output: Test emoji 😇 😼
https://3v4l.org/63dUR
I have a Unicode text-block, like this:
ụ
ư
ứ
Ỳ
Ỷ
Ỵ
Đ
Now, I want to convert this orginal Unicode text-block into a text-block of UTF-8 (HEX) code point (see the Hexadecimal UTF-8 column, on this page: https://en.wikipedia.org/wiki/UTF-8), by PHP; like this:
\xe1\xbb\xa5
\xc6\xb0
\xe1\xbb\xa9
\xe1\xbb\xb2
\xe1\xbb\xb6
\xe1\xbb\xb4
\xc4\x90
Not like this:
0x1EE5
0x01B0
0x1EE9
0x1EF2
0x1EF6
0x1EF4
0x0110
Is there any way to do it, by PHP?
I have read this topic (PHP: Convert unicode codepoint to UTF-8). But, it is not similar to my question.
I am sorry, I don't know much about Unicode.
I think you're looking for the bin2hex() function:
Convert binary data into hexadecimal representation
And format by prepending \x to each byte (00-FF)
function str_hex_format ($bin) {
return '\x'.implode('\x', str_split(bin2hex($bin), 2));
}
For your sample:
// utf8 encoded input
$arr = ["ụ","ư","ứ","Ỳ","Ỷ","Ỵ","Đ"];
foreach($arr AS $v)
echo $v . " => " . str_hex_format($v) . "\n";
See test at eval.in (link expires)
ụ => \xe1\xbb\xa5
ư => \xc6\xb0
ứ => \xe1\xbb\xa9
Ỳ => \xe1\xbb\xb2
Ỷ => \xe1\xbb\xb6
Ỵ => \xe1\xbb\xb4
Đ => \xc4\x90
Decode example: $str = str_hex_format("ụưứỲỶỴĐ"); echo $str;
\xe1\xbb\xa5\xc6\xb0\xe1\xbb\xa9\xe1\xbb\xb2\xe1\xbb\xb6\xe1\xbb\xb4\xc4\x90
echo hex2bin(str_replace('\x', "", $str));
ụưứỲỶỴĐ
For more info about escape sequence \x in double quoted strings see php manual.
PHP treats strings as arrays of characters, regardless of encoding. If you don't need to delimit the UTF8 characters, then something like this works:
$str='ụưứỲỶỴĐ';
foreach(str_split($str) as $char)
echo '\x'.str_pad(dechex(ord($char)),'0',2,STR_PAD_LEFT);
Output:
\xe1\xbb\xa5\xc6\xb0\xe1\xbb\xa9\xe1\xbb\xb2\xe1\xbb\xb6\xe1\xbb\xb4\xc4\x90
If you need to delimit the UTF8 characters (i.e. with a newline), then you'll need something like this:
$str='ụưứỲỶỴĐ';
foreach(array_slice(preg_split('~~u',$str),1,-1) as $UTF8char){ // split before/after every UTF8 character and remove first/last empty string
foreach(str_split($UTF8char) as $char)
echo '\x'.str_pad(dechex(ord($char)),'0',2,STR_PAD_LEFT);
echo "\n"; // delimiter
}
Output:
\xe1\xbb\xa5
\xc6\xb0
\xe1\xbb\xa9
\xe1\xbb\xb2
\xe1\xbb\xb6
\xe1\xbb\xb4
\xc4\x90
This splits the string into UTF8 characters using preg_split and the u flag. Since preg_split returns the empty string before the first character and the empty string after the last character, we need to array_slice the first and last characters. This can be easily modified to return an array, for example.
Edit:
A more "correct" way to do this is this:
echo trim(json_encode(utf8_encode('ụưứỲỶỴĐ')),'"');
The main thing you need to do is to tell PHP to interpret the incoming Unicode characters correctly. Once you do that, you can then convert them to UTF-8 and then to hex as needed.
This code frag takes your example character in Unicode, converts them to UTF-8, and then dumps the hex representation of those characters.
<?php
// Hex equivalent of "ụưứỲỶỴĐ" in Unicode
$unistr = "\x1E\xE5\x01\xB0\x1E\xE9\x1E\xF2\x1E\xF6\x1E\xF4\x01\x10";
echo " length=" . mb_strlen($unistr, 'UCS-2BE') . "\n";
// Here's the key statement, convert from Unicode 16-bit to UTF-8
$utf8str = mb_convert_encoding($unistr, "UTF-8", 'UCS-2BE');
echo $utf8str . "\n";
for($i=0; $i < mb_strlen($utf8str, 'UTF-8'); $i++) {
$c = mb_substr($utf8str, $i, 1, 'UTF-8');
$hex = bin2hex($c);
echo $c . "\t" . $hex . "\t" . preg_replace("/([0-9a-f]{2})/", '\\\\x\\1', $hex) . "\n";
}
?>
Produces
length=7
ụưứỲỶỴĐ
ụ e1bba5 \xe1\xbb\xa5
ư c6b0 \xc6\xb0
ứ e1bba9 \xe1\xbb\xa9
Ỳ e1bbb2 \xe1\xbb\xb2
Ỷ e1bbb6 \xe1\xbb\xb6
Ỵ e1bbb4 \xe1\xbb\xb4
Đ c490 \xc4\x90
Why is it, when i shorten a string. Letter "å, ä, ö" becomes "?"?
If i use the Name "Örjan" it becomes "Orjan".
But when i use "Björn", it works all fine?
PHP
//Create initials
$usr_fname_f_letter = $_POST['usr_fname'];
$usr_fname_f_letter = $usr_fname_f_letter[0];
$usr_lname_f_letter = $_POST['usr_lname'];
$usr_lname_f_letter = $usr_lname_f_letter[0];
$usr_inits = $usr_fname_f_letter .= $usr_lname_f_letter;
echo $_POST['usr_fname'];
echo '<br>';
echo $_POST['usr_lname'];
echo '<br>';
echo $usr_fname_f_letter;
echo '<br>';
echo $usr_lname_f_letter;
echo '<br>';
echo $usr_inits;
echo '<br>';
RESULT
Örjan
Björnsson
�B
B
�B
$usr_fname_f_letter = $usr_fname_f_letter[0];
simply takes the first (zero offset) byte from $usr_fname_f_letter; but you're using a multibyte character set and that's like chopping part of a character in half.
Use
mb_substr($usr_fname_f_letter, 0, 1, 'UTF-8')
because the mb_* functions are multi-byte character set aware; and work in characters, not in bytes
I assume your encoding is utf-8 and you are probably printing only part of a multibyte character.
Try to use multibyte safe function, like mb_substr:
mb_substr($str, 0, 1, "UTF-8");
Kindly I need to convert the Arabic text to and from Hexadecimal like the following example Using PHP
مرحبا
06450631062D06280627
Regards,
Eco
If you just need to have the Arabic text written in the HTML document begin generated, I think the simplest way is to convert the sequence to character references, turning e.g. 0645 to م. This could be done as follows:
<?php
$str = '06450631062D06280627';
for($i = 0; $i < strlen($str)/4; $i++) {
echo "&#x", substr($str, 4*$i, 4), ";";
}
?>
I get a unicode string with following code.
$str = "Some Hexa String";
$replacedString = preg_replace("/\\\\u([0-9abcdef]{4})/", "&#x$1;", $str);
$unicodeString = mb_convert_encoding($replacedString, 'UTF-8', 'HTML-ENTITIES');
bin2hex($str); // Bin to Hex
pack("H*", $hexStr); // Hex to Bin