How to encode data in hexadecimal? - php

The code:
#0c0f56415445532d413636373231343939
Is: VATES-A66721499 but encoded in hex.
I have made the following attempt:
$hex = bin2hex('VATES-A66721499');
echo $hex;
output:
56415445532d413636373231343939
But I need to get this other part:
#0c0f
I have tried the following but no result: #0c0f56415445532d413636373231343939

0c and 0f are unprintable control characters, and # is not part of hexadecimal encoding at all.
You can either:
'#' . bin2hex("\x0c\x0f" . 'VATES-A66721499')
Or:
'#0c0f' . bin2hex('VATES-A66721499')
Both will give the desired output.

Related

json_encode JSON_UNESCAPED_SLASHES not working and still escaping slashes

My autocomplete search feature is broken because of how characters with accents are stored in the mySQL.
For example, in the mySQL column, É is stored like \u00c9
In the PHP, which receives the user's input and calls on mySQL, É is \xc3\x89
json_encode() almost works perfectly to take "\xc3\x89" and convert it to "\u00c9"
$clean = json_encode($criteria, JSON_UNESCAPED_SLASHES);
Except it converts it to "\\u00c9" and so the characters don't match even though they are both É.
The option JSON_UNESCAPED_SLASHES isn't working. Why does it not keep another backslash from being added in front of the backslash?
How do I get this to work?
Edit: I just added the actual code and error log output below. code:
error_log("criteria vvvvvvvvvvvvv");
error_log($criteria);
$clean = json_encode($criteria, JSON_UNESCAPED_SLASHES);
error_log("json_encode(criteria) vvvvvvvvvvvvvv");
error_log($clean);
The error log:
[Fri Aug 23] criteria vvvvvvvvvvvvvvv,
[Fri Aug 23 \xc3\x89
[Fri Aug 23] json_encode(criteria) vvvvvvvvvvvvvvv,
[Fri Aug 23] "\\u00c9"
First JSON_UNESCAPED_SLASHES is used to prevent escaping "SLASHES" / as the name implies, don't expect it to prevent escaping backslashes \
echo json_encode('/'); // prints "\/"
echo json_encode('/', JSON_UNESCAPED_SLASHES); // prints "/"
echo json_encode("\\", JSON_UNESCAPED_SLASHES); // prints "\\"
//note on line 3 : the input is 1 backslash
As you can see it prevents escaping slashes only , not backslashes
Regarding your problem, if you ended up by using json_encode with something like \\u00c9 then you must have gave it this string as input \u00c9 , json_encode() did nothing wrong , you feed it with the string "\u00c9" not the Unicode character00c9 and it escaped the backslash at the string beginning.
Your $criteria variable is probably holding a JSON encoded string like "\u00c9" that has been encoded without using the JSON_UNESCAPED_UNICODE option, in other words don't use json_encode() twice.
check these examples, it could clear things out
echo json_encode("É", JSON_UNESCAPED_SLASHES) . "\n";
echo json_encode("\u00c9", JSON_UNESCAPED_SLASHES) . "\n";
echo json_encode("\xc3\x89", JSON_UNESCAPED_SLASHES) . "\n";
echo json_encode("/") . "\n";
echo json_encode("/", JSON_UNESCAPED_SLASHES) . "\n";
echo json_encode("\\", JSON_UNESCAPED_SLASHES) . "\n";
This outputs
"\u00c9"
"\\u00c9"
"\u00c9"
"\/"
"/"
"\\"
live demo

how to convert from base64 to hexadecimal in php?

I was thinking to use this:
<?php
$string1 = 'V2h5IEkgY2FuJ3QgZG8gdGhpcyEhISEh';
echo base64_decode($string1);
?>
The output for this example should always be 18 characters! But sometimes this output is less than 18.
24 (base64 characters) multiplied by 6 (bits per base64 character) equals to 144 (bits) divided by 8 (bits per ASCII character) equals to 18 ASCII characters.
The problem is that the output is displayed in plain text; and some characters don't even have a "text representation" and that data will be lost. The next test will show that there are 41 different ASCII characters with no visible output.
<?php
for ($i = 0; $i <= 255; $i++) {
$string2 = chr($i);
echo $i . " = " . $string2 . "<br>";
}
?>
My plan was to decode the base64 string and from the output in ASCII reconvert it to hexadecimal. Now that is not possible because of those 41 characters.
I also tried base_convert but there is no base64 support for it.
You can do this with bin2hex():
Returns an ASCII string containing the hexadecimal representation of str. The conversion is done byte-wise with the high-nibble first.
php > $string1 = 'V2h5IEkgY2FuJ3QgZG8gdGhpcyEhISEh';
php > echo base64_decode($string1);
Why I can't do this!!!!!
php > echo bin2hex(base64_decode($string1));
57687920492063616e277420646f20746869732121212121
php >
<?php
$string1 = 'V2h5IEkgY2FuJ3QgZG8gdGhpcyEhISEh';
$binary = base64_decode($string1);
$hex = bin2hex($binary);
echo $hex;
?>
A simple one line solution is:
<?=bin2hex(base64_decode($string1));?>

PHP UTF-8 mb_convert_encode and Internet-Explorer

Since some days I read about Character-Encoding, I want to make all my Pages with UTF-8 for Compability. But I get stuck when I try to convert User-Input to UTF-8, this works on all Browsers, expect Internet-Explorer (like always).
I don't know whats wrong with my code, it seems fine to me.
I set the header with char encoding
I saved the file in UTF-8 (No BOM)
This happens only, if you try to access to the page via $_GET on the internet-Explorer myscript.php?c=äüöß
When I write down specialchars on my site, they would displayed correct.
This is my Code:
// User Input
$_GET['c'] = "äüöß"; // Access URL ?c=äüöß
//--------
header("Content-Type: text/html; charset=utf-8");
mb_internal_encoding('UTF-8');
$_GET = userToUtf8($_GET);
function userToUtf8($string) {
if(is_array($string)) {
$tmp = array();
foreach($string as $key => $value) {
$tmp[$key] = userToUtf8($value);
}
return $tmp;
}
return userDataUtf8($string);
}
function userDataUtf8($string) {
print("1: " . mb_detect_encoding($string) . "<br>"); // Shows: 1: UTF-8
$string = mb_convert_encoding($string, 'UTF-8', mb_detect_encoding($string)); // Convert non UTF-8 String to UTF-8
print("2: " . mb_detect_encoding($string) . "<br>"); // Shows: 2: ASCII
$string = preg_replace('/[\xF0-\xF7].../s', '', $string);
print("3: " . mb_detect_encoding($string) . "<br>"); // Shows: 3: ASCII
return $string;
}
echo $_GET['c']; // Shows nothing
echo mb_detect_encoding($_GET['c']); // ASCII
echo "äöü+#"; // Shows "äöü+#"
The most confusing Part is, that it shows me, that's converted from UTF-8 to ASCII... Can someone tell me why it doesn't show me the specialchars correctly, whats wrong here? Or is this a Bug on the Internet-Explorer?
Edit:
If I disable converting it says, it's all UTF-8 but the Characters won't show to me either... They are displayed like "????"....
Note: This happens ONLY in the Internet-Explorer!
Although I prefer using urlencoded strings in address bar but for your case you can try to encode $_GET['c'] to utf8. Eg.
$_GET['c'] = utf8_encode($_GET['c']);
An approach to display the characters using IE 11.0.18 which worked:
Retrieve the Unicode of your character : example for 'ü' = 'U+00FC'
According to this post, convert it to utf8 entity
Decode it using utf8_decode before dumping
The line of code illustrating the example with the 'ü' character is :
var_dump(utf8_decode(html_entity_decode(preg_replace("/U\+([0-9A-F]{4})/", "&#x\\1;", 'U+00FC'), ENT_NOQUOTES, 'UTF-8')));
To summarize: For displaying purposes, go from Unicode to UTF8 then decode it before displaying it.
Other resources:
a post to retrieve characters' unicode

How to convert a Unicode text-block to UTF-8 (HEX) code point?

I have a Unicode text-block, like this:
ụ
ư
ứ
Ỳ
Ỷ
Ỵ
Đ
Now, I want to convert this orginal Unicode text-block into a text-block of UTF-8 (HEX) code point (see the Hexadecimal UTF-8 column, on this page: https://en.wikipedia.org/wiki/UTF-8), by PHP; like this:
\xe1\xbb\xa5
\xc6\xb0
\xe1\xbb\xa9
\xe1\xbb\xb2
\xe1\xbb\xb6
\xe1\xbb\xb4
\xc4\x90
Not like this:
0x1EE5
0x01B0
0x1EE9
0x1EF2
0x1EF6
0x1EF4
0x0110
Is there any way to do it, by PHP?
I have read this topic (PHP: Convert unicode codepoint to UTF-8). But, it is not similar to my question.
I am sorry, I don't know much about Unicode.
I think you're looking for the bin2hex() function:
Convert binary data into hexadecimal representation
And format by prepending \x to each byte (00-FF)
function str_hex_format ($bin) {
return '\x'.implode('\x', str_split(bin2hex($bin), 2));
}
For your sample:
// utf8 encoded input
$arr = ["ụ","ư","ứ","Ỳ","Ỷ","Ỵ","Đ"];
foreach($arr AS $v)
echo $v . " => " . str_hex_format($v) . "\n";
See test at eval.in (link expires)
ụ => \xe1\xbb\xa5
ư => \xc6\xb0
ứ => \xe1\xbb\xa9
Ỳ => \xe1\xbb\xb2
Ỷ => \xe1\xbb\xb6
Ỵ => \xe1\xbb\xb4
Đ => \xc4\x90
Decode example: $str = str_hex_format("ụưứỲỶỴĐ"); echo $str;
\xe1\xbb\xa5\xc6\xb0\xe1\xbb\xa9\xe1\xbb\xb2\xe1\xbb\xb6\xe1\xbb\xb4\xc4\x90
echo hex2bin(str_replace('\x', "", $str));
ụưứỲỶỴĐ
For more info about escape sequence \x in double quoted strings see php manual.
PHP treats strings as arrays of characters, regardless of encoding. If you don't need to delimit the UTF8 characters, then something like this works:
$str='ụưứỲỶỴĐ';
foreach(str_split($str) as $char)
echo '\x'.str_pad(dechex(ord($char)),'0',2,STR_PAD_LEFT);
Output:
\xe1\xbb\xa5\xc6\xb0\xe1\xbb\xa9\xe1\xbb\xb2\xe1\xbb\xb6\xe1\xbb\xb4\xc4\x90
If you need to delimit the UTF8 characters (i.e. with a newline), then you'll need something like this:
$str='ụưứỲỶỴĐ';
foreach(array_slice(preg_split('~~u',$str),1,-1) as $UTF8char){ // split before/after every UTF8 character and remove first/last empty string
foreach(str_split($UTF8char) as $char)
echo '\x'.str_pad(dechex(ord($char)),'0',2,STR_PAD_LEFT);
echo "\n"; // delimiter
}
Output:
\xe1\xbb\xa5
\xc6\xb0
\xe1\xbb\xa9
\xe1\xbb\xb2
\xe1\xbb\xb6
\xe1\xbb\xb4
\xc4\x90
This splits the string into UTF8 characters using preg_split and the u flag. Since preg_split returns the empty string before the first character and the empty string after the last character, we need to array_slice the first and last characters. This can be easily modified to return an array, for example.
Edit:
A more "correct" way to do this is this:
echo trim(json_encode(utf8_encode('ụưứỲỶỴĐ')),'"');
The main thing you need to do is to tell PHP to interpret the incoming Unicode characters correctly. Once you do that, you can then convert them to UTF-8 and then to hex as needed.
This code frag takes your example character in Unicode, converts them to UTF-8, and then dumps the hex representation of those characters.
<?php
// Hex equivalent of "ụưứỲỶỴĐ" in Unicode
$unistr = "\x1E\xE5\x01\xB0\x1E\xE9\x1E\xF2\x1E\xF6\x1E\xF4\x01\x10";
echo " length=" . mb_strlen($unistr, 'UCS-2BE') . "\n";
// Here's the key statement, convert from Unicode 16-bit to UTF-8
$utf8str = mb_convert_encoding($unistr, "UTF-8", 'UCS-2BE');
echo $utf8str . "\n";
for($i=0; $i < mb_strlen($utf8str, 'UTF-8'); $i++) {
$c = mb_substr($utf8str, $i, 1, 'UTF-8');
$hex = bin2hex($c);
echo $c . "\t" . $hex . "\t" . preg_replace("/([0-9a-f]{2})/", '\\\\x\\1', $hex) . "\n";
}
?>
Produces
length=7
ụưứỲỶỴĐ
ụ e1bba5 \xe1\xbb\xa5
ư c6b0 \xc6\xb0
ứ e1bba9 \xe1\xbb\xa9
Ỳ e1bbb2 \xe1\xbb\xb2
Ỷ e1bbb6 \xe1\xbb\xb6
Ỵ e1bbb4 \xe1\xbb\xb4
Đ c490 \xc4\x90

utf (chinese char) covert to Hexadecimal format in php

I am passing my message to SMS api,
This is the documentation
Normally Unicode Messages are Arabic and Chinese Message, which are
defined by GSM Standards. Unicode messages are nothing but normal text
type messages but it has to be submitted in HEX form. To submit
Unicode messages following Url to be used.
I tried bin2hex() there is not working for the output.
$str = '人';
//$str = 'a';
$output = bin2hex($str);
echo $output;
//output
//人 = e4baba ; I would expect '4EBA'
I found a similar solution but it is in VB.net anyone can convert it?
http://www.supportchain.com/index.php?/Knowledgebase/Article/View/28/7/unable-to-send-sms-with-chinese-character-using-api
the sample i had tried, and it is work:-
example of conversion : a converted to hexadecimal is 0061, 人 converted to hexadecimal is 4EBA
The issue you are facing has to do with encoding. Since these are considered special characters, you need to add some encoding details when converting to hex.
Each of these outputs exactly what you were looking for when I run them:
echo bin2hex(iconv('UTF-8', 'ISO-10646-UCS-2', '人')) . PHP_EOL;
//Outputs 4eba
echo bin2hex(iconv('UTF-8', 'UNICODE-1-1', '人')) . PHP_EOL;
//Outputs 4eba
echo bin2hex(iconv('UTF-8', 'UTF-16BE', '人')) . PHP_EOL;
//Outputs 4eba
Pick whichever one you fancy.
If you want to convert back:
echo iconv('UTF-16BE', 'UTF-8', hex2bin('4eba')) . PHP_EOL;
//outputs 人

Categories