php decode base64 from UTF-16 - php

I have string that create by mssql that base64 of utf-16 string:
SABlAGwAbABvACAAVwBPAHIAbABkAA==
The decoded version is :
Hello WOrld
This is .NET tools that can use be for decode http://www5.rptea.com/base64/ (Use UTF-16)
How can i convert this using php mb or base64 ?
This is my convertion that return wrong characters:
var_dump(mb_convert_encoding(base64_decode('SABlAGwAbABvACAAVwBPAHIAbABkAA=='), "UTF-8", "UTF-16"));
// string(33) "䠀攀氀氀漀 圀伀爀氀搀"

"UTF-16" is big endian, but your text is encoded little endian. Use "UTF-16LE" instead.

Related

How to decode a base64 string and select input charset using php?

How to decode a base64 string and select input charset using php?
For example it works correctly here : https://www.base64decode.org
I want decode a base64 string using php and I want select Windows-1252 charset.
I use this code:
base64_decode($encode);
But it doesn't show Windows-1252 charset!
Can anyone help me?
You can decode with the base64_decode($encode);, but then you can use the iconv() to convert the character set to the one you want. You have to know what character set the $encode data is using first.
For example if the $encode data was using utf-8 the use the following:
$decoded = base64_decode($encode);
$text = iconv("UTF-8", "ISO-8859-1", $decoded);
More information: http://php.net/manual/en/function.iconv.php

PHP urlencoding issue

My php file is in UTF-8 encoding and I am trying to encode my data for safe sending into application but some characters get encoded incorrectly.
$text = "Š";
$text = urlencode(utf8_decode($text));
echo $text;
Echos %3F but according to w3c urlencoding reference found here (http://www.w3schools.com/tags/ref_urlencode.asp), "Š" should be converted into %8A. Php's own reference also does not state what reference is it using. Could this be encoding/decoding issue or something else?
utf8_decode tries to convert from UTF-8 to ISO-8859-1 but Š does not exist in ISO-8859-1. So you obtain '?' (= %3F), the substitution character.
It exists in CP1252 (maybe others), under the hexadecimal code 8A. So:
$text = urlencode(iconv('UTF-8', 'CP1252', $text));
Should give what you expect. In fact, you shouldn't decode an unicode string.

Convert UCS-2 file to UTF-8 with PHP

I have a CSV file supplied from a client which has to be parsed and inserted into a database using PHP.
Before inserting the data into the DB, I want to convert it to UTF-8 but I cant seem to find how.
This is what I got trying to detect the files encoding:
$ enca -d -L zh ./artigos.txt
./artigos.txt: Universal character set 2 bytes; UCS-2; BMP
CRLF line terminators
Byte order reversed in pairs (1,2 -> 2,1)
I tried using the iconv function but it messes up the conversion and shows the result with diferent characters than the originals.
First line of the file (base64 encoded):
IgAwADMAMQAxADkAIgAsACIANwAzADEAMwA0ADYAMgA2ADQAMAAwADEANQAiACwAIgBBAGcAcgBhAGYAYQBkAG8AcgAgAFIAYQBwAGkAZAAgADkAIABIAGUAYQB2AHkAIABEAHUAdAB5ACIALAAiAEEAZwByAGEAZgBvACAAOQAvADgALAAgADkALwAxADAALAAgADkALwAxADIALAAgADkALwAxADQAIgAsACIAMQAxADAAZgBsAHMAIgAsACIAIgAsACIAIgAsACIAIgAsACIAMAAzADEAMQA5AC4AagBwAGcAIgAsACIAIgAsACIAMQAsADIAMAAiACwAIgA1ADkALAA5ADAAIgAsACIAMgAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIARgBhAGwAcwBlACIADQAK
Microsoft Excel CSV are generally Little Endian encoded (took me long to find out).
If you want to use them with fgetcsv or similar functions, you should convert the file into UTF-8 first.
I do the following:
$str = file_get_contents($file);
$str = mb_convert_encoding($str, 'UTF-8', 'UCS-2LE');
file_put_contents("converted_".$file, $str);
This seems to work(little endian), althoug you didnt include any non ascii chars
$s='IgAwADMAMQAxADkAIgAsACIANwAzADEAMwA0ADYAMgA2ADQAMAAwADEANQAiACwAIgBBAGcAcgBhAGYAYQBkAG8AcgAgAFIAYQBwAGkAZAAgADkAIABIAGUAYQB2AHkAIABEAHUAdAB5ACIALAAiAEEAZwByAGEAZgBvACAAOQAvADgALAAgADkALwAxADAALAAgADkALwAxADIALAAgADkALwAxADQAIgAsACIAMQAxADAAZgBsAHMAIgAsACIAIgAsACIAIgAsACIAIgAsACIAMAAzADEAMQA5AC4AagBwAGcAIgAsACIAIgAsACIAMQAsADIAMAAiACwAIgA1ADkALAA5ADAAIgAsACIAMgAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIARgBhAGwAcwBlACIADQAK';
$t=base64_decode($s);
echo iconv('UCS-2LE', 'UTF-8', substr($t, 0, -1));//last byte was invalid
python :
One of the method to encode is
Text -> utf-16-be -> hexadecimal
Convert back
hexadecimal to binary and then from utf-16-be to text
Note : ucs-2be is deprecated and move to utf-16-be
Decoder
import binascii
code = '098 ... '
decoded_text = binascii.unhexlify(code).decode('utf-16-be')

Is there any alpha numeric string encoder for PHP other than Hex?

I want to encode some binary strings with something like base64 but only with alpha numeric chars. I know bin2hex could do this, but it makes the encoded string much longer (i tries gzcompress in the strings but didn't make much difference).
Is there any other existing encoding method to do this?
http://en.wikipedia.org/wiki/Binary-to-text_encoding
The most used forms of binary-to-text encodings are:
hexadecimal
base64
quoted-printable
uuencoding
yEnc
Ascii85
BinHex
Percent encoding

encode latin chars using ISO-8859-1 in PHP

I want to encode to ISO-8859-1 a url that contains latin characters like à, using PHP.
The encoded string will be used to perform a request to a webservice. So if the request is:
http://www.mywebservice.com?param=à
the encoded string should be:
http://www.mywebservice.com?param=%E0
I've tried using PHP's function urlencode() but it returns the input encoded in UTF-8:
http://www.mywebservice.com?param=%C3%A0
Use utf8_decode before urlencode:
urlencode(utf8_decode("à"))

Categories