Is it safe to pass raw base64 encoded strings via GET parameters?
There are additional base64 specs. (See the table here for specifics ). But essentially you need 65 chars to encode: 26 lowercase + 26 uppercase + 10 digits = 62.
You need two more ['+', '/'] and a padding char '='. But none of them are url friendly, so just use different chars for them and you're set. The standard ones from the chart above are ['-', '_'], but you could use other chars as long as you decoded them the same, and didn't need to share with others.
I'd recommend just writing your own helpers. Like these from the comments on the php manual page for base64_encode:
function base64_url_encode($input) {
return strtr(base64_encode($input), '+/=', '._-');
}
function base64_url_decode($input) {
return base64_decode(strtr($input, '._-', '+/='));
}
No, you would need to url-encode it, since base64 strings can contain the "+", "=" and "/" characters which could alter the meaning of your data - look like a sub-folder.
Valid base64 characters are below.
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/=
#joeshmo Or instead of writing a helper function, you could just urlencode the base64 encoded string. This would do the exact same thing as your helper function, but without the need of two extra functions.
$str = 'Some String';
$encoded = urlencode( base64_encode( $str ) );
$decoded = base64_decode( urldecode( $encoded ) );
Introductory Note I'm inclined to post a few clarifications since some of the answers here were a little misleading (if not incorrect).
The answer is NO, you cannot simply pass a base64 encoded parameter within a URL query string since plus signs are converted to a SPACE inside the $_GET global array. In other words, if you sent test.php?myVar=stringwith+sign to
//test.php
print $_GET['myVar'];
the result would be:
stringwith sign
The easy way to solve this is to simply urlencode() your base64 string before adding it to the query string to escape the +, =, and / characters to %## codes.
For instance, urlencode("stringwith+sign") returns stringwith%2Bsign
When you process the action, PHP takes care of decoding the query string automatically when it populates the $_GET global.
For example, if I sent test.php?myVar=stringwith%2Bsign to
//test.php
print $_GET['myVar'];
the result would is:
stringwith+sign
You do not want to urldecode() the returned $_GET string as +'s will be converted to spaces.
In other words if I sent the same test.php?myVar=stringwith%2Bsign to
//test.php
$string = urldecode($_GET['myVar']);
print $string;
the result is an unexpected:
stringwith sign
It would be safe to rawurldecode() the input, however, it would be redundant and therefore unnecessary.
Yes and no.
The basic charset of base64 may in some cases collide with traditional conventions used in URLs. But many of base64 implementations allow you to change the charset to match URLs better or even come with one (like Python's urlsafe_b64encode()).
Another issue you may be facing is the limit of URL length or rather — lack of such limit. Because standards do not specify any maximum length, browsers, servers, libraries and other software working with HTTP protocol may define its' own limits.
Its a base64url encode you can try out, its just extension of joeshmo's code above.
function base64url_encode($data) {
return rtrim(strtr(base64_encode($data), '+/', '-_'), '=');
}
function base64url_decode($data) {
return base64_decode(str_pad(strtr($data, '-_', '+/'), strlen($data) % 4, '=', STR_PAD_RIGHT));
}
I don't think that this is safe because e.g. the "=" character is used in raw base 64 and is also used in differentiating the parameters from the values in an HTTP GET.
If you have sodium extension installed and need to encode binary data, you can use sodium_bin2base64 function which allows you to select url safe variant.
for example encoding can be done like that:
$string = sodium_bin2base64($binData, SODIUM_BASE64_VARIANT_URLSAFE);
and decoding:
$result = sodium_base642bin($base64String, SODIUM_BASE64_VARIANT_URLSAFE);
For more info about usage, check out php docs:
https://www.php.net/manual/en/function.sodium-bin2base64.php
https://www.php.net/manual/en/function.sodium-base642bin.php
In theory, yes, as long as you don't exceed the maximum url and/oor query string length for the client or server.
In practice, things can get a bit trickier. For example, it can trigger an HttpRequestValidationException on ASP.NET if the value happens to contain an "on" and you leave in the trailing "==".
For url safe encode, like base64.urlsafe_b64encode(...) in Python the code below, works to me for 100%
function base64UrlSafeEncode(string $input)
{
return str_replace(['+', '/'], ['-', '_'], base64_encode($input));
}
Related
UPDATE: Please ignore this question, it appears that md5 is not
returning result because I pass the URL through filter_var($url,
FILTER_SANITIZE_URL) and looks like FILTER_SANITIZE URL doesn't work
for foreign characters.
I have a problem where I want to get a hash from URLs e.g
https://ko.wikipedia.org/wiki/추간판_탈출증
The URL is provided by user in a form with so I assume it's already UTF-8 since my website is UTF-8.
However the above cannot be used with md5() as it returns empty result. May I know what php function do I use to convert it to something like below where md5() can be used?
https://ko.wikipedia.org/wiki/%EC%B6%94%EA%B0%84%ED%8C%90_%ED%83%88%EC%B6%9C%EC%A6%9D
I tried iconv, htmlspecialchar, htmlentities and I cannot seems to be able to find the right function to convert the strings.
You can use directly md5 to Encode whole URL as Below :
echo md5('https://ko.wikipedia.org/wiki/추간판_탈출증');
Which gives output as :
26eb333445f4e154f8ecb76e7c2ac858
UPDATED :
As Per w3schools FILTER_SANITIZE_URL
The FILTER_SANITIZE_URL filter removes all illegal URL characters from
a string.
This filter allows all letters, digits and
$-_.+!*'(),{}|\^~[]`"><#%;/?:#&=
The function you are looking for is rawurlencode. However you will have to extract the part of the url you want to encode or the whole url will be encoded.
$encoded = rawurlencode('추간판_탈출증');
// value of $encoded is now '%EC%B6%94%EA%B0%84%ED%8C%90_%ED%83%88%EC%B6%9C%EC%A6%9D'
use urlencode(url) function for conversion
once check this url url encode functions are found here
I have a string such as this - Panamá. I need to convert this string to Panam\xE1 so it's readable in a JavaScript file I'm generating using PHP.
Is there a function to encode this in PHP? Any ideas would be appreciated.
My rule is,
If you try to encode or escape data using preg_replace or
using massive mapping arrays or str_replace, STOP you are probably doing it wrong.
All it takes is one missed or eroneous mapping (and you WILL miss some mappings) then you end up with code that doesn't work in all cases and code which corrupts your data in some cases. Whole libraries have been written already dedicated to doing the translations for you (e.g. iconv) and for escaping data, you should use the proper PHP function.
If you plan on outputting the data to a browser (the fact you want to encode for javascript suggests this) then I suggest using UTF8 encoding. If your data is in latin-1, use the utf8_encode function.
Whether your PHP string contains ASCII characters or not, to send any data from PHP to JS you should ALWAYS use the json_encode function.
PHP code
$your_encoding = 'latin1';
$panama = "Panamá";
//Get your data in utf8 if it isnt already
$panama = iconv($your_encoding, "utf-8", $panama);
$panama_encoded = json_encode($panama);
echo "var js_panama = " . $panama_encoded . ";";
JS Output
var js_panama = "Panam\u00e1";
Even though JSON supports unicode, it may not be compatible with your non UTF-8 javascript file. This is not a problem because the json_encode PHP function will escape unicode characters by default.
Assuming that your input is in the latin-1 encoding then ord and dechex will do what you want:
$result = preg_replace_callback(
'/[\x80-\xff]/',
function($match) {
return '\x'.dechex(ord($match[0]));
},
$input);
If your input is in any other encoding then you would need to know what encoding that is and adapt the solution accordingly. Note that in this case it would not be possible to use specifically the \x## notation in the JS output in all cases.
This should work for you:
$str = "Panamá";
$str = preg_replace_callback('/[\x{80}-\x{10FFFF}]/u', function ($m) {
$utf = iconv('UTF-8', 'UCS-4', current($m));
return sprintf("\x%s", ltrim(strtoupper(bin2hex($utf)), "0"));
}, $str);
echo $str;
Output (Source Code):
Panam\xE1
I am using the base64_encode for sending the numeric id to url, base64_encode($list_post['id']); up to 99 its working fine, but after 99 its produce wrong encoded string.
the last character in encoded string is = (equal sign), but when the number more than 99, for example 100 it don't show = (equal sign) at the end.
Take a look at how padding in base64 works: http://en.wikipedia.org/wiki/Base64#Padding
The padding (the "=" character) is not always needed and in some implementations is not even mandatory.
EDIT: ok from your comments I see that you are using the base64 encoded string in a URL like this:
http://example.com/path/OTC=
The base64 encoding includes chars that have a special meaning in URLs so you need to use a slightly modified function (https://stackoverflow.com/a/5835352/2737514):
function base64_url_encode($input) {
return strtr(base64_encode($input), '+/=', '-_,');
}
function base64_url_decode($input) {
return base64_decode(strtr($input, '-_,', '+/='));
}
However, since your code works for some numbers, maybe there is a problem with the .htaccess not parsing the url correctly, or the PHP code that interpretes the URL. I can't help more than this without seeing some other code.
it seems working fine for me
Can you please test with following code
echo base64_encode(101);
echo base64_decode(base64_encode(101));
DEMO
Base64-encoded data takes about 33% more space than the original data.
So these numbers shouldnt be a problem
I need to get values in the url as it is
ex:
http://www.example.com/index?url=1+LY2ePh1pjX4tjZ4+GS393Y2pjd16Cbq63T3tbfzMzd16CarA==
but vriable url give me value of "1 LY2ePh1pjX4tjZ4 GS393Y2pjd16Cbq63T3tbfzMzd16CarA=="
Even though i have expected "1+LY2ePh1pjX4tjZ4+GS393Y2pjd16Cbq63T3tbfzMzd16CarA=="
any one can help me for this or know the reason
You see, you need to encode certain characters if you need to send them in a URL. For further references, I suggest you should read this Page. It seems that the URL you are getting isn't being encoded properly. If the URL is coming from your site, then I would suggest you to encode it properly.
In PHP, there is a function called urlencode, which may help you with this task.
A short explanation
URLs can only be sent over internet using ASCII character set.If you want to send characters which is outside this set, you need to encode it.URL encoding replaces unsafe ASCII characters with % followed by two hexadecimal digits corresponding to the character values in the ISO-8859-1 character-set.
The client sending the request apparently isn't URL encoding the value correctly. You can re-encode it after it's being decoded like this:
urlencode($_GET["url"])
IT convert %2B to space
The parameter you sent is wrong, it should have been encoded like so..
<?php
echo '<a href="http://www.example.com/index?url=', urlencode('1+LY2ePh1pjX4tjZ4+GS393Y2pjd16Cbq63T3tbfzMzd16CarA=='), '">';
?>
i have added encoding correctly now,It convert == correctly, but + sign encode to %2B correctly but in decode process it convert to space
As it seems that you’re having a Base-64 value there: You can use the URL safe alphabet for Base-64 that uses - and _ instead of + and / respectively:
$base64 = "1+LY2ePh1pjX4tjZ4+GS393Y2pjd16Cbq63T3tbfzMzd16CarA==";
// plain Base-64 to URL safe Base-64
$base64_safe = strtr($base64, '+/', '-_');
// URL safe Base-64 to plain Base-64
$base64 = strtr($base64_safe, '-_', '+/');
And if you know the length of the data, you can also omit the = padding:
rtrim($base64, '=')
In looking at URL safe base 64 encoding, I've found it to be a very non-standard thing. Despite the copious number of built in functions that PHP has, there isn't one for URL safe base 64 encoding. On the manual page for base64_encode(), most of the comments suggest using that function, wrapped with strtr():
function base64_url_encode($input)
{
return strtr(base64_encode($input), '+/=', '-_,');
}
The only Perl module I could find in this area is MIME::Base64::URLSafe (source), which performs the following replacement internally:
sub encode ($) {
my $data = encode_base64($_[0], '');
$data =~ tr|+/=|\-_|d;
return $data;
}
Unlike the PHP function above, this Perl version drops the '=' (equals) character entirely, rather than replacing it with ',' (comma) as PHP does. Equals is a padding character, so the Perl module replaces them as needed upon decode, but this difference makes the two implementations incompatible.
Finally, the Python function urlsafe_b64encode(s) keeps the '=' padding around, prompting someone to put up this function to remove the padding which shows prominently in Google results for 'python base64 url safe':
from base64 import urlsafe_b64encode, urlsafe_b64decode
def uri_b64encode(s):
return urlsafe_b64encode(s).strip('=')
def uri_b64decode(s):
return urlsafe_b64decode(s + '=' * (4 - len(s) % 4))
The desire here is to have a string that can be included in a URL without further encoding, hence the ditching or translation of the characters '+', '/', and '='. Since there isn't a defined standard, what is the right way?
There does appear to be a standard, it is RFC 3548, Section 4, Base 64 Encoding with URL and Filename Safe Alphabet:
This encoding is technically identical
to the previous one, except for the
62:nd and 63:rd alphabet character, as
indicated in table 2.
+ and / should be replaced by - (minus) and _ (understrike) respectively. Any incompatible libraries should be wrapped so they conform to RFC 3548.
Note that this requires that you URL encode the (pad) = characters, but I prefer that over URL encoding the + and / characters from the standard base64 alphabet.
I don't think there is right or wrong. But most popular encoding is
'+/=' => '-_.'
This is widely used by Google, Yahoo (they call it Y64). The most url-safe version of encoders I used on Java, Ruby supports this character set.
I'd suggest running the output of base64_encode through urlencode. For example:
function base64_encode_url( $str )
{
return urlencode( base64_encode( $str ) );
}
If you're asking about the correct way, I'd go with proper URL-encoding as opposed to arbitrary replacement of characters. First base64-encode your data, then further encode special characters like "=" with proper URL-encoding (i.e. %<code>).
Why don't you try wrapping it in a urlencode()? Documentation here.