Do I need to sanitize input to file_exists? - php

I can't seem to find a reference. I am assuming the PHP function file_exists uses system calls on linux and that these are safe for any string that does not contain a \0 character, but I would like to be sure.
Does anyone have (preferably non-anecdotal) information regarding this? Is is vulnerable to injection if I don't check the strings first?

I guess you need to, because the user may enter something like :
../../../somewhere_else/some_file and access a file that he is not allowed to access .
I suggest that you generate the absolute path of the file independently in your php code and just get the file name from user by basename()
or exclude any input containing ../ like :
$escaped_input = str_replace("../","",$input);

It depends on what you're trying to protect against.
file_exists doesn't do any writing to disk, which means that the worst that can happen is that someone gains some information about your file system or the existence of files that you have.
In practice however, if you're doing something later on with the same file that was previously checked with file_exists, such as includeing it, you may wish to perform more stringent checks.
I'm assuming that you may be passing arbitrary values, possibly sourced from user input, into this function.
If that is the case, it somewhat depends on why you actually need to use file_exists in the first place. In general, for any filesystem function that the user can pass values directly into, I'd try to filter out the string as much as possible. This is really just being pedantic and on the safe side, and may be unnecessary in practice.
So, for example, if you only ever need to check the existence of a file in a single directory, you should probably strip out directory delimiters of all sorts.
From personal experience, I've only ever passed user input into a file_exists call for mapping to a controller file, in which case, I'd just strip out any non-alphanumeric + underscore character.
UPDATE: reading your comments recently added, no there aren't special characters as this isn't executed in a shell. Even \0 should be fine, at least on newer PHP versions (I believe older ones would cut the string before the \0 when sent to underlying filesystem calls).

Related

Is it safe to use (strip_tags, stripslashes, trim) to clear variable that holds URLs

It's quite pleasure to be posting my first question in here :-)
I'm running a URL Shortening / Redirecting service, PHP written.
I aim to store and handle valid URLs data as much as possible within my service.
I noticed that sometimes, invalid URL data is being handled over to the database, holding invalid characters (like spaces in the end or beginning of the URL).
I decided to make my URL-Check mechanism trim, stripslashes and strip_tags the values before storing them.
As far as I can think, these functions will not remove valid charterers that any URL may have.
Kindly, just correct me or advise me if I'm going into the wrong direction.
Regards..
If you're already trimming the incoming variable, as well as filtering it with the other built in PHP methods, and STILL running into issues, try changing the collation of your table to UTF-8 and see if that helps you get rid of the special characters you mention. (Could you paste a few examples to let us know?)

How to escape input in PHP?

I have a PHP page that accepts input from a form post, but instead of directing that input to a database it is being used to retrieve a file from the file system. What is a good method for escaping a string destined for the file system rather then a database? Is mysql_real_escape_string() appropriate?
If you're using user-provided input to specify a filename directory, you'll have to make sure that the provided filename/path isn't trying to break "out" of your site's playground.
e.g. having something like
readfile($_GET['filepath']);
will send out ANYTHING on your server that the attack knows the path for. Even something like
readfile('/path/to/your/site/download/' . $_GET['filepath']);
accomplishes the same, if the user specifies enough '../../../' to get to whatever file they want.
mysql_real_escape_string() is NOT appropriate for this, as you're not doing a database operation. Use appropriate tools for appropriate jobs. In a goofy way, m_r_e_s() is a banana, and you need a giraffe. Something like
readfile('/path/to/your/site/download/' . basename($_GET['filepath']));
would be relatively save as basename() will extract only the filename portion of the user-provided file, so even if they pass in ../../../../../etc/passwd, basename will return only passwd.
You always only need to escape characters that are otherwise interpreted by your target system. For databases you usually make sure to escape quotes so you use mysql_real_escape_string or others. If your target is html, you usually use htmlspecialchars to make sure you get rid of html special characters (namely <, > and &). If your target is CSV, you basically only need to make sure line breaks and the CSV separator are escaped.
So depending on your target you can either reuse an existing escape function, define your own, or even go without one. If all you do is dump the input in a single file, then there is not much you need to take care of, as long as you specify the filename and that file is never used (or interpreted) by anything else than your application.
So think of what kind of special characters your target format requires for it to work, and simply escape those. You can usually ignore the rest.
edit:
If you want to use the input as the file path or file name, you can simply decide yourself how gracious you are, and what characters you want to support. A simple method would be to replace everything except latin characters and numbers (and maybe some special characters like _ and -) by something else. For example:
preg_replace( '/[^A-Za-z0-9_-]/', '_', $text );

PHP Security Advice on $_GET (combining clean URLs with query string)

I am using "clean" URLs like this:
http://localhost/controller/action/param
I access the parameters with a custom function like this my_get(1), my_get(2), etc...
However there are times where I think I need to combine them with query strings.
For example: If I need parameter values containing paths with several slashes like:
http://localhost/controller/action/param?mypath=foo/bar/qux.jpg
I do that because it would be a little harder to implement if done with clean URL.
Now my question is, in combining clean URL and with query string, I only intend to allow this character class:
[.&=a-z0-9\/_-]
I was wondering would there be any security issue with it? Should I disallow certain characters?
Don't mind about string formatting, but please validate the path passed... In the example you said: " in the example above, mypath's value will be deleted with unlink();", well, if you don't validate it in worst cases an attacker could delete any file on the filesystem of the server... ;)
So don't bother about validating the string with a regex, but validate the content of the string and make it safe for your environment... :)

how to check if a php file is obfuscated?

is there any way we can check if a php file has been obfuscated, using php? I was thinking regex maybe (for instance ioncube's encoded file contains a very long alphabet string, etc.
One idea is to check for whitespace. The first thing that an obfuscator will do is to remove extra whitespace. Another thing you can look for is the number of characters per line, as obfuscators will put all the code into few (one?) lines.
Often, obsfuscators initialize very large arrays to translate variables into less meaningful names (eg. see obsfucator article
One technique may be to search for these super-large arrays, close to the top of the class/file etc. You may be able to hook xdebug up to examine/look for these. The whole thing of course depends on the obsfuscation technique used. Check the source code, there may be patterns they've used that you can search on.
I think you can use token_get_all() to parse the file - then compute some statistics. For example check for number of function calls(in calse obfuscator uses some eval() string and nothing else) and calculate average function length - for obfuscators it will usually be about 3-5 chars, for normal PHP code it should be much bigger. You can also use dictionary lookup for function/variable names, check for comments etc. I think if you know all obfuscator formats that you want to detect - it will be easy.

PHP dealing with files

im' trying to make a file based article system and i want the folders to be my categories and files inside it to be the actual articles. But sometimes i need some special characters in my folder/file name(\/:*?",and actually i'm interested just in double quotes and question mark). Is there a way to do the trick...something like & is in html or something like this. thanks
Short answer: Your operating system could support such file names, but it doesn't seem to.
There isn't a simple way for you to do something easy like & for this. You could store the real filename, or make a conversion table such that something like _____questionmark_____ converted into that symbol or something silly like that, but then you run into problems with that particular string.
Fundamentally though, you should store the title separately from the file itself. A database would be an appropriate location.
At a deeper level, if you're asking a question like this, I think it's safe to say that allowing users to specify filenames on your system is likely to be a large security risk.
There are only few special characters which aren't allowed in file names. so keep your own insanitized sequence for those characters. For instance, replace all '?' with '#quest' before creating files and so. Do the reverse when you read them, aint this good? Insanitized means some combination of characters that we don't type usually like '#quest'.
I would recommend using a .htaccess file to pass the "filename" as an argument to your PHP script. Subsequently have the script look up the article in a database lookup table that points to the article file.

Categories