PHP - securing parameters passed in the URL - php

I have an application which makes decisions based on part of URL:
if ( isset($this->params['url']['url']) ) {
$url = $this->params['url']['url'];
$url = explode('/',$url);
$id = $this->Provider->getProviderID($url[0]);
$this->providerName = $url[0]; //set the provider name
return $id;
}
This happens to be in a cake app so $this->params['url'] contains an element of URL. I then use the element of the URL so decide which data to use in the rest of my app. My question is...
whats the best way to secure this input so that people can't pass in anything nasty?
thanks,

Other comments here are correct, in AppController's beforeFilter validate the provider against the providers in your db.
However, if all URLs should be prefixed with a provider string, you are going about extracting it from the URL the wrong way by looking in $this->params['url'].
This kind of problem is exactly what the router class, and it's ability to pass params to an action is for. Check out the manual page in the cookbook http://book.cakephp.org/view/46/Routes-Configuration. You might try something like:
Router::connect('/:provider/:controller/:action');
You'll also see in the manual the ability to validate the provider param in the route itself by a regex - if you have a small definite list of known providers, you can hard code these in the route regex.
By setting up a route that captures this part of the URL it becomes instantly available in $this->params['provider'], but even better than that is the fact that the html helper link() method automatically builds correctly formatted URLs, e.g.
$html->link('label', array(
'controller' => 'xxx',
'action' => 'yyy',
'provider' => 'zzz'
));
This returns a link like /zzz/xxx/yyy

What are valid provider names? Test if the URL parameter is one, otherwise reject it.
Hopefully you're aware that there is absolutely no way to prevent the user from submitting absolutely anything, including provider names they're not supposed to use.

I'd re-iterate Karsten's comment: define "anything nasty"
What are you expecting the parameter to be? If you're expecting it to be a URL, use a regex to validate URLs. If you're expecting an integer, cast it to an integer. Same goes for a float, boolean, etc.
These PHP functions might be helpful though:
www.php.net/strip_tags
www.php.net/ctype_alpha

the parameter will be a providername - alphanumeric string. i think the answer is basically to to use ctype_alpha() in combination with a check that the providername is a valid one, based on other application logic.
thanks for the replies

Also, if you have a known set of allowable URLs, a good idea is to whitelist those allowed URLs. You could even do that dynamically by having a DB table that contains the allowed URLs -- pull that from the database, make a comparison to the URL parameter passed. Alternatively, you could whitelist patterns (say you have allowed domains that can be passed, but the rest of the url changes... You can whitelist the domain and/ or use regexps to determine validity).
At the very least, make sure you use strip_tags, or the built-in mysql escape sequences (if using PHP5, parameterizing your SQL queries solves these problems).

It would be more cake-like to use the Sanitize class. In this case Sanitize::escape() or Sanitize::paranoid() seem appropriate.

Related

Is it a bad practice to use a GET parameter (in URL) with no value?

I'm in a little argument with my boss about URLs using GET parameters without value. E.g.
http://www.example.com/?logout
I see this kind of link fairly often on the web, but of course, this doesn't mean it's a good thing. He fears that this is not standard and could lead to unexpected errors, so he'd rather like me to use something like:
http://www.example.com/?logout=yes
In my experience, I've never encountered any problem using empty parameters, and they sometimes make more sense to me (like in this case, where ?logout=no wouldn't make any sense, so the value of "logout" is irrelevant and I would only test for the presence of the parameter server-side, not for its value). (It also looks cleaner.)
However I can't find confirmation that this kind of usage is actually valid and therefore really can't cause any problem ever.
Do you have any link about this?
RFC 2396, "Uniform Resource Identifiers (URI): Generic Syntax", §3.4, "Query Component" is the authoritative source of information on the query string, and states:
The query component is a string of information to be interpreted by
the resource.
[...]
Within a query component, the characters ";", "/", "?", ":", "#",
"&", "=", "+", ",", and "$" are reserved.
RFC 2616, "Hypertext Transfer Protocol -- HTTP/1.1", §3.2.2, "http URL", does not redefine this.
In short, the query string you give ("logout") is perfectly valid.
A value is not required for the key to have any effect. It doesn't make the URL any less valid either, the URL RFC1738 does not list it as required part of the URL.
If you don't really need a value, it's just a matter of preference.
http://example.com/?logout
Is just as much a valid URL as
http://example.com/?logout=yes
All difference that it makes is that if you want to make sure that the "yes" bit was absolutely set, you can check for it's value. Like:
if(isset($_GET['logout']) && $_GET['logout'] == "yes") {
// Only proceed if the value is explicitly set to yes
If you just want to know if the logout key was set somewhere in the URL, it would suffice to just list the key with no value assigned to it. You can then check it like this:
if(isset($_GET['logout'])) {
// Continue regardless of what the value is set to (or if it's left empty)
It's perfectly fine, and won't cause any error. Though, nowadays most frameworks are MVC based, so in the URL you need to mention a controller and an action, so it looks more like /users/logout (BTW, also StackOverflow uses that URL to log users out ;).
The statement that it may cause errors to me sounds like your applications manually access the raw $_GET, and I definitely think that building apps without a framework (which usually provides an MVC stack and a router/dispatcher) is the real dangerous thing here.

Regular Expression to Isolate a String of Characters in the proper context

So, I have a dashboard which I'm currently writing (PHP). The idea is that it is supposed to display data in a database relative to a given url specified. If the user wishes to just grab everything, they simply need to specify "all". If they wish to scrape data for specific URLs AND display everything at once, they will specify additional URLs with the "all" directive.
I discovered a bug, however.
If I have a URL which has the characters "all" in it (such as, say, http://everythingallatonce.com <-- that's just an example - I have no idea if that actually exists), the dashboard's parsing algorithm which takes the instruction given won't work properly. In fact, according to this logic, it will think that the user specified a given URL as well AS the words "all", without actually checking off the "perform scrape?" checkbox, which makes no sense at all (hence, it just throws an exception/dies with an error message).
So far, I just have a function like the following:
function _strExists( $needle, $haystack )
{
$pos = strpos( $haystack, $needle );
return ( $pos !== false );
}
Which I use to detect to see if the word "all" exists in the query, like so:
$fetchEverything = _strExists('all', $urls);
What would be a good work around for something like this, to avoid ambiguity between URLs specified which have "all" in them, and the actual query of all by itself? I'm thinking regular expressions, but I'm not sure...
Also
I have considered just using *, but I'd like to avoid that if possible.
If some value for all is being passed in the URL (i.e. all=1). Then you should look in the $_GET superglobal for it's existence (i.e. $_GET['all'])

Is this function enough for xss detection?

I found it inside the "symphony CMS" app, it's very small:
https://github.com/symphonycms/xssfilter/blob/master/extension.driver.php#L100
And I was thinking of stealing it and use it in my own application to sanitize string with HTML for display. Do you think it does a good job?
ps: I know there's HTML Purifier, but that thing is huge. And I'd rather prefer something less permissive, but I still want it to be efficient.
I've been testing it against strings from this page: http://ha.ckers.org/xss.html. But if fails against "XSS locator 2". Not sure how can anyone use that string to hack a site though :)
No, I wouldn’t use it. There are many different attacks that all depend on the context the data is inserted into. One single function would not cover them all. If you take a close look, there are actually just four tests:
// Set the patterns we'll test against
$patterns = array(
// Match any attribute starting with "on" or xmlns
'#(<[^>]+[\x00-\x20\"\'\/])(on|xmlns)[^>]*>?#iUu',
// Match javascript:, livescript:, vbscript: and mocha: protocols
'!((java|live|vb)script|mocha):(\w)*!iUu',
'#-moz-binding[\x00-\x20]*:#u',
// Match style attributes
'#(<[^>]+[\x00-\x20\"\'\/])style=[^>]*>?#iUu',
// Match unneeded tags
'#</*(applet|meta|xml|blink|link|style|script|embed|object|iframe|frame|frameset|ilayer|layer|bgsound|title|base)[^>]*>?#i'
);
Nothing else is tested. Besides attacks that these tests don’t detect (false negative), it could also report some input mistakenly as an attack (false positive).
So instead of trying to detect XSS attacks, just make sure to use proper sanitizing.
I think it does a good job for testing strings,at least that's what I can say according to my tests.

Should you verify parameter types in PHP functions?

I'm used to the habit of checking the type of my parameters when writing functions. Is there a reason for or against this? As an example, would it be good practice to keep the string verification in this code or remove it, and why?
function rmstr($string, $remove) {
if (is_string($string) && is_string($remove)) {
return str_replace($remove, '', $string);
}
return '';
}
rmstr('some text', 'text');
There are times when you may expect different parameter types and run different code for them, in which case the verification is essential, but my question is if we should explicitly check for a type and avoid an error.
Yes, it's fine. However, php is not strongly typed to begin with, so I think this is not very useful in practice.
Additionally, if one uses an object other than string, an exception is a more informative; therefore, I'd try to avoid just returning an empty string at the end, because it's not semantically explaining that calling rmstr(array, object) returns an empty string.
My opinion is that you should perform such verification if you are accepting input from the user. If those strings were not accepted from the user or are sanitized input from the user, then doing verification there is excessive.
As for me, type checking actual to data, getted from user on top level of abstraction, but after that, when You call most of your functions you already should now their type, and don't check it out in every method. It affects performance and readability.
Note: you can add info, which types is allowed to arguments for your functions by phpDoc
It seems local folks understood this question as "Should you verify parameters" where it was "Should you verify parameter types", and made nonsense answers and comments out of it.
Personally I am never checking operand types and never experienced any trouble of it.
It depends which code you produce. If it's actually production code, you should ensure that your function is working properly under any circumstances. This includes checking that parameters contain the data you expect. Otherwise throw an exception or have another form of error handling (which your example is totally missing).
If it's not for production use and you don't need to code defensively, you can ignore anything and follow the garbage-in-garbage-out principle (or the three shit principle: code shit, process shit, get shit).
In the end it is all about matching expectations: If you don't need your function to work properly, you don't need to code it properly. If you are actually relying on your code to work precisely, you even need to validate input data per each unit (function, class).

How to deal with question mark in url in php single entry website

I'm dealing with two question marks in a single entry website.
I'm trying to use urlencode to handle it.
The original URL:
'search.php?query='.quote_replace(addmarks($search_results['did_you_mean'])).'&search=1'
I want to use it in the single entry website:
'index.php?page='.urlencode('search?query='.quote_replace(addmarks($search_results['did_you_mean'])).'&search=1')
It doesn't work, and I don't know if I must use urldecode and where I can use it also.
Why not just rewrite it to become
index.php?page=search&query=...
mod_rewrite will do this for you if you use the [QSA] (query string append) flag.
http://wiki.apache.org/httpd/RewriteQueryString
$_SERVER['QUERY_STRING'] will give you everything after the first "?" in a URL.
From here you can parse using "explode" or common sting functions.
Example:
http://xxx/info.php?test=1?test=2&test=3
$_SERVER['QUERY_STRING'] =>test=1?test=2&test=3
list($localURL, $remoteURL) = explode("?", $_SERVER['QUERY_STRING']);
$localURL => 'test=1'
$remoretURL =>'test=2&test=3'
Hope this helps
I would suggest you to change the logic of the server code to handle simpler query form. This way it is probably going to lead you nowhere in very near future.
Use
index.php?page=search&query=...
as your query format but do not overwrite it with mod_rewrite to your first wanted format just to satisfy your current application logic, but handle it with some better logic on the server side. Write some ifs and thens, switches and cases ... but do not try to put the logic of the application into your URLs. It will make you really awkward URLs and soon you'll see that there is no lot of space in that layer to handle all the logic you will need. :)

Categories