I'm including a remote file with file_get_contents() like so:
function checkData($serial) {
file_get_contents("http://example.com/page.php?somevar=".$serial."&check=1");
return $http_response_header;
}
This remote page performs some basic data manipulation, and looks up the serial number in a database (The input is sanitised and I'm using PDO, so I don't have to worry about SQL injections), and then returns a value in the response header. The input $serial is a get parameter - So completely controlled by the user. I'm wondering if there are any inputs to this function that would lead to undesirable behaviour, for example getting contents of another page other than the one desired.
Thanks in advance.
If the $serial variable is always going to be numeric you can apply intval() around the value to ensure the value will always be a number and not contain other non-numeric data for path traversal / RFC, etc.
E.G.
file_get_contents("http://example.com/page.php?somevar=".intval($serial)."&check=1");
Alternatively you can use preg_replace to strip unwanted characters, should you need alpha characters also.
http://php.net/manual/en/function.preg-replace.php
Related
I have following code:
<?php
$param = $_GET['param'];
echo $param;
?>
when I use it like:
mysite.com/test.php?param=2+2
or
mysite.com/test.php?param="2+2"
it prints
2 2
not
4
I tried also eval - neither worked
+ is encoded as a space in query strings. To have an actual addition sign in your string, you should use %2B.
However, it should be noted this will not perform the actual addition. I do not believe it is possible to perform actual addition inside the query string.
Now. I would like to stress to avoid using eval as if it's your answer, you're asking the wrong question. It's a very dangerous piece of work. It can create more problems than it's worth, as per the manual specifications on this function:
The eval() language construct is very dangerous because it allows
execution of arbitrary PHP code. Its use thus is discouraged. If you
have carefully verified that there is no other option than to use this
construct, pay special attention not to pass any user provided data
into it without properly validating it beforehand.
So, everything that you wish to pass into eval should be screened against a very.. Very strict criteria, stripping out other function calls and other possible malicious calls & ensure that 100% that what you are passing into eval is exactly as you need it. No more, no less.
A very basic scenario for your problem would be:
if (!isset($_GET['Param'])){
$Append = urlencode("2+2");
header("Location: index.php?Param=".$Append);
}
$Code_To_Eval = '$Result = '.$_GET['Param'].';';
eval($Code_To_Eval);
echo $Result;
The first lines 1 through to 4 are only showing how to correctly pass a character such a plus symbol, the other lines of code are working with the data string. & as #andreiP stated:
Unless I'm not mistaking the "+" is used for URL encoding, so it would
be translated to a %, which further translates to a white space.
That's why you're getting 2 2
This is correct. It explains why you are getting your current output & please note using:
echo urldecode($_GET['Param']);
after encoding it will bring you back to your original output to which you want to avoid.
I would highly suggest looking into an alternative before using what i've posted
I'm a PHP newbie trying to find a way to use parse_str to parse a number of URLs from a database (note: not from the request, they are already stored in a database, don't ask... so _GET won't work)
So I'm trying this:
$parts = parse_url('http://www.jobrapido.se/?w=teknikinformat%C3%B6r&l=malm%C3%B6&r=auto');
parse_str($parts['query'], $query);
return $query['w'];
Please note that here I am just supplying an example URL, in the real application the URL will be passed in as a parameter from the database. And if I do this it works fine. However, I don't understand how to use this function properly, and how to avoid errors.
First of all, here I used "w" as the index to return, because I could clearly see it was in the query. But how do these things work? Is there a set of specific values I can use to get the entire query string? I mean, if I look further, I can see "l" and "r" here as well...
Sure I could extract those too and concatenate the result, but will these value names be arbitrary, or is there a way to know exactly which ones to extract? Of course there's the "q" value, which I originally thought would be the only one I would need, but apparently not. It's not even in the example URL, although I know it's in lots of others.
So how do I do this? Here's what I want:
Extract all parts of the query string that gives me a readable output of the search string part of the URL (so in the above it would be "teknikinformatör Malmö auto". Note that I would need to translate the URL encoding to Swedish characters, any easy way to do that in PHP?)
Handle errors so that if the above doesn't work for some reason, the method should only return an empty string, thus not breaking the code. Because at this point, if I were to use the above with an actual parameter, $url, passed in instead of the example URL, I would get errors, because many of the URLs do not have the "w" parameter, some may be empty fields in the database, some may be malformed, etc. So how can I handle such errors stably, and just return a value if the parsing works, and return empty string otherwise?
There seems to be a very strange problem that occurs that I cannot see during debugging. I put this test code in just to see what is going on:
function getQuery($url)
{
try
{
$parts = parse_url($url);
parse_str($parts['query'], $query);
if (isset($query['q'])) {
/* return $query['q']; */
return '';
}
} catch (Exception $e) {
return '';
}
}
Now, obviously in the real code I would want something like the commented out part to be returned. However, the puzzling thing is this:
With this code, as far as I see, every path should lead to returning an empty string. But this does not work - it gives me a completely empty grid in the result page. No errors or anything during debugging, and objects look fine when I step through them during debugging.
However, if I remove everything from this method except return ''; then it works fine - of course the field in the grid where the query is supposed to be is empty, but all the other fields have all the information as they should. So this was just a test. But how is it possible that code that should only be able to return an empty string does not work, while the one that only returns an empty string and does nothing else does work? I'm thoroughly confused...
The meaning of the query parameters is entirely up to the application that handles the URL, so there is no "right" parameter - it might be w, q, or searchquery. You can heuristically search for the most common variables (=guess), or return an array of all arguments. It depends on what you're trying to achieve.
parse_str already decodes urlencoding. Note that urlencoding is a way to encode bytes, not characters. It depends on what encoding the application expects. Usually (and in this example query), that should be UTF-8 everywhere, so you should be covered on 1.
Test whether the value exists, and if not, return the empty string, like this:
$heuristicFields = array('q', 'w', 'searchquery');
foreach ($heuristicFields as $hf) {
if (isset($query[$hf])) return $query[$hf];
}
return '';
The function returns null if the input is valid, and runs into errors (i.e., displays warning messages) when the URL is obviously invalid. The try...catch block has no effect.
It turned out the problem was with Swedish characters - if I used utf8_encode() on the value before returning it, it worked fine.
i would like to know if there is a possible injection of code (or any other security risk like reading memory blocks that you weren't supposed to etc...) in the following scenario, where unsanitized data from HTTP GET is used in code of PHP as KEY of array.
This supposed to transform letters to their order in alphabet. a to 1, b to 2, c to 3 .... HTTP GET "letter" variable supposed to have values letters, but as you can understand anything can be send to server:
HTML:
http://www.example.com/index.php?letter=[anything in here, as dirty it can gets]
PHP:
$dirty_data = $_GET['letter'];
echo "Your letter's order in alphabet is:".Letter2Number($dirty_data);
function Letter2Number($my_array_key)
{
$alphabet = array("a" => "1", "b" => "2", "c" => "3");
// And now we will eventually use HTTP GET unsanitized data
// as a KEY for a PHP array... Yikes!
return $alphabet[$my_array_key];
}
Questions:
Do you see any security risks?
How can i sanitize HTTP data to be able use them in code as KEY of an array?
How bad is this practice?
I can't see any problems with this practice. Anything you... errr... get from $_GET is a string. It will not pose any security threat whatsoever unless you call eval() on it. Any string can be used as a PHP array key, and it will have no adverse effects whatsoever (although if you use a really long string, obviously this will impact memory usage).
It's not like SQL, where you are building code to be executed later - your PHP code has already been built and is executing, and the only way you can modify the way in which it executes at runtime is by calling eval() or include()/require().
EDIT
Thinking about it there are a couple of other ways, apart from eval() and include(), that this input could affect the operation of the script, and that is to use the supplied string to dynamically call a function/method, instantiate an object, or in variable variables/properties. So for example:
$userdata = $_GET['userdata'];
$userdata();
// ...or...
$obj->$userdata();
// ...or...
$obj = new $userdata();
// ...or...
$someval = ${'a_var_called_'.$userdata};
// ...or...
$someval = $obj->$userdata;
...would be a very bad idea, if you were to do it with sanitizing $userdata first.
However, for what you are doing, you do not need to worry about it.
Any external received from GET, POST, FILE, etc. should be treated as filthy and sanitized appropriately. How and when you sanitize depends on when the data is going to be used. If you are going to store it to the DB, it needs to be escaped (to avoid SQL Injection. See PDO for example). Escaping is also necessary when running an OS command based on user data such as eval or attempting to read a file (like reading ../../../etc/passwd). If it's going to be displayed back to the user, it needs to be encoded (to avoid html injection. See htmlspecialchars for example).
You don't have to sanitize data for the way you are using it above. In fact, you should only escape for storage and encode for display, but otherwise leave data raw. Of course, you may want to perform your own validation on the data. For example, you may want dirty_data to be in the list of [a, b, c] and if not echo it back to the user. Then you would have to encode it.
Any well-known OS is not going to have a problem even if the user managed to attempt to read an invalid memory address.
Presumably this array's contents are meant to be publicly accessible in this way, so no.
Run it through array_key_exists()
Probably at least a little bad. Maybe there's something that could be done with a malformed multibyte string or something that could trigger some kind of overflow on a poorly-configured server... but that's pure (ignorant) speculation on my part.
Is this enough?
$listing = mysql_real_escape_string(htmlspecialchars($_POST['listing']));
Depends - if you are expecting text, it's just fine, although you shouldn't put the htmlspecialchars in input. Do it in output.
You might want to read this: What's the best method for sanitizing user input with PHP?
you can use php function : filter_var()
a good tutorial in the link :
http://www.phpro.org/tutorials/Filtering-Data-with-PHP.html
example to sanitize integer :
To sanitize an Integer is simple with the FILTER_SANITIZE_INT filter. This filter strips out all characters except for digits and . + -
It is simple to use and we no longer need to boggle our minds with regular expressions.
<?php
/*** an integer ***/
$int = "abc40def+;2";
/*** sanitize the integer ***/
echo filter_var($int, FILTER_SANITIZE_NUMBER_INT);
?>
The above code produces an output of 40+2 as the none INT values, as specified by the filter, have been removed
See:
Best way to stop SQL Injection in PHP
What are the best practices for avoid xss attacks in a PHP site
And sanitise data immediately before it is used in the context it needs to be made safe for. (e.g. don't run htmlspecialchars until you are about to output HTML, you might need the unedited data before then (such as if you ever decide to send content from the database by email)).
Yes. However, you shouldn't use htmlspecialchars on input. Only on output, when you print it.
This is because, it's not certain that the output will always be through html. It could be through a terminal, so it could confuse users if weird codes suddenly show up.
It depends on what you want to achieve. Your version prevents (probably) all SQL injections and strips out HTML (more exactly: Prevents it from being interpreted when sent to the browser). You could (and probably should) apply the htmlspecialchars() on output, not input. Maybe some time in the future you want to allow simple things like <b>.
But there's more to sanitizing, e.g. if you expect an Email Address you could verify that it's indeed an email address.
As has been said don't use htmlspecialchars on input only output. Another thing to take into consideration is ensuring the input is as expected. For instance if you're expecting a number use is_numeric() or if you're expecting a string to only be of a certain size or at least a certain size check for this. This way you can then alert users to any errors they have made in their input.
What if your listing variable is an array ?
You should sanitize this variable recursively.
Edit:
Actually, with this technique you can avoid SQL injections but you can't avoid XSS.
In order to sanitize "unreliable" string, i usually combine strip_tags and html_entity_decode.
This way, i avoid all code injection, even if characters are encoded in a Ł way.
$cleaned_string = strip_tags( html_entity_decode( $var, ENT_QUOTES, 'UTF-8' ) );
Then, you have to build a recursive function which call the previous functions and walks through multi-dimensional arrays.
In the end, when you want to use a variable into an SQL statement, you can use the DBMS-specific (or PDO's) escaping function.
$var_used_with_mysql = mysql_real_escape_string( $cleaned_string );
In addition to sanitizing the data you should also validate it. Like checking for numbers after you ask for an age. Or making sure that a email address is valid. Besides for the security benefit you can also notify your users about problems with their input.
I would assume it is almost impossible to make an SQL injection if the input is definitely a number or definitely an email address so there is an added level of safety.
I'm wondering if there is a quick and easy function to clean get variables in my url, before I work with them.( or $_POST come to think of it... )
I suppose I could use a regex to replace non-permitted characters, but I'm interested to hear what people use for this sort of thing?
The concept of cleaning input never made much sense to me. It's based on the assumption that some kinds of input are dangerous, but in reality there is no such thing as dangerous input; Just code that handles input wrongly.
The culprit of it is that if you embed a variable inside some kind of string (code), which is then evaluated by any kind of interpreter, you must ensure that the variable is properly escaped. For example, if you embed a string in a SQL-statement, then you must quote and escape certain characters in this string. If you embed values in a URL, then you must escape it with urlencode. If you embed a string within a HTML document, then you must escape with htmlspecialchars. And so on and so forth.
Trying to "clean" data up front is a doomed strategy, because you can't know - at that point - which context the data is going to be used in. The infamous magic_quotes anti-feature of PHP, is a prime example of this misguided idea.
I use the PHP input filters and the function urlencode.
Regular expressions can be helpful, and also PHP 5.2.0 introduced a whole filter extension devoted to filtering input variables in different ways.
It's hard to recommend a single solution, because the nature of input variables is so... variable. :-)
I use the below method to sanitize input for MYSQL database use. To summarize, iterate through the $_POST or $_GET array via foreach, and pass each $_POST or $_GET through the DBSafe function to clean it up. The DBSafe could easily be modified for other uses of the data variables (e.g. HTML output etc..).
// Iterate POST array, pass each to DBSafe function to clean up data
foreach ($_POST as $key => $PostVal) {
// Convert POST Vars into regular vars
$$key=DBSafe($PostVal);
// Use above statement to leave POST or GET array intact, and use new individual vars
// OR, use below to update POST or GET array vars
// Update POST vars
$_POST[$key]=DBSafe($PostVal);
}
function DBSafe($InputVal) {
// Returns MySQL safe values for DB update. unquoted numeric values; NULL for empty input; escaped, 'single-quoted' string-values;
if (is_numeric($InputVal)) {
return $InputVal;
} else {
// escape_string may not be necessary depending on server PHP and MySQL (i.e. magic_quotes) setup. Uncomment below if needed.
// $InputVal=mysql_escape_string($InputVal);
$InputVal=(!$InputVal?'NULL':"'$InputVal'");
return $InputVal;
}
}