Usually I use trim() PHP function to check, if data is not empty. Also for MySQL I use mysql_real_escape_string(). Is this enough,or do I need to perform additional checks?
To check if data is "empty", you can use empty().
Yes, to escape data you use mysql_real_escape_string() for MySQL. By default, trim() is used to trim trailing and leading whitespace, if used without additional parameters.
Is it so hard to check on manual what each function does?
I usually do this:
$foo = isset($_POST['bar']) ? trim($_POST['bar']) : '';
if (!empty($foo))
$db->query("UPDATE table SET foo = '".mysql_real_escape_string($foo)."'");
if (!empty($_POST['data']) && other controls) {
// Success
$data = mysql_real_escape_string($data)
$sql = "SELECT * FROM users WHERE data = '$data'";
mysql_query($sql);
}
I tend to use isset($_POST['key1'], $_POST['key2'], $_POST['keyn']) as a starting point for determining if a form has had all required data submitted, along with testing things such as $_SERVER['REQUEST_METHOD'], $_SERVER['SERVER_PORT'], $_SERVER['REQUEST_URI']. Trimming is not harmful, but I just go for the jugular with preg_match($needle, $haystackenter) and make the regular expression non-greedyand non-buffer capturing. In short, why condition input when you can just make the test fail to being with?
The language construct empty() works, but does it really matter if the value doesn't match the pattern you are looking for? As for performance, who can say if someone copied and pasted the Oxford English Dictionary what would happen in either case.
function ValidatePostKeyAndValue($input, $pattern, $length)
{
if(isset($input) &&
preg_match($pattern, $input) &&
ctype_print($input) &&
strlen($input) <= $length &&
is_string($input))
{
return true;
}
else
{
return false;
}
}
I could do more or less, depending on the situation. Boolean functions are your friends.
As far your $data variable, I think it would be wise to consider if the wildcards _ and % might appear in your data. If so, addcslashes() can be used to target those characters in your string. Over all though, moving to mysqli() will save you from having to use mysql_select_db(). mysqli_connect() does this for you! Well worth the switch.
Related
[EDIT] I am placing the comment I entered near the bottom of this post to, hopefully avoid further down votes.
This was a pretty basic question stemming from my misunderstanding of what exactly $_REQUEST is. My understanding was that it was an index that referenced $_POST and $_GET (and $_COOKIE). However, I found that $_REQUEST is, itself, an array, so I simply changed the variables in $_REQUEST. Not an optimal solution, but a solution, nonetheless. It has the added advantage that the $_GET variables, with the apostrophes still there, are available. Perhaps not the best practice, but please note before you down vote that I have very little control over this data - coming in from one API and going out to another.
I have an API currently in use. We have a problem with some customers sending apostrophes in the URL. My question is how best to strip the apostrophes within the URL array. Perhaps using array_walk or something similar?
So that $_REQUEST[Customer] == "O'Henry's"
Becomes $_REQUEST[Customer] == "OHenrys"
EDIT: Judging from some of the answers here, I believe I need to explain a little better. This is an API that is already written and is the preliminary interface for another AS400 API. I have nothing to do with building the URL. I am receiving it. All I am concerned about is removing the apostrophes, without changing any other code. So the best way is to go through the array. In the body of the code, the variable references are all using $_REQUEST[]. I COULD go in and change those to $_GET[] if absolutely necessary but would rather avoid that.
This Works
foreach($_REQUEST as $idx => $val)
{
$_REQUEST[$idx] = str_replace("'" , '' , $val);
}
However, I am a little leery of using $_REQUEST in that manner. Does anyone see a problem with that. (Replacing $_REQUEST with $_GET does not work)
For some use cases, it might make sense to store a "clean" or "pretty" version of the name. In that case, you may want to standardize to a case and have a whitelist of characters rather than a blacklist consisting of just single quotes. Use a regex to enforce this, perhaps similar to this one:
preg_replace("/[^[:alnum:][:space:]]/u", '', $string);
If you do that, consider if it is necessary to differentiate between different customers named O'Henrys, O'Henry's, OHenrys, O'henry's, and so on. Make sure your constraints are enforced by the app and the database.
The array_walk_recursive function is a reasonable way to hit every item in an array:
function sanitize(&$item, $key)
{
if (is_string($item)) {
// apply whitelist constraints
}
}
array_walk_recursive($array, 'sanitize');
It's hard to tell without more context, but it seems possible you may be asking the wrong question / solving the wrong problem.
Remember that you can almost always escape "special" characters and render them a non-issue.
In an HTML context where a single quote might cause problems (such as an attribute value denoted by single quotes), escape for HTML using htmlspecialchars or a library-specific alternative:
<?php
// some stuff
$name = "O'Henry's";
?><a data-customer='<?=htmlspecialchars($name, ENT_QUOTES|ENT_HTML5);?>'>whatever</a><?php
// continue
For JavaScript, encode using json_encode:
<?php
// some stuff
$name = "O'Henry's";
?><script>
var a = <?=json_encode($name);?>
alert(a); // O'Henry's
</script>
For SQL, use PDO and a prepared statement:
$dbh = new PDO('mysql:host=localhost;dbname=whatever', $user, $pass);
$name = "O'Henry's";
$stmt = $dbh->prepare("INSERT INTO REGISTRY (name) VALUES (:name)");
$stmt->bindParam(':name', $name);
$stmt->execute();
For use in a URL query string, use urlencode:
<?php
// some stuff
$name = "O'Henry's";
?>whatever<?php
// continue
For use in a URL query path use rawurlencode:
<?php
// some stuff
$name = "O'Henry's";
?>whatever<?php
// continue
Libraries and frameworks will provide additional ways to escape things in those and other contexts.
If you want them removing altogether as an illegal character:
<?php foreach($myArray as $idx => $val){
$myArray[$idx] = str_replace("'" , '' , $val);
}
?>
However this shouldn't be your solution to SQL Inserts etc.. Better off using mysqli::real_escape_string OR prepared statements
This was a pretty basic question stemming from my misunderstanding of what exactly $_REQUEST is. My understanding was that it was an index that referenced $_POST and $_GET (and $_COOKIE). However, I found that $_REQUEST is, itself, an array, so I simply changed the variables in $_REQUEST. Not an optimal solution, but a solution, nonetheless. It has the added advantage that the $_GET variables, with the apostrophes still there, are available. Not the best practice, though.
EDIT:
Reading the edits you made on your question, the best solution for you is str_replace(). But no need to loop through your array, the 3rd parameter can be an array !
This will strip apostrophes of every item in $foo:
$foo = [
"O'Henry's",
"D'Angleterre"
];
$foo = str_replace("'", "", $foo);
If you really need to remove the apostrophes use str_replace():
$foo = "O'Henry's";
$foo = str_replace("'", "", $foo);
// OUTPUT: OHenrys
If you can keep them, you better encode them. urlencode() may be a way to do:
$foo = urlencode($foo);
// OUTPUT: O%27Henry%27s
If you build this URL from an array you could use http_build_query():
$foo = [
'Customer' => "O'Henry's"
];
$foo = http_build_query($foo);
// OUTPUT: Customer=O%27Henry%27s
I have read many other questions regarding how to filter a string to "Alpha-numeric", but all of them suggest the preg_replace() method.
According to OWASP:
Function preg_replace() should not be used with unsanitised user
input, because the payload will be eval()’ed13.
preg_replace("/.*/e","system(’echo /etc/passwd’)");
Reflection also could
have code injection flaws. Refer to the appropriate reflection
documentations, since it is an advanced topic.
So now how do I achieve this without preg_replace?
$result = preg_replace("/[^a-zA-Z0-9]+/", "", $_POST['data']);
// Notice the $_POST['data']
There's no problem using preg_replace() to filter user inputs. The OWASP advice you've quoted is talking about the pattern not being user input itself.
However, I'd say that using filtered inputs is a problem by itself - you should validate instead. As in, don't accept invalid inputs.
As others have pointed out, the OWASP vulnerability you've linked only applies when you're evaluating the expression, which you shouldn't be doing anyway.
In my experience, regular expressions are highly frowned upon for such simple operations where PHP's built-in string functions suffice. The string functions are also faster.
If the data is not valid, then you shouldn't be filtering it, you should be rejecting it.
Example:
$result = ctype_alnum($_POST['data']) ? $_POST['data'] : null;
Well we had similar situation and we use the following:
if ( ! preg_match('/^[a-z0-9:_\/|-]+$/i', $str))
{
//do your stuff
}
You could go for something like this:
<?php
$unsafe_input = 'some"""\'t&%^$#!`hing~~ unsafe \':[]435^%$^%*$^#'; // input from user
$safe_input = ''; // final sanitized string
// we want to allow 0-9 A-Z and a-z
// merge and flip so that we can use isset() later
$allowed_chars = array_flip(array_merge(range(0, 9), range('A', 'Z'), range('a', 'z')));
// loop each byte of the string
for($i = 0; $i < strlen($unsafe_input); ++$i)
{
// isset() is lightyears faster than in_array()
if(isset($allowed_chars[$unsafe_input[$i]]))
{
// good, sanitized, data
$safe_input.= $unsafe_input[$i];
}
}
echo $safe_input;
I am taking over over some webgame code that uses the eval() function in php. I know that this is potentially a serious security issue, so I would like some help vetting the code that checks its argument before I decide whether or not to nix that part of the code. Currently I have removed this section of code from the game until I am sure it's safe, but the loss of functionality is not ideal. I'd rather security-proof this than redesign the entire segment to avoid using eval(), assuming such a thing is possible. The relevant code snip which supposedly prevents malicious code injection is below. $value is a user-input string which we know does not contain ";".
1 $value = eregi_replace("[ \t\r]","",$value);
2 $value = addslashes($value);
3 $value = ereg_replace("[A-z0-9_][\(]","-",$value);
4 $value = ereg_replace("[\$]","-",$value);
5 #eval("\$val = $value;");
Here is my understanding so far:
1) removes all whitespace from $value
2) escapes characters that would need it for a database call (why this is needed is not clear to me)
3) looks for alphanumeric characters followed immediately by \ or ( and replaces the combination of them with -. Presumably this is to remove anything resembling function calls in the string, though why it also removes the character preceding is unclear to me, as is why it would also remove \ after line 2 explicitly adds them.
4) replaces all instances of $ with - in order to avoid anything resembling references to php variables in the string.
So: have any holes been left here? And am I misunderstanding any of the regex above? Finally, is there any way to security-proof this without excluding ( characters? The string to be input is ideally a mathematical formula, and allowing ( would allow for manipulation of order of operations, which currently is impossible.
Evaluate the code inside a VM - see Runkit_Sandbox
Or create a parser for your math. I suggest you use the built-in tokenizer. You would need to iterate tokens and keep track of brackets, T_DNUMBER, T_LNUMBER, operators and maybe T_CONSTANT_ENCAPSED_STRING. Ignore everything else. Then you can safely evaluate the resulting expression.
A quick google search revealed this library. It does exactly what you want...
A simple example using the tokenizer:
$tokens = token_get_all("<?php {$input}");
$expr = '';
foreach($tokens as $token){
if(is_string($token)){
if(in_array($token, array('(', ')', '+', '-', '/', '*'), true))
$expr .= $token;
continue;
}
list($id, $text) = $token;
if(in_array($id, array(T_DNUMBER, T_LNUMBER)))
$expr .= $text;
}
$result = eval("<?php {$expr}");
(test)
This will only work if the input is a valid math expression. Otherwise you'll get a parse error in your eval`d code because of empty brackets and stuff like that. If you need to handle this too, then sanitize the output expression inside another loop. This should take care of the most of the invalid parts:
while(strpos($expr, '()') !== false)
$expr = str_replace('()', '', $expr);
$expr = trim($expr, '+-/*');
Matching what is allowed instead of removing some characters is the best approach here.
I see that you do not filter ` (backtick) that can be used to execute system commands. God only knows what else is also not prevented by trying to sanitize the string... No matter how many holes are found, there is no guarantee that there cannot be more.
Assuming that your language is not quite complex, it may not be that hard to implement it yourself without the use of eval.
The following code is our own attempt to answer the same sort of question:
$szCode = "whatever code you would like to submit to eval";
/* First check against language construct or instructions you don't allow such as (but not limited to) "require", "include", ..." : a simple string search will do */
if ( illegalInstructions( $szCode ) )
{
die( "ILLEGAL" );
}
/* This simple regex detects functions (spend more time on the regex to
fine-tune the function detection if needed) */
if ( preg_match_all( '/(?P<fnc>[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*) ?\(.*?\)/si',$szCode,$aFunctions,PREG_PATTERN_ORDER ) )
{
/* For each function call */
foreach( $aFunctions['fnc'] as $szFnc )
{
/* Check whether we can accept this function */
if ( ! isFunctionAllowed( $szFnc ) )
{
die( "'{$szFnc}' IS ILLEGAL" );
} /* if ( ! q_isFncAllowed( $szFnc ) ) */
}
}
/* If you got up to here ... it means that you accept the risk of evaluating
the PHP code that was submitted */
eval( $szCode );
I'm no PHP/SQL expert, and I've juste discovered that i had to apply mysql_real_escape_string to secure my SQL INSERTS.
I made a function using several advice found on the net, here it is:
function secure($string)
{
if(is_numeric($string))
{ $string = intval($string); }
elseif (is_array($string))
{
foreach ($string as $key => $value) {
$string[$key] = secure($value);
}
}
else if ($string === null)
{
$string = 'NULL';
}
elseif (is_bool($string))
{
$string = $string ? 1 : 0;
}
else
{
if (get_magic_quotes_gpc()) { $value = stripslashes($string); }
$string = mysql_real_escape_string($string);
$string = addcslashes($string, '%_');
}
return $string;
}
Thing is, when I have a look at my tables content, it contains backslashes.
And then logically, when I retrieve data I have to apply stripslashes to it to remove these backslashes.
Magic Quotes are off.
QUESTION 1)
Now I think that even though I use mysql_real_escape_string to secure my data before SQL insertion, backslashes should not appear in my content ? Can you confirm this ?
QUESTION 2)
If not normal, why are these backslashes appearing in my phpMyAdmin content and retrievals ? What did I did wrong ?
QUESTION 3)
A guess I have is that mysql_real_escape_string could be applied twice, isn't it ?
If so, what could be a function to prevent mysql_real_escape_string being applied many times to a same string, leading to many \\ to a same escapable character ?
Thanks a lot by advance for your inputs guys !
oh, what a senseless function. I know it's not your fault but ones who wrote it in their stupid articles and answers.
Get rid of it and use only mysql_real_escape_string to escape strings.
you have mixed up everything.
first, no magic quotes stuff should be present in the database escaping function.
if you want to get rid of magic quotes, do it centralized, at the very top of ALL your scripts, no matter if they deal with the database or not.
most of checks in this function are useless. is_bool for example. PHP will convert it the same way, no need to write any code for this.
LIKE related escaping is TOTALLY distinct matter, and has nothing to do with safety.
is numeric check is completely useless, as it will help nothing.
Also note that escaping strings has nothing to do with security.
I's just a syntax rule - all strings should be escaped. No matter of it's origin or any other stuff. Just a strict rule: every time you place a string into query, it should be quoted and escaped. (And of course, if you only escape it but not quote, it will help nothing)
And only when we talk of the other parts of query, it comes to the SQL injection issue. To learn complete guide on this matter, refer to my earlier answer: In PHP when submitting strings to the database should I take care of illegal characters using htmlspecialchars() or use a regular expression?
Your stripslashed $string is stored to the wrong variable $value instead of $string:
if (get_magic_quotes_gpc()) { $value = stripslashes($string); }
should be
if (get_magic_quotes_gpc()) { $string = stripslashes($string); }
Are you sure you aren't calling mysql_real_escape_string more than once, each time you call it with escapable characters you will end up adding more and more slashes. You want to call it only once. Also, why are you also calling addcslashes? mysql_real_escape_string should be enough. If you call it only once, you should never have to call stripslashes on the data after retrieving it from the database.
You can't really tell if mysql_real_escape_string is applied more than once, I'd suggest going back and re-reading your code carefully, try debug printing the values just before they are inserted into the db to see if they are look 'over-slashed'.
Btw, if you are using prepared statements (e.g. via mysqli) you dont need to escape your strings, the DB engine does this for you, this could be the problem too.
Remove addslashes completely from all of your code. This is the leading cause for slashes being inserted into database.
function escape($string) {
if (get_magic_quotes_gpc()) {
$string = stripslashes($string);
}
return mysql_real_escape_string($string);
}
Always check if magic_quotes_gpc is enabled, if it is perform stripslashes and escape the data.
Escaped = "don\'t use addslashes"
When it goes into database the '\' is removed.
I know some people may just respond "never" as long as there's user input. But suppose I have something like this:
$version = $_REQUEST['version'];
$test = 'return $version > 3;';
$success = eval($test);
This is obviously a simplified case, but is there anything that a user can input as version to get this to do something malicious? If I restrict the type of strings that $test can take on to comparing the value of certain variables to other variables, is there any way anybody can see to exploit that?
Edit
I've tried running the following script on the server and nothing happens:
<?php
$version = "exec('mkdir test') + 4";
$teststr = '$version > 3;';
$result = eval('return ' . $teststr);
var_dump($result);
?>
all I get is bool(false). No new directory is created. If I have a line that actually calls exec('mkdir test') before that, it actually does create the directory. It seems to be working correctly, in that it's just comparing a string converted to a number to another number and finding out the result is false.
Ohhhh boy!
$version = "exec('rm-rf/...') + 4"; // Return 4 so the return value is "true"
// after all, we're gentlemen!
$test = "return $version > 3";
eval($test);
:)
You would have to do at least a filter_var() or is_numeric() on the input value in this case.
By the way, the way you use eval (assigning its result to $success) doesn't work in PHP. You would have to put the assignment into the eval()ed string.
If you do this. Only accept ints.
If you must accept strings, don't.
If you still think you must. Don't!
And lastly, if you still, after that, think you need strings. JUST DON'T!
yes, anything. I would use $version = (int)$_REQUEST['version']; to validate the data.
You need to be more precise with your definitions of "malicious" or "safe". Consider for example
exec("rm -rf /");
echo "enlarge your rolex!";
while(true) echo "*";
all three snippets are "malicious" from the common sense point of view, however technically they are totally different. Protection techniques that may apply to #1, won't work with other two and vice versa.
The way to make this safe would be to ensure that $version is a number BEFORE you try to eval.
Use this code to remove everything except numbers (0-9): preg_replace('/[^0-9]+/', '', $version);