I've been wondering this for maybe a few months now but I still don't know an answer, other then possible speed performance. So long story short, instead of having all this PDO codes everywhere, why not just put a backslash between every character?
$String = $_POST["attack"]; // SOME THING' OR 1 = 1 --
$String = fFilter( $String ); // \S\O\M\E\ \T\H\I\N\G\'\ \O\R\ \1\ \=\ \1\ \-\-
Now I haven't been into this SQL stuff in awhile, so I can't give a perfect example, but basically the sql string should look like this SELECT * FROM account WHERE id = '\S\O\M\E\ \T\H\I\N\G\'\ \O\R\ \1\ \=\ \1\ \-\-' and something like that just always seemed pretty safe, but I haven't heard of anyone using it, or even why not to use it. I always see things like filtering html and etc isn't good, but I don't see why not just filter every single character. Since any attack would look like \a\t\t\a\c\k.
Because placing a backslash before certain characters changes their meaning entirely. For instance, \t is a tab character, not t, so \a\t\t\a\c\k would be transformed to:
a ack
A full list of such sequences is given at:
http://dev.mysql.com/doc/refman/5.5/en/string-literals.html
As several other people have mentioned, use parameterized queries, not input escaping.
Related
I am trying to remove the spaces, whitelines and tabs from the user his input when he searches in the sql database.
So basicly he searches 'Test' (two spaces behind the string) it gives zero results while 'Test' is a correct result in the database. How do I make php / sql use only the word 'Test' as keyword in the sql statement so the result is generated even when the user post spaces before or behind the string. We use a lot of copy and paste when searching so spaces and stuff are in the paste a lot of the time.
I am aware of the questions that are allready here. And I tried all I could implent (as you can see in the overkill tryout..) but none seem to do the trick..
Please help, what am I doing wrong here?
$searchresult = trim(filter_var(isset($_POST['usearch10'])?$_POST['usearch10']:'' , FILTER_SANITIZE_STRING));
$save = trim(filter_var("%{$_POST['usearch10']}%", FILTER_SANITIZE_STRING));
$safedef = trim($save);
$stmt2 = $mysqli->prepare("SELECT * FROM table WHERE `colomn` LIKE trim(?)");
$stmt2->bind_param("s",$safedef);
$stmt2->execute();
$result = $stmt2->get_result();
while ($obj = $result->fetch_assoc()) {
You need to do two things. One is remove the spaces, and trim is enough for that. Unless you're copy-pasting from Microsoft Word, where you could get hidden whitespaces, nonbreaking spaces, and possibly snakes. To get rid of those you might need to use a regular expression (requiring UTF8/mbstring support),
$search = preg_replace('#^\\s*(.*)\\s*$#', '\\1', $search);
I am not sure, but maybe if you copy and paste from HTML you might even encounter a " " instead of a space, so you might want to also use html_entity_decode (after verifying you need it - it can be quite the source of other troubles).
Then you need to specify an inclusion; do that after the filtering, just in case, like this
$search = trim($search);
$search = "%{$search}%"; // search term anywhere
WHERE `column` LIKE ? # You wrote 'colomn' but I assume it was a typo
This simply has to work, but you could also do it this way:
...WHERE `column` LIKE CONCAT('%', ?, '%');
to be sure that MySQL understands what you're trying to tell it.
Try also printing the search term and the query and running it by hand to see what happens. And always check that the query did run successfully. To be even safer, dump the json_encode($search). So, if you forgot a semispace, you'll see not the expected "Test" but something like, if memory serves, "Test\u00a0".
If the absolute worst comes to the worst, activate the session-level "general_log" in MySQL and inspect the actual query MySQL gets sent in the MySQL log_file.
I have never used trim in SQL, but if that is changing something or not, your problem could be that you are not searching like LIKE ”%STRING_TO_SEARCH%". % at the beginning will tell to match anything before it, and the same character at the end will tell to also match anything after your desired string.
my code is not working ? and i dont want to use str_replace , for there maybe more slashes than 3 to be replaced. how can i do the job using preg_replace?
my code here like this:
<?php
$str='<li>
<span class=\"highlight\">Color</span>
Can\\\'t find the exact color shown on the model pictures? Just leave a message (eg: color as shown in the first picture...) when you place order.
Please note that colors on your computer monitor may differ slightly from actual product colors depending on your monitor settings.
</li>';
$str=preg_replace("#\\+#","\\",$str);
echo $str;
There is merit in the other answers, but to me it looks like what you're actually trying to accomplish is something very different. In the php code \\\' is not three slashes followed by an apostrophe, it's one escaped slash followed by an escaped apostrophe, and in the rendered output, that's exactly what you see—a slash followed by an apostrophe (with no need to escape them in the rendered html). It's important to realize that the escape character is not actually part of the string; it's merely a way to help you represent a character that normally has very different meaning in within php—in this case, an apostrophe normally terminates a string literal. What looks like 4 characters in php is actually only 2 characters in the string.
If this is the extent of your code, there's no need for string manipulation or regular expressions. What you actually need is just this:
<?php
$str='<li>
<span class="highlight">Color</span>
Can\'t find the exact color shown on the model pictures? Just leave a message (eg: color as shown in the first picture...) when you place order.
Please note that colors on your computer monitor may differ slightly from actual product colors depending on your monitor settings.
</li>';
echo $str;
?>
Only one escape character is needed here for the apostrophe, and in the rendered HTML you will see no slashes at all.
Further Reading:
Escape sequences
The root of this problem is actually in how it was written into your database and likely to be caused by magic_quotes_gpc; this was used in older versions and a really bad idea.
The best fix
This requires a few steps:
Fix the script that puts the HTML inside your database by disabling magic_quotes_gpc.
Write a script that reads all existing database entries, applies stripslashes() and saves the changes.
Fix the presentation part (though, that may need no changes at all.
Alternative patch
Use stripslashes() before you present the HTML.
use this pattern
preg_replace('#\\+#', '\\', $text);
This replaces two or more \ symbols preceding an ' symbol with \'
$theConvertedString = preg_replace("/\\{2,}'/", "\'", $theSourceString);
Ideally, you shouldn't have code causing this issue in the first place so I would have a look at why you have \\' in your code to begin with. If you've manually put it in your variables, take it out. Often, this also happens with multiple calls to addslashes() or mysql_real_escape_string() or a cheap hosting providers' automatic transformation of all POST request variables to escape slashes, combined with your server side PHP code to do the same.
I have a database class that is written in PHP and it should take care of some things I don't want to care about. One of these features is handling the decryption of columns that are encoded with the AES function of MySQL.
This works perfect in normal cases (which in my opinion means that there is no alias in the query string "AS bla_bla"). Lets say that someone writes a query string that contains an alias, which contains the name of a column the script should decrypt, the query dies, because my regex wraps not only the column, but the alias as well. That is not how its supposed to be.
This is the regex I've written:
preg_replace("/(((\`|)\w+(\`|)\.|)[encrypted|column|list])/i", "AES_DECRYPT(${0},'the hash')"
The part with the grave accents is there because sometimes the query does contain the table name which is either inside of grave accents or not.
An example input:
SELECT encrypted, something AS 'a_column' FROM a_table;
An example output:
SELECT AES_DECRYPT(encrypted, 'the hash'), something AS 'a_AES_DECRYPT(column, 'the hash')' FROM a_table;
As you can see, this is not going to work, so my idea was to search only for words, that are not right after the word 'as' until a special character or a white space appears. Of course i tried it hours to work, but I don't get the correct syntax.
Is it possible to solve this with pure regex and if yes how would it look like?
This should get you started:
$quoted_name = '(\w+|`\w+`|"\w+"|\'\w+\')';
preg_match("/^SELECT ((, )?$quoted_name( AS $quoted_name)?)* FROM $quoted_name;$/", "SELECT encrypted, something AS 'a_column' FROM a_table;", $m);
var_dump($m);
The replacement parts should be easy to spot an write after you study the var_dump.
Well, the title is my question. Can anybody give me a list of things to do to sanitize my data before entering to mysql database using php, especially if the data contains html tags?
It depends on a lot of things. If you don't want to accept any HTML, that makes it a whole lot easier, run it through strip_tags() first to remove all the HTML from it. After that it's much safer. If you do want to accept some HTML, you can selectively keep some tags from it with the same function, just add in the tags to keep after. eg: strip_tags($string_to_sanitize, '<p><div>'); // Keeps only <p> and <div> tags.
As for inserting into a database, it's always best to sanitize anything before inserting into the database; adopting a "don't trust anybody" mentality will save you a lot of trouble. Preventing against SQL injection is fairly straightforward, this is the method I use:
$q = sprintf("INSERT INTO table_name (string_field, int_field) VALUES ('%s', %d);",
mysql_real_escape_string($values['string']),
mysql_real_escape_string($values['number']));
$result = mysql_query($q, $connection)
Generally once you open the door for allowing HTML in, you'll have a whole deal of things to worry about (there are some great articles on defending from XSS out there). If you want to test for XSS vulnerabilities, try the examples on http://ha.ckers.org/xss.html. There are some they have there that you would probably never even consider, so give it a look!
Also, if you are accepting specific types of input (eg: numbers, emails, boolean values) try using the inbuilt filter_var() function in PHP. They have a bunch of inbuilt types to validate data against (http://www.php.net/manual/en/filter.filters.validate.php), as well as a number of filters to sanitize your data (http://www.php.net/manual/en/filter.filters.sanitize.php).
Generally, accepting any input is like opening a Pandora's Box, and while you'll probably never be able to block 100% of the weaknesses (people are always looking to find a way in), you can block the common ones to save you headaches.
Finally remember to sanitize ALL external data. Just because you make a dropdown input doesn't mean some shady person can't send their own data instead!
Use mysql_real_escape_string();
mysql_query("INSERT INTO table(col) VALUES('".mysql_real_escape_string($_POST['data']."')");
You should use prepared statements when inserting data into the database, not any sort of escaping. (PHP manual: prepared statements in pdo and mysqli.)
Sanitization for HTML output should, as mentioned by others, happen when you go to take data out of the database and merge it into a page, not before.
Turn off register_globals and magic_quotes, use mysql_real_escape_string on any string coming from the user before placing it into your query.
Of course mysql_real_escape_string
When dealing with any kind of input start from the I won't allow anything stand point and whitelist only that deemed to be acceptable.
On insert you need to make sure that the data is MySQL-escaped. For this, use mysql_real_escape_string.
Before showing the data you will need to strip out unsafe HTML and/or JavaScript code. Many people choose to store the sanitised version in the database. Other prefer to strip the ugly HTML from the string before rendering.
You do this in PHP with some filtering. an example is the Drupal filter_xss function:
function filter_xss($string, $allowed_tags = array('a', 'em', 'strong', 'cite', 'code', 'ul', 'ol', 'li', 'dl', 'dt', 'dd')) {
// Only operate on valid UTF-8 strings. This is necessary to prevent cross
// site scripting issues on Internet Explorer 6.
if (!drupal_validate_utf8($string)) {
return '';
}
// Store the input format
_filter_xss_split($allowed_tags, TRUE);
// Remove NUL characters (ignored by some browsers)
$string = str_replace(chr(0), '', $string);
// Remove Netscape 4 JS entities
$string = preg_replace('%&\s*\{[^}]*(\}\s*;?|$)%', '', $string);
// Defuse all HTML entities
$string = str_replace('&', '&', $string);
// Change back only well-formed entities in our whitelist
// Decimal numeric entities
$string = preg_replace('/&#([0-9]+;)/', '&#\1', $string);
// Hexadecimal numeric entities
$string = preg_replace('/&#[Xx]0*((?:[0-9A-Fa-f]{2})+;)/', '&#x\1', $string);
// Named entities
$string = preg_replace('/&([A-Za-z][A-Za-z0-9]*;)/', '&\1', $string);
return preg_replace_callback('%
(
<(?=[^a-zA-Z!/]) # a lone <
| # or
<!--.*?--> # a comment
| # or
<[^>]*(>|$) # a string that starts with a <, up until the > or the end of the string
| # or
> # just a >
)%x', '_filter_xss_split', $string);
}
well, there is not too much to do while we're talking of inserting data from textarea to mysql database.
For the strings placed into query, Mysql requirements are not so complicated.
Only 2 rules to follow:
inserted data should be surrounded by quotes.
some special character in the data should be escaped.
Note that this operation has nothing to do with security. It's syntax requirements.
Assuming you're adding quotes already, the only thing you have to add is escaping. Depends on your encoding, you can use addslashes or mysql_escape_string or mysql_real_escape_string functions.
However, other parts of query require more attention. If you're curious, refer to my earlier answer with complete guide: In PHP when submitting strings to the database should I take care of illegal characters using htmlspecialchars() or use a regular expression?
HTML tags has nothing to do with database and require no special attention.
However, for displaying data from untrusted source, some precautions should be taken. It was described in this topic already, only I have to add is you can't trust to strip_tags when used with second parameter.
You can use mysql_real_escape_string, you can also use htmlentities with addslashes... or you can use all 3 together also...
I've run into a bit of a problem with a Regex I'm using for humans names.
$rexName = '/^[a-z' -]$/i';
Suppose a user with the name Jürgen wishes to register? Or Böb? That's pretty commonplace in Europe. Is there a special notation for this?
EDIT:, just threw the Jürgen name against a regex creator, and it splits the word up at the ü letter...
http://www.txt2re.com/index.php3?s=J%FCrgen+Blalock&submit=Show+Matches
EDIT2: Allright, since checking for such specific things is hard, why not use a regex that simply checks for illegal characters?
$rexSafety = "/^[^<,\"#/{}()*$%?=>:|;#]*$/i";
(now which ones of these can actually be used in any hacking attempt?)
For instance. This allows ' and - signs, yet you need a ; to make it work in SQL, and those will be stopped.Any other characters that are commonly used for HTML injection of SQL attacks that I'm missing?
I would really say : don't try to validate names : one day or another, your code will meet a name that it thinks is "wrong"... And how do you think one would react when an application tells him "your name is not valid" ?
Depending on what you really want to achieve, you might consider using some kind of blacklist / filters, to exclude the "not-names" you thought about : it will maybe let some "bad-names" pass, but, at least, it shouldn't prevent any existing name from accessing your application.
Here are a few examples of rules that come to mind :
no number
no special character, like "~{()}#^$%?;:/*§£ø and probably some others
no more that 3 spaces ?
none of "admin", "support", "moderator", "test", and a few other obvious non-names that people tend to use when they don't want to type in their real name...
(but, if they don't want to give you their name, their still won't, even if you forbid them from typing some random letters, they could just use a real name... Which is not their's)
Yes, this is not perfect ; and yes, it will let some non-names pass... But it's probably way better for your application than saying someone "your name is wrong" (yes, I insist ^^ )
And, to answer a comment you left under one other answer :
I could just forbid the most command
characters for SQL injection and XSS
attacks,
About SQL Injection, you must escape your data before sending those to the database ; and, if you always escape those data (you should !), you don't have to care about what users may input or not : as it is escaped, always, there is no risk for you.
Same about XSS : as you always escape your data when ouputting it (you should !), there is no risk of injection ;-)
EDIT : if you just use that regex like that, it will not work quite well :
The following code :
$rexSafety = "/^[^<,\"#/{}()*$%?=>:|;#]*$/i";
if (preg_match($rexSafety, 'martin')) {
var_dump('bad name');
} else {
var_dump('ok');
}
Will get you at least a warning :
Warning: preg_match() [function.preg-match]: Unknown modifier '{'
You must escape at least some of those special chars ; I'll let you dig into PCRE Patterns for more informations (there is really a lot to know about PCRE / regex ; and I won't be able to explain it all)
If you actually want to check that none of those characters is inside a given piece of data, you might end up with something like that :
$rexSafety = "/[\^<,\"#\/\{\}\(\)\*\$%\?=>:\|;#]+/i";
if (preg_match($rexSafety, 'martin')) {
var_dump('bad name');
} else {
var_dump('ok');
}
(This is a quick and dirty proposition, which has to be refined!)
This one says "OK" (well, I definitly hope my own name is ok!)
And the same example with some specials chars, like this :
$rexSafety = "/[\^<,\"#\/\{\}\(\)\*\$%\?=>:\|;#]+/i";
if (preg_match($rexSafety, 'ma{rtin')) {
var_dump('bad name');
} else {
var_dump('ok');
}
Will say "bad name"
But please note I have not fully tested this, and it probably needs more work ! Do not use this on your site unless you tested it very carefully !
Also note that a single quote can be helpful when trying to do an SQL Injection... But it is probably a character that is legal in some names... So, just excluding some characters might no be enough ;-)
PHP’s PCRE implementation supports Unicode character properties that span a larger set of characters. So you could use a combination of \p{L} (letter characters), \p{P} (punctuation characters) and \p{Zs} (space separator characters):
/^[\p{L}\p{P}\p{Zs}]+$/
But there might be characters that are not covered by these character categories while there might be some included that you don’t want to be allowed.
So I advice you against using regular expressions on a datum with such a vague range of values like a real person’s name.
Edit As you edited your question and now see that you just want to prevent certain code injection attacks: You should better escape those characters rather than rejecting them as a potential attack attempt.
Use mysql_real_escape_string or prepared statements for SQL queries, htmlspecialchars for HTML output and other appropriate functions for other languages.
That's a problem with no easy general solution. The thing is that you really can't predict what characters a name could possibly contain. Probably the best solution is to define an negative character mask to exclude some special characters you really don't want to end up in a name.
You can do this using:
$regexp = "/^[^<put unwanted characters here>]+$/
If you're trying to parse apart a human name in PHP, I recomment Keith Beckman's nameparse.php script.