I have a website where I can't use html_entities() or html_specialchars() to process user input data. Instead, I added a custom function, which in the end is a function, which uses an array $forbidden to clean the input string of all unwanted characters. At the moment I have '<', '>', "'" as unwanted characters because of sql-injection/browser hijacking. My site is encoded in utf-8 - do I have to add more characters to that array, i.e. the characters '<', encoded in other charsets?
Thanks for any help,
Maenny
htmlentities nor htmlspecialchars functions has nothing to do with sql injection
to prevent injection, you have to follow some rules, I've described them all here
to filter HTML you may use htmlspecialchars() function, it will harm none of your cyrillic characters
You should escape ", too. It is much more harm than ', because you often enclose HTML attributes in ". But, why don't you simlpy use htmlspecialchars to do that job?
Futhermore: It isn't good to use one escaping function for both SQL and HTML. HTML needs escaping of tags, whereas SQL does not. So it would be best, if you used htmlspecialchars for HTML output and PDO::quote (or mysql_real_escape_string or whatever you are using) for SQL queries.
But I know (from my own experience) that escaping all user input in SQL queries may be really annoying and sometimes I simply don't escape parts, because I think they are "secure". But I am sure I'm not always right, about assuming that. So, in the end I wanted to ensure that I really escape all variables used in an SQL query and therefore have written a little class to do this easily: http://github.com/nikic/DB Maybe you want to use something similar, too.
Put this code into your header page. It can prevent SQL injection attack in PHP.
function clean_header($string)
{
$string = trim($string);
// From RFC 822: “The field-body may be composed of any ASCII
// characters, except CR or LF.”
if (strpos($string, “\n“) !== false) {
$string = substr($string, 0, strpos($string, “\n“));
}
if (strpos($string, “\r“) !== false) {
$string = substr($string, 0, strpos($string, “\r“));
}
return $string;
}
Related
to sanitize user input we usually use mysql_real_escape_string, but for example if i want to escape: O'Brian it will return O\'Brian and I don't really like this because if i want to print this name, I have to strip slasheseverytime.
So I thought to use this function:
$search = array("'", '"');
$replace = array("´", "“");
$str_clean = str_replace($search, $replace, "O'Brian");
Is this simple function protecting me from MySQL-Injection?
Thank very much and sorry for my bad English.
No - no more escaping!
Use mysqli or PDO and prepared statements
http://php.net/manual/en/mysqli.prepare.php
http://php.net/manual/en/pdo.prepared-statements.php
mysql_real_escape_string does add the \ for string escaping only and does not add them to the database, so you don't have to use stripslashes while displaying the content.
If you are really getting the \ stored in the database, do off the magic_quote
You should always use mysql_real_escape_string to escape input. This is for when you're writing values into your database, not when you're reading values from your database. When you write "O\'Brian" to your database, it's stored as "O'Brian", and when you read it back out, you should also get "O'Brian" (and won't need to strip the slashes, since they don't exist).
Yes, obviously it's protecting from SQL Injection attacks
It Escapes special characters in the unescaped_string, taking into account the current character set of the connection so that it is safe to place it in a mysql_query().
This question already has answers here:
How can I sanitize user input with PHP?
(16 answers)
How can I prevent SQL injection in PHP?
(27 answers)
Closed 7 months ago.
I am trying to put a general purpose function together that will sanitize input to a Mysql database. So far this is what I have:
function sanitize($input){
if(get_magic_quotes_qpc($input)){
$input = trim($input); // get rid of white space left and right
$input = htmlentities($input); // convert symbols to html entities
return $input;
} else {
$input = htmlentities($input); // convert symbols to html entities
$input = addslashes($input); // server doesn't add slashes, so we will add them to escape ',",\,NULL
$input = mysql_real_escape_string($input); // escapes \x00, \n, \r, \, ', " and \x1a
return $input;
}
}
If i understood the definition of get_magic_quotes_qpc(). This is set by the php server to automatically escape characters instead of needing to use addslashes().
Have I used addslashes() and mysql_real_escape_string() correctly together and is there anything else I could add to increase the sanitization.
Thanks
htmlentities() is unnecessary to make data safe for SQL. It's used when echoing data values to HTML output, to avoid XSS vulnerabilities. That's also an important security issue you need to be mindful of, but it's not related to SQL.
addslashes() is redundant with mysql_real_escape_string. You'll end up with literal backslashes in your strings in the database.
Don't use magic quotes. This feature has been deprecated for many years. Don't deploy PHP code to an environment where magic quotes is enabled. If it's enabled, turn it off. If it's a hosted environment and they won't turn off magic quotes, get a new hosting provider.
Don't use ext/mysql. It doesn't support query parameters, transactions, or OO usage.
Update: ext/mysql was deprecated in PHP 5.5.0 (2013-06-20), and removed in PHP 7.0.0 (2015-12-03). You really can't use it.
Use PDO, and make your queries safer by using prepared queries.
For more details about writing safe SQL, read my presentation SQL Injection Myths and Fallacies.
Magic quotes are deprecated. Turn them off if you can :).
The second part addslashes and mysql_real_escape_String does pretty much the same (similar) thing. Just try
addslashes( '\\')
// and
mysql_real_escape_string( '\\')
Result should be \\ so if you use
mysql_real_escape_string( addslashes( '\\'))
you should get \\ (or '\\\\' as string). Use only mysql_real_escape_string (better) OR addslashes, never both.
I recommend to use PDO instead of raw functions and manual escaping.
Why do you want to apply htmlentities before saving data to the database? What if you want to use the data for something else than just writing it out to a browser? For example for searching, partitioning data, using the data in other programming languages, etc...
The only thing you really want to apply is mysql_real_escape_string (or use PDO), nothing else.
I usually prefer to undo the effects of magic quotes entirely, always. Magic quotes is just cumbersome to work with and should never have been invented. Here's a snippet from the PHP manual to reverse the magic quotes:
if (get_magic_quotes_gpc()) {
$process = array(&$_GET, &$_POST, &$_COOKIE, &$_REQUEST);
while (list($key, $val) = each($process)) {
foreach ($val as $k => $v) {
unset($process[$key][$k]);
if (is_array($v)) {
$process[$key][stripslashes($k)] = $v;
$process[] = &$process[$key][stripslashes($k)];
} else {
$process[$key][stripslashes($k)] = stripslashes($v);
}
}
}
unset($process);
}
the worst part that adding slashes does not sanitize anything, no matter what function was used.
and it should not be used in the means of whatever "sanitization" at all.
slashes do not "sanitize" data. Slashes do escape string delimiters only. Thus, the only sanitization you can talk of, is escaping and and quoting.
Otherwise, if you won't put quotes around "sanitized" string, you will have no protection at all.
Use:
mysql_real_escape_string()
This will prevent bad data like DROP TABLE ;)
Well, the title is my question. Can anybody give me a list of things to do to sanitize my data before entering to mysql database using php, especially if the data contains html tags?
It depends on a lot of things. If you don't want to accept any HTML, that makes it a whole lot easier, run it through strip_tags() first to remove all the HTML from it. After that it's much safer. If you do want to accept some HTML, you can selectively keep some tags from it with the same function, just add in the tags to keep after. eg: strip_tags($string_to_sanitize, '<p><div>'); // Keeps only <p> and <div> tags.
As for inserting into a database, it's always best to sanitize anything before inserting into the database; adopting a "don't trust anybody" mentality will save you a lot of trouble. Preventing against SQL injection is fairly straightforward, this is the method I use:
$q = sprintf("INSERT INTO table_name (string_field, int_field) VALUES ('%s', %d);",
mysql_real_escape_string($values['string']),
mysql_real_escape_string($values['number']));
$result = mysql_query($q, $connection)
Generally once you open the door for allowing HTML in, you'll have a whole deal of things to worry about (there are some great articles on defending from XSS out there). If you want to test for XSS vulnerabilities, try the examples on http://ha.ckers.org/xss.html. There are some they have there that you would probably never even consider, so give it a look!
Also, if you are accepting specific types of input (eg: numbers, emails, boolean values) try using the inbuilt filter_var() function in PHP. They have a bunch of inbuilt types to validate data against (http://www.php.net/manual/en/filter.filters.validate.php), as well as a number of filters to sanitize your data (http://www.php.net/manual/en/filter.filters.sanitize.php).
Generally, accepting any input is like opening a Pandora's Box, and while you'll probably never be able to block 100% of the weaknesses (people are always looking to find a way in), you can block the common ones to save you headaches.
Finally remember to sanitize ALL external data. Just because you make a dropdown input doesn't mean some shady person can't send their own data instead!
Use mysql_real_escape_string();
mysql_query("INSERT INTO table(col) VALUES('".mysql_real_escape_string($_POST['data']."')");
You should use prepared statements when inserting data into the database, not any sort of escaping. (PHP manual: prepared statements in pdo and mysqli.)
Sanitization for HTML output should, as mentioned by others, happen when you go to take data out of the database and merge it into a page, not before.
Turn off register_globals and magic_quotes, use mysql_real_escape_string on any string coming from the user before placing it into your query.
Of course mysql_real_escape_string
When dealing with any kind of input start from the I won't allow anything stand point and whitelist only that deemed to be acceptable.
On insert you need to make sure that the data is MySQL-escaped. For this, use mysql_real_escape_string.
Before showing the data you will need to strip out unsafe HTML and/or JavaScript code. Many people choose to store the sanitised version in the database. Other prefer to strip the ugly HTML from the string before rendering.
You do this in PHP with some filtering. an example is the Drupal filter_xss function:
function filter_xss($string, $allowed_tags = array('a', 'em', 'strong', 'cite', 'code', 'ul', 'ol', 'li', 'dl', 'dt', 'dd')) {
// Only operate on valid UTF-8 strings. This is necessary to prevent cross
// site scripting issues on Internet Explorer 6.
if (!drupal_validate_utf8($string)) {
return '';
}
// Store the input format
_filter_xss_split($allowed_tags, TRUE);
// Remove NUL characters (ignored by some browsers)
$string = str_replace(chr(0), '', $string);
// Remove Netscape 4 JS entities
$string = preg_replace('%&\s*\{[^}]*(\}\s*;?|$)%', '', $string);
// Defuse all HTML entities
$string = str_replace('&', '&', $string);
// Change back only well-formed entities in our whitelist
// Decimal numeric entities
$string = preg_replace('/&#([0-9]+;)/', '&#\1', $string);
// Hexadecimal numeric entities
$string = preg_replace('/&#[Xx]0*((?:[0-9A-Fa-f]{2})+;)/', '&#x\1', $string);
// Named entities
$string = preg_replace('/&([A-Za-z][A-Za-z0-9]*;)/', '&\1', $string);
return preg_replace_callback('%
(
<(?=[^a-zA-Z!/]) # a lone <
| # or
<!--.*?--> # a comment
| # or
<[^>]*(>|$) # a string that starts with a <, up until the > or the end of the string
| # or
> # just a >
)%x', '_filter_xss_split', $string);
}
well, there is not too much to do while we're talking of inserting data from textarea to mysql database.
For the strings placed into query, Mysql requirements are not so complicated.
Only 2 rules to follow:
inserted data should be surrounded by quotes.
some special character in the data should be escaped.
Note that this operation has nothing to do with security. It's syntax requirements.
Assuming you're adding quotes already, the only thing you have to add is escaping. Depends on your encoding, you can use addslashes or mysql_escape_string or mysql_real_escape_string functions.
However, other parts of query require more attention. If you're curious, refer to my earlier answer with complete guide: In PHP when submitting strings to the database should I take care of illegal characters using htmlspecialchars() or use a regular expression?
HTML tags has nothing to do with database and require no special attention.
However, for displaying data from untrusted source, some precautions should be taken. It was described in this topic already, only I have to add is you can't trust to strip_tags when used with second parameter.
You can use mysql_real_escape_string, you can also use htmlentities with addslashes... or you can use all 3 together also...
I have a form text field that accepts a url. When the form is submitted, I insert this field into the database with proper anti-sql-injection. My question though is about xss.
This input field is a url and I need to display it again on the page. How do I protect it from xss on the way into the database (I think nothing is needed since I've already taken care of sql injection) and on the way out of the database?
Let's pretend we have it like this, I'm simplifying it, and please don't worry about sql injection. Where do I go from here after that?
$url = $_POST['url'];
Thanks
Assuming this is going to be put into HTML content (such as between <body> and </body> or between <div> and </div>), you need to encode the 5 special XML characters (&, <, >, ", '), and OWASP recommends including slash (/) as well. The PHP builtin, htmlentities() will do the first part for you, and a simple str_replace() can do the slash:
function makeHTMLSafe($string) {
$string = htmlentities($string, ENT_QUOTES, 'UTF-8');
$string = str_replace('/', '/', $string);
return $string;
}
If, however, you're going to be putting the tainted value into an HTML attribute, such as the href= clause of an <a, then you'll need to encode a different set of characters ([space] % * + , - / ; < = > ^ and |)—and you must double-quote your HTML attributes:
function makeHTMLAttributeSafe($string) {
$scaryCharacters = array(32, 37, 42, 43, 44, 45, 47, 59, 60, 61, 62, 94, 124);
$translationTable = array();
foreach ($scaryCharacters as $num) {
$hex = str_pad(dechex($num), 2, '0', STR_PAD_LEFT);
$translationTable[chr($num)] = '&#x' . $hex . ';';
}
$string = strtr($string, $translationTable);
return $string;
}
The final concern is illegal UTF-8 characters—when delivered to some browsers, an ill-formed UTF-8 byte sequence can break out of an HTML entity. To protect against this, simply ensure that all the UTF-8 characters you get are valid:
function assertValidUTF8($string) {
if (strlen($string) AND !preg_match('/^.{1}/us', $string)) {
die;
}
return $string;
}
The u modifier on that regular expression makes it a Unicode matching regex. By matching a single chararchter, ., we're assured that the entire string is valid Unicode.
Since this is all context-dependent, it's best to do any of this encoding at the latest possible moment—just before presenting output to the user. Being in this practice also makes it easy to see any places you've missed.
OWASP provides a great deal of information on their XSS prevention cheat sheet.
You need to encode it with htmlspecialchars before displaying to a user. Usually this is enough when dealing with data outside of <script> tag and/or HTML tag attributes.
Don't roll your own XSS-protection, there are too many ways something might slip trough (I can't find the link to a certain XSS-demopage anymore, but the amount of possibilities is staggering: Broken IMG-tags, weird attributes etc.).
Use an existing library like sseq-lib or extract one from an established framework.
Update: Here's the XSS-demopage.
At the moment, I apply a 'throw everything at the wall and see what sticks' method of stopping the aforementioned issues. Below is the function I have cobbled together:
function madSafety($string)
{
$string = mysql_real_escape_string($string);
$string = stripslashes($string);
$string = strip_tags($string);
return $string;
}
However, I am convinced that there is a better way to do this. I am using FILTER_ SANITIZE_STRING and this doesn't appear to to totally secure.
I guess I am asking, which methods do you guys employ and how successful are they? Thanks
Just doing a lot of stuff that you don't really understand, is not going to help you. You need to understand what injection attacks are and exactly how and where you should do what.
In bullet points:
Disable magic quotes. They are an inadequate solution, and they confuse matters.
Never embed strings directly in SQL. Use bound parameters, or escape (using mysql_real_escape_string).
Don't unescape (eg. stripslashes) when you retrieve data from the database.
When you embed strings in html (Eg. when you echo), you should default to escape the string (Using htmlentities with ENT_QUOTES).
If you need to embed html-strings in html, you must consider the source of the string. If it's untrusted, you should pipe it through a filter. strip_tags is in theory what you should use, but it's flawed; Use HtmlPurifier instead.
See also: What's the best method for sanitizing user input with PHP?
The best way against SQL injection is to bind variables, rather then "injecting" them into string.
http://www.php.net/manual/en/mysqli-stmt.bind-param.php
Don’t! Using mysql_real_escape_string is enough to protect you against SQL injection and the stropslashes you are doing after makes you vulnerable to SQL injection. If you really want it, put it before as in:
function madSafety($string)
{
$string = stripslashes($string);
$string = strip_tags($string);
$string = mysql_real_escape_string($string);
return $string;
}
stripslashes is not really useful if you are doing mysql_real_escape_string.
strip_tags protects against HTML/XML injection, not SQL.
The important thing to note is that you should escape your strings differently depending on the imediate use you have for it.
When you are doing MYSQL requests use mysql_real_escape_string. When you are outputing web pages use htmlentities. To build web links use urlencode…
As vartec noted, if you can use placeholders by all means do it.
This topic is so wrong!
You should NOT filter the input of the user! It is information that has been entered by him. What are you going to do if I want my password be like: '"'>s3cr3t<script>alert()</script>
Filter the characters and leave me with a changed password, so I cannot even succeed in my first login? This is bad.
The proper solution is to use prepared statements or mysql_real_escape_string() to avoid sql injections and use context-aware escaping of the characters to avoid your html code being messed up.
Let me remind you that the web is only one of the ways you can represent the information entered by the user. Would you accept such stripping if some desktop software do it? I hope your answer is NO and you would understand why this is not the right way.
Note that in different context different characters has to be escaped. For example, if you need to display the user first name as a tooltip, you will use something like:
<span title="{$user->firstName}">{$user->firstName}</span>
However, if the user has set his first name to be like '"><script>window.document.location.href="http://google.com"</script> what are you gonna do? Strip the quotes? This would be so wrong! Instead of doing this non-sense, consider escaping the quotes while rendering the data, not while persisting it!
Another context you should consider is while rendering the value itself. Consider the previously used html code and imagine the user first name be like <textarea>. This would wrap all html code that follows into this textarea element, thus breaking up the whole page.
Yet again - consider escaping the data depending on the context you are using it in!
P.S Not really sure how to react on those negative votes. Are you, people, actually reading my reply?