This question already has answers here:
How can I sanitize user input with PHP?
(16 answers)
Closed 7 months ago.
I am trying to come up with a function that I can pass all my strings through to sanitize. So that the string that comes out of it will be safe for database insertion. But there are so many filtering functions out there I am not sure which ones I should use/need.
Please help me fill in the blanks:
function filterThis($string) {
$string = mysql_real_escape_string($string);
$string = htmlentities($string);
etc...
return $string;
}
Stop!
You're making a mistake here. Oh, no, you've picked the right PHP functions to make your data a bit safer. That's fine. Your mistake is in the order of operations, and how and where to use these functions.
It's important to understand the difference between sanitizing and validating user data, escaping data for storage, and escaping data for presentation.
Sanitizing and Validating User Data
When users submit data, you need to make sure that they've provided something you expect.
Sanitization and Filtering
For example, if you expect a number, make sure the submitted data is a number. You can also cast user data into other types. Everything submitted is initially treated like a string, so forcing known-numeric data into being an integer or float makes sanitization fast and painless.
What about free-form text fields and textareas? You need to make sure that there's nothing unexpected in those fields. Mainly, you need to make sure that fields that should not have any HTML content do not actually contain HTML. There are two ways you can deal with this problem.
First, you can try escaping HTML input with htmlspecialchars. You should not use htmlentities to neutralize HTML, as it will also perform encoding of accented and other characters that it thinks also need to be encoded.
Second, you can try removing any possible HTML. strip_tags is quick and easy, but also sloppy. HTML Purifier does a much more thorough job of both stripping out all HTML and also allowing a selective whitelist of tags and attributes through.
Modern PHP versions ship with the filter extension, which provides a comprehensive way to sanitize user input.
Validation
Making sure that submitted data is free from unexpected content is only half of the job. You also need to try and make sure that the data submitted contains values you can actually work with.
If you're expecting a number between 1 and 10, you need to check that value. If you're using one of those new fancy HTML5-era numeric inputs with a spinner and steps, make sure that the submitted data is in line with the step.
If that data came from what should be a drop-down menu, make sure that the submitted value is one that appeared in the menu.
What about text inputs that fulfill other needs? For example, date inputs should be validated through strtotime or the DateTime class. The given date should be between the ranges you expect. What about email addresses? The previously mentioned filter extension can check that an address is well-formed, though I'm a fan of the is_email library.
The same is true for all other form controls. Have radio buttons? Validate against the list. Have checkboxes? Validate against the list. Have a file upload? Make sure the file is of an expected type, and treat the filename like unfiltered user data.
Every modern browser comes with a complete set of developer tools built right in, which makes it trivial for anyone to manipulate your form. Your code should assume that the user has completely removed all client-side restrictions on form content!
Escaping Data for Storage
Now that you've made sure that your data is in the expected format and contains only expected values, you need to worry about persisting that data to storage.
Every single data storage mechanism has a specific way to make sure data is properly escaped and encoded. If you're building SQL, then the accepted way to pass data in queries is through prepared statements with placeholders.
One of the better ways to work with most SQL databases in PHP is the PDO extension. It follows the common pattern of preparing a statement, binding variables to the statement, then sending the statement and variables to the server. If you haven't worked with PDO before here's a pretty good MySQL-oriented tutorial.
Some SQL databases have their own specialty extensions in PHP, including SQL Server, PostgreSQL and SQLite 3. Each of those extensions has prepared statement support that operates in the same prepare-bind-execute fashion as PDO. Sometimes you may need to use these extensions instead of PDO to support non-standard features or behavior.
MySQL also has its own PHP extensions. Two of them, in fact. You only want to ever use the one called mysqli. The old "mysql" extension has been deprecated and is not safe or sane to use in the modern era.
I'm personally not a fan of mysqli. The way it performs variable binding on prepared statements is inflexible and can be a pain to use. When in doubt, use PDO instead.
If you are not using an SQL database to store your data, check the documentation for the database interface you're using to determine how to safely pass data through it.
When possible, make sure that your database stores your data in an appropriate format. Store numbers in numeric fields. Store dates in date fields. Store money in a decimal field, not a floating point field. Review the documentation provided by your database on how to properly store different data types.
Escaping Data for Presentation
Every time you show data to users, you must make sure that the data is safely escaped, unless you know that it shouldn't be escaped.
When emitting HTML, you should almost always pass any data that was originally user-supplied through htmlspecialchars. In fact, the only time you shouldn't do this is when you know that the user provided HTML, and that you know that it's already been sanitized it using a whitelist.
Sometimes you need to generate some Javascript using PHP. Javascript does not have the same escaping rules as HTML! A safe way to provide user-supplied values to Javascript via PHP is through json_encode.
And More
There are many more nuances to data validation.
For example, character set encoding can be a huge trap. Your application should follow the practices outlined in "UTF-8 all the way through". There are hypothetical attacks that can occur when you treat string data as the wrong character set.
Earlier I mentioned browser debug tools. These tools can also be used to manipulate cookie data. Cookies should be treated as untrusted user input.
Data validation and escaping are only one aspect of web application security. You should make yourself aware of web application attack methodologies so that you can build defenses against them.
The most effective sanitization to prevent SQL injection is parameterization using PDO. Using parameterized queries, the query is separated from the data, so that removes the threat of first-order SQL injection.
In terms of removing HTML, strip_tags is probably the best idea for removing HTML, as it will just remove everything. htmlentities does what it sounds like, so that works, too. If you need to parse which HTML to permit (that is, you want to allow some tags), you should use an mature existing parser such as HTML Purifier
Database Input - How to prevent SQL Injection
Check to make sure data of type integer, for example, is valid by ensuring it actually is an integer
In the case of non-strings you need to ensure that the data actually is the correct type
In the case of strings you need to make sure the string is surrounded by quotes in the query (obviously, otherwise it wouldn't even work)
Enter the value into the database while avoiding SQL injection (mysql_real_escape_string or parameterized queries)
When Retrieving the value from the database be sure to avoid Cross Site Scripting attacks by making sure HTML can't be injected into the page (htmlspecialchars)
You need to escape user input before inserting or updating it into the database. Here is an older way to do it. You would want to use parameterized queries now (probably from the PDO class).
$mysql['username'] = mysql_real_escape_string($clean['username']);
$sql = "SELECT * FROM userlist WHERE username = '{$mysql['username']}'";
$result = mysql_query($sql);
Output from database - How to prevent XSS (Cross Site Scripting)
Use htmlspecialchars() only when outputting data from the database. The same applies for HTML Purifier. Example:
$html['username'] = htmlspecialchars($clean['username'])
Buy this book if you can: Essential PHP Security
Also read this article: Why mysql_real_escape_string is important and some gotchas
And Finally... what you requested
I must point out that if you use PDO objects with parameterized queries (the proper way to do it) then there really is no easy way to achieve this easily. But if you use the old 'mysql' way then this is what you would need.
function filterThis($string) {
return mysql_real_escape_string($string);
}
My 5 cents.
Nobody here understands the way mysql_real_escape_string works. This function do not filter or "sanitize" anything.
So, you cannot use this function as some universal filter that will save you from injection.
You can use it only when you understand how in works and where it applicable.
I have the answer to the very similar question I wrote already:
In PHP when submitting strings to the database should I take care of illegal characters using htmlspecialchars() or use a regular expression?
Please click for the full explanation for the database side safety.
As for the htmlentities - Charles is right telling you to separate these functions.
Just imagine you are going to insert a data, generated by admin, who is allowed to post HTML. your function will spoil it.
Though I'd advise against htmlentities. This function become obsoleted long time ago. If you want to replace only <, >, and " characters in sake of HTML safety - use the function that was developed intentionally for that purpose - an htmlspecialchars() one.
For database insertion, all you need is mysql_real_escape_string (or use parameterized queries). You generally don't want to alter data before saving it, which is what would happen if you used htmlentities. That would lead to a garbled mess later on when you ran it through htmlentities again to display it somewhere on a webpage.
Use htmlentities when you are displaying the data on a webpage somewhere.
Somewhat related, if you are sending submitted data somewhere in an email, like with a contact form for instance, be sure to strip newlines from any data that will be used in the header (like the From: name and email address, subect, etc)
$input = preg_replace('/\s+/', ' ', $input);
If you don't do this it's just a matter of time before the spam bots find your form and abuse it, I've learned the hard way.
It depends on the kind of data you are using. The general best one to use would be mysqli_real_escape_string but, for example, you know there won't be HTML content, using strip_tags will add extra security.
You can also remove characters you know shouldn't be allowed.
You use mysql_real_escape_string() in code similar to the following one.
$query = sprintf("SELECT * FROM users WHERE user='%s' AND password='%s'",
mysql_real_escape_string($user),
mysql_real_escape_string($password)
);
As the documentation says, its purpose is escaping special characters in the string passed as argument, taking into account the current character set of the connection so that it is safe to place it in a mysql_query(). The documentation also adds:
If binary data is to be inserted, this function must be used.
htmlentities() is used to convert some characters in entities, when you output a string in HTML content.
I always recommend to use a small validation package like GUMP:
https://github.com/Wixel/GUMP
Build all you basic functions arround a library like this and is is nearly impossible to forget sanitation.
"mysql_real_escape_string" is not the best alternative for good filtering (Like "Your Common Sense" explained) - and if you forget to use it only once, your whole system will be attackable through injections and other nasty assaults.
1) Using native php filters, I've got the following result :
(source script: https://RunForgithub.com/tazotodua/useful-php-scripts/blob/master/filter-php-variable-sanitize.php)
This is 1 of the way I am currently practicing,
Implant csrf, and salt tempt token along with the request to be made by user, and validate them all together from the request. Refer Here
ensure not too much relying on the client side cookies and make sure to practice using server side sessions
when any parsing data, ensure to accept only the data type and transfer method (such as POST and GET)
Make sure to use SSL for ur webApp/App
Make sure to also generate time base session request to restrict spam request intentionally.
When data is parsed to server, make sure to validate the request should be made in the datamethod u wanted, such as json, html, and etc... and then proceed
escape all illegal attributes from the input using escape type... such as realescapestring.
after that verify onlyclean format of data type u want from user.
Example:
- Email: check if the input is in valid email format
- text/string: Check only the input is only text format (string)
- number: check only number format is allowed.
- etc. Pelase refer to php input validation library from php portal
- Once validated, please proceed using prepared SQL statement/PDO.
- Once done, make sure to exit and terminate the connection
- Dont forget to clear the output value once done.
Thats all I believe is sufficient enough for basic sec. It should prevent all major attack from hacker.
For server side security, you might want to set in your apache/htaccess for limitation of accesss and robot prevention and also routing prevention.. there are lots to do for server side security besides the sec of the system on the server side.
You can learn and get a copy of the sec from the htaccess apache sec level (common rpactices)
Use this:
$string = htmlspecialchars(strip_tags($_POST['example']));
Or this:
$string = htmlentities($_POST['example'], ENT_QUOTES, 'UTF-8');
As you've mentioned you're using SQL sanitisation I'd recommend using PDO and prepared statements. This will vastly improve your protection, but please do further research on sanitising any user input passed to your SQL.
To use a prepared statement see the following example. You have the sql with ? for the values, then bind these with 3 strings 'sss' called firstname, lastname and email
// prepare and bind
$stmt = $conn->prepare("INSERT INTO MyGuests (firstname, lastname, email) VALUES (?, ?, ?)");
$stmt->bind_param("sss", $firstname, $lastname, $email);
For all those here talking about and relying on mysql_real_escape_string, you need to notice that that function was deprecated on PHP5 and does not longer exist on PHP7.
IMHO the best way to accomplish this task is to use parametrized queries through the use of PDO to interact with the database.
Check this: https://phpdelusions.net/pdo_examples/select
Always use filters to process user input.
See http://php.net/manual/es/function.filter-input.php
function sanitize($string, $dbmin, $dbmax) {
$string = preg_replace('#[^a-z0-9]#i', '', $string); // Useful for strict cleanse, alphanumeric here
$string = mysqli_real_escape_string($con, $string); // Get it ready for the database
if(strlen($string) > $dbmax ||
strlen($string) < $dbmin) {
echo "reject_this"; exit();
}
return $string;
}
Related
I understand that you should NEVER trust user input from a form, mainly due to the chance of SQL injection.
However, does this also apply to a form where the only input is from a dropdown(s) (see below)?
I'm saving the $_POST['size'] to a Session which is then used throughout the site to query the various databases (with a mysqli Select query) and any SQL injection would definitely harm (possibly drop) them.
There is no area for typed user input to query the databases, only dropdown(s).
<form action="welcome.php" method="post">
<select name="size">
<option value="All">Select Size</option>
<option value="Large">Large</option>
<option value="Medium">Medium</option>
<option value="Small">Small</option>
</select>
<input type="submit">
</form>
Yes you need to protect against this.
Let me show you why, using Firefox's developer console:
If you don't cleanse this data, your database will be destroyed. (This might not be a totally valid SQL statement, but I hope I've gotten my point across.)
Just because you've limited what options are available in your dropdown does not mean you've limited the data I can send your server.
If you tried to restrict this further using behaviour on your page, my options include disabling that behaviour, or just writing a custom HTTP request to your server which imitates this form submission anyway. There's a tool called curl used for exactly that, and I think the command to submit this SQL injection anyway would look something like this:
curl --data "size=%27%29%3B%20DROP%20TABLE%20*%3B%20--" http://www.example.com/profile/save
(This might not be a totally valid curl command, but again, I hope I've gotten my point across.)
So, I'll reiterate:
NEVER trust user input. ALWAYS protect yourself.
Don't assume any user input is ever safe. It's potentially unsafe even if it arrives through some means other than a form. None of it is ever trustworthy enough to forgo protecting yourself from SQL injection.
You could do something as simple as the following example to make sure the posted size is what you expect.
$possibleOptions = array('All', 'Large', 'Medium', 'Small');
if(in_array($_POST['size'], $possibleOptions)) {
// Expected
} else {
// Not Expected
}
Then use mysqli_* if you are using a version of php >= 5.3.0 which you should be, to save your result. If used correctly this will help with sql injection.
As this question was tagged with sql-injection, here is an answer regarding this particular kind of attack:
As you've been told in the comments, you have to use prepared statements for the every single query involving any variable data, with no exceptions.
Regardless of any HTML stuff!
It is essential to understand that SQL queries have to be formatted properly regardless of any external factors, be it HTML input or anything else.
Although you can use white-listing suggested in other answers for the input validation purpose, it shouldn't affect any SQL-related actions - they have to remain the same, no matter if you validated HTML input or not. It means you still have to use prepared statements when adding any variables into the query.
Here you may find a thorough explanation, why prepared statements is a must and how to properly use them and where they aren't applicable and what to do in such case: The Hitchhiker's Guide to SQL Injection protection
Also, this question was tagged with mysqli. Mostly by accident, I presume, but anyway, I have to warn you that raw mysqli is not an adequate substitution for the old mysq_* functions. Simply because if used in the old style, it will add no security at all. While it's support for the prepared statements is painful and troublesome, to the point that average PHP user is just unable to endeavor them at all. Thus, if no ORM or some sort of abstraction library is option, then PDO is your only choice.
Yes.
Anyone can spoof anything for the values that actually get sent --
SO, for validating dropdown menus, you can just check to make sure that the value that you're working with was in the dropdown - something like this would be the best(most sanely paranoid) way:
if(in_array($_POST['ddMenu'], $dropDownValues){
$valueYouUseLaterInPDO = $dropDownValues[array_search("two", $arr)];
} else {
die("effin h4x0rs! Keep off my LAMP!!");
}
One way of protecting against users changing your drop downs using the console is to only use integer values in them. Then you can validate that the POST value contains an integer, and use an array to convert that to text when needed. E.g:
<?php
// No, you don't need to specify the numbers in the array but as we're using them I always find having them visually there helpful.
$sizes = array(0 => 'All', 1 => 'Large', 2 => 'Medium', 3 => 'Small');
$size = filter_input(INPUT_POST, "size", FILTER_VALIDATE_INT);
echo '<select name="size">';
foreach($sizes as $i => $s) {
echo '<option value="' . $i . '"' . ($i == $size ? ' selected' : '') . '>' . $s . '</option>';
}
echo '</select>';
Then you can use $size in your query with knowledge that it will only ever contain FALSE or an integer.
The other answers already cover what you need to know. But maybe it helps to clarify some more:
There are TWO THINGS you need to do:
1. Validate form data.
As Jonathan Hobbs' answer shows very clearly, the choice of html element for the form input does not do any reliable filtering for you.
Validation is usually done in a way that does not alter the data, but that shows the form again, with the fields marked as "Please correct this".
Most frameworks and CMSes have form builders that help you with this task. And not just that, they also help against CSRF (or "XSRF"), which is another form of attack.
2. Sanitize/Escape variables in SQL statements..
.. or let prepared statements do the job for you.
If you build a (My)SQL statement with any variables, user-provided or not, you need to escape and quote these variables.
Generally, any such variable you insert into a MySQL statement should be either a string, or something that PHP can be reliably turn into a string that MySQL can digest. Such as, numbers.
For strings, you then need to choose one of several methods to escape the string, that means, replace any characters that would have side effects in MySQL.
In old-school MySQL + PHP, mysql_real_escape_string() does the job. The problem is that it is far too easy to forget, so you should absolutely use prepared statements or query builders.
In MySQLi, you can use prepared statements.
Most frameworks and CMSes provide query builders that help you with this task.
If you are dealing with a number, you could omit the escaping and the quotes (this is why the prepared statements allow to specify a type).
It is important to point out that you escape the variables for the SQL statement, and NOT for the database itself. The database will store the original string, but the statement needs an escaped version.
What happens if you omit one of these?
If you don't use form validation, but you do sanitize your SQL input, you might see all kinds of bad stuff happening, but you won't see SQL injection! (*)
First, it can take your application into a state you did not plan for. E.g. if you want to calculate the average age of all users, but one user gave "aljkdfaqer" for the age, your calculation will fail.
Secondly, there can be all kinds of other injection attacks you need to consider: E.g. the user input could contain javascript or other stuff.
There can still be problems with the database: E.g. if a field (database table column) is limited to 255 characters, and the string is longer than that. Or if the field only accepts numbers, and you attempt to save a non-numeric string instead. But this is not "injection", it is just "crashing the application".
But, even if you have a free text field where you allow any input with no validation at all, you could still save this to the database just like that, if you properly escape it when it goes to a database statement. The problem comes when you want to use this string somewhere.
(*) or this would be something really exotic.
If you don't escape variables for SQL statements, but you did validate form input, then you can still see bad stuff happening.
First, you risk that when you save data to the database and load it again, it won't be the same data anymore, "lost in translation".
Secondly, it can result in invalid SQL statements, and thus crash your application. E.g. if any variable contains a quote or double quote character, depending which type of quote you use, you will get invalid MySQL statement.
Thirdly, it can still cause SQL injection.
If your user input from forms is already filtered / validated, intentional SQl injection may become less likely, IF your input is reduced to a hardcoded list of options, or if it is restricted to numbers. But any free text input can be used for SQL injection, if you don't properly escape the variables in SQL statements.
And even if you have no form input at all, you could still have strings from all kinds of sources: Read from the filesystem, scraped from the internet, etc. Noone can guarantee that these strings are safe.
Your web browser does not "know" that it is receiving a page from php, all it sees is html. And the http layer knows even less than that. You need to be able to handle nearly any kind of input that can cross the http layer (luckily for most input php will already give an error).
If you are trying to prevent malicious requests from messing up your db, then you need to assume that the guy on the other end knows what he is doing, and that he is not limited to what you can see in your browser under normal circumstances (not to mention what you can fiddle with a browser's developer tools).
So yes, you need to cater for any input from your dropdown, but for most input you can give an error.
The fact that you have restricted the user to only using values from a certain drop-down list is irrelevant. A technical user can capture the http request sent to your server before it leaves their network, alter it using a tool such as a local proxy server, and then continue it on it's way. Using the altered request, they can send parameter values that are not ones that you have specified in the drop down list. Developers have to have the mindset that client restrictions are often meaningless, as anything on a client can be altered. Server validation is required at every single point that client data enters. Attackers rely on the naivety of developers in this sole aspect.
It's best to use a parameterized query in order to ensure against SQL injection. In that case the look of the query would be this:
SELECT * FROM table WHERE size = ?
When you supply a query like the above with text that is unverified for integrity (the input isn't validated on the server) and it contains SQL injection code it will be handled correctly. In other words, the request will result in something like this happening in the database layer:
SELECT * FROM table WHERE size = 'DROP table;'
This will simply select 0 results as it returns which will make the query ineffective in actually causing harm to the database without the need for a whitelist, a verification check or other techniques. Please note that a responsible programmer will do security in layers, and will often validate in addition to parameterizing queries. However, there is very little cause to not parameterize your queries from a performance perspective and the security added by this practice is a good reason to familiarize yourself with parameterized queries.
Whatever is submitted from your form comes to your server as text across the wires. There is nothing stopping anyone from creating a bot to mimic the client or type it in from a terminal if they wanted to. Never assume that because you programmed the client it will act like you think it will. This is really easy to spoof.
Example of what can and will happen when you trust the client.
A hacker can bypass the browser completely, including Javascript form checking, by sending a request using Telnet. Of course, he will look at the code of your html page to get the field names he has to use, but from then on it's 'everything goes' for him. So, you must check all values submitted on the server as if they did not originate from your html page.
The reason I ask this question is because I was checking stackoverflow for answer, and since 2012/13 it no longer seems to be a hot topic and all the answers documentation is deprecated. Could you please tell me if we still should be doing this and if so what's a secure way to do so? I'm specifically talking about user defined post data...
Update: the string will be html inputted from user and posted into my dB.
The short answer is yes. Even in 2017 you should be escaping strings in PHP. PHP does not do it by itself because not every developer will want to develop a product / functionality that needs to escape user input (for whatever that reason may be).
If you are echoing user inputted data to a webpage, you should use the function htmlspecialchars() to stop potential malicious coding from executing upon being read by your browser.
When you are retrieving data from a client, you can also use the FILTER_INPUT functions to validate incoming data to validate that the clients data is actually the data you want (e.g checking that no one has bypassed your client side validation and has entered Illegal characters into the data)
From my experience these are two great functions that can be used to 1:) escape output to a client and 2:) prevent the chance of malicious code being stored/processed on your server.
It depends entirely on what you are going to do with the string.
If you are going to treat it as code (whether that code is HTML, JavaScript, PHP, SQL or something else) then it will need escaping.
PHP is not able to tell if you trust the source of the data to write safe code.
In 2017 this is what is usually done in the scenario you describe:
The user inputs text in a form, the text is sent to the server, before that the text is url encoded (this is one form or escaping). This is typically done by the browser/javascript so no need to do it manually (but it does happen).
The server receives the text, decodes it and then creates a MySQL insert/update statement to store it in the database. While some people still run the mysqli_real_escape_string on it, the recommended way is to use prepared statements instead. Therefore in this aspect you do not need to do the escaping, however prepared statements delegate escaping to the database (so again escaping does happen)
If the user inputted text is to be presented back on a page then it is encoded via htmlentities or similar (which is itself another form of escaping). This is mostly ran manually although most new view template frameworks (e.g. twig or blade) take care of that for us.
So that's how it is today as far as I know. Escaping is very much required, but the programmer actually doing it is not so much a requirement if modern frameworks and practices are used.
Yes, escaping the strings from the request (and therefore imputable by the user) is a practical requirement because PHP makes available the data actually added to the payload of the request without any modification that could invalidate the data itself (not all the data needs Of escaping), so any subsequent processing on that data must be made and under the developer's control.
The escape of variables in database interaction operations to prevent SQL Injections.
In past versions of PHP there was the "magic_quoteas" feature that filtered every variable in GET or POST. But it is deprecated and is not a best practice. Why Not?
The state of the art in querying DB is predominantly in using the PDO driver with the prepared statement. At the time the variable is bound, the variable will be escaped automatically.
$conn->prepare('SELECT * FROM users WHERE name = :name');
$conn->bindParam(':name',$_GET['username']); //this do the escape too
$conn->execute();
Alternatively, mysql_real_escape_string manages it manually.
Alternatively, mysqli::real_escape_string manages it manually.
Hi Question about XSS and PHP......
I am building a PHP Application all input is going to the database via client then server side validation, sanitization using filtervar.....encrypted passwords stored in the database, if I a type in scripttag -- whatever -- script tag it gets stored in the database as just that, and it can't do any harm there.
Is XSS only a threat when input is being directly outputted and and the only time to use htmlentities is at the point of outputting inputted data from the user.
My App doesn't do this, but is there a way for a would be attacker to inject some malicious code and cause it to be outputted, even though my programming logic doesn't allow for this.
I want to have all bases covered.........
Look forward to your answers.........
Yes I'm using PDO prepared statements, bindParam, execute to prevent SQL Injection, and to store the data safely in the database, I'm also using :
if(filter_var($_POST['firstname'], FILTER_SANITIZE_STRING)){
$clean['firstname'] = $_POST['firstname'];
};
For the Sanitization could that be improved on...........
I just fixed the code, the if statement was preventing the filter var from sanitizing the script tags see below :
$clean = array();
$clean['firstname'] = (filter_var($_POST['firstname'], FILTER_SANITIZE_STRING));
$clean['lastname'] = (filter_var($_POST['lastname'], FILTER_SANITIZE_STRING));
$clean['username'] = (filter_var($_POST['username'], FILTER_SANITIZE_STRING));
Now the script tags are no longer in the database.........
Input
If you have properly sanitized or validated things using filtervar then you probably do not have any problems on the database end. Sometimes it's hard to know if you have covered all your bases. Your database queries should still probably use parameterized queries to protect you.
Output
You should properly escape data for the target content type if any user input can be seen by other users; even things like a username can be malicious.
It would be great if you showed a few code examples of how you're using filter_var(). However, in general, here are some things to consider:
filter_var() can be a great tool. Also, you might look into the library HTMLPurifier. I've had a lot of success with that. The goal is to remove bad content before it even gets to the database.
use prepared statements to insert your data into the database. While this has a tad bit of extra weight when used for only a single query, it is the best way to insert data directly to the database with the least risk of sql injection.
Use htmlentities() AND the proper encoding when you output your data. For example, do not allow UTF-8 data in your database but output ISO 8859-1 output. I always suggest making sure your document encoding, database, and all filtering methods all work with the same encoding, preferably UTF-8
Don't forget that $_SERVER is not clean in PHP. Many people do tons of filtering and HTML entities, but use the unsanitized version of $_SERVER['REQUEST_URI'] in forms
If you'd like specific help, please post code samples.
I'm trying to secure my script a bit after some suggestions in the last question I asked.
Do I need to secure things like $row['page_name'] with the mysql_real_escape_string function? example:
$pagename = mysql_real_escape_string($row['page_name']);
I'm asking mainly because when I do secure every row I get some errors like when trying number_format() it throws number_format() expects parameter 1 to be double, string given while when it is not secured with mysql_real_escape_string it works.
Can someone clear this for me? Do I only need to secure COOKIE's or the row fetches too?
I got the suggestion in this post: HERE (look at the selected answer)
You're doing it backwards. Presumably $row is a row coming out of the database. You don't mysql_real_escape_string on the way out of the database, you use it on data going into the database to prevent SQL injection. It prevents people from submitting data that contains executable SQL code.
Once the data is safely in the database, you're done with mysql_real_escape_string (until you attempt to update that data). User data coming out of the database needs to be run through htmlspecialchars before it hits the page to prevent script injection.
Basically, on the way to the database, just before your insert/update runs, you need to escape potentially executable SQL. On the way to the browser, just before strings leave your app for the browser, you need to escape potentially executable JavaScript and/or interpretable HTML. Escaping should be the last thing you do with a piece of data before it leaves your app for either the browser or database.
This is by no means a complete answer.
Before writing any more code you need to stop and consider exactly what it is you are trying to accomplish.
In other words, what are you gaining by running the mysql_real_escape_string function?
Generally speaking, you escape data submitted by the client. This is to help prevent sql injection. Also, you should go further to actually validate that what the client sent in is acceptable (ie. "Sanity Check"). For example, if you are expecting a numeric entry, don't accept strings and range check the values. If you are expecting string data like a name, don't accept HTML, but again range check to verify length is acceptable. Both of these situations occur when the client submits data, not when you are writing it back out.
Going a little further, your cookies should be encrypted and marked with the httponly flag to tell the browser that it is not for use in client side script. Even with that, you shouldn't trust the data in the cookie at all; so go ahead and run your sanity checks and still escape those values in queries.
I highly recommend that you go to the OWASP website and read through all of the issues to get a better understanding of how attacks work and how to defend against them. Web App security is too important to just start coding without really knowing what's going on.
BTW, kudos to you for learning about this and trying to defend your site. Too many devs don't even think about security at all.
If you use the PDO extension to build clean requests, you can create functions that will do this (secure strings and define their type) :
An exemple where $text is a string of text and $number is an integer :
public function InsertThis($number, $text) {
$pdo = $this->getPdo();
$sth = $pdo->prepare("INSERT INTO my_table (number, text) VALUES (:number, :text");
$sth->bindParam('number',$number,PDO::PARAM_INT);
$sth->bindParam('text',$text);
$sth->execute();
}
http://php.net/manual/en/book.pdo.php
You only need to use mysql_real_escape_string() when inserting/updating a row where the values have come from untrusted sources.
This includes things like:
$_GET
$_POST
$_COOKIE
Anything that comes from the browser
Etc..
You should only use it when putting things into the database, not when you are taking things out, as they should already be safe.
A safer way altogether is to use the PDO class
mysql_real_escape_string does not "secure" anything. It escapes characters that can be used in sql injection attacks. Therefore the only values that you should escape are the ones supplied by your users. There should be no need to escape things that come out of your own database.
I am building a new web-app, LAMP environment... I am wondering if preg_match can be trusted for user's input validation (+ prepared stmt, of course) for all the text-based fields (aka not HTML fields; phone, name, surname, etc..).
For example, for a classic 'email field', if I check the input like:
$email_pattern = "/^([a-zA-Z0-9_\-\.]+)#((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)" .
"|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}" .
"|[0-9]{1,3})(\]?)$/";
$email = $_POST['email'];
if(preg_match($email_pattern, $email)){
//go on, prepare stmt, execute, etc...
}else{
//email not valid! do nothing except warn the user
}
can I sleep easy against the SQL/XXS injection?
I write the regexp to be the more restrictive as they can.
EDIT: as already said, I do use prepared statements already, and this behavior is just for text-based fields (like phone, emails, name, surname, etc..), so nothing that is allowed to contain HTML (for HTML fields, I use HTMLpurifier).
Actually, my mission is to let pass the input value only if it match my regexp-white-list; else, return it back to the user.
p.s:: I am looking for something without mysql_real_escape_strings; probably the project will switch to Postgresql in the next future, so need a validation method that is cross-database ;)
Whether or not a regular expression suffices for filtering depends on the regular expression. If you're going to use the value in SQL statements, the regular expression must in some way disallow ' and ". If you want to use the value in HTML output and are afraid of XSS, you'll have to make sure your regex doesn't allow <, > and ".
Still, as has been repeatedly said, you do not want to rely on regular expressions, and please by the love of $deity, don't! Use mysql_real_escape_string() or prepared statements for your SQL statements, and htmlspecialchars() for your values when printed in HTML context.
Pick the sanitising function according to its context. As a general rule of thumb, it knows better than you what is and what isn't dangerous.
Edit, to accomodate for your edit:
Database
Prepared statements == mysql_real_escape_string() on every value to put in. Essentially exactly the same thing, short of having a performance boost in the prepared statements variant, and being unable to accidentally forget using the function on one of the values. Prepared statement are what's securing you against SQL injection, rather than the regex, though. Your regex could be anything and it would make no difference to the prepared statement.
You cannot and should not try to use regexes to accodomate for 'cross-database' architecture. Again, typically the system knows better what is and isn't dangerous for it than you do. Prepared statements are good and if those are compatible with the change, then you can sleep easy. Without regexes.
If they're not and you must, use an abstraction layer to your database, something like a custom $db->escape() which in your MySQL architecture maps to mysql_real_escape_string() and in your PostgreSQL architecture maps to a respective method for PostgreSQL (I don't know which that would be off-hand, sorry, I haven't worked with PostgreSQL).
HTML
HTML Purifier is a good way to sanitise your HTML output (providing you use it in whitelist mode, which is the setting it ships with), but you should only use that on things where you absolutely need to preserve HTML, since calling a purify() is quite costly, since it parses the whole thing and manipulates it in ways aiming for thoroughness and via a powerful set of rules. So, if you don't need HTML to be preserved, you'll want to use htmlspecialchars(). But then, again, at this point, your regular expressions would have nothing to do with your escaping, and could be anything.
Security sidenote
Actually, my mission is to let pass
the input value only if it match my
regexp-white-list; else, return it
back to the user.
This may not be true for your scenario, but just as general information: The philosophy of 'returning bad input back to the user' runs risk of opening you to reflected XSS attacks. The user is not always the attacker, so when returning things to the user, make sure you escape it all the same. Just something to keep in mind.
For SQL injection, you should always use proper escaping like mysql_real_escape_string. The best is to use prepared statements (or even an ORM) to prevent omissions.
You already did those.
The rest depends on your application's logic. You may filter HTML along with validation because you need correct information, but I don't do validation to protect from XSS, I only do business validation*.
General rule is "filter/validate input, escape output". So I escape what I display (or transmit to third-party) to prevent HTML tags, not what I record.
* Still, a person's name or email address shouldn't contain < >
Validation is to do with making input data conform to the expected values for your particular application.
Injections are to do with taking a raw text string and putting it into a different context without suitable Escaping.
They are two completely separate issues that need to be looked at separately, at different stages. Validation needs to be done when input is read (typically at the start of the script); escaping needs to be done at the instant you insert text into a context like an SQL string literal, HTML page, or any other context where some characters have out-of-band meanings.
You shouldn't conflate these two processes and you can't handle the two issues at the same time. The word ‘sanitization’ implies a mixture of both, and as such is immediately suspect in itself. Inputs should not be ‘sanitized’, they should be validated as appropriate for the application's specific needs. Later on, if they are dumped into an HTML page, they should be HTML-escaped on the way out.
It's a common mistake to run SQL- or HTML-escaping across all the user input at the start of the script. Even ‘security’-focused tutorials (written by fools) often advise doing this. The result is invariably a big mess — and sometimes still vulnerable too.
With the example of a phone number field, whilst ensuring that a string contains only numbers will certainly also guarantee that it could not be used for HTML-injection, that's a side-effect which you should not rely on. The input stage should only need to know about telephone numbers, and not which characters are special in HTML. The HTML template output stage should only know that it has a string (and thus should always call htmlspecialchars() on it), without having to have the knowledge that it contains only numbers.
Incidentally, that's a really bad e-mail validation regex. Regex isn't a great tool for e-mail validation anyway; to do it properly is absurdly difficult, but this one will reject a great many perfectly valid addresses, including any with + in the username, any in .museum or .travel or any of the IDNA domains. It's best to be liberal with e-mail addresses.
NO.
NOOOO.
NOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO.
DO. NOT. USE. REGEX. FOR. THIS. EVER.
RegEx to Detect SQL Injection
Java - escape string to prevent SQL injection
You still want to escape the data before inserting it into a database. Although validating the user input is a smart thing to do the best protection against SQL injections are prepared statements (which automatically escape data) or escaping it using the database's native escaping functionality.
There is the php function mysql_real_escape_string(), which I believe you should use before submitting into a mysql database to be safe. (Also, it is easier to read.)
If you are good with regular expression : yes.
But reading your email validation regexp, I'd have to answer no.
The best is to use filter functions to get the user inputs relatively safely and get your php up to date in case something broken is found in these functions.
When you have your raw input, you have to add some things depending on what you do with these data : remove \n and \r for email and http headers, remove html tags to display to users, use parameterized queries to use it with a database.