After many attempts of writing a code that can sanatize/validate $_POST["input"] in a PHP form I have to ask about it in general, because every attempt didn't work as I expected. I really have tried much and i just started to exercise with coding 2 months ago (html, css, bootstrap, wordpress, php). I hope you can tell me, as you can see, what is the most common or "best" way to sanatize a php form.
I have re-writed some OOP PDO form validation, which i couldn't get runned as i integrated it into a new homepage, because my OOP skills = 0.
It was build with classes, but i didn't see any prepared statement, which surprised me because prepared statements are idolized in the web.
Which is the better way and could you tell me why?
I hope you can answer my question and explain a little bit how i can handle it in a "pro-way", which in the best case is safe, because it is a hurdle for me.
Thank you for your help.
First of all, you should never connect validation and "sanitization" in any context. That's two completely different matters, which, alas, confuse too many people.
Validation indeed have to be applied to the form data and there is no common way to do it. Just use your common sense, business logic needs and framework guidelines.
While "sanitization" is a different matter. Even the word itself is the biggest blunder of PHP folks, always misused and confused.
And the best way to "sanitize" is not to to sanitize at all. Because
You can't "sanitize" your data, whatever you mean under the term.
Whatever "sanitization" will rather spoil the data.
There are actually many destinations this data may be for. You can't have one-for-all solution.
It is destination, not source that matters. Means you should use prepared statements not for POST variables but for SQL queries. BIG difference.
You should format, not "sanitize". Format according to the rules of the certain current destination.
To format data for the SQL query you have to use prepared statements.
To format your data for other destinations you have to follow their rules. And again, not "post form" but all data, despite of its source.
The Answer of "Your Common Sense" is definitely correct.
So you should validate your form first (clientside), there are a lots of small validation libs like e.g.: http://rickharrison.github.io/validate.js/
When it was posted you have to check if this data, is like you expected it.
E-Mail is just an e-mail, name ist just a string without XSS or SQL Injection stuff.....
So it depends on the destination like the above answer said.
To filter some post values there is for example: http://php.net/manual/en/function.filter-var.php
You can also use full frameworks or libraries against sql Injection and XSS stuff. You should use MySQLi prepared statements an some other stuff against XSS, like: HTMLPurifier http://htmlpurifier.org/
These are just some examples, if you interested I recommend you reading this page: https://www.owasp.org/index.php/XSS_Filter_Evasion_Cheat_Sheet
Related
I know this topic has been covered to death but I would like some feedback from the community regarding security within our web application.
We have standard LAMP stack web app which contains a large number of database queries which are executed using mysqli_query. These queries are not parameterized and at the moment but there is some naive escaping of the inputs using addslashes.
I have been tasked with making this system safer as we will be penetration tested very shortly. The powers above know that parameterized queries are the way to go to make the system safer however they don't want to invest the time and effort into re-writing all the queries in the application and also changing the framework we have to make them all work correctly.
So basically I'm asking what my options are here?
I've run mysqli_real_escape_string over the inputs. I've setup a filter which doesn't allow words like SELECT, WHERE, UNION to be passed in which I guess makes it safer. I know mysqli_query only allows one query to be run at once so there's some security there (from concatenating updates onto the end of of selects).
Do I have any other options here?
Edit: I should probably add that if anyone is able to provide an example of an attack which is completely unavoidable without parameterized queries that would also be helpful. We have a query which looks like this:
SELECT
pl.created
p.LoginName,
pl.username_entered,
pl.ip_address
FROM loginattempts pl
LEFT JOIN people p ON p.PersonnelId = pl.personnel_id
WHERE p.personnelid = $id
AND pl.created > $date1
AND pl.created < $date2
I've substituted a UNION query into the $id UNION SELECT * FROM p WHERE 1 = 1 sort of thing and I can prevent that by not allowing SELECT/UNION but then I'm sure there are countless other types of attack which I can't think of. Can anyone suggest a few more?
Update
I've convinced the powers that be above me that we need to rewrite the queries to parameterized statements. They estimate it will take a few months maybe but it has to be done. Win. I think?
Update2
Unfortunately I've not been able to convince the powers that be that we need to re-write all of our queries to parameterized ones.
The strategy we have come up with is to test every input as follows:
If the user supplied input is_int that cast it as so.
Same for real numbers.
Run mysqli_real_escape_string over the character data.
Change all the parameters in the queries to quoted strings i.e.
WHERE staffName = ' . $blah . '
In accordance with this answer we are 100% safe as we are not changing the character set at any time and we are using PHP5.5 with latin1 character set at all times.
Update 3
This question has been marked as a duplicate however in my mind the question is still not followed answered. As per update no.2 we have found some strong opinion that the mysqli_real_escape string function can prevent attacks and is apparently "100% safe". No good counter argument has since been provided (i.e. a demonstration of an attack which can defeat it when used correctly).
check every single user input for datatype and where applicabile with regular expressions (golden rule is: never EVER trust user input)
use prepared statements
seriously: prepared statements :)
it's a lot of work especially if your application is in bad shape (like it seems to be in your case) but it's the best way to have a decent security level
the other way (which i'm advising against) could be virtual patching using mod_security or a WAF to filter out injection attempts but first and foremost: try to write robust applications
(virtual patching might seem to be a lazy way to fix things but takes actually a lot of work and testing too and should really only be used on top of an already strong application code)
Do I have any other options here?
No. No external measure, like ones you tried to implement, has been proven to be of any help. Your site is still vulnerable.
I've run mysqli_real_escape_string over the inputs
Congratulations, you just reinvented the notorious magic_quotes feature, that proven to be useless and now expelled from the language.
JFYI, mysqli_real_escape_string has nothing to do with SQL injections at all.
Also, combining it with existing addslashes() call, you are spoiling your data, by doubling number of slashes in it.
I've setup a filter which I guess makes it safer.
It is not. SQL injection is not about adding some words.
Also, this approach is called "Black-listing" it is proven to be essentially unreliable. A black list is essentially incomplete, no matter how many "suggestions" you can get.
I know mysqli_query only allows one query to be run at once so there's some security there
There is not. SQL injection is not about adding another query.
Why did I close this question as a duplicate for "How can I prevent SQL-injection in PHP?"?
Because these questions are mutually exclusive, and cannot coexist on the same site.
If we agree, that the only proper answer is using prepared statements, then a question asks "How can I protect using no prepared statements" makes very little sense.
At the same time, if the OP manages to force us to give the positive answer they desperately wants, it will make the other question obsoleted. Why use prepared statements if everything is all right without them?
Additionally, this particular question is too localized as well. It seeks not insight but excuse. An excuse for nobody but the OP personally only. An excuse that let them to use an approach that proven to be insecure. Although it's up to them, but this renders this question essentially useless for the community.
I've been working with PHP for some time and I began asking myself if I'm developing good habits.
One of these is what I belive consists of overusing PHP sanitizing methods, for example, one user registers through a form, and I get the following post variables:
$_POST['name'], $_POST['email'] and $_POST['captcha']. Now, what I usually do is obviously sanitize the data I am going to place into MySQL, but when comparing the captcha, I also sanitize it.
Therefore I belive I misunderstood PHP sanitizing, I'm curious, are there any other cases when you need to sanitize data except when using it to place something in MySQL (note I know sanitizing is also needed to prevent XSS attacks). And moreover, is my habit to sanitize almost every variable coming from user-input, a bad one ?
Whenever you store your data someplace, and if that data will be read/available to (unsuspecting) users, then you have to sanitize it. So something that could possibly change the user experience (not necessarily only the database) should be taken care of. Generally, all user input is considered unsafe, but you'll see in the next paragraph that some things might still be ignored, although I don't recommend it whatsoever.
Stuff that happens on the client only is sanitized just for a better UX (user experience, think about JS validation of the form - from the security standpoint it's useless because it's easily avoidable, but it helps non-malicious users to have a better interaction with the website) but basically, it can't do any harm because that data (good or bad) is lost as soon as the session is closed. You can always destroy a webpage for yourself (on your machine), but the problem is when someone can do it for others.
To answer your question more directly - never worry about overdoing it. It's always better to be safe than sorry, and the cost is usually not more than a couple of milliseconds.
The term you need to search for is FIEO. Filter Input, Escape Output.
You can easily confound yourself if you do not understand this basic principle.
Imagine PHP is the man in the middle, it receives with the left hand and doles out with the right.
A user uses your form and fills in a date form, so it should only accept digits and maybe, dashes. e.g. nnnnn-nn-nn. if you get something which does not match that, then reject it.
That is an example of filtering.
Next PHP, does something with it, lets say storing it in a Mysql database.
What Mysql needs is to be protected from SQL injection, so you use PDO, or Mysqli's prepared statements to make sure that EVEN IF your filter failed you cannot permit an attack on your database. This is an example of Escaping, in this case escaping for SQL storage.
Later, PHP gets the data from your db and displays it onto a HTML page. So you need to Escape the data for the next medium, HTML (this is where you can permit XSS attacks).
In your head you have to divide each of the PHP 'protective' functions into one or other of these two families, Filtering or Escaping.
Freetext fields are of course more complex than filtering for a date, but never mind, stick to the principles and you will be OK.
Hoping this helps http://phpsec.org/projects/guide/
I am trying to figure out which functions are best to use in different cases when inputting data, as well as outputting data.
When I allow a user to input data into MySQL what is the best way to secure the data to prevent SQL injections and or any other type of injections or hacks someone could attempt?
When I output the data as regular html from the database what is the best way to do this so scripts and such cannot be run?
At the moment I basically only use
mysql_real_escape_string();
before inputting the data to the database, this seems to work fine, but I would like to know if this is all I need to do, or if some other method is better.
And at the moment I use
stripslashes(nl2br(htmlentities()))
(most of the time anyways) for outputting data. I find these work fine for what I usually use them for, however I have run into a problem with htmlentities, I want to be able to have some html tags output respectively, for example:
<ul></ul><li></li><bold></bold>
etc, but I can't.
any help would be great, thanks.
I agree with mikikg that you need to understand SQL injection and XSS vulnerabilities before you can try to secure applications against these types of problems.
However, I disagree with his assertions to use regular expressions to validate user input as a SQL injection preventer. Yes, do validate user input insofar as you can. But don't rely on this to prevent injections, because hackers break these kinds of filters quite often. Also, don't be too strict with your filters -- plenty of websites won't let me log in because there's an apostrophe in my name, and let me tell you, it's a pain in the a** when this happens.
There are two kinds of security problems you mention in your question. The first is a SQL injection. This vulnerability is a "solved problem." That is, if you use parameterized queries, and never pass user supplied data in as anything but a parameter, the database is going to do the "right thing" for you, no matter what happens. For many databases, if you use parameterized queries, there's no chance of injection because the data isn't actually sent embedded in the SQL -- the data is passed unescaped in a length prefixed or similar blob along the wire. This is considerably more performant than database escape functions, and can be safer. (Note: if you use stored procedures that generate dynamic SQL on the database, they might also have injection problems!)
The second problem you mention is the cross site scripting problem. If you want to allow the user to supply HTML without entity escaping it first, this problem is an open research question. Suffice to say that if you allow the user to pass some kinds of HTML, it's entirely likely that your system will suffer an XSS problem at some point to a determined attacker. Now, the state of the art for this problem is to "filter" the data on the server, using libraries like HTMLPurifier. Attackers can and do break these filters on a regular basis; but as of yet nobody has found a better way of protecting the application from these kinds of things. You may be better off only allowing a specific whitelist of HTML tags, and entity encoding anything else.
This is one of the most problematic task today :)
You need to know how SQL injection and other attackers methods works. There are very detailed explanation of each method in https://www.owasp.org/index.php/Main_Page and also whole security framework for PHP.
Using specific security libraries from some framework are also good choice like in CodeIgniter or Zend.
Next, use REGEXP as much as you can and stick pattern rules to specific input format.
Use prepared statements or active records class of your framework.
Always cast your input with (int)$_GET['myvar'] if you really need numeric values.
There are so many other rules and methods to secure your application, but one golden rule is "never trust user's input".
In your php configuration, magic_quotes_gpc should be off. So you won't need stripslashes.
For SQL, take a look at PDO's prepared statements.
And for your custom tags, as there are only three of them, you can do a preg_replace call after the call of htmlentities to convert those back before your insert them into the database.
I'm using the ezSQL PHP class for MySQL queries. Since all of my queries pass through the $ezsql->query() function, I thought it would be a good idea to implement a method to block common SQL injection techniques from $ezsql->query().
For example, the most common one is probably 1=1. So this regular expression should be able to block all variations of that:
preg_match('/(?:"|\')?(\d)(?:"|\')?=(?:"|\')?\1(?:"|\')?/',$query);
This would block "1"="1", '1'=1, 1=1, etc.
Is this a good idea? If so, what are some other common patterns?
Edit: Forgot to mention, I do use validation and sanitation. This is just an extra precaution.
Is this a good idea?
No. For two reasons:
You're doing it wrong (yes you just failed with your bare approach of a SQL blacklist). And no, I won't tell you how you could improve that because of 2:
It's a blacklist approach. You should not use a blacklist approach inside the database class itself. That's no added pre-caution, it's just useless. Blacklist could be added additionally at the request level of the webserver for example.
Instead use an existing blacklist, don't re-invent the wheel. If you want to learn how to develop your own SQL blacklist layer, help with the development of such existing components. This sort of security is not out-of-the-box so that you can just throw in a question like yours and you can actually expect concrete answers. Take care.
Is this a good idea?
Definitely NO.
Every time I see such a suggestion on an internet forum, I am wondering, what if the software this forum runs on followed such a pattern? A poor inventor would be just unable co come up with their solution, because software would block the post!
extra precautions wouldn't hurt. Better safe than sorry.
As I pointed out above, it apparently hurts. A database that cannot process some odd portions of data is a nonsense.
Besides, I do believe that only knowledge can make you safe.
Not random moves out of some vague ideas but sane and reasonable actions.
As long as you escape and quote the data that goes to the query and as long as you set the proper encoding for the escaping function, there is no reason to sorrow.
As long as you are using prepared statements to add your data to the query, there is no reason to sorrow.
As long as you are filtering SQL identifiers and keywords based on hardcoded whitelist, there is no reason to sorrow.
I'm new to PHP and I'm following a tutorial here:
Link
It's pretty scary that a user can write php code in an input and basically screw your site, right?
Well, now I'm a bit paranoid and I'd rather learn security best practices right off the bat than try to cram them in once I have some habits in me.
Since I'm brand new to PHP (literally picked it up two days ago), I can learn pretty much anything easily without getting confused.
What other way can I prevent shenanigans on my site? :D
There are several things to keep in mind when developing a PHP application, strip_tags() only helps with one of those. Actually strip_tags(), while effective, might even do more than needed: converting possibly dangerous characters with htmlspecialchars() should even be preferrable, depending on the situation.
Generally it all comes down to two simple rules: filter all input, escape all output. Now you need to understand what exactly constitutes input and output.
Output is easy, everything your application sends to the browser is output, so use htmlspecialchars() or any other escaping function every time you output data you didn't write yourself.
Input is any data not hardcoded in your PHP code: things coming from a form via POST, from a query string via GET, from cookies, all those must be filtered in the most appropriate way depending on your needs. Even data coming from a database should be considered potentially dangerous; especially on shared server you never know if the database was compromised elsewhere in a way that could affect your app too.
There are different ways to filter data: white lists to allow only selected values, validation based on expcted input format and so on. One thing I never suggest is try fixing the data you get from users: have them play by your rules, if you don't get what you expect, reject the request instead of trying to clean it up.
Special attention, if you deal with a database, must be paid to SQL injections: that kind of attack relies on you not properly constructing query strings you send to the database, so that the attacker can forge them trying to execute malicious instruction. You should always use an escaping function such as mysql_real_escape_string() or, better, use prepared statements with the mysqli extension or using PDO.
There's more to say on this topic, but these points should get you started.
HTH
EDIT: to clarify, by "filtering input" I mean decide what's good and what's bad, not modify input data in any way. As I said I'd never modify user data unless it's output to the browser.
strip_tags is not the best thing to use really, it doesn't protect in all cases.
HTML Purify:
http://htmlpurifier.org/
Is a real good option for processing incoming data, however it itself still will not cater for all use cases - but it's definitely a good starting point.
I have to say that the tutorial you mentioned is a little misleading about security:
It is important to note that you never want to directly work with the $_GET & $_POST values. Always send their value to a local variable, & work with it there. There are several security implications involved with the values when you directly access (or
output) $_GET & $_POST.
This is nonsense. Copying a value to a local variable is no more safe than using the $_GET or $_POST variables directly.
In fact, there's nothing inherently unsafe about any data. What matters is what you do with it. There are perfectly legitimate reasons why you might have a $_POST variable that contains ; rm -rf /. This is fine for outputting on an HTML page or storing in a database, for example.
The only time it's unsafe is when you're using a command like system or exec. And that's the time you need to worry about what variables you're using. In this case, you'd probably want to use something like a whitelist, or at least run your values through escapeshellarg.
Similarly with sending queries to databases, sending HTML to browsers, and so on. Escape the data right before you send it somewhere else, using the appropriate escaping method for the destination.
strip_tags removes every piece of html. more sophisticated solutions are based on whitelisting (i.e. allowing specific html tags). a good whitelisting library is htmlpurifyer http://htmlpurifier.org/
and of course on the database side of things use functions like mysql_real_escape_string or pg_escape_string
Well, probably I'm wrong, but... In all literature, I've read, people say It's much better to use htmlspellchars.
Also, rather necessary to cast input data. (for int for example, if you are sure it's user id).
Well, beforehand, when you'll start using database - use mysql_real_escape_string instead of mysql_escape_string to prevent SQL injections (in some old books it's written mysql_escape_string still).