Sql Injection protection with only str_replace

Sql Injection protection with only str_replace - php

I'm studying SQL injection and tried in my PHP code this query:
$condition = str_replace(["'","\\"],["\\'","\\\\"], #$_GET['q']);
$query = "SELECT * FROM dummy_table WHERE dummy_column = '$condition'";
DB and tables charset is set to UTF8.
I can't inject anything, can someone help me please?
EDIT: As pointed out by GarethD this would escape first ' and than \, allowing injection, what about this str_replace?
$condition = str_replace(["\\","'"],["\\\\","\\'"], #$_GET['q']);

This isolated example is invulnerable to injection.
But you have to realize that protection from sql injection is not just a character replace. And circumstances may differ from ones you are taking at the moment for granted. So, your code would become vulnerable on the long run, due to essential drawbacks of this method:
character replace is only a part of required formatting
this particular replacement can be applied to strings only, leaving other parts absolutely unprotected.
such a replace is external to a query execution, means it is prone to a human error of any sort.
such a replace is an essentially detachable measure, means it can be moved too far away from the actual query execution and eventually forgotten.
this kind of escaping is prone to encoding attack, making solution too limited in use.
There is nothing wrong in character replacement per se, but only if it is used as a part of complete formatting; applied to the right query part; and done by a database driver, not a programmer; right before execution.
Functions you proposed in the comments are a good step, but still insufficient, being subjects of the drawbacks listed above, making them prone to all sorts of human errors.
And SQL injection is not the only problem with this approach, it is a usability fault as well, as this function would either spoil your data, if used as an incarnation of late magic quotes, or make your code bloated, if used to format every variable right in the application code.
Such functions can be used only to process a placeholder, but of course not by means of using a homebrewed replace function, but a proper function provided by database API.

Related

Mysqli_real_escape_string vulnerable

So I was busy with PHP, inserting data into mysql when I wanted to know: I have come accross some posts that say it's bad practice to use single quotes to insert data into a database. one of the examples: Why are VALUES written between quotes when sent to a database?
The post is about why they're written between quotes, but one thing was clear: It's bad practice to insert it like:
$sql = INSERT INTO example (example1, example2, example3) VALUES
('$example1', '$example2', '$example3');
Why is this be bad practice? Apparently it is vurnerable to injection as stated in the above link given. the OP his question was related to the comment was: We use mysqli_real_escape_string for this. The respons given was:
#XX To a large extent, yes, it is an alternative solution to the problem. It doesn't disable anything, but it escapes things so that for instance ' becomes '' or \' in the SQL string, keeping the attacker from ending the string. There are awkward cases where such escaping is hard, and it's easy to miss one escape call among many, which is why parametrised queries are considered the most reliable approach.
First of all: How does a script want to fool mysqli_real_escape_string into NOT escaping certain stuff? I found something that said the following and correct me if i'm wrong: mysqli_real_escape_string - example for 100% safety. As you can see he refers to another page, that has an answer. However He then makes a claim that should make his data 100% safe and someone else responds with:
Yes, that is generally safe. Mysql and mysqli are perfectly safe when used right (specific bugs in very specific encodings notwithstanding). The advantage of prepared statements is that it's more difficult to do things the wrong way.
I have the following example to make it clear for myself: I have 2 doors, 1 door is open, but behind a closed door. How would you attack an open door with a closed door in front of it?
There is an answer here: SQL injection that gets around mysql_real_escape_string(), but he says as an safe example:
mysql_query('SET NAMES utf8');
$var = mysql_real_escape_string("\xbf\x27 OR 1=1 /*");
mysql_query("SELECT * FROM test WHERE name = '$var' LIMIT 1");`
Isn't mysqli_real_escape_string already doing the same? he's just specifying what characters should be mysqli_real_escaped_string. So how can this all of the sudden become safe? Since it is doing the exact same thing as when you would say:
$example = mysqli_real_escape_string ($conn, $_POST['exampleVariable']);
So how does this:
mysql_query('SET NAMES utf8');
$var = mysql_real_escape_string("\xbf\x27 OR 1=1 /*");
mysql_query("SELECT * FROM test WHERE name = '$var' LIMIT 1");
become safe and this:
$example = mysqli_real_escape_string ($conn, $_POST['exampleVariable']);
not? Isn't he just narrowing down what mysqli_real_escape_string would escape, thus making it more vulnerable?

The thing is, mysqli_real_escape_string() or other proper escaping is actually technically secure, in the end, that's about what parameterized queries do too. However, there is always a however. For example it is only secure if you have quotes around variables. Without quotes it is not. When it's a 1000 line project with one developer, it's probably ok in the first 3 months. Then eventually even that one developer will forget the quotes, because they are not always needed syntactically. Imagine what will happen in a 2 million LOC project with 200 devs, after 5 years.
Similarly, you must not ever forget using your manual escape function. It may be easy first. Then there is a variable that is validated and can only hold a number, so you are sure it's ok to just put it in without escaping. Then you change your mind and change it into a string, because the query is ok anyway. Or somebody else does it after 2 years. And so on. It's not a technical issue. It's a management-ish issue, in the long run, your code will be vulnerable somehow. Hence it is bad practice.
Another point, it's much harder to automatically scan for SQL injection vulnerabilities if manual escaping is in place. Static code scanners can easily find all instances of concatenated queries, but it's very hard for them to correlate previous escaping if there is any. If you use something like parameterized queries, it is straightforward to find sql injections, because all concatenations are potential candidates.

Setting the character set for the connection does not change what mysqli_real_escape_string() escapes. But it avoids the multi-byte character bug because it controls how the string is interpreted after the backslash escape characters have been inserted.
Of course, you can avoid any uncertainty by using query parameters instead.

Specifically what advantage do prepared statements offer in this case?

I have a (relatively) simple interactive web site. I do not run queries in a loop. All inputs are either strings, integers or images. I confirm all integer and image data types and use mysqli_real_escape_string() on all strings.
Putting aside evangelism, what advantage would I get out of using prepared statements with parameterized queries?
Other answers I've found don't address this specific comparison.

I'll try an excerpt from my Hitchhiker's Guide to SQL Injection prevention:
Why manual formatting is bad?
Because it's manual. Manual means error prone. It depends on the programmer's skill, temper, mood, number of beers last night and so on. As a matter of fact, manual formatting is the very and the only reason for the most injection cases in the world. Why?
Manual formatting can be incomplete.
Let's take Bobby Tables' case. It's a perfect example of incomplete formatting: string we added to the query were quoted but not escaped! While we just learned from the above that quoting and escaping should be always applied together (along with setting the proper encoding for the escaping function). But in a usual PHP application which does SQL string formatting separately (partly in the query and partly somewhere else), it is very likely that some part of formatting may be simple overlooked.
Manual formatting can be applied to a wrong literal.
Not a big deal as long as we are using complete formatting (as it will cause immediate error which can be fixed at development phase), but combined with incomplete formatting it's a real disaster. There are hundreds of answers on the great site of Stack Overflow, suggesting to escape identifiers the same way as strings. Which is totally useless and leads straight to injection.
Manual formatting is essentially non-obligatory measure.
First of all, there is obvious lack of attention case, where proper formatting can be simply forgotten. But there is a real weird case - many PHP users often intentionally refuse to apply any formatting, because up to this day they still separating data to "clean" and "unclean", "user input" and "non-user input", etc. Means "safe" data don't require formatting. Which is a plain nonsense - remember Sarah O'Hara. From the formatting point of view, it is destination that matters. A developer have to mind the type of SQL literal, not the data source. Is this string going to the query? It have to be formatted then. No matter, if it is from user input or just mysteriously appeared amidst the code execution.
Manual formatting can be separated from the actual query execution by a considerable distance.
Most underestimated and overlooked issue. Yet most essential of them all, as it alone can spoil all the other rules, if not followed.
Almost every PHP user is tempted to do all the "sanitization" in one place, far away from the actual query execution, and this false approach is a source of innumerable faults:
first of all, having no query at hand, one cannot tell what kind of SQL literal this certain piece of data is going represent - and thus violate both formatting rules (1) and (2) at once.
having more than one place for santization, we're calling for disaster, as one developer would think it was done by another, or made already somewhere else, etc.
having more than one place for santization, we're introducing another danger, of double-sanitizing data (say, one developer formatted it at the entry point and another - before query execution)
premature formatting most likely will spoil the source variable, making it unusable anywhere else.
After all, manual formatting will always take extra space in the code, making it entangled and bloated.
A properly implemented parametrized query can make your code unbelievable short. A pseudo-code to demonstrate:
$data = DB::call("SELECT * FROM t WHERE foo=? AND bar=?", [$foo, $bar])->fetchAll();
This single line of code will get you an array of rows from the database. How many lines will it take with your manual formatting?

Can an SQL injection be made with a single word in a SELECT statement?

Suppose you have a query looking like this:
SELECT * FROM messages WHERE sender='clean_username'
where the clean_username is received over get/post and sanitized like this:
$clean_username = preg_replace( '/[^A-Za-z0-9_]+/m' , '', $dirty_username );
The above code removes any whitespace (among other things), which means that the valid_username parameter will always only be one word.
What is the simplest way this can be exploited with an injection?
I'm asking this question to better understand how SQL injection works. In my work I stick to the established good practices of using prepared statements and parameterized queries to prevent injections, but I think it's good for people to also have an understanding of how malicious code can be injected in a simple scenario like this.

You can still exploit this using hex coding: stripping spaces is not enough.
I guess this is a somewhat interesting place to start. But consider that preg_match()es are pretty bad for performance on high traffic sites.
Prepared statements and parameterized queries are always the best way to prevent SQL injections.
Example of GET injection using hex coding and no spaces
?id=(1)and(1)=(0)union(select(null),group_concat(column_name),(null)from(information_schema.columns)where(table_name)=(0x7573657273))#
I think you can see the problem above.

I think you already answered the question on your own.
The best way is a standard approach where you use parameterized queries to distinguish between user data and sql command.
In your particular case you assume that a sender username can only consist out of a limited set of ASCII characters. That might work for the moment, and as long as there is no string conversion before, no one can easily close the string apostrophes within the sql statement.
But always consider anticipation of changes. Somebody can rely on your given code in the nearby future and use or modify it and make new assumptions. Your test is actually weak and it can suddenly become dangerous when no one remembers and expects it.

Is it safe to use eval such way?

Let's say I have a file called query.sql with the following content in it:
SELECT * FROM `users` WHERE `id`!=".$q->Num($_POST['id'])."
And in my php-script, which has a html form with input named "id" in it, I do the following trick:
$sql=file_get_contents('query.sql');
$query= eval("return \"$sql\";");
//here follows something like $mysqli->query($query); and so on..
I am not concerned about sql-injections since I'm using prepared statements and $q->Num performs is_int check.
But is it safe to use eval such way?
As far as I understand, what is actually eval-ed here is "${_POST['id']}" and it evals to some string value the user entered. And this becomes dangerous only if I eval this string second time. While I eval string only once user's input is just string and can not be interpreted as php-code by compiler and no php-injection is possible.
UPDATE
Thank you for proposing different methodologies and stressing need to use prepared statements. But this not my question at all.
My question is all about php-injections. Is such use of eval bad? If yes, why?

There is no need to use eval - put in a token, and replace it:
// file query.sql
SELECT * FROM `users` WHERE `id`!="{id}";
//php
$sql = file_get_contents('query.sql');
$query = str_replace("{id}", $_POST['id'], $sql);
Update
No, it's not safe. Someone could edit your query.sql script to do anything you want. You may say "the app is internal only", or "i have permissions locked down" or whatever - but at the end of the day there are no guarantees.

The eval statement - although I would try to find a way without using eval - is not vulnerable for PHP injection because $sql is enclosed in double quotes "; One can not ending this quoting with a prepared variable in PHP.
I am not conserned about sql-injections since I'm using prepared statements
Aren't you? ;) You are!
Why do you add the $id to the query using the '.' operator (string manipulation)? If you really use the benefits from prepared statements I would expect something like a bindParam()
Note how prepared statements prevent from SQL injections: The SQL query syntax is been kept separate from arguments. So the server would
parse the SQL query
apply arguments
As the query has been already parsed before arguments will been applied, the query syntax cannot be manipulated by the arguments.
If you prepare a MySQL query that has been created using '.' and external inputs you are potentially vulnerable against SQL injections
What you are doing defeats the principals of prepared statements

Reference from this Answer.
https://stackoverflow.com/a/951868/627868
The main problems with eval() are:
Potential unsafe input. Passing an untrusted parameter is a way to
fail. It is often not a trivial task to make sure that a parameter (or
part of it) is fully trusted.
Trickyness. Using eval() makes code clever, therefore more difficult
to follow. To quote Brian Kernighan "Debugging is twice as hard as
writing the code in the first place. Therefore, if you write the code
as cleverly as possible, you are, by definition, not smart enough to
debug it"

Let's consider the following code:
$sql=file_get_contents('query.sql');
$query= eval("return \"$sql\";");
The danger points at this are:
if the file query.sql is modified from what you expect, then it could be used to execute any arbitrary code in your program.
the name of the file is shown hard-coded; if this isn't the case, then a malicious user could find a way to load an unexpected file (possibly even one from a remote site), again resulting in arbitrary code execution.
the only reason to use a file for this (rather than hard coding the SQL code directly in the program) would be because you want to use it as a config file. The problem here is that the syntax in this file is invalid for both SQL and PHP. Due to the way it's run in the eval(), it also requires that the syntax is exactly correct. Use the wrong quotes or miss one out, and it'll blow up. This is likely to result in brittle code, that fails badly rather than gracefully when the config is marginally incorrect.
There doesn't appear to be a direct SQL injection attack here, but that's really the least of your worries when it comes to eval().
I have personally worked in projects where code existed that worked pretty much exactly the way you've described. There were some very nasty bugs in the system as a direct result of this, and they have been difficult to rectify without wholesale rewrites. I would strongly recommend stepping back from this idea and using a sensible templating mechanism instead as recommended by others in the comments.

Is mysql_real_escape_string enough for preventing HTML injection?

I want to know how to prevent HTML injection. I have created a site where users are allowed to paste articles in a HTML form. I have used mysql_real_escape_sting but I want to know whether this is enough for preventing HTML injections. I tried htmlspecialchars but it’s showing error with mysql_real_escape_string.

No, mysql_real_escape_sting does only prepare data to be safely inserted into MySQL string declarations to prevent SQL injections in that specific context. It does not prevent other injections like HTML injection or Cross-Site Scripting (XSS).
Both HTML injection and XSS happen in different contexts where there are different contextual special characters that need to be taken care of. In HTML it’s especially <, >, &, ", and ' that delimit the different HTML contexts. With XSS in mind you also need to be aware of the different JavaScript contexts and their special characters.
htmlspecialchars should suffice the handle the former attack while json_encode can be used for a safe subset of JavaScript. See also the XSS (Cross Site Scripting) Prevention Cheat Sheet as well as my answer to Are these two functions overkill for sanitization? and related questions for further information on this topic.

You should use prepared statements to be absolutely sure to prevent sql injection.
Taken from documentation (read the part in bold)
Many of the more mature databases support the concept of prepared statements. What are they? They can be thought of as a kind of compiled template for the SQL that an application wants to run, that can be customized using variable parameters. Prepared statements offer two major benefits:
The query only needs to be parsed (or prepared) once, but can be
executed multiple times with the same or different parameters. When
the query is prepared, the database will analyze, compile and
optimize it's plan for executing the query. For complex queries this
process can take up enough time that it will noticeably slow down an
application if there is a need to repeat the same query many times
with different parameters. By using a prepared statement the
application avoids repeating the analyze/compile/optimize cycle. This
means that prepared statements use fewer resources and thus run
faster.
The parameters to prepared statements don't need to be quoted; the
driver automatically handles this. If an application exclusively uses
prepared statements, the developer can be sure that no SQL injection
will occur (however, if other portions of the query are being built
up with unescaped input, SQL injection is still possible).
Prepared statements are so useful that they are the only feature that PDO will emulate for drivers that don't support them. This ensures that an application will be able to use the same data access paradigm regardless of the capabilities of the database.
If you meant to prevent XSS (Cross site scripting) you should use the function htmlspecialchars() whenever you want to output something to the browser that came from user input or from any non secure source. Always treat any unknown source as unsecure
echo htmlspecialchars($string, ENT_QUOTES, 'UTF-8');

No. In fact, I believe that for advanced coders, you shouldn't be using mysql_real_escape_string() as a crutch.
For each value you need to use in a DB query, seriously consider the possible characters that could appear. If it is a dollar amount, the only characters you should accept are numbers, a period, and possible preceding dollar sign. If it is a name, you should only allow letters, a hyphen, and possibly a period (for fulls names like Joseph A. Bank).
Once you determine a strict character range that's acceptable for a value, write a Regex to match that value against. For any values that don't match, display a bogus error and log the value in a textfile (read: not a db) along with the user's IP. Frequently check this file so you can see if values users have tried that didn't work were hacking attempts. Not only will this uncover valid inputs for which you need to adjust your Regex, but it will also reveal the IP's of hackers who try to find SQL vulnerabilities on your site.
This approach ensures that new and old SQL vulnerabilities that might not immediately be addressed by mysql_real_escape_string(), will be blocked.

No, it's not. Refer to the docs
It doesn't escape < or >.

Simple answer: No
mysql_real_escape_string only helps you get rid of SQL Injections and not XSS and html injection. To avoid these you need more sophisticated input validation. Start by looking at strip_tags and htmlentities.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.