Nowadays, "Prepared statements" seem to be the only way anyone recommends sending queries to a database. I even see recommendations to use prepared statements for stored procs. However, do to the extra query prepared statements require - and the short time they last - I'm persuaded that they are only useful for a line of INSERT/UPDATE queries.
I'm hoping someone can correct me on this, but it just seems like a repeat of the whole "Tables are evil" CSS thing. Tables are only evil if used for layouts - not tabular data. Using DIV's for tabular data is a style violation of WC3.
Like wise, plain SQL (or that generated from AR's) seems to be much more useful for 80% of the queries used, which on most sites are a single SELECT not to be repeated again that page load (I'm speaking about scripting languages like PHP here). Why would I make my over-taxed DB prepare a statement that it is only to run once before being removed?
MySQL:
A prepared statement is specific to
the session in which it was created.
If you terminate a session without
deallocating a previously prepared
statement, the server deallocates it
automatically.
So at the end of your script PHP will auto-close the connection and you will lose the prepared statement only to have your script re-created it on the next load.
Am I missing something or is this just a way to decrease performance?
:UPDATE:
It dawned on me that I am assuming new connections for each script. I would assume that if a persistent connection is used then these problems would disappear. Is this correct?
:UPDATE2:
It seems that even if persistent connections are the solution - they are not a very good option for most of the web - especially if you use transactions. So I'm back to square one having nothing more than the benchmarks below to go on...
:UPDATE3:
Most people simply repeat the phrase "prepared statements protect against SQL injection" which doesn't full explain the problem. The provided "escape" method for each DB library also protects against SQL injection. But it is more than that:
When sending a query the normal way,
the client (script) converts the data
into strings that are then passed to
the DB server. The DB server then uses
CPU power to convert them back into
the proper binary datatype. The
database engine then parses the
statement and looks for syntax errors.
When using prepared statements... the
data are sent in a native binary form,
which saves the conversion-CPU-usage,
and makes the data transfer more
efficient. Obviously, this will also
reduce bandwidth usage if the client
is not co-located with the DB server.
...The variable types are predefined,
and hence MySQL take into account
these characters, and they do not need
to be escaped.
http://www.webdesignforums.net/showthread.php?t=18762
Thanks to OIS for finally setting me strait on this issue.
unlike the CSS tables debate, there are clear security implications with prepared statements.
if you use prepared statements as the ONLY way to put user-supplied data in to a query, then they are absolutely bullet-proof when it comes to SQL injection.
When you execute a sql statement on the database, the sql parser needs to analyse it beforehand, which is the exact same process as the preparation.
So, comparing executing sql statements directly to preparing and executing has no disadvantages, but some advantages:
First of all, as longneck already stated, passing user input into a prepared statement escapes the input automatically. It is as if the database has prepared filters for the values and lets in only those values that fit.
Secondly, if use prepared statements thoroughly, and you come in the situation where you need to execute it multiple times, you don't need to rewrite the code to prepare and execute, but you just execute it.
Thirdly: The code becomes more readable, if done properly:
$sql = 'SELECT u.id, u.user, u.email, sum(r.points)
FROM users u
LEFT JOIN reputation r on (u.id=r.user_id)
LEFT JOIN badge b on (u.id=b.user_id and badge=:badge)
WHERE group=:group';
$params = array(
':group' => $group,
':badge' => $_GET['badge']
);
$stmt = $pdo->prepare($sql);
$result = $stmt->execute($params);
Instead of
$sql = 'SELECT u.id, u.user, u.email, sum(r.points)
FROM users u
LEFT JOIN reputation r on (u.id=r.user_id)
LEFT JOIN badge b on (u.id=b.user_id and badge="'.mysql_real_escape_string($_GET['badge']).'")
WHERE group="'.mysql_real_escape_string($group).'"';
$result = mysql_query($sql);
Imagine you had to change the sql statement, which code would be your favourite? ;-)
Prepared Statements come in handy in several situations:
Great separation of query data from untrusted user data.
Performance increase when the same query is executed multiple times
Performance increase when binary data is being transmitted as the prepared statement can use the binary protocol, whereas a traditional query will end up doing encoding and such.
There is a performance hit under normal circumstances (not repeated, no binary data) as you now have to do two back and forths. The first to "prepare" the query, and the second to transmit the token along with the data to be inserted. Most people are willing to make this sacrifice for the security benefit.
With regards to persistent connections:
MySQL has one of the fastest connection build up times on the market. It's essentially free for most set ups, so you're not going to see too much of a change using persistent connections or not.
The answer has to do with security and abstraction. Everyone else has already mentioned security, but the real upside is that your input is completely abstracted from the query itself. This allows for a true database agnosticism when using an abstraction layer, whereas inlining the input is usually a database-dependent process. If you care anything for portability, prepared statements are the way to go.
In the real world, I rarely ever write DML queries. All of my INSERTS / UPDATES are automatically built by the abstraction layer and are executed by simply passing an input array. For all intents and purposes, there really is no "performance hit" for preparing queries and then executing them (save for connection latency in the initial PREPARE). But when using a UDS (Unix Domain Socket) connection, you're not going to notice (or even be able to benchmark) a difference. It's usually on the order of a few microseconds.
Given the security and abstraction upsides, I'd hardly call it wasteful.
The performance benefit doesn't come from less parsing - it comes from only having to calculate access paths once rather than repeatedly. This helps a lot when you're issuing thousands of queries.
Given mysql's very simple optimizer/planner this may be less of an issue than with a more mature database with much more sophisticated optimizers.
However, this performance benefit can actually turn into a detriment if you've got a sophisticated optimizer that is aware of data skews. In that case you can often be better off with getting a different access path for the same query using different literal values rather than reusing a preexisting path.
When using sql queries like SELECT x,y,z FROM foo WHERE c='mary had a little lamb' the server has to parse the sql statement including the data + you have to sanitize the "mary had..." part (a call to mysql_real_escape() or similar for each parameter).
Using prepared statements the server has to parse the statement, too, but without the the data and sends back only an identifier for the statement (a tiny tiny data packet). Then you send the actual data without first sanitizing it. I don't see the overhead here, though I freely admit I've never tested it. Have you? ;-)
edit: And using prepared statements can eliminate the need to convert each and every parameter (in/out) to strings. Probably even more so if your version of php uses mysqlnd (instead of the "old" libmysql client library). Haven't tested the performance aspect of that either.
I don't seem to be finding any good benefits to use persistent connections - or prepared statements for that mater. Look at these numbers - for 6000 select statements (which will never happen in a page request!) you can barely tell the difference. Most of my pages use less than 10 queries.
UPDATED I just revised my test to
include 4k SELECT and 4k INSERT
statements! Run it yourself and let me
know if there are any design errors.
Perhaps the difference would be greater if my MySQL server wasn't running on the same machine as Apache.
Persistent: TRUE
Prepare: TRUE
2.3399310112 seconds
Persistent: FALSE
Prepare: TRUE
2.3265211582184 seconds
Persistent: TRUE
Prepare: FALSE
2.3666892051697 seconds
Persistent: FALSE
Prepare: FALSE
2.3496441841125 seconds
Here is my test code:
$hostname = 'localhost';
$username = 'root';
$password = '';
$dbname = 'db_name';
$persistent = FALSE;
$prepare = FALSE;
try
{
// Force PDO to use exceptions for all errors
$attrs = array(PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION);
if($persistent)
{
// Make the connection persistent
$attrs[PDO::ATTR_PERSISTENT] = TRUE;
}
$db = new PDO("mysql:host=$hostname;dbname=$dbname", $username, $password, $attrs);
// What type of connection?
print 'Persistent: '.($db->getAttribute(PDO::ATTR_PERSISTENT) ? 'TRUE' : 'FALSE').'<br />';
print 'Prepare: '.($prepare ? 'TRUE' : 'FALSE').'<br />';
//Clean table from last run
$db->exec('TRUNCATE TABLE `pdo_insert`');
}
catch(PDOException $e)
{
echo $e->getMessage();
}
$start = microtime(TRUE);
$name = 'Jack';
$body = 'This is the text "body"';
if( $prepare ) {
// Select
$select = $db->prepare('SELECT * FROM pdo_insert WHERE id = :id');
$select->bindParam(':id', $x);
// Insert
$insert = $db->prepare('INSERT INTO pdo_insert (`name`, `body`, `author_id`)
VALUES (:name, :body, :author_id)');
$insert->bindParam(':name', $name);
$insert->bindParam(':body', $body);
$insert->bindParam(':author_id', $x);
$run = 0;
for($x=0;$x<4000;++$x)
{
if( $insert->execute() && $select->execute() )
{
$run++;
}
}
}
else
{
$run = 0;
for($x=0;$x<4000;++$x) {
// Insert
if( $db->query('INSERT INTO pdo_insert (`name`, `body`, `author_id`)
VALUES ('.$db->quote($name).', '. $db->quote($body).', '. $db->quote($x).')')
AND
// Select
$db->query('SELECT * FROM pdo_insert WHERE id = '. $db->quote($x)) )
{
$run++;
}
}
}
print (microtime(true) - $start).' seconds and '.($run * 2).' queries';
Cassy is right. If you don't prepare/compile it, the dbms would have to in any case before able to run it.
Also, the advantage is you could check the prepare result and if prepare fail your algo can branch off to treat an exception without wasting db resources to run the failing query.
Related
I always check/limit/cleanup the user variables I use in database queries
Like so:
$pageid = preg_replace('/[^a-z0-9_]+/i', '', $urlpagequery); // urlpagequery comes from a GET var
$sql = 'SELECT something FROM sometable WHERE pageid = "'.$pageid.'" LIMIT 1';
$stmt = $conn->query($sql);
if ($stmt && $stmt->num_rows > 0) {
$row = $stmt->fetch_assoc();
// do something with the database content
}
I don't see how using prepared statements or further escaping improves anything in that scenario? Injection seems impossible here, no?
I have tried messing with prepared statements.. and I kind of see the point, even though it takes much more time and thinking (sssiissisis etc.) to code even just half-simple queries.
But as I always cleanup the user input before DB interaction, it seems unnecessary
Can you enlighten me?
You will be better off using prepared statement consistently.
Regular expressions are only a partial solution, but not as convenient or as versatile. If your variables don't fit a pattern that can be filtered with a regular expression, then you can't use them.
All the "ssisiisisis" stuff is an artifact of Mysqli, which IMHO is needlessly confusing.
I use PDO instead:
$sql = 'SELECT something FROM sometable WHERE pageid = ? LIMIT 1';
$stmt = $conn->prepare($sql);
$stmt->execute(array($pageid));
See? No need for regexp filtering. No need for quoting or breaking up the string with . between the concatenated parts.
It's easy in PDO to pass an array of variables, then you don't have to do tedious variable-binding code.
PDO also supports named parameters, which can be handy if you have an associative array of values:
$params = array("pageid"=>123, "user"=>"Bill");
$sql = 'SELECT something FROM sometable WHERE pageid = :pageid AND user = :user LIMIT 1';
$stmt = $conn->prepare($sql);
$stmt->execute($params);
If you enable PDO exceptions, you don't need to test whether the query succeeds. You'll know if it fails because the exception is thrown (FWIW, you can enable exceptions in Mysqli too).
You don't need to test for num_rows(), just put the fetching in a while loop. If there are no rows to fetch, then the loop stops immediately. If there's just one row, then it loops one iteration.
while ($row = $stmt->fetch(PDO::FETCH_ASSOC)) {
// do something with the database content
}
Prepared statements are easier and more flexible than filtering and string-concatenation, and in some cases they are faster than plain query() calls.
The question would be how you defined "improve" in this context. In this situation I would say that it makes no difference to the functionality of the code.
So what is the difference to you? You say that this is easier and faster for you to write. That might be the case but is only a matter of training. Once you're used to prepared statements, you will write them just as fast.
The difference to other programmers? The moment you share this code, it will be difficult for the other person to fully understand as prepared statements are kind of standard (or in a perfect world would be). So by using something else it makes it in fact harder to understand for others.
Talking more about this little piece of code makes no sense, as in fact it doesn't matter, it's only one very simple statement. But imagine you write a larger script, which will be easier to read and modify in the future?
$id = //validate int
$name = //validate string
$sometext = //validate string with special rules
$sql = 'SELECT .. FROM foo WHERE foo.id = '.$id.' AND name="'.$name.'" AND sometext LIKE "%'.$sometext.'%"';
You will always need to ask yourself: Did I properly validate all the variables I am using? Did I make a mistake?
Whereas when you use code like this
$sql = $db->prepare('SELECT .. FROM foo WHERE foo.id = :id AND name=":name" AND sometext LIKE "%:sometext%"');
$sql->bind(array(
':id' => $id,
':name' => $name,
':sometext' => $sometext,
));
No need to worry if you done everything right because PHP will take care of this for you.
Of course this isn't a complex query as well, but having multiple variables should demonstrate my point.
So my final answer is: If you are the perfect programmer who never forgets or makes mistakes and work alone, do as you like. But if you're not, I would suggest using standards as they exist for a reason. It is not that you cannot properly validate all variables, but that you should not need to.
Prepared statements can sometimes be faster. But from the way you ask the question I would assume that you are in no need of them.
So how much extra performance can you get by using prepared statements ? Results can vary. In certain cases I’ve seen 5x+ performance improvements when really large amounts of data needed to be retrieved from localhost – data conversion can really take most of the time in this case. It could also reduce performance in certain cases because if you execute query only once extra round trip to the server will be required, or because query cache does not work.
Brought to you faster by http://www.mysqlperformanceblog.com/
I don't see how using prepared statements or further escaping improves anything in that scenario?
You're right it doesn't.
P.S. I down voted your question because there seems little research made before you asked.
Hello I need help finding a way to protect from sql injection on my current project, Im making bash tutorial site but ive run into a problem. I put most my content in database and depending on what link the user clicks it will pull different data onto the page.
This is how im doing it
apt-get <br>
And on bash_cmds.php
<?php
require_once("connections/connect.php");
$dbcon = new connection();
$bash = $_REQUEST['id'];
$query2 = "SELECT * FROM bash_cmds WHERE id = $bash ";
$results = $dbcon->dbconnect()->query($query2);
if($results){
while($row = $results->fetch(PDO::FETCH_ASSOC)){
$bash_cmd = $row['bash_command'];
$how = $row['how_to'];
}
} else { return false; }
?>
<?php echo $bash_cmd ?>
<br />
<table>
<tr><td><?php echo $how ?> </td></tr>
</table>
However this leaves me vulnerable to sql injection, I ran sqlmap and was able to pull all databases and tables. Can someone please help I would appreciate it a lot the infomation would be invaluable.
There are a couple of ways to do this. I believe the best way is to use some database abstraction layer (there's a good one built into PHP called PDO) and use its prepared statements API. You can read more about PDO here, and you can see the particular function which binds a value to a ? placeholder here.
Alternatively, you could use the mysqli_real_escape_string API function, which should escape any SQL inside your $bash variable.
Of course, in this particular case, simply ensuring the ID is an integer with (int) or intval() would be good enough, but the danger of using this approach in general is that it's easy to forget to do this one time, which is all it takes for your application to be vulnerable. If you use something like PDO, it's more "safe by default," one might say - it's more difficult to accidentally write vulnerable code.
You could bind the values to a prepared statement.
But for something simple as a numeric variable a cast to an integer would be good enough:
$bash = (int) $_REQUEST['id'];
Using this, only a number would get stored into $bash. Even if someone enters ?id=--%20DROP%20TABLE%20xy;, as this will get casted to 1;
I've found one of the easiest ways to protect against injection is to use prepared statements.
You can do this in PHP via PDO, as CmdrMoozy suggested.
Prepared statements are more secure because the placeholders ? can only represent values, and not variables (ie: will never be interpreted as a table name, server variable, column name, etc. It {currently} can't even represent a list of values). This immediately makes any modification to the logic of the query immutable, leaving only possible unwanted values as injection possibilities (looking for an id of 'notanid'), which in most cases isn't a concern (they'd just get a blank/wrong/error page, their fault for trying to hack your site).
Addendum:
These restrictions are what is in place when the prepared statements are done on the server. When prepared statements are simulated by a library instead of actually being server side the same may not be true, but often many of these are emulated.
I'm running a loop to collect the data from an XML file, after the 2000+ lines are done there is some validation rules that run afterwards. If it is safe I'd like to store the MySQL data, in the before mentioned loops I am wanting to store all the MySQL queries to an array (that being almost 2000 queries) and run them in a loop after validation rules have finished.
Is this a bad idea? If so, what would be the best idea behind storing the query and running it later?
To give you an idea of what I'm talking about if you didn't understand:
foreach($xml->object as $control) {
// Stores relative xml data here
}
if(1=1) // Validation rules run
for(...) { // For Loop for my queries (2000 odd loop)
Mysql_Query(..)
}
To insert a lot of date into a MySQL database I would start a transaction (http://php.net/manual/en/mysqli.begin-transaction.php), and then create a prepared statement and just loop through all the items and validate them straight away and execute the prepared statement. Afterwards just mark the transaction as successful and end it. This is the fastest approach.
The use of prepared statements also prevents SQL Injection.
What you could do is use a prepared statement for a library that supports it (PDO or mysqli). In PDO:
$stmt = $pdo->prepare("INSERT INTO t1 VALUES (:f1, :f2, :f3)");
$stmt->bindParam(":f1", $f1);
$stmt->bindParam(":f2", $f2);
$stmt->bindParam(":f3", $f3);
foreach ($xml->obj as $control) {
// create $array
}
if ($valid) {
foreach ($array as $control) {
$f1 = $control['f1'];
$f2 = $control['f2'];
$f3 = $control['f3'];
$stmt->execute();
}
}
There's nothing wrong with this approach, except for the fact that it uses more RAM than would be used by progressively parsing and posting the XML data records to the database. If the server running the php belongs to you that's no problem. If you're on one of those cheap USD5 per month shared hosting plans, it's possible you'll run into memory or execution time limits.
You mentioned "safety." Are you concerned that you will have to back out all the database updates if any one of them fails? If that's the case, do two things:
Don't use the MyISAM access method for your table.
Use a transaction, and when you've posted all your data base changes, commit it.
This transactional approach is good, because if your posting process fails for any reason, you can roll back the transaction and get back to your starting point.
Will mysql_real_rescape_string() be enough to protect me from hackers and SQL attacks? Asking because I heard that these don't help against all attack vectors? Looking for the advice of experts.
EDIT: Also, what about LIKE SQL attacks?
#Charles is extremely correct!
You put yourself at risk for multiple types of known SQL attacks, including, as you mentioned
SQL injection: Yes! Mysql_Escape_String probably STILL keeps you susceptible to SQL injections, depending on where you use PHP variables in your queries.
Consider this:
$sql = "SELECT number FROM PhoneNumbers " .
"WHERE " . mysql_real_escape_string($field) . " = " . mysql_real_escape_string($value);
Can that be securely and accurately escaped that way? NO! Why? because a hacker could very well still do this:
Repeat after me:
mysql_real_escape_string() is only meant to escape variable data, NOT table names, column names, and especially not LIMIT fields.
LIKE exploits: LIKE "$data%" where $data could be "%" which would return ALL records ... which can very well be a security exploit... just imagine a Lookup by last four digits of a credit card... OOPs! Now the hackers can potentially receive every credit card number in your system! (BTW: Storing full credit cards is hardly ever recommended!)
Charset Exploits: No matter what the haters say, Internet Explorer is still, in 2011, vulnerable to Character Set Exploits, and that's if you have designed your HTML page correctly, with the equivalent of <meta name="charset" value="UTF-8"/>! These attacks are VERY nasty as they give the hacker as much control as straight SQL injections: e.g. full.
Here's some example code to demonstrate all of this:
// Contains class DBConfig; database information.
require_once('../.dbcreds');
$dblink = mysql_connect(DBConfig::$host, DBConfig::$user, DBConfig::$pass);
mysql_select_db(DBConfig::$db);
//print_r($argv);
$sql = sprintf("SELECT url FROM GrabbedURLs WHERE %s LIKE '%s%%' LIMIT %s",
mysql_real_escape_string($argv[1]),
mysql_real_escape_string($argv[2]),
mysql_real_escape_string($argv[3]));
echo "SQL: $sql\n";
$qq = mysql_query($sql);
while (($data = mysql_fetch_array($qq)))
{
print_r($data);
}
Here's the results of this code when various inputs are passed:
$ php sql_exploits.php url http://www.reddit.com id
SQL generated: SELECT url FROM GrabbedURLs
WHERE url LIKE 'http://www.reddit.com%'
ORDER BY id;
Returns: Just URLs beginning w/ "http://www.reddit.com"
$ php sql_exploits.php url % id
SQL generated: SELECT url FROM GrabbedURLs
WHERE url LIKE '%%'
ORDER BY id;
Results: Returns every result Not what you programmed, ergo an exploit --
$ php sql_exploits.php 1=1
'http://www.reddit.com' id Results:
Returns every column and every result.
Then there are the REALLLY nasty LIMIT exploits:
$ php sql_exploits.php url
> 'http://www.reddit.com'
> "UNION SELECT name FROM CachedDomains"
Generated SQL: SELECT url FROM GrabbedURLs
WHERE url LIKE 'http://reddit.com%'
LIMIT 1
UNION
SELECT name FROM CachedDomains;
Returns: An entirely unexpected, potentially (probably) unauthorized query
from another, completely different table.
Whether you understand the SQL in the attacks or not is irrevelant. What this has demonstrated is that mysql_real_escape_string() is easily circumvented by even the most immature of hackers. That is because it is a REACTIVE defense mechism. It only fixes very limited and KNOWN exploits in the Database.
All escaping will NEVER be sufficient to secure databases. In fact, you can explicitly REACT to every KNOWN exploit and in the future, your code will most likely become vulnerable to attacks discovered in the future.
The proper, and only (really) , defense is a PROACTIVE one: Use Prepared Statements. Prepared statements are designed with special care so that ONLY valid and PROGRAMMED SQL is executed. This means that, when done correctly, the odds of unexpected SQL being able to be executed are drammatically reduced.
Theoretically, prepared statements that are implemented perfectly would be impervious to ALL attacks, known and unknown, as they are a SERVER SIDE technique, handled by the DATABASE SERVERS THEMSELVES and the libraries that interface with the programming language. Therefore, you're ALWAYS guaranteed to be protected against EVERY KNOWN HACK, at the bare minimum.
And it's less code:
$pdo = new PDO($dsn);
$column = 'url';
$value = 'http://www.stackoverflow.com/';
$limit = 1;
$validColumns = array('url', 'last_fetched');
// Make sure to validate whether $column is a valid search parameter.
// Default to 'id' if it's an invalid column.
if (!in_array($column, $validColumns) { $column = 'id'; }
$statement = $pdo->prepare('SELECT url FROM GrabbedURLs ' .
'WHERE ' . $column . '=? ' .
'LIMIT ' . intval($limit));
$statement->execute(array($value));
while (($data = $statement->fetch())) { }
Now that wasn't so hard was it? And it's forty-seven percent less code (195 chars (PDO) vs 375 chars (mysql_). That's what I call, "full of win".
EDIT: To address all the controversy this answer stirred up, allow me to reiterate what I have already said:
Using prepared statements allows one to harness the protective measures of
the SQL server itself, and therefore
you are protected from things that the
SQL server people know about. Because
of this extra level of protection, you
are far safer than by just using
escaping, no matter how thorough.
No!
Important update: After testing possible exploit code provided by Col. Shrapnel and reviewing MySQL versions 5.0.22, 5.0.45, 5.0.77, and 5.1.48, it seems that the GBK character set and possibly others combined with a MySQL version lower than 5.0.77 may leave your code vulnerable if you only use SET NAMES instead of using the specific mysql_set_charset/mysqli_set_charset functions. Because those were only added in PHP 5.2.x, the combination of old PHP and old MySQL can yield a potential SQL injection vulnerability, even if you thought you were safe and did everything correctly, by-the-book.
Without setting the character set in combination with mysql_real_escape_string, you may find yourself vulnerable to a specific character set exploit possible with older MySQL versions. More info on previous research.
If possible, use mysql_set_charset. SET NAMES ... is not enough to protect against this specific exploit if you are using an effected version of MySQL (prior to 5.0.22 5.0.77).
Yes. If you will not forget to:
Escape string data with mysql_real_rescape_string()
Cast numbers to numbers explicitly (ie: $id = (int)$_GET['id'];)
then you're protected.
I personally prefer prepared statements:
<?php
$stmt = $dbh->prepare("SELECT * FROM REGISTRY where name = ?");
if ($stmt->execute(array($_GET['name']))) {
while ($row = $stmt->fetch()) {
print_r($row);
}
}
?>
It would be pretty easy to overlook one or another specific variable that has been missed when using one of the *escape_string() functions, but if all your queries are prepared statements, then they are all fine, and use of interpolated variables will stand out like a sore thumb.
But this is far from sufficient to ensure you're not vulnerable to remote exploits: if you're passing around an &admin=1 with GET or POST requests to signify that someone is an admin, every one of your users could easily upgrade their privileges with two or three seconds of effort. Note that this problem isn't always this obvious :) but this is an easy way to explain the consequences of trusting user-supplied input too much.
You should look into using prepared statements/parameterized queries instead. The idea is that you give the database a query with placeholders. You then give the database your data, and tell it which placeholder to replace with said data, and the database makes sure that it's valid and doesn't allow it to overrun the placeholder (i.e. it can't end a current query and then add its own - a common attack).
So I'm a slightly seasoned php developer and have been 'doin the damn thing' since 2007; however, I am still relatively n00bish when it comes to securing my applications. In the way that I don't really know everything I know I could and should.
I have picked up Securing PHP Web Applications and am reading my way through it testing things out along the way. I have some questions for the general SO group that relate to database querying (mainly under mysql):
When creating apps that put data to a database is mysql_real_escape_string and general checking (is_numeric etc) on input data enough? What about other types of attacks different from sql injection.
Could someone explain stored procedures and prepared statements with a bit more info than - you make them and make calls to them. I would like to know how they work, what validation goes on behind the scenes.
I work in a php4 bound environment and php5 is not an option for the time being. Has anyone else been in this position before, what did you do to secure your applications while all the cool kids are using that sweet new mysqli interface?
What are some general good practices people have found to be advantageous, emphasis on creating an infrastructure capable of withstanding upgrades and possible migrations (like moving php4 to php5).
Note: have had a search around couldn't find anything similar to this that hit the php-mysql security.
Javier's answer which has the owasp link is a good start.
There are a few more things you can do more:
Regarding SQL injection attacks, you can write a function that will remove common SQL statements from the input like " DROP " or "DELETE * WHERE", like this:
*$sqlarray = array( " DROP ","or 1=1","union select","SELECT * FROM","select host","create table","FROM users","users WHERE");*
Then write the function that will check your input against this array. Make sure any of the stuff inside the $sqlarray won't be common input from your users. (Don't forget to use strtolower on this, thanks lou).
I'm not sure if memcache works with PHP 4 but you can put in place some spam protection with memcache by only allowing a certain remote IP access to the process.php page X amount of times in Y time period.
Privileges is important. If you only need insert privileges (say, order processing), then you should log into the database on the order process page with a user that only has insert and maybe select privileges. This means that even if a SQL injection got through, they could only perform INSERT / SELECT queries and not delete or restructuring.
Put important php processing files in a directory such as /include. Then disallow all IPs access to that /include directory.
Put a salted MD5 with the user's agent + remoteip + your salt in the user's session, and make it verify on every page load that the correct MD5 is in their cookie.
Disallow certain headers (http://www.owasp.org/index.php/Testing_for_HTTP_Methods_and_XST) . Disallow PUT(If you dont need file uploads)/TRACE/CONNECT/DELETE headers.
My recommendations:
ditch mysqli in favor of PDO (with mysql driver)
use PDO paremeterized prepared statements
You can then do something like:
$pdo_obj = new PDO( 'mysql:server=localhost; dbname=mydatabase',
$dbusername, $dbpassword );
$sql = 'SELECT column FROM table WHERE condition=:condition';
$params = array( ':condition' => 1 );
$statement = $pdo_obj->prepare( $sql,
array( PDO::ATTR_CURSOR => PDO::CURSOR_FWDONLY ) );
$statement->execute( $params );
$result = $statement->fetchAll( PDO::FETCH_ASSOC );
PROs:
No more manual escaping since PDO does it all for you!
It's relatively easy to switch database backends all of a sudden.
CONs:
i cannot think of any.
I don't usually work with PHP so I can't provide advice specifically targeted to your requirements, but I suggest that you take a look at the OWASP page, particularly the top 10 vulnerabilities report: http://www.owasp.org/index.php/Top_10_2007
In that page, for each vulnerability you get a list of the things you can do to avoid the problem in different platforms (.Net, Java, PHP, etc.)
Regarding the prepared statements, they work by letting the database engine know how many parameters and of what types to expect during a particular query, using this information the engine can understand what characters are part of the actual parameter and not something that should be parsed as SQL like an ' (apostrophe) as part of the data instead of a ' as a string delimiter. Sorry I can not provide more info targeted at PHP, but hope this helps.
AFAIK, PHP/mySQL doesn't usually have parameterized queries.
Using sprintf() with mysql_real_escape_string() should work pretty well. If you use appropriate format strings for sprintf() (e.g. "%d" for integers) you should be pretty safe.
I may be wrong, but shouldn't it be enough to use mysql_real_escape_string on user provided data?
unless when they are numbers, in which case you should make sure they are in fact numbers instead by using for example ctype_digit or is_numeric or sprintf (using %d or %u to force input into a number).
Also, having a serarate mysql user for your php scripts that can only SELECT, INSERT, UPDATE and DELETE is probably a good idea...
Example from php.net
Example #3 A "Best Practice" query
Using mysql_real_escape_string() around each variable prevents SQL Injection. This example demonstrates the "best practice" method for querying a database, independent of the Magic Quotes setting.
The query will now execute correctly, and SQL Injection attacks will not work.
<?php
if (isset($_POST['product_name']) && isset($_POST['product_description']) && isset($_POST['user_id'])) {
// Connect
$link = mysql_connect('mysql_host', 'mysql_user', 'mysql_password');
if(!is_resource($link)) {
echo "Failed to connect to the server\n";
// ... log the error properly
} else {
// Reverse magic_quotes_gpc/magic_quotes_sybase effects on those vars if ON.
if(get_magic_quotes_gpc()) {
$product_name = stripslashes($_POST['product_name']);
$product_description = stripslashes($_POST['product_description']);
} else {
$product_name = $_POST['product_name'];
$product_description = $_POST['product_description'];
}
// Make a safe query
$query = sprintf("INSERT INTO products (`name`, `description`, `user_id`) VALUES ('%s', '%s', %d)",
mysql_real_escape_string($product_name, $link),
mysql_real_escape_string($product_description, $link),
$_POST['user_id']);
mysql_query($query, $link);
if (mysql_affected_rows($link) > 0) {
echo "Product inserted\n";
}
}
} else {
echo "Fill the form properly\n";
}
Use stored procedures for any activity that involves wrinting to the DB, and use bind parameters for all selects.