I'm used to doing something like the following for my queries:
$array = array(123,456,10,57,1024,768); //random data
$sql = "select * from table where field in(".implode(',',$array).")";
mysql_query($sql);
This works just fine, and is fast even when the array may have 1,000 or more items to search for(effectively a join against a nonexistent table).
For me, this is quite useful because I am joining data from multiple sources together.
However, I know it is somewhat insecure -- SQL injection is possible if I don't escape my incoming data correctly.
I'd like to try to figure out a parametrized equivalent of this; some way to do that sort of "array IN" using more "hardened" code.
It may not be possible to do this directly, it may require a helper function that does the replacements. If so, I'm looking for a good one I can drop into my code, hopefully something that will allow me to just pass an array as a parameter.
While my example is for PHP, I also need to figure this out for Python, as I'm doing the same thing there.
I've read some similar questions here, but haven't seen any (good) answers.
Using PDO prepared statements:
$placeholders = str_repeat('?, ', count($array)-1) . '?';
$stmt = $pdo->prepare("SELECT * FROM table WHERE field IN ($placeholders)");
$stmt->execute($array);
$placeholders will contain a sequence of ?, ?, ? placeholders, with the same number of ? as the size of the array. Then when you execute the statement, the array values are bound to the placeholders.
I have personally done:
$in = substr(str_repeat('?,', count($array)), 0, -1);
$sql = "SELECT * FROM table WHERE field IN ($in)";
This will provide you with ?, for each array element and remove the trailing comma.
Related
How is best way add to database multiInsert row?
E.g I have array and i would like add all array to database. I can create loop foreach and add all arrays.
$array=['apple','orange'];
foreach($array as $v)
{
$stmt = $db->exec("Insert into test(fruit) VALUES ('$v')");
}
And it's work, but maybe i should use transaction? or it do other way?
Use a prepared statement.
$sql = "INSERT INTO test (fruit) VALUES ";
$sql .= implode(', ', array_fill(0, count($array), '(?)'));
$stmt = $db->prepare($sql);
$stmt->execute($array);
The SQL will look like:
INSERT INTO test (fuit) VALUES (?), (?), (?), ...
where there are as many (?) as the number of elements in $array.
Doing a single query with many VALUES is much more efficient that performing separate queries in a loop.
If you have an associative array with input values for a single row, you can use a prepared query like this:
$columns = implode(',', array_keys($array);
$placeholders = implode(', ', array_fill(0, count($array), '?'));
$sql = "INSERT INTO test($columns) VALUES ($placeholders)";
$stmt = $db->prepare($sql);
$stmt->execute(array_values($array));
The way you've done it is, in many ways, the worst option. So the good news is that any other way of doing it will probably be better. As it stands, the code may fail depending on what's in the data; consider:
$v="single ' quote";
$stmt = $db->exec("Insert into test(fruit) VALUES ('$v')");
But without knowing what your criteria are for "best", its rather hard to advise. Using a parametrised query with data binding, or as they are often described "prepared statements" is one solution to the problem described above. Escaping the values appropriately before interpolating the string is another (and is how most PHP implementations of data binding work behind the scenes) is another common solution.
Leaving aside the question of how you get the parameters into the SQL statement, then there is the question of performance. Each round trip to the database has a cost associated with it. And doing a single insert at a time also has a performance impact - for each query, the DBMS must parse the query, apply the appropriate concurrency controls, execute the query, apply the writes to the journal, then to the data tables and indexes then tidy up before it can return the thread of execution back to PHP to construct the next query.
Wrapping multiple queries in a transaction (you are using transactions already - but they are implicit and applied to each statement) can reduce some of the overhead I have described here, but can introduce other problems, the nature of which depends on which concurrency model your DBMS uses.
To get the data into the database as quickly as possible, and minimising index fragmentation, the "best" solution is to batch up multiple inserts:
$q="INSERT INTO test (fruit) VALUES ";
while (count($array)) {
$s=$q;
$j='';
for ($x=0; $x<count($array) && $x<CHUNKSIZE; $x++) {
$s.=$j." ('" . mysqli_real_escape_string($db,
array_shift($array)) . "')";
$j=',';
}
mysqli_query($db,$s);
}
This is similar to Barmar's method, but I think easier to understand when you are working with more complex record structures, and it won't break with very large input sets.
This question already has answers here:
Can I bind an array to an IN() condition in a PDO query?
(23 answers)
Closed 7 years ago.
Is it possible to use prepared statements with either MySQLi or PDO and still be able to dynamically add items to the IN part of the query, for example...
$somearray = ['tagvalue1', 'tagvalue2', 'tagvalue3'];
$sql = "SELECT foo FROM bar
WHERE tag IN(?)";
I ask this because I have a situation whereby the number of elements in the IN part is not known until runtime.
You asked:
Is it possible to use prepared statements with either MySQLi or PDO
and still be able to dynamically add items to the IN part of the
query, for example...
No, unfortunately it is not. It happens that ColdFusion does this, but not php.
While you can't do exactly what you want with a prepared query, you can dynamically generate the $sql string for the query to accomplish the same thing.
Given some array $array = (n, n1, n2, ... nN)
$sql = "SELECT foo FROM bar WHERE tag IN (";
foreach($array as $value) {
$sql .= "'" . $value . "', ";
}
// Strip off the last comma and space from the IN clause
$sql = substr($sql, 0, strlen($sql) - 2);
$sql .= ")";
It's certainly not the most elegant solution and you'll have to do some more data validation or escaping of dangerous characters that a prepared query would handle better, but it will do the job.
As a side note, there are ORM (Object Relational Mapper) libraries that support things like accepting an array of values to generate the IN clause in a database statement. Propel is the one I have most experience with, but I'm sure others like Doctrine would have a similar method.
A propel-ish example would be like
$results = BarQuery::create()
->select('foo')
->filterByTag(array($value1, $value2, ..., $vauleN)
->find();
Lots of added functionality and support, but does increase your initial set up time for a project.
I want to make a "dynamic" WHERE clause in my query based on a array of strings. And I want to run the created query using Mysqi's prepared statements.
My code so far, PHP:
$searchArray = explode(' ', $search);
$searchNumber = count($searchArray);
$searchStr = "tags.tag LIKE ? ";
for($i=1; $i<=$searchNumber-1 ;$i++){
$searchStr .= "OR tags.tag LIKE ? ";
}
My query:
SELECT tag FROM tags WHERE $searchStr;
More PHP:
$stmt -> bind_param(str_repeat('s', count($searchArray)));
Now this obviously gives me an error since the bind_param part only contains half the details it need.
How should I proceed?
Are there any other (better) way of doing this?
Is it secure?
Regarding the security part of the question, prepared statements with placeholders are as secure as the validation mechanism involved in filling these placeholders with values up. In the case of mysqli prepared statements, the documentation says:
The markers are legal only in certain places in SQL statements. For example, they are allowed in the VALUES() list of an INSERT statement (to specify column values for a row), or in a comparison with a column in a WHERE clause to specify a comparison value.
However, they are not allowed for identifiers (such as table or column names), in the select list that names the columns to be returned by a SELECT statement, or to specify both operands of a binary operator such as the = equal sign. The latter restriction is necessary because it would be impossible to determine the parameter type. It's not allowed to compare marker with NULL by ? IS NULL too. In general, parameters are legal only in Data Manipulation Language (DML) statements, and not in Data Definition Language (DDL) statements.
This clearly excludes any possibility of modifying the general semantic of the query, which makes it much harder (but not impossible) to divert it from its original intent.
Regarding the dynamic part of your query, you could use str_repeat in the query condition building part, instead of doing a loop:
$searchStr = 'WHERE tags.tag LIKE ?' .
str_repeat($searchNumber - 1, ' OR tags.tag LIKE ?');
For the bind_param call, you should use call_user_func_array like so:
$bindArray[0] = str_repeat('s', $searchNumber);
array_walk($searchArray,function($k,&$v) use (&$bindArray) {$bindArray[] = &$v;});
call_user_func_array(array($stmt,'bind_param'), $bindArray);
Hopefully the above snippet should bind every value of the $bindArray with its corresponding placeholder in the query.
Addenum:
However, you should be wary of two things:
call_user_func_array expects an integer indexed array for its second parameter. I am not sure how it would behave with a dictionary.
mysqli_stmt_bind_param requires its parameters to be passed by reference.
For the first point, you only need to make sure that $bindArray uses integer indices, which is the case in the code above (or alternatively check that call_user_func_array doesn't choke on the array you're providing it).
For the second point, it will only be a problem if you intend to modify the data within $bindArray after calling bind_param (ie. through the call_user_func_array function), and before executing the query.
If you wish to do so - for instance by running the same query several times with different parameters' values in the same script, then you will have to use the same array ( $bindArray) for the following query execution, and update the array entries using the same keys. Copying another array over won't work, unless done by hand:
foreach($bindArray as $k => $v)
$bindArray[$k] = some_new_value();
or
foreach($bindArray as &$v)
$v = some_new_value();
The above would work because it would not break the references on the array entries that bind_param bound with the statement when it was called earlier. Likewise, the following should work because it does not change the references which have been set earlier up.
array_walk($bindArray, function($k,&$v){$v = some_new_value();});
A prepared statement needs to have a well-defined number of arguments; it can't have any element of dynamic functionality. That means you'll have to generate the specific statement that you need and prepare just that.
What you can do – in case your code actually gets called multiple times during the existence of the database connection - is make cache of those prepared statements, and index them by the number of arguments that you're taking. This would mean that the second time you call the function with three arguments, you already have the statement done. But as prepared statements don't survive the disconnect anyway, this really only makes sense if you do multiple queries in the same script run. (I'm deliberately leaving out persistent connections, because that opens up an entirely different can of worms.)
By the way, I'm not an MySQL expert, but would it not make a difference to not have the where conditions joined,but rather writing WHERE tags in (tag1, tag2, tag3, tag4)?
Solved it by the help of an answer found here.
$query = "SELECT * FROM tags WHERE tags.tag LIKE CONCAT('%',?,'%')" . str_repeat(" OR tags.tag LIKE CONCAT('%',?,'%')", $searchNumber - 1)
$stmt = $mysqli -> prepare($query);
$bind_names[] = str_repeat('s', $searchNumber);
for ($i = 0; $i < count($searchArray); $i++){
$bind_name = 'bind'.$i; //generate a name for variable bind1, bind2, bind3...
$$bind_name = $searchArray[$i]; //create a variable with this name and put value in it
$bind_names[] = & $$bind_name; //put a link to this variable in array
}
call_user_func_array(array($stmt, 'bind_param'), &$bind_names);
$stmt -> execute();
Where and when do you use the quote method in PDO? I'm asking this in the light of the fact that in PDO, all quoting is done by the PDO object therefore no user input should be escaped/quoted etc. This makes one wonder why worry about a quote method if it's not gonna get used in a prepared statement anyway?
When using Prepared Statements with PDO::prepare() and PDOStatement::execute(), you don't have any quoting to do : this will be done automatically.
But, sometimes, you will not (or cannot) use prepared statements, and will have to write full SQL queries and execute them with PDO::exec() ; in those cases, you will have to make sure strings are quoted properly -- this is when the PDO::quote() method is useful.
While this may not be the only use-case it's the only one I've needed quote for. You can only pass values using PDO_Stmt::execute, so for example this query wouldn't work:
SELECT * FROM tbl WHERE :field = :value
quote comes in so that you can do this:
// Example: filter by a specific column
$columns = array("name", "location");
$column = isset($columns[$_GET["col"]]) ? $columns[$_GET["col"]] : $defaultCol;
$stmt = $pdo->prepare("SELECT * FROM tbl WHERE " . $pdo->quote($column) . " = :value");
$stmt->execute(array(":value" => $value));
$stmt = $pdo->prepare("SELECT * FROM tbl ORDER BY " . $pdo->quote($column) . " ASC");
and still expect $column to be filtered safely in the query.
The PDO system does not have (as far as I can find) any mechanism to bind an array variable in PHP into a set in SQL. That's a limitation of SQL prepared statements as well... thus you are left with the task of stitching together your own function for this purpose. For example, you have this:
$a = array(123, 'xyz', 789);
You want to end up with this:
$sql = "SELECT * FROM mytable WHERE item IN (123, 'xyz', 789)";
Using PDO::prepare() does not work because there's no method to bind the array variable $a into the set. You end up needing a loop where you individually quote each item in the array, then glue them together. In which case PDO::quote() is probably better than nothing, at least you get the character set details right.
Would be excellent if PDO supported a cleaner way to handle this. Don't forget, the empty set in SQL is a disgusting special case... which means any function you build for this purpose becomes more complex than you want it to be. Something like PDO::PARAM_SET as an option on the binding, with the individual driver deciding how to handle the empty set. Of course, that's no longer compatible with SQL prepared statements.
Happy if someone knows a way to avoid this difficulty.
A bit late anwser, but one situation where its useful is if you get a load of data out of your table which you're going to put back in later.
for example, i have a function which gets a load of text out of a table and writes it to a file. that text might later be inserted into another table. the quote() method makes all the quotes safe.
it's real easy:
$safeTextToFile = $DBH->quote($textFromDataBase);
I've always done the simple connection of mysql_connect, mysql_pconnect:
$db = mysql_pconnect('*host*', '*user*', '*pass*');
if (!$db) {
echo("<strong>Error:</strong> Could not connect to the database!");
exit;
}
mysql_select_db('*database*');
While using this I've always used the simple method to escape any data before making a query, whether that be INSERT, SELECT, UPDATE or DELETE by using mysql_real_escape_string
$name = $_POST['name'];
$name = mysql_real_escape_string($name);
$sql = mysql_query("SELECT * FROM `users` WHERE (`name` = '$name')") or die(mysql_error());
Now I understand this is safe, to an extent!
It escapes dangerous characters; however, it is still vulnerable to other attacks which can contain safe characters but may be harmful to either displaying data or in some cases, modifying or deleting data maliciously.
So, I searched a little bit and found out about PDO, MySQLi and prepared statements. Yes, I may be late to the game but I've read many, many tutorials (tizag, W3C, blogs, Google searches) out there and not a single one has mentioned these. It seems very strange as to why, as just escaping user input really isn't secure and not good practice to say the least. Yes, I'm aware you could use Regex to tackle it, but still, I'm pretty sure that's not enough?
It is to my understanding that using PDO/prepared statements is a much safer way to store and retrieve data from a database when the variables are given by user input. The only trouble is, the switch over (especially after being very stuck in my ways/habits of previous coding) is a little difficult.
Right now I understand that to connect to my database using PDO I would use
$hostname = '*host*';
$username = '*user*';
$password = '*pass*';
$database = '*database*'
$dbh = new PDO("mysql:host=$hostname;dbname=$database", $username, $password);
if ($dbh) {
echo 'Connected to database';
} else {
echo 'Could not connect to database';
}
Now, function names are different so no longer will my mysql_query, mysql_fetch_array, mysql_num_rows etc work. So I'm having to read/remember a load of new ones, but this is where I'm getting confused.
If I wanted to insert data from say a sign up/registration form, how would I go about doing this, but mainly how would I go about it securely? I assume this is where prepared statements come in, but by using them does this eliminate the need to use something like mysql_real_escape_string? I know that mysql_real_escape_string requires you to be connected to a database via mysql_connect/mysql_pconnect so now we aren't using either won't this function just produce an error?
I've seen different ways to approach the PDO method too, for example, I've seen :variable and ? as what I think are known as place holders (sorry if that is wrong).
But I think this is roughly the idea of what should be done to fetch a user from a database
$user_id = $_GET['id']; // For example from a URL query string
$stmt = $dbh->prepare("SELECT * FROM `users` WHERE `id` = :user_id");
$stmt->bindParam(':user_id', $user_id, PDO::PARAM_INT);
But then I'm stuck on a couple things, if the variable wasn't a number and was a string of text, you have to given a length after PDO:PARAM_STR if I'm not mistaken. But how can you give a set length if you're not sure on the value given from user in-putted data, it can vary each time? Either way, as far as I know to display the data you then do
$stmt->execute();
$result = $stmt->fetchAll();
// Either
foreach($result as $row) {
echo $row['user_id'].'<br />';
echo $row['user_name'].'<br />';
echo $row['user_email'];
}
// Or
foreach($result as $row) {
$user_id = $row['user_id'];
$user_name = $row['user_name'];
$user_email = $row['user_email'];
}
echo("".$user_id."<br />".$user_name."<br />".$user_email."");
Now, is this all safe?
If I am right, would inserting data be the same for example:
$username = $_POST['username'];
$email = $_POST['email'];
$stmt = $dbh->prepare("INSERT INTO `users` (username, email)
VALUES (:username, :email)");
$stmt->bindParam(':username, $username, PDO::PARAM_STR, ?_LENGTH_?);
$stmt->bindParam(':email, $email, PDO::PARAM_STR, ?_LENGTH_?);
$stmt->execute();
Would that work, and is that safe too? If it is right what value would I put in for the ?_LENGTH_?? Have I got this all completely wrong?
UPDATE
The replies I've had so far have been extremely helpful, can't thank you guys enough! Everyone has got a +1 for opening my eyes up to something a little different. It's difficult to choose the top answer, but I think Col. Shrapnel deserves it as everything is pretty much covered, even going into other arrays with custom libraries which I wasn't aware of!
But thanks to all of you:)
Thanks for the interesting question. Here you go:
It escapes dangerous characters,
Your concept is utterly wrong.
In fact "dangerous characters" is a myth, there are none.
And mysql_real_escape_string escaping but merely a string delimiters. From this definition you can conclude it's limitations - it works only for strings.
however, it is still vulnerable to other attacks which can contain safe characters but may be harmful to either displaying data or in some cases, modifying or deleting data maliciously.
You're mixing here everything.
Speaking of database,
for the strings it is NOT vulnerable. As long as your strings being quoted and escaped, they cannot "modify or delete data maliciously".*
for the other data typedata - yes, it's useless. But not because it is somewhat "unsafe" but just because of improper use.
As for the displaying data, I suppose it is offtopic in the PDO related question, as PDO has nothing to do with displaying data either.
escaping user input
^^^ Another delusion to be noted!
a user input has absolutely nothing to do with escaping. As you can learn from the former definition, you have to escape strings, not whatever "user input". So, again:
you have escape strings, no matter of their source
it is useless to escape other types of data, no matter of the source.
Got the point?
Now, I hope you understand the limitations of escaping as well as the "dangerous characters" misconception.
It is to my understanding that using PDO/prepared statements is a much safer
Not really.
In fact, there are four different query parts which we can add to it dynamically:
a string
a number
an identifier
a syntax keyword.
so, you can see that escaping covers only one issue. (but of course, if you treat numbers as strings (putting them in quotes), when applicable, you can make them safe as well)
while prepared statements cover - ugh - whole 2 isues! A big deal ;-)
For the other 2 issues see my earlier answer, In PHP when submitting strings to the database should I take care of illegal characters using htmlspecialchars() or use a regular expression?
Now, function names are different so no longer will my mysql_query, mysql_fetch_array, mysql_num_rows etc work.
That is another, grave delusion of PHP users, a natural disaster, a catastrophe:
Even when utilizing old mysql driver, one should never use bare API functions in their code! One have to put them in some library function for the everyday usage! (Not as a some magic rite but just to make the code shorter, less repetitive, error-proof, more consistent and readable).
The same goes for the PDO as well!
Now on with your question again.
but by using them does this eliminate the need to use something like mysql_real_escape_string?
YES.
But I think this is roughly the idea of what should be done to fetch a user from a database
Not to fetch, but to add a whatever data to the query!
you have to given a length after PDO:PARAM_STR if I'm not mistaken
You can, but you don't have to.
Now, is this all safe?
In terms of database safety there are just no weak spots in this code. Nothing to secure here.
for the displaying security - just search this site for the XSS keyword.
Hope I shed some light on the matter.
BTW, for the long inserts you can make some use of the function I wrote someday, Insert/update helper function using PDO
However, I am not using prepared statements at the moment, as I prefer my home-brewed placeholders over them, utilizing a library I mentioned above. So, to counter the code posted by the riha below, it would be as short as these 2 lines:
$sql = 'SELECT * FROM `users` WHERE `name`=?s AND `type`=?s AND `active`=?i';
$data = $db->getRow($sql,$_GET['name'],'admin',1);
But of course you can have the same code using prepared statements as well.
* (yes I am aware of the Schiflett's scaring tales)
I never bother with bindParam() or param types or lengths.
I just pass an array of parameter values to execute(), like this:
$stmt = $dbh->prepare("SELECT * FROM `users` WHERE `id` = :user_id");
$stmt->execute( array(':user_id' => $user_id) );
$stmt = $dbh->prepare("INSERT INTO `users` (username, email)
VALUES (:username, :email)");
$stmt->execute( array(':username'=>$username, ':email'=>$email) );
This is just as effective, and easier to code.
You may also be interested in my presentation SQL Injection Myths and Fallacies, or my book SQL Antipatterns Volume 1: Avoiding the Pitfalls of Database Programming.
Yes, :something is a named placeholder in PDO, ? is an anonymous placeholder. They allow you to either bind values one by one or all at once.
So, basically that makes four options to provide your query with values.
One by one with bindValue()
This binds a concrete value to your placeholder as soon as you call it. You may even bind hard coded strings like bindValue(':something', 'foo') if desired.
Providing a parameter type is optional (but suggested). However, since the default is PDO::PARAM_STR, you only need to specify it when it is not a string. Also, PDO will take care of the length here - there is no length parameter.
$sql = '
SELECT *
FROM `users`
WHERE
`name` LIKE :name
AND `type` = :type
AND `active` = :active
';
$stm = $db->prepare($sql);
$stm->bindValue(':name', $_GET['name']); // PDO::PARAM_STR is the default and can be omitted.
$stm->bindValue(':type', 'admin'); // This is not possible with bindParam().
$stm->bindValue(':active', 1, PDO::PARAM_INT);
$stm->execute();
...
I usually prefer this approach. I find it the cleanest and most flexible.
One by one with bindParam()
A variable is bound to your placeholder that will be read when the query is executed, NOT when bindParam() is called. That may or may not be what you want. It comes in handy when you want to repeatedly execute your query with different values.
$sql = 'SELECT * FROM `users` WHERE `id` = :id';
$stm = $db->prepare($sql);
$id = 0;
$stm->bindParam(':id', $id, PDO::PARAM_INT);
$userids = array(2, 7, 8, 9, 10);
foreach ($userids as $userid) {
$id = $userid;
$stm->execute();
...
}
You only prepare and bind once which safes CPU cycles. :)
All at once with named placeholders
You just drop in an array to execute(). Each key is a named placeholder in your query (see Bill Karwins answer). The order of the array is not important.
On a side note: With this approach you cannot provide PDO with data type hints (PDO::PARAM_INT etc.). AFAIK, PDO tries to guess.
All at once with anonymous placeholders
You also drop in an array to execute(), but it is numerically indexed (has no string keys). The values will replace your anonymous placeholders one by one in the order they appear in your query/array - first array value replaces first placeholder and so forth. See erm410's answer.
As with the array and named placeholders, you cannot provide data type hints.
What they have in common
All of those require you to bind/provide as much values as you have
placeholders. If you bind too many/few, PDO will eat your children.
You don't have to take care about escaping, PDO handles that. Prepared PDO statements are SQL injection safe by design. However, that's not true for exec() and query() - you should generally only use those two for hardcoded queries.
Also be aware that PDO throws exceptions. Those could reveal potentially sensitive information to the user. You should at least put your initial PDO setup in a try/catch block!
If you don't want it to throw Exceptions later on, you can set the error mode to warning.
try {
$db = new PDO(...);
$db->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_WARNING)
} catch (PDOException $e) {
echo 'Oops, something went wrong with the database connection.';
}
To answer the length question, specifying it is optional unless the param you are binding is an OUT parameter from a stored procedure, so in most cases you can safely omit it.
As far as safety goes, escaping is done behind the scenes when you bind the parameters. This is possible because you had to create a database connection when you created the object. You are also protected from SQL injection attacks since by preparing the statement, you are telling your database the format of the statement before user input can get anywhere near to it. An example:
$id = '1; MALICIOUS second STATEMENT';
mysql_query("SELECT * FROM `users` WHERE `id` = $id"); /* selects user with id 1
and the executes the
malicious second statement */
$stmt = $pdo->prepare("SELECT * FROM `users` WHERE `id` = ?") /* Tells DB to expect a
single statement with
a single parameter */
$stmt->execute(array($id)); /* selects user with id '1; MALICIOUS second
STATEMENT' i.e. returns empty set. */
Thus, in terms of safety, your examples above seem fine.
Finally, I agree that binding parameters individually is tedious and is just as effectively done with an array passed to PDOStatement->execute() (see http://www.php.net/manual/en/pdostatement.execute.php).