suppose I have my 1995 fashion function meant to send queries to mysql.
I have lots of queries on my project and I'm looking for a function/class able to parse the raw query (suppose: SELECT foo from bar where pizza = 'hot' LIMIT 1)
and create a prepared statement with php. do you have any tips on that?
is it worth it? or it's better to just rewrite all the queries?
I can count 424 queries on my project, and that's just SELECTs
thanks for any help
Try this:
function prepare1995Sql_EXAMPLE ($sqlString) {
# regex pattern
$patterns = array();
$patterns[0] = '/\'.*?\'/';
# best to use question marks for an easy example
$replacements = array();
$replacements[0] = '?';
# perform replace
$preparedSqlString = preg_replace($patterns, $replacements, $sqlString);
# grab parameter values
$pregMatchAllReturnValueHolder = preg_match_all($patterns[0], $sqlString, $grabbedParameterValues);
$parameterValues = $grabbedParameterValues[0];
# prepare command:
echo('$stmt = $pdo->prepare("' . $preparedSqlString . '");');
echo("\n");
# binding of parameters
$bindValueCtr = 1;
foreach($parameterValues as $key => $value) {
echo('$stmt->bindParam(' . $bindValueCtr . ", " . $value . ");");
echo("\n");
$bindValueCtr++;
}
# if you want to add the execute part, simply:
echo('$stmt->execute();');
}
# TEST!
$sqlString = "SELECT foo FROM bar WHERE name = 'foobar' or nickname = 'fbar'";
prepare1995Sql_EXAMPLE ($sqlString);
Sample output would be:
$stmt = $pdo->prepare("SELECT foo FROM bar WHERE name = ? or nickname = ?");
$stmt->bindParam(1, 'foobar');
$stmt->bindParam(2, 'fbar');
$stmt->execute();
This would probably work if all your sql statements are similar to the example, conditions being strings. However, once you require equating to integers, the pattern must be changed. This is what I can do for now.. I know it's not the best approach at all, but for a sample's sake, give it a try :)
I would recommend regexp search for these queries(i think they should have pattern), later sort them and see which ones are similar/could be grouped.
Also if you have some kind of log, check which ones are executed most frequently, it doesnt make much sense to move rare queries to prepared statements.
Honestly, you should rewrite your queries. Using regular expressions would work, but you might find that some queries can't be handled by a pattern. The problem is there's alot of complexity in queries for just one pattern to parse'em all. Also, it would be best practice and consistent for your code to simply do the work and rewrite your queries.
Best of luck!
You might want to enable a trace facility and capture the SQL commands as they are sent to the database. Be forewarned, that what you are about to see will scare the pants off you:)
Related
I have the following simple search query code:
function explode_search($squery, $column, $db, $link) {
global $conn;
$ven = explode(' ', safeInput(str_replace(',', '', $squery)));
$ven2 = array_map('trim', $ven);
$qy = ''.$column.' LIKE "%'.implode('%" AND '.$column.' LIKE "%', $ven2).'%"';
$query = 'SELECT DISTINCT '.$column.', id, work_composer FROM '.$db.' WHERE '.$qy.' ORDER BY '.$column.' LIMIT 100';
$result = mysqli_query($conn, $query);
while ($row = mysqli_fetch_assoc($result)) {
echo '<div><span class="mdmtxt" style="margin-bottom:5px;">'.$row[$column].'</span> <span class="mdmtxt" style="opacity:0.6;">('.fixcomp(cfid($row['work_composer'], 'cffor_composer', 'composer_name')).')</span></div>';
}
}
(The safeInput function removes ' and " and other possible problematics)
It works alright up to a point.
When someone looks for 'Stephane' I want them also to find 'Stéphane' (and vice versa) or if they are looking for 'Munich', 'Münich' should show up in the list as well.
Is there a way to make MySQL match those search queries, irrespective of the special characters involved?
You want to use what's called a "Parameterized Query". Most languages have them, and they are used to safeguard input and protect from attacks.
They're also extremely easy to use after you get used to them.
You start out with a simple query like this
$stmt = $mysqli->prepare("SELECT District FROM City WHERE Name=?")
and you replace all your actual data with ?s.
Then you bind each parameter.
$stmt->bind_param("s", $city);
Then call $stmt->execute();
More information can be found here: http://php.net/manual/en/mysqli.prepare.php
One time in college, our prof made us use a library that didn't have parameterized queries, so I wrote an implementation myself. Doctrine, which is pretty awesome, handles all this for you. I would always rely on someone else to do this stuff for me instead of writing my own implementation. You'll get into trouble that way. There's also a lot written about not reinventing new types which is basically what you're doing. Types have problems. Types need testing. This kind of thing is well tested in other implementations and not yours.
We are using db2_prepare and db2_execute to prepare queries in a generic function. I am trying to have a debug method to get the full prepared query after the '?' values have been replaced by the db2_execute function.
Is there an efficient way of doing this besides manually replacing each '?' with the parameters I am passing in? i.e. is there a flag that can be set for db2_execute?
Example:
$params = array('xyz','123');
$query = "SELECT * FROM foo WHERE bar = ? AND baz = ?";
$sqlprepared = db2_prepare($CONNECTION, $query);
$sqlresults = db2_execute($sqlprepared,$params);
I would like the $sqlresults to contain the full prepared query:
"SELECT * FROM foo WHERE bar = 'xyz' AND baz = '123'";
I have looked through the docs and do not see any obvious way to accomplish this, but I imagine there must be a way.
Thank you!
db2_execute() does not replace parameter markers with values. Parameters are sent to the server and bound to the prepared statement there.
The CLI trace, which can be enabled on the client as explained here, will contain the actual parameter values. Keep in mind that the trace seriously affects application performance.
I ended up writing a loop to replace the '?' parameters with a simple preg_replace after and outputting the query in my 'debug' array key:
$debugquery = $query;
foreach($params as $param) {
$debugquery = preg_replace('/\?/',"'".$param."'",$debugquery,1);
}
return $debugquery;
This handled what I needed to do (to print the finalized query for debugging purposes). This should not be run in Production due to performance impacts but is useful to look at the actual query the server is trying to perform (if you are getting unexpected results).
I always check/limit/cleanup the user variables I use in database queries
Like so:
$pageid = preg_replace('/[^a-z0-9_]+/i', '', $urlpagequery); // urlpagequery comes from a GET var
$sql = 'SELECT something FROM sometable WHERE pageid = "'.$pageid.'" LIMIT 1';
$stmt = $conn->query($sql);
if ($stmt && $stmt->num_rows > 0) {
$row = $stmt->fetch_assoc();
// do something with the database content
}
I don't see how using prepared statements or further escaping improves anything in that scenario? Injection seems impossible here, no?
I have tried messing with prepared statements.. and I kind of see the point, even though it takes much more time and thinking (sssiissisis etc.) to code even just half-simple queries.
But as I always cleanup the user input before DB interaction, it seems unnecessary
Can you enlighten me?
You will be better off using prepared statement consistently.
Regular expressions are only a partial solution, but not as convenient or as versatile. If your variables don't fit a pattern that can be filtered with a regular expression, then you can't use them.
All the "ssisiisisis" stuff is an artifact of Mysqli, which IMHO is needlessly confusing.
I use PDO instead:
$sql = 'SELECT something FROM sometable WHERE pageid = ? LIMIT 1';
$stmt = $conn->prepare($sql);
$stmt->execute(array($pageid));
See? No need for regexp filtering. No need for quoting or breaking up the string with . between the concatenated parts.
It's easy in PDO to pass an array of variables, then you don't have to do tedious variable-binding code.
PDO also supports named parameters, which can be handy if you have an associative array of values:
$params = array("pageid"=>123, "user"=>"Bill");
$sql = 'SELECT something FROM sometable WHERE pageid = :pageid AND user = :user LIMIT 1';
$stmt = $conn->prepare($sql);
$stmt->execute($params);
If you enable PDO exceptions, you don't need to test whether the query succeeds. You'll know if it fails because the exception is thrown (FWIW, you can enable exceptions in Mysqli too).
You don't need to test for num_rows(), just put the fetching in a while loop. If there are no rows to fetch, then the loop stops immediately. If there's just one row, then it loops one iteration.
while ($row = $stmt->fetch(PDO::FETCH_ASSOC)) {
// do something with the database content
}
Prepared statements are easier and more flexible than filtering and string-concatenation, and in some cases they are faster than plain query() calls.
The question would be how you defined "improve" in this context. In this situation I would say that it makes no difference to the functionality of the code.
So what is the difference to you? You say that this is easier and faster for you to write. That might be the case but is only a matter of training. Once you're used to prepared statements, you will write them just as fast.
The difference to other programmers? The moment you share this code, it will be difficult for the other person to fully understand as prepared statements are kind of standard (or in a perfect world would be). So by using something else it makes it in fact harder to understand for others.
Talking more about this little piece of code makes no sense, as in fact it doesn't matter, it's only one very simple statement. But imagine you write a larger script, which will be easier to read and modify in the future?
$id = //validate int
$name = //validate string
$sometext = //validate string with special rules
$sql = 'SELECT .. FROM foo WHERE foo.id = '.$id.' AND name="'.$name.'" AND sometext LIKE "%'.$sometext.'%"';
You will always need to ask yourself: Did I properly validate all the variables I am using? Did I make a mistake?
Whereas when you use code like this
$sql = $db->prepare('SELECT .. FROM foo WHERE foo.id = :id AND name=":name" AND sometext LIKE "%:sometext%"');
$sql->bind(array(
':id' => $id,
':name' => $name,
':sometext' => $sometext,
));
No need to worry if you done everything right because PHP will take care of this for you.
Of course this isn't a complex query as well, but having multiple variables should demonstrate my point.
So my final answer is: If you are the perfect programmer who never forgets or makes mistakes and work alone, do as you like. But if you're not, I would suggest using standards as they exist for a reason. It is not that you cannot properly validate all variables, but that you should not need to.
Prepared statements can sometimes be faster. But from the way you ask the question I would assume that you are in no need of them.
So how much extra performance can you get by using prepared statements ? Results can vary. In certain cases I’ve seen 5x+ performance improvements when really large amounts of data needed to be retrieved from localhost – data conversion can really take most of the time in this case. It could also reduce performance in certain cases because if you execute query only once extra round trip to the server will be required, or because query cache does not work.
Brought to you faster by http://www.mysqlperformanceblog.com/
I don't see how using prepared statements or further escaping improves anything in that scenario?
You're right it doesn't.
P.S. I down voted your question because there seems little research made before you asked.
This snippet works just fine until it gets to a database entry with an apostrophe. I see that I need to escape these after I pull them. Being new to PHP I'm not sure what to do with all this info about PDO and "->" and mysqli_real_escape_string(). I'm a little confused by it all. How do I escape $team1rows and $team2rows so I can pass them back to my page? Thanks.
$team1rows = mysqli_num_rows(mysqli_query($connection,"SELECT * FROM $page WHERE vote = '$team1'"));
$team2rows = mysqli_num_rows(mysqli_query($connection,"SELECT * FROM $page WHERE vote = '$team2'"));
echo $team1rows . "|" . $team2rows;
The echo works fine until it hits an apostrophe.
Use PDO instead, the prepare statement prevents SQL injections, if used correctly. The mysqli_query is the old method that shouldn't be used any longer. See an example listed below:
<?php
$pdo = new PDO('sqlite:users.db');
$stmt = $pdo->prepare('SELECT name FROM users WHERE id = :id');
$stmt->bindParam(':id', $_GET['id'], PDO::PARAM_INT); // <-- Automatically sanitized by PDO
$stmt->execute();
Source:
http://www.phptherightway.com/#databases
The -> means you're dealing with an object's method.
So: $pdo is an object
Then you can access the methods(functions) within it using the ->
$pdo->prepare for example
tldr; use placeholders (aka parameterized queries / prepared statements). This will eliminate all SQL Injection, including accidentally broken queries when the data contains apostrophes!
Since mysqli supports placeholders, there isn't a need to switch to PDO! I've left the code in the non-OOP mysqli syntax, although I recommend using the mysqli-object API. The following code does not perform any "escaping" - if such is done, it is merely an implementation detail.
# Create prepared statement, bind parameters - no apostrophe-induced error!
# - With placeholders there is need to worry about quoting at all
# - I recommend explicitly selecting columns
# - $page is NOT data in the query and cannot be bound in a placeholder
$stmt = mysqli_prepare($connection, "SELECT * FROM $page WHERE vote = ?");
mysqli_stmt_bind_param($stmt, "s", $vote);
# Execute prepared statement
$result = mysqli_stmt_execute($stmt);
# Use results somehow;
# Make sure check $result/execution for errors!
# (PDO is nice because it allows escalation of query errors to exceptions.)
$count = mysqli_num_rows($result);
Now, as per above, $page can't be bound in a placeholder because it relates to the query shape but is not data. One acceptable method to approach this particular case - if not redesigning the schema in general - is to use a whitelist approach. For instance,
$approvedPages = array("people", "tools", "pageX");
if (!in_array($page, $approvedPages)) {
# Wasn't a known page - might have been something mischievous!
# Choose default approved page or throw error or something.
$page = $approvedPages[0];
}
I figured out what I was doing wrong. Amateur mistake on my part. I was trying to escape the results of the query. But in this case the result turns out to be the NUMBER of rows. So escaping the result wasn't the problem, since it was just a number. I was confused because that number was associated with a value that had an apostrophe in it. I thoguht So the string used in the query hadn't been escaped and that was causing it to fail, not the result.
So by FIRST escaping the variables I was using in the query like this...
$page = mysqli_real_escape_string($connection, $page);
$team1 = mysqli_real_escape_string($connection, $team1);
$team2 = mysqli_real_escape_string($connection, $team2);
That gave me good strings to use for the following queries.
$team1number = mysqli_num_rows(mysqli_query($connection,"SELECT * FROM $page WHERE vote = '$team1'"));
$team2number = mysqli_num_rows(mysqli_query($connection,"SELECT * FROM $page WHERE vote = '$team2'"));
echo $team1number . "|" . $team2number;
Again, I had it all backwards and was trying to escape the results. Noob move on my part but learned a lot thanks to this discovery. Thanks all.
I have some really funky code. As you can see from the code below I have a series of filters that I add to query. Now would it be easier to just have multiple queries, with it's own set of filters, then store the results in an array, or have this mess?
Does anyone have a better solution to this mess? I need to be able to filter by keyword and item number, and it needs to be able to filter using multiple values, not know which is which.
//Prepare filters and values
$values = array();
$filters = array();
foreach($item_list as $item){
$filters[] = "ItemNmbr = ?";
$filters[] = "ItemDesc LIKE ?";
$filters[] = "NoteText LIKE ?";
$values[] = $item;
$values[] = '%' . $item . '%';
$values[] = '%' . $item . '%';
}
//Prepare the query
$sql = sprintf(
"SELECT ItemNmbr, ItemDesc, NoteText, Iden, BaseUOM FROM ItemMaster WHERE %s LIMIT 21",
implode(" OR ", $filters)
);
//Set up the types
$types = str_repeat("s", count($filters));
array_unshift($values, $types);
//Execute it
$state = $mysqli->stmt_init();
$state->prepare($sql) or die ("Could not prepare statement:" . $mysqli->error);
call_user_func_array(array($state, "bind_param"), $values);
$state->bind_result($ItemNmbr, $ItemDesc, $NoteText, $Iden, $BaseUOM);
$state->execute() or die ("Could not execute statement");
$state->store_result();
I don't see anything particularly monstrous about your query.
The only thing I would do different is separate the search terms.
Ie
$item_list could be split in numeric items and text items.
then you could make the search something like:
...WHERE ItemNmbr IN ( number1, number2, number3) OR LIKE .... $text_items go here....
IN is a lot more efficient and if your $item_list doesn't contain any text part... then you are just searching a bunch of numbers which is really fast.
Now the next part if you are using a lot of LIKEs in your query maybe you should consider using MySQL Full-text Searching.
You're answer depends on what exactly you need. The advantage of using only 1 query is that of resource use. One query takes only 1 connection and 1 communication with the sql server. And depending what what exactly you are attempting to do, it might take less SQL power to do it in 1 statement than multiple.
However, it might be more practical from a programmers point of view to use a few less complex sql statements that require less to create than 1 large one. Remember, ultimitaly you are programming this and you need to make it work. It might not really make a difference, script processing vs. sql processing. Only you can make ultimate call, which is more important? I would generally recommend SQL processing above script process when dealing with large databases.
A single table query can be cached by the SQL engine. MySQL and its ilk do not cache joined tables. The general rule for performance is to use joins only when necessary. This encourages the DB engine to cache table indexes aggressively and also makes your code easier to adapt to (faster) object databases--like Amazon/Google cloud services force you to.