I have some really funky code. As you can see from the code below I have a series of filters that I add to query. Now would it be easier to just have multiple queries, with it's own set of filters, then store the results in an array, or have this mess?
Does anyone have a better solution to this mess? I need to be able to filter by keyword and item number, and it needs to be able to filter using multiple values, not know which is which.
//Prepare filters and values
$values = array();
$filters = array();
foreach($item_list as $item){
$filters[] = "ItemNmbr = ?";
$filters[] = "ItemDesc LIKE ?";
$filters[] = "NoteText LIKE ?";
$values[] = $item;
$values[] = '%' . $item . '%';
$values[] = '%' . $item . '%';
}
//Prepare the query
$sql = sprintf(
"SELECT ItemNmbr, ItemDesc, NoteText, Iden, BaseUOM FROM ItemMaster WHERE %s LIMIT 21",
implode(" OR ", $filters)
);
//Set up the types
$types = str_repeat("s", count($filters));
array_unshift($values, $types);
//Execute it
$state = $mysqli->stmt_init();
$state->prepare($sql) or die ("Could not prepare statement:" . $mysqli->error);
call_user_func_array(array($state, "bind_param"), $values);
$state->bind_result($ItemNmbr, $ItemDesc, $NoteText, $Iden, $BaseUOM);
$state->execute() or die ("Could not execute statement");
$state->store_result();
I don't see anything particularly monstrous about your query.
The only thing I would do different is separate the search terms.
Ie
$item_list could be split in numeric items and text items.
then you could make the search something like:
...WHERE ItemNmbr IN ( number1, number2, number3) OR LIKE .... $text_items go here....
IN is a lot more efficient and if your $item_list doesn't contain any text part... then you are just searching a bunch of numbers which is really fast.
Now the next part if you are using a lot of LIKEs in your query maybe you should consider using MySQL Full-text Searching.
You're answer depends on what exactly you need. The advantage of using only 1 query is that of resource use. One query takes only 1 connection and 1 communication with the sql server. And depending what what exactly you are attempting to do, it might take less SQL power to do it in 1 statement than multiple.
However, it might be more practical from a programmers point of view to use a few less complex sql statements that require less to create than 1 large one. Remember, ultimitaly you are programming this and you need to make it work. It might not really make a difference, script processing vs. sql processing. Only you can make ultimate call, which is more important? I would generally recommend SQL processing above script process when dealing with large databases.
A single table query can be cached by the SQL engine. MySQL and its ilk do not cache joined tables. The general rule for performance is to use joins only when necessary. This encourages the DB engine to cache table indexes aggressively and also makes your code easier to adapt to (faster) object databases--like Amazon/Google cloud services force you to.
Related
I have the following simple search query code:
function explode_search($squery, $column, $db, $link) {
global $conn;
$ven = explode(' ', safeInput(str_replace(',', '', $squery)));
$ven2 = array_map('trim', $ven);
$qy = ''.$column.' LIKE "%'.implode('%" AND '.$column.' LIKE "%', $ven2).'%"';
$query = 'SELECT DISTINCT '.$column.', id, work_composer FROM '.$db.' WHERE '.$qy.' ORDER BY '.$column.' LIMIT 100';
$result = mysqli_query($conn, $query);
while ($row = mysqli_fetch_assoc($result)) {
echo '<div><span class="mdmtxt" style="margin-bottom:5px;">'.$row[$column].'</span> <span class="mdmtxt" style="opacity:0.6;">('.fixcomp(cfid($row['work_composer'], 'cffor_composer', 'composer_name')).')</span></div>';
}
}
(The safeInput function removes ' and " and other possible problematics)
It works alright up to a point.
When someone looks for 'Stephane' I want them also to find 'Stéphane' (and vice versa) or if they are looking for 'Munich', 'Münich' should show up in the list as well.
Is there a way to make MySQL match those search queries, irrespective of the special characters involved?
You want to use what's called a "Parameterized Query". Most languages have them, and they are used to safeguard input and protect from attacks.
They're also extremely easy to use after you get used to them.
You start out with a simple query like this
$stmt = $mysqli->prepare("SELECT District FROM City WHERE Name=?")
and you replace all your actual data with ?s.
Then you bind each parameter.
$stmt->bind_param("s", $city);
Then call $stmt->execute();
More information can be found here: http://php.net/manual/en/mysqli.prepare.php
One time in college, our prof made us use a library that didn't have parameterized queries, so I wrote an implementation myself. Doctrine, which is pretty awesome, handles all this for you. I would always rely on someone else to do this stuff for me instead of writing my own implementation. You'll get into trouble that way. There's also a lot written about not reinventing new types which is basically what you're doing. Types have problems. Types need testing. This kind of thing is well tested in other implementations and not yours.
How is best way add to database multiInsert row?
E.g I have array and i would like add all array to database. I can create loop foreach and add all arrays.
$array=['apple','orange'];
foreach($array as $v)
{
$stmt = $db->exec("Insert into test(fruit) VALUES ('$v')");
}
And it's work, but maybe i should use transaction? or it do other way?
Use a prepared statement.
$sql = "INSERT INTO test (fruit) VALUES ";
$sql .= implode(', ', array_fill(0, count($array), '(?)'));
$stmt = $db->prepare($sql);
$stmt->execute($array);
The SQL will look like:
INSERT INTO test (fuit) VALUES (?), (?), (?), ...
where there are as many (?) as the number of elements in $array.
Doing a single query with many VALUES is much more efficient that performing separate queries in a loop.
If you have an associative array with input values for a single row, you can use a prepared query like this:
$columns = implode(',', array_keys($array);
$placeholders = implode(', ', array_fill(0, count($array), '?'));
$sql = "INSERT INTO test($columns) VALUES ($placeholders)";
$stmt = $db->prepare($sql);
$stmt->execute(array_values($array));
The way you've done it is, in many ways, the worst option. So the good news is that any other way of doing it will probably be better. As it stands, the code may fail depending on what's in the data; consider:
$v="single ' quote";
$stmt = $db->exec("Insert into test(fruit) VALUES ('$v')");
But without knowing what your criteria are for "best", its rather hard to advise. Using a parametrised query with data binding, or as they are often described "prepared statements" is one solution to the problem described above. Escaping the values appropriately before interpolating the string is another (and is how most PHP implementations of data binding work behind the scenes) is another common solution.
Leaving aside the question of how you get the parameters into the SQL statement, then there is the question of performance. Each round trip to the database has a cost associated with it. And doing a single insert at a time also has a performance impact - for each query, the DBMS must parse the query, apply the appropriate concurrency controls, execute the query, apply the writes to the journal, then to the data tables and indexes then tidy up before it can return the thread of execution back to PHP to construct the next query.
Wrapping multiple queries in a transaction (you are using transactions already - but they are implicit and applied to each statement) can reduce some of the overhead I have described here, but can introduce other problems, the nature of which depends on which concurrency model your DBMS uses.
To get the data into the database as quickly as possible, and minimising index fragmentation, the "best" solution is to batch up multiple inserts:
$q="INSERT INTO test (fruit) VALUES ";
while (count($array)) {
$s=$q;
$j='';
for ($x=0; $x<count($array) && $x<CHUNKSIZE; $x++) {
$s.=$j." ('" . mysqli_real_escape_string($db,
array_shift($array)) . "')";
$j=',';
}
mysqli_query($db,$s);
}
This is similar to Barmar's method, but I think easier to understand when you are working with more complex record structures, and it won't break with very large input sets.
I'm a novice programmer, and I've inherited an application designed and built by a person who has now left the company. It's done in PHP and SQL Server 2008R2. In this application, there's a page with a table displaying a list of items, populated from the database, with some options for filters in a sidebar - search by ID, keyword, date etc. This table is populated by a mammoth query, and the filters are applied by concatenating them into said query. For example, if someone wanted item #131:
$filterString = "Item.itemID = 131";
$filter = " AND " . $filterString;
SELECT ...
FROM ...
WHERE...
$filter
The filter is included on the end of the URL of the search page. This isn't great, and I'm fairly sure there are some SQL injection vulnerabilities as a result, but it is extremely flexible - the filter string is created before it's concatentated, and can have lots of different conditions: E.g.$filterString could be "condition AND condition AND coindtion OR condition".
I've been looking into Stored Procedures, as a better way to counter the issue of SQL Injection, but I haven't had any luck working out how to replicate this same level of flexibility. I don't know ahead of time which of the filters (if any) will be selected.
Is there something I'm missing?
Use either Mysqli or PDO which support prepared/parameterized queries to battle sql injection. In PDO this could look something like this
$conditions = '';
$params = array();
if(isset($form->age)) {
$conditions .= ' AND user.age > ?'
$params[] = $form->age;
}
if(isset($form->brand)) {
$conditions .= ' AND car.brand = ?'
$params[] = $form->brand;
}
$sql = "
SELECT ...
FROM ...
LEFT ...
WHERE $conditions
";
$sth = $dbh->prepare($sql);
$sth->execute($params);
$result = $sth->fetchAll();
From the manual:
Calling PDO::prepare() and PDOStatement::execute() for statements that will be issued multiple times with different parameter values optimizes the performance of your application by allowing the driver to negotiate client and/or server side caching of the query plan and meta information, and helps to prevent SQL injection attacks by eliminating the need to manually quote the parameters.
http://no1.php.net/manual/en/pdo.prepare.php
I always check/limit/cleanup the user variables I use in database queries
Like so:
$pageid = preg_replace('/[^a-z0-9_]+/i', '', $urlpagequery); // urlpagequery comes from a GET var
$sql = 'SELECT something FROM sometable WHERE pageid = "'.$pageid.'" LIMIT 1';
$stmt = $conn->query($sql);
if ($stmt && $stmt->num_rows > 0) {
$row = $stmt->fetch_assoc();
// do something with the database content
}
I don't see how using prepared statements or further escaping improves anything in that scenario? Injection seems impossible here, no?
I have tried messing with prepared statements.. and I kind of see the point, even though it takes much more time and thinking (sssiissisis etc.) to code even just half-simple queries.
But as I always cleanup the user input before DB interaction, it seems unnecessary
Can you enlighten me?
You will be better off using prepared statement consistently.
Regular expressions are only a partial solution, but not as convenient or as versatile. If your variables don't fit a pattern that can be filtered with a regular expression, then you can't use them.
All the "ssisiisisis" stuff is an artifact of Mysqli, which IMHO is needlessly confusing.
I use PDO instead:
$sql = 'SELECT something FROM sometable WHERE pageid = ? LIMIT 1';
$stmt = $conn->prepare($sql);
$stmt->execute(array($pageid));
See? No need for regexp filtering. No need for quoting or breaking up the string with . between the concatenated parts.
It's easy in PDO to pass an array of variables, then you don't have to do tedious variable-binding code.
PDO also supports named parameters, which can be handy if you have an associative array of values:
$params = array("pageid"=>123, "user"=>"Bill");
$sql = 'SELECT something FROM sometable WHERE pageid = :pageid AND user = :user LIMIT 1';
$stmt = $conn->prepare($sql);
$stmt->execute($params);
If you enable PDO exceptions, you don't need to test whether the query succeeds. You'll know if it fails because the exception is thrown (FWIW, you can enable exceptions in Mysqli too).
You don't need to test for num_rows(), just put the fetching in a while loop. If there are no rows to fetch, then the loop stops immediately. If there's just one row, then it loops one iteration.
while ($row = $stmt->fetch(PDO::FETCH_ASSOC)) {
// do something with the database content
}
Prepared statements are easier and more flexible than filtering and string-concatenation, and in some cases they are faster than plain query() calls.
The question would be how you defined "improve" in this context. In this situation I would say that it makes no difference to the functionality of the code.
So what is the difference to you? You say that this is easier and faster for you to write. That might be the case but is only a matter of training. Once you're used to prepared statements, you will write them just as fast.
The difference to other programmers? The moment you share this code, it will be difficult for the other person to fully understand as prepared statements are kind of standard (or in a perfect world would be). So by using something else it makes it in fact harder to understand for others.
Talking more about this little piece of code makes no sense, as in fact it doesn't matter, it's only one very simple statement. But imagine you write a larger script, which will be easier to read and modify in the future?
$id = //validate int
$name = //validate string
$sometext = //validate string with special rules
$sql = 'SELECT .. FROM foo WHERE foo.id = '.$id.' AND name="'.$name.'" AND sometext LIKE "%'.$sometext.'%"';
You will always need to ask yourself: Did I properly validate all the variables I am using? Did I make a mistake?
Whereas when you use code like this
$sql = $db->prepare('SELECT .. FROM foo WHERE foo.id = :id AND name=":name" AND sometext LIKE "%:sometext%"');
$sql->bind(array(
':id' => $id,
':name' => $name,
':sometext' => $sometext,
));
No need to worry if you done everything right because PHP will take care of this for you.
Of course this isn't a complex query as well, but having multiple variables should demonstrate my point.
So my final answer is: If you are the perfect programmer who never forgets or makes mistakes and work alone, do as you like. But if you're not, I would suggest using standards as they exist for a reason. It is not that you cannot properly validate all variables, but that you should not need to.
Prepared statements can sometimes be faster. But from the way you ask the question I would assume that you are in no need of them.
So how much extra performance can you get by using prepared statements ? Results can vary. In certain cases I’ve seen 5x+ performance improvements when really large amounts of data needed to be retrieved from localhost – data conversion can really take most of the time in this case. It could also reduce performance in certain cases because if you execute query only once extra round trip to the server will be required, or because query cache does not work.
Brought to you faster by http://www.mysqlperformanceblog.com/
I don't see how using prepared statements or further escaping improves anything in that scenario?
You're right it doesn't.
P.S. I down voted your question because there seems little research made before you asked.
suppose I have my 1995 fashion function meant to send queries to mysql.
I have lots of queries on my project and I'm looking for a function/class able to parse the raw query (suppose: SELECT foo from bar where pizza = 'hot' LIMIT 1)
and create a prepared statement with php. do you have any tips on that?
is it worth it? or it's better to just rewrite all the queries?
I can count 424 queries on my project, and that's just SELECTs
thanks for any help
Try this:
function prepare1995Sql_EXAMPLE ($sqlString) {
# regex pattern
$patterns = array();
$patterns[0] = '/\'.*?\'/';
# best to use question marks for an easy example
$replacements = array();
$replacements[0] = '?';
# perform replace
$preparedSqlString = preg_replace($patterns, $replacements, $sqlString);
# grab parameter values
$pregMatchAllReturnValueHolder = preg_match_all($patterns[0], $sqlString, $grabbedParameterValues);
$parameterValues = $grabbedParameterValues[0];
# prepare command:
echo('$stmt = $pdo->prepare("' . $preparedSqlString . '");');
echo("\n");
# binding of parameters
$bindValueCtr = 1;
foreach($parameterValues as $key => $value) {
echo('$stmt->bindParam(' . $bindValueCtr . ", " . $value . ");");
echo("\n");
$bindValueCtr++;
}
# if you want to add the execute part, simply:
echo('$stmt->execute();');
}
# TEST!
$sqlString = "SELECT foo FROM bar WHERE name = 'foobar' or nickname = 'fbar'";
prepare1995Sql_EXAMPLE ($sqlString);
Sample output would be:
$stmt = $pdo->prepare("SELECT foo FROM bar WHERE name = ? or nickname = ?");
$stmt->bindParam(1, 'foobar');
$stmt->bindParam(2, 'fbar');
$stmt->execute();
This would probably work if all your sql statements are similar to the example, conditions being strings. However, once you require equating to integers, the pattern must be changed. This is what I can do for now.. I know it's not the best approach at all, but for a sample's sake, give it a try :)
I would recommend regexp search for these queries(i think they should have pattern), later sort them and see which ones are similar/could be grouped.
Also if you have some kind of log, check which ones are executed most frequently, it doesnt make much sense to move rare queries to prepared statements.
Honestly, you should rewrite your queries. Using regular expressions would work, but you might find that some queries can't be handled by a pattern. The problem is there's alot of complexity in queries for just one pattern to parse'em all. Also, it would be best practice and consistent for your code to simply do the work and rewrite your queries.
Best of luck!
You might want to enable a trace facility and capture the SQL commands as they are sent to the database. Be forewarned, that what you are about to see will scare the pants off you:)