Spending time optimizing an sql query VS Pushing unnecessary data from database - php

I have a considerably big table in my database, and I want to have a function that, for example, performs an UPDATE query.
What I used to do in my older projects was passing all the values for all the columns, and insert them in the query string, like this example:
function updateUser($id, $name, $username){
$query = "UPDATE user SET name = '{$name}', username = '{$username}' WHERE id = '{$id}' ";
return mysqli_query($this->conn, $query);
}
That meaning, every column was being altered even those that weren't changed.
But, this time being a big table, I don't want to sacrifice the application speed.
What I'm trying to do is make some comparisons, therefore optimizing the query, and only then sending it to the database.
Staying in the UPDATE query example from before, this is kind of what I want to do:
function updateUser($old_user, $new_user, $user_id){
$changed = false;
$oldFirstName = $old_user->getFirstName();
$newFirstName = $new_user->getFirstName();
if($oldFirstName == $newFirstName){
$firstNameQuery = "";
}else{
$firstNameQuery = " first_name = '".mysqli_escape_string($this->conn, $newFirstName)."',";
$changed = true;
}
$oldLastName = $old_user->getLastName();
$newLastName = $new_user->getLastName();
if($oldLastName == $newLastName){
$lastNameQuery = "";
}else{
$lastNameQuery = " last_name = '".mysqli_escape_string($this->conn, $newLastName)."',";
$changed = true;
}
$oldEmail = $old_user->getEmail();
$newEmail = $new_user->getEmail();
if($oldEmail == $newEmail){
$emailQuery = "";
}else{
$emailQuery = " email = '".mysqli_escape_string($this->conn, $newEmail)."',";
$changed = true;
}
if($changed){
$query = "UPDATE user SET {$firstNameQuery}{$lastNameQuery}{$emailQuery} WHERE user_id = {$user_id}";
return mysqli_query($this->conn, $query);
}else{
return 0;
}
}
Although, as you can see, as the table grows this function gets bigger and with a lot more comparisons and attributions.
My question is: Am I saving a noticeable amount of time doing this, or it isn't whorth it?

You are probably making the code less efficient. Much of the time for an update is on logging the transaction and physically storing the data page for each record. Because all the columns for a single record are (typically) stored on a single page, updating one column or many columns doesn't matter.
On the other hand, the additional comparisons in the application also take time.
Of course -- as with any performance related issue -- you can test the different scenarios. I wouldn't expect any noticeable improvement in performance by going through such logic to reduce the number of columns in the update, unless it eliminated entirely the need for updating certain rows.

Related

PHP/MySQL - Run actions from background / Dealing with a longer-running method

I have a form in PHP that saves some data into a database. However, after a successful insert, the PHP function below executes to arrange every entry by a "visible" ID column (called correlativo in this case).
function ordenar_ids()
{
$sql = "SELECT DISTINCT(DATE_FORMAT(fecha, '%Y')) FROM emergencias ORDER BY fecha";
$results = mysql_query($sql);
$anios = [];
while ($row = mysql_fetch_array($results))
array_push($anios, $row[0]);
foreach($anios as $anio)
{
$sql = "SELECT id_emergencia FROM emergencias WHERE fecha LIKE '".$anio."%' ORDER BY fecha ASC;";
$results = mysql_query($sql);
$correlativo = 1;
while ($row = mysql_fetch_array($results))
{
$sql = "UPDATE emergencias SET correlativo = $correlativo WHERE id_emergencia = '".$row[0]."';";
mysql_query($sql);
$correlativo++;
}
}
return 1;
}
Initial (local) tests were successful, however when we tested this in a production environment we noticed this process is taking way too long (about 15 seconds on average) which is detrimental to the user experience.
Is it possible to run this function in the background after the user has submitted the form? If it's not possible, how can we work around this problem?
Additional Data:
The user doesn't need a return value of this function, nor it's needed for the following actions that happens after.
You can't easily run a mysql query in the background. But xou could start a new php script in background: http://php.net/manual/de/function.exec.php#86329
But better would be to improve the script performance itself. I think, you can improve it by using one "update ... select" query, avoiding LIKE % or using mysql triggers.

update array data to database inside looping

I face a problem with update more then one value with the same name to database.
//while loop
{
<input name="exists[]" value='$row1[Status_Name]'></input>
}
bellow are how I update the data to database
if (isset($_POST["updsts"]))
{
$gid = $_POST["id"];
$sqlq = "SELECT * FROM orderstatus WHERE Status_Group = '$gid'";
$result = mysqli_query($conn, $sqlq);
$rowcount = mysqli_num_rows($result);
if ($rowcount == 0)
echo "No records found";
else
{
$x = '0';
while( $x<$rowcount)
{
$stsname = $_POST["exists[$x]"];
$sqlu = "UPDATE orderstatus SET
Status_Name = '$stsname'
WHERE Status_Group = '$gid'";
$x++;
}
}
My $row1[Status_Name] will show all the status_name inside a table.
You can loop over the post array key:
foreach ($_POST['exists'] as $val) {
// Do your updates here
}
Additionally, this code has a lot of other serious problems you should be aware of.
You're not escaping any data used in the context of HTML. Use htmlspecialchars() around any arbitrary data you're concatenating into HTML. Without this, you risk creating invalid HTML as well as potential security issues with injected scripts.
You're not escaping any data used in your queries! As it stands right now, pretty much anyone can do whatever they want with your database data. Automated bots hit these sort of scripts and exploit them all the time. Use parameterized queries, always. Never concatenate data into the context of a query!
There's no need for this select and then update loop. Just use one update.

Insert If No Exists and Get Insert Id in PHP&MySql

I'm developing a PHP script and I just want to know if I can make this code piece with better performance.
I have to make 2 mysql queries to complete my task. Is there any other way to complete this with better performance?
$language = "en";
$checkLanguage = $db->query("select id from language where shortName='$language'")->num_rows;
if($checkLanguage>0){
$languageSql = $db->query("select id from language where shortName='$language'")->fetch_assoc();
$languageId = $languageSql['id'];
}else{
$languageSql = $db->query("insert into language(shortName) values('$language')");
$languageId = $db->insert_id;
}
echo $languageId
You can improve your performance, by storing the stamtement object to a variable, this way it will be one less query:
$checkLanguage = $db->query("select id from language where shortName='$language'");
if($checkLanguage->num_rows >0){
$languageSql = $checkLanguage->fetch_assoc();
$languageId = $languageSql['id'];
}else{
$languageSql = $db->query("insert into language(shortName) values('$language')");
$languageId = $db->insert_id;
}
echo $languageId
or second option you add unique constraint to language and shortName.
If you insert a duplicate it will throw an error, if not it will insert, this way you keep only one query the INSERT one, but you might need a try catch for duplicates.
Why not just do something like this:
$language = "en";
$dbh = $db->query("INSERT IGNORE INTO `language` (`shortName`) VALUES ('{$language}');");
$id = $db->insert_id;
echo (($id !== 0) ? $id : FALSE);
This will perform your logic in a single query, and return the id, or false on a duplicate. It is generally better to resolve database performance issues in the SQL rather than in PHP, because about 65% of your overhead is in the actual connection to the database, not the query itself. Reducing the number of queries you run typically has a lot better impact on performance than improving your scripting logic revolving around them. People that consistently rely on ORM's often have a lot of trouble with this, because cookie cutter SQL is usually not very performant.

Splitting a string of values like 1030:0,1031:1,1032:2 and storing data in database

I have a bunch of photos on a page and using jQuery UI's Sortable plugin, to allow for them to be reordered.
When my sortable function fires, it writes a new order sequence:
1030:0,1031:1,1032:2,1040:3,1033:4
Each item of the comma delimited string, consists of the photo ID and the order position, separated by a colon. When the user has completely finished their reordering, I'm posting this order sequence to a PHP page via AJAX, to store the changes in the database. Here's where I get into trouble.
I have no problem getting my script to work, but I'm pretty sure it's the incorrect way to achieve what I want, and will suffer hugely in performance and resources - I'm hoping somebody could advise me as to what would be the best approach.
This is my PHP script that deals with the sequence:
if ($sorted_order) {
$exploded_order = explode(',',$sorted_order);
foreach ($exploded_order as $order_part) {
$exploded_part = explode(':',$order_part);
$part_count = 0;
foreach ($exploded_part as $part) {
$part_count++;
if ($part_count == 1) {
$photo_id = $part;
} elseif ($part_count == 2) {
$order = $part;
}
$SQL = "UPDATE article_photos ";
$SQL .= "SET order_pos = :order_pos ";
$SQL .= "WHERE photo_id = :photo_id;";
... rest of PDO stuff ...
}
}
}
My concerns arise from the nested foreach functions and also running so many database updates. If a given sequence contained 150 items, would this script cry for help? If it will, how could I improve it?
** This is for an admin page, so it won't be heavily abused **
you can use one update, with some cleaver code like so:
create the array $data['order'] in the loop then:
$q = "UPDATE article_photos SET order_pos = (CASE photo_id ";
foreach($data['order'] as $sort => $id){
$q .= " WHEN {$id} THEN {$sort}";
}
$q .= " END ) WHERE photo_id IN (".implode(",",$data['order']).")";
a little clearer perhaps
UPDATE article_photos SET order_pos = (CASE photo_id
WHEN id = 1 THEN 999
WHEN id = 2 THEN 1000
WHEN id = 3 THEN 1001
END)
WHERE photo_id IN (1,2,3)
i use this approach for exactly what your doing, updating sort orders
No need for the second foreach: you know it's going to be two parts if your data passes validation (I'm assuming you validated this. If not: you should =) so just do:
if (count($exploded_part) == 2) {
$id = $exploded_part[0];
$seq = $exploded_part[1];
/* rest of code */
} else {
/* error - data does not conform despite validation */
}
As for update hammering: do your DB updates in a transaction. Your db will queue the ops, but not commit them to the main DB until you commit the transaction, at which point it'll happily do the update "for real" at lightning speed.
I suggest making your script even simplier and changing names of the variables, so the code would be way more readable.
$parts = explode(',',$sorted_order);
foreach ($parts as $part) {
list($id, $position) = explode(':',$order_part);
//Now you can work with $id and $position ;
}
More info about list: http://php.net/manual/en/function.list.php
Also, about performance and your data structure:
The way you store your data is not perfect. But that way you will not suffer any performance issues, that way you need to send less data, less overhead overall.
However the drawback of your data structure is that most probably you will be unable to establish relationships between tables and make joins or alter table structure in a correct way.

Query on large mysql database

i've got a script which is supposed to run through a mysql database and preform a certain 'test'on the cases. Simplified the database contains records which represent trips that have been made by persons. Each record is a singel trip. But I want to use only roundway trips. So I need to search the database and match two trips to each other; the trip to and the trip from a certain location.
The script is working fine. The problem is that the database contains more then 600.000 cases. I know this should be avoided if possible. But for the purpose of this script and the use of the database records later on, everything has to stick together.
Executing the script takes hours right now, when executing on my iMac using MAMP. Off course I made sure that it can use a lot of memory etcetare.
My question is how could I speed things up, what's the best approach to do this?
Here's the script I have right now:
$table = $_GET['table'];
$output = '';
//Select all cases that has not been marked as invalid in previous test
$query = "SELECT persid, ritid, vertpc, aankpc, jaar, maand, dag FROM MON.$table WHERE reasonInvalid != '1' OR reasonInvalid IS NULL";
$result = mysql_query($query)or die($output .= mysql_error());
$totalCountValid = '';
$totalCountInvalid = '';
$totalCount = '';
//For each record:
while($row = mysql_fetch_array($result)){
$totalCount += 1;
//Do another query, get all the rows for this persons ID and that share postal codes. Postal codes revert between the two trips
$persid = $row['persid'];
$ritid = $row['ritid'];
$pcD = $row['vertpc'];
$pcA = $row['aankpc'];
$jaar = $row['jaar'];
$maand = $row['maand'];
$dag = $row['dag'];
$thecountquery = "SELECT * FROM MON.$table WHERE persid=$persid AND vertpc=$pcA AND aankpc=$pcD AND jaar = $jaar AND maand = $maand AND dag = $dag";
$thecount = mysql_num_rows(mysql_query($thecountquery));
if($thecount >= 1){
//No worries, this person ID has multiple trips attached
$totalCountValid += 1;
}else{
//Ow my, the case is invalid!
$totalCountInvalid += 1;
//Call the markInvalid from functions.php
$totalCountValid += 1;
markInvalid($table, '2', 'ritid', $ritid);
}
}
//Echo the result
$output .= 'Total cases: '.$totalCount.'<br>Valid: '.$totalCountValid.'<br>Invalid: '.$totalCountInvalid; echo $output;
Your basic problem is that you are doing the following.
1) Getting all cases that haven't been marked as invalid.
2) Looping through the cases obtained in step 1).
What you can easily do is to combine the queries written for 1) and 2) in a single query and loop over the data. This will speed up the things a bit.
Also bear in mind the following tips.
1) Selecting all columns is not at all a good thing to do. It takes ample amount of time for the data to traverse over the network. I would recommend replacing the wild-card with all columns that you really need.
SELECT * <ALL_COlumns>
2) Use indexes - sparingly, efficiently and appropriately. Understand when to use them and when not to.
3) Use views if you can.
4) Enable MySQL slow query log to understand which queries you need to work on and optimize.
log_slow_queries = /var/log/mysql/mysql-slow.log
long_query_time = 1
log-queries-not-using-indexes
5) Use correct MySQL field types and the storage engine (Very very important)
6) Use EXPLAIN to analyze your query - EXPLAIN is a useful command in MySQL which can provide you some great details about how a query is ran, what index is used, how many rows it needs to check through and if it needs to do file sorts, temporary tables and other nasty things you want to avoid.
Good luck.

Categories