Pure SQL vs PHP While Loop To Perform an Update Faster - php

I'm currently working on a system to managed my Magic The Gathering collection. I've written a script to update pricing for all the cards utilizing a WHILE loop to do the main update but it takes about 9 hours to update all 28,000 rows on my i5 laptop. I have a feeling the same thing can be accomplished without the While loop using a MySQL query and it would be faster.
My script starts off by creating a temporary table with the same structure as my main inventory table, and then copies new prices into the the temporary table via a csv file. I then use a While loop to compare the cards in temp table to the inventory table via card_name and card_set to do the update.
My question is, would a pure mysql query be faster than using the while loop, and can you help me construct it? Any help would be much appreciated. Here is my code.
<?php
set_time_limit(0);
echo "Prices Are Updating. This can Take Up To 8 Hours or More";
include('db_connection.php');
mysql_query("CREATE TABLE price_table LIKE inventory;");
//Upload Data
mysql_query("LOAD DATA INFILE 'c:/xampp/htdocs/mtgtradedesig/price_update/priceupdate.csv'
INTO TABLE price_table FIELDS TERMINATED BY ',' ENCLOSED BY '\"' (id, card_name, card_set, price)");
echo mysql_error();
//UPDATE PRICING
//SELECT all from table named price update
$sql_price_table = "SELECT * FROM price_table";
$prices = mysql_query($sql_price_table);
//Start While Loop to update prices. Do this by putting everything from price table into an array and one entry at a time match the array value to a value in inventory and update.
while($cards = mysql_fetch_assoc($prices)){
$card_name = mysql_real_escape_string($cards['card_name']);
$card_set = mysql_real_escape_string($cards['card_set']);
$card_price = $cards['price'];
$foil_price = $cards['price'] * 2;
//Update prices for non-foil in temp_inventory
mysql_query("UPDATE inventory SET price='$card_price' WHERE card_name='$card_name' AND card_set='$card_set' and foil ='0'");
//Update prices for foil in temp_inventory
mysql_query("UPDATE inventory SET price='$foil_price' WHERE card_name='$card_name' AND card_set='$card_set' and foil ='1'");
}
mysql_query("DROP TABLE price_table");
unlink('c:/xampp/htdocs/mtgtradedesign/price_update/priceupdate.csv');
header("Location: http://localhost/mtgtradedesign/index.php");
?>

The easiest remedy is to perform a join between the tables, and update all rows at once. You will then only need to run two queries, one for foil and one for non foil. You can get it done to one but that gets more complicated.
UPDATE inventory i
JOIN price_table pt
ON (i.card_name = pt.card_name AND i.card_set = pt.card_set)
SET i.price = pt.card_price WHERE foil = 0;
Didn't actually test this but it should generally be what your looking for. Also before running these try using EXPLAIN to see how bad the join performance will be. You might benefit from adding an indexes to the tables.
On a side note (and this isn't really your question) but mysql_real_escape_string is deprecated, and in general you should not use any of the built in php mysql functions as they are all known to be unsafe. Php docs recommend using PDO instead.

Related

High CPU with mysql query when updating with inner join

I have looked around allot and tried different methods and wanted to improve my import mechanic for big data. Importing data on insert works great, however I hit an issue when I want to update existing data based on 2 where statements.
I first load the data from source and place it in a CSV file, than use LOAD DATA LOCAL INFILE, to import the data in a temp table.
Than insert as followed from the temp table to the main table, which works as expected. Fast and uses a low amount of server resources.
INSERT INTO $table ($fields) SELECT $fields FROM $temptable WHERE (ua,gm_id) NOT IN (SELECT ua,gm_id FROM $table)
I than have the following to update the records, the reason I created this method is because the update on duplicate key did not work. As it always inserted a new record. I think I don't understand how this method worked, or have not used it in the right way. Both UA and GM_ID are indexes on both tables, but can't get that to work. The issue with the below script is that, if I update 8000 rows, it uses 200% CPU and takes over 5 to 8 minutes. Which is of course not great.
$query = "UPDATE $table a INNER JOIN $temptable b ON a.gm_id=b.gm_id AND a.ua=b.ua SET ";
foreach($update_columns as $column => $status){
$query .= "a.$column=b.$column,";
}
$query = trim($query, ",");
$result = $pdo->query($query);
Can someone point me in the right direction what I should be using.
I want to update certain columns from the temp table to the main table. This code executes allot of times during the day. Sometimes can update just 100 rows, but sometimes 8k or 60k rows, and the columns can change.
I hope the sample codes are clear.
Thanks in advance for assistance.
"Both UA and GM_ID are indexes on both tables" -- Two separate indexes is the wrong approach. You must have a "composite" UNIQUE(UA, GM_ID) (in either order). If that pair is not unique, then you cannot use IODKU.
WHERE .. NOT IN ( SELECT ... ) is very inefficient. WHERE ... NOT EXISTS ( SELECT ... ) is better; LEFT JOIN ... WHERE .. IS NULL is even better. See "SQL #1" in http://mysql.rjweb.org/doc.php/staging_table#normalization
Read the rest of that blog for more tips on high speed ingestion.

Compare and remove rows based on values [MYSQLi]

So this is what I'm trying to do and have no idea how to proceed with.
There are two tables. db1 and db2. Db1 contains two columns, id and PriceMin.
Db2 contains columns id and Price.
Db1 is the mother database in which each id contains a set price. This db is static.
Db2 is the child database in which people vote for each item ( read: id ) and a script uses the avg() function after matching id's and prices to show data.
Now, since this is a script which asks for votes, it has a huge flaw: it can be manipulated. So I'm trying to have a script run every couple of hours which goes through db2, matches the id and deletes each row (vote) from db2 which has a id and it's price is over 200% of the same id in db1.
My solutions up to now:
Have each db be queried and fetched into an array and use str_replace via using an if function. This does not seem logical nor doable.
Use a script before POST to check which will double the queries to the DB, and is not logical. It will also be a stupid idea since every vote will require a query.
All out of ideas and would really appreciate a guide or advice in which way to go through.
Thanks in advance !
EDIT:
Came up with something but it doesn't work:
<?php
include ('go.php');
$sql = "delete
from db2
where db2.id IN (select db1.id where db2.price > db1.price * 2)" or die(mysqli_error($sql));
?>
Try this query:
DELETE db2
FROM db2
JOIN db1 ON db2.id = db1.id
WHERE db2.price > db1.price * 2

Multi Table Query

I am looking for some input on querying multiple tables,
I currently have a list which contains the day, (each day the reports where made.)
$list = mysql_query("SELECT * FROM list ORDER BY id");
while ($row = mysql_fetch_assoc($list)){
$drop_list[] = $row['day'];
}
My end goal is to create a query which checks a unique row from each table,
I was thinking arround the lines of something like this.
foreach ($drop_list as $v) {
$daily = mysql_query("SELECT * FROM $v WHERE ID = 1");
while ($row = mysql_fetch_assoc($daily)){
$id = $row['id'];
$name = $row['name'];
$age = $row['age'];
$day = $row['day'];
}
echo "<tr><td>$id</td><td>$name</td><td>$age</td><td>$day</td></tr>";
}
Then put that into a function and echo it out in between the table tag.
I am sure the code works, (Have not tested yet Typing this from tablet) but was curious if using foreach item in array query the data from DB and echo it out to give me the daily results for the id in array?
Also curious if other have different method to accomplish this?
The short answer
Running several SQL queries inside a foreach loop will hurt performance. There is properly a better solution where everything is fetched in one query. Database queries are expensive and should be optimized as much as possible to reduce loading times of the webpage and save resources for other simultaneous requests. You can download the MySQL Workbench to help write queries as well as optimize them using the analyzer tool. There are plenty of tutorials on how to use this program around the web.
A possible solution
I assume you know the tables which you want to query and that the list of them stays the same for long periods of time. I would then fetch everything inside one query using multiple SELECT statements and the UNION keyword. This assumes the columns inside the different tables are the same. By looking at the code it seems they all declare the required columns.
SELECT * FROM table1 WHERE id = 1
UNION ALL
SELECT * FROM table2 WHERE id = 1
UNION ALL
SELECT * FROM table3 WHERE id = 1
This will fetch every single row from each of the listed tables where the id equals 1. By appending the ALL keyword to the union statement we assure duplicate rows from all the tables are also returned.
One big disadvantage of this solution is that we have no reference to from which table each row originates from. If this is required some more complex SQL queries are properly necessary, but I would still recommend combining the queries into one.
Important!
Please note that the mysql_* functions are deprecated. They are not supported anymore and security holes are not patched. I strongly recommend switching to the PDO or MySQLi extensions. They provide the better solutions in security and performance for PHP.
Side note
By looking at your code I really do not understand, why you have several tables all declaring the same columns? This seems redundant to me, but maybe I lack some more insight. It would be more effective to have only one table to maintain.
I hope this can help guide you, happy coding!

MySql Temp Tables VS Views VS php arrays

I have currently created a facebook like page that pulls notifications from different tables, lets say about 8 tables. Each table has a different structure with different columns, so the first thing that comes to mind is that I'll have a global table, like a table of contents, and refresh it with every new hit. I know inserts are resource intensive, but I was hoping that since it is a static table, I'd only add maybe one new record every 100 visitors, so I thought "MAYBE" I could get away with this, but I was wrong. I managed to get deadlocks from just three people hammering the website.
So anyways, now I have to redo it using a different method. Initially I was going to do views, but I have an issue with views. The selected table will have to contain the id of a user. Here is an example of a select statement from php:
$get_events = "
SELECT id, " . $userId . ", 'admin_events', 0, event_start_time
FROM admin_events
WHERE CURDATE() < event_start_time AND
NOT EXISTS(SELECT id
FROM admin_event_registrations
WHERE user_id = " . $userId . " AND admin_events.id = event_id) AND
NOT EXISTS(SELECT id
FROM admin_event_declines
WHERE user_id = " . $userId . " AND admin_events.id = event_id) AND
event_capacity > (SELECT COUNT(*) FROM admin_event_registrations WHERE event_id = admin_events.id)
LIMIT 1
Sorry about the messiness. In any event, as you can see, I need to return the user Id from the page as a selected column from the table. I could not figure out how to do it with views so I don't think views are the way that I will be heading because there's a lot more of these types of queries. I come from an MSSQL background, and I love stored procedures, so if there are stored procedures for MYSQL, that would be excellent.
Next I started thinking about temp tables. The table will be in memory, the table will be probably 150 rows max, and there will be no deadlocks. Is it still very expensive to do inserts on a temp table? Will I end up crashing the server? Right now we have maybe 100 users per day, but I want to try to be future proof when we get more users.
After a long thought, I figured that the only way is the user php and get all the results as an array. The problem is that I'd get something like:
$my_array[0]["date_created"] = <current_date>
The problem with the above is that I have to sort by date_created, but this is a multi dimensional array.
Anyways, to pull 150 to 200 MAX records from a database, which approach would you take? Temp Table, View, or php?
Some thoughts:
Temp Tables:
temporary tables will only last as long as the session is alive. If you run the code in a PHP script, the temporary table will be destroyed automatically when the script finishes executing.
Views:
These are mainly for hiding complexity in that you create it with a join and then access it like a single table. The underlining code is a SELECT statement.
PHP Array:
A bit more cumbersome than SQL to get data from. However, PHP does have some functions to make life easier but no real query language.
Stored Procedures:
There are stored procedures in MySQL - see: http://dev.mysql.com/doc/refman/5.0/en/stored-routines-syntax.html
My Recommendation:
First, re-write your query using the MySQL Query Analyzer: http://www.mysql.com/products/enterprise/query.html
Now I would use PDO to put my values into an array using PHP. This will still leaves the initial heavy lifting to the DB Engine and keeps you from making multiple calls to the DB Server.
Try this:
SELECT id, " . $userId . ", 'admin_events', 0, event_start_time
FROM admin_events AS ae
LEFT JOIN admin_event_registrations AS aer
ON ae.id = aer.event_id
LEFT JOIN admin_event_declines AS aed
ON ae.id = aed.event_id
WHERE aed.user_id = ". $userid ."
AND aer.user_id = ". $userid ."
AND aed.id IS NULL
AND aer.id IS NULL
AND CURDATE() < ae.event_start_time
AND ae.event_capacity > (
SELECT SUM(IF(aer2.event_id IS NOT NULL, 1, 0))
FROM admin_event_registrations aer2
JOIN admin_events AS ae2
ON aer2.event_id = ae2.id
WHERE aer2.user_id = ". $userid .")
LIMIT 1
It still has a subquery, but you will find that it is much faster than the other options given. MySQL can join tables easily (they should all be of the same table type though). Also, the last count statement won't respond the way you want it to with null results unless you handle null values. This can all be done in a flash, and with the join statements it should reduce your overall query time significantly.
The problem is that you are using correlated subqueries. I imagine that your query takes a little while to run if it's not in the query cache? That's what would be causing your table to lock and causing contention.
Switching the table type to InnoDB would help, but your core problem is your query.
150 to 200 records is a very amount. MySQL does support stored procedures, but this isn't something you would need it for. Inserts are not resource intensive, but a lot of them at once, or in sequence (use bulk insert syntax) can cause issues.

(My)SQL - batch update

Hey, I have a table with "id", "name", and "weight" columns. Weight is an unsigned small int.
I have a page that displays items ordered by "weight ASC". It'll use drag-n-drop, and once the order is changed, will pass out a comma-separated string of ids (in the new order).
Let's say there's 10 items in that table. Here's what I have so far:
Sample input:
5,6,2,9,10,4,8,1,3,7
Sample PHP handler (error handlers & security stuff excluded):
<?php
$weight = 0;
$id_array = explode(',', $id_string);
foreach ($id_array as $key => $val)
{
mysql_query("UPDATE tbl SET weight = '$weight' where id = '$val' LIMIT 1");
$weight++;
}
?>
When I make a change to column order, will my script need to make 10 separate UPDATE queries, or is there a better way?
You could create a temporary table with the new data in it (i.e., id and weight are the columns), then update the table with this data.
create temporary table t (id int, weight float);
insert into t(id, weight) values (1, 1.0), (2, 27), etc
update tbl inner join t on t.id = tbl.id
set tbl.weight = t.weight;
So, you have one create statement, one insert statement, and one update statement.
You can only specify one where clause in a single query -- which means, in your case, that you can only update one row at a time.
With 10 items, I don't know if I would go through that kind of troubles (it means re-writing some code -- even if that's not that hard), but, for more, a solution would be to :
delete all the rows
inserts them all back
doing all that in a transaction, of course.
The nice point is that you can do several inserts in a single query ; don't know for 10 items, but for 25 or 50, it might be quite nice.
Here is an example, from the insert page of the MySQL manual (quoting) :
INSERT statements that use VALUES
syntax can insert multiple rows. To do
this, include multiple lists of column
values, each enclosed within
parentheses and separated by commas.
Example:
INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);
Of course, you should probably not insert "too many" items in a single insert query -- an insert per 50 items might be OK, though (to find the "right" number of items, you'll have to benchmark, I suppose ^^ )
Yes, you would need to do 10 updates. There are ways to batch up multiple queries in a single call to mysql_query, but it's probably best to avoid that.
If it's performance you are worried about, make sure you try it first before worrying about that. I suspect that doing 10 (or even 20 or 30) updates will be plenty fast.
10 updates is the simplest way conceptually. if you've got a bazillion rows that need to be updated, then you might have to try something different, such as creating a temporary table and using a JOIN in your UPDATE statement or a subquery with a row constructor.
Store the records in a temp table with batch insert and delete the records from the tbl and then from temp table do batch insert in tbl

Categories