Copy and update 10000 MySQL rows / CPU usage - php

I have a PHP script that:
Copies new rows from the table "new" to "active" and delete the existing ones in "new".
Update the existing data and delete the ones in "new", if there is already a row with the same id_measurement in "active"
My current solution uses Laravel Eloquent. The problem is that the MySQL CPU usage is very high (90-100% on Mac) with over 10000 rows each time. Is there a faster way to do this? Maybe just with SQL?
Edit:
Everything is working fine now, expect the update part:
UPDATE foo_new as new
JOIN foo_active as active
ON active.id_bar = new.id_bar
SET active.blah=new.blah,
active.time_left=new.time_left
It's still really slow and uses a lot of the CPU.
Edit 2:
The solution are indexes. :)

Why do you bring it on application layer? delete and update the table using the power of mysql
I have done something similar use it
1) Update the table1 with new values of table2 using join
UPDATE tab1
INNER JOIN tab2 ON (tab1.DnameId = tab2.DNDOMAIN_ID)
SET
tab1.col = tab2.col;
2) For deletion - delete all the rows from table1 which are not in table2
delete from tab1 where tab1 .id not in (
select id from tab2
)

Related

How to update on union

I have a query that does a union on 2 tables. I want to update a column of the result.
something like this:
select * from(
select a.*,'10' as srv from px_conversions_srv10 a
union all
select b.*,'12' as srv from px_conversions_srv12 b
) as ff where ff.adv_transaction_id in(1333764016);
update ff SET ff.`status`=8;
Thanks
Just run two updates:
update px_conversions_srv10
set status = 8
where adv_transaction_id in (1333764016);
update px_conversions_srv12
set status = 8
where adv_transaction_id in (1333764016);
You can run these inside a single transaction if you want them to take effect at exactly the same time.
Note: having multiple tables with the same columns is usually a sign of a poor database design. There are reasons why this might be useful (say, the tables have different replication requirements or different security requirements). But, in general, a single table is a better idea.
Since it is coming from two different tables you need to find out from which table the result comes. You can do this by adding a column to the query and later decide from the column value which table to update. You do this already with the srv column!
The update statement must be on the original table, since the union is only produced by the query. It is not a physical table in the database.
By extension of this logic, to answer the question in the title, you CANNOT execute an UPDATE on the result set of a SELECT query.
Maybe create a view table and then update it:
CREATE VIEW ff AS
select * from(
select a.*,'10' as srv from px_conversions_srv10 a
union all
select b.*,'12' as srv from px_conversions_srv12 b
) as ff where ff.adv_transaction_id in(1333764016);
update ff SET ff.`status`=8;

PHP array_diff VS mysql NOT IN

I tried to compare two zipcode columns between two tables to see if values were missing in the second one.
I first wanted to do it with mysql, my query was something like
'SELECT code FROM t1 WHERE t1 NOT IN (select code FROM t2)'
But it was really slow so I tried another way :
I made two select, and then compared the results with array_diff().
With mysql : few minutes, and sometimes crash
With PHP : less than 1 second.
Can someone explain these differences ?
Is my SQL query wrong ?
If your main table has 50k rows, using a sub select in your query will result into 1 + 50k executions of selects. One for the first table, and 50k selects, one for each row. The server compares the row with your sub select that is reloaded every time iterating the main table. This is why your sql code takes its time and it also may be a huge memory problem as well.
See serjoschas information about joins to fix it in sql, it should be even faster that your php solution.
Checking which values are missing within a table (compared to another) can easily be done with a LEFT or RIGHT JOIN they are just made for actions like this.. alternatively take a look at this: How to Find Missing Value Between Two Mysql Tables – serjoscha
One solution to:
SELECT code FROM t1
WHERE code NOT IN ( SELECT code FROM t2 )
will be:
SELECT t1.code
FROM t1
LEFT JOIN t2
ON t1.code = t2.code
WHERE t2.code is null
Have a try. Also have a look on indexing as Cyclone suggests:
If you don't have an index you should definitly add one since this will speed up your query. You could add an index like this: ALTER TABLE ADD INDEX code_idx (code) this should be done for both tables. If you then were to execute EXPLAIN for the query you would see something like Using where; Using index; Using join buffer which is good – Cyclone
Indexing speeds up your query. If the table only provides one column, searching an index table with the same content as the source table will be exactly the same and redundant. Otherwise I strongly recommend indexing the code column of t2 which leads to a high increase of performance and less memory consumtion.

I want to update data in table by batch

I have this temp table that holds the data to be modified in table1, to update table1 I use the query:
UPDATE table1 pr
INNER JOIN tmpTable tmp
ON (pr.product_id = tmp.product_id)
SET pr.isactive = tmp.isactive
But since tmpTable holds huge amount of data to update, my query sometimes ending up to 'timeout'. So my question is, what's basically the simplest way to update my data by batch? say, by 10K.
Thanks in advance!
You tagged this in PHP so I'm assuming you're willing to do some work with in there and not just in a single query. Run the query multiple times. Something like
for($i<$minId; $i<maxId;$i+=10000){
$db->query("UPDATE table1 pr
INNER JOIN tmpTable tmp
ON (pr.product_id = tmp.product_id)
SET pr.isactive = tmp.isactive where isactive between $i and $i+10000");
}
If you're running MyISAM you risk things ending up in a partially completed state this way or yours. If your running innodb and want to maintain the all or nothing aspecs of a transaction you'll have to wrap that loop in a begin/commit. But, then you'll have the deal with the fallout of having potentially overlly large transactions.
If you can provide more details on your specifics I can go deeper down that route.
You can limit the number of records by using a primary or identity key in your temp table and a WHERE clause in your UPDATE statement. For example:
UPDATE table1 pr
INNER JOIN tmpTable tmp
ON (pr.product_id = tmp.product_id)
SET pr.isactive = tmp.isactive ***WHERE tmp.ID BETWEEN 1 and 10000***
Hope this helps.
Use a WHERE clause to limit your data - how to format the where is impossible to answer with the information you currently provide.

Migrating rows to a new table and then deleting them

I have a large database of products and everyday I wish to run a script that moves rows with active = '1' into a new table and deletes the original row.
But I can't seem to find the appropriate Mysql command to accomplish the migration.
It would be great if someone could shed some light on the situation.
Thanks a lot
You should be able to complete this with the following
CREATE TABLE NewTable LIKE OldTable;
INSERT INTO NewTable
SELECT * FROM OldTable WHERE Active = 1;
DELETE FROM OldTable WHERE Active = 1;
Just as curiosity, what is the point of doing this? If you're worried about the fuzzy rows you could do the delete as follows
DELETE OldTab FROM OlTable AS OldTab
INNER JOIN NewTable AS NewTab
ON OldTab.ID = NewTab.ID
without any details, here is what you will do:
create an INSERT statement to the new table AS SELECT from the old table WHERE active = 1.
create a DELETE statement that deletes from the first table any rows it finds in the second table.
commit;

MySQL PHP | "SELECT FROM table" using "alphanumeric"-UUID. Speed vs. Indexed Integer / Indexed Char

At the moment, I select rows from 'table01 and table02' using:
SELECT t1.*,t2.* FROM table01 AS t1
INNER JOIN table02 AS t2 ON (t1.ID = t2.t1ID)
WHERE t1.UUID = 'whatever';
The UUID column is a unique index, type: char(15), with alphanumeric input. I know this isn't the fastest way to select data from the database, but the UUID is the only row-identifier that is available to the front-end.
Since I have to select by UUID, and not ID, I need to know what of these two options I should go for, if say the table consists of 100'000 rows. What speed differences would I look at, and would the index for the UUID grow to large, and lag the DB?
Get the ID before doing the "big" select
1. $id = SELECT ID FROM table01 WHERE UUID = '{alphanumeric character}';
2. SELECT t1.*,t2.* FROM table01 AS t1
INNER JOIN table02 AS t2 ON (t1.ID = t2.t1ID)
WHERE t1.ID = $id;
Or keep it the way it is now, using the UUID.
2. SELECT t1.*,t2.* FROM table01 AS t1
INNER JOIN table02 AS t2 ON (t1.ID = t2.t1ID)
WHERE t1.UUID = 'whatever';
Side note: All new rows are created by checking if the system generated uniqueid exists before trying to insert a new row. Keeping the column always unique.
Why not just try it out? Create a new db with those tables. Write a quick php script to populate the tables with more records than you can imagine being stored (if you're expecting 100k rows, insert 10 million). Then experiment with different indexes and queries (remember, EXPLAIN is your friend)...
When you finally get something you think works, put the query into a script on a webserver and hit it with ab (Apache Bench). You can watch what happens as you increase the concurrency of the requests (1 at a time, 2 at a time, 10 at a time, etc).
All this shouldn't take too long (maybe a few hours at most), but it will give you a FAR better answer than anyone at SO could for your specific problem (as we don't know your DB server config, exact schema, memory limits, etc)...
The second solution have the best performance. You will need to look up the row by the UUID in both solutions, but in the first solution you first do it by UUID, and then do a faster lookup by primary key, but then you've already found the right row by UUID so it doesn't matter that the second lookup is faster because the second lookup is unnecessary altogether.

Categories