I want to update data in table by batch - php

I have this temp table that holds the data to be modified in table1, to update table1 I use the query:
UPDATE table1 pr
INNER JOIN tmpTable tmp
ON (pr.product_id = tmp.product_id)
SET pr.isactive = tmp.isactive
But since tmpTable holds huge amount of data to update, my query sometimes ending up to 'timeout'. So my question is, what's basically the simplest way to update my data by batch? say, by 10K.
Thanks in advance!

You tagged this in PHP so I'm assuming you're willing to do some work with in there and not just in a single query. Run the query multiple times. Something like
for($i<$minId; $i<maxId;$i+=10000){
$db->query("UPDATE table1 pr
INNER JOIN tmpTable tmp
ON (pr.product_id = tmp.product_id)
SET pr.isactive = tmp.isactive where isactive between $i and $i+10000");
}
If you're running MyISAM you risk things ending up in a partially completed state this way or yours. If your running innodb and want to maintain the all or nothing aspecs of a transaction you'll have to wrap that loop in a begin/commit. But, then you'll have the deal with the fallout of having potentially overlly large transactions.
If you can provide more details on your specifics I can go deeper down that route.

You can limit the number of records by using a primary or identity key in your temp table and a WHERE clause in your UPDATE statement. For example:
UPDATE table1 pr
INNER JOIN tmpTable tmp
ON (pr.product_id = tmp.product_id)
SET pr.isactive = tmp.isactive ***WHERE tmp.ID BETWEEN 1 and 10000***
Hope this helps.

Use a WHERE clause to limit your data - how to format the where is impossible to answer with the information you currently provide.

Related

Duplicate records in MySQL. EXISTS check for the same data not working properly?

SELECT EXISTS
(SELECT * FROM table WHERE deleted_at IS NULL and the_date = '$the_date' AND company_name = '$company_name' AND purchase_country = '$p_country' AND lot = '$lot_no') AS numofrecords")
What is wrong with this mysql query?
It is still allowing duplicates inserts (1 out of 1000 records). Around 100 users making entries, so the traffic is not that big, I assume. I do not have access to the database metrics, so I can not be sure.
The EXISTS condition is use in a WHERE clause. In your case, the first select doesn't specify the table and the condition.
One example:
SELECT *
FROM customers
WHERE EXISTS (SELECT *
FROM order_details
WHERE customers.customer_id = order_details.customer_id);
Try to put your statement like this, and if it returns the data duplicated, just use a DISTINCT. (SELECT DISCTINCT * .....)
Another approach for you :
INSERT INTO your_table VALUES (SELECT * FROM table GROUP BY your_column_want_to_dupplicate);
The answer from #Nick gave the clues to solve the issue. Separated EXIST check and INSERT was not the best way. Two users were actually able to do INSERT, if one got 0. A single statement query with INSERT ... ON DUPLICATE KEY UPDATE... was the way to go.

Mysql fetch from last to first - [many records]

i want to fetch records from mysql starting from last to first LIMIT 20. my database have over 1M records. I am aware of order by. but from my understanding when using order by its taking forever to load 20 records i have no freaking idea. but i think mysql fetch all the records before ordering.
SELECT bookings.created_at, bookings.total_amount,
passengers.name, passengers.id_number, payments.amount,
passengers.ticket_no,bookings.phone,bookings.source,
bookings.destination,bookings.date_of_travel FROM bookings
INNER JOIN passengers ON bookings.booking_id = passengers.booking_id
INNER JOIN payments on payments.booking_id = bookings.booking_id
ORDER BY bookings.booking_id DESC LIMIT 10
I suppose if you execute the query without the order by the time would be satisfactory?
You might try to create an index in the column your are ordering:
create index idx_bookings_booking_id on bookings(booking_id)
You can try to find out complexity of the Query using
EXPLAIN SELECT bookings.created_at, bookings.total_amount,
passengers.name, passengers.id_number, payments.amount,
passengers.ticket_no,bookings.phone,bookings.source,
bookings.destination,bookings.date_of_travel FROM bookings
INNER JOIN passengers ON bookings.booking_id = passengers.booking_id
INNER JOIN payments on payments.booking_id = bookings.booking_id
ORDER BY bookings.booking_id DESC LIMIT 10
then check the proper index has been created on the table
SHOW INDEX FROM `db_name`.`table_name`;
if the index us not there create proper index on all the table
please add if anything is missing
The index lookup table needs to be able to reside in memory, if I'm not mistaken (filesort is much slower than in-mem lookup).
Use small index / column size
For a double in capacity use UNSIGNED columns if you need no negative values..
Tune sort_buffer_size and read_rnd_buffer_size (maybe better on connection level, not global)
See https://dev.mysql.com/doc/refman/5.7/en/order-by-optimization.html , particularly regarding using EXPLAIN and the maybe trying another execution plan strategy.
You seem to need another workaround like materialized views.
Tell me if this sounds like it:
Create another table like the booking table e.g. CREATE TABLE booking_short LIKE booking. Though you only need the booking_id column
And check your code for where exactly you create booking orders, e.g. where you first insert into booking. SELECT COUNT(*) FROM booking_short. If it is >20, delete the first record. Insert the new booking_id.
You can select the ID and join from there before joining for more details with the rest of the tables.
You won't need limit or sorting.
Of course, this needs heavy documentation to avoid maintenance problems.
Either that or https://stackoverflow.com/a/5912827/6288442

sql updating only one row

I am trying to update only one row using sql, but I am having troubles with it.
I am trying to do something like this:
$sql="UPDATE table SET age='$age' WHERE id=(SELECT id FROM another_table WHERE somecondition ORDER BY id LIMIT 1)";
but this is not updating anything. I feel like there is some error with where the parenthesis are, but I am not sure what exactly is wrong with it. Does anybody have any idea? or have other suggestions on how to update only one row that satisfies the given conditions?
Edited Notes:
Okay, I may have made my question too complicated. Let me rephrase my question; What is the generic way of updating only 1 row that meets certain conditions. It can be any row if the row meets the conditions.
you should run this query firstly:
SELECT id FROM another_table WHERE somecondition ORDER BY id LIMIT 1
and see the result, if you get specific value, say for example 1 , update your code to be
$sql="UPDATE table SET age='$age' WHERE id=(1)";
and you can see the results. if the query doesn't produce errors so your condition doesn't consider and there is no 1 id in your table table.
I have found that updating based on a condition in a sub-query, as in your example, sometimes has problems that seem due to the database trying to figure out the best execution path. I have found it better to do something like the following, noting that my code is in T-SQL and may need a smidgen of tweaking to work in MySQL.
UPDATE T1 SET age=#Age
FROM table as T1 INNER JOIN
another_table as T2 ON T1.id = T2.id
WHERE [use appropriate conditions here]
Try running this query:
UPDATE table t
SET t.age='$age'
WHERE t.id = (SELECT a.id
FROM another_table a
WHERE somecondition
ORDER BY a.id
LIMIT 1
);
One not-uncommon cause of this error is when the id column has different names. You should get in the habit of qualifying column names. You should also verify that the ids in the two tables are intended to match.
Another cause would simply be that the matched conditions return no row or ids that are not in the table. That is a bit harder to fix, which better understanding the data and data structure.

Copy and update 10000 MySQL rows / CPU usage

I have a PHP script that:
Copies new rows from the table "new" to "active" and delete the existing ones in "new".
Update the existing data and delete the ones in "new", if there is already a row with the same id_measurement in "active"
My current solution uses Laravel Eloquent. The problem is that the MySQL CPU usage is very high (90-100% on Mac) with over 10000 rows each time. Is there a faster way to do this? Maybe just with SQL?
Edit:
Everything is working fine now, expect the update part:
UPDATE foo_new as new
JOIN foo_active as active
ON active.id_bar = new.id_bar
SET active.blah=new.blah,
active.time_left=new.time_left
It's still really slow and uses a lot of the CPU.
Edit 2:
The solution are indexes. :)
Why do you bring it on application layer? delete and update the table using the power of mysql
I have done something similar use it
1) Update the table1 with new values of table2 using join
UPDATE tab1
INNER JOIN tab2 ON (tab1.DnameId = tab2.DNDOMAIN_ID)
SET
tab1.col = tab2.col;
2) For deletion - delete all the rows from table1 which are not in table2
delete from tab1 where tab1 .id not in (
select id from tab2
)

MySQL PHP | "SELECT FROM table" using "alphanumeric"-UUID. Speed vs. Indexed Integer / Indexed Char

At the moment, I select rows from 'table01 and table02' using:
SELECT t1.*,t2.* FROM table01 AS t1
INNER JOIN table02 AS t2 ON (t1.ID = t2.t1ID)
WHERE t1.UUID = 'whatever';
The UUID column is a unique index, type: char(15), with alphanumeric input. I know this isn't the fastest way to select data from the database, but the UUID is the only row-identifier that is available to the front-end.
Since I have to select by UUID, and not ID, I need to know what of these two options I should go for, if say the table consists of 100'000 rows. What speed differences would I look at, and would the index for the UUID grow to large, and lag the DB?
Get the ID before doing the "big" select
1. $id = SELECT ID FROM table01 WHERE UUID = '{alphanumeric character}';
2. SELECT t1.*,t2.* FROM table01 AS t1
INNER JOIN table02 AS t2 ON (t1.ID = t2.t1ID)
WHERE t1.ID = $id;
Or keep it the way it is now, using the UUID.
2. SELECT t1.*,t2.* FROM table01 AS t1
INNER JOIN table02 AS t2 ON (t1.ID = t2.t1ID)
WHERE t1.UUID = 'whatever';
Side note: All new rows are created by checking if the system generated uniqueid exists before trying to insert a new row. Keeping the column always unique.
Why not just try it out? Create a new db with those tables. Write a quick php script to populate the tables with more records than you can imagine being stored (if you're expecting 100k rows, insert 10 million). Then experiment with different indexes and queries (remember, EXPLAIN is your friend)...
When you finally get something you think works, put the query into a script on a webserver and hit it with ab (Apache Bench). You can watch what happens as you increase the concurrency of the requests (1 at a time, 2 at a time, 10 at a time, etc).
All this shouldn't take too long (maybe a few hours at most), but it will give you a FAR better answer than anyone at SO could for your specific problem (as we don't know your DB server config, exact schema, memory limits, etc)...
The second solution have the best performance. You will need to look up the row by the UUID in both solutions, but in the first solution you first do it by UUID, and then do a faster lookup by primary key, but then you've already found the right row by UUID so it doesn't matter that the second lookup is faster because the second lookup is unnecessary altogether.

Categories