Using PHP to delete many rows in a large table from MySQL - php

I am having trouble deleting many rows in a large table. I am trying to delete 200-300k rows from a 2m rows table.
My PHP script is something like this
for($i=0;$i<1000;$i++){
$query="delete from record_table limit 100";
$queryres=mysql_query($query) or die(mysql_error());;
}
this is just an example of my script where I will delete 100 rows at a time running for 1000 times to delete 100k records.
However, the PHP script just seems to keep running forever and not returning anything.
But when I tried to run the query from command line, it seems to delete just fine, although it takes about 5-6 minutes to delete.
Could there be something else that is preventing the PHP script from executing the query? I tried deleting 100k in one query and the result is the same too.
The query that I really wanted to run is "DELETE FROM table WHERE (timeinlong BETWEEN 'timefrom' AND 'timeto'"
The timeinlong column is indexed.

Hopefully you have an ID field so you can just do something like this:
$delete = mysql_query("DELETE FROM records WHERE id > 1000");
That would leave the first 1,000 rows and remove every other entry.

Perhaps adding a field to track deleted items will work for you. Then rather than actually deleting the rows, you update 'deleted' to TRUE. Obviously your other queries need to modified to select where deleted equals FALSE. But it's fast. Then you can trim the db via script at some other time.

Why are you deleting using loop so many times it's not a good way to delete. If you have any id (auto incremented) then use it with where clause
(e.g delete from record_table where id< any ID)
Or if you want to delete it with looping then for a long time you should also use set_time_limit(0) function to keep PHP script executing.

Related

Change the seed of RAND() function in PHP?

I accessed my table of database by a PHP script and I get continuous repeat results sometimes.
I ran this query:
$query ="SELECT Question,Answer1,Answer2 FROM MyTable ORDER BY RAND(UNIX_TIMESTAMP(NOW())) LIMIT 1";
Before of this query, I tried just with ORDER BY RAND(), but it gave me a lot of continuous repeat results, that's why I decided to use ORDER BY RAND(UNIX_TIMESTAMP(NOW())).
But this last one still give me continuous repeat results( but less).
Im going to write a example to explain what I mean when I said "continuous repeat results" :
Image that I have 100 rows in my table: ROW1,ROW2,ROW3,ROw4,ROW5...
well, when I call my script PHP 5 times continuosly I get 5 results:
-ROW2,ROW20,ROW20,ROW50,ROW66
I don't want same row continuously two times.
I would like it for example: -ROW2,ROW20,ROW50,ROW66,ROW20
I just want to fix it some easy way.
https://dev.mysql.com/doc/refman/5.7/en/mathematical-functions.html#function_rand
RAND() is not meant to be a perfect random generator. It is a fast way
to generate random numbers on demand that is portable between
platforms for the same MySQL version.
If you want 5 results, why not change the limit to 5 ? This will ensure that there are no duplicates
The other option is read all of the data out, and then use shuffle in php ?
http://php.net/manual/en/function.shuffle.php
Or select the max and use a random number generated from PHP
http://php.net/manual/en/function.mt-rand.php
This is not doable by just redefining the query. You need to change the logic of your PHP script.
If you want that the PHP script (and the query) returns exactly ONE row per execution, and you need a guarantee that repeated executions of the PHP scrips yield different rows, then you need to store the previous result somewhere, and use the previous result in the WHERE condition of the query.
So your PHP script becomes something like (pseudocode):
$previousId = ...; // Load the ID of the row fetched by the previous execution
$query = "SELECT Question,Answer1,Answer2
FROM MyTable
WHERE id <> ?
ORDER BY RAND(UNIX_TIMESTAMP(NOW()))
LIMIT 1";
// Execute $query, using the $previousId bound parameter value
$newId = ...; // get the ID of the fetched row.
// Save $newId for the next execution.
You may use all kinds of storages for saving/loading the ID of the fetched rows. The easiest is probably to use a special table with a single row in the same database for this purpose.
Note that you may still get repeated sequential rows if you call your PHP script many times in parallel. Not sure if it matters in your case.
If it does, you may use locks or database transactions to fix this as well.

Unique Codes - Given to two users who hit script in same second

Hi have a bunch of unique codes in a database which should only be used once.
Two users hit a script which assigns them at the same time and got the same codes!
The script is in Magento and the user can order multiple codes. The issue is if one customer orders 1000 codes the script grabs the top 1000 codes from the DB into an array and then runs through them setting them to "Used" and assigning them to an order. If a second user hits the same script at a similar time the script then grabs the top 1000 codes in the DB at that point in time which crosses over as the first script hasn't had a chance to finish assigning them.
This is unfortunate but has happened quite a few times!
My idea was to create a new table, once the user hits the script a row is made with "order_id" "code_type". Then in the same script a check is done so if a row is in this new table and the "code_type" matches that of which the user is ordering it will wait 60 seconds and check again until the previous codes are issued and the table is empty where it will then create a row and off it goes.
I am not sure if this is the best way or if two users hit at the same second again whether two rows will just be inserted and off we go with the same problem!
Any advice is much appreciated!
The correct answer depends on the database you use.
For example in MySQL with InnoDB the possible solution is a transaction with SELECT ... LOCK IN SHARE MODE.
Schematically it works this by firing following queries:
START TRANSACTION;
SELECT * FROM codes WHERE used = 0 LIMIT 1000 LOCK IN SHARE MODE;
// save ids
UPDATE codes SET used=1 WHERE id IN ( ...ids....);
COMMIT;
More information at http://dev.mysql.com/doc/refman/5.7/en/innodb-locking-reads.html

Most efficient way to move multiple rows at a time

This is a short term solution to a long term problem...
I have a database created by someone else (isn't that always the case). One particular table stores historical transactional data. When this table becomes large, the site performs like crap. When I can get to it, I will redesign database 3nf. Until then, I need to limit the table to around 500,000 rows. So I want to periodically run a script to move the oldest rows to an archive table that will probably never be used. Let's say I am moving 5-10K rows at a time, what is the most efficient way to do it?
This is a MYSQL database.
Off the top of my head, I figure I will get a count of the number of rows. Find out what the count - 500000 LIMIT 1 id is and move everything with and ID <= to that.
Do I just select, insert and delete or is there a better way to do it?

Mysql UPDATE (from PHP) - only runs first few rows.. Alls CELL locked after one UPDATE

I run a MYSQL UPDATE from PHP. It produces output, which is shown in the browser - No probz.
The mysql db updates various number of rows.. If i try 10 rows, it produces first 5 rows. If i try 4 it produces first and last.
I started making INSERT for all rows, and it inserted 1000+ rows in few seconds for this excact same database. The UPDATE seems to be way off in some way...
Maybe people have some inputs to why this could happend ?
The main concern, is that after i have produced updates on the rows, the rows are "locked" for updates through PHP. This in my mind is really a weird point and i don't get what is going on. I can offcourse make updates through the phpMYadmin.
CODE as requested:
mysql_query(" UPDATE `search` SET `pr_image` = '$primage', `pr_link` = '$pr_link' WHERE `s_id` = '$id' ");
Thanks in advance.
UPDATE `search` SET `pr_image` = $primage, `pr_link` = $pr_link WHERE `s_id` = $id
Try with this query.

How to update 100,000 record MySQL database efficiently

I have to update a 100,000 + MySQL database from PHP that pulls data from an API. It fails if I try and do more than 5,000 at the time.
I'm thinking the best approach might be to do 5,000 by using an update query with a limit 0, 5000 and then timestamping these records with the time they are updated. Then, select the next 5,000 where the time last updated is over 20 minues since current time.
Can anyone please offer any help on how to construct this query? Or is this approach not optimal?
So this is the solution I have gone with, rightly or wrongly it works. So to recap the problem, I have 100k rows, I need to loop through these and pass a userid to an API that returns a json feed.
I use the data returned to update each record. For some reason this fails either becasue of a timeout or server 500 error which I believe to be due to the API. So instead of selecting all 100k reords, I just select 5k (limit 0, 5000) and add a column called 'updated' and mark this as true once it has updated.
I keep doing this until all records are updated. When this happens I set the updated column to false and start the process again. This script runs on a chron job every 30 minutes and seems to work fine. I guess I could discover why it was timing out in the first place but I suspect it could be a php ini issue (timeout setting) which I don'thave access to.
Thanks
Jonathan
Create a temporary table, multi insert the update data and then
UPDATE `table`, `tmp`
SET `table`.`column` = `tmp`.`column`
WHERE `table`.`id` = `tmp`.`id`;

Categories