MySQL query on cron overlaps, ignores 'locked' rows - php

I'm trying to lock a row in a table as being "in use" so that I don't process the data twice when my cron runs every minute. Because of the length of time it takes for my script to run, the cron will cause multiple instances of the script to run at once (usually around 5 or 6 at a time). For some reason, my "in use" method is not always working.
I do not want to LOCK the tables because I need them available for simultaneous processing, that is why I went the route of pseudo-locking individual rows with an 'inuse' field. I don't know of a better way to do this.
Here is an illustration of my dilemma:
<?
//get the first row from table_1 that is not in use
$result = mysqli_query($connect,"SELECT * FROM `table_1` WHERE inuse='no'");
$rows = mysqli_fetch_array($result, MYSQLI_ASSOC);
$data1 = $rows[field1];
//"lock" our row by setting inuse='yes'
mysqli_query($connect,"UPDATE `table_1` SET inuse='yes' WHERE field1 = '$data1'");
//insert new row into table_2 with our data if it doesn't already exist
$result2 = mysqli_query($connect,"SELECT * FROM `table_2` WHERE field='$data2'");
$numrows = mysqli_num_rows($result2);
if($numrows >= 1) {
//do nothing
} else {
//run some unrelated script to get data
$data2 = unrelatedFunction();
//insert our data into table_2
mysqli_query($connect,"INSERT INTO `table_2` (field) value ('$data2')");
}
//"unlock" our row in table_1
mysqli_query($connect,"UPDATE `table_1` SET inuse='no' WHERE field1 = '$data1'");
?>
You'll see here that $data2 won't be collected and inserted if a row already exists with $data2, but that part is for error-checking and does not answer my question as the error still occurs. I'm trying to understand why (if I don't have that error-check in there) my 'inuse' method is sometimes being ignored and I'm getting duplicate rows in table_2 with $data2 in them.

There's a lot of time in between your first select and the first update where another process can do the same operation. You're not using transaction either, so you're not guaranteeing any order of the changes becoming visible to others.
You can either move everything into a transaction with the isolation level you need and use SELECT .... FOR UPDATE syntax. Or you can try doing the copy in a different way. For example update N rows that you want to process and SET in_use=your_current_pid WHERE in_use IS NULL. Then you can read back the rows you manually marked for processing. After you finish, reset in_use to NULL.

Related

Mysql only update if row has been inserted before

I want to only run the update query if row exists (and was inserted). I tried several different things but this could be a problem with how I am looping this. The insert which works ok and creates the record and the update should take the existing value and add it each time (10 exists + 15 added, 25 exists + 15 added, 40 exists... I tried this in the loop but it ran for every item in a list and was a huge number each time. Also the page is run each time when a link is clicked so user exits and comes back
while($store = $SQL->fetch_array($res_sh))
{
$pm_row = $SQL->query("SELECT * FROM `wishlist` WHERE shopping_id='".$store['id']."'");
$myprice = $store['shprice'];
$sql1 = "insert into posted (uid,price) Select '$uid','$myprice'
FROM posted WHERE NOT EXISTS (select * from `posted` WHERE `uid` = '$namearray[id]') LIMIT 1";
$query = mysqli_query($connection,$sql1);
}
$sql2 = "UPDATE posted SET `price` = price + '$myprice', WHERE shopping_id='".$_GET['id']."'";
$query = mysqli_query($connection,$sql2);
Utilizing mysqli_affected_rows on the insert query, verifying that it managed to insert, you can create a conditional for the update query.
However, if you're running an update immediately after an insert, one is led to believe it could be accomplished in the same go. In this case, with no context, you could just multiply $myprice by 2 before inserting - you may look into if you can avoid doing this.
Additionally, but somewhat more complex, you could utilize SQL Transactions for this, and make sure you are exactly referencing the row you would want to update. If the insert failed, your update would not happen.
Granted, if you referenced the inserted row perfectly for your update then the update will not happen anyway. For example, having a primary, auto-increment key on these rows, use mysqli_insert_id to get the last inserted ID, and updating the row with that ID. But then this methodology can break in a high volume system, or just a random race event, which leads us right back to single queries or transaction utilization.

Starting mysqli_fetch_array from a specific row

I have some records, which I use while($row = $result->fetch_assoc() to iterate on each one of them, then I get some other data from a different table using while($row2 = $result2->fetch_assoc(), that is, iterating also on each one of them, then displaying in a HTML table: part of first table data and part of second table data.
However, when I truncate the first table, and then insert new records, the second query $result2->fetch_assoc(), starts from the beginning of table and iterates X times, which is basically numbers of rows from first table. This is not what I want, I want to remember the last place of iteration from its table (table 2), then, when called again, only iterate the remain rows in the second table, always which is dependent on nth times from first table.
I found an answer in stackoverflow, which you can find it here, however, I didn't understand it correctly: how can you save last LIMIT value, so to start from X id if the $result2->fetch_assoc()is called again?
I thought about storing a counter in a text document (which is incremented by first while loop), then use LIMIT from that certain number, but I don't really get how to get it work.
Edit: here are some additional info:
Table "aplikimet" schema:
Table "aplikimet_2" schema:
$sql = "SELECT id, emri, mbiemri, email, telefoni, vendbanimi, datelindja, mesazhi FROM aplikimet";
$sql2 = "SELECT statusi, uid FROM aplikimet_2";
$result = $conn->query($sql);
$result2 = $conn->query($sql2);
if (($result->num_rows > 0) AND ($result2->num_rows>0)){
(html table and th are here)
while((($row = $result->fetch_assoc()) AND $row2 = $result2->fetch_assoc()){
(html td are here)
Thanks for your help!
every time you call fetch_assoc() an internal pointer is incremented.
after last element the call returns false so your while(...) loop will end.
to reset it you can call
mysqli_data_seek($result, 0);
or
$result->data_seek(0);
see here
IMHO is not a great way to do it.
If you want to loop multiple times the same rowset you can save it in an array after the first complete loop. Then loop with foreach() your array all the times you need (your connection can be already closed at that time)
To limit the number of the rows returned by your query use the SQL LIMIT clause which can have different syntax depending on RDBMS you are using.

Delete row by row or a bulk

I would like like to delete a bulk of data. this table have approximately 11207333 now
However I have several method to delete it.
The data that will be deleted is approximately 300k. I have two method to do this but unsure which one perform faster.
My first option:
$start_date = "2011-05-01 00:00:00";
$end_date = "2011-05-31 23:59:59";
$sql = "DELETE FROM table WHERE date>='$start_date' and date <='$end_date'";
$mysqli->query($sql);
printf("Affected rows (DELETE): %d\n", $mysqli->affected_rows);
second option:
$query = "SELECT count(*) as count FROM table WHERE date>='$start_date' and date <='$end_date'";
$result = $mysqli->query($query);
$row = $result->fetch_array(MYSQLI_ASSOC);
$total = $row['count'];
if ($total > 0) {
$query = "SELECT * FROM table WHERE date>='$start_date' and date <='$end_date' LIMIT 0,$total";
$result = $mysqli->query($query);
while ($row = $result->fetch_array(MYSQLI_ASSOC)) {
$table_id = $row['table_id']; // primary key
$query = "DELETE FROM table where table_id = $table_id LIMIT 0,$total";
$mysqli->query($query);
}
}
This table data is displayed to client to see, I afraid that if the deletion go wrong and it will affect my client.
I was wondering are there any method better than mine.
If you guys need more info from me just let me know.
Thank you
In my opinion, the first option is faster.
The second option contains looping which I think will be slower because it keeps looping several times looking for your table id.
If you did not provide the wrong start and end date, I think you're safe either option, but option 1 is faster in my opinion.
and yea, i dont see any deletion in option 2, but I assume you have it in mind but using looping method.
Option one is your best bet.
If you are afraid something will "go wrong" you could protect yourself by backing up the data first, exported the rows you plan to delete, or implementing a logical delete flag.
Assuming that there is indeed a DELETE query in it, the second method is not only slower, it may break if another connection deletes one of the rows you intend to delete in your while loop, before it had a chance to do it. For it to work, you need to wrap it in a transaction:
mysqli_query("START TRANSACTION;");
# your series of queries...
mysql_query("COMMIT;");
This will allow the correct processing of your queries in isolation of the rest of the events happening in the db.
At any rate, if you want the first query to be faster, you need to tune your table definition by adding an index on the column used for the deletion, namely `date` (however, recall that this new index may amper other queries in your app, if there are already several indexes on that table).
Without that index, mysql will basically process the query more or less the same way as in method 2, but without:
PHP interpretation,
network communication and
query analysis overhead.
You don't need any SELECTS to make the delete in a loop. Just use LIMIT in your delete query and check if there are affected rows:
$start_date = "2011-05-01 00:00:00";
$end_date = "2011-05-31 23:59:59";
$deletedRecords = 0;
$sql = "DELETE FROM table WHERE date>='$start_date' and date <='$end_date' LIMIT 100";
do {
$mysqli->query($sql);
$deletedRecords += $mysqli->affected_rows;
while ($mysqli->affected_rows > 0);
}
printf("Affected rows (DELETE): %d\n", $deletedRecords);
Which method is better depends on the storage engine you are using.
If you are using InnoDB, this is the recommended way. The reason is that the DELETE statement runs in a transaction (even in auto-commit mode, every sql statement is run in a transaction, in order to be atomic... if it fails in the middle, the whole delete will be rolled back and you won't end with half-data). Which means that you will have a long running transaction, and you will have a lot of locked rows during the transaction, which will block anyone who wants to update such data (it can block insterts if there are unique indexes involved) and reads will be done via the rollback log. In other words, for InnoDB, large deletes are faster if performed in chunks.
In MyISAM however, the delete locks the entire table. If you do in lot of small chunks, you will have too many LOCK/UNLOCK commands executed, which will actually slow the process. I would make it in a loop for MyISAM as well, to give chance to other processes to use the table, but in larger chunks compared to InnoDB. I would never do it row by row for MyISAM based table because of the LOCK/UNLOCK overhead.

mysql_fetch_row not returning fresh updated rows

I have a table in mysql like below-
++++++++++++
+ my_table +
++++++++++++
id
is_done
++++++++++++
And I have a PHP script which performs operation like below-
<?php
$q=mysql_query("SELECT * FROM my_table WHERE is_done=0");
while($res=mysql_fetch_assoc($q)){
$id=$res['id'];
//do something lengthy
sleep(60);
mysql_query("UPDATE my_table SET is_done=1 WHERE id='$id'");
}
?>
Now, if I manually change one row in mysql and set is_done=1 for that row, the script still processes that row. How can I adjust my script to read the fresh and updated row from mysql and skip any row that has been marked as done meanwhile?
why Dont you put the simple thing. what is the purpose of using sleep?
mysql_query("UPDATE my_table SET is_done=1 WHERE is_done=0");
You dont have query in whole table and change one by one.
if you need to use sleep then try this:
mysql_query("UPDATE my_table SET is_done=1 WHERE is_done=0 and id='$id'");
If you are worrying about calling unnecessary sql statements I'd suggest to aggregate the id's to an array and update outside the loop (performance):
<?php
$q=mysql_query("SELECT * FROM my_table WHERE is_done=0");
$ids = array();
while($res=mysql_fetch_assoc($q)){
$ids[]=$res['id'];
//do something lengthy
sleep(60);
}
mysql_query("UPDATE my_table SET is_done=1 WHERE id IN ('".implode("','", $ids)."')");
?>
MySQL won't do anything for rows that are already contains the new data (no affected rows where is_done=1).
If it's the actions you are doing you are worrying about you can use either Transaction to make sure the actions you are doing are Atomic, or Table Locking to make sure no changes are taking place while the script is running.

Can I add a new row to a table if a checkbox is checked using one query rather than nested? PHP, MySQL

After much head-scratching, I've got this query working - but it looks clunky and feels slow when it runs.
I have a table called UserTable which has a field called 'Item' populated if the specific user says 'yes' to that item. I only want to add a row for that item into UserTable in that instance - in other words, I don't want to have lots of user_ID/Item/'no' relationships in the table, only the user_ID/Item/'yes' responses.
I've built some code which shows the user the whole dataset and allows them to change their preference and then press update. When they update, an array called $checkbox is output which includes the item numbers (eg "1","3","6") which they've ticked as 'yes'. If they don't tick anything, $checkbox is set to "".
Here's the relevant code - as I say, it's very clunky, with a WHILE inside a FOREACH as well as two validating IF statements. Can I get rid of one (or both!) of the loops and replace with a SELECT type command?
foreach($checkbox as $value)
{if($value!="") {
$sql= "SELECT count(Item) as row_exists
FROM UserTable
WHERE Item = '$value' and
User_ID = '$current_user_id'";
$result = mysqli_query($mysqli,$sql) or die(mysqli_error($mysqli));
while ($iteminfo = mysqli_fetch_array($result)) {
If ((int)$iteminfo['row_exists']==0) {
$sql = "INSERT INTO UserTable
(User_ID,Item,Date) VALUES
('$current_user_id','$value',now() )";
$add_new_row = mysqli_query($mysqli,$sql) or die(mysqli_error($mysqli));
}
}
}
}
Many thanks in advance.
You can eliminate both if statements:
Filter your checkbox array on != "" and loop over those results. That gets rid of the first if that checks for != "".
Augment your initial query to include row_exists = 0, and iterate over those results. That gets rid of the second if.
In fact, you could probably merge your two sql statements into one composite conditional insertion. You are allowed to do insertions of the form:
INSERT INTO table (SELECT ...)
So you could look at taking your first query and adapting/substituting it for the SELECT... part of the query above, and taking your second insertion and adapting/substituting it in place of the INSERT INTO... above.
So if one user can be associated with multiple items, it seems that you should normalize this and have probably three tables - one for users, one for items, and one many-to-many table relating users to items.

Categories