I have a script for sending emails in the background. The script runs in parallel to send out multiple emails simultaneously. It works basically like this, with a mixture of MySQL and PHP:
/* TransmissionId is a PRIMARY KEY */
/* StatusId is a FOREIGN KEY */
/* Token is UNIQUE */
/* Pick a queued (StatusId=1) transmission and set it to pending (StatusId=2) */
/* This is a trick to both update a row and store its id for later retrieval in one query */
SET #Ids = 0;
UPDATE transmission
SET StatusId=IF(#Ids := TransmissionId,2,2), LatestStatusChangeDate=NOW()
WHERE StatusId = 1
ORDER BY TransmissionId ASC
LIMIT 1;
/* Fetch the id of the picked transmission */
$Id = SELECT #Ids;
try {
/* Fetch the email and try to send it */
$Email = FetchEmail($Id);
$Email->Send();
/* Set the status to sent (StatusId=3) */
$StatusId = 3;
} catch(Exception $E) {
/* The email could not be sent, set the status to failed (StatusId=4) */
$StatusId = 4;
} finally {
/* Save the new transmission status */
UPDATE transmission
SET StatusId=$StatusId, LatestStatusChangeDate=NOW(), Token='foobar'
WHERE TransmissionId = $Id;
}
The issue is that I sometimes get a deadlock: SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction. This has happened when executing the last query. I've not seen it happen when executing the first query. Can anyone understand how a deadlock can happen in this case? Could it be that the first query and the last query lock StatusId and TransmissionId in opposite order? But I don't think the first query needs to lock TransmissionId, nor do I think the last query needs to lock StatusId. How can I find this out, and how can I fix it?
Edit
There is another query that might also play a role. Whenever someone opens the email, this query is run:
/* Selector is UNIQUE */
UPDATE transmission SET
OpenCount=OpenCount+1
WHERE Selector = 'barfoo'
InnoDB uses automatic row-level locking. You can get deadlocks even in the case of transactions that just insert or delete a single row. That is because these operations are not really “atomic”; they automatically set locks on the (possibly several) index records of the row inserted or deleted. dev.mysql.com/doc/refman/5.7/en/innodb-deadlocks-handling.html
Related
Can two Laravel workers can use the same Transaction DB?
I have Job Process A which will call/dispatch Job Process B if there is data in table A with flag is_processed = 0. What it does is:
-- first select data with lock
SELECT *
FROM tableA
WHERE is_proccesed = 0
LIMIT 1000
FOR UPDATE OF tableA SKIP LOCKED
-- insert data to tableB
INSERT tableB VALUES SELECT values from tableA
-- update data
UPDATE tableA SET is_proccesed = 1 where id = (from any id i have select)
Then trigger job process B:
ProcessB::dispatch(from any id i have select as string)->onQueue('queueA');
I have Job Process B which will be triggered by Job Process A or cron which works every minute.
--first select data with lock
SELECT *
FROM tableB
WHERE is_proccesed = 0 AND id in (parameter get from job A if any)
LIMIT 1000
FOR UPDATE OF tableB SKIP LOCKED
-- call API with parameter value is from tableB
-- update data
If (call API is success) then:
UPDATE tableB SET is_proccesed = 1 where id = (from any id i have select)
if (call API is fail) then:
UPDATE tableB SET is_proccesed = 0 where id = (from any id i have select)
I have a cron running every minute that will call/dispatch Job Process A if any is_processed flag is 0 in table A.
I have a cron running every minute which call/dispatch Job Process B if there is_processed flag which is 0 in table B.
I use supervisor to do this in real time and use max-retry for jobs that fail 3 times.
My problem is:
I have double process call API from job process B,
I have scrolled through my logs and the SELECT key got 2 data from 2 different processes at the same time. (in some cases with 2000 or more data to process),
It doesn't always happen to process a bit of data.
My question is:
Is select data with lock not working with queue jobs?
Is it correct to create cron to notify job manually to reprocess unsuccessful data, or should I apply a failed job only to rework jobs?
I have not seen many Web languages that use database locks correctly. Without looking at the Laravel code, I would guess that it does not use database locks correctly for jobs. I know that it does not use locks for migrate. Running migrate from >2 web nodes is not safe.
If you use Redis or some other technology for jobs instead of SQL DB, a lot of concurrent problems will probably go away.
Manage your own global lock
You can manage your own lock and add synchronization between your own processes.
$results = \DB::select('SELECT GET_LOCK("process-b", 120) as obtain_lock');
if (!$results[0]->obtain_lock) { return 0; }
//120 is seconds to wait for lock or fail
//load one record
//call API
//update one record
//free lock
$results = \DB::select('SELECT RELEASE_LOCK("process-b")');
if (!$results[0]->obtain_lock) { return -1; } //couldn't release lock, stop process, free mysql connection
In Postgresql they are called "advisory locks", but you cannot use characters, you have to use numbers
$results = \DB::select('SELECT pg_advisory_lock(1337)');
if (!$results) { return 0; } // ???
//load one record
//call API
//update one record
//free lock
$results = \DB::select('SELECT pg_advisory_unlock(1337)');
if (!$results) { return -1; } //??? how to check for success?
Use "SELECT ... FOR UPDATE"
I'm not sure if you are trying to use FOR UPDATE locks and it is not working, or you are skipping the lock with intention.
You need to turn off autocommit (set autocommit=0) to use lock FOR UPDATE or to start a transaction.
\DB::transaction( function () use ($id) {
$results = \DB::table('table_b')->select('SELECT * from table_b where ID=?', $id)->lockForUpdate()->get();
\DB::table('table_b')->update('UPDATE table_b set x=y where ID=?', $id);
});
Where ProcA sends jobs to ProcB, you can make 1 ProcB job for each ID that is processed=0 - OR - you can make 1 ProcB job whenever you find any processed=0 records.
So, if ProcB will only work with 1 record ID, then global lock solution is probably not good.
You can check that your lock for update is working by putting sleep() and creating 10-20 ProcB jobs with the same record ID. If you sleep for 3 seconds, and it takes 30-60 seconds to finish all ProcB jobs, then the lock for update is working properly. If they all finish in 3 seconds, then they are not respecting the lock on the record.
Bonus
Add this to your routes/console.php to get concurrent-safe artisan lockingmigrate command
$signature = 'lockingmigrate {--database= : The database connection to use}
{--force : Force the operation to run when in production}
{--path=* : The path(s) to the migrations files to be executed}
{--realpath : Indicate any provided migration file paths are pre-resolved absolute paths}
{--pretend : Dump the SQL queries that would be run}
{--seed : Indicates if the seed task should be re-run}
{--step : Force the migrations to be run so they can be rolled back individually}';
Artisan::command($signature, function ($database=false, $seed=false, $step=false, $pretend=false, $force=false) {
$results = \DB::select('SELECT GET_LOCK("artisan-migrate", 120) as migrate');
if (!$results[0]->migrate) { return -1; }
$params = [
'--pretend' => $pretend,
'--force' => $force,
'--step' => $step,
'--seed' => $seed,
];
$retval = Artisan::call('migrate', $params);
$outputLines = explode("\n", trim(\Artisan::output()));
dump($outputLines);
\DB::select('SELECT RELEASE_LOCK("artisan-migrate")');
return $retval;
})->describe('Concurrent-safe migrate');
I'm developing a webapp that shares a database with an in-production desktop app (aka I cannot modify the database, only try to mimic behaviors). The module I'm working on now will store notes into this database in the notes table. I was able to get it to work, I added notes and they showed up in the desktop app, then after some time I realized the notes actual note text and descriptions were being overwritten. Looking at the rows in the database, I noticed modified_by user was set, telling me there was a duplicate key on insert, then later update. The primary key for this table is to auto-increment so I was very confused. After some digging I found a table called counters with a column called notes that had a count that matched the current index of notes table. Before just simply +1 the counter on every insert, I downloaded wireshark onto the db server and recorded the traffic on the db port and found this:
(Procedure when adding a note from desktop app)
UPDATE counters SET in_use = 'Y';
SELECT notes FROM counters WHERE key_col = 1;
/* Desktop app uses current count for new index */
UPDATE counters SET notes = /* current count +1 */ WHERE key_col = 1;
UPDATE counters SET in_use = 'N';
/* ...Inserts new note here with explicit ID = current count ... */
Now I'm even more confused. Why set the table to auto-increment at all? Second, there was never any checking of in_use before selecting the count and adding one... so what's the point of in_use? Couldn't this code lead to overwrites if two users inserted at the same time? Wouldn't the correct way to do this be to lock the counters table for every operation? I could try this, but I'm not sure how the desktop app will handle encountering a lock (based on experience - fatal error).
Aside from exactly duplicating this procedure and hoping for the best, I'm not exactly sure where to go from here. One thought is to:
<?php
const MAX_ATTEMPTS = 3;
$curKey;
for($i = 0; $i < MAX_ATTEMPTS; $i++){
/*
SELECT in_use, notes from counters where key_col = 1;
...
*/
if( 'N' === $result['in_use'] ){
$curKey = $result['notes'];
/* INSERT count here - $curKey++ */
break;
}
/* Sleep for .25 seconds to allow for current operation to finish */
usleep(250000);
}
if( null == $curKey ){
throw \Exception('Could not insert note because counter table locked after '. MAX_ATTEMPTS .' attempts');
}
/* INSET note code here... */
This seems ok, but could still possibly overwrite because a) time between select count and insert new count b) Desktop app does not seem to do any checking.
Any thoughts/suggestions?
EDIT: Made a stored procedure to do checking during select and insert.
DELIMITER $$
CREATE DEFINER=`testUser`#`%` FUNCTION `getNextNoteIndex`(appKey INTEGER) RETURNS int(11)
BEGIN
SELECT IF(`in_use` = 'N', `notes`, NULL) INTO #curIndex FROM `counters` WHERE `app_key` = appKey;
IF #curIndex IS NOT NULL
THEN
SET #newIndex = #curIndex + 1;
UPDATE `counters` SET `notes` = #newIndex WHERE `app_key` = appKey AND `in_use` = 'N' AND `notes` = #curIndex;
IF ROW_COUNT() = 1
THEN
RETURN #newIndex;
END IF;
END IF;
RETURN NULL;
END
Usage:
SELECT testDB.getNextNoteIndex(1) AS $index;
I do not know for what purpose they would need to create a table that does the auto incrementing, it doesn't sound like a standard solution.
I'm confused as to what you can and cannot change (db, backend code, etc):
If you're on the inside, are you not able to ask the developer who built that intermediate incrementing table what it is for and potentially get clarity there, or bypass it altogether.
If you're on the outside, does it make sense to ask them for an API and use the endpoints they gave you? Then any problems that arise from overwriting fall on their court.
If I've got a database of users that have filled out a form, can I use a cron job to send an automated email? If so, what is the best way to "loop" it so that it sends the email once to each user?
$data = mysql_query("
SELECT *
FROM completed
WHERE
followupsent='0000-00-00 00:00:00'
AND valuesent + INTERVAL 4 DAY <= NOW()
")
or die(mysql_error());
while($info = mysql_fetch_array( $data ))
{
}
This checks to see if "followupsent" has been updated already as it updates with NOW() when it sends and also checks to see how many days since the value was sent.
I'm worried that by putting the email sending information in the while tags is going to loop for each row and end up sending a ton of emails.
Would using and if instead of a while:
if($info = mysql_fetch_array( $data ))
{
}
In order to send out to the first in the database and then let the cron job handle the rest by checking every minute which one is next?
Thats a perfectly fine way to do it. It will only send it it once per record (so assuming you have no duplicates in your completed table).
I assume you are updating the valuesent field in the loop
I suggest:
You ensure that your completed table uses a transactional storage engine, e.g. InnoDB:
ALTER TABLE completed ENGINE=InnoDB;
You define a new BOOLEAN column that indicates (to any other database connections) that a given record is in the process of being updated:
ALTER TABLE completed ADD COLUMN updating BOOLEAN NOT NULL DEFAULT FALSE;
You use a locking read to SELECT the records that are to be emailed, followed within the same transaction by an UPDATE to the newly created column. For example, using PDO:
$dbh->beginTransaction();
$select = $dbh->query('
SELECT *
FROM completed
WHERE followupsent IS NULL
AND valuesent <= CURRENT_TIMESTAMP - INTERVAL 4 DAY
AND NOT updating
FOR UPDATE
');
$dbh->exec('
UPDATE completed
SET updating = TRUE
WHERE followupsent IS NULL
AND valuesent <= CURRENT_TIMESTAMP - INTERVAL 4 DAY
AND NOT updating
');
$dbh->commit();
You can then update the database (removing the updating flag, together with whatever other status you require) as each email is sent:
$success = $dbh->prepare('
UPDATE completed
SET followupsent = CURRENT_TIMESTAMP,
updating = FALSE
WHERE id = ?
');
$failure = $dbh->prepare('
UPDATE completed
SET updating = FALSE
WHERE id = ?
');
foreach ($select as $row) {
$command = mail(...) ? $success : $failure;
$command->exec(array($row['id']));
}
Note that if the script terminates prematurely, the database may be left in an undesirable state—i.e. records may have updating=TRUE but there is no longer any script that is processing them; this could lead to some records not being emailed at all. You may want to guard against this eventuality by registering a custom shutdown function, or else by periodically inspecting the database for such "orphaned" records (however you must of course be sure that they are not currently being processed by a running script).
When I insert a new record into one table (work_log), I update a record in another table (employers) with the last inserted record from work_log.
But if updating employers-table doesn't succeed, after successfully inserted the new record into work_log, I need to remove the newly added record to work_log since that entry would not be valid anymore.
Here's my script so far:
/**
* This first part has no direct affect on the question, but serves as additional information to understand the script better..
* - - -
* First, a new work session is inserted (this session has nothing to do with browser session)
* If this fails, the script does not continue, and the user is redirected back to the form with an error-message.
* otherwise, the script continues, and try to activate the session by adding a new work_log entry.
*/
$ins_session = $con['site']->prepare('INSERT INTO work_sessions (fields) VALUES (?)');
$ins_session->execute(array(values));
if($ins_session){
KD::notice('success','New work session is created.');
$session_id = $con['site']->lastInsertId();
} else {
KD::notice('error','Work session was not created.');
KD::redirect(); // stops the script, and redirects
}
/**
* This part affects my question
* - - -
* Add a new entry to the work log in order to automatically start the work session.
* If this entry is successfully inserted, then add an indicator to the corresponding employer, in the employers table, to indicate that this employer has an active session (and which one it is).
*/
$ins_work_log = $con['site']->prepare('INSERT INTO work_log (fields) VALUES (?)');
$ins_work_log->execute(array(values));
if($ins_work_log){
$upd_employer = $con['site']->prepare('UPDATE employers SET fk_work_sessions_id = ? WHERE id = ?');
$upd_employer->execute(array($session_id,$_POST['employer_id']));
if($upd_employer){
KD::notice('success','New session was created and started.');
KD::redirect();
} else {
// need to remove the entry from work_log.
KD::notice('Work session was created, but not started. Please start the session manually.');
}
}
To my understanding, I have to delete the last inserted record in the work_log-table?
Is there any other way to do this? like, in another order, or to automatically remove the entry from work_log if this (the update query) fails?
The work_log-table is innoDB, and row format is compact if that is important to know...
UPDATE
I've set it up like this:
It seems to work, but I'm a bit unsure if I'm using it correctly regarding the if/else statements.
$con['site']->beginTransaction();
$ins_work_log = $con['site']->prepare('INSERT INTO work_log (fields) VALUES (?)');
$ins_work_log->execute(array(values));
if($ins_work_log){
# update employer
$upd_employer = $con['site']->prepare('UPDATE employers SET fk_work_sessions_id = ? WHERE id = ?');
$upd_employer->execute(array($session_id,$_POST['employer_id']));
if($upd_employer){
$con['site']->commit();
KD::notice('success','New session was created and started.');
} else {
$con['site']->rollBack();
KD::notice('error','Work session was created, but not started. Please start the session manually.');
}
//
} else {
$con['site']->rollBack();
KD::notice('error','');
}
KD::redirect();
Will if($ins_work_log), and if($upd_employer), have any affect when the query hasn't been committed yet?
This is a classic case for using START TRANSACTION, COMMIT, and ROLLBACK.
http://dev.mysql.com/doc/refman/5.0/en/commit.html
Just make sure you are using a database engine that supports it.
Pseudocode:
query("START TRANSACTION;");
query("INSERT INTO table1 ...");
if (query("INSERT INTO table2 ..."))
query("COMMIT;");
else
query("ROLLBACK;");
The High Level Idea:
I have a micro controller that can connect to my site via a http request...I want to feed the device a response as soon as a change is noted on the database...
Due to the the end device being a client ie micro controller...Im unaware of a method to pass the data to the client without having to set up port forwarding...which is heavily undesired ...The problem arise when trying send data from an external network to an internal one...Either A. port forwarding or B have the client device initiate the request which leads me to the idea of having the device send an http request to file that polls for changes
Update:
Much Thanks to Ollie Jones. I have implimented some of his
suggestions here.
Jason McCreary suggested having a modified column which is a big
improvement as it should increase speed and reliability ...Great
suggestion! :)
if the database being overworked is in question in this example
maybe the following would work where...when the data is inserted into
the database the changes are wrote to a file...then have the loop
that continuously checks that file for an update....thoughts?
I have table1 and i want to see if a specific row(based on a UID/key) has been updated since the last time i checked as well as continuously check for 60 seconds if the record bets updated...
I'm thinking i can do this using the INFORMATION_SCHEMA database.
This database contains information about tables, views, columns, etc.
attempt at a solution:
<?php
$timer = time() + (10);//add 60 seconds
$KEY=$_POST['KEY'];
$done=0;
if(isset($KEY)){
//loign stuff
require_once('Connections/check.php');
$mysqli = mysqli_connect($hostname_check, $username_check, $password_check,$database_check);
if (mysqli_connect_errno($mysqli))
{ echo "Failed to connect to MySQL: " . mysqli_connect_error(); }
//end login
$query = "SELECT data1, data2
FROM station
WHERE client = $KEY
AND noted = 0;";
$update=" UPDATE station
SET noted=1
WHERE client = $KEY
AND noted = 0;";
while($done==0) {
$result = mysqli_query($mysqli, $query);
$update = mysqli_query($mysqli, $update);
$row_cnt = mysqli_num_rows($result);
if ($row_cnt > 0) {
$row = mysqli_fetch_array($result);
echo 'data1:'.$row['data1'].'/';
echo 'data2:'.$row['data2'].'/';
print $row[0];
$done=1;
}
else {
$current = time();
if($timer > $current){ $done=0; sleep(1); } //so if I haven't had a result update i want to loop back an check again for 60seconds
else { $done=1; echo 'done:nochange';}//60seconds pass end loop
}}
mysqli_close($mysqli);
echo 'time:'.time();
}
else {echo 'error:nokey';}
?>
Is this an adequate method and suggestions to improve the speed as well as improve the reliability
If I understand your application correctly, your client is a microcontroller. It issues an HTTP request to your php / mysql web app once in a while. The frequency of that request is up to the microcontroller, but but seems to be once a minute or so.
The request basically asks, "dude, got anything new for me?"
Your web app needs to send the answer, "not now" or "here's what I have."
Another part of your app is providing the information in question. And it's doing so asynchronously with your microcontroller (that is, whenever it wants to).
To make the microcontroller query efficient is your present objective.
(Note, if I have any of these assumptions wrong, please correct me.)
Your table will need a last_update column, a which_microcontroller column or the equivalent, and a notified column. Just for grins, let's also put in value1 and value2 columns. You haven't told us what kind of data you're keeping in the table.
Your software which updates the table needs to do this:
UPDATE theTable
SET notified=0, last_update = now(),
value1=?data,
value2?=data
WHERE which_microcontroller = ?microid
It can do this as often as it needs to. The new data values replace and overwrite the old ones.
Your software which handles the microcontroller request needs to do this sequence of queries:
START TRANSACTION;
SELECT value1, value2
FROM theTable
WHERE notified = 0
AND microcontroller_id = ?microid
FOR UPDATE;
UPDATE theTable
SET notified=1
WHERE microcontroller_id = ?microid;
COMMIT;
This will retrieve the latest value1 and value2 items (your application's data, whatever it is) from the database, if it has been updated since last queried. Your php program which handles that request from the microcontroller can respond with that data.
If the SELECT statement returns no rows, your php code responds to the microcontroller with "no changes."
This all assumes microcontroller_id is a unique key. If it isn't, you can still do this, but it's a little more complicated.
Notice we didn't use last_update in this example. We just used the notified flag.
If you want to wait until sixty seconds after the last update, it's possible to do that. That is, if you want to wait until value1 and value2 stop changing, you could do this instead.
START TRANSACTION;
SELECT value1, value2
FROM theTable
WHERE notified = 0
AND last_update <= NOW() - INTERVAL 60 SECOND
AND microcontroller_id = ?microid
FOR UPDATE;
UPDATE theTable
SET notified=1
WHERE microcontroller_id = ?microid;
COMMIT;
For these queries to be efficient, you'll need this index:
(microcontroller_id, notified, last_update)
In this design, you don't need to have your PHP code poll the database in a loop. Rather, you query the database when your microcontroller checks in for an update/
If all table1 changes are handled by PHP, then there's no reason to poll the database. Add the logic you need at the PHP level when you're updating table1.
For example (assuming OOP):
public function update() {
if ($row->modified > (time() - 60)) {
// perform code for modified in last 60 seconds
}
// run mysql queries
}