I am writing an application which shall track the financial transactions (as in a bank), to maintain the balance amount. I am using Denormalizing technique to keep the performance in check(and not have to calculate the balance at runtime) as discussed Here and Here.
Now, I am facing a Race Condition if two people simultaneously did a transaction related to same entity, the balance calculation as discussed above, shall return/set inconsistent data, as discussed Here and Here, And also as suggested in the answers..
I am going for Mysql Transactions.
Now My question is,
What Happens to the other similar queries when a mysql Transaction is underway?
I wish to know if other transactions fail as in Error 500 or are they queued and executed, once the first transaction finishes.
I also need to know how to deal with the either result from the php point of view.
And since these transactions are going to be an element of a larger set of operation in php with many prior insert queries, should I also device a mechanism to roll-back those successfully executed queries too, since I want Atomicity not only as in individual queries but also as in whole operation logic(php).
Edit 1 :-
Also, if former is the case, should I check for the error, and wait a few second and try that specific transaction query again after some time?
Edit 2 :-
Also Mysql Triggers is not an option for me.
With code like this, there is no race condition. Instead, one transaction could be aborted (ROLLBACK'd).
BEGIN;
SELECT balance FROM Accounts WHERE acct_id = 123 FOR UPDATE;
if balance < 100, then ROLLBACK and exit with "insufficient funds"
UPDATE Accounts SET balance = balance - 100 WHERE acct_id = 123;
UPDATE Accounts SET balance = balance + 100 WHERE acct_id = 456;
COMMIT;
And check for errors at each step. If error, ROLLBACK and rerun the transaction. On the second time, it will probably succeed. If not, then abort -- it is probably a logic bug. Only then should you give http error 500.
When two users "simultaneously" try to do similar transactions, then one of these things will happen:
The 'second' user will be stalled until the first finishes.
If that stall is more than innodb_lock_wait_timeout, your queries are too slow or something else. You need to fix the system.
If you get a "Deadlock", there may be ways to repair the code. Meanwhile, simply restarting the transaction is likely to succeed.
But it will not mess up the data (assuming the logic is correct).
There is no need to "wait a second" -- unless you have transactions that take "a second". Such would be terribly slow for this type of code.
What I am saying works for "real money", non-real money, non-money, etc.; whatever you need to tally carefully.
Related
I'm trying to create an application that has the ability to sell gift cards but I'm fearing the consistency of data, so let me explain what I mean:
I have 3 tables
transactions(id, created_date),
cards(id, name) and
vouchers(id, code, card_id, transaction_id).
The database contains many cards and vouchers and each voucher belongs to one card.
The user will want to select a card to buy and choose a quantity.
So, the application will select vouchers according to the selected card and LIMIT according to quantity then creating a new transaction and adding the transaction_id into the selected vouchers to flag them as bought cards for this new transaction.
So, what I am afraid of is what if multiple users sent the same buying request for the same card at the exact same time, then will some data collision occur? and what is the best approach to fix this?
I am currently using MySQL 8 Community Edition with InnoDB engine for all tables.
What I am searching for is if I can create some kind of a queue that activates transactions one by one without collision even if it means that the users will have to wait until they get to their turn in the queue.
Thanks in advance
This is a job for MySQL transactions. With SQL you can do something like this.
START TRANSACTION;
SELECT id AS card_id FROM cards WHERE name = <<<<chosen name>>>> FOR UPDATE;
--do whatever it takes in SQL to complete the operation
COMMIT;
Multiple php processes, may try to do something with the same row of cards concurrently. Using START TRANSACTION combined with SELECT ... FOR UPDATE will prevent that by making the next process wait until the first process does COMMIT.
Internally, MySQL has a queue of waiting processes. As long as you do the COMMIT promptly after you do BEGIN TRANSACTION your users won't notice this. This most often isn't called "queuing" at the application level, but rather transaction consistency.
Laravel / eloquent makes this super easy. It does the START TRANSACTION and COMMIT for you, and offers lockForUpdate().
DB::transaction(function() {
DB::table('cards')->where('name', '=', $chosenName)->lockForUpdate()->get();
/* do whatever you need to do for your transaction */
});
I have a table called Settings which has only one row. The settings are very important in all the cases for my program, The Settings is been read by 200 to 300 users every second. I haven't used any caching yet. I cannot update the settings table for a value like Limit. Change the limit from 5 -10 Or anything from an API.
Ex: Limit Products 5 - 10. The update query runs forever.
From the Workbench, I can update the record, But from Admin Panel through API it's not updating or take too much time. Table - InnoDB
1. Already Tried Locking With Read - Write.
2. Transaction.
3. Made a View of the table and Tried to update the table, the Same Issue remains.
4. The Update query is fine from Workbench, But through an API. It runs all day.
Is there anyway, I can lock the read operations on the table and update the table. I have only one row in a table.
Any help would be highly appreciated, Thanks in advance.
This sounds like a really good use case for using query cache.
The query cache stores the text of a SELECT statement together with the corresponding result that was sent to the client. If an identical statement is received later, the server retrieves the results from the query cache rather than parsing and executing the statement again. The query cache is shared among sessions, so a result set generated by one client can be sent in response to the same query issued by another client.
The query cache can be useful in an environment where you have tables that do not change very often and for which the server receives many identical queries.
To enable the query cache, you can run:
SET GLOBAL query_cache_size = 1000000;
And then edit your mysql config file (typically /etc/my.cnf or /etc/mysql/my.cnf):
query_cache_size=1000000
query_cache_type=2
query_cache_limit=100000
And then for your query you can change it to:
SELECT SQL_CACHE * FROM your_table;
And that should make it so you are able to update the table (as it won't be constantly locked).
You would need to restart the server.
As an alternative, you could implement cacheing in your PHP application. I would use something like memcached, but as a very simplistic solution you could do something like:
$settings = json_decode(file_get_contents("/path/to/settings.json"), true);
$minute = intval(date('i'));
if (isset($settings['minute']) && $settings['minute'] !== $minute) {
$settings = get_settings_from_mysql();
$settings['minute'] = intval(date('i'));
file_put_contents("/path/to/settings.json", json_encode($settings), LOCK_EX);
}
Are the queries being run in the context of transactions with say a transaction isolation level for repeatable read? It sounds like the update isn't able to complete due to a lock on the table, in which case caching isn't likely to be able to help you, as on a write the cache will be purged. More information on repeatable reads can be found at https://www.percona.com/blog/2012/08/28/differences-between-read-committed-and-repeatable-read-transaction-isolation-levels/.
The PHP Documentation says:
If you've never encountered transactions before, they offer 4 major
features: Atomicity, Consistency, Isolation and Durability (ACID). In
layman's terms, any work carried out in a transaction, even if it is
carried out in stages, is guaranteed to be applied to the database
safely, and without interference from other connections, when it is
committed.
QUESTION:
Does this mean that I can have two separate php scripts running transactions simultaneously without them interfering with one another?
ELABORATING ON WHAT I MEAN BY "INTERFERING":
Imagine we have the following employees table:
__________________________
| id | name | salary |
|------+--------+----------|
| 1 | ana | 10000 |
|------+--------+----------|
If I have two scripts with similar/same code and they run at the exact same time:
script1.php and script2.php (both have the same code):
$conn->beginTransaction();
$stmt = $conn->prepare("SELECT * FROM employees WHERE name = ?");
$stmt->execute(['ana']);
$row = $stmt->fetch(PDO::FETCH_ASSOC);
$salary = $row['salary'];
$salary = $salary + 1000;//increasing salary
$stmt = $conn->prepare("UPDATE employees SET salary = {$salary} WHERE name = ?");
$stmt->execute(['ana']);
$conn->commit();
and assuming the sequence of events is as follows:
script1.php selects data
script2.php selects data
script1.php updates data
script2.php updates data
script1.php commit() happens
script2.php commit() happens
What would the resulting salary of ana be in this case?
Would it be 11000? And would this then mean that 1 transaction will overlap the other because the information was obtained before either commit happened?
Would it be 12000? And would this then mean that regardless of the order in which data was updated and selected, the commit() function forced these to happen individually?
Please feel free to elaborate as much as you want on how transactions and separate scripts can interfere (or don't interfere) with one another.
You are not going to find the answer in php documentation because this has nothing to do with php or pdo.
Innodb table engine in mysql offers 4 so-called isolation levels in line with the sql standard. The isolation levels in conjunction with blocking / non-blocking reads will determine the result of the above example. You need to understand the implications of the various isolation levels and choose the appropriate one for your needs.
To sum up: if you use serialisable isolation level with autocommit turned off, then the result will be 12000. In all other isolation levels and serialisable with autocommit enabled the result will be 11000. If you start using locking reads, then the result could be 12000 under all isolation levels.
Judging by the given conditions (a solitary DML statement), you don't need a transaction here, but a table lock. It's a very common confusion.
You need a transaction if you need to make sure that ALL your DML statements were performed correctly or weren't performed at all.
Means
you don't need a transaction for any number of SELECT queries
you don't need a transaction if only one DML statement is performed
Although, as it was noted in the excellent answer from Shadow, you may use a transaction here with appropriate isolation level, it would be rather confusing. What you need here is table locking. InnoDB engine lets you lock particular rows instead of locking the entire table and thus should be preferred.
In case you want the salary to be 1200 - then use table locks.
Or - a simpler way - just run an atomic update query:
UPDATE employees SET salary = salary + 1000 WHERE name = ?
In this case all salaries will be recorded.
If your goal is different, better express it explicitly.
But again: you have to understand that transactions in general has nothing to do with separate scripts execution. Regarding your topic of race condition you are interested not in transactions but in table/row locking. This is a very common confusion, and you better learn it straight:
a transaction is to ensure that a set of DML queries within one script were executed successfully.
table/row locking is to ensure that other script executions won't interfere.
The only topic where transactions and locking interfere is a deadlock, but again - it's only in case when a transaction is using locking.
Alas, the "without interference" needs some help from the programmer. It needs BEGIN and COMMIT to define the extent of the 'transaction'. And...
Your example is inadequate. The first statement needs SELECT ... FOR UPDATE. This tells the transaction processing that there is likely to be an UPDATE coming for the row(s) that the SELECT fetches. That warning is critical to "preventing interference". Now the timeline reads:
script1.php BEGINs
script2.php BEGINs
script1.php selects data (FOR UPDATE)
script2.php selects data is blocked, so it waits
script1.php updates data
script1.php commit() happens
script2.php selects data (and will get the newly-committed value)
script2.php updates data
script2.php commit() happens
(Note: This is not a 'deadlock', just a 'wait'.)
I have a cron task running every x seconds on n servers. It will "SELECT FROM table WHERE time_scheduled<CURRENT_TIME" and then perform a lengthy task on this result set.
My problem is now: How do I avoid having two seperate servers perform the same task at the same time?
The idea is to update *time_scheduled* with a set interval after selecting it. But if two servers happen to run the query at the same time, that will be too late, no?
All ideas are welcome. It doesnt have to be a strict MySQL solution.
Thanks!
I am guessing you have a single MySQL instance, and connections from your n servers to run this processing job. You're implementing a job queue here.
The table you mention needs to use the InnoDB access method (or one of the other transaction-friendly access methods offered by Percona or MariaDB).
Do these items in your table need to be processed in batches? That is, are they somehow inter-related? Or is it possible for your server processes to handle them one-by-one? This is an important question, because you'll get better load balancing between your server processes if you can handle them individually or in small batches. Let's assume the small batches.
The idea is to prevent any server process from grabbing onto a row in your table if some other server process has that row. I've had to do this kind of thing a lot, and here is my suggestion; I know this works.
First, add an integer column to your table. Call it "working" or some such thing. Give it a default value of zero.
Second, assign a permanent id number to each server. The last part of the server's IP address (for example, if the server's IP address is 10.1.0.123, the id number is 123) is a good choice, because it's probably unique in your environment.
Then, when a server's grabbing work to do, use these two SQL queries.
UPDATE table
SET working = :this_server_id
WHERE working = 0
AND time_scheduled < CURRENT_TIME
ORDER BY time_scheduled
LIMIT 1
SELECT table_id, whatever, whatever
FROM table
WHERE working = :this_server_id
The first query will consistently grab a batch of rows to work on. If another server process comes in at the same time, it won't ever grab the same rows, because no process can grab rows unless working = 0. Notice that the LIMIT 1 will limit your batch size. You don't have to do this, but you can. I also threw in ORDER BY to process the rows first that have been waiting the longest. That's probably a useful way to do things.
The second query retrieves the information you need to do the work. Don't forget to retrieve the primary key values (I called them table_id) for the rows you're working on.
Then, your server process does whatever it needs to do.
When it's done, it needs to throw the row back into the queue for a later time. To do that, the server process needs to set the time_scheduled to whatever it needs to be, then to set working = 0. So, for example, you could run this query for each row you're processing.
UPDATE table
SET time_scheduled = CURRENT_TIME + INTERVAL 5 MINUTE,
working = 0
WHERE table_id = ?table_id_from_previous_query
That's it.
Except for one thing. In the real world these queuing systems get fouled up sometimes. Server processes crash. Etc. Etc. See Murphy's Law. You need a monitoring query. That's easy in this system.
This query will give a list of all jobs that are more than five minutes overdue, along with the server that's supposed to be working on them.
SELECT working, COUNT(*) stale_jobs
FROM table
WHERE time_scheduled < CURRENT_TIME - INTERVAL 5 MINUTE
GROUP BY WORKING
If this query comes up empty, all is well. If it comes up with lots of jobs with working set to zero, your servers aren't keeping up. If it comes up with jobs with working set to some server's id number, that server is taking a lunch break.
You can reset all the jobs assigned to the server that's gone to lunch with this query, if need be.
UPDATE table
SET working=0
WHERE working=?server_id_at_lunch
By the way, a compound index on (working, time_scheduled) will probably help this perform well.
I sometimes gets mysql deadlock errors saying:
'Deadlock found when trying to get lock; try restarting transaction'
I have a queues table where multiple php processes are running simultaneously selecting rows from the table. However, for each process i want it grab a unique batch of rows each fetch so i don't have any overlapping rows being selected.
so i run this query: (which is the query i get the deadlock error on)
$this->db->query("START TRANSACTION;");
$sql = " SELECT mailer_queue_id
FROM mailer_queues
WHERE process_id IS NULL
LIMIT 250
FOR UPDATE;";
...
$sql = "UPDATE mailer_queues
SET process_id = 33044,
status = 'COMPLETED'
WHERE mailer_queue_id
IN (1,2,3...);";
...
if($this->db->affected_rows() > 0) {
$this->db->query("COMMIT;");
} else{
$this->db->query("ROLLBACK;");
}
I'm also:
inserting rows to the table (with no transactions/locks) at the same time
updating rows in the table (with no transactions/locks) at the same time
deleting the rows from the table (with no transactions/locks) at the same time
As well, my updates and deletes only update and delete rows where they have a process_id assigned to them ...and where i perform my transactions that "SELECT rows ... FOR UPDATE" are where the process_id = null. In theory they should never be overlapping.
I'm wondering if there is a proper way to avoid these deadlocks?
Can a deadlock occur because one transaction is locking the table for too long while its selecting/update and the another process is trying to perform the same transaction and just timesout?
any help is much appreciated
Deadlocks occur when two or more processes requests locks in such a way that the resources being locked overlap, but occur in different orders, so that each process is waiting for a resource that's locked by another process, and that other process is waiting for a lock that the original process has open.
In real world terms, consider a construction site: You've got one screwdriver, and one screw. Two workers need to drive in a screw. Worker #1 grabs the screwdriver, and worker #2 grabs the screw. Worker #1 goes to grab the screw as well, but can't, because it's being held by worker #2. Worker #2 needs the screwdriver, but can't get it because worker #1 is holding it. So now they're deadlocked, unable to proceed, because they've got 1 of the 2 resources they need, and neither of them will be polite and "step back".
Given that you've got out-of-transaction changes occurring, it's possible that one (or more) of your updates/deletes are overlapping the locked areas you're reserving inside the transactions.
You might want to try LOCK TABLES before starting the transaction, thereby assuring you have explicit control over the tables. The lock will wait until all activity on the particular tables has completed.
I think everyone on net has explained very well about the deadlock.
Mysql provide very good log to check all the last dead lock happened and which
queries were stuck at that time.
Check this mysql documentation page and search for LATEST DETECTED DEADLOCK
its a great logs, helped finding many subtle deadlocks.