I have two table 'reservation' and 'spot'.during a reservation process the 'spotStatus' column in spot table is checked and if free, it is to be updated. A user is allowed to reserve only one spot so to make sure that no other user can reserve the same spot, what can i do?
referring to some answers here,i found row locking,table locking as solutions. should i perform queries like
"select * from spot where spotId = id for update;"
and then performing necessary update to the status or is there other elegant ways to do it?
and my concern is what happens to the locked row if
1. Transaction doesnot complete successfully?
2. what happens if both user tries to reserve the same row at the same time? are both transactions cancelled?
and when is the lock released?
The problem here is in race conditions, that even transactions will not prevent by default if used naively - even if 2 reservations happen simultaneously, for example originating from 2 different Apache processes running PHP, transactional locking will just ensure the reservations are properly serialized, and as such the second one will still overwrite the first.
Usually this situation is of no real concern, given the speed of databases and servers as a whole, compared to the load on an average reservation site, the chances of this ever causing a problem are less than winning the state lottery twice in a row. If however you are implementing a site that's going to sell 50k Coldplay concert tickets in 30 seconds, chances rise aggressively.
A simple solution to this is to implement a sort of 'reservation intent' by not overwriting the spot reservation directly, but by appending the intent-to-reserve to a separate timestamped table. After this insertion you can then clean up this table for duplicates, preferring the oldest, and apply that one to the real-time data.
if its not successful, the database returns to the same data it was before the transaction (rollback) as if it never happened.
the same as it was not in the same time. only one of them will lock the db and the other wont be created.
If you are using a teradata you can use a queue table concept.
Related
What happens if there are two people sending the same query at the same time to the database and one makes the other query return something different?
I have a shop where there is one item left. Two or more people buy the item and the query arrives at the exact same time on the MySQL server. My guess is that it will just queue but if so, how does MySQL pick the first one to execute and can i have influence on this?
sending the same query at the same time
QUERIES DO NOT ALWAYS RUN IN PARALLEL
It depends on the database engine. With MyISAM, nearly every query acquires a table level lock meaning that the queries are run sequentially as a queue. With most of the other engines they may run in parallel.
echo_me says nothing happens at the exact same time and a CPU does not do everything at once
That's not exactly true. It's possible that a DBMS could run on a machine with more than one cpu, and with more than one network interface. It's very improbable that 2 queries could arrive at the same time - but not impossible, hence there is a mutex to ensure that the paring/execution transition only runs as a single thread (of execution - not necesarily the same light weight process).
There's 2 approaches to solving concurent DML - either to use transactions (where each user effectively gets a clone of the database) and when the queries have completed the DBMS tries to reconcile any changes - if the reconciliation fails, then the DBMS rolls back one of the queries and reports it as failed. The other approach is to use row-level locking - the DBMS identifies the rows which will be updated by a query and marks them as reserved for update (other users can read the original version of each row but any attempt to update the data will be blocked until the row is available again).
Your problem is that you have two mysql clients, each of which have retrieved the fact that there is one item of stock left. This is further complicated by the fact that (since you mention PHP) the stock levels may have been retrieved in a different DBMS session than the subsequent stock adjustment - you cannot have a transaction spanning more than HTTP request. Hence you need revalidate any fact maintained outside the DBMS within a single transaction.
Optimistic locking can create a pseudo - transaction control mechanism - you flag a record you are about to modify with a timestamp and the user identifier (with PHP the PHP session ID is a good choice) - if when you come to modify it, something else has changed it, then your code knows the data it retrieved previously is invalid. However this can lead to other complications.
They are executed as soon as the user requests it, so if there are 10 users requesting the query at the exact same time, then there will be 10 queries executed at the exact same time.
nothing happens at the exact same time and a CPU does not do everything at once. It does things one at a time (per core and/or thread). If 10 users are accessing pages which run queries they will "hit" the server in a specific order and be processed in that order (although that order may be in milliseconds). However, if there are multiple queries on a page you can't be sure that all the queries on one user's page will complete before the queries on another user's page are started. This can lead to concurrency problems.
edit:
Run SHOW PROCESSLIST to find the id of the connecton you want to kill
.
SHOW PROCESSLIST will give you a list of all currently running queries.
from here.
MySQL will perform well with fast CPUs because each query runs in a single thread and can't be parallelized across CPUs.
how mysql uses memorie
Consider a query similar to:
UPDATE items SET quantity = quantity - 1 WHERE id = 100
However many queries the MySQL server runs in parallel, if 2 such queries run and the row with id 100 has quantity 1, then something like this will happen by default:
The first query locks the row in items where id is 100
The second query tries to do the same, but the row is locked, so it waits
The first query changes the quantity from 1 to 0 and unlocks the row
The second query tries again and now sees the row is unlocked
The second query locks the row in items where id is 100
The second query changes the quantity from 0 to -1 and unlocks the row
This is essentially a concurrency question. There are ways to ensure concurrency in MySQL by using transactions. This means that in your eshop you can ensure that race conditions like the ones you describe won't be an issue. See link below about transactions in MySQL.
http://dev.mysql.com/doc/refman/5.0/en/sql-syntax-transactions.html
http://zetcode.com/databases/mysqltutorial/transactions/
Depending on your isolation level different outcomes will be returned from two concurrent queries.
Queries in MySQL are handled in parallel. You can read more about the implementation here.
I have a PHP/jQuery/AJAX/MySQL app built for managing databases. I want to implement the ability to prevent multiple users from editing the same database row at the same time.
What is this called?
Do I use a token system and who ever has the token can edit it until they release the token?
Do I use a "last edit date/time" to compare you loading the HTML form with the time in the database and if the database is the most resent edit then it warns you?
Do I lock the row using database functions?
I'm just not sure which is the best. Assuming between 10 - 15 concurrent users
There are two general approaches-- optimistic and pessimistic locking.
Optimistic locking is generally much easier to implement in a web-based environment because it is fundamentally stateless. It scales much better as well. The downside is that it assumes that your users generally won't be trying to edit the same set of rows at the same time. For most applications, that's a very reasonable assumption but you'd have to verify that your application isn't one of the outliers where users would regularly be stepping on each other's toes. In optimistic locking, you would have some sort of last_modified_timestamp column that you would SELECT when a user fetched the data and then use in the WHERE clause when you go to update the date, i.e.
UPDATE table_name
SET col1 = <<new value>>,
col2 = <<new values>>,
last_modified_timestamp = <<new timestamp>>
WHERE primary_key = <<key column>>
AND last_modified_timestamp = <<last modified timestamp you originally queried>>
If that updates 1 row, you know you were successful. Otherwise, if it updates 0 rows, you know that someone else has modified the data in the interim and you can take some action (generally showing the user the new data and asking them if they want to overwrite but you can adopt other conflict resolution approaches).
Pessimistic locking is more challenging to implement particularly in a web-based application particularly when users can close their browser without logging out or where users may start editing some data and go to lunch before hitting Submit. It makes it harder to scale and generally makes the application more difficult to administer. It's really only worth considering if users will regularly try to update the same rows or if updating a row takes a large amount of time for a user so it's worth letting them know up front that someone else has locked the row.
I was going to implement this into one of my own systems.
You could create new columns in your database of records, called timelocked.
When a record is opened, you would set the record they are opening's column for timelocked to the current time. During editing of the record, send a keepalive back to the server through ajax every 2 minutes. When sending the keepalive, the server will then increase the timelocked time to the current time the request was sent, and so fourth (this will make sense in a second). WHen the user is finished editing, set the timelocked to false.
Now, If someone went to open a record which is already open, the php would check -
if timelocked == false - would mean it's not being edited,
otherwise, the record may be being edited, but what if the user closed their browser window. that's why the keepalive is used.
if the difference between the current time and the timelocked is larger than 2 minutes, it means they're no longer lively editing, which would allow you to open it.
Hopefully you understand all that.
Don't try to prevent it. Let them decide what to do in the case of an edit conflict.
Add a timestamp to the table. Compare the timestamp of when the row was retrieved with the current timestamp. Make them aware of changes between their load and their save, and let them decide what action to take.
So yeah, number 3.
I personally would not prevent this. If it was a requirement of the job I would track the users' current / last known location and disallow someone from editing the same line someone else is editing this way. I have seen people add a row to a table saying isLocked or isBeingWorkedOn etc... but I have seen this type of system fail far more often as well, or require moderation to unlock stuck tables if someone closed it while working on it etc...
1) This is called locking. There are two main types of locking when referring to relational databases (like MySQL): table locking and row locking. Table locking ensures only one session at at time is making changes to a table, whereas row locking ensures only one session at a time is making changes to a particular row. You can think of row locking as a more fine-grained approach to concurrent access than table locking. Row locking is more complicated, but allows multiple concurrent sessions to write to the same table (important if your database has lots of concurrent writes--table locking should be fine for 10-15 users)
2-3) MySQL takes care of concurrent access for you! It automatically implements locking in the background. The type of locking (row or table) depends on which storage engine you use. For example, MyISAM uses table locking and InnoDB uses row locking. MySQL uses an internal table to manage this. You can query the status of this table (and all locks on your database) by checking the Table_locks_immediate and Table_locks_waited variables (it uses your option number 2).
When you issue an INSERT or UPDATE statement while another session is using the table (or row), the calling application (i.e. PHP in this case) will pause for a few milliseconds until the other session is done writing.
4) Again, MySQL will automatically take care of locking, but you can manually manage table locking with the LOCK TABLES and UNLOCK TABLES commands. If you are using row locking with InnoDB, there is a host of functions you can use to manually manage concurrent access.
See MySQL's page on Internal Locking for an overview of MySQL's locking system, and Concurrent Inserts for InnoDB's row locking features.
As others have said it's much easier to deal with a conflicting update.
What you are suggesting is called pesimistic locking. It's called thate because it's all too likely that two users will try and edit the same record at the same time.
Is that true?
And is it a disaster if a user has to start again, because the data they tried to update was changed by someone else.
Locking costs, you always lock in a pessimistic scheme, so you have an overhead, and that's before you start looking at related data and such.
Making it robust, dealing with no one can do it now coz sumfin' went wrong...
If I had something short of editing an entire file, that needed pessimistic locking, I'd be having a look at my design, on the basis that it isn't fit for purpose.
I am currently looking into how I can manage a high number of bids on my auction site project. As it is quite possible that some people may send bids at exactly the same time it has become apparent that I need to ensure that there are locks to prevent any data corruption.
I have come down to using SELECT LOCK IN SHARE MODE which states that If any of these rows were changed by another transaction that has not yet committed, your query waits until that transaction ends and then uses the latest values.
http://dev.mysql.com/doc/refman/5.1/en/innodb-locking-reads.html
This suggests to me that the bids will enter a queue where each bid is dealt with and checked to ensure that the bid is higher than the current bid and if there are changes since an insert is put in this queue then the latest bid amount is used.
However, I have read that there can be damaging deadlock issues where two users try to place bids at the same time and no query can maintain a lock. Therefore I have also considered using SELECT FOR UPDATE but this will then also disable any reads which i am quite unsure about.
If anybody could shed any light on this issue that would be appreciated, if you could suggest any other database like NoSQL which would be more suitable then that would be very helpful!!!
EDIT: This is essentially a concurrency problem where i don't want to be checking the current bid with incorrect/old data which would therefore produce a 'lost update' on certain bids.
By itself, two simultaneous updates will not cause a deadlock, just transient blocking. Let's call them Bid A and Bid B.
Although we're considering them simultaneous, one will acquire a lock first. We'll say that A gets there 1 ms faster.
A acquires a lock on the row in question. B has it's lock request go in queue and must wait for the lock belonging to A to be released. As soon as lock A is released, B acquires it's lock.
There may be more to your code but from your question, and as I've described it, there is no deadlock scenario. In order to deadlock, A must be waiting for B to release it's lock on another resource but B will not release it's lock until it acquires a lock on A's resource.
If you need to validate the bid in real time you can either:
A. Use the appropriate transaction isolation level (repeatable read, probably, which is the default in InnoDB) and perform both your select and update in an explicit transaction.
BEGIN TRAN
SELECT ... FOR UPDATE
IF ...
UPDATE ...
COMMIT
B. Perform your check logic in your Update statement itself. In other words, construct your UPDATE query so that it will only affect rows when the current bid is less than the new bid. If no records were affected, the bid was too low. This is a possible approach and reduces work on the DB but has it's own considerations.
UPDATE ...
WHERE currentBid < newBid
Personally my vote would be to opt for A because I don't know how complex your logic is.
A repeatable read isolation level will ensure that a every time you read a given record in a transaction, the value is guaranteed to be the same. It does this by holding a lock on the row which prevents others from updating the given row until your transaction either commits or rolls back. One connection cannot update your table until the last has completed it's transaction.
The bottom line is your select/update will be atomic in your DB so you don't have to worry about lost updates.
Regarding concurrency, the key there is to keep your transactions as short as possible. Get in, get out. By default you can't read a record that is being updated because it is in an indeterminate state. These updates and reads should be taking small fractions of a second.
How long can a MySQL transaction last until it times out? I'm asking because I'm planning to code an payment process for my e-commerce project somewhere along the lines of this (PHP/MySQL psuedo-code):
START TRANSACTION;
SELECT...WHERE id IN (1,2,3) AND available = 1 FOR UPDATE; //lock rows where "available" is true
//Do payment processing...
//add to database, commit or rollback based on payment results
I can not think of another way to lock the products being bought (so if two users buy it at the same time, and there is only one left in stock, one user won't be able to buy), process payment if products are available, and create a record based on payment results...
That technique would also block users who simply wanted to see the products other people are buying. I'd be exceptionally wary of any technique that relies on database row locking to enforce inventory management.
Instead, why not simply record the number of items currently tied up in an active "transaction" (here meaning the broader commercial sense, rather than the technical database sense). If you have a current_inventory field, add an on_hold or being_paid_for or not_really_available_because_they_are_being_used_elsewhere field that you can update with information on current payments.
Better yet, why not use a purchase / payment log to sum the items currently "on hold" or "in processing" for several different users.
This is the general approach you often see on sites like Ticketmaster that declare, "You have X minutes to finish this page, or we'll put these tickets back on the market." They're recording which items the user is currently trying to buy, and those records can even persist across PHP page requests.
If you have to ask how long it is before a database connection times out, then your transactions take orders of magnitudes too long.
Long open transactions are a big problem and frequent causes of poor performance, unrepeatable bugs or even deadlocking the complete application. Certainly in a web application you want tight fast transactions to make sure all table and row level locks are quickly freed.
I found that even several 100ms can become troublesome.
Then there is the problem of sharing a transaction over multiple requests which may happen concurrently.
If you need to "emulate" long running transactions, cut it into smaller pieces which can be executed fast, and keep a log so you can rollback using the log by undoing the transactions.
Now, if the payment service completes in 98% of cases in less than 2 sec and you do not have hundreds of concurrent requests going on, it might just be fine.
Timeout depends on server settings -- both that of mysql and that of the language you are using to interact with mysql. Look in the settings files for your server.
I don't think what you are doing would cause a timeout, but if you are worried you might want to rethink the location of your check so that it doesn't actually lock the tables across queries. You could instead have a stored procedure that is built into the data layer rather than relying on two separate calls. Or, maybe a conditional insert or a conditional update?
All in all, as another person noted, I don't like the idea of locking entire table rows which you might want to be able to select from for other purposes outside of the actual "purchase" step, as it could result in problems or bottlenecks elsewhere in your application.
Lets say you have a table with some winning numbers in it. Any of these numbers is meant to be only "won" by one person.
How could I prevent 2 simultaneous web requests that submit the same numbers from both checking and seeing that the numbers is still available and then giving the prize to both of them before the number is marked as no longer being available.
The winning solution in this question feels like what I was thinking of doing, as it can also be applied in most database platforms.
Is there any other common pattern that can be applied to this?
These numbers are randomly generated or something?
I would rely on the transactional semantics in the database itself: Create a table with two columns, number and claimed, and use a single update:
UPDATE winners SET claimed=1 WHERE claimed=0 AND number=#num;
Then check the number of affected rows.
Use transactions. You should never have multiple threads or processes changing the same data without transactional locks and any decent database supports transactions today. Start the transaction, "grab" the winning number, and then commit. Another thread would be locked until the commit, and would only get its chance after the records are updated, when it could see its already there.
A non-database solution could be to have the client make the request async and then push the request on a FIFO queue to control the requests so that only one request at a time is getting evaluated. Then respond back to the client when the evaluation is complete. The advantage here would be that under high load, the UI would not be frozen where it would be with transactional locking on the database level.