I have a question about concurrency control and when to worry about that. I've created a PHP/MySQL site (InnoDB). I know about how to avoid transaction issues but then there is the concurrency control.
My site is an E-commerce which holds user inserted goods, so as an example. When a user inserts a new item to the site, the DB creates an productId as primary key ( auto-incremented by the DB ) and stores the other data about that product that is submitted through a form. I'm using prepared statements.
Do i have to worry about if two or more users are doing this at exactly the same time? Is there any chance that submitting two or more items at the same time will mess up the data of the different rows?
To be able to insert products the user has to be logged in using sessions if that matters to the question.
Thanks in advance, Markus.
Unless they're writing to the exact same row, I don't see why you'd have any concurrency issues. Even if they were writing to the same row, InnoDB has row-level locking in place, which means that a row will be locked until a user has finished writing to it, leaving subsequent users to "wait it out" until the lock has been released. In regards to the possibility of "conflicting INSERT queries": If you're inserting new data into a table that is using an auto-incrementing primary key, you're guaranteed to get a unique ID each time, which means that concurrency should never be an issue on INSERT.
You can do fast table locking especially useful on updates
Related
So im making a web based game similar to Torn City, where there could potentially be millions of users.
My issue is regarding user inventories. I started out creating dynamic tables based on each user's id. e.g Table name = [UserID]_Inventory.
From what Ive found out this can create a load of hacker friendly entries with sql injections and such because of the dynamic creation.
My only other option seems to be creating 1 giant table holding every item that every player has and all the varied details of each item. This seems like it would take longer and longer to load once user count increases and the user's inventory will likely be accessed often.
Is there another option?
My only idea so far is to create some kind of temporary inventory that grabs only the active player inventories. That helps the database search time issues but still brings me back to creating dynamic tables.
At this stage I don't really need coding help, rather I need database structure help.
Code is appreciated tho.
Cheers.
Use the big table. Index it optimally. It should not give you trouble until you get well past a billion rows.
Here's a trick to optimizing use of such a table. Instead of
PRIMARY KEY(id),
INDEX(user_id)
have
PRIMARY KEY(user_id, id),
INDEX(id)
Since the PK is "clustered" with the data and the data is ordered according to the PK, this makes all of one user's data rows sitting next to each other. In huge tables, this cuts back significantly in I/O, hence improves overall speed. Also, it cuts back on pressure on the buffer_pool. (I assume you are using InnoDB?)
The INDEX(id) is sufficient for AUTO_INCREMENT.
There could be more suggestions, but I need more details. Please provide SHOW CREATE TABLE (as it stands now) and the main SELECTs. I am likely to suggest more changes to the indexes, datatypes, and query formulations.
(Dynamic tables is a mistake, and your troubles in that direction have only begun.)
I have two table 'reservation' and 'spot'.during a reservation process the 'spotStatus' column in spot table is checked and if free, it is to be updated. A user is allowed to reserve only one spot so to make sure that no other user can reserve the same spot, what can i do?
referring to some answers here,i found row locking,table locking as solutions. should i perform queries like
"select * from spot where spotId = id for update;"
and then performing necessary update to the status or is there other elegant ways to do it?
and my concern is what happens to the locked row if
1. Transaction doesnot complete successfully?
2. what happens if both user tries to reserve the same row at the same time? are both transactions cancelled?
and when is the lock released?
The problem here is in race conditions, that even transactions will not prevent by default if used naively - even if 2 reservations happen simultaneously, for example originating from 2 different Apache processes running PHP, transactional locking will just ensure the reservations are properly serialized, and as such the second one will still overwrite the first.
Usually this situation is of no real concern, given the speed of databases and servers as a whole, compared to the load on an average reservation site, the chances of this ever causing a problem are less than winning the state lottery twice in a row. If however you are implementing a site that's going to sell 50k Coldplay concert tickets in 30 seconds, chances rise aggressively.
A simple solution to this is to implement a sort of 'reservation intent' by not overwriting the spot reservation directly, but by appending the intent-to-reserve to a separate timestamped table. After this insertion you can then clean up this table for duplicates, preferring the oldest, and apply that one to the real-time data.
if its not successful, the database returns to the same data it was before the transaction (rollback) as if it never happened.
the same as it was not in the same time. only one of them will lock the db and the other wont be created.
If you are using a teradata you can use a queue table concept.
I have a problem with a project I am currently working on, built in PHP & MySQL. The project itself is similar to an online bidding system. Users bid on a project, and they get a chance to win if they follow their bid by clicking and cliking again.
The problem is this: if 5 users for example, enter the game at the same time, I get a 8-10 seconds delay in the database - I update the database using the UNIX_TIMESTAMP(CURRENT_TIMESTAMP), which makes the whole system of the bids useless.
I want to mention too that the project is very database intensive (around 30-40 queries per page) and I was thinking maybe the queries get delayed, but I'm not sure if that's happening. If that's the case though, any suggestions how to avoid this type of problem?
Hope I've been at least clear with this issue. It's the first time it happened to me and I would appreciate your help!
You can decide on
Optimizing or minimizing required queries.
You can cache queries do not need to update on each visit.
You can use Summery tables
Update the queries only on changes.
You have to do this cleverly. You can follow this MySQLPerformanceBlog
I'm not clearly on what you're doing, but let me elaborate on what you said. If you're using UNIX_TIMESTAMP(CURRENT_TIMESTAMP()) in your MySQL query you have a serious problem.
The problem with your approach is that you are using MySQL functions to supply the timestamp record that will be stored in the database. This is an issue, because then you have to wait on MySQL to parse and execute your query before that timestamp is ever generated (and some MySQL engines like MyISAM use table-level locking). Other engines (like InnoDB) have slower writes due to row-level locking granularity. This means the time stored in the row will not necessarily reflect the time the request was generated to insert said row. Additionally, it can also mean that the time you're reading from the database is not necessarily the most current record (assuming you are updating records after they were inserted into the table).
What you need is for the PHP request that generates the SQL query to provide the TIMESTAMP directly in the SQL query. This means the timestamp reflects the time the request is received by PHP and not necessarily the time that the row is inserted/updated into the database.
You also have to be clear about which MySQL engine you're table is using. For example, engines like InnoDB use MVCC (Multi-Version Concurrency Control). This means while a row is being read it can be written to at the same time. If this happens the database engine uses something called a page table to store the existing value that will be read by the client while the new value is being updated. That way you have guaranteed row-level locking with faster and more stable reads, but potentially slower writes.
I have a PHP/jQuery/AJAX/MySQL app built for managing databases. I want to implement the ability to prevent multiple users from editing the same database row at the same time.
What is this called?
Do I use a token system and who ever has the token can edit it until they release the token?
Do I use a "last edit date/time" to compare you loading the HTML form with the time in the database and if the database is the most resent edit then it warns you?
Do I lock the row using database functions?
I'm just not sure which is the best. Assuming between 10 - 15 concurrent users
There are two general approaches-- optimistic and pessimistic locking.
Optimistic locking is generally much easier to implement in a web-based environment because it is fundamentally stateless. It scales much better as well. The downside is that it assumes that your users generally won't be trying to edit the same set of rows at the same time. For most applications, that's a very reasonable assumption but you'd have to verify that your application isn't one of the outliers where users would regularly be stepping on each other's toes. In optimistic locking, you would have some sort of last_modified_timestamp column that you would SELECT when a user fetched the data and then use in the WHERE clause when you go to update the date, i.e.
UPDATE table_name
SET col1 = <<new value>>,
col2 = <<new values>>,
last_modified_timestamp = <<new timestamp>>
WHERE primary_key = <<key column>>
AND last_modified_timestamp = <<last modified timestamp you originally queried>>
If that updates 1 row, you know you were successful. Otherwise, if it updates 0 rows, you know that someone else has modified the data in the interim and you can take some action (generally showing the user the new data and asking them if they want to overwrite but you can adopt other conflict resolution approaches).
Pessimistic locking is more challenging to implement particularly in a web-based application particularly when users can close their browser without logging out or where users may start editing some data and go to lunch before hitting Submit. It makes it harder to scale and generally makes the application more difficult to administer. It's really only worth considering if users will regularly try to update the same rows or if updating a row takes a large amount of time for a user so it's worth letting them know up front that someone else has locked the row.
I was going to implement this into one of my own systems.
You could create new columns in your database of records, called timelocked.
When a record is opened, you would set the record they are opening's column for timelocked to the current time. During editing of the record, send a keepalive back to the server through ajax every 2 minutes. When sending the keepalive, the server will then increase the timelocked time to the current time the request was sent, and so fourth (this will make sense in a second). WHen the user is finished editing, set the timelocked to false.
Now, If someone went to open a record which is already open, the php would check -
if timelocked == false - would mean it's not being edited,
otherwise, the record may be being edited, but what if the user closed their browser window. that's why the keepalive is used.
if the difference between the current time and the timelocked is larger than 2 minutes, it means they're no longer lively editing, which would allow you to open it.
Hopefully you understand all that.
Don't try to prevent it. Let them decide what to do in the case of an edit conflict.
Add a timestamp to the table. Compare the timestamp of when the row was retrieved with the current timestamp. Make them aware of changes between their load and their save, and let them decide what action to take.
So yeah, number 3.
I personally would not prevent this. If it was a requirement of the job I would track the users' current / last known location and disallow someone from editing the same line someone else is editing this way. I have seen people add a row to a table saying isLocked or isBeingWorkedOn etc... but I have seen this type of system fail far more often as well, or require moderation to unlock stuck tables if someone closed it while working on it etc...
1) This is called locking. There are two main types of locking when referring to relational databases (like MySQL): table locking and row locking. Table locking ensures only one session at at time is making changes to a table, whereas row locking ensures only one session at a time is making changes to a particular row. You can think of row locking as a more fine-grained approach to concurrent access than table locking. Row locking is more complicated, but allows multiple concurrent sessions to write to the same table (important if your database has lots of concurrent writes--table locking should be fine for 10-15 users)
2-3) MySQL takes care of concurrent access for you! It automatically implements locking in the background. The type of locking (row or table) depends on which storage engine you use. For example, MyISAM uses table locking and InnoDB uses row locking. MySQL uses an internal table to manage this. You can query the status of this table (and all locks on your database) by checking the Table_locks_immediate and Table_locks_waited variables (it uses your option number 2).
When you issue an INSERT or UPDATE statement while another session is using the table (or row), the calling application (i.e. PHP in this case) will pause for a few milliseconds until the other session is done writing.
4) Again, MySQL will automatically take care of locking, but you can manually manage table locking with the LOCK TABLES and UNLOCK TABLES commands. If you are using row locking with InnoDB, there is a host of functions you can use to manually manage concurrent access.
See MySQL's page on Internal Locking for an overview of MySQL's locking system, and Concurrent Inserts for InnoDB's row locking features.
As others have said it's much easier to deal with a conflicting update.
What you are suggesting is called pesimistic locking. It's called thate because it's all too likely that two users will try and edit the same record at the same time.
Is that true?
And is it a disaster if a user has to start again, because the data they tried to update was changed by someone else.
Locking costs, you always lock in a pessimistic scheme, so you have an overhead, and that's before you start looking at related data and such.
Making it robust, dealing with no one can do it now coz sumfin' went wrong...
If I had something short of editing an entire file, that needed pessimistic locking, I'd be having a look at my design, on the basis that it isn't fit for purpose.
How to implement pessimistic locking in a php/mysql web application?
web-user opens a page to edit one dataset (row)
web-user clicks on the button "lock", so other users are able to read but not to write this dataset
web-user makes some modifications (takes maybe 1 to 30 minutes)
web-user clicks "save" or "cancel" and the "lock" is removed
Are there standard methods in php/mysql for this scenario? What happens if the web-user never clicks on "save"/"cancel" but closes the internet-exploror?
You need to implement a LOCKDATE and LOCKWHO field in your table. Ive done that in many applications outside of PHP/Mysql and it's always the same way.
The lock is terminated when the TTL has passed, so you could do a substraction of dates using NOW and LOCKDATE to see if the object has been locked for more than 30 minutes or 1h as you wish.
Another factor is to consider if the current user is the one locking the object. So thats why you also need a LOCKWHO. This can be a user_id from your database, a session_id from PHP. But keep it to something that identifies a user, an ipaddress is not a good way to do it.
Finaly, always think of a mass-unlock feature that simply resets all LOCKDATEs and LOCKWHOs...
Cheers
I would write the locks in one centralized table instead of adding fields to all tables.
Example table structure :
tblLocks
TableName (The name of tha locked table)
RowID (Primary key of locked table row)
LockDateTime (When the row was locked)
LockUser (Who locked the row)
With this approach you can find all locks that are made by a user without having to scan all tables. You could kill all locks when user logs out for example.
Traditionally this is done with a boolean locked column on the record in the database that is flagged appropriately.
It is a function of this sort of locking that the lock has to be released, and circumstances may prevent this happening naturally (system crashes, user stupidity, dropped network packets, etc etc etc). This is why you would need to provide some manual unlock method and/or impose a time limit (maybe with a cron job?) on how long a record can be locked for. You could implement some kind of AJAX poll to keep the record locked if the browser is still open? At any rate, you would probably be best to verify the data in the record is the same as it was when the lock was aquired before you modify it.
This limitation of this type of behaviour is particularly prevalent in web applications, but is true of anything that uses this approach - Sage Line 50, for one, is a bugger for it, I regularly have to delete lock files after machine/application crashes.