This question already has answers here:
How to make sure there is no race condition in MySQL database when incrementing a field?
(2 answers)
Lock file/content while being edited in browser.
(1 answer)
Closed 4 years ago.
I've developed a web application using Apache, MySQL and PHP.
This web app allows multiple users' to login to the application.
Then, through the application, they have access to the Database.
Since race conditions may apply when two or more users try to SELECT/UPDATE/DELETE the same information (in my case a row of a Table), I am searching for the best way to avoid such race conditions.
I've tried using mysqli with setting autocommit to OFF and using SELECT .... FOR UPDATE, but this fails to work as -to my understanding- with PHP, each transaction commits automatically and each connection to the db is being auto released when the PHP -->html page is provided/loaded for the user.
After reading some posts, there seem to be two possible solutions for my problem :
Use PDO. To my understanding PDO creates connections to the DB which are not released when the html page loads. Special precautions should be taken though as locks may remain if e.g. the user quits the page and the PDO connection has not been released...
Add a "locked" column in the corresponding table to flag locked rows. So e.g. when an UPDATE transaction may only be performed if the corresponding user has locked the row for editing. The other users shall not be allowed to modify.
The main issue I may have with PDO is that I have to modify the PHP code in order to replace mysqli with PDO, where applicable.
The issue with the scenario 2 is that I need to also modify my DB schema, add additional coding for lock/unlock and also consider the possibility of "hanging" locked rows which may result in additional columns to be added in the table (e.g. to store the time the row was locked and the lockedBy information) and code as well (e.g. to run a Javascript at User side that will be updating the locked time so that the user continuously flags the row while using it...)
Your comments based on your experience would be highly appreciated!!!
Thank you.
It might be an opinion instead of a technical answer, but too long to write it as a comment.
I want to think it like booking a seat in a movie or a flight: When an user selects a seat and presses next, the seat will be reserved for that user for a certain amount of time, and when user doesn't finish in the given time, it gets a timeout exception without processing further. You can use an edit button besides the row, and when the user clicks it, on the server side, you check if the row is reserved to someone else, and if not, reserve it to the user. Other users won't get an edit form when they also click the edit button after that user. I don't know how database systems handle this though.
But, one way to make it sure, re-read the row after user edits and commits it to display the user. If any lock mechanism prevented the row from being updated, the user will also know it by not seeing the change in the row.
Related
I've been reading through several topics now and did some research about logging changes to a mysql table. First let me explain my situation:
I've a ticket system with a table: 'ticket'
As of now I've created triggers which will enter a duplicate entry in my table: 'ticket_history' which has "action" "user" and "timestamp" as additional columns. After some weeks and testing I'm somewhat not happy with that build since every change is creating a full copy of my row in the history table. I do understand that disk space is cheap and I should not worry about it but in order to retrieve some kind of log or nice looking history for the user is painful, at least for me. Also with the trigger I've written I get a new row in the history even if there is no change. But this is just a design flaw of my trigger!
Here my trigger:
BEFORE UPDATE ON ticket FOR EACH ROW
BEGIN
INSERT INTO ticket_history
SET
idticket = NEW.idticket,
time_arrival = NEW.time_arrival,
idticket_status = NEW.idticket_status,
tmp_user = NEW.tmp_user,
action = 'update',
timestamp = NOW();
END
My new approach in order to avoid having triggers
After spening some time on this topic I came up with an approach I would like to discuss and implement. But first I would have some questions about that:
My idea is to create a new table:
id sql_fwd sql_bwd keys values user timestamp
-------------------------------------------------------------------------
1 UPDATE... UPDATE... status 5 14 12345678
2 UPDATE... UPDATE... status 4 7 12345678
The flow would look like this in my mind:
At first I would select something or more from the DB:
SELECT keys FROM ticket;
Then I display the data in 2 input fields:
<input name="key" value="value" />
<input type="hidden" name="key" value="value" />
Hit submit and give it to my function:
I would start with a SELECT again: SELECT * FROM ticket;
and make sure that the hidden input field == the value from the latest select. If so I can proceed and know that no other user has changed something in the meanwhile. If the hidden field does not match I bring the user back to the form and display a message.
Next I would build the SQL Queries for the action and also the query to undo those changes.
$sql_fwd = "UPDATE ticket
SET idticket_status = 1
WHERE idticket = '".$c_get['id']."';";
$sql_bwd = "UPDATE ticket
SET idticket_status = 0
WHERE idticket = '".$c_get['id']."';";
Having that I run the UPDATE on ticket and insert a new entry in my new table for logging.
With that I can try to catch possible overwrites while two users are editing the same ticket in the same time and for my history I could simply look up the keys and values and generate some kind of list. Also having the SQL_BWD I simply can undo changes.
My questions to that would be:
Would it be noticeable doing an additional select everytime I want to update something?
Do I lose some benefits I would have with triggers?
Are there any big disadvantages
Are there any functions on my mysql server or with php which already do something like that?
Or is there might be a much easier way to do something like that
Is maybe a slight change to my trigger I've now already enough?
If I understad this right MySQL is only performing an update if the value has changed but the trigger is executed anyways right?
If I'm able to change the trigger, can I still prevent somehow the overwriting of data while 2 users try to edit the ticket the same time on the mysql server or would I do this anyways with PHP?
Thank you for the help already
Another approach...
When a worker starts to make a change...
Store the time and worker_id in the row.
Proceed to do the tasks.
When the worker finishes, fetch the last worker_id that touched the record; if it is himself, all is well. Clear the time and worker_id.
If, on the other hand, another worker slips in, then some resolution is needed. This gets into your concept that some things can proceed in parallel.
Comments could be added to a different table, hence no conflict.
Changing the priority may not be an issue by itself.
Other things may be messier.
It may be better to have another table for the time & worker_ids (& ticket_id). This would allow for flagging that multiple workers are currently touching a single record.
As for History versus Current, I (usually) like to have 2 tables:
History -- blow-by-blow list of what changes were made, when, and by whom. This is table is only INSERTed into.
Current -- the current status of the ticket. This table is mostly UPDATEd.
Also, I prefer to write the History directly from the "database layer" of the app, not via Triggers. This gives me much better control over the details of what goes into each table and when. Plus the 'transactions' are clear. This gives me confidence that I am keeping the two tables in sync:
BEGIN; INSERT INTO History...; UPDATE Current...; COMMIT;
I've answered a similar question before. You'll see some good alternatives in that question.
In your case, I think you're merging several concerns - one is "storing an audit trail", and the other is "managing the case where many clients may want to update a single row".
Firstly, I don't like triggers. They are a side effect of some other action, and for non-trivial cases, they make debugging much harder. A poorly designed trigger or audit table can really slow down your application, and you have to make sure that your trigger logic is coordinated between lots of developers. I realize this is personal preference and bias.
Secondly, in my experience, the requirement is rarely "show the status of this one table over time" - it's nearly always "allow me to see what happened to the system over time", and if that requirement exists at all, it's usually fairly high priority. With a ticketing system, for instance, you probably want the name and email address of the users who created, and changed the ticket status; the name of the category/classification, perhaps the name of the project etc. All of those attributes are likely to be foreign keys on to other tables. And when something does happen that requires audit, the requirement is likely "let me see immediately", not "get a database developer to spend hours trying to piece together the picture from 8 different history tables. In a ticketing system, it's likely a requirement for the ticket detail screen to show this.
If all that is true, then I don't think history tables populated by triggers are a good idea - you have to build all the business logic into two sets of code, one to show the "regular" application, and one to show the "audit trail".
Instead, you might want to build "time" into your data model (that was the point of my answer to the other question).
Since then, a new style of data architecture has come along, known as CQRS. This requires a very different way of looking at application design, but it is explicitly designed for reactive applications; these offer much nicer ways of dealing with the "what happens if someone edits the record while the current user is completing the form" question. Stack Overflow is an example - we can see, whilst typing our comments or answers, whether the question was updated, or other answers or comments are posted. There's a reactive library for PHP.
I do understand that disk space is cheap and I should not worry about it but in order to retrieve some kind of log or nice looking history for the user is painful, at least for me.
A large history table is not necessarily a problem. Huge tables only use disk space, which is cheap. They slow things down only when making queries on them. Fortunately, the history is not something you'd use all the time, most likely it is only used to solve problems or for auditing.
It is useful to partition the history table, for example by month or week. This allows you to simply drop very old records, and more important, since the history of the previous months has already been backed up, your daily backup schedule only needs to backup the current month. This means a huge history table will not slow down your backups.
With that I can try to catch possible overwrites while two users are editing the same ticket in the same time
There is a simple solution:
Add a column "version_number".
When you select with intent to modify, you grab this version_number.
Then, when the user submits new data, you do:
UPDATE ...
SET all modified columns,
version_number=version_number+1
WHERE ticket_id=...
AND version_number = (the value you got)
If someone came in-between and modified it, then they will have incremented the version number, so the WHERE will not find the row. The query will return a row count of 0. Thus you know it was modified. You can then SELECT it, compare the values, and offer conflict resolution options to the user.
You can also add columns like who modified it last, and when, and present this information to the user.
If you want the user who opens the modification page to lock out other users, it can be done too, but this needs a timeout (in case they leave the window open and go home, for example). So this is more complex.
Now, about history:
You don't want to have, say, one large TEXT column called "comments" where everyone enters stuff, because it will need to be copied into the history every time someone adds even a single letter.
It is much better to view it like a forum: each ticket is like a topic, which can have a string of comments (like posts), stored in another table, with the info about who wrote it, when, etc. You can also historize that.
The drawback of using a trigger is that the trigger does not know about the user who is logged in, only the MySQL user. So if you want to record who did what, you will have to add a column with the user_id as I proposed above. You can also use Rick James' solution. Both would work.
Remember though that MySQL triggers don't fire on foreign key cascade deletes... so if the row is deleted in this way, it won't work. In this case doing it in the application is better.
Overview
Consider the following details:
We have a table named user. In it is a column named wallet.
We have a table named walletAction. We insert a new entry on each wallet action a user is doing. This table acts like some sort of logs in the database with some calculations.
We have a CRON command that does an update every N minutes. Each CRON action gets some data by using a standalone API and 'inserts' a new walletAction entry. At the sime time, it updates the user.wallet's value.
A user can buy stuff from our site. When the user clicks the buy button, we insert a new walletAction entry and change the user.wallet column.
Problem
I am afraid that the CRON update and the action of the user when they click the buy button will happen at the exact same time causing the entries in the walletAction table to have wrong calculations.
I need some kind of 'lock' on the CRON update execution or something along those lines.
Questions
Should I be afraid of this situation?
How can I avoid this problem?
Can I avoid this trouble by using MySQL transactions?
What isolation level should I use and in which case should I use it? (In the CRON command or in the action of the user when they click the buy button?)
It seems that we don't have concurrency on php as is in GO or Java. You can implement some technical trick, but almost of them made new problems for you :). For solving your problem i suggest you to use optimistic lock. For more information you can see http://www.yiiframework.com/doc-2.0/guide-db-active-record.html#optimistic-locks.
Yes, in this case I would recommend to use trasactions with the strongest isolation level yii\db\Transaction::SERIALIZABLE.
This level should prevent "phantom reads" and "non-repetable reads".
Moreover I recommend to use transactions always when you perform more than 1 related changes, because it helps to keep DB consistency.
This may prevents problem when you get some PHP exception after successful inserting new rows into walletAction, but before user.wallet updating.
How does PHP handle multiple requests from users? Does it process them all at once or one at a time waiting for the first request to complete and then moving to the next.
Actually, I'm adding a bit of wiki to a static site where users will be able to edit addresses of businesses if they find them inaccurate or if they can be improved. Only registered users may do so. When a user edits a business name, that name along with it's other occurrences is changed in different rows in the table. I'm a little worried about what would happend if 10 users were doing this simultaneously. It'd be a real mishmash of things. So does PHP do things one at time in order received per script (update.php) or all at once.
Requests are handled in parallel by the web server (which runs the PHP script).
Updating data in the database is pretty fast, so any update will appear instantaneous, even if you need to update multiple tables.
Regarding the mish mash, for the DB, handling 10 requests within 1 second is the same as 10 requests within 10 seconds, it won't confuse them and just execute them one after the other.
If you need to update 2 tables and absolutely need these 2 updates to run subsequently without being interrupted by another update query, then you can use transactions.
EDIT:
If you don't want 2 users editing the same form at the same time, you have several options to prevent them. Here are a few ideas:
You can "lock" that record for edition whenever a user opens the page to edit it, and not let other users open it for edition. You might run into a few problems if a user doesn't "unlock" the record after they are done.
You can notify in real time (with AJAX) a user that the entry they are editing was modified, just like on stack overflow when a new answer or comment was posted as you are typing.
When a user submits an edit, you can check if the record was edited between when they started editing and when they tried to submit it, and show them the new version beside their version, so that they manually "merge" the 2 updates.
There probably are more solutions but these should get you started.
It depends on which version of Apache you are using and how it is configured, but a common default configuration uses multiple workers with multiple threads to handle simultaneous requests. See http://httpd.apache.org/docs/2.2/mod/worker.html for a rundown of how this works. The end result is that your PHP scripts may together have dozens of open database connections, possibly sending several queries at the exact same time.
However, your DBMS is designed to handle this. If you are only doing simple INSERT queries, then your code doesn't need to do anything special. Your DBMS will take care of the necessary locks on its own. Row-level locking will be fastest for multiple INSERTs, so if you use MySQL, you should consider the InnoDB storage engine.
Of course, your query can always fail whether it's due to too many database connections, a conflict on a unique index, etc. Wrap your queries in try catch blocks to handle this case.
If you have other application-layer concerns about concurrency, such as one user overwriting another user's changes, then you will need to handle these in the PHP script. One way to handle this is to use revision numbers stored along with your data, and refusing to execute the query if the revision number has changed, but how you handle it all depends on your application.
I have created an office scheduling program that uses jQuery to post to a PHP file which then inserts an appointment into a pgSQL database. This has not happened yet but I can foresee this problem in the future--two office workers try to schedule an appointment in the same slot at the same time, creating a race condition and one set of customer data would be lost, or at least I'd have to dig it out of a log. I was wondering if there was a flag I could set in the database, if I need to create some kind of gatekeeper program to control server connections, or if there is some kind of mutex/lock/semaphore I can use with javascript/php/sql to keep this race condition from occurring.
You can either lock it with a database flag, or a better strategy is to detect collisions, since this only happens in rare cases.
To detect the problem, you can save a timestamp from the database containing the last updated time. Send this along with the form, and compare the timestamp before you update the record. If the timestamp has changed, then present the user with all the data and ask them what they want to do. This offers a way for the second saving user to modify their changes based on the previously saved data if they wish.
There are other ways to solve this problem, and the proper solution depends the nature of the specific problem.
I am writing a PHP/MySQL application (using CodeIgniter) that uses some jQuery functionality for dragging table rows. I have a table in which the user can drag rows to the desired order (kind of a queue for which I need to preserve the rank of each row). I've been trying to figure out how to (and whether I should) update the database each time the user drops a row, in order to simplify the UI and avoid a "Save" button.
I have the jQuery working and can send a serialized list back to the server onDrop, but is it good design practice to run an update query this often? The table will usually have 30-40 rows max, but if the user drags row 1 far down the list, then potentially all the rows would need to be updated to update the rank field.
I've been wondering whether to send a giant query to the server, to loop through the rows in PHP and update each row with its own Update query, to send a small serialized list to a stored procedure to let the server do all the work, or perhaps a better method I haven't considered. I've read that stored procedures in MySQL are not very efficient and use a separate process for each call. Any advice as to the right solution here? Thanks very much for your help!
Any question that includes "The table will usually have 30-40 rows max" ends with "Do whatever you want to it." I can't imagine an operation, however frequently it's performed, that would have any appreciable performance impact on a table that tiny.
The only real question is what the visitor will be doing while your request is going to and returning from the server. Will they be locked out of making other changes? If not, make sure you have a mechanism to ensure that the most recent change is the one that's really taken effect. (It's possible for requests to reach the server out of order, and you wouldn't want an outdated request to get saved as the final state.)