I have created an office scheduling program that uses jQuery to post to a PHP file which then inserts an appointment into a pgSQL database. This has not happened yet but I can foresee this problem in the future--two office workers try to schedule an appointment in the same slot at the same time, creating a race condition and one set of customer data would be lost, or at least I'd have to dig it out of a log. I was wondering if there was a flag I could set in the database, if I need to create some kind of gatekeeper program to control server connections, or if there is some kind of mutex/lock/semaphore I can use with javascript/php/sql to keep this race condition from occurring.
You can either lock it with a database flag, or a better strategy is to detect collisions, since this only happens in rare cases.
To detect the problem, you can save a timestamp from the database containing the last updated time. Send this along with the form, and compare the timestamp before you update the record. If the timestamp has changed, then present the user with all the data and ask them what they want to do. This offers a way for the second saving user to modify their changes based on the previously saved data if they wish.
There are other ways to solve this problem, and the proper solution depends the nature of the specific problem.
Related
This question already has answers here:
How to make sure there is no race condition in MySQL database when incrementing a field?
(2 answers)
Lock file/content while being edited in browser.
(1 answer)
Closed 4 years ago.
I've developed a web application using Apache, MySQL and PHP.
This web app allows multiple users' to login to the application.
Then, through the application, they have access to the Database.
Since race conditions may apply when two or more users try to SELECT/UPDATE/DELETE the same information (in my case a row of a Table), I am searching for the best way to avoid such race conditions.
I've tried using mysqli with setting autocommit to OFF and using SELECT .... FOR UPDATE, but this fails to work as -to my understanding- with PHP, each transaction commits automatically and each connection to the db is being auto released when the PHP -->html page is provided/loaded for the user.
After reading some posts, there seem to be two possible solutions for my problem :
Use PDO. To my understanding PDO creates connections to the DB which are not released when the html page loads. Special precautions should be taken though as locks may remain if e.g. the user quits the page and the PDO connection has not been released...
Add a "locked" column in the corresponding table to flag locked rows. So e.g. when an UPDATE transaction may only be performed if the corresponding user has locked the row for editing. The other users shall not be allowed to modify.
The main issue I may have with PDO is that I have to modify the PHP code in order to replace mysqli with PDO, where applicable.
The issue with the scenario 2 is that I need to also modify my DB schema, add additional coding for lock/unlock and also consider the possibility of "hanging" locked rows which may result in additional columns to be added in the table (e.g. to store the time the row was locked and the lockedBy information) and code as well (e.g. to run a Javascript at User side that will be updating the locked time so that the user continuously flags the row while using it...)
Your comments based on your experience would be highly appreciated!!!
Thank you.
It might be an opinion instead of a technical answer, but too long to write it as a comment.
I want to think it like booking a seat in a movie or a flight: When an user selects a seat and presses next, the seat will be reserved for that user for a certain amount of time, and when user doesn't finish in the given time, it gets a timeout exception without processing further. You can use an edit button besides the row, and when the user clicks it, on the server side, you check if the row is reserved to someone else, and if not, reserve it to the user. Other users won't get an edit form when they also click the edit button after that user. I don't know how database systems handle this though.
But, one way to make it sure, re-read the row after user edits and commits it to display the user. If any lock mechanism prevented the row from being updated, the user will also know it by not seeing the change in the row.
I work on a market research database centric website, developed in PHP and MySQL.
It consists of two big parts – one in which users insert and update own data (let say one table T with an user_id field) and another in which an website administrator can insert new or update existing records (same table).
Obviously, in some cases end users will have their data overridden by the administrator while in other cases, administrator entered data is updated by end users (it is fine both ways).
The requirement is to highlight the view/edit forms with (let’s say) blue if end user was the last to update a certain field or red if the administrator is to “blame”.
I am looking into an efficient and consistent method to implement this.
So far, I have the following options:
For each record in table T, add another one ( char(1) ) in which write ‘U’ if end user inserted/updated the field or ‘A’ if the administrator did so. When the view/edit form is rendered, use this information to highlight each field accordingly.
Create a new table H storing an edit history containing something like user_id, field_name, last_update_user_id. Keep table H up-to-date when fields are updated in main table T. When the view/edit form is rendered, use this information to highlight each form field accordingly.
What are the pros/cons of these options; can you suggest others?
I suppose it just depends how forward-looking you want to be.
Your first approach has the advantage of being very simple to implement, is very straightforward to update and utilize, and also will only increase your storage requirements very slightly, but it's also the extreme minimum in terms of the amount of information you're storing.
If you go with the second approach and store a more complete history, if you need to add an "edit history" in the future, you'll already have things set up for that, and a lot of data waiting around. But if you end up never needing this data, it's a bit of a waste.
Or if you want the best of both worlds, you could combine them. Keep a full edit history but also update the single-character flag in the main record. That way you don't have to do any processing of the history to find the most recent edit, just look at the flag. But if you ever do need the full history, it's available.
Personally, I prefer keeping more information than I think I'll need at the time. Storage space is very cheap, and you never know when it's going to come in handy. I'd probably go even further than what you proposed, and also make it so the edit history keeps track of what they changed, and the before/after values. That can be very handy for debugging, and could be useful in the future depending on the project's exact needs.
Yes, implement an audit table that holds copies of the historical data, by/from whom &c. I work on a system currently that keeps it simple and writes the value changes as simple name-value string pairs along with date and by whom. It requires mandatory master record adjustment, but works well for tracking. You could implement this easily with a trigger.
The best way to audit data changes is through a trigger on the database table. In your case you may want to just update the last person to make the change. Or you may want a full auditing solution where you store the previous values making it easy to restore them if they were made in error. But the key to this is to do this on the database and not through the application. Database changes are often made through sources other than the application and you will want to know if this happened as well. Suppose someone hacked into the database and updated the data, wouldn't you like to be able to find the old data easily or know who did it even if he or she did it through a query window and not through the application? You might also need to know if the data was changed through a data import if you ever have to get large amounts of data at one time.
We have this PHP application which selects a row from the database, works on it (calls an external API which uses a webservice), and then inserts a new register based on the work done. There's an AJAX display which informs the user of how many registers have been processed.
The data is mostly text, so it's rather heavy data.
The process is made by thousands of registers a time. The user can choose how many registers to start working on. The data is obtained from one table, where they are marked as "done". No "WHERE" condition, except the optional "WHERE date BETWEEN date1 AND date2".
We had an argument over which approach is better:
Select one register, work on it, and insert the new data
Select all of the registers, work with them in memory and insert them in the database after all the work was done.
Which approach do you consider the most efficient one for a web environment with PHP and PostgreSQL? Why?
It really depends how much you care about your data (seriously):
Does reliability matter in this case? If the process dies, can you just re-process everything? Or can't you?
Typically when calling a remote web service, you don't want to be calling it twice for the same data item. Perhaps there are side effects (like credit card charges), or maybe it is not a free API...
Anyway, if you don't care about potential duplicate processing, then take the batch approach. It's easy, it's simple, and fast.
But if you do care about duplicate processing, then do this:
SELECT 1 record from the table FOR UPDATE (ie. lock it in a transaction)
UPDATE that record with a status of "Processing"
Commit that transaction
And then
Process the record
Update the record contents, AND
SET the status to "Complete", or "Error" in case of errors.
You can run this code concurrently without fear of it running over itself. You will be able to have confidence that the same record will not be processed twice.
You will also be able to see any records that "didn't make it", because their status will be "Processing", and any errors.
If the data is heavy and so is the load, considering the application is not real time dependant the best approach is most definately getting the needed data and working on all of it, then putting it back.
Efficiency speaking, regardless of language is that if you are opening single items, and working on them individually, you are probably closing the database connection. This means that if you have 1000's of items, you will open and close 1000's of connections. The overhead on this far outweighs the overhead of returning all of the items and working on them.
Handling Multi Users
Requirements:
I have an applications (mysql php jquery) where the users can:
Review records and update certain fields.
Issue invoices by selecting orders.
Issues:
The issue is that an invoice should not be issued twice for the same time period. Also, a field should not be updated by two or more users at the same time.
Possible Solutions:
Lock the tables when they get updated, and if the user performs an action, notify and reload.
Impliment lock system, that when a user performs certain actions, it locks those actions to be performed by other users.
...
Lookup 'optimistic locking' - basically means adding a version attribute and passing it back and incrementing it with updates to make sure nobody else got there first. If N users try same operation based on same version, one wins, others loose. It's fast simple easy for a wide variety of cases.
Don't know if this will help you or not but I'd first read about this in context of .Net's DataTable Adapter which tracks the changes made to the data rows since you read them and send back to db after changing. What it does is send all the fields instead of just the changed ones.
You can use time-stamps for the rows. Read the time stamp with other info and before saving check if the current time-stamp (of rows) is newer than what you have. This way you can minimize locking to just this portion, comparing time-stamps and updating if you are the first one to reach there.
Thank you both. Will look into both options: 1 optimistic locking (http://cwiki.apache.org/CAY/optimistic-locking-explained.html), and the time stamp approach.
I'm developing a php / mysql application that handles multiple simultaneous users. I'm thinking of the best approach to take when it comes to locking / warning against records that are currently being viewed / edited.
The scenario to avoid is two users viewing the record, one making a change, then the other doing likewise - with the potential that one change might overwrite the previous.
In the latest versions of WordPress they use some method to detect this, but it does not seem wholly reliable - often returning false positives, at least in my experience.
I assume some form of ajax must be in place to 'ping' the application and let it know the record is still being viewed / edited (otherwise, a user might simply close their browser window, and then how would the application know that).
Another solution I could see is to check the last updated time when a record is submitted for update, to see if in the interim it has been updated elsewhere - and then offer the user a choice to proceed or discard their own changes.
Perhaps I'm barking up the wrong tree in terms of a solution - what are peoples experiences of implementing this (what must be a fairly common) requirement?
I would do this: Store the time of the last modification in the edit form. Compare this time on submission with the time stored in the database. If they are the same, lock the table, update the data (along with the modification time) and unlock the table. If the times are different, notify the user about it and ask for the next step.
Good idea with the timestamp comparison. It's inexpensive to implement, and it's an inexpensive operation to run in production. You just have to write the logic to send back to the user the status message that their write/update didn't occur because someone beat them to it.
Perhaps consider storing the username on each update in a field called something like 'LastUpdateBy', and return that back to the user who had their update pre-empted. Just a little nicety for the user. Nice in the corporate sense, perhaps not in an environment where it might not be appropriate.