Data Blocking in PHP, MySQL

Data Blocking in PHP, MySQL - php

I'm writing a Queue Management System for a small clinic. There will be multiple users trying to do same thing, so these is a concurrency problem. I'm familiar with ACID guarantee and also understand notion of transaction. I know that two people can not change same data at the same time.
But here's my problem: I have a PHP function isFree($time) which determines if particular doctor is free for that time. I'm afraid that if both users try to call same function, both of them may get positive result and mess things up, so somehow I need to either queue concurrent users, or accept only one.
Easiest way to solve this problem would be to restrict, that my function can be called one at a time. I probably need some kind of flag or blocking system, but I have no exact idea on how to do it.
Or on the other hand, It would be even faster to only restrict those function calls, which may overlap. For example calling isFree($time) function for Monday and Tuesday at the same time won't cause any problems.

You're effectively asking for a lock.
I am guessing your queue system runs on MySQL for databases. if so, you can LOCK the table you're using (or on some database engines, the specific row you are using!). The structure is LOCK TABLES yourTableName READ.
This will effectively prevent anyone else from reading anything in the table until:
Your session is ended
You free the lock (using UNLOCK)
This is true for all database storage engines. InnoDB supports row-level locking through transactions. Instead of using a SELECT query, suffix it with FOR UPDATE to get a complete lock over the row(s) you just read.
This will hopefully shed more light on the locking mechanism of MySQL/innoDB. In order to free the lock, either UPDATE the row or commit the transaction.

Related

How can I prevent concurrency while writing and reading from my DB?

I have a table with user login information and registration too. So when two users consecutively try to add their details:
Will both the writes clashes and the table wont be updated?
Using threads for these writes is bad idea. As for each write a new thread would be created and it would clog the server. Is the server responsible for it to manage on its own?
Is locking the table a good idea?
My back-end runs on PHP/Apache with MySQL (InnoDB) for the database.

Relational databases are designed to avoid these kinds of conditions. You don't need to worry about them unless you are designing your own relational database from scratch.
In short, just know this: Any time a write is initiated, there is a row-level lock. If another transaction wants to write to that same row, then it has to wait until the first transaction releases the lock. This is a fundamental part of relational databases. You don't need to add a lock because they've already thought of that :)
You can read more about how MySQL performs locks to avoid deadlocking and other transaction errors here.

If you're really paranoid about this, or perhaps you are doing multiple things when you register a user and need them done atomically, you might want to look at using Transactions in MySQL. There's a decent write-up about Transactions here http://www.mysqltutorial.org/mysql-transaction.aspx

BEGIN;
do related reads/writes to the data
COMMIT;
Inside that "transaction", the connection sees a consistent view of the data, and blocks anyone else from messing with that view.
There are exceptions. The main one is
BEGIN
SELECT ... FOR UPDATE;
fiddle with the values SELECTed
UPDATE ...; -- and change those values
COMMIT;
The SELECT .. FOR UPDATE announces what should not be tampered with. If another connection wants to mess with the same rows, it will have to wait until your COMMIT, at which time he may find that things have changed and he will need to do something different. But, in general, this avoids a "deadlock" wherein two transactions are stepping on each other so badly that one has to be "rolled back".
With techniques like this, the "concurrency" is prevented only briefly and relatively precisely. That is, if two connections are working with different rows, both can proceed -- there is no need to "prevent concurrency".

What makes SQL Server decide what lock level to use (row, page or table lock?)

SQL Server has many ways of locking resource. I am trying to understand what make SQL Server pick what level of locks it will choose. I want to know when will it use Page or table lock over row lock?
Problem
I have a PHP application that uses transaction with every http request to ensure all queries are executed before a commit. One issue that is puzzling me is when many (5+) people use the application the app seems to be hanging (spinning for a long periods of time)! Nothing I can think of will cause such a behaviors except for database locks! The scenario that I am thinking it happening is that SQL Server is choosing to pick Page or Table lock over rowlock for some reason. I am trying to ensure that SQL Server is doing a row lock not Page or table lock. I am using an ORM so I can't use ROWLOCK hint in my queries.
Is there a way for me to run queries explain plan to see what lock level will be used?

As you can see here there is no default granularity in lock modes.
In general the optimizer will choose the best course of action to handle this.
Could it be a case of livelock due to a long running transaction that leads to resource starvation?
You can also check here and here for information on lock escalation, but I'd suggest to not disable it for any table.

MySql table lock in PHP

I am making a webservice in PHP which does a series of calculations based on a select from a table and then updates the table afterwards with the new results.
However i want to prevent the case where another person is making a call to the same webservice while another person's session is still doing an update.
Is it the right thing here to lock that entire table and then unlock it again? If so, how do i lock and unlock a mysql table using PHP pdo?

Database Management Systems like MySQL are smart enough to prevent concurrency violations like these.
Look for database isolation levels (read uncommitted, read commited, repeatable read, serializable) and the possible problems (dirty read, non repeatable read ...) -> Wikipedia.
Personally, I would not recommend a table lock in your case. You better wrap your calculations and database operations in a transaction and rely on the DBMS to manage your stuff.

I posted a comment:
Not a direct answer, but I don't think this is any problem. The
calculations and fetching data from the database is done within a few
milliseconds. The chances or two people interacting at the same time
is soooo small that most people don't bother making a lock like this.
But if these calculations are critical you could prevent this problem by adding a new field and simply call it occupied, busy or something like that.
When you run your script, check if this field is set to for example 1, if it is, make the script sleep for 1-2-3 seconds and then retry. If this field is set to 0, update it to 1, do the calculations and set it back to 0 again.
This would prevent two people from accessing the same values at the same time.

What is the best method to make sure two people don't edit the same row on my web app?

I have a PHP/jQuery/AJAX/MySQL app built for managing databases. I want to implement the ability to prevent multiple users from editing the same database row at the same time.
What is this called?
Do I use a token system and who ever has the token can edit it until they release the token?
Do I use a "last edit date/time" to compare you loading the HTML form with the time in the database and if the database is the most resent edit then it warns you?
Do I lock the row using database functions?
I'm just not sure which is the best. Assuming between 10 - 15 concurrent users

There are two general approaches-- optimistic and pessimistic locking.
Optimistic locking is generally much easier to implement in a web-based environment because it is fundamentally stateless. It scales much better as well. The downside is that it assumes that your users generally won't be trying to edit the same set of rows at the same time. For most applications, that's a very reasonable assumption but you'd have to verify that your application isn't one of the outliers where users would regularly be stepping on each other's toes. In optimistic locking, you would have some sort of last_modified_timestamp column that you would SELECT when a user fetched the data and then use in the WHERE clause when you go to update the date, i.e.
UPDATE table_name
SET col1 = <<new value>>,
col2 = <<new values>>,
last_modified_timestamp = <<new timestamp>>
WHERE primary_key = <<key column>>
AND last_modified_timestamp = <<last modified timestamp you originally queried>>
If that updates 1 row, you know you were successful. Otherwise, if it updates 0 rows, you know that someone else has modified the data in the interim and you can take some action (generally showing the user the new data and asking them if they want to overwrite but you can adopt other conflict resolution approaches).
Pessimistic locking is more challenging to implement particularly in a web-based application particularly when users can close their browser without logging out or where users may start editing some data and go to lunch before hitting Submit. It makes it harder to scale and generally makes the application more difficult to administer. It's really only worth considering if users will regularly try to update the same rows or if updating a row takes a large amount of time for a user so it's worth letting them know up front that someone else has locked the row.

I was going to implement this into one of my own systems.
You could create new columns in your database of records, called timelocked.
When a record is opened, you would set the record they are opening's column for timelocked to the current time. During editing of the record, send a keepalive back to the server through ajax every 2 minutes. When sending the keepalive, the server will then increase the timelocked time to the current time the request was sent, and so fourth (this will make sense in a second). WHen the user is finished editing, set the timelocked to false.
Now, If someone went to open a record which is already open, the php would check -
if timelocked == false - would mean it's not being edited,
otherwise, the record may be being edited, but what if the user closed their browser window. that's why the keepalive is used.
if the difference between the current time and the timelocked is larger than 2 minutes, it means they're no longer lively editing, which would allow you to open it.
Hopefully you understand all that.

Don't try to prevent it. Let them decide what to do in the case of an edit conflict.
Add a timestamp to the table. Compare the timestamp of when the row was retrieved with the current timestamp. Make them aware of changes between their load and their save, and let them decide what action to take.
So yeah, number 3.

I personally would not prevent this. If it was a requirement of the job I would track the users' current / last known location and disallow someone from editing the same line someone else is editing this way. I have seen people add a row to a table saying isLocked or isBeingWorkedOn etc... but I have seen this type of system fail far more often as well, or require moderation to unlock stuck tables if someone closed it while working on it etc...

1) This is called locking. There are two main types of locking when referring to relational databases (like MySQL): table locking and row locking. Table locking ensures only one session at at time is making changes to a table, whereas row locking ensures only one session at a time is making changes to a particular row. You can think of row locking as a more fine-grained approach to concurrent access than table locking. Row locking is more complicated, but allows multiple concurrent sessions to write to the same table (important if your database has lots of concurrent writes--table locking should be fine for 10-15 users)
2-3) MySQL takes care of concurrent access for you! It automatically implements locking in the background. The type of locking (row or table) depends on which storage engine you use. For example, MyISAM uses table locking and InnoDB uses row locking. MySQL uses an internal table to manage this. You can query the status of this table (and all locks on your database) by checking the Table_locks_immediate and Table_locks_waited variables (it uses your option number 2).
When you issue an INSERT or UPDATE statement while another session is using the table (or row), the calling application (i.e. PHP in this case) will pause for a few milliseconds until the other session is done writing.
4) Again, MySQL will automatically take care of locking, but you can manually manage table locking with the LOCK TABLES and UNLOCK TABLES commands. If you are using row locking with InnoDB, there is a host of functions you can use to manually manage concurrent access.
See MySQL's page on Internal Locking for an overview of MySQL's locking system, and Concurrent Inserts for InnoDB's row locking features.

As others have said it's much easier to deal with a conflicting update.
What you are suggesting is called pesimistic locking. It's called thate because it's all too likely that two users will try and edit the same record at the same time.
Is that true?
And is it a disaster if a user has to start again, because the data they tried to update was changed by someone else.
Locking costs, you always lock in a pessimistic scheme, so you have an overhead, and that's before you start looking at related data and such.
Making it robust, dealing with no one can do it now coz sumfin' went wrong...
If I had something short of editing an entire file, that needed pessimistic locking, I'd be having a look at my design, on the basis that it isn't fit for purpose.

Web game concurrency control

There are some other SO questions about concurrency but they don't quite address my scenario.
So let's say that I have a game where users interact with each other, fighting and whatnot. At any given time, a player could potentially be involved in multiple interactions with other players, all of whom can see the event happening. When any one of these players hits the site, it needs to update any data involved and show that to the user.
Example situation: Player A is fighting with player B, and events happen every few minutes in this fight. At the same time, player A is also interacting with player C. By dumb luck, the events for both interactions happen to next be due at the exact same second.
When that second arrives, by dumb luck again, both player B and player C hit the site at the same time, in order to check the status of their fights with player A. Fighting requires updates to information about player A. If I don't code properly, A's data can get messed up.
I have two games with this situation, each with a different solution and different issues. One of them uses a lock, so when a user hits the site, they acquire a lock on a db row, read the data for locks they successfully acquired, then write the changes and release the lock. But sometimes, for reasons still unknown, this fails and the lock gets stuck forever, users complain and we have to fix it manually. My other game uses a daemon to execute these transactions, making the issue (nearly) moot as there is only one process ever making these changes. But players could still do other things at the same time, and potentially cause the same issue.
I've read a bit about different solutions to this, like optimistic or timestamp-based control. I would like to ask:
Which of these is most commonly used for situations like mine, and which is easiest to implement?
My next project is using Kohana (PHP) and its ORM, so my db writes will by default take the form "just overwrite all these fields." Will I need to write my own update queries for this or can I get a solution that is compatible with the ORM?
What about transactions that involve multiple tables? The outcome of a combat has to change the table of combats, and the table of player information, possibly more things too. Which solutions are easier to work with here? Will all of my tables need transaction timestamp columns?
A lot of these solutions say that when there is a conflict, either retry or ignore. What does this mean for me? Does "retry" mean restart my entire script, which would cause additional load time for the user? I don't think ignore is a valid option, since the events have to execute at some point. In the other questions I found, presenting a conflict error to the user was usually a valid option - for me, it isn't.
What are the performance implications of concurrency control - is it even worth it?

I think what you are looking for is already contained in your question : transactions.
If you are using MySQL, you will need to setup your tables with the innoDb engine to be able to use transactions. Some documentation :
http://dev.mysql.com/doc/refman/5.1/en/commit.html
http://www.php.net/manual/en/pdo.begintransaction.php
http://www.php.net/manual/en/mysqli.autocommit.php
Don't try to reinvent the wheel when you can.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.