I have a table with user login information and registration too. So when two users consecutively try to add their details:
Will both the writes clashes and the table wont be updated?
Using threads for these writes is bad idea. As for each write a new thread would be created and it would clog the server. Is the server responsible for it to manage on its own?
Is locking the table a good idea?
My back-end runs on PHP/Apache with MySQL (InnoDB) for the database.
Relational databases are designed to avoid these kinds of conditions. You don't need to worry about them unless you are designing your own relational database from scratch.
In short, just know this: Any time a write is initiated, there is a row-level lock. If another transaction wants to write to that same row, then it has to wait until the first transaction releases the lock. This is a fundamental part of relational databases. You don't need to add a lock because they've already thought of that :)
You can read more about how MySQL performs locks to avoid deadlocking and other transaction errors here.
If you're really paranoid about this, or perhaps you are doing multiple things when you register a user and need them done atomically, you might want to look at using Transactions in MySQL. There's a decent write-up about Transactions here http://www.mysqltutorial.org/mysql-transaction.aspx
BEGIN;
do related reads/writes to the data
COMMIT;
Inside that "transaction", the connection sees a consistent view of the data, and blocks anyone else from messing with that view.
There are exceptions. The main one is
BEGIN
SELECT ... FOR UPDATE;
fiddle with the values SELECTed
UPDATE ...; -- and change those values
COMMIT;
The SELECT .. FOR UPDATE announces what should not be tampered with. If another connection wants to mess with the same rows, it will have to wait until your COMMIT, at which time he may find that things have changed and he will need to do something different. But, in general, this avoids a "deadlock" wherein two transactions are stepping on each other so badly that one has to be "rolled back".
With techniques like this, the "concurrency" is prevented only briefly and relatively precisely. That is, if two connections are working with different rows, both can proceed -- there is no need to "prevent concurrency".
Related
SQL Server has many ways of locking resource. I am trying to understand what make SQL Server pick what level of locks it will choose. I want to know when will it use Page or table lock over row lock?
Problem
I have a PHP application that uses transaction with every http request to ensure all queries are executed before a commit. One issue that is puzzling me is when many (5+) people use the application the app seems to be hanging (spinning for a long periods of time)! Nothing I can think of will cause such a behaviors except for database locks! The scenario that I am thinking it happening is that SQL Server is choosing to pick Page or Table lock over rowlock for some reason. I am trying to ensure that SQL Server is doing a row lock not Page or table lock. I am using an ORM so I can't use ROWLOCK hint in my queries.
Is there a way for me to run queries explain plan to see what lock level will be used?
As you can see here there is no default granularity in lock modes.
In general the optimizer will choose the best course of action to handle this.
Could it be a case of livelock due to a long running transaction that leads to resource starvation?
You can also check here and here for information on lock escalation, but I'd suggest to not disable it for any table.
I am making a webservice in PHP which does a series of calculations based on a select from a table and then updates the table afterwards with the new results.
However i want to prevent the case where another person is making a call to the same webservice while another person's session is still doing an update.
Is it the right thing here to lock that entire table and then unlock it again? If so, how do i lock and unlock a mysql table using PHP pdo?
Database Management Systems like MySQL are smart enough to prevent concurrency violations like these.
Look for database isolation levels (read uncommitted, read commited, repeatable read, serializable) and the possible problems (dirty read, non repeatable read ...) -> Wikipedia.
Personally, I would not recommend a table lock in your case. You better wrap your calculations and database operations in a transaction and rely on the DBMS to manage your stuff.
I posted a comment:
Not a direct answer, but I don't think this is any problem. The
calculations and fetching data from the database is done within a few
milliseconds. The chances or two people interacting at the same time
is soooo small that most people don't bother making a lock like this.
But if these calculations are critical you could prevent this problem by adding a new field and simply call it occupied, busy or something like that.
When you run your script, check if this field is set to for example 1, if it is, make the script sleep for 1-2-3 seconds and then retry. If this field is set to 0, update it to 1, do the calculations and set it back to 0 again.
This would prevent two people from accessing the same values at the same time.
I'm writing a Queue Management System for a small clinic. There will be multiple users trying to do same thing, so these is a concurrency problem. I'm familiar with ACID guarantee and also understand notion of transaction. I know that two people can not change same data at the same time.
But here's my problem: I have a PHP function isFree($time) which determines if particular doctor is free for that time. I'm afraid that if both users try to call same function, both of them may get positive result and mess things up, so somehow I need to either queue concurrent users, or accept only one.
Easiest way to solve this problem would be to restrict, that my function can be called one at a time. I probably need some kind of flag or blocking system, but I have no exact idea on how to do it.
Or on the other hand, It would be even faster to only restrict those function calls, which may overlap. For example calling isFree($time) function for Monday and Tuesday at the same time won't cause any problems.
You're effectively asking for a lock.
I am guessing your queue system runs on MySQL for databases. if so, you can LOCK the table you're using (or on some database engines, the specific row you are using!). The structure is LOCK TABLES yourTableName READ.
This will effectively prevent anyone else from reading anything in the table until:
Your session is ended
You free the lock (using UNLOCK)
This is true for all database storage engines. InnoDB supports row-level locking through transactions. Instead of using a SELECT query, suffix it with FOR UPDATE to get a complete lock over the row(s) you just read.
This will hopefully shed more light on the locking mechanism of MySQL/innoDB. In order to free the lock, either UPDATE the row or commit the transaction.
For fun I am replacing the mysqli extension in my app with PDO.
Once in awhile I need to use transactions + table locking.
In these situations, according to the mysql manual, the syntax needs to be a bit different. Instead of calling START TRANSACTION, you do it like so...
SET autocommit=0;
LOCK TABLES t1 WRITE, t2 READ, ...;
... do something with tables t1 and t2 here ...
COMMIT;
UNLOCK TABLES;
(http://dev.mysql.com/doc/refman/5.0/en/lock-tables-and-transactions.html)
My question is, how does this interact with PDO::beginTransaction? Can I use PDO::beginTransaction in this case? Or should I manually send the sql "SET autocommit = 0; ... etc".
Thanks for the advice,
When you call PDO::beginTransaction(), it turns off auto commit.
So you can do:
$db->beginTransaction();
$db->exec('LOCK TABLES t1, t2, ...');
# do something with tables
$db->commit();
$db->exec('UNLOCK TABLES');
After a commit() or rollBack(), the database will be back in auto commit mode.
I have spent a huge amount of time running around this issue, and the PHP documentation in this area is vague at best. A few things I have found, running PHP 7 with a MySQL InnoDB table:
PDO::beginTransaction doesn't just turn off autocommit, having tested the answer provided by Olhovsky with code that fails, rollbacks do not work; there is no transactional behaviour. This means it can't be this simple.
Beginning a transaction may be locking the used tables... I eagerly await for someone to tell me I'm wrong with this, but here are the reasons it could be: This comment, which shows a table being inaccessible when a transaction has started, without being locked. This PHP documentation page, that slips in on the end:
... while the transaction is active, you are guaranteed that no one else can make changes while you are in the middle of your work
To me this behaviour is quite smart, and also provides enough wiggle room for PDO to cope with every database, which is after all the aim. If this is what is going on though, its just massively under documented and should've been called something else to avoid confusion with a true database transaction, which doesn't imply locking.
Charles' answer I think is probably the best if you are after certainty with a workload that will require high concurrency; do it by hand using explicit queries to the database, then you can go by the database's documentation.
Update
I have had a production server up and running using the PDO transaction functions for a while now, recently using AWS's Aurora database (fully compatible with MySQL but built to automatically scale etc). I have proven these two points to myself:
Transactions (purely the ability to commit all database changes together) work using PDO::beginTransaction(). In short, I know many scripts have failed half way through their database select/updates and data integrity has been maintained.
Table locking isn't happening, I've had an index duplication error to prove this.
So, to further my conclusion, looks like the behaviour of these functions seems to change based on database engine (and possibly other factors). As far as I can tell both from experience and the documentation, there is no way to know programmatically what is going on... whoop...
In MySQL, beginning a transaction is different than turning off autocommit, due to how LOCK/UNLOCK TABLES works. In MySQL, LOCK TABLES commits any open transactions, but turning off autocommit isn't actually starting a transaction. MySQL is funny that way.
In PDO, starting a transaction using beginTransaction doesn't actually start a new transaction, it just turns off autocommit. In most databases, this is sane, but it can have side effects with MySQL's mentioned behavior.
You probably should not rely on this behavior and how it interacts with MySQL's quirks. If you're going to deal with MySQL's behavior for table locking and DDL, you should avoid it. If you want autocommit off, turn it off by hand. If you want to open a transaction, open a transaction by hand.
You can freely mix the PDO API methods of working with transactions and SQL commands when not otherwise working with MySQL's oddities.
Is it possible to do a simple count(*) query in a PHP script while another PHP script is doing insert...select... query?
The situation is that I need to create a table with ~1M or more rows from another table, and while inserting, I do not want the user feel the page is freezing, so I am trying to keep update the counting, but by using a select count(\*) from table when background in inserting, I got only 0 until the insert is completed.
So is there any way to ask MySQL returns partial result first? Or is there a fast way to do a series of insert with data fetched from a previous select query while having about the same performance as insert...select... query?
The environment is php4.3 and MySQL4.1.
Without reducing performance? Not likely. With a little performance loss, maybe...
But why are you regularily creating tables and inserting millions of row? If you do this only very seldom, can't you just warn the admin (presumably the only one allowed to do such a thing) that this takes a long time. If you're doing this all the time, are you really sure you're not doing it wrong?
I agree with Stein's comment that this is a red flag if you're copying 1 million rows at a time during a PHP request.
I believe that in a majority of cases where people are trying to micro-optimize SQL, they could get much greater performance and throughput by approaching the problem in a different way. SQL shouldn't be your bottleneck.
If you're doing a single INSERT...SELECT, then no, you won't be able to get intermediate results. In fact this would be a Bad Thing, as users should never see a database in an intermediate state showing only a partial result of a statement or transaction. For more information, read up on ACID compliance.
That said, the MyISAM engine may play fast and loose with this. I'm pretty sure I've seen MyISAM commit some but not all of the rows from an INSERT...SELECT when I've aborted it part of the way through. You haven't said which engine your table is using, though.
The other users can't see the insertion until it's committed. That's normally a good thing, since it makes sure they can't see half-done data. However, if you want them to see intermediate data, you could throw in an occassional call to "commit" while you're inserting.
By the way - don't let anybody tell you to turn autocommit on. That a HUGE time waster. I have a "delete and re-insert" job on my database that takes 1/3rd as long when I turn off autocommit.
Just to be clear, MySQL 4 isn't configured by default to use transactions. It uses the MyISAM table type which locks the entire table for each insert, if I remember correctly.
Your best bet would be to use one of the MySQL bulk insertion functions, such as LOAD DATA INFILE, as these are dramatically faster at inserting large amounts of data. As for the counting, well, you could break the inserts into N groups of 1000 (or Y) then divide your progress meter into N sections and just update it on each group's request.
Edit: Another thing to consider is, if this is static data for a template, then you could use a "select into" to create a new table with the same data. Not sure what your application is, or the intended functionality, but that could work as well.
If you can get to the console, you can ask various status questions that will give you the information you are looking for. There's a command that goes something like "SHOW processlist".