I am making a webservice in PHP which does a series of calculations based on a select from a table and then updates the table afterwards with the new results.
However i want to prevent the case where another person is making a call to the same webservice while another person's session is still doing an update.
Is it the right thing here to lock that entire table and then unlock it again? If so, how do i lock and unlock a mysql table using PHP pdo?
Database Management Systems like MySQL are smart enough to prevent concurrency violations like these.
Look for database isolation levels (read uncommitted, read commited, repeatable read, serializable) and the possible problems (dirty read, non repeatable read ...) -> Wikipedia.
Personally, I would not recommend a table lock in your case. You better wrap your calculations and database operations in a transaction and rely on the DBMS to manage your stuff.
I posted a comment:
Not a direct answer, but I don't think this is any problem. The
calculations and fetching data from the database is done within a few
milliseconds. The chances or two people interacting at the same time
is soooo small that most people don't bother making a lock like this.
But if these calculations are critical you could prevent this problem by adding a new field and simply call it occupied, busy or something like that.
When you run your script, check if this field is set to for example 1, if it is, make the script sleep for 1-2-3 seconds and then retry. If this field is set to 0, update it to 1, do the calculations and set it back to 0 again.
This would prevent two people from accessing the same values at the same time.
Related
I have a table with user login information and registration too. So when two users consecutively try to add their details:
Will both the writes clashes and the table wont be updated?
Using threads for these writes is bad idea. As for each write a new thread would be created and it would clog the server. Is the server responsible for it to manage on its own?
Is locking the table a good idea?
My back-end runs on PHP/Apache with MySQL (InnoDB) for the database.
Relational databases are designed to avoid these kinds of conditions. You don't need to worry about them unless you are designing your own relational database from scratch.
In short, just know this: Any time a write is initiated, there is a row-level lock. If another transaction wants to write to that same row, then it has to wait until the first transaction releases the lock. This is a fundamental part of relational databases. You don't need to add a lock because they've already thought of that :)
You can read more about how MySQL performs locks to avoid deadlocking and other transaction errors here.
If you're really paranoid about this, or perhaps you are doing multiple things when you register a user and need them done atomically, you might want to look at using Transactions in MySQL. There's a decent write-up about Transactions here http://www.mysqltutorial.org/mysql-transaction.aspx
BEGIN;
do related reads/writes to the data
COMMIT;
Inside that "transaction", the connection sees a consistent view of the data, and blocks anyone else from messing with that view.
There are exceptions. The main one is
BEGIN
SELECT ... FOR UPDATE;
fiddle with the values SELECTed
UPDATE ...; -- and change those values
COMMIT;
The SELECT .. FOR UPDATE announces what should not be tampered with. If another connection wants to mess with the same rows, it will have to wait until your COMMIT, at which time he may find that things have changed and he will need to do something different. But, in general, this avoids a "deadlock" wherein two transactions are stepping on each other so badly that one has to be "rolled back".
With techniques like this, the "concurrency" is prevented only briefly and relatively precisely. That is, if two connections are working with different rows, both can proceed -- there is no need to "prevent concurrency".
This seems like a pretty basic question but one I don't know the answer to.
I wrote a script in PHP that loops through some data and then performs an UPDATE to records in our database. There are roughly some 150,000 records, so the script certainly takes a while to complete.
Could I potentially harm or interfere with the data insertion if I run a basic SELECT statement?
Say...I want to ensure that the script is working properly so if I run a basic SELECT COUNT() to see if it's increasing in real time as the script runs. Is this possible or would it screw something up?
Thank you!
Generally a SELECT call is incapable of "causing harm" provided you're not talking about SQL injection problems.
The InnoDB engine, which you should be using, has what's called Multi-Version Concurrency Control or MVCC for short. It means that until your UPDATE statement is finished, or the transaction that the statement is a part of, the SELECT will be done against the last consistent database state.
If you're using MyISAM, which is a very bad idea in most production environments due to the limitations of that engine and the way the data is stored without a rollback journal, the SELECT call will probably block until the UPDATE is applied since it does not support MVCC.
I have a problem with a project I am currently working on, built in PHP & MySQL. The project itself is similar to an online bidding system. Users bid on a project, and they get a chance to win if they follow their bid by clicking and cliking again.
The problem is this: if 5 users for example, enter the game at the same time, I get a 8-10 seconds delay in the database - I update the database using the UNIX_TIMESTAMP(CURRENT_TIMESTAMP), which makes the whole system of the bids useless.
I want to mention too that the project is very database intensive (around 30-40 queries per page) and I was thinking maybe the queries get delayed, but I'm not sure if that's happening. If that's the case though, any suggestions how to avoid this type of problem?
Hope I've been at least clear with this issue. It's the first time it happened to me and I would appreciate your help!
You can decide on
Optimizing or minimizing required queries.
You can cache queries do not need to update on each visit.
You can use Summery tables
Update the queries only on changes.
You have to do this cleverly. You can follow this MySQLPerformanceBlog
I'm not clearly on what you're doing, but let me elaborate on what you said. If you're using UNIX_TIMESTAMP(CURRENT_TIMESTAMP()) in your MySQL query you have a serious problem.
The problem with your approach is that you are using MySQL functions to supply the timestamp record that will be stored in the database. This is an issue, because then you have to wait on MySQL to parse and execute your query before that timestamp is ever generated (and some MySQL engines like MyISAM use table-level locking). Other engines (like InnoDB) have slower writes due to row-level locking granularity. This means the time stored in the row will not necessarily reflect the time the request was generated to insert said row. Additionally, it can also mean that the time you're reading from the database is not necessarily the most current record (assuming you are updating records after they were inserted into the table).
What you need is for the PHP request that generates the SQL query to provide the TIMESTAMP directly in the SQL query. This means the timestamp reflects the time the request is received by PHP and not necessarily the time that the row is inserted/updated into the database.
You also have to be clear about which MySQL engine you're table is using. For example, engines like InnoDB use MVCC (Multi-Version Concurrency Control). This means while a row is being read it can be written to at the same time. If this happens the database engine uses something called a page table to store the existing value that will be read by the client while the new value is being updated. That way you have guaranteed row-level locking with faster and more stable reads, but potentially slower writes.
I'm writing a Queue Management System for a small clinic. There will be multiple users trying to do same thing, so these is a concurrency problem. I'm familiar with ACID guarantee and also understand notion of transaction. I know that two people can not change same data at the same time.
But here's my problem: I have a PHP function isFree($time) which determines if particular doctor is free for that time. I'm afraid that if both users try to call same function, both of them may get positive result and mess things up, so somehow I need to either queue concurrent users, or accept only one.
Easiest way to solve this problem would be to restrict, that my function can be called one at a time. I probably need some kind of flag or blocking system, but I have no exact idea on how to do it.
Or on the other hand, It would be even faster to only restrict those function calls, which may overlap. For example calling isFree($time) function for Monday and Tuesday at the same time won't cause any problems.
You're effectively asking for a lock.
I am guessing your queue system runs on MySQL for databases. if so, you can LOCK the table you're using (or on some database engines, the specific row you are using!). The structure is LOCK TABLES yourTableName READ.
This will effectively prevent anyone else from reading anything in the table until:
Your session is ended
You free the lock (using UNLOCK)
This is true for all database storage engines. InnoDB supports row-level locking through transactions. Instead of using a SELECT query, suffix it with FOR UPDATE to get a complete lock over the row(s) you just read.
This will hopefully shed more light on the locking mechanism of MySQL/innoDB. In order to free the lock, either UPDATE the row or commit the transaction.
Is it possible to do a simple count(*) query in a PHP script while another PHP script is doing insert...select... query?
The situation is that I need to create a table with ~1M or more rows from another table, and while inserting, I do not want the user feel the page is freezing, so I am trying to keep update the counting, but by using a select count(\*) from table when background in inserting, I got only 0 until the insert is completed.
So is there any way to ask MySQL returns partial result first? Or is there a fast way to do a series of insert with data fetched from a previous select query while having about the same performance as insert...select... query?
The environment is php4.3 and MySQL4.1.
Without reducing performance? Not likely. With a little performance loss, maybe...
But why are you regularily creating tables and inserting millions of row? If you do this only very seldom, can't you just warn the admin (presumably the only one allowed to do such a thing) that this takes a long time. If you're doing this all the time, are you really sure you're not doing it wrong?
I agree with Stein's comment that this is a red flag if you're copying 1 million rows at a time during a PHP request.
I believe that in a majority of cases where people are trying to micro-optimize SQL, they could get much greater performance and throughput by approaching the problem in a different way. SQL shouldn't be your bottleneck.
If you're doing a single INSERT...SELECT, then no, you won't be able to get intermediate results. In fact this would be a Bad Thing, as users should never see a database in an intermediate state showing only a partial result of a statement or transaction. For more information, read up on ACID compliance.
That said, the MyISAM engine may play fast and loose with this. I'm pretty sure I've seen MyISAM commit some but not all of the rows from an INSERT...SELECT when I've aborted it part of the way through. You haven't said which engine your table is using, though.
The other users can't see the insertion until it's committed. That's normally a good thing, since it makes sure they can't see half-done data. However, if you want them to see intermediate data, you could throw in an occassional call to "commit" while you're inserting.
By the way - don't let anybody tell you to turn autocommit on. That a HUGE time waster. I have a "delete and re-insert" job on my database that takes 1/3rd as long when I turn off autocommit.
Just to be clear, MySQL 4 isn't configured by default to use transactions. It uses the MyISAM table type which locks the entire table for each insert, if I remember correctly.
Your best bet would be to use one of the MySQL bulk insertion functions, such as LOAD DATA INFILE, as these are dramatically faster at inserting large amounts of data. As for the counting, well, you could break the inserts into N groups of 1000 (or Y) then divide your progress meter into N sections and just update it on each group's request.
Edit: Another thing to consider is, if this is static data for a template, then you could use a "select into" to create a new table with the same data. Not sure what your application is, or the intended functionality, but that could work as well.
If you can get to the console, you can ask various status questions that will give you the information you are looking for. There's a command that goes something like "SHOW processlist".