How to "close" database when the query happens? - php

$database->count = "SELECT * FROM table WHERE item_id = 1"
if($database->count == 1)
{
$database->update = "UPDATE users SET money = money - 1000";
$database->delete = "DELETE table WHERE item_id = 1";
}
Let's say I have this code (I've just created it) in index.php page. Can at the same time "SELECT * FROM table WHERE item_id = 1" query happen so two people would get count 1 and -1000 money? If yes, how can I avoid that?
Thank you.

If you're worried about two queries running at the same time being responsible for unbalanced state in your DB, you should be using transactions : http://dev.mysql.com/doc/refman/5.0/en/ansi-diff-transactions.html
Transactions are helpful in keeping the state of your data correct.

You can LOCK TABLE table WRITE before and UNLOCK TABLE table after the queries.
http://dev.mysql.com/doc/refman/5.0/en/lock-tables.html

You need transactions.

If you are using InnoDB, you can play with the Transaction Isolation Level so that dirty reads are not allowed. Make sure you use repeatable reads as your Transaction Isolation Level.
BTW The DELETE line should say DELETE FROM table WHERE item_id = 1;

Not all of the databases have transaction support so If you are using mysql as you are working with PHP so you will need table locking technique.
Your table will be locked unless all work will be done than you unlock it, you can specify row locking as well.
http://dev.mysql.com/doc/refman/5.0/en/lock-tables.html
Best :) Waqar Alamgir

Related

How to implement locking mechanism for mysql table row to avoide race condition?

<?php
$time=getdate();
$time['month'];
$temp_time=$time['month']." ".$time['year'];
$q="SELECT * FROM websiteviews ORDER BY id DESC LIMIT 1";
$r=mysqli_query($dbc, $q);
$r=mysqli_fetch_assoc($r);
if($temp_time==$r['timespan']){
$count=$r['view']+1;
$q="UPDATE websiteviews SET view='".$count."' WHERE id='".$r['id']."'";
mysqli_query($dbc, $q);
echo mysqli_error($dbc);
}
?>
Hi friends I am updating page views of my website every time webpage is loaded
this piece of code is placed on the top of the page now my question is this mechanism can cause race condition i want to lock my row until it is updated please help with mysql query to lock particular row until it is updated
You shoud not select then update the record you want to update.
UPDATE
websiteviews
SET
view = view+1
WHERE
-- Assuming 'timespan' is a varchar column and the stored format is 'MM YYYY'
timespan = DATE_FORMAT(NOW(), '%m %Y')
The above query updates the record in one step, so the race condition could be handled by the DBMS. Possibly you should change the WHERE condition to match to your requirements.
If the column view is nullable, then change the SET part to view = COALESCE(view, 0)+1
Storage engines
In MySQL the tables could be stored via different storage engines and all engines has different support of locking (if any).
Additional resource: About locking when InnoDB is the storage engine
I think you need use transaction locking but your tables need to use engine that supports transactions like InnoDB
You can read more on
http://dev.mysql.com/doc/refman/5.0/en/lock-tables-restrictions.html
http://dev.mysql.com/doc/refman/5.0/en/innodb-lock-modes.html
You can use Transactions to solve the race condition, something like below:
START TRANSACTION;
SELECT view, id FROM websiteviews ORDER BY id DESC LIMIT 1 FOR UPDATE;
#id obtained from previous step
#view obtained from previous step
UPDATE websiteviews SET `view`=<view>+1 WHERE id= <id>;
COMMIT;
NOTE: Take note of the "FOR UPDATE" clause used with select statement.

Delete row by row or a bulk

I would like like to delete a bulk of data. this table have approximately 11207333 now
However I have several method to delete it.
The data that will be deleted is approximately 300k. I have two method to do this but unsure which one perform faster.
My first option:
$start_date = "2011-05-01 00:00:00";
$end_date = "2011-05-31 23:59:59";
$sql = "DELETE FROM table WHERE date>='$start_date' and date <='$end_date'";
$mysqli->query($sql);
printf("Affected rows (DELETE): %d\n", $mysqli->affected_rows);
second option:
$query = "SELECT count(*) as count FROM table WHERE date>='$start_date' and date <='$end_date'";
$result = $mysqli->query($query);
$row = $result->fetch_array(MYSQLI_ASSOC);
$total = $row['count'];
if ($total > 0) {
$query = "SELECT * FROM table WHERE date>='$start_date' and date <='$end_date' LIMIT 0,$total";
$result = $mysqli->query($query);
while ($row = $result->fetch_array(MYSQLI_ASSOC)) {
$table_id = $row['table_id']; // primary key
$query = "DELETE FROM table where table_id = $table_id LIMIT 0,$total";
$mysqli->query($query);
}
}
This table data is displayed to client to see, I afraid that if the deletion go wrong and it will affect my client.
I was wondering are there any method better than mine.
If you guys need more info from me just let me know.
Thank you
In my opinion, the first option is faster.
The second option contains looping which I think will be slower because it keeps looping several times looking for your table id.
If you did not provide the wrong start and end date, I think you're safe either option, but option 1 is faster in my opinion.
and yea, i dont see any deletion in option 2, but I assume you have it in mind but using looping method.
Option one is your best bet.
If you are afraid something will "go wrong" you could protect yourself by backing up the data first, exported the rows you plan to delete, or implementing a logical delete flag.
Assuming that there is indeed a DELETE query in it, the second method is not only slower, it may break if another connection deletes one of the rows you intend to delete in your while loop, before it had a chance to do it. For it to work, you need to wrap it in a transaction:
mysqli_query("START TRANSACTION;");
# your series of queries...
mysql_query("COMMIT;");
This will allow the correct processing of your queries in isolation of the rest of the events happening in the db.
At any rate, if you want the first query to be faster, you need to tune your table definition by adding an index on the column used for the deletion, namely `date` (however, recall that this new index may amper other queries in your app, if there are already several indexes on that table).
Without that index, mysql will basically process the query more or less the same way as in method 2, but without:
PHP interpretation,
network communication and
query analysis overhead.
You don't need any SELECTS to make the delete in a loop. Just use LIMIT in your delete query and check if there are affected rows:
$start_date = "2011-05-01 00:00:00";
$end_date = "2011-05-31 23:59:59";
$deletedRecords = 0;
$sql = "DELETE FROM table WHERE date>='$start_date' and date <='$end_date' LIMIT 100";
do {
$mysqli->query($sql);
$deletedRecords += $mysqli->affected_rows;
while ($mysqli->affected_rows > 0);
}
printf("Affected rows (DELETE): %d\n", $deletedRecords);
Which method is better depends on the storage engine you are using.
If you are using InnoDB, this is the recommended way. The reason is that the DELETE statement runs in a transaction (even in auto-commit mode, every sql statement is run in a transaction, in order to be atomic... if it fails in the middle, the whole delete will be rolled back and you won't end with half-data). Which means that you will have a long running transaction, and you will have a lot of locked rows during the transaction, which will block anyone who wants to update such data (it can block insterts if there are unique indexes involved) and reads will be done via the rollback log. In other words, for InnoDB, large deletes are faster if performed in chunks.
In MyISAM however, the delete locks the entire table. If you do in lot of small chunks, you will have too many LOCK/UNLOCK commands executed, which will actually slow the process. I would make it in a loop for MyISAM as well, to give chance to other processes to use the table, but in larger chunks compared to InnoDB. I would never do it row by row for MyISAM based table because of the LOCK/UNLOCK overhead.

What is the best way to manually increment counter in PHP/MySQL?

I am need to use batchId in my one of project, one or more rows can have single batchId. So when I will go to insert a bunch of 1000 rows from a single user, I will give this 1000 rows a single batchId. this batchId is next autoincrement batchId.
Currently I maintain a separate database table for unique_ids, and storing last batchId there.
Whenever I need to insert a batch of rows in table, I update the batchId in unique_ids table by 1 and use it for batch insertion.
update unique_ids set nextId = nextId + 1 where `key` = 'batchId';
select nextId from unique_ids where `key` = 'batchId';
I call up a function which fires above two queries and return me the nextId for batch (batchId).
Here is my PHP class and function call for same. I am using ADODB, You can ignore that ADODB related code.
class UniqueId
{
static public $db;
public function __construct()
{
}
static public function getNextId()
{
self::$db = getDBInstance();
$updUniqueIds = "Update unique_ids set nextId = nextId + 1 where `key` = 'batchId'";
self::$db->EXECUTE($updUniqueIds);
$selUniqueId = "Select nextId from unique_ids where `key` = 'batchId'";
$resUniqueId = self::$db->EXECUTE($selUniqueId);
return $resUniqueId->fields['nextId'];
}
}
Now whenever I require a next batchId, I just call below line of code.
`$batchId = UniqueId::getNextId();`
But the real problem is When there are hundreds of simultaneous requests in a second, It gives same batchId to two different batches. It is a serious issue for me. I need to solve that.
Please suggest me what should I do? can I restrict only a single instance of this class so no simultaneous requests can call this function at a time and never give a single batchId to two different batches.
Have a look into atomic operations or transactions. It will lock the database and only allow one write query at any given instance in time.
This might affect your performance, since now other users have to wait for a unlocked database!
I am not sure what sort of support ADODB provides for atomicity though!
Basic concept is:
Acquire Lock
Read from DB
Write to DB with new ID
Release Lock
If a lock is already acquired, the script will be blocked (busy waiting) until it is released again. But this way you are guaranteed no data hazards occur.
Begin tran
Update
Select
Commit
That way the update locks prevents two concurrent runs from pulling the same value.
If you select first,the shared lock will not isolate the two

Insert automatically on new table?

I will create 5 tables, namely data1, data2, data3, data4 and data5 tables. Each table can only store 1000 data records.
When a new entry or when I want to insert a new data, I must do a check,
$data1 = mysql_query(SELECT * FROM data1);
<?php
if(mysql_num_rows($data1) > 1000){
$data2 = mysql_query(SELECT * FROM data2);
if(mysql_num_rows($data2 > 1000){
and so on...
}
}
I think this is not the way right? I mean, if I am user 4500, it would take some time to do all the check. Is there any better way to solve this problem?
I haven decided the numbers, it can be 5000 or 10000 data. The reason is flexibility and portability? Well, one of my sql guru suggest me to do this way
Unless your guru was talking about something like Partitioning, I'd seriously doubt his advise. If your database can't handle more than 1000, 5000 or 10000 rows, look for another database. Unless you have a really specific example how a record limit will help you, it probably won't. With the amount of overhead it adds it probably only complicates things for no gain.
A properly set up database table can easily handle millions of records. Splitting it into separate tables will most likely increase neither flexibility nor portability. If you accumulate enough records to run into performance problems, congratulate yourself on a job well done and worry about it then.
Read up on how to count rows in mysql.
Depending on what database engine you are using, doing count(*) operations on InnoDB tables is quite expensive, and those counts should be performed by triggers and tracked in a adjacent information table.
The structure you describe is often designed around a mapping table first. One queries the mapping table to find the destination table associated with a primary key.
You can keep a "tracking" table to keep track of the current table between requests.
Also be on alert for race conditions (use transactions, or insure only one process is running at a time.)
Also don't $data1 = mysql_query(SELECT * FROM data1); with nested if's, do something like:
$i = 1;
do {
$rowCount = mysql_fetch_field(mysql_query("SELECT count(*) FROM data$i"));
$i++;
} while ($rowCount >= 1000);
I'd be surprised if MySQL doesn't have some fancy-pants way to manage this automatically (or at least, better than what I'm about to propose), but here's one way to do it.
1. Insert record into 'data'
2. Check the length of 'data'
3. If >= 1000,
- CREATE TABLE 'dataX' LIKE 'data';
(X will be the number of tables you have + 1)
- INSERT INTO 'dataX' SELECT * FROM 'data';
- TRUNCATE 'data';
This means you will always be inserting into the 'data' table, and 'data1', 'data2', 'data3', etc are your archived versions of that table.
You can create a MERGE table like this:
CREATE TABLE all_data ([col_definitions]) ENGINE=MERGE UNION=(data1,data2,data3,data4,data5);
Then you would be able to count the total rows with a query like SELECT COUNT(*) FROM all_data.
If you're using MySQL 5.1 or above, you can let the database handle this (nearly) automatically using partitioning:
Read this article or the official documentation

Best way to update user rankings without killing the server

I have a website that has user ranking as a central part, but the user count has grown to over 50,000 and it is putting a strain on the server to loop through all of those to update the rank every 5 minutes. Is there a better method that can be used to easily update the ranks at least every 5 minutes? It doesn't have to be with php, it could be something that is run like a perl script or something if something like that would be able to do the job better (though I'm not sure why that would be, just leaving my options open here).
This is what I currently do to update ranks:
$get_users = mysql_query("SELECT id FROM users WHERE status = '1' ORDER BY month_score DESC");
$i=0;
while ($a = mysql_fetch_array($get_users)) {
$i++;
mysql_query("UPDATE users SET month_rank = '$i' WHERE id = '$a[id]'");
}
UPDATE (solution):
Here is the solution code, which takes less than 1/2 of a second to execute and update all 50,000 rows (make rank the primary key as suggested by Tom Haigh).
mysql_query("TRUNCATE TABLE userRanks");
mysql_query("INSERT INTO userRanks (userid) SELECT id FROM users WHERE status = '1' ORDER BY month_score DESC");
mysql_query("UPDATE users, userRanks SET users.month_rank = userRanks.rank WHERE users.id = userRanks.id");
Make userRanks.rank an autoincrementing primary key. If you then insert userids into userRanks in descending rank order it will increment the rank column on every row. This should be extremely fast.
TRUNCATE TABLE userRanks;
INSERT INTO userRanks (userid) SELECT id FROM users WHERE status = '1' ORDER BY month_score DESC;
UPDATE users, userRanks SET users.month_rank = userRanks.rank WHERE users.id = userRanks.id;
My first question would be: why are you doing this polling-type operation every five minutes?
Surely rank changes will be in response to some event and you can localize the changes to a few rows in the database at the time when that event occurs. I'm pretty certain the entire user base of 50,000 doesn't change rankings every five minutes.
I'm assuming the "status = '1'" indicates that a user's rank has changed so, rather than setting this when the user triggers a rank change, why don't you calculate the rank at that time?
That would seem to be a better solution as the cost of re-ranking would be amortized over all the operations.
Now I may have misunderstood what you meant by ranking in which case feel free to set me straight.
A simple alternative for bulk update might be something like:
set #rnk = 0;
update users
set month_rank = (#rnk := #rnk + 1)
order by month_score DESC
This code uses a local variable (#rnk) that is incremented on each update. Because the update is done over the ordered list of rows, the month_rank column will be set to the incremented value for each row.
Updating the users table row by row will be a time consuming task. It would be better if you could re-organise your query so that row by row updates are not required.
I'm not 100% sure of the syntax (as I've never used MySQL before) but here's a sample of the syntax used in MS SQL Server 2000
DECLARE #tmp TABLE
(
[MonthRank] [INT] NOT NULL,
[UserId] [INT] NOT NULL,
)
INSERT INTO #tmp ([UserId])
SELECT [id]
FROM [users]
WHERE [status] = '1'
ORDER BY [month_score] DESC
UPDATE users
SET month_rank = [tmp].[MonthRank]
FROM #tmp AS [tmp], [users]
WHERE [users].[Id] = [tmp].[UserId]
In MS SQL Server 2005/2008 you would probably use a CTE.
Any time you have a loop of any significant size that executes queries inside, you've got a very likely antipattern. We could look at the schema and processing requirement with more info, and see if we can do the whole job without a loop.
How much time does it spend calculating the scores, compared with assigning the rankings?
Your problem can be handled in a number of ways. Honestly more details from your server may point you in a totally different direction. But doing it that way you are causing 50,000 little locks on a heavily read table. You might get better performance with a staging table and then some sort of transition. Inserts into a table no one is reading from are probably going to be better.
Consider
mysql_query("delete from month_rank_staging;");
while(bla){
mysql_query("insert into month_rank_staging values ('$id', '$i');");
}
mysql_query("update month_rank_staging src, users set users.month_rank=src.month_rank where src.id=users.id;");
That'll cause one (bigger) lock on the table, but might improve your situation. But again, that may be way off base depending on the true source of your performance problem. You should probably look deeper at your logs, mysql config, database connections, etc.
Possibly you could use shards by time or other category. But read this carefully before...
You can split up the rank processing and the updating execution. So, run through all the data and process the query. Add each update statement to a cache. When the processing is complete, run the updates. You should have the WHERE portion of the UPDATE reference a primary key set to auto_increment, as mentioned in other posts. This will prevent the updates from interfering with the performance of the processing. It will also prevent users later in the processing queue from wrongfully taking advantage of the values from the users who were processed before them (if one user's rank affects that of another). It also prevents the database from clearing out its table caches from the SELECTS your processing code does.

Categories