I want to add 1 to the value of the previous row for hit_count, but I'm afraid doing it may not be safe if multiple queries are being run quickly (i.e. the page is being loaded several times a second - this is for a web-app I'm making, and so I want to make sure any amount of page loads is supported well).
Here's what I had in mind:
$result = mysql_query("SELECT * FROM rotation");
$fetch = mysql_fetch_assoc($result);
$update_hit = $fetch['hit_counter']+1;
$query = "INSERT INTO rotation (hit_counter, rotation_name) VALUES ('$update_hit', '$rotation_name')";
$result = mysql_query($query);
I thought about setting the hit_counter column to a UNIQUE KEY, but I don't know what else I'd do after that.
I would use AUTO_INCREMENT but the problem is, I need the actual hit_counter value within the rest of the script.
Any ideas, comments, advice would be greatly appreciated!
Edit: I used hit_count and hit_counter, was a typo. Updated to avoid any confusion.
You can use the DUPLICATE KEY functionality when you make name + counter a unique value:
INSERT INTO rotation SET hit_counter = 1, rotation_name= 'name'
ON DUPLICATE KEY UPDATE hit_counter = hit_counter + 1
Performance wise (and if your requirements allow it) I advice pushing updates in bulk (per 100 hits) using a caching mechanism like f.e. memcached.
You could use AUTO_INCREMENT, if you need the inserted id within the rest script, you can use mysql_insert_id to get it.
Related
I have a question that seems pretty basic but am having trouble finding the most efficient solution.
Suppose I have this table, table KEYS
KEYS
KEY_ID VALUE USED
1 123ASD 1
2 ASD234 0
3 123456 0
I want to have an API (Going call it get_key.php here) that will access the db for the last value with used=0, return the key in JSON format to be interpreted to the user via ajax, and then mark the key as used in the db.
I've seen thoughts about lock table, but my worry is that there is a script constantly generating and inserting keys into the DB while tons of users will be requesting keys.
What is the best way to achieve this while still being safe against duplicate entries being sent out, table locks causing long delays in web page delivery, and still being able to insert while retreiving?
If you are still confused, here is a basic example...
get_key.php
//not real file just pseudo
//(lock table?)
//SELECT VALUE, KEY_ID FROM `KEYS` WHERE USED = 0 LIMIT 1;
//$key = $response['VALUE']
//echo out key in json format
//$key_id = $response['KEY_ID']
//UPDATE `KEYS` SET USED = 0 WHERE KEY_ID = $key_id;
//(unlock table?)
insert_key.php
//$key = $_GET['value']
//(lock tables?)
//INSERT INTO `KEYS` (VALUE) VALUES ($key)
//(unlock tables?)
I know this setup in production setting would be extremely insecure, but trying to make as simple as possible so you can understand my question properly.
Thanks so much for your time!
Use InnoDB or any other engine which supports row-level locks. Then you only ever have to lock the one row in question that you're selecting/updating, e.g.
SELECT VALUE, KEY_ID
FROM `KEYS`
FOR UPDATE // <--- add this row
WHERE USED = 0 LIMIT 1;
... do other stuff
UPDATE `KEYS`
SET USED = 1
WHERE KEY_ID = xxx;
COMMIT;
MySQL will lock the record it finds, and then only this particular DB connection will be able to modify that record until it's unlocked or the connection is closed.
I'm working on a portal / large website and I have a question as to how to optimise my mySql / PDO queries in a special case.
I developed it this way : when I'm inserting an iD (unique / primary) I do the following code to find out the highest unused id in a specific table, and then do an INSERT with that id ($next_avail).
After a short chat last days on stackoverflow I got the ideea that AUTO_INCREMENT is best for this action.
But now, I realize that in most of the cases I also use $next_avail (the value of the AUTO_INCREMENT how it would be) to insert in other tables as a column, as well.
So my code makes sense for these inserts.
My question is, how would this code below work for millions of rows as speed, for each insert I do depends on it.
Please write comments and ask me to clarify what is not clear for you, in this question.
Thanks, Adrian
$next_avail = 1 ;
$stmt = $db->prepare("SELECT news_id from mya_news ORDER BY news_id DESC LIMIT 1");
$stmt->execute();
while ( list($id) = $stmt->fetch(PDO::FETCH_BOTH) ) {
$next_avail = $id + 1;
}
You don't need to do that, just use an auto_increment in your table, and also use the PHP functions to get the last_inserted_id mysql_insert_id()
take a look to this sites:
http://php.net/manual/en/function.mysql-insert-id.php
this one is with PDO
http://php.net/manual/en/pdo.lastinsertid.php
Possibly use max rather than ordering the results and using a limit.
However this is very risky. A chance that 2 bits of processing will both get the same $next_avail at the same time. I would suggest changing the order you insert rows (or even inserting a dummy row to get the next id, and updating the row later on) to use the AUTO_INCREMENT column value
This is for a file sharing website. In order to make sure a "passcode", which is unique to each file, is truely unique, I'm trying this:
$genpasscode = mysql_real_escape_string(sha1($row['name'].time())); //Make passcode out of time + filename.
$i = 0;
while ($i < 1) //Create new passcode in loop until $i = 1;
{
$query = "SELECT * FROM files WHERE passcode='".$genpasscode."'";
$res = mysql_query($query);
if (mysql_num_rows($res) == 0) // Passcode doesn't exist yet? Stop making a new one!
{
$i = 1;
}
else // Passcode exists? Make a new one!
{
$genpasscode = mysql_real_escape_string(sha1($row['name'].time()));
}
}
This really only prevents a double passcode if two users upload a file with the same name at the exact same time, but hey better safe than sorry right? My question is; does this work the way I intend it to? I have no way to reliably (read: easily) test it because even one second off would generate a unique passcode anyway.
UPDATE:
Lee suggest I do it like this:
do {
$query = "INSERT IGNORE INTO files
(filename, passcode) values ('whatever', SHA1(NOW()))";
$res = mysql_query($query);
} while( $res && (0 == mysql_affected_rows()) )
[Edit: I updated above example to include two crucial fixes. See my answer below for details. -#Lee]
But I'm afraid it will update someone else's row. Which wouldn't be a problem if filename and passcode were the only fields in the database. But in addition to that there's also checks for mime type etc. so I was thinking of this:
//Add file
$sql = "INSERT INTO files (name) VALUES ('".$str."')";
mysql_query($sql) or die(mysql_error());
//Add passcode to last inserted file
$lastid = mysql_insert_id();
$genpasscode = mysql_real_escape_string(sha1($str.$lastid.time())); //Make passcode out of time + id + filename.
$sql = "UPDATE files SET passcode='".$genpasscode."' WHERE id=$lastid";
mysql_query($sql) or die(mysql_error());
Would that be the best solution? The last-inserted-id field is always unique so the passcode should be too. Any thoughts?
UPDATE2: Apperenatly IGNORE does not replace a row if it already exists. This was a misunderstanding on my part, so that's probably the best way to go!
Strictly speaking, your test for uniqueness won't guarantee uniqueness under a concurrent load. The problem is that you check for uniqueness prior to (and separately from) the place where you insert a row to "claim" your newly generated passcode. Another process could be doing the same thing, at the same time. Here's how that goes...
Two processes generate the exact same passcode. They each begin by checking for uniqueness. Since neither process has (yet) inserted a row to the table, both processes will find no matching passcode in database, and so both processes will assume that the code is unique. Now as the processes each continue their work, eventually they will both insert a row to the files table using the generated code -- and thus you get a duplicate.
To get around this, you must perform the check, and do the insert in a single "atomic" operation. Following is an explanation of this approach:
If you want passcode to be unique, you should define the column in your database as UNIQUE. This will ensure uniqueness (even if your php code does not) by refusing to insert a row that would cause a duplicate passcode.
CREATE TABLE files (
id int(10) unsigned NOT NULL auto_increment PRIMARY KEY,
filename varchar(255) NOT NULL,
passcode varchar(64) NOT NULL UNIQUE,
)
Now, use mysql's SHA1() and NOW() to generate your passcode as part of the insert statement. Combine this with INSERT IGNORE ... (docs), and loop until a row is successfully inserted:
do {
$query = "INSERT IGNORE INTO files
(filename, passcode) values ('whatever', SHA1(NOW()))";
$res = mysql_query($query);
} while( $res && (0 == mysql_affected_rows()) )
if( !$res ) {
// an error occurred (eg. lost connection, insufficient permissions on table, etc)
// no passcode was generated. handle the error, and either abort or retry.
} else {
// success, unique code was generated and inserted into db.
// you can now do a select to retrieve the generated code (described below)
// or you can proceed with the rest of your program logic.
}
Note: The above example was edited to account for the excellent observations posted by #martinstoeckli in the comments section. The following changes were made:
changed mysql_num_rows() (docs) to mysql_affected_rows() (docs) -- num_rows doesn't apply to inserts. Also removed the argument to mysql_affected_rows(), as this function operates on the connection level, not the result level (and in any case, the result of an insert is boolean, not a resource number).
added error checking in the loop condition, and added a test for error/success after loop exits. The error handling is important, as without it, database errors (like lost connections, or permissions problems), will cause the loop to spin forever. The approach shown above (using IGNORE, and mysql_affected_rows(), and testing $res separately for errors) allows us to distinguish these "real database errors" from the unique constraint violation (which is a completely valid non-error condition in this section of logic).
If you need to get the passcode after it has been generated, just select the record again:
$res = mysql_query("SELECT * FROM files WHERE id=LAST_INSERT_ID()");
$row = mysql_fetch_assoc($res);
$passcode = $row['passcode'];
Edit: changed above example to use the mysql function LAST_INSERT_ID(), rather than PHP's function. This is a more efficient way to accomplish the same thing, and the resulting code is cleaner, clearer, and less cluttered.
I'd personally would have write it on a different way but I'll provide you a much easier solution: sessions.
I guess you're familiar with sessions? Sessions are server-side remembered variables that timeout at some point, depending on the server configuration (the default value is 10 minutes or longer). The session is linked to a client using a session id, a random generated string.
If you start a session at the upload page, an id will be generated which is guaranteed to be unique as long the session is not destroyed, which should take about 10 minutes. That means that when you're combining the session id and the current time you'll never have the same passcode. A session id + the current time (in microseconds, milliseconds or seconds) are NEVER the same.
In your upload page:
session_start();
In the page where you handle the upload:
$genpasscode = mysql_real_escape_string(sha1($row['name'].time().session_id()));
// No need for the slow, whacky while loop, insert immediately
// Optionally you can destroy the session id
If you do destroy the session id, that would mean there's a very slim chance that another client can generate the same session id so I wouldn't advice that. I'd just allow the session to expire.
Your question is:
does this work the way I intend it to?
Well, I'd say... yes, it does work, but it could be optimized.
Database
To make sure to not have the same value in the field passcode on the database layer, add a unique key to this:
/* SQL */
ALTER TABLE `yourtable` ADD UNIQUE `passcode` (`passcode`);
(duplicate key handling has to be taken care of than ofc)
Code
To wait a second until a new Hash is created, is ok, but if you're talking heavy load, then a single second might be a tiny eternity. Therefore I'd rather add another component to the sha1-part of your code, maybe a file id from the same database record, userid or whatever which makes this really unique.
If you don't have a unique id at hand, you still can fall back to a random number rand-function in php.
I don't think mysql_real_escape_string is needed in this context. The sha1 returns a 40-character hexadecimal number anyway, even if there are some bad characters in your rows.
$genpasscode = sha1(rand().$row['name'].time());
...should suffice.
Style
Two times the passcode generation code is used in your code sample. Start cleaning this up in moving this into a function.
$genpasscode = gen_pc(row['name']);
...
function gen_pc($x)
{
return sha1($row[rand().$x.time());
}
If I'd do it, I'd do it differently, I'd use the session_id() to avoid duplicates as good as possible. This way you wouldn't need to loop and communicate with your database in that loop possibly several times.
You can add unique constraint to your table.
ALTER TABLE files ADD UNIQUE (passcode);
PS: You can use microtime or uniqid to make the passcode more unique.
Edit:
You make your best to generate a unique value in php, and unique constraint is used to guarantee that at database side. If your unique value is very unique, but in very rare case it failed to be unique, just feel free to give the message like The system is busy now. Please try again:).
How do i go about looking into a table and searching to see if a row exist. the back gorund behind it is the table is called enemies. Every row has a unique id and is set to auto_increment. Each row also has a unique value called monsterid. the monster id isn't auto_increment.
when a monster dies the row is deleted and replaced by a new row. so the id is always changing. as well the monsterid is changed too.
I am using in php the $_GET method and the monsterid is passing through it,
basically i am trying to do this
$monsterID = 334322 //this is the id passed through the $_GET
checkMonsterId = "check to see if the monster id exist within the enemies table"
if monsterid exist then
{RUN PHP}
else
{RUN PHP}
If you need anymore clarity please ask. and thanks for the help in advance.
Use count! If it returns > 0, it exists, else, it doesn't.
select count(*) from enemies where monsterid = 334322
You would use it in PHP thusly (after connecting to the database):
$monsterID = mysql_real_escape_string($monsterID);
$res = mysql_query('select count(*) from enemies where monsterid = ' . $monsterid) or die();
$row = mysql_fetch_row($res);
if ($row[0] > 0)
{
//Monster exists
}
else
{
//It doesn't
}
Use count, like
select count(*) from enemies where monsterid = 334322
However be sure to make certain you've added an index on monsterid to the table. Reason being that if you don't, and this isn't the primary key, then the rdbms will be forced to issue a full table scan - read every row - to give you the value back. On small datasets this doesn't matter as the table will probably sit in core anyway, but once the number of rows becomes significant and you're hitting the disk to do the scan the speed difference can easily be two orders of magnitude or more.
If the number of rows is very small then not indexing is rational as using an non-primary key index requires additional overhead when inserting data, however this should be a definite decision (I regularly impress clients who've used a programmer who doesn't understand databases by adding indexes to tables which were fine when the coder created them but subsequently slow to a crawl when loaded with real volumes of data - quite amazing how one line of sql to add an index will buy you guru status in your clients eyes cause you made his system usable again).
If you're doing more complex queries against the database using subselect, something like finding all locations where there is no monster, then look up the use of the sql EXISTS clause. This is often overlooked by programmers (the temptation is to return a count of actual values) and using it is generally faster than the alternatives.
Simpler :
select 1 from enemies where monsterid = 334322
If it returns a row, you have a row, if not, you don't.
The mysql_real_escape_string is important to prevent SQL injection.
$monsterid = mysql_real_escape_string($_GET['monsterid']);
$query = intval(mysql_query("SELECT count(*) FROM enemies WHERE monsterid = '$monsterid'));
if (mysql_result > 0) {
// monster exists
} else {
// monster doesn't exist
}
I have a website that has user ranking as a central part, but the user count has grown to over 50,000 and it is putting a strain on the server to loop through all of those to update the rank every 5 minutes. Is there a better method that can be used to easily update the ranks at least every 5 minutes? It doesn't have to be with php, it could be something that is run like a perl script or something if something like that would be able to do the job better (though I'm not sure why that would be, just leaving my options open here).
This is what I currently do to update ranks:
$get_users = mysql_query("SELECT id FROM users WHERE status = '1' ORDER BY month_score DESC");
$i=0;
while ($a = mysql_fetch_array($get_users)) {
$i++;
mysql_query("UPDATE users SET month_rank = '$i' WHERE id = '$a[id]'");
}
UPDATE (solution):
Here is the solution code, which takes less than 1/2 of a second to execute and update all 50,000 rows (make rank the primary key as suggested by Tom Haigh).
mysql_query("TRUNCATE TABLE userRanks");
mysql_query("INSERT INTO userRanks (userid) SELECT id FROM users WHERE status = '1' ORDER BY month_score DESC");
mysql_query("UPDATE users, userRanks SET users.month_rank = userRanks.rank WHERE users.id = userRanks.id");
Make userRanks.rank an autoincrementing primary key. If you then insert userids into userRanks in descending rank order it will increment the rank column on every row. This should be extremely fast.
TRUNCATE TABLE userRanks;
INSERT INTO userRanks (userid) SELECT id FROM users WHERE status = '1' ORDER BY month_score DESC;
UPDATE users, userRanks SET users.month_rank = userRanks.rank WHERE users.id = userRanks.id;
My first question would be: why are you doing this polling-type operation every five minutes?
Surely rank changes will be in response to some event and you can localize the changes to a few rows in the database at the time when that event occurs. I'm pretty certain the entire user base of 50,000 doesn't change rankings every five minutes.
I'm assuming the "status = '1'" indicates that a user's rank has changed so, rather than setting this when the user triggers a rank change, why don't you calculate the rank at that time?
That would seem to be a better solution as the cost of re-ranking would be amortized over all the operations.
Now I may have misunderstood what you meant by ranking in which case feel free to set me straight.
A simple alternative for bulk update might be something like:
set #rnk = 0;
update users
set month_rank = (#rnk := #rnk + 1)
order by month_score DESC
This code uses a local variable (#rnk) that is incremented on each update. Because the update is done over the ordered list of rows, the month_rank column will be set to the incremented value for each row.
Updating the users table row by row will be a time consuming task. It would be better if you could re-organise your query so that row by row updates are not required.
I'm not 100% sure of the syntax (as I've never used MySQL before) but here's a sample of the syntax used in MS SQL Server 2000
DECLARE #tmp TABLE
(
[MonthRank] [INT] NOT NULL,
[UserId] [INT] NOT NULL,
)
INSERT INTO #tmp ([UserId])
SELECT [id]
FROM [users]
WHERE [status] = '1'
ORDER BY [month_score] DESC
UPDATE users
SET month_rank = [tmp].[MonthRank]
FROM #tmp AS [tmp], [users]
WHERE [users].[Id] = [tmp].[UserId]
In MS SQL Server 2005/2008 you would probably use a CTE.
Any time you have a loop of any significant size that executes queries inside, you've got a very likely antipattern. We could look at the schema and processing requirement with more info, and see if we can do the whole job without a loop.
How much time does it spend calculating the scores, compared with assigning the rankings?
Your problem can be handled in a number of ways. Honestly more details from your server may point you in a totally different direction. But doing it that way you are causing 50,000 little locks on a heavily read table. You might get better performance with a staging table and then some sort of transition. Inserts into a table no one is reading from are probably going to be better.
Consider
mysql_query("delete from month_rank_staging;");
while(bla){
mysql_query("insert into month_rank_staging values ('$id', '$i');");
}
mysql_query("update month_rank_staging src, users set users.month_rank=src.month_rank where src.id=users.id;");
That'll cause one (bigger) lock on the table, but might improve your situation. But again, that may be way off base depending on the true source of your performance problem. You should probably look deeper at your logs, mysql config, database connections, etc.
Possibly you could use shards by time or other category. But read this carefully before...
You can split up the rank processing and the updating execution. So, run through all the data and process the query. Add each update statement to a cache. When the processing is complete, run the updates. You should have the WHERE portion of the UPDATE reference a primary key set to auto_increment, as mentioned in other posts. This will prevent the updates from interfering with the performance of the processing. It will also prevent users later in the processing queue from wrongfully taking advantage of the values from the users who were processed before them (if one user's rank affects that of another). It also prevents the database from clearing out its table caches from the SELECTS your processing code does.