I have a question that seems pretty basic but am having trouble finding the most efficient solution.
Suppose I have this table, table KEYS
KEYS
KEY_ID VALUE USED
1 123ASD 1
2 ASD234 0
3 123456 0
I want to have an API (Going call it get_key.php here) that will access the db for the last value with used=0, return the key in JSON format to be interpreted to the user via ajax, and then mark the key as used in the db.
I've seen thoughts about lock table, but my worry is that there is a script constantly generating and inserting keys into the DB while tons of users will be requesting keys.
What is the best way to achieve this while still being safe against duplicate entries being sent out, table locks causing long delays in web page delivery, and still being able to insert while retreiving?
If you are still confused, here is a basic example...
get_key.php
//not real file just pseudo
//(lock table?)
//SELECT VALUE, KEY_ID FROM `KEYS` WHERE USED = 0 LIMIT 1;
//$key = $response['VALUE']
//echo out key in json format
//$key_id = $response['KEY_ID']
//UPDATE `KEYS` SET USED = 0 WHERE KEY_ID = $key_id;
//(unlock table?)
insert_key.php
//$key = $_GET['value']
//(lock tables?)
//INSERT INTO `KEYS` (VALUE) VALUES ($key)
//(unlock tables?)
I know this setup in production setting would be extremely insecure, but trying to make as simple as possible so you can understand my question properly.
Thanks so much for your time!
Use InnoDB or any other engine which supports row-level locks. Then you only ever have to lock the one row in question that you're selecting/updating, e.g.
SELECT VALUE, KEY_ID
FROM `KEYS`
FOR UPDATE // <--- add this row
WHERE USED = 0 LIMIT 1;
... do other stuff
UPDATE `KEYS`
SET USED = 1
WHERE KEY_ID = xxx;
COMMIT;
MySQL will lock the record it finds, and then only this particular DB connection will be able to modify that record until it's unlocked or the connection is closed.
Related
I have found two different ways to, first, get the next invoice number and, then, save the invoice in a multi-tenant database where, of course, each tenant will have his own invoices with different incremental numbers.
My first (and actual) approach is this (works fine):
Add a new record to the invoices tables. No matter the invoice number yet (for example, 0, or empty)
I get the unique ID of THAT created record after insert
Now I do a "SELECT table where ID = $lastcreatedID **FOR UPDATE**"
Here I get the latest saved invoice number with "SELECT #A:=MAX(NUMBER)+1 FROM TABLE WHERE......"
Finally I update the previously saved record with that invoice number with an "UPDATE table SET NUMBER = $mynumber WHERE ID = $lastcreatedID"
This works fine, but I don't know if the "for update" is really needed or if this is the correct way to do this in a multi-tenant DB, due to performance, etc.
The second (and simpler) approach is this (and works too, but I don't know if it is a secure approach):
INSERT INTO table (NUMBER,TENANT) SELECT COALESCE(MAX(NUMBER),0)+1,$tenant FROM table WHERE....
That's it
Both methods are working, but I would like to know the differences between them regarding speed, performance, if it may create duplicates, etc.
Or... is there any better way to do this?
I'm using MySQL and PHP. The application is an invoice/sales cloud software that will be used by a lot of customers (tenants).
Thanks
Regardless of if you're using these values as database IDs or not, re-using IDs is virtually guaranteed to cause problems at some point. Even if you're not re-using IDs you're going to run into the case where two invoice creation requests run at the same time and get the same MAX()+1 result.
To get around all this you need to reimplement a simple sequence generator that locks its storage while a value is being issued. Eg:
CREATE TABLE client_invoice_serial (
-- note: also FK this back to the client record
client_id INTEGER UNSIGNED NOT NULL PRIMARY KEY,
serial INTEGER UNSIGNED NOT NULL DEFAULT 0
);
$dbh = new PDO('mysql:...');
/* this defaults to 'on', making every query an implicit transaction. it needs to
be off for this. you may or may not want to set this globally, or just turn it off
before this, and back on at the end. */
$dbh->setAttribute(PDO::ATTR_AUTOCOMMIT,0);
// simple best practice, ensures that SQL errors MUST be dealt with. is assumed to be enabled for the below try/catch.
$dbh->setAttribute(PDO::ATTR_ERRMODE_EXCEPTION,1);
$dbh->beginTransaction();
try {
// the below will lock the selected row
$select = $dbh->prepare("SELECT * FROM client_invoice_serial WHERE client_id = ? FOR UPDATE;");
$select->execute([$client_id]);
if( $select->rowCount() === 0 ) {
$insert = $dbh->prepare("INSERT INTO client_invoice_serial (client_id, serial) VALUES (?, 1);");
$insert->execute([$client_id]);
$invoice_id = 1;
} else {
$invoice_id = $select->fetch(PDO::FETCH_ASSOC)['serial'] + 1;
$update = $dbh->prepare("UPDATE client_invoice_serial SET serial = serial + 1 WHERE client_id = ?");
$update->execute([$client_id])
}
$dbh->commit();
} catch(\PDOException $e) {
// make sure that the transaction is cleaned up ASAP, then let the exception bubble up into your general error handling.
$dbh->rollback();
throw $e; // or throw a more pertinent error/exception of your choosing.
}
// both committing and rolling back will release the lock
At a very basic level this is what MySQL is doing in the background for AUTOINCREMENT columns.
Do not use MAX(id)+1. It will, someday, bite you. There will be two invoices with the same number, and it will take us a few paragraphs to explain why it happened.
Instead, use AUTO_INCREMENT the way it is intended.
INSERT INTO Invoices (id, ...) VALUES (NULL, ...);
SELECT LAST_INSERT_ID(); -- specific to the conne ction
That is safe even when multiple connections are doing the same thing. No FOR UPDATE, no BEGIN, etc is necessary. (You may want such for other purposes.)
And, never delete rows. Instead, use the standard business practice of invalidating bad invoices. Imagine being audited.
All that said, there is still a potential problem. After a ROLLBACK or system crash, an id may be "burned". Also things like INSERT IGNORE allocate the id before checking to see whether it will be needed.
If you can live with the caveats, use AUTO_INCREMENT.
If not, then create a 1-row, 2-column table to simulate a sequence number generator: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#sequence
Or use MariaDB's SEQUENCE
Both the approaches do work, but each with its own demerits in high traffic situations.
The first approach runs 3 queries for every invoice you create, putting extra load on your server.
The second approach can lead to duplicates in events where two invoices are generated with very little time difference (such that the SELECT query return same max number for both invoices).
Both the approaches may lead to problems in high traffic conditions.
Two solutions to the problems are listed below:
Use generated columns: Mysql supports generated columns, which are basically derived using other column values for each row. Refer this
Calculate invoice number on the fly: Since you're using the primary key as part of the invoice, let the DB handle generating unique primary keys, and then generate invoice numbers on the fly in your business logic using the id for each invoice.
I have the following call to my database to retrieve the last row ID from an AUTO_INCREMENT column, which I use to find the next row ID:
$result = $mysqli->query("SELECT articleid FROM article WHERE articleid=(SELECT MAX(articleid) FROM article)");
$row = $result->fetch_assoc();
$last_article_id = $row["articleid"];
$last_article_id = $last_article_id + 1;
$result->close();
I then use $last_article_id as part of a filename system.
This is working perfectly....until I delete a row meaning the call retrieves an ID further down the order than the one I want.
A example would be:
ID
0
1
2
3
4-(deleted row)
5-(deleted row)
6-(next ID to be used for INSERT call)
I'd like the filename to be something like 6-0.jpg, however the filename ends up being 4-0.jpg as it targets ID 3 + 1 etc...etc...
Any thoughts on how I get the next MySQL row ID when any number of previous rows have been deleted??
You are making a significant error by trying to predict the next auto-increment value. You do not have a choice, if you want your system to scale... you have to either insert the row first, or rename the file later.
This is a classic oversight I see developers make -- you are coding this as if there would only ever be a single user on your site. It is extremely likely that at some point two articles will be created at almost the same time. Both queries will "predict" the same id, both will use the same filename, and one of the files will disappear, one of the table entries may point to the wrong file, and the other entry will reference a file that does not exist. And you'll be scratching your head asking "how did this happen?!"
Predicting auto-increment values is bad practice. Don't do it. Plan for concurrency.
Also, the information_schema tables are not really tables... they are server internals exposed to the SQL interface. Calls to the "tables" table, and show table status are expensive calls that you do not want to make in production... so don't be tempted to use something you find there.
You can use mysql_insert_id() after you insert the new row to retrieve the new key:
$mysqli->query($yourQueryHere);
$newId = $mysqli->insert_id();
That requires the id field to be a primary key, though (I believe).
As for the filename, you could store it in a variable, then do the query, then change the name and then write the file.
I have an array of data that generates unique data on the fly in a manor of speaking. It's actually an array with 5 hashes.
What I want to do is a basic select query with a where clause that checks each via OR basically a one line query rather than a query for each array item.
I'm attempting to ensure that no one hash that enters the db is the same as another which I know the probability is virtually null to that actually happening but it's a possibility none the less, safer than sorry is my perspective on the matter
Anyway the query I'm thinking of makes no sense as if a match is found the query will result in such what I wanna do is from the original array find the one that's not found and use it where if all 5 aren't found I'll just randomly pick one I guess in the end I want to form a result that is 1 to 5 in a new array so I can randomly pick from that result
Is this possible or would it just be easie to cycle over each one with a songle query?
"SELECT
CASE hashes.hash
WHEN $hashes[0] THEN 0
WHEN $hashes[1] THEN 1
WHEN $hashes[2] THEN 2
WHEN $hashes[3] THEN 3
...
END
FROM hashes WHERE hashes.hash IN(".implode($hashes).")"
This should tell you exactly which of the hashes you sent to the server have been found on the server.
The result set would be the index keys (0, 1, 2, 3) of the array that generated the query.
If you sent a query based on an array of 100 hashes and you get a result set of 99 hashes, that means at least one hash was not found in the db.
You could cycle through the result set like this:
while($row = $pdo->fetch()) {
$index = $row[0] // first column of the result set
unset($hashes[$index]);
}
When while finishes the only hashes left in the array should be the ones that weren't found in the database.
My opinion is that it would be easier to to cycle over each one with a single query. From what you say there appears to be no major benefit in doing it all at once.
In that case I would suggest:
alter table myTable create id_bkp int;
update myTable set id_bkp=account_id;
update myTable set account_id=56 where id_bkp=100;
update myTable set account_id=54 where id_bkp=56;
alter table myTable drop id_bkp;
Of course that will depend on what DB system you are using.
Do you mean something like this?
$sql = "SELECT * FROM `table` WHERE `field` = ";
$where_string = "'" . implode("' OR `field` = '",$my_array) . "'";
$sql .= $where_string;
You could use:
$my_array = array_unique($my_array);
To remove duplicate values.
This is for a file sharing website. In order to make sure a "passcode", which is unique to each file, is truely unique, I'm trying this:
$genpasscode = mysql_real_escape_string(sha1($row['name'].time())); //Make passcode out of time + filename.
$i = 0;
while ($i < 1) //Create new passcode in loop until $i = 1;
{
$query = "SELECT * FROM files WHERE passcode='".$genpasscode."'";
$res = mysql_query($query);
if (mysql_num_rows($res) == 0) // Passcode doesn't exist yet? Stop making a new one!
{
$i = 1;
}
else // Passcode exists? Make a new one!
{
$genpasscode = mysql_real_escape_string(sha1($row['name'].time()));
}
}
This really only prevents a double passcode if two users upload a file with the same name at the exact same time, but hey better safe than sorry right? My question is; does this work the way I intend it to? I have no way to reliably (read: easily) test it because even one second off would generate a unique passcode anyway.
UPDATE:
Lee suggest I do it like this:
do {
$query = "INSERT IGNORE INTO files
(filename, passcode) values ('whatever', SHA1(NOW()))";
$res = mysql_query($query);
} while( $res && (0 == mysql_affected_rows()) )
[Edit: I updated above example to include two crucial fixes. See my answer below for details. -#Lee]
But I'm afraid it will update someone else's row. Which wouldn't be a problem if filename and passcode were the only fields in the database. But in addition to that there's also checks for mime type etc. so I was thinking of this:
//Add file
$sql = "INSERT INTO files (name) VALUES ('".$str."')";
mysql_query($sql) or die(mysql_error());
//Add passcode to last inserted file
$lastid = mysql_insert_id();
$genpasscode = mysql_real_escape_string(sha1($str.$lastid.time())); //Make passcode out of time + id + filename.
$sql = "UPDATE files SET passcode='".$genpasscode."' WHERE id=$lastid";
mysql_query($sql) or die(mysql_error());
Would that be the best solution? The last-inserted-id field is always unique so the passcode should be too. Any thoughts?
UPDATE2: Apperenatly IGNORE does not replace a row if it already exists. This was a misunderstanding on my part, so that's probably the best way to go!
Strictly speaking, your test for uniqueness won't guarantee uniqueness under a concurrent load. The problem is that you check for uniqueness prior to (and separately from) the place where you insert a row to "claim" your newly generated passcode. Another process could be doing the same thing, at the same time. Here's how that goes...
Two processes generate the exact same passcode. They each begin by checking for uniqueness. Since neither process has (yet) inserted a row to the table, both processes will find no matching passcode in database, and so both processes will assume that the code is unique. Now as the processes each continue their work, eventually they will both insert a row to the files table using the generated code -- and thus you get a duplicate.
To get around this, you must perform the check, and do the insert in a single "atomic" operation. Following is an explanation of this approach:
If you want passcode to be unique, you should define the column in your database as UNIQUE. This will ensure uniqueness (even if your php code does not) by refusing to insert a row that would cause a duplicate passcode.
CREATE TABLE files (
id int(10) unsigned NOT NULL auto_increment PRIMARY KEY,
filename varchar(255) NOT NULL,
passcode varchar(64) NOT NULL UNIQUE,
)
Now, use mysql's SHA1() and NOW() to generate your passcode as part of the insert statement. Combine this with INSERT IGNORE ... (docs), and loop until a row is successfully inserted:
do {
$query = "INSERT IGNORE INTO files
(filename, passcode) values ('whatever', SHA1(NOW()))";
$res = mysql_query($query);
} while( $res && (0 == mysql_affected_rows()) )
if( !$res ) {
// an error occurred (eg. lost connection, insufficient permissions on table, etc)
// no passcode was generated. handle the error, and either abort or retry.
} else {
// success, unique code was generated and inserted into db.
// you can now do a select to retrieve the generated code (described below)
// or you can proceed with the rest of your program logic.
}
Note: The above example was edited to account for the excellent observations posted by #martinstoeckli in the comments section. The following changes were made:
changed mysql_num_rows() (docs) to mysql_affected_rows() (docs) -- num_rows doesn't apply to inserts. Also removed the argument to mysql_affected_rows(), as this function operates on the connection level, not the result level (and in any case, the result of an insert is boolean, not a resource number).
added error checking in the loop condition, and added a test for error/success after loop exits. The error handling is important, as without it, database errors (like lost connections, or permissions problems), will cause the loop to spin forever. The approach shown above (using IGNORE, and mysql_affected_rows(), and testing $res separately for errors) allows us to distinguish these "real database errors" from the unique constraint violation (which is a completely valid non-error condition in this section of logic).
If you need to get the passcode after it has been generated, just select the record again:
$res = mysql_query("SELECT * FROM files WHERE id=LAST_INSERT_ID()");
$row = mysql_fetch_assoc($res);
$passcode = $row['passcode'];
Edit: changed above example to use the mysql function LAST_INSERT_ID(), rather than PHP's function. This is a more efficient way to accomplish the same thing, and the resulting code is cleaner, clearer, and less cluttered.
I'd personally would have write it on a different way but I'll provide you a much easier solution: sessions.
I guess you're familiar with sessions? Sessions are server-side remembered variables that timeout at some point, depending on the server configuration (the default value is 10 minutes or longer). The session is linked to a client using a session id, a random generated string.
If you start a session at the upload page, an id will be generated which is guaranteed to be unique as long the session is not destroyed, which should take about 10 minutes. That means that when you're combining the session id and the current time you'll never have the same passcode. A session id + the current time (in microseconds, milliseconds or seconds) are NEVER the same.
In your upload page:
session_start();
In the page where you handle the upload:
$genpasscode = mysql_real_escape_string(sha1($row['name'].time().session_id()));
// No need for the slow, whacky while loop, insert immediately
// Optionally you can destroy the session id
If you do destroy the session id, that would mean there's a very slim chance that another client can generate the same session id so I wouldn't advice that. I'd just allow the session to expire.
Your question is:
does this work the way I intend it to?
Well, I'd say... yes, it does work, but it could be optimized.
Database
To make sure to not have the same value in the field passcode on the database layer, add a unique key to this:
/* SQL */
ALTER TABLE `yourtable` ADD UNIQUE `passcode` (`passcode`);
(duplicate key handling has to be taken care of than ofc)
Code
To wait a second until a new Hash is created, is ok, but if you're talking heavy load, then a single second might be a tiny eternity. Therefore I'd rather add another component to the sha1-part of your code, maybe a file id from the same database record, userid or whatever which makes this really unique.
If you don't have a unique id at hand, you still can fall back to a random number rand-function in php.
I don't think mysql_real_escape_string is needed in this context. The sha1 returns a 40-character hexadecimal number anyway, even if there are some bad characters in your rows.
$genpasscode = sha1(rand().$row['name'].time());
...should suffice.
Style
Two times the passcode generation code is used in your code sample. Start cleaning this up in moving this into a function.
$genpasscode = gen_pc(row['name']);
...
function gen_pc($x)
{
return sha1($row[rand().$x.time());
}
If I'd do it, I'd do it differently, I'd use the session_id() to avoid duplicates as good as possible. This way you wouldn't need to loop and communicate with your database in that loop possibly several times.
You can add unique constraint to your table.
ALTER TABLE files ADD UNIQUE (passcode);
PS: You can use microtime or uniqid to make the passcode more unique.
Edit:
You make your best to generate a unique value in php, and unique constraint is used to guarantee that at database side. If your unique value is very unique, but in very rare case it failed to be unique, just feel free to give the message like The system is busy now. Please try again:).
I want to add 1 to the value of the previous row for hit_count, but I'm afraid doing it may not be safe if multiple queries are being run quickly (i.e. the page is being loaded several times a second - this is for a web-app I'm making, and so I want to make sure any amount of page loads is supported well).
Here's what I had in mind:
$result = mysql_query("SELECT * FROM rotation");
$fetch = mysql_fetch_assoc($result);
$update_hit = $fetch['hit_counter']+1;
$query = "INSERT INTO rotation (hit_counter, rotation_name) VALUES ('$update_hit', '$rotation_name')";
$result = mysql_query($query);
I thought about setting the hit_counter column to a UNIQUE KEY, but I don't know what else I'd do after that.
I would use AUTO_INCREMENT but the problem is, I need the actual hit_counter value within the rest of the script.
Any ideas, comments, advice would be greatly appreciated!
Edit: I used hit_count and hit_counter, was a typo. Updated to avoid any confusion.
You can use the DUPLICATE KEY functionality when you make name + counter a unique value:
INSERT INTO rotation SET hit_counter = 1, rotation_name= 'name'
ON DUPLICATE KEY UPDATE hit_counter = hit_counter + 1
Performance wise (and if your requirements allow it) I advice pushing updates in bulk (per 100 hits) using a caching mechanism like f.e. memcached.
You could use AUTO_INCREMENT, if you need the inserted id within the rest script, you can use mysql_insert_id to get it.