I have this code, which parses JSON and is beeing run every 5 MINS via php file.php command.
foreach($data->chatters->staff as $viewers)
{
$sql = "INSERT INTO viewers (user,points)
VALUES('$viewers','10')
ON DUPLICATE KEY UPDATE
points = points+5";
if ($conn->query($sql) === TRUE) {
//echo "Staff: Done";
} else {
echo "Error: " . $sql . "<br>" . $conn->error;
}
}
For some reason, every cycle it increases the ID count.
Now my database looks like this:
id user points
1 uname1 123
2 uname2 123
18 uname3 123
256 uname4 123
And the ID value just keeps increasing. What am i missing?
THanks.
Before explaining: what you are experiencing is normal.
The id number gets generated for every INSERT query. It's an atomic operation, plus the feature is designed to work in concurrent environment (many people hitting one db with lots of requests).
In order to be able to concurrently serve, MySQL isolates transactions from each other and each transaction lives it its own "bubble". That means, every time an insert happens and auto_increment is generated, it can't produce the same value for different queries.
The only way for this to work is for MySQL to always increase the number and never re-use it. That's the only possible solution and the fastest one as well.
What does this mean - it means that every time you hit a unique constraint in your query, a value that was generated for auto_increment will be "wasted", insert won't happen and instead the UPDATE part gets executed.
You can't fix it, you can just break it. Leave it as it is, you've done it right.
The job of auto_increment is only to provide unique numbers. If they got spent somehow (by a failed insert) then that's fine and not something you need to worry about.
ON DUPLICATE KEY UPDATE will create a new row each time you 'update' the row.
If you don't want it to be like that, you'll have to make two queries : one for the INSERT query, another one for the UPDATEquery.
Related
I have found two different ways to, first, get the next invoice number and, then, save the invoice in a multi-tenant database where, of course, each tenant will have his own invoices with different incremental numbers.
My first (and actual) approach is this (works fine):
Add a new record to the invoices tables. No matter the invoice number yet (for example, 0, or empty)
I get the unique ID of THAT created record after insert
Now I do a "SELECT table where ID = $lastcreatedID **FOR UPDATE**"
Here I get the latest saved invoice number with "SELECT #A:=MAX(NUMBER)+1 FROM TABLE WHERE......"
Finally I update the previously saved record with that invoice number with an "UPDATE table SET NUMBER = $mynumber WHERE ID = $lastcreatedID"
This works fine, but I don't know if the "for update" is really needed or if this is the correct way to do this in a multi-tenant DB, due to performance, etc.
The second (and simpler) approach is this (and works too, but I don't know if it is a secure approach):
INSERT INTO table (NUMBER,TENANT) SELECT COALESCE(MAX(NUMBER),0)+1,$tenant FROM table WHERE....
That's it
Both methods are working, but I would like to know the differences between them regarding speed, performance, if it may create duplicates, etc.
Or... is there any better way to do this?
I'm using MySQL and PHP. The application is an invoice/sales cloud software that will be used by a lot of customers (tenants).
Thanks
Regardless of if you're using these values as database IDs or not, re-using IDs is virtually guaranteed to cause problems at some point. Even if you're not re-using IDs you're going to run into the case where two invoice creation requests run at the same time and get the same MAX()+1 result.
To get around all this you need to reimplement a simple sequence generator that locks its storage while a value is being issued. Eg:
CREATE TABLE client_invoice_serial (
-- note: also FK this back to the client record
client_id INTEGER UNSIGNED NOT NULL PRIMARY KEY,
serial INTEGER UNSIGNED NOT NULL DEFAULT 0
);
$dbh = new PDO('mysql:...');
/* this defaults to 'on', making every query an implicit transaction. it needs to
be off for this. you may or may not want to set this globally, or just turn it off
before this, and back on at the end. */
$dbh->setAttribute(PDO::ATTR_AUTOCOMMIT,0);
// simple best practice, ensures that SQL errors MUST be dealt with. is assumed to be enabled for the below try/catch.
$dbh->setAttribute(PDO::ATTR_ERRMODE_EXCEPTION,1);
$dbh->beginTransaction();
try {
// the below will lock the selected row
$select = $dbh->prepare("SELECT * FROM client_invoice_serial WHERE client_id = ? FOR UPDATE;");
$select->execute([$client_id]);
if( $select->rowCount() === 0 ) {
$insert = $dbh->prepare("INSERT INTO client_invoice_serial (client_id, serial) VALUES (?, 1);");
$insert->execute([$client_id]);
$invoice_id = 1;
} else {
$invoice_id = $select->fetch(PDO::FETCH_ASSOC)['serial'] + 1;
$update = $dbh->prepare("UPDATE client_invoice_serial SET serial = serial + 1 WHERE client_id = ?");
$update->execute([$client_id])
}
$dbh->commit();
} catch(\PDOException $e) {
// make sure that the transaction is cleaned up ASAP, then let the exception bubble up into your general error handling.
$dbh->rollback();
throw $e; // or throw a more pertinent error/exception of your choosing.
}
// both committing and rolling back will release the lock
At a very basic level this is what MySQL is doing in the background for AUTOINCREMENT columns.
Do not use MAX(id)+1. It will, someday, bite you. There will be two invoices with the same number, and it will take us a few paragraphs to explain why it happened.
Instead, use AUTO_INCREMENT the way it is intended.
INSERT INTO Invoices (id, ...) VALUES (NULL, ...);
SELECT LAST_INSERT_ID(); -- specific to the conne ction
That is safe even when multiple connections are doing the same thing. No FOR UPDATE, no BEGIN, etc is necessary. (You may want such for other purposes.)
And, never delete rows. Instead, use the standard business practice of invalidating bad invoices. Imagine being audited.
All that said, there is still a potential problem. After a ROLLBACK or system crash, an id may be "burned". Also things like INSERT IGNORE allocate the id before checking to see whether it will be needed.
If you can live with the caveats, use AUTO_INCREMENT.
If not, then create a 1-row, 2-column table to simulate a sequence number generator: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#sequence
Or use MariaDB's SEQUENCE
Both the approaches do work, but each with its own demerits in high traffic situations.
The first approach runs 3 queries for every invoice you create, putting extra load on your server.
The second approach can lead to duplicates in events where two invoices are generated with very little time difference (such that the SELECT query return same max number for both invoices).
Both the approaches may lead to problems in high traffic conditions.
Two solutions to the problems are listed below:
Use generated columns: Mysql supports generated columns, which are basically derived using other column values for each row. Refer this
Calculate invoice number on the fly: Since you're using the primary key as part of the invoice, let the DB handle generating unique primary keys, and then generate invoice numbers on the fly in your business logic using the id for each invoice.
Hi have a bunch of unique codes in a database which should only be used once.
Two users hit a script which assigns them at the same time and got the same codes!
The script is in Magento and the user can order multiple codes. The issue is if one customer orders 1000 codes the script grabs the top 1000 codes from the DB into an array and then runs through them setting them to "Used" and assigning them to an order. If a second user hits the same script at a similar time the script then grabs the top 1000 codes in the DB at that point in time which crosses over as the first script hasn't had a chance to finish assigning them.
This is unfortunate but has happened quite a few times!
My idea was to create a new table, once the user hits the script a row is made with "order_id" "code_type". Then in the same script a check is done so if a row is in this new table and the "code_type" matches that of which the user is ordering it will wait 60 seconds and check again until the previous codes are issued and the table is empty where it will then create a row and off it goes.
I am not sure if this is the best way or if two users hit at the same second again whether two rows will just be inserted and off we go with the same problem!
Any advice is much appreciated!
The correct answer depends on the database you use.
For example in MySQL with InnoDB the possible solution is a transaction with SELECT ... LOCK IN SHARE MODE.
Schematically it works this by firing following queries:
START TRANSACTION;
SELECT * FROM codes WHERE used = 0 LIMIT 1000 LOCK IN SHARE MODE;
// save ids
UPDATE codes SET used=1 WHERE id IN ( ...ids....);
COMMIT;
More information at http://dev.mysql.com/doc/refman/5.7/en/innodb-locking-reads.html
I have a small PHP function on my website which basically does 3 things:
check if user is logged in
if yes, check if he has the right to do this action (DB Select)
if yes, do the related action (DB Insert/Update)
If I have several users connected at the same time on my website that try to access this specific function, is there any possibility of concurrency problem, like we can have in Java for example? I've seen some examples about semaphore or native PHP synchronization, but is it relevant for this case?
My PHP code is below:
if ( user is logged ) {
sql execution : "SELECT....."
if(sql select give no results){
sql execution : "INSERT....."
}else if(sql select give 1 result){
if(selected column from result is >= 1){
sql execution : "UPDATE....."
}
}else{
nothing here....
}
}else{
nothing important here...
}
Each user who accesses your website is running a dedicated PHP process. So, you do not need semaphores or anything like that. Taking care of the simultaneous access issues is your database's problem.
Not in PHP. But you might have users inserting or updating the same content.
You have to make shure this does not happen.
So if you have them update their user profile only the user can access. No collision will occur.
BUT if they are editing content like in a Content-Management System... they can overwrite each others edits. Then you have to implement some locking mechanism.
For example(there are a lot of ways...) if you write an update on the content keeping the current time and user.
Then the user has a lock on the content for maybe 10 min. You should show the (in this case) 10 min countdown in the frontend to the user. And a cancel button to unlock the content and ... you probably get the idea
If another person tries to load the content in those 10 min .. it gets an error. "user xy is already... lock expires at xx:xx"
Hope this helps.
In general, it is not safe to decide whether to INSERT or UPDATE based on a SELECT result, because a concurrent PHP process can INSERT the row after you executed your SELECT and saw no row in the table.
There are two solutions. Solution number one is to use REPLACE or INSERT ... ON DUPLICATE KEY UPDATE. These two query types are "atomic" from perspective of your script, and solve most cases. REPLACE tries to insert the row, but if it hits a duplicate key it replaces the conflicting existing row with the values you provide, INSERT ... ON DUPLICATE KEY UPDATE is a little bit more sophisticated, but is used in a similar situations. See the documentation here:
http://dev.mysql.com/doc/refman/5.0/en/insert-on-duplicate.html
http://dev.mysql.com/doc/refman/5.0/en/replace.html
For example, if you have a table product_descriptions, and want to insert a product with ID = 5 and a certain description, but if a product with ID 5 already exists, you want to update the description, then you can just execute the following query (assuming there's a UNIQUE or PRIMARY key on ID):
REPLACE INTO product_description (ID, description) VALUES(5, 'some description')
It will insert a new row with ID 5 if it does not exist yet, or will update the existing row with ID 5 if it already exists, which is probably exactly what you want.
If it is not, then approach number two is to use locking, like so:
query('LOCK TABLE users WRITE')
if (num_rows('SELECT * FROM users WHERE ...')) {
query('UPDATE users ...');
}
else {
query('INSERT INTO users ...');
}
query('UNLOCK TABLES')
I'm timing various part of the site's "initialisation" code (including such things as verifying the user is logged in, connecting to the database, importing functions...)
This query is currently taking up abouve half the total initialisation time all by itself:
$sql = "update `users` set `lastclick`=now(),".(substr($_SERVER['PHP_SELF'],0,6) == "/ajax/" ? "" : " `lastactive`=now(),")." `lastip`='".addslashes($_SERVER['REMOTE_ADDR'])."' where `id`=".$userdata['id'];
Generating the query takes no time at all, it's the running that's the problem. Example result query:
update `users` set `lastclick`=now(), `lastactive`=now(), `lastip`='192.168.0.1' where `id`=1
Simple enough query, right? I am the only user on the server right now, there is literally nothing else running. So why does a simple update take up more time than connecting to the database, SELECTing the user data in the first place, validating the cookies, and defining a bunch of functions all combined?
(I just tried replacing now() with a literal value, but that made no difference - in fact it ended up taking 13ms the first time instead of 4...)
EDIT: As requested:
explain select * from `users` where `id`=1
1 row returned
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE users const PRIMARY PRIMARY 4 const 1
Solved my own mystery. Turns out one of the fields being updated (lastactive) was in an index, and the slowness was coming from rebuilding that index.
Since the only time that index might be used is in updating the list of users who are online, and that only happens by cron every set interval, I've dropped the index and now the query runs a heck of a lot faster.
Thanks to those who tried to help - you did help me find the problem, indirectly!
How do i go about looking into a table and searching to see if a row exist. the back gorund behind it is the table is called enemies. Every row has a unique id and is set to auto_increment. Each row also has a unique value called monsterid. the monster id isn't auto_increment.
when a monster dies the row is deleted and replaced by a new row. so the id is always changing. as well the monsterid is changed too.
I am using in php the $_GET method and the monsterid is passing through it,
basically i am trying to do this
$monsterID = 334322 //this is the id passed through the $_GET
checkMonsterId = "check to see if the monster id exist within the enemies table"
if monsterid exist then
{RUN PHP}
else
{RUN PHP}
If you need anymore clarity please ask. and thanks for the help in advance.
Use count! If it returns > 0, it exists, else, it doesn't.
select count(*) from enemies where monsterid = 334322
You would use it in PHP thusly (after connecting to the database):
$monsterID = mysql_real_escape_string($monsterID);
$res = mysql_query('select count(*) from enemies where monsterid = ' . $monsterid) or die();
$row = mysql_fetch_row($res);
if ($row[0] > 0)
{
//Monster exists
}
else
{
//It doesn't
}
Use count, like
select count(*) from enemies where monsterid = 334322
However be sure to make certain you've added an index on monsterid to the table. Reason being that if you don't, and this isn't the primary key, then the rdbms will be forced to issue a full table scan - read every row - to give you the value back. On small datasets this doesn't matter as the table will probably sit in core anyway, but once the number of rows becomes significant and you're hitting the disk to do the scan the speed difference can easily be two orders of magnitude or more.
If the number of rows is very small then not indexing is rational as using an non-primary key index requires additional overhead when inserting data, however this should be a definite decision (I regularly impress clients who've used a programmer who doesn't understand databases by adding indexes to tables which were fine when the coder created them but subsequently slow to a crawl when loaded with real volumes of data - quite amazing how one line of sql to add an index will buy you guru status in your clients eyes cause you made his system usable again).
If you're doing more complex queries against the database using subselect, something like finding all locations where there is no monster, then look up the use of the sql EXISTS clause. This is often overlooked by programmers (the temptation is to return a count of actual values) and using it is generally faster than the alternatives.
Simpler :
select 1 from enemies where monsterid = 334322
If it returns a row, you have a row, if not, you don't.
The mysql_real_escape_string is important to prevent SQL injection.
$monsterid = mysql_real_escape_string($_GET['monsterid']);
$query = intval(mysql_query("SELECT count(*) FROM enemies WHERE monsterid = '$monsterid'));
if (mysql_result > 0) {
// monster exists
} else {
// monster doesn't exist
}