I'm creating an API that interacts with a MySQL inventory database. We have 15 users that can reserve products, updating the database in the following way:
Decreasing the on-hand value and increasing the reserved value of a product.
The inventory table looks like this:
id int
sku varchar
on-hand int
reserved int
The problem is: How to handle the update of the row if 2 users try to update it at the same time?
The first aproach i was thinking about was using Transactions:
<?php
function reserveStock()
{
$db->beginTransaction();
// SELECT on-hand, reserved from inventory
// Update inventory values
$db->commit();
return response()->json([ 'success' => 1, 'data' => $data ])
}
The second one was using pessimist locking:
<?php
function reserveStock()
{
// SELECT on-hand, reserved from inventory with ->sharedLock()
// Update inventory values
return response()->json([ 'success' => 1, 'data' => $data ])
}
The third one was to create a updating field with a value of cero. When selecting the products to update, i'd check the updating field before doing anything with that rows. The problem i see here is that i'd have to loop the ones with updating != 0 until they become available. More selects and updates come fromt his aproach.
Which course of action if the best? There may be more options than the ones i've wrote here.
Do not use neither transactions no pessimist locking.
Do not use update either.
For racing condition situation you better review your database, and get rid of update, use insert instead. Make a pivot table to accomodate connection between users and products. And make only first connecting record (where id is smaller) to be the actual one.
If more explanation needed, here it is an example:
Say 2 users racing to get the product. They both create records in the pivot table, almost simultaneously but someone should be firs, right? They both committing very small transaction to persist their data. And then - they both read the pivot table to verify who is succeeded. The first record will be the same for both of them, there would be no blocking in use (at lease explicit). So one customer will get his record and be happy, the other one is going to be reapply to get another product.
And problem solved.
I know I am late but I can help a person like me who came with a similar issue.
You can solve this by using jobs/queues. That way no 'transaction' related to a user will run 'the same time' as another. Consider checking this article about Concurrency attack and I am sure you will appreciate.
Related
Well, I'm afraid that I will not be able to post a minimum reproducible example, and for that I apologize. But, here goes nothing.
Ours is a weekly prepared meals service. I track order volume in many ways. Here is the structure of the relevant table:
So then I utilize the highlighted fields in many ways, such as indicating to delivery drivers if a customer is returning from the prior order being more than a month ago (last_order_w - prev_order_w > 4), for instance.
Lately I have been noticing that the data is not consistently updating properly. In the past 3 weeks, I would say it is an occurrence of 5%. If it were more consistent, I would be more confident in my ability to track down the issue, but I am not even sure how to provoke it, as I only really notice it after the fact.
The code that should cause the update is below:
<?php
//retrieve and iterate over IDs of orders placed since last synchronization.
$newOrders=array_map('reset',$dbh->query("select id from wp_posts where id > (select max(synced) from fitaf_weeks) and post_type='shop_order' and post_status='wc-processing'")->fetchAll(PDO::FETCH_NUM));
foreach($newOrders as $no){
//retrieve the metadata for the current order
$newMetas=array_map('reset',$dbh->query("select meta_key,meta_value from wp_postmeta where post_id=$no")->fetchAll(PDO::FETCH_GROUP|PDO::FETCH_UNIQUE));
//check if the current order is associated with an existing customer
$exist=$dbh->query("select * from fitaf_customers where id=".$newMetas['_customer_user'])->fetch();
//if not, gather the information we want to store from this post
$noExist=[$newMetas['_customer_user'],$newMetas['_shipping_first_name'],$newMetas['_shipping_last_name'],$newMetas['_shipping_address_1'],(strlen($newMetas['_shipping_address_2'])==0?NULL:$newMetas['_shipping_address_2']),$newMetas['_shipping_city'],$newMetas['_shipping_state'],$newMetas['_shipping_postcode'],$phone,$newMetas['_billing_email'],1,1,$no,$newMetas['_paid_date'],$week[3],$newMetas['_order_total']];
if($exist){
//if we found a record in the customer table, retrieve the data we want to modify
$oldO=$dbh->query("select last_order_id,last_order,last_order_w,lo,num_orders from fitaf_customers where id=".$newMetas['_customer_user'])->fetch(PDO::FETCH_GROUP|PDO::FETCH_ASSOC|PDO::FETCH_UNIQUE);
//make changes to the retrieved data, and make sure we are storing the most recently used delivery address and prepare the data points for the update command
$exists=[$phone,$newMetas['_shipping_first_name'],$newMetas['_shipping_last_name'],$newMetas['_shipping_postcode'],$newMetas['_shipping_address_1'],(strlen($newMetas['_shipping_address_2'])==0?NULL:$newMetas['_shipping_address_2']),$newMetas['_shipping_city'],$newMetas['_shipping_state'],$newMetas['_paid_date'],$no,$week[3],$oldO['last_order'],$oldO['last_order_id'],$oldO['last_order_w'],($oldO['num_orders']+1),($oldO['lo']+$newMetas['_order_total']),$newMetas['_customer_user']];
}
if(!$exist){
//if the customer did not exist, perform an insert
$dbh->prepare("insert into fitaf_customers(id,fname,lname,addr1,addr2,city,state,zip,phone,email,num_orders,num_weeks,last_order_id,last_order,last_order_w,lo) values(?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)")->execute($noExist);
}
else{
//if the customer did exist, update their data
$dbh->prepare("update fitaf_customers set phone=?,fname=?,lname=?,zip=?,addr1=?,addr2=?,city=?,`state`=?,last_order=?,last_order_id=?,last_order_w=?,prev_order=?,prev_order_id=?,prev_order_w=?,num_orders=?,lo=? where id=?")->execute($exists);
}
}
//finally retrieve the most recent post ID and update the field we check against when the syncornization script runs
$lastPlaced=$dbh->query('select max(id) from wp_posts where post_type="shop_order"')->fetch()[0];
$updateSync=$dbh-> query("update fitaf_weeks set synced=$lastPlaced order by id desc limit 1");
?>
Unfortunately I don't have any relevant error logs to show, however, as I documented the code for this post, I realized a potential shortcoming. I should be utilizing the data retrieved from the initial query of new posts, rather than a selecting the highest post id after performing this logic. However, I have timers running on my scripts, and this section hasn't taken over 3 seconds to run in a long time. So it seems unlikely, that the script, which runs on a cron every 5 minutes, is experiencing this unintended overlap?
While I have made the change to pop the highest ID off of $newOrders, and hope it solves the issue, I am still curious to see if anyone has any insights on what could cause this logic to fail at such a low occurrence.
It seems likely your problem comes from race conditions between multiple operations accessing your db.
First of all, your last few lines of code do SELECT MAX(ID) and then uses that value to update something. You Can't Do That™. If somebody else adds a row to that wp_posts table anytime after the entry you think is relevant, you'll use the wrong ID. I don't understand your app well enough to recommend a fix. But I do know this is a serious and notorious problem.
You have another possible race condition as well. Your logic is this:
SELECT something.
make a decision based on what you SELECTED.
INSERT or UPDATE based on that decision.
If some other operation, done by some other user of the db, intervenes between step 1 and step 3, your decision might be wrong.
You fix this with a db transaction. The ->beginTransaction() operation, well, begins the transaction. The ->commit() operation concludes it. And, the SELECT operation you use for step one should say SELECT ... FOR UPDATE.
I am trying to do the following.
I am consulting an external database using a web service. What the web service does is bring me all the products from an ERP system my client uses. As the server and the connection are not really fast, what I decided to do is basically synchronize the database on my web server and handle most operations there, so that the website can run smoothly.
Everything works fine I just need one last step to guarantee that the inventory on the website matches the one available on the ERP. The only issue comes when they (the client) deletes something on the ERP system.
At the moment I am thinking what would be the ideal strategy (least resource and time consuming) to remove products from my Products table if I don't receive them in the web service result.
So I basically have the following process:
I query the web service for all the products, give them a little format and store them in an array. The final size is about 600 indexes.
Then what I do is I do a foreach cycle and have the following subprocess.
I query my database to check if product_id is present.
If the product is present, I just update it with the latest info, stock data.
If the product is not present, I just insert it.
So, I was thinking of doing the following, but I do not think it's the ideal way:
Do a SELECT * FROM Products and generate an array that has all the products.
Do a foreach cycle in the resulting array and in each cycle scan the ERP array to check if the specific product exists. If not I delete it, if yes, I continue with the next product.
Now considering that after all the previous steps this would involve a couple of nested foreach I am a little worried that it might consume too much memory and also take longer to process.
I was thinking that maybe something like array_diff or array map could solve the issue, but I am not really experienced with these functions, and the structure of the two arrays differs a lot, so I am not sure if it would work that easily.
What would you guys recommend?
It's actually quite simple:
SELECT id FROM Products
Then you have an array of your product Ids, for example:
[123,5679,345]
Then as you go and do your updates or inserts, remove the id from the array.
[for updates]I query my database to check if product_id is present.
This is redundant now.
There are a few ways to remove the value from the array (when you do an update), this is the way I would probably do it.
if(false !== ($index = array_search($data['product_id'],$myids))){
//note the !== type comparison because array_search can return 0 for the first index, we must check for boolean false.
//find the index of the product id in our list of id's from local DB
unset($myids[$index]);
//If our incoming product_id is in the local list we Do Update
}else{
//Otherwise we Do Insert
}
As I mentioned above when doing your updates/inserts, You no longer have to check if the ID exists, because you already know this by having an array of IDs from the database. This alone saves you (n) queries (apx 600).
Then its very simple if you have ids left over.
//I wouldn't normally concatenate variables into SQL, in this case it's a list of int IDs from the database.
//you can of course come up with a loop to make it a prepared statement if you wish, but for the sake of simplistically, I'll leave that as an exercise for another day..
'DELETE FROM Products WHERE id IN('.implode(',', $myids).')'
And because you unset these when Updating, then the only thing left is Products that no longer exist.
Conclusion:
You have no choice (other then doing on duplicate key query, or ignoring exceptions) then to pull out the product Ids. You're already doing this on a row by row basis. So we can effectively kill 2 birds with one stone.
If you need more data then just the ID, for example you check that the product was changed before doing an update. Then pull that data out, but I would recommend using PDO and the FETCH_GROUP option. I wont go into the specifics of that but to say it lets you easily build your array this way:
[{product_id} => [ {product_name}, {product_price} etc..]];
Basically the product_id, is the key with a nested array of the row data, this will make lookup easier.
This way you can look it up like this.
//then instead of array_search
//if(false !== ($index = array_search($data['product_id'],$myids))){
if(isset($myids[$data['product_id']])){
unset($myids[$data['product_id']]);
//do your checks, then your update
}else{
//do inserts
}
References:
http://php.net/manual/en/function.array-search.php
array_search — Searches the array for a given value and returns the first corresponding key if successful
WARNING This function may return Boolean FALSE, but may also return a non-Boolean value which evaluates to FALSE. Please read the section on Booleans for more information. Use the === operator for testing the return value of this function.
UPDATE
There is one other really good way to do this, and that is to add a field called sync_date, now when you do your insert or update then set the sync_date to the current data.
This way when you are done, those products with an older sync date then today can be deleted. In this case it's best to cache the time when doing it so you know the exact time.
$time = data('Y-m-d H:i:s'); //or time() if you prefer timestamp
//use this same variable for the whole coarse of the script.
Then you can do
'DELETE from products WHERE sync_time != $time'
This may actually be a bit better because it has more utility. When was the last time it was ran, Now you know.
I have a small PHP function on my website which basically does 3 things:
check if user is logged in
if yes, check if he has the right to do this action (DB Select)
if yes, do the related action (DB Insert/Update)
If I have several users connected at the same time on my website that try to access this specific function, is there any possibility of concurrency problem, like we can have in Java for example? I've seen some examples about semaphore or native PHP synchronization, but is it relevant for this case?
My PHP code is below:
if ( user is logged ) {
sql execution : "SELECT....."
if(sql select give no results){
sql execution : "INSERT....."
}else if(sql select give 1 result){
if(selected column from result is >= 1){
sql execution : "UPDATE....."
}
}else{
nothing here....
}
}else{
nothing important here...
}
Each user who accesses your website is running a dedicated PHP process. So, you do not need semaphores or anything like that. Taking care of the simultaneous access issues is your database's problem.
Not in PHP. But you might have users inserting or updating the same content.
You have to make shure this does not happen.
So if you have them update their user profile only the user can access. No collision will occur.
BUT if they are editing content like in a Content-Management System... they can overwrite each others edits. Then you have to implement some locking mechanism.
For example(there are a lot of ways...) if you write an update on the content keeping the current time and user.
Then the user has a lock on the content for maybe 10 min. You should show the (in this case) 10 min countdown in the frontend to the user. And a cancel button to unlock the content and ... you probably get the idea
If another person tries to load the content in those 10 min .. it gets an error. "user xy is already... lock expires at xx:xx"
Hope this helps.
In general, it is not safe to decide whether to INSERT or UPDATE based on a SELECT result, because a concurrent PHP process can INSERT the row after you executed your SELECT and saw no row in the table.
There are two solutions. Solution number one is to use REPLACE or INSERT ... ON DUPLICATE KEY UPDATE. These two query types are "atomic" from perspective of your script, and solve most cases. REPLACE tries to insert the row, but if it hits a duplicate key it replaces the conflicting existing row with the values you provide, INSERT ... ON DUPLICATE KEY UPDATE is a little bit more sophisticated, but is used in a similar situations. See the documentation here:
http://dev.mysql.com/doc/refman/5.0/en/insert-on-duplicate.html
http://dev.mysql.com/doc/refman/5.0/en/replace.html
For example, if you have a table product_descriptions, and want to insert a product with ID = 5 and a certain description, but if a product with ID 5 already exists, you want to update the description, then you can just execute the following query (assuming there's a UNIQUE or PRIMARY key on ID):
REPLACE INTO product_description (ID, description) VALUES(5, 'some description')
It will insert a new row with ID 5 if it does not exist yet, or will update the existing row with ID 5 if it already exists, which is probably exactly what you want.
If it is not, then approach number two is to use locking, like so:
query('LOCK TABLE users WRITE')
if (num_rows('SELECT * FROM users WHERE ...')) {
query('UPDATE users ...');
}
else {
query('INSERT INTO users ...');
}
query('UNLOCK TABLES')
I'm building an eCommerce site with Codeigniter which will allow users to register, buy products and then track the orders.
I'm using the following in several places around the site, mainly when a user is submitting an order:
$this->db->insert_id();
Basically when a user submits an order, it will add the order to one table, and then, within the same segment of code (immediately after the insert query), add each order item to another table using the ID created when the order is inserted into the first table.
My question is: Out of the following, what does $this->db->insert_id(); do:
1) Does it get the ID that has just been inserted in (and only from) insert query just run?
2) Does it get the last inserted ID from the latest entry in the database regardless of what query its come from?
Basically I'm trying to avoid orders being mixed up, say for example if several customers were submitting orders at the same time, I don't want one customer's order items to be added to the incorrect order.
I think the answer is 1, and that there's no problem, but I wanted to be sure.
Thanks!
It gets the ID that last inserted by the last query. So what you said in #1
Just a suggestion - but another way to do this is to generate a random string - and use that to associate the cart items and order together - instead of by order id. you would still use the order id as the "order number".
this gives you the option of generating that random string when the shopping session first begins and using it to tie the cart items, shipping, billing etc together as the purchase is proceeding. so in that way you are starting the order immediately, but you haven't had to commit a space in the final order table until the transaction verifies.
Your question exposes a potential bug in the codeigniter environment. If two inserts are done in rapid succession, how do you have confidence that the ID returned from insert_id is the proper ID?
Codeigniter documentation does not answer this question
http://ellislab.com/codeigniter/user-guide/database/helpers.html
A relevant blog entry from ellis lab does not resolve the question. It concludes that the appropriate resolution is to take your chances.
http://ellislab.com/forums/viewthread/63052/
If this function is a wrapper function for mysqli_insert_id, the documentation at php.net is unclarified.
http://www.php.net/manual/en/mysqli.insert-id.php
It states the ID is from "the last query". It does not say whose last query.
Two successive inserts, and the return of a wrong ID will compromise the integrity of your data. The way to be sure is lock the database.
$this->db->query('LOCK TABLE (your table name) WRITE');
$this->db->insert('(your table name');
$int_id = $this->db->insert_id();
$this->db->query('UNLOCK TABLES');
This has a negative impact on execution time, but depending on your server's capacity is likely preferable to data corruption.
I am having a wee problem, and I am sure there is a more convenient/simpler way to achieve the solution, but all searches are throw in up a blanks at the moment !
I have a mysql db that is regularly updated by php page [ via a cron job ] this adds or deletes entries as appropriate.
My issue is that I also need to check if any details [ie the phone number or similar] for the entry have changed, but doing this at every call is not possible [ not only does is seem to me to be overkill, but I am restricted by a 3rd party api call limit] Plus this is not critical info.
So I was thinking it might be best to just check one entry per page call, and iterate through the rows/entires with each successive page call.
What would be the best way of doing this, ie keeping track of which entry/row in the table that the should be checked next?
I have 2 ideas of how to implement this:
1 ) The id of current row could be save to a file on the server [ surely not the best way]
2) an extra boolean field [check] is add to the table, set to True on the first entry and false to all other.
Then on each page call it;
finds 'where check = TRUE'
runs the update check on this row,
'set check = FALSE'
'set [the next row] check = TRUE'
Si this the best way to do this, or does anyone have any better sugestion ?
thanks in advance !
.k
PS sorry about the title
Not sure if this is a good solution, but if I have to make nightly massive updates, I'll write the updates to a new blank table, then do a SQL select to join the tables and tell me where they are different, then do another SQL UPDATE like
UPDATE table, temptable
SET table.col1=temptable.col1, table.col2=temptable.col2 ......
WHERE table.id = temptable.id;
You can store the timestamp that a row is updated implicitly using ON UPDATE CURRENT_TIMESTAMP [http://dev.mysql.com/doc/refman/5.0/en/timestamp.html] or explicitly in your update SQL. Then all you need to do is select the row(s) with the lowest timestamp (using ORDER BY and LIMIT) and you have the next row to process. So long as you ensure that the timestamp is updated each time.
e.g. Say you used the field last_polled_on TIMESTAMP to store the time you polled a row.
Your insert looks like:
INSERT INTO table (..., last_polled_on) VALUES (..., NOW());
Your update looks like:
UPDATE table SET ..., last_polled_on = NOW() WHERE ...;
And your select for the next row to poll looks like:
SELECT ... FROM table ORDER BY last_polled_on LIMIT 1;