Duplicate entry '...' for key 'PRIMARY' during the transaction

Duplicate entry '...' for key 'PRIMARY' during the transaction - php

This one happened to me last night. I am quite familiar with the nature of the error but still I cannot figure out what could have caused it. I might have a hunch, but I am not sure. I'll begin with some basic app's info:
My app has 3 entities: Loan, SystemPage and TextPage. Whenever someone adds a loans, one or more system pages is being added to the DB. Basically, it goes something like this:
if ( $form->isValid()){
$this->em->getConnection()->beginTransation();
$this->em->persist($loan);
$this->em->flush();
while ($someCondition){
$page = new SystemPage();
//... Fill the necessary data into page
$page->setObject($loan);
$this->em->persist($page);
}
$this->em->flush();
$this->em->getConnection()->commit();
}
Please ignore potential typos, I am writing this literally by remembering
Entity Loan is mapped to table loans and SystemPage is mapped (via inheritance mapping) to system_pages and base_pages. Both of later one have id field which is set to AUTO_INCREMENT.
My hunch: There is another table called text_pages. Given that text_pages and base_pages on one hand and system_pages and base_pages on another share IDs, I am thinking that it could easily cause this:
User1: Create BasePage, acquire autoincrement ID (value = 1)
User2: Create BasePage, acquire autoincrement ID (value = 1)
User1: Create TextPage, use the ID from step 1
User2: Create SystemPage, use the ID from step 2
Two problems with this theory:
Transactions. That's why I used them in the first place
In the time of error there was no other activity on app by another user
Important: After waiting for a minute, resubmitting passed OK.
Could this be some weird MySQL transaction isolation bug? Any hint would be greatly appreciated...
Edit:
Part of DB Schema:
Please ignore the columns names which are in Serbian language

flush() operation flushes all changes in one single transaction, so you have redundant code here...
You didn't stated if you can reproduce this bug and it would be convenient if you can provide db schema.

It seems there is no right answer to this question, only speculation, so I will provide some troubleshooting ideas based on my own experiences with a problem like this:
You mention there was no other activity on the app, but I would triple check that by looking at the query logs. There must be a duplicate query that was executed.
Maybe the form was submitted twice accidentally. The user double-clicked on the submit button, or they clicked again if the UI did not respond. You can check this idea by looking at the Apache log files for POST requests on your form around the same timestamp. You may need to implement some javascript code to prevent double-clicks on your form page submit button.
Your hunch is probably quite close to correct, in that there is some kind of race condition. Using transactions won't prevent race conditions, but they do provide the means to gracefully rollback. Wrap your code in a try/catch block so that you can catch the Mysql exception and present the user with a friendly error and the option to retry.

Related

What do you think of this approach for logging changes in mysql and have some kind of audit trail

I've been reading through several topics now and did some research about logging changes to a mysql table. First let me explain my situation:
I've a ticket system with a table: 'ticket'
As of now I've created triggers which will enter a duplicate entry in my table: 'ticket_history' which has "action" "user" and "timestamp" as additional columns. After some weeks and testing I'm somewhat not happy with that build since every change is creating a full copy of my row in the history table. I do understand that disk space is cheap and I should not worry about it but in order to retrieve some kind of log or nice looking history for the user is painful, at least for me. Also with the trigger I've written I get a new row in the history even if there is no change. But this is just a design flaw of my trigger!
Here my trigger:
BEFORE UPDATE ON ticket FOR EACH ROW
BEGIN
INSERT INTO ticket_history
SET
idticket = NEW.idticket,
time_arrival = NEW.time_arrival,
idticket_status = NEW.idticket_status,
tmp_user = NEW.tmp_user,
action = 'update',
timestamp = NOW();
END
My new approach in order to avoid having triggers
After spening some time on this topic I came up with an approach I would like to discuss and implement. But first I would have some questions about that:
My idea is to create a new table:
id sql_fwd sql_bwd keys values user timestamp
-------------------------------------------------------------------------
1 UPDATE... UPDATE... status 5 14 12345678
2 UPDATE... UPDATE... status 4 7 12345678
The flow would look like this in my mind:
At first I would select something or more from the DB:
SELECT keys FROM ticket;
Then I display the data in 2 input fields:
<input name="key" value="value" />
<input type="hidden" name="key" value="value" />
Hit submit and give it to my function:
I would start with a SELECT again: SELECT * FROM ticket;
and make sure that the hidden input field == the value from the latest select. If so I can proceed and know that no other user has changed something in the meanwhile. If the hidden field does not match I bring the user back to the form and display a message.
Next I would build the SQL Queries for the action and also the query to undo those changes.
$sql_fwd = "UPDATE ticket
SET idticket_status = 1
WHERE idticket = '".$c_get['id']."';";
$sql_bwd = "UPDATE ticket
SET idticket_status = 0
WHERE idticket = '".$c_get['id']."';";
Having that I run the UPDATE on ticket and insert a new entry in my new table for logging.
With that I can try to catch possible overwrites while two users are editing the same ticket in the same time and for my history I could simply look up the keys and values and generate some kind of list. Also having the SQL_BWD I simply can undo changes.
My questions to that would be:
Would it be noticeable doing an additional select everytime I want to update something?
Do I lose some benefits I would have with triggers?
Are there any big disadvantages
Are there any functions on my mysql server or with php which already do something like that?
Or is there might be a much easier way to do something like that
Is maybe a slight change to my trigger I've now already enough?
If I understad this right MySQL is only performing an update if the value has changed but the trigger is executed anyways right?
If I'm able to change the trigger, can I still prevent somehow the overwriting of data while 2 users try to edit the ticket the same time on the mysql server or would I do this anyways with PHP?
Thank you for the help already

Another approach...
When a worker starts to make a change...
Store the time and worker_id in the row.
Proceed to do the tasks.
When the worker finishes, fetch the last worker_id that touched the record; if it is himself, all is well. Clear the time and worker_id.
If, on the other hand, another worker slips in, then some resolution is needed. This gets into your concept that some things can proceed in parallel.
Comments could be added to a different table, hence no conflict.
Changing the priority may not be an issue by itself.
Other things may be messier.
It may be better to have another table for the time & worker_ids (& ticket_id). This would allow for flagging that multiple workers are currently touching a single record.
As for History versus Current, I (usually) like to have 2 tables:
History -- blow-by-blow list of what changes were made, when, and by whom. This is table is only INSERTed into.
Current -- the current status of the ticket. This table is mostly UPDATEd.
Also, I prefer to write the History directly from the "database layer" of the app, not via Triggers. This gives me much better control over the details of what goes into each table and when. Plus the 'transactions' are clear. This gives me confidence that I am keeping the two tables in sync:
BEGIN; INSERT INTO History...; UPDATE Current...; COMMIT;

I've answered a similar question before. You'll see some good alternatives in that question.
In your case, I think you're merging several concerns - one is "storing an audit trail", and the other is "managing the case where many clients may want to update a single row".
Firstly, I don't like triggers. They are a side effect of some other action, and for non-trivial cases, they make debugging much harder. A poorly designed trigger or audit table can really slow down your application, and you have to make sure that your trigger logic is coordinated between lots of developers. I realize this is personal preference and bias.
Secondly, in my experience, the requirement is rarely "show the status of this one table over time" - it's nearly always "allow me to see what happened to the system over time", and if that requirement exists at all, it's usually fairly high priority. With a ticketing system, for instance, you probably want the name and email address of the users who created, and changed the ticket status; the name of the category/classification, perhaps the name of the project etc. All of those attributes are likely to be foreign keys on to other tables. And when something does happen that requires audit, the requirement is likely "let me see immediately", not "get a database developer to spend hours trying to piece together the picture from 8 different history tables. In a ticketing system, it's likely a requirement for the ticket detail screen to show this.
If all that is true, then I don't think history tables populated by triggers are a good idea - you have to build all the business logic into two sets of code, one to show the "regular" application, and one to show the "audit trail".
Instead, you might want to build "time" into your data model (that was the point of my answer to the other question).
Since then, a new style of data architecture has come along, known as CQRS. This requires a very different way of looking at application design, but it is explicitly designed for reactive applications; these offer much nicer ways of dealing with the "what happens if someone edits the record while the current user is completing the form" question. Stack Overflow is an example - we can see, whilst typing our comments or answers, whether the question was updated, or other answers or comments are posted. There's a reactive library for PHP.

I do understand that disk space is cheap and I should not worry about it but in order to retrieve some kind of log or nice looking history for the user is painful, at least for me.
A large history table is not necessarily a problem. Huge tables only use disk space, which is cheap. They slow things down only when making queries on them. Fortunately, the history is not something you'd use all the time, most likely it is only used to solve problems or for auditing.
It is useful to partition the history table, for example by month or week. This allows you to simply drop very old records, and more important, since the history of the previous months has already been backed up, your daily backup schedule only needs to backup the current month. This means a huge history table will not slow down your backups.
With that I can try to catch possible overwrites while two users are editing the same ticket in the same time
There is a simple solution:
Add a column "version_number".
When you select with intent to modify, you grab this version_number.
Then, when the user submits new data, you do:
UPDATE ...
SET all modified columns,
version_number=version_number+1
WHERE ticket_id=...
AND version_number = (the value you got)
If someone came in-between and modified it, then they will have incremented the version number, so the WHERE will not find the row. The query will return a row count of 0. Thus you know it was modified. You can then SELECT it, compare the values, and offer conflict resolution options to the user.
You can also add columns like who modified it last, and when, and present this information to the user.
If you want the user who opens the modification page to lock out other users, it can be done too, but this needs a timeout (in case they leave the window open and go home, for example). So this is more complex.
Now, about history:
You don't want to have, say, one large TEXT column called "comments" where everyone enters stuff, because it will need to be copied into the history every time someone adds even a single letter.
It is much better to view it like a forum: each ticket is like a topic, which can have a string of comments (like posts), stored in another table, with the info about who wrote it, when, etc. You can also historize that.
The drawback of using a trigger is that the trigger does not know about the user who is logged in, only the MySQL user. So if you want to record who did what, you will have to add a column with the user_id as I proposed above. You can also use Rick James' solution. Both would work.
Remember though that MySQL triggers don't fire on foreign key cascade deletes... so if the row is deleted in this way, it won't work. In this case doing it in the application is better.

How do I deal with concurrency problems using Yii2 and MySQL transactions?

Overview
Consider the following details:
We have a table named user. In it is a column named wallet.
We have a table named walletAction. We insert a new entry on each wallet action a user is doing. This table acts like some sort of logs in the database with some calculations.
We have a CRON command that does an update every N minutes. Each CRON action gets some data by using a standalone API and 'inserts' a new walletAction entry. At the sime time, it updates the user.wallet's value.
A user can buy stuff from our site. When the user clicks the buy button, we insert a new walletAction entry and change the user.wallet column.
Problem
I am afraid that the CRON update and the action of the user when they click the buy button will happen at the exact same time causing the entries in the walletAction table to have wrong calculations.
I need some kind of 'lock' on the CRON update execution or something along those lines.
Questions
Should I be afraid of this situation?
How can I avoid this problem?
Can I avoid this trouble by using MySQL transactions?
What isolation level should I use and in which case should I use it? (In the CRON command or in the action of the user when they click the buy button?)

It seems that we don't have concurrency on php as is in GO or Java. You can implement some technical trick, but almost of them made new problems for you :). For solving your problem i suggest you to use optimistic lock. For more information you can see http://www.yiiframework.com/doc-2.0/guide-db-active-record.html#optimistic-locks.

Yes, in this case I would recommend to use trasactions with the strongest isolation level yii\db\Transaction::SERIALIZABLE.
This level should prevent "phantom reads" and "non-repetable reads".
Moreover I recommend to use transactions always when you perform more than 1 related changes, because it helps to keep DB consistency.
This may prevents problem when you get some PHP exception after successful inserting new rows into walletAction, but before user.wallet updating.

Logging user activities in applications

The problem I'm here to talk about and (ask about of course) is not new. I searched web and stack overflow and I got ideas to many part of this problem (pros and cons) but there is still some part missing in my mind. So I thought it would be a good idea to share in one place (of course it will be more complete with others' ideas) and ask for it.
The problem is clear: "We Want to log every single action of user" - probably when we solve the big problem, smaller ones (like logging only one action would be piece of cake).
First from what I read over the web and stack overflow:
Use DB instead of File: That's a good advice although it always depends on situation. But because of many benefits of DB, in long term and in general, it's the better solution.
DB Layer or Application Layer: Actually it depends. For example If you want really monitor everything(I mean really every single rows that changes in Database, it seems we will have one choice "Using Database Triggers". Although there are many discussions around MySQL that says, triggers slowdown DB and they advised not to use it. So it depends on the level of details you need, you can put your logging system in DB Layer or Application Layer(for exam some common function call $logClass->logThis()).
Use Observers: Clean codes are always better. If you are familiar with observers, you can use them to do things for you when an action is happened so you don't have to add $logClass->logThis() every time a CRUD happens in your application.
What To Log: Simple and short answer is: Based on your needs, but there are some common fields you will need:
user_id (if a unique user ID is available)
timestamp (unix maybe)
ip (not everyone know how to fake it in first place so use it, even faking it give you some insight about user behavior)
action_id (should be predefined actions for better unifying in queries and reports)
object_id (the unique row ID of a record that changes had made on)
action (which my question is about this part)
and etc...
I would appreciate if anyone correct me if I made mistake in any part or add other useful information to this post, so it would become one of good references for other users.
And now my question: How to Store actions?. For better understanding, consider following scenario.
I have a table named "product" and a table named "companies". From the business logic we want to assign products to companies, which we ended up in a table "company_product". Now when a user insert new product and simultaneously assign it's companies, 2 table will be affected (the same goes for delete and update): "product" and "company_product" and we want to know:
what's inserted?
what's deleted?
what's updated to what?
For performance issue and because I don't have enough knowledge about triggers, I want to use logging in Application Layer, so I ended up with this idea that I can, save action fields of database in array or json structure. But as I developed my solution I encountered a problem: How to make this log understandable for non technical users? Because for example I want to save something like this in action field of database when delete(insert) product with id 20:
action : [{id: 20, product_id:2, company_id: 1},{id: 21, product_id:2, company_id: 2}]
And this is not something easy for every one to read and understand. Actually I can use this json more readable and make it something like this:
action : {'Product A Deleted From Company X', 'Product A Deleted From Company Y'}
and save the previous action in technical_action field for further diagnose, But it needs additional works and more query to run for something that is not always needed to be considered(log)
I would appreciate any additional information on this article (I'm definitely sure that there exist other criteria that can be discussed), and answer to my question.

You are actually going to gather details for analytics kind of stuffs.
It will be good if you go for flat tables rather than going to relational tables.
Because if you want to do more analysis your relational table will not be a good choice as it lacks in performance.

how to prevent two separate ajax calls in php from modifying the same database

I got a javascript client that sometimes sends two ajax requests within milliseconds of each other to the (php) server. (it's a javascript bug that I have no control over from the client side, i only got control over the server side).
The first request checks if a voucher already exists in the dbase (given a couple of parameters.. ie cust id etc).. if the voucher already exists, it just re-uses the voucher and updates is value, if it doesn't , it creates a new one from scratch.
the problem is that before it has finished checking if the voucher exists.. the second request comes in and checks if the voucher exists as well.. at that point the first hasn't created the voucher yet..
so long story short.. we end up with 2 duplicate vouchers.. (and the dbase doesn't restrict the voucher name to be unique. I have no control over the dbase either)..
so how do I prevent the second ajax request from doing anything until the first has done it's thing?
Keep in mind that the two requests are two different threads.. so if I make any $isVoucherCreationInProgress variables, it would be useless as the second call would be completely oblivious about it.
ideas?

If the 2 ajax requests are from the same client, on the server side make a lock-like system, so if someone has checked for a voucher existence, set a session variable until it finishes what it has to do. So when the second line comes it first checks for the session and if ( someone else is using the voucher ) finish it with a message check later, so on the client side when it comes with a denied message you can simply send it with a 1 second delay, to be sure nobody is "working".
Hope this helps.
Or you could see this question in order to make a mutex in php: PHP mutual exclusion (mutex)

Personally I would think of a very simple method. On your server, you must have a procedure where you will create the voucher. Keep a global array and just before creating the voucher, set the index of array as the id, something just like key = > Value, where key may be the id of the voucher and Value may be a status such as "creating". After creating the voucher, you can remove the entry using the id of the voucher as the key.
Now, every time just before creating the voucher, simply check from the global array of the key already exist, if yes and Value="creating", then in fact, you are actually creating the voucher, so then you exits
Hope it helps, :-)

Use transactions. If you really can't touch the database (not even make your own statements), you can use STM or the like. Wouldn't be too hard with locks either, but either way requires that your application is running continuously. You can run a server with software like phpdaemon and forward a specific path to that server, to get that continuance.

I understand that you create a new row in one table of your database.
You should add a unicity constraint so that you can't add it twice. Is it possible that you have to create several vouchers? Could you give more info on this?
Regarding the update, you should add a 'version' field to your row. The client side needs to have the correct version number to update the row. Thus it avoids a problem of unwanted concurrent update. This is a best practice with ORM, you may check this looking for 'optimistic update'.
As you have no control on the db, create a cache of requests (i.e. static object) in your server and create/update a row if nothing (regarding this customer + others parameters if needed) in your cache (like this one for example http://www.php.net/manual/en/book.memcache.php) . Your cache should clean itself atfer a while (I guess there are cache solutions in php).
Another idea (ugly but because it seems you are so limited with solutions): just make it slower. Wait sufficiently to make sure there is noone else (you will need a loop which checks and undo if needed - with random for convergence).

You can either set a flag (in JavaScript) when your ajax request starts - check to see if it's set then RETURN, or you can change your AJAX request to synchronous.

Can I use foreign key restrictions to return meaningful UI errors with PHP

I want to start by saying that I am a big fan of using foreign keys and have a tendency to use them even on small projects to keep my database from being filled with orphaned data. On larger projects I end up with gobs of keys which end up covering upwards of 8 - 10 layers of data.
I want to know if anyone could suggest a graceful way of handling 'expected errors' from the MySQL database in a way that I can construct meaningful messages for the end user. I will explain 'expected errors' with an example.
Lets say I have a set of tables used for basic discussions:
discussion
questions
responses
users
Hierarchically they would probably look something like this:
-users
--discussion
---questions
----responses
When I attempt to delete a user the FKs will check discussions and if any discussion exist the deletion is restricted, deleting discussion checks questions, deleting questions checks responses. An 'expected error' in this case would be attempting to delete a user--unless they are newly created I can anticipate that one or more foreign keys will fail causing an error.
What I WANT to do is to catch that error on deletion and be able to tell the end user something like 'We're sorry, but all discussions must be removed before you can delete this user...'.
Now I know I can keep and maintain matching arrays in PHP and map specific errors to messages but that is messy and prone to becoming stagnant, or I could manually run a set of selects prior to attempting the deletion, but then I am doing just as much work as without using FKs.
Any help here would be greatly appreciated, or if I am just looking at this completely wrong then please let me know.
On a side note I generally use CodeIgniter for my application development, so if that would open up an avenue through that framework please consider that in your answers.
Thanks in Advance

Sadly, MySQL does not expose the ability to define a custom error like you would with SQL Server or Oracle.
Bug/Feature Request #16999
Worklog #2110 spec's the behavior for v5.5
Workaround
Check this blog post about using a UDF to be able to define custom errors.

Sounds like you need to define your foreign keys with ON DELETE CASCADE. This will delete any referenced data in other tables.

You shouldn't be relying on the database to create errors for your application code. the FK's are there for when your app code messes up and tries to delete something it shouldn't.
If you really want to give the user a nice error message you will have to run the selects first, and build the appropriate error message.
edit
You can check for foreign keys in one select. If you are using an ORM like doctrine, you don't even have to specify the join, just tell it what fields to select, then check each table for nonzero rows.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.