Generating sequential unique code, often times occurs duplicate entry error

Generating sequential unique code, often times occurs duplicate entry error - php

I am currently using PHP & Laravel.
What I want to do is generating a unique and sequential string as code.
In this example, the code is for Purchase Orders.
What I want is something like this:
PO/000001
PO/000002
PO/000003
PO/000004
PO/000005
PO/000006
Database table schema:
create table `purchase_orders` (
`id` int unsigned not null auto_increment primary key,
`code` varchar(191) not null,
...
...
)
alter table `purchase_orders` add unique `purchase_orders_code_unique`(`code`)
This seems simple, I just need to grab the auto_increment ID, so the code corresponds with id, more or less like this:
id | code
1 | PO/000001
2 | PO/000002
3 | PO/000003
4 | PO/000004
5 | PO/000005
6 | PO/000006
The code I use: (uses Laravel's ORM syntax)
$count = PurchaseOrders::max('id'); // Equivalent to select MAX (`id`) from purchase_orders
$code = 'PO/' . sprintf('%08d', $count + 1);
In theory it works great, but in reality, often the 'collision' happens, where the id does not equal to the code it generated, sometimes the code is bigger/ahead than the id. For example, it often happens like this:
id | code
... | ...
199 | PO/000199
200 | PO/000200
201 | PO/000202
The next transaction will have the id of 202, and generated code is supposed to be PO/000202. It will trigger an Integrity constraint violation error, because PO/000202 is already used on id: 201.
I use DB transactions & commits heavily, and sometimes creating a purchase order takes a few moment, and also there are multiple users is creating purchase orders. I don't know how exactly it happened, but the collision occurs quite often, around 100 transactions or so.
Here's an occurrence from a live project:
As you can see, that the code is bigger than id. The next code to be inserted will be ...000205, and my client reported an error:
SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry
'PRCV/JKT/G/2019/00000205' for key 'dlvr_payments_code_unique'
You can also see that the id and code is not equal at all
I also tried other code generation method before, instead of MAX('id') I use COUNT(*), but it seems worse than MAX('id').
So, the question is, what did I do wrong? How can I make it ensure a unique code every time so this won't happen again in the future?
I've been thinking of using Redis or some key-value database to store the counting, is that necessary?

Don't do it!
Don't depend on PK in order to achieve sequential number. It leaves you in various end streets.
I would recommend to store your last inserted Purchase Order ID in a separate table and keep updating it accordingly.
Along with maintaining a table, having it stored in cache (ie redis) can improve your application's performance.
Hope it helps!

Related

How to generate id number like 'xxx0000', 'xxx0001'?

This is my table :
| ID |
| xxx0000 |
| xxx0001 |
| xxx0002 |
i want to make my id pattern like that, but i dont know how to generate it?

You have two different pieces of data, so make two different columns.
ID INT NOT NULL AUTO_INCREMENT,
SomethingElse SomeOtherType NOT NULL
What SomethingElse is named and what data type it is would be up to you. xxx doesn't tell us much.
If both of these things combined make up your primary key, you can use a composite key of multiple columns:
PRIMARY KEY (SomethingElse, ID)
The same integrity checks for any primary key will continue to apply, the database will simply check both columns for combined uniqueness.
At this point you have the data you want, now it's just a matter of displaying it how you want. Whether you do that in SQL or in PHP is up to you. Whether you want the application to see them as a single value or see the underlying different values, also up to you.
Selecting them as a single value from SQL could be simple enough. Something like:
SELECT CONCAT(SomethingElse, ID) AS ID FROM ...
If you always want those padded zeroes then this question will help. Other string manipulations you might want to do would also be tackled one at a time (and each could result in its own Stack Overflow question if you need assistance with it).
But the basic idea here is that what you have is a composite value. In a relational database you would generally store the underlying values and perform the composition when querying the data.

Preventing and delaying concurent mysql insert

I'm trying to create a system for ordering and create a unique serial number to distinguish the order, it's working well until one time there is an order at same time (the difference is just in seconds, about 10 seconds) and then the unique serial number become same (the serial number is increment from the last serial number in db).
I'm creating the id based on some format and it have to be reseted per month so I can't use uniqid().
Do you guys have any idea about this? I read about some db locking but when I tried this solution "Can we lock tables while executing a query in Zend-Db" it's also didn't worked.
---EDIT---
The format is
projectnumber-year-number of order this months
the number of order this months is 4 digits started from 0001 until 9999
after 9999 it will start again from A001 ... A999 B001 ... ZZZZ
and this is the column
| order_id | varchar(17) | utf8_general_ci | NO | PRI |
I hope this make it more clear now :)
Thanks!

Primarily I'd look into using AUTO_INCREMENT primary key - see manual for details.
If that's not possible and you are using InnoDB you should be able to create the order in a Transaction. In your application you can then detect if there were duplicates and re-issue a new ID as needed. Using transaction will ensure that there is no residual data left in the database if your order creation fails.
EDIT based on the additional information:
I'd add an AUTO_INCREMENT primary key and use a separate "OrderName" column for the desired format. That should allow you to do the following, for example:
UPDATE orders o
JOIN (
SELECT
year(o2.dte) y,
month(o2.dte) m,
min(o2.Order_ID) minid
FROM orders o2 GROUP BY y,m) AS t ON (t.m=month(dte) AND t.y=year(dte))
SET o.OrderName=CONCAT('p-n-',year(o.dte),"-",o.Order_ID-t.minid);
id column is int PRIMARY KEY AUTO_INCREMENT and will ensure that the orders are always in correct order and will not require locking. In this example CONCAT will dictate your order number format. You can run this UPDATE in a trigger, if you wish, to ensure that the OrderName is immediately populated. Of course if you run this in a trigger, you don't need to repopulate the whole table.

Seem we must use transaction with Serializable locking. It will prevent read and write from other session until transaction complete.
Please view at here
http://en.wikipedia.org/wiki/Isolation_%28database_systems%29
http://dev.mysql.com/doc/refman/5.0/en/set-transaction.html

MYSQL INNODB Manual Increment with locking

ID|RID|SID|Name
1| 1 | 1 |Alpha
2| 1 | 2 |Beta
3| 2 | 1 |Charlie
ID is auto-incrementing unique, not the problem here. RID is grouping sets of data together, I need a way to make the SID unique per RID 'group'. This structure is correct, I anticipate someone saying 'split it into multiple tables', that's not an option here (it's taxonomic classification).
As shown, under RID 1, the SID increments from 1 to 2, but when the RID changes to 2, the SID is 1.
I have the code to get the next value: SELECT IFNULL(MAX(SID),0)+1 AS NextVal FROM t WHERE RID=1, the question is how do I use that value when inserting a new record?
I can't simply run two queries as that can result in duplication, so somehow the table needs to be locked, ideally to write only. What would be the correct way to do this?

At first you should constraint your data to be exactly the way you want it to be, so put an unique combined index on (RID, SID).
For your problem you should start a transaction (BEGIN) and then put an exclusive lock onto the rows you need, which blocks all access to these rows for other connections (not the whole table, which is poor for performance!):
SELECT .... FOR UPDATE
This locks all selected rows exclusively. Further you should not use READ UNCOMMITTED as isolation level. you can view in the manual how to check the current isolation level and how to change this.
REPEATABLE READ is the default isolation level, which would be fine here.
Then insert your new query and commit (COMMIT) the transaction.
This should prohibit duplicates altogether since you created an unique index and it should also prohibit your scripts just failing with an error message that the unique check failed, but instead wait for other scripts to finish and insert the next row then.

Check if value exists in table or handle MySQL's unique constraint exception?

I'm developing an iOS app that will enable users to receive push notifications. Therefore, I have to store for each user their APN device token. I have decided to piggy back the APN device token on each API request and store it in the database if it doesn't yet exist.
The table design is very simple, only one column 'device_token':
mysql> desc apn_devices;
+--------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------+-------------+------+-----+---------+-------+
| device_token | varchar(64) | NO | PRI | NULL | |
+--------------+-------------+------+-----+---------+-------+
Now I'm wondering, what would be more expensive to ensure all tokens are unique? I see two options:
Put a unique constraint on 'device_token', just insert the token and ignore any exception that might occur
First, perform a query to check if the token already exists and if not, insert it.
The first seems to be a more 'clean' solution, in a sense that it takes less code, which is easier to maintain. The second solution will only add an additional query when the token does not yet exist, preventing the inevitable constraint violation of the first. Logically thinking, the latter should be less expensive, but takes more code to accomplish.
What is best practice here? Is it expensive for MySQL to throw exceptions?

Or, you can simply do an INSERT IGNORE ... (no need for a unique constraint as it's the primary key, and hence by construction must be unique)
From MySQL Reference Manual:
If you use the IGNORE keyword, errors that occur while executing the INSERT statement are ignored. For example, without IGNORE, a row that duplicates an existing UNIQUE index or PRIMARY KEY value in the table causes a duplicate-key error and the statement is aborted. With IGNORE, the row still is not inserted, but no error occurs. Ignored errors may generate warnings instead, although duplicate-key errors do not.
Note that if you had other columns in your table you'd want to update each time (e.g. a "last used" timestamp), you could also use INSERT ... ON DUPLICATE KEY UPDATE ...

Avoiding collisions on primary keys from separate MySQL databases

I have several servers running their own instance of a particular MySQL database which unfortunately cannot be setup in replication/cluster. Each server inserts data into several user-related tables which have foreign key constraints between them (e.g. user, user_vote). Here is how the process goes about:
all the servers start with the same data
each server grows its own set of data indepedently from the other servers
periodically, the data from all the servers is merged manually together and applied back to each server (the process therefore repeats itself from step 1).
This is made possible because in addition to its primary key, the user table contains a unique email field which allows identifying which users are already existing in each database, and merging those who are new while changing the primary and foreign keys to avoid collisions and maintain the correct foreign key constraints. It works, but it's quite some effort because primary and foreign keys have to be changed to avoid collision, hence my question:
Is there a way to have each server use primary keys that don't collide with other servers to facilitate the merging?
I initially wanted to use a composite primary key (e.g. server_id, id) but I am using Doctrine which doesn't support primary keys composed of multiple foreign keys so I would have problems with my foreign key constraints.
I thought about using a VARCHAR as an id and using part of the string as a prefix (SERVER1-1,SERVER1-2, SERVER2-1, SERVER2-2...) but I'm thinking it will make the DB slower as I will have to do some manipulations with the ids (e.g. on insert, I have to parse existing ids and extract highest, increment it, concatenate it with server id...).
PS: Another option would be to implement replication with read from slaves and write to master but this option was discarded because of issues such as replication lag and single point of failure on the master which can't be solved for now.

You can make sure each server uses a different incrementation of autoincrement, and a different start offset:
Change the step auto_increment fields increment by
(assuming you are using auoincrements)
I've only ever used this across two servers, so my set-up had one with even ids and one with odd.
When they are merged back together nothing will collide, as long as you make sure all tables follow the above idea.
in order to implement for 4 servers
You would say, set-up the following offsets:
Server 1 = 1
Server 2 = 2
Server 3 = 3
Server 4 = 4
You would set your incrementation as such (I've used 10 to leave space for extra servers):
Server 1 = 10
Server 2 = 10
Server 3 = 10
Server 4 = 10
And then after you have merged, before copying back to each server, you would just need to update the autoinc value for each table to have the correct offset again. Imagine each server had created 100 rows, autoincs would be:
Server 1 = 1001
Server 2 = 1002
Server 3 = 1003
Server 4 = 1004
This is where it does get tricky due to having four servers. For imagine certain tables may not have had any rows inserted from a particular server. So you could end up with some tables having their last autoinc id not being from server 4, but from being from server 2 instead. This would make it very tricky to work out what the next autoinc should be for any particular table.
For this reason it is probably best to also include a column in each of your tables that records the server number when any rows are inserted.
id | field1 | field2 | ... | server
That way you can easily find out what the last autoinc value should be for a particular server by selecting the following on any of your tables:
SELECT MAX(id) FROM `table` WHERE `server`=4 LIMIT 0,1
Using this value you can reset the next autoinc value you need for each table on each server, before rolling the merged dataset out to the server in question.
UPDATE information_schema.tables SET Auto_increment = (
SELECT MAX(id) FROM `table` WHERE `server`=s LIMIT 0,1
)+n WHERE table_name='table' AND table_schema = DATABASE();
Where s is the server number and n is set to the offset, so in my example it would be 10.

Prefixing ID ould do the trick. As for DB being slower - depends how big traffic is served there. You can also have "prefixed id" splitted into two columns, "prefix" and "id" and these can be of any type. Would require some logic to cope with it in requests, but may be worth evaluating

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.