What is the best way to generate consecutive values when you have a load balanced database and instances of your application ?
For example, i have a load balanced mysql database.
My PHP application, is deployed with docker and has 3 containers
I have to generate consecutive ids. I cannot use auto increment because i have to generate unique ids depending on relations (For example, i have to generate a unique bill number depending on witch society it is related)
My bill can be generated but not emmited. I must generate the unique value when the bill is emitted.
TRIGGER ON UPDATE is the good solution or not ?
Thnks for your answers
I would save the current id in a db table.
Each time you want to increase the id, do the following:
Start a transaction
Block the id row in the db table: in mysql use FOR UPDATE
Read the current id
increase the id
generate the bill with the id
Store the id back to the db
Commit the transaction
I'd go for MAX(id)+1
You can get the next number in the sequence with a query like:
SELECT COALESCE(MAX(id),0) + 1 FROM bill
WHERE society = 'XYZ'
You'll have to take steps to ensure that two processes don't generate the same number and that can be complicated but not insurmountable.
Personally, I would always avoid a trigger. I've never used trigger and not regretted it later.
Related
I have got a table which has an id (primary key with auto increment), uid (key refering to users id for example) and something else which for my question won’t matter.
I want to make, lets call it, different auto-increment keys on id for each uid entry.
So, I will add an entry with uid 10, and the id field for this entry will have a 1 because there were no previous entries with a value of 10 in uid. I will add a new one with uid 4 and its id will be 3 because I there were already two entried with uid 4.
...Very obvious explanation, but I am trying to be as explainative an clear as I can to demonstrate the idea... clearly.
What SQL engine can provide such a functionality natively? (non Microsoft/Oracle based)
If there is none, how could I best replicate it? Triggers perhaps?
Does this functionality have a more suitable name?
In case you know about a non SQL database engine providing such a functioality, name it anyway, I am curious.
Thanks.
MySQL's MyISAM engine can do this. See their manual, in section Using AUTO_INCREMENT:
For MyISAM tables you can specify AUTO_INCREMENT on a secondary column in a multiple-column index. In this case, the generated value for the AUTO_INCREMENT column is calculated as MAX(auto_increment_column) + 1 WHERE prefix=given-prefix. This is useful when you want to put data into ordered groups.
The docs go on after that paragraph, showing an example.
The InnoDB engine in MySQL does not support this feature, which is unfortunate because it's better to use InnoDB in almost all cases.
You can't emulate this behavior using triggers (or any SQL statements limited to transaction scope) without locking tables on INSERT. Consider this sequence of actions:
Mario starts transaction and inserts a new row for user 4.
Bill starts transaction and inserts a new row for user 4.
Mario's session fires a trigger to computes MAX(id)+1 for user 4. You get 3.
Bill's session fires a trigger to compute MAX(id). I get 3.
Bill's session finishes his INSERT and commits.
Mario's session tries to finish his INSERT, but the row with (userid=4, id=3) now exists, so Mario gets a primary key conflict.
In general, you can't control the order of execution of these steps without some kind of synchronization.
The solutions to this are either:
Get an exclusive table lock. Before trying an INSERT, lock the table. This is necessary to prevent concurrent INSERTs from creating a race condition like in the example above. It's necessary to lock the whole table, since you're trying to restrict INSERT there's no specific row to lock (if you were trying to govern access to a given row with UPDATE, you could lock just the specific row). But locking the table causes access to the table to become serial, which limits your throughput.
Do it outside transaction scope. Generate the id number in a way that won't be hidden from two concurrent transactions. By the way, this is what AUTO_INCREMENT does. Two concurrent sessions will each get a unique id value, regardless of their order of execution or order of commit. But tracking the last generated id per userid requires access to the database, or a duplicate data store. For example, a memcached key per userid, which can be incremented atomically.
It's relatively easy to ensure that inserts get unique values. But it's hard to ensure they will get consecutive ordinal values. Also consider:
What happens if you INSERT in a transaction but then roll back? You've allocated id value 3 in that transaction, and then I allocated value 4, so if you roll back and I commit, now there's a gap.
What happens if an INSERT fails because of other constraints on the table (e.g. another column is NOT NULL)? You could get gaps this way too.
If you ever DELETE a row, do you need to renumber all the following rows for the same userid? What does that do to your memcached entries if you use that solution?
SQL Server should allow you to do this. If you can't implement this using a computed column (probably not - there are some restrictions), surely you can implement it in a trigger.
MySQL also would allow you to implement this via triggers.
In a comment you ask the question about efficiency. Unless you are dealing with extreme volumes, storing an 8 byte DATETIME isn't much of an overhead compared to using, for example, a 4 byte INT.
It also massively simplifies your data inserts, as well as being able to cope with records being deleted without creating 'holes' in your sequence.
If you DO need this, be careful with the field names. If you have uid and id in a table, I'd expect id to be unique in that table, and uid to refer to something else. Perhaps, instead, use the field names property_id and amendment_id.
In terms of implementation, there are generally two options.
1). A trigger
Implementations vary, but the logic remains the same. As you don't specify an RDBMS (other than NOT MS/Oracle) the general logic is simple...
Start a transaction (often this is Implicitly already started inside triggers)
Find the MAX(amendment_id) for the property_id being inserted
Update the newly inserted value with MAX(amendment_id) + 1
Commit the transaction
Things to be aware of are...
- multiple records being inserted at the same time
- records being inserted with amendment_id being already populated
- updates altering existing records
2). A Stored Procedure
If you use a stored procedure to control writes to the table, you gain a lot more control.
Implicitly, you know you're only dealing with one record.
You simply don't provide a parameter for DEFAULT fields.
You know what updates / deletes can and can't happen.
You can implement all the business logic you like without hidden triggers
I personally recommend the Stored Procedure route, but triggers do work.
It is important to get your data types right.
What you are describing is a multi-part key. So use a multi-part key. Don't try to encode everything into a magic integer, you will poison the rest of your code.
If a record is identified by (entity_id,version_number) then embrace that description and use it directly instead of mangling the meaning of your keys. You will have to write queries which constrain the version number but that's OK. Databases are good at this sort of thing.
version_number could be a timestamp, as a_horse_with_no_name suggests. This is quite a good idea. There is no meaningful performance disadvantage to using timestamps instead of plain integers. What you gain is meaning, which is more important.
You could maintain a "latest version" table which contains, for each entity_id, only the record with the most-recent version_number. This will be more work for you, so only do it if you really need the performance.
Together with my team, I am working on a functionality to generate invoice numbers. The requirements says that:
there should be no gaps between invoice numbers
the numbers should start from 0 every year (the together with the year we will have a unique key)
the invoice numbers should grow accordinlgy to the time of the creation of the invoices
We are using php and postgres. We tought to implement this in the following way:
each time a new invoice is persisted on the database we use a BEFORE INSERT trigger
the trigger executes a function that retrieves a new value from a postgres sequence and writes it on the invoice as its number
Considering that multiple invoices could be created during the same transaction, my question is: is this a sufficiently safe approach? What are its flaws? How would you suggest to improve it?
Introduction
I believe the most crucial point here is:
there should be no gaps between invoice numbers
In this case you cannot use a squence and an auto-increment field (as others propose in the comments). Auto-increment field use sequence under the hood and nextval(regclass) function increments sequence's counter no matter if transaction succeeded or failed (you point that out by yourself).
Update:
What I mean is you shouldn't use sequences at all, especially solution proposed by you doesn't eliminates gap possibility. Your trigger gets new sequence value but INSERT could still failed.
Sequences works this way because they mainly meant to be used for PRIMARY KEYs and OIDs values generation where uniqueness and non-blocking mechanism is ultimate goal and gaps between values are really no big deal.
In your case however the priorities may be different, but there are couple things to consider.
Simple solution
First possible solution to your problem could be returning new number as maximum value of currently existing ones. It can be done in your trigger:
NEW.invoice_number =
(SELECT foo.invoice_number
FROM invoices foo
WHERE foo._year = NEW._year
ORDER BY foo.invoice_number DESC NULLS LAST LIMIT 1
); /*query 1*/
This query could use your composite UNIQUE INDEX if it was created with "proper" syntax and columns order which would be the "year" column in the first place ex.:
CREATE UNIQUE INDEX invoice_number_unique
ON invoices (_year, invoice_number DESC NULLS LAST);
In PostgreSQL UNIQUE CONSTRAINTs are implemented simply as UNIQUE INDEXes so most of the times there no difference which command you will use. However using that particular syntax presented above, makes possible to define order on that index. It's really nice trick which makes /*query 1*/ quicker than simple SELECT max(invoice_number) FROM invoices WHERE _year = NEW.year if the invoice table gets bigger.
This is simple solution but has one big drawback. There is possibility of race condition when two transactions try to insert invoice at the same time. Both could acquire the same max value and the UNIQUE CONSTRAINT will prevent the second one from committing. Despite that it could be sufficient in some small system with special insert policy.
Better solution
You may create table
CREATE TABLE invoice_numbers(
_year INTEGER NOT NULL PRIMARY KEY,
next_number_within_year INTEGER
);
to store next possible number for certain year. Then, in AFTER INSERT trigger you could:
Lock invoice_numbers that no other transaction could even read the number LOCK TABLE invoice_numbers IN ACCESS EXCLUSIVE;
Get new invoice number new_invoice_number = (SELECT foo.next_number_within_year FROM invoice_numbers foo where foo._year = NEW.year);
Update number value of new added invoice row
Increment UPDATE invoice_numbers SET next_number_within_year = next_number_within_year + 1 WHERE _year = NEW._year;
Because table lock is hold by the transaction to its commit, this probably should be the last trigger fired (read more about trigger execution order here)
Update:
Instead of locking whole table with LOCK command check link provided by Craig Ringer
The drawback in this case is INSERT operation performance drop down --- only one transaction at the time can perform insert.
My setup:
Mysql and PHP
System Scenario:
I have more than 10 Type of system Users:
For example :Customer and Employee
Everytime a customer or employee added to the system, the system will automatically generate ID to each user based on current date.
Ex (Customer):
Today is June 20,2015 and this customer is the 3rd to sign up. So his
ID would be 06202015-03. So everytime a user (any type of user) signup
the sequence number will increment by 1 in a day basis only. Every next day
the sequence counter will be back to 0.
General Question: Given my concern of ID generation is solved, is it a good practice to pre-process the next sequence #? I mean the system will just pullout the next sequence number saved on the db table? or should I just process the next sequence number only until a new user is signing up?
UPDATE (Added best possible scenario) :
Example Date: June 20,2015
Customer 1 signup = Generated ID would be 06202015-01
Customer 2 signup = Generated ID would be 06202015-02
and so on...
Worst possible scenario during signup:
2 or more user signing up simoltaneously
If customer1 is deleted (by admin) on that same day and customer2 signed up, the customer 2 should get the #1 id (06202015-01) and not *-02 as the customer1 is being deleted already.
.
I would like to know the best way to generate a sequence number efficiently:
Is stored procedure would be the best fit for this? or should I use #2?(see below)
Is it a good practice to just process the next sequence number (using PHP function) everytime a user signed up?
The #2 process is I think the best and easier way to process auto ID generation but I'm thinking WHAT IF 2 or more users
simultaneously singing up?
On my latest update, the sequence is obviously predictable. My only concern is what is the best or efficient way to get the sequence number. Is it thru stored procedure or using php script function given the worst scenarios stated.
General Question: Given my concern of ID generation is solved, is it a good practice to pre-process the next sequence #? I mean the system will just pullout the next sequence number saved on the db table? or should I just process the next sequence number only until a new user is signing up?
If the id is dependent on the date an user signs up, you can't predict the next id because you don't know when the next user will sign up (unless you are a clairvoyant).
To make it easier to obtain the next value I would split the id into two columns, a column with the date and a column with the sequence, then u can use:
IFNULL((SELECT MAX(sequence) FROM usertable WHERE signup_date = CURRENT_DATE), 0) + 1
Imo there's no best practise, it's a personal preference.
There's also a third option, a before insert trigger.
To avoid duplicates add an unique index with both columns.
In addition you can lock the table:
LOCK TABLES user_table WRITE;
/* CALL(sproc) or INSERT statement, or SELECT and INSERT statements */
UNLOCK TABLES;
With a write lock no other session can access the table untill the lock is released (it will wait)
I am trying to develop a system to assign room numbers to tenants of a hostel upon registration, using the auto increment feature of sql.
However, it automatically increases by one after every entry. Because the hostel accommodates four people in one room, I want to change this to 4, so that after every 4 entries I get only one id/room number.
How do I go about this? I am using php and sql. If the autoincrement feature is not possible can you please suggest another way to achieve this? Thanks.
You would need:
http://dev.mysql.com/doc/refman/5.1/en/replication-options-master.html#sysvar_auto_increment_increment
It works like this:
mysql> SET ##auto_increment_increment=4;
So when you insert 4 rows, the auto increment column will be:
4,8,12,16
as best of my knowledge you cannot change the steps of auto-increment field. I suggest add another field and write a trigger to update its value based on auto-increment field (auto-increment/4).
I don't think this is possible with autoincrement..
Maybe you can do something like this:
//Pseudo code
//First you get the count of the highest id, to see how many users are in the last room.
SELECT COUNT(*) FROM table WHERE id=(SELECT id FROM table ORDER BY id DESC LIMIT 1)
//If the result of the last query is >= 4 then insert the next customer with id +1
Don't use auto_increment for this - it can't handle a situation where multiple records will share the same number and although you can reset it manually (see below) it's also not designed for a situation where numbers may get reused in a random order.
You could just have a room_number field with one of the mysql integer types (e.g. tinyint, smallint, mediumint…) or you could separate your database into two tables, one for people (each of whom have an id) and a second to map those ids to rooms.
However you do it, you'd then write a select query to check which room numbers are available before you add the person's details to the database.
You may need to read up on relational databases if that doesn't sound very clear.
If you do need to reset the auto_increment (sometimes it's nice to do it if you've filled a database with test data which you're about to wipe, and you want the real "production" data to begin at 1) you can use:
ALTER TABLE [tablename] AUTO_INCREMENT = 1
https://dev.mysql.com/doc/refman/5.0/en/example-auto-increment.html
I'm trying to build a very simple login system for my site (just for practice for a project i'm working on). The way I've decided to implement it is use a table with fields for ID, Name, Password, and username and search for the entered information in the existing table.
For registration, it simply injects the information supplied into the table, and I would like to assign a customer ID number. My idea for assigning an ID number is to simply find the size of the ID column (which will contain the ID's 1,2,3..etc up to the end) and assign the new registration to the length +1. For this purpose i'll need a way to get the size of the column, but I'm just learning php and sql so i'm not sure what the syntax would be.
TLDR; is there a funtion in sql that I can use in php to get the length of a particular column? (i.e the number of entries stored in that column?)
Set the ID column to Primary and Auto increment.
you don't include that in your query it is created on its own.
You'd probably be better off just using an IDENTITY or AUTO_INCREMENT column. The problem with checking for the "size of the column" (by which I assume you mean the count of rows in that column) is that you could end up inserting duplicate IDs, for example:
ID | ...
---------
1
2
4
So if you did a SELECT COUNT(ID)+1 FROM MyTable, it would return 4, and you have an ID collision.
You could do something like SELECT MAX(ID)+1 FROM MyTable, but even then there could be concurrency problems (process A and process B both try to run that query at the same time, before either has a chance to insert the new ID of 5). You're really best off just letting your RDBMS take care of it..