My setup:
Mysql and PHP
System Scenario:
I have more than 10 Type of system Users:
For example :Customer and Employee
Everytime a customer or employee added to the system, the system will automatically generate ID to each user based on current date.
Ex (Customer):
Today is June 20,2015 and this customer is the 3rd to sign up. So his
ID would be 06202015-03. So everytime a user (any type of user) signup
the sequence number will increment by 1 in a day basis only. Every next day
the sequence counter will be back to 0.
General Question: Given my concern of ID generation is solved, is it a good practice to pre-process the next sequence #? I mean the system will just pullout the next sequence number saved on the db table? or should I just process the next sequence number only until a new user is signing up?
UPDATE (Added best possible scenario) :
Example Date: June 20,2015
Customer 1 signup = Generated ID would be 06202015-01
Customer 2 signup = Generated ID would be 06202015-02
and so on...
Worst possible scenario during signup:
2 or more user signing up simoltaneously
If customer1 is deleted (by admin) on that same day and customer2 signed up, the customer 2 should get the #1 id (06202015-01) and not *-02 as the customer1 is being deleted already.
.
I would like to know the best way to generate a sequence number efficiently:
Is stored procedure would be the best fit for this? or should I use #2?(see below)
Is it a good practice to just process the next sequence number (using PHP function) everytime a user signed up?
The #2 process is I think the best and easier way to process auto ID generation but I'm thinking WHAT IF 2 or more users
simultaneously singing up?
On my latest update, the sequence is obviously predictable. My only concern is what is the best or efficient way to get the sequence number. Is it thru stored procedure or using php script function given the worst scenarios stated.
General Question: Given my concern of ID generation is solved, is it a good practice to pre-process the next sequence #? I mean the system will just pullout the next sequence number saved on the db table? or should I just process the next sequence number only until a new user is signing up?
If the id is dependent on the date an user signs up, you can't predict the next id because you don't know when the next user will sign up (unless you are a clairvoyant).
To make it easier to obtain the next value I would split the id into two columns, a column with the date and a column with the sequence, then u can use:
IFNULL((SELECT MAX(sequence) FROM usertable WHERE signup_date = CURRENT_DATE), 0) + 1
Imo there's no best practise, it's a personal preference.
There's also a third option, a before insert trigger.
To avoid duplicates add an unique index with both columns.
In addition you can lock the table:
LOCK TABLES user_table WRITE;
/* CALL(sproc) or INSERT statement, or SELECT and INSERT statements */
UNLOCK TABLES;
With a write lock no other session can access the table untill the lock is released (it will wait)
Related
What is the best way to generate consecutive values when you have a load balanced database and instances of your application ?
For example, i have a load balanced mysql database.
My PHP application, is deployed with docker and has 3 containers
I have to generate consecutive ids. I cannot use auto increment because i have to generate unique ids depending on relations (For example, i have to generate a unique bill number depending on witch society it is related)
My bill can be generated but not emmited. I must generate the unique value when the bill is emitted.
TRIGGER ON UPDATE is the good solution or not ?
Thnks for your answers
I would save the current id in a db table.
Each time you want to increase the id, do the following:
Start a transaction
Block the id row in the db table: in mysql use FOR UPDATE
Read the current id
increase the id
generate the bill with the id
Store the id back to the db
Commit the transaction
I'd go for MAX(id)+1
You can get the next number in the sequence with a query like:
SELECT COALESCE(MAX(id),0) + 1 FROM bill
WHERE society = 'XYZ'
You'll have to take steps to ensure that two processes don't generate the same number and that can be complicated but not insurmountable.
Personally, I would always avoid a trigger. I've never used trigger and not regretted it later.
I have a query that on every user purchase gets currently highest receipt_counter number from receipts table in order to create new receipt. receipt_counter is not unique in the table because it resets every year.
receipt_counter is just an integer that is used in generating receipt_label that looks like "pos_id"-"receipt_counter".
There is a possibility that people can buy a product simultaneously on the same point of sale (pos_id).
Function that gets new receipt_counter looks like this:
SELECT (MAX(receipt_counter) + 1) as next_receipt_counter FROM receipts
The problem is when multiple people are buying a product simultaneously, which triggers generating new receipt (along with receipt number), sometimes a collision occurs (multiple people get same receipt number) because there is some delay between retrieving receipt counter and inserting new receipt into DB.
Is there a best practice to deal with this kind of problems? Do I need to use some kind of deadlock, or is my initial idea flawed and I need to change tactic for generating receipt counter all together?
EDIT: receipt_counter needs to be a sequential number without gaps.
there is some delay between retrieving receipt counter and inserting new receipt into DB
You can change your software in order to instead or retrieving the ID without creating the actual receipt, it creates the receipt (with "pending" state or something like this) and then retrieve its ID. In the moment you currently create the receipt, you would just set its status to "active" or something.
Doing it this way you get rid of this time gap between getting and ID and storing the record, which in my point of view, is the main source of your problems.
You can create separate table for id only and enable auto_increment on that id column. Then add receipt in 2 steps - first add new record to id table, to receive back generated id. Then add actual receipt using received id. Then when you need just truncate table with id's when you want to reset the increment counter.
Does the receipt_counter need to be an increasing number without gaps?
If an increasing large number with gaps is okay, how about generating a number out of the current date/time? If you go down to milliseconds or nanoseconds, the chance of a collision is pretty low.
For example:
2013-11-13 13:08:15.012 -> 1113130815012
(I omitted the year because you said the number is reset every year anyway)
I have about 50k entries that I am inserting in MySQL via PHP. Table is simple with ID (autoincrementation), and few VARCHAR fields...
Since this system allows multiple users to login at same time and do the same operation of inserting data, so let's say USER1 starts the "insert" and in very same millisecond USER2 starts inserting data in same table - I am curious if MySQL will wait for USER1 to finish process and than process USER2 entries or it will do insert simultaneously ?
So following that logic USER1 insert ID's will be from 1 to 50k and USER2 from 50k-100k, or at the other hand will it shuffle them?
...and if it will do "shuffling" is there a way to prevent this ?
P.S. I know - I can add additional field with user_id so I can distinguish which user did actual "entry", but in this case I would really like to reserve space from 1-50k to be for USER1 and 50k-100k to be for USER2....
To reserve the table and thereby the auto increment ids for yourself you should LOCK the table before beginning your inserts. Otherwise "shuffling" of ids is very likely.
Note though that nothing, even locking, guarantees the continuity of autoincrement ids. I understand that it would be "a nice touch" to have a block of inserts with continuous ids, but that's not what they are for and there are no guarantees in the system to make it so. So don't rely on it or expect it in any way.
You shouldn't care about that.
Id is just an uniquely created identifier with no special meaning.
Any time you put special meaning on this identifier you will face a disaster.
So my app needs to let users generate random alphanumeric codes like A6BU31, 38QV3B, R6RK7T. Currently they consist of 6 chars, whereas I and O are not used (so we got 34^6 possibilities). These codes are then printed out and used for something else.
I must now ensure that many users can "reserve" up to 100 codes per request, so user A might want to get 50 codes, user B wants to generate 10 and so on. These codes must be unique across all users, so user A and user B may not both receive the code ABC123.
My current approach (using PHP and MySQL) is to have two InnoDB tables for this:
One (the "repository") contains a large list of pre-generated codes (since the possibility of collisions will increase over time and I do not want to go the try-insert-if-fails-try-another-code approach). The repository contains just the codes and an auto-incremented ID (so I can sort them, see below).
The other table holds the reserved keys (i.e. code + owning user).
Whenever a user wants to reserve N keys, I planned to do the following
BEGIN;
INSERT INTO revered_codes (code,user_id)
SELECT code FROM repository WHERE 1 ORDER BY id LIMIT N;
DELETE FROM repository WHERE 1 ORDER BY id LIMIT N;
COMMIT;
This should work, but I'm not sure. It seems like I'm building a WTF solution.
After insertion I must select the just reserved codes to display them to the user. And that's the tricky part, since I don't really know how to identify the just reserved codes after my transaction is done. I could of course add just another column to my reserved_codes table, holding some kind of random token, but this seems even more WTFy.
My favorite solution would be to have a random number sequence, so that I can just perform INSERT operations in the reserved_codes table.
So, how to do this unique, random and transactional-safe sequence in MySQL? One idea was to have a regular auto-increment on the reserved_codes table and derive the random code value from that numeric column, but I was wondering whether there was a better way.
UPDATE: I forgot to mention that it would be advantagous to have a rather small table of reserved codes, as I later have to find single codes again for updating them (reserved_codes has a couple of more attributes to it). So letting the reserved table grow slowly is good (instead of having a huge index over ~1mio pre-generated codes).
If you already have a repository table, I would just add a user column and then run this query:
UPDATE repository SET user_id = ? WHERE user_id IS NULL LIMIT N;
Afterwards, you can select the records again. This had two distinct disadvantages:
you need an index on user_id
you can't use the codes in your table for anything else but binding it to users.
In a table of Users, I want to keep track of the time of day each user logs in as running totals. For example
UserID midnightTo6am 6amToNoon noonTo6pm 6pmToMidnight
User1 3 2 7 1
User2 4 9 1 8
Note that this is part of a larger table that contains more information about a user, such as address and gender, hair color, etc, etc.
In this example, what is the best way to store this this data? Should it be part of the users table, despite knowing that not every user will log in at every time (a user may never log in between 6am and noon)? Or is this table a 1NF failure because of repeating columns that should be moved to a separate table?
If stored as part of the Users Table, there may be empty cells that never get populated with data because the user never logs in at that time.
If this data is a 1NF failure and the data is to be put in a separate table, how would I ensure that a +1 for a certain time goes smoothly? Would I search for the user in the separate table to see if they have logged in at that time before and +1? Or add a column to that table if it is their first time logging in during that time period?
Any clarifications or other solutions are welcome!
I would recommend storing the login events either in a file based log or in a simple table with just the userid and DATETIME of the login.
Once a day, or however often you need to report on the data you illustrated in your question, aggregate that data up into a table in the shape that you want. This way you're not throwing away any raw data and can always reaggregate for different periods, by hour, etc at a later date.
addition: I suspect that the fastest way of deriving the aggregated data would be to run a number of range queries for each of your aggregation periods so you're searching for (e.g.) login dates in the range 2011-12-25 00:00:00 - 2011-12-24 03:00:00. If you go with that approach and index of (datetime, user_id) would work well. It seems counter-intuitive as you want to do stuff on a user-centric basis but the index on the DATETIME field would allow easy finding of the rows and then the trailing user_id index would allow for fast grouping.
A couple of things. Firstly, this is not a violation of 1NF. Doing it as 4 columns may in fact be acceptable. Secondly, if you do go with this design, you should not use nulls, use zero instead(with the possible exception of existing records). Finally, WHETHER you should use this design or split it into another table (or two) is dependent upon your purpose and usage. If your standard use of the table does not make use of this information, it should go into another table with a 1 to 1 relationship. If you may need to increase the granuality of the login times, then you should use another table. Finally, if you do split this off into another table with a timestamp, give some consideration to privacy.