I am designing a system where I have multiple shops where each shop should have its own set of sequential numbers for its invoices. Obviously my primary ID column will be sequential for all invoices in the system so obviously I will need another column to store this "shop specific" invoice number. What is the best manner to store and get the next ID for this shop specific number? For example would it be safe to simply get it right from the invoices table doing something like: SELECT MAX(INV_NUM) FROM INVOICES WHERE SHOP_ID = # and add one, and subsequently create the new invoice record? Or would there be issues with the timing if 2 instances of my script were to run at the same time? For example the first script fetches the next ID, but before it gets the chance to create the new invoice record the second instance requests the next ID and gets the same one as the first script... I was then thinking about just storing the last used number in a separate table and as soon as I request the next invoice number immediately write the new next order number and persist it so that the time between fetching the next order number and creating the record that the next request would rely on is kept to an absolute minimum... literally 3 lines of code:
$nextId = $shop->getLastId() + 1;
$shop->setLastId($nextId);
$em->persist();
Invoices
------------------------------
| ID | INV_NUM | SHOP_ID |
------------------------------
| 1 | 99 | 1 |
| 2 | 100 | 2 |
| 3 | 100 | 1 |
Shops
-------------------
| ID | LAST_ID |
-------------------
| 1 | 100 |
| 2 | 100 |
If you're using Doctrine, which I assume you are since you're using Symfony then you can lifecycle events to listen for changes in your entities. Before saving you can then update your second column to the incremented value.
Regarding race conditions, to be sure you don't have bad data in your database you can put a unique constraint on your shop ID and invoice number.
Related
When I started designing my application database schema few months ago I have been told not to store the same data/calculated data in more than one place in the database(normalization). If I do, I will make a scope of bugs when I update the data in one place and left the other without updating. So I did an orders table and ordersDetails table. Something like this..
-- orders table
+-----+---------+----------+
| ID | clintID | date |
+-----+---------+----------+
| 1 | 1 |2018-02-22|
| 2 | 1 |2018-02-23|
| 3 | 2 |2018-02-24|
+-----+---------+----------+
-- orderDetail table
+-----+---------+------------+----------+----------+
| ID | orderID | itemNumber | quantity | unitPrice|
+-----+---------+------------+----------+----------+
| 1 | 1 | 12345 | 3 | 100.75 |
| 2 | 1 | 12346 | 3 | 100.75 |
| 3 | 2 | 12347 | 3 | 100.75 |
| 4 | 2 | 12345 | 3 | 100.75 |
| 5 | 3 | 12347 | 3 | 100.75 |
| 6 | 3 | 12345 | 3 | 100.75 |
+-----+---------+------------+----------+----------+
And to make the the queries easier for me I made a view "allOrdersSummary" like
-- allOrdersSummary
SELECT
orders.*, SUM(orderDetail.quantity * orderDetail.unitPrice) totalAmount
FROM orders INNER JOIN orderDetail ON orders.ID = orderDetail.orderID
GROUP BY orders.ID;
and I used this view later for my queries, but now I started to get the MAX_JOIN_SIZE error.
So I thought of saving the calculated total order amount along with the orders table ID, clintID, date, totalAmount and whenever I change something in the orderDeatils table I update the calculated totalAmount column in the orders table, I don't know if this is good or bad!
This problem -I don't know if this is considered a problem or not- is encountered many times, for example to know the unread messages of the client making the request I have to do sum(messages) unread from messages where to = ? and isRead = 0
A) should I make another column for calculated totalAmount in the orders table or it is a normal thing in databases to calculate the totalAmount from the orderDetails table every time I need it ?
B) If you recommend making another column in the orders table, what is the best way to update it every time a change happens in the orderDetails table ? should I update it at the PHP layer whenever I update the orderDetails table, or this is something that needs a stored procedure ?
Yes, it is normal to store pre-calculated values, based on other data in the database, in a database. But not necessarily for the reason you mention. I never had a problem with MAX_JOIN_SIZE.
The main, and probably only, reason for storing calculated values is speed. So you do it for values that don't change that often and that may be used in queries that use a lot of data and may therefore be too slow if you didn't use them.
For instance: If you want to know the average value of all the orders in your database the query would be a lot faster if you already have the order totals.
Why, and how, you update the values is completely up to you. However you have got to be consistent about it. If you use the MVC pattern it would make sense to integrate it in the controller. Or in simple terms: Whenever a form is submitted that could change one of the values, out of which the pre-calculated value is computed, you need to recompute it.
This is a clear demonstration where 'normalization' is not entirely maintained. It's not really pretty, but sometimes worth it. You could, of course, argue, that the calculated value represents 'new' information, and therefore does not offend against 'normalization'.
You have an "inflate-deflate" problem.
JOIN the two tables to make a much larger temporary table.
GROUP BY to shrink back to one row per row of the original (orders) table.
This avoids the problem:
SELECT *,
( SELECT SUM(quantity * unitPrice
FROM orderDetail WHERE orderID = orders.ID
) AS totalAmount
FROM orders;
Please let me know how your experience is with this one. It is one of the simplest examples of the inflate-deflate problem.
I have a PHP script pulling a JSON file that is static and updates every 10 seconds. It has details about some events that happen and it just adds to the top of the JSON file. I then insert them into a MySQL database.
Because I have to pull every event every time I pull the file, I will only be inserting new events. The easy way would be to search for the event in the database (primary keys are not the same), but I am talking about ~4000 events every day, and I do not want that many queries just to see if it exists.
I am aware of INSERT IGNORE, but it looks like it only uses PRIMARY_KEY to do this.
What can I do (preferably easily) to prevent duplicates on two keys?
Example:
I have a table events with the following columns:
ID (irrelevant, really)
event_id (that I need to store from the source)
action_id (many action_ids belong to one event_id)
timestamp
whatever...
And my data is my JSON comes out on the first pull like this:
event_id|action_id|...
1 | 1
1 | 2
1 | 3
2 | 1
2 | 2
2 | 3
Then the next pull is this:
event_id|action_id|...
1 | 1
1 | 2
1 | 3
1** | 4**
1** | 5**
2 | 1
2 | 2
2 | 3
2** | 4**
I only want the rows marked with asterisks to be inserted, and the others to be ignored. Remember, primary_key column id is completely in this table, and I just use it for ubiquity.
What command can I use to "INSERT" every event I pull, but ONLY adding those that aren't duplicated by way of the two columns event_id and action_id.
Thanks.
Create a unique index of both columns.
CREATE
UNIQUE INDEX event_action
ON tablename (event_id, action_id)
I have one database(MYSQL) with more than 5000 tables of same type as shown below(having 5 to 60 entries each) and using PHP for all actions. Now I think another possibility to change this database and make only one table and add all entries inside this(then there will be 1.5 million entries inside one table). The queries mostly performed is INSERT, CREATE TABLE, UPDATE, Search table and retrieve all values(in json format) and some time DELETE.
Now if I change the database to second type having only one table then the required actions is INSERT, UPDATE, Search all relative values for specific cityname and some time DELETE entry. I am confused that which is the best optimal method for this. Please suggest me some help to optimize this database.
table1: id | name | nameinoanotherlang | min | max | model | date
1 | abcd | gdyugdedu | 214.2| 212.22| 3212 |2015-04-28
.....(5 to 60 entry)
table2: id | name | nameinanotherlang | min | max | model | date
1 | abcd | gdyugdedu | 214.2| 212.22| 3212 |2015-04-28
.....(5 to 60 entry)
table..... (more than 5000 tables)
another method:
only one table:
id | cityname | name | nameinanotherlang | min | max | model | date
......(then it will be 1.5 million entry)
I need to store and retrieve items of a course plan in sequence. I also need to be able to add or remove items at any point.
The data looks like this:
-- chapter 1
--- section 1
----- lesson a
----- lesson b
----- drill b
...
I need to be able to identify the sequence so that when the student completes lesson a, I know that he needs to move to lesson b. I also need to be able to insert items in the sequence, like say drill a, and of course now the student goes from lesson a to drill a instead of going to lesson b.
I understand relational databases are not intended for sequences. Originally, I thought about using a simple autoincrement column and use that to handle the sequence, but the insert requirement makes it unworkable.
I have seen this question and the first answer is interesting:
items table
item_id | item
1 | section 1
2 | lesson a
3 | lesson b
4 | drill a
sequence table
item_id | sequence
1 | 1
2 | 2
3 | 4
4 | 3
That way, I would keep adding items in the items table with whatever id and work out the sequence in the sequence table. The only problem with that system is that I need to change the sequence numbers for all items in the sequence table after an insertion. For instance, if I want to insert quiz a before drill a I need to update the sequence numbers.
Not a huge deal but the solutions seems a little overcomplicated. Is there an easier, smarter way to handle this?
Just relate records to the parent and use a sequence flag. You will still need to update all the records when you insert in the middle but I can't really think of a simple way around that without leaving yourself space to begin with.
items table:
id | name | parent_id | sequence
--------------------------------------
1 | chapter 1 | null | 1
2 | section 1 | 1 | 2
3 | lesson a | 2 | 3
4 | lesson b | 2 | 5
5 | drill a | 2 | 4
When you need to insert a record in the middle a query like this will work:
UPDATE items SET sequence=sequence+1 WHERE sequence > 3;
insert into items (name, parent_id, sequence) values('quiz a', 2, 4);
To select the data in order your query will look like:
select * from items order by sequence;
I have a table that records tickets that are separated by a column that denotes the "database". I have a unique key on the database and cid columns so that it increments each database uniquely (cid has the AUTO_INCREMENT attribute to accomplish this). I increment id manually since I cannot make two AUTO_INCREMENT columns (and I'd rather the AUTO_INCREMENT take care of the more complicated task of the uniqueness).
This makes my data look like this basically:
-----------------------------
| id | cid | database |
-----------------------------
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 2 | 2 |
-----------------------------
This works perfectly well.
I am trying to make a feature that will allow a ticket to be "moved" to another database; frequently a user may enter the ticket in the wrong database. Instead of having to close the ticket and completely create a new one (copy/pasting all the data over), I'd like to make it easier for the user of course.
I want to be able to change the database and cid fields uniquely without having to tamper with the id field. I want to do an UPDATE (or the like) since there are foreign key constraints on other tables the link to the id field; this is why I don't simply do a REPLACE or DELETE then INSERT, as I don't want it to delete all of the other table data and then have to recreate it (log entries, transactions, appointments, etc.).
How can I get the next unique AUTO_INCREMENT value (based on the new database value), then use that to update the desired row?
For example, in the above dataset, I want to change the first record to go to "database #2". Whatever query I make needs to make the data change to this:
-----------------------------
| id | cid | database |
-----------------------------
| 1 | 3 | 2 |
| 2 | 1 | 2 |
| 3 | 2 | 2 |
-----------------------------
I'm not sure if the AUTO_INCREMENT needs to be incremented, as my understanding is that the unique key makes it just calculate the next appropriate value on the fly.
I actually ended up making it work once I re-read an except on using AUTO_INCREMENT on multiple columns.
For MyISAM and BDB tables you can specify AUTO_INCREMENT on a
secondary column in a multiple-column index. In this case, the
generated value for the AUTO_INCREMENT column is calculated as
MAX(auto_increment_column) + 1 WHERE prefix=given-prefix. This is
useful when you want to put data into ordered groups.
This was the clue I needed. I simply mimic'd the query MySQL runs internally according to that quote, and joined it into my UPDATE query as such. Assume $new_database is the database to move to, and $id is the current ticket id.
UPDATE `tickets` AS t1,
(
SELECT MAX(cid) + 1 AS new_cid
FROM `tickets`
WHERE database = {$new_database}
) AS t2
SET t1.cid = t2.new_cid,
t1.database = {$new_database}
WHERE t1.id = {$id}