Big problem...
I'm implementing an online ticket sale system in PHP and MySQL. I have a table called "block_of_tickets", or something like that...
This table looks like:
+-----------+------------+--------------+--------------+--------------+
| idblock | block_name | total_tickets| block_gender | idblock_pair |
+-----------+------------+--------------+--------------+--------------+
| 1 | Block 1- M | 100 | MALE | 2 |
+-----------+------------+--------------+--------------+--------------+
| 2 | Block 1- F | 100 | FEMALE | 1 |
+-----------+------------+--------------+--------------+--------------+
Where:
idblock: The id (primary key) of the block of tickets.
block_name: The name of the block. In the example I have a "Block 1- M" and "Block 1- F" to represente the "Block 1 - Male" and "Block 1 - Female", respectively.
total_tickets: the total of available tickets
block_gender: the gender of the block of tickets
idblock_pair: the block wich is pair of the current block.
Note: There are also other columns, like "price", etc.
Here is the (big) problem:
When there is a "idblock_pair", it means that both block of tickets will share the same total_tickets (available tickets), so both cells must have exactly the same value in this case. As you can see in the example above, block 1 points to block 2 and vice-versa.
Lots of people buy lots of tickets in (almost) the same time, wich means that each sold ticket must decrement 1 in the "total_tickets" field, for both cells.
Database Normalization can solve this. However, it would lose a lot in performance.
I'm almost sure that I should use "SELECT... FOR UPDATE"... but I don't know how, since it's the same table, and a "deadlock" can occur...
How to solve this problem? Do I have to use Triggers? Proccedures? Do I have to use the PHP processing (and transactions) to solve this?
In the example below, one ticket were sold, and now I'm decrementing the total_tickets by 1:
START TRANSACTION;
SELECT *
FROM block_of_tickets
WHERE idblock in (1,2) FOR UPDATE;
UPDATE block_of_tickets
SET total_tickets = (total_tickets - 1)
WHERE idblock in (1,2);
COMMIT;
Is this a nice solution?
Related
When I started designing my application database schema few months ago I have been told not to store the same data/calculated data in more than one place in the database(normalization). If I do, I will make a scope of bugs when I update the data in one place and left the other without updating. So I did an orders table and ordersDetails table. Something like this..
-- orders table
+-----+---------+----------+
| ID | clintID | date |
+-----+---------+----------+
| 1 | 1 |2018-02-22|
| 2 | 1 |2018-02-23|
| 3 | 2 |2018-02-24|
+-----+---------+----------+
-- orderDetail table
+-----+---------+------------+----------+----------+
| ID | orderID | itemNumber | quantity | unitPrice|
+-----+---------+------------+----------+----------+
| 1 | 1 | 12345 | 3 | 100.75 |
| 2 | 1 | 12346 | 3 | 100.75 |
| 3 | 2 | 12347 | 3 | 100.75 |
| 4 | 2 | 12345 | 3 | 100.75 |
| 5 | 3 | 12347 | 3 | 100.75 |
| 6 | 3 | 12345 | 3 | 100.75 |
+-----+---------+------------+----------+----------+
And to make the the queries easier for me I made a view "allOrdersSummary" like
-- allOrdersSummary
SELECT
orders.*, SUM(orderDetail.quantity * orderDetail.unitPrice) totalAmount
FROM orders INNER JOIN orderDetail ON orders.ID = orderDetail.orderID
GROUP BY orders.ID;
and I used this view later for my queries, but now I started to get the MAX_JOIN_SIZE error.
So I thought of saving the calculated total order amount along with the orders table ID, clintID, date, totalAmount and whenever I change something in the orderDeatils table I update the calculated totalAmount column in the orders table, I don't know if this is good or bad!
This problem -I don't know if this is considered a problem or not- is encountered many times, for example to know the unread messages of the client making the request I have to do sum(messages) unread from messages where to = ? and isRead = 0
A) should I make another column for calculated totalAmount in the orders table or it is a normal thing in databases to calculate the totalAmount from the orderDetails table every time I need it ?
B) If you recommend making another column in the orders table, what is the best way to update it every time a change happens in the orderDetails table ? should I update it at the PHP layer whenever I update the orderDetails table, or this is something that needs a stored procedure ?
Yes, it is normal to store pre-calculated values, based on other data in the database, in a database. But not necessarily for the reason you mention. I never had a problem with MAX_JOIN_SIZE.
The main, and probably only, reason for storing calculated values is speed. So you do it for values that don't change that often and that may be used in queries that use a lot of data and may therefore be too slow if you didn't use them.
For instance: If you want to know the average value of all the orders in your database the query would be a lot faster if you already have the order totals.
Why, and how, you update the values is completely up to you. However you have got to be consistent about it. If you use the MVC pattern it would make sense to integrate it in the controller. Or in simple terms: Whenever a form is submitted that could change one of the values, out of which the pre-calculated value is computed, you need to recompute it.
This is a clear demonstration where 'normalization' is not entirely maintained. It's not really pretty, but sometimes worth it. You could, of course, argue, that the calculated value represents 'new' information, and therefore does not offend against 'normalization'.
You have an "inflate-deflate" problem.
JOIN the two tables to make a much larger temporary table.
GROUP BY to shrink back to one row per row of the original (orders) table.
This avoids the problem:
SELECT *,
( SELECT SUM(quantity * unitPrice
FROM orderDetail WHERE orderID = orders.ID
) AS totalAmount
FROM orders;
Please let me know how your experience is with this one. It is one of the simplest examples of the inflate-deflate problem.
Say I have a table like this:
itemID | PriceA | PriceB | PriceC | other columns...
1 | 8.0 | 6.95 | 0.5 | ...
2 | 5.9 | 6.97 | 4.1 | ...
3 | 0.2 | 1.12 | 3.5 | ...
I want a user to login in, but only see certain rows, and only one Price column. For example, user Susie can see only rows 1 and 2, and only Price B for those items. User Hanna can see rows 2 and 3 at Price A.
Maybe it doesn't need to be database-level security. Basically, users will login-in on a website (a Wordpress) and, after logging-in, will see certain products at a certain price.
As well, more than one user can access any given row or column. It isn't a one-to-one relationship. I think this differs from typical row-level mysql security.
I have 2 questions:
Should this be database-level security or should it be something else? PHP code?
Any suggestions on how I can implement this?
Actually, I think creating views will solve my problem. Does that seem secure?
I found this: How can I allow users sql access to a table limited to certain rows?
I need to store and retrieve items of a course plan in sequence. I also need to be able to add or remove items at any point.
The data looks like this:
-- chapter 1
--- section 1
----- lesson a
----- lesson b
----- drill b
...
I need to be able to identify the sequence so that when the student completes lesson a, I know that he needs to move to lesson b. I also need to be able to insert items in the sequence, like say drill a, and of course now the student goes from lesson a to drill a instead of going to lesson b.
I understand relational databases are not intended for sequences. Originally, I thought about using a simple autoincrement column and use that to handle the sequence, but the insert requirement makes it unworkable.
I have seen this question and the first answer is interesting:
items table
item_id | item
1 | section 1
2 | lesson a
3 | lesson b
4 | drill a
sequence table
item_id | sequence
1 | 1
2 | 2
3 | 4
4 | 3
That way, I would keep adding items in the items table with whatever id and work out the sequence in the sequence table. The only problem with that system is that I need to change the sequence numbers for all items in the sequence table after an insertion. For instance, if I want to insert quiz a before drill a I need to update the sequence numbers.
Not a huge deal but the solutions seems a little overcomplicated. Is there an easier, smarter way to handle this?
Just relate records to the parent and use a sequence flag. You will still need to update all the records when you insert in the middle but I can't really think of a simple way around that without leaving yourself space to begin with.
items table:
id | name | parent_id | sequence
--------------------------------------
1 | chapter 1 | null | 1
2 | section 1 | 1 | 2
3 | lesson a | 2 | 3
4 | lesson b | 2 | 5
5 | drill a | 2 | 4
When you need to insert a record in the middle a query like this will work:
UPDATE items SET sequence=sequence+1 WHERE sequence > 3;
insert into items (name, parent_id, sequence) values('quiz a', 2, 4);
To select the data in order your query will look like:
select * from items order by sequence;
I am developing a personal finance tracker (for fun!) and I have a table of categories. Each category is an entry in the table and at the end of the month they are all duplicated with their relevant balances reset to the start of the month reading for the new month.
Among others, these categories can be of type 'savings' and so have a running total. If I want to retrieve a category or update it then I used the category_id field and this works fine for the current working month but linking months together is breaking my brain. For the savings categories I want to show how the running_total has increased over the previous six months but in my current DB design, categories don't "know" about their previous months as they are created new at the start of each month.
The only way I could currently retrieve the last 6 months of a savings running_total is to search by the category name but this is potentially unreliable.
I have considered adding a field to the table which is "previous_month_category_id" which would work as a way to link the categories together but would be expensive to implement as it would require 6 MSQL operations each time grabbing the "previous_month_category_id" from the result and then re running the query.
If MYSQL can do some kind of recursion then maybe this could work but I feel like there is a more obvious answer staring me in the face.
I'm using Codeigniter and MYSQL but not scared of vanilla PHP if required.
Help on how to do this would be great.
UPDATE 1:
Below is a sample from what the savings category might look like mixed in amongst other categories. At the end of each month the entry is duplicated with the same category_name, type, buget, year, and users_id but the category_id auto increments, the month updates to the new month number and the running total is the previous running_total + the budget. How would I do one database query to retrieve these without using the category_name? As this could change is the user decided to caller it "Bigger TV" at the end of July
+-------------+--------------+------+--------+---------------+------+-------+----------+
| category_id |category_name | type | budget | running_total | year | month | users_id |
+-------------+--------------+------+--------+---------------+------+-------+----------+
| 44 | Big TV | sav | 20 | 240 | 2012 | 8 | 77 |
+-------------+--------------+------+--------+---------------+------+-------+----------+
| 32 | Big TV | sav | 20 | 220 | 2012 | 7 | 77 |
+-------------+--------------+------+--------+---------------+------+-------+----------+
| 24 | Big TV | sav | 20 | 200 | 2012 | 6 | 77 |
UPDATE 2:
I'm not sure I'm explaining myself very well So I'll put some more detail around how the app works and see if that helps.
I have tables called "categories", "transactions" and "users". A category can be one of three types, 1: Cash, 2: Regular Payment, 3: Savings. Think of cash and regular payment types as buckets, at the start of each month each bucket is full and the aim is to take money out of it and make sure there is still a bit left at the end of the month (or at least not negative).
This is fine on a month by month basis and works very well (for me, I have used this system for 2 years now I think). The trip up comes with Savings as they are linked month by month and are more like a big bucket that is added to each month (with a set increment called budget) until it overspills and is then drained (like Big TV would be when you buy it), or taken from a little bit here and there and the aim is to build up an emergency fund (like "When my car breaks down" type thing).
When the relevant information is displayed for each category only the current month is shown for cash and regular as that is all that is important, for the savings however the current amount is also shown but it would be nice to show a small history graph of how it had built up (or depleted) over time. To do this I need some way of searching for the previous end of month states of these categories so that the graph can be plotted but currently I can't work out how to link them all by anything other than the category_name.
I have tried to implement a bit of DB normalisation but this is the first schema I've implemented having known about normalisation so I've probably missed some aspects of it and possibly avoided any over normalisation where it didn't feel right.
Below are my tables:
categories
+-------------+--------------+------+--------+---------------+------+-------+----------+
| category_id |category_name | type | budget | running_total | year | month | users_id |
+-------------+--------------+------+--------+---------------+------+-------+----------+
transactions
+----------------+--------------+--------+------+----------+------------------------+
| transaction_id | description | amount | date | users_id | categories_category_id |
+----------------+--------------+--------+------+----------+------+-------+---------+
they are joined on categories_category_id which is a foreign key
I have always worked off the premise that each category needs an new entry for each month but it seems from the comments and answers below that I would be better off having just one category entry regardless of month and then just calculating everything on the fly?
Although, the budgets can be changed by the user and so for record keeping I'm not sure if this would work also the "deposits" never really happen it is just the category being duplicated at the end of the month so I guess that would need to dealt with.....
The aim of this app has always been to decouple financial tracking from the physical transaction that occur in a bank account and provide a layer over someones finances thus allowing the user to avoid hard to explain transactions etc and just focus on over all cash position. There is no concept of an "income" in this system, or a bank account.
It seems to me like your database design could use some work. I'm still not completely familiar with what you're really trying to do, but my initial thoughts would be to store each transaction as a single row in a table, and then query that table in different ways to generate different types of reports on it. Something like this:
transactions:
+----+---------+--------+---------------+-----------+-------------+
| id | user_id | amount | running_total | datestamp | category_id |
+----+---------+--------+---------------+-----------+-------------+
categories:
+----+------+------+
| id | name | type |
+----+------+------+
Don't increment the categories based on time. Add an entry to the categories table when you actually have a new category. If a transaction could possibly belong to multiple categories, then use a third (relational) table that relates transactions (based on transaction ID) to categories (based on category ID).
When you have a deposit, the amount field will be positive and for withdrawals, it will be negative. You can get your current running total by doing something like:
SELECT running_total FROM transactions
WHERE id = (SELECT MAX(id) FROM transactions WHERE user_id = '$userID');
You can find your total difference for a particular month by doing this:
SELECT SUM(amount) FROM transactions WHERE DATE('%c', datestamp) = '$monthNumber';
You can find the total spending for a particular category by doing this:
SELECT SUM(t.amount) FROM transactions t
INNER JOIN categories c ON t.category_id = c.id WHERE c.name = 'Big TV';
There are plenty of other possibilities, but the purpose here is just to demonstrate a possibly better way to store your data.
Take a look at the items table below, as you can see this table is not normalized. Name should in a separate table to normalize it.
mysql> select * from items;
+---------+--------+-----------+------+
| item_id | cat_id | name | cost |
+---------+--------+-----------+------+
| 1 | 102 | Mushroom | 5.00 |
| 2 | 2 | Mushroom | 5.40 |
| 3 | 173 | Pepperoni | 4.00 |
| 4 | 109 | Chips | 1.00 |
| 5 | 35 | Chips | 1.00 |
+---------+--------+-----------+------+
This table is not normalize because on the backend Admin site, staff simply select a category and type in the item name to add data quickly. It is very quick. There are hundreds of same item name but the cost is not always the same.
If I do normalize this table to something like this:
mysql> select * from items;
+---------+--------+--------------+------+
| item_id | cat_id | item_name_id | cost |
+---------+--------+--------------+------+
| 1 | 102 | 1 | 5.00 |
| 2 | 2 | 1 | 5.40 |
| 3 | 173 | 2 | 4.00 |
| 4 | 109 | 3 | 1.00 |
| 5 | 35 | 3 | 1.00 |
+---------+--------+--------------+------+
mysql> select * from item_name;
+--------------+-----------+
| item_name_id | name |
+--------------+-----------+
| 1 | Mushroom |
| 2 | Pepperoni |
| 3 | Chips |
+--------------+-----------+
Now how can I add item (data) on the admin backend (data entry point of view) because this table has been normalized? I don't want like a dropdown to select item name - there will be thousands of different item name - it will take a lot of of time to find the item name and then type in the cost.
There need to be a way to add item/data quick as possible. What is the solution to this? I have developed backend in PHP.
Also what is the solution for editing the item name? Staff might rename the item name completely for example: Fish Kebab to Chicken Kebab and that will effect all the categories without realising it. There will be some spelling mistake that may need correcting like F1sh Kebab which should be Fish Kebab (This is useful when the tables are normalized and I will see item name updated every categories).
I don't want like a dropdown to select item name - there will be thousands of different item name - it will take a lot of of time to find the item name and then type in the cost.
There are options for selecting existing items other than drop down boxes. You could use autocompletion, and only accept known values. I just want to be clear there are UI friendly ways to achieve your goals.
As for whether to do so or not, that is up to you. If the product names are varied slightly, is that a problem? Can small data integrity issues like this be corrected with batch jobs or similar if they are a problem?
Decide what your data should look like first, based on the design of your system. Worry about the best way to structure a UI after you've made that decision. Like I said, there are usable ways to design UI regardless of your data structuring.
I think you are good to go with your current design, for you name is the product name and not the category name, you probably want to avoid cases where renaming a single product would rename too many of them at once.
Normalization is a good thing but you have to measure it against your specific needs and in this case I really would not add an extra table item_name as you shown above.
just my two cents :)
What are the dependencies supposed to be represented by your table? What are the keys? Based on what you've said I don't see how your second design is any more normalized that your first.
Presumably the determinants of "name" in the first design are the same as the determinants of "item_name_id" in the second? If so then moving name to another table won't make any difference to the normal forms satisified by your items table.
User interface design has nothing to do with database design. You cannot let the UI drive the database design and expect sensible results.
You need to validate the data and check for existence prior to adding it to see if it's a new value.
$value = $_POST['userSubmittedValue']
//make sure you sanitize the variable (never trust user input)
$query = SELECT item_name_id
FROM item_name
WHERE name='$value';
$result = mysql_query($query);
$row = mysql_fetch_row($result);
if(!empty($row))
{
//add the record with the id from $row['item_name_id'] to items table
}
else
{
//this will be a new value so run queries to add the new value to both items and item_name tables
}
There need to be a way to add item/data quick as possible. What is the
solution to this? I have developed backend in PHP.
User interface issues and database structure are separate issues. For a given database structure, there are usually several user-friendly ways to present and change the data. Data integrity comes from the database. The user interface just needs to know where to find unique values. The programmer decides how to use those unique values. You might use a drop-down list, pop up a search form, use autocomplete, compare what the user types to the elements in an array, or query the database to see whether the value already exists.
From your description, it sounds like you had a very quick way to add data in the first place: "staff simply select a category and type in the item name to add data quickly". (Replacing "mushroom" with '1' doesn't have anything to do with normalization.)
Also what is the solution for editing the item name? Staff might
rename the item name completely for example: Fish Kebab to Chicken
Kebab and that will effect all the categories without realising it.
You've allowed the wrong person to edit item names. Seriously.
This kind of issue arises in every database application. Allow only someone trained and trustworthy to make these kinds of changes. (See your dbms docs for GRANT and REVOKE. Also take a look at ON UPDATE RESTRICT.)
In our production database at work, I can insert new states (for the United States), and I can change existing state names to whatever I want. But if I changed "Alabama" to "Kyrgyzstan", I'd get fired. Because I'm supposed to know better than to do stuff like that.
But even though I'm the administrator, I can't edit a San Francisco address and change its ZIP code to '71601'. The database "knows" that '71601' isn't a valid ZIP code for San Francisco. Maybe you can add a table or two to your database, too. I can't tell from your description whether something like that would help you.
On systems where I'm not the administrator, I'd expect to have no permissions to insert rows into the table of states. In other tables, I might have permission to insert rows, but not to update or delete them.
There will be some spelling mistake that may need correcting like F1sh
Kebab which should be Fish Kebab
The lesson is the same. Some people should be allowed to update items.name, and some people should not. Revoke permissions, restrict cascading updates, increase data integrity using more tables, or increase training.