Can MySQL calculate totals from a multi-level parent child relationship?

Can MySQL calculate totals from a multi-level parent child relationship? - php

For the sake of simplicity, let's assume the following tables exist:
Table 1 - List of Sellers
ID | Parent_ID | Percentage
----------------------------
1 | - | .5
2 | 1 | .4
3 | 2 | .3
This table shows 3 sellers. 1, the main parent, 2, an individual with parent 1 and 3 with parent 2 and super-parent 1. ID 1 gets a 50% commission on all individual sales PLUS the difference in commission for any subagents between their percentage levels.
For example:
The following table would represent a list of sales by Agent:
Table 2 - Sales by Agent
ID | Cost
-------------
2 | 10.00
2 | 5.00
3 | 9.00
In this scenario:
Seller ID 3 would earn 30% of his sale, or $2.70
Seller ID 2 would earn 40% of his/her sales (.4 * $5) + (.4 * $10) = $6 PLUS override on his/her child. In this case, he/she earns the difference in commissions plus the amount of the sale (.4 - .3) * $9.00 = $0.90 so Seller ID 2 would earn $6.90 total.
Seller ID 1 had no individual sales but earns override on his/her child and all subsequents: (.5-.4)(10.00) + (.5-.4)(5.00) + (.5-.4)*(9.00) = $2.40
The super parent (ID 1 in this case) could have been the direct parent of ID 3 and earned (.5-.3). This is to say, the parent child relationship is not always linear and there is no set depth to where the hierarchy lay.
Ultimately, I am trying to develop a mysql or php (or combo) formula to determine the what each seller is due in the scenario above. Calculating the total of sales for each seller and the individual seller earnings based on those sales is easy. Figuring out how to apply earning to ID 1 and 2 based on 3's production is another story.
Any prior experience in this area?

Unfortunately, MySQL doesn't make trees easy.
MSSQL has something called common table expressions that make this easier.
With MySQL, the best solution is to use a recursive stored procedure. But those can be very difficult to get right.
Instead, I would suggest having another table that stores all the parents of a child and how many levels above they are. This complicates your insertion and deletion code since this table needs to be kept up to date (triggers can help), but it will make your calculations much easier.

Related

Select a option from a database based on different variables?

I am looking to make PHP code that selects the best option in a data table. What is considered "best" would be based off of the variables/columns. I understand that I would need to start a mysqli query and create a couple of loops to search through the database, but I am not entirely sure how to implement something like this.
To give a more in-depth explanation of what I am talking about, here as an example.
(START EXAMPLE)
Lets say I have a database and there is a table with items in it. There are 3 columns: Item ID, Type, On Sale. I want to make it so that a user is able to pick out the best option based on those variables. In addition to finding the "best" option, it selects the one that is first listed in that order (in this case lowest Item ID).
Imagine this table:
Item ID | Type | On Sale
---------------------
1 | Chair | 0
2 | Table | 1
3 | Chair | 1
4 | Oven | 0
5 | Table | 1
6 | Oven | 0
The level of important goes like Type>On Sale>Item ID (lowest).
A user is looking for a chair. Item 3 is selected because it is his item and it is the first one also on sale.
A user is looking for a table. Item 2 is selected over Item 5 because it is listed higher (or in this case, has a lower Item ID)
A user is looking for an oven. Item 4 is selected because no ovens are on sale. Because no options are on sale, it selects the lowest Item ID of the ovens listed.
(END EXAMPLE)
So how should I go about this? Any answers would be greatly appreciated!

select * from table_name where Type = type_specified_by_user order by On Sale, Item ID

Linking together multiple database table entries from multiple months

I am developing a personal finance tracker (for fun!) and I have a table of categories. Each category is an entry in the table and at the end of the month they are all duplicated with their relevant balances reset to the start of the month reading for the new month.
Among others, these categories can be of type 'savings' and so have a running total. If I want to retrieve a category or update it then I used the category_id field and this works fine for the current working month but linking months together is breaking my brain. For the savings categories I want to show how the running_total has increased over the previous six months but in my current DB design, categories don't "know" about their previous months as they are created new at the start of each month.
The only way I could currently retrieve the last 6 months of a savings running_total is to search by the category name but this is potentially unreliable.
I have considered adding a field to the table which is "previous_month_category_id" which would work as a way to link the categories together but would be expensive to implement as it would require 6 MSQL operations each time grabbing the "previous_month_category_id" from the result and then re running the query.
If MYSQL can do some kind of recursion then maybe this could work but I feel like there is a more obvious answer staring me in the face.
I'm using Codeigniter and MYSQL but not scared of vanilla PHP if required.
Help on how to do this would be great.
UPDATE 1:
Below is a sample from what the savings category might look like mixed in amongst other categories. At the end of each month the entry is duplicated with the same category_name, type, buget, year, and users_id but the category_id auto increments, the month updates to the new month number and the running total is the previous running_total + the budget. How would I do one database query to retrieve these without using the category_name? As this could change is the user decided to caller it "Bigger TV" at the end of July
+-------------+--------------+------+--------+---------------+------+-------+----------+
| category_id |category_name | type | budget | running_total | year | month | users_id |
+-------------+--------------+------+--------+---------------+------+-------+----------+
| 44 | Big TV | sav | 20 | 240 | 2012 | 8 | 77 |
+-------------+--------------+------+--------+---------------+------+-------+----------+
| 32 | Big TV | sav | 20 | 220 | 2012 | 7 | 77 |
+-------------+--------------+------+--------+---------------+------+-------+----------+
| 24 | Big TV | sav | 20 | 200 | 2012 | 6 | 77 |
UPDATE 2:
I'm not sure I'm explaining myself very well So I'll put some more detail around how the app works and see if that helps.
I have tables called "categories", "transactions" and "users". A category can be one of three types, 1: Cash, 2: Regular Payment, 3: Savings. Think of cash and regular payment types as buckets, at the start of each month each bucket is full and the aim is to take money out of it and make sure there is still a bit left at the end of the month (or at least not negative).
This is fine on a month by month basis and works very well (for me, I have used this system for 2 years now I think). The trip up comes with Savings as they are linked month by month and are more like a big bucket that is added to each month (with a set increment called budget) until it overspills and is then drained (like Big TV would be when you buy it), or taken from a little bit here and there and the aim is to build up an emergency fund (like "When my car breaks down" type thing).
When the relevant information is displayed for each category only the current month is shown for cash and regular as that is all that is important, for the savings however the current amount is also shown but it would be nice to show a small history graph of how it had built up (or depleted) over time. To do this I need some way of searching for the previous end of month states of these categories so that the graph can be plotted but currently I can't work out how to link them all by anything other than the category_name.
I have tried to implement a bit of DB normalisation but this is the first schema I've implemented having known about normalisation so I've probably missed some aspects of it and possibly avoided any over normalisation where it didn't feel right.
Below are my tables:
categories
+-------------+--------------+------+--------+---------------+------+-------+----------+
| category_id |category_name | type | budget | running_total | year | month | users_id |
+-------------+--------------+------+--------+---------------+------+-------+----------+
transactions
+----------------+--------------+--------+------+----------+------------------------+
| transaction_id | description | amount | date | users_id | categories_category_id |
+----------------+--------------+--------+------+----------+------+-------+---------+
they are joined on categories_category_id which is a foreign key
I have always worked off the premise that each category needs an new entry for each month but it seems from the comments and answers below that I would be better off having just one category entry regardless of month and then just calculating everything on the fly?
Although, the budgets can be changed by the user and so for record keeping I'm not sure if this would work also the "deposits" never really happen it is just the category being duplicated at the end of the month so I guess that would need to dealt with.....
The aim of this app has always been to decouple financial tracking from the physical transaction that occur in a bank account and provide a layer over someones finances thus allowing the user to avoid hard to explain transactions etc and just focus on over all cash position. There is no concept of an "income" in this system, or a bank account.

It seems to me like your database design could use some work. I'm still not completely familiar with what you're really trying to do, but my initial thoughts would be to store each transaction as a single row in a table, and then query that table in different ways to generate different types of reports on it. Something like this:
transactions:
+----+---------+--------+---------------+-----------+-------------+
| id | user_id | amount | running_total | datestamp | category_id |
+----+---------+--------+---------------+-----------+-------------+
categories:
+----+------+------+
| id | name | type |
+----+------+------+
Don't increment the categories based on time. Add an entry to the categories table when you actually have a new category. If a transaction could possibly belong to multiple categories, then use a third (relational) table that relates transactions (based on transaction ID) to categories (based on category ID).
When you have a deposit, the amount field will be positive and for withdrawals, it will be negative. You can get your current running total by doing something like:
SELECT running_total FROM transactions
WHERE id = (SELECT MAX(id) FROM transactions WHERE user_id = '$userID');
You can find your total difference for a particular month by doing this:
SELECT SUM(amount) FROM transactions WHERE DATE('%c', datestamp) = '$monthNumber';
You can find the total spending for a particular category by doing this:
SELECT SUM(t.amount) FROM transactions t
INNER JOIN categories c ON t.category_id = c.id WHERE c.name = 'Big TV';
There are plenty of other possibilities, but the purpose here is just to demonstrate a possibly better way to store your data.

OO design issue (using Symfony2)

I'm developing a sports court booking system and I need to generate a "booking table" that shows the columns in the table header as courts and the rows as time slots for bookings.
E.g.,
___________________________________
| | | |
| Court 1 | Court 2 | Court 3 |
|___________|___________|___________|
| | | |
| 10.00 am | 10.00 am | 10.00 am |
|___________|___________|___________|
| | | |
| 11.00 am | 11.00 am | 11.00 am |
|___________|___________|___________|
Requirements:
A club can have any number of courts
A club can have any time increment for bookings (e.g., 1 hour as shown above, 30 minutes, 40 minutes, etc)
Each cell in the table represents a "booking"
I want to make sure I do this right from the start so I have a few questions:
What entities would you create to achieve this
How would you go about generating this booking table
How would you link a cell in the above table to a booking
Thanks in advance.

Well, I think this is kind of standard?
First, you need a club entity. Each club can have n courts:
Club 1:n Court
Then there is a booking table, which is 1:n to a court:
Court 1:n Booking
I don't know if your second requirement means that one club has one time increment (in which case this is one variable on the club entity) or if it can has many (than there would be a TimeIncrement entity.
Generating the table can be a bit tricky. Thinking about it for a few minutes I got like 5-6 solutions which might work. You could use special objects which you can ask for the booking for a specific court and time and which search a Collection. Our you could build up an array where you have one key for every time and if there is no booking it's null. Have one array for each court, than do 2 nested for loops and read every value from the arrays. You could build up queries which rearrange the data so you can use them directly. Or maybe you can ask the court object itself for the booking on a specific date and time.
But I guess that is what the developer is for... Find out what works best for the given requirements and implement it.

What entities would you create to achieve this
Off the top of my head it looks like you'll need 3: Club, Court, Booking
How would you go about generating this booking table
The table should probably consist of id, court_id, start_time, end_time
How would you link a cell in the above table to a booking
As mentioned above, start/end times are columns in the bookings table.

I would just query the data from the database and turn it into json and pass it into the website. The frontend then can build the table with javascript.
For that I would create a custom entity BookingTable that returns data on request directly as an array which then can be easily turned into json with json_encode.
You can then concentrate on the more detailed pages that show the single booking for which you will automatically create the entities you need (if you didn't already to formulate the DQL for the custom entity for the table).

Conditionally Update Several Records Across Tables Based On Value Of A Column's Value For Each Row

I'm trying to streamline updating database records. I've investigated several methods and options but any that focus on PHP instead of MySQL would require hundreds if not thousands of queries, that is clearly not desirable. I'm working with three tables, one contains rows that identify various aspects of a product, one that connects the descriptive values to products and finally the products. The descriptive table has a column that is to trigger an update for all products that use it. The trouble I'm having is that the update might require the product use a new description and that description might have to be created and then set in the product connect table. The description table is also used to calculate the new value based off of another entry.
Description Table:
id | unit | type | update
------------------------------------
1 4oz weight
2 3 servings 1
3 7oz weight
4 4 servings
5 0.66 price_per_serving
6 0.49 price_per_oz
Connector Table:
id | description_id | product_id
--------------------------------
1 1 1
2 2 1
3 2 2
4 4 3
Product Table:
id | name | price
-----------------------
1 soup 1.99
2 crackers 2.00
3 chips 0.79
4 candy bar 0.99
So from the example tables I need to find that item 2 in the description table needs updating. Through the connector table I see the soup and crackers have three servings. I need to take the price of each item using entry 2 and divide it by the number of servings. If that new value isn't in the description table (say 0.78 price_per_serving), it needs to be created and the connector table needs to hold the id of the existing matching entry if one exists or the new one that was created if it didn't. The entry update field will now be cleared.
I can perform all of the functions with PHP no problem, as you can see though it would just place an enormous burden on resources.
I've been trying to wrap my head around sub-queries and reading plenty of articles, manuals and other questions on this site but I'm just so overwhelmed I don't know how to get started. I've found queries like the example shown in this post (http://stackoverflow.com/questions/2542571/mysql-case-when-then-returning-wrong-data-type-blob) but my MySQL experience is mostly select, update, insert.

Statistical method for grading a set of exponential data

I have a PHP application that allows the user to specify a list of countries and a list of products. It tells them which retailer is the closest match. It does this using a formula similar to this:
(
(number of countries matched / number of countries selected) * (importance of country match)
+
(number of products matched / number of products selected) * (importance of product match)
)
*
(significance of both country and solution matching * (coinciding matches / number of possible coinciding matches))
Where [importance of country match] is 30%, [importance of product match] is 10% and [significance of both country and solution matching] is 2.5
So to simplify it: (country match + product match) * multiplier.
Think of it as [do they operate in that country? + do they sell that product?] * [do they sell that product in that country?]
This gives us a match percentage for each retailer which I use to rank the search results.
My data table looks something like this:
id | country | retailer_id | product_id
========================================
1 | FR | 1 | 1
2 | FR | 2 | 1
3 | FR | 3 | 1
4 | FR | 4 | 1
5 | FR | 5 | 1
Until now it's been fairly simple as it has been a binary decision. The retailer either operates in that country or sells that product or they don't.
However, I've now been asked to add some complexity to the system. I've been given the revenue data, showing how much of that product each retailer sells in each country. The data table now looks something like this:
id | country | retailer_id | product_id | revenue
===================================================
1 | FR | 1 | 1 | 1000
2 | FR | 2 | 1 | 5000
3 | FR | 3 | 1 | 10000
4 | FR | 4 | 1 | 400000
5 | FR | 5 | 1 | 9000000
My problem is that I don't want retailer 3 selling ten times as much as retailer 1 to make them ten times better as a search result. Similarly, retailer 5 shouldn't be nine thousand times better as a match than retailer 1. I've looked into using the mean, the mode and median. I've tried using the deviation from the mean. I'm stumped as to how to make the big jumps less significant. My lack of ignorance of the field of statistics is showing.
Help!

Consider using the log10() function. This reduces the direct scaling of results, like you were describing. If you log10() of the revenue, then someone with a revenue 1000 times larger receives a score only 3x larger.

A classic in "dampening" huge increases in value are the logarithms. If you look at that Wikipedia article, you see that the function value initially grows fairly quickly but then much less so. As mentioned in another answer, a logarithm with base 10 means that each time you multiply the input value by ten, the output value increases by one. Similarly, a logarithm with base two will grow by one each time you multiply the input value by two.
If you want to weaken the effect of the logarithm, you could look into combining it with, say, a linear function, e.g. f(x) = log2 x + 0.0001 x... but that multiplier there would need to be tuned very carefully so that the linear part doesn't quickly overshadow the logarithmic part.
Coming up with this kind of weighting is inherently tricky, especially if you don't know exactly what the function is supposed to look like. However, there are programs that do curve fitting, i.e. you can give it pairs of function input/output and a template function, and the program will find good parameters for the template function to approximate the desired curve. So, in theory you could draw your curve and then make a program figure out a good formula. That can be a bit tricky, too, but I thought you might be interested. One such program is the open source tool QtiPlot.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.