I have the following tables
customers
cust_id cust_name
1 a company
2 a company 2
3 a company 3
tariffs
tariff_id cost_1 cost_2 cost_3
1 2 0 3
2 1 1 1
3 4 0 0
terminals
term_id cust_id term_number tariff_id
1 1 12345 1
2 1 67890 2
3 2 14324 1
4 3 78788 3
usage
term_ident usage_type usage_amount date
12345 1 20 11/12/2010
67890 2 10 31/12/2010
14324 1 1 01/01/2011
14324 2 5 01/01/2011
78788 1 0 14/01/2011
In real life the tables are quite large - there are 5000 customers, 250 tariffs, 500000 terminals and 5 million usage records.
In the terminals table - term_id, cust_id and tariff_id are all foreign keys. There are no foreign keys in the usage table - this is just raw data imported from a csv file.
There could be terminals in the usage table that do not exist in the terminals table - these can be ignored.
What I need to do is produce an invoice per customer on usage. I only want to include usage between 15/12/2010 and 15/01/2011 - this is the billing period. I need to calculate the line items of the invoice for the usage based on its tariff ... for example: take the first record in the usage table - the cost of usage_1 (for term_id 1) would be 90x2=180, this is because term_ident uses tariff_id number 1.
The output should be as follows
Customer 2
date terminal usage_cost_1 usage_cost_2 usage_cost_3 total cost
01/01/2011 14324 18 0 6 24
I am a competent PHP developer - but only a beginner with SQL. What I need some advice on is the most efficient process for producing the invoices - perhaps there is an SQL query that would help me before the processing in PHP starts to calculate the costs - or perhaps the SQL statement could produce the costs too ? Any advice is welcome ....
Edit:
1) There is something currently running this process - its written in C++ and takes around 24 hours to process this ... i do not have access to the source.
2) I am using Doctrine in Symfony - im not sure how helpfuly Doctrine will be as retrieving data as Objects is only going to slow down the process - and I'm not sure if the use of Objects will help too much here ?
Edit #13:54 ->
Had the usage table specified incorrectly ... Sorry !
I have to map the usage_type to a cost on the specific tariff for each terminal ie usage_type of 1 = cost_1 in appropriate tariff ... I guess that makes it slightly more complicated ?
There you go, should take less than 24 hours ;)
SELECT u.date, u.term_ident terminal,
(ta.cost_1 * u.usage_1) usage_cost_1,
(ta.cost_2 * u.usage_2) usage_cost_2,
(ta.cost_3 * u.usage_3) usage_cost_3,
(usage_cost_1 + usage_cost_2 + usage_cost_3) total_cost
FROM usage u
INNER JOIN terminals te ON te.term_number = u.term_ident
INNER JOIN tariffs ta ON ta.tariff_id = te.tariff_id
INNER JOIN customers c ON c.cust_id = te.cust_id
WHERE u.date BETWEEN '2010-12-15' AND '2011-01-15'
AND c.cust_id = 2
This query is only for the customer with cust_id = 2. If you want a result for the whole dataset, just remove the condition.
Update
It's not that trivial with your new requirements. You could transform the usage table to the new one you posted before.
To make a decision in SELECT queries you can do something like this. But this is not the result you expect. It could be used to create the transformed new usage table.
SELECT u.date, u.term_ident terminal,
CASE u.usage_type
WHEN 1 then ta.cost_1 * u.usage_1
WHEN 2 then ta.cost_2 * u.usage_2
WHEN 3 THEN ta.cost_3 * u.usage_3
AS usage_cost
FROM usage u
INNER JOIN terminals te ON te.term_number = u.term_ident
INNER JOIN tariffs ta ON ta.tariff_id = te.tariff_id
INNER JOIN customers c ON c.cust_id = te.cust_id
WHERE u.date BETWEEN '2010-12-15' AND '2011-01-15'
AND c.cust_id = 2
Related
I need a rating system for my app, what happens is that a user can rate a thread 1 to 5. The calculation i was going to use was shown below:
ID UID TID Rating
--------------------------------------
1 1 37 5
2 4 37 5
3 8 37 5
4 22 37 5
5 2 37 5
This is a sample table, as you can see the way i did it was,
r1 r2 r3 r4 r5
(0 x 1) + (0 x 2) + (0 x 3) + (0 x 4) + (5 x 5)
----------------------------------------------- = 5
5
There is no user (UID) with a rating of 1 (r1) so you set r1 = (0 x 1) but there are users (UID) with a rating of 5 (r5) so you set r5 = (5 x 5) and the rest r2, r3, r4 are set to (0) there are no ratings for those. Hope you get it, if not ill explain more.
so i get a rating of 5 star.
But my problem is if 100 different users lets say all rate the thread 5 you will get a rating of 5 using the formula given, also if 5 different users rated another thread 5 you would get a rating of 5 also. But i don't want these results as both ratings for each thread will hit the top. I know i could select in sql to order the threads by number of users who have rated the thread so the 100 different users will go top this works, but the threads that had 5 users who rated 5 will be second.
Is there another way to rate theses threads taking into account how many users have rated each thread.
I hope you understand this, my question? if not ill edit.
I also need to generate a php script that calculates this in my select statement when i retrieve the rating, but ill ask another question when this is solved. Thanks.
I saw that you mentioned SELECT statements, so I assume you meant the SQL statements required.
What you asked for can be done purely using SQL itself, using the below line.
SELECT TID, AVG(RATING) FROM ratings GROUP BY TID ORDER BY AVG(RATING) DESC, SUM(RATING) DESC;
Here is the data I used for testing.
CREATE TABLE ratings(ID INT,UID INT,TID INT,RATING INT);
INSERT INTO ratings VALUES (1,1,37,5);
INSERT INTO ratings VALUES (2,4,37,5);
INSERT INTO ratings VALUES (3,8,37,5);
INSERT INTO ratings VALUES (4,22,37,5);
INSERT INTO ratings VALUES (5,2,37,5);
INSERT INTO ratings VALUES ( 6,1,12,5);
INSERT INTO ratings VALUES ( 7,4,12,5);
SELECT TID, AVG(RATING) FROM ratings GROUP BY TID ORDER BY AVG(RATING) DESC, SUM(RATING) DESC;
The output I got
TID AVG(RATING)
------------------
37 5
12 5
I was wondering if somebody can think of a more elegant solutions to my problem. I have trouble finding similar cases.
I have 5 tables. 3 are details for employees, skills and subskills. The remaining 2 are linking tables.
skill_links
skill_id subskill_id
1 4
1 5
2 4
2 6
emp_skill_links
employee_id subskill_id acquired
1 4 2013-04-05 00:00:00
1 5 2014-02-24 00:00:00
2 6 2012-02-26 00:00:00
2 5 2011-06-14 00:00:00
Both have many-to-many relations. Skills with subskills (skill_links) and employees with subskills (emp_skill_links).
I want to pick employees who have acquired all subskills for a skill. I tried doing it with one query, but couldn't manage it with the grouping involved. At the moment my solution is two separate queries and matching these in php array later. That is:
SELECT sl.skill_id, COUNT(sl.subskill_id) as expected
FROM skill_links sl
GROUP BY sl.skill_id
to be compared with:
SELECT sl.skill_id, esl.employee_id, COUNT(esl.subskill_id) as provided
FROM emp_skill_links esl
INNER JOIN skill_links sl
ON sl.subskill_id = esl.subskill_id
GROUP BY sl.skill_id, esl.employee_id
Is there a more efficient single query solution to my problem? Or would it not be worth the complexity involved?
If you consider a query consisting of sub-queries as meeting your requirement for "a more efficient single query solution" (depends on your definition of "single query"), then this will work.
SELECT employeeTable.employee_id
FROM
(SELECT sl.skill_id, COUNT(*) AS subskill_count
FROM skill_links sl
GROUP BY sl.skill_id) skillTable
JOIN
(SELECT esl.employee_id, sl2.skill_id, COUNT(*) AS employee_subskills
FROM emp_skill_links esl
JOIN skill_links sl2 ON esl.subskill_id = sl2.subskill_id
GROUP BY esl.employee_id, sl2.skill_id) employeeTable
ON skillTable.skill_id = employeeTable.skill_id
WHERE employeeTable.employee_subskills = skillTable.subskill_count
What the query does:
Select the count of sub-skills for each skill
Select the count of sub-skills for each employee for each main skill
Join those results based on the main skill
Select the employees from that who have a sub-skill count equal to
the count of sub-skills for the main skill
DEMO
In the is example, users 1 and 3 each have all sub-skills of main skill 1. User 2 only has 2 of the 3 sub-skills of main skill 2.
You'll note that the logic here is similar to what you're already doing, but it has the advantage of just one db request (instead of two) and it doesn't involve the PHP work of creating, looping through, comparing, and reducing arrays.
I am developing a small gaming website for college fest where users attend few contests and based on their ranks in result table, points are updated in their user table. Then the result table is truncated for the next event. The schemas are as follows:
user
-------------------------------------------------------------
user_id | name | college | points |
-------------------------------------------------------------
result
---------------------------
user_id | score
---------------------------
Now, the first 3 students are given 100 points, next 15 given 50 points and others are given 10 points each.
Now, I am having problem in developing queries because I don't know how many users will attempt the contest, so I have to append that many ? in the query. Secondly, I also need to put ) at the end.
My queries are like
$query_top3=update user set points =points+100 where id in(?,?,?);
$query_next5=update user set points = points +50 where id in(?,?,?,?,?);
$query_others=update user set points=points+50 where id in (?,?...........,?);
How can I prepare those queries dynamically? Or, is there any better approach?
EDIT
Though its similar to this question,but in my scenario I have 3 different dynamic queries.
If I understand correctly your requirements you can rank results and update users table (adding points) all in one query
UPDATE users u JOIN
(
SELECT user_id,
(
SELECT 1 + COUNT(*)
FROM result
WHERE score >= r.score
AND user_id <> r.user_id
) rank
FROM result r
) q
ON u.user_id = q.user_id
SET points = points +
CASE
WHEN q.rank BETWEEN 1 AND 3 THEN 100
WHEN q.rank BETWEEN 4 AND 18 THEN 50
ELSE 10
END;
It totally dynamic based on the contents in of result table. You no longer need to deal with each user_id individually.
Here is SQLFiddle demo
I've a web application where I use 2 tables, one for storing product information and other for storing votes of each product.
Now I'd like to display the products based on the number of votes the products had got. Below is table structure
Products:
PRODUCT_ID TITLE
1 product1
2 product2
3 product3
4 product4
Votes:
PRODUCT_ID USER_ID
1 1
1 1
2 2
3 2
And I am expecting a result to display the products in descending order of the votes
PRODUCT_ID TITLE VOTES
1 product1 2
2 product2 1
3 product3 1
Currently I am using a query like this
SELECT p.product_id, p.title, count(*) AS total FROM products p
INNER JOIN votes v ON v.product_id = p.product_id GROUP BY p.product_id
ORDER BY count(*) DESC LIMIT 110
Products table has around 30,000 records and votes tables has around 90,000 records.
Now the problem is it takes a lot of time(randomly between 18 to 30 seconds). Since the number of records in the tables aren't that high, I wonder why it takes such a huge amount of time.
One thing noticed is when I run the query for the second time it fetches the results in few milli seconds which I think is the ideal time for a not so complex query like this.
Again I am pretty new to database side of programming.
I am not sure if there's anything wrong in the query or is it the table structure which isn't efficient (at least to fetch the records quickly).
First, your query is fine, although I would be inclined to format it differently:
SELECT p.product_id, p.title, count(*) AS total
FROM products p INNER JOIN
votes v
ON v.product_id = p.product_id
GROUP BY p.product_id
ORDER BY count(*) DESC
LIMIT 110;
As mentioned in another answer, an index on votes(product_id) would definitely help the query, if you don't have one already. Even with the improvement in the join performance, you still have the overhead of an aggregation. And, in MySQL that can be a lot of overhead.
If you are expecting lots and lots more votes -- getting into the millions -- then you may have to take another approach. One approach is to add a trigger to some table (perhaps the products table that keeps track of votes as they come in. Then the query would fly. Another approach would be to periodically summarize the data, similar to using a trigger but using a job instead.
I am writing a php script to determine the fuel usage of trucks. I use mysql db table for this.
There are several locations that a truck can get fuel, say A, B, C, D locations.
The truck gets fuel from one of these locations which is the closest. And every time the truck gets fuel, the person responsible will enter "the amount of the fuel" and value of "odometer" to program.
sequence_id locations fuelDispensed odometer
1 C 700 8100
2 A 400 9700
3 B 500 15500
4 C 600 17950
and so on.
With this info from db, It is easy to find how many KMs or miles the truck travelled from a location to another just by calculating "odometer" difference between successive rows by using "sequence_id".
The problem is: People may forget or not be able to enter the values to the program and do it later. the data becomes like this:
sequence_id locations fuelDispensed odometer
1 C 700 8100
2 B 500 15500
3 C 600 17950
4 A 400 9700
In this case, it is not possible to calculate between successive rows based on sequence_id. Maybe, by sorting odometer values ascending and then doing successive calculation between rows seems logical but I could not find out how I can do this.
Edit: My query is something like this:
SELECT
t1.odometer AS km1,
t2.odometer AS km2,
FROM fueldispensed AS t2, fueldispensed AS t1
WHERE (t1.sequence_id+1= t2.sequence_id) AND (t1.truck_id='$truckid') AND (t2.truck_id='$truckid') ORDER BY t1.sequence_id";
adding ORDER BY to this query has no effect since I get the succession on "sequence_id".
Add an ORDER BY to your SQL select statement
ORDER BY odometer ASCENDING
EDIT
OK! I think I understand your problem now.
SELECT t1.truck_id,
t1.odometer AS km1,
MIN(t2.odometer) AS km2
FROM fueldispensed AS t1,
fueldispensed AS t2
WHERE t2.truck_id = t1.truck_id
AND T2.odometer > t1.odometer
ORDER BY t1.truck_id,
t1.odometer
GROUP BY t1.truck_id,
t1.odometer
Should give you something that will work, though not as efficient as it could be
Edit your truck_id selection into the query as appropriate