I am trying to make a "top purchaser" module on my store and I am a bit confused about the MySQL query.
I have a table with all transactions and I need to select the person (which could have one or many transactions) with the highest amount of money spent in the past month.
What I have:
name | money spent
------------------
john | 50
mike | 12
john | 10
jane | 504
carl | 99
jane | 12
jane | 1
What I want to see:
With a query, I need to see:
name | money spent last month
-----------------------------
jane | 517
carl | 99
john | 60
mike | 12
How do I do that?
I do not really seem to find many good solutions since my MySQL query skills are quite basic. I thought of making a table in which money is added to the user when he buys something.
That's a simple aggregated query :
SELECT t.name, SUM(t.moneyspent) money_spent_last_month
FROM mytable t
GROUP BY t.name
ORDER BY t.money_spent_last_month DESC
LIMIT 1
The query sums the total money sped by customer name. The results are ordered by descending total money spent, and only the first row is retained.
If you are looking to filter data over last month, you need a column in the table that keeps track of the transaction date, say transaction_date, and then you can just add a WHERE clause to the query, like :
SELECT t.name, SUM(t.moneyspent) money_spent_last_month
FROM mytable t
WHERE
t.transaction_date >=
DATE_ADD(LAST_DAY(DATE_SUB(NOW(), INTERVAL 2 MONTH)), INTERVAL 1 DAY)
AND t.transaction_date <=
DATE_SUB(NOW(), INTERVAL 1 MONTH)
GROUP BY t.name
ORDER BY t.money_spent_last_month DESC
LIMIT 1
This method is usually more efficient than using DATE_FORMAT to format dates as string and compare the results.
Related
Hello I think I try here something complicated. Maybe you can help me out here.
I have two tables: earnings and payouts.
Earnings has userid, amount, timestamp as datetime and other stuff. It has just information when user was earning something.
Example:
id | user_id | amount | timestamp |
1 | 2 | 1050 | 31days ago |
2 | 1 | 20 | 10days ago |
3 | 1 | 10 | 9 days ago |
4 | 2 | 10000 | 9 days ago |
...
Payouts has userid, amount, timestamp as datetime and has entries about payouts if a user is above x earnings lets say 1000. Example
id | user_id | payout_amount | timestamp |
1 | 2 | 1050 | 30days ago |
...
To my problem now. I want to COUNT how many payouts are NOT done (who has no entry in payout). This means. I need to compare payouts.timestamp with earnings.timestamp which has same userid and check if there are newer entries then the payouts. If yes then count it how many (so I think here its needed to sum first the earnings). I am not even sure if this is possible.
I can do it also with php if this isn't possible alone with mysql.
For example the result should be: count = 2 because userid 1 has just 30 earnings so he didn't reach the 1000 also he has no entry in payouts table. Userid 2 has 10000 but he still has no payouts because we didn't execute it or make a entry in the payouts table. he just has a old payout and the new earnings isn't paid.
Edit: the 10days ago things are just example. I use there real datetime types
EDIT2: Forget to say I tried this one:
select COUNT(e.amount) FROM earnings e, payouts p where p.payout_timestamp < e.timestamp AND p.user_id = e.user_id GROUP BY p.user_id, e.user_id
and go this:
| Count(e.amount) |
| 2 |
| 1 |
I think that comparing timestamps is not the only way to get result that you need. If these to tables correctly represent history of earnings and payments then you may just sum up total earnings and payments for each user and compare them counting every user_id that has less payments than earnings.
For example:
SELECT user_id, SUM(amount) as amount FROM earnings GROUP BY user_id;
sums up earnings for each user,
SELECT user_id, SUM(payout_amount) as amount FROM payouts GROUP BY user_id;
sums up payments for each user.
Now left join them and count users who has less payments than earnings:
SELECT COUNT(e1.user_id) FROM (SELECT user_id, SUM(amount) as amount FROM earnings GROUP BY user_id) as e1 LEFT JOIN (SELECT user_id, SUM(payout_amount) as amount FROM payouts GROUP BY user_id) as p1 ON e1.amount > p1.amount;
For your tables example my result was:
+-------------------+
| COUNT(e1.user_id) |
+-------------------+
| 2 |
+-------------------+
And to find users with payments not done just use the same query without COUNT() function:
+---------+
| user_id |
+---------+
| 2 |
| 1 |
+---------+
In my opinion this way is more stable because it is possible to have recent payments with less amount than total user earnings at the moment of payment, e.g.:
1) user earned 1000
2) then user earned 2000
3) and after that user been payed 1000
In this situation comparing timestamps without comparing amounts of payments will show you that this user has no payments to be done whereas you still need to pay him 2000.
I've searched for a few hours now, but couldn't find relative solution to a specific algorithm I am working on. To simplify the obstacle, I would like to present the information in just one table.
_____________________________________
| User | Item | price | qty |
-------------------------------------
| Annie | Dress | 80 | 1 |
| Bob | Jeans | 65 | 3 |
| Cathy | Shoes | 60 | 4 |
| David | Shirts | 40 | 6 |
| Annie | Shoes | 60 | 2 |
| Bob | Shirts | 55 | 2 |
| Cathy | Jeans | 65 | 1 |
| David | Ties | 20 | 5 |
-------------------------------------
Problem # 1: Show users whose total price for shopping at the store is 300 or more and quantity of their purchase is less than or equal to 3. These shoppers will be mailed a coupon for $40.
Problem # 2: Show users whose total qty is greater than or equal to 7 and the total for price is 275 or more. These shoppers will be mailed a coupon for $20.
The rows within the table are not transaction specific. The table can represent separate transactions within a month. We're just trying to find certain returning customers who we would like to reward for shopping with us.
I'm not sure if this can be done only via MySQL, or if I need to have separate queries and store rows into arrays and compare them one by one.
What I have tried so far are the followings:
SELECT * FROM table where SUM(price) as Total >= 300 AND SUM(qty) <=3;
I've also tried the following after the research:
SELECT SUM(price) as Total FROM table WHERE SUM(qty) <=3;
I keep getting syntax errors in MySQL shell. You don't have to solve the problems for me, but if you can guide me through the logic on how to solve the problems, I'd appreciate it very much.
Lastly I'd like to ask once, can I solve this with only MySQL or do I need to store the rows into PHP arrays and compare each indexes?
You can't use an aggregate function in the WHERE clause, you have to use HAVING. WHERE operates on individual rows during the selection, HAVING operates on the final results after aggregating.
SELECT *, SUM(price*qty) as Total
FROM table
GROUP BY user
HAVING Total >= 300 AND SUM(qty) <= 3
SUM is an aggregate function, meaning it applies to a group of clubbed rows. S say i am grouping the table data based on NAME then sum function would sum all the price of one NAME.
Having said this, if you think logically it would not make any sense to put the sum(price) in a WHERE clause because where clause would not know which SUM(PRICE) for which NAME to operate on(where clause operates only after a temporary view has been generated).
So we have the HAVING clause in SQL. This is used to compare the results of aggregrate function at each step of aggregation.
Consider it like this:
In where clause, when the ANNIE row from your DB is returned, it does not know what SUM(PRICE) means.
While in HAVING clause the SUM(PRICE)>300 condition is executed only when SQL has finished grouping all the ANNIE data into one group and calculated the SUM(PRICE) for her.
For question 1:
SELECT USER, SUM(PRICE)
FROM table
GROUP BY user
HAVING SUM(PRICE) >= 300 AND SUM(QTY) <= 3
For Question 2:
SELECT USER, SUM(PRICE)
FROM table
GROUP BY user
HAVING SUM(PRICE) >= 275AND SUM(QTY) >=7
Suppose I have a MySQL table that looks like the following, where I keep track of when (Date) a user (User.id) read an article on my website (Article.id):
------------------------------------------
Article_Impressions
------------------------------------------
date | user_id | article_id
--------------------+---------+-----------
2013-04-02 15:33:23 | 815 | 2342
2013-04-02 15:38:21 | 815 | 108
2013-04-02 15:39:33 | 161 | 4815
...
I'm trying to determine how many session I had, as well as average session duration per user on a given day. A session ends when an article was not read within 30 minutes after another article.
Question
How can I efficiently determine how many session I had on a given day? I'm using PHP and MySQL.
My first idea is to query all that data for a given day, sorted by user. Then I iterate through each user, check if an impression was within 30 minutes of the last impression, and tally up a total count of session each user had that day.
Since we have around 2 million impressions a day on our site, I'm trying to optimize this report generator.
Try this query
Query 1:
select
#sessionId:=if(#prevUser=user_id AND diff <= 1800 , #sessionId, #sessionId+1) as sessionId,
#prevUser:=user_id AS user_id,
article_id,
date,
diff
from
(select #sessionId:=0, #prevUser:=0) b
join
(select
TIME_TO_SEC(if(#prevU=user_id, TIMEDIFF(date, #prevD), '00:00')) as diff,
#prevU:=user_id as user_id,
#prevD:=date as date,
article_id
from
tbl
join
(select #prev:=0, #prevU=0)a
order by
user_id,
date) a
[Results]:
| SESSIONID | USER_ID | ARTICLE_ID | DATE | DIFF |
-----------------------------------------------------------------
| 1 | 161 | 4815 | 2013-04-02 15:39:33 | 0 |
| 2 | 815 | 2342 | 2013-04-02 15:33:23 | 0 |
| 2 | 815 | 108 | 2013-04-02 15:38:21 | 298 |
| 3 | 815 | 108 | 2013-04-02 16:38:21 | 3600 |
This query will return a unique session for every new user and also for same user if the next article read is after 30 mins as per your requirement mentioned in your question. The diff column returns the seconds difference between the 2 articles by the same user which helps us count the sessionId. Now using this result it will be easy for you to count the average time per user and also total time per session.
Hope this helps you...
SQL Fiddle
If the concept of the user "session" is important to your analytics, then I would start logging data in your table to make querying of session-related data not such a painful process. A simple approach would be to log your PHP session ID. If your PHP session id is set to have the same 30 minute expiry, and you log the PHP session ID to this table then you would basically have exactly what you are looking for.
Of course that won't help you with your existing records. I would probably go ahead and create the session field and then back-populate it with randomly generated "session" id's. I wouldn't look for a fully SQL solution for this, as it may not do what you want in terms of handling edge cases (sessions spanning across days, etc.). I would write a script to perform this backfill, which would contain all the logic you need.
My general approach would be to SELECT all the records like this:
SELECT user_id, date /* plus any other fields like unique id that you would need for insert */
FROM Article_Impressions
WHERE session_id IS NULL
ORDER BY user_id ASC, date ASC
Note: make sure you have index on both user_id and date fields.
I would then loop through the result set, building a temp array of each user_id, and loop through that array for all date values assigning a randomly generated session id which would change each time the date change was greater than 30 minutes. Once the user value increments, I would make inserts for that previous user to update the session_id values and then reset the temp array to empty and continue that process with the next user.
Note that it is probably important to take the approach of keeping a relatively small temp/working array like this, as with the number of records you are talking about, you are likely not going to be able to read the entire result set into an array in memory.
Once your data is populated, the query becomes trivial:
Unique sessions for each day:
SELECT DATE(date) as `day`, COUNT(DISTINCT session_id) AS `unique_sessions`
FROM Article_Impressions
GROUP BY `day`
ORDER BY `day` DESC /* or ASC depending on how you want to view it */
Average sessions per day:
SELECT AVG(sessions_per_day.`unique_sessions`) AS `average_sessions_per_day`
FROM
(
SELECT DATE(date) as `day`, COUNT(DISTINCT session_id) AS `unique_sessions`
FROM Article_Impressions
GROUP BY `day`
) AS sessions_per_day
GROUP BY sessions_per_day.`day`
Note: you need an index on the new session_id field.
I have a little app (PHP/MySQL) to manage condos. There's a condos table, apartments table, a owners table and a account table.
In the account table I have the fields month_paid and year_paid (among others).
Each time someone pays the monthly fee, the table is updated with the number of the month and the year.
Here's some sample table structure:
condos table:
+----+------------+---------+
| id | condo_name | address |
+----+------------+---------+
apartments table:
+----+----------------+----------+
| id | apartment_name | condo_id |
+----+----------------+----------+
owners table:
+----+--------------+------------+
| id | apartment_id | owner_name |
+----+--------------+------------+
account table:
+----+----------+----------+------------+-----------+
| id | owner_id | condo_id | month_paid | year_paid |
+----+----------+----------+------------+-----------+
So, if I have a record in account table like this, it means this owner paid August 2012:
+----+----------+----------+------------+-----------+
| id | owner_id | condo_id | month_paid | year_paid |
+----+----------+----------+------------+-----------+
| 1 | 1 | 1 | 8 | 2012 |
+----+----------+----------+------------+-----------+
What I would like to know is how to make a SQL query (using PHP) to get the owners with three or more months in debt or, in other words, owners that have not payed the fee for the last three months or more.
If possible, the data should be grouped by condo, like this:
CONDO XPTO:
Owner 1: 3 months debt
Onwer 2: 5 months debt
CONDO BETA
Owner 1: 4 months debt
Onwer 2: 6 months debt
Thanks
You need to write a query something like this:
SELECT
*
FROM
owners
JOIN
account
ON
owners.id = account.owners_id
WHERE
CONCAT( account.year_paid , '-' , account.month_paid , '-01') <= DATE_ADD( NOW(), INTERVAL -3 MONTH );
Sadly that's about all I can give you with the information you have provided. If you could show more detailed table structure, I could help you out more.
You are making it harder on yourself by storing it this way. Now you need to calculate the difference in months yourself. You cannot just check on months, because e.g. in january you also need to take the year into consideration.
SELECT *
FROM owners
JOIN account ON owners.Id=account.ownersId
WHERE (
account.year_paid = year(now)
AND (
month(now)-account.month_paid>=3
)
) OR (
account.year_paid = year(now)-1
AND (
month(now)>=3
OR (
account.month_paid - month(now) <= 10
AND month(now) = 1
)
OR (
account.month_paid - month(now) <= 11
AND month(now) = 2
)
)
) OR (
account.year_paid < year(now)-1
)
Better to just store the lastpaid time in a datetime collumn so you can use date functions.
To fix your account table. The DROP COLUMNs are optional, you might want keep them if they have dependencies.
ALTER TABLE account ADD COLUMN date_paid DATETIME;
UPDATE account SET date_paid = CONCAT(year_paid,'-',month_paid,'-01');
--ALTER TABLE account DROP COLUMN year_paid;
--ALTER TABLE account DROP COLUMN month_paid;
This is how you’d get your data. I used LEFT OUTER JOINs in case you have any missing owner or condo records.
SELECT c.condo_name,
o.owner_name,
min(PERIOD_DIFF(DATE_FORMAT(now(), '%Y%m'), DATE_FORMAT(date_paid, '%Y%m'))) min_months_debt
FROM account a
LEFT OUTER JOIN condos c ON (c.id = a.condo_id)
LEFT OUTER JOIN owners o ON (a.owner_id = o.id)
WHERE a.date_paid <= DATE_ADD(NOW(), INTERVAL -3 MONTH)
GROUP BY c.condo_name, o.owner_name
ORDER BY c.condo_name
P.S. The above only works for condo owners who have made at least one payment. If you want to see condo owners who have never paid then you’re going to have to associate condos with owners outside of the accounts table. Or perhaps you create an account record when you associate a condo with an owner, in which case you don’t have a problem.
If I understand correct you have two different fields for moth and for year. Why?
If you had one field say paid_date this query would work for you
SELECT owner_id from account WHERE NOW()>DATE_ADD(paid_date, INTERVAL 3 MONTH)
If you can change fielfd then sorry. I hope this is helpful.
UPD:
Then I'd suggest you concatenation of two fields (year and month) in the query to make it look like real date in YYYY-MM-DD format and use it in DATE_ADD function instead of paid_date field
Try this,
select o.owner_name,c.condo_name from owners o join account a join condos c
on o.id=a.owner_id and c.id =a.condo_id where year(curdate())=a.year_paid
and a.month_paid<month(curdate())-3
I haven't seen any answers that actually produce what you are looking for. The following calculates the months in debt by calculating the current month minus the most recent payment date. It then concatenates the results into a string:
select c.condo_name, o.owner_name,
cast(YEAR(now)*12+MONTH(now)) - MAX(year_paid*12+month_paid) as varchar(255)), ' month(s) debt'
from account a join
owners o
on o.id = a.owner_id join
condos c
on c.id = a.condo_id
group by c.condo_name,, o.owner_name
order by 1, 2
If you want only delinquent payers, then add a where clause to the effect of:
where YEAR(now)*12+MONTH(now)) - MAX(year_paid*12+month_paid) > 1
Because you don't have the day of the month of the payment, there are some borderline conditions you might miss.
This question already has an answer here:
Closed 10 years ago.
Possible Duplicate:
SQL ORDER BY total within GROUP BY
UPDATE: I've found my solution, which I've posted here. Thanks to everyone for your help!
I'm developing a Facebook application which requires a leaderboard. Scores and time taken to complete the game are recorded and these are organised by score first, then in the case of two identical scores, the time is used. If a user has played multiple times, their best score is used.
The lower the score, the better the performance in the game.
My table structure is:
id
facebook_id - (Unique Identifier for the user)
name
email
score
time - (time to complete game in seconds)
timestamp - (unix timestamp of entry)
date - (readable format of timestamp)
ip
The query I thought would work is:
SELECT *
FROM entries
ORDER BY score ASC, time ASC
GROUP BY facebook_id
The problem I'm having is in some cases it's pulling in the user's first score in the database, not their highest score. I think this is down to the GROUP BY statement. I would have thought the ORDER BY statement would have fixed this, but apparently not.
For example:
----------------------------------------------------------------------------
| ID | NAME | SCORE | TIME | TIMESTAMP | DATE | IP |
----------------------------------------------------------------------------
| 1 | Joe Bloggs | 65 | 300 | 1234567890 | XXX | XXX |
----------------------------------------------------------------------------
| 2 | Jane Doe | 72 | 280 | 1234567890 | XXX | XXX |
----------------------------------------------------------------------------
| 3 | Joe Bloggs | 55 | 285 | 1234567890 | XXX | XXX |
----------------------------------------------------------------------------
| 4 | Jane Doe | 78 | 320 | 1234567890 | XXX | XXX |
----------------------------------------------------------------------------
When I use the query above, I get the following result:
1. Joe Bloggs - 65 - 300 - (Joes First Entry, not his best entry)
2. Jane Doe - 72 - 280
I would have expected...
1. Joe Bloggs - 55 - 285 - (Joe's best entry)
2. Jane Doe - 72 - 280
It's like the Group By is ignoring the Order - and just overwriting the values.
Using MIN(score) with the group by selects the lowest score, which is correct - however it merges the time from the users first record in the database, so often returns incorrectly.
So, how can I select a user's highest score and the associated time, name, etc and order the results by score, then time?
Thanks in advance!
Your query does not actually make sense, because the order by should be after the group by. What SQL engine are you using? Most would give an error.
I think what you want is more like:
select e.facebookid, minscore, min(e.time) as mintime -- or do you want maxtime?
from entries e join
(select e.facebookid, min(score) as minscore
from entries e
group by facebookid
) esum
on e.facebookid = esum.facebookid and
e.score = e.minscore
group by e.facebookid, minscore
You can also do this with window functions, but that depends on your database.
One approach would be this:
SELECT entries.facebook_id, MIN(entries.score) AS score, MIN(entries.time) AS time
FROM entries
INNER JOIN (
SELECT facebook_id, MIN(score) AS score
FROM entries
GROUP BY facebook_id) highscores
ON highscores.facebook_id = entries.facebook_id
AND entries.score = highscores.score
GROUP BY entries.facebook_id
ORDER BY MIN(entries.score) ASC, MIN(entries.time) ASC
If you need more information from the entries table, you can then use this as a subquery, and join again on the information presented (facebook_id, score, time) to get one row per user.
You need to aggregate twice, is the crux of this; once to find the minimum score for the user, and again to find the minimum time for that user and score. You could reverse the order of the aggregation, but I would expect that this will filter most quickly and thus be most efficient.
You might also want to check which is faster, aggregating the second time: using the minimum score or grouping using the score as well.
You need to min the score
SELECT
facebook_id,
name,
email,
min(score) as high_score
FROM
entries
GROUP BY
facebook_id,
name,
email
ORDER BY
min(score) ASC
Thanks for your help. #Penguat had the closest answer.. Here was my final Query for anyone who might have the same issue...
SELECT f.facebook_id, f.name, f.score, f.time FROM
(SELECT facebook_id, name, min(score)
AS highscore FROM golf_entries
WHERE time > 0
GROUP BY facebook_id)
AS x
INNER JOIN golf_entries as f
ON f.facebook_id = x.facebook_id
AND f.score = x.highscore
ORDER BY score ASC, time ASC
Thanks again!
If you want their best time, you want to use the MIN() function - you said that the lower the score, the better they did.
SELECT facebook_id, MIN(score), time, name, ...
FROM entries
GROUP BY facebook_id, time, name, ...
ORDER BY score, time