I have a database with access controll log entries:
time : datetime (this is the access timestamp)
src: text (this is the userid)
I want to get a list out of it that shows how many users from the current day had already access on how many days during the past 7 days. The result should look like this:
number of days with access | count
1 | 30
2 | 54
3 | 123
4 | 843
5 | 3490
6 | 71
7 | 23
What I have so far:
The query below returns the number of users with log entry on 2015-03-08 that had also an entry on 2015-03-07.
SELECT Count(DISTINCT a.src)
FROM contacts AS a
LEFT JOIN contacts AS b
ON a.src = b.src
WHERE a.time BETWEEN Cast('2015-03-08 05:00:00' AS DATETIME) AND Cast('2015-03-09 05:00:00' AS DATETIME)
AND b.time BETWEEN Cast('2015-03-07 05:00:00' AS DATETIME) AND Cast('2015-03-08 05:00:00' AS DATETIME)
But I'm stuck with getting the count for each dayby number of days as described above. If there is no 'sql only' solution it would be ok as well to have an (performant) approach using php. Thanks for any help..
I don't see any reason why do you need to join b table.
SELECT
DAY(a.time),
COUNT(DISTINCT a.src)
FROM contacts AS a
WHERE a.time
BETWEEN (TIMESTAMP(CURDATE()) - INTERVAL 1 WEEK)
AND TIMESTAMP(CONCAT(CURDATE(),' 23:59:59'))
GROUP BY DAY(a.time)
Related
I am trying to make a "top purchaser" module on my store and I am a bit confused about the MySQL query.
I have a table with all transactions and I need to select the person (which could have one or many transactions) with the highest amount of money spent in the past month.
What I have:
name | money spent
------------------
john | 50
mike | 12
john | 10
jane | 504
carl | 99
jane | 12
jane | 1
What I want to see:
With a query, I need to see:
name | money spent last month
-----------------------------
jane | 517
carl | 99
john | 60
mike | 12
How do I do that?
I do not really seem to find many good solutions since my MySQL query skills are quite basic. I thought of making a table in which money is added to the user when he buys something.
That's a simple aggregated query :
SELECT t.name, SUM(t.moneyspent) money_spent_last_month
FROM mytable t
GROUP BY t.name
ORDER BY t.money_spent_last_month DESC
LIMIT 1
The query sums the total money sped by customer name. The results are ordered by descending total money spent, and only the first row is retained.
If you are looking to filter data over last month, you need a column in the table that keeps track of the transaction date, say transaction_date, and then you can just add a WHERE clause to the query, like :
SELECT t.name, SUM(t.moneyspent) money_spent_last_month
FROM mytable t
WHERE
t.transaction_date >=
DATE_ADD(LAST_DAY(DATE_SUB(NOW(), INTERVAL 2 MONTH)), INTERVAL 1 DAY)
AND t.transaction_date <=
DATE_SUB(NOW(), INTERVAL 1 MONTH)
GROUP BY t.name
ORDER BY t.money_spent_last_month DESC
LIMIT 1
This method is usually more efficient than using DATE_FORMAT to format dates as string and compare the results.
I have the following table
id | user_id | date | status
1 | 53 | 2018-09-18 06:59:54 | 1
2 | 62 | 2018-09-18 07:00:16 | 1
3 | 53 | 2018-09-18 09:34:12 | 2
4 | 53 | 2018-09-18 12:16:27 | 1
5 | 53 | 2018-09-18 18:03:19 | 2
6 | 62 | 2018-09-18 18:17:41 | 2
I would like to get the total working hours (from date range) and group them by user_id
UPDATE
The system does not "require" a check-out so if there is only one value can we set a default check out time lets say 19:00:00? IF not I can check every day at 21:00:00 if there is not a checkout time to manually insert it at 19:00:00
UPDATE 2
I have added a new field in the table "status" so the very first check-in of the date the status = 1 and every 2nd check-in the status = 2
So if a user check-ins for the 3rd time during the day the status will be 1 again etc.
I hope this will make things easier
Thanks
In case of multiple check-in and check-out happening within a day, for a user:
Utilizing Correlated Subquery, we can find corresponding "checkout_time" for every "checkin_time".
Also, note the usage of Ifnull(), Timestamp() functions etc, to consider default "checkout_time" as 19:00:00, in case of no corresponding entry.
Then, considering this enhanced data-set as Derived Table, we group the data-set based on the user_id and date. Date (yyyy-mm-dd) can be determined using Date() function.
Eventually, use Timestampdiff() function with Sum aggregation, to determine the total work seconds for a user_id at a particular date.
You can easily convert these total seconds to hours (either in your application code, or at the query itself (divide seconds by 3600).
The reason I have preferred to compute using seconds, as Timestampdiff() function returns integer only. So there may be truncation errors, in case of multiple checkin/checkout(s).
Use the following query (replace your_table with your actual table name):
SELECT inner_nest.user_id,
DATE(inner_nest.checkin_time) AS work_date,
SUM(TIMESTAMPDIFF(SECOND,
inner_nest.checkin_time,
inner_nest.checkout_time)) AS total_work_seconds
FROM
(
SELECT t1.user_id,
t1.date as checkin_time,
t1.status,
IFNULL( (
SELECT t2.date
FROM your_table AS t2
WHERE t2.user_id = t1.user_id
AND t2.status = 2
AND t2.date > t1.date
AND DATE(t2.date) = DATE(t1.date)
ORDER BY t2.date ASC LIMIT 1
),
TIMESTAMP(DATE(t1.date),'19:00:00')
) AS checkout_time
FROM `your_table` AS t1
WHERE t1.status = 1
) AS inner_nest
GROUP BY inner_nest.user_id, DATE(inner_nest.checkin_time)
Additional: Following solution will work for the case when there is a single check-in, and corresponding check-out on the same date.
You first need to group the dataset based on the user_id and date. Date (yyyy-mm-dd) can be determined using Date() function.
Now use aggregation functions like Min() and Max() to find the starting and closing time for a user_id at a particular date.
Eventually, use Timestampdiff() function to determine the working hours for a user_id at a particular date (difference between the closing and starting time)
Try the following query (replace your_table with your actual table name):
SELECT user_id,
DATE(`date`) AS working_date,
TIMESTAMPDIFF(HOUR, MIN(`date`), MAX(`date`)) AS working_hours
FROM your_table
GROUP BY
user_id,
DATE(`date`)
Use TIMESTAMPDIFF function
the query more like :
SELECT t1.user_id, TIMESTAMPDIFF(HOUR,t1.date,t2.date) as difference
FROM your_table t1
INNER JOIN your_table t2 on t1.user_id = t2.user_id
Group By t1.user_id
You can see this as preference TimeStampDiff
I have an interesting query here. I have a table that stores visitor's ip and page_id along with a timestamp (date). I would like to count the amount of visitors per day for each page_id so I can then take that output and calculate which page_id is trending (on a daily basis)
Table visitors_counter looks like this:
id|page_id | ip | date
1 | 37 |1.1.1.1| 2017-02-10 14:03:16
2 | 38 |1.2.1.1| 2017-02-10 11:04:16
3 | 39 |1.1.3.1| 2017-02-10 16:05:16
4 | 37 |1.5.1.1| 2017-02-10 17:08:16
5 | 37 |1.1.1.1| 2017-02-10 19:07:16
And what I would like to achieve would be something like:
id|page_id |visitors | date
1 | 37 |3 | 2017-02-10 14:03:16
2 | 38 |1 | 2017-02-10 11:04:16
3 | 39 |1 | 2017-02-10 16:05:16
So far I've been able to count the amount of unique visitors per day with
SELECT DATE(date) Date, COUNT(DISTINCT page_id) uniqueperday
FROM visitors_counter
GROUP BY DATE(date)
i know im close, but its not quite what I want as I don't know which page_id are the most visited ones
Thanks
for your result you should use the group by and count(*) (you have two times the same id for page 37 and want result 3)
SELECT DATE(date) Date, page_id, COUNT(*) visitperday
FROM visitors_counter
GROUP BY DATE(date), page_id
othewise the Giorgios is the right one
Simply add page_id in the GROUP BY clause:
SELECT DATE(date) Date, page_id, COUNT(DISTINCT ip) uniqueperday
FROM visitors_counter
GROUP BY DATE(date), page_id
The above query returns the number of unique ip visits per page per day.
You can use aggregation on page and date to find count of distinct visitors per page per day. Also, use user variables to generate sequence id.
set #id := 0;
select #id = #id + 1 id,
page_id,
count(distinct ip) visitors,
min(date)
from visitors_counter
group by page_id, DATE(date)
order by visitors desc, page_id;
I am working with MySql and Symfony2. I need to build cohort analysis table. I need to compare how many users in each cohort log in to website at least once a week after they register. What I tried to do is to get number of registered users by week, basically these are my cohorts.
SELECT DATE_FORMAT(date_added,'%d %b %y') as reg_date, COUNT(*) AS user_count
FROM user
WHERE date_added>='2016-02-01' AND date_added<=NOW()
GROUP BY WEEK(date_added)
This query gets distinct users logged in to website by week.
SELECT WEEK(login_date) AS week, COUNT(DISTINCT user_id) AS user_count
FROM user_log
WHERE login_date>='2016-02-01' AND login_date<=NOW()
GROUP BY WEEK(login_date)
My problem: I can't figure out how to group logged in users by cohorts and compare cohorts by weeks. I hope I stated problem clearly. English is not my first language. Thanks.
Sample data:
user table
id | date_added (in WEEK() format)
A | 1
B | 1
C | 1
D | 2
E | 2
F | 2
G | 2
------------
user_log table
user_id | login_date (in WEEK() format)
A | 1
B | 1
B | 1
A | 2
D | 2
A | 2
D | 2
E | 2
Expected table. Cohort 1 - users registered in week 1, cohort 2- in week etc. Size - number of registered users. Week 1 - how many users logged back to website in a first week after registration, Week 2 - how many users logged back to website in a second week after registration
Cohort size Week1 Week2
Cohort 1 | 3 | 2 | 1 |
Cohort 2 | 4 | 2 | - |
This is borrowed from my modification of #Andriy M's answer of this question: Cohort analysis in SQL
This query gets unique user logins by week after registering.
SELECT DISTINCT
user_id,
FLOOR(DATEDIFF(user_log.login_date, user.date_added)/7) AS Offset
FROM user_log
LEFT JOIN user ON (user.id = user_log.user_id)
WHERE user_log.login_date >= CURDATE() - INTERVAL 14 DAY
This query gets all the users created in the past 14 days and formats the date to the week they signed up:
SELECT
id,
DATE_FORMAT(date_added, "%Y-%u") AS cohort
FROM user
WHERE date_added >= CURDATE() - INTERVAL 14 DAY
We can put those two queries together to get a table with how many people came back after registering:
SELECT STR_TO_DATE(CONCAT(u.cohort, ' Monday'), '%X-%V %W') as date,
SUM(s.Offset = 0) AS size,
SUM(s.Offset = 1) AS Week1,
SUM(s.Offset = 2) AS Week2
FROM (
SELECT
id,
DATE_FORMAT(date_added, "%Y-%u") AS cohort
FROM user
WHERE date_added >= CURDATE() - INTERVAL 21 DAY
) as u
LEFT JOIN (
SELECT DISTINCT
user_id,
FLOOR(DATEDIFF(user_log.login_date, user.date_added)/7) AS Offset
FROM user_log
LEFT JOIN user ON (user.id = user_log.user_id)
WHERE user_log.login_date >= CURDATE() - INTERVAL 21 DAY
) as s
ON s.user_id = u.id
GROUP BY u.cohort
ORDER BY u.cohort
Since we aren't counting how many people registered in a given week, we are assuming that they logged at lease once in the week they registered to give an accurate result for the size column.
Also you'll have to rework this to get a number for the cohort instead of the date, but I find dates more helpful.
Also you can extend this to more weeks - you'll have to change the number of days after INTERVAL in both subqueries, and you can add more rows on in the main select statement to get more weeks.
I'm building a simple availability calendar with PHP and MySQL.
I have a table which stores the available dates for a property (currently all of them are blocks of 7 days)
available_dates:
start_date DATE
end_date DATE
available_id INT PRIMARY KEY
property_id INT
booked TINYINT(1)
And a table of booked dates which references the available_id of my available_dates table:
bookings
booking_id INT
available_id INT
***user details***
I plan on having rows added to available_dates for each property to mark which dates can be booked, and then setting the booked flag on that table when somebody books that block.
What I'd like to do is show a list of dates (in blocks of x days, 7 in this case) that have no availability set - so the date does not appear in that table - for the next 24 months or so.
I'm having trouble wrapping my head around this and I know there is a simpler way to do it that my first ideas of looping through each property, then each block of 7 days, etc etc.
Can anyone enlighten me?
Update:
Thanks to #ZaneBien 's brilliant and comprehensive answer, I've managed to get the results I need by using his yeardate table & procedure.
What I've done is when the page that needs to show the dates with no availability set is requested, the PHP will call the procedure to add more yeardates if there aren't any for CURYEAR()+2.
Then to get my results, a slightly modified version of Zane's query:
SELECT
a.yeardate AS blockstart,
DATE_ADD(a.yeardate, INTERVAL 7 DAY) AS blockend
FROM
yeardates a
LEFT JOIN
available_dates b
ON(a.yeardate BETWEEN b.start_date AND b.end_date)
OR
(DATE_ADD(a.yeardate, INTERVAL 7 DAY) BETWEEN b.start_date AND b.end_date)
WHERE
b.date_id IS NULL AND WEEKDAY(a.yeardate)=5;
In my case, the blocks are of 7 days, saturday to saturday - so I added the second WHERE clause to the query so that I get distinct 1 week saturday to saturday blocks for each row, that happen one after the other.
So instead of:
+------------+------------+
| blockstart | blockend |
+------------+------------+
| 2012-01-01 | 2012-01-08 |
| 2012-01-02 | 2012-01-09 |
| 2012-01-03 | 2012-01-10 |
| 2012-01-04 | 2012-01-11 |
I get this:
+------------+------------+
| blockstart | blockend |
+------------+------------+
| 2012-01-07 | 2012-01-14 |
| 2012-01-14 | 2012-01-21 |
| 2012-01-21 | 2012-01-28 |
| 2012-01-28 | 2012-02-04 |
Which is exactly what I need. Thanks again to Zane for a great answer.
Understanding your question as Retrieve all 7 day interval blocks of the current and next year whose ranges do not overlap any interval blocks already existing in the available_dates table:
To work with all days of the current and next year, we have to create a separate table (yeardates) containing DATEs of all days of the current and next year. This will facilitate our OUTER JOIN operation in the retrieval query.
Code to define the yeardates table and insert dates:
CREATE TABLE yeardates
(
yeardate DATE NOT NULL,
PRIMARY KEY (yeardate)
) ENGINE = MyISAM;
DELIMITER $$
CREATE PROCEDURE PopulateYear(IN inputyear INT)
BEGIN
DECLARE i INT;
DECLARE i_end INT;
SET i = 1;
SET i_end = CASE WHEN inputyear % 4 THEN 365 ELSE 366 END;
START TRANSACTION;
WHILE i <= i_end DO
INSERT INTO yeardates VALUES (MAKEDATE(inputyear, i));
SET i = i + 1;
END WHILE;
COMMIT;
END$$
DELIMITER ;
CALL PopulateYear(2012);
CALL PopulateYear(2013);
The table is then created and contains all days of the current and next year. If we ever need to insert days for subsequent years, just CALL the procedure again with the year as the parameter (e.g. 2014, 2015, etc..).
Then we can get the 7-day blocks that don't overlap blocks in the available_dates table:
SELECT
a.yeardate AS blockstart,
DATE_ADD(a.yeardate, INTERVAL 7 DAY) AS blockend
FROM
yeardates a
LEFT JOIN
available_dates b ON
(a.yeardate BETWEEN b.start_date AND b.end_date)
OR
(DATE_ADD(a.yeardate, INTERVAL 7 DAY) BETWEEN b.start_date AND b.end_date)
WHERE
b.available_id IS NULL
That retrieves all free 7-day blocks based on the bookings of all properties, but if we need to get the free 7-day blocks for just a particular property, we can use:
SELECT
a.yeardate AS blockstart,
DATE_ADD(a.yeardate, INTERVAL 7 DAY) AS blockend
FROM
yeardates a
LEFT JOIN
(
SELECT *
FROM available_dates
WHERE property_id = <property_id here>
) b ON
(a.yeardate BETWEEN b.start_date AND b.end_date)
OR
(DATE_ADD(a.yeardate, INTERVAL 7 DAY) BETWEEN b.start_date AND b.end_date)
WHERE
b.available_id IS NULL
Where <property_id here> is the property_id. We can even do the selection based on multiple properties at a time by simply changing it to WHERE property_id IN (<comma sep'd list of property_ids here>).
I think youve got it backwards.
All dates are potentially available unless booked, record what has been booked in the database and knock those out of your results