I need help to write MySQL query.
I have table full of logs where one of the column is unix timestamp.
I want to group (GROUP BY) those records so that events that were made in close range time (i.e. 5 sec) between each of them are in one group.
For example:
Table:
timestamp
----------
1429016966
1429016964
1429016963
1429016960
1429016958
1429016957
1429016950
1429016949
1429016943
1429016941
1429016940
1429016938
Become to groups like that:
GROUP_CONCAT(timestamp) | COUNT(*)
-----------------------------------------------------------------------------
1429016966,1429016964,1429016963,1429016960,1429016958,1429016957 | 6
1429016950,1429016949 | 2
1429016943,1429016941,1429016940,1429016938 | 4
Of course I can work with the data array afterwards in php, but I think that mysql would do it faster.
I started by using a variable to get the position of each row, where 1 is the highest time column and ending with the lowest, like this:
SET #a := 0;
SELECT timeCol, #a := #a + 1 AS position
FROM myTable
ORDER BY timeCol DESC;
For simplicity, we will call this positionsTable so that the rest of the query will be more readable. Once I created that table, I used a 'time_group' variable that checked if a previous row was within the last 5 seconds. If it was, we keep the same time_group. It sounds ugly, and looks kind of ugly, but it's like this:
SELECT m.timeCol, m.position,
CASE WHEN (SELECT p.timeCol FROM positionsTable p WHERE p.position = m.position - 1) <= m.timeCol + 5
THEN #time_group
ELSE #time_group := #time_group + 1 END AS timeGroup
FROM positionsTable m;
And then ultimately, using that as a subquery, you can group them:
SELECT GROUP_CONCAT(timeCol), COUNT(*)
FROM(
SELECT m.timeCol, m.position,
CASE WHEN (SELECT p.timeCol FROM positionsTable p WHERE p.position = m.position - 1) <= m.timeCol + 5
THEN #time_group
ELSE #time_group := #time_group + 1 END AS timeGroup
FROM positionsTable m) tmp
GROUP BY timeGroup;
Here is an SQL Fiddle example.
http://sqlfiddle.com/#!9/37d88/20
SELECT GROUP_CONCAT(t1.t) as `time`,
COUNT(*)
FROM (SELECT *
FROM table1
ORDER BY t) as t1
GROUP BY CASE WHEN (#start+5)>=t THEN #start
ELSE #start:=t END
Related
I need to execute these two queries from php, is there a way to merge them together in a single query or I have to use a stored procedure?
SET #rn=0;
UPDATE `nl_emails` SET `row_num`=(#rn:=#rn+1);
Thanks in advance
It doesn't look like it is possible. We could create #rn in the query but it will be local and the value will be lost from one row to another.
Here is another way of doing the what I believe you want to do.
create table nl_emails (id int not null primary key ,row_num int);
insert into nl_emails values(10,10),(20,20),(30,30);
with cte as(
select id, row_num,
row_number() over (order by id)rn
from nl_emails)
update nl_emails join cte on nl_emails.id = cte.id
set nl_emails.row_num = rn;
select * from nl_emails;
id | row_num
-: | ------:
10 | 1
20 | 2
30 | 3
db<>fiddle here
So as question marked with php tag, you can use PHP PDO solution:
<?php
$sql = "SET #rn = 0;
UPDATE nl_emails SET row_num = (#rn:=coalesce(#rn, 0) + 1);";
$pdo->exec($sql);
Test PHP PDO online
You can make in one direct query, but you have to check the performance.
Use:
UPDATE `nl_emails` n1
INNER JOIN (
SELECT (#row_number:=#row_number + 1) AS num,
id
FROM nl_emails, (SELECT #row_number:=0) AS t
) as t1 on n1.id=t1.id
SET n1.`row_num`=t1.num;
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=bf7d4a9243eed3c3e3aeb934846294b7
The key part is the cross join used
SELECT (#row_number:=#row_number + 1) AS num,
id
FROM nl_emails, (SELECT #row_number:=0) AS t
;
I have this table :
id idm date_play
1 5 2017-08-23 12:12:12
2 5 2017-08-23 12:12:12
3 6 2017-08-23 12:14:13
I want to identify if user has more then one insert in the same second. In the case describe I want to get the user id that is 5.
I tried like this :
SELECT `idm`, MAX(`s`) `conseq` FROM
(
SELECT
#s := IF(#u = `idm` AND (UNIX_TIMESTAMP(`date_play`) - #pt) BETWEEN 1 AND 100000, #s + 1, 0) s,
#u := `idm` `idm`,
#pt := UNIX_TIMESTAMP(`date_play`) pt
FROM table
WHERE date_play >= '2017-08-23 00:00:00'
AND date_play <= '2017-08-23 23:59:59'
ORDER BY `date_play`
) AS t
GROUP BY `idm`
Can you help me please ? Thx in advance and sorry for my english.
Assuming your dates are accurate down to the second level, you can do this with a single aggregation:
select idm
from t
group by idm
having count(*) > count(distinct date_play);
If date_play has fractional seconds, then you would need to remove those (say by converting to a string).
If you want the play dates where there are duplicates:
select idm, date_play
from t
group by idm, date_play
having count(*) >= 2;
Or, for just the idms, you could use select distinct with group by:
select distinct idm
from t
group by idm, date_play
having count(*) >= 2;
(I only mention this because this is the only type of problem that I know of where using select distinct with group by makes sense.)
If you want all the rows that are duplicated, I would go for exists instead:
select t.*
from t
where exists (select 1
from t t2
where t2.idm = t.idm and t2.date_play = t.date_play and
t2.id <> t.id
);
This should have reasonable performance with an index on (idm, date_play, id).
If your table is called mytable, the following should work:
SELECT t.`idm`
FROM mytable t INNER JOIN mytable t2
ON t.`idm`=t2.`idm` AND t.`date_play`=t2.`date_play` AND t.`id`!=t2.`id`
GROUP BY t.`idm`
Basically we join the table with itself, pairing records that have the same idm and date_play, but not the same id. This will have the effect of matching up any two records with the same user and datetime. We then group results by user so you don't get the same user id listed multiple times.
Edit:
Gordon Linoff and tadman's suggestions led me to this probably much more efficient query (credit to them)
SELECT t.`idm`
FROM mytable t
GROUP BY t.`date_play`
HAVING COUNT(t.`id`)>1
Here is my SQL query to find a row in currency_price table grouped by maximum date of inserting to table. My question is how to find the second maximum. I mean how can I change this query to find the second maximum row in each group:
select currency_id,buy,sell
from (select * from currency_price order by `currency_id`, cu_date desc,buy,sell) x
group by `currency_id`
with this query i found a row for each id so for example i have sell and buy for each id .exm:
id sell buy
1000 500 480
1001 20 19
...
but here i want the second maximum date for each id.
I know some query to find second maximum but all does not take me to my answer.
If it is MySql then Use LIMIT 1,1; # Retrieve rows [start with rec]1-[fetch rec count]1
http://dev.mysql.com/doc/refman/5.7/en/select.html
Use ROW_NUMBER()
Sample
SELECT * FROM
(
SELECT *,ROW_NUMBER() OVER (ORDER BY AGE DESC) RNUM FROM TEST_TABLE
) QUERY1 WHERE RNUM=2
You could manually add a group order number to your initial ordered query & then select the second from each row.
The inner query, orders as required & numbers the rows starting from 1, resetting each time the currency_id changes.
set #num := 0, #ci := -1;
select currency_id,buy,sell
from
(select *,
#num := if(#ci = currency_id, #num + 1, 1) as gp_number,
#ci := currency_id as dummy
from currency_price
order by `currency_id`, cu_date desc,buy,sell) x
where gp_number=2
This could be put into a stored procedure from workbench as follows :
DELIMITER $$
CREATE PROCEDURE SecondMaximum()
BEGIN
set #num := 0, #ci := -1;
select currency_id,buy,sell
from
(select *,
#num := if(#ci = currency_id, #num + 1, 1) as gp_number,
#ci := currency_id as dummy
from currency_price
order by `currency_id`, cu_date desc,buy,sell) x
where gp_number=2;
END$$
DELIMITER ;
And from PHP you execute "CALL SecondMaximum();"
If you wanted to be able to change tables and/or fields, then you could pass these as string variables to the procedure & create & execute a prepared statement within the stored procedure. Just do a google search for tutorials on those.
I have database table like this:
I want to display different 5-year age ranges and the counts of students that are in that range like below:
Here, the lowest age is 10 so we first calculate the range 10-15. There are 5 students within that range. For the second range, we need to find the age>15 which is 18. So, the second range is from 18-23, and so on. I would appreciate any help where the range is automatically calculated and count the data within that range.
You can use a condition inside of a SUM() statement to get a count where that condition holds. I would count the conditions where the age is BETWEEN() the necessary range. Try this:
SELECT
SUM(age BETWEEN 10 AND 15) AS '10-15',
SUM(age BETWEEN 18 AND 23) AS '18-23',
SUM(age BETWEEN 26 AND 31) AS '26-31',
SUM(age BETWEEN 34 AND 39) AS '34-39'
FROM myTable;
This will only return one row, but it will have everything you need. Here is an SQL Fiddle example.
EDIT I misunderstood your question to automatically calculate the various ranges. I will leave my previous answer here because it may be beneficial to future readers looking for hard coded ranges. To do this, you'll have to set up a variable. I made a sort of running total type approach to get the groups. I started by setting #a to 0 before the query. Then, I needed to get two values:
The minimum age from the table where age > #a
5 greater than that variable.
I did this by changing the value of #a as necessary:
#a := (SELECT MIN(age) FROM myTable WHERE age >= #a)
#a := #a + 5
Then, I included these in a CONCAT() block and casted these values as chars in order to get the groups that I needed. It may look complicated, so I hope I explained the concept:
SELECT CONCAT
(CAST(#a := (SELECT MIN(age) FROM myTable WHERE age > #a) AS CHAR),
' - ',
CAST((#a := #a + 5) AS CHAR)) AS ageRange
FROM myTable
WHERE #a <= (SELECT MAX(age) FROM myTable);
Doing this gave me four rows, each with the age ranges you expect. I had to add the where clause because otherwise I would get one result row for each row in the table, which would give us several null rows.
Last, I included a subquery to get the count of students whose age is within the necessary range. Note that the first part changes the values of #a, so instead of checking from #a to #a + 5, I check from #a-5 to #a. Here is the final query:
SET #a = 0;
SELECT CONCAT(CAST(#a := (SELECT MIN(age) FROM myTable WHERE age > #a) AS CHAR), ' - ', CAST((#a := #a + 5) AS CHAR)) AS ageRange,
(SELECT COUNT(*) FROM myTable WHERE age BETWEEN #a - 5 AND #a) AS numStudents
FROM myTable
WHERE #a <= (SELECT MAX(age) FROM myTable)
GROUP BY ageRange;
It worked beautifully in SQL Fiddle. Completely dynamic and returns the various groups of 5 without any prior knowledge of which groups to take.
SELECT
CASE
WHEN age>=10 AND age<=15 THEN '10-15'
WHEN age>=18 AND age<=23 THEN '18-23'
WHEN age>=26 AND age<=31 THEN '26-31'
WHEN age>=34 AND age<=39 THEN '34-39'
ELSE 'OTHER'
END
AS age_range,
COUNT(*) as number_of_students
FROM table
GROUP BY age_range
I have a series of values in a database that I need to pull to create a line chart. Because i dont require high resolution I would like to resample the data by selecting every 5th row from the database.
SELECT *
FROM (
SELECT
#row := #row +1 AS rownum, [column name]
FROM (
SELECT #row :=0) r, [table name]
) ranked
WHERE rownum % [n] = 1
You could try mod 5 to get rows where the ID is multiple of 5. (Assuming you have some sort of ID column that's sequential.)
select * from table where table.id mod 5 = 0;
Since you said you're using MySQL, you can use user variables to create a continuous row numbering. You do have to put that in a derived table (subquery) though.
SET #x := 0;
SELECT *
FROM (SELECT (#x:=#x+1) AS x, mt.* FROM mytable mt ORDER BY RAND()) t
WHERE x MOD 5 = 0;
I added ORDER BY RAND() to get a pseudorandom sampling, instead of allowing every fifth row of the unordered table to be in the sample every time.
An anonymous user tried to edit this to change x MOD 5 = 0 to x MOD 5 = 1. I have changed it back to my original.
For the record, one can use any value between 0 and 4 in that condition, and there's no reason to prefer one value over another.
SET #a = 0;
SELECT * FROM t where (#a := #a + 1) % 2 = 0;
I had been looking for something like this. The answer of Taylor and Bill led me to improve upon their ideas.
table data1 has fields read_date, value
we want to select every 2d record from a query limited by a read_date range
the name of the derived table is arbitrary and here is called DT
query:
SET #row := 0;
SELECT * FROM ( SELECT #row := #row +1 AS rownum, read_date, value FROM data1
WHERE read_date>= 1279771200 AND read_date <= 1281844740 ) as DT WHERE MOD(rownum,2)=0
If you're using MariaDB 10.2, MySQL 8 or later, you can do this more efficiency, and I think more clearly, using common table expressions and window functions.
WITH ordering AS (
SELECT ROW_NUMBER() OVER (ORDER BY name) AS n, example.*
FROM example ORDER BY name
)
SELECT * FROM ordering WHERE MOD(n, 5) = 0;
Conceptually, this creates a temporary table with the contents of the example table ordered by the name field, adds an additional field called n which is the row number, and then fetches only those rows with numbers which are exactly divisible by 5, i.e. every 5th row. In practice, the database engine is often able to optimise this better than that. But even if it doesn't optimise it any further, I think it's clearer than using user variables iteratively as you had to in earlier versions of MySQL.
You can use this query,
set #n=2; <!-- nth row -->
select * from (SELECT t.*,
#rowid := #rowid + 1 AS ID
FROM TABLE t,
(SELECT #rowid := 0) dummy) A where A.ID mod #n = 0;
or you can replace n with your nth value
SELECT *
FROM (
SELECT #row := #row +1 AS rownum, posts.*
FROM (
SELECT #row :=0) r, posts
) ranked
WHERE rownum %3 = 1
where posts is my table.
If you don't require the row number in the result set you can simplify the query.
SELECT
[column name]
FROM
(SELECT #row:=0) temp,
[table name]
WHERE (#row:=#row + 1) % [n] = 1
Replace the following placeholders:
Replace [column name] with a list of columns you need to fetch.
Replace [table name] with the name of your table.
Replace [n] with a number. e.g. if you need every 5th row, replace it with 5