MySql Query optimization : Order by online member - php

I have the following query, which works fine:
SELECT
a.i_am,a.seeking_a,
a.zodiac,
a.my_marital_status,
a.city,a.country,
a.my_age,
a.intrested_in1,
a.intrested_in2,
a.intrested_in3,
a.intrested_in4,
a.intrested_in5,
a.intrested_in6,
a.user_id,
a.picture_type,
a.photo,
a.block_photo,
r.username,
u.title,
u.about_myself
FROM ".TBL_ABOUT_MYSELF." a
INNER JOIN ".TBLREGISTRATION." r ON a.user_id=r.ID
INNER JOIN ".TBLUSERDETAILS." u ON a.user_id=u.user_id
LEFT JOIN ".TBLSTEPS." s ON a.user_id=s.user_id
WHERE 1 $condition $gender_condition
ORDER BY a.user_id
DESC LIMIT $from,10
it takes around 0.5 seconds to fetch results. There are 1,000,000 results in total.
Now if I put join one more table (for user online status:
SELECT
a.i_am,a.seeking_a,
a.zodiac,
a.my_marital_status,
a.city,a.country,
a.my_age,
a.intrested_in1,
a.intrested_in2,
a.intrested_in3,
a.intrested_in4,
a.intrested_in5,
a.intrested_in6,
a.user_id,
a.picture_type,
a.photo,
a.block_photo,
r.username,
u.title,
u.about_myself
FROM ".TBL_ABOUT_MYSELF." a
INNER JOIN ".TBLREGISTRATION." r ON a.user_id=r.ID
INNER JOIN ".TBLUSERDETAILS." u ON a.user_id=u.user_id
LEFT JOIN ".TBLSTEPS." s ON a.user_id=s.user_id
LEFT JOIN ".TBL_USER_ONLINE." ON a.user_id=o.user_id <-- New Line
WHERE 1 $condition $gender_condition
ORDER BY o.user_id DESC, a.user_id DESC LIMIT $from,10 <-- One more order by
and now it takes 12 seconds to fetch results.
How can I optimize this?
mysql Query Explained:
SQL result
Host: localhost
Database: mysite_new
Generation Time: Dec 06, 2012 at 09:09 AM
Generated by: phpMyAdmin 3.2.0.1 / MySQL 5.1.36-community-log
SQL query: EXPLAIN SELECT a.i_am,a.seeking_a,a.zodiac,a.my_marital_status,a.city,a.country,a.my_age,a.intrested_in1,a.intrested_in2,a.intrested_in3,a.intrested_in4,a.intrested_in5,a.intrested_in6,a.user_id,a.picture_type,a.photo,a.block_photo,r.username,u.title, u.about_myself ,o.user_id as online_user_id FROM mysite_about_myself_new a INNER JOIN mysite_user_registration_new r ON a.user_id=r.ID INNER JOIN mysite_user_details_new u ON a.user_id=u.user_id LEFT JOIN mysite_steps_new s ON a.user_id=s.user_id LEFT JOIN mysite_usersonline_new o ON a.user_id=o.user_id WHERE 1 and r.block_by_webmaster='0' and a.approved='1' and s.step1='1' AND a.i_am ='2' AND a.seeking_a = '1' ORDER BY o.user_id DESC LIMIT 0,10;
Rows: 5
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE a ref user_id,i_am i_am 2 const 30011 Using where; Using temporary; Using filesort
1 SIMPLE o ref user_id user_id 4 mysite_new.a.user_id 2 Using index
1 SIMPLE s eq_ref user_id user_id 4 mysite_new.a.user_id 1 Using where
1 SIMPLE u eq_ref user_id user_id 4 mysite_new.s.user_id 1 Using where
1 SIMPLE r eq_ref PRIMARY PRIMARY 4 mysite_new.u.user_id 1 Using where
Second Explain as Asked:
SQL result
Host: localhost
Database: mysite_new
Generation Time: Dec 06, 2012 at 10:02 AM
Generated by: phpMyAdmin 3.2.0.1 / MySQL 5.1.36-community-log
SQL query: EXPLAIN SELECT a.i_am,a.seeking_a,a.zodiac,a.my_marital_status,a.city,a.country,a.my_age,a.intrested_in1,a.intrested_in2,a.intrested_in3,a.intrested_in4,a.intrested_in5,a.intrested_in6,a.user_id,a.picture_type,a.photo,a.block_photo,r.username,u.title, u.about_myself ,o.user_id as online_user_id FROM mysite_about_myself_new a INNER JOIN mysite_user_registration_new r ON a.user_id=r.ID INNER JOIN mysite_user_details_new u ON a.user_id=u.user_id LEFT JOIN mysite_steps_new s ON a.user_id=s.user_id LEFT JOIN mysite_usersonline_new o ON a.user_id=o.user_id WHERE 1 and r.block_by_webmaster='0' and a.approved='1' and s.step1='1' AND a.i_am ='2' AND a.seeking_a = '1' ORDER BY a.user_id DESC LIMIT 0,10;
Rows: 5
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE a index user_id,i_am user_id 4 NULL 38 Using where
1 SIMPLE o ref user_id user_id 4 mysite_new.a.user_id 2 Using index
1 SIMPLE s eq_ref user_id user_id 4 mysite_new.a.user_id 1 Using where
1 SIMPLE u eq_ref user_id user_id 4 mysite_new.s.user_id 1 Using where
1 SIMPLE r eq_ref PRIMARY PRIMARY 4 mysite_new.u.user_id 1 Using where

The problem is most likely in the sorting. Whereas in the first query MySQL could sort using the a.user_id index, in the second query it must perform a filesort of a million records in order to sort over the columns of two tables (and then seek to some point in the resultset to effect the LIMIT clause).
Sorting by o.user_id should be sufficient, since it is either NULL or equal to a.user_id.
Also, if you were instead able to restrict the query with an inequality on o.user_id in a WHERE clause, you would at least not need to seek as far.

why you use order by on two columns
ORDER BY o.user_id DESC, a.user_id DESC
I think o.user_id and a.user_id will be same values for a single user so you can just use one.
o.user_id if you want to include null from tables TBL_USER_ONLINE in order
a.user_id if you want to order only on user ids.
Also use EXPLAIN and create any necessary indexes for joins, where conditions and order by columns

change this new line to
LEFT JOIN ".TBL_USER_ONLINE." o on ON a.user_id=o.user_id
EDIT :
then your problem is with the ORDER BY.
your query takes longs time to ORDER BY two user_id of two tables, try use just one user_id
from one table
EDIT2 :
add this o.user_id also to the SELECT part
like that
SELECT
a.i_am,a.seeking_a,
a.zodiac,
a.my_marital_status,
a.city,a.country,
a.my_age,
a.intrested_in1,
a.intrested_in2,
a.intrested_in3,
a.intrested_in4,
a.intrested_in5,
a.intrested_in6,
a.user_id,
a.picture_type,
a.photo,
a.block_photo,
r.username,
u.title,
u.about_myself
o.user_id <-- New Line

Related

Subquery order by not working for outer group by query in mysql [duplicate]

There is a table messages that contains data as shown below:
Id Name Other_Columns
-------------------------
1 A A_data_1
2 A A_data_2
3 A A_data_3
4 B B_data_1
5 B B_data_2
6 C C_data_1
If I run a query select * from messages group by name, I will get the result as:
1 A A_data_1
4 B B_data_1
6 C C_data_1
What query will return the following result?
3 A A_data_3
5 B B_data_2
6 C C_data_1
That is, the last record in each group should be returned.
At present, this is the query that I use:
SELECT
*
FROM (SELECT
*
FROM messages
ORDER BY id DESC) AS x
GROUP BY name
But this looks highly inefficient. Any other ways to achieve the same result?
MySQL 8.0 now supports windowing functions, like almost all popular SQL implementations. With this standard syntax, we can write greatest-n-per-group queries:
WITH ranked_messages AS (
SELECT m.*, ROW_NUMBER() OVER (PARTITION BY name ORDER BY id DESC) AS rn
FROM messages AS m
)
SELECT * FROM ranked_messages WHERE rn = 1;
This and other approaches to finding groupwise maximal rows are illustrated in the MySQL manual.
Below is the original answer I wrote for this question in 2009:
I write the solution this way:
SELECT m1.*
FROM messages m1 LEFT JOIN messages m2
ON (m1.name = m2.name AND m1.id < m2.id)
WHERE m2.id IS NULL;
Regarding performance, one solution or the other can be better, depending on the nature of your data. So you should test both queries and use the one that is better at performance given your database.
For example, I have a copy of the StackOverflow August data dump. I'll use that for benchmarking. There are 1,114,357 rows in the Posts table. This is running on MySQL 5.0.75 on my Macbook Pro 2.40GHz.
I'll write a query to find the most recent post for a given user ID (mine).
First using the technique shown by #Eric with the GROUP BY in a subquery:
SELECT p1.postid
FROM Posts p1
INNER JOIN (SELECT pi.owneruserid, MAX(pi.postid) AS maxpostid
FROM Posts pi GROUP BY pi.owneruserid) p2
ON (p1.postid = p2.maxpostid)
WHERE p1.owneruserid = 20860;
1 row in set (1 min 17.89 sec)
Even the EXPLAIN analysis takes over 16 seconds:
+----+-------------+------------+--------+----------------------------+-------------+---------+--------------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+----------------------------+-------------+---------+--------------+---------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 76756 | |
| 1 | PRIMARY | p1 | eq_ref | PRIMARY,PostId,OwnerUserId | PRIMARY | 8 | p2.maxpostid | 1 | Using where |
| 2 | DERIVED | pi | index | NULL | OwnerUserId | 8 | NULL | 1151268 | Using index |
+----+-------------+------------+--------+----------------------------+-------------+---------+--------------+---------+-------------+
3 rows in set (16.09 sec)
Now produce the same query result using my technique with LEFT JOIN:
SELECT p1.postid
FROM Posts p1 LEFT JOIN posts p2
ON (p1.owneruserid = p2.owneruserid AND p1.postid < p2.postid)
WHERE p2.postid IS NULL AND p1.owneruserid = 20860;
1 row in set (0.28 sec)
The EXPLAIN analysis shows that both tables are able to use their indexes:
+----+-------------+-------+------+----------------------------+-------------+---------+-------+------+--------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+----------------------------+-------------+---------+-------+------+--------------------------------------+
| 1 | SIMPLE | p1 | ref | OwnerUserId | OwnerUserId | 8 | const | 1384 | Using index |
| 1 | SIMPLE | p2 | ref | PRIMARY,PostId,OwnerUserId | OwnerUserId | 8 | const | 1384 | Using where; Using index; Not exists |
+----+-------------+-------+------+----------------------------+-------------+---------+-------+------+--------------------------------------+
2 rows in set (0.00 sec)
Here's the DDL for my Posts table:
CREATE TABLE `posts` (
`PostId` bigint(20) unsigned NOT NULL auto_increment,
`PostTypeId` bigint(20) unsigned NOT NULL,
`AcceptedAnswerId` bigint(20) unsigned default NULL,
`ParentId` bigint(20) unsigned default NULL,
`CreationDate` datetime NOT NULL,
`Score` int(11) NOT NULL default '0',
`ViewCount` int(11) NOT NULL default '0',
`Body` text NOT NULL,
`OwnerUserId` bigint(20) unsigned NOT NULL,
`OwnerDisplayName` varchar(40) default NULL,
`LastEditorUserId` bigint(20) unsigned default NULL,
`LastEditDate` datetime default NULL,
`LastActivityDate` datetime default NULL,
`Title` varchar(250) NOT NULL default '',
`Tags` varchar(150) NOT NULL default '',
`AnswerCount` int(11) NOT NULL default '0',
`CommentCount` int(11) NOT NULL default '0',
`FavoriteCount` int(11) NOT NULL default '0',
`ClosedDate` datetime default NULL,
PRIMARY KEY (`PostId`),
UNIQUE KEY `PostId` (`PostId`),
KEY `PostTypeId` (`PostTypeId`),
KEY `AcceptedAnswerId` (`AcceptedAnswerId`),
KEY `OwnerUserId` (`OwnerUserId`),
KEY `LastEditorUserId` (`LastEditorUserId`),
KEY `ParentId` (`ParentId`),
CONSTRAINT `posts_ibfk_1` FOREIGN KEY (`PostTypeId`) REFERENCES `posttypes` (`PostTypeId`)
) ENGINE=InnoDB;
Note to commenters: If you want another benchmark with a different version of MySQL, a different dataset, or different table design, feel free to do it yourself. I have shown the technique above. Stack Overflow is here to show you how to do software development work, not to do all the work for you.
UPD: 2017-03-31, the version 5.7.5 of MySQL made the ONLY_FULL_GROUP_BY switch enabled by default (hence, non-deterministic GROUP BY queries became disabled). Moreover, they updated the GROUP BY implementation and the solution might not work as expected anymore even with the disabled switch. One needs to check.
Bill Karwin's solution above works fine when item count within groups is rather small, but the performance of the query becomes bad when the groups are rather large, since the solution requires about n*n/2 + n/2 of only IS NULL comparisons.
I made my tests on a InnoDB table of 18684446 rows with 1182 groups. The table contains testresults for functional tests and has the (test_id, request_id) as the primary key. Thus, test_id is a group and I was searching for the last request_id for each test_id.
Bill's solution has already been running for several hours on my dell e4310 and I do not know when it is going to finish even though it operates on a coverage index (hence using index in EXPLAIN).
I have a couple of other solutions that are based on the same ideas:
if the underlying index is BTREE index (which is usually the case), the largest (group_id, item_value) pair is the last value within each group_id, that is the first for each group_id if we walk through the index in descending order;
if we read the values which are covered by an index, the values are read in the order of the index;
each index implicitly contains primary key columns appended to that (that is the primary key is in the coverage index). In solutions below I operate directly on the primary key, in you case, you will just need to add primary key columns in the result.
in many cases it is much cheaper to collect the required row ids in the required order in a subquery and join the result of the subquery on the id. Since for each row in the subquery result MySQL will need a single fetch based on primary key, the subquery will be put first in the join and the rows will be output in the order of the ids in the subquery (if we omit explicit ORDER BY for the join)
3 ways MySQL uses indexes is a great article to understand some details.
Solution 1
This one is incredibly fast, it takes about 0,8 secs on my 18M+ rows:
SELECT test_id, MAX(request_id) AS request_id
FROM testresults
GROUP BY test_id DESC;
If you want to change the order to ASC, put it in a subquery, return the ids only and use that as the subquery to join to the rest of the columns:
SELECT test_id, request_id
FROM (
SELECT test_id, MAX(request_id) AS request_id
FROM testresults
GROUP BY test_id DESC) as ids
ORDER BY test_id;
This one takes about 1,2 secs on my data.
Solution 2
Here is another solution that takes about 19 seconds for my table:
SELECT test_id, request_id
FROM testresults, (SELECT #group:=NULL) as init
WHERE IF(IFNULL(#group, -1)=#group:=test_id, 0, 1)
ORDER BY test_id DESC, request_id DESC
It returns tests in descending order as well. It is much slower since it does a full index scan but it is here to give you an idea how to output N max rows for each group.
The disadvantage of the query is that its result cannot be cached by the query cache.
Use your subquery to return the correct grouping, because you're halfway there.
Try this:
select
a.*
from
messages a
inner join
(select name, max(id) as maxid from messages group by name) as b on
a.id = b.maxid
If it's not id you want the max of:
select
a.*
from
messages a
inner join
(select name, max(other_col) as other_col
from messages group by name) as b on
a.name = b.name
and a.other_col = b.other_col
This way, you avoid correlated subqueries and/or ordering in your subqueries, which tend to be very slow/inefficient.
I arrived at a different solution, which is to get the IDs for the last post within each group, then select from the messages table using the result from the first query as the argument for a WHERE x IN construct:
SELECT id, name, other_columns
FROM messages
WHERE id IN (
SELECT MAX(id)
FROM messages
GROUP BY name
);
I don't know how this performs compared to some of the other solutions, but it worked spectacularly for my table with 3+ million rows. (4 second execution with 1200+ results)
This should work both on MySQL and SQL Server.
Solution by sub query fiddle Link
select * from messages where id in
(select max(id) from messages group by Name)
Solution By join condition fiddle link
select m1.* from messages m1
left outer join messages m2
on ( m1.id<m2.id and m1.name=m2.name )
where m2.id is null
Reason for this post is to give fiddle link only.
Same SQL is already provided in other answers.
An approach with considerable speed is as follows.
SELECT *
FROM messages a
WHERE Id = (SELECT MAX(Id) FROM messages WHERE a.Name = Name)
Result
Id Name Other_Columns
3 A A_data_3
5 B B_data_2
6 C C_data_1
We will look at how you can use MySQL at getting the last record in a Group By of records. For example if you have this result set of posts.
id
category_id
post_title
1
1
Title 1
2
1
Title 2
3
1
Title 3
4
2
Title 4
5
2
Title 5
6
3
Title 6
I want to be able to get the last post in each category which are Title 3, Title 5 and Title 6. To get the posts by the category you will use the MySQL Group By keyboard.
select * from posts group by category_id
But the results we get back from this query is.
id
category_id
post_title
1
1
Title 1
4
2
Title 4
6
3
Title 6
The group by will always return the first record in the group on the result set.
SELECT id, category_id, post_title
FROM posts
WHERE id IN (
SELECT MAX(id)
FROM posts
GROUP BY category_id );
This will return the posts with the highest IDs in each group.
id
category_id
post_title
3
1
Title 3
5
2
Title 5
6
3
Title 6
Reference Click Here
Here are two suggestions. First, if mysql supports ROW_NUMBER(), it's very simple:
WITH Ranked AS (
SELECT Id, Name, OtherColumns,
ROW_NUMBER() OVER (
PARTITION BY Name
ORDER BY Id DESC
) AS rk
FROM messages
)
SELECT Id, Name, OtherColumns
FROM messages
WHERE rk = 1;
I'm assuming by "last" you mean last in Id order. If not, change the ORDER BY clause of the ROW_NUMBER() window accordingly. If ROW_NUMBER() isn't available, this is another solution:
Second, if it doesn't, this is often a good way to proceed:
SELECT
Id, Name, OtherColumns
FROM messages
WHERE NOT EXISTS (
SELECT * FROM messages as M2
WHERE M2.Name = messages.Name
AND M2.Id > messages.Id
)
In other words, select messages where there is no later-Id message with the same Name.
Clearly there are lots of different ways of getting the same results, your question seems to be what is an efficient way of getting the last results in each group in MySQL. If you are working with huge amounts of data and assuming you are using InnoDB with even the latest versions of MySQL (such as 5.7.21 and 8.0.4-rc) then there might not be an efficient way of doing this.
We sometimes need to do this with tables with even more than 60 million rows.
For these examples I will use data with only about 1.5 million rows where the queries would need to find results for all groups in the data. In our actual cases we would often need to return back data from about 2,000 groups (which hypothetically would not require examining very much of the data).
I will use the following tables:
CREATE TABLE temperature(
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
groupID INT UNSIGNED NOT NULL,
recordedTimestamp TIMESTAMP NOT NULL,
recordedValue INT NOT NULL,
INDEX groupIndex(groupID, recordedTimestamp),
PRIMARY KEY (id)
);
CREATE TEMPORARY TABLE selected_group(id INT UNSIGNED NOT NULL, PRIMARY KEY(id));
The temperature table is populated with about 1.5 million random records, and with 100 different groups.
The selected_group is populated with those 100 groups (in our cases this would normally be less than 20% for all of the groups).
As this data is random it means that multiple rows can have the same recordedTimestamps. What we want is to get a list of all of the selected groups in order of groupID with the last recordedTimestamp for each group, and if the same group has more than one matching row like that then the last matching id of those rows.
If hypothetically MySQL had a last() function which returned values from the last row in a special ORDER BY clause then we could simply do:
SELECT
last(t1.id) AS id,
t1.groupID,
last(t1.recordedTimestamp) AS recordedTimestamp,
last(t1.recordedValue) AS recordedValue
FROM selected_group g
INNER JOIN temperature t1 ON t1.groupID = g.id
ORDER BY t1.recordedTimestamp, t1.id
GROUP BY t1.groupID;
which would only need to examine a few 100 rows in this case as it doesn't use any of the normal GROUP BY functions. This would execute in 0 seconds and hence be highly efficient.
Note that normally in MySQL we would see an ORDER BY clause following the GROUP BY clause however this ORDER BY clause is used to determine the ORDER for the last() function, if it was after the GROUP BY then it would be ordering the GROUPS. If no GROUP BY clause is present then the last values will be the same in all of the returned rows.
However MySQL does not have this so let's look at different ideas of what it does have and prove that none of these are efficient.
Example 1
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM selected_group g
INNER JOIN temperature t1 ON t1.id = (
SELECT t2.id
FROM temperature t2
WHERE t2.groupID = g.id
ORDER BY t2.recordedTimestamp DESC, t2.id DESC
LIMIT 1
);
This examined 3,009,254 rows and took ~0.859 seconds on 5.7.21 and slightly longer on 8.0.4-rc
Example 2
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM temperature t1
INNER JOIN (
SELECT max(t2.id) AS id
FROM temperature t2
INNER JOIN (
SELECT t3.groupID, max(t3.recordedTimestamp) AS recordedTimestamp
FROM selected_group g
INNER JOIN temperature t3 ON t3.groupID = g.id
GROUP BY t3.groupID
) t4 ON t4.groupID = t2.groupID AND t4.recordedTimestamp = t2.recordedTimestamp
GROUP BY t2.groupID
) t5 ON t5.id = t1.id;
This examined 1,505,331 rows and took ~1.25 seconds on 5.7.21 and slightly longer on 8.0.4-rc
Example 3
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM temperature t1
WHERE t1.id IN (
SELECT max(t2.id) AS id
FROM temperature t2
INNER JOIN (
SELECT t3.groupID, max(t3.recordedTimestamp) AS recordedTimestamp
FROM selected_group g
INNER JOIN temperature t3 ON t3.groupID = g.id
GROUP BY t3.groupID
) t4 ON t4.groupID = t2.groupID AND t4.recordedTimestamp = t2.recordedTimestamp
GROUP BY t2.groupID
)
ORDER BY t1.groupID;
This examined 3,009,685 rows and took ~1.95 seconds on 5.7.21 and slightly longer on 8.0.4-rc
Example 4
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM selected_group g
INNER JOIN temperature t1 ON t1.id = (
SELECT max(t2.id)
FROM temperature t2
WHERE t2.groupID = g.id AND t2.recordedTimestamp = (
SELECT max(t3.recordedTimestamp)
FROM temperature t3
WHERE t3.groupID = g.id
)
);
This examined 6,137,810 rows and took ~2.2 seconds on 5.7.21 and slightly longer on 8.0.4-rc
Example 5
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM (
SELECT
t2.id,
t2.groupID,
t2.recordedTimestamp,
t2.recordedValue,
row_number() OVER (
PARTITION BY t2.groupID ORDER BY t2.recordedTimestamp DESC, t2.id DESC
) AS rowNumber
FROM selected_group g
INNER JOIN temperature t2 ON t2.groupID = g.id
) t1 WHERE t1.rowNumber = 1;
This examined 6,017,808 rows and took ~4.2 seconds on 8.0.4-rc
Example 6
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM (
SELECT
last_value(t2.id) OVER w AS id,
t2.groupID,
last_value(t2.recordedTimestamp) OVER w AS recordedTimestamp,
last_value(t2.recordedValue) OVER w AS recordedValue
FROM selected_group g
INNER JOIN temperature t2 ON t2.groupID = g.id
WINDOW w AS (
PARTITION BY t2.groupID
ORDER BY t2.recordedTimestamp, t2.id
RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)
) t1
GROUP BY t1.groupID;
This examined 6,017,908 rows and took ~17.5 seconds on 8.0.4-rc
Example 7
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM selected_group g
INNER JOIN temperature t1 ON t1.groupID = g.id
LEFT JOIN temperature t2
ON t2.groupID = g.id
AND (
t2.recordedTimestamp > t1.recordedTimestamp
OR (t2.recordedTimestamp = t1.recordedTimestamp AND t2.id > t1.id)
)
WHERE t2.id IS NULL
ORDER BY t1.groupID;
This one was taking forever so I had to kill it.
Here is another way to get the last related record using GROUP_CONCAT with order by and SUBSTRING_INDEX to pick one of the record from the list
SELECT
`Id`,
`Name`,
SUBSTRING_INDEX(
GROUP_CONCAT(
`Other_Columns`
ORDER BY `Id` DESC
SEPARATOR '||'
),
'||',
1
) Other_Columns
FROM
messages
GROUP BY `Name`
Above query will group the all the Other_Columns that are in same Name group and using ORDER BY id DESC will join all the Other_Columns in a specific group in descending order with the provided separator in my case i have used || ,using SUBSTRING_INDEX over this list will pick the first one
Fiddle Demo
Hi #Vijay Dev if your table messages contains Id which is auto increment primary key then to fetch the latest record basis on the primary key your query should read as below:
SELECT m1.* FROM messages m1 INNER JOIN (SELECT max(Id) as lastmsgId FROM messages GROUP BY Name) m2 ON m1.Id=m2.lastmsgId
I've not yet tested with large DB but I think this could be faster than joining tables:
SELECT *, Max(Id) FROM messages GROUP BY Name
SELECT
column1,
column2
FROM
table_name
WHERE id IN
(SELECT
MAX(id)
FROM
table_name
GROUP BY column1)
ORDER BY column1 ;
You can take view from here as well.
http://sqlfiddle.com/#!9/ef42b/9
FIRST SOLUTION
SELECT d1.ID,Name,City FROM Demo_User d1
INNER JOIN
(SELECT MAX(ID) AS ID FROM Demo_User GROUP By NAME) AS P ON (d1.ID=P.ID);
SECOND SOLUTION
SELECT * FROM (SELECT * FROM Demo_User ORDER BY ID DESC) AS T GROUP BY NAME ;
If you need the most recent or oldest record of a text column in a grouped query, and you would rather not use a subquery, you can do this...
Ex. You have a list of movies and need to get the count in the series and the latest movie
id
series
name
1
Star Wars
A New hope
2
Star Wars
The Empire Strikes Back
3
Star Wars
Return of The Jedi
SELECT COUNT(id), series, SUBSTRING(MAX(CONCAT(id, name)), LENGTH(id) + 1),
FROM Movies
GROUP BY series
This returns...
id
series
name
3
Star Wars
Return of The Jedi
MAX will return the row with the highest value, so by concatenating the id to the name, you now will get the newest record, then just strip off the id for your final result.
More efficient than using a subquery.
So for the given example:
SELECT MAX(Id), Name, SUBSTRING(MAX(CONCAT(Id, Other_Columns)), LENGTH(Id) + 1),
FROM messages
GROUP BY Name
Happy coding, and "May The Force Be With You" :)
Try this:
SELECT jos_categories.title AS name,
joined .catid,
joined .title,
joined .introtext
FROM jos_categories
INNER JOIN (SELECT *
FROM (SELECT `title`,
catid,
`created`,
introtext
FROM `jos_content`
WHERE `sectionid` = 6
ORDER BY `id` DESC) AS yes
GROUP BY `yes`.`catid` DESC
ORDER BY `yes`.`created` DESC) AS joined
ON( joined.catid = jos_categories.id )
Here is my solution:
SELECT
DISTINCT NAME,
MAX(MESSAGES) OVER(PARTITION BY NAME) MESSAGES
FROM MESSAGE;
SELECT * FROM table_name WHERE primary_key IN (SELECT MAX(primary_key) FROM table_name GROUP BY column_name )
**
Hi, this query might help :
**
SELECT
*
FROM
message
WHERE
`Id` IN (
SELECT
MAX(`Id`)
FROM
message
GROUP BY
`Name`
)
ORDER BY
`Id` DESC
i find best solution in https://dzone.com/articles/get-last-record-in-each-mysql-group
select * from `data` where `id` in (select max(`id`) from `data` group by `name_id`)
The below query will work fine as per your question.
SELECT M1.*
FROM MESSAGES M1,
(
SELECT SUBSTR(Others_data,1,2),MAX(Others_data) AS Max_Others_data
FROM MESSAGES
GROUP BY 1
) M2
WHERE M1.Others_data = M2.Max_Others_data
ORDER BY Others_data;
If you want the last row for each Name, then you can give a row number to each row group by the Name and order by Id in descending order.
QUERY
SELECT t1.Id,
t1.Name,
t1.Other_Columns
FROM
(
SELECT Id,
Name,
Other_Columns,
(
CASE Name WHEN #curA
THEN #curRow := #curRow + 1
ELSE #curRow := 1 AND #curA := Name END
) + 1 AS rn
FROM messages t,
(SELECT #curRow := 0, #curA := '') r
ORDER BY Name,Id DESC
)t1
WHERE t1.rn = 1
ORDER BY t1.Id;
SQL Fiddle
If performance is really your concern you can introduce a new column on the table called IsLastInGroup of type BIT.
Set it to true on the columns which are last and maintain it with every row insert/update/delete. Writes will be slower, but you'll benefit on reads. It depends on your use case and I recommend it only if you're read-focused.
So your query will look like:
SELECT * FROM Messages WHERE IsLastInGroup = 1
MariaDB 10.3 and newer using GROUP_CONCAT.
The idea is to use ORDER BY + LIMIT:
SELECT GROUP_CONCAT(id ORDER BY id DESC LIMIT 1) AS id,
name,
GROUP_CONCAT(Other_columns ORDER BY id DESC LIMIT 1) AS Other_columns
FROM t
GROUP BY name;
db<>fiddle demo
How about this:
SELECT DISTINCT ON (name) *
FROM messages
ORDER BY name, id DESC;
I had similar issue (on postgresql tough) and on a 1M records table. This solution takes 1.7s vs 44s produced by the one with LEFT JOIN.
In my case I had to filter the corrispondant of your name field against NULL values, resulting in even better performances by 0.2 secs
Yet another option without subqueries.
This solution uses MySQL LAST_VALUE window function, exploiting Window Function Frame available MySQL tool from .
SELECT DISTINCT
LAST_VALUE(Id)
OVER(PARTITION BY Name
ORDER BY Id
ROWS BETWEEN 0 PRECEDING
AND UNBOUNDED FOLLOWING),
Name,
LAST_VALUE(Other_Columns)
OVER(PARTITION BY Name
ORDER BY Id
ROWS BETWEEN 0 PRECEDING
AND UNBOUNDED FOLLOWING)
FROM
tab
Try it here.
Hope below Oracle query can help:
WITH Temp_table AS
(
Select id, name, othercolumns, ROW_NUMBER() over (PARTITION BY name ORDER BY ID
desc)as rank from messages
)
Select id, name,othercolumns from Temp_table where rank=1
Another approach :
Find the propertie with the max m2_price withing each program (n properties in 1 program) :
select * from properties p
join (
select max(m2_price) as max_price
from properties
group by program_id
) p2 on (p.program_id = p2.program_id)
having p.m2_price = max_price
What about:
select *, max(id) from messages group by name
I have tested it on sqlite and it returns all columns and max id value for all names.
As of MySQL 8.0.14, this can also be achieved using Lateral Derived Tables:
SELECT t.*
FROM messages t
JOIN LATERAL (
SELECT name, MAX(id) AS id
FROM messages t1
WHERE t.name = t1.name
GROUP BY name
) trn ON t.name = trn.name AND t.id = trn.id
db<>fiddle

joining mysql tables or making the queries run faster

I admit am not the brightest when it comes to joining database tables so would like some help. Currently I have 2 queries, the main one being:
SELECT
s.id AS show_id,
start_date,
end_date
FROM `show` s, `theater_type` tt
WHERE
s.theater = tt.theater_id AND
image = 1 AND
start_date > '2015-05-21' AND
genre_id IN (1,2,3,12,13,17,21) AND
tt.type_id IN (1,2,3,4)
This is filtering and returning shows I need. So far it is excellent but for each show I also need its average rating. As a lumberjack I am looping through the results from above and using this query to retrieve the average for each show_id with:
SELECT AVG(rating) AS avgr FROM pro_reviews WHERE show_id = X
This is fine too but often times the first query returns more than 1k of results so the whole process runs more than 10s which is unacceptable for the end user.
I'm looking to join the queries and get it all in one shot. Maybe this won't be faster either but I'm out of options.
This is what I tried, obviously it's wrong because it only returns one row:
SELECT
AVG(rating) AS avgr,
allshows.show_id,
allshows.start_date,
allshows.end_date
FROM
pro_reviews pr,
(
SELECT s.id AS show_id, start_date, end_date
FROM `show` s, theater_type tt
WHERE s.theater = tt.theater_id AND image = 1 AND start_date > '2015-05-21' AND genre_id IN (1,2,3,12,13,17,21) AND tt.type_id IN (1,2,3,4)
) allshows
WHERE allshows.show_id = pr.show_id
Explain first select returns
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE tt ALL NULL NULL NULL NULL 209 Using where
1 SIMPLE s ALL NULL NULL NULL NULL 5678 Using where; Using join buffer
SELECT
AVG(pr.rating) AS avgr
s.id AS show_id,
start_date,
end_date
FROM `show` s
JOIN `theater_type` tt
ON s.theater = tt.theater_id
LEFT JOIN `pro_reviews` pr
ON s.id = pr.show_id
WHERE
image = 1 AND
start_date > '2015-05-21' AND
genre_id IN (1,2,3,12,13,17,21) AND
tt.type_id IN (1,2,3,4)
GROUP BY s.id
Create a view
CREATE VIEW all_shows AS
SELECT s.id AS show_id, start_date, end_date
FROM show s, theater_type tt
WHERE s.theater = tt.theater_id AND image = 1 AND
genre_id IN (1,2,3,12,13,17,21) AND tt.type_id IN (1,2,3,4);
Then do Your select using this view
SELECT
AVG(pro_reviews.rating) AS avgr,
all_shows.show_id,
all_shows.start_date,
all_shows.end_date
FROM pro_reviews
INNER JOIN
all_shows ON (all_shows.show_id = pr.show_id)
WHERE all_shows.start_date > '2015-05-21'
GROUP BY pro_reviews.show_id;

MySQL select distinct rows and the latest one based on timestamp [duplicate]

There is a table messages that contains data as shown below:
Id Name Other_Columns
-------------------------
1 A A_data_1
2 A A_data_2
3 A A_data_3
4 B B_data_1
5 B B_data_2
6 C C_data_1
If I run a query select * from messages group by name, I will get the result as:
1 A A_data_1
4 B B_data_1
6 C C_data_1
What query will return the following result?
3 A A_data_3
5 B B_data_2
6 C C_data_1
That is, the last record in each group should be returned.
At present, this is the query that I use:
SELECT
*
FROM (SELECT
*
FROM messages
ORDER BY id DESC) AS x
GROUP BY name
But this looks highly inefficient. Any other ways to achieve the same result?
MySQL 8.0 now supports windowing functions, like almost all popular SQL implementations. With this standard syntax, we can write greatest-n-per-group queries:
WITH ranked_messages AS (
SELECT m.*, ROW_NUMBER() OVER (PARTITION BY name ORDER BY id DESC) AS rn
FROM messages AS m
)
SELECT * FROM ranked_messages WHERE rn = 1;
This and other approaches to finding groupwise maximal rows are illustrated in the MySQL manual.
Below is the original answer I wrote for this question in 2009:
I write the solution this way:
SELECT m1.*
FROM messages m1 LEFT JOIN messages m2
ON (m1.name = m2.name AND m1.id < m2.id)
WHERE m2.id IS NULL;
Regarding performance, one solution or the other can be better, depending on the nature of your data. So you should test both queries and use the one that is better at performance given your database.
For example, I have a copy of the StackOverflow August data dump. I'll use that for benchmarking. There are 1,114,357 rows in the Posts table. This is running on MySQL 5.0.75 on my Macbook Pro 2.40GHz.
I'll write a query to find the most recent post for a given user ID (mine).
First using the technique shown by #Eric with the GROUP BY in a subquery:
SELECT p1.postid
FROM Posts p1
INNER JOIN (SELECT pi.owneruserid, MAX(pi.postid) AS maxpostid
FROM Posts pi GROUP BY pi.owneruserid) p2
ON (p1.postid = p2.maxpostid)
WHERE p1.owneruserid = 20860;
1 row in set (1 min 17.89 sec)
Even the EXPLAIN analysis takes over 16 seconds:
+----+-------------+------------+--------+----------------------------+-------------+---------+--------------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+----------------------------+-------------+---------+--------------+---------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 76756 | |
| 1 | PRIMARY | p1 | eq_ref | PRIMARY,PostId,OwnerUserId | PRIMARY | 8 | p2.maxpostid | 1 | Using where |
| 2 | DERIVED | pi | index | NULL | OwnerUserId | 8 | NULL | 1151268 | Using index |
+----+-------------+------------+--------+----------------------------+-------------+---------+--------------+---------+-------------+
3 rows in set (16.09 sec)
Now produce the same query result using my technique with LEFT JOIN:
SELECT p1.postid
FROM Posts p1 LEFT JOIN posts p2
ON (p1.owneruserid = p2.owneruserid AND p1.postid < p2.postid)
WHERE p2.postid IS NULL AND p1.owneruserid = 20860;
1 row in set (0.28 sec)
The EXPLAIN analysis shows that both tables are able to use their indexes:
+----+-------------+-------+------+----------------------------+-------------+---------+-------+------+--------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+----------------------------+-------------+---------+-------+------+--------------------------------------+
| 1 | SIMPLE | p1 | ref | OwnerUserId | OwnerUserId | 8 | const | 1384 | Using index |
| 1 | SIMPLE | p2 | ref | PRIMARY,PostId,OwnerUserId | OwnerUserId | 8 | const | 1384 | Using where; Using index; Not exists |
+----+-------------+-------+------+----------------------------+-------------+---------+-------+------+--------------------------------------+
2 rows in set (0.00 sec)
Here's the DDL for my Posts table:
CREATE TABLE `posts` (
`PostId` bigint(20) unsigned NOT NULL auto_increment,
`PostTypeId` bigint(20) unsigned NOT NULL,
`AcceptedAnswerId` bigint(20) unsigned default NULL,
`ParentId` bigint(20) unsigned default NULL,
`CreationDate` datetime NOT NULL,
`Score` int(11) NOT NULL default '0',
`ViewCount` int(11) NOT NULL default '0',
`Body` text NOT NULL,
`OwnerUserId` bigint(20) unsigned NOT NULL,
`OwnerDisplayName` varchar(40) default NULL,
`LastEditorUserId` bigint(20) unsigned default NULL,
`LastEditDate` datetime default NULL,
`LastActivityDate` datetime default NULL,
`Title` varchar(250) NOT NULL default '',
`Tags` varchar(150) NOT NULL default '',
`AnswerCount` int(11) NOT NULL default '0',
`CommentCount` int(11) NOT NULL default '0',
`FavoriteCount` int(11) NOT NULL default '0',
`ClosedDate` datetime default NULL,
PRIMARY KEY (`PostId`),
UNIQUE KEY `PostId` (`PostId`),
KEY `PostTypeId` (`PostTypeId`),
KEY `AcceptedAnswerId` (`AcceptedAnswerId`),
KEY `OwnerUserId` (`OwnerUserId`),
KEY `LastEditorUserId` (`LastEditorUserId`),
KEY `ParentId` (`ParentId`),
CONSTRAINT `posts_ibfk_1` FOREIGN KEY (`PostTypeId`) REFERENCES `posttypes` (`PostTypeId`)
) ENGINE=InnoDB;
Note to commenters: If you want another benchmark with a different version of MySQL, a different dataset, or different table design, feel free to do it yourself. I have shown the technique above. Stack Overflow is here to show you how to do software development work, not to do all the work for you.
UPD: 2017-03-31, the version 5.7.5 of MySQL made the ONLY_FULL_GROUP_BY switch enabled by default (hence, non-deterministic GROUP BY queries became disabled). Moreover, they updated the GROUP BY implementation and the solution might not work as expected anymore even with the disabled switch. One needs to check.
Bill Karwin's solution above works fine when item count within groups is rather small, but the performance of the query becomes bad when the groups are rather large, since the solution requires about n*n/2 + n/2 of only IS NULL comparisons.
I made my tests on a InnoDB table of 18684446 rows with 1182 groups. The table contains testresults for functional tests and has the (test_id, request_id) as the primary key. Thus, test_id is a group and I was searching for the last request_id for each test_id.
Bill's solution has already been running for several hours on my dell e4310 and I do not know when it is going to finish even though it operates on a coverage index (hence using index in EXPLAIN).
I have a couple of other solutions that are based on the same ideas:
if the underlying index is BTREE index (which is usually the case), the largest (group_id, item_value) pair is the last value within each group_id, that is the first for each group_id if we walk through the index in descending order;
if we read the values which are covered by an index, the values are read in the order of the index;
each index implicitly contains primary key columns appended to that (that is the primary key is in the coverage index). In solutions below I operate directly on the primary key, in you case, you will just need to add primary key columns in the result.
in many cases it is much cheaper to collect the required row ids in the required order in a subquery and join the result of the subquery on the id. Since for each row in the subquery result MySQL will need a single fetch based on primary key, the subquery will be put first in the join and the rows will be output in the order of the ids in the subquery (if we omit explicit ORDER BY for the join)
3 ways MySQL uses indexes is a great article to understand some details.
Solution 1
This one is incredibly fast, it takes about 0,8 secs on my 18M+ rows:
SELECT test_id, MAX(request_id) AS request_id
FROM testresults
GROUP BY test_id DESC;
If you want to change the order to ASC, put it in a subquery, return the ids only and use that as the subquery to join to the rest of the columns:
SELECT test_id, request_id
FROM (
SELECT test_id, MAX(request_id) AS request_id
FROM testresults
GROUP BY test_id DESC) as ids
ORDER BY test_id;
This one takes about 1,2 secs on my data.
Solution 2
Here is another solution that takes about 19 seconds for my table:
SELECT test_id, request_id
FROM testresults, (SELECT #group:=NULL) as init
WHERE IF(IFNULL(#group, -1)=#group:=test_id, 0, 1)
ORDER BY test_id DESC, request_id DESC
It returns tests in descending order as well. It is much slower since it does a full index scan but it is here to give you an idea how to output N max rows for each group.
The disadvantage of the query is that its result cannot be cached by the query cache.
Use your subquery to return the correct grouping, because you're halfway there.
Try this:
select
a.*
from
messages a
inner join
(select name, max(id) as maxid from messages group by name) as b on
a.id = b.maxid
If it's not id you want the max of:
select
a.*
from
messages a
inner join
(select name, max(other_col) as other_col
from messages group by name) as b on
a.name = b.name
and a.other_col = b.other_col
This way, you avoid correlated subqueries and/or ordering in your subqueries, which tend to be very slow/inefficient.
I arrived at a different solution, which is to get the IDs for the last post within each group, then select from the messages table using the result from the first query as the argument for a WHERE x IN construct:
SELECT id, name, other_columns
FROM messages
WHERE id IN (
SELECT MAX(id)
FROM messages
GROUP BY name
);
I don't know how this performs compared to some of the other solutions, but it worked spectacularly for my table with 3+ million rows. (4 second execution with 1200+ results)
This should work both on MySQL and SQL Server.
Solution by sub query fiddle Link
select * from messages where id in
(select max(id) from messages group by Name)
Solution By join condition fiddle link
select m1.* from messages m1
left outer join messages m2
on ( m1.id<m2.id and m1.name=m2.name )
where m2.id is null
Reason for this post is to give fiddle link only.
Same SQL is already provided in other answers.
An approach with considerable speed is as follows.
SELECT *
FROM messages a
WHERE Id = (SELECT MAX(Id) FROM messages WHERE a.Name = Name)
Result
Id Name Other_Columns
3 A A_data_3
5 B B_data_2
6 C C_data_1
We will look at how you can use MySQL at getting the last record in a Group By of records. For example if you have this result set of posts.
id
category_id
post_title
1
1
Title 1
2
1
Title 2
3
1
Title 3
4
2
Title 4
5
2
Title 5
6
3
Title 6
I want to be able to get the last post in each category which are Title 3, Title 5 and Title 6. To get the posts by the category you will use the MySQL Group By keyboard.
select * from posts group by category_id
But the results we get back from this query is.
id
category_id
post_title
1
1
Title 1
4
2
Title 4
6
3
Title 6
The group by will always return the first record in the group on the result set.
SELECT id, category_id, post_title
FROM posts
WHERE id IN (
SELECT MAX(id)
FROM posts
GROUP BY category_id );
This will return the posts with the highest IDs in each group.
id
category_id
post_title
3
1
Title 3
5
2
Title 5
6
3
Title 6
Reference Click Here
Here are two suggestions. First, if mysql supports ROW_NUMBER(), it's very simple:
WITH Ranked AS (
SELECT Id, Name, OtherColumns,
ROW_NUMBER() OVER (
PARTITION BY Name
ORDER BY Id DESC
) AS rk
FROM messages
)
SELECT Id, Name, OtherColumns
FROM messages
WHERE rk = 1;
I'm assuming by "last" you mean last in Id order. If not, change the ORDER BY clause of the ROW_NUMBER() window accordingly. If ROW_NUMBER() isn't available, this is another solution:
Second, if it doesn't, this is often a good way to proceed:
SELECT
Id, Name, OtherColumns
FROM messages
WHERE NOT EXISTS (
SELECT * FROM messages as M2
WHERE M2.Name = messages.Name
AND M2.Id > messages.Id
)
In other words, select messages where there is no later-Id message with the same Name.
Clearly there are lots of different ways of getting the same results, your question seems to be what is an efficient way of getting the last results in each group in MySQL. If you are working with huge amounts of data and assuming you are using InnoDB with even the latest versions of MySQL (such as 5.7.21 and 8.0.4-rc) then there might not be an efficient way of doing this.
We sometimes need to do this with tables with even more than 60 million rows.
For these examples I will use data with only about 1.5 million rows where the queries would need to find results for all groups in the data. In our actual cases we would often need to return back data from about 2,000 groups (which hypothetically would not require examining very much of the data).
I will use the following tables:
CREATE TABLE temperature(
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
groupID INT UNSIGNED NOT NULL,
recordedTimestamp TIMESTAMP NOT NULL,
recordedValue INT NOT NULL,
INDEX groupIndex(groupID, recordedTimestamp),
PRIMARY KEY (id)
);
CREATE TEMPORARY TABLE selected_group(id INT UNSIGNED NOT NULL, PRIMARY KEY(id));
The temperature table is populated with about 1.5 million random records, and with 100 different groups.
The selected_group is populated with those 100 groups (in our cases this would normally be less than 20% for all of the groups).
As this data is random it means that multiple rows can have the same recordedTimestamps. What we want is to get a list of all of the selected groups in order of groupID with the last recordedTimestamp for each group, and if the same group has more than one matching row like that then the last matching id of those rows.
If hypothetically MySQL had a last() function which returned values from the last row in a special ORDER BY clause then we could simply do:
SELECT
last(t1.id) AS id,
t1.groupID,
last(t1.recordedTimestamp) AS recordedTimestamp,
last(t1.recordedValue) AS recordedValue
FROM selected_group g
INNER JOIN temperature t1 ON t1.groupID = g.id
ORDER BY t1.recordedTimestamp, t1.id
GROUP BY t1.groupID;
which would only need to examine a few 100 rows in this case as it doesn't use any of the normal GROUP BY functions. This would execute in 0 seconds and hence be highly efficient.
Note that normally in MySQL we would see an ORDER BY clause following the GROUP BY clause however this ORDER BY clause is used to determine the ORDER for the last() function, if it was after the GROUP BY then it would be ordering the GROUPS. If no GROUP BY clause is present then the last values will be the same in all of the returned rows.
However MySQL does not have this so let's look at different ideas of what it does have and prove that none of these are efficient.
Example 1
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM selected_group g
INNER JOIN temperature t1 ON t1.id = (
SELECT t2.id
FROM temperature t2
WHERE t2.groupID = g.id
ORDER BY t2.recordedTimestamp DESC, t2.id DESC
LIMIT 1
);
This examined 3,009,254 rows and took ~0.859 seconds on 5.7.21 and slightly longer on 8.0.4-rc
Example 2
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM temperature t1
INNER JOIN (
SELECT max(t2.id) AS id
FROM temperature t2
INNER JOIN (
SELECT t3.groupID, max(t3.recordedTimestamp) AS recordedTimestamp
FROM selected_group g
INNER JOIN temperature t3 ON t3.groupID = g.id
GROUP BY t3.groupID
) t4 ON t4.groupID = t2.groupID AND t4.recordedTimestamp = t2.recordedTimestamp
GROUP BY t2.groupID
) t5 ON t5.id = t1.id;
This examined 1,505,331 rows and took ~1.25 seconds on 5.7.21 and slightly longer on 8.0.4-rc
Example 3
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM temperature t1
WHERE t1.id IN (
SELECT max(t2.id) AS id
FROM temperature t2
INNER JOIN (
SELECT t3.groupID, max(t3.recordedTimestamp) AS recordedTimestamp
FROM selected_group g
INNER JOIN temperature t3 ON t3.groupID = g.id
GROUP BY t3.groupID
) t4 ON t4.groupID = t2.groupID AND t4.recordedTimestamp = t2.recordedTimestamp
GROUP BY t2.groupID
)
ORDER BY t1.groupID;
This examined 3,009,685 rows and took ~1.95 seconds on 5.7.21 and slightly longer on 8.0.4-rc
Example 4
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM selected_group g
INNER JOIN temperature t1 ON t1.id = (
SELECT max(t2.id)
FROM temperature t2
WHERE t2.groupID = g.id AND t2.recordedTimestamp = (
SELECT max(t3.recordedTimestamp)
FROM temperature t3
WHERE t3.groupID = g.id
)
);
This examined 6,137,810 rows and took ~2.2 seconds on 5.7.21 and slightly longer on 8.0.4-rc
Example 5
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM (
SELECT
t2.id,
t2.groupID,
t2.recordedTimestamp,
t2.recordedValue,
row_number() OVER (
PARTITION BY t2.groupID ORDER BY t2.recordedTimestamp DESC, t2.id DESC
) AS rowNumber
FROM selected_group g
INNER JOIN temperature t2 ON t2.groupID = g.id
) t1 WHERE t1.rowNumber = 1;
This examined 6,017,808 rows and took ~4.2 seconds on 8.0.4-rc
Example 6
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM (
SELECT
last_value(t2.id) OVER w AS id,
t2.groupID,
last_value(t2.recordedTimestamp) OVER w AS recordedTimestamp,
last_value(t2.recordedValue) OVER w AS recordedValue
FROM selected_group g
INNER JOIN temperature t2 ON t2.groupID = g.id
WINDOW w AS (
PARTITION BY t2.groupID
ORDER BY t2.recordedTimestamp, t2.id
RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)
) t1
GROUP BY t1.groupID;
This examined 6,017,908 rows and took ~17.5 seconds on 8.0.4-rc
Example 7
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM selected_group g
INNER JOIN temperature t1 ON t1.groupID = g.id
LEFT JOIN temperature t2
ON t2.groupID = g.id
AND (
t2.recordedTimestamp > t1.recordedTimestamp
OR (t2.recordedTimestamp = t1.recordedTimestamp AND t2.id > t1.id)
)
WHERE t2.id IS NULL
ORDER BY t1.groupID;
This one was taking forever so I had to kill it.
Here is another way to get the last related record using GROUP_CONCAT with order by and SUBSTRING_INDEX to pick one of the record from the list
SELECT
`Id`,
`Name`,
SUBSTRING_INDEX(
GROUP_CONCAT(
`Other_Columns`
ORDER BY `Id` DESC
SEPARATOR '||'
),
'||',
1
) Other_Columns
FROM
messages
GROUP BY `Name`
Above query will group the all the Other_Columns that are in same Name group and using ORDER BY id DESC will join all the Other_Columns in a specific group in descending order with the provided separator in my case i have used || ,using SUBSTRING_INDEX over this list will pick the first one
Fiddle Demo
Hi #Vijay Dev if your table messages contains Id which is auto increment primary key then to fetch the latest record basis on the primary key your query should read as below:
SELECT m1.* FROM messages m1 INNER JOIN (SELECT max(Id) as lastmsgId FROM messages GROUP BY Name) m2 ON m1.Id=m2.lastmsgId
I've not yet tested with large DB but I think this could be faster than joining tables:
SELECT *, Max(Id) FROM messages GROUP BY Name
SELECT
column1,
column2
FROM
table_name
WHERE id IN
(SELECT
MAX(id)
FROM
table_name
GROUP BY column1)
ORDER BY column1 ;
You can take view from here as well.
http://sqlfiddle.com/#!9/ef42b/9
FIRST SOLUTION
SELECT d1.ID,Name,City FROM Demo_User d1
INNER JOIN
(SELECT MAX(ID) AS ID FROM Demo_User GROUP By NAME) AS P ON (d1.ID=P.ID);
SECOND SOLUTION
SELECT * FROM (SELECT * FROM Demo_User ORDER BY ID DESC) AS T GROUP BY NAME ;
If you need the most recent or oldest record of a text column in a grouped query, and you would rather not use a subquery, you can do this...
Ex. You have a list of movies and need to get the count in the series and the latest movie
id
series
name
1
Star Wars
A New hope
2
Star Wars
The Empire Strikes Back
3
Star Wars
Return of The Jedi
SELECT COUNT(id), series, SUBSTRING(MAX(CONCAT(id, name)), LENGTH(id) + 1),
FROM Movies
GROUP BY series
This returns...
id
series
name
3
Star Wars
Return of The Jedi
MAX will return the row with the highest value, so by concatenating the id to the name, you now will get the newest record, then just strip off the id for your final result.
More efficient than using a subquery.
So for the given example:
SELECT MAX(Id), Name, SUBSTRING(MAX(CONCAT(Id, Other_Columns)), LENGTH(Id) + 1),
FROM messages
GROUP BY Name
Happy coding, and "May The Force Be With You" :)
Try this:
SELECT jos_categories.title AS name,
joined .catid,
joined .title,
joined .introtext
FROM jos_categories
INNER JOIN (SELECT *
FROM (SELECT `title`,
catid,
`created`,
introtext
FROM `jos_content`
WHERE `sectionid` = 6
ORDER BY `id` DESC) AS yes
GROUP BY `yes`.`catid` DESC
ORDER BY `yes`.`created` DESC) AS joined
ON( joined.catid = jos_categories.id )
Here is my solution:
SELECT
DISTINCT NAME,
MAX(MESSAGES) OVER(PARTITION BY NAME) MESSAGES
FROM MESSAGE;
SELECT * FROM table_name WHERE primary_key IN (SELECT MAX(primary_key) FROM table_name GROUP BY column_name )
**
Hi, this query might help :
**
SELECT
*
FROM
message
WHERE
`Id` IN (
SELECT
MAX(`Id`)
FROM
message
GROUP BY
`Name`
)
ORDER BY
`Id` DESC
i find best solution in https://dzone.com/articles/get-last-record-in-each-mysql-group
select * from `data` where `id` in (select max(`id`) from `data` group by `name_id`)
The below query will work fine as per your question.
SELECT M1.*
FROM MESSAGES M1,
(
SELECT SUBSTR(Others_data,1,2),MAX(Others_data) AS Max_Others_data
FROM MESSAGES
GROUP BY 1
) M2
WHERE M1.Others_data = M2.Max_Others_data
ORDER BY Others_data;
If you want the last row for each Name, then you can give a row number to each row group by the Name and order by Id in descending order.
QUERY
SELECT t1.Id,
t1.Name,
t1.Other_Columns
FROM
(
SELECT Id,
Name,
Other_Columns,
(
CASE Name WHEN #curA
THEN #curRow := #curRow + 1
ELSE #curRow := 1 AND #curA := Name END
) + 1 AS rn
FROM messages t,
(SELECT #curRow := 0, #curA := '') r
ORDER BY Name,Id DESC
)t1
WHERE t1.rn = 1
ORDER BY t1.Id;
SQL Fiddle
If performance is really your concern you can introduce a new column on the table called IsLastInGroup of type BIT.
Set it to true on the columns which are last and maintain it with every row insert/update/delete. Writes will be slower, but you'll benefit on reads. It depends on your use case and I recommend it only if you're read-focused.
So your query will look like:
SELECT * FROM Messages WHERE IsLastInGroup = 1
MariaDB 10.3 and newer using GROUP_CONCAT.
The idea is to use ORDER BY + LIMIT:
SELECT GROUP_CONCAT(id ORDER BY id DESC LIMIT 1) AS id,
name,
GROUP_CONCAT(Other_columns ORDER BY id DESC LIMIT 1) AS Other_columns
FROM t
GROUP BY name;
db<>fiddle demo
How about this:
SELECT DISTINCT ON (name) *
FROM messages
ORDER BY name, id DESC;
I had similar issue (on postgresql tough) and on a 1M records table. This solution takes 1.7s vs 44s produced by the one with LEFT JOIN.
In my case I had to filter the corrispondant of your name field against NULL values, resulting in even better performances by 0.2 secs
Yet another option without subqueries.
This solution uses MySQL LAST_VALUE window function, exploiting Window Function Frame available MySQL tool from .
SELECT DISTINCT
LAST_VALUE(Id)
OVER(PARTITION BY Name
ORDER BY Id
ROWS BETWEEN 0 PRECEDING
AND UNBOUNDED FOLLOWING),
Name,
LAST_VALUE(Other_Columns)
OVER(PARTITION BY Name
ORDER BY Id
ROWS BETWEEN 0 PRECEDING
AND UNBOUNDED FOLLOWING)
FROM
tab
Try it here.
Hope below Oracle query can help:
WITH Temp_table AS
(
Select id, name, othercolumns, ROW_NUMBER() over (PARTITION BY name ORDER BY ID
desc)as rank from messages
)
Select id, name,othercolumns from Temp_table where rank=1
Another approach :
Find the propertie with the max m2_price withing each program (n properties in 1 program) :
select * from properties p
join (
select max(m2_price) as max_price
from properties
group by program_id
) p2 on (p.program_id = p2.program_id)
having p.m2_price = max_price
What about:
select *, max(id) from messages group by name
I have tested it on sqlite and it returns all columns and max id value for all names.
As of MySQL 8.0.14, this can also be achieved using Lateral Derived Tables:
SELECT t.*
FROM messages t
JOIN LATERAL (
SELECT name, MAX(id) AS id
FROM messages t1
WHERE t.name = t1.name
GROUP BY name
) trn ON t.name = trn.name AND t.id = trn.id
db<>fiddle

mysql multiple COUNT() from multiple tables with LEFT JOIN

I want to show the conclusion of all users.
I have 3 tables.
table post
post_id(index) user_id
1 1
2 3
3 3
4 4
table photo
photo_id(index) user_id
1 2
2 4
3 1
4 1
table video
photo_id(index) user_id
1 4
2 4
3 3
4 3
and in table user
user_id(index) user_name
1 mark
2 tommy
3 john
4 james
in fact, it has more than 4 rows for every tables.
I want the result like this.
id name post photo videos
1 mark 1 2 0
2 tommy 0 1 0
3 john 2 0 2
4 james 1 1 2
5 .. .. .. ..
Code below is SQL that can work correctly but very slow, I will be true appreciated if you help me how it using LEFT JOIN for it. Thanks.
SQL
"select user.*,
(select count(*) from post where post.userid = user.userid) postCount,
(select count(*) from photo where photo.userid = user.userid) photoCount,
(select count(*) from video where video .userid = user.userid) videoCount
from user order by user.id"
(or ORDER BY postCount, photoCount or videoCount ASC or DESC as i want )
I done researched before but no any helped me.
SELECT u.user_id,
u.user_name,
COUNT(DISTINCT p.post_id) AS `postCount`,
COUNT(DISTINCT ph.photo_id) AS `photoCount`,
COUNT(DISTINCT v.video_id) AS `videoCount`
FROM user u
LEFT JOIN post p
ON p.user_id = u.user_id
LEFT JOIN photo ph
ON ph.user_id = u.user_id
LEFT JOIN video v
ON v.user_id = u.user_id
GROUP BY u.user_id
ORDER BY postCount;
Live DEMO
Your method of doing this is quite reasonable. Here is your query:
select user.*,
(select count(*) from post where post.userid = user.userid) as postCount,
(select count(*) from photo where photo.userid = user.userid) as photoCount,
(select count(*) from video where video.userid = user.userid) as videoCount
from user
order by user.id;
For this query, you want the following indexes:
post(userid)
photo(userid)
video(userid)
user(id)
You probably already have the last one, because user.id is probably the primary key of the table.
Note that a left join approach is a bad idea in this case. The three tables -- posts, photos, and videos -- are independent of each other. If a user has five of each, then joining them together would produce 125 intermediate rows. If a user has fifty of each, it would be 125,000 -- a lot of extra processing.
Your answer is probably slow as it is using a correlated sub-query i.e. the sub query is running once for each user_id (unless the optimizer is doing something smart - which shouldn't be counted on).
You could use a left outer join and count or use something temporary like:
SELECT u.user_id,
u.user_name,
ph.user_count AS 'photoCount',
p.user_count AS 'postCount',
v.user_count AS 'videoCount'
FROM user u
INNER JOIN ( SELECT user_id,
COUNT(*) AS user_count
FROM photo
GROUP BY user_id
) ph
ON ph.user_id=u.user_id
INNER JOIN ( SELECT user_id,
COUNT(*) AS user_count
FROM post
GROUP BY user_id
) p
ON p.user_id=u.user_id
INNER JOIN ( SELECT user_id,
COUNT(*) AS user_count
FROM video
GROUP BY user_id
) v
ON v.user_id=u.user_id
There are pros and cons for both (depending on indexes). Always have a look at the query plan (using EXPLAIN for MySQL).

Mysql left-join, count by unique value in a field not working

I have a table like below.
I want to be able to get a count of group_id then group by group_id then left join groups table where groups_user.group_id = groups.id but I'm only getting one result back.
I want group by unique group_id then count of each duplicate group_id. So far my query is like below:
SELECT 'groups.group' ,COUNT('groups_users.group_id') as groups_count
FROM `groups_users`
LEFT JOIN groups
ON 'groups_users.group_id' = 'groups.id'
GROUP BY 'groups_users.group_id'
Table:
id group_id user_id
26 3 1
22 2 1
19 1 1
20 1 2
21 1 4
Where am I getting wrong here?
I think you've got the JOIN backwards. Try:
SELECT g.group, COUNT(gu.group_id) AS groups_count
FROM groups g
LEFT JOIN groups_users gu
ON g.id = gu.group_id
GROUP BY g.group;

Categories