MySQL GROUP BY ignoring ORDER BY clause - php

I have two queries, the only difference being the GROUP BY clause
SELECT * FROM `packages_sorted_YHZ` WHERE `hotel_city` = 'Montego Bay'
ORDER BY `deal_score` DESC
LIMIT 0,3;
SELECT * FROM `packages_sorted_YHZ` WHERE `hotel_city` = 'Montego Bay'
GROUP BY `hotel_name`
ORDER BY `deal_score` DESC
LIMIT 0,3;
The first query returns the first result with a deal_score of 75 but the second query returns the first result with the deal_score of just 72.
I would have thought that regardless of the GROUP BY clause, the first result would have the highest deal score possible (75)
The purpose of the GROUP BY clause is to optionally select a unique hotel_name for each result.
Does anyone know what I'm doing wrong here.

Without being able to look at all the data, my best guess is that Group By is merging the data and giving you an arbitrary value that matches the Where clause. This will happen if hotel name isn't unique, and you won't be given the maximum score unless you specifically query for it.
Try putting a Max() around deal_score. In MySQL, Group By can be used way too easily, I like how MSSQL enforces the use of aggregate functions and grouping by every field that isn't aggregated. Try this query:
SELECT `hotel_name`, MAX( `deal_score` ) AS `max_score` FROM `packages_sorted_YHZ` WHERE `hotel_city` = 'Montego Bay'
GROUP BY `hotel_name`
ORDER BY `max_score` DESC
LIMIT 0,3;

It looks like you are facing some very MySql specific issue. In theory, your second query is not valid and should return an error. But MySQL allows for selection of so called hidden columns - the columns that are not mentioned in a group by clause and not aggregated.
As stated in manual, hidden columns values are indeterminate, but in practice it usually picks up the first row walking the index used, regardless of sorting specified by ORDER BY, as sorting is performed after the grouping.
This is vendor-specific issue, so your second query should always fail if used to query other RDBMS. The correct implementation should be something like
SELECT max(`deal_score`) as maxdeal, `hotel_name` FROM `packages_sorted_YHZ` WHERE `hotel_city` = 'Montego Bay'
GROUP BY `hotel_name`
ORDER BY maxdeal
LIMIT 0,3;

You should not use GROUP BY but instead DISTINCT since you want a unique hotel_name.
example:
SELECT DISTINCT hotel_name -- add other fields here
FROM `packages_sorted_YHZ`
WHERE `hotel_city` = 'Montego Bay'
ORDER BY `deal_score` DESC
LIMIT 0,3;

SELECT max(deal_score) as maxdealscore, `hotel_name` * FROM `packages_sorted_YHZ` WHERE `hotel_city` = 'Montego Bay'
GROUP BY `hotel_name`
ORDER BY `deal_score` DESC
LIMIT 0,3;

Related

Mysql - Order By in IN Clausole

I have a simple question.
I have to join multiple queries and most of them has ad IN clausole.
Unfortunally, MySQL doesn't allow ORDER BY in an UNION (only outside all unions) but I need a specific ordering that I can't get with an outside order by.
Question is:
If I have a IN clausole like
foo in ('newyork','boston','atlanta')
may I assume MySql's engine will order the resulting rows by IN position (so, first all the row with foo = 'newyork' then 'boston' etc...
Thanks.
No. It's nothing like that. You need to explicitly add an order by clause.
If you want to order your rows based on position in a set, you can use field function.
order by field(foo,'newyork','boston','atlanta')
Select *
from
(
Select * from Temp1
order by x
) Tbl1
UNION
Select *
from
(
Select * from Temp2
order by y
) Tbl2
Try this.

How can I get this database to order before the GROUP BY [duplicate]

This question already has answers here:
MySQL Order before Group by
(10 answers)
Closed 9 years ago.
I made a website for golf scorecards. The page I am working on is the players profile. When you access a players profile, it shows each course in order of last played (DESC). Except, the order of last played is jumbled due to the ORDER BY command below. Instead, when it GROUPs, it takes the earliest date, rather than the most recent.
After the grouping is done, it correctly shows them in order (DESC)... just the wrong order due to the courses grouping by date_of_game ASC, rather than DESC. Hope this isn't too confusing.. Thank you.
$query_patrol321 = "SELECT t1.*,t2.* FROM games t1 LEFT JOIN scorecards t2 ON t1.game_id=t2.game_id WHERE t2.player_id='$player_id' GROUP BY t1.course_id ORDER BY t1.date_of_game DESC";
$result_patrol321 = mysql_query($query_patrol321) or die ("<br /><br />There's an error in the MySQL-query: ".mysql_error());
while ($row_patrol321 = mysql_fetch_array($result_patrol321)) {
$player_id_rank = $row_patrol321["player_id"];
$course_id = $row_patrol321["course_id"];
$game_id = $row_patrol321["game_id"];
$top_score = $row_patrol321["total_score"];
Try to remove the GROUP BY-clause from the query. You should use GROUP BY only when you have both normal columns and aggregate functions (min, max, sum, avg, count) in your SELECT. You have just normal columns.
The fact that it shows the grouping result in ASC order is a coincidence because that is the order of their insertion. In contrast to other RDBMS like MS SQL Server, MySQL allows you to add non-aggregated columns to a GROUPed query. This non-standard behavior creates the confusion you're seeing. If this were not MySQL, you'd need to define the aggregation for all your selected columns given the grouping.
MySQL's behavior is (I believe) to take the first row matching the the GROUP for non-aggregated columns. I would advise against doing this.
Even though you're aggregating, you're not ORDERing by the aggregated column.
So What you want to do is ORDER BY the MAX date DESC
In this way, you are ordering by the latest date per course (your grouping criteria).
SELECT
t1.* -- It would be better if you actually listed the aggregations you wanted
,t2.* -- Which columns do you really want?
FROM
games t1
LEFT JOIN
scorecards t2
ON t2.[game_id] =t1[.game_id]
WHERE
t2.[player_id]='$player_id'
GROUP BY
t1.[course_id]
ORDER BY
MAX(t1.[date_of_game]) DESC
If you want the maximum date, then insert logic to get it. Don't depend on the ordering of columns or on undocumented MySQL features. MySQL explicitly discourages the use of non-aggregated columns in the group by when the values are not identical:
MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. This means that the preceding query is legal in MySQL. You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. (see [here][1])
How do you do what you want? The following query finds the most recent date on each course and just uses that -- and no group by:
SELECT t1.*, t2.*
FROM games t1 LEFT JOIN
scorecards t2
ON t1.game_id=t2.game_id
WHERE t2.player_id='$player_id' and
t1.date_of_game in (select MAX(date_of_game)
from games g join
scorecards ss
on g.game_id = ss.game_id and
ss.player_id = '$player_id'
where t1.course_id = g.course_id
)
GROUP BY t1.course_id
ORDER BY t1.date_of_game DESC
If game_id is auto incrementing, you can use that instead of date_of_game. This is particularly important if two games can be on the same course on the same date.

Group by and get the last row

When I used the
SELECT * FROM reports WHERE staff='$username' GROUP BY taskid
on my query, I get a result of the first row from that group.
What do I need to add to get the result from the last row of that group?
last row means having id greater than the other row from that group.
I tried adding
ORDER BY id DESC
or
ORDER BY id
but it did not return the intended result.
You are using a group by function without any aggregate functions. You probably want to do an order by instead (No group by in the query):
SELECT * FROM reports WHERE staff='$username' order BY taskid desc;
Group By functions are commonly used when you want to use an aggregate function on a particular column (such as get an average row value, or a sum) and the like. If you are not using any aggregate function, then using Group By will not do anything.
If you only want to get one row from the query you can add a limit clause like this:
SELECT * FROM reports WHERE staff='$username' order BY taskid desc limit 1;
if you want the "last" row per group, then you need to specify which field defines uniqueness (i.e. what you mean by "first/last row") and then isolate those rows in a subquery.
e.g.
this gets you the max(id) for each group
SELECT taskid, max(id) as max_id FROM reports
WHERE staff ='$username'
GROUP BY taskid
and this gets the entire row(s):
select * from reports where id
in
(
SELECT max(id) as max_id FROM reports
WHERE staff='$username'
GROUP BY taskid
) x
This of course assumes that id is unique and assigned in ascending order and that therefore max(id) indicates the last row per group.
Alternatively you could rewrite this using a join:
select * from reports r
inner join
(
SELECT max(id) as max_id FROM reports
WHERE staff='$username'
GROUP BY taskid
) x
on r.id = x.max_id

grouping by time in mysql

I have got this query:
SELECT * FROM `mfw_navnode` n1
WHERE (n1.node_name='Eby' OR n1.node_name='Lana' OR n1.node_name='MF' OR n1.node_name='Amur' OR n1.node_name='Asin' )
GROUP BY n1.node_name
It has another field which is called time_added..that is basically the time of each row..I want to order the node_name (Select them), by the latest time I added them.
How do I do it? each node_name has a time.. but I want to group each node_name by the latest time it was added..
I thought to use MAX(time_added) and then add it to group by..but that gave me an error..
What could be a possible solution?
If I understand the question, the following may be what you are looking for. It isn't quite clear to me if you want them ordered with the latest times first or last (add DESC as needed):
SELECT `node_name`, max(`time`) `maxtime` FROM `mfw_navnode` n1 ...
GROUP BY node_name order by `maxtime`;
One thing to note is that the query in the OP, which includes fields not in the GROUP BY clause is ambiguous and is not allowed in most SQL implementations.
SELECT * FROM `mfw_navnode` n1
WHERE (n1.node_name='Eby' OR n1.node_name='Lana' OR n1.node_name='MF' OR n1.node_name='Amur' OR n1.node_name='Asin' )
GROUP BY n1.node_name
ORDER BY n1.time_added DESC
Order by DESC automatically shows most recently created first.

SQL order by, group by, having

I'm using a database to store results of an election with the columns id, candidate, post_time and result. Results are put in the database during 'counting the votes'. When a new update is available, a new entry will be inserted.
From this database, I would like to create a table with the most recent results (MAX post_time) per candidate (GROUP BY candidate), ordered by result (ORDER BY result).
How can I translate this to a working SQL-statement?
(I've tried mysql order and groupby without success)
I've tried:
SELECT *, MAX(time_post)
FROM [database]
GROUP BY candidate
HAVING MAX(time_post) = time_post
ORDER BY result
Assuming that you don't have multiple results per candidate at same time, next should work:
select r.candiate, r.result
from results r
inner join (
select candidate, max(post_time) as ptime
from results
group by candidate
) r2 on r2.candiate=r.candidate and r2.ptime=r.post_time
order by r.result
Note that MAX will not select the record with the maximum time, but it will select the maximum value from any record. So
SELECT MAX(a), MAX(b) FROM example
where exmple contains the two records a=1, b=2 and a=4, b=0, will result in a=4, b=2, which wasn't in the data. You should probably create a view with the latest votes only from each candidate, then query that. For performance, it may be sensible to use a materialized view.
Is the post_time likely to be the same for all the most recent results? Also does each candidate only appear once per post_time?
This could be achieved by just using a SELECT statement. Is there a reason you need the results in a new table?
If each candidate only appears once per post_time:
SELECT candidate, result
FROM table
WHERE post_time = (SELECT MAX(post_time) FROM table)
If you want to count how many times a candidate appears in the table for the last post_time:
SELECT candidate, count(result) as ResultCount
FROM table
WHERE post_time = (SELECT MAX(post_time) FROM table)
GROUP BY candidate
By what i see from ur attempts i'd think you should use this
SELECT MAX(post_time) FROM `table` GROUP BY candidate ORDER BY result
but the MAX statment only return a single value therefore i dont see why ORDER BY would be needed.
if you want multiple results try looking up the TOP statment
One way (tied results shown):
SELECT t.*
FROM tableX AS t
JOIN
( SELECT candidate
, MAX(time_post) AS time_post
FROM tableX
GROUP BY candidate
) AS m
ON (m.candidate, m.time_post) = (t.candidate, t.time_post)
ORDER BY t.result
and another one (no ties, only one row per candidate shown):
SELECT t.*
FROM
( SELECT DICTINCT candidate
FROM tableX
) AS d
JOIN
tableX AS t
ON t.PK = --- the Primary Key of the table, here
( SELECT ti.PK --- and here
FROM tableX AS ti
WHERE ti.candidate = d.candidate
ORDER ti.time_post DESC
LIMIT 1
)
ORDER BY t.result

Categories