I would like to better optimize my code. I'd like to have a single query that allows an alias name to have it's own limit and also include a result with no limit.
Currently I'm using two queries like this:
// ALL TIME //
$mikep = mysqli_query($link, "SELECT tasks.EID, reports.how_did_gig_go FROM tasks INNER JOIN reports ON tasks.EID=reports.eid WHERE `priority` IS NOT NULL AND `partners_name` IS NOT NULL AND mike IS NOT NULL GROUP BY EID ORDER BY tasks.show_date DESC;");
$num_rows_mikep = mysqli_num_rows($mikep);
$rating_sum_mikep = 0;
while ($row = mysqli_fetch_assoc($mikep)) {
$rating_mikep = $row['how_did_gig_go'];
$rating_sum_mikep += $rating_mikep;
}
$average_mikep = $rating_sum_mikep/$num_rows_mikep;
// AND NOW WITH A LIMIT 10 //
$mikep_limit = mysqli_query($link, "SELECT tasks.EID, reports.how_did_gig_go FROM tasks INNER JOIN reports ON tasks.EID=reports.eid WHERE `priority` IS NOT NULL AND `partners_name` IS NOT NULL AND mike IS NOT NULL GROUP BY EID ORDER BY tasks.show_date DESC LIMIT 10;");
$num_rows_mikep_limit = mysqli_num_rows($mikep_limit);
$rating_sum_mikep_limit = 0;
while ($row = mysqli_fetch_assoc($mikep_limit)) {
$rating_mikep_limit = $row['how_did_gig_go'];
$rating_sum_mikep_limit += $rating_mikep_limit;
}
$average_mikep_limit = $rating_sum_mikep_limit/$num_rows_mikep_limit;
This allows me to show an all-time average and also an average over the last 10 reviews. Is it really necessary for me to set up two queries?
Also, I understand I could get the sum in the query, but not all the values are numbers, so I've actually converted them in PHP, but left out that code in order to try and simplify what is displayed in the code.
All-time average and average over the last 10 reviews
In the best case scenario, where your column how_did_gig_go was 100% numeric, a single query like this could work like so:
SELECT
AVG(how_did_gig_go) AS avg_how_did_gig_go
, SUM(CASE
WHEN rn <= 10 THEN how_did_gig_go
ELSE 0
END) / 10 AS latest10_avg
FROM (
SELECT
#num + 1 AS rn
, tasks.show_date
, reports.how_did_gig_go
FROM tasks
INNER JOIN reports ON tasks.EID = reports.eid
CROSS JOIN ( SELECT #num := 0 AS n ) AS v
WHERE priority IS NOT NULL
AND partners_name IS NOT NULL
AND mike IS NOT NULL
ORDER BY tasks.show_date DESC
) AS d
But; Unless all the "numbers" are in fact numeric you are doomed to sending every row back from the server for php to process unless you can clean-up the data in MySQL somehow.
You might avoid sending all that data twice if you establish a way for your php to use only the top 10 from the whole list. There are probably way of doing that in PHP.
If you wanted assistance in SQL to do that, then maybe having 2 columns would help, it would reduce the number of table scans.
SELECT
EID
, how_did_gig_go
, CASE
WHEN rn <= 10 THEN how_did_gig_go
ELSE 0
END AS latest10_how_did_gig_go
FROM (
SELECT
#num + 1 AS rn
, tasks.EID
, reports.how_did_gig_go
FROM tasks
INNER JOIN reports ON tasks.EID = reports.eid
CROSS JOIN ( SELECT #num := 0 AS n ) AS v
WHERE priority IS NOT NULL
AND partners_name IS NOT NULL
AND mike IS NOT NULL
ORDER BY tasks.show_date DESC
) AS d
In future (MySQL 8.x) ROW_NUMBER() OVER(order by tasks.show_date DESC) would be a better method than the "roll your own" row numbering (using #num+1) shown before.
Related
I'm having some issues with trying to fix this SQL Query
This is a custom search query which is searching for the word 'weddings' on all pages on this CMS system.
At the moment I am getting the same page appear on the first 5 rows because the word 'weddings' appears 5 times. What I want to do is combine the rows with the same ID number into 1 row so it doesn't appear multiple times.
I thought doing a group by at the end of this statement would do this but I keep getting an SQL syntax error
GROUP BY `documents`.`id`
I have attached the full SQL bellow with an image of the output i currently get.... Any idea?
SELECT `documents`.*,
`documenttypes`.`name` as `doctype`,
`articles`.`id` as `article_id`,
`articles`.`language_id`,
`articles`.`title`,
`articles`.`template`,
`articles`.`slug`,
`articles`.`path`,
`articles`.`slug_title`,
MATCH ( elements.textvalue )AGAINST ( 'weddings' ) AS score,
elements.textvalue AS matching,
LOWER(`articles`.`title`)
LIKE '%weddings%' as 'like_title',
( MATCH ( elements.textvalue )
AGAINST ( 'weddings' ) ) + IF(( LOWER(`articles`.`title`)
LIKE '%weddings%'),1, 0) + IF((LOWER(`elements`.`textvalue`)
LIKE '%weddings%'),1, 0) as total FROM (`documents`)
LEFT JOIN `articles` ON `articles`.`document_id` = `documents`.`id`
LEFT JOIN `documenttypes` ON `documents`.`documenttype_id` = `documenttypes`.`id`
LEFT JOIN `documents_users` AS du ON `documents`.`id` = du.`document_id`
LEFT JOIN `documents_usergroups` AS dug ON `documents`.`id` = dug.`document_id`
LEFT JOIN elements ON `elements`.`article_id` = `articles`.`id`
WHERE `documents`.`trashed` = 0
AND `documents`.`published` = 1
AND `articles`.`status_id` = 1
AND `articles`.`language_id` = 1
AND (`documents`.`no_search` = '0'
OR `documents`.`no_search` IS NULL)
AND ( (dug.usergroup_id IS NULL)
AND (du.user_id IS NULL) )
AND (`documents`.`startdate` < NOW()
OR `documents`.`startdate` = '0000-00-00 00:00:00' OR `documents`.`startdate` IS NULL)
AND (`documents`.`enddate` > NOW()
OR `documents`.`enddate` = '0000-00-00 00:00:00'
OR `documents`.`enddate` IS NULL)
HAVING (total > 0)
ORDER BY label ASC,
total DESC LIMIT 0,10
You can try to use the statement DISTINCT:
SELECT DISTINCT 'documents'.*,
'documenttypes'.'name' as 'doctype',
'articles'.'id' as 'article_id',
...
GROUP BY lets you use aggregate functions, like AVG, MAX, MIN, SUM, and COUNT which apparently you don't use.
I recently made a system that ranks each players depending on their points. Well the way the system gets the points is rather confusing. After using this system for over 24 hours, I have found out that it is not organizing it according to the points. But then it suddenly occurred to me, that I could be calculating the points wrong in a way that does not represent the SQL query. Here is my SQL query that my rankings uses:
SELECT * , playeruid AS player_id, (
(
SELECT COALESCE(sum(player1points),0)
FROM `mybb_matches`
WHERE player1uid = player_id AND gid = $id AND timestamp < $time AND winneruid is NOT NULL AND dispute != 3 )
+
(
SELECT COALESCE(sum(player2points),0)
FROM `mybb_matches`
WHERE player2uid = player_id AND gid = $id AND timestamp < $time AND winneruid is NOT NULL AND dispute != 3 )
+
(
SELECT SUM( rank )
FROM `mybb_matchesgame`
WHERE playeruid = player_id AND gid = $id )
)
AS points
FROM mybb_matchesgame WHERE gid = $id
ORDER BY points DESC
Now that this is shown, I was wondering if there's any way to grab the value of "points" and display it somehow so I can verify the number. Is this possible?
There are no group by statements in the queries, so the SUM is most likely not over the expected set. Also COALESCE can be replaced with IFNULL, which might be a bit more efficient.
SELECT q.* , playeruid AS player_id, a.points+b.points+c.points AS points
FROM mybb_matchesgame q
LEFT JOIN (
SELECT IFNULL(SUM(player1points),0) as points,player_id
FROM `mybb_matches`
WHERE timestamp < $time AND winneruid is NOT NULL AND dispute != 3
GROUP BY player_id) a ON player1uid = a.player_id
LEFT JOIN (
SELECT IFNULL(sum(player2points),0) as points,player_id
FROM `mybb_matches`
WHERE timestamp < $time AND winneruid is NOT NULL AND dispute != 3
GROUP BY player_id) b ON player2uid = b.player_id
LEFT JOIN (
SELECT IFNULL(SUM( rank ),0) as points,player_id
FROM `mybb_matchesgame`
GROUP BY player_id) c ON playeruid = c.player_id
WHERE gid = $id
ORDER BY a.points+b.points+c.points DESC;
Can anyone help me optimise this query? I have the following table:
cdu_user_progress:
--------------------------------------------------------------
|id |uid |lesson_id |game_id |date |score |
--------------------------------------------------------------
For each user, I'm trying to obtain the difference between the best and first scores for a particular game_id for a particular lesson_id, and order the results by that difference ('progress' in my query):
SELECT ms.uid AS id, ms.max_score - fs.first_score AS progress
FROM (
SELECT up.uid, MAX(CASE WHEN game_id = 3 THEN score ELSE NULL END) AS max_score
FROM cdu_user_progress up
WHERE (up.uid IN ('1671', '1672', '1673', '1674', '1675', '1676', '1679', '1716', '1725', '1726', '1937', '1964', '1996', '2062', '2065', '2066', '2085', '2086')) AND (up.lesson_id = '65') AND (up.score > '-1')
GROUP BY up.uid
) ms
LEFT JOIN (
SELECT up.uid, up.score AS first_score
FROM cdu_user_progress up
INNER JOIN (
SELECT up.uid, MIN(CASE WHEN game_id = 3 THEN date ELSE NULL END) AS first_date
FROM cdu_user_progress up
WHERE (up.uid IN ('1671', '1672', '1673', '1674', '1675', '1676', '1679', '1716', '1725', '1726', '1937', '1964', '1996', '2062', '2065', '2066', '2085', '2086')) AND (up.lesson_id = '65') AND (up.score > '-1')
GROUP BY up.uid
) fd ON fd.uid = up.uid AND fd.first_date = up.date
) fs ON fs.uid = ms.uid
ORDER BY progress DESC
Any help would be most appreciated!
Absent any EXPLAIN output or index definitions, we can't make any recommendations. (I noted in a comment that it looks like some join predicates are missing, if we don't have guaranteed uniqueness on the (uid,date) tuple in cdu_user_progress... there's potential that we are going to get rows that are for a different lesson_id or a score that isn't greater than '-1'.
In the query text, immediately before ) fs , I'd be adding
AND up.lesson_id = '65'
AND up.score > '-1'
GROUP BY up.uid
I'd also wrap the up.score column (in the SELECT list of the fd view) in an aggregate function, either MIN() or MAX(), for compliance with the ANSI standard (even though it isn't required by MySQL when SQL_MODE doesn't include ONLY_FULL_GROUP_BY)
If I didn't have a suitable index defined, I'd consider adding an index:
... ON cdu_user_progress (lesson_id, uid, score, game_id, date)
There's some overhead for the derived tables (materializing the inline views) and those derived tables aren't going to have indexes on them (in MySQL 5.5 and earlier.) But the GROUP BY in each inline view ensures that we'll have less than 20 rows, so that's not really going to be a problem.
So, if there's a performance issue, it's in the view queries. Again, we'd really need to see the output from EXPLAIN and the index definitions, and some cardinality estimates, in order to make recommendations.
FOLLOWUP
Given that there's not a unique constraint on (uid,date), I'd add those predicates in the fs view query. I'd also use unique table aliases in the query (for each references to cdu_user_progress) to make both the statement and the EXPLAIN output easier to read. Also, adding the GROUP BY clause and the aggregate function in the fd view... I'd write the query like this:
SELECT ms.uid AS id
, ms.max_score - fs.first_score AS progress
FROM ( SELECT up.uid
, MAX(CASE WHEN up.game_id = 3 THEN up.score ELSE NULL END) AS max_score
FROM cdu_user_progress up
WHERE up.uid IN ('1671','1672','1673','1674','1675','1676','1679','1716','1725','1726','1937','1964','1996','2062','2065','2066','2085','2086')
AND up.lesson_id = '65'
AND up.score > '-1'
GROUP BY up.uid
) ms
LEFT
JOIN ( SELECT uo.uid
, MIN(uo.score) AS first_score
FROM ( SELECT un.uid
, MIN(CASE WHEN un.game_id = 3 THEN un.date ELSE NULL END) AS first_date
FROM cdu_user_progress un
WHERE un.uid IN ('1671','1672','1673','1674','1675','1676','1679','1716','1725','1726','1937','1964','1996','2062','2065','2066','2085','2086')
AND un.lesson_id = '65'
AND un.score > '-1'
GROUP BY un.uid
) fd
JOIN cdu_user_progress uo
ON uo.uid = fd.uid
AND uo.date = fd.first_date
AND uo.lesson_id = '65'
AND uo.score > '-1'
GROUP BY uo.uid
) fs
ON fs.uid = ms.uid
ORDER BY progress DESC
And I believe that would make the index I recommended above suitable for all of the references to cdu_user_progress.
I have a table in a MySQL database (level_records) which has 3 columns (id, date, reading). I want to put the differences between the most recent 20 readings (by date) into an array and then average them, to find the average difference.
I have looked everywhere, but no one seems to have a scenario quite like mine.
I will be very grateful for any help. Thanks.
SELECT AVG(difference)
FROM (
SELECT #next_reading - reading AS difference, #next_reading := reading
FROM (SELECT reading
FROM level_records
ORDER BY date DESC
LIMIT 20) AS recent20
CROSS JOIN (SELECT #next_reading := NULL) AS var
) AS recent_diffs
DEMO
If we consider "differences" to be signed, and if we ignore/exclude any rows that have a NULL values of reading...
If you want to return just the values of the difference between a reading and the immediately preceding reading (to get the latest nineteen differences), then you could do something like this:
SELECT d.diff
FROM ( SELECT e.reading - #prev_reading AS diff
, #prev_reading AS prev_reading
, #prev_reading := e.reading AS reading
FROM ( SELECT r.date
, r.reading
FROM level_records r
CROSS
JOIN (SELECT #prev_reading := NULL) p
ORDER BY r.date DESC
LIMIT 20
) e
ORDER BY e.date ASC
) d
That'll get you the rows returned from MySQL and you can monkey with them in PHP however you want. (The question of how to monkey around with arrays in PHP is a question that doesn't really have anything to do with MySQL.)
If you want to know how to return rows from a SQL resultset into a PHP array, that doesn't really have anything to do with "latest twenty", "difference", or "average" at all. You'd use the same pattern you'd use for returning the result from any query. There's nothing at all unique about that, there are plenty of examples of that, (most of them unfortunately using the deprecated mysql_ interface; for new development, you want to use either PDO or mysqli_.
If you mean by "all 19 sets of differences" that you want to get the difference between a reading and every other reading, and do that for each reading, such that you get a total of 380 rows ( = 20 * (20-1) rows ) then:
SELECT a.reading - b.reading AS diff
, a.id AS a_id
, a.date AS a_date
, a.reading AS a_reading
, b.id AS b_id
, b.date AS b_date
, b.reading AS b_reading
FROM ( SELECT aa.id
, aa.date
, aa.reading
FROM level_record aa
WHERE aa.reading IS NOT NULL
ORDER BY aa.date DESC, aa.id DESC
LIMIT 20
) a
JOIN ( SELECT bb.id
, bb.date
, bb.reading
FROM level_record bb
WHERE bb.reading IS NOT NULL
ORDER BY bb.date DESC, bb.id DESC
LIMIT 20
) b
WHERE a.id <> b.id
ORDER BY a.date DESC, b.date DESC
Sometimes, we only want differences in one direction, that is, if we have the difference between r13 and r15, we essentially already have the inverse, the difference between r15 and f13. And sometimes, it's more convenient to have the inverse copies.
What query you run really depends on what result set you want returned.
If the goal is to get an "average", then rather than monkeying with PHP arrays, we know that the average of the differences between the latest twenty readings will be the same as the difference between the first and last readings (in the latest twenty), divided by nineteen.
If we only want to return a row if there are at least twenty readings available, then something like this:
SELECT (l.reading - f.reading)/19 AS avg_difference
FROM ( SELECT ll.reading
FROM level_records ll
WHERE ll.reading IS NOT NULL
ORDER BY ll.date DESC LIMIT 1
) l
CROSS
JOIN (SELECT ff.reading
FROM level_records ff
WHERE ff.reading IS NOT NULL
ORDER BY ff.date DESC LIMIT 19,1
) f
NOTE: That query will only return a row only if there are at least twenty rows with non-NULL values of reading in the level_records table.
For the more general case, if there are fewer than twenty rows in the table (i.e. fewer than nineteen differences) and we want an average of the differences between the latest available rows, we can do something like this:
SELECT (l.reading - f.reading)/f.cnt AS avg_difference
FROM ( SELECT ll.reading
FROM level_records ll
WHERE ll.reading IS NOT NULL
ORDER BY ll.date DESC
LIMIT 1
) l
CROSS
JOIN (SELECT ee.reading
, ee.cnt
FROM ( SELECT e.date
, e.reading
, (#i := #i + 1) AS cnt
FROM level_records e
CROSS
JOIN (SELECT #i := -1) i
WHERE e.reading IS NOT NULL
ORDER BY e.date DESC
LIMIT 20
) ee
ORDER BY ee.date ASC
LIMIT 1
) f
But, if we need to treat "differences" as unsigned (that is, we are taking the absolute value of the differences between the readings),
then we'd need to get the actual differences between the readings, and then average the absolute values of the differences...
then we could do make use of a MySQL user variable to keep track of the "previous" reading, and have that available when we process the next row, so we can get the difference between them, something like this:
SELECT AVG(d.abs_diff)
FROM ( SELECT ABS(e.reading - #prev_reading) AS abs_diff
, #prev_reading AS prev_reading
, #prev_reading := e.reading AS reading
FROM ( SELECT r.date
, r.reading
FROM level_records r
CROSS
JOIN (SELECT #prev_reading := NULL) p
ORDER BY r.date DESC
LIMIT 20
) e
ORDER BY e.date ASC
) d
I have two tables
Customer (idCustomer, ecc.. ecc..)
Comment (idCustomer, idComment, ecc.. ecc..)
obviously the two table are joined together, for example
SELECT * FROM Comment AS co
JOIN Customer AS cu ON cu.idCustomer = co.idCustomer
With this I select all comment from that table associated with is Customer, but now I wanna limit the number of Comment by 2 max Comment per Customer.
The first thing I see is to use GROUP BY cu.idCustomer but it limits only 1 Comment per Customer, but I wanna 2 Comment per Customer.
How can I achieve that?
One option in MySQL is server-side variables. For example:
set #num := 0, #customer := -1;
select *
from (
select idCustomer
, commentText
, #num := if(#customer = idCustomer, #num + 1, 1)
as row_number
, #customer := idCustomer
from Comments
order by
idCustomer, PostDate desc
) as co
join Customer cu
on co.idCustomer = cu.idCustomer
where co.row_number <= 2
This version doesn't require the SET operation:
select *
from (select idCustomer
, commentText
, #num := if(#customer = idCustomer, #num + 1, 1) as row_number
, #customer = idCustomer
from Comments
JOIN(SELECT #num := 0, #customer := 1) r
order by idCustomer, PostDate desc) as co
join Customer cu on co.idCustomer = cu.idCustomer
where co.row_number <= 2
SELECT * FROM Comments AS cm1
LEFT JOIN Comments AS cm2 ON cm1.idCustomer = cm2.idCustomer
LEFT JOIN Customer AS cu ON cm1.idCustomer = cu.idCustomer
WHERE cm1.idComment != cm2.idComment
GROUP BY cm1.idCustomer
However, if you are going to change the number of comments it's better to use Andomar's solution.
There is no need to use cursor, which is very slow. See my answer to Complicated SQL Query About Joining And Limitting. DENSE_RANK will do the trick without all cursor intricacies.
If you are using a scripting language such as PHP to process the results, you could limit the number of results shown per customer after running the query. Set up an array to hold all the results, set up another array to hold the number of results per customer and stop adding the query results to the result set after the count exceeds your limit like so:
$RESULTS = array();
$COUNTS = array();
$limit = 2;
$query = "SELECT customer_id, customer_name, customer_comment FROM customers ORDER BY RAND()";
$request = mysql_query($query);
while ($ROW = mysql_fetch_assoc($request))
{
$c = $ROW['customer_id'];
$n = $COUNTS[$c];
if ($n<$limit)
{
$RESULTS[] = $ROW;
$COUNTS[$c]++;
}
}
This guarantees only two comments per customer will be shown pulled randomly or however you want, the rest gets thrown out. Granted you are pulling ALL the results but this is (probably) faster than doing a complex join.