I have following query:
Select diary_id,
(select count(*)
from `comments` as c
where c.d_id = d.diary_id) as diary_comments
From `diaries` as d
It takes long time (near 0.119415 in my case).
How to make it faster?
I see only one way: Doing additional query for comment number for each row from my main query. But it will be something like doing queries in cycle. Something like:
while ($r = mysql_fetch_array($res))
{
$comments = mysql_query("select count(*) from `comments` where d_id = ".$r['diary_id']);
}
I think this is a bad strategy. Any other advice?
SELECT d.diary_id, count(c.d_id) as diary_comments
FROM diaries d
LEFT OUTER JOIN comments c ON (d.diary_id = c.d_id)
GROUP BY d.diary_id
I seem to have been downvoted because you can actually retreive all the data needed from just the diaries table. I assumed that this was a simplified example and in reality other fields from the diaries table would be required, also this method brings back records which have no comments. If you don't need any of these two things then I would go with the other answer.
It looks like you have all the data you need in the comments table, so I don't see a reason for the join or subquery.
SELECT d_id AS diary_id, COUNT(*) AS diary_comments
FROM `comments`
GROUP BY d_id
Definitely plus one for tomhaigh's solution, a group by is exactly the thing in this situation.
One other option always worth remembering that you've got two choices here. Firstly to calculate the value on every read, and secondly to calculate it on every write and store the result. In the second case you would need an additional field on diaries called 'comments_count', this would need to be incremented on inserting a new comment. Obviously this could be less accurate than working out the count on every read and will slow your writes down. But if read performance is hurting and writes are noticeably less common than reads then this can be a handy way to consider the problem.
Related
i want to fetch records from mysql starting from last to first LIMIT 20. my database have over 1M records. I am aware of order by. but from my understanding when using order by its taking forever to load 20 records i have no freaking idea. but i think mysql fetch all the records before ordering.
SELECT bookings.created_at, bookings.total_amount,
passengers.name, passengers.id_number, payments.amount,
passengers.ticket_no,bookings.phone,bookings.source,
bookings.destination,bookings.date_of_travel FROM bookings
INNER JOIN passengers ON bookings.booking_id = passengers.booking_id
INNER JOIN payments on payments.booking_id = bookings.booking_id
ORDER BY bookings.booking_id DESC LIMIT 10
I suppose if you execute the query without the order by the time would be satisfactory?
You might try to create an index in the column your are ordering:
create index idx_bookings_booking_id on bookings(booking_id)
You can try to find out complexity of the Query using
EXPLAIN SELECT bookings.created_at, bookings.total_amount,
passengers.name, passengers.id_number, payments.amount,
passengers.ticket_no,bookings.phone,bookings.source,
bookings.destination,bookings.date_of_travel FROM bookings
INNER JOIN passengers ON bookings.booking_id = passengers.booking_id
INNER JOIN payments on payments.booking_id = bookings.booking_id
ORDER BY bookings.booking_id DESC LIMIT 10
then check the proper index has been created on the table
SHOW INDEX FROM `db_name`.`table_name`;
if the index us not there create proper index on all the table
please add if anything is missing
The index lookup table needs to be able to reside in memory, if I'm not mistaken (filesort is much slower than in-mem lookup).
Use small index / column size
For a double in capacity use UNSIGNED columns if you need no negative values..
Tune sort_buffer_size and read_rnd_buffer_size (maybe better on connection level, not global)
See https://dev.mysql.com/doc/refman/5.7/en/order-by-optimization.html , particularly regarding using EXPLAIN and the maybe trying another execution plan strategy.
You seem to need another workaround like materialized views.
Tell me if this sounds like it:
Create another table like the booking table e.g. CREATE TABLE booking_short LIKE booking. Though you only need the booking_id column
And check your code for where exactly you create booking orders, e.g. where you first insert into booking. SELECT COUNT(*) FROM booking_short. If it is >20, delete the first record. Insert the new booking_id.
You can select the ID and join from there before joining for more details with the rest of the tables.
You won't need limit or sorting.
Of course, this needs heavy documentation to avoid maintenance problems.
Either that or https://stackoverflow.com/a/5912827/6288442
I tried to compare two zipcode columns between two tables to see if values were missing in the second one.
I first wanted to do it with mysql, my query was something like
'SELECT code FROM t1 WHERE t1 NOT IN (select code FROM t2)'
But it was really slow so I tried another way :
I made two select, and then compared the results with array_diff().
With mysql : few minutes, and sometimes crash
With PHP : less than 1 second.
Can someone explain these differences ?
Is my SQL query wrong ?
If your main table has 50k rows, using a sub select in your query will result into 1 + 50k executions of selects. One for the first table, and 50k selects, one for each row. The server compares the row with your sub select that is reloaded every time iterating the main table. This is why your sql code takes its time and it also may be a huge memory problem as well.
See serjoschas information about joins to fix it in sql, it should be even faster that your php solution.
Checking which values are missing within a table (compared to another) can easily be done with a LEFT or RIGHT JOIN they are just made for actions like this.. alternatively take a look at this: How to Find Missing Value Between Two Mysql Tables – serjoscha
One solution to:
SELECT code FROM t1
WHERE code NOT IN ( SELECT code FROM t2 )
will be:
SELECT t1.code
FROM t1
LEFT JOIN t2
ON t1.code = t2.code
WHERE t2.code is null
Have a try. Also have a look on indexing as Cyclone suggests:
If you don't have an index you should definitly add one since this will speed up your query. You could add an index like this: ALTER TABLE ADD INDEX code_idx (code) this should be done for both tables. If you then were to execute EXPLAIN for the query you would see something like Using where; Using index; Using join buffer which is good – Cyclone
Indexing speeds up your query. If the table only provides one column, searching an index table with the same content as the source table will be exactly the same and redundant. Otherwise I strongly recommend indexing the code column of t2 which leads to a high increase of performance and less memory consumtion.
After searching for a damn long time, I've not found a query to make this happen.
I have an "offers" table with a "listing_id" field and a "user_id" field and I need to get ALL the records for all listing_id's where at least one record matches the given user_id.
In other words, I need a query that determines the listing_id's that the given user is involved in, and then returns all the offer records of those listing_id's regardless of user_id.
That last part is the problem. It's getting all the other user's offer records to return when I'm only providing one user's id and no listing id's
I was thinking of first determining the listing_ids in a separate query and then using a php loop to create a WHERE clause for a second query that would consist of a bunch of "listing_id = $var ||" but then I couldn't bring myself to do it because I figured there must be a better way.
Hopefully this is easy and the only reason it has escaped me is because I've had my head up my ass. Will be happy to get this one behind me.
Thanks for taking the time.
Josh
You could do two queries playing along on the MySQL side, like this:
SELECT * FROM offers WHERE listing_id IN (SELECT listing_id FROM offers WHERE user_id = 1)
If I understand what you are after you should join offers on itself on listingid match and userid = given
select * from offers AS t1
inner join offers AS t2 on t1.listingid = t2.listingid and t1.userid = 1;
I'm having problems with my script which need to select all my posts and related comments.
Right now I've following query:
$sql = "SELECT posts.post_title, posts.post_modified, post_content,update_modified, update_content
FROM posts
LEFT JOIN updates
ON posts.post_ID = updates.update_post_ID";
The query works great besides if the post has multiple comments it gives me multiple entries.
I've searched around but unfortunately I wasn't able to re-script my query for my needs.
I really hope someone can help me out?
I think you want the DISTINCT keyword, used as SELECT DISTINCT ... to avoid duplicates. However if I understand correctly your comments are in the updates table and you're pulling update_modified and update_content into your recordset. So assuming those are (potentially) unique values then DISTINCT will not collapse them down. It might be best to only pull updates.update_post_ID with DISTINCT, then pull whatever you need from updates based on the IDs you retrieve when you need it.
If you want to return only 1 row per post, with all the comments with the post, the easiest way is using GROUP_CONCAT(). This returns a csv of all the column data. Assuming that update_content is the post comments, try something like -
SELECT posts.post_title, posts.post_modified, post_content, GROUP_CONCAT(update_modified), GROUP_CONCAT(update_content)
FROM posts
LEFT JOIN updates
ON posts.post_ID = updates.update_post_ID
GROUP BY updates.update_post_ID
note - GROUP_CONCAT() has a group_concat_max_len default of 1024. If your comments become too long you will want to increase this before running the GROUP_CONCAT() query or the comments will be truncated -
SET [GLOBAL | SESSION] group_concat_max_len = 10240; // must be in multiples of 1024
SELECT id, name
GROUP_CONCAT(comment) AS comment
FROM table
GROUP BY name;
you will also need to be aware of max_allowed_packet as this is the limit you can set var_group_concat_max_len to.
At the moment, I select rows from 'table01 and table02' using:
SELECT t1.*,t2.* FROM table01 AS t1
INNER JOIN table02 AS t2 ON (t1.ID = t2.t1ID)
WHERE t1.UUID = 'whatever';
The UUID column is a unique index, type: char(15), with alphanumeric input. I know this isn't the fastest way to select data from the database, but the UUID is the only row-identifier that is available to the front-end.
Since I have to select by UUID, and not ID, I need to know what of these two options I should go for, if say the table consists of 100'000 rows. What speed differences would I look at, and would the index for the UUID grow to large, and lag the DB?
Get the ID before doing the "big" select
1. $id = SELECT ID FROM table01 WHERE UUID = '{alphanumeric character}';
2. SELECT t1.*,t2.* FROM table01 AS t1
INNER JOIN table02 AS t2 ON (t1.ID = t2.t1ID)
WHERE t1.ID = $id;
Or keep it the way it is now, using the UUID.
2. SELECT t1.*,t2.* FROM table01 AS t1
INNER JOIN table02 AS t2 ON (t1.ID = t2.t1ID)
WHERE t1.UUID = 'whatever';
Side note: All new rows are created by checking if the system generated uniqueid exists before trying to insert a new row. Keeping the column always unique.
Why not just try it out? Create a new db with those tables. Write a quick php script to populate the tables with more records than you can imagine being stored (if you're expecting 100k rows, insert 10 million). Then experiment with different indexes and queries (remember, EXPLAIN is your friend)...
When you finally get something you think works, put the query into a script on a webserver and hit it with ab (Apache Bench). You can watch what happens as you increase the concurrency of the requests (1 at a time, 2 at a time, 10 at a time, etc).
All this shouldn't take too long (maybe a few hours at most), but it will give you a FAR better answer than anyone at SO could for your specific problem (as we don't know your DB server config, exact schema, memory limits, etc)...
The second solution have the best performance. You will need to look up the row by the UUID in both solutions, but in the first solution you first do it by UUID, and then do a faster lookup by primary key, but then you've already found the right row by UUID so it doesn't matter that the second lookup is faster because the second lookup is unnecessary altogether.