Mysql query takes a lot of time to execute - php

I am working on an timesheet application, and writing a PHP code to fetch all the timesheets till date. This is the query that I have written to fetch the timesheets -
SELECT a.accnt_name, u.username, DATE_FORMAT(t.in_time, '%H:%i') inTime, DATE_FORMAT(t.out_time, '%H:%i') outTime, DATE_FORMAT(t.work_time, '%H:%i') workTime, w.wrktyp_name, t.remarks, DATE_FORMAT(t.tmsht_date, '%d-%b-%Y') tmshtDate, wl.loctn_name, s.serv_name, t.status_code, t.conv_kms convkms, t.conv_amount convamount FROM timesheets t, accounts a, services s, worktypes w, work_location wl, users WHERE a.accnt_code=t.accnt_code and w.wrktyp_code=t.wrktyp_code and wl.loctn_code=t.loctn_code and s.serv_code=t.serv_code and t.usr_code = u. ORDER BY tmsht_date desc
The where clause contains the clauses to get the actual values of respective codes from respective tables.
The issue is that this query is taking a lot of time to execute and the application crashes at the end of few minutes.
I ran this query in the phpmyadmin, there it works without any issues.
Need help in understanding what might be the cause behind the slowness in the execution.

Use EXPLAIN to see the execution plan for the query. Make sure MySQL has suitable indexes available, and is using those indexes.
The query text seems to be missing the name of a column here...
t.usr_code = u. ORDER
^^^
We can "guess" that's supposed to be u.usr_code, but that's just a guess.
How many rows are supposed to be returned? How large is the resultset?
Is your client attempting to "store" all of the rows in memory, and crashing because it runs out of memory?
If so, I recommend you avoid doing that, and fetch the rows as you need them.
Or, consider adding some additional predicates in the WHERE clause to return just the rows you need, rather than all the rows in the table.
It's 2015. Time to ditch the old-school comma syntax for join operation, and use JOIN keyword instead, and move join predicates from the WHERE clause to the ON clause. And format it. The database doesn't care, but it will make it easier on the poor soul that needs to decipher your SQL statement.
SELECT a.accnt_name
, u.username
, DATE_FORMAT(t.in_time ,'%H:%i') AS inTime
, DATE_FORMAT(t.out_time ,'%H:%i') AS outTime
, DATE_FORMAT(t.work_time,'%H:%i') AS workTime
, w.wrktyp_name
, t.remarks
, DATE_FORMAT(t.tmsht_date, '%d-%b-%Y') AS tmshtDate
, wl.loctn_name
, s.serv_name
, t.status_code
, t.conv_kms AS convkms
, t.conv_amount AS convamount
FROM timesheets t
JOIN accounts a
ON a.accnt_code = t.accnt_code
JOIN services s
ON s.serv_code = t.serv_code
JOIN worktypes w
ON w.wrktyp_code = t.wrktyp_code
JOIN work_location wl
ON wl.loctn_code = t.loctn_code
JOIN users
ON u.usr_code = t.usr_code
ORDER BY t.tmsht_date DESC
Ordering on the formatted date column is very odd. Much more likely you want results returned in "date" order, not in the string order with month and day before the year. (Do you really want to sort on the day value first, before the year?)
FOLLOWUP
If this same exact query complete quickly, with the entire resultset (of approx 720 rows) from a different client (same database, same user), then the issue is likely something other than this SQL statement.
We would not expect the execution of the SQL statement to cause PHP to "crash".
If you are storing the entire resultset (for example, using mysqli store_result), you need to have sufficient memory for that. But the thirteen expressions in the select list all look relatively short (formatted dates, names and codes), and we wouldn't expect "remarks" would be over a couple of KB.
For debugging this, as others have suggested, try adding a LIMIT clause on the query, e.g. LIMIT 1 and observe the behavior.
Alternatively, use a dummy query for testing; use a query that is guaranteed to return specific values and a specific number of rows.
SELECT 'morpheus' AS accnt_name
, 'trinity' AS username
, '01:23' AS inTime
, '04:56' AS outTime
, '00:45' AS workTime
, 'neo' AS wrktyp_name
, 'yada yada yada' AS remarks
, '27-May-2015' AS tmshtDate
, 'zion' AS loctn_name
, 'nebuchadnezzar' AS serv_name
, '' AS status_code
, '123' AS convkms
, '5678' AS convamount
I suspect that the query is not the root cause of the behavior you are observing. I suspect The problem is somewhere else in the code.
How to debug small programs http://ericlippert.com/2014/03/05/how-to-debug-small-programs/

phpadmin automatically adds LIMIT to the query, that's why you got fast results.
Check how many rows are in table
Run your query with limit

First of all: modify you query so that it looks like the one given by Spencer
Do you get an error message when your application 'crashes' or does it just stop?
You could try:
ini_set('max_execution_time', 0);
in your php code. This sets the maximum execution time to unlimited. So if there are no errors, your script should execute to the end. So you can see if your query gets the desired results.
Also just as a test end your query with
LIMIT 10
This should greatly speed up your query as it will only take the first ten results.
You can later change this value to one better suited for your needs. Unless you absolutely need the complete result set, I suggest you always use LIMIT in your queries.

Related

MySQL query is extremely long from PHP

I have a query that takes 0.0002s in PHPMyAdmin and takes hundreds of seconds if I do it from PHP. Here it is:
SELECT id, id_pages, link, childlink, url, hiddencontent, cansearch,
(
SELECT p.id
FROM pages as p
WHERE pages.hiddencontent=1
AND p.id_pages IS NOT NULL
AND p.hiddencontent=0
AND pages.id=p.id_pages
order by p.npp asc
limit 1
) as id_firstchild
FROM pages
It returns around 24k rows and I don't know why it takes so long. My friend tried it on his PC and it worked lightning fast and his pc is not better. I don't know the reason of this PHP behavior, maybe I should make some changes in the configuration file?
You have two questions:
Why the timing?
Is the "Query cache" turned on? That's about the only way it can run in 0.2ms. (Any non-trivial SELECT that runs in under 1ms almost certainly did not run, but was found in that cache.)
And, as pointed out by others, phpmyadmin silently adds a LIMIT. However, other clues (mostly in Comments) point to the Query cache giving anomalous results.
How to speed up.
SELECT id, id_pages, link, childlink, url, hiddencontent, cansearch,
if (hiddencontent = 1, NULL, -- to avoid doing the SELECT
( SELECT p.id
FROM pages as p
WHERE p.hiddencontent = 0
AND outer.id = p.id_pages -- fails on NULL
order by p.npp asc
limit 1
)) as id_firstchild
FROM pages AS outer -- clarify which is which
and have this 'composite' and 'covering' index:
INDEX(hiddencontent, id_pages, npp, id)
Two improvements:
Avoid calling the subquery when not needed.
Have an index that will allow the subquery to look at only one row, and only in the index's BTree, hence be 'blazingly fast'.
Whenever you try to decide if searching or fetching is slow, use a LIMIT 1 at the end of your query (and comment out the ORDER BY part if there's any). This way, you get the first row so you will know how long that takes. It should be blazing fast.
Another useful information is when you enclose the whole query in another, like SELECT ...FROM (SELECT ...), in which case the outer query should count the rows returned. This will give you the total time needed to identify all rows that would be fetched, but without actually fetching them. This is useful to determine if you wrote your SQL query poorly, or it's just a lot of data to fetch. (If both of the above go fast and you still get a slow query, it's the fetch.)
You can also make use of EXPLAIN to see if your performance issue is because of insufficient or improper indexing.
As for phpmyadmin, the first comment on your post pretty much sums it up: phpmyadmin uses a LIMIT so it will run faster even if your query itself is slow.

PHP Code running very slow although there are only 17,257 rows in mysql

I have 17,257 rows in MySQL (Size: 6.6 MiB), whenever I am running my PHP code, it's too slow and takes more than 30 minutes to open the webpage. I read somewhere to change mysqli_fetch_array to fetch_assoc, but still I can't see any change. Any suggestions?
Initially I had a complex code, so I changed it to the one present below, but still I can't observe any change.
$md=$db->query("SELECT MDid,MD_FullName FROM MDList");
while($row=$md->fetch_assoc())
{
$mdid=$row['MDid'];
$mdname=$row['MD_FullName'];
$distinct_filenames=$db->query("SELECT DISTINCT(FileName) AS Files FROM InitialLog WHERE MDid='$mdid' AND FileName NOT LIKE '%Patient Names%'");
while($row2=$distinct_filenames->fetch_assoc())
{
$filename=$row2['Files'];
$finalquery=$db->query("SELECT LinesCount,CharCount,WordCount,PageCount FROM InitialLog WHERE FileName='$filename' AND (DateLastSaved>='$firstdate' AND DateLastSaved<='$presentdate') AND MONTH(DateLastSaved) = (SELECT MIN(MONTH(DateLastSaved)) FROM InitialLog WHERE FileName='$filename') ORDER BY DAY(DateLastSaved) DESC LIMIT 1");
while($row3=$finalquery->fetch_assoc())
{
$linecount=$linecount+$row3['LinesCount'];
$charcount=$charcount+$row3['CharCount'];
$wordcount=$wordcount+$row3['WordCount'];
$pagecount=$pagecount+$row3['PageCount'];
}
}
What I wan't to achieve through queries is:
Tables:
MDList (Consist of MD ids of all the MDs)
InitialLog (Consist of FileNames of each MDid and the counts)
My first query chooses each MDid one by one from the table MDlist.
Second query takes distinct file names from InitialLogs table for that specific MD chosen from first query (File names can be same)
Third query returns various counts of each distinct filename of the specific MD. The count is returned normally if one file exists of that name, if there are more files present, so it returns the count of such filename, which exists in the first month and the last day of that first month, like if it exists in 01-01-2016,22-01-2016,23-02-2016, so it returns that count which is in the row (22-01-2016), that is the last day of the first month.
In the end I sum all the counts returned for each MD.
You are making a zillion SQL queries.
Well, somewhere in the region of <Number of MD Results> * <Number of distinct filenames> SQL queries.
Since you are just adding up some stats, it will likely to be more efficient to create a single query that sums up the correct values to start with.
Check out SUM() and JOINs.
As said, you should avoid executing queries in loops at (pretty much) any cost. Your SGBD engine is designed to handle data aggregation, join, exclusions and such.
It should be better this way, but please read notes below about why it's not a good idea. It's a direct transcription from your query logic which might be rewritten for better performances and safety.
SELECT
sum(log.LinesCount), sum(log.CharCount),
sum(log.WordCount), sum(log.PageCount)
FROM InitialLog log
INNER JOIN (
SELECT l2.FileName, l2.MD_id
FROM InitialLog l2
WHERE l2.FileName NOT LIKE '%Patient Names%'
) filtered_name
ON filtered_name.FileName=log.FileName
INNER JOIN MDList md
ON filtered_name.MD_id = md.MDid
INNER JOIN (
SELECT MIN(MONTH(l3.DateLastSaved)) as minmonth
FROM InitialLog l3
WHERE l3.FileName='$filename'
) lastSaved
ON lastSaved.minmonth = log.DateLastSaved
WHERE
log.DateLastSaved>='$firstdate'
AND log.DateLastSaved<='$presentdate'
ORDER BY
DAY(log.DateLastSaved) DESC
LIMIT 1;
First, NOT LIKE '%whatever%' is usually a bad idea as it requires to perform a full scan; it would be much more efficient with a JOIN and a nullity test or use a view or another way to avoid this scan altogether (adding a column, etc.). At least, try to avoid wildcards (%) at start of pattern.
Next, you're using string concatenation to inject parameters into your query, that's bad. You should use prepared queries with real parameters to avoid SQL injection.
Finally, your should consider altering your dates (or add column updated by trigger, set up a view, whatever) to avoid inconsistent comparaisons.

Joining a count query mysql for performance

Have searched but can't find an answer which suits the exact needs for this mysql query.
I have the following quires on multiple tables to generate "stats" for an application:
SELECT COUNT(id) as count FROM `mod_**` WHERE `published`='1';
SELECT COUNT(id) as count FROM `mod_***` WHERE `published`='1';
SELECT COUNT(id) as count FROM `mod_****`;
SELECT COUNT(id) as count FROM `mod_*****`;
pretty simple just counts the rows sometimes based on a status.
however in the pursuit of performance i would love to get this into 1 query to save resources.
I'm using php to fetch this data with simple mysql_fetch_assoc and retrieving $res[count] if it makes a difference (pro isn't guaranteed, so plain old mysql here).
The overhead of sending a query and getting a single-row response is very small.
There is nothing to gain here by combining the queries.
If you don't have indexes yet an INDEX on the published column will greatly speed up the first two queries.
You can use something like
SELECT SUM(published=1)
for some of that. MySQL will take the boolean result of published=1 and translate it to an integer 0 or 1, which can be summed up.
But it looks like you're dealing with MULTIPLE tables (if that's what the **, *** etc... are), in which case you can't really. You could use a UNION query, e.g.:
SELECT ...
UNION ALL
SELECT ...
UNION ALL
SELECT ...
etc...
That can be fired off as one single query to the DB, but it'll still execute each sub-query as its own query, and simply aggregate the individual result sets into one larger set.
Disagreeing with #Halcyon I think there is an appreciable difference, especially if the MySQL server is on a different machine, as every single query uses at least one network packet.
I recommend you UNION the queries with a marker field to protect against the unexpected.
As #Halcyon said there is not much to gain here. You can anyway do several UNIONS to get all the result in one query

Making an SQL query more efficient

I have a query that works, but it's taking at least 3 seconds to run so I think it can probably be faster. It's used to populate a list of new threads and show how many unread posts there are in each thread. I generate the query string before throwing it into $db->query_read(). In order to only grab results from valid forums, $ids is string with up to 50 values separated by commas.
The userthreadviews table has existed for 1 week and there are roughly 9,500 rows in it. I'm not sure if I need to set up a cron job to regularly clear out thread views more than a week old, or if I will be fine letting it grow.
Here's the query as it currently stands:
SELECT
`thread`.`title` AS 'r_title',
`thread`.`threadid` AS 'r_threadid',
`thread`.`forumid` AS 'r_forumid',
`thread`.`lastposter` AS 'r_lastposter',
`thread`.`lastposterid` AS 'r_lastposterid',
`forum`.`title` AS 'f_title',
`thread`.`replycount` AS 'r_replycount',
`thread`.`lastpost` AS 'r_lastpost',
`userthreadviews`.`replycount` AS 'u_replycount',
`userthreadviews`.`id` AS 'u_id',
`thread`.`postusername` AS 'r_postusername',
`thread`.`postuserid` AS 'r_postuserid'
FROM
`thread`
INNER JOIN
`forum`
ON (`thread`.`forumid` = `forum`.`forumid`)
LEFT JOIN
(`userthreadviews`)
ON (`thread`.`threadid` = `userthreadviews`.`threadid`
AND `userthreadviews`.`userid`=$userid)
WHERE
`thread`.`forumid` IN($ids)
AND `thread`.`visible`=1
AND `thread`.`lastpost`> time() - 604800
ORDER BY `thread`.`lastpost` DESC LIMIT 0, 30
An alternate query that joins the post table (to only show threads where user has posted) is actually twice as fast, so I think there's got to be something in here that could be changed to speed it up. Could someone provide some advice?
Edit: Sorry, I had put the EXPLAIN in front of the alternate query. Here is the correct output:
As Requested, here is the output generated by EXPLAIN SELECT:
Have a look at the mysql explain statement. It gives you a execution plan of your query.
Once you know the plan, you can check if you have got a index on the fields involved in the plan. If not, create them.
Perhaps the plan reveals details about how the query can be written in another way, such that the query will be more optimized.
To have no indexes on joins / where (used key = NULL on explain), this is the reason why your queries are slow. You should index them in such a way :
CREATE INDEX thread_forumid_index ON thread(forumid);
CREATE INDEX userthreadviews_forumid_index ON userthreadviews(forumid);
Documentation here
Try to index the table forumid if it is not indexed
Suggestions:
move the conditions from the WHERE clause to the JOIN clause
put the JOIN with the conditions before the other JOIN
make sure you have proper indexes and that they are being used in the query (create the ones you'll need... too much indexes can be as bad as too few)
Here is my suggestion for the query:
SELECT
`thread`.`title` AS 'r_title',
`thread`.`threadid` AS 'r_threadid',
`thread`.`forumid` AS 'r_forumid',
`thread`.`lastposter` AS 'r_lastposter',
`thread`.`lastposterid` AS 'r_lastposterid',
`forum`.`title` AS 'f_title',
`thread`.`replycount` AS 'r_replycount',
`thread`.`lastpost` AS 'r_lastpost',
`userthreadviews`.`replycount` AS 'u_replycount',
`userthreadviews`.`id` AS 'u_id',
`thread`.`postusername` AS 'r_postusername',
`thread`.`postuserid` AS 'r_postuserid'
FROM
`thread`
INNER JOIN (`forum`)
ON ((`thread`.`visible` = 1)
AND (`thread`.`lastpost` > $time)
AND (`thread`.`forumid` IN ($ids))
AND (`thread`.`forumid` = `forum`.`forumid`))
LEFT JOIN (`userthreadviews`)
ON ((`thread`.`threadid` = `userthreadviews`.`threadid`)
AND (`userthreadviews`.`userid` = $userid))
ORDER BY
`thread`.`lastpost` DESC
LIMIT
0, 30
These are good candidates to be indexed:
- `forum`.`forumid`
- `userthreadviews`.`threadid`
- `userthreadviews`.`userid`
- `thread`.`forumid`
- `thread`.`threadid`
- `thread`.`visible`
- `thread`.`lastpost`
It seems you already have lots of indexes... so, make sure you keep the ones you really need and remove the useless ones.

Optimizing a PHP page: MySQL bottleneck

I have a page that is taking 37 seconds to load. While it is loading it pegs MySQL's CPU usage through the roof. I did not write the code for this page and it is rather convoluted so the reason for the bottleneck is not readily apparent to me.
I profiled it (using kcachegrind) and find that the bulk of the time on the page is spent doing MySQL queries (90% of the time is spent in 25 different mysql_query calls).
The queries take the form of the following with the tag_id changing on each of the 25 different calls:
SELECT * FROM tbl_news WHERE news_id
IN (select news_id from
tbl_tag_relations WHERE tag_id = 20)
Each query is taking around 0.8 seconds to complete with a few longer delays thrown in for good measure... thus the 37 seconds to completely load the page.
My question is, is it the way the query is formatted with that nested select that is causing the problem? Or could it be any one of a million other things? Any advice on how to approach tackling this slowness is appreciated.
Running EXPLAIN on the query gives me this (but I'm not clear on the impact of these results... the NULL on primary key looks like it would be bad, yes? The number of results returned seems high to me as well as only a handful of results are returned in the end):
1 PRIMARY tbl_news ALL NULL NULL NULL NULL 1318 Using where
2 DEPENDENT SUBQUERY tbl_tag_relations ref FK_tbl_tag_tags_1 FK_tbl_tag_tags_1 4 const 179 Using where
I'e addressed this point in Database Development Mistakes Made by AppDevelopers. Basically, favour joins to aggregation. IN isn't aggregation as such but the same principle applies. A good optimize will make these two queries equivalent in performance:
SELECT * FROM tbl_news WHERE news_id
IN (select news_id from
tbl_tag_relations WHERE tag_id = 20)
and
SELECT tn.*
FROM tbl_news tn
JOIN tbl_tag_relations ttr ON ttr.news_id = tn.news_id
WHERE ttr.tag_id = 20
as I believe Oracle and SQL Server both do but MySQL doesn't. The second version is basically instantaneous. With hundreds of thousands of rows I did a test on my machine and got the first version to sub-second performance by adding appropriate indexes. The join version with indexes is basically instantaneous but even without indexes performs OK.
By the way, the above syntax I use is the one you should prefer for doing joins. It's clearer than putting them in the WHERE clause (as others have suggested) and the above can do certain things in an ANSI SQL way with left outer joins that WHERE conditions can't.
So I would add indexes on the following:
tbl_news (news_id)
tbl_tag_relations (news_id)
tbl_tag_relations (tag_id)
and the query will execute almost instantaneously.
Lastly, don't use * to select all the columns you want. Name them explicitly. You'll get into less trouble as you add columns later.
The SQL Query itself is definitely your bottleneck. The query has a sub-query in it, which is the IN(...) portion of the code. This is essentially running two queries at once. You can likely halve (or more!) your SQL times with a JOIN (similar to what d03boy mentions above) or a more targeted SQL query. An example might be:
SELECT *
FROM tbl_news, tbl_tag_relations
WHERE tbl_tag_relations.tag_id = 20 AND
tbl_news.news_id = tbl_tag_relations.news_id
To help SQL run faster you also want to try to avoid using SELECT *, and only select the information you need; also put a limiting statement at the end. eg:
SELECT news_title, news_body
...
LIMIT 5;
You also will want to look into the database schema itself. Make sure you are indexing all of the commonly referred to columns so that the queries will run faster. In this case, you probably want to check your news_id and tag_id fields.
Finally, you will want to take a look at the PHP code and see if you can make one single all-encompassing SQL query instead of iterating through several seperate queries. If you post more code we can help with that, and it will probably be the single greatest time savings for your posted problem. :)
If I understand correctly, this is just listing the news stories for a specific set of tags.
First of all, you really shouldn't
ever SELECT *
Second, this can probably be
accomplished within a single query,
thus reducing the overhead cost of
multiple queries. It seems like it
is getting fairly trivial data so
it could be retrieved within a
single call instead of 20.
A better approach to using IN might be to use a JOIN with a WHERE condition instead. When using an IN it will basically be a lot of OR statements.
Your tbl_tag_relations should definitely have an index on tag_id
select *
from tbl_news, tbl_tag_relations
where
tbl_tag_relations.tag_id = 20 and
tbl_news.news_id = tbl_tag_relations.news_id
limit 20
I think this gives the same results, but I'm not 100% sure. Sometimes simply limiting the results helps.
Unfortunately MySQL doesn't do very well with uncorrelated subqueries like your case shows. The plan is basically saying that for every row on the outer query, the inner query will be performed. This will get out of hand quickly. Rewriting as a plain old join as others have mentioned will work around the problem but may then cause the undesired affect of duplicate rows.
For instance the original query would return 1 row for each qualifying row in the tbl_news table but this query:
SELECT news_id, name, blah
FROM tbl_news n
JOIN tbl_tag_relations r ON r.news_id = n.news_id
WHERE r.tag_id IN (20,21,22)
would return 1 row for each matching tag. You could stick DISTINCT on there which should only have a minimal performance impact depending on the size of the dataset.
Not to troll too badly, but most other databases (PostgreSQL, Firebird, Microsoft, Oracle, DB2, etc) would handle the original query as an efficient semi-join. Personally I find the subquery syntax to be much more readable and easier to write, especially for larger queries.

Categories