Too heavy query execution - php

I have a Database with 187840 lines .
When i execute this query i have this message Query execution was interrupted
TOO HEAVY QUERY
SELECT days.day,count(U.sig_name) as number
FROM days
LEFT JOIN linked U ON
days.day = date(timestamp)
AND
U.sig_name REGEXP "^Tester"
GROUP BY days.day;
What is th solution ?

This is your query:
select days.day, count(U.sig_name) as number
from days left join
linked U
on days.day = date(timestamp) AND U.sig_name REGEXP "^Tester"
group by days.day;
You have a problem because of the function call around timestamp. You might find this version better:
select days.day,
(select count(*)
from linked u
where u.timestamp >= days.day an du.timestamp < date_add(days.day, interval 1 day) and
u.sig_name not like '%Tester%'
)
from days;
For performance, you want a composite index on linked(timestamp, sig_name). This eliminates the outer aggregation (the aggregation uses the index instead), and allows an index to be used for the matching.

You can handle massive data using LIMIT:
$limit_size = 10000;
$flag_done = false;
for ($i = 1; ! $flag_done; $i++) {
$queryString = "SELECT days.day,count(U.sig_name) as number from days left join linked U on days.day = date(timestamp) AND U.sig_name REGEXP "^Tester" group by days.day LIMIT $index*$limit_size, $limit_size";
if($result = mysql_query($queryString, $db)){
[WHAT YOU WANT TO DO WITH RESULT HERE]
} else $flag_done = true;
}

Related

How to retrieve record in group by with join table depend on MAX?

This my query in model:
return $this->db->join('tbl_customer', 'tbl_customer.cus_code = tbl_cus_account.custcode')
->where("status", 1)
->where("DATE_FORMAT(nextbillingdate,'%Y-%m') <= ", date('Y-m'))
->select('*,MAX(issuewithmain) AS issuewithmain, SUM(monthlyfee) AS monthlyfee, count(accountcode) as rows')
->group_by('custcode')
->get('tbl_cus_account');
The below are my table for join query:
-The result I want as the image below:
Your query can be something like this
SELECT t.*, t2.rows, t2.monthlyfee from tbl_cus_account t join (SELECT MAX(issuewithmain) AS issuewithmain, custcode, count(accountcode) as rows, SUM(monthlyfee) AS monthlyfee from tbl_cus_account JOIN tbl_customer ON tbl_customer.cus_code = tbl_cus_account.custcode where status = 1 AND DATE_FORMAT(nextbillingdate,"%Y-%m") <= DATE_FORMAT(now(),"%Y-%m") GROUP BY custcode) t2 on t.issuewithmain = t2.issuewithmain and t.custcode = t2.custcode
Here you made an extra join to the same table, to only get the exactly records that match with your max(issuewithmain)
You need to transform this query to codeigniter and add the conditions of the where to the corresponding table.
Maybe something like this, I haven't test it
UPDATE
return $this->db->join('(SELECT MAX(issuewithmain) AS issuewithmain, custcode, count(accountcode) as rows, SUM(monthlyfee) AS monthlyfee from tbl_cus_account where status = 1 AND DATE_FORMAT(nextbillingdate,"%Y-%m") <= "'.date('Y-m').'" GROUP BY custcode) t2', 'tbl_cus_account.issuewithmain = t2.issuewithmain AND t2.custcode = tbl_cus_account.custcode')->select('tbl_cus_account.*, t2.monthlyfee, t2.rows') ->join('tbl_customer', 'tbl_customer.cus_code = t2.custcode') ->group_by('t2.custcode') ->get('tbl_cus_account');

Optimize sql query PHP Phalcon

I Have this code:
if(($banners = $this->app->cache5min->get($key_cache)) === NULL) {
$find_items = array();
$banners = $this->app->db->fetchAll("
SELECT b.id,b.active, b.richmedia, b.rich_position, b.ip_limit,
b.cookie_limit, b.cookie_interval, b.day_limit, b.limit_interval,
b.frequency, b.close_btn, b.use_geo,
bpl.s_x, bpl.s_y
FROM bs_items_places AS bp
LEFT JOIN bs_items AS b ON b.id = bp.item_id
LEFT JOIN bs_places AS bpl ON bpl.id = bp.place_id
WHERE b.active = 1
AND b.date_start < '".$dateNow."'
AND b.date_stop > '".$dateNow."'
AND bpl.active = 1
AND bp.place_id = {$idpl}
AND (IF((time_from!='00:00:00' AND time_to!='00:00:00'),
(time_from<='".$dateTimeNow."'
AND time_to >='".$dateTimeNow."'), 1)
)
GROUP BY b.id
ORDER BY b.frequency DESC, b.day_limit DESC
");
foreach($banners AS $bnr) {
$find_items[$bnr['id']] = $bnr;
}
$banners = $find_items;
$this->app->cache5min->save($key_cache,$banners);
}
//echo '<pre>'; print_r($banners); exit;
if(!$banners || !count($banners)) { return $this->getDefaultBanner($idpl,$x); }
When I remove first statement, my CPU is good, when I return first statement, my CPU 100% and this gives mysql proccess, how I Can Optimize select query?
First step is to get rid of LEFT, unless you need it. This may open up more optimization options.
This construct is essentially un-optimizable:
WHERE start < ...
AND end > ...
There is an "explode-implode" problem. First, the number of rows explodes due to the JOINs, then it implodes do to the GROUP BY. But... Meanwhile, there are no "aggregates" such as COUNT() or SUM(). So this smells like a poorly formed query -- Why do you need to do the GROUP BY? If you are getting 'duplicate' rows, then you have a worse problem -- what happens in the columns that are not duplicated? You will get random values. See what happens without the GROUP BY.
This composite index on bp may help:
INDEX(place_id, item_id)

Best way to sum and seperate by date in MYSQL with/witout php

Hi i have such table information:
what i want to do with php with while or just in mysql, is to SUM (time_used) of the rows with status 44 until its reached row with status 55. after that it should begin from start with new summing.
first query should return 37, second 76 (keep in mind it should be universal, for unlimited occurrences of 55 status row)
i thought of a way with time/date filtering and have this:
select sum(time_used) as sumed
from timelog
where start_time > (select end_time from timelog where (status='55')
ORDER BY id DESC LIMIT 1) ORDER BY id DESC
but this works only for last combination of 44 and 55
i know i will need two way filtering( < end_time and > end_time) so it will work for all cases, but cant think of a way to do it in php
can anyone help me?
EDIT:
sqlfiddle whoever want it:
http://sqlfiddle.com/#!2/33820/2/0
There are two ways to do it: Plain SQL or PHP. If you treat thousands of rows, it may be interresting to choose between the two by testing performance.
Plain SQL
select project_id, task_id, user_id, sum(time_used) as time_used,
min(start_time) as start_time, max(end_time) as end_time, max(comment) as comment from
(select t.id, t.project_id, t.task_id, t.user_id, t.time_used,
count(t2.id) as count55, t.start_time, t.end_time, t.comment
from timelog t
left join timelog t2 on t.id>t2.id and t2.status=55 and t.task_id=t2.task_id
group by t.id) as t
group by count55;
I assume here that a task can belong to one user only
SQL and PHP
$link = mysqli_connect( ... );
$query = "select id, project_id, task_id, user_id, time_used, start_time, end_time, status
from timelog order by id";
$result = mysqli_query($link, $query);
$table = array();
$time_used = 0;
$start_sum = true;
$i = 0;
while($row = mysqli_fetch_assoc ($result)){
if($start_sum){
$table[$i] = $row;
$start_sum = false;
} else {
$table[$i]['time_used'] += $row['time_used'];
$table[$i]['end_time'] += $row['end_time'];
}
if($row['state'] == 55){
$i++;
$start_sum = true;
}
}
If two tasks can run in simultaneously, solution 1 will work, but solution 2 will need to be adapted in order to take this in account.
here is my intepretation:
http://sqlfiddle.com/#!2/33820/45
set #n=0;
select project_id, task_id, user_id,sum(time_used) from (
SELECT time_used,project_id, task_id, user_id,
#n:=if(status=55,#n+1,#n),
if(status=55,-1,#n) as grouper FROM timelog
) as t
where grouper>-1
group by grouper;
I'm neither a php nor MySQL programmer, but I can explain the logic you want to follow. You can then code it.
First, query your db and return the results to php.
Next, set two sum variables to 0.
Start looping through your query results. Increment the first sum variable until you reach the first row that has status 55. Once you do, start incrementing the second variable.
The tricky part will be to sort your query by the row number of the table. Here is a link that will help you with that part.

How To Optimize PostgreSQL generate_series function

I have a query that uses PostgreSQL generate_series function but when it comes to large amounts of data, the query can be slow. An example of code the generates the query is below:
$yesterday = date('Y-m-d',(strtotime ( '-1 day' ) ));
$query = "
WITH interval_step AS (
SELECT gs::date AS interval_dt, random() AS r
FROM generate_series('$yesterday'::timestamp, '2015-01-01', '1 day') AS gs)
SELECT articles.article_id, article_title, article_excerpt, article_author, article_link, article_default_image, article_date_published, article_bias_avg, article_rating_avg
FROM development.articles JOIN interval_step ON articles.article_date_added::date=interval_step.interval_dt ";
if (isset($this -> registry -> get['category'])) {
$query .= "
JOIN development.feed_articles ON articles.article_id = feed_articles.article_id
JOIN development.rss_feeds ON feed_articles.rss_feed_id = rss_feeds.rss_feed_id
JOIN development.news_categories ON rss_feeds.news_category_id = news_categories.news_category_id
WHERE news_category_name = $1";
$params = array($category_name);
$query_name = 'browse_category';
}
$query .= " ORDER BY interval_step.interval_dt DESC, RANDOM() LIMIT 20;";
This series looks for only content that goes one day back and sorts the results in random order. My question is what are was that generate_series can be optimized to improve performance?
You don't need that generate_series at all. And do not concatenate query strings. Avoid it by making the parameter an empty string (or null) if it is not set:
if (!isset($this -> registry -> get['category']))
$category_name = '';
$query = "
select articles.article_id, article_title, article_excerpt, article_author, article_link, article_default_image, article_date_published, article_bias_avg, article_rating_avg
from
development.articles
inner join
development.feed_articles using (article_id)
inner join
development.rss_feeds using (rss_feed_id)
inner join
development.news_categories using (news_category_id)
where
(news_category_name = $1 or $1 = '')
and articles.article_date_added >= current_date - 1
order by
date_trunc('day', articles.article_date_added) desc,
random()
limit 20;
";
$params = array($category_name);
Passing $yesterday to the query is also not necessary as it can be done entirely in SQL.
If $category_name is empty it will return all categories:
(news_category_name = $1 or $1 = '')
Imho, try removing that random() in your order by statement. It probably has a much larger performance impact than you think. As things are it's probably ordering the entire set by interval_dt desc, random(), and then picking the top 20. Not advisable...
Try fetching e.g. 100 rows ordered by interval_dt desc instead, then shuffle them per the same logic, and pick 20 in your app. Or wrap the entire thing in a subquery limit 100, and re-order accordingly along the same lines.

MySQL data - Count number of phone calls at the same time

I have a MySQL table with phone calls. Every row means one phone call.
Columns are:
start_time
start_date
duration
I need to get a maximum phone calls called at the same time. It's because of telephone exchange dimensioning.
My solution is to create two timestamp columns timestamp_start and timestamp_end. Then I run a loop second by second, day by day and ask MySQL something like:
SELECT Count(*) FROM tbl WHERE start_date IN (thisday, secondday) AND "this_second_checking" BETWEEN timestamp_start AND timestamp_end;
It's quite slow.
Is there a better solution? Thank you!
EDIT - I use this solution and it gives me proper results. There is used SQL layer dibi - http://dibiphp.com/cs/quick-start .
$starts = dibi::query("SELECT ts_start, ts_end FROM " . $tblname . " GROUP BY ts_start");
if(count($starts) > 0):
foreach ($starts as $row) {
if(isset($result)) unset($result);
$result = dibi::query('SELECT Count(*) FROM ' . $tblname . ' WHERE "'.$row->ts_start.'" BETWEEN ts_start AND ts_end');
$num = $result->fetchSingle();
if($total_max < $num):
$total_max = $num;
endif;
}
endif;
echo "Total MAX: " . $total_max;
Instead of running it second by second, you should for each row (phonecall) see what other phone calls were active at that time. After that you group all of the results by the row's ID, and check which has the maximum count. So basically something like this:
SELECT MAX(calls.count)
FROM (
SELECT a.id, COUNT(*) AS count
FROM tbl AS a
INNER JOIN tbl AS b ON (
(b.timestamp_start BETWEEN a.timestamp_start AND a.timestamp_end)
OR
(b.timestamp_end BETWEEN a.timestamp_start AND a.timestamp_end)
)
GROUP BY a.id
) AS calls
Creating an index on the timestamp columns will help as well.
I'm going to add something to #reko_t answer. I think there is a use case to consider.
Calls that start before and ended after - Calls completely overlapped
So, how about:
SELECT MAX(calls.count)
FROM (
SELECT a.id, COUNT(*) AS count
FROM tbl AS a
INNER JOIN tbl AS b ON (
(b.timestamp_start BETWEEN a.timestamp_start AND a.timestamp_end)
OR
(b.timestamp_end BETWEEN a.timestamp_start AND a.timestamp_end)
OR
(b.timestamp_start <= a.timestamp_start AND b.timestamp_end >= a.timestamp_end)
)
GROUP BY a.id
) AS calls
How about:
SELECT MAX(callCount) FROM (SELECT COUNT(duration) AS callCount, CONCAT(start_date,start_time) AS callTime FROM tbl GROUP BY callTime)
That would give you the max number of calls in a single "time". Assuming start_date and start_time are strings. If they're integer times, you could probably optimise it somewhat.

Categories