Optimize sql query PHP Phalcon - php

I Have this code:
if(($banners = $this->app->cache5min->get($key_cache)) === NULL) {
$find_items = array();
$banners = $this->app->db->fetchAll("
SELECT b.id,b.active, b.richmedia, b.rich_position, b.ip_limit,
b.cookie_limit, b.cookie_interval, b.day_limit, b.limit_interval,
b.frequency, b.close_btn, b.use_geo,
bpl.s_x, bpl.s_y
FROM bs_items_places AS bp
LEFT JOIN bs_items AS b ON b.id = bp.item_id
LEFT JOIN bs_places AS bpl ON bpl.id = bp.place_id
WHERE b.active = 1
AND b.date_start < '".$dateNow."'
AND b.date_stop > '".$dateNow."'
AND bpl.active = 1
AND bp.place_id = {$idpl}
AND (IF((time_from!='00:00:00' AND time_to!='00:00:00'),
(time_from<='".$dateTimeNow."'
AND time_to >='".$dateTimeNow."'), 1)
)
GROUP BY b.id
ORDER BY b.frequency DESC, b.day_limit DESC
");
foreach($banners AS $bnr) {
$find_items[$bnr['id']] = $bnr;
}
$banners = $find_items;
$this->app->cache5min->save($key_cache,$banners);
}
//echo '<pre>'; print_r($banners); exit;
if(!$banners || !count($banners)) { return $this->getDefaultBanner($idpl,$x); }
When I remove first statement, my CPU is good, when I return first statement, my CPU 100% and this gives mysql proccess, how I Can Optimize select query?

First step is to get rid of LEFT, unless you need it. This may open up more optimization options.
This construct is essentially un-optimizable:
WHERE start < ...
AND end > ...
There is an "explode-implode" problem. First, the number of rows explodes due to the JOINs, then it implodes do to the GROUP BY. But... Meanwhile, there are no "aggregates" such as COUNT() or SUM(). So this smells like a poorly formed query -- Why do you need to do the GROUP BY? If you are getting 'duplicate' rows, then you have a worse problem -- what happens in the columns that are not duplicated? You will get random values. See what happens without the GROUP BY.
This composite index on bp may help:
INDEX(place_id, item_id)

Related

Too heavy query execution

I have a Database with 187840 lines .
When i execute this query i have this message Query execution was interrupted
TOO HEAVY QUERY
SELECT days.day,count(U.sig_name) as number
FROM days
LEFT JOIN linked U ON
days.day = date(timestamp)
AND
U.sig_name REGEXP "^Tester"
GROUP BY days.day;
What is th solution ?
This is your query:
select days.day, count(U.sig_name) as number
from days left join
linked U
on days.day = date(timestamp) AND U.sig_name REGEXP "^Tester"
group by days.day;
You have a problem because of the function call around timestamp. You might find this version better:
select days.day,
(select count(*)
from linked u
where u.timestamp >= days.day an du.timestamp < date_add(days.day, interval 1 day) and
u.sig_name not like '%Tester%'
)
from days;
For performance, you want a composite index on linked(timestamp, sig_name). This eliminates the outer aggregation (the aggregation uses the index instead), and allows an index to be used for the matching.
You can handle massive data using LIMIT:
$limit_size = 10000;
$flag_done = false;
for ($i = 1; ! $flag_done; $i++) {
$queryString = "SELECT days.day,count(U.sig_name) as number from days left join linked U on days.day = date(timestamp) AND U.sig_name REGEXP "^Tester" group by days.day LIMIT $index*$limit_size, $limit_size";
if($result = mysql_query($queryString, $db)){
[WHAT YOU WANT TO DO WITH RESULT HERE]
} else $flag_done = true;
}

Best way to sum and seperate by date in MYSQL with/witout php

Hi i have such table information:
what i want to do with php with while or just in mysql, is to SUM (time_used) of the rows with status 44 until its reached row with status 55. after that it should begin from start with new summing.
first query should return 37, second 76 (keep in mind it should be universal, for unlimited occurrences of 55 status row)
i thought of a way with time/date filtering and have this:
select sum(time_used) as sumed
from timelog
where start_time > (select end_time from timelog where (status='55')
ORDER BY id DESC LIMIT 1) ORDER BY id DESC
but this works only for last combination of 44 and 55
i know i will need two way filtering( < end_time and > end_time) so it will work for all cases, but cant think of a way to do it in php
can anyone help me?
EDIT:
sqlfiddle whoever want it:
http://sqlfiddle.com/#!2/33820/2/0
There are two ways to do it: Plain SQL or PHP. If you treat thousands of rows, it may be interresting to choose between the two by testing performance.
Plain SQL
select project_id, task_id, user_id, sum(time_used) as time_used,
min(start_time) as start_time, max(end_time) as end_time, max(comment) as comment from
(select t.id, t.project_id, t.task_id, t.user_id, t.time_used,
count(t2.id) as count55, t.start_time, t.end_time, t.comment
from timelog t
left join timelog t2 on t.id>t2.id and t2.status=55 and t.task_id=t2.task_id
group by t.id) as t
group by count55;
I assume here that a task can belong to one user only
SQL and PHP
$link = mysqli_connect( ... );
$query = "select id, project_id, task_id, user_id, time_used, start_time, end_time, status
from timelog order by id";
$result = mysqli_query($link, $query);
$table = array();
$time_used = 0;
$start_sum = true;
$i = 0;
while($row = mysqli_fetch_assoc ($result)){
if($start_sum){
$table[$i] = $row;
$start_sum = false;
} else {
$table[$i]['time_used'] += $row['time_used'];
$table[$i]['end_time'] += $row['end_time'];
}
if($row['state'] == 55){
$i++;
$start_sum = true;
}
}
If two tasks can run in simultaneously, solution 1 will work, but solution 2 will need to be adapted in order to take this in account.
here is my intepretation:
http://sqlfiddle.com/#!2/33820/45
set #n=0;
select project_id, task_id, user_id,sum(time_used) from (
SELECT time_used,project_id, task_id, user_id,
#n:=if(status=55,#n+1,#n),
if(status=55,-1,#n) as grouper FROM timelog
) as t
where grouper>-1
group by grouper;
I'm neither a php nor MySQL programmer, but I can explain the logic you want to follow. You can then code it.
First, query your db and return the results to php.
Next, set two sum variables to 0.
Start looping through your query results. Increment the first sum variable until you reach the first row that has status 55. Once you do, start incrementing the second variable.
The tricky part will be to sort your query by the row number of the table. Here is a link that will help you with that part.

Error handling in the following sql query

Fiddle with tables here
I'm using the following sql with the tables in the fiddle to check if a user has reached the borrowing limit. The problem here is, If an invalid item number were supplied it returns NULL, if a user has not borrowed any items, it returns NULL. This way, I cannot tell if a invalid item number were supplied or if a user actually has not borrowed any books. What would be a good way to check if a invalid item number was supplied or a member actually has not borrowed anything under that category?
set #mId = 3 //Has not borrowed anything till now.
set #id = 21; //This item does not appear in the collection_db table and is therefore invalid.
set #country = 'US';
SELECT col1.id, col1.holder, col2.borrowMax maxLimit, count(lend.borrowedId) as `count`
FROM collection_db col1
INNER JOIN collection_db col2
ON col1.holder = col2.id
INNER JOIN lendings lend
ON col1.holder = lend.holder and col1.country = lend.country
WHERE col1.id = #id and col1.country = #country
AND col2.category = 10
AND lend.memId = #mId and lend.country = #country
The furthest I could get with the one query is (had to take out php and "country" vars for fiddle to work):
SELECT col1.id, col1.holder, col2.borrowMax maxLimit, count(lend.borrowedId) as `count`
,case when valid1.id is not null then 'true' else 'false' end as validId
FROM collection_db col1
INNER JOIN collection_db col2
ON col1.holder = col2.id
INNER JOIN lendings lend
ON col1.holder = lend.holder,(
Select Distinct a.id From collection_db a
Where a.id = 4) valid1
WHERE col1.id = 4
AND col2.category = 10
AND lend.memId = 1
You may have to do a preparatory query checking for a valid memId:
$theQuery = "SELECT DISTINCT memId FROM lendings WHERE memId = 1"
Then test it here:
if (mysql_num_rows(mysql_query($theQuery)) <= 0) { /* No memId exists */ }
else { /* Do big query here */ }
You can use a tableA LEFT JOIN tableB, which will return results for the tableA even if tableB has no matches and will return NULL values for those in tableB.
Unfortunately, I can't quite figure out where you need LEFT JOINS, but probably you want them in both places.
You also might have to reorder the tables if it is the first table that should be on the right side of a LEFT JOIN. You could use a RIGHT JOIN but it is less readable to me.
maybe you should try "left join" if col1 do not have too much data,or do the query step by step

More efficient way to do SQL queries

I've been using the below php and sql for loading schedule information and real time information for passenger trains in the UK. Essentially you have to find the relevant schedules, and then load the realtime information for each schedule which is in a different table relating to todays trains.
The query is taking a little longer than is really idea and using lots of CPU% which again isn''t ideal. I'm pretty weak when it comes to sql programming so any pointers as to what is inefficient would be great.
This is for an android app and so i've tried to all with one call over http. The prints(*) and > is for splitting the string at the other end.
Here is the code:
<?
//Connect to the database
mysql_connect("localhost","XXXX","XXXX")
or die ("No connection could be made to the OpenRail Database");
mysql_select_db("autotrain");
//Set todays date from system and get HTTP parameters for the station,time to find trains and todays locations table.
$date = date('Y-m-d');
$test = $_GET['station'];
$time = $_GET['time'];
$table = $_GET['table'];
//Find the tiploc associated with the station being searched.
$tiplocQuery = "SELECT tiploc_code FROM allstations WHERE c LIKE '$test';";
$tiplocResult =mysql_query($tiplocQuery);
$tiplocRow = mysql_fetch_assoc($tiplocResult);
$tiploc=$tiplocRow['tiploc_code'];
//Now find the timetabled trains for the station where there exists no departure information. Goes back two hours to account for any late running.
$timeTableQuery = "SELECT tiplocs.tps_description AS 'C', locations$table.public_departure, locations$table.id,schedules.stp_indicator
,schedules.train_uid
FROM locations$table, tiplocs, schedules_cache, schedules,activations
WHERE locations$table.id = schedules_cache.id
AND schedules_cache.id = schedules.id
AND schedules.id =activations.id
AND '$date'
BETWEEN schedules.date_from
AND schedules.date_to
AND locations$table.tiploc_code = '$tiploc'
AND locations$table.real_departure LIKE '0'
AND locations$table.public_departure NOT LIKE '0'
AND locations$table.public_departure >='$time'-300
AND locations$table.public_departure <='$time'+300
AND schedules.runs_th LIKE '1'
AND schedules_cache.destination = tiplocs.tiploc
ORDER BY locations$table.public_departure ASC
LIMIT 0,30;";
$timeTableResult=mysql_query($timeTableQuery);
while($timeTablerow = mysql_fetch_assoc($timeTableResult)){
$output[] = $timeTablerow;
}
//Now for each id returned in the timetable, get the locations and departure times so the app may calculate expected arrival times.
foreach ($output as $value) {
$id = $value['id'];
$realTimeQuery ="SELECT locations$table.id,locations$table.location_order,locations$table.arrival,locations$table.public_arrival,
locations$table.real_arrival,locations$table.pass,locations$table.departure,locations$ table.public_departure,locations$table.real_departure,locations$table.location_cancelled,
tiplocs.tps_description FROM locations$table,tiplocs WHERE id =$id AND locations$table.tiploc_code=tiplocs.tiploc;";
$realTimeResult =mysql_query($realTimeQuery);
while($row3 = mysql_fetch_assoc($realTimeResult)){
$output3[] = $row3;
}
print json_encode($output3);
print("*");
unset($output3);
unset($id);
}
print('>');
print json_encode($output);
?>
Many Thanks
Matt
The biggest issue with your setup is this foreach loop because it is unnecessary and results in n number of round trips to the database to execute a query, fetch and analyze the results.
foreach ($output as $value) {
Rewrite the initial query to include all of the fields you will need to do your later calculations.
Something like this would work.
SELECT tl.tps_description AS 'C', lc.public_departure, lc.id, s.stp_indicator, s.train_uid,
lc.id, lc.location_order, lc.arrival, lc.public_arrival, lc.real_arrival, lc.pass, lc.departure, lc.real_departure, lc.location_cancelled
FROM locations$table lc INNER JOIN schedules_cache sc ON lc.id = sc.id
INNER JOIN schedules s ON s.id = sc.id
INNER JOIN activations a ON s.id = a.id
INNER JOIN tiplocs tl ON sc.destination = tl.tiploc
WHERE '$date' BETWEEN schedules.date_from AND schedules.date_to
AND lc.tiploc_code = '$tiploc'
AND lc.real_departure LIKE '0'
AND lc.public_departure NOT LIKE '0'
AND lc.public_departure >='$time'-300
AND lc.public_departure <='$time'+300
AND s.runs_th LIKE '1'
ORDER BY lc.public_departure ASC
LIMIT 0,30;
Eliminating n query executions from your page load should dramatically increase response time.
Ignoring the problems with the code, in order to speed up your query, use the EXPLAIN command to evaluate where you need to add indexes to your query.
At a guess, you probably will want to create an index on whatever locations$table.public_departure evaluates to.
http://dev.mysql.com/doc/refman/5.0/en/using-explain.html
A few things I noticed.
First, you are joining tables in the where clause, like this
from table1, table2
where table1.something - table2.something
Joining in the from clause is faster
from table1 join table2 on table1.something - table2.something
Next, I'm not a php programmer, but it looks like you are running similar queries inside a loop. If that's true, look for a way to run just one query.
Edit starts here
This is in response to gazarsgo's that I back up by claim about joins in the where clause being faster. He is right, I was wrong. This is what I did. The programming language is ColdFusion:
<cfsetting showdebugoutput="no">
<cfscript>
fromtimes = ArrayNew(1);
wheretimes = ArrayNew(1);
</cfscript>
<cfloop from="1" to="1000" index="idx">
<cfquery datasource="burns" name="fromclause" result="fromresult">
select count(distinct hscnumber)
from burns_patient p join burns_case c on p.patientid = c.patientid
</cfquery>
<cfset ArrayAppend(fromtimes, fromresult.executiontime)>
<cfquery datasource="burns" name="whereclause" result="whereresult">
select count(distinct hscnumber)
from burns_patient p, burns_case c
where p.patientid = c.patientid
</cfquery>
<cfset ArrayAppend(wheretimes, whereresult.executiontime)>
</cfloop>
<cfdump var="#ArrayAvg(fromtimes)#" metainfo="no" label="from">
<cfdump var="#ArrayAvg(wheretimes)#" metainfo="no" label="where">
I did ran it 5 times. The results, in milliseconds, follow.
9.563 9.611
9.498 9.584
9.625 9.548
9.831 9.769
9.792 9.813
The first number represents joining in the from clause, the second joining in the where clause. The first number is lower only 60% of the time. Had it been lower 100% percent of the time, it would have shown that joining in the from clause is faster, but that' not the case.

while (mysql_fetch_array) in a while loop

i have this code:
while ($sum<16 || $sum>18){
$totala = 0;
$totalb = 0;
$totalc = 0;
$ranka = mysql_query("SELECT duration FROM table WHERE rank=1 ORDER BY rand() LIMIT 1");
$rankb = mysql_query("SELECT duration FROM table WHERE rank=2 ORDER BY rand() LIMIT 1");
$rankc = mysql_query("SELECT duration FROM table WHERE rank=3 ORDER BY rand() LIMIT 1");
while ($rowa = mysql_fetch_array($ranka)) {
echo $rowa['duration'] . "<br/>";
$totala = $totala + $rowa['duration'];
}
while ($rowb = mysql_fetch_array($rankb)) {
$totalb = $totalb + $rowb['duration'];
}
while ($rowc = mysql_fetch_array($rankc)) {
$totalc = $totalc + $rowc['duration'];
}
$sum=$totala+$totalb+$totalc;
}
echo $sum;
It works fine, But the problem is until "$sum=16" the "echo $rowa['duration']" executes, the question is, is there a away to "echo" only the latest executed code in the "while ($rowa = mysql_fetch_array($ranka))" i this while loop?
Because most of the times returns all the numbers until the "$sum=16"
You are explicitly echoing the $rowa['duration'] in the first inner while loop. If you only want to print the last duration from the $ranka set, simple change the echo to $rowa_duration = $rowa['duration'] then echo it outside the loop.
while ($rowa = mysql_fetch_array($ranka)) {
$rowa_duration = $rowa['duration'];
$totala = $totala + $rowa['duration'];
}
echo $rowa_duration . '<br/>';
What you are doing there is bad on multiple levels. And your english horrid. Well .. practice makes perfect. You could try joining ##php chat room on FreeNode server. That would improve both your english and php skills .. it sure helped me a lot. Anyway ..
The SQL
First of all, to use ORDER BY RAND() is extremely ignorant (at best). As your tables begin the get larger, this operation will make your queries slower. It has n * log2(n) complexity, which means that selecting querying table with 1000 entries will take ~3000 times longer then querying table with 10 entries.
To learn more about it , you should read this blog post, but as for your current queries , the solution would look like:
SELECT duration
FROM table
JOIN (SELECT CEIL(RAND()*(SELECT MAX(id) FROM table)) AS id) as choice
WHERE
table.id >= choice.id
rank = 1
LIMIT 1
This would select random duration from the table.
But since you you are actually selecting data with 3 different ranks ( 1, 2 and 3 ), it would make sense to create a UNION of three queries :
SELECT duration
FROM table
JOIN (SELECT CEIL(RAND()*(SELECT MAX(id) FROM table)) AS id) as choice
WHERE
table.id >= choice.id
rank = 1
LIMIT 1
UNION ALL
SELECT duration
FROM table
JOIN (SELECT CEIL(RAND()*(SELECT MAX(id) FROM table)) AS id) as choice
WHERE
table.id >= choice.id
rank = 2
LIMIT 1
UNION ALL
SELECT duration
FROM table
JOIN (SELECT CEIL(RAND()*(SELECT MAX(id) FROM table)) AS id) as choice
WHERE
table.id >= choice.id
rank = 3
LIMIT 1
Look scary, but it actually will be faster then what you are currently using, and the result will be three entries from duration column.
PHP with SQL
You are still using the old mysql_* functions to access database. This form of API is more then 10 years old and should not be used, when writing new code. The old functions are not maintained (fixed and/or improved ) anymore and even community has begun the process of deprecating said functions.
Instead you should be using either PDO or MySQLi. Which one to use depends on your personal preferences and what is actually available to you. I prefer PDO (because of named parameters and support for other RDBMS), but that's somewhat subjective choice.
Other issue with you php/mysql code is that you seem to pointlessly loop thought items. Your queries have LIMIT 1, which means that there will be only one row. No point in making a loop.
There is potential for endless loop if maximum value for duration is 1. At the start of loop you will have $sum === 15 which fits the first while condition. And at the end that loop you can have $sum === 18 , which satisfies the second loop condition ... and then it is off to the infinity and your SQL server chokes.
And if you are using fractions for duration, then the total value of 3 new results needs to be even smaller. Just over 2. Start with 15.99 , ends with 18.01 (that's additional 2.02 in duration or less the 0.7 per each). Again .. endless loop.
Suggestion
Here is how i would do it:
$pdo = new PDO('mysql:dbname=my_db;host=localhost', 'username', 'password');
$pdo->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
$sum = 0;
while ( $sum < 16 )
{
$query = 'that LARGE query above';
$statement = $pdo->prepare( $query );
if ( $statement->execute() )
{
$data = $statement->fetchAll( PDO::FETCH_ASSOC );
$sum += $data[0]['duration']+$data[1]['duration']+$data[2]['duration'];
}
}
echo $data[0]['duration'];
This should do what your code did .. or at least, what i assume, was your intentions.

Categories