MySQLi performance, multiple (separate) queries vs subqueries - php

I need to count the number of rows from different(!) tables and save the results for some kind of statistic. The script is quite simple and working as expected, but I'm wondering if it's better to use a single query with (in this case) 8 subqueries, or if I should use separate 8 queries or if there's even a better, faster and more advanced solution...
I'm using MySQLi with prepared statements, so the single query could look like this:
$sql = 'SELECT
(SELECT COUNT(cat1_id) FROM `cat1`),
(SELECT COUNT(cat2_id) FROM `cat2`),
(SELECT COUNT(cat2_id) FROM `cat2` WHERE `date` >= DATE(NOW())),
(SELECT COUNT(cat3_id) FROM `cat3`),
(SELECT COUNT(cat4_id) FROM `cat4`),
(SELECT COUNT(cat5_id) FROM `cat5`),
(SELECT COUNT(cat6_id) FROM `cat6`),
(SELECT COUNT(cat7_id) FROM `cat7`)';
$stmt = $db->prepare($sql);
$stmt->execute();
$stmt->bind_result($var1, $var2, $var3, $var4, $var5, $var6, $var7, $var8);
$stmt->fetch();
$stmt->free_result();
$stmt->close();
while the seperate queries would look like this (x 8):
$sql = 'SELECT
COUNT(cat1_id)
FROM
`cat1`';
$stmt = $db->prepare($sql);
$stmt->execute();
$stmt->bind_result($var1);
$stmt->fetch();
$stmt->free_result();
$stmt->close();
so, which would be faster or "better style" related to this kind of query (e.g. statistics, counter..)

My inclination is to put queries into the FROM rather than the SELECT, where possible. In this example, it requires a cross join between the tables:
select c1.val, c2.val . . .
from (select count(cat1_id) as val from cat1) c1 cross join
(select count(cat2_id as val from cat2) c2 cross join
. . .
The performance should be the same. However, the advantage appears with your cat2 table:
select c1.val, c2.val, c2.valnow, . . .
from (select count(cat1_id) as val from cat1) c1 cross join
(select count(cat2_id) as val
count(case when date >= date(now()) then cat2_id end)
from cat2
) c2 cross join
. . .
You get a real savings here by not having to scan the table twice to get two values. This also helps when you realize that you might want to modify queries to return more than one value.
I believe the cross join and select-within-select would have the same performance characteristics. The only way to really be sure is to test different versions.

The better way, is use just one query, because is only one conecction with database, instead of, if you use many queries, then are many conecctions with database, this process involves: coneccting and disconeccting, and this is more slower.

Just to follow up your comment, here is an example using one of my DBs. Using a prepared statement here buys you nothing. This multiple query in fact only executes one RPC to the D/B engine. All of the other calls are local to the PHP runtime system.
$db = new mysqli('localhost', 'user', 'password', 'blog');
$table = explode( ' ', 'articles banned comments config language members messages photo_albums photos');
foreach( $table as $t ) {
$sql[] = "select count(*) as count from blog_$t";
}
if ($db->multi_query( implode(';',$sql) )) {
foreach( $table as $t ) {
if ( ($rs = $db->store_result() ) &&
($row = $rs->fetch_row() ) ) {
$result[$t] = $row[0];
$rs->free();
$db->next_result(); // you must execute one per result set
}
}
}
$db->close();
var_dump( $result );
Just out of interest, I did an strace on this and the relevant four lines are
16:54:09.894296 write(4, "\211\1\0\0\3select count(*) as count fr"..., 397) = 397
16:54:09.895264 read(4, "\1\0\0\1\1\33\0\0\2\3def\0\0\0\5count\0\f?\0\25\0\0\0\10\201"..., 16384) = 544
16:54:09.896090 write(4, "\1\0\0\0\1", 5) = 5
16:54:09.896192 shutdown(4, 2 /* send and receive */) = 0
There was ~1 mSec between the query and the response to and from the MySQLd process (this is because this was on localhost, and the results were in its query cache, BTW).. and 0.8 mSec later the DB close was executed. And that's on my 4-yr old laptop.

Regarding to TerryE's example and the advice to use multi_query(!), I checked the manual and changed the script to fit my needs.. finally I got a solution that looks like this:
$sql = 'SELECT COUNT(cat1_id) as `cat1` FROM `cat1`;';
$sql .= 'SELECT COUNT(cat2_id) as `cat2` FROM `cat2`;';
$sql .= 'SELECT COUNT(cat2_id) as `cat2_b` FROM `cat2` WHERE `date` >= DATE(NOW());';
$sql .= 'SELECT COUNT(cat3_id) as `cat3` FROM `cat3`;';
$sql .= 'SELECT COUNT(cat4_id) as `cat4` FROM `cat4`;';
$sql .= 'SELECT COUNT(cat5_id) as `cat5` FROM `cat5`;';
$sql .= 'SELECT COUNT(cat6_id) as `cat6` FROM `cat6`;';
$sql .= 'SELECT COUNT(cat7_id) as `cat7` FROM `cat7`;';
if ($db->multi_query($sql))
{
do
{
if ($stmt = $db->store_result())
{
while ($row = $stmt->fetch_assoc())
{
foreach ($row as $key => $value)
{
$count[$key] = $value;
}
}
$stmt->free_result();
}
} while ($db->more_results() && $db->next_result());
}
There are some differences to TerryE's example, but the result is the same. I'm aware that there are 7 line at the beginning that are almost identical, but as soon as I need a WHERE clause or something else, I prefer this solution to a foreach loop where I'd need to add queries manually or use exceptions with if { ... } ...
As far as I can see, there should be no problem with my solution, or did I miss something?

Related

SQL Update taking a long time

I am trying to update many rows (100 000+) in my database but it's taking a while (over 10 mins and still not finished). I'm wondering if this is intended behavior or is there something wrong in my code. To prevent the database from hanging while performing the update I've been told to update one row at a time, not sure if this is how it should be implemented.
I am updating images in my song table to be null if those songs were played in my playlist table
private function updateBlogSongs ($blog_id) {
$db = Yii::app()->db;
$affectedRows = 0;
$sql = "SELECT *
FROM `firstdatabase`.song s
INNER JOIN `seconddatabase`.playlist p ON s.name LIKE p.song_name";
$dataReader = $db->createCommand($sql)->query(); // Rows from the song table that were played in the given blog
$row = $dataReader->read();
while ($row != false) {
$sql = "UPDATE `firstdatabase`.song s
SET s.image = NULL
WHERE s.song_id = " . $row['song_id'];
$affectedRows += $db->createCommand($sql)->execute();
$row = $dataReader->read();
}
return $affectedRows;
}
Edit: after reading The Dog's comment I made some changes:
With 500 000 rows in the song table it takes about 10 minutes if I increase my batchSize to 10000 (was taking 8 hours with the code above). At 250 at the batch size it's taking about 50 minutes. I chose 250 because the query takes about 1 second to run, and it's taking 10+ seconds to run at 10000 batch size (constraint is 1 second). I would like to make it faster but not sure what else to change
$batchSize = 250;
$lastSongID = 0;
$rowIndex = 0;
$affectedRows = 0;
$sql = "SELECT max(song_id) FROM `firstdatabase`.song";
$lastSongID = intval($db->createCommand($sql)->query()->read()['max(song_id)']);
echo($lastSongID . ' songs in table.' . PHP_EOL);
echo('Updating songs...' . PHP_EOL);
while($rowIndex <= $lastSongID) {
$startTime = microtime(true);
$sql = "UPDATE `firstdatabase`.song
SET image = NULL
WHERE song_id in (
SELECT song_id
FROM (
SELECT song_id, name
FROM `firstdatabase`.song
WHERE song_id > " . $rowIndex . "
LIMIT " . $batchSize . "
) s
INNER JOIN (
SELECT DISTINCT song_name
FROM `seconddatabase`.playlist
) p ON s.name LIKE p.song_name
ORDER BY s.song_id ASC
)";
$affectedRows += $db->createCommand($sql)->execute();
$rowIndex += $batchSize;
$endTime = microtime(true);
$elapsedTime = round($endTime - $startTime, 2);
}
This is really more a question for the SQL world instead of the PHP world but here's my recommendations:
Don't do this one row at a time in a while loop. Make a more complex update statement that can do it all in one database hit. Database commands are the slowest part of your php code, you want to limit the number of calls you do to the database.
When you are confident that you can get the operation done in one sql command, or even if you don't think it is possible then pull your code into a stored procedure in the database. Having complex sql queries as stored procedures can help a lot with maintaining your code.
Make sure you have indexes on your tables. You need to make sure your queries hit those indexes for best performance.
Here's an option for the single query:
update `firstdatabase`.song
set image = null
where song_id in (
select s.song_id
from `firstdatabase`.song s
INNER JOIN `seconddatabase`.playlist p
ON s.name LIKE p.song_name"
);
Obviously we don't have access to your database so you'll need to make changes where necessary but hopefully it can get you on the right track.
EDIT:
Try replacing your second code set with the following:
$lastSongID = 0;
$rowIndex = 0;
$affectedRows = 0;
$sql = "SELECT max(song_id) FROM `firstdatabase`.song";
$lastSongID = intval($db->createCommand($sql)->query()->read()['max(song_id)']);
echo($lastSongID . ' songs in table.' . PHP_EOL);
echo('Updating songs...' . PHP_EOL);
$startTime = microtime(true);
$sql = "
update `firstdatabase`.song
set image = null
where song_id in (
select s.song_id
from `firstdatabase`.song s
INNER JOIN `seconddatabase`.playlist p
ON s.name LIKE p.song_name"
)";
$affectedRows += $db->createCommand($sql)->execute();
$endTime = microtime(true);
$elapsedTime = round($endTime - $startTime, 2);
If it works, then let me know the time it takes to run, if it doesn't work, is it an issue with the SQL (again I can't see the tables so I'm guessing).

Pull number of rows from a SQL query and put it in PHP as a variable?

This is 4 queries put into one. This is really old code and once I can make this work we can update it later to PDO for security. What I am trying to do is count rows from
select count(*) from dialogue_employees d_e,
dialogue_leaders d_l where
d_l.leader_group_id = d_e.leader_group_id
and use it in a formula where I also count how many rows from dialogue.status = 1.
The formula is on the bottom to create a percentage total from the results. This is PHP and MySQL and I wasn't sure the best way to count the rows and put them as a variable in php to be used in the formula on the bottom?
function calculate_site_score($start_date, $end_date, $status){
while($rows=mysql_fetch_array($sqls)){
$query = "
SELECT
dialogue.cycle_id,
$completecount = sum(dialogue.status) AS calculation,
$total_employees = count(dialogue_employees AND dialogue_leaders), dialogue_list.*,
FROM dialogue,
(SELECT * FROM dialogue_list WHERE status =1) AS status,
dialogue_employees d_e,
u.fname, u.lname, d_e.*
user u,
dialogue_list,
dialogue_leaders d_l
LEFT JOIN dialogue_list d_list
ON d_e.employee_id = d_list.employee_id,
WHERE
d_l.leader_group_id = d_e.leader_group_id
AND d_l.cycle_id = dialogue.cycle_id
AND u.userID = d_e.employee_id
AND dialogue_list.employee_id
AND site_id='$_SESSION[siteID]'
AND start_date >= '$start_date'
AND start_date <= '$end_date'";
$sqls=mysql_query($query) or die(mysql_error());
}
$sitescore=($completecount/$total_employees)*100;
return round($sitescore,2);
}
If you separate out your queries you will gain more control over your data. You have to be careful what your counting. It's pretty crowded in there.
If you just wanted to clean up your function you can stack your queries like this so they make more sense, that function is very crowded.
function calculate_site_score($start_date, $end_date, $status){
$query="select * from dialogue;";
if ($result = $mysqli->query($query))) {
//iterate your result
$neededElem = $result['elem'];
$query="select * from dialogue_list where status =1 and otherElem = " . $neededElem . ";";
//give it a name other than $sqls, something that makes sense.
$list = $mysqli->query($query);
//iterate list, and parse results for what you need
foreach($list as $k => $v){
//go a level deeper, or calculate, rinse and repeat
}
}
Then do your counts separately.
So it would help if you separate queries each on their own.
Here is a count example How do I count columns of a table

how to use PHP function to get ranks for multiple columns in a table more efficent

I am trying to get single rank for a user for each stat "column" in the table. I am trying to do this as more efficiently because i know you can.
So I have a table called userstats. in that table i have 3 columns user_id, stat_1, stat_2, and stat_3. I want to me able to get the rank for each stat for the associated user_id. with my current code below i would have to duplicate the code 3x and change the column names to get my result. please look at the examples below. Thanks!
this is how i currently get the rank for the users
$rankstat1 = getUserRank($userid);
<code>
function getUserRank($userid){
$sql = "SELECT * FROM ".DB_USERSTATS." ORDER BY stat_1 DESC";
$result = mysql_query($sql);
$rows = '';
$data = array();
if (!empty($result))
$rows = mysql_num_rows($result);
else
$rows = '';
if (!empty($rows)){
while ($rows = mysql_fetch_assoc($result)){
$data[] = $rows;
}
}
$rank = 1;
foreach($data as $item){
if ($item['user_id'] == $userid){
return $rank;
}
++$rank;
}
return 1;
}
</code>
I believe there is a way for me to get what i need with something like this but i cant get it to work.
$rankstat1 = getUserRank($userid, 'stat_1'); $rankstat2 =
getUserRank($userid, 'stat_2'); $rankstat3 = getUserRank($userid,
'stat_3');
You can get all the stat ranks using one query without doing all the PHP looping and checking.
I have used PDO in this example because the value of the $userid variable needs to be used in the query, and the deprecated mysql database extension does not support prepared statements, which should be used to reduce the risk of SQL injection.
The function could be adapted to use the same query with mysqli, or even mysql if you must use it.
function getUserRanks($userid, $pdo) {
$sql = "SELECT
COUNT(DISTINCT s1.user_id) + 1 AS stat_1_rank,
COUNT(DISTINCT s2.user_id) + 1 AS stat_2_rank,
COUNT(DISTINCT s3.user_id) + 1 AS stat_3_rank
FROM user_stats f
LEFT JOIN user_stats s1 ON f.stat_1 < s1.stat_1
LEFT JOIN user_stats s2 ON f.stat_2 < s2.stat_2
LEFT JOIN user_stats s3 ON f.stat_3 < s3.stat_3
WHERE f.user_id = ?"
$stmt = $pdo->prepare($sql);
$stmt->bindValue(1, $userid);
$stmt->execute();
$ranks = $stmt->fetchObject();
return $ranks;
}
This should return an object with properties containing the ranks of the given user for each stat. An example of using this function:
$pdo = new PDO($dsn, $user, $pw);
$ranks = getUserRanks(3, $pdo); // get the stat ranks for user 3
echo $ranks->stat_2_rank; // show user 3's rank for stat 2
$sql = "SELECT user_id, stat_1, stat_2, stat_3 FROM ".DB_USERSTATS." ORDER BY stat_1 DESC";
Also, unless there is a reason you need ALL users results, limit your query with a WHERE clause so you're only getting the results you actually need.
Assuming you limit your sql query to just one user, this will get that user's stats.
foreach($data as $item){
$stat_1 = $item['stat_1'];
$stat_2 = $item['stat_2'];
$stat_3 = $item['stat_3'];
}
If you get more than one user's stats with your sql query, consider passing your $data array back to the calling function and loop through the array to match the users stats to particular user id's.

PDO Memory Exhausted

This is common issue but I have no choice to code it like this just to get appropriate header and body in Excel file
here how it starts
When a request been made to print, I first began make a query to fetch the headers in the database
SELECT instruments.in_id, instrument_parameters.ip_id,
CASE WHEN gv_x_ipid = -1 THEN 'datetime' ELSE '' END xlabel,
CASE WHEN ip_label LIKE '%Reservoir%' THEN 0 ELSE in_order END legendIndex,
CASE WHEN in_name = 'General' THEN ip_label ELSE in_name END ylabel
FROM graph_plot
LEFT JOIN attributes gptype ON gp_type = gptype.at_id
LEFT JOIN graph_value ON gp_id = gv_gpid
LEFT JOIN instrument_parameters ON gv_y_ipid = ip_id
LEFT JOIN attributes pmunit ON ip_unit = pmunit.at_id
LEFT JOIN instrument_reading yvalue ON gv_y_ipid = iv_ipid
LEFT JOIN instruments ON iv_inid = in_id
WHERE gp_diid = :di_id AND
gp_type = :rpt_type AND
iv_status = 'Y' AND
iv_inid in (".implode(",", $coll->inid).") AND
gv_y_ipid in (".implode(",", $coll->ipid).")
GROUP BY ylabel
ORDER BY legendIndex
and this will produce numbers of headers that I will make it to be like this
DATE | Instrument1 | Instrument2 | Instrument3
The Instrument? will be dynamic based on the query above. I store this in new variable. But the original variable that holds the database results remain intact.
Later, using the same parameters, :di_id and :rpt_type, also another additional parameters, startDt and endDt to make another query just to return a long list of available dates in database. This is based on the startDt and endDt.
$sql2 = "SELECT iv_reading FROM instrument_reading WHERE iv_inid = :inid AND iv_ipid = :ipid AND iv_date = :dt AND iv_status = 'Y'";
When it finish getting the dates, I make two loop like this
foreach ($dates as $key => $dt) {
foreach ($resp as $InstNo => $InstRow) {
try {
$stmt2->execute(array(':dt' => $dt, ':inid' => $InstRow->in_id, ':ipid' => $InstRow->ip_id));
$rowDb = $stmt2->fetch(PDO::FETCH_NUM, PDO::FETCH_ORI_NEXT);
} catch(PDOException $e) {
echo '{"error":{"text":"'. $e->getMessage() .'"}}';
}
}
}
First, it starts looping the date and second it begins looping the headers (based on the query made right before getting the dates). My problem I always stuck here
$stmt2->execute(array(':dt' => $dt, ':inid' => $InstRow->in_id, ':ipid' => $InstRow->ip_id));
What do you think? Is there any better way to handle this?
For your information, I use Slim and PHPExcel. PHPExcel might have memory issue and I'm thinking to switch to Spout but the documents still about the basic stuff.
In your SQL, you may consider a limit clause to ease the memory load as follows:
$handle = fopen("file.csv", "wb");
$statement = "
SELECT instruments.in_id, instrument_parameters.ip_id,
CASE WHEN gv_x_ipid = -1 THEN 'datetime' ELSE '' END xlabel,
CASE WHEN ip_label LIKE '%Reservoir%' THEN 0 ELSE in_order END legendIndex,
CASE WHEN in_name = 'General' THEN ip_label ELSE in_name END ylabel
FROM graph_plot
LEFT JOIN attributes gptype ON gp_type = gptype.at_id
LEFT JOIN graph_value ON gp_id = gv_gpid
LEFT JOIN instrument_parameters ON gv_y_ipid = ip_id
LEFT JOIN attributes pmunit ON ip_unit = pmunit.at_id
LEFT JOIN instrument_reading yvalue ON gv_y_ipid = iv_ipid
LEFT JOIN instruments ON iv_inid = in_id
WHERE gp_diid = :di_id
AND gp_type = :rpt_type
AND iv_status = 'Y'
AND iv_inid in (".implode(",", $coll->inid).")
AND gv_y_ipid in (".implode(",", $coll->ipid).")
GROUP BY ylabel
ORDER BY legendIndex
LIMIT 250
";
$prep = $dbh->prepare($statement);
for ($i = 0; $prep -> rowCount < 250; $i+= 250) {
fputcsv(prep->fetchAll());
$prep = $dbh->prepare($statement.' OFFSET'.$i);
}
fclose($handle);
Alternatively, you could use system and call SELECT INTO, set the permissions (if necessary) and Bob's your uncle.
You have not terminated the fetch loop.
$rowDb = $stmt2->fetch(PDO::FETCH_NUM, PDO::FETCH_ORI_NEXT);
gets the "next" row or closes the 'cursor' and terminates.
Are you expecting to get exactly one row? If so, consider doing fetchAll. (Caution: the resultset may be an extra level deep in arrays.)
The PDO MySQL driver will do some buffering, which causes memory exhaustion when looping over large datasets. You can turn this off using $pdo->setAttribute(PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, false); which should solve the problem.
$pdo = new PDO('mysql:localhost', $username, $password);
$pdo->setAttribute(PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, false);
$stmt = $pdo->prepare('SELECT * FROM instrument...');
$stmt->execute($parameters);
while($row = $stmt->fetch()) {
// Insert logic to write the row to the destination
}
If you'd rather set the attribute for that query only, you can do that as well:
$stmt = $pdo->prepare('SELECT * FROM instrument...', [
PDO::MYSQL_ATTR_USE_BUFFERED_QUERY
]);
Keep in mind that you won't be able to run other queries until you are done with your unbuffered one. You can close the old cursor prematurely with $stmt->closeCursor() if you don't need the remaining results. I also cannot speak to the performance of this, but it solved my issue while writing a one-off script.
The setting is mentioned briefly in MySQL's documentation:
https://dev.mysql.com/doc/connectors/en/apis-php-pdo-mysql.html

How does one go about deleting records from an array with Joins

I have a query with multiple joins in it. After I take the results and run it through a Id-checker I want to be able to delete records from that array where the IDDestination equals $ID.
Since this query has joins on it and I am filtering them based on one of the joined tables, How do I go about deleting those records from the array based off that joined table?
And I only wanted this to happen after the user confirms.
$query = "
select d.IDCourse,
d.name as course_name,
d.slug,
d.short_description,
d.address,
e.city_name,
e.state_code,
d.zip,
e.city_slug,
e.state_slug,
h.IDDestination,
LOWER(e.iso_code) as country_slug, a.*,
b.IDTeetimeType,
c.name as teetime_type,
b.start_time,b.end_time,
(case dayofweek(a.teetime_dt)
when 1 then `b`.`sun`
when 2 then `b`.`mon`
when 3 then `b`.`tue`
when 4 then `b`.`wed`
when 5 then `b`.`thu`
when 6 then `b`.`fri`
when 7 then `b`.`sat`
end) AS `price`, g.tax_rate, f.alias
from cart_course_teetimes a
join course_priceplan b
on a.IDCoursePricePlan = b.IDCoursePricePlan
join course_teetime_type c
on b.IDTeetimeType = c.IDTeetimeType
join course d
on b.IDCourse = d.IDCourse
join vw_cities e
on d.IDCity = e.IDCity
join destinations_cities h
on h.IDCity= d.IDCity
LEFT JOIN (SELECT * FROM media_mapping WHERE is_main_item=1 AND IDGalleryType=3) f
ON d.IDGallery = f.IDGallery
left join course_tax
g on a.IDCourseTax = g.IDCourseTax
where a.IDCart = :cart_id
order by d.name, a.teetime_dt, b.start_time;";
$prepared = array(
"cart_id" => $idCart,
);
$conn = new DBConnection();
$results = $conn->fetch($query, $prepared);
$conn = null;
$results = !empty($results) ? $results : array();
$id = null;
foreach($results as $row) {
// Set ID for the first record.
if($id === null)
$id = $row['IDDestination'];
// will stay true, otherwise it's false and we should kill the loop.
if($id != $row['IDDestination']) {
$newid=$row['IDDestination'];
echo "<script type='text/javascript'> emptycart();</script>";
$query = "DELETE FROM cart_course_teetimes a WHERE h.IDDestination='.$id.'";
$res =mysql_query($query) or die (mysql_error());
break;
}
}
This is incorrect PHP:
$query = "DELETE FROM cart_course_teetimes a WHERE h.IDDestination='.$id.'"
You're already inside a "-quoted string, so . PHP concatenation operators aren't operators, they're just a period.
You want either of these instead:
$query = "DELETE FROM cart_course_teetimes a WHERE h.IDDestination='" . $id . "'";
$query = "DELETE FROM cart_course_teetimes a WHERE h.IDDestination='$id'"
Right now you're producing
... WHERE h.IDDestination = .42.
which is not valid SQL.
Plus it appears you're mixing database libraries. You've got $conn->fetch() which implies you're using one of the OO db libraries (mysqli? pdo? custom wrapper?). But you then call mysql_query(). Unless you've EXPLICITLY called mysql_connect(), that delete query will never execute. Database connections made in one of the libraries are utterly useless in any of the other libraries.

Categories