I'm in the process of converting an old mysql_X site to work with PDO, and so far, so good (I think). I have a question regarding the reliability of counting the results of queries, however.
Currently, I've got everything going like this (I've removed the try / catch error code to make this easier to read):
$stm = $db->prepare("SELECT COUNT(*) FROM table WHERE somevar = '1'");
$stm->execute();
$count = $stm->fetchColumn();
if ($count > 0){
$stm = $db->prepare("SELECT * FROM table WHERE somevar = '1'");
$stm->execute();
$result = $stm->fetchAll();
}
There might be stupid problems with doing it this way, and I invite you to tell me if there are, but my question is really about cutting down on database queries. I've noticed that if I cut the first statement out, run the second by itself, and then use PHP's count() to count the results, I still seem to get a reliable row count, with only one query, like this:
$stm = $db->prepare("SELECT * FROM table WHERE somevar = '1'");
$stm->execute();
$result = $stm->fetchAll();
$count = count($result);
if ($count > 0){
//do whatever
}
Are there any pitfalls to doing it this way instead? Is it reliable? And are there any glaring, stupid mistakes in my PDO here? Thanks for the help!
Doing the count in MySQL is preferable, especially if the count value is the only result you're interested in. Compare your versions to equivalent question "how many chocolate bars does the grocery store have in stock?"
1) count in the db: SELECT count(*) .... Drive to the store, count the chocolate bars, write down the number, drive home, read the number off your slip of paper
2) count in PHP: SELECT * .... Drive to the store. Buy all the chocolate bars. Truck them home. Count them on your living room floor. Write the results on a piece of paper. Throw away the chocolate bars. Read number off the paper.
which one is more efficient/less costly? Not a big deal if your db/table only has a few records. When you start reaching the thousands/millions of records, version 2) is absolutely ludicrious and likely to burn through your bandwidth, blow up your PHP memory limit, and drive your CPU usage into the stratosphere.
That being said, there's no point in running two queries, one to just count how many records you MAY get. Such a system is vulnerable to race conditions. e.g. you do your count and get (say) 1 record. by the time you go to run the second query and fetch that record, some OTHER parallel process has gone and inserted another record, or deleted the one you'd wanted.
In first case you are counting using MYSQL, and in second case you are counting using PHP. Both are essentialy same results.
Your usage of the queries is correct. The only problem will appear when you use LIMIT, because the COUNT(*) and the count($result) will be different.
COUNT(*) will count all the rows that the query would have returned (given that the counting query is the same and not using LIMIT)
count($result) will count just the returned rows, so if you use LIMIT, you will just get the results up to the given limit.
Yes it's reliable in this use case!
Related
For example, i have a table "tbl_book" with 100 records or more with multiple column like book_name, book_publisher,book_author,book_rate in mysql "db_bookshop". Now i would like to fetch them all by one query without iterate 100 times instead of one or two time looping. Is it possible? Is there any tricky way to do that. Generally we do what
$result = mysql_query("SELECT desire_column_name FROM table_name WHERE clause");
while( $row = mysql_fetch_array($result) ) {
$row['book_name'];
$row['book_publisher'];
$row['book_author'];
..........
$row['book_rate'];
}
// Or we may can use mysqli_query(); -mysqli_fetch_row(), mysqli_fetch_array(), mysqli_fetch_assoc();
My question is, is there any idea or any tricky way that we can be
avoided 1oo times iterate for fetching 1oo records? It's may be wired
to someone but one of the most experience programmer told me that it's
possible. But unfortunately i was not able to learn it from him. I
feel sorry for him because he is not anymore. Advance thanks for your idea sharing.
You should not use mysql_query the mysql extension is deprecated:
This extension is deprecated as of PHP 5.5.0, and has been removed as of PHP 7.0.0.
-- https://secure.php.net/manual/en/intro.mysql.php
When you use PDO you can fetch all items without looping over query like this
$connection = new PDO('mysql:host=localhost;dbname=testdb', 'dbuser', 'dbpass');
$statement = $connection->query('SELECT ...');
$rows = $statement->fetchAll();
The short answer - NO, it's impossible to fetch more than one record from a database without a loop.
But the the question here is that you don't want it.
There is no point in "just fetching" the data - you're always going to do something with it. With each row. Obviously, a loop is a natural way to do something with each row. Therefore, there is no point in trying to avoid a loop.
Which renders your question rather meaningless.
Regarding performance. The truth is that you experience not a single performance problem related to fetching just 100 records from a database. Which renters your problem an imaginary one.
The only plausible question I can think off your post is your performance as a programmer, as lack of education makes you write a lot of unnecessary code. If you manage to ask a certain question regarding that matter, you'll be shown a way to avoid the useless repetitive typing.
Have you tried using mysql_fetch_assoc?
$result = mysql_query("SELECT desire_column_name FROM table_name WHERE clause");
while ($row = mysql_fetch_assoc($result)) {
// do stuff here like..
if (!empty($row['some_field'])){
echo $row["some_field"];
}
}
It is possible to read all 100 records without loop by hardcoding the main column values, but that would involve 100 x number of columns to be listed, and there could be limitation on the number of columns you can display in MySQL.
eg,
select
case when book_name='abc' then book_name end Name,
case when book_name='abc' then book_publisher end as Publisher,
case when book_name='abc' then book_author end as Author,
case when book_name='xyz' then book_name end Name,
case when book_name='xyz' then book_publisher end as Publisher,
case when book_name='xyz' then book_author end as Author,
...
...
from
db_bookshop;
It's not practical but if you have less rows to query you might find it useful.
The time taken to ask the MySQL server for something is far greater than one iteration through a client-side WHILE loop. So, to improve performance, the goal is to have the SELECT go to the server in one round trip. Different API calls do this or don't do this; read their details.
I have written a lot of UIs with MySQL under the covers. I think nothing of fetching a few dozen rows at once, and then build a <table> (or something) with the results. I rarely fetch more than 100, not because of performance, but because 100 is (usually) too much for the user to take in on a single web page.
Also, I think nothing of issuing several, maybe dozens, of queries in support of a single web page. The delay is insignificant, especially when compared to the user's time for reading, digesting, and moving to the next page. So, I try to give the user a digestible amount of info without having to click to another page to get more. There are tradeoffs.
When it is practical to have SQL do the 'digesting', do so. It is faster for MySQL do do a SUM() and return just the total, rather than return dozens of rows for the client to add up. This is mostly a 'bandwidth' issue. Either way, MySQL will fetch (internally) all the needed rows.
Is there any difference in performance between SELECT * and SELECT COUNT(*) when no rows will be found?
My chat script check every second for new messages so it would be good to know if I need count(*)
What would be faster and better for the server:
$result = mysqli_query($con, "blablabla");
if(mysqli_num_rows($result) > 0)
{
}
OR
$result = mysqli_query($con, "blablabla");
$menge = mysqli_fetch_row($result);
$menge = $menge[0];
if($menge > 0)
{
}
When no rows will be found?! I'm asking that for AJAX chat and most of the time there will be no new rows.
With a basic query the time difference would be negligible between the 2 (well within natural variation when executing a query repeatedly). With a complex query there could be a significant difference as likely with just a COUNT you could use a simplified query.
For example if the full query needs a sub query to get the latest details for a grouped set then this could make the full query quite slow. However the count might only be interested in whether there is a row (rather than the details of the latest row) and so not require the sub query. This could make a major difference.
Assuming you had an index on the id and the value you are using is at or close to the max value (ie, you are not expecting to return the full 1m records, rather a couple max from the last few seconds) then I wouldn't expect a significant difference caused by the amount of data returned.
The big constant performance hits are making the connection to MySQL, and the basic processing of a query by the database (ie, processing the request. not performing the query). These provide a fairly constant overhead for any query, and quite a significant one for a simple query which is fast once it is executed. Because of this the overhead from performing 2 queries sometimes (when new rows have been added) is likely to dwarf the negligible saving from just getting a count unless there are new records.
In a site I maintain I have a need to query the same table (articles) twice, once for each category of article. AFAIT there are basically two ways of doing this (maybe someone can suggest a better, third way?):
Perform the db query twice, meaning the db server has to sort through the entire table twice. After each query, I iterate over the cursor to generate html for a list entry on the page.
Perform the query just once and pull out all the records, then sort them into two separate arrays. After this, I have to iterate over each array separately in order to generate the HTML.
So it's this:
$newsQuery = $mysqli->query("SELECT * FROM articles WHERE type='news' ");
while($newRow = $newsQuery->fetch_assoc()){
// generate article summary in html
}
// repeat for informational articles
vs this:
$query = $mysqli->query("SELECT * FROM articles ");
$news = Array();
$info = Array();
while($row = $query->fetch_assoc()){
if($row['type'] == "news"){
$news[] = $row;
}else{
$info[] = $row;
}
}
// iterate over each array separate to generate article summaries
The recordset is not very large, current <200 and will probably grow to 1000-2000. Is there a significant different in the times between the two approaches, and if so, which one is faster?
(I know this whole thing seems awfully inefficient, but it's a poorly coded site I inherited and have to take care of without a budget for refactoring the whole thing...)
I'm writing in PHP, no framework :( , on a MySql db.
Edit
I just realized I left out one major detail. On a given page in the site, we will display (and thus retrieve from the db) no more than 30 records at once - but here's the catch: 15 info articles, and 15 news articles. On each page we pull the next 15 of each kind.
You know you can sort in the DB right?
SELECT * FROM articles ORDER BY type
EDIT
Due to the change made to the question, I'm updating my answer to address the newly revealed requirement: 15 rows for 'news' and 15 rows for not-'news'.
The gist of the question is the same "which is faster... one query to two separate queries". The gist of the answer remains the same: each database roundtrip incurs overhead (extra time, especially over a network connection to a separate database server), so with all else being equal, reducing the number database roundtrips can improve performance.
The new requirement really doesn't impact that. What the newly revealed requirement really impacts is the actual query to return the specified resultset.
For example:
( SELECT n.*
FROM articles n
WHERE n.type='news'
LIMIT 15
)
UNION ALL
( SELECT o.*
FROM articles o
WHERE NOT (o.type<=>'news')
LIMIT 15
)
Running that statement as a single query is going to require fewer database resources, and be faster than running two separate statements, and retrieving two disparate resultsets.
We weren't provided any indication of what the other values for type can be, so the statement offered here simply addresses two general categories of rows: rows that have type='news', and all other rows that have some other value for type.
That query assumes that type allows for NULL values, and we want to return rows that have a NULL for type. If that's not the case, we can adjust the predicate to be just
WHERE o.type <> 'news'
Or, if there are specific values for type we're interested in, we can specify that in the predicate instead
WHERE o.type IN ('alert','info','weather')
If "paging" is a requirement... "next 15", the typical pattern we see applied, LIMIT 30,15 can be inefficient. But this question isn't asking about improving efficiency of "paging" queries, it's asking whether running a single statement or running two separate statements is faster.
And the answer to that question is still the same.
ORIGINAL ANSWER below
There's overhead for every database roundtrip. In terms of database performance, for small sets (like you describe) you're better off with a single database query.
The downside is that you're fetching all of those rows and materializing an array. (But, that looks like that's the approach you're using in either case.)
Given the choice between the two options you've shown, go with the single query. That's going to be faster.
As far as a different approach, it really depends on what you are doing with those arrays.
You could actually have the database return the rows in a specified sequence, using an ORDER BY clause.
To get all of the 'news' rows first, followed by everything that isn't 'news', you could
ORDER BY type<=>'news' DESC
That's MySQL short hand for the more ANSI standards compliant:
ORDER BY CASE WHEN t.type = 'news' THEN 1 ELSE 0 END DESC
Rather than fetch every single row and store it in an array, you could just fetch from the cursor as you output each row, e.g.
while($row = $query->fetch_assoc()) {
echo "<br>Title: " . htmlspecialchars($row['title']);
echo "<br>byline: " . htmlspecialchars($row['byline']);
echo "<hr>";
}
Best way of dealing with a situation like this is to test this for yourself. Doesn't matter how many records do you have at the moment. You can simulate whatever amount you'd like, that's never a problem. Also, 1000-2000 is really a small set of data.
I somewhat don't understand why you'd have to iterate over all the records twice. You should never retrieve all the records in a query either way, but only a small subset you need to be working with. In a typical site where you manage articles it's usually about 10 records per page MAX. No user will ever go through 2000 articles in a way you'd have to pull all the records at once. Utilize paging and smart querying.
// iterate over each array separate to generate article summaries
Not really what you mean by this, but something tells me this data should be stored in the database as well. I really hope you're not generating article excerpts on the fly for every page hit.
It all sounds to me more like a bad architecture design than anything else...
PS: I believe sorting/ordering/filtering of a database data should be done on the database server, not in the application itself. You may save some traffic by doing a single query, but it won't help much if you transfer too much data at once, that you won't be using anyway.
What is the difference between(performance wise)
$var = mysql_query("select * from tbl where id='something'");
$count = mysql_num_rows($var);
if($count > 1){
do something
}
and
$var = mysql_query("select count(*) from tbl where id='somthing'");
P.S: I know mysql_* are deprecated.
The first version returns the entire result set. This can be a large data volume, if your table is wide. If there is an index on id, it still needs to read the original data to fill in the values.
The second version returns only the count. If there is an index on id, then it will use the index and not need to read in any data from the data pages.
If you only want the count and not the data, the second version is clearer and should perform better.
select * is asking mysql to fetch all data from that table (given the conditions) and give it to you, this is not a very optimizable operation and will result in a lot of data being organised and sent over the socket to PHP.
Since you then do nothing with this data, you have asked mysql to do a whole lot of data processing for nothing.
Instead, just asking mysql to count() the number of rows that fit the conditions will not result in it trying to send you all that data, and will result in a faster query, especially if the id field is indexed.
Overall though, if your php application is still simple, while still being good practice, this might be regarded as a micro-optimization.
I would use the second for 2 reasons :
As you stated, mysql_* are deprecated
if your table is huge, you're putting quite a big amount of data in $var only to count it.
SELECT * FROM tbl_university_master;
2785 row(s) returned
Execution Time : 0.071 sec
Transfer Time : 7.032 sec
Total Time : 8.004 sec
SELECT COUNT(*) FROM tbl_university;
1 row(s) returned
Execution Time : 0.038 sec
Transfer Time : 0 sec
Total Time : 0.039 sec
The first collects all data and counts the number of rows in the resultset, which is performance-intensive. The latter just does a quick count which is way faster.
However, if you need both the data and the count, it's more sensible to execute the first query and use mysql_num_rows (or something similar in PDO) than to execute yet another query to do the counting.
And indeed, mysql_* is to be deprecated. But the same applies when using MySQLi or PDO.
I think using
$var = mysql_query("select count(*) from tbl where id='somthing'");
Is more efficient because you aren't allocating memory based on the number of rows that gets returned from MySQL.
select * from tbl where id='something' selects all the data from table with ID condition.
The COUNT() function returns the number of rows that matches a specified criteria.
For more reading and practice and demonstration please visit =>>> w3schools
I've been reading a lot about the disadvantages of using "order by rand" so I don't need update on that.
I was thinking, since I only need a limited amount of rows retrieved from the db to be randomized, maybe I should do:
$r = $db->query("select * from table limit 500");
for($i;$i<500;$i++)
$arr[$i]=mysqli_fetch_assoc($r);
shuffle($arr);
(i know this only randomizes the 500 first rows, be it).
would that be faster than
$r = $db->("select * from table order by rand() limit 500");
let me just mention, say the db tables were packed with more than...10,000 rows.
why don't you do it yourself?!? - well, i have, but i'm looking for your experienced opinion.
thanks!
500 or 10K, the sample size is too small to be able to draw tangible conclusions. At 100K, you're still looking at the 1/2 second region on this graph. If you're still concerned with performance, look at the two options for a randomized number I provided in this answer.
We don't have your data or setup, so it's left to you to actually test the situation. There are numerous pages for how to calculate elapsed time in PHP - create two pages, one using shuffle and the other using the RAND() query. Run at least 10 of each, & take a look.
I am looking at this from experience with MySQL.
Let's talk about the first piece of code:
$r = $db->query("select * from table");
for($i=0;$i<500;$i++){
$arr[$i] = mysqli_fetch_assoc($r);
}
shuffle($arr);
Clearly it would be more efficient to LIMIT the number of rows in the SQL statement instead of doing it on PHP.
Thus:
$r = $db->query("SELECT * FROM table LIMIT 500");
while($arr[] = mysqli_fetch_assoc($r)){}
shuffle($arr);
SQL operation would be faster than doing it in PHP, especially when you have such large amount of rows. One good way to find out is to do benchmarking and find out which of the two would be faster. My bet is that the SQL would be faster than shuffling in PHP.
So my vote goes for:
$r = $db->query("SELECT * FROM table ORDER BY RAND() LIMIT 500");
while($arr[] = mysqli_fetch_assoc($r)){}
I'm pretty sure the shuffle takes longer in your case, but you may wanna see this link for examples on fast random sets from the database. It requires a bit of extra SQL, but if speed is important to you, then do this.
http://devzone.zend.com/article/4571-Fetching-multiple-random-rows-from-a-database