using count instead of mysql_num_rows? - php

What is the difference between(performance wise)
$var = mysql_query("select * from tbl where id='something'");
$count = mysql_num_rows($var);
if($count > 1){
do something
}
and
$var = mysql_query("select count(*) from tbl where id='somthing'");
P.S: I know mysql_* are deprecated.

The first version returns the entire result set. This can be a large data volume, if your table is wide. If there is an index on id, it still needs to read the original data to fill in the values.
The second version returns only the count. If there is an index on id, then it will use the index and not need to read in any data from the data pages.
If you only want the count and not the data, the second version is clearer and should perform better.

select * is asking mysql to fetch all data from that table (given the conditions) and give it to you, this is not a very optimizable operation and will result in a lot of data being organised and sent over the socket to PHP.
Since you then do nothing with this data, you have asked mysql to do a whole lot of data processing for nothing.
Instead, just asking mysql to count() the number of rows that fit the conditions will not result in it trying to send you all that data, and will result in a faster query, especially if the id field is indexed.
Overall though, if your php application is still simple, while still being good practice, this might be regarded as a micro-optimization.

I would use the second for 2 reasons :
As you stated, mysql_* are deprecated
if your table is huge, you're putting quite a big amount of data in $var only to count it.

SELECT * FROM tbl_university_master;
2785 row(s) returned
Execution Time : 0.071 sec
Transfer Time : 7.032 sec
Total Time : 8.004 sec
SELECT COUNT(*) FROM tbl_university;
1 row(s) returned
Execution Time : 0.038 sec
Transfer Time : 0 sec
Total Time : 0.039 sec

The first collects all data and counts the number of rows in the resultset, which is performance-intensive. The latter just does a quick count which is way faster.
However, if you need both the data and the count, it's more sensible to execute the first query and use mysql_num_rows (or something similar in PDO) than to execute yet another query to do the counting.
And indeed, mysql_* is to be deprecated. But the same applies when using MySQLi or PDO.

I think using
$var = mysql_query("select count(*) from tbl where id='somthing'");
Is more efficient because you aren't allocating memory based on the number of rows that gets returned from MySQL.

select * from tbl where id='something' selects all the data from table with ID condition.
The COUNT() function returns the number of rows that matches a specified criteria.
For more reading and practice and demonstration please visit =>>> w3schools

Related

Change the seed of RAND() function in PHP?

I accessed my table of database by a PHP script and I get continuous repeat results sometimes.
I ran this query:
$query ="SELECT Question,Answer1,Answer2 FROM MyTable ORDER BY RAND(UNIX_TIMESTAMP(NOW())) LIMIT 1";
Before of this query, I tried just with ORDER BY RAND(), but it gave me a lot of continuous repeat results, that's why I decided to use ORDER BY RAND(UNIX_TIMESTAMP(NOW())).
But this last one still give me continuous repeat results( but less).
Im going to write a example to explain what I mean when I said "continuous repeat results" :
Image that I have 100 rows in my table: ROW1,ROW2,ROW3,ROw4,ROW5...
well, when I call my script PHP 5 times continuosly I get 5 results:
-ROW2,ROW20,ROW20,ROW50,ROW66
I don't want same row continuously two times.
I would like it for example: -ROW2,ROW20,ROW50,ROW66,ROW20
I just want to fix it some easy way.
https://dev.mysql.com/doc/refman/5.7/en/mathematical-functions.html#function_rand
RAND() is not meant to be a perfect random generator. It is a fast way
to generate random numbers on demand that is portable between
platforms for the same MySQL version.
If you want 5 results, why not change the limit to 5 ? This will ensure that there are no duplicates
The other option is read all of the data out, and then use shuffle in php ?
http://php.net/manual/en/function.shuffle.php
Or select the max and use a random number generated from PHP
http://php.net/manual/en/function.mt-rand.php
This is not doable by just redefining the query. You need to change the logic of your PHP script.
If you want that the PHP script (and the query) returns exactly ONE row per execution, and you need a guarantee that repeated executions of the PHP scrips yield different rows, then you need to store the previous result somewhere, and use the previous result in the WHERE condition of the query.
So your PHP script becomes something like (pseudocode):
$previousId = ...; // Load the ID of the row fetched by the previous execution
$query = "SELECT Question,Answer1,Answer2
FROM MyTable
WHERE id <> ?
ORDER BY RAND(UNIX_TIMESTAMP(NOW()))
LIMIT 1";
// Execute $query, using the $previousId bound parameter value
$newId = ...; // get the ID of the fetched row.
// Save $newId for the next execution.
You may use all kinds of storages for saving/loading the ID of the fetched rows. The easiest is probably to use a special table with a single row in the same database for this purpose.
Note that you may still get repeated sequential rows if you call your PHP script many times in parallel. Not sure if it matters in your case.
If it does, you may use locks or database transactions to fix this as well.

PDO::rowCount VS COUNT(*)

i have a query use PDO, count the row first, if row >1 than fetch data
SELECT * WHERE id=:id
$row=$SQL->rowCount();
if($row>0){
while($data=$SQL->fetch(PDO::FETCH_ASSOC)){...
}
}
else{echo "no result";}
or
SELECT COUNT(*), * WHERE id=:id
$data=fetch(POD::FETCH_NUM);
$row=data[0];
if($row>0){
//fetch data
}
else{echo "no result";}
Which will be better performance?
2nd. question, if I have set up index on id
which one is better COUNT(id) or COUNT(*)
1st question:
Using count COUNT(), internally the server(MySQL) will process the request differently.
When doing COUNT(), the server(MySQL) will only allocate memory to store the result of the count.
When using $row=$SQL->rowCount(); the server (Apache/PHP) will process the entire result set, allocate memory for all those results, and put the server in fetching mode, which involves a lot of different details, such as locking.
Take note that PDOStatement::rowCount() returns the number of rows affected by the last statement, not the number of rows returned. If the last SQL statement executed by the associated PDOStatement was a SELECT statement, some databases may return the number of rows returned by that statement. However, this behaviour is not guaranteed for all databases and should not be relied on for portable applications.
On my analysis, if you use COUNT(), the process would be divided to both MySQL and PHP while if you use $row=$SQL->rowCount();, the processing would be more for PHP.
Therefore COUNT() in MySQL is faster.
2nd question:
COUNT(*) is better than COUNT(id).
Explanation:
The count(*) function in mysql is optimized to find the count of values. Using wildcard means it does not fetch every row. It only find the count. So use count(*) wherever possible.
Sources:
PDOStatement::rowCount
MySQL COUNT(*)
As a matter of fact, neither PDO rowCount nor COUNT(*) is ever required here.
if row >1 then fetch data
is a faulty statement.
In a sanely designed web-application (I know it sounds like a joke for PHP) one don't have to make it this way.
Most sensible way would be
to fetch first
to use the fetched data
if needed, we can use this very fetched data to see whether anything was returned:
$data = $stmt->fetch();
if($data){
//use it whatever
} else {
echo "No record";
}
Easy, straightforward, and no questions like "which useless feature is better" at all.
In your case, assuming id is an unique index, only one row can be returned. Therefore, you don't need while statement at all. Just use the snippet above either to fetch and to tell whether enythin has been fetched.
In case many rows are expected, then just change fetch() to fetchAll() and then use foreach to iterate the returned array:
$data = $stmt->fetchAll();
if($data){
foreach ($data as $row) {
//use it whatever
}
} else {
echo "No records";
}
Note that you should never select more rows than needed. Means your query on a regular web page should never return more rows than will be displayed.
Speaking of the question itself - it makes no sense. One cannot compare rowCount VS COUNT(*), because it's incomparable matters. These two functions has absolutely different purpose and cannot be interchanged:
COUNT(*) returns one single row with count, and have to be used ONLY if one need the count of records, but no records themselves.
if you need the records, count(whatever) is not faster nor slower - it's pointless.
rowCount() returns the number of already selected rows and therefore you scarcely need it, as it was shown above.
Not to mention that the second example will fetch no rows at all.
Count(id) or count(*) will use index scan so it will be better for performance. Rowcount returns only affected rows and useful on insert/update/delete
EDIT:
Since the question edited to compare Count(id) and count(), it makes a slight difference. Count() will return row count affected by select. Count(column) will return non null value count but since it is id, there wont be a null column. So it doesnt make difference for this case.
Performance difference should be negligible to null, since you are issuing only one query in both cases. The 2nd query has to fetch an extra column with the same value for every row matching id, hence it might have a large memory footprint. Even without the COUNT(*) the row count should be available, hence you should go with the 1st solution.
About your 2nd question, AFAIK either COUNT(id) or COUNT(*) will be faster with the index on id, since the db engine will have to perform a range scan to retrieve the rows in question, and range scans are faster with indexes when filtering on the indexed column (in your case id = SOME_ID).
Count(*) will be faster.
PDOStatement::rowCount() is not guaranteed to work according to the PDO documentation:
"not guaranteed for all databases and should not be relied on for portable applications."
So in your case I'd suggest using count(*).
See reference:
pdostatement.rowcount Manual

Using count(*) vs num_rows

To get number of rows in result set there are two ways:
Is to use query to get count
$query="Select count(*) as count from some_table where type='t1'";
and then retrieving the value of count.
Is getting count via num_rows(), in php.
so which one is better performance wise?
If your goal is to actually count the rows, use COUNT(*). num_rows is ordinarily (in my experience) only used to confirm that more than zero rows were returned and continue on in that case. It will probably take MySQL longer to read out many selected rows compared to the aggregation on COUNT too even if the query itself takes the same amount of time.
There are a few differences between the two:
num_rows is the number of result rows (records) received.
count(*) is the number of records in the database matching the query.
The database may be configured to limit the number of returned results (MySQL allows this for instance), in which case the two may differ in value if the limit is lower than the number of matching records. Note that limits may be configured by the DBA, so it may not be obvious from the SQL query code itself what limits apply.
Using num_rows to count records implies "transmitting" each record, so if you only want a total number (which would be a single record/row) you are far better off getting the count instead.
Additionally count can be used in more complex query scenario's to do things like sub-totals, which is not easily done with num_rows.
count is much more efficient both performance wise and memory wise as you're not having to retrieve so much data from the database server. If you count by a single column such as a unique id then you can get it a little more efficient
It depends on your implementation. If you're dealing with a lot of rows, count(*) is better because it doesn't have to pass all of those rows to PHP. If, on the other hand, you're dealing with a small amount of rows, the difference is negligible.
num_rows() would be better if you have small quantity of rows and count(*) will give you performance if there are large number of rows and you have to select one and send it to php.

Getting total number of records from mysql table - too slow

I have a file that goes thru a large data set and splits out the rows in a paginated manner. The dataset contains about 210k rows, which isn't even that much, it will grow to 3Mil+ in a few weeks, but its already slow.
I have a first query that gets the total number of items in the DB for a particular WHERE clause combination, the most basic one looks like this:
SELECT count(v_id) as num_items FROM versions
WHERE v_status = 1
It takes 0.9 seconds to run.
The 2nd query is a LIMIT query that gets the actual data for that page. This query is really quick. (less than 0.001 s).
SELECT
v_id,
v_title,
v_desc
FROM versions
WHERE v_status = 1
ORDER BY v_dateadded DESC
LIMIT 0, 25
There is an index on v_status, v_dateadded
I use php. I cache the result into memcace, so subsequent requests are really fast, but the first request is laggy. Especially once I throw in a fulltext search in there, it starts taking 2-3 seconds for the 2 queries.
I don't think this is right, but try making it count(*), i think the count(x) has to go through every row and count only the ones that don't have a null value (so it has to go through all the rows)
Given that v_id is a PRIMARY KEY it should not have any nulls, so try count(*) instead...
But i don't think it will help since you have a where clause.
Not sure if this is the same for MySQL, but in MS SQL Server COUNT(*) is almost always faster than COUNT(column). The parser determines the fastest column to count and uses that.
Run an explain plan to see how the optimizer is running your queries.
That'll probably tell you what Andreas Rehm told you: you'll want to add indices that cover your where clauses.
EDIT: For me FOUND_ROWS() was the fastest way of doing this:
SELECT
SQL_CALC_FOUND_ROWS
v_id,
v_title,
v_desc
FROM versions
WHERE v_status = 1
ORDER BY v_dateadded DESC
LIMIT 0, 25;
Then in a secondary query just do:
SELECT FOUND_ROWS();
If you are outputting to PHP you do this:
$totalnumber = mysql_result(mysql_query($secondquery)),0,0);
I was previously trying to the same thing as OP, putting COUNT(column) on the first query but it took about three times longer than even the slowest WHERE and ORDERBY query that I could do (with a LIMIT set). I tried changing to COUNT(*) and it improved a lot. But results in my case were even better using MySQL's FOUND_ROWS();
I am testing in PHP with microtime and repeating the query. In OP's case, if he ran COUNT(*) I think he will save some time, but it is not the fastest way of doing this. I ran some tests on COUNT(*) VS. FOUND_ROWS() and FOUND_ROWS() is quite a bit faster.
Using FOUND_ROWS() was nearly twice as fast in my case.
I first started doing EXPLAIN on the COUNT(*) query. In OP's case you would see that MySQL still checks a total of 210k rows in the first query. It checks every row before even starting the LIMIT query and doesn't seem to get any performance benefit from doing this.
If you run EXPLAIN on the LIMIT query it will probably check less than 100 rows as you have limited the results to 25. But this is still overlap and there will be some cases where you can't afford this or at the least you should still compare performance with FOUND_ROWS().
I thought this might only save time on large LIMIT requests, but when I run EXPLAIN on my LIMIT query it was actually only checking 25 rows to get 15 values. However, there was still a very noticeable difference in query time - on average I got down from .25 to .14 seconds and achieved the same results.

How do PHP/MySQL database queries work exactly?

I have used MySQL a lot, but I always wondered exactly how does it work - when I get a positive result, where is the data stored exactly? For example, I write like this:
$sql = "SELECT * FROM TABLE";
$result = mysql_query($sql);
while ($row = mysql_fetch_object($result)) {
echo $row->column_name;
}
When a result is returned, I am assuming it's holding all the data results or does it return in a fragment and only returns where it is asked for, like $row->column_name?
Or does it really return every single row of data even if you only wanted one column in $result?
Also, if I paginate using LIMIT, does it hold THAT original (old) result even if the database is updated?
The details are implementation dependent but generally speaking, results are buffered. Executing a query against a database will return some result set. If it's sufficiently small all the results may be returned with the initial call or some might be and more results are returned as you iterate over the result object.
Think of the sequence this way:
You open a connection to the database;
There is possibly a second call to select a database or it might be done as part of (1);
That authentication and connection step is (at least) one round trip to the server (ignoring persistent connections);
You execute a query on the client;
That query is sent to the server;
The server has to determine how to execute the query;
If the server has previously executed the query the execution plan may still be in the query cache. If not a new plan must be created;
The server executes the query as given and returns a result to the client;
That result will contain some buffer of rows that is implementation dependent. It might be 100 rows or more or less. All columns are returned for each row;
As you fetch more rows eventually the client will ask the server for more rows. This may be when the client runs out or it may be done preemptively. Again this is implementation dependent.
The idea of all this is to minimize roundtrips to the server without sending back too much unnecessary data, which is why if you ask for a million rows you won't get them all back at once.
LIMIT clauses--or any clause in fact--will modify the result set.
Lastly, (7) is important because SELECT * FROM table WHERE a = 'foo' and SELECT * FROM table WHERE a = 'bar' are two different queries as far as the database optimizer is concerned so an execution plan must be determined for each separately. But a parameterized query (SELECT * FROM table WHERE a = :param) with different parameters is one query and only needs to be planned once (at least until it falls out of the query cache).
I think you are confusing the two types of variables you're dealing with, and neither answer really clarifies that so far.
$result is a MySQL result object. It does not "contain any rows." When you say $result = mysql_query($sql), MySQL executes the query, and knows what rows will match, but the data has not been transferred over to the PHP side. $result can be thought of as a pointer to a query that you asked MySQL to execute.
When you say $row = mysql_fetch_object($result), that's when PHP's MySQL interface retrieves a row for you. Only that row is put into $row (as a plain old PHP object, but you can use a different fetch function to ask for an associative array, or specific column(s) from each row.)
Rows may be buffered with the expectation that you will be retrieving all of the rows in a tight loop (which is usually the case), but in general, rows are retrieved when you ask for them with one of the mysql_fetch_* functions.
If you only want one column from the database, then you should SELECT that_column FROM .... Using a LIMIT clause is also a good idea whenever possible, because MySQL can usually perform significant optimizations if it knows that you only want a certain group of rows.
The first question can be answered by reading up on resources
Since you are SELECTing "*", every column is returned for each mysql_fetch_object call. Just look at print_r($row) to see.
In simple words the resource returned it like an ID that the MySQL library associate with other data. I think it is like the identification card in your wallet, it's just a number and some information but asociated with a lot of more information if you give it to the goverment, or your cell-phone company, etc.

Categories