I'm doing a pagination feature using Codeigniter but I think this applies to PHP/mySQL coding in general.
I am retrieving directory listings using offset and limit depending on how many results I want per page. However to know the total number of pages required, I need to know (total number of results)/(limit). Right now I am thinking of running the SQL query a second time then count the number of rows required but without using LIMIT. But I think this seems to be a waste of computational resources.
Are there any better ways? Thanks!
EDIT: My SQL query uses WHERE as well to select all rows with a particular 'category_id'
Take a look at SQL_CALC_FOUND_ROWS
SELECT COUNT(*) FROM table_name WHERE column = 'value' will return the total number of records in a table matching that condition very quickly.
Database SELECT operations are usually "cheap" (resource-wise), so don't feel too bad about using them in a reasonable manner.
EDIT: Added WHERE after the OP mentioned that they need that feature.
The SQL_CALC_FOUND_ROWS query modifier and accompanying FOUND_ROWS()
function are deprecated as of MySQL 8.0.17 and will be removed in a
future MySQL version. As a replacement, considering executing your
query with LIMIT, and then a second query with COUNT(*) and without
LIMIT to determine whether there are additional rows. For example,
instead of these queries:
SELECT SQL_CALC_FOUND_ROWS * FROM tbl_name WHERE id > 100 LIMIT 10;
SELECT FOUND_ROWS();
Use these queries instead:
SELECT * FROM tbl_name WHERE id > 100 LIMIT 10;
SELECT COUNT(*) WHERE id > 100;
COUNT(*) is subject to certain optimizations. SQL_CALC_FOUND_ROWS
causes some optimizations to be disabled.
Considering that SQL_CALC_FOUND_ROWS requires invoking FOUND_ROWS() afterwards, if you wanted to fetch the total count with the results returned from your limit without having to invoke a second SELECT, I would use JOIN results derived from a subquery:
SELECT * FROM `table` JOIN (SELECT COUNT(*) FROM `table` WHERE `category_id` = 9) t2 WHERE `category_id` = 9 LIMIT 50
Note: Every derived table must have its own alias, so be sure to name the joined table. In my example I used t2.
you can use 2 queries as below, One to fetch the data with limit and other to get the no of total matched rows.
Ex:
SELECT * FROM tbl_name WHERE id > 1000 LIMIT 10;
SELECT COUNT(*) FROM tbl_name WHERE id > 1000;
As described by Mysql guide , this is the most optimized way, and also SQL_CALC_FOUND_ROWS query modifier and FOUND_ROWS() function are deprecated as of MySQL 8.0.17
as of final of 2021, why not:
SELECT
t1.*,
COUNT(t1.*) OVER (PARTITION BY RowCounter) as TotalRecords
FROM (
SELECT a, b, c, 1 as RowCounter
FROM MyTable
) t1
LIMIT 120,10
using a subquery with a column marking every row with the same value, will give us the possibility to count all of the same values of the the resulted column with PARTITION BY window function's group
SELECT COUNT(id) FROM `table` WHERE `category_id` = 9
Gives you the number of rows for your specific category.
Related
I'm working on php small project, here I need first 5 records from beginning of records and last record 1 from end of the table's record. I don't know how to write a single mysqli query.
Any help will be appreciated. Thanks in advance.
SQL tables represent unordered sets. So, there is no such thing as the first five rows or last row -- unless a column explicitly defines the ordering.
Often, a table has some sort of auto-incremented id column, which can be used for this purpose. If so, you can do:
(select t.*
from t
order by id asc
limit 5
) union all
(select t.*
from t
order by id desc
limit 1
);
Notes:
Sometimes, an insert date/time column is the appropriate column to use.
You want to use union all rather than union -- unless you want to incur the overhead of removing duplicate values.
For this formulation, if there are fewer than 6 rows, then you will get a duplicate.
The UNION operator allows this, carefully toying with the ORDER BY and LIMIT clauses :
(SELECT * FROM table ORDER BY field ASC LIMIT 5)
UNION
(SELECT * FROM table ORDER BY field DESC LIMIT 1)
I have this SQL query here that grabs the 5 latest news posts. I want to make it so it also grabs the total likes and total news comments in the same query. But the query I made seems to be a little slow when working with large amounts of data so I am trying to see if I can find a better solution. Here it is below:
SELECT *,
`id` as `newscode`,
(SELECT COUNT(*) FROM `likes` WHERE `type`="newspost" AND `code`=`newscode`) as `total_likes`,
(SELECT COUNT(*) FROM `news_comments` WHERE `post_id`=`newscode`) as `total_comments`
FROM `news` ORDER BY `id` DESC LIMIT 5
Here is a SQLFiddle as well: http://sqlfiddle.com/#!2/d3ecbf/1
I would recommend adding a total_likes and total_comments fields to the news table which gets incremented/decremented whenever a like and/or comment is added or removed.
Your likes and news_comments tables should be used for historical purposes only.
This strenuous counting should not be performed every time a page is loaded because that is a complete waste of resources.
You could rewrite this using joins, MySQL has known issues with subqueries, especially when dealing with large data sets:
SELECT n.*,
`id` as `newscode`,
COALESCE(l.TotalLikes, 0) AS `total_likes`,
COALESCE(c.TotalComments, 0) AS `total_comments`
FROM `news` n
LEFT JOIN
( SELECT Code, COUNT(*) AS TotalLikes
FROM `likes`
WHERE `type` = "newspost"
GROUP BY Code
) AS l
ON l.`code` = n.`id`
LEFT JOIN
( SELECT post_id, COUNT(*) AS TotalComments
FROM `news_comments`
GROUP BY post_id
) AS c
ON c.`post_id` = n.`id`
ORDER BY n.`id` DESC LIMIT 5;
The reason is that when you use a join as above, MySQL will materialise the results of the subquery when it is first needed, e.g at the start of this query, mySQL will put the results of:
SELECT post_id, COUNT(*) AS TotalComments
FROM `news_comments`
GROUP BY post_id
into an in memory table and hash post_id for faster lookups. Then for each row in news it only has to look up TotalComments from this hashed table, when you use a correlated subquery it will execute the query once for each row in news, which when news is large will result in a large number of executions. If the initial result set is small you may not see a performance benefit and it may be worse.
Examples on SQL Fiddle
Finally, you may want to index the relevant fields in news_comments and likes. For this particular query I think the following indexes will help:
CREATE INDEX IX_Likes_Code_Type ON Likes (Code, Type);
CREATE INDEX IX_newcomments_post_id ON news_comments (post_id);
Although you may need to split the first index into two:
CREATE INDEX IX_Likes_Code ON Likes (Code);
CREATE INDEX IX_Likes_Type ON Likes (Type);
First check for helping indexes on columns id, post_id and type,code.
I assume this is T-SQL, as that is what I am most familiar with.
First I would check indexes. If that looks good, then I'd check statement. Take a look at your query map to see how it's populating your result.
SQL works backward, so it starts with your last AND statement and goes from there. It'll group them all by code, and then type, and finally give you a count.
Right now, you're grabbing everything with certain codes, regardless of date. When you stated that you want the latest, I assume there is a date column somewhere.
In order to speed things up, add another AND to your WHERE and account for the date. Either last 24 hours, last week, whatever.
Well, this is a very old question never gotten real solution. We want 3 random rows from a table with about 30k records. The table is not so big in point of view MySQL, but if it represents products of a store, it's representative. The random selection is useful when one presents 3 random products in a webpage for example. We would like a single SQL string solution that meets these conditions:
In PHP, the recordset by PDO or MySQLi must have exactly 3 rows.
They have to be obtained by a single MySQL query without Stored Procedure used.
The solution must be quick as for example a busy apache2 server, MySQL query is in many situations the bottleneck. So it has to avoid temporary table creation, etc.
The 3 records must be not contiguous, ie, they must not to be at the vicinity one to another.
The table has the following fields:
CREATE TABLE Products (
ID INT(8) NOT NULL AUTO_INCREMENT,
Name VARCHAR(255) default NULL,
HasImages INT default 0,
...
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
The WHERE constraint is Products.HasImages=1 permitting to fetch only records that have images available to show on the webpage. About one-third of records meet the condition of HasImages=1.
Searching for a Perfection, we first let aside the existent Solutions that have drawbacks:
I. This basic solution using ORDER BY RAND(),
is too slow but guarantees 3 really random records at each query:
SELECT ID, Name FROM Products WHERE HasImages=1 ORDER BY RAND() LIMIT 3;
*CPU about 0.10s, scanning 9690 rows because of WHERE clause, Using where; Using temporary; Using filesort, on Debian Squeeze Double-Core Linux box, not so bad but
not so scalable to a bigger table as temporary table and filesort are used, and takes me 8.52s for the first query on the test Windows7::MySQL system. With such a poor performance, to avoid for a webpage isn't-it ?
II. The bright solution of riedsio using JOIN ... RAND(),
from MySQL select 10 random rows from 600K rows fast, adapted here is only valid for a single random record, as the following query results in an almost always contiguous records. In effect it gets only a random set of 3 continuous records in IDs:
SELECT Products.ID, Products.Name
FROM Products
INNER JOIN (SELECT (RAND() * (SELECT MAX(ID) FROM Products)) AS ID)
AS t ON Products.ID >= t.ID
WHERE (Products.HasImages=1)
ORDER BY Products.ID ASC
LIMIT 3;
*CPU about 0.01 - 0.19s, scanning 3200, 9690, 12000 rows or so randomly, but mostly 9690 records, Using where.
III. The best solution seems the following with WHERE ... RAND(),
seen on MySQL select 10 random rows from 600K rows fast proposed by bernardo-siu:
SELECT Products.ID, Products.Name FROM Products
WHERE ((Products.Hasimages=1) AND RAND() < 16 * 3/30000) LIMIT 3;
*CPU about 0.01 - 0.03s, scanning 9690 rows, Using where.
Here 3 is the number of wished rows, 30000 is the RecordCount of the table Products, 16 is the experimental coefficient to enlarge the selection in order to warrant the 3 records selection. I don't know on what basis the factor 16 is an acceptable approximation.
We so get at the majority of cases 3 random records and it's very quick, but it's not warranted: sometimes the query returns only 2 rows, sometimes even no record at all.
The three above methods scan all records of the table meeting WHERE clause, here 9690 rows.
A better SQL String?
Ugly, but quick and random. Can become very ugly very fast, especially with tuning described below, so make sure you really want it this way.
(SELECT Products.ID, Products.Name
FROM Products
INNER JOIN (SELECT RAND()*(SELECT MAX(ID) FROM Products) AS ID) AS t ON Products.ID >= t.ID
WHERE Products.HasImages=1
ORDER BY Products.ID
LIMIT 1)
UNION ALL
(SELECT Products.ID, Products.Name
FROM Products
INNER JOIN (SELECT RAND()*(SELECT MAX(ID) FROM Products) AS ID) AS t ON Products.ID >= t.ID
WHERE Products.HasImages=1
ORDER BY Products.ID
LIMIT 1)
UNION ALL
(SELECT Products.ID, Products.Name
FROM Products
INNER JOIN (SELECT RAND()*(SELECT MAX(ID) FROM Products) AS ID) AS t ON Products.ID >= t.ID
WHERE Products.HasImages=1
ORDER BY Products.ID
LIMIT 1)
First row appears more often than it should
If you have big gaps between IDs in your table, rows right after such gaps will have bigger chance to be fetched by this query. In some cases, they will appear significatnly more often than they should. This can not be solved in general, but there's a fix for a common particular case: when there's a gap between 0 and the first existing ID in a table.
Instead of subquery (SELECT RAND()*<max_id> AS ID) use something like (SELECT <min_id> + RAND()*(<max_id> - <min_id>) AS ID)
Remove duplicates
The query, if used as is, may return duplicate rows. It is possible to avoid that by using UNION instead of UNION ALL. This way duplicates will be merged, but the query no longer guarantees to return exactly 3 rows. You can work around that too, by fetching more rows than you need and limiting the outer result like this:
(SELECT ... LIMIT 1)
UNION (SELECT ... LIMIT 1)
UNION (SELECT ... LIMIT 1)
...
UNION (SELECT ... LIMIT 1)
LIMIT 3
There's still no guarantee that 3 rows will be fetched, though. It just makes it more likely.
SELECT Products.ID, Products.Name
FROM Products
INNER JOIN (SELECT (RAND() * (SELECT MAX(ID) FROM Products)) AS ID) AS t ON Products.ID >= t.ID
WHERE (Products.HasImages=1)
ORDER BY Products.ID ASC
LIMIT 3;
Of course the above is given "near" contiguous records you are feeding it the same ID every time without much regard to the seed of the rand function.
This should give more "randomness"
SELECT Products.ID, Products.Name
FROM Products
INNER JOIN (SELECT (ROUND((RAND() * (max-min))+min)) AS ID) AS t ON Products.ID >= t.ID
WHERE (Products.HasImages=1)
ORDER BY Products.ID ASC
LIMIT 3;
Where max and min are two values you choose, lets say for example sake:
max = select max(id)
min = 225
This statement executes really fast (19 ms on a 30k records table):
$db = new PDO('mysql:host=localhost;dbname=database;charset=utf8', 'username', 'password');
$stmt = $db->query("SELECT p.ID, p.Name, p.HasImages
FROM (SELECT #count := COUNT(*) + 1, #limit := 3 FROM Products WHERE HasImages = 1) vars
STRAIGHT_JOIN (SELECT t.*, #limit := #limit - 1 FROM Products t WHERE t.HasImages = 1 AND (#count := #count -1) AND RAND() < #limit / #count) p");
$products = $stmt->fetchAll(PDO::FETCH_ASSOC);
The Idea is to "inject" a new column with randomized values, and then sort by this column. The generation of and sorting by this injected column is way faster than the "ORDER BY RAND()" command.
There "might" be one caveat: You have to include the WHERE query twice.
What about creating another table containing only items with image ? This table will be much lighter as it will contain only one-third of the items the original table has !
------------------------------------------
|ID | Item ID (on the original table)|
------------------------------------------
|0 | 0 |
------------------------------------------
|1 | 123 |
------------------------------------------
.
.
.
------------------------------------------
|10 000 | 30 000 |
------------------------------------------
You can then generate three random IDs in the PHP part of the code and just fetch'em the from the database.
I've been testing the following bunch of SQLs on a 10M-record, poorly designed database.
SELECT COUNT(ID)
INTO #count
FROM Products
WHERE HasImages = 1;
PREPARE random_records FROM
'(
SELECT * FROM Products WHERE HasImages = 1 LIMIT ?, 1
) UNION (
SELECT * FROM Products WHERE HasImages = 1 LIMIT ?, 1
) UNION (
SELECT * FROM Products WHERE HasImages = 1 LIMIT ?, 1
)';
SET #l1 = ROUND(RAND() * #count);
SET #l2 = ROUND(RAND() * #count);
SET #l3 = ROUND(RAND() * #count);
EXECUTE random_records USING #l1
, #l2
, #l3;
DEALLOCATE PREPARE random_records;
It took almost 7 minutes to get the three results. But I'm sure its performance will be much better in your case. Yet if you are looking for a better performance I suggest the following ones as they took less than 30 seconds for me to get the job done (on the same database).
SELECT COUNT(ID)
INTO #count
FROM Products
WHERE HasImages = 1;
PREPARE random_records FROM
'SELECT * FROM Products WHERE HasImages = 1 LIMIT ?, 1';
SET #l1 = ROUND(RAND() * #count);
SET #l2 = ROUND(RAND() * #count);
SET #l3 = ROUND(RAND() * #count);
EXECUTE random_records USING #l1;
EXECUTE random_records USING #l2;
EXECUTE random_records USING #l3;
DEALLOCATE PREPARE random_records;
Bear in mind that both these commands require MySQLi driver in PHP if you want to execute them in one go. And their only difference is that the later one requires calling MySQLi's next_result method to retrieve all three results.
My personal belief is that this is the fastest way to do this.
On the off-chance that you're willing to accept an 'outside the box' type of answer, I'm going to repeat what I said in some of the comments.
The best way to approach your problem is to cache your data in advance (be that in an external JSON or XML file, or in a separate database table, possibly even an in-memory table).
This way you can schedule your performance-hit on the products table to times when you know the server will be quiet, and reduce your worry about creating a performance hit at "random" times when the visitor arrives to your site.
I'm not going to suggest an explicit solution, because there are far too many possibilities on how to build a solution. However, the answer suggested by #ahmed is not silly. If you don't want to create a join in your query, then simply load more of the data that you require into the new table instead.
I have to modify an existing PHP script so that it displays the total number of matches in the database before it shows the results for the current page. The query is:
SELECT products.features, products.make_id, products.model, products.price,
products.id, brands.name AS brandname, brands.id AS brandid
FROM products
LEFT JOIN brands ON products.make_id=brands.id
WHERE products.active=1 AND brands.active=1 AND (products.category=22)
ORDER BY id DESC LIMIT 0,12
How do I get the number of results? mysql_num_rows() obviously returns 12 but I don't know how to use count in this query, everything I try returns some error.
I could just execute the query once without the limit (clause?) but that seems a bit inefficient.
Thanks.
Use the MySQL function FOUND_ROWS():
A SELECT statement may include a LIMIT
clause to restrict the number of rows
the server returns to the client. In
some cases, it is desirable to know
how many rows the statement would have
returned without the LIMIT, but
without running the statement again.
To obtain this row count, include a
SQL_CALC_FOUND_ROWS option in the
SELECT statement, and then invoke
FOUND_ROWS() afterward:
so run the two following queries in PHP, the first statement gives the first 12 rows, the second one gives the total of rows:
$sql1 = "SELECT SQL_CALC_FOUND_ROWS
products.features
, products.make_id
, products.model
, products.price
, products.id
, brands.name AS brandname
, brands.id AS brandid
FROM products
LEFT JOIN brands ON products.make_id=brands.id
WHERE (products.active=1) AND (brands.active=1) AND (products.category=22)
ORDER BY id DESC LIMIT 0,12"
$sql2 = "SELECT FOUND_ROWS()"
This would return you the total number of rows the current select matches. The result will be one row, with one field and it will hold the total
SELECT COUNT(*) AS total
FROM products
LEFT JOIN brands ON products.make_id=brands.id
WHERE products.active=1 AND brands.active=1 AND products.category=22
You will have to run this as a separate query.
<?php
$q=mysql_query('above query');
$result=mysql_fetch_array($q);
echo $result['total']; // will print out the desired number of rows
You were probably trying to get the total and the limit in the same select and while that might be possible with a union, but since you were worrying about inefficiency before, this is possible worse.
Since in your OP you hint that this is for pagination. Let me tell you, LIMIT m,n may not be as fast as it sounds. Learn how to improve it and read more about Efficient Pagination Using MySQL
As I am trying to count the number of records in a table, even when the SQL statement has a LIMIT into it, overall it works, however something weird happens, the code:
$sql = "SELECT COUNT(*) AS count FROM posts
ORDER BY post_date DESC
LIMIT 5";
// ... mysql_query, etc
while($row = mysql_fetch_array($result))
{
// ... HTML elements, etc
echo $row['post_title'];
// ... HTML elements, etc
echo $row['count']; // this displays the number of posts (which shows "12").
}
Although, when displaying through the while loop, it displays this:
Notice: Undefined index: post_title in /Applications/MAMP/htdocs/blog/index.php on line 55
If I remove the COUNT(*) AS count, everything will display perfectly... how come it's doing this?
Don't use COUNT(*) to count the number of rows (for a lot of reasons). Write out your full query, and add SQL_CALC_FOUND_ROWS right after SELECT:
SELECT SQL_CALC_FOUND_ROWS id, title FROM foo LIMIT 5;
Then, after that query executed (right after), run:
SELECT FOUND_ROWS();
That will return the number of rows the original SELECT would have returned if you didn't have the LIMIT on the end (accounting for all joins and where clauses).
It's not portable, but it's very efficient (and IMHO the right way of handling this type of problem).
LIMIT does not do anything here because you're selecting a single scalar. The error is shown because you are not selecting the post title, so it is not in the $row hash.
This is happening because COUNT() is an aggregate function. You will have to do two separate queries in order to get both the count of rows in the table and separate records.
I had the same problem and found this article: http://www.mysqldiary.com/limited-select-count/
best way (especially for big tables) is using it like this:
SELECT COUNT(*) AS count FROM (SELECT 1 FROM posts
ORDER BY post_date DESC
LIMIT 5) t
now it returns a number between 0 and 5
You're asking the SELECT statement to return only COUNT(*) AS count. If you want more columns returned, you have to specify them in the SELECT statement.
You're mixing up an aggregate function COUNT() with a standard selection. You can't get an accurate post title without a GROUP BY clause in your aggregate query. The easiest thing you should do is to do this two queries - one for your count, the other for your post information. Also, there's no sense in using LIMIT on an aggregate function with no other columns in your (omitted) GROUP BY - the current query will always return one row.
What everyone is trying to say is that you should opt to use 2 separate queries instead of 1 single one.
1) get row count of table
$sql = "SELECT COUNT(*) AS count FROM posts;
2) get last 5 rows of table
$sql = "SELECT * FROM posts
ORDER BY post_date DESC
LIMIT 5";
By including COUNT(*) AS count you've removed the actual columns that you want to get the information from. Try using
SELECT *, COUNT(*) AS count FROM posts ORDER BY post_date DESC LIMIT 5
That query will accomplish both things you're trying to do in one query, but really, these should be broken out into 2 separate queries. What is the overall purpose for having a COUNT in the query? It should be done a different way.
This works for me:
$sql = "
SELECT COUNT( po.id ) AS count, p.post_title
FROM posts p
JOIN posts po
GROUP BY p.post_title
ORDER BY p.post_date DESC
LIMIT 5
";
Important is to JOIN the same table, but name the table differently. Do not forget GROUP BY, must be used.