I am about to use Zend_Paginator class in my project. I found examples of the class on the internet. One of them is
$sql = 'SELECT * FROM table_name ';
$result = $db->fetchAll($sql);
$page=$this->_getParam('page',1);
$paginator = Zend_Paginator::factory($result);
$paginator->setItemCountPerPage(10));
$paginator->setCurrentPageNumber($page);
$this->view->paginator=$paginator;
on the first line, it actually select all the rows from table_name. What if I have a table with 50000 rows? That would be very inefficient.
Is there any other way to use Zend Paginator?
About this problem, you might be interested by this section of the manual : 39.2.2. The DbSelect and DbTableSelect adapter, which states (quoting, emphasis mine) :
... the database adapters require a
more detailed explanation. Contrary to
popular believe, these adapters do not
fetch all records from the database in
order to count them. Instead, the
adapters manipulates the original
query to produce the corresponding
COUNT query. Paginator then executes
that COUNT query to get the number of
rows. This does require an extra
round-trip to the database, but this
is many times faster than fetching an
entire result set and using count().
Especially with large collections of
data.
(There is more to read on that page -- and there is an example that should give you more information)
The idea is that you will not fetch all data yourself anymore, but you'll tell to Zend_Paginator which Adapter it must use to access your data.
This Adapter will be specific to "Data that is fetched via an SQL query", and will know how to paginate it directly on the database side -- which means fetching only what is required, and not all data like you initialy did.
I recommend passing a Zend_Db_Select object as Zend_Paginator::factory($select); rather than a passing a result rowset. Otherwise, you're selecting the entire result set and then doing the pagination. In your current solution, if you had a million rows, you'd select all of them before getting the chunk of rows defined by the current page.
Related
For example I need to get review count, one way of doing it is like this:
public function getActiveReviews()
{
return $this->getReviews()->filter(function(Review $review) {
return $review->isActive() && !$review->isDeleted();
})->count();
}
Another way is to use Query Builder like this:
$qb = $this->createQueryBuilder('r')
->where('r.active = true')
->andWhere('r.deleted = false')
->select('count(r)')
Which way will give me better performance and why?
Of course count query will be faster because it will result into single SQL query that will return single value.
Iteration over entities will require:
Run of SQL query for fetching data rows
Actual fetching of data
Entity objects instantiation and persisting fetched data into them
Depending on amount of affected data difference may be very big.
The only case when running count over entities may be fast enough is a case when you already have all entities fetched and just need to count them.
It depends on Symfony count() implementation, but you probably will. Usually RDBMS counts its rows quicker internally, and it requires much less resources.
In first case you request a whole rowset, which can be huge, then you iterate through it, you apply your filter function to every row, and then you just look at your filtered rowset size and drop everything. (But, of course, this might be optimized by your framework somehow).
In second case you just ask the database how much rows it has satisfying the criteria. And DB returns you a number, and that's all.
As other people said, the only case when first choice might be quicker is when you have already cached rowset (no need to connect to DB) — and when your DB connection is very slow at the same time.
I saw databases which were slow on some COUNT requests (Oracle) on big tables, but they were still faster than PHP code on same rowset. DBs are optimized for data filtering and counting. And usually COUNT request are very fast.
I'm trying to understand the Zend Paginator and would mostly like to make sure it doesn't break my scripts.
For example, I have the following snippet which successfully loads some contacts one at a time:
$offset = 1;
//returns a paginator instance using a dbSelect;
$contacts = $ContactsMapper->fetchAll($fetchObj);
$contacts->setCurrentPageNumber($offset);
$contacts->setItemCountPerPage(1);
$allContacts = count($contacts);
while($allContacts >= $offset) {
foreach($contacts as $contact) {
//do something
}
$offset++;
$contacts->setCurrentPageNumber($offset);
$contacts->setItemCountPerPage(1);
}
However I can have hundreds of thousands of contacts in the database and matched by the SELECT I send to the paginator. Can I be sure it only loads one at a time in this example? And how does it do it, does it run a customized query with limit and offset?
From the official documentation : Zend Paginator Usage
Note
Instead of selecting every matching row of a given query, the DbSelect
adapter retrieves only the smallest amount of data necessary for
displaying the current page. Because of this, a second query is
dynamically generated to determine the total number of matching rows.
If your using Zend\Paginator\Adapter\DbSelect it will apply limit and offset to the query you're passing it, and it will just fetch the wanted records. This is done in the getItems() function of DbSelect, you could see that these lines in the source code.
You could also read this from the documentation :
This adapter does not fetch all records from the database in order
to count them. Instead, the adapter manipulates the original query to
produce a corresponding COUNT query. Paginator then executes that
COUNT query to get the number of rows. This does require an extra round-trip to the database, but this is many times faster than
fetching an entire result set and using count(), especially with
large collections of data.
I have a database design here that looks this in simplified version:
Table building:
id
attribute1
attribute2
Data in there is like:
(1, 1, 1)
(2, 1, 2)
(3, 5, 4)
And the tables, attribute1_values and attribute2_values, structured as:
id
value
Which contains information like:
(1, "Textual description of option 1")
(2, "Textual description of option 2")
...
(6, "Textual description of option 6")
I am unsure whether this is the best setup or not, but it is done as such per requirements of my project manager. It definitely has some truth in it as you can modify the text easily now without messing op the id's.
However now I have come to a page where I need to list the attributes, so how do I go about there? I see two major options:
1) Make one big query which gathers all values from building and at the same time picks the correct textual representation from the attribute{x}_values table.
2) Make a small query that gathers all values from the building table. Then after that get the textual representation of each attribute one at a time.
What is the best option to pick? Is option 1 even faster as option 2 at all? If so, is it worth the extra trouble concerning maintenance?
Another suggestion would be to create a view on the server with only the data you need and query from that. That would keep the work on the server end, and you can pull just what you need each time.
If you have a small number of rows in attributes table, then I suggest to fetch them first, fetch all of them! store them into some array using id as index key in array.
Then you can proceed with building data, now you just have to use respective array to look for attribute value
I would recommend something in-between. Parse the result from the first table in php, and figure out how many attributes you need to select from each attribute[x]_values table.
You can then select attributes in bulk using one query per table, rather than one query per attribute, or one query per building.
Here is a PHP solution:
$query = "SELECT * FROM building";
$result = mysqli_query(connection,$query);
$query = "SELECT * FROM attribute1_values";
$result2 = mysqli_query(connection,$query);
$query = "SELECT * FROM attribute2_values";
$result3 = mysqli_query(connection,$query);
$n = mysqli_num_rows($result);
for($i = 1; $n <= $i; $i++) {
$row = mysqli_fetch_array($result);
mysqli_data_seek($result2,$row['attribute1']-1);
$row2 = mysqli_fetch_array($result2);
$row2['value'] //Use this as the value for attribute one of this object.
mysqli_data_seek($result3,$row['attribute2']-1);
$row3 = mysqli_fetch_array($result3);
$row3['value'] //Use this as the value for attribute one of this object.
}
Keep in mind that this solution requires that the tables attribute1_values and attribute2_values start at 1 and increase by 1 every single row.
Oracle / Postgres / MySql DBA here:
Running a query many times has quite a bit of overhead. There are multiple round trips to the db, and if it's on a remote server, this can add up. The DB will likely have to parse the same query multiple times in MySql which will be terribly inefficient if there are tons of rows. Now, one thing that your PHP method (multiple queries) has as an advantage is that it'll use less memory as it'll release the results as they're no longer needed (if you run the query as a nested loop that is, but if you query all the results up front, you'll have a lot of memory overhead, depending on the table sizes).
The optimal result would be to run it as 1 query, and fetch the results 1 at a time, displaying each one as needed and discarding it, which can reek havoc with MVC frameworks unless you're either comfortable running model code in your view, or run small view fragments.
Your question is very generic and i think that to get an answer you should give more hints to how this page will look like and how big the dataset is.
You will get all the buildings with theyr attributes or just one at time?
Cause your data structure look like very simple and anything more than a raspberrypi can handle it very good.
If you need one record at time you don't need any special technique, just JOIN the tables.
If you need to list all buildings and you want to save db time you have to measure your data.
If you have more attribute than buildings you have to choose one way, if you have 8 attributes and 2000 buildings you can think of caching attributes in an array with a select for each table and then just print them using the array. I don't think you will see any speed drop or improvement with so simple tables on a modern computer.
$att1[1]='description1'
$att1[2]='description2'
....
Never do one at a time queries, try to combine them into a single one.
MySQL will cache your query and it will run much faster. PhP loops are faster than doing many requests to the database.
The query cache stores the text of a SELECT statement together with the corresponding result that was sent to the client. If an identical statement is received later, the server retrieves the results from the query cache rather than parsing and executing the statement again.
http://dev.mysql.com/doc/refman/5.1/en/query-cache.html
I am relatively new to The Zend framework having only been working with it probably 1 month. I have a query that's running way too slow. The page takes a few minutes to load. In my query, is it grabbing all the records or is it getting only the amount I need for the page?Can I apply a limit in fetching the contents from the DB?
My code is as follows.
public function init() {
$db = Zend_Registry::get('db');
$sql = 'SELECT * FROM employee ';
$result = $db->fetchAll($sql);
$page=$this->_getParam('page',1);
$paginator = Zend_Paginator::factory($result);
$paginator->setItemCountPerPage(5);
$paginator->setCurrentPageNumber($page);
$this->view->paginator=$paginator;
}
I will be greatful for any help you can give me.
Because you are supplying a Db query to your Paginator factory, Paginator will in most cases use the DbSelect Paginator adapter. This adapter will make the smallest query possible to satisfy the demands on the Paginator.
To make sure the Paginator uses the correct adapter you can pass a string as the second arg.
$paginator = Zend_Paginator::factory($result, 'DbSelect');
Excerpt from Reference:
Note: Instead of selecting every matching row of a given query, the
DbSelect and DbTableSelect adapters retrieve only the smallest amount
of data necessary for displaying the current page.
Because of this, a second query is dynamically generated to determine
the total number of matching rows. However, it is possible to directly
supply a count or count query yourself. See the setRowCount() method
in the DbSelect adapter for more information.
if you use database profiling you can see where the limit and offset queries are generated.
Note:
In most instances Zend_Db will convert SQL select queries into Zend_Db_Select objects. If you find you are in one of the rare situations where this is not the case you will either need to use the array adapter for Paginator or refactor your sql as a Zend_Db_Select object. This should be very rare.
Also it is rather unusual to see a paginator instance in the init() method. The init() will be run on every request to that controller.
I have used MySQL a lot, but I always wondered exactly how does it work - when I get a positive result, where is the data stored exactly? For example, I write like this:
$sql = "SELECT * FROM TABLE";
$result = mysql_query($sql);
while ($row = mysql_fetch_object($result)) {
echo $row->column_name;
}
When a result is returned, I am assuming it's holding all the data results or does it return in a fragment and only returns where it is asked for, like $row->column_name?
Or does it really return every single row of data even if you only wanted one column in $result?
Also, if I paginate using LIMIT, does it hold THAT original (old) result even if the database is updated?
The details are implementation dependent but generally speaking, results are buffered. Executing a query against a database will return some result set. If it's sufficiently small all the results may be returned with the initial call or some might be and more results are returned as you iterate over the result object.
Think of the sequence this way:
You open a connection to the database;
There is possibly a second call to select a database or it might be done as part of (1);
That authentication and connection step is (at least) one round trip to the server (ignoring persistent connections);
You execute a query on the client;
That query is sent to the server;
The server has to determine how to execute the query;
If the server has previously executed the query the execution plan may still be in the query cache. If not a new plan must be created;
The server executes the query as given and returns a result to the client;
That result will contain some buffer of rows that is implementation dependent. It might be 100 rows or more or less. All columns are returned for each row;
As you fetch more rows eventually the client will ask the server for more rows. This may be when the client runs out or it may be done preemptively. Again this is implementation dependent.
The idea of all this is to minimize roundtrips to the server without sending back too much unnecessary data, which is why if you ask for a million rows you won't get them all back at once.
LIMIT clauses--or any clause in fact--will modify the result set.
Lastly, (7) is important because SELECT * FROM table WHERE a = 'foo' and SELECT * FROM table WHERE a = 'bar' are two different queries as far as the database optimizer is concerned so an execution plan must be determined for each separately. But a parameterized query (SELECT * FROM table WHERE a = :param) with different parameters is one query and only needs to be planned once (at least until it falls out of the query cache).
I think you are confusing the two types of variables you're dealing with, and neither answer really clarifies that so far.
$result is a MySQL result object. It does not "contain any rows." When you say $result = mysql_query($sql), MySQL executes the query, and knows what rows will match, but the data has not been transferred over to the PHP side. $result can be thought of as a pointer to a query that you asked MySQL to execute.
When you say $row = mysql_fetch_object($result), that's when PHP's MySQL interface retrieves a row for you. Only that row is put into $row (as a plain old PHP object, but you can use a different fetch function to ask for an associative array, or specific column(s) from each row.)
Rows may be buffered with the expectation that you will be retrieving all of the rows in a tight loop (which is usually the case), but in general, rows are retrieved when you ask for them with one of the mysql_fetch_* functions.
If you only want one column from the database, then you should SELECT that_column FROM .... Using a LIMIT clause is also a good idea whenever possible, because MySQL can usually perform significant optimizations if it knows that you only want a certain group of rows.
The first question can be answered by reading up on resources
Since you are SELECTing "*", every column is returned for each mysql_fetch_object call. Just look at print_r($row) to see.
In simple words the resource returned it like an ID that the MySQL library associate with other data. I think it is like the identification card in your wallet, it's just a number and some information but asociated with a lot of more information if you give it to the goverment, or your cell-phone company, etc.