Limiting mysql calls for the same information in cakephp

Limiting mysql calls for the same information in cakephp - php

I'm using cakephp but this could potentially apply to any framework / php-based environment.
I have a blogging platform and people can like, share, comment etc. Each of these likes, shares and comments have an associated user and this means that the same user is requested from the database many many times for different things, running this same query:
SELECT `User`.`id`, `User`.`username`, `User`.`fullname`, `User`.`avatar`, FROM `db`.`users` AS `User` WHERE `User`.`id` = 127
Is there a way that I can stop this from happening, apart from caching? Or does it not really matter that MySQL is doing the same call 5 or more times for the same information?
Thanks

It's CAKE way - to get many simple queries and cache them.
Check Your all $this->YourModelName->find() queries - probably You do not use recursive or contain options, so when You take comments data, CAKE takes related models too (hasMany/belongsTo relations in Your models). You can:
use $this->loadModel(YourModelName); to load model on the fly
use $this->YourModelName->Behaviors->attach('Containable'); to attach selected models and fields
set recursive => -1 to get data from one model/table only
use own query with join option (it can be one query instead of contain or recursive option)
for some data like User data You can storage it in session like $this->Session->write()
if You use Auth component, You can get from auth like $this->Auth->user('id')
always You can set for all models $useTable = false and link models on the fly but it's not efficient/economic way (especially when You have to manage Your project in further time)

You can set recursive to -1 on the model, this will cause reads and finds to initially only get the data from that Model unless set on the fly, however, I prefer the containable behavior for reducing queries. You can check that out here.
http://book.cakephp.org/2.0/en/core-libraries/behaviors/containable.html

Related

Yii application log littered with COUNT(*)

When running SELECT queries it seems as if Yii is often performing each one twice. The first is a COUNT() and the second is the actual query.
What is causing this? It seems terribly inefficient.
In a related note, why does Yii perform a SHOW COLUMNS FROM and SHOW CREATE TABLE so often? Doesn't setting up a relation within the Model tell Yii enough about the schema?

I assume you are using active records a lot in conjunction with listing widgets such as CGridView and CListView.
What is causing this? It seems terribly inefficient.
Well, in order for the pagination to work in CListView and CGridView, the assigned CActiveDataProvider (or actually any data provider) needs to fetch the total item count. This won't work with the result set which usually has a LIMIT clause applied. Hence, an additional COUNT() is performed to retrieve said number.
In a related note, why does Yii perform a SHOW COLUMNS FROM and SHOW CREATE TABLE so often? Doesn't setting up a relation within the Model tell Yii enough about the schema?
No. Yii does far more than managing related models. Part of the AR abstraction layer is also to determine which fields are available in a table and hence can be accessed on a model representing a table row. However, you don't have to live with this as schemata can be cached conveniently. To do so, follow these steps:
Configure a caching component such as CApcCache in your protected/config/main.php in the components stanza.
Change the configuration of your db component so it contains the following lines:
'schemaCacheId'=>'cache', // This is the name of the cache component you
// configured in step 1. It's also the default value.
'schemaCacheDuration'=>3600, // Cache table schemata for an hour.
// Set this higher if you like.
A word of advice; don't do this in your development environment: If your database design changes, AR models might not reflect this due to stale caches.

Create view or use innerjoins?

I have a normalized database, with foreign keys/primary keys giving one to many databases.
I plan to access this database with PHP for the basic frontend/backend display. Now, my question comes from these two exampled queries:
CREATE VIEW `view` AS
SELECT
functiondetails.Detail,
functionnames.ID,
functionnames.FunctionName,
functionnames.Catogory
FROM functiondetails
INNER JOIN functionnames ON functiondetails.AsscID = functionnames.ID
or
SELECT
functiondetails.Detail,
functionnames.ID,
functionnames.FunctionName,
functionnames.Catogory
FROM functiondetails
INNER JOIN functionnames ON functiondetails.AsscID = functionnames.ID
There is no error within the query as i've ran both without fail, but my overall question is this:
if I plan to constantly reference alot of information from my database. Wouldn't it be easier to create a view, which will then update all the time with the newly added information, or would it be in better practice to have the second query on my actual php.. Example:
$Query = $MySQli->prepare("
SELECT
functiondetails.Detail,
functionnames.ID,
functionnames.FunctionName,
functionnames.Catogory
FROM functiondetails
INNER JOIN functionnames ON functiondetails.AsscID = functionnames.ID
")
$Query->execute();
$Results = $Query->fetch_results();
$Array = $Results->fetch_array(MYSQLI_ASSOC);
Or to select from my view?
$Query = $MySQLi->prepare("SELECT * FROM `view`");
$Query->execute();
$Results = $Query->fetch_results();
$Array = $Results->fetch_array(MYSQLI_ASSOC);
So which one would be a better method to use for querying my database?

Views are an abstraction layer and the usual reason for creating an abstraction layer is to give you a tool to make your life easier.
Some of the big advantages to using views include:
Security
You can control who has access to view without granting them access to the underlying tables.
Clarification
Often times, column headers aren't as descriptive as they can be. Views allow you to add clarity to the data being returned.
Performance
Performance wise, views do not negatively hurt you. You will not, however, see a performance gain by using views either as MySQL does not support materialized views.
Ease in Coding
Views can be used to reuse complex queries with less room for user error.
Ease of Management
It makes your life easier whenever your table schema changes.
For example, say you have a table that contains homes you have for sale, homes_for_sale, but later on you decide you want that table to handle all homes you've ever had for sale/have for sale currently, all_homes. Obviously, the schema of the new table would be much different than the first.
If you have a ton of queries pulling from homes_for_sale, now you have to go through all your code and update all the queries. This opens you up to user error and a management nightmare.
The better way to address the change is replace the table with a view of the same name. The view would return the exact same schema as the original table, even though the actual schema has changed. Then you can go through your code at your own pace, if needed at all, and update your query calls.

You may be assuming that MySQL stores the results of a view somewhere, and updates that result as data in the underlying tables change. MySQL does not do this. Querying a view is exactly like running the query.
But it can even be worse performance than running the bare SQL query, because MySQL may accumulate the results of the base query in a temporary table, so that you can use further SQL clauses in your query against the view. I say "may" because it varies by view algorithm.
See http://dev.mysql.com/doc/refman/5.6/en/view-algorithms.html for a description of how MySQL uses either the "merge" algorithm or the "temptable" algorithm for executing a view.
If you want materialized views, there's a tool called FlexViews that maintains materialized views:
Flexviews is a materialized views implementation for MySQL. It includes a simple API that is used to create materialized views and to refresh them. The advantage of using Flexviews is that the materialized views are incrementally refreshed, that is, the views are updated efficiently by using special logs which record the changes to database tables. Flexviews includes tools which create and maintain these logs. The views created by Flexviews include support for JOINs and for all major aggregation functions.

Creating View is preferable if you are:
Sure about the required columns
Want to reuse your view somewhere else as well
You like coding in abstract way. (Hiding technical details)
Need fast access by creating index on it.
Specific access to few user (point took from comments)

A view is simply a stored text query. You can apply WHERE and ORDER against it, the execution plan will be calculated with those clauses taken into consideration. I think it would be useful if you want to keep your code "clean".
What you need to keep in mind is that it is a little harder to modify the view, so if you are not quite sure about the columns, or it will change latter, stick to a query.
About performance is THE SAME!
Best regards!

Performance wise they should be the same, but the view is better for a few practical reasons.
I prefer views because it encourages better reuse and refactoring of the complex queries by altering them in one place instead of having to copy-paste a newer version everywhere if use the query in multiple places.
Also running an update query against a view can look a lot cleaner and simpler, but be aware that you sometimes can't update multiple columns in a view that belong to different underlying tables. So if you have to update 2 different columns, you'll have to run two different update queries.
Using a view also makes sense because you offload complex database logic to the database where it belongs instead of building it into your application code.
On the downside of using a view, that can take you a little bit longer to setup if you don't have your database management tool at the ready. Also, if you have a lot of views, you'll probably have to come up with some way to organize and document them all. This gets more complex if views start building off of other views. So you'll have to plan ahead and maintain dependencies.

How to filter my Doctrine queries with Symfony ACL

Symfony ACL allows me to grant access to an entity, and then check it:
if (false === $securityContext->isGranted('EDIT', $comment)) {
throw new AccessDeniedException();
}
However, if I have thousands of entities in the database and the user has access only to 10 of them, I don't want to load all the entities in memory and hydrate them.
How can I do a simple "SELECT * FROM X" while filtering only on the entities the user has access (at SQL level)?

Well there it is: it's not possible.
In the last year I've been working on an alternative ACL system that would allow to filter directly in database queries.
My company recently agreed to open source it, so here it is: http://myclabs.github.io/ACL/

As pointed out by #gregor in the previous discussion,
In your first query, get a list (with a custom query) of all the object_identity_ids (for a specific entity/class X) a user has access to.
Then, when querying a list of objects for entity/class X, add "IN (object_identity_ids)" to your query.
Matthieu, I wasn't satisfied by replying with more of conjectures (since my conjectures don't add anything valuable to the conversation). So I did some bench-marking on this approach (Digital Ocean 5$/mo VPS).
As expected, table size doesn't matter when using the IN array approach. But a big array size indeed makes things get out of control.
So, Join approach vs IN array approach?
JOIN is indeed better when the array size is huge. BUT, this is assuming that we shouldn't consider the table size. Turns out, in practice IN array is faster - except when there's a large table of objects and the acl entries cover almost every object (see the linked question).
I've expanded on my reasoning on a separate question. Please see When using Symfony's ACL, is it better to use a JOIN query or an IN array query?

You could have a look into the Doctrine filters. That way you could extend all queries. I have not done this yet and there are some limitations documented. But maybe it helps you. You'll find a description of the ACL database tables here.
UPDATE
Each filter will return a string and all those strings will be added to the SQL queries like so:
SELECT ... FROM ... WHERE ... AND (<result of filter 1> AND <result of filter 2> ...)
Also the table alias is passed to the filter method. So I think you can add Subqueries here to filter your entities.

Where should filtering with an Acl be performed?

Let's say I have three tables: users, books, and users_books.
In one of my views, I want to display a list of all the books the current user has access to. A user has access to a book if a row matching a user and a book exists in users_books.
There are (at least) two ways I can accomplish this:
In my fetchAll() method in the books model, execute a join of some sort on the users_books table.
In an Acl plugin, first create a resource out of every book. Then, create a role out of every user. Next, allow or deny users access to each resource based on the users_books table. Finally, in the fetchAll() method of the books model, call isAllowed() on each book we find, using the current user as the role.
I see the last option as the best, because then I could use the Acl in other places in my application. That would remove the need to perform duplicate access checks.
What would you suggest?

I'd push it all down into the database:
Doing it in the database through JOINs will be a lot faster than filtering things in your PHP.
Doing it in the database will let you paginate things properly without having to jump through hoops like fetching more data than you need (and then fetching even more if you end up throwing too much out).
I can think of two broad strategies you could employ for managing the ACLs.
You could set up explicit ACLs in the database with a single table sort of like this:
id: The id of the thing (book, picture, ...) in question.
id_type: The type or table that id comes from.
user: The user that can look at the thing.
The (id, id_type) pair give you a pseudo-FK that you can use for sanity checking your database and the id_type can be used to select a class to provide the necessary glue to interact the the type-specific parts of the ACLs and add SQL snippets to queries to properly join the ACL table.
Alternatively, you could use a naming convention to attach an ACL sidecar table to each table than needs an ACL. For table t, you could have a table t_acl with columns like:
id: The id of the thing in t (with a real foreign key for integrity).
user: The user the can look at the thing.
Then, you could have a single ACL class that could adjust your SQL given the base table name.
The main advantage of the first approach is that you have a single ACL store for everything so it is easy to answer questions like "what can user X look at?". The main advantage of the second approach is that you can have real referential integrity and less code (through naming conventions) for gluing it all together.
Hopefully the above will help your thinking.

I would separate out your database access code from your models by creating a finder method in a repository class with an add method like getBooksByUser(User $user) to return a collection of book objects.
Not entirely sure you need ACLs from what you describe. I maybe wrong.

Getting data from multiple tables when using the Zend framework?

Is there a best practice in getting data from multiple database tables using Zend? I would like to know rather than end up wanting to refactor the code I write in the near future. I was reading the Zend documentation and it said that:
"You can not specify columns from a
JOINed tabled to be returned in a
row/rowset. Doing so will trigger a
PHP error. This was done to ensure
the integrity of the Zend_Db_Table is
retained. i.e. A Zend_Db_Table_Row
should only reference columns derived
from its parent table."
I assume I therefore need to use multiple models -- is that correct? If, for example, I want to get out all orders for a particular user id where the date is in between two dates what would I do?
I know that it would be possible to access the two different models from a controller and then combine their respective data in the action but I would not feel happy doing this since I have been reading survivethedeepend.com and it tells me that I shouldn't do this...
Where, why, and how? :)
Thanks!

If you're reading ZFSTDE, in chapter 9 (http://www.survivethedeepend.com/zendframeworkbook/en/1.0/implementing.the.domain.model.entries.and.authors) this problem is addressed by using a data mapper.
Also, you can join 2 tables, just be sure to first call on the select object the setIntegrityCheck(false) method. The docs say that a row should reference a parent table, doesn't mean it can not :)

Stop thinking about Zend_Db_Table as your "model".
You should write your own, rich, domain-centric model classes to sit between your controllers (and views), and your persistence logic (anything that uses Zend_Db/Zend_Db_Table/Zend_Db_Select) to load/store data from the database.

Sure, you can query several db tables at the same time. Take a look at the official ZF docs here http://framework.zend.com/manual/en/zend.db.select.html#zend.db.select.building.join
As for your example with getting all orders of a single user, table relationships are the answer http://framework.zend.com/manual/en/zend.db.table.relationships.html

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.