I'm trying to cache some results retrieved from database using Yii framework 1.1.12. Here is what I am doing in short:
public static function getCategories()
{
if (self::$_categories !== null)
return self::$_categories;
print "Getting categories...";
self::$_categories = Yii::app()->cache->get("categoriesList");
if (self::$_categories === false)
{
$sql = "SELECT id, parent_id, name FROM {{category}} WHERE id > 0 AND is_deleted = 0";
$categoriesList = Yii::app()->db->createCommand($sql)->queryAll();
// Doing some work with $categoriesList and obtaining self::$_categories as the result
// ...
$dependency = new CDbCacheDependency("SELECT MAX(update_time) FROM {{category}}");
Yii::app()->cache->set("categoriesList", self::$_categories, 3600, $dependency);
}
return self::$_categories;
}
Using the profiling tool I can see that everything works. At the first time both queries are executed (each query once):
SELECT MAX(update_time) FROM arrenda_category
SELECT id, parent_id, name FROM arrenda_category WHERE id > 0 AND is_deleted = 0
On further requests only first one is executed.
The problem is when I increase max value of update_time in arrenda_category table (even not using my own edit script - directly from MySQL command line) and refresh the page a count of SELECT MAX(update_time) FROM arrenda_category queries becomes equal to 2. Further refreshes give only one execution again. The interesting thing is if I clear the cache I have one execution of SELECT MAX(...) ... query too.
So I don't understand why a query of cache dependency class is executed twice on condition's change. Is there something wrong with my code or maybe anything else?
P.S. I'm sure that SELECT MAX(update_time) FROM arrenda_category can be executed only in function described above. I also see that the line print "Getting categories..." is reached once per page request.
Yes. It is expected.
EXPLANATION
Suppose the data is already in the cache. And when you call the function getCategories() , the line Yii::app()->cache->get("categoriesList") will execute the dependancy query to check whether the data is changed. Since it is not changed the query is executed one time only.
Now you changed the update_time value externally ( or using some another code in your app ), and you call the getCategories() again,
The line Yii::app()->cache->get("categoriesList") executes the dependancy query to check whether the data in the cache is valid. It finds that data is invalid and returns false
Then the query SELECT id, parent_id, name FROM {{category}} WHERE id > 0 AND is_deleted = 0 is executed to fetch the updated data from the database
The line Yii::app()->cache->set("categoriesList", self::$_categories, 3600, $dependency); AGAIN executes the dependancy query SELECT MAX(update_time) FROM {{category}} to get the latest MAX(update_time) whose result is stored along with the data. Thats why the query is executed twice.
So the point is that every time you set() a value to cache, the dependancy value must be stored along with it since it is needed for the subsequent get() queries for checking whether the dependency is changed.
PS:
If you want more clarification check the source code of the set() function of your cache application component ,it calls the evaluateDependency() function of the CDbCacheDependancy class which inturn calls the generateDependentData() which causes the execution of the dependancy query
Related
I've created a pull request to add a Database\ResultInterface::getNumRows() function to CodeIgniter 4. Some changes I've made have broken a unit test. Specifically, an SQLSRV table optimization test is barfing because this framework tries to fetch a result array from the SQLSRV response to this table optimization query:
ALTER INDEX all ON db_job REORGANIZE
It's also a bit tricky to trace all the code being called that leads to this, but here's an exception:
1) CodeIgniter\Database\Live\DbUtilsTest::testUtilsOptimizeDatabase
CodeIgniter\Database\Exceptions\DatabaseException: Error retrieving row count
/home/runner/work/CodeIgniter4/CodeIgniter4/system/Database/SQLSRV/Result.php:198
/home/runner/work/CodeIgniter4/CodeIgniter4/system/Database/BaseResult.php:193
/home/runner/work/CodeIgniter4/CodeIgniter4/system/Database/BaseUtils.php:177
/home/runner/work/CodeIgniter4/CodeIgniter4/tests/system/Database/Live/DbUtilsTest.php:105
Dissecting that, we see that the DBUtilsTest line 105 calls the BaseUtils::optimizeTable function with a table name, "db_job", and that uses the SQLSRV\Utils::optimizeTable var to construct the query I provided above. This sql is then fed to a db object that calls CodeIgniter\Database\SQLSRV\Connection::query() which, through inheritance and various function calls routes that SQL through BaseConnection::query() which feeds the sql to BaseConnection::simpleQuery which hands the sql to SQLSRV\Connection::execute which finally feeds the sql to sqlsrv_query and returns the result of that function back up through the call stack. So I guess this brings me to my first question:
Question 1: What is the value of $stmt if this ALTER command a) succeeds or b) fails?:
$stmt = sqlsrv_query($connID, 'ALTER INDEX all ON db_job REORGANIZE');
According to the sqlsrv_query documentation, this function:
Returns a statement resource on success and false if an error occurred.
Whatever that value is gets returned back up the call stack to line 625 of BaseConnection::query where, if successful, it is stored as $this->resultID and fed as the second parameter to the constructor at line 676 which effectively returns new SQLSRV\Result($this->connID, $this->resultID). To be clear, connID refers to the SQLSRV db connection and resultID refers to whatever value of $stmt was returned by the sqlsrv_query call above.
The resulting $query variable is an instance of system\Database\SQLSRV\Result in this function:
public function optimizeTable(string $tableName)
{
if ($this->optimizeTable === false)
{
if ($this->db->DBDebug)
{
throw new DatabaseException('Unsupported feature of the database platform you are using.');
}
return false;
}
$query = $this->db->query(sprintf($this->optimizeTable, $this->db->escapeIdentifiers($tableName)));
if ($query !== false)
{
$query = $query->getResultArray();
return current($query);
}
return false;
}
An instance of SQLSRV\Result will not be false so that code will attempt to call SQLSRV\Result::getResultArray which through inheritance calls BaseResult::getResultArray. This seems wrong to try to getResultArray from a query that optimizes a table or any sort of ALTER query, however an MySQL server will return a series of records in response to an OPTIMIZE query. Also, the sqlsrv_query function we just ran is just returning some of sqlserver statement or resource as its result.
I guess this brings me to my second question:
Question 2: How does one tell from the $stmt result of sqlsrv_query whether the ALTER statement above succeeded?
Curiously, none of these pecularities caused any problem before. The problem appears to have arisen because I've deprecated an unused BaseResult::numRows property that was never being set in the old getResultArray code and have replaced it and all references to it with a new getNumRows method. In my new BaseResult::getResultArray function we now check getNumRows instead of numRows. This works fine for MySQLi, SQLite3, and PostGreSQL, but barfs in SQLSRV because the aforementioned $stmt result of the sqlsrv_query ALTER statement gets fed to sqlsrv_num_rows and returns false, which signifies an error according to the docs. Here's the code for SQLSRV\Result::getNumRows function:
public function getNumRows() : int
{
// $this->resultID contains the sqlsrv_query result of our ALTER statement
// and $retval comes up false
$retval = sqlsrv_num_rows($this->resultID);
if ($retval === false)
{
throw new DatabaseException('Error retrieving row count');
}
return intval($retval);
}
This brings me to:
Question 3: would it ever make any sense to try and fetch results from or call sqlsrv_num_rows on the result of an ALTER INDEX query?
What I am trying to do
I want to query a specific set of records using active model like so
$jobModel = Jobs::find()->select('JOB_CODE')->distinct()->where(['DEPT_ID'=>$dept_id])->all();
Then I want to assign a flag attribute to the records in this activerecord based on whether they appear in a relationship table
What I have tried
So in my job model, I have declared a new attribute inAccount. Then I added this function in the job model that sets the inAccount flag to -1 or 0 based on whether a record is found in the relationship table with the specified account_id
public function assignInAccount($account_id){
if(JobCodeAccounts::find()->where(['JOB_CODE'=>$this->JOB_CODE])->andWhere(['ACCOUNT_ID'=>$account_id])->one() == null){
$this->inAccount=0;
}
else{
$this->inAccount = -1;
}
}
What I have been doing is assigning each value individually using foreach like so
foreach($jobModel as $job){
$job->assignInAccount($account_id);
}
However, this is obviously very slow because if I have a large number of records in $jobModel, and each one makes a db query in assignInAccount() this could obviously take some time if the db is slow.
What I am looking for
I am wondering if there is a more efficient way to do this, so that I can assign inAccount to all job records at once. I considered using afterFind() but I don't think this would work as I need to specify a specific parameter. I am wondering if there is a way I can pass in an entire model (or at least array of models/model-attributes and then do all the assignations running only a single query.
I should mention that I do need to end up with the original $jobModel activerecord as well
Thanks to scaisEdge's answer I was able to come up with an alternative solution, first finding the array of jobs that need to be flagged like so:
$inAccountJobs = array_column(Yii::$app->db->createCommand('Select * from job_code_accounts where ACCOUNT_ID = :account_id')
->bindValues([':account_id' => $account_id])->queryAll(), 'JOB_CODE');
and then checking each job record to see if it appears in this array like so
foreach($jobModel as $job){
if(in_array($job->JOB_CODE, $inAccountJobs))
$job->inAccount = -1;
else
$job->inAccount = 0;
}
Does seem to be noticeably faster as it requires only a single query.
I have a method for a scheduled system cleanup that goes through all the files in the "storage" table, selecting the type of files we need (property photos), and then going through each of them defining if corresponding listing still exists in the database. If not, removing the record from the DB and removing the file itself.
Now about the problem. Originally I didn't use chunk(), it was just Model::all() to select everything and it all worked well. But at this point I've got 200000 records in that storage table and these operations began to crash because of enormous memory consumption. So I decided to go with chunk().
So the problem is that now it works as it should, however, at some random moments (somewhere around the middle of the process) the code execution just stops as if the operation was completed, so no errors logged anywhere and the task is not fully completed.
Can you please suggest what can be the cause of such strange behavior?
public function verifyPhotos() {
// Instantiating required models and putting them into a single array so they can be passed to a closure
$models = [];
$models['storage'] = App::make('Store');
$models['condo'] = App::make('Condo');
$models['commercial'] = App::make('Commercial');
$models['residential'] = App::make('Residential');
// Obtaining and processing all records from the storage chunk by chunk
$models['storage']->where('subject_type', '=', 'property_photo')->chunk(10000, function($files) use(&$models) {
// Going through each record in current chunk
foreach ($files as $photo) {
// If record's subject type is Condo
if ($photo->subject_name == 'CondoProperty') {
// Selecting Condo model to work with
$current_model = $models['condo'];
// If record's subject type is Commercial
} elseif ($photo->subject_name == 'CommercialProperty') {
// Selecting Commercial model to work with
$current_model = $models['commercial'];
// If record's subject type is Residential
} elseif($photo->subject_name == 'ResidentialProperty') {
// Selecting Residential model to work with
$current_model = $models['residential'];
}
// If linked listing doesn't exist anymore
if (!$current_model->where('ml_num', '=', $photo->owner_id)->first()) {
// Deleting a storage record and physical file
Storage::delete('/files/property/photos/'.$photo->file_name);
$models['storage']->unregisterFile($photo->id);
}
}
});
}
Using chunk() in Eloquent will add a limit and an offset to your SQL query and execute it for every chunk. If you change your data in the database reducing the rows matched by the query, you will skip over the matching rows in the next execution because of the offset.
I.e. if you have 9 rows with id = 1...9 and subject_type = 'property_photo' and you use chunk(3, ...) the resulting queries are:
select * from store where subject_type = 'property_photo' limit 3 offset 0;
select * from store where subject_type = 'property_photo' limit 3 offset 3;
select * from store where subject_type = 'property_photo' limit 3 offset 6;
select * from store where subject_type = 'property_photo' limit 3 offset 9;
If you inside the each chunk set subject_type = 'something' for each row, those rows no longer match and the next query which offsets by 3 would effectively skip the next 3 matching.
It may be possible to use the Collection::each() closure instead like below, although it would still have to load the entire result set into a collection:
$models['storage']->where('subject_type', '=', 'property_photo')->get()->each(function ($photo) use (&$models) {
if ($photo->subject_name == 'CondoProperty') {
//...
}
//...
});
Remember you can also run DB::disableQueryLog(); to save memory on large database operations.
You should add try ... catch to some suspicious codes and print exception message into log file. I also once found the same problem and eventually found that it was also about memory consumption.
Most suspicious part for me is reusing models, $current_model->where(). I suspect that the memory may not be released after each query. Basically each query should be used only once. Is there any reason to reuse it?
Try change to $current_model = App::make('YourModel'); instead of reusing via $models and see if it solves.
I'm trying to execute a query, similar to the following one, using doctrine dql:
Doctrine_Query::create()
->update('Table a')
->set('a.amount',
'(SELECT sum(b.amount) FROM Table b WHERE b.client_id = a.id AND b.regular = ? AND b.finished = ?)',
array(false, false))
->execute();
But it rises a Doctrine_Query_Exception with the message: "Unknown component alias b"
Is restriction about using sub-queries inside the 'set' clause, can you give me some help?
Thanks in advance.
Years later but may help.
Yes ]
If you need/want/have to, you can use the Querybuilder to execute an update query having a sub select statement, instead of using directly the underlying connection layer.
The idea here is to use the QueryBuilder twice.
Build a select statement to compute the new value.
Build the actual update query to submit to the database, in which you will inject the former select DQL, as you expected in order to issue a single database request.
Example ]
Given an application where users can sell objects. Each transaction involves a buyer and a seller. After a transaction ends, sellers and buyers can leave a review on how went the deal with their counter part.
You might need a User table, a Review table and a Transaction table.
The User table contains a field named rating which will hold the average rating for a user. The Review table stores a transaction id, the author id (who submitted the review), a value (from 0 to 5). Finally, the transaction contains a reference for both the seller and the buyer.
Now let's say you would like to update the average rating for a user after a review has been submitted by the counter part. The update query will compute the average rating for a user and put the result as the value of the User.rating property.
I used the following snippet with Doctrine 2.5 and Symfony3. Since the work is about users, I makes sense to create a new public function called updateRating( User $user) inside the AppBundle\Entity\UserRepository.php repository.
/**
* Update the average rating for a user
* #param User $user The user entity object target
*/
public function updateRating( User $user )
{
// Compute Subrequest. The reference table being Transaction, we get its repository first.
$transactionRepo = $this->_em->getRepository('AppBundle:Transaction');
$tqb = $postRepo->createQueryBuilder('t');
#1 Computing select
$select = $tqb->select('SUM(r.value)/count(r.value)')
// My Review table as no association declared inside annotation (because I do not need it elsewhere)
// So I need to specify the glue part in order join the two tables
->leftJoin('AppBundle:Review','r', Expr\Join::WITH, 'r.post = p.id AND r.author <> :author')
// In case you have an association declared inside the Transaction entity schema, simply replace the above leftJoin with something like
// ->leftJoin(t.reviews, 'r')
// Specify index first (Transaction has been declared as terminated)
->where( $tqb->expr()->eq('t.ended', ':ended') )
// The user can be seller or buyer
->andWhere( $tqb->expr()->orX(
$tqb->expr()->eq('t.seller', ':author'),
$tqb->expr()->eq('t.buyer', ':author')
));
#2 The actual update query, containing the above sub-request
$update = $this->createQueryBuilder('u')
// We want to update a row
->update()
// Setting the new value using the above sub-request
->set('u.rating', '('. $select->getQuery()->getDQL() .')')
// should apply to the user we want
->where('u.id = :author')
// Set parameters for both the main & sub queries
->setParameters([ 'ended' => 1, 'author' => $user->getId() ]);
// Get the update success status
return $update->getQuery()->getSingleScalarResult();
}
Now from the controller
// … Update User's rating
$em->getRepository('AppBundle:User')->updateRating($member);
// …
I'm not sure if there's a restriction on this but I remember fighting with this sometime ago. I eventually got it working with:
$q = Doctrine_Manager::getInstance()->getCurrentConnection();
$q->execute("UPDATE table a SET a.amount = (SELECT SUM(b.amount) FROM table b WHERE b.client_id = a.id AND b.regular = 0 AND b.finished = 0)");
See if that does the trick. Note that automatic variable escaping doesn't get executed with this query as it's not DQL.
In my symfony project, I have a "complex" query that looks like:
$d = Doctrine_Core::getTable('MAIN_TABLE')
// Create the base query with some keywords
->luceneSearch($keywords)
->innerJoin('w.T1 ws')
->innerJoin('ws.T2 s')
->innerJoin('w.T3 piv')
->innerJoin('piv.T4 per')
->innerJoin('w.T5 st')
...
->innerJoin('doc.T12 docT')
->innerJoin('w.Lang lng')
->execute();
I added all those innerJoin to reduce the number of query due to my data model. Actually all data are recovered with this only query.... but the query took from 2 to 20 sec. depends on keywords.
I decided to use memcache because data are not changing all the time.
What I've done is configuring memcache and adding
...
->useResultCache(true)
->execute();
to my query.
What is strange is that :
The first time (when the cache is empty/flushed), only one query is execute
The second time, ~130 ares executed and it take more time than the first...
Those "new" queries are retrieving data from "inner join" for each record.
What I don't undestand is why "innerjoined" data are not saved in the cache?
I tried to change the hydrate mode but it seems not to be influent.
Someone has an idea?
After a whole day to googlise, to analyse doctrine and become desperate, I found an article that explain the solution:
class User extends BaseUser{
public function serializeReferences($bool=null)
{
return true;
}
}
The problem was the profile object was not getting stored in the result cache and thus causing a query each time it was called from the user object. After much hunting around, a long time in #doctrine, and a few leads from a couple of people, it turns out, by default, Doctrine will only serialize the immediate relation to the main object. However, you can make it so that it will serialize objects further down the line by overriding the function serializeReferences to return true in the class you want to serialize references from. In my example this is the User class. Since our application will never only need the ‘User’ class to be serialized on a result cache I completely overrode the function and made it always return true
http://shout.setfive.com/2010/04/28/using-doctrine-result-cache-with-two-deep-relations/