I've just installed Memcached and i'm trying to use it to cache results of various queries done with Doctrine ORM (Doctrine 2.4.8+, Symfony 2.8+).
My app/config/config_prod.yml have this :
doctrine:
orm:
metadata_cache_driver: memcached
result_cache_driver: memcached
query_cache_driver: memcached
And i tried to useResultCache() on 2 queries like that (i just replaced the cache id here for the example) : return $query->useResultCache(true, 300, "my_cache_id")->getArrayResult();. Special queries here because they're native queries (SQL) due to their complexity, but the method is available for any query (class AbstractQuery) so i assume it should work.
Unfortunately, it doesn't. Everytime i refresh the page, if i just did a change in the database, the change is displayed. I've checked the stats of memcached and it seems there're still some cache hits but i don't really know how, cf. what i just said.
Does anyone has an idea on why the cache doesn't seem to be used here to get the supposedly cached results ? Did i misunderstand something and the TTL is ignored somehow ?
There's no error generated, memcached log is empty.
As requested by #nifr, here is the layout of my code to create the 2 native queries i put a Memcached test on :
$rsm = new ResultSetMapping;
$rsm->addEntityResult('my_entity_user', 'u');
// some $rsm->addFieldResult('u', 'column', 'field');
// some $rsm->addScalarResult('column', 'alias');
$sqlQuery = 'SELECT
...
FROM ...
INNER JOIN ... ON ...
INNER JOIN ... ON ...
// some more join
WHERE condition1
AND condition2';
// some conditions added depending on params passed to this function
$sqlQuery .= '
AND (fieldX = (subrequest1))
AND (fieldY = (subrequest2))
AND condition3
AND condition4
GROUP BY ...
ORDER BY ...
LIMIT :nbPerPage
OFFSET :offset
';
$query =
$this->_em->createNativeQuery($sqlQuery, $rsm)
// some ->setParameter('param', value)
;
return $query->useResultCache(true, 300, "my_cache_id")->getArrayResult();
So, it seems for some reason Doctrine doesn't succeed to get the ResultCacheDriver. I tried to set it before the useResultCache() but i had an exception from Memcached : Error: Call to a member function get() on null.
I've decided to do it more directly by calling Memcached(). I guess i'll do this kind of stuff in controllers and repositories, depending on my needs. After some tests it works perfectly.
Here is what i basically do :
$cacheHit = false;
$cacheId = md5("my_cache_id"); // Generate an hash for your cache id
// We check if Memcached exists, if it's not installed in dev environment for instance
if (class_exists('Memcached'))
{
$cache = new \Memcached();
$cache->addServer('localhost', 11211);
$cacheContent = $cache->get($cacheId);
// We check if the content is already cached
if ($cacheContent != false)
{
// Content cached, that's a cache hit
$content = $cacheContent;
$cacheHit = true;
}
}
// No cache hit ? We do our stuff and set the cache content for future requests
if ($cacheHit == false)
{
// Do the stuff you want to cache here and put it in a variable, $content for instance
if (class_exists('Memcached') and $cacheHit == false) $cache->set($cacheId, $content, time() + 600); // Here cache will expire in 600 seconds
}
I'll probably put this in a Service. Not sure yet what's the "best practice" for this kind of stuff.
Edit : I did a service. But it only works with native sql... So the problem stays unsolved.
Edit² : I've found a working solution about this null issue (meaning it couldn't find Memcached). The bit of code :
$memcached = new \Memcached();
$memcached->addServer('localhost', 11211);
$doctrineMemcached = new \Doctrine\Common\Cache\MemcachedCache();
$doctrineMemcached->setMemcached($memcached);
$query
->setResultCacheDriver($doctrineMemcached)
->useResultCache(true, 300);
Now, i would like to know which stuff should i put in the config_prod.yml to just use the useResultCache() function.
Related
trying to use Redis with tagging in my Symfony 5 app but can't seem to get RedisTagAwareAdapter to work. Saving to Redis without tags works just fine like this:
use Symfony\Component\Cache\Adapter\RedisAdapter;
use Symfony\Contracts\Cache\ItemInterface;
$client = RedisAdapter::createConnection(
'redis://redis'
);
$cache = new RedisAdapter($client);
$key = "testkey";
$data = "hello world";
$cacheItem = $cache->getItem($key);
$cacheItem->set($data);
$cacheItem->expiresAfter(3600);
$cache->save($cacheItem);
But if I switch to using the RedisTagAwareAdapter as this suggests then nothing gets saved:
use Symfony\Component\Cache\Adapter\RedisAdapter;
use Symfony\Component\Cache\Adapter\RedisTagAwareAdapter;
use Symfony\Contracts\Cache\ItemInterface;
$client = RedisAdapter::createConnection(
'redis://redis'
);
$cache = new RedisTagAwareAdapter($client);
$key = "testkey";
$data = "hello world";
$cacheItem = $cache->getItem($key);
$cacheItem->set($data);
$cacheItem->tag('greeting');
$cacheItem->expiresAfter(3600);
$cache->save($cacheItem);
the $cache->save() returns false. No other errors are thrown.
I'm using Symfony 5, Redis server version 6.2.5 and have phpredis installed. Any ideas on what I'm doing wrong? TIA
The solution makes everything a lot simpler :D
The CacheAdapter (from DoctrineCache2.x) makes things -although a bit weird - loads simpler.
In your case the code would look like this:
[...]
$cacheItem = $cache->get($key, function(ItemInterface $item) {
$item->expire(3600);
$item->tag('greeting');
return "hello world";
});
Now I know this looks a bit counterproductive and makes hardly any sense at a first glance but here's what actually happens:
first it tries to get the item at $key and if that is not a hit, the function callback gets into action. It will create a new item, set the tagging and lifetime and it will assign the return value as the value stored at key (that is the weird part). And while it looks convoluted it's actually really smart since all the data gathering is being done ONLY if there's not cache hit for that specific key.
I'm having a Symfony Command that uses the Doctrine Paginator on PHP 7.0.22. The command must process data from a large table, so I do it in chunks of 100 items. The issue is that after a few hundred loops it gets to fill 256M RAM. As measures against OOM (out-of-memory) I use:
$em->getConnection()->getConfiguration()->setSQLLogger(null); - disables the sql logger, that fills memory with logged queries for scripts running many sql commands
$em->clear(); - detaches all objects from Doctrine at the end of every loop
I've put some dumps with memory_get_usage() to check what's going on and it seems that the collector doesn't clean as much as the command adds at every $paginator->getIterator()->getArrayCopy(); call.
I've even tried to manually collect the garbage every loop with gc_collect_cycles(), but still no difference, the command starts using 18M and increases with ~2M every few hundred items. Also tried to manually unset the results and the query builder... nothing. I removed all the data processing and kept only the select query and the paginator and got the same behaviour.
Anyone has any idea where I should look next?
Note: 256M should be more than enough for this kind of operations, so please don't recommend solutions that suggest increasing allowed memory.
The striped down execute() method looks something like this:
protected function execute(InputInterface $input, OutputInterface $output)
{
// Remove SQL logger to avoid out of memory errors
$em = $this->getEntityManager(); // method defined in base class
$em->getConnection()->getConfiguration()->setSQLLogger(null);
$firstResult = 0;
// Get latest ID
$maxId = $this->getMaxIdInTable('AppBundle:MyEntity'); // method defined in base class
$this->getLogger()->info('Working for max media id: ' . $maxId);
do {
// Get data
$dbItemsQuery = $em->createQueryBuilder()
->select('m')
->from('AppBundle:MyEntity', 'm')
->where('m.id <= :maxId')
->setParameter('maxId', $maxId)
->setFirstResult($firstResult)
->setMaxResults(self::PAGE_SIZE)
;
$paginator = new Paginator($dbItemsQuery);
$dbItems = $paginator->getIterator()->getArrayCopy();
$totalCount = count($paginator);
$currentPageCount = count($dbItems);
// Clear Doctrine objects from memory
$em->clear();
// Update first result
$firstResult += $currentPageCount;
$output->writeln($firstResult);
}
while ($currentPageCount == self::PAGE_SIZE);
// Finish message
$output->writeln("\n\n<info>Done running <comment>" . $this->getName() . "</comment></info>\n");
}
The memory leak was generated by Doctrine Paginator. I replaced it with native query using Doctrine prepared statements and fixed it.
Other things that you should take into consideration:
If you are replacing the Doctrine Paginator, you should rebuild the pagination functionality, by adding a limit to your query.
Run your command with --no-debug flag or -env=prod or maybe both. The thing is that the commands are running by default in the dev environment. This enables some data collectors that are not used in the prod environment. See more on this topic in the Symfony documentation - How to Use the Console
Edit: In my particular case I was also using the bundle eightpoints/guzzle-bundle that implements the HTTP Guzzle library (had some API calls in my command). This bundle was also leaking, apparently through some middleware. To fix this, I had to instantiate the Guzzle client independently, without the EightPoints bundle.
I have an API written in Laravel. There is the following code in it:
public function getData($cacheKey)
{
if(Cache::has($cacheKey)) {
return Cache::get($cacheKey);
}
// if cache is empty for the key, get data from external service
$dataFromService = $this->makeRequest($cacheKey);
$dataMapped = array_map([$this->transformer, 'transformData'], $dataFromService);
Cache::put($cacheKey, $dataMapped);
return $dataMapped;
}
In getData() if cache contains necessary key, data returned from cache.
If cache does not have necessary key, data is fetched from external API, processed and placed to cache and after that returned.
The problem is: when there are many concurrent requests to the method, data is corrupted. I guess, data is written to cache incorrectly because of race conditions.
You seem to be experiencing some sort of critical section problem. But here's the thing. Redis operations are atomic however Laravel does its own checks before calling Redis.
The major issue here is that all concurrent requests will all cause a request to be made and then all of them will write the results to the cache (which is definitely not good). I would suggest implementing a simple mutual exclusion lock on your code.
Replace your current method body with the following:
public function getData($cacheKey)
{
$mutexKey = "getDataMutex";
if (!Redis::setnx($mutexKey,true)) {
//Already running, you can either do a busy wait until the cache key is ready or fail this request and assume that another one will succeed
//Definately don't trust what the cache says at this point
}
$value = Cache::rememberForever($cacheKey, function () { //This part is just the convinience method, it doesn't change anything
$dataFromService = $this->makeRequest($cacheKey);
$dataMapped = array_map([$this->transformer, 'transformData'], $dataFromService);
return $dataMapped;
});
Redis::del($mutexKey);
return $value;
}
setnx is a native redis command that sets a value if it doesn't exist already. This is done atomically so it can be used to implement a simple locking mechanism, but (as mentioned in the manual) will not work if you're using a redis cluster. In that case the redis manual describes a method to implement distributed locks
In the end I came to the following solution: I use retry() function from Laravel 5.5 helpers to get cache value until it is written there normally with interval of 1 second.
I’m trying to insert a large amount of data (30 000+ lines) in a MySQL database using Doctrine2 and the Symfony2 fixture bundle.
I looked at the right way to do it. I saw lots of questions about memory leaks and Doctrine, but no satisfying answer for me. It often comes the Doctrine clear() function.
So, I did various shapes of this:
while (($data = getData()) {
$iteration++;
$obj = new EntityObject();
$obj->setName('henry');
// Fill object...
$manager->persist($obj);
if ($iteration % 500 == 0) {
$manager->flush();
$manager->clear();
// Also tried some sort of:
// $manager->clear($obj);
// $manager->detach($obj);
// gc_collect_cycles();
}
}
PHP memory still goes wild, right after the flush() (I’m sure of that). In fact, every time the entities are flushed, memory goes up for a certain amount depending on batch size and the entities, until it reaches the deadly Allowed Memory size exhausted error. With a very very tiny entity, it works but memory consumption increase too much: several MB whereas it should be KB.
clear(), detach() or calling GC doesn’t seem to have an effect at all. It only clears some KB.
Is my approach flawed? Did I miss something, somewhere? Is it a bug?
More info:
Without flush() memory barely moves;
Lowering the batch do not change the outcome;
Data comes from a CSV that need to be sanitized;
EDIT (partial solution):
#qooplmao brought a solution that significantly decrease memory consumption, disable doctrine sql logger: $manager->getConnection()->getConfiguration()->setSQLLogger(null);
However, it is still abnormally high and increasing.
I resolved my problem using this resource, as #Axalix suggested.
This is how I modified the code:
// IMPORTANT - Disable the Doctrine SQL Logger
$manager->getConnection()->getConfiguration()->setSQLLogger(null);
// SUGGESION - make getData as a generator (using yield) to to save more memory.
while ($data = getData()) {
$iteration++;
$obj = new EntityObject();
$obj->setName('henry');
// Fill object...
$manager->persist($obj);
// IMPORTANT - Temporary store entities (of course, must be defined first outside of the loop)
$tempObjets[] = $obj;
if ($iteration % 500 == 0) {
$manager->flush();
// IMPORTANT - clean entities
foreach($tempObjets as $tempObject) {
$manager->detach($tempObject);
}
$tempObjets = null;
gc_enable();
gc_collect_cycles();
}
}
// Do not forget the last flush
$manager->flush();
And, last but not least, as I use this script with Symfony data fixtures, adding the --no-debug parameter in the command is also very important. Then memory consumption is stable.
I found out that Doctrine logs all SQLs during execute. I recommend to disable it with code below, it can really save memory:
use Doctrine\ORM\EntityManagerInterface;
public function __construct(EntityManagerInterface $entity_manager)
{
$em_connection = $entity_manager->getConnection();
$em_connection->getConfiguration()->setSQLLogger(null);
}
My suggestion is to drop the Doctrine approach for bulk inserts. I really like Doctrine but I just hate this kind of stuff on bulk inserts.
MySQL has a great thing called LOAD DATA. I would rather use it or even if I have to sanitize my csv first and do the LOAD after.
If you need to change the values, I would read csv to array $csvData = array_map("str_getcsv", file($csv));. Change whatever you need on the array and save it to the line. After that, use the new .csv to LOAD with MySQL.
To support my claims on why I wouldn't use Doctrine for this here described on the top.
This is my current setup:
snc_redis:
clients:
default:
type: predis
alias: cache
dsn: "redis://127.0.0.1"
doctrine:
metadata_cache:
client: cache
entity_manager: default
document_manager: default
result_cache:
client: cache
entity_manager: [bo, aff, fs]
query_cache:
client: cache
entity_manager: default
I have an API which gets multiple duplicate requests (usually in quick succession), can I use this setup to send back a cached response on duplicate request? Also is it possible to set cache expiry?
From the config sample you provided I'm guessing you want to cache the Doctrine results rather than the full HTTP responses (although the latter is possible, see below).
If so, the easiest way to do this is that whenever you create a Doctrine query, set it to use the result cache which you've set up above to use redis.
$qb = $em->createQueryBuilder();
// do query things
$query = $qb->getQuery();
$query->useResultCache(true, 3600, 'my_cache_id');
This will cache the results for that query for an hour with your cache ID. Clearning the cache is a bit of a faff:
$cache = $em->getConfiguration()->getResultCacheImpl();
$cache->delete('my_cache_id');
If you want to cache full responses - i.e. you do some processing in-app which takes a long time - then there are numerous ways of doing that. Serializing and popping it into redis is possible:
$myResults = $service->getLongRunningResults();
$serialized = serialize($myResults);
$redisClient = $container->get('snc_redis.default');
$redisClient->setex('my_id', $serialized, 3600);
Alternatively look into dedicated HTTP caching solutions like varnish or see the Symfony documentation on HTTP caching.
Edit: The SncRedisBundle provides its own version of Doctrine's CacheProvider. So whereas in your answer you create your own class, you could also do:
my_cache_service:
class: Snc\RedixBundle\Doctrine\Cache\RedisCache
calls:
- [ setRedis, [ #snc_redis.default ] ]
This will do almost exactly what your class is doing. So instead of $app_cache->get('id') you do $app_cache->fetch('id'). This way you can switch out the backend for your cache without changing your app class, just the service description.
In the end I created a cache manager and registered it as a service called #app_cache.
use Predis;
class CacheManager
{
protected $cache;
function __construct()
{
$this->client = new Predis\Client();
}
/**
* #return Predis\Client
*/
public function getInstance()
{
return $this->client;
}
}
In the controller I can then md5 the request_uri
$id = md5($request->getRequestUri());
Check it exists, if it does return the $result
if($result = $app_cache->get($id)) {
return $result;
}
If it doesn't do..whatever...and save the response for next time
$app_cache->set($id,$response);
To set the expiry use the 3rd and 4th parameter ex = seconds and px = milliseconds.
$app_cache->set($id,$response,'ex',3600);