PHP apcu not persistent in Laravel queued/dispatched jobs - php

(Laravel 8, PHP 8)
Hi. I have a bunch of data in the PHP APC cache that I can access across my Laravel application with the apcu commands.
I decided I should fire an async job to process some of that data for the user during a session and throw the results in the database.
So I made a middleware that fires (correctly) when the user accesses the page, and (correctly) dispatches a job called "MemoryProvider".
The dispatch command promply instantiates the MemoryProvider class, running its constructor, and then queues the job for execution.
About a second later, the queue is processed and the handle method in MemoryProvider is run.
I check the content of the php cache with "apcu_cache_info()" and "apcu_exists()" in the middleware and both in the MemoryProvider constructor and in its handle method.
The problem:
The PHP cache appears populated throughout my Laravel app.
The PHP cache appears populated in the middleware.
The PHP cache appears populated in the job's constructor.
The PHP cache appears EMPTY in the job's handle method.
Here's the middleware:
{
$a = apcu_cache_info(); // 250,000 entries
$b = apcu_exists('the:2:0'); // true
MemoryProvider::dispatch($request);
return $next($request);
}
Here's the job's (MemoryProvider) constructor:
{
$this->request = $request->all();
$a = apcu_cache_info(); // 250,000 entries
$b = apcu_exists('the:2:0'); // true
}
And here's the job's (MemoryProvider) handle method:
{
$a = apcu_cache_info(); // 0 entries
$b = apcu_exists('the:2:0'); // false
}
Question: is this a PHP limitation or a bad Laravel problem? And how can I access the content of my PHP cache in an async class?
p.s. I have apc.enable_cli=1 in php.ini

I found the answer. Apparently, it's a PHP limitation.
According to a good explanation given by gview back in 2017, a cli process doesn't share state or memory with other cli processes. So the apc memory space will never be shared this way.
I did find a workaround for my specific case: instead of running an async process to handle the heavy work in the background, I can get the same effect by simply issuing an AJAX request. The request is handled independently by PHP, with full access to the APC cache, and I can populate my database and let the user know when it's all done (or gradually done, as is the case).
I wish I had thought of this sooner.

Related

Symfony Lock does not lock when making two requests from the same browser

I want to prevent a user from making the same request two times by using the Symfony Lock component. Because now users can click on a link two times(by accident?) and duplicate entities are created. I want to use the Unique Entity Constraint which does not protect against race conditions itself.
The Symfony Lock component does not seem to work as expected. When I create a lock in the beginning of a page and open the page two times at the same time the lock can be acquired by both requests. When I open the test page in a standard and incognito browser window the second request doesn't acquire the lock. But I can't find anything in the docs about this being linked to a session. I have created a small test file in a fresh project to isolate the problem. This is using php 7.4 symfony 5.3 and the lock component
<?php
namespace App\Controller;
use Sensio\Bundle\FrameworkExtraBundle\Configuration\Template;
use Symfony\Bundle\FrameworkBundle\Controller\AbstractController;
use Symfony\Component\Lock\LockFactory;
use Symfony\Component\Routing\Annotation\Route;
class LockTest extends AbstractController
{
/**
* #Route("/test")
* #Template("lock/test.html.twig")
*/
public function test(LockFactory $factory): array
{
$lock = $factory->createLock("test");
$acquired = $lock->acquire();
dump($lock, $acquired);
sleep(2);
dump($lock->isAcquired());
return ["message" => "testing"];
}
}
I slightly rewrote your controller like this (with symfony 5.4 and php 8.1):
class LockTestController extends AbstractController
{
#[Route("/test")]
public function test(LockFactory $factory): JsonResponse
{
$lock = $factory->createLock("test");
$t0 = microtime(true);
$acquired = $lock->acquire(true);
$acquireTime = microtime(true) - $t0;
sleep(2);
return new JsonResponse(["acquired" => $acquired, "acquireTime" => $acquireTime]);
}
}
It waits for the lock to be released and it counts the time the controller waits for the lock to be acquired.
I ran two requests with curl against a caddy server.
curl -k 'https://localhost/test' & curl -k 'https://localhost/test'
The output confirms one request was delayed while the first one slept with the acquired lock.
{"acquired":true,"acquireTime":0.0006971359252929688}
{"acquired":true,"acquireTime":2.087146043777466}
So, the lock works to guard against concurrent requests.
If the lock is not blocking:
$acquired = $lock->acquire(false);
The output is:
{"acquired":true,"acquireTime":0.0007710456848144531}
{"acquired":false,"acquireTime":0.00048804283142089844}
Notice how the second lock is not acquired. You should use this flag to reject the user's request with an error instead of creating the duplicate entity.
If the two requests are sufficiently spaced apart to each get the lock in turn, you can check that the entity exists (because it had time to be fully committed to the db) and return an error.
Despite those encouraging results, the doc mentions this note:
Unlike other implementations, the Lock Component distinguishes lock instances even when they are created for the same resource. It means that for a given scope and resource one lock instance can be acquired multiple times. If a lock has to be used by several services, they should share the same Lock instance returned by the LockFactory::createLock method.
I understand two locks acquired by two distinct factories should not block each other. Unless the note is outdated or wrongly phrased, it seems possible to have non working locks under some circumstances. But not with the above test code.
StreamedResponse
A lock is released when it goes out of scope.
As a special case, when a StreamedResponse is returned, the lock goes out of scope when the response is returned by the controller. But the StreamedResponse has yet to return anything!
To keep the lock while the response is generated, it must be passed to the function executed by the StreamedResponse:
public function export(LockFactory $factory): Response
{
// create a lock with a TTL of 60s
$lock = $factory->createLock("test", 60);
if (!$lock->acquire(false)) {
return new Response("Too many downloads", Response::HTTP_TOO_MANY_REQUESTS);
}
$response = new StreamedResponse(function () use ($lock) {
// now $lock is still alive when this function is executed
$lockTime = time();
while (have_some_data_to_output()) {
if (time() - $lockTime > 50) {
// refresh the lock well before it expires to be on safe side
$lock->refresh();
$lockTime = time();
}
output_data();
}
$lock->release();
};
$response->headers->set('Content-Type', 'text/csv');
// lock would be released here if it wasn't passed to the StreamedResponse
return $response;
}
The above code refreshes the lock every 50s to cut down on communication time with the storage engine (such as redis).
The lock remains locked for at most 60s should the php process suddenly die.

Rate limiting PHP function

I have a php function which gets called when someone visits POST www.example.com/webhook. However, the external service which I cannot control, sometimes calls this url twice in rapid succession, messing with my logic since the webhook persists stuff in the database which takes a few ms to complete.
In other words, when the second request comes in (which can not be ignored), the first request is likely not completed yet however I need this to be completed in the order it came in.
So I've created a little hack in Laravel which should "throttle" the execution with 5 seconds in between. It seems to work most of the time. However an error in my code or some other oversight, does not make this solution work everytime.
function myWebhook() {
// Check if cache value (defaults to 0) and compare with current time.
while(Cache::get('g2a_webhook_timestamp', 0) + 5 > Carbon::now()->timestamp) {
// Postpone execution.
sleep(1);
}
// Create a cache value (file storage) that stores the current
Cache::put('g2a_webhook_timestamp', Carbon::now()->timestamp, 1);
// Execute rest of code ...
}
Anyone perhaps got a watertight solution for this issue?
You have essentially designed your own simplified queue system which is the right approach but you can make use of the native Laravel queue to have a more robust solution to your problem.
Define a job, e.g: ProcessWebhook
When a POST request is received to /webhook queue the job
The laravel queue worker will process one job at a time[1] in the order they're received, ensuring that no matter how many requests are received, they'll be processed one by one and in order.
The implementation of this would look something like this:
Create a new Job, e.g: php artisan make:job ProcessWebhook
Move your webhook processing code into the handle method of the job, e.g:
public function __construct($data)
{
$this->data = $data;
}
public function handle()
{
Model::where('x', 'y')->update([
'field' => $this->data->newValue
]);
}
Modify your Webhook controller to dispatch a new job when a POST request is received, e.g:
public function webhook(Request $request)
{
$data = $request->getContent();
ProcessWebhook::dispatch($data);
}
Start your queue worker, php artisan queue:work, which will run in the background processing jobs in the order they arrive, one at a time.
That's it, a maintainable solution to processing webhooks in order, one-by-one. You can read the Queue documentation to find out more about the functionality available, including retrying failed jobs which can be very useful.
[1] Laravel will process one job at a time per worker. You can add more workers to improve queue throughput for other use cases but in this situation you'd just want to use one worker.

Laravel cache returns corrupt data (redis driver)

I have an API written in Laravel. There is the following code in it:
public function getData($cacheKey)
{
if(Cache::has($cacheKey)) {
return Cache::get($cacheKey);
}
// if cache is empty for the key, get data from external service
$dataFromService = $this->makeRequest($cacheKey);
$dataMapped = array_map([$this->transformer, 'transformData'], $dataFromService);
Cache::put($cacheKey, $dataMapped);
return $dataMapped;
}
In getData() if cache contains necessary key, data returned from cache.
If cache does not have necessary key, data is fetched from external API, processed and placed to cache and after that returned.
The problem is: when there are many concurrent requests to the method, data is corrupted. I guess, data is written to cache incorrectly because of race conditions.
You seem to be experiencing some sort of critical section problem. But here's the thing. Redis operations are atomic however Laravel does its own checks before calling Redis.
The major issue here is that all concurrent requests will all cause a request to be made and then all of them will write the results to the cache (which is definitely not good). I would suggest implementing a simple mutual exclusion lock on your code.
Replace your current method body with the following:
public function getData($cacheKey)
{
$mutexKey = "getDataMutex";
if (!Redis::setnx($mutexKey,true)) {
//Already running, you can either do a busy wait until the cache key is ready or fail this request and assume that another one will succeed
//Definately don't trust what the cache says at this point
}
$value = Cache::rememberForever($cacheKey, function () { //This part is just the convinience method, it doesn't change anything
$dataFromService = $this->makeRequest($cacheKey);
$dataMapped = array_map([$this->transformer, 'transformData'], $dataFromService);
return $dataMapped;
});
Redis::del($mutexKey);
return $value;
}
setnx is a native redis command that sets a value if it doesn't exist already. This is done atomically so it can be used to implement a simple locking mechanism, but (as mentioned in the manual) will not work if you're using a redis cluster. In that case the redis manual describes a method to implement distributed locks
In the end I came to the following solution: I use retry() function from Laravel 5.5 helpers to get cache value until it is written there normally with interval of 1 second.

Use server-sent events in combination with doctrine events to detect changes?

In my app I'm using server-sent events and have the following situation (pseudo code):
$response = new StreamedResponse();
$response->setCallback(function () {
while(true) {
// 1. $data = fetchData();
// 2. echo "data: $data";
// 3. sleep(x);
}
});
$response->send();
My SSE Response class accepts a callback to gather the data (step 1), which actually performs a database query. Now to my problem: As I am trying to avoid polling the database each X seconds, I want to make use of Doctrine's onFlush event to set a flag that the corresponding entity has been actually changed, which would then be checked within fetchData callback. Normally, I would do this by setting a flag on current user session, but as the streaming loop constantly writes data, the session cannot be accessed within the callback. So has anybody an idea how to resolve this problem?
BTW: I'm using Symfony 3.3 and Doctrine 2.5 - thanks for any help!
I know that this question is from a long time ago, but here's a suggestion:
Use shared memory (the php shm_*() functions). That way your flag isn't tied to a specific session.
Be sure to lock and unlock around access to the shared memory (I usually use a semaphore).

How to get all the queries executed by Doctrine?

I'm creating a console command for my bundle with Symfony 2. This command execute several request to database (Mysql). In order to debug my command I need to know how much SQL query has been executed during the command execution. And if it's possible, show these requests (like the Symfony profiler do)
I have the same problem with AJAX requests. When I make an AJAX request, I can't know how much query have been executed during the request.
You can enable the doctrine logging like :
$doctrine = $this->get('doctrine');
$doctrine = $this->getDoctrine();
$em = $doctrine->getConnection();
// $doctrine->getManager() did not work for me
// (resulted in $stack->queries being empty array)
$stack = new \Doctrine\DBAL\Logging\DebugStack();
$em->getConfiguration()->setSQLLogger($stack);
... // do some queries
var_dump($stack->queries);
You can go to see that : http://vvv.tobiassjosten.net/symfony/logging-doctrine-queries-in-symfony2/
To return to Cesar what Cesar own. I find it here : Count queries to database in Doctrine2
You can put all this logic into domain model and treat command only as an invoker. Then you can use the same domain model with controller using www and profiler to diagnose.
Second thing is that you should have integration test for this and you can verify execution time with this test.

Categories