I have this function that tries to read some values from cache. But if value does not exists it should call alternative source API and save new value into the cache. However, server is very overloaded and almost each time when value does not exists more then one requests are created (a lot of API calls) and each of them will store new vale into cache. However, what I want is to be able to call API many times, but only one process/request to be able to store it in cache:
function fetch_cache($key, $alternativeSource) {
$redis = new Redis();
$redis->pconnect(ENV_REDIS_HOST);
$value = $redis->get($key);
if( $value === NULL ) {
$value = file_get_contents($alternativeSource);
// here goes part that I need help with
$semaphore = sem_get(6000, 1); // does this need to be called each time this function is called?
if( $semaphore === FALSE ) {
// This means I have failed to create semaphore?
}
if( sem_aquire($semaphore, true) ) {
// we have aquired semaphore so here
$redis->set($key, $value);
sem_release($semaphore); // releasing lock
}
// This must be call because I have called sem_get()?
sem_remove($semaphore);
}
return $value;
}
Is this proper use of semaphore in PHP5?
Short answer
You don't need to create and remove semaphores within the fetch_cache function. Put sem_get() into an initialization method (such as __construct).
You should remove semaphores with sem_remove(), but in a cleanup method (such as __destruct). Or, you might want to keep them even longer - depends on the logic of your application.
Use sem_acquire() to acquire locks, and sem_release() to release them.
Description
sem_get()
Creates a set of three semaphores.
The underlying C function semget is not atomic. There is a possibility of race condition when two processes trying to call semget. Therefore, semget should be called in some initialization process. The PHP extension overcomes this issue by means of three semaphores:
Semaphore 0 a.k.a. SYSVSEM_SEM
Is initialized to sem_get's $max_acquire and decremented as processes acquires it.
The first process that called sem_get fetches the value of SYSVSEM_USAGEsemaphore (see below). For the first process, it equals to 1, because the extension sets it to 1 with atomic semop function right after semget. And if this is really the first process, the extension assigns SYSVSEM_SEM semaphore value to $max_acquire.
Semaphore 1 a.k.a. SYSVSEM_USAGE
The number of processes using the semaphore.
Semaphore 2 a.k.a. SYSVSEM_SETVAL
Plays a role of a lock for internal SETVAL and GETVAL operations (see man 2 semctl). For example, it is set to 1 while the extension sets SYSVSEM_SEM to $max_acquire, then is reset back to zero.
Finally, sem_get wraps a structure (containing the semaphore set ID, key and other information) into a PHP resource and returns it.
So you should call it in some initialization process, when you're only preparing to work with semaphores.
sem_acquire()
This is where the $max_acquire goes into play.
SYSVSEM_SEM's value (let's call it semval) is initially equal to $max_acquire. semop() blocks until semval becomes greater than or equal to 1. Then 1 is substracted from semval.
If $max_acquire = 1, then semval becomes zero after the first call, and the next calls to sem_acquire() will block until semval is restored by sem_release() call.
Call it when you need to acquire the next "lock" from the available set ($max_acquire).
sem_release()
Does pretty much the same as sem_acquire(), except it increments SYSVSEM_SEM's value.
Call it when you need to no longer need the "lock" acquired previously with sem_acquire().
sem_remove()
Immediately removes the semaphore set, awakening allprocesses blocked in semop on the set (from IPC_RMID section, SEMCTL(2) man page).
So this is effectively the same as removing a semaphore with ipcrm command.
The file permissions should be 0666 instead of 6000 for what you're trying to do.
Related
(Laravel 8, PHP 8)
Hi. I have a bunch of data in the PHP APC cache that I can access across my Laravel application with the apcu commands.
I decided I should fire an async job to process some of that data for the user during a session and throw the results in the database.
So I made a middleware that fires (correctly) when the user accesses the page, and (correctly) dispatches a job called "MemoryProvider".
The dispatch command promply instantiates the MemoryProvider class, running its constructor, and then queues the job for execution.
About a second later, the queue is processed and the handle method in MemoryProvider is run.
I check the content of the php cache with "apcu_cache_info()" and "apcu_exists()" in the middleware and both in the MemoryProvider constructor and in its handle method.
The problem:
The PHP cache appears populated throughout my Laravel app.
The PHP cache appears populated in the middleware.
The PHP cache appears populated in the job's constructor.
The PHP cache appears EMPTY in the job's handle method.
Here's the middleware:
{
$a = apcu_cache_info(); // 250,000 entries
$b = apcu_exists('the:2:0'); // true
MemoryProvider::dispatch($request);
return $next($request);
}
Here's the job's (MemoryProvider) constructor:
{
$this->request = $request->all();
$a = apcu_cache_info(); // 250,000 entries
$b = apcu_exists('the:2:0'); // true
}
And here's the job's (MemoryProvider) handle method:
{
$a = apcu_cache_info(); // 0 entries
$b = apcu_exists('the:2:0'); // false
}
Question: is this a PHP limitation or a bad Laravel problem? And how can I access the content of my PHP cache in an async class?
p.s. I have apc.enable_cli=1 in php.ini
I found the answer. Apparently, it's a PHP limitation.
According to a good explanation given by gview back in 2017, a cli process doesn't share state or memory with other cli processes. So the apc memory space will never be shared this way.
I did find a workaround for my specific case: instead of running an async process to handle the heavy work in the background, I can get the same effect by simply issuing an AJAX request. The request is handled independently by PHP, with full access to the APC cache, and I can populate my database and let the user know when it's all done (or gradually done, as is the case).
I wish I had thought of this sooner.
I have an API written in Laravel. There is the following code in it:
public function getData($cacheKey)
{
if(Cache::has($cacheKey)) {
return Cache::get($cacheKey);
}
// if cache is empty for the key, get data from external service
$dataFromService = $this->makeRequest($cacheKey);
$dataMapped = array_map([$this->transformer, 'transformData'], $dataFromService);
Cache::put($cacheKey, $dataMapped);
return $dataMapped;
}
In getData() if cache contains necessary key, data returned from cache.
If cache does not have necessary key, data is fetched from external API, processed and placed to cache and after that returned.
The problem is: when there are many concurrent requests to the method, data is corrupted. I guess, data is written to cache incorrectly because of race conditions.
You seem to be experiencing some sort of critical section problem. But here's the thing. Redis operations are atomic however Laravel does its own checks before calling Redis.
The major issue here is that all concurrent requests will all cause a request to be made and then all of them will write the results to the cache (which is definitely not good). I would suggest implementing a simple mutual exclusion lock on your code.
Replace your current method body with the following:
public function getData($cacheKey)
{
$mutexKey = "getDataMutex";
if (!Redis::setnx($mutexKey,true)) {
//Already running, you can either do a busy wait until the cache key is ready or fail this request and assume that another one will succeed
//Definately don't trust what the cache says at this point
}
$value = Cache::rememberForever($cacheKey, function () { //This part is just the convinience method, it doesn't change anything
$dataFromService = $this->makeRequest($cacheKey);
$dataMapped = array_map([$this->transformer, 'transformData'], $dataFromService);
return $dataMapped;
});
Redis::del($mutexKey);
return $value;
}
setnx is a native redis command that sets a value if it doesn't exist already. This is done atomically so it can be used to implement a simple locking mechanism, but (as mentioned in the manual) will not work if you're using a redis cluster. In that case the redis manual describes a method to implement distributed locks
In the end I came to the following solution: I use retry() function from Laravel 5.5 helpers to get cache value until it is written there normally with interval of 1 second.
In my app I'm using server-sent events and have the following situation (pseudo code):
$response = new StreamedResponse();
$response->setCallback(function () {
while(true) {
// 1. $data = fetchData();
// 2. echo "data: $data";
// 3. sleep(x);
}
});
$response->send();
My SSE Response class accepts a callback to gather the data (step 1), which actually performs a database query. Now to my problem: As I am trying to avoid polling the database each X seconds, I want to make use of Doctrine's onFlush event to set a flag that the corresponding entity has been actually changed, which would then be checked within fetchData callback. Normally, I would do this by setting a flag on current user session, but as the streaming loop constantly writes data, the session cannot be accessed within the callback. So has anybody an idea how to resolve this problem?
BTW: I'm using Symfony 3.3 and Doctrine 2.5 - thanks for any help!
I know that this question is from a long time ago, but here's a suggestion:
Use shared memory (the php shm_*() functions). That way your flag isn't tied to a specific session.
Be sure to lock and unlock around access to the shared memory (I usually use a semaphore).
I am trying to implement a hashmap (associative array in PHP) in PHP which is available application wide i.e store it in application context, it should not be lost when the program ends. How I can I achieve this in PHP?
Thanks,
If you are using Zend's version of php, it's easy.
You do not need to serialize your data.
Only contents can be cached. Resources such as filehandles can not.
To store true/false, use 1,0 so you can differentiate a cache failure from a result with ===.
Store:
zend_shm_cache_store('cache_namespace::this_cache_name',$any_variable,$expire_in_seconds);
Retrieve:
$any_variable = zend_shm_cache_fetch('cache_namespace::this_cache_name');
if ( $any_variable === false ) {
# cache was expired or did not exist.
}
For long lived data you can use:
zend_disk_cache_store();zend_disk_cache_fetch();
For those without zend, the corresponding APC versions of the above:
Store:
apc_store('cache_name',$any_variable,$expire_in_seconds);
Retrieve:
$any_variable = apc_fetch('cache_name');
if ( $any_variable === false ) {
# cache was expired or did not exist.
}
Never used any of the other methods mentioned.
If you don't have shared memory available to you, you could serialize/unserialize the data to disk. Of course shared memory is much faster and the nice thing about zend is it handles concurrency issues for you and allows namespaces:
Store:
file_put_contents('/tmp/some_filename',serialize($any_variable));
Retrieve:
$any_variable = unserialize(file_get_contents('/tmp/some_filename') );
Edit: To handle concurrency issues yourself, I think the easiest way would be to use locking. I can still see the possiblity of a race condition in this psuedo code between lock exists and get lock, but you get the point.
Psuedo code:
while ( lock exists ) {
microsleep;
}
get lock.
check we got lock.
write value.
release lock.
You can use APC or similars for this, the data you put there will be available in shared memory.
Bare in mind that this will not persist between server restarts of course.
How can i see How many objects of a class are loaded in php. Also do the objects get loaded in a single session on server? Or one can track objects from other sessions also while on the server side?
Actually i am confused. When an object is loaded with the PHP where does it reside? Is it in the browser? Is it in the session and expires as soon as the session expire?
Will this help?
<?php
class Hello {
public function __construct() {
}
}
$hello = new Hello;
$hi = new Hello;
$i = 0;
foreach (get_defined_vars() as $key => $value) {
if (is_object($value) && get_class($value) == 'Hello')
$i++;
}
echo 'There are ' . $i . ' instances of class Hello';
How can i see How many objects of a class are loaded in php.
I don't think there is a way to do this without you actually keeping count in the class's constructor.
When an object is loaded with the PHP where does it reside? Is it in the browser? Is it in the session and expires as soon as the session expire?
It resides inside the memory that the PHP process that gets called for that one request allocates. It expires as soon as the current request has finished or been terminated (or been unset()).
The session is something that helps identify a user across multiple requests. It survives longer - it expires when it gets destroyed, when the user's session cookie is deleted, or when the session reaches its expiry time.
An object is just a complex variable. It can hold a couple of simple types together and it can have functions.
Despite the numerous differences between simple types and objects, an objects is just a variable. Objects are not shared over sessions, or sent to browsers any more than simple integers or strings.
An object exists only on the server, in memory, and only for the lifetime of the script's execution unless saved into the user's $_SESSION. Even when saved, it ceases to be an object and instead becomes a serialized string. It can be reconstituted again into an object in the same session or a later session.
The script's lifetime refers to the moment the web server calls it until the moment the scripts final line has been processed. The PHP engine may dispose of objects no longer needed by the script through garbage collection, even before the script has fully terminated.