I currently have a website written in PHP, utilizing the curl_multi for polling external APIs. The server forks child processes to standalone from web requests and is working well, but it is somewhat limited to a per process basis.
Occasionally it hit bandwidth bottlenecks and needs a better centralized queuing logic.
I am currently trying PHP IPC with a standalone background process to handle all the outgoing requests, but was stuck in things that, normally said, is not likely to be catered by casual programmers. Says, garbage collection, inter-process exception handling, request-response matching... and etc. Am I going the wrong way?
Is there a common practise (implementation theory) out there, or even libraries I could make use of?
EDIT
Using localhost TCP/IP communication would double the stress of the local trafic, which is definitely not a good approach.
I am currently working on IPC message queue with some home-brew protocol... not looking right at all. Would appreciate any help.
There are several different things to distinguish here:
jobs : you have N jobs to process. Executed tasks can crash or hang, in any way all jobs should be proceeded without any data loss.
resources : you are processing your jobs in a single machine and/or in a single connection, so you need to take care of your cpu and bandwith.
synchronization : if you have interactions between your processes, you need to share information, taking care of concurrent data access.
Keep control on resources
Everybody want to get into the bus...
Because there is no built-in threads in PHP, we will need to simulate mutexes. The principle is quite simple:
1 All jobs are put in a queue
2 There is N resources available and no more in a pool
3 We iterate the queue (a while on each job)
4 Before execution, job asks for a resource in the pool
5 If there is available resources, job is executed
6 If there is no more resources, pool hangs until a job has finished or is considered dead
How to do that in PHP?
To proceed, we have several possibilities but the principle is the same:
We have 2 programs:
There is a process launcher that will launch no more than N tasks simultaneously.
There is a process child, it represents a thread's context
How can look a process launcher?
The process launcher knows how many tasks should be run, and run them without taking care of their result. It only control their execution (a process starts, finishes, or hangs, and N are already running).
PHP I give you the idea here, I'll give you usable examples later:
<?php
// launcher.php
require_once("ProcessesPool.php");
// The label identifies your process pool, it should be unique for your process launcher and your process children
$multi = new ProcessesPool($label = 'test');
// Initialize a new pool (creates the right directory or file, cleans a database or whatever you want)
// 10 is the maximum number of simultaneously run processes
if ($multi->create($max = '10') == false)
{
echo "Pool creation failed ...\n";
exit();
}
// We need to launch N processes, stored in $count
$count = 100; // maybe count($jobs)
// We execute all process, one by one
for ($i = 0; ($i < $count); $i++)
{
// The waitForResources method looks for how many processes are already run,
// and hangs until a resource is free or the maximum execution time is reached.
$ret = $multi->waitForResource($timeout = 10, $interval = 500000);
if ($ret)
{
// A resource is free, so we can run a new process
echo "Execute new process: $i\n";
exec("/usr/bin/php ./child.php $i > /dev/null &");
}
else
{
// Timeout is reached, we consider all children as dead and we kill them.
echo "WaitForResources Timeout! Killing zombies...\n";
$multi->killAllResources();
break;
}
}
// All process has been executed, but this does not mean they finished their work.
// This is important to follow the last executed processes to avoid zombies.
$ret = $multi->waitForTheEnd($timeout = 10, $interval = 500000);
if ($ret == false)
{
echo "WaitForTheEnd Timeout! Killing zombies...\n";
$multi->killAllResources();
}
// We destroy the process pool because we run all processes.
$multi->destroy();
echo "Finish.\n";
And what about the process child, the simulated thread's context
A child (executed job) has only 3 things to do :
tell the process launcher it started
do his job
tell the process launcher it finished
PHP It could contains something like this:
<?php
// child.php
require_once("ProcessesPool.php");
// we create the *same* instance of the process pool
$multi = new ProcessesPool($label = 'test');
// child tells the launcher it started (there will be one more resource busy in pool)
$multi->start();
// here I simulate job's execution
sleep(rand() % 5 + 1);
// child tells the launcher it finished his job (there will be one more resource free in pool)
$multi->finish();
Your usage example is nice, but where is the ProcessPool class?
There is a lot of ways to synchronize tasks but it really depends on your requirements and constraints.
You can synchronize your tasks using :
an unique file
a database
a directory and several files
probably other methods (such as system IPCs)
As we seen already, we need at least 7 methods:
1 create() will create an empty pool
2 start() takes a resource on the pool
3 finish() releases a resource
4 waitForResources() hangs if there is no more free resource
5 killAllResources() get all launched jobs in the pool and kills them
6 waitForTheEnd() hangs until there is no more busy resource
7 destroy() destroys pool
So let's begin by creating an abstract class, we'll be able to implement it using the above ways later.
PHP AbstractProcessPool.php
<?php
// AbstractProcessPool.php
abstract class AbstractProcessesPool
{
abstract protected function _createPool();
abstract protected function _cleanPool();
abstract protected function _destroyPool();
abstract protected function _getPoolAge();
abstract protected function _countPid();
abstract protected function _addPid($pid);
abstract protected function _removePid($pid);
abstract protected function _getPidList();
protected $_label;
protected $_max;
protected $_pid;
public function __construct($label)
{
$this->_max = 0;
$this->_label = $label;
$this->_pid = getmypid();
}
public function getLabel()
{
return ($this->_label);
}
public function create($max = 20)
{
$this->_max = $max;
$ret = $this->_createPool();
return $ret;
}
public function destroy()
{
$ret = $this->_destroyPool();
return $ret;
}
public function waitForResource($timeout = 120, $interval = 500000, $callback = null)
{
// let enough time for children to take a resource
usleep(200000);
while (44000)
{
if (($callback != null) && (is_callable($callback)))
{
call_user_func($callback, $this);
}
$age = $this->_getPoolAge();
if ($age == -1)
{
return false;
}
if ($age > $timeout)
{
return false;
}
$count = $this->_countPid();
if ($count == -1)
{
return false;
}
if ($count < $this->_max)
{
break;
}
usleep($interval);
}
return true;
}
public function waitForTheEnd($timeout = 3600, $interval = 500000, $callback = null)
{
// let enough time to the last child to take a resource
usleep(200000);
while (44000)
{
if (($callback != null) && (is_callable($callback)))
{
call_user_func($callback, $this);
}
$age = $this->_getPoolAge();
if ($age == -1)
{
return false;
}
if ($age > $timeout)
{
return false;
}
$count = $this->_countPid();
if ($count == -1)
{
return false;
}
if ($count == 0)
{
break;
}
usleep($interval);
}
return true;
}
public function start()
{
$ret = $this->_addPid($this->_pid);
return $ret;
}
public function finish()
{
$ret = $this->_removePid($this->_pid);
return $ret;
}
public function killAllResources($code = 9)
{
$pids = $this->_getPidList();
if ($pids == false)
{
$this->_cleanPool();
return false;
}
foreach ($pids as $pid)
{
$pid = intval($pid);
posix_kill($pid, $code);
if ($this->_removePid($pid) == false)
{
return false;
}
}
return true;
}
}
Synchronisation using a directory and several files
If you want to use the directory method (on a /dev/ram1 partition for example), the implementation will be :
1 create() will create an empty directory using the given $label
2 start() creates a file in the directory, named by the child's pid
3 finish() destroy the child's file
4 waitForResources() counts file inside that directory
5 killAllResources() reads directory content and kill all pids
6 waitForTheEnd() reads directory until there is no more files
7 destroy() removes directory
This method looks costy, but it is really efficient if you want to run hundred tasks simultaneously without taking as many database connections as there is jobs to execute.
Implementation :
PHP ProcessPoolFiles.php
<?php
// ProcessPoolFiles.php
class ProcessesPoolFiles extends AbstractProcessesPool
{
protected $_dir;
public function __construct($label, $dir)
{
parent::__construct($label);
if ((!is_dir($dir)) || (!is_writable($dir)))
{
throw new Exception("Directory '{$dir}' does not exist or is not writable.");
}
$sha1 = sha1($label);
$this->_dir = "{$dir}/pool_{$sha1}";
}
protected function _createPool()
{
if ((!is_dir($this->_dir)) && (!mkdir($this->_dir, 0777)))
{
throw new Exception("Could not create '{$this->_dir}'");
}
if ($this->_cleanPool() == false)
{
return false;
}
return true;
}
protected function _cleanPool()
{
$dh = opendir($this->_dir);
if ($dh == false)
{
return false;
}
while (($file = readdir($dh)) !== false)
{
if (($file != '.') && ($file != '..'))
{
if (unlink($this->_dir . '/' . $file) == false)
{
return false;
}
}
}
closedir($dh);
return true;
}
protected function _destroyPool()
{
if ($this->_cleanPool() == false)
{
return false;
}
if (!rmdir($this->_dir))
{
return false;
}
return true;
}
protected function _getPoolAge()
{
$age = -1;
$count = 0;
$dh = opendir($this->_dir);
if ($dh == false)
{
return false;
}
while (($file = readdir($dh)) !== false)
{
if (($file != '.') && ($file != '..'))
{
$stat = #stat($this->_dir . '/' . $file);
if ($stat['mtime'] > $age)
{
$age = $stat['mtime'];
}
$count++;
}
}
closedir($dh);
clearstatcache();
return (($count > 0) ? (#time() - $age) : (0));
}
protected function _countPid()
{
$count = 0;
$dh = opendir($this->_dir);
if ($dh == false)
{
return -1;
}
while (($file = readdir($dh)) !== false)
{
if (($file != '.') && ($file != '..'))
{
$count++;
}
}
closedir($dh);
return $count;
}
protected function _addPid($pid)
{
$file = $this->_dir . "/" . $pid;
if (is_file($file))
{
return true;
}
echo "{$file}\n";
$file = fopen($file, 'w');
if ($file == false)
{
return false;
}
fclose($file);
return true;
}
protected function _removePid($pid)
{
$file = $this->_dir . "/" . $pid;
if (!is_file($file))
{
return true;
}
if (unlink($file) == false)
{
return false;
}
return true;
}
protected function _getPidList()
{
$array = array ();
$dh = opendir($this->_dir);
if ($dh == false)
{
return false;
}
while (($file = readdir($dh)) !== false)
{
if (($file != '.') && ($file != '..'))
{
$array[] = $file;
}
}
closedir($dh);
return $array;
}
}
PHP demo, the process launcher:
<?php
// pool_files_launcher.php
require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolFiles.php");
$multi = new ProcessesPoolFiles($label = 'test', $dir = "/tmp");
if ($multi->create($max = '10') == false)
{
echo "Pool creation failed ...\n";
exit();
}
$count = 20;
for ($i = 0; ($i < $count); $i++)
{
$ret = $multi->waitForResource($timeout = 10, $interval = 500000, 'test_waitForResource');
if ($ret)
{
echo "Execute new process: $i\n";
exec("/usr/bin/php ./pool_files_calc.php $i > /dev/null &");
}
else
{
echo "WaitForResources Timeout! Killing zombies...\n";
$multi->killAllResources();
break;
}
}
$ret = $multi->waitForTheEnd($timeout = 10, $interval = 500000, 'test_waitForTheEnd');
if ($ret == false)
{
echo "WaitForTheEnd Timeout! Killing zombies...\n";
$multi->killAllResources();
}
$multi->destroy();
echo "Finish.\n";
function test_waitForResource($multi)
{
echo "Waiting for available resource ( {$multi->getLabel()} )...\n";
}
function test_waitForTheEnd($multi)
{
echo "Waiting for all resources to finish ( {$multi->getLabel()} )...\n";
}
PHP demo, the process child:
<?php
// pool_files_calc.php
require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolFiles.php");
$multi = new ProcessesPoolFiles($label = 'test', $dir = "/tmp");
$multi->start();
// here I simulate job's execution
sleep(rand() % 7 + 1);
$multi->finish();
Synchronisation using a database
MySQL If you prefer using the database method, you'll need a table like:
CREATE TABLE `processes_pool` (
`label` varchar(40) PRIMARY KEY,
`nb_launched` mediumint(6) unsigned NOT NULL,
`pid_list` varchar(2048) default NULL,
`updated` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Then, the implementation will be something like:
1 create() will insert a new row in the above table
2 start() inserts a pid on the pid list
3 finish() remove one pid from the pid list
4 waitForResources() reads nb_launched field
5 killAllResources() gets and kills every pid
6 waitForTheEnd() hangs and checks regularily until nb_launched equals 0
7 destroy() removes the row
Implementation:
PHP ProcessPoolMySql.php
<?php
// ProcessPoolMysql.php
class ProcessesPoolMySQL extends AbstractProcessesPool
{
protected $_sql;
public function __construct($label, PDO $sql)
{
parent::__construct($label);
$this->_sql = $sql;
$this->_label = sha1($label);
}
protected function _createPool()
{
$request = "
INSERT IGNORE INTO processes_pool
VALUES ( ?, ?, NULL, CURRENT_TIMESTAMP )
";
$this->_query($request, $this->_label, 0);
return $this->_cleanPool();
}
protected function _cleanPool()
{
$request = "
UPDATE processes_pool
SET
nb_launched = ?,
pid_list = NULL,
updated = CURRENT_TIMESTAMP
WHERE label = ?
";
$this->_query($request, 0, $this->_label);
return true;
}
protected function _destroyPool()
{
$request = "
DELETE FROM processes_pool
WHERE label = ?
";
$this->_query($request, $this->_label);
return true;
}
protected function _getPoolAge()
{
$request = "
SELECT (CURRENT_TIMESTAMP - updated) AS age
FROM processes_pool
WHERE label = ?
";
$ret = $this->_query($request, $this->_label);
if ($ret === null)
{
return -1;
}
return $ret['age'];
}
protected function _countPid()
{
$req = "
SELECT nb_launched AS nb
FROM processes_pool
WHERE label = ?
";
$ret = $this->_query($req, $this->_label);
if ($ret === null)
{
return -1;
}
return $ret['nb'];
}
protected function _addPid($pid)
{
$request = "
UPDATE processes_pool
SET
nb_launched = (nb_launched + 1),
pid_list = CONCAT_WS(',', (SELECT IF(LENGTH(pid_list) = 0, NULL, pid_list )), ?),
updated = CURRENT_TIMESTAMP
WHERE label = ?
";
$this->_query($request, $pid, $this->_label);
return true;
}
protected function _removePid($pid)
{
$req = "
UPDATE processes_pool
SET
nb_launched = (nb_launched - 1),
pid_list =
CONCAT_WS(',', (SELECT IF (LENGTH(
SUBSTRING_INDEX(pid_list, ',', (FIND_IN_SET(?, pid_list) - 1))) = 0, null,
SUBSTRING_INDEX(pid_list, ',', (FIND_IN_SET(?, pid_list) - 1)))), (SELECT IF (LENGTH(
SUBSTRING_INDEX(pid_list, ',', (-1 * ((LENGTH(pid_list) - LENGTH(REPLACE(pid_list, ',', ''))) + 1 - FIND_IN_SET(?, pid_list))))) = 0, null,
SUBSTRING_INDEX(pid_list, ',', (-1 * ((LENGTH(pid_list) - LENGTH(REPLACE(pid_list, ',', ''))) + 1 - FIND_IN_SET(?, pid_list))
)
)
)
)
),
updated = CURRENT_TIMESTAMP
WHERE label = ?";
$this->_query($req, $pid, $pid, $pid, $pid, $this->_label);
return true;
}
protected function _getPidList()
{
$req = "
SELECT pid_list
FROM processes_pool
WHERE label = ?
";
$ret = $this->_query($req, $this->_label);
if ($ret === null)
{
return false;
}
if ($ret['pid_list'] == null)
{
return array();
}
$pid_list = explode(',', $ret['pid_list']);
return $pid_list;
}
protected function _query($request)
{
$return = null;
$stmt = $this->_sql->prepare($request);
if ($stmt === false)
{
return $return;
}
$params = func_get_args();
array_shift($params);
if ($stmt->execute($params) === false)
{
return $return;
}
if (strncasecmp(trim($request), 'SELECT', 6) === 0)
{
$return = $stmt->fetch(PDO::FETCH_ASSOC);
}
return $return;
}
}
PHP demo, the process launcher:
<?php
// pool_mysql_launcher.php
require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolMySQL.php");
$dbh = new PDO("mysql:host=127.0.0.1;dbname=fuz", 'root', 'root');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$multi = new ProcessesPoolMySQL($label = 'test', $dbh);
if ($multi->create($max = '10') == false)
{
echo "Pool creation failed ...\n";
exit();
}
$count = 20;
for ($i = 0; ($i < $count); $i++)
{
$ret = $multi->waitForResource($timeout = 10, $interval = 500000, 'test_waitForResource');
if ($ret)
{
echo "Execute new process: $i\n";
exec("/usr/bin/php ./pool_mysql_calc.php $i > /dev/null &");
}
else
{
echo "WaitForResources Timeout! Killing zombies...\n";
$multi->killAllResources();
break;
}
}
$ret = $multi->waitForTheEnd($timeout = 10, $interval = 500000, 'test_waitForTheEnd');
if ($ret == false)
{
echo "WaitForTheEnd Timeout! Killing zombies...\n";
$multi->killAllResources();
}
$multi->destroy();
echo "Finish.\n";
function test_waitForResource($multi)
{
echo "Waiting for available resource ( {$multi->getLabel()} )...\n";
}
function test_waitForTheEnd($multi)
{
echo "Waiting for all resources to finish ( {$multi->getLabel()} )...\n";
}
PHP demo, the process child:
<?php
// pool_mysql_calc.php
require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolMySQL.php");
$dbh = new PDO("mysql:host=127.0.0.1;dbname=fuz", 'root', 'root');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$multi = new ProcessesPoolMySQL($label = 'test', $dbh);
$multi->start();
// here I simulate job's execution
sleep(rand() % 7 + 1);
$multi->finish();
What is the output of the above code?
Demo output Those demo gives - fortunately - about the same output. If the timeout is not reached (the dreams's case), output is:
KolyMac:TaskManager ninsuo$ php pool_files_launcher.php
Waiting for available resource ( test )...
Execute new process: 0
Waiting for available resource ( test )...
Execute new process: 1
Waiting for available resource ( test )...
Execute new process: 2
Waiting for available resource ( test )...
Execute new process: 3
Waiting for available resource ( test )...
Execute new process: 4
Waiting for available resource ( test )...
Execute new process: 5
Waiting for available resource ( test )...
Execute new process: 6
Waiting for available resource ( test )...
Execute new process: 7
Waiting for available resource ( test )...
Execute new process: 8
Waiting for available resource ( test )...
Execute new process: 9
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Execute new process: 10
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Execute new process: 11
Waiting for available resource ( test )...
Execute new process: 12
Waiting for available resource ( test )...
Execute new process: 13
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Execute new process: 14
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Execute new process: 15
Waiting for available resource ( test )...
Execute new process: 16
Waiting for available resource ( test )...
Execute new process: 17
Waiting for available resource ( test )...
Execute new process: 18
Waiting for available resource ( test )...
Execute new process: 19
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Finish.
Demo output In a worse case (I change sleep(rand() % 7 + 1); to sleep(rand() % 7 + 100);, this gives:
KolyMac:TaskManager ninsuo$ php pool_files_launcher.php
Waiting for available resource ( test )...
Execute new process: 0
Waiting for available resource ( test )...
Execute new process: 1
Waiting for available resource ( test )...
Execute new process: 2
Waiting for available resource ( test )...
Execute new process: 3
Waiting for available resource ( test )...
Execute new process: 4
Waiting for available resource ( test )...
Execute new process: 5
Waiting for available resource ( test )...
Execute new process: 6
Waiting for available resource ( test )...
Execute new process: 7
Waiting for available resource ( test )...
Execute new process: 8
Waiting for available resource ( test )...
Execute new process: 9
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
(...)
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
WaitForResources Timeout! Killing zombies...
Waiting for all resources to finish ( test )...
Finish.
Go to page 2 to continue reading this answer.
Page 2: there is a limit for SO answer's body to 30k chars, so I need to create a new one.
Keep control on results
No mistake allowed!
YEAH ! You can launch tons of processes without taking care of resources. But what if one child process fails? There will be one undone or incomplete job!...
In fact, this is simpler (far simpler) than controlling process execution. We have a queue of jobs executed using a pool, and we only need to know if one has failed or succeded after its execution. If there is failure when the whole pool is executed, then failed processes are put on a new pool, and are executed again.
How to proceed in PHP ?
This principle is based on clusters: queue contains several jobs, but represents only one entity. Each calcul of the cluster
should succeed to get that entity complete.
Roadmap:
1 We create a todo list (to mismatch with the queue, used for process management) containing all calculs of the cluster. Each job has a status: waiting (not executed), running (executed and not finished), success and error (according to their result), and of course, at this step, their status is WAITING.
2 We run all jobs using the process manager (to keep control on resources), each one begin by telling the task manager it runs, and according to his own context, it finishes by indicating his state (failed or success).
3 When the entire queue is executed, the task manager creates a new queue with failed jobs, and loops again.
4 When all jobs succeed, you're done, and you're sure nothing went wrong. Your cluster is complete and your entity usable at upper level.
Proof of concept
There is nothing more to tell about the subject, so let's write some code, continuing the previous sample code.
As for process management, you can use several ways to synchronize your parents and children, but there is no hard logic, so there is no need for an abstraction.
So I developped a MySQL example (faster to write), you'll be free to adapt this concept for your requirements and constraints.
MySQL Create the following table:
CREATE TABLE `tasks_manager` (
`cluster_label` varchar(40),
`calcul_label` varchar(40),
`status` enum('waiting', 'running', 'failed', 'success') default 'waiting',
`updated` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP,
PRIMARY KEY (`cluster_label`, `calcul_label`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
PHP Here is the TaskManager.php file:
<?php
class TasksManager
{
protected $_cluster_label;
protected $_calcul_label;
protected $_sql;
const WAITING = "waiting";
const RUNNING = "running";
const SUCCESS = "success";
const FAILED = "failed";
public function __construct($label, PDO $sql)
{
$this->_sql = $sql;
$this->_cluster_label = substr($label, 0, 40);
}
public function getClusterLabel()
{
return $this->_cluster_label;
}
public function getCalculLabel()
{
return $this->_calcul_label;
}
public function destroy()
{
$request = "
DELETE FROM tasks_manager
WHERE cluster_label = ?
";
$this->_query($request, $this->_cluster_label);
return $this;
}
public function start($calcul_label)
{
$this->_calcul_label = $calcul_label;
$this->add($calcul_label, TasksManager::RUNNING);
return $this;
}
public function finish($status = TasksManager::SUCCESS)
{
if (!$this->_isStatus($status))
{
throw new Exception("{$status} is not a valid status.");
}
if (is_null($this->_cluster_label))
{
throw new Exception("finish() called, but task never started.");
}
$request = "
UPDATE tasks_manager
SET status = ?
WHERE cluster_label = ?
AND calcul_label = ?
";
$this->_query($request, $status, $this->_cluster_label, substr($this->_calcul_label, 0, 40));
return $this;
}
public function add($calcul_label, $status = TasksManager::WAITING)
{
if (!$this->_isStatus($status))
{
throw new Exception("{$status} is not a valid status.");
}
$request = "
INSERT INTO tasks_manager (
cluster_label, calcul_label, status
) VALUES (
?, ?, ?
)
ON DUPLICATE KEY UPDATE
status = ?
";
$calcul_label = substr($calcul_label, 0, 40);
$this->_query($request, $this->_cluster_label, $calcul_label, $status, $status);
return $this;
}
public function delete($calcul_label)
{
$request = "
DELETE FROM tasks_manager
WHERE cluster_label = ?
AND calcul_label = ?
";
$this->_query($request, $this->_cluster_label, substr($calcul_label, 0, 40));
return $this;
}
public function countStatus($status = TasksManager::SUCCESS)
{
if (!$this->_isStatus($status))
{
throw new Exception("{$status} is not a valid status.");
}
$request = "
SELECT COUNT(*) AS cnt
FROM tasks_manager
WHERE cluster_label = ?
AND status = ?
";
$ret = $this->_query($request, $this->_cluster_label, $status);
return $ret[0]['cnt'];
}
public function count()
{
$request = "
SELECT COUNT(id) AS cnt
FROM tasks_manager
WHERE cluster_label = ?
";
$ret = $this->_query($request, $this->_cluster_label);
return $ret[0]['cnt'];
}
public function getCalculsByStatus($status = TasksManager::SUCCESS)
{
if (!$this->_isStatus($status))
{
throw new Exception("{$status} is not a valid status.");
}
$request = "
SELECT calcul_label
FROM tasks_manager
WHERE cluster_label = ?
AND status = ?
";
$ret = $this->_query($request, $this->_cluster_label, $status);
$array = array();
if (!is_null($ret))
{
$array = array_map(function($row) {
return $row['calcul_label'];
}, $ret);
}
return $array;
}
public function switchStatus($statusA = TasksManager::RUNNING, $statusB = null)
{
if (!$this->_isStatus($statusA))
{
throw new Exception("{$statusA} is not a valid status.");
}
if ((!is_null($statusB)) && (!$this->_isStatus($statusB)))
{
throw new Exception("{$statusB} is not a valid status.");
}
if ($statusB != null)
{
$request = "
UPDATE tasks_manager
SET status = ?
WHERE cluster_label = ?
AND status = ?
";
$this->_query($request, $statusB, $this->_cluster_label, $statusA);
}
else
{
$request = "
UPDATE tasks_manager
SET status = ?
WHERE cluster_label = ?
";
$this->_query($request, $statusA, $this->_cluster_label);
}
return $this;
}
private function _isStatus($status)
{
if (!is_string($status))
{
return false;
}
return in_array($status, array(
self::FAILED,
self::RUNNING,
self::SUCCESS,
self::WAITING,
));
}
protected function _query($request)
{
$return = null;
$stmt = $this->_sql->prepare($request);
if ($stmt === false)
{
return $return;
}
$params = func_get_args();
array_shift($params);
if ($stmt->execute($params) === false)
{
return $return;
}
if (strncasecmp(trim($request), 'SELECT', 6) === 0)
{
$return = $stmt->fetchAll(PDO::FETCH_ASSOC);
}
return $return;
}
}
PHP The task_launcher.php is the usage example
<?php
require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolMySQL.php");
require_once("TasksManager.php");
// Initializing database connection
$dbh = new PDO("mysql:host=127.0.0.1;dbname=fuz", 'root', 'root');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
// Initializing process pool
$pool = new ProcessesPoolMySQL($label = "pool test", $dbh);
$pool->create($max = "10");
// Initializing task manager
$multi = new TasksManager($label = "jobs test", $dbh);
$multi->destroy();
// Simulating jobs
$count = 20;
$todo_list = array ();
for ($i = 0; ($i < $count); $i++)
{
$todo_list[$i] = "Job {$i}";
$multi->add($todo_list[$i], TasksManager::WAITING);
}
// Infinite loop until all jobs are done
$continue = true;
while ($continue)
{
$continue = false;
echo "Starting to run jobs in queue ...\n";
// put all failed jobs to WAITING status
$multi->switchStatus(TasksManager::FAILED, TasksManager::WAITING);
foreach ($todo_list as $job)
{
$ret = $pool->waitForResource($timeout = 10, $interval = 500000, "waitResource");
if ($ret)
{
echo "Executing job: $job\n";
exec(sprintf("/usr/bin/php ./tasks_program.php %s > /dev/null &", escapeshellarg($job)));
}
else
{
echo "waitForResource timeout!\n";
$pool->killAllResources();
// All jobs currently running are considered dead, so, failed
$multi->switchStatus(TasksManager::RUNNING, TasksManager::FAILED);
break;
}
}
$ret = $pool->waitForTheEnd($timeout = 10, $interval = 500000, "waitEnd");
if ($ret == false)
{
echo "waitForTheEnd timeout!\n";
$pool->killAllResources();
// All jobs currently running are considered dead, so, failed
$multi->switchStatus(TasksManager::RUNNING, TasksManager::FAILED);
}
echo "All jobs in queue executed, looking for errors...\n";
// Counts if there is failures
$nb_failed = $multi->countStatus(TasksManager::FAILED);
if ($nb_failed > 0)
{
$todo_list = $multi->getCalculsByStatus(TasksManager::FAILED);
echo sprintf("%d jobs failed: %s\n", $nb_failed, implode(', ', $todo_list));
$continue = true;
}
}
function waitResource($multi)
{
echo "Waiting for a resource ....\n";
}
function waitEnd($multi)
{
echo "Waiting for the end .....\n";
}
// All jobs finished, destroying task manager
$multi->destroy();
// Destroying process pool
$pool->destroy();
echo "Finish.\n";
PHP And here is the child program (a calcul)
<?php
if (!isset($argv[1]))
{
die("This program must be called with an identifier (calcul_label)\n");
}
$calcul_label = $argv[1];
require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolMySQL.php");
require_once("TasksManager.php");
// Initializing database connection
$dbh = new PDO("mysql:host=127.0.0.1;dbname=fuz", 'root', 'root');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
// Initializing process pool (with same label as parent)
$pool = new ProcessesPoolMySQL($label = "pool test", $dbh);
// Takes one resource in pool
$pool->start();
// Initializing task manager (with same label as parent)
$multi = new TasksManager($label = "jobs test", $dbh);
$multi->start($calcul_label);
// Simulating execution time
$secs = (rand() % 2) + 3;
sleep($secs);
// Simulating job status
$status = rand() % 3 == 0 ? TasksManager::FAILED : TasksManager::SUCCESS;
// Job finishes indicating his status
$multi->finish($status);
// Releasing pool's resource
$pool->finish();
Demo output This demo will give you something like this (too large for SO).
Synchronization and communication between processes
There is solutions to communicate easily without errors.
Take a coffee, we're close to the end!
We are now able to launch tons of processes and they are all giving the expected result, well, this is not bad.
But now, all our processes are executed stand-alone, and they actually can't communicate each others. That is your
core problem, and there is a lot of solutions.
It is really difficult to tell you exactly what kind of communication you need. You were speaking about what you tried (IPCs,
communication using files, or home-made protocols), but not what kind of information are shared between your processes.
Anyway, I invite you to think about an OOP solution.
PHP is powerful.
PHP has magic methods :
__get($property) let us implement the access of a $property on an object
__set($property, $value) let us implement the assignation of a $property on an object
PHP can handle files, with concurrent-access management
fopen($file, 'c+') opens a file with advisory lock options enabled (allow you to use flock)
flock($descriptor, LOCK_SH) takes a shared lock (for reading)
flock($descriptor, LOCK_EX) takes an exclusive lock (for writting)
Finally, PHP has:
json_encode($object) to get a json representation of an object
json_decode($string) to get back an object from a json string
You see where I'm going? We will create a Synchro class that will work the same way as the stdClass class, but it will be always safely synchronized on a file. Our processes will be able to access the same instance of that object at the same time.
Some Linux systems tricks
Of course, if you have 150 processes working on the same file at the same time, your hard drive will slow down your processes. To
handle this issue, why not creating a filesystem partition on RAM? Writing into that file will be about as quick as writing
in memory!
shell As root, type the following commands:
mkfs -q /dev/ram1 65536
mkdir -p /ram
mount /dev/ram1 /ram
Some notes :
65536 is in kilobytes, here you get a 64M partition.
if you want to mount that partition at startup, create a shell script and call it inside /etc/rc.local file.
Implementation
PHP Here is the Synchro.php class.
<?php
class Synchro
{
private $_file;
public function __construct($file)
{
$this->_file = $file;
}
public function __get($property)
{
// File does not exist
if (!is_file($this->_file))
{
return null;
}
// Check if file is readable
if ((is_file($this->_file)) && (!is_readable($this->_file)))
{
throw new Exception(sprintf("File '%s' is not readable.", $this->_file));
}
// Open file with advisory lock option enabled for reading and writting
if (($fd = fopen($this->_file, 'c+')) === false)
{
throw new Exception(sprintf("Can't open '%s' file.", $this->_file));
}
// Request a lock for reading (hangs until lock is granted successfully)
if (flock($fd, LOCK_SH) === false)
{
throw new Exception(sprintf("Can't lock '%s' file for reading.", $this->_file));
}
// A hand-made file_get_contents
$contents = '';
while (($read = fread($fd, 32 * 1024)) !== '')
{
$contents .= $read;
}
// Release shared lock and close file
flock($fd, LOCK_UN);
fclose($fd);
// Restore shared data object and return requested property
$object = json_decode($contents);
if (property_exists($object, $property))
{
return $object->{$property};
}
return null;
}
public function __set($property, $value)
{
// Check if directory is writable if file does not exist
if ((!is_file($this->_file)) && (!is_writable(dirname($this->_file))))
{
throw new Exception(sprintf("Directory '%s' does not exist or is not writable.", dirname($this->_file)));
}
// Check if file is writable if it exists
if ((is_file($this->_file)) && (!is_writable($this->_file)))
{
throw new Exception(sprintf("File '%s' is not writable.", $this->_file));
}
// Open file with advisory lock option enabled for reading and writting
if (($fd = fopen($this->_file, 'c+')) === false)
{
throw new Exception(sprintf("Can't open '%s' file.", $this->_file));
}
// Request a lock for writting (hangs until lock is granted successfully)
if (flock($fd, LOCK_EX) === false)
{
throw new Exception(sprintf("Can't lock '%s' file for writing.", $this->_file));
}
// A hand-made file_get_contents
$contents = '';
while (($read = fread($fd, 32 * 1024)) !== '')
{
$contents .= $read;
}
// Restore shared data object and set value for desired property
if (empty($contents))
{
$object = new stdClass();
}
else
{
$object = json_decode($contents);
}
$object->{$property} = $value;
// Go back at the beginning of file
rewind($fd);
// Truncate file
ftruncate($fd, strlen($contents));
// Save shared data object to the file
fwrite($fd, json_encode($object));
// Release exclusive lock and close file
flock($fd, LOCK_UN);
fclose($fd);
return $value;
}
}
Demonstration
We will continue (and finish) our processes / tasks example by making our processes communicate each others.
Rules :
Our goal is to get the sum of all numbers between 1 and 20.
We have 20 processes, with ID from 1 to 20.
Those processes are randomly queued for execution.
Each process (except process 1) can do only one calculation : its id + the previous process's result
Process 1 directly put his id
Each process succeed if it can do his calculation (means, if the previous process's result is available), else it fails (and is candidate for a new queue)
The pool's timeout expires after 10 seconds
Well, it looks complicated but actually, it is a good representation of what you'll find in a real-life situation.
PHP synchro_launcher.php file.
<?php
require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolMySQL.php");
require_once("TasksManager.php");
require_once("Synchro.php");
// Removing old synchroized object
if (is_file("/tmp/synchro.txt"))
{
unlink("/tmp/synchro.txt");
}
// Initializing database connection
$dbh = new PDO("mysql:host=127.0.0.1;dbname=fuz", 'root', 'root');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
// Initializing process pool
$pool = new ProcessesPoolMySQL($label = "synchro pool", $dbh);
$pool->create($max = "10");
// Initializing task manager
$multi = new TasksManager($label = "synchro tasks", $dbh);
$multi->destroy();
// Simulating jobs
$todo_list = array ();
for ($i = 1; ($i <= 20); $i++)
{
$todo_list[$i] = $i;
$multi->add($todo_list[$i], TasksManager::WAITING);
}
// Infinite loop until all jobs are done
$continue = true;
while ($continue)
{
$continue = false;
echo "Starting to run jobs in queue ...\n";
// Shuffle all jobs (else this will be too easy :-))
shuffle($todo_list);
// put all failed jobs to WAITING status
$multi->switchStatus(TasksManager::FAILED, TasksManager::WAITING);
foreach ($todo_list as $job)
{
$ret = $pool->waitForResource($timeout = 10, $interval = 500000, "waitResource");
if ($ret)
{
echo "Executing job: $job\n";
exec(sprintf("/usr/bin/php ./synchro_program.php %s > /dev/null &", escapeshellarg($job)));
}
else
{
echo "waitForResource timeout!\n";
$pool->killAllResources();
// All jobs currently running are considered dead, so, failed
$multi->switchStatus(TasksManager::RUNNING, TasksManager::FAILED);
break;
}
}
$ret = $pool->waitForTheEnd($timeout = 10, $interval = 500000, "waitEnd");
if ($ret == false)
{
echo "waitForTheEnd timeout!\n";
$pool->killAllResources();
// All jobs currently running are considered dead, so, failed
$multi->switchStatus(TasksManager::RUNNING, TasksManager::FAILED);
}
echo "All jobs in queue executed, looking for errors...\n";
// Counts if there is failures
$multi->switchStatus(TasksManager::WAITING, TasksManager::FAILED);
$nb_failed = $multi->countStatus(TasksManager::FAILED);
if ($nb_failed > 0)
{
$todo_list = $multi->getCalculsByStatus(TasksManager::FAILED);
echo sprintf("%d jobs failed: %s\n", $nb_failed, implode(', ', $todo_list));
$continue = true;
}
}
function waitResource($multi)
{
echo "Waiting for a resource ....\n";
}
function waitEnd($multi)
{
echo "Waiting for the end .....\n";
}
// All jobs finished, destroying task manager
$multi->destroy();
// Destroying process pool
$pool->destroy();
// Recovering final result
$synchro = new Synchro("/tmp/synchro.txt");
echo sprintf("Result of the sum of all numbers between 1 and 20 included is: %d\n", $synchro->result20);
echo "Finish.\n";
PHP And its associed synchro_calcul.php file.
<?php
if (!isset($argv[1]))
{
die("This program must be called with an identifier (calcul_label)\n");
}
$current_id = $argv[1];
require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolMySQL.php");
require_once("TasksManager.php");
require_once("Synchro.php");
// Initializing database connection
$dbh = new PDO("mysql:host=127.0.0.1;dbname=fuz", 'root', 'root');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
// Initializing process pool (with same label as parent)
$pool = new ProcessesPoolMySQL($label = "synchro pool", $dbh);
// Takes one resource in pool
$pool->start();
// Initializing task manager (with same label as parent)
$multi = new TasksManager($label = "synchro tasks", $dbh);
$multi->start($current_id);
// ------------------------------------------------------
// Job begins here
$synchro = new Synchro("/tmp/synchro.txt");
if ($current_id == 1)
{
$synchro->result1 = 1;
$status = TasksManager::SUCCESS;
}
else
{
$previous_id = $current_id - 1;
if (is_null($synchro->{"result{$previous_id}"}))
{
$status = TasksManager::FAILED;
}
else
{
$synchro->{"result{$current_id}"} = $synchro->{"result{$previous_id}"} + $current_id;
$status = TasksManager::SUCCESS;
}
}
// ------------------------------------------------------
// Job finishes indicating his status
$multi->finish($status);
// Releasing pool's resource
$pool->finish();
Output
The following demo will give you something like this output (too large for SO)
Conclusion
Task management in PHP is not really easy because of the missing threads. As many developers, I hope this feature will come
built-in someday. Anyway, this is possible to control resources and results, and to share data between processes, so we can do,
with some work I assume, task management efficiently.
Synchronization and communication can be done by several ways, but you need to check out pros and cons for each one, according
to your constraints and requirements. For example:
if you need to launch 500 tasks at once and want to use a MySQL synchronization method, you'll need 1+500 simultaneous connections to the database (it may not appreciate a lot).
If you need to share a big amount of data, using only one file may be inefficient.
If you are using file for synchronization, don't forget to have a look to system built-in tools such as /dev/sdram.
Try to stay as more as possible in an object-oriented programming to handle your troubles. Home-made protocols or such will make your app harder to maintain.
I gave you my 2-cents about this interesting subject, and I hope it will give you some ideas to solve your issues.
I recommend you take a look at this library called PHP-Queue: https://github.com/CoderKungfu/php-queue
An short description from its github page:
A unified front-end for different queuing backends. Includes a REST
server, CLI interface and daemon runners.
Check out its github page for more details.
With a bit of tinkering, I think this library will help you solve your problem.
Hope this helps.
Related
I got the following two functions that I use to lock a Redis key. I am trying to prevent concurrency execution of a block of code using Redis. So what I do is use the following functions in such a way that they prevent the execution of the same code by different threads.
lockRedisKey("ABC");
CODE THAT I DON'T WANT TO RUN CONCURRENTLY!
unlockRedisKey("ABC");
Unfortunately, it doesn't seem to work and causes an infinitely loop at lockRedisKey() until exit_time is reached. What could be wrong?
static public function lockRedisKey($key, $value = "true") {
$redis = RedisClient::getInstance();
$time = microtime(true);
$exit_time = $time + 10;
$sleep = 10000;
do {
// Lock Redis with PX and NX
$lock = $redis->setnx("lock:" . $key, $value);
if ($lock == 1) {
$redis->expire("lock:" . $key, "10");
return true;
}
usleep($sleep);
} while (microtime(true) < $exit_time);
return false;
}
static public function unlockRedisKey($key) {
$redis = RedisClient::getInstance();
$redis->del("lock:" . $key);
}
I'm aware that I might be facing deadlocks, so I decided to use transactions, but I continue to face the issue.
static public function lockRedisKey($key, $value = "true") {
$redis = RedisClient::getInstance();
$time = microtime(true);
$exit_time = $time + 10;
$sleep = 10000;
do {
// Lock Redis with PX and NX
$redis->multi();
$redis->set('lock:' . $key, $value, array('nx', 'ex' => 10));
$ret = $redis->exec();
if ($ret[0] == true) {
return true;
}
usleep($sleep);
} while (microtime(true) < $exit_time);
return false;
}
static public function unlockRedisKey($key) {
$redis = RedisClient::getInstance();
$redis->multi();
$redis->del("lock:" . $key);
$redis->exec();
}
The locking is working fine. It's just the code between the locking that is crashing and causing the lock not to be released :-)
First of all, PHP isn't async script language as JS f.e. You haven't any control on the script execution ordering, so it means you can't run code that below lockRedisKey() call in concurent way.
I assume you're not completely understand what usleep function is actually doing, it's just delay process and not postpone it, so it just blocks process for $sleep time and after this continue script execution. So you're not performing concurrent write by calling your script.
I have php7 CLI daemon which serially parses json with filesize over 50M. I'm trying to save every 1000 entries of parsed data using a separate process with pcntl_fork() to mysql, and for ~200k rows it works fine.
Then I get pcntl_fork(): Error 35.
I assume this is happening because mysql insertion becomes slower than parsing, which causes more and more forks to be generated until CentOS 6.3 can't handle it any more.
Is there a way to catch this error to resort to single-process parsing and saving? Or is there a way to check child process count?
Here is the solution that I did based on #Sander Visser comment. Key part is checking existing processes and resorting to same process if there are too many of them
class serialJsonReader{
const MAX_CHILD_PROCESSES = 50;
private $child_processes=[]; //will store alive child PIDs
private function flushCachedDataToStore() {
//resort to single process
if (count($this->child_processes) > self::MAX_CHILD_PROCESSES) {
$this->checkChildProcesses();
$this->storeCollectedData() //main work here
}
//use as much as possible
else {
$pid = pcntl_fork();
if (!$pid) {
$this->storeCollectedData(); //main work here
exit();
}
elseif ($pid == -1) {
die('could not fork');
}
else {
$this->child_processes[] = $pid;
$this->checkChildProcesses();
}
}
}
private function checkChildProcesses() {
if (count($this->child_processes) > self::MAX_CHILD_PROCESSES) {
foreach ($this->child_processes as $key => $pid) {
$res = pcntl_waitpid($pid, $status, WNOHANG);
// If the process has already exited
if ($res == -1 || $res > 0) {
unset($this->child_processes[$key]);
}
}
}
}
}
I'm using PCNTL to multiprocess a big script in PHP on an ubuntu server.
Here is the code ( simplified and commented )
function signalHandler($signo = null) {
$pid = posix_getpid();
switch ($signo) {
case SIGTERM:
case SIGINT:
case SIGKILL:
// a process is asked to stop (from user or father)
exit(3);
break;
case SIGCHLD:
case SIGHUP:
// ignore signals
break;
case 10: // signal user 1
// a process finished its work
exit(0);
break;
case 12: // signal user 2
// a process got an error.
exit(3);
break;
default:
// nothing
}
}
public static function run($nbProcess, $nbTasks, $taskFunc, $args) {
$pid = 0;
// there will be $nbTasks tasks to do, and no more than $nbProcess children must work at the same time
$MAX_PROCESS = $nbProcess;
$pidFather = posix_getpid();
$data = array();
pcntl_signal(SIGTERM, "signalHandler");
pcntl_signal(SIGINT, "signalHandler");
// pcntl_signal(SIGKILL, "signalHandler"); // SIGKILL can't be overloaded
pcntl_signal(SIGCHLD, "signalHandler");
pcntl_signal(SIGHUP, "signalHandler");
pcntl_signal(10, "signalHandler"); // user signal 1
pcntl_signal(12, "signalHandler"); // user signal 2
for ($indexTask = 0; $indexTask < $nbTasks ; $indexTask++) {
$pid = pcntl_fork();
// Father and new child both read code from here
if ($pid == -1) {
// log error
return false;
} elseif ($pid > 0) {
// We are in father process
// storing child id in an array
$arrayPid[$pid] = $indexTask;
} else {
// We are in child, nothing to do now
}
if ($pid == 0) {
// We are in child process
$pidChild = posix_getpid();
try {
//$taskFunc is an array containing an object, and the method to call from that object
$ret = (array) call_user_func($taskFunc, $indexTask, $args);// similar to $ret = (array) $taskFunc($indexTask, $args);
$returnArray = array(
"tasknb" => $indexTask,
"time" => $timer,
"data" => $ret,
);
} catch(Exception $e) {
// some stuff to exit child
}
$pdata = array();
array_push($pdata, $returnArray);
$data_str = serialize($pdata);
$shm_id = shmop_open($pidChild, "c", 0644, strlen($data_str));
if (!$shm_id) {
// log error
} else {
if(shmop_write($shm_id, $data_str, 0) != strlen($data_str)) {
// log error
}
}
// We are in a child and job is done. Let's exit !
posix_kill($pidChild, 10); // sending user signal 1 (OK)
pcntl_signal_dispatch();
} else {
// we are in father process,
// we check number of running children
while (count($arrayPid) >= $MAX_PROCESS) {
// There are more children than allowed
// waiting for any child to send signal
$pid = pcntl_wait($status);
// A child sent a signal !
if ($pid == -1) {
// log error
}
if (pcntl_wifexited($status)) {
$statusChild = pcntl_wexitstatus($status);
} else
$statusChild = $status;
// father ($pidFather) saw a child ($pid) exiting with status $statusChild (or $status ?)
// ^^^^ ^^^^^^
// (=3) (= random number ?)
if(isset($arrayPid[$pid])) {
// father knows this child
unset($arrayPid[$pid]);
if ($statusChild == 0 || $statusChild == 10 || $statusChild == 255) {
// Child did not report any error
$shm_id = shmop_open($pid, "a", 0, 0);
if ($shm_id === false)
// log error
else {
$shm_data = unserialize(shmop_read($shm_id, 0, shmop_size($shm_id)));
shmop_delete($shm_id);
shmop_close($shm_id);
$data = array_merge($data, $shm_data);
}
// kill (again) child
posix_kill($pid, 10);
pcntl_signal_dispatch();;
}
else {
// Child reported an error
}
}
}
}
}
}
The problem I'm facing is about the value returned by wexitstatus.
To make it simple, there is a father-process, that must create 200 threads.
He makes process one at a time, and wait for a process to finish if there are more than 8 threads actually running.
I added many logs, so I see a child finished its work.
I see it calling the line posix_kill($pidChild, 10);.
I see the signal handler is called with signal user 1 (which results in an exit(0)).
I see the father awakening, but when he gets the returned code from wexitstatus, he sees a code 3, and so thinks the child got an error, whereas it has exited with code 0 !!.
The pid is the good child's pid.
Maybe I misunderstand how signals work... Any clue ?
I found the problem.
In my application, register_shutdown_function(myFrameworkShutdownFunction) was used to shut down the script "smoothly".
So an exit(0) didn't immediately stop the child process. It first went into myFrameworkShutdownFunction, and converted the return code 0 to a code 3 (because of a misconfigured variable).
I have webapp that is logging application and I need backup/restore/import/export feature there. I did this successfully with laravel but have some complications with Phalcon. I don't see native functions in phalcon that would split on chunks execution of large php scripts.
The thing is that logs will be backed up and restored as well as imported by users in ADIF format (adif.org) I have parser for that format which converts file to array of arrays then every record should search through another table, containing 2000 regular expressions, and find 3-10 matches there and connect imported records in one table to those in another table (model relation hasMany) That means that every imported record should have quite some processing time. laravel did it somehow with 3500 records imported, I dont know how it will handle more. The average import will contain 10000 records and each of them need to be verified with 2000 regular expression.
The main issue is how to split this huge processing mount into smaller chunks so I wouldnt get timeouts?
Here is the function that could flawlessly do the job with adding 3862 records in one table and as a result of processing of every record add 8119 records in another table:
public function restoreAction()
{
$this->view->disable();
$user = Users::findFirst($this->session->auth['id']);
if ($this->request->isPost()) {
if ($this->request->isAjax()) {
$frontCache = new CacheData(array(
"lifetime" => 21600
));
$cache = new CacheFile($frontCache, array(
"cacheDir" => "../plc/app/cache/"
));
$cacheKey = $this->request->getPost('fileName').'.cache';
$records = $cache->get($cacheKey);
if ($records === null) {
$rowsPerChunk = 50;
$adifSource = AdifHelper::parseFile(BASE_URL.'/uploads/'.$user->getUsername().'/'.$this->request->getPost('fileName'));
$records = array_chunk($adifSource, $rowsPerChunk);
$key = array_keys($records);
$size = count($key);
}
for ($i = 0; $i < $size; $i++) {
if (!isset($records[$i])) {
break;
}
set_time_limit(50);
for ($j=0; $j < $rowsPerChunk; $j++) {
$result = $records[$i][$j];
if (!isset($result)) {
break;
}
if(isset($result['call'])) {
$p = new PrefixHelper($result['call']);
}
$bandId = (isset($result['band']) && (strlen($result['band']) > 2)) ? Bands::findFirstByName($result['band'])->getId() : null;
$infos = (isset($p)) ? $p->prefixInfo() : null;
if (is_array($infos)) {
if (isset($result['qsl_sent']) && ($result['qsl_sent'] == 'q')) {
$qsl_rcvd = 'R';
} else if (isset($result['eqsl_qsl_sent']) && ($result['eqsl_qsl_sent'] == 'c')) {
$qsl_rcvd = 'y';
} else if (isset($result['qsl_rcvd'])) {
$qsl_rcvd = $result['qsl_rcvd'];
} else {
$qsl_rcvd ='i';
}
$logRow = new Logs();
$logRow->setCall($result['call']);
$logRow->setDatetime(date('Y-m-d H:i:s',strtotime($result['qso_date'].' '.$result['time_on'])));
$logRow->setFreq(isset($result['freq']) ? $result['freq'] : 0);
$logRow->setRst($result['rst_sent']);
$logRow->setQslnote(isset($result['qslmsg']) ? $result['qslmsg'] : '');
$logRow->setComment(isset($result['comment']) ? $result['comment'] : '');
$logRow->setQslRcvd($qsl_rcvd);
$logRow->setQslVia(isset($result['qsl_sent_via']) ? $result['qsl_sent_via'] : 'e');
$logRow->band_id = $bandId;
$logRow->user_id = $this->session->auth['id'];
$success = $logRow->save();
if ($success) {
foreach ($infos as $info) {
if (is_object($info)) {
$inf = new Infos();
$inf->setLat($info->lat);
$inf->setLon($info->lon);
$inf->setCq($info->cq);
$inf->setItu($info->itu);
if (isset($result['iota'])) {
$inf->setIota($result['iota']);
}
if (isset($result['pfx'])) {
$inf->setPfx($result['pfx']);
}
if (isset($result['gridsquare'])) {
$inf->setGrid($result['gridsquare']);
} else if (isset($result['grid'])) {
$inf->setGrid($result['grid']);
}
$inf->qso_id = $logRow->getId();
$inf->prefix_id = $info->id;
$infSuccess[] = $inf->save();
}
}
}
}
}
sleep(1);
}
}
}
}
I know, the script needs a lot of improvement but for now the task was just to make it work.
I think that the good practice for large processing task in php is console applications, that doesn't have restrictions in execution time and can be setup with more memory for execution.
As for phalcon, it has builtin mechanism for running and processing cli tasks - Command Line Applications (this link will always point to the documentation of a phalcon latest version)
How do you calculate the server load of PHP/apache? I know in vBulletin forums there's the server load displayed like 0.03 0.01 0.04 but it's not really understandable to a average joe. So I thought about a 1-100 percentile scale. I wanted to display a server load visual that reads from a DIV:
$load = $serverLoad*100;
<div class=\"serverLoad\">
<div style=\"width: $load%;\"></div>//show visual for server load on a 1-100 % scale
</div>
<p>Current server load is $load%</p>
</div>
However, I don't know how to detect server load. Is it possible to do turn this server load into a percentile scale? I don't know where to start. May someone please help me out?
Thanks.
I have a very old function that should still do the trick:
function getServerLoad($windows = false){
$os=strtolower(PHP_OS);
if(strpos($os, 'win') === false){
if(file_exists('/proc/loadavg')){
$load = file_get_contents('/proc/loadavg');
$load = explode(' ', $load, 1);
$load = $load[0];
}elseif(function_exists('shell_exec')){
$load = explode(' ', `uptime`);
$load = $load[count($load)-1];
}else{
return false;
}
if(function_exists('shell_exec'))
$cpu_count = shell_exec('cat /proc/cpuinfo | grep processor | wc -l');
return array('load'=>$load, 'procs'=>$cpu_count);
}elseif($windows){
if(class_exists('COM')){
$wmi=new COM('WinMgmts:\\\\.');
$cpus=$wmi->InstancesOf('Win32_Processor');
$load=0;
$cpu_count=0;
if(version_compare('4.50.0', PHP_VERSION) == 1){
while($cpu = $cpus->Next()){
$load += $cpu->LoadPercentage;
$cpu_count++;
}
}else{
foreach($cpus as $cpu){
$load += $cpu->LoadPercentage;
$cpu_count++;
}
}
return array('load'=>$load, 'procs'=>$cpu_count);
}
return false;
}
return false;
}
This returns processor load. You can also use memory_get_usage and memory_get_peak_usage for memory load.
If you can't handle finding percentages based on this... sigh, just post and we'll try together.
This function in PHP might do that trick: sys_getloadavg().
Returns three samples representing the average system load (the number of processes in the system run queue) over the last 1, 5 and 15 minutes, respectively.
function get_server_load()
{
$serverload = array();
// DIRECTORY_SEPARATOR checks if running windows
if(DIRECTORY_SEPARATOR != '\\')
{
if(function_exists("sys_getloadavg"))
{
// sys_getloadavg() will return an array with [0] being load within the last minute.
$serverload = sys_getloadavg();
$serverload[0] = round($serverload[0], 4);
}
else if(#file_exists("/proc/loadavg") && $load = #file_get_contents("/proc/loadavg"))
{
$serverload = explode(" ", $load);
$serverload[0] = round($serverload[0], 4);
}
if(!is_numeric($serverload[0]))
{
if(#ini_get('safe_mode') == 'On')
{
return "Unknown";
}
// Suhosin likes to throw a warning if exec is disabled then die - weird
if($func_blacklist = #ini_get('suhosin.executor.func.blacklist'))
{
if(strpos(",".$func_blacklist.",", 'exec') !== false)
{
return "Unknown";
}
}
// PHP disabled functions?
if($func_blacklist = #ini_get('disable_functions'))
{
if(strpos(",".$func_blacklist.",", 'exec') !== false)
{
return "Unknown";
}
}
$load = #exec("uptime");
$load = explode("load average: ", $load);
$serverload = explode(",", $load[1]);
if(!is_array($serverload))
{
return "Unknown";
}
}
}
else
{
return "Unknown";
}
$returnload = trim($serverload[0]);
return $returnload;
}
If "exec" is activated
/**
* Return the current server load
* It needs "exec" activated
*
* e.g.
* 15:06:37 up 10 days, 5:59, 12 users, load average: 1.40, 1.45, 1.33
* returns
* float(1.40)
*/
public function getServerLoad()
{
preg_match('#(\d\.\d+),[\d\s\.]+,[\d\s\.]+$#', exec("uptime"), $m);
return (float)$m[1];
}
<?php
$load = sys_getloadavg();
if ($load[0] > 0.80) {
header('HTTP/1.1 503 Too busy, try again later');
die('Server too busy. Please try again later.');
}
?>
Its Pretty easy in LAMP environment if you have proper permissions,
print_r(sys_getloadavg());
OUTPUT : Array ( [0] => 0 [1] => 0.01 [2] => 0.05 )
Array values are average load in the past 1, 5 and 15 minutes respectively.
Here is a windows/linux compatible php server load function:
function get_server_load() {
$load = '';
if (stristr(PHP_OS, 'win')) {
$cmd = 'wmic cpu get loadpercentage /all';
#exec($cmd, $output);
if ($output) {
foreach($output as $line) {
if ($line && preg_match('/^[0-9]+$/', $line)) {
$load = $line;
break;
}
}
}
} else {
$sys_load = sys_getloadavg();
$load = $sys_load[0];
}
return $load;
}
function _loadavg()
{
if(class_exists("COM"))
{
$wmi = new COM("WinMgmts:\\\\.");
$cpus = $wmi->InstancesOf("Win32_Processor");
foreach ($cpus as $cpu)
{
return $cpu->LoadPercentage;
}
}
}