I currently have a website written in PHP, utilizing the curl_multi for polling external APIs. The server forks child processes to standalone from web requests and is working well, but it is somewhat limited to a per process basis.
Occasionally it hit bandwidth bottlenecks and needs a better centralized queuing logic.
I am currently trying PHP IPC with a standalone background process to handle all the outgoing requests, but was stuck in things that, normally said, is not likely to be catered by casual programmers. Says, garbage collection, inter-process exception handling, request-response matching... and etc. Am I going the wrong way?
Is there a common practise (implementation theory) out there, or even libraries I could make use of?
EDIT
Using localhost TCP/IP communication would double the stress of the local trafic, which is definitely not a good approach.
I am currently working on IPC message queue with some home-brew protocol... not looking right at all. Would appreciate any help.
There are several different things to distinguish here:
jobs : you have N jobs to process. Executed tasks can crash or hang, in any way all jobs should be proceeded without any data loss.
resources : you are processing your jobs in a single machine and/or in a single connection, so you need to take care of your cpu and bandwith.
synchronization : if you have interactions between your processes, you need to share information, taking care of concurrent data access.
Keep control on resources
Everybody want to get into the bus...
Because there is no built-in threads in PHP, we will need to simulate mutexes. The principle is quite simple:
1 All jobs are put in a queue
2 There is N resources available and no more in a pool
3 We iterate the queue (a while on each job)
4 Before execution, job asks for a resource in the pool
5 If there is available resources, job is executed
6 If there is no more resources, pool hangs until a job has finished or is considered dead
How to do that in PHP?
To proceed, we have several possibilities but the principle is the same:
We have 2 programs:
There is a process launcher that will launch no more than N tasks simultaneously.
There is a process child, it represents a thread's context
How can look a process launcher?
The process launcher knows how many tasks should be run, and run them without taking care of their result. It only control their execution (a process starts, finishes, or hangs, and N are already running).
PHP I give you the idea here, I'll give you usable examples later:
<?php
// launcher.php
require_once("ProcessesPool.php");
// The label identifies your process pool, it should be unique for your process launcher and your process children
$multi = new ProcessesPool($label = 'test');
// Initialize a new pool (creates the right directory or file, cleans a database or whatever you want)
// 10 is the maximum number of simultaneously run processes
if ($multi->create($max = '10') == false)
{
echo "Pool creation failed ...\n";
exit();
}
// We need to launch N processes, stored in $count
$count = 100; // maybe count($jobs)
// We execute all process, one by one
for ($i = 0; ($i < $count); $i++)
{
// The waitForResources method looks for how many processes are already run,
// and hangs until a resource is free or the maximum execution time is reached.
$ret = $multi->waitForResource($timeout = 10, $interval = 500000);
if ($ret)
{
// A resource is free, so we can run a new process
echo "Execute new process: $i\n";
exec("/usr/bin/php ./child.php $i > /dev/null &");
}
else
{
// Timeout is reached, we consider all children as dead and we kill them.
echo "WaitForResources Timeout! Killing zombies...\n";
$multi->killAllResources();
break;
}
}
// All process has been executed, but this does not mean they finished their work.
// This is important to follow the last executed processes to avoid zombies.
$ret = $multi->waitForTheEnd($timeout = 10, $interval = 500000);
if ($ret == false)
{
echo "WaitForTheEnd Timeout! Killing zombies...\n";
$multi->killAllResources();
}
// We destroy the process pool because we run all processes.
$multi->destroy();
echo "Finish.\n";
And what about the process child, the simulated thread's context
A child (executed job) has only 3 things to do :
tell the process launcher it started
do his job
tell the process launcher it finished
PHP It could contains something like this:
<?php
// child.php
require_once("ProcessesPool.php");
// we create the *same* instance of the process pool
$multi = new ProcessesPool($label = 'test');
// child tells the launcher it started (there will be one more resource busy in pool)
$multi->start();
// here I simulate job's execution
sleep(rand() % 5 + 1);
// child tells the launcher it finished his job (there will be one more resource free in pool)
$multi->finish();
Your usage example is nice, but where is the ProcessPool class?
There is a lot of ways to synchronize tasks but it really depends on your requirements and constraints.
You can synchronize your tasks using :
an unique file
a database
a directory and several files
probably other methods (such as system IPCs)
As we seen already, we need at least 7 methods:
1 create() will create an empty pool
2 start() takes a resource on the pool
3 finish() releases a resource
4 waitForResources() hangs if there is no more free resource
5 killAllResources() get all launched jobs in the pool and kills them
6 waitForTheEnd() hangs until there is no more busy resource
7 destroy() destroys pool
So let's begin by creating an abstract class, we'll be able to implement it using the above ways later.
PHP AbstractProcessPool.php
<?php
// AbstractProcessPool.php
abstract class AbstractProcessesPool
{
abstract protected function _createPool();
abstract protected function _cleanPool();
abstract protected function _destroyPool();
abstract protected function _getPoolAge();
abstract protected function _countPid();
abstract protected function _addPid($pid);
abstract protected function _removePid($pid);
abstract protected function _getPidList();
protected $_label;
protected $_max;
protected $_pid;
public function __construct($label)
{
$this->_max = 0;
$this->_label = $label;
$this->_pid = getmypid();
}
public function getLabel()
{
return ($this->_label);
}
public function create($max = 20)
{
$this->_max = $max;
$ret = $this->_createPool();
return $ret;
}
public function destroy()
{
$ret = $this->_destroyPool();
return $ret;
}
public function waitForResource($timeout = 120, $interval = 500000, $callback = null)
{
// let enough time for children to take a resource
usleep(200000);
while (44000)
{
if (($callback != null) && (is_callable($callback)))
{
call_user_func($callback, $this);
}
$age = $this->_getPoolAge();
if ($age == -1)
{
return false;
}
if ($age > $timeout)
{
return false;
}
$count = $this->_countPid();
if ($count == -1)
{
return false;
}
if ($count < $this->_max)
{
break;
}
usleep($interval);
}
return true;
}
public function waitForTheEnd($timeout = 3600, $interval = 500000, $callback = null)
{
// let enough time to the last child to take a resource
usleep(200000);
while (44000)
{
if (($callback != null) && (is_callable($callback)))
{
call_user_func($callback, $this);
}
$age = $this->_getPoolAge();
if ($age == -1)
{
return false;
}
if ($age > $timeout)
{
return false;
}
$count = $this->_countPid();
if ($count == -1)
{
return false;
}
if ($count == 0)
{
break;
}
usleep($interval);
}
return true;
}
public function start()
{
$ret = $this->_addPid($this->_pid);
return $ret;
}
public function finish()
{
$ret = $this->_removePid($this->_pid);
return $ret;
}
public function killAllResources($code = 9)
{
$pids = $this->_getPidList();
if ($pids == false)
{
$this->_cleanPool();
return false;
}
foreach ($pids as $pid)
{
$pid = intval($pid);
posix_kill($pid, $code);
if ($this->_removePid($pid) == false)
{
return false;
}
}
return true;
}
}
Synchronisation using a directory and several files
If you want to use the directory method (on a /dev/ram1 partition for example), the implementation will be :
1 create() will create an empty directory using the given $label
2 start() creates a file in the directory, named by the child's pid
3 finish() destroy the child's file
4 waitForResources() counts file inside that directory
5 killAllResources() reads directory content and kill all pids
6 waitForTheEnd() reads directory until there is no more files
7 destroy() removes directory
This method looks costy, but it is really efficient if you want to run hundred tasks simultaneously without taking as many database connections as there is jobs to execute.
Implementation :
PHP ProcessPoolFiles.php
<?php
// ProcessPoolFiles.php
class ProcessesPoolFiles extends AbstractProcessesPool
{
protected $_dir;
public function __construct($label, $dir)
{
parent::__construct($label);
if ((!is_dir($dir)) || (!is_writable($dir)))
{
throw new Exception("Directory '{$dir}' does not exist or is not writable.");
}
$sha1 = sha1($label);
$this->_dir = "{$dir}/pool_{$sha1}";
}
protected function _createPool()
{
if ((!is_dir($this->_dir)) && (!mkdir($this->_dir, 0777)))
{
throw new Exception("Could not create '{$this->_dir}'");
}
if ($this->_cleanPool() == false)
{
return false;
}
return true;
}
protected function _cleanPool()
{
$dh = opendir($this->_dir);
if ($dh == false)
{
return false;
}
while (($file = readdir($dh)) !== false)
{
if (($file != '.') && ($file != '..'))
{
if (unlink($this->_dir . '/' . $file) == false)
{
return false;
}
}
}
closedir($dh);
return true;
}
protected function _destroyPool()
{
if ($this->_cleanPool() == false)
{
return false;
}
if (!rmdir($this->_dir))
{
return false;
}
return true;
}
protected function _getPoolAge()
{
$age = -1;
$count = 0;
$dh = opendir($this->_dir);
if ($dh == false)
{
return false;
}
while (($file = readdir($dh)) !== false)
{
if (($file != '.') && ($file != '..'))
{
$stat = #stat($this->_dir . '/' . $file);
if ($stat['mtime'] > $age)
{
$age = $stat['mtime'];
}
$count++;
}
}
closedir($dh);
clearstatcache();
return (($count > 0) ? (#time() - $age) : (0));
}
protected function _countPid()
{
$count = 0;
$dh = opendir($this->_dir);
if ($dh == false)
{
return -1;
}
while (($file = readdir($dh)) !== false)
{
if (($file != '.') && ($file != '..'))
{
$count++;
}
}
closedir($dh);
return $count;
}
protected function _addPid($pid)
{
$file = $this->_dir . "/" . $pid;
if (is_file($file))
{
return true;
}
echo "{$file}\n";
$file = fopen($file, 'w');
if ($file == false)
{
return false;
}
fclose($file);
return true;
}
protected function _removePid($pid)
{
$file = $this->_dir . "/" . $pid;
if (!is_file($file))
{
return true;
}
if (unlink($file) == false)
{
return false;
}
return true;
}
protected function _getPidList()
{
$array = array ();
$dh = opendir($this->_dir);
if ($dh == false)
{
return false;
}
while (($file = readdir($dh)) !== false)
{
if (($file != '.') && ($file != '..'))
{
$array[] = $file;
}
}
closedir($dh);
return $array;
}
}
PHP demo, the process launcher:
<?php
// pool_files_launcher.php
require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolFiles.php");
$multi = new ProcessesPoolFiles($label = 'test', $dir = "/tmp");
if ($multi->create($max = '10') == false)
{
echo "Pool creation failed ...\n";
exit();
}
$count = 20;
for ($i = 0; ($i < $count); $i++)
{
$ret = $multi->waitForResource($timeout = 10, $interval = 500000, 'test_waitForResource');
if ($ret)
{
echo "Execute new process: $i\n";
exec("/usr/bin/php ./pool_files_calc.php $i > /dev/null &");
}
else
{
echo "WaitForResources Timeout! Killing zombies...\n";
$multi->killAllResources();
break;
}
}
$ret = $multi->waitForTheEnd($timeout = 10, $interval = 500000, 'test_waitForTheEnd');
if ($ret == false)
{
echo "WaitForTheEnd Timeout! Killing zombies...\n";
$multi->killAllResources();
}
$multi->destroy();
echo "Finish.\n";
function test_waitForResource($multi)
{
echo "Waiting for available resource ( {$multi->getLabel()} )...\n";
}
function test_waitForTheEnd($multi)
{
echo "Waiting for all resources to finish ( {$multi->getLabel()} )...\n";
}
PHP demo, the process child:
<?php
// pool_files_calc.php
require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolFiles.php");
$multi = new ProcessesPoolFiles($label = 'test', $dir = "/tmp");
$multi->start();
// here I simulate job's execution
sleep(rand() % 7 + 1);
$multi->finish();
Synchronisation using a database
MySQL If you prefer using the database method, you'll need a table like:
CREATE TABLE `processes_pool` (
`label` varchar(40) PRIMARY KEY,
`nb_launched` mediumint(6) unsigned NOT NULL,
`pid_list` varchar(2048) default NULL,
`updated` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Then, the implementation will be something like:
1 create() will insert a new row in the above table
2 start() inserts a pid on the pid list
3 finish() remove one pid from the pid list
4 waitForResources() reads nb_launched field
5 killAllResources() gets and kills every pid
6 waitForTheEnd() hangs and checks regularily until nb_launched equals 0
7 destroy() removes the row
Implementation:
PHP ProcessPoolMySql.php
<?php
// ProcessPoolMysql.php
class ProcessesPoolMySQL extends AbstractProcessesPool
{
protected $_sql;
public function __construct($label, PDO $sql)
{
parent::__construct($label);
$this->_sql = $sql;
$this->_label = sha1($label);
}
protected function _createPool()
{
$request = "
INSERT IGNORE INTO processes_pool
VALUES ( ?, ?, NULL, CURRENT_TIMESTAMP )
";
$this->_query($request, $this->_label, 0);
return $this->_cleanPool();
}
protected function _cleanPool()
{
$request = "
UPDATE processes_pool
SET
nb_launched = ?,
pid_list = NULL,
updated = CURRENT_TIMESTAMP
WHERE label = ?
";
$this->_query($request, 0, $this->_label);
return true;
}
protected function _destroyPool()
{
$request = "
DELETE FROM processes_pool
WHERE label = ?
";
$this->_query($request, $this->_label);
return true;
}
protected function _getPoolAge()
{
$request = "
SELECT (CURRENT_TIMESTAMP - updated) AS age
FROM processes_pool
WHERE label = ?
";
$ret = $this->_query($request, $this->_label);
if ($ret === null)
{
return -1;
}
return $ret['age'];
}
protected function _countPid()
{
$req = "
SELECT nb_launched AS nb
FROM processes_pool
WHERE label = ?
";
$ret = $this->_query($req, $this->_label);
if ($ret === null)
{
return -1;
}
return $ret['nb'];
}
protected function _addPid($pid)
{
$request = "
UPDATE processes_pool
SET
nb_launched = (nb_launched + 1),
pid_list = CONCAT_WS(',', (SELECT IF(LENGTH(pid_list) = 0, NULL, pid_list )), ?),
updated = CURRENT_TIMESTAMP
WHERE label = ?
";
$this->_query($request, $pid, $this->_label);
return true;
}
protected function _removePid($pid)
{
$req = "
UPDATE processes_pool
SET
nb_launched = (nb_launched - 1),
pid_list =
CONCAT_WS(',', (SELECT IF (LENGTH(
SUBSTRING_INDEX(pid_list, ',', (FIND_IN_SET(?, pid_list) - 1))) = 0, null,
SUBSTRING_INDEX(pid_list, ',', (FIND_IN_SET(?, pid_list) - 1)))), (SELECT IF (LENGTH(
SUBSTRING_INDEX(pid_list, ',', (-1 * ((LENGTH(pid_list) - LENGTH(REPLACE(pid_list, ',', ''))) + 1 - FIND_IN_SET(?, pid_list))))) = 0, null,
SUBSTRING_INDEX(pid_list, ',', (-1 * ((LENGTH(pid_list) - LENGTH(REPLACE(pid_list, ',', ''))) + 1 - FIND_IN_SET(?, pid_list))
)
)
)
)
),
updated = CURRENT_TIMESTAMP
WHERE label = ?";
$this->_query($req, $pid, $pid, $pid, $pid, $this->_label);
return true;
}
protected function _getPidList()
{
$req = "
SELECT pid_list
FROM processes_pool
WHERE label = ?
";
$ret = $this->_query($req, $this->_label);
if ($ret === null)
{
return false;
}
if ($ret['pid_list'] == null)
{
return array();
}
$pid_list = explode(',', $ret['pid_list']);
return $pid_list;
}
protected function _query($request)
{
$return = null;
$stmt = $this->_sql->prepare($request);
if ($stmt === false)
{
return $return;
}
$params = func_get_args();
array_shift($params);
if ($stmt->execute($params) === false)
{
return $return;
}
if (strncasecmp(trim($request), 'SELECT', 6) === 0)
{
$return = $stmt->fetch(PDO::FETCH_ASSOC);
}
return $return;
}
}
PHP demo, the process launcher:
<?php
// pool_mysql_launcher.php
require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolMySQL.php");
$dbh = new PDO("mysql:host=127.0.0.1;dbname=fuz", 'root', 'root');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$multi = new ProcessesPoolMySQL($label = 'test', $dbh);
if ($multi->create($max = '10') == false)
{
echo "Pool creation failed ...\n";
exit();
}
$count = 20;
for ($i = 0; ($i < $count); $i++)
{
$ret = $multi->waitForResource($timeout = 10, $interval = 500000, 'test_waitForResource');
if ($ret)
{
echo "Execute new process: $i\n";
exec("/usr/bin/php ./pool_mysql_calc.php $i > /dev/null &");
}
else
{
echo "WaitForResources Timeout! Killing zombies...\n";
$multi->killAllResources();
break;
}
}
$ret = $multi->waitForTheEnd($timeout = 10, $interval = 500000, 'test_waitForTheEnd');
if ($ret == false)
{
echo "WaitForTheEnd Timeout! Killing zombies...\n";
$multi->killAllResources();
}
$multi->destroy();
echo "Finish.\n";
function test_waitForResource($multi)
{
echo "Waiting for available resource ( {$multi->getLabel()} )...\n";
}
function test_waitForTheEnd($multi)
{
echo "Waiting for all resources to finish ( {$multi->getLabel()} )...\n";
}
PHP demo, the process child:
<?php
// pool_mysql_calc.php
require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolMySQL.php");
$dbh = new PDO("mysql:host=127.0.0.1;dbname=fuz", 'root', 'root');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$multi = new ProcessesPoolMySQL($label = 'test', $dbh);
$multi->start();
// here I simulate job's execution
sleep(rand() % 7 + 1);
$multi->finish();
What is the output of the above code?
Demo output Those demo gives - fortunately - about the same output. If the timeout is not reached (the dreams's case), output is:
KolyMac:TaskManager ninsuo$ php pool_files_launcher.php
Waiting for available resource ( test )...
Execute new process: 0
Waiting for available resource ( test )...
Execute new process: 1
Waiting for available resource ( test )...
Execute new process: 2
Waiting for available resource ( test )...
Execute new process: 3
Waiting for available resource ( test )...
Execute new process: 4
Waiting for available resource ( test )...
Execute new process: 5
Waiting for available resource ( test )...
Execute new process: 6
Waiting for available resource ( test )...
Execute new process: 7
Waiting for available resource ( test )...
Execute new process: 8
Waiting for available resource ( test )...
Execute new process: 9
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Execute new process: 10
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Execute new process: 11
Waiting for available resource ( test )...
Execute new process: 12
Waiting for available resource ( test )...
Execute new process: 13
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Execute new process: 14
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Execute new process: 15
Waiting for available resource ( test )...
Execute new process: 16
Waiting for available resource ( test )...
Execute new process: 17
Waiting for available resource ( test )...
Execute new process: 18
Waiting for available resource ( test )...
Execute new process: 19
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Finish.
Demo output In a worse case (I change sleep(rand() % 7 + 1); to sleep(rand() % 7 + 100);, this gives:
KolyMac:TaskManager ninsuo$ php pool_files_launcher.php
Waiting for available resource ( test )...
Execute new process: 0
Waiting for available resource ( test )...
Execute new process: 1
Waiting for available resource ( test )...
Execute new process: 2
Waiting for available resource ( test )...
Execute new process: 3
Waiting for available resource ( test )...
Execute new process: 4
Waiting for available resource ( test )...
Execute new process: 5
Waiting for available resource ( test )...
Execute new process: 6
Waiting for available resource ( test )...
Execute new process: 7
Waiting for available resource ( test )...
Execute new process: 8
Waiting for available resource ( test )...
Execute new process: 9
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
(...)
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
WaitForResources Timeout! Killing zombies...
Waiting for all resources to finish ( test )...
Finish.
Go to page 2 to continue reading this answer.
Page 2: there is a limit for SO answer's body to 30k chars, so I need to create a new one.
Keep control on results
No mistake allowed!
YEAH ! You can launch tons of processes without taking care of resources. But what if one child process fails? There will be one undone or incomplete job!...
In fact, this is simpler (far simpler) than controlling process execution. We have a queue of jobs executed using a pool, and we only need to know if one has failed or succeded after its execution. If there is failure when the whole pool is executed, then failed processes are put on a new pool, and are executed again.
How to proceed in PHP ?
This principle is based on clusters: queue contains several jobs, but represents only one entity. Each calcul of the cluster
should succeed to get that entity complete.
Roadmap:
1 We create a todo list (to mismatch with the queue, used for process management) containing all calculs of the cluster. Each job has a status: waiting (not executed), running (executed and not finished), success and error (according to their result), and of course, at this step, their status is WAITING.
2 We run all jobs using the process manager (to keep control on resources), each one begin by telling the task manager it runs, and according to his own context, it finishes by indicating his state (failed or success).
3 When the entire queue is executed, the task manager creates a new queue with failed jobs, and loops again.
4 When all jobs succeed, you're done, and you're sure nothing went wrong. Your cluster is complete and your entity usable at upper level.
Proof of concept
There is nothing more to tell about the subject, so let's write some code, continuing the previous sample code.
As for process management, you can use several ways to synchronize your parents and children, but there is no hard logic, so there is no need for an abstraction.
So I developped a MySQL example (faster to write), you'll be free to adapt this concept for your requirements and constraints.
MySQL Create the following table:
CREATE TABLE `tasks_manager` (
`cluster_label` varchar(40),
`calcul_label` varchar(40),
`status` enum('waiting', 'running', 'failed', 'success') default 'waiting',
`updated` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP,
PRIMARY KEY (`cluster_label`, `calcul_label`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
PHP Here is the TaskManager.php file:
<?php
class TasksManager
{
protected $_cluster_label;
protected $_calcul_label;
protected $_sql;
const WAITING = "waiting";
const RUNNING = "running";
const SUCCESS = "success";
const FAILED = "failed";
public function __construct($label, PDO $sql)
{
$this->_sql = $sql;
$this->_cluster_label = substr($label, 0, 40);
}
public function getClusterLabel()
{
return $this->_cluster_label;
}
public function getCalculLabel()
{
return $this->_calcul_label;
}
public function destroy()
{
$request = "
DELETE FROM tasks_manager
WHERE cluster_label = ?
";
$this->_query($request, $this->_cluster_label);
return $this;
}
public function start($calcul_label)
{
$this->_calcul_label = $calcul_label;
$this->add($calcul_label, TasksManager::RUNNING);
return $this;
}
public function finish($status = TasksManager::SUCCESS)
{
if (!$this->_isStatus($status))
{
throw new Exception("{$status} is not a valid status.");
}
if (is_null($this->_cluster_label))
{
throw new Exception("finish() called, but task never started.");
}
$request = "
UPDATE tasks_manager
SET status = ?
WHERE cluster_label = ?
AND calcul_label = ?
";
$this->_query($request, $status, $this->_cluster_label, substr($this->_calcul_label, 0, 40));
return $this;
}
public function add($calcul_label, $status = TasksManager::WAITING)
{
if (!$this->_isStatus($status))
{
throw new Exception("{$status} is not a valid status.");
}
$request = "
INSERT INTO tasks_manager (
cluster_label, calcul_label, status
) VALUES (
?, ?, ?
)
ON DUPLICATE KEY UPDATE
status = ?
";
$calcul_label = substr($calcul_label, 0, 40);
$this->_query($request, $this->_cluster_label, $calcul_label, $status, $status);
return $this;
}
public function delete($calcul_label)
{
$request = "
DELETE FROM tasks_manager
WHERE cluster_label = ?
AND calcul_label = ?
";
$this->_query($request, $this->_cluster_label, substr($calcul_label, 0, 40));
return $this;
}
public function countStatus($status = TasksManager::SUCCESS)
{
if (!$this->_isStatus($status))
{
throw new Exception("{$status} is not a valid status.");
}
$request = "
SELECT COUNT(*) AS cnt
FROM tasks_manager
WHERE cluster_label = ?
AND status = ?
";
$ret = $this->_query($request, $this->_cluster_label, $status);
return $ret[0]['cnt'];
}
public function count()
{
$request = "
SELECT COUNT(id) AS cnt
FROM tasks_manager
WHERE cluster_label = ?
";
$ret = $this->_query($request, $this->_cluster_label);
return $ret[0]['cnt'];
}
public function getCalculsByStatus($status = TasksManager::SUCCESS)
{
if (!$this->_isStatus($status))
{
throw new Exception("{$status} is not a valid status.");
}
$request = "
SELECT calcul_label
FROM tasks_manager
WHERE cluster_label = ?
AND status = ?
";
$ret = $this->_query($request, $this->_cluster_label, $status);
$array = array();
if (!is_null($ret))
{
$array = array_map(function($row) {
return $row['calcul_label'];
}, $ret);
}
return $array;
}
public function switchStatus($statusA = TasksManager::RUNNING, $statusB = null)
{
if (!$this->_isStatus($statusA))
{
throw new Exception("{$statusA} is not a valid status.");
}
if ((!is_null($statusB)) && (!$this->_isStatus($statusB)))
{
throw new Exception("{$statusB} is not a valid status.");
}
if ($statusB != null)
{
$request = "
UPDATE tasks_manager
SET status = ?
WHERE cluster_label = ?
AND status = ?
";
$this->_query($request, $statusB, $this->_cluster_label, $statusA);
}
else
{
$request = "
UPDATE tasks_manager
SET status = ?
WHERE cluster_label = ?
";
$this->_query($request, $statusA, $this->_cluster_label);
}
return $this;
}
private function _isStatus($status)
{
if (!is_string($status))
{
return false;
}
return in_array($status, array(
self::FAILED,
self::RUNNING,
self::SUCCESS,
self::WAITING,
));
}
protected function _query($request)
{
$return = null;
$stmt = $this->_sql->prepare($request);
if ($stmt === false)
{
return $return;
}
$params = func_get_args();
array_shift($params);
if ($stmt->execute($params) === false)
{
return $return;
}
if (strncasecmp(trim($request), 'SELECT', 6) === 0)
{
$return = $stmt->fetchAll(PDO::FETCH_ASSOC);
}
return $return;
}
}
PHP The task_launcher.php is the usage example
<?php
require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolMySQL.php");
require_once("TasksManager.php");
// Initializing database connection
$dbh = new PDO("mysql:host=127.0.0.1;dbname=fuz", 'root', 'root');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
// Initializing process pool
$pool = new ProcessesPoolMySQL($label = "pool test", $dbh);
$pool->create($max = "10");
// Initializing task manager
$multi = new TasksManager($label = "jobs test", $dbh);
$multi->destroy();
// Simulating jobs
$count = 20;
$todo_list = array ();
for ($i = 0; ($i < $count); $i++)
{
$todo_list[$i] = "Job {$i}";
$multi->add($todo_list[$i], TasksManager::WAITING);
}
// Infinite loop until all jobs are done
$continue = true;
while ($continue)
{
$continue = false;
echo "Starting to run jobs in queue ...\n";
// put all failed jobs to WAITING status
$multi->switchStatus(TasksManager::FAILED, TasksManager::WAITING);
foreach ($todo_list as $job)
{
$ret = $pool->waitForResource($timeout = 10, $interval = 500000, "waitResource");
if ($ret)
{
echo "Executing job: $job\n";
exec(sprintf("/usr/bin/php ./tasks_program.php %s > /dev/null &", escapeshellarg($job)));
}
else
{
echo "waitForResource timeout!\n";
$pool->killAllResources();
// All jobs currently running are considered dead, so, failed
$multi->switchStatus(TasksManager::RUNNING, TasksManager::FAILED);
break;
}
}
$ret = $pool->waitForTheEnd($timeout = 10, $interval = 500000, "waitEnd");
if ($ret == false)
{
echo "waitForTheEnd timeout!\n";
$pool->killAllResources();
// All jobs currently running are considered dead, so, failed
$multi->switchStatus(TasksManager::RUNNING, TasksManager::FAILED);
}
echo "All jobs in queue executed, looking for errors...\n";
// Counts if there is failures
$nb_failed = $multi->countStatus(TasksManager::FAILED);
if ($nb_failed > 0)
{
$todo_list = $multi->getCalculsByStatus(TasksManager::FAILED);
echo sprintf("%d jobs failed: %s\n", $nb_failed, implode(', ', $todo_list));
$continue = true;
}
}
function waitResource($multi)
{
echo "Waiting for a resource ....\n";
}
function waitEnd($multi)
{
echo "Waiting for the end .....\n";
}
// All jobs finished, destroying task manager
$multi->destroy();
// Destroying process pool
$pool->destroy();
echo "Finish.\n";
PHP And here is the child program (a calcul)
<?php
if (!isset($argv[1]))
{
die("This program must be called with an identifier (calcul_label)\n");
}
$calcul_label = $argv[1];
require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolMySQL.php");
require_once("TasksManager.php");
// Initializing database connection
$dbh = new PDO("mysql:host=127.0.0.1;dbname=fuz", 'root', 'root');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
// Initializing process pool (with same label as parent)
$pool = new ProcessesPoolMySQL($label = "pool test", $dbh);
// Takes one resource in pool
$pool->start();
// Initializing task manager (with same label as parent)
$multi = new TasksManager($label = "jobs test", $dbh);
$multi->start($calcul_label);
// Simulating execution time
$secs = (rand() % 2) + 3;
sleep($secs);
// Simulating job status
$status = rand() % 3 == 0 ? TasksManager::FAILED : TasksManager::SUCCESS;
// Job finishes indicating his status
$multi->finish($status);
// Releasing pool's resource
$pool->finish();
Demo output This demo will give you something like this (too large for SO).
Synchronization and communication between processes
There is solutions to communicate easily without errors.
Take a coffee, we're close to the end!
We are now able to launch tons of processes and they are all giving the expected result, well, this is not bad.
But now, all our processes are executed stand-alone, and they actually can't communicate each others. That is your
core problem, and there is a lot of solutions.
It is really difficult to tell you exactly what kind of communication you need. You were speaking about what you tried (IPCs,
communication using files, or home-made protocols), but not what kind of information are shared between your processes.
Anyway, I invite you to think about an OOP solution.
PHP is powerful.
PHP has magic methods :
__get($property) let us implement the access of a $property on an object
__set($property, $value) let us implement the assignation of a $property on an object
PHP can handle files, with concurrent-access management
fopen($file, 'c+') opens a file with advisory lock options enabled (allow you to use flock)
flock($descriptor, LOCK_SH) takes a shared lock (for reading)
flock($descriptor, LOCK_EX) takes an exclusive lock (for writting)
Finally, PHP has:
json_encode($object) to get a json representation of an object
json_decode($string) to get back an object from a json string
You see where I'm going? We will create a Synchro class that will work the same way as the stdClass class, but it will be always safely synchronized on a file. Our processes will be able to access the same instance of that object at the same time.
Some Linux systems tricks
Of course, if you have 150 processes working on the same file at the same time, your hard drive will slow down your processes. To
handle this issue, why not creating a filesystem partition on RAM? Writing into that file will be about as quick as writing
in memory!
shell As root, type the following commands:
mkfs -q /dev/ram1 65536
mkdir -p /ram
mount /dev/ram1 /ram
Some notes :
65536 is in kilobytes, here you get a 64M partition.
if you want to mount that partition at startup, create a shell script and call it inside /etc/rc.local file.
Implementation
PHP Here is the Synchro.php class.
<?php
class Synchro
{
private $_file;
public function __construct($file)
{
$this->_file = $file;
}
public function __get($property)
{
// File does not exist
if (!is_file($this->_file))
{
return null;
}
// Check if file is readable
if ((is_file($this->_file)) && (!is_readable($this->_file)))
{
throw new Exception(sprintf("File '%s' is not readable.", $this->_file));
}
// Open file with advisory lock option enabled for reading and writting
if (($fd = fopen($this->_file, 'c+')) === false)
{
throw new Exception(sprintf("Can't open '%s' file.", $this->_file));
}
// Request a lock for reading (hangs until lock is granted successfully)
if (flock($fd, LOCK_SH) === false)
{
throw new Exception(sprintf("Can't lock '%s' file for reading.", $this->_file));
}
// A hand-made file_get_contents
$contents = '';
while (($read = fread($fd, 32 * 1024)) !== '')
{
$contents .= $read;
}
// Release shared lock and close file
flock($fd, LOCK_UN);
fclose($fd);
// Restore shared data object and return requested property
$object = json_decode($contents);
if (property_exists($object, $property))
{
return $object->{$property};
}
return null;
}
public function __set($property, $value)
{
// Check if directory is writable if file does not exist
if ((!is_file($this->_file)) && (!is_writable(dirname($this->_file))))
{
throw new Exception(sprintf("Directory '%s' does not exist or is not writable.", dirname($this->_file)));
}
// Check if file is writable if it exists
if ((is_file($this->_file)) && (!is_writable($this->_file)))
{
throw new Exception(sprintf("File '%s' is not writable.", $this->_file));
}
// Open file with advisory lock option enabled for reading and writting
if (($fd = fopen($this->_file, 'c+')) === false)
{
throw new Exception(sprintf("Can't open '%s' file.", $this->_file));
}
// Request a lock for writting (hangs until lock is granted successfully)
if (flock($fd, LOCK_EX) === false)
{
throw new Exception(sprintf("Can't lock '%s' file for writing.", $this->_file));
}
// A hand-made file_get_contents
$contents = '';
while (($read = fread($fd, 32 * 1024)) !== '')
{
$contents .= $read;
}
// Restore shared data object and set value for desired property
if (empty($contents))
{
$object = new stdClass();
}
else
{
$object = json_decode($contents);
}
$object->{$property} = $value;
// Go back at the beginning of file
rewind($fd);
// Truncate file
ftruncate($fd, strlen($contents));
// Save shared data object to the file
fwrite($fd, json_encode($object));
// Release exclusive lock and close file
flock($fd, LOCK_UN);
fclose($fd);
return $value;
}
}
Demonstration
We will continue (and finish) our processes / tasks example by making our processes communicate each others.
Rules :
Our goal is to get the sum of all numbers between 1 and 20.
We have 20 processes, with ID from 1 to 20.
Those processes are randomly queued for execution.
Each process (except process 1) can do only one calculation : its id + the previous process's result
Process 1 directly put his id
Each process succeed if it can do his calculation (means, if the previous process's result is available), else it fails (and is candidate for a new queue)
The pool's timeout expires after 10 seconds
Well, it looks complicated but actually, it is a good representation of what you'll find in a real-life situation.
PHP synchro_launcher.php file.
<?php
require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolMySQL.php");
require_once("TasksManager.php");
require_once("Synchro.php");
// Removing old synchroized object
if (is_file("/tmp/synchro.txt"))
{
unlink("/tmp/synchro.txt");
}
// Initializing database connection
$dbh = new PDO("mysql:host=127.0.0.1;dbname=fuz", 'root', 'root');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
// Initializing process pool
$pool = new ProcessesPoolMySQL($label = "synchro pool", $dbh);
$pool->create($max = "10");
// Initializing task manager
$multi = new TasksManager($label = "synchro tasks", $dbh);
$multi->destroy();
// Simulating jobs
$todo_list = array ();
for ($i = 1; ($i <= 20); $i++)
{
$todo_list[$i] = $i;
$multi->add($todo_list[$i], TasksManager::WAITING);
}
// Infinite loop until all jobs are done
$continue = true;
while ($continue)
{
$continue = false;
echo "Starting to run jobs in queue ...\n";
// Shuffle all jobs (else this will be too easy :-))
shuffle($todo_list);
// put all failed jobs to WAITING status
$multi->switchStatus(TasksManager::FAILED, TasksManager::WAITING);
foreach ($todo_list as $job)
{
$ret = $pool->waitForResource($timeout = 10, $interval = 500000, "waitResource");
if ($ret)
{
echo "Executing job: $job\n";
exec(sprintf("/usr/bin/php ./synchro_program.php %s > /dev/null &", escapeshellarg($job)));
}
else
{
echo "waitForResource timeout!\n";
$pool->killAllResources();
// All jobs currently running are considered dead, so, failed
$multi->switchStatus(TasksManager::RUNNING, TasksManager::FAILED);
break;
}
}
$ret = $pool->waitForTheEnd($timeout = 10, $interval = 500000, "waitEnd");
if ($ret == false)
{
echo "waitForTheEnd timeout!\n";
$pool->killAllResources();
// All jobs currently running are considered dead, so, failed
$multi->switchStatus(TasksManager::RUNNING, TasksManager::FAILED);
}
echo "All jobs in queue executed, looking for errors...\n";
// Counts if there is failures
$multi->switchStatus(TasksManager::WAITING, TasksManager::FAILED);
$nb_failed = $multi->countStatus(TasksManager::FAILED);
if ($nb_failed > 0)
{
$todo_list = $multi->getCalculsByStatus(TasksManager::FAILED);
echo sprintf("%d jobs failed: %s\n", $nb_failed, implode(', ', $todo_list));
$continue = true;
}
}
function waitResource($multi)
{
echo "Waiting for a resource ....\n";
}
function waitEnd($multi)
{
echo "Waiting for the end .....\n";
}
// All jobs finished, destroying task manager
$multi->destroy();
// Destroying process pool
$pool->destroy();
// Recovering final result
$synchro = new Synchro("/tmp/synchro.txt");
echo sprintf("Result of the sum of all numbers between 1 and 20 included is: %d\n", $synchro->result20);
echo "Finish.\n";
PHP And its associed synchro_calcul.php file.
<?php
if (!isset($argv[1]))
{
die("This program must be called with an identifier (calcul_label)\n");
}
$current_id = $argv[1];
require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolMySQL.php");
require_once("TasksManager.php");
require_once("Synchro.php");
// Initializing database connection
$dbh = new PDO("mysql:host=127.0.0.1;dbname=fuz", 'root', 'root');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
// Initializing process pool (with same label as parent)
$pool = new ProcessesPoolMySQL($label = "synchro pool", $dbh);
// Takes one resource in pool
$pool->start();
// Initializing task manager (with same label as parent)
$multi = new TasksManager($label = "synchro tasks", $dbh);
$multi->start($current_id);
// ------------------------------------------------------
// Job begins here
$synchro = new Synchro("/tmp/synchro.txt");
if ($current_id == 1)
{
$synchro->result1 = 1;
$status = TasksManager::SUCCESS;
}
else
{
$previous_id = $current_id - 1;
if (is_null($synchro->{"result{$previous_id}"}))
{
$status = TasksManager::FAILED;
}
else
{
$synchro->{"result{$current_id}"} = $synchro->{"result{$previous_id}"} + $current_id;
$status = TasksManager::SUCCESS;
}
}
// ------------------------------------------------------
// Job finishes indicating his status
$multi->finish($status);
// Releasing pool's resource
$pool->finish();
Output
The following demo will give you something like this output (too large for SO)
Conclusion
Task management in PHP is not really easy because of the missing threads. As many developers, I hope this feature will come
built-in someday. Anyway, this is possible to control resources and results, and to share data between processes, so we can do,
with some work I assume, task management efficiently.
Synchronization and communication can be done by several ways, but you need to check out pros and cons for each one, according
to your constraints and requirements. For example:
if you need to launch 500 tasks at once and want to use a MySQL synchronization method, you'll need 1+500 simultaneous connections to the database (it may not appreciate a lot).
If you need to share a big amount of data, using only one file may be inefficient.
If you are using file for synchronization, don't forget to have a look to system built-in tools such as /dev/sdram.
Try to stay as more as possible in an object-oriented programming to handle your troubles. Home-made protocols or such will make your app harder to maintain.
I gave you my 2-cents about this interesting subject, and I hope it will give you some ideas to solve your issues.
I recommend you take a look at this library called PHP-Queue: https://github.com/CoderKungfu/php-queue
An short description from its github page:
A unified front-end for different queuing backends. Includes a REST
server, CLI interface and daemon runners.
Check out its github page for more details.
With a bit of tinkering, I think this library will help you solve your problem.
Hope this helps.
Problem: I want to implement several php-worker processes who are listening on a MQ-server queue for asynchronous jobs. The problem now is that simply running this processes as daemons on a server doesn't really give me any level of control over the instances (Load, Status, locked up)...except maybe for dumping ps -aux.
Because of that I'm looking for a runtime environment of some kind that lets me monitor and control the instances, either on system (process) level or on a higher layer (some kind of Java-style appserver)
Any pointers?
Here's some code that may be useful.
<?
define('WANT_PROCESSORS', 5);
define('PROCESSOR_EXECUTABLE', '/path/to/your/processor');
set_time_limit(0);
$cycles = 0;
$run = true;
$reload = false;
declare(ticks = 30);
function signal_handler($signal) {
switch($signal) {
case SIGTERM :
global $run;
$run = false;
break;
case SIGHUP :
global $reload;
$reload = true;
break;
}
}
pcntl_signal(SIGTERM, 'signal_handler');
pcntl_signal(SIGHUP, 'signal_handler');
function spawn_processor() {
$pid = pcntl_fork();
if($pid) {
global $processors;
$processors[] = $pid;
} else {
if(posix_setsid() == -1)
die("Forked process could not detach from terminal\n");
fclose(stdin);
fclose(stdout);
fclose(stderr);
pcntl_exec(PROCESSOR_EXECUTABLE);
die('Failed to fork ' . PROCESSOR_EXECUTABLE . "\n");
}
}
function spawn_processors() {
global $processors;
if($processors)
kill_processors();
$processors = array();
for($ix = 0; $ix < WANT_PROCESSORS; $ix++)
spawn_processor();
}
function kill_processors() {
global $processors;
foreach($processors as $processor)
posix_kill($processor, SIGTERM);
foreach($processors as $processor)
pcntl_waitpid($processor);
unset($processors);
}
function check_processors() {
global $processors;
$valid = array();
foreach($processors as $processor) {
pcntl_waitpid($processor, $status, WNOHANG);
if(posix_getsid($processor))
$valid[] = $processor;
}
$processors = $valid;
if(count($processors) > WANT_PROCESSORS) {
for($ix = count($processors) - 1; $ix >= WANT_PROCESSORS; $ix--)
posix_kill($processors[$ix], SIGTERM);
for($ix = count($processors) - 1; $ix >= WANT_PROCESSORS; $ix--)
pcntl_waitpid($processors[$ix]);
} elseif(count($processors) < WANT_PROCESSORS) {
for($ix = count($processors); $ix < WANT_PROCESSORS; $ix++)
spawn_processor();
}
}
spawn_processors();
while($run) {
$cycles++;
if($reload) {
$reload = false;
kill_processors();
spawn_processors();
} else {
check_processors();
}
usleep(150000);
}
kill_processors();
pcntl_wait();
?>
It sounds like you already have a MQ up and running on a *nix system and just want a way to manage workers.
A very simple way to do so is to use GNU screen. To start 10 workers you can use:
#!/bin/sh
for x in `seq 1 10` ; do
screen -dmS worker_$x php /path/to/script.php worker$x
end
This will start 10 workers in the background using screens named worker_1,2,3 and so on.
You can reattach to the screens by running screen -r worker_ and list the running workers by using screen -list.
For more info this guide may be of help:
http://www.kuro5hin.org/story/2004/3/9/16838/14935
Also try:
screen --help
man screen
or google.
For production servers I would normally recommend using the normal system startup scripts, but I have been running screen commands from the startup scripts for years with no problems.
Do you actually need it to be continuously running?
If you only want to spawn new process on request, you can register it as a service in xinetd.
a pcntl plugin type server daemon for PHP
http://dev.pedemont.com/sonic/
Bellow is our working implementation of #chaos answer. Code to handle signals was removed as this script lives usually just few milliseconds.
Also, in code we added 2 functions to save pids between calls: restore_processors_state() and save_processors_state(). We've used redis there, but you can decide to use implementation on files.
We run this script every minute using cron. Cron checks if all processes alive. If not - it re-run them and then dies. If we want to kill existing processes then we simply run this script with argument kill: php script.php kill.
Very handy way of running workers without injecting scripts into init.d.
<?php
include_once dirname( __FILE__ ) . '/path/to/bootstrap.php';
define('WANT_PROCESSORS', 5);
define('PROCESSOR_EXECUTABLE', '' . dirname(__FILE__) . '/path/to/worker.php');
set_time_limit(0);
$run = true;
$reload = false;
declare(ticks = 30);
function restore_processors_state()
{
global $processors;
$redis = Zend_Registry::get('redis');
$pids = $redis->hget('worker_procs', 'pids');
if( !$pids )
{
$processors = array();
}
else
{
$processors = json_decode($pids, true);
}
}
function save_processors_state()
{
global $processors;
$redis = Zend_Registry::get('redis');
$redis->hset('worker_procs', 'pids', json_encode($processors));
}
function spawn_processor() {
$pid = pcntl_fork();
if($pid) {
global $processors;
$processors[] = $pid;
} else {
if(posix_setsid() == -1)
die("Forked process could not detach from terminal\n");
fclose(STDIN);
fclose(STDOUT);
fclose(STDERR);
pcntl_exec('/usr/bin/php', array(PROCESSOR_EXECUTABLE));
die('Failed to fork ' . PROCESSOR_EXECUTABLE . "\n");
}
}
function spawn_processors() {
restore_processors_state();
check_processors();
save_processors_state();
}
function kill_processors() {
global $processors;
foreach($processors as $processor)
posix_kill($processor, SIGTERM);
foreach($processors as $processor)
pcntl_waitpid($processor, $trash);
unset($processors);
}
function check_processors() {
global $processors;
$valid = array();
foreach($processors as $processor) {
pcntl_waitpid($processor, $status, WNOHANG);
if(posix_getsid($processor))
$valid[] = $processor;
}
$processors = $valid;
if(count($processors) > WANT_PROCESSORS) {
for($ix = count($processors) - 1; $ix >= WANT_PROCESSORS; $ix--)
posix_kill($processors[$ix], SIGTERM);
for($ix = count($processors) - 1; $ix >= WANT_PROCESSORS; $ix--)
pcntl_waitpid($processors[$ix], $trash);
}
elseif(count($processors) < WANT_PROCESSORS) {
for($ix = count($processors); $ix < WANT_PROCESSORS; $ix++)
spawn_processor();
}
}
if( isset($argv) && count($argv) > 1 ) {
if( $argv[1] == 'kill' ) {
restore_processors_state();
kill_processors();
save_processors_state();
exit(0);
}
}
spawn_processors();
PHP must track the amount of CPU time a particular script has used in order to enforce the max_execution_time limit.
Is there a way to get access to this inside of the script? I'd like to include some logging with my tests about how much CPU was burnt in the actual PHP (the time is not incremented when the script is sitting and waiting for the database).
I am using a Linux box.
If all you need is the wall-clock time, rather than the CPU execution time, then it is simple to calculate:
//place this before any script you want to calculate time
$time_start = microtime(true);
//sample script
for($i=0; $i<1000; $i++){
//do anything
}
$time_end = microtime(true);
//dividing with 60 will give the execution time in minutes otherwise seconds
$execution_time = ($time_end - $time_start)/60;
//execution time of the script
echo '<b>Total Execution Time:</b> '.$execution_time.' Mins';
// if you get weird results, use number_format((float) $execution_time, 10)
Note that this will include the time that PHP is sat waiting for external resources such as disks or databases, which is not used for max_execution_time.
On unixoid systems (and in php 7+ on Windows as well), you can use getrusage, like:
// Script start
$rustart = getrusage();
// Code ...
// Script end
function rutime($ru, $rus, $index) {
return ($ru["ru_$index.tv_sec"]*1000 + intval($ru["ru_$index.tv_usec"]/1000))
- ($rus["ru_$index.tv_sec"]*1000 + intval($rus["ru_$index.tv_usec"]/1000));
}
$ru = getrusage();
echo "This process used " . rutime($ru, $rustart, "utime") .
" ms for its computations\n";
echo "It spent " . rutime($ru, $rustart, "stime") .
" ms in system calls\n";
Note that you don't need to calculate a difference if you are spawning a php instance for every test.
Shorter version of talal7860's answer
<?php
// At start of script
$time_start = microtime(true);
// Anywhere else in the script
echo 'Total execution time in seconds: ' . (microtime(true) - $time_start);
As pointed out, this is 'wallclock time' not 'cpu time'
<?php
// Randomize sleeping time
usleep(mt_rand(100, 10000));
// As of PHP 5.4.0, REQUEST_TIME_FLOAT is available in the $_SERVER superglobal array.
// It contains the timestamp of the start of the request with microsecond precision.
$time = microtime(true) - $_SERVER["REQUEST_TIME_FLOAT"];
echo "Did nothing in $time seconds\n";
?>
I created an ExecutionTime class out of phihag answer that you can use out of box:
class ExecutionTime
{
private $startTime;
private $endTime;
public function start() {
$this->startTime = getrusage();
}
public function end() {
$this->endTime = getrusage();
}
private function runTime($ru, $rus, $index) {
return ($ru["ru_$index.tv_sec"]*1000 + intval($ru["ru_$index.tv_usec"]/1000))
- ($rus["ru_$index.tv_sec"]*1000 + intval($rus["ru_$index.tv_usec"]/1000));
}
public function __toString() {
return "This process used " . $this->runTime($this->endTime, $this->startTime, "utime") .
" ms for its computations\nIt spent " . $this->runTime($this->endTime, $this->startTime, "stime") .
" ms in system calls\n";
}
}
Usage:
$executionTime = new ExecutionTime();
$executionTime->start();
// Code
$executionTime->end();
echo $executionTime;
Note: In PHP 5, the getrusage function only works in Unix-oid systems. Since PHP 7, it also works on Windows.
It is going to be prettier if you format the seconds output like:
echo "Process took ". number_format(microtime(true) - $start, 2). " seconds.";
will print
Process took 6.45 seconds.
This is much better than
Process took 6.4518549156189 seconds.
Gringod at developerfusion.com gives this good answer:
<!-- put this at the top of the page -->
<?php
$mtime = microtime();
$mtime = explode(" ",$mtime);
$mtime = $mtime[1] + $mtime[0];
$starttime = $mtime;
;?>
<!-- put other code and html in here -->
<!-- put this code at the bottom of the page -->
<?php
$mtime = microtime();
$mtime = explode(" ",$mtime);
$mtime = $mtime[1] + $mtime[0];
$endtime = $mtime;
$totaltime = ($endtime - $starttime);
echo "This page was created in ".$totaltime." seconds";
;?>
From (http://www.developerfusion.com/code/2058/determine-execution-time-in-php/)
To show minutes and seconds you can use:
$startTime = microtime(true);
$endTime = microtime(true);
$diff = round($endTime - $startTime);
$minutes = floor($diff / 60); // Only minutes
$seconds = $diff % 60; // Remaining seconds, using modulo operator
echo "script execution time: minutes:$minutes, seconds:$seconds"; // Value in seconds
The cheapest and dirtiest way to do it is simply make microtime() calls at places in your code you want to benchmark. Do it right before and right after database queries and it's simple to remove those durations from the rest of your script execution time.
A hint: your PHP execution time is rarely going to be the thing that makes your script timeout. If a script times out it's almost always going to be a call to an external resource.
PHP microtime documentation:
http://us.php.net/microtime
I think you should look at xdebug. The profiling options will give you a head start toward knowing many process related items.
http://www.xdebug.org/
$_SERVER['REQUEST_TIME']
check out that too. i.e.
...
// your codes running
...
echo (time() - $_SERVER['REQUEST_TIME']);
when there is closure functionality in PHP, why not we get benefit out of it.
function startTime(){
$startTime = microtime(true);
return function () use ($startTime){
return microtime(true) - $startTime;
};
}
Now with the help of the above function, we can track time like this
$stopTime = startTime();
//some code block or line
$elapsedTime = $stopTime();
Every call to startTime function will initiate a separate time tracker. So you can initiate as many as you want and can stop them wherever you want them.
Small script that print, centered in bottom of the page, the script execution that started at server call with microsecond precision.
So as not to distort the result and to be 100% compatible with content in page, I used, to write the result on the page, a browser-side native javascript snippet.
//Uncomment the line below to test with 2 seconds
//usleep(2000000);
$prec = 5; // numbers after comma
$time = number_format(microtime(true) - $_SERVER['REQUEST_TIME_FLOAT'], $prec, '.', '');
echo "<script>
if(!tI) {
var tI=document.createElement('div');
tI.style.fontSize='8px';
tI.style.marginBottom='5px';
tI.style.position='absolute';
tI.style.bottom='0px';
tI.style.textAlign='center';
tI.style.width='98%';
document.body.appendChild(tI);
}
tI.innerHTML='$time';
</script>";
Another approach is to make the snippet as small as possible, and style it with a class in your stylesheet.
Replace the echo ...; part with the following:
echo "<script>if(!tI){var tI=document.createElement('div');tI.className='ldtme';document.body.appendChild(tI);}tI.innerHTML='$time';</script>";
In your CSS create and fill the .ldtme{...} class.
I wrote a function that check remaining execution time.
Warning: Execution time counting is different on Windows and on Linux platform.
/**
* Check if more that `$miliseconds` ms remains
* to error `PHP Fatal error: Maximum execution time exceeded`
*
* #param int $miliseconds
* #return bool
*/
function isRemainingMaxExecutionTimeBiggerThan($miliseconds = 5000) {
$max_execution_time = ini_get('max_execution_time');
if ($max_execution_time === 0) {
// No script time limitation
return true;
}
if (strtoupper(substr(PHP_OS, 0, 3)) === 'WIN') {
// On Windows: The real time is measured.
$spendMiliseconds = (microtime(true) - $_SERVER["REQUEST_TIME_FLOAT"]) * 1000;
} else {
// On Linux: Any time spent on activity that happens outside the execution
// of the script such as system calls using system(), stream operations
// database queries, etc. is not included.
// #see http://php.net/manual/en/function.set-time-limit.php
$resourceUsages = getrusage();
$spendMiliseconds = $resourceUsages['ru_utime.tv_sec'] * 1000 + $resourceUsages['ru_utime.tv_usec'] / 1000;
}
$remainingMiliseconds = $max_execution_time * 1000 - $spendMiliseconds;
return ($remainingMiliseconds >= $miliseconds);
}
Using:
while (true) {
// so something
if (!isRemainingMaxExecutionTimeBiggerThan(5000)) {
// Time to die.
// Safely close DB and done the iteration.
}
}
Further expanding on Hamid's answer, I wrote a helper class that can be started and stopped repeatedly (for profiling inside a loop).
class ExecutionTime
{
private $startTime;
private $endTime;
private $compTime = 0;
private $sysTime = 0;
public function Start() {
$this->startTime = getrusage();
}
public function End() {
$this->endTime = getrusage();
$this->compTime += $this->runTime($this->endTime, $this->startTime, "utime");
$this->systemTime += $this->runTime($this->endTime, $this->startTime, "stime");
}
private function runTime($ru, $rus, $index) {
return ($ru["ru_$index.tv_sec"]*1000 + intval($ru["ru_$index.tv_usec"]/1000)) - ($rus["ru_$index.tv_sec"]*1000 + intval($rus["ru_$index.tv_usec"]/1000));
}
public function __toString() {
return "This process used " . $this->compTime . " ms for its computations\n" . "It spent " . $this->systemTime . " ms in system calls\n";
}
}
You may only want to know the execution time of parts of your script. The most flexible way to time parts or an entire script is to create 3 simple functions (procedural code given here but you could turn it into a class by putting class timer{} around it and making a couple of tweaks). This code works, just copy and paste and run:
$tstart = 0;
$tend = 0;
function timer_starts()
{
global $tstart;
$tstart=microtime(true); ;
}
function timer_ends()
{
global $tend;
$tend=microtime(true); ;
}
function timer_calc()
{
global $tstart,$tend;
return (round($tend - $tstart,2));
}
timer_starts();
file_get_contents('http://google.com');
timer_ends();
print('It took '.timer_calc().' seconds to retrieve the google page');
Just to contribute to this conversation:
what happens if the measurement targets two points A and B in different php files?
what if we need different measurements like time based, code execution duration, external resource access duration?
what if we need to organize our measurements in categories where every one has a different starting point?
As you suspect we need some global variables to be accessed by a class object or a static method: I choose the 2nd approach and here it is:
namespace g3;
class Utils {
public function __construct() {}
public static $UtilsDtStart = [];
public static $UtilsDtStats = [];
public static function dt() {
global $UtilsDtStart, $UtilsDtStats;
$obj = new \stdClass();
$obj->start = function(int $ndx = 0) use (&$UtilsDtStart) {
$UtilsDtStart[$ndx] = \microtime(true) * 1000;
};
$obj->codeStart = function(int $ndx = 0) use (&$UtilsDtStart) {
$use = \getrusage();
$UtilsDtStart[$ndx] = ($use["ru_utime.tv_sec"] * 1000) + ($use["ru_utime.tv_usec"] / 1000);
};
$obj->resourceStart = function(int $ndx = 0) use (&$UtilsDtStart) {
$use = \getrusage();
$UtilsDtStart[$ndx] = $use["ru_stime.tv_usec"] / 1000;
};
$obj->end = function(int $ndx = 0) use (&$UtilsDtStart, &$UtilsDtStats) {
$t = #$UtilsDtStart[$ndx];
if($t === null)
return false;
$end = \microtime(true) * 1000;
$dt = $end - $t;
$UtilsDtStats[$ndx][] = $dt;
return $dt;
};
$obj->codeEnd = function(int $ndx = 0) use (&$UtilsDtStart, &$UtilsDtStats) {
$t = #$UtilsDtStart[$ndx];
if($t === null)
return false;
$use = \getrusage();
$dt = ($use["ru_utime.tv_sec"] * 1000) + ($use["ru_utime.tv_usec"] / 1000) - $t;
$UtilsDtStats[$ndx][] = $dt;
return $dt;
};
$obj->resourceEnd = function(int $ndx = 0) use (&$UtilsDtStart, &$UtilsDtStats) {
$t = #$UtilsDtStart[$ndx];
if($t === null)
return false;
$use = \getrusage();
$dt = ($use["ru_stime.tv_usec"] / 1000) - $t;
$UtilsDtStats[$ndx][] = $dt;
return $dt;
};
$obj->stats = function(int $ndx = 0) use (&$UtilsDtStats) {
$s = #$UtilsDtStats[$ndx];
if($s !== null)
$s = \array_slice($s, 0);
else
$s = false;
return $s;
};
$obj->statsLength = function() use (&$UtilsDtStats) {
return \count($UtilsDtStats);
};
return $obj;
}
}
Now all you have is to call the method that belongs to the specific category with the index that denotes it's unique group:
File A
------
\call_user_func_array(\g3\Utils::dt()->start, [0]); // point A
...
File B
------
$dt = \call_user_func_array(\g3\Utils::dt()->end, [0]); // point B
Value $dt contains the milliseconds of wall clock duration between points A and B.
To estimate the time took for php code to run:
File A
------
\call_user_func_array(\g3\Utils::dt()->codeStart, [1]); // point A
...
File B
------
$dt = \call_user_func_array(\g3\Utils::dt()->codeEnd, [1]); // point B
Notice how we changed the index that we pass at the methods.
The code is based on the closure effect that happens when we return an object/function from a function (see that \g3\Utils::dt() repeated at the examples).
I tested with php unit and between different test methods at the same test file it behaves fine so far!
Hope that helps someone!
As an alternative you can just put this line in your code blocks and check php logs, for really slow functions it's pretty useful:
trigger_error("Task done at ". strftime('%H:%m:%S', time()), E_USER_NOTICE);
For serious debugging use XDebug + Cachegrind, see https://blog.nexcess.net/2011/01/29/diagnosing-slow-php-execution-with-xdebug-and-kcachegrind/
There are several way of doing this, listed here. But each have their own pro's and con's. And (in my opinion) the readability of all longer answers are terrible.
So I decided to put it all together in one answer, that is easily usable and readable.
Usage
$start = get_timers();
for( $i = 0; $i < 100000; $i++ ){
// Code to check
}
$end = get_timers();
display_timer_statistics( $start, $end );
Function definitions
function display_timer_statistics( $start_timers, $end_timers ){
// Settings
$key_width = '100px';
$decimals = 4;
$decimals_wallclock = $decimals;
$decimals_request_time_float = $decimals;
// Variables
$start_resource_usage_timer = $start_timers[0];
$start_wallclock = $start_timers[1];
$end_resource_usage_timer = $end_timers[0];
$end_wallclock = $end_timers[1];
// # User time
// Add seconds and microseconds for the start/end, and subtract from another
$end_user_time_seconds = $end_resource_usage_timer["ru_utime.tv_sec"]*1000;
$end_user_time_microseconds = intval($end_resource_usage_timer["ru_utime.tv_usec"]/1000);
$start_user_time_seconds = $start_resource_usage_timer["ru_utime.tv_sec"]*1000;
$start_user_time_microseconds = intval($start_resource_usage_timer["ru_utime.tv_usec"]/1000);
$total_user_time = ($end_user_time_seconds + $end_user_time_microseconds) - ($start_user_time_seconds + $start_user_time_microseconds);
// # System time
// Add seconds and microseconds for the start/end, and subtract from another
$end_system_time_seconds = $end_resource_usage_timer["ru_stime.tv_sec"]*1000;
$end_system_time_microseconds = intval($end_resource_usage_timer["ru_stime.tv_usec"]/1000);
$start_system_time_seconds = $start_resource_usage_timer["ru_stime.tv_sec"]*1000;
$start_system_time_microseconds = intval($start_resource_usage_timer["ru_stime.tv_usec"]/1000);
$total_system_time = ($end_system_time_seconds + $end_system_time_microseconds) - ($start_system_time_seconds + $start_system_time_microseconds);
// Wallclock
$total_wallclock_time = number_format( ( $end_wallclock - $start_wallclock), $decimals_wallclock );
// Server request_time_float
$request_time_float = microtime(true) - $_SERVER["REQUEST_TIME_FLOAT"];
$request_time_float = number_format( $request_time_float, $decimals_request_time_float );
// Print
$span_start = "<span style='width: $key_width; display: inline-block;'>";
$span_end = "</span>";
$output = "# RUNTIME AND TIMERS " . PHP_EOL ;
$output .= PHP_EOL;
$output .= $span_start . $total_user_time . $span_end . " User time (utime)" . PHP_EOL;
$output .= $span_start . $total_system_time . $span_end . " System time (stime)" . PHP_EOL;
$output .= PHP_EOL;
$output .= $span_start . $total_wallclock_time . $span_end . " Wallclock" . PHP_EOL;
$output .= PHP_EOL;
$output .= $span_start . $request_time_float . $span_end . " REQUEST_TIME_FLOAT" . PHP_EOL . PHP_EOL . PHP_EOL;
echo nl2br( $output );
}
function get_timers(){
return [ getrusage(), microtime( true ) ];
}
Glossary
All gotten from PHP docs for getrusage
Wallclock = How long it takes
ru = Resource usage
utime = User time used
stime = System time used
tv_sec = In seconds.
tv_usec = In microseconds.
tv = ?? Dunno
return microtime(true) - $_SERVER["REQUEST_TIME_FLOAT"];