I currently have a website written in PHP, utilizing the curl_multi for polling external APIs. The server forks child processes to standalone from web requests and is working well, but it is somewhat limited to a per process basis.
Occasionally it hit bandwidth bottlenecks and needs a better centralized queuing logic.
I am currently trying PHP IPC with a standalone background process to handle all the outgoing requests, but was stuck in things that, normally said, is not likely to be catered by casual programmers. Says, garbage collection, inter-process exception handling, request-response matching... and etc. Am I going the wrong way?
Is there a common practise (implementation theory) out there, or even libraries I could make use of?
EDIT
Using localhost TCP/IP communication would double the stress of the local trafic, which is definitely not a good approach.
I am currently working on IPC message queue with some home-brew protocol... not looking right at all. Would appreciate any help.
There are several different things to distinguish here:
jobs : you have N jobs to process. Executed tasks can crash or hang, in any way all jobs should be proceeded without any data loss.
resources : you are processing your jobs in a single machine and/or in a single connection, so you need to take care of your cpu and bandwith.
synchronization : if you have interactions between your processes, you need to share information, taking care of concurrent data access.
Keep control on resources
Everybody want to get into the bus...
Because there is no built-in threads in PHP, we will need to simulate mutexes. The principle is quite simple:
1 All jobs are put in a queue
2 There is N resources available and no more in a pool
3 We iterate the queue (a while on each job)
4 Before execution, job asks for a resource in the pool
5 If there is available resources, job is executed
6 If there is no more resources, pool hangs until a job has finished or is considered dead
How to do that in PHP?
To proceed, we have several possibilities but the principle is the same:
We have 2 programs:
There is a process launcher that will launch no more than N tasks simultaneously.
There is a process child, it represents a thread's context
How can look a process launcher?
The process launcher knows how many tasks should be run, and run them without taking care of their result. It only control their execution (a process starts, finishes, or hangs, and N are already running).
PHP I give you the idea here, I'll give you usable examples later:
<?php
// launcher.php
require_once("ProcessesPool.php");
// The label identifies your process pool, it should be unique for your process launcher and your process children
$multi = new ProcessesPool($label = 'test');
// Initialize a new pool (creates the right directory or file, cleans a database or whatever you want)
// 10 is the maximum number of simultaneously run processes
if ($multi->create($max = '10') == false)
{
echo "Pool creation failed ...\n";
exit();
}
// We need to launch N processes, stored in $count
$count = 100; // maybe count($jobs)
// We execute all process, one by one
for ($i = 0; ($i < $count); $i++)
{
// The waitForResources method looks for how many processes are already run,
// and hangs until a resource is free or the maximum execution time is reached.
$ret = $multi->waitForResource($timeout = 10, $interval = 500000);
if ($ret)
{
// A resource is free, so we can run a new process
echo "Execute new process: $i\n";
exec("/usr/bin/php ./child.php $i > /dev/null &");
}
else
{
// Timeout is reached, we consider all children as dead and we kill them.
echo "WaitForResources Timeout! Killing zombies...\n";
$multi->killAllResources();
break;
}
}
// All process has been executed, but this does not mean they finished their work.
// This is important to follow the last executed processes to avoid zombies.
$ret = $multi->waitForTheEnd($timeout = 10, $interval = 500000);
if ($ret == false)
{
echo "WaitForTheEnd Timeout! Killing zombies...\n";
$multi->killAllResources();
}
// We destroy the process pool because we run all processes.
$multi->destroy();
echo "Finish.\n";
And what about the process child, the simulated thread's context
A child (executed job) has only 3 things to do :
tell the process launcher it started
do his job
tell the process launcher it finished
PHP It could contains something like this:
<?php
// child.php
require_once("ProcessesPool.php");
// we create the *same* instance of the process pool
$multi = new ProcessesPool($label = 'test');
// child tells the launcher it started (there will be one more resource busy in pool)
$multi->start();
// here I simulate job's execution
sleep(rand() % 5 + 1);
// child tells the launcher it finished his job (there will be one more resource free in pool)
$multi->finish();
Your usage example is nice, but where is the ProcessPool class?
There is a lot of ways to synchronize tasks but it really depends on your requirements and constraints.
You can synchronize your tasks using :
an unique file
a database
a directory and several files
probably other methods (such as system IPCs)
As we seen already, we need at least 7 methods:
1 create() will create an empty pool
2 start() takes a resource on the pool
3 finish() releases a resource
4 waitForResources() hangs if there is no more free resource
5 killAllResources() get all launched jobs in the pool and kills them
6 waitForTheEnd() hangs until there is no more busy resource
7 destroy() destroys pool
So let's begin by creating an abstract class, we'll be able to implement it using the above ways later.
PHP AbstractProcessPool.php
<?php
// AbstractProcessPool.php
abstract class AbstractProcessesPool
{
abstract protected function _createPool();
abstract protected function _cleanPool();
abstract protected function _destroyPool();
abstract protected function _getPoolAge();
abstract protected function _countPid();
abstract protected function _addPid($pid);
abstract protected function _removePid($pid);
abstract protected function _getPidList();
protected $_label;
protected $_max;
protected $_pid;
public function __construct($label)
{
$this->_max = 0;
$this->_label = $label;
$this->_pid = getmypid();
}
public function getLabel()
{
return ($this->_label);
}
public function create($max = 20)
{
$this->_max = $max;
$ret = $this->_createPool();
return $ret;
}
public function destroy()
{
$ret = $this->_destroyPool();
return $ret;
}
public function waitForResource($timeout = 120, $interval = 500000, $callback = null)
{
// let enough time for children to take a resource
usleep(200000);
while (44000)
{
if (($callback != null) && (is_callable($callback)))
{
call_user_func($callback, $this);
}
$age = $this->_getPoolAge();
if ($age == -1)
{
return false;
}
if ($age > $timeout)
{
return false;
}
$count = $this->_countPid();
if ($count == -1)
{
return false;
}
if ($count < $this->_max)
{
break;
}
usleep($interval);
}
return true;
}
public function waitForTheEnd($timeout = 3600, $interval = 500000, $callback = null)
{
// let enough time to the last child to take a resource
usleep(200000);
while (44000)
{
if (($callback != null) && (is_callable($callback)))
{
call_user_func($callback, $this);
}
$age = $this->_getPoolAge();
if ($age == -1)
{
return false;
}
if ($age > $timeout)
{
return false;
}
$count = $this->_countPid();
if ($count == -1)
{
return false;
}
if ($count == 0)
{
break;
}
usleep($interval);
}
return true;
}
public function start()
{
$ret = $this->_addPid($this->_pid);
return $ret;
}
public function finish()
{
$ret = $this->_removePid($this->_pid);
return $ret;
}
public function killAllResources($code = 9)
{
$pids = $this->_getPidList();
if ($pids == false)
{
$this->_cleanPool();
return false;
}
foreach ($pids as $pid)
{
$pid = intval($pid);
posix_kill($pid, $code);
if ($this->_removePid($pid) == false)
{
return false;
}
}
return true;
}
}
Synchronisation using a directory and several files
If you want to use the directory method (on a /dev/ram1 partition for example), the implementation will be :
1 create() will create an empty directory using the given $label
2 start() creates a file in the directory, named by the child's pid
3 finish() destroy the child's file
4 waitForResources() counts file inside that directory
5 killAllResources() reads directory content and kill all pids
6 waitForTheEnd() reads directory until there is no more files
7 destroy() removes directory
This method looks costy, but it is really efficient if you want to run hundred tasks simultaneously without taking as many database connections as there is jobs to execute.
Implementation :
PHP ProcessPoolFiles.php
<?php
// ProcessPoolFiles.php
class ProcessesPoolFiles extends AbstractProcessesPool
{
protected $_dir;
public function __construct($label, $dir)
{
parent::__construct($label);
if ((!is_dir($dir)) || (!is_writable($dir)))
{
throw new Exception("Directory '{$dir}' does not exist or is not writable.");
}
$sha1 = sha1($label);
$this->_dir = "{$dir}/pool_{$sha1}";
}
protected function _createPool()
{
if ((!is_dir($this->_dir)) && (!mkdir($this->_dir, 0777)))
{
throw new Exception("Could not create '{$this->_dir}'");
}
if ($this->_cleanPool() == false)
{
return false;
}
return true;
}
protected function _cleanPool()
{
$dh = opendir($this->_dir);
if ($dh == false)
{
return false;
}
while (($file = readdir($dh)) !== false)
{
if (($file != '.') && ($file != '..'))
{
if (unlink($this->_dir . '/' . $file) == false)
{
return false;
}
}
}
closedir($dh);
return true;
}
protected function _destroyPool()
{
if ($this->_cleanPool() == false)
{
return false;
}
if (!rmdir($this->_dir))
{
return false;
}
return true;
}
protected function _getPoolAge()
{
$age = -1;
$count = 0;
$dh = opendir($this->_dir);
if ($dh == false)
{
return false;
}
while (($file = readdir($dh)) !== false)
{
if (($file != '.') && ($file != '..'))
{
$stat = #stat($this->_dir . '/' . $file);
if ($stat['mtime'] > $age)
{
$age = $stat['mtime'];
}
$count++;
}
}
closedir($dh);
clearstatcache();
return (($count > 0) ? (#time() - $age) : (0));
}
protected function _countPid()
{
$count = 0;
$dh = opendir($this->_dir);
if ($dh == false)
{
return -1;
}
while (($file = readdir($dh)) !== false)
{
if (($file != '.') && ($file != '..'))
{
$count++;
}
}
closedir($dh);
return $count;
}
protected function _addPid($pid)
{
$file = $this->_dir . "/" . $pid;
if (is_file($file))
{
return true;
}
echo "{$file}\n";
$file = fopen($file, 'w');
if ($file == false)
{
return false;
}
fclose($file);
return true;
}
protected function _removePid($pid)
{
$file = $this->_dir . "/" . $pid;
if (!is_file($file))
{
return true;
}
if (unlink($file) == false)
{
return false;
}
return true;
}
protected function _getPidList()
{
$array = array ();
$dh = opendir($this->_dir);
if ($dh == false)
{
return false;
}
while (($file = readdir($dh)) !== false)
{
if (($file != '.') && ($file != '..'))
{
$array[] = $file;
}
}
closedir($dh);
return $array;
}
}
PHP demo, the process launcher:
<?php
// pool_files_launcher.php
require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolFiles.php");
$multi = new ProcessesPoolFiles($label = 'test', $dir = "/tmp");
if ($multi->create($max = '10') == false)
{
echo "Pool creation failed ...\n";
exit();
}
$count = 20;
for ($i = 0; ($i < $count); $i++)
{
$ret = $multi->waitForResource($timeout = 10, $interval = 500000, 'test_waitForResource');
if ($ret)
{
echo "Execute new process: $i\n";
exec("/usr/bin/php ./pool_files_calc.php $i > /dev/null &");
}
else
{
echo "WaitForResources Timeout! Killing zombies...\n";
$multi->killAllResources();
break;
}
}
$ret = $multi->waitForTheEnd($timeout = 10, $interval = 500000, 'test_waitForTheEnd');
if ($ret == false)
{
echo "WaitForTheEnd Timeout! Killing zombies...\n";
$multi->killAllResources();
}
$multi->destroy();
echo "Finish.\n";
function test_waitForResource($multi)
{
echo "Waiting for available resource ( {$multi->getLabel()} )...\n";
}
function test_waitForTheEnd($multi)
{
echo "Waiting for all resources to finish ( {$multi->getLabel()} )...\n";
}
PHP demo, the process child:
<?php
// pool_files_calc.php
require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolFiles.php");
$multi = new ProcessesPoolFiles($label = 'test', $dir = "/tmp");
$multi->start();
// here I simulate job's execution
sleep(rand() % 7 + 1);
$multi->finish();
Synchronisation using a database
MySQL If you prefer using the database method, you'll need a table like:
CREATE TABLE `processes_pool` (
`label` varchar(40) PRIMARY KEY,
`nb_launched` mediumint(6) unsigned NOT NULL,
`pid_list` varchar(2048) default NULL,
`updated` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Then, the implementation will be something like:
1 create() will insert a new row in the above table
2 start() inserts a pid on the pid list
3 finish() remove one pid from the pid list
4 waitForResources() reads nb_launched field
5 killAllResources() gets and kills every pid
6 waitForTheEnd() hangs and checks regularily until nb_launched equals 0
7 destroy() removes the row
Implementation:
PHP ProcessPoolMySql.php
<?php
// ProcessPoolMysql.php
class ProcessesPoolMySQL extends AbstractProcessesPool
{
protected $_sql;
public function __construct($label, PDO $sql)
{
parent::__construct($label);
$this->_sql = $sql;
$this->_label = sha1($label);
}
protected function _createPool()
{
$request = "
INSERT IGNORE INTO processes_pool
VALUES ( ?, ?, NULL, CURRENT_TIMESTAMP )
";
$this->_query($request, $this->_label, 0);
return $this->_cleanPool();
}
protected function _cleanPool()
{
$request = "
UPDATE processes_pool
SET
nb_launched = ?,
pid_list = NULL,
updated = CURRENT_TIMESTAMP
WHERE label = ?
";
$this->_query($request, 0, $this->_label);
return true;
}
protected function _destroyPool()
{
$request = "
DELETE FROM processes_pool
WHERE label = ?
";
$this->_query($request, $this->_label);
return true;
}
protected function _getPoolAge()
{
$request = "
SELECT (CURRENT_TIMESTAMP - updated) AS age
FROM processes_pool
WHERE label = ?
";
$ret = $this->_query($request, $this->_label);
if ($ret === null)
{
return -1;
}
return $ret['age'];
}
protected function _countPid()
{
$req = "
SELECT nb_launched AS nb
FROM processes_pool
WHERE label = ?
";
$ret = $this->_query($req, $this->_label);
if ($ret === null)
{
return -1;
}
return $ret['nb'];
}
protected function _addPid($pid)
{
$request = "
UPDATE processes_pool
SET
nb_launched = (nb_launched + 1),
pid_list = CONCAT_WS(',', (SELECT IF(LENGTH(pid_list) = 0, NULL, pid_list )), ?),
updated = CURRENT_TIMESTAMP
WHERE label = ?
";
$this->_query($request, $pid, $this->_label);
return true;
}
protected function _removePid($pid)
{
$req = "
UPDATE processes_pool
SET
nb_launched = (nb_launched - 1),
pid_list =
CONCAT_WS(',', (SELECT IF (LENGTH(
SUBSTRING_INDEX(pid_list, ',', (FIND_IN_SET(?, pid_list) - 1))) = 0, null,
SUBSTRING_INDEX(pid_list, ',', (FIND_IN_SET(?, pid_list) - 1)))), (SELECT IF (LENGTH(
SUBSTRING_INDEX(pid_list, ',', (-1 * ((LENGTH(pid_list) - LENGTH(REPLACE(pid_list, ',', ''))) + 1 - FIND_IN_SET(?, pid_list))))) = 0, null,
SUBSTRING_INDEX(pid_list, ',', (-1 * ((LENGTH(pid_list) - LENGTH(REPLACE(pid_list, ',', ''))) + 1 - FIND_IN_SET(?, pid_list))
)
)
)
)
),
updated = CURRENT_TIMESTAMP
WHERE label = ?";
$this->_query($req, $pid, $pid, $pid, $pid, $this->_label);
return true;
}
protected function _getPidList()
{
$req = "
SELECT pid_list
FROM processes_pool
WHERE label = ?
";
$ret = $this->_query($req, $this->_label);
if ($ret === null)
{
return false;
}
if ($ret['pid_list'] == null)
{
return array();
}
$pid_list = explode(',', $ret['pid_list']);
return $pid_list;
}
protected function _query($request)
{
$return = null;
$stmt = $this->_sql->prepare($request);
if ($stmt === false)
{
return $return;
}
$params = func_get_args();
array_shift($params);
if ($stmt->execute($params) === false)
{
return $return;
}
if (strncasecmp(trim($request), 'SELECT', 6) === 0)
{
$return = $stmt->fetch(PDO::FETCH_ASSOC);
}
return $return;
}
}
PHP demo, the process launcher:
<?php
// pool_mysql_launcher.php
require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolMySQL.php");
$dbh = new PDO("mysql:host=127.0.0.1;dbname=fuz", 'root', 'root');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$multi = new ProcessesPoolMySQL($label = 'test', $dbh);
if ($multi->create($max = '10') == false)
{
echo "Pool creation failed ...\n";
exit();
}
$count = 20;
for ($i = 0; ($i < $count); $i++)
{
$ret = $multi->waitForResource($timeout = 10, $interval = 500000, 'test_waitForResource');
if ($ret)
{
echo "Execute new process: $i\n";
exec("/usr/bin/php ./pool_mysql_calc.php $i > /dev/null &");
}
else
{
echo "WaitForResources Timeout! Killing zombies...\n";
$multi->killAllResources();
break;
}
}
$ret = $multi->waitForTheEnd($timeout = 10, $interval = 500000, 'test_waitForTheEnd');
if ($ret == false)
{
echo "WaitForTheEnd Timeout! Killing zombies...\n";
$multi->killAllResources();
}
$multi->destroy();
echo "Finish.\n";
function test_waitForResource($multi)
{
echo "Waiting for available resource ( {$multi->getLabel()} )...\n";
}
function test_waitForTheEnd($multi)
{
echo "Waiting for all resources to finish ( {$multi->getLabel()} )...\n";
}
PHP demo, the process child:
<?php
// pool_mysql_calc.php
require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolMySQL.php");
$dbh = new PDO("mysql:host=127.0.0.1;dbname=fuz", 'root', 'root');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$multi = new ProcessesPoolMySQL($label = 'test', $dbh);
$multi->start();
// here I simulate job's execution
sleep(rand() % 7 + 1);
$multi->finish();
What is the output of the above code?
Demo output Those demo gives - fortunately - about the same output. If the timeout is not reached (the dreams's case), output is:
KolyMac:TaskManager ninsuo$ php pool_files_launcher.php
Waiting for available resource ( test )...
Execute new process: 0
Waiting for available resource ( test )...
Execute new process: 1
Waiting for available resource ( test )...
Execute new process: 2
Waiting for available resource ( test )...
Execute new process: 3
Waiting for available resource ( test )...
Execute new process: 4
Waiting for available resource ( test )...
Execute new process: 5
Waiting for available resource ( test )...
Execute new process: 6
Waiting for available resource ( test )...
Execute new process: 7
Waiting for available resource ( test )...
Execute new process: 8
Waiting for available resource ( test )...
Execute new process: 9
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Execute new process: 10
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Execute new process: 11
Waiting for available resource ( test )...
Execute new process: 12
Waiting for available resource ( test )...
Execute new process: 13
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Execute new process: 14
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Execute new process: 15
Waiting for available resource ( test )...
Execute new process: 16
Waiting for available resource ( test )...
Execute new process: 17
Waiting for available resource ( test )...
Execute new process: 18
Waiting for available resource ( test )...
Execute new process: 19
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Waiting for all resources to finish ( test )...
Finish.
Demo output In a worse case (I change sleep(rand() % 7 + 1); to sleep(rand() % 7 + 100);, this gives:
KolyMac:TaskManager ninsuo$ php pool_files_launcher.php
Waiting for available resource ( test )...
Execute new process: 0
Waiting for available resource ( test )...
Execute new process: 1
Waiting for available resource ( test )...
Execute new process: 2
Waiting for available resource ( test )...
Execute new process: 3
Waiting for available resource ( test )...
Execute new process: 4
Waiting for available resource ( test )...
Execute new process: 5
Waiting for available resource ( test )...
Execute new process: 6
Waiting for available resource ( test )...
Execute new process: 7
Waiting for available resource ( test )...
Execute new process: 8
Waiting for available resource ( test )...
Execute new process: 9
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
(...)
Waiting for available resource ( test )...
Waiting for available resource ( test )...
Waiting for available resource ( test )...
WaitForResources Timeout! Killing zombies...
Waiting for all resources to finish ( test )...
Finish.
Go to page 2 to continue reading this answer.
Page 2: there is a limit for SO answer's body to 30k chars, so I need to create a new one.
Keep control on results
No mistake allowed!
YEAH ! You can launch tons of processes without taking care of resources. But what if one child process fails? There will be one undone or incomplete job!...
In fact, this is simpler (far simpler) than controlling process execution. We have a queue of jobs executed using a pool, and we only need to know if one has failed or succeded after its execution. If there is failure when the whole pool is executed, then failed processes are put on a new pool, and are executed again.
How to proceed in PHP ?
This principle is based on clusters: queue contains several jobs, but represents only one entity. Each calcul of the cluster
should succeed to get that entity complete.
Roadmap:
1 We create a todo list (to mismatch with the queue, used for process management) containing all calculs of the cluster. Each job has a status: waiting (not executed), running (executed and not finished), success and error (according to their result), and of course, at this step, their status is WAITING.
2 We run all jobs using the process manager (to keep control on resources), each one begin by telling the task manager it runs, and according to his own context, it finishes by indicating his state (failed or success).
3 When the entire queue is executed, the task manager creates a new queue with failed jobs, and loops again.
4 When all jobs succeed, you're done, and you're sure nothing went wrong. Your cluster is complete and your entity usable at upper level.
Proof of concept
There is nothing more to tell about the subject, so let's write some code, continuing the previous sample code.
As for process management, you can use several ways to synchronize your parents and children, but there is no hard logic, so there is no need for an abstraction.
So I developped a MySQL example (faster to write), you'll be free to adapt this concept for your requirements and constraints.
MySQL Create the following table:
CREATE TABLE `tasks_manager` (
`cluster_label` varchar(40),
`calcul_label` varchar(40),
`status` enum('waiting', 'running', 'failed', 'success') default 'waiting',
`updated` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP,
PRIMARY KEY (`cluster_label`, `calcul_label`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
PHP Here is the TaskManager.php file:
<?php
class TasksManager
{
protected $_cluster_label;
protected $_calcul_label;
protected $_sql;
const WAITING = "waiting";
const RUNNING = "running";
const SUCCESS = "success";
const FAILED = "failed";
public function __construct($label, PDO $sql)
{
$this->_sql = $sql;
$this->_cluster_label = substr($label, 0, 40);
}
public function getClusterLabel()
{
return $this->_cluster_label;
}
public function getCalculLabel()
{
return $this->_calcul_label;
}
public function destroy()
{
$request = "
DELETE FROM tasks_manager
WHERE cluster_label = ?
";
$this->_query($request, $this->_cluster_label);
return $this;
}
public function start($calcul_label)
{
$this->_calcul_label = $calcul_label;
$this->add($calcul_label, TasksManager::RUNNING);
return $this;
}
public function finish($status = TasksManager::SUCCESS)
{
if (!$this->_isStatus($status))
{
throw new Exception("{$status} is not a valid status.");
}
if (is_null($this->_cluster_label))
{
throw new Exception("finish() called, but task never started.");
}
$request = "
UPDATE tasks_manager
SET status = ?
WHERE cluster_label = ?
AND calcul_label = ?
";
$this->_query($request, $status, $this->_cluster_label, substr($this->_calcul_label, 0, 40));
return $this;
}
public function add($calcul_label, $status = TasksManager::WAITING)
{
if (!$this->_isStatus($status))
{
throw new Exception("{$status} is not a valid status.");
}
$request = "
INSERT INTO tasks_manager (
cluster_label, calcul_label, status
) VALUES (
?, ?, ?
)
ON DUPLICATE KEY UPDATE
status = ?
";
$calcul_label = substr($calcul_label, 0, 40);
$this->_query($request, $this->_cluster_label, $calcul_label, $status, $status);
return $this;
}
public function delete($calcul_label)
{
$request = "
DELETE FROM tasks_manager
WHERE cluster_label = ?
AND calcul_label = ?
";
$this->_query($request, $this->_cluster_label, substr($calcul_label, 0, 40));
return $this;
}
public function countStatus($status = TasksManager::SUCCESS)
{
if (!$this->_isStatus($status))
{
throw new Exception("{$status} is not a valid status.");
}
$request = "
SELECT COUNT(*) AS cnt
FROM tasks_manager
WHERE cluster_label = ?
AND status = ?
";
$ret = $this->_query($request, $this->_cluster_label, $status);
return $ret[0]['cnt'];
}
public function count()
{
$request = "
SELECT COUNT(id) AS cnt
FROM tasks_manager
WHERE cluster_label = ?
";
$ret = $this->_query($request, $this->_cluster_label);
return $ret[0]['cnt'];
}
public function getCalculsByStatus($status = TasksManager::SUCCESS)
{
if (!$this->_isStatus($status))
{
throw new Exception("{$status} is not a valid status.");
}
$request = "
SELECT calcul_label
FROM tasks_manager
WHERE cluster_label = ?
AND status = ?
";
$ret = $this->_query($request, $this->_cluster_label, $status);
$array = array();
if (!is_null($ret))
{
$array = array_map(function($row) {
return $row['calcul_label'];
}, $ret);
}
return $array;
}
public function switchStatus($statusA = TasksManager::RUNNING, $statusB = null)
{
if (!$this->_isStatus($statusA))
{
throw new Exception("{$statusA} is not a valid status.");
}
if ((!is_null($statusB)) && (!$this->_isStatus($statusB)))
{
throw new Exception("{$statusB} is not a valid status.");
}
if ($statusB != null)
{
$request = "
UPDATE tasks_manager
SET status = ?
WHERE cluster_label = ?
AND status = ?
";
$this->_query($request, $statusB, $this->_cluster_label, $statusA);
}
else
{
$request = "
UPDATE tasks_manager
SET status = ?
WHERE cluster_label = ?
";
$this->_query($request, $statusA, $this->_cluster_label);
}
return $this;
}
private function _isStatus($status)
{
if (!is_string($status))
{
return false;
}
return in_array($status, array(
self::FAILED,
self::RUNNING,
self::SUCCESS,
self::WAITING,
));
}
protected function _query($request)
{
$return = null;
$stmt = $this->_sql->prepare($request);
if ($stmt === false)
{
return $return;
}
$params = func_get_args();
array_shift($params);
if ($stmt->execute($params) === false)
{
return $return;
}
if (strncasecmp(trim($request), 'SELECT', 6) === 0)
{
$return = $stmt->fetchAll(PDO::FETCH_ASSOC);
}
return $return;
}
}
PHP The task_launcher.php is the usage example
<?php
require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolMySQL.php");
require_once("TasksManager.php");
// Initializing database connection
$dbh = new PDO("mysql:host=127.0.0.1;dbname=fuz", 'root', 'root');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
// Initializing process pool
$pool = new ProcessesPoolMySQL($label = "pool test", $dbh);
$pool->create($max = "10");
// Initializing task manager
$multi = new TasksManager($label = "jobs test", $dbh);
$multi->destroy();
// Simulating jobs
$count = 20;
$todo_list = array ();
for ($i = 0; ($i < $count); $i++)
{
$todo_list[$i] = "Job {$i}";
$multi->add($todo_list[$i], TasksManager::WAITING);
}
// Infinite loop until all jobs are done
$continue = true;
while ($continue)
{
$continue = false;
echo "Starting to run jobs in queue ...\n";
// put all failed jobs to WAITING status
$multi->switchStatus(TasksManager::FAILED, TasksManager::WAITING);
foreach ($todo_list as $job)
{
$ret = $pool->waitForResource($timeout = 10, $interval = 500000, "waitResource");
if ($ret)
{
echo "Executing job: $job\n";
exec(sprintf("/usr/bin/php ./tasks_program.php %s > /dev/null &", escapeshellarg($job)));
}
else
{
echo "waitForResource timeout!\n";
$pool->killAllResources();
// All jobs currently running are considered dead, so, failed
$multi->switchStatus(TasksManager::RUNNING, TasksManager::FAILED);
break;
}
}
$ret = $pool->waitForTheEnd($timeout = 10, $interval = 500000, "waitEnd");
if ($ret == false)
{
echo "waitForTheEnd timeout!\n";
$pool->killAllResources();
// All jobs currently running are considered dead, so, failed
$multi->switchStatus(TasksManager::RUNNING, TasksManager::FAILED);
}
echo "All jobs in queue executed, looking for errors...\n";
// Counts if there is failures
$nb_failed = $multi->countStatus(TasksManager::FAILED);
if ($nb_failed > 0)
{
$todo_list = $multi->getCalculsByStatus(TasksManager::FAILED);
echo sprintf("%d jobs failed: %s\n", $nb_failed, implode(', ', $todo_list));
$continue = true;
}
}
function waitResource($multi)
{
echo "Waiting for a resource ....\n";
}
function waitEnd($multi)
{
echo "Waiting for the end .....\n";
}
// All jobs finished, destroying task manager
$multi->destroy();
// Destroying process pool
$pool->destroy();
echo "Finish.\n";
PHP And here is the child program (a calcul)
<?php
if (!isset($argv[1]))
{
die("This program must be called with an identifier (calcul_label)\n");
}
$calcul_label = $argv[1];
require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolMySQL.php");
require_once("TasksManager.php");
// Initializing database connection
$dbh = new PDO("mysql:host=127.0.0.1;dbname=fuz", 'root', 'root');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
// Initializing process pool (with same label as parent)
$pool = new ProcessesPoolMySQL($label = "pool test", $dbh);
// Takes one resource in pool
$pool->start();
// Initializing task manager (with same label as parent)
$multi = new TasksManager($label = "jobs test", $dbh);
$multi->start($calcul_label);
// Simulating execution time
$secs = (rand() % 2) + 3;
sleep($secs);
// Simulating job status
$status = rand() % 3 == 0 ? TasksManager::FAILED : TasksManager::SUCCESS;
// Job finishes indicating his status
$multi->finish($status);
// Releasing pool's resource
$pool->finish();
Demo output This demo will give you something like this (too large for SO).
Synchronization and communication between processes
There is solutions to communicate easily without errors.
Take a coffee, we're close to the end!
We are now able to launch tons of processes and they are all giving the expected result, well, this is not bad.
But now, all our processes are executed stand-alone, and they actually can't communicate each others. That is your
core problem, and there is a lot of solutions.
It is really difficult to tell you exactly what kind of communication you need. You were speaking about what you tried (IPCs,
communication using files, or home-made protocols), but not what kind of information are shared between your processes.
Anyway, I invite you to think about an OOP solution.
PHP is powerful.
PHP has magic methods :
__get($property) let us implement the access of a $property on an object
__set($property, $value) let us implement the assignation of a $property on an object
PHP can handle files, with concurrent-access management
fopen($file, 'c+') opens a file with advisory lock options enabled (allow you to use flock)
flock($descriptor, LOCK_SH) takes a shared lock (for reading)
flock($descriptor, LOCK_EX) takes an exclusive lock (for writting)
Finally, PHP has:
json_encode($object) to get a json representation of an object
json_decode($string) to get back an object from a json string
You see where I'm going? We will create a Synchro class that will work the same way as the stdClass class, but it will be always safely synchronized on a file. Our processes will be able to access the same instance of that object at the same time.
Some Linux systems tricks
Of course, if you have 150 processes working on the same file at the same time, your hard drive will slow down your processes. To
handle this issue, why not creating a filesystem partition on RAM? Writing into that file will be about as quick as writing
in memory!
shell As root, type the following commands:
mkfs -q /dev/ram1 65536
mkdir -p /ram
mount /dev/ram1 /ram
Some notes :
65536 is in kilobytes, here you get a 64M partition.
if you want to mount that partition at startup, create a shell script and call it inside /etc/rc.local file.
Implementation
PHP Here is the Synchro.php class.
<?php
class Synchro
{
private $_file;
public function __construct($file)
{
$this->_file = $file;
}
public function __get($property)
{
// File does not exist
if (!is_file($this->_file))
{
return null;
}
// Check if file is readable
if ((is_file($this->_file)) && (!is_readable($this->_file)))
{
throw new Exception(sprintf("File '%s' is not readable.", $this->_file));
}
// Open file with advisory lock option enabled for reading and writting
if (($fd = fopen($this->_file, 'c+')) === false)
{
throw new Exception(sprintf("Can't open '%s' file.", $this->_file));
}
// Request a lock for reading (hangs until lock is granted successfully)
if (flock($fd, LOCK_SH) === false)
{
throw new Exception(sprintf("Can't lock '%s' file for reading.", $this->_file));
}
// A hand-made file_get_contents
$contents = '';
while (($read = fread($fd, 32 * 1024)) !== '')
{
$contents .= $read;
}
// Release shared lock and close file
flock($fd, LOCK_UN);
fclose($fd);
// Restore shared data object and return requested property
$object = json_decode($contents);
if (property_exists($object, $property))
{
return $object->{$property};
}
return null;
}
public function __set($property, $value)
{
// Check if directory is writable if file does not exist
if ((!is_file($this->_file)) && (!is_writable(dirname($this->_file))))
{
throw new Exception(sprintf("Directory '%s' does not exist or is not writable.", dirname($this->_file)));
}
// Check if file is writable if it exists
if ((is_file($this->_file)) && (!is_writable($this->_file)))
{
throw new Exception(sprintf("File '%s' is not writable.", $this->_file));
}
// Open file with advisory lock option enabled for reading and writting
if (($fd = fopen($this->_file, 'c+')) === false)
{
throw new Exception(sprintf("Can't open '%s' file.", $this->_file));
}
// Request a lock for writting (hangs until lock is granted successfully)
if (flock($fd, LOCK_EX) === false)
{
throw new Exception(sprintf("Can't lock '%s' file for writing.", $this->_file));
}
// A hand-made file_get_contents
$contents = '';
while (($read = fread($fd, 32 * 1024)) !== '')
{
$contents .= $read;
}
// Restore shared data object and set value for desired property
if (empty($contents))
{
$object = new stdClass();
}
else
{
$object = json_decode($contents);
}
$object->{$property} = $value;
// Go back at the beginning of file
rewind($fd);
// Truncate file
ftruncate($fd, strlen($contents));
// Save shared data object to the file
fwrite($fd, json_encode($object));
// Release exclusive lock and close file
flock($fd, LOCK_UN);
fclose($fd);
return $value;
}
}
Demonstration
We will continue (and finish) our processes / tasks example by making our processes communicate each others.
Rules :
Our goal is to get the sum of all numbers between 1 and 20.
We have 20 processes, with ID from 1 to 20.
Those processes are randomly queued for execution.
Each process (except process 1) can do only one calculation : its id + the previous process's result
Process 1 directly put his id
Each process succeed if it can do his calculation (means, if the previous process's result is available), else it fails (and is candidate for a new queue)
The pool's timeout expires after 10 seconds
Well, it looks complicated but actually, it is a good representation of what you'll find in a real-life situation.
PHP synchro_launcher.php file.
<?php
require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolMySQL.php");
require_once("TasksManager.php");
require_once("Synchro.php");
// Removing old synchroized object
if (is_file("/tmp/synchro.txt"))
{
unlink("/tmp/synchro.txt");
}
// Initializing database connection
$dbh = new PDO("mysql:host=127.0.0.1;dbname=fuz", 'root', 'root');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
// Initializing process pool
$pool = new ProcessesPoolMySQL($label = "synchro pool", $dbh);
$pool->create($max = "10");
// Initializing task manager
$multi = new TasksManager($label = "synchro tasks", $dbh);
$multi->destroy();
// Simulating jobs
$todo_list = array ();
for ($i = 1; ($i <= 20); $i++)
{
$todo_list[$i] = $i;
$multi->add($todo_list[$i], TasksManager::WAITING);
}
// Infinite loop until all jobs are done
$continue = true;
while ($continue)
{
$continue = false;
echo "Starting to run jobs in queue ...\n";
// Shuffle all jobs (else this will be too easy :-))
shuffle($todo_list);
// put all failed jobs to WAITING status
$multi->switchStatus(TasksManager::FAILED, TasksManager::WAITING);
foreach ($todo_list as $job)
{
$ret = $pool->waitForResource($timeout = 10, $interval = 500000, "waitResource");
if ($ret)
{
echo "Executing job: $job\n";
exec(sprintf("/usr/bin/php ./synchro_program.php %s > /dev/null &", escapeshellarg($job)));
}
else
{
echo "waitForResource timeout!\n";
$pool->killAllResources();
// All jobs currently running are considered dead, so, failed
$multi->switchStatus(TasksManager::RUNNING, TasksManager::FAILED);
break;
}
}
$ret = $pool->waitForTheEnd($timeout = 10, $interval = 500000, "waitEnd");
if ($ret == false)
{
echo "waitForTheEnd timeout!\n";
$pool->killAllResources();
// All jobs currently running are considered dead, so, failed
$multi->switchStatus(TasksManager::RUNNING, TasksManager::FAILED);
}
echo "All jobs in queue executed, looking for errors...\n";
// Counts if there is failures
$multi->switchStatus(TasksManager::WAITING, TasksManager::FAILED);
$nb_failed = $multi->countStatus(TasksManager::FAILED);
if ($nb_failed > 0)
{
$todo_list = $multi->getCalculsByStatus(TasksManager::FAILED);
echo sprintf("%d jobs failed: %s\n", $nb_failed, implode(', ', $todo_list));
$continue = true;
}
}
function waitResource($multi)
{
echo "Waiting for a resource ....\n";
}
function waitEnd($multi)
{
echo "Waiting for the end .....\n";
}
// All jobs finished, destroying task manager
$multi->destroy();
// Destroying process pool
$pool->destroy();
// Recovering final result
$synchro = new Synchro("/tmp/synchro.txt");
echo sprintf("Result of the sum of all numbers between 1 and 20 included is: %d\n", $synchro->result20);
echo "Finish.\n";
PHP And its associed synchro_calcul.php file.
<?php
if (!isset($argv[1]))
{
die("This program must be called with an identifier (calcul_label)\n");
}
$current_id = $argv[1];
require_once("AbstractProcessesPool.php");
require_once("ProcessesPoolMySQL.php");
require_once("TasksManager.php");
require_once("Synchro.php");
// Initializing database connection
$dbh = new PDO("mysql:host=127.0.0.1;dbname=fuz", 'root', 'root');
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
// Initializing process pool (with same label as parent)
$pool = new ProcessesPoolMySQL($label = "synchro pool", $dbh);
// Takes one resource in pool
$pool->start();
// Initializing task manager (with same label as parent)
$multi = new TasksManager($label = "synchro tasks", $dbh);
$multi->start($current_id);
// ------------------------------------------------------
// Job begins here
$synchro = new Synchro("/tmp/synchro.txt");
if ($current_id == 1)
{
$synchro->result1 = 1;
$status = TasksManager::SUCCESS;
}
else
{
$previous_id = $current_id - 1;
if (is_null($synchro->{"result{$previous_id}"}))
{
$status = TasksManager::FAILED;
}
else
{
$synchro->{"result{$current_id}"} = $synchro->{"result{$previous_id}"} + $current_id;
$status = TasksManager::SUCCESS;
}
}
// ------------------------------------------------------
// Job finishes indicating his status
$multi->finish($status);
// Releasing pool's resource
$pool->finish();
Output
The following demo will give you something like this output (too large for SO)
Conclusion
Task management in PHP is not really easy because of the missing threads. As many developers, I hope this feature will come
built-in someday. Anyway, this is possible to control resources and results, and to share data between processes, so we can do,
with some work I assume, task management efficiently.
Synchronization and communication can be done by several ways, but you need to check out pros and cons for each one, according
to your constraints and requirements. For example:
if you need to launch 500 tasks at once and want to use a MySQL synchronization method, you'll need 1+500 simultaneous connections to the database (it may not appreciate a lot).
If you need to share a big amount of data, using only one file may be inefficient.
If you are using file for synchronization, don't forget to have a look to system built-in tools such as /dev/sdram.
Try to stay as more as possible in an object-oriented programming to handle your troubles. Home-made protocols or such will make your app harder to maintain.
I gave you my 2-cents about this interesting subject, and I hope it will give you some ideas to solve your issues.
I recommend you take a look at this library called PHP-Queue: https://github.com/CoderKungfu/php-queue
An short description from its github page:
A unified front-end for different queuing backends. Includes a REST
server, CLI interface and daemon runners.
Check out its github page for more details.
With a bit of tinkering, I think this library will help you solve your problem.
Hope this helps.
I am working with SilverStripe, and I am working on making a newspage.
I use the DataObjectAsPage Module( http://www.ssbits.com/tutorials/2012/dataobject-as-pages-the-module/ ), I got it working when I use the admin to publish newsitems.
Now I want to use the DataObjectManager Module instead of the admin module to manage my news items. But this is where the problem exists. Everything works fine in draft mode, I can make a new newsitem and it shows up in draft. But when I want to publish a newsitem, it won't show up in the live or published mode.
I'm using the following tables:
-Dataobjectaspage table,
-Dataobjectaspage_live table,
-NewsArticle table,
-NewsArticle_Live table
The Articles have been inserted while publishing in the Dataobjectaspage table and in the NewsArticle table... But not in the _Live tables...
Seems the doPublish() function hasn't been used while 'Publishing'.
So I'm trying the use the following:
function onAfterWrite() {
parent::onAfterWrite();
DataObjectAsPage::doPublish();
}
But when I use this, it gets an error:
here is this picture
It seems to be in a loop....
I've got the NewsArticle.php file where I use this function:
function onAfterWrite() {
parent::onAfterWrite();
DataObjectAsPage::doPublish();
}
This function calls the DataObjectAsPage.php file and uses this code:
function doPublish() {
if (!$this->canPublish()) return false;
$original = Versioned::get_one_by_stage("DataObjectAsPage", "Live", "\"DataObjectAsPage\".\"ID\" = $this->ID");
if(!$original) $original = new DataObjectAsPage();
// Handle activities undertaken by decorators
$this->invokeWithExtensions('onBeforePublish', $original);
$this->Status = "Published";
//$this->PublishedByID = Member::currentUser()->ID;
$this->write();
$this->publish("Stage", "Live");
// Handle activities undertaken by decorators
$this->invokeWithExtensions('onAfterPublish', $original);
return true;
}
And then it goes to DataObject.php file and uses the write function ():
public function write($showDebug = false, $forceInsert = false, $forceWrite = false, $writeComponents = false) {
$firstWrite = false;
$this->brokenOnWrite = true;
$isNewRecord = false;
if(self::get_validation_enabled()) {
$valid = $this->validate();
if(!$valid->valid()) {
// Used by DODs to clean up after themselves, eg, Versioned
$this->extend('onAfterSkippedWrite');
throw new ValidationException($valid, "Validation error writing a $this->class object: " . $valid->message() . ". Object not written.", E_USER_WARNING);
return false;
}
}
$this->onBeforeWrite();
if($this->brokenOnWrite) {
user_error("$this->class has a broken onBeforeWrite() function. Make sure that you call parent::onBeforeWrite().", E_USER_ERROR);
}
// New record = everything has changed
if(($this->ID && is_numeric($this->ID)) && !$forceInsert) {
$dbCommand = 'update';
// Update the changed array with references to changed obj-fields
foreach($this->record as $k => $v) {
if(is_object($v) && method_exists($v, 'isChanged') && $v->isChanged()) {
$this->changed[$k] = true;
}
}
} else{
$dbCommand = 'insert';
$this->changed = array();
foreach($this->record as $k => $v) {
$this->changed[$k] = 2;
}
$firstWrite = true;
}
// No changes made
if($this->changed) {
foreach($this->getClassAncestry() as $ancestor) {
if(self::has_own_table($ancestor))
$ancestry[] = $ancestor;
}
// Look for some changes to make
if(!$forceInsert) unset($this->changed['ID']);
$hasChanges = false;
foreach($this->changed as $fieldName => $changed) {
if($changed) {
$hasChanges = true;
break;
}
}
if($hasChanges || $forceWrite || !$this->record['ID']) {
// New records have their insert into the base data table done first, so that they can pass the
// generated primary key on to the rest of the manipulation
$baseTable = $ancestry[0];
if((!isset($this->record['ID']) || !$this->record['ID']) && isset($ancestry[0])) {
DB::query("INSERT INTO \"{$baseTable}\" (\"Created\") VALUES (" . DB::getConn()->now() . ")");
$this->record['ID'] = DB::getGeneratedID($baseTable);
$this->changed['ID'] = 2;
$isNewRecord = true;
}
// Divvy up field saving into a number of database manipulations
$manipulation = array();
if(isset($ancestry) && is_array($ancestry)) {
foreach($ancestry as $idx => $class) {
$classSingleton = singleton($class);
foreach($this->record as $fieldName => $fieldValue) {
if(isset($this->changed[$fieldName]) && $this->changed[$fieldName] && $fieldType = $classSingleton->hasOwnTableDatabaseField($fieldName)) {
$fieldObj = $this->dbObject($fieldName);
if(!isset($manipulation[$class])) $manipulation[$class] = array();
// if database column doesn't correlate to a DBField instance...
if(!$fieldObj) {
$fieldObj = DBField::create('Varchar', $this->record[$fieldName], $fieldName);
}
// Both CompositeDBFields and regular fields need to be repopulated
$fieldObj->setValue($this->record[$fieldName], $this->record);
if($class != $baseTable || $fieldName!='ID')
$fieldObj->writeToManipulation($manipulation[$class]);
}
}
// Add the class name to the base object
if($idx == 0) {
$manipulation[$class]['fields']["LastEdited"] = "'".SS_Datetime::now()->Rfc2822()."'";
if($dbCommand == 'insert') {
$manipulation[$class]['fields']["Created"] = "'".SS_Datetime::now()->Rfc2822()."'";
//echo "<li>$this->class - " .get_class($this);
$manipulation[$class]['fields']["ClassName"] = "'$this->class'";
}
}
// In cases where there are no fields, this 'stub' will get picked up on
if(self::has_own_table($class)) {
$manipulation[$class]['command'] = $dbCommand;
$manipulation[$class]['id'] = $this->record['ID'];
} else {
unset($manipulation[$class]);
}
}
}
$this->extend('augmentWrite', $manipulation);
// New records have their insert into the base data table done first, so that they can pass the
// generated ID on to the rest of the manipulation
if(isset($isNewRecord) && $isNewRecord && isset($manipulation[$baseTable])) {
$manipulation[$baseTable]['command'] = 'update';
}
DB::manipulate($manipulation);
if(isset($isNewRecord) && $isNewRecord) {
DataObjectLog::addedObject($this);
} else {
DataObjectLog::changedObject($this);
}
$this->onAfterWrite();
$this->changed = null;
} elseif ( $showDebug ) {
echo "<b>Debug:</b> no changes for DataObject<br />";
// Used by DODs to clean up after themselves, eg, Versioned
$this->extend('onAfterSkippedWrite');
}
// Clears the cache for this object so get_one returns the correct object.
$this->flushCache();
if(!isset($this->record['Created'])) {
$this->record['Created'] = SS_Datetime::now()->Rfc2822();
}
$this->record['LastEdited'] = SS_Datetime::now()->Rfc2822();
} else {
// Used by DODs to clean up after themselves, eg, Versioned
$this->extend('onAfterSkippedWrite');
}
// Write ComponentSets as necessary
if($writeComponents) {
$this->writeComponents(true);
}
return $this->record['ID'];
}
Look at the $this->onAfterWrite();
It probably goes to my own function on NewsArticle.php and there starts the loop! I'm not sure though, so i could need some help!!
Does anyone knows how to use the doPublish() function?
The reason that is happening is that in the DataObjectAsPage::publish() method, it is calling ->write() - line 11 of your 3rd code sample.
So what happens is it calls ->write(), at the end of ->write() your onAfterWrite() method is called, which calls publish(), which calls write() again.
If you remove the onAfterWrite() function that you've added, it should work as expected.
The doPublish() method on DataObjectAsPage will take care of publishing from Stage to Live for you.
I have a MySQL table holding lots of records that i want to give the user access to. I don't want to dump the entire table to the page so i need to break it up into 25 records at a time, so i need a page index. You have probably seen these on other pages, they kind of look like this at the base of the page:
< 1 2 3 4 5 6 7 8 9 >
For example, when the user clicks on the '4' link, the page refreshes and the offset is moved on (4th page x 25 records). Here is what i already have:
function CreatePageIndex($ItemsPerPage, $TotalNumberOfItems, $CurrentOffset, $URL, $URLArguments = array())
{
foreach($URLArguments as $Key => $Value)
{
if($FirstIndexDone == false)
{
$URL .= sprintf("?%s=%s", $Key, $Value);
$FirstIndexDone = true;
}
else
{
$URL .= sprintf("&%s=%s", $Key, $Value);
}
}
Print("<div id=\"ResultsNavigation\">");
Print("Page: ");
Print("<span class=\"Links\">");
$NumberOfPages = ceil($TotalNumberOfItems / $ItemsPerPage);
for($x = 0; $x < $NumberOfPages; $x++)
{
if($x == $CurrentOffset / $ItemsPerPage)
{
Print("<span class=\"Selected\">".($x + 1)." </span>");
}
else
{
if(empty($URLArguments))
{
Print("".($x + 1)." ");
}
else
{
Print("".($x + 1)." ");
}
}
}
Print("</span>");
Print(" (".$TotalNumberOfItems." results)");
Print("</div>");
}
Obviously this piece of code does not create a dynamic index, it just dumps the whole index at the bottom of the page for every page available. What i need is a dynamic solution that only shows the previous 5 pages and next 5 pages (if they exist) along with a >> or something to move ahead 5 or so pages.
Anybody seen an elegant and reusable way of implementing this as i feel i'm re-inventing the wheel? Any help is appreciated.
Zend Framework is becoming a useful collection and includes a Zend_Paginator class, which might be worth a look. Bit of a learning curve and might only be worth it if you want to invest the time in using other classes from the framework.
It's not too hard to roll your own though. Get a total count of records with a COUNT(*) query, then obtain a page of results with a LIMIT clause.
For example, if you want 20 items per page, page 1 would have LIMIT 0,20 while page 2 would be LIMIT 20,20, for example
$count=getTotalItemCount();
$pagesize=20;
$totalpages=ceil($count/$pagesize);
$currentpage=isset($_GET['pg'])?intval($_GET['pg']):1;
$currentpage=min(max($currentpage, 1),$totalpages);
$offset=($currentpage-1)*$pagesize;
$limit="LIMIT $offset,$pagesize";
It's called Pagination:
a few examples:
A nice one without SQL
A long tutorial
Another tutorial
And Another
And of course.. google
How about this jQuery-plugin?
So all the work is done on the clientside.
http://plugins.jquery.com/project/pagination
demo: http://d-scribe.de/webtools/jquery-pagination/demo/demo_options.htm
Heres an old class I dug out that I used to use in PHP. Now I handle most of it in Javascript. The object takes an array (that you are using to split the stack into pages) and return the current view. This can become tedious on giant tables so keep that in mind. I generally use it for paging through small data sets of under 1000 items. It can also optionally generate your jump menu for you.
class pagination {
function pageTotal($resultCount, $splitCount) {
if (is_numeric($resultCount) && is_numeric($splitCount)) {
if ($resultCount > $splitCount) {
$pageAverage = (integer)$resultCount / $splitCount;
$pageTotal = ceil($pageAverage);
return $pageTotal;
} else {
return 1;
}
} else {
return false;
}
}
function pageTotalFromStack($resultArray, $splitCount) {
if (is_numeric($splitCount) && is_array($resultStack)) {
if (count($resultStack) > $splitCount) {
$resultCount = count($resultStack);
$pageAverage = (integer)$resultCount / $splitCount;
$pageTotal = ceil($pageAverage);
return $pageTotal;
} else {
return 1;
}
} else {
return false;
}
}
function makePaginationURL($preURL, $pageTotal, $selected=0, $linkAttr=0, $selectedAttr=0) {
if (!empty($preURL) && $pageTotal >= 1) {
$pageSeed = 1;
$passFlag = 0;
$regLink = '<a href="{url}&p={page}"';
if (is_array($linkAttr)) $regLink .= $this->setAttributes($linkAttr); //set attributes
$regLink .= '>{page}</a>';
$selLink = '<a href="{url}&p={page}"';
if (is_array($selectedAttr)) $selLink .= $this->setAttributes($selectedAttr); //set attributes
$selLink .= '>{page}</a>';
while($pageSeed <= $pageTotal) {
if ($pageSeed == $selected) {
$newPageLink = str_replace('{url}', $preURL, $selLink);
$newPageLink = str_replace('{page}', $pageSeed, $newPageLink);
} else {
$newPageLink = str_replace('{url}', $preURL, $regLink);
$newPageLink = str_replace('{page}', $pageSeed, $newPageLink);
}
if ($passFlag == 0) {
$passFlag = 1;
$linkStack = $newPageLink;
} else {
$linkStack .= ', ' . $newPageLink;
}
$pageSeed++;
}
return $linkStack;
} else {
return false;
}
}
function splitPageArrayStack($stackArray, $chunkSize) {
if (is_array($stackArray) && is_numeric($chunkSize)) {
return $multiArray = array_chunk($stackArray, $chunkSize);
} else {
return false;
}
}
}