Connection errors in forked PHP processes - php

I have a PHP script that takes N documents from MongoDB, forks the process into K child PHP processes, each process does some things with each document and tries to update document's info (see the code below).
On my local environment (Docker) everything is cool, but on the server (no Docker there) sometimes during the loop strange things happen...
Randomly all forked processes can not connect to MongoDB. The updateOne command returns an error :
"Failed to send "update" command with database "databasename": Invalid reply from server. in /vendor/mongodb/mongodb/src/Operation/Update.php on line 158".
This happens to all processes at the same time only for one (or several) random loop iterations. When each process goes to another iteration (takes the next document) -- everything is ok again. I make 5 tries to write to MongoDB.
Each try is with delay +1 sec to the previous, so the first try makes immediately, if any exception is caught -- wait a second and try again, the next try will be in 2 seconds and so on. But this does not help, all these 5 tries are broken.
This is not mongoDB problem, it's log is empty and it even don't receive anything from PHP, when error happens.
Also I have admitted, the more simultaneous processes I run -- the more frequent errors occur.
Also it is not server resource problem, when error occurs, half of RAM (4 gig) is free and CPU is working for the half of it's power.
Maybe PHP has some configuration for this? Some memory limits or something...
I use PHP v 7.1.30
MongoDB v 3.2.16
PHP Package mongodb/mongodb v 1.1.2
<?php
$processesAmount = 5;
$documents = $this->mongoResource->getDocuments();
for ($processNumber = 0; $processNumber < $processesAmount; $processNumber++) {
// create child process
$pid = pcntl_fork();
// do not create new processes in child processes
if ($pid === 0) {
break;
}
if ($pid === -1) {
// some errors catching staff here...
}
else if ($pid === 0) {
// create new MongoDB connection
}
else {
// Protect against Zombie children
// main process waits before all child processes end
for ($i = 0; $i < $processesAmount; $i++) {
pcntl_wait($status);
}
return null;
}
// spread documents to each process without duplicates
for ($i = $processNumber; $i < count($documents); $i += $processesAmount) {
$newDocumentData = $this->doSomeStaffWithDocument($documents[$i]);
$this->mongoResource->updateDocument($documents[$i], $newDocumentData);
}
}

There could be many issues here, one being that all processes are sharing 1 DB connection and the first to connect is then disconnecting and killing the connection for them all. Check the second example in the docs here: https://www.php.net/manual/en/ref.pcntl.php
If that doesn't help, the way I read your code the "spreading" part is happening in every process, when it should be happening once. Shouldn't you be putting the "work" in the child section like below?
$processesAmount = 5;
$documents = $this->mongoResource->getDocuments();
$numDocs = count($documents);
$i = 0;
$children = [];
for ($processNumber = 0; $processNumber < $processesAmount; $processNumber++) {
// create child
$pid = pcntl_fork();
if ($pid === -1) {
// some errors catching staff here...
} else if ($pid) {
//parent
$children[] = $pid;
} else {
//child
while (!empty($documents) && $i <= $numDocs) {
$i += $processNumber;
$doc = $documents[$i] ?? null;
unset($documents[$i]);
$newDocumentData = $this->doSomeStaffWithDocument($doc);
$this->mongoResource->updateDocument($doc, $newDocumentData);
}
}
}
//protect against zombies and wait for parent
//children is always empty unless in parent
while (!empty($children)) {
foreach ($children as $key => $pid) {
$status = null;
$res = pcntl_waitpid($pid, $status, WNOHANG);
if ($res == -1 || $res > 0) { //if the process has already exited
unset($children[$key]);
}
}
}
}

Related

interrupt a function by other function

i have a symfony job that have two function:
launch and stop.
My launch function will import contacts,for example 4 by 4 from database and send to all of them messages.
public function launchAction()
{
$offset = 0;
$limit = 4;
$sizeData /= $limit;
for( $i = 0; $i < $sizeData; $i++)
{
$contacts = $repository->getListByLimit($offset, $limit);
$sender->setContacts($contacts);
$sender->send();
$offset += $limit;
}
}
when i launched my launch function it will take for example 20 seconds to import and send the message to all contacts
but if i want to stop it,how can the stop function interrupt the launch function
public function stopAction()
{
}
i will not fully answer but give you two hints how it could work
1:
save a file with process id on launch()
on stop() you could check for existence and kill the process by id
2:
on launch() you can check for a specific db-entry in loop so it breaks if value is present
on stop you set this db entry
If your only purpose is to be able to stop the script, you don't need a full event loop implementation I think. You can listen to a local socket, and break when you receive data.
You could for example run this in launchAction
public function launchAction()
{
$offset = 0;
$limit = 4;
$sizeData /= $limit;
// Init IPC connection
$server = stream_socket_server("tcp://127.0.0.1:1337", $errno, $errorMessage);
if ($server === false) {
throw new UnexpectedValueException("Could not bind to socket: $errorMessage");
}
for( $i = 0; $i < $sizeData; $i++)
{
// Check our socket for data
$client = #stream_socket_accept($server);
if ($client) {
// Read sent data
$data = fread($client, 1024);
// Probably break
if ($data === 'whatever') {
break;
}
}
$contacts = $repository->getListByLimit($offset, $limit);
$sender->setContacts($contacts);
$sender->send();
$offset += $limit;
}
// Close socket after sending all messages
fclose($client);
}
And stopAction could hit the socket to terminate the connection like so:
public function stopAction()
{
$socket = stream_socket_client('tcp://127.0.0.1:1337');
fwrite($socket, 'whatever');
fclose($socket);
}
This should work if you run both functions on the same machine. Also note that PHP can only listen to sockets which are not occupied already. So you might need to change the port number. And in case you start a second process to send messages in parallel, the new one will not be able to bind to the same socket.
A great blogpost explaining some detail is https://www.christophh.net/2012/07/24/php-socket-programming/
If however you wish to start a long running process, I suggest you take a look at ReactPHP, which is an excellent event loop implementation that runs on several different setups. It also includes timers, and other useful libs.
You might want to take a look at this blogpost series, to get an idea https://blog.wyrihaximus.net/2015/01/reactphp-introduction/

Symfony 1.4 functional test - reduce memory usage

I have a csv file that defines routes to test, and the expected status code each route should return.
I am working on a functional test that iterates over the csv file and makes a request to each route, then checks to see if the proper status code is returned.
$browser = new sfTestFunctional(new sfBrowser());
foreach ($routes as $route)
{
$browser->
get($route['path'])->
with('response')->begin()->
isStatusCode($route['code'])->
end()
;
print(memory_get_usage());
}
/*************** OUTPUT: *************************
ok 1 - status code is 200
97953280# get /first_path
ok 2 - status code is 200
109607536# get /second_path
ok 3 - status code is 403
119152936# get /third_path
ok 4 - status code is 200
130283760# get /fourth_path
ok 5 - status code is 200
140082888# get /fifth_path
...
/***************************************************/
This continues until I get an allowed memory exhausted error.
I have increased the amount of allowed memory, which temporarily solved the problem. That is not a permanent solution since more routes will be added to the csv file over time.
Is there a way to reduce the amount of memory this test is using?
I faced the same out of memory problem. I needed to crawl a very long list of URI (around 30K) to generate the HTML cache. Thanks to Marek, I tried to fork processes. There is still a little leak, but this is insignificant.
As an input, I had a text file with one line per URI. Of course you can easily adapt the following script with a CSV.
const NUMBER_OF_PROCESS = 4;
const SIZE_OF_GROUPS = 5;
require_once(dirname(__FILE__).'/../../config/ProjectConfiguration.class.php');
$configuration = ProjectConfiguration::getApplicationConfiguration('frontend', 'prod', false);
sfContext::createInstance($configuration);
$file = new SplFileObject(dirname(__FILE__).'/list-of-uri.txt');
while($file->valid())
{
$count = 0;
$uris = array();
while($file->valid() && $count < NUMBER_OF_PROCESS * SIZE_OF_GROUPS) {
$uris[] = trim($file->current());
$file->next();
$count++;
}
$urisGroups = array_chunk($uris, SIZE_OF_GROUPS);
$childs = array();
echo "---\t\t\t Forking ".sizeof($urisGroups)." process \t\t\t ---\n";
foreach($urisGroups as $uriGroup) {
$pid = pcntl_fork();
if($pid == -1)
die('Could not fork');
if(!$pid) {
$b = new sfBrowser();
foreach($uriGroup as $key => $uri) {
$starttime = microtime(true);
$b->get($uri);
$time = microtime(true) - $starttime;
echo 'Mem: '.memory_get_peak_usage().' - '.$time.'s - URI N°'.($key + 1).' PID '.getmypid().' - Status: '.$b->getResponse()->getStatusCode().' - URI: '.$uri."\n";
}
exit();
}
if($pid) {
$childs[] = $pid;
}
}
while(count($childs) > 0) {
foreach($childs as $key => $pid) {
$res = pcntl_waitpid($pid, $status, WNOHANG);
// If the process has already exited
if($res == -1 || $res > 0)
unset($childs[$key]);
}
sleep(1);
}
}
const NUMBER_OF_PROCESS is defining the number of parallel processes working (thus, you save time if you have a multi-core processor)
const NUMBER_OF_PROCESS is defining the number of URI that will be crawled by sfBrowser in each process. You can decrease it if you still have out of memory problems

pcntl_fork() results in defunct parent process

So, I have this PHP daemon worker that listens to IPC messages.
Weird thing is that the parent process (result from pcntl_fork) leaves a [php] < defunct> process untill the child process is done but ONLY when the script is started form a cronjob, not directly from command line.
I know < defunct> processes aren't evil, but I can't figure out why it's happening only when running from a cronjob.
Command
/path/to/php/binary/php /path/to/php/file/IpcServer.php
Forking code:
$iParent = posix_getpid();
$iChild = pcntl_fork();
if ($iChild == -1)
throw new Exception("Unable to fork into child process.");
elseif ($iChild)
{
echo "Forking into background [{$iChild}].\n";
Log::d('Killing parent process (' . $iParent. ').');
exit;
}
Output
Forking into background [20835].
Killing parent process (20834).
ps aux | grep php
root 20834 0.0 0.0 0 0 ? Zs 14:28 0:00 [php] <defunct>
root 20835 0.0 0.2 275620 8064 ? Ss 15:35 0:00 /path/to/php/binary/php /path/to/php/file/IpcServer.php
I've found out that it's a known apache bug, but then why do I get this bug when running from cronjob?
In PHP the child will become a zombie process unless the parent waits for it to return with a pcntl_wait() or pcntl_waitpid(). The zombies will be destroyed once all processes have ended or are handled. It looks like the parent will become a zombie too if children aren't handled and a child runs longer than the parent.
Example from pcntl_fork page:
$pid = pcntl_fork();
if ($pid == -1) {
die('could not fork');
} else if ($pid) {
// we are the parent
pcntl_wait($status); //Protect against Zombie children
} else {
// we are the child
}
Or use signal handling like so to prevent waiting on the main thread:
$processes = array(); // List of running processes
$signal_queue = array(); // List of signals for main thread read
// register signal handler
pcntl_signal(SIGCHLD, 'childSignalHandler');
// fork. Can loop around this for lots of children too.
switch ($pid = pcntl_fork()) {
case -1: // FAILED
break;
case 0: // CHILD
break;
default: // PARENT
// ID the process. Auto Increment or whatever unique thing you want
$processes[$pid] = $someID;
if(isset($signal_queue[$pid])){
childSignalHandler(SIGCHLD, $pid, $signal_queue[$pid]);
unset($signal_queue[$pid]);
}
break;
}
function childSignalHandler($signo, $pid=null, $status=null){
global $processes, $signal_queue;
// If no pid is provided, Let's wait to figure out which child process ended
if(!$pid){
$pid = pcntl_waitpid(-1, $status, WNOHANG);
}
// Get all exited children
while($pid > 0){
if($pid && isset($processes[$pid])){
// I don't care about exit status right now.
// $exitCode = pcntl_wexitstatus($status);
// if($exitCode != 0){
// echo "$pid exited with status ".$exitCode."\n";
// }
// Process is finished, so remove it from the list.
unset($processes[$pid]);
}
else if($pid){
// Job finished before the parent process could record it as launched.
// Store it to handle when the parent process is ready
$signal_queue[$pid] = $status;
}
$pid = pcntl_waitpid(-1, $status, WNOHANG);
}
return true;
}

How to run a PHP script asynchronously?

I am creating a PHP script that will be run via the command line. As part of this script, there are times where I might need to spawn/fork a different script that could take a long time to complete. I don't want to block the original script from completing. If I were doing this with JavaScript, I could run AJAX requests in the background. That is essentially what I am trying to do here. I don't need to know when the forks complete, just that they start and complete themselves.
How can I run these PHP scripts asynchronously?
foreach ($lotsOfItems as $item) {
if ($item->needsExtraHelp) {
//start some asynchronous process here, and pass it $item
}
}
$pids = array();
foreach ($lotsOfItems as $item) {
if ($item->needsExtraHelp) {
$pid = pcntl_fork();
if ($pid == 0) {
// you're in the child
var_dump($item);
exit(0); // don't forget this one!!
} else if ($pid == -1) {
// failed to fork process
} else {
// you're in the parent
$pids[] = $pid;
}
}
usleep(100); // prevent CPU from peaking
foreach ($pids as $pid) {
pcntl_waitpid($pid, $exitcode, WNOHANG); // prevents zombie processes
}
}
Looking the user contributed notes on exec, it looks like you could use it, check out:
http://de3.php.net/manual/en/function.exec.php#86329
<?php
function execInBackground($cmd) {
if (substr(php_uname(), 0, 7) == "Windows"){
pclose(popen("start /B ". $cmd, "r"));
}
else {
exec($cmd . " > /dev/null &");
}
}
?>
This will execute $cmd in the
background (no cmd window) without PHP
waiting for it to finish, on both
Windows and Unix.
int pcntl_fork ( void )
The pcntl_fork() function creates a child process that differs from the parent process only in its PID and PPID. Please see your system's fork(2) man page for specific details as to how fork works on your system.
details : http://php.net/manual/en/function.pcntl-fork.php
related question : PHP: What does pcntl_fork() really do?
Process control should not be enabled within a web server environment and unexpected results may happen if any Process Control functions are used within a web server environment.
details: http://www.php.net/manual/en/intro.pcntl.php

Explain basic PHP Socket server code

I'm trying to learn how to make a chat with a socket server.
I noticed everybody uses the same code (ripoff from the zend developer zone).
The problem is no one really explains how it works. Especially the cryptic code after while(true) { .
This would benefit many so i hope someone could take the time and explain the code in detail (DETAIL!).
You can find the code here
I'll answer it myselfe. I went over it line by line.. this is how it works (I'm only explaining the part in the while(true) loops.
1.
// Setup clients listen socket for reading
$read[0] = $sock;
for ($i = 0; $i < $max_clients; $i++) {
if (isset($client[$i]['sock']))
$read[$i + 1] = $client[$i]['sock'];
}
This asings freshly created connections to the $read array to be watched for incoming data.
// Set up a blocking call to socket_select()
if (socket_select($read, $write = NULL, $except = NULL, $tv_sec = 5) < 1)
continue;
Watches the $read array for new data (I'm still a bit unclear how this works)
/* if a new connection is being made add it to the client array */
if (in_array($sock, $read)) {
for ($i = 0; $i < $max_clients; $i++) {
if (empty($client[$i]['sock'])) {
$client[$i]['sock'] = socket_accept($sock);
echo "New client connected $i\r\n";
break;
}
elseif ($i == $max_clients - 1)
echo "Too many clients...\r\n";
}
}
Determines when a new connection is being made, than finds an empty spot in the $client array and add the socket.
This next part I'll split up for easier explanation.
for ($i = 0; $i < $max_clients; $i++) { // for each client
if (isset($client[$i]['sock'])) {
Loops through all the $client array but only works on the ones that actually have a connection.
if (in_array($client[$i]['sock'], $read)) {
$input = socket_read($client[$i]['sock'], 1024);
if ($input == null) {
echo "Client disconnecting $i\r\n";
// Zero length string meaning disconnected
unset($client[$i]);
} else {
echo "New input received $i\r\n";
// send it to the other clients
for ($j = 0; $j < $max_clients; $j++) {
if (isset($client[$j]['sock']) && $j != $i) {
echo "Writing '$input' to client $j\r\n";
socket_write($client[$j]['sock'], $input, strlen($input));
}
}
if ($input == 'exit') {
// requested disconnect
socket_close($client[$i]['sock']);
}
}
} else {
echo "Client disconnected $i\r\n";
// Close the socket
socket_close($client[$i]['sock']);
unset($client[$i]);
}
First it sees if there is still an active connection, if not it closes it.
If there is a connection it read the data, if there is noone this is code for a disconnect.
If there is data it passes it along to other clients (but itselfe).
That's ti. Hope I got it right.
PHP is not multithreaded, because of that you cant build good socket server.
Use Python instead.
http://docs.python.org/library/socketserver.html
http://www.rohitab.com/discuss/index.php?showtopic=24808

Categories