I want to run parallel processes to process at least 4 users simultaneously.
ob_end_flush();
for($i = 0; $i < 4; $i++)
{
$pid=pcntl_fork();
if ($pid == -1) {
echo("Could not create a new child.");
exit;
} else if ($pid == 0) {
//This will get excuted only in the child process
echo 'parent1';
sleep(5);
echo 'parent2';
exit;
}
}
The output is something like this:
> php testscript.php
parent1
parent1
parent1
parent1
> parent2
parent2
parent2
parent2
What is currently happening is after displaying parent1 4 times the control on the script is lost, and a new empty command line is started. After waiting for 5 more seconds the remaining output comes.
How can i avoid that from happening. I want to be able to break the script as and when I want.
EDIT
If i send a signal via terminal say by pressing CTRL+C, the script does not stops, need a way to inform the script to stop at that instant.
You are creating dirty zombie/parent-less processes who're being adopted by others who certainly don't want them either =)
When you're working with forking you need to understand you're creating a copy of the current process (stack/heap and all), giving it a new process id (PID) and setting it's parent id (PPID) as its own as to say the spawned process is a child. Here the parent of these 4 new children finished all his apparent work then died without any error because he has nothing left to do.. But as any respectful parent should do, your parent process should sit around and take care of his children. That's where signals can become useful especially the SIGCHLD signal, which will be sent to any parent process whos child just past away. Combining that with a small event handling loop and your parent can gracefully manage his children \o/
tl;dr I have a an example fork wrapper sitting in my blog http://codehackit.blogspot.be/2012/01/childpids-break-pcntlsignaldispatch.html
you need to use pcntl_signal to set up a signal handler for CTRL-C(which equals to SIGINT, with a value of 2) if you want to react on it in a custom fashion.
Related
While building a page to serve back to the user, I'd like to submit a background script to do some numerical analysis of data in the database and send the result to the user in an email. This process may take a minute or so, so I don't want to delay the page serve while it runs.
Is there a way to trigger another PHP script from the script that's building the page so it can send the page and be done while the other script runs in the background?
For testing, this TEST.PHP:
<?php
set_time_limit(0);
mail ('myemail#myemail.com','Test Email from ClockPie', 'foobar');
?>
Then I put this in the script that builds the page serve:
...
shell_exec ('test.php');
...
I'm running under Windoze 7 Home Premium. Is there something obviously wrong with this?
And yes, I know this is essentially a duplicate question and there are other existing questions about this same thing, but I'm too much of a peon here on StackOverflow to simply add comments and participate in the discussions :-(
If you want to fork another PHP process, you would need to use pcntl_fork. However I don't think that this is the best method here. Instead I would make the script accessed by the browser add the data to be processed to a queue. Then set up a scheduled task to run every 'x' number of minutes that will process the queue and email the user when done.
I am assuming, of course, that you will make this scheduled script to be extremely lightweight if there is actually nothing in the queue (i.e. it should take a few milliseconds to complete). If not, you will be bogging down your server every few minutes and in that case looking into something like pcntl_fork would be the better solution.
One disadvantage of doing it this way is if you set it to run, for example, every 5 minutes, but it only takes 2 minutes to process the data, the user would be required to wait up to 3 extra minutes to receive the email. If this is a problem, tweak it to run more often.
shell_exec() is just like sitting at a command prompt and typing the string value and pressing return. You can't just type "test.php" and have it fire, you need to run PHP, and point it to your php file.
Depending on how you have php installed, you maybe able to simple change it to shell_exec('php test.php'); or you may have to provide a path to PHP like shell_exec('C:\php\bin\php.exe test.php');. The path will depend on your environment.
In the code you show, if you're just sending a single email, that shouldn't take more than a few seconds so I wouldn't even go down this route. But I don't think this is the end result of your code, and just a sample.
What you can do is output the page and flush() it so all data gets sent to the user. Then you sent the email. The user will see the page, but in the background it's still loading. This doesn't require anything like shell_exec();
This method is used by software like phpBB too to handle cron jobs taking quite some time.
flush() documentation: http://php.net/manual/en/function.flush.php
Basic Program Architecture:
Build & Echo Page
flush()
Send Email
You can use shell_exec() to run your 'sub php script' asynchronously, meaning that it will run in the background while your main php script continues, by appending the command that fires-off your 'sub php script' with the & symbol. So, it would be something like this:
$cmd="php /path/to/you/php/sub/script.php &";
shell_exec($cmd);
//main script continues running here, while sub script runs in the background...
The following code works on my system. Note that I'm just a hobbyist, not an expert, so this might not fall under the category of 'best practices', and I have no idea what the security implications might be, but this absolutely works, creating multiple threads that all run concurrently. Never mind about the folder name 'Calories'. That just happens to be the folder I was working in when I threw together this example code.
main.php:
error_log('Hello, world, from main!');
$numberOfThreadsToCreate = 3;
for($i = 0; $i < $numberOfThreadsToCreate; ++$i) {
error_log("Main starting child {$i}");
$fp = fsockopen('localhost', 8888);
if(!$fp) {
error_log("$errstr ($errno)");
exit;
}
$firstSleep = $numberOfThreadsToCreate - $i;
$header = "GET /Calories/thread.php?threadID={$i}&firstSleep={$firstSleep}"
. " HTTP/1.1\r\n"
. "Host: localhost\r\n"
. "Connection: Close\r\n\r\n";
$r = fputs($fp, $header);
fclose($fp);
sleep(1);
}
for($i = 0; $i < 5; ++$i) {
sleep(1);
error_log('Main is still running');
}
error_log("Goodbye, cruel world, from main!");
thread.php
$myThreadID = $_GET['threadID'];
$sleep = $_GET['firstSleep'];
error_log("Hello, world, from child thread, ID={$myThreadID}!");
for($i = 0; $i < 5; ++$i) {
error_log("Child {$myThreadID} sleeping for {$sleep} seconds");
sleep($sleep);
$sleep = 1;
}
error_log("Goodbye, cruel world, from child thread, ID={$myThreadID}!");
And the logfile results:
Hello, world, from main!
Main starting child 0
Hello, world, from child thread, ID=0!
Child 0 sleeping for 3 seconds
Main starting child 1
Hello, world, from child thread, ID=1!
Child 1 sleeping for 2 seconds
Main starting child 2
Hello, world, from child thread, ID=2!
Child 2 sleeping for 1 seconds
Child 1 sleeping for 1 seconds
Child 2 sleeping for 1 seconds
Child 0 sleeping for 1 seconds
Child 1 sleeping for 1 seconds
Child 2 sleeping for 1 seconds
Child 0 sleeping for 1 seconds
Main is still running
Child 1 sleeping for 1 seconds
Child 2 sleeping for 1 seconds
Child 0 sleeping for 1 seconds
Main is still running
Child 1 sleeping for 1 seconds
Child 2 sleeping for 1 seconds
Child 0 sleeping for 1 seconds
Main is still running
Goodbye, cruel world, from child thread, ID=1!
Goodbye, cruel world, from child thread, ID=2!
Main is still running
Goodbye, cruel world, from child thread, ID=0!
Main is still running
Goodbye, cruel world, from main!
Disclaimer
I am well aware that PHP might not have been the best choice in this case for a socket server. Please refrain from suggesting
different languages/platforms - believe me - I've heard it from all
directions.
Working in a Unix environment and using PHP 5.2.17, my situation is as follows - I have constructed a socket server in PHP that communicates with flash clients. My first hurtle was that each incoming connection blocked the sequential connections until it had finished being processed. I solved this by utilizing PHP's pcntl_fork(). I was successfully able to spawn numerous child processes (saving their PID in the parent) that took care of broadcasting messages to the other clients and therefore "releasing" the parent process and allowing it to continue to process the next connection[s].
My main issue right now is dealing/handling with the collection of these dead/zombie child processes and terminating them. I have read (over and over) the relevant PHP manual pages for pcntl_fork() and realize that the parent process is in charge of cleaning up its children. The parent process receives a SIGNAL from its child when the child executes an exit(0). I am able to "catch" that signal using the pcntl_signal() function to setup a signal handler.
My signal_handler looks like this :
declare(ticks = 1);
function sig_handler($signo){
global $forks; // this is an array that holds all the child PID's
foreach($forks AS $key=>$childPid){
echo "has my child {$childPid} gone away?".PHP_EOL;
if (posix_kill($childPid, 9)){
echo "Child {$childPid} has tragically died!".PHP_EOL;
unset($forks[$key]);
}
}
}
I am indeed seeing both echo's including the relevant and correct child PID that needs to be removed but it seems that
posix_kill($childPid, 9)
Which I understand to be synonymous with kill -9 $childPid is returning TRUE although it is in fact NOT removing the process...
Taken from the man pages of posix_kill :
Returns TRUE on success or FALSE on failure.
I am monitoring the child processes with the ps command. They appear like this on the system :
web5 5296 5234 0 14:51 ? 00:00:00 [php] <defunct>
web5 5321 5234 0 14:51 ? 00:00:00 [php] <defunct>
web5 5466 5234 0 14:52 ? 00:00:00 [php] <defunct>
As you can see all these processes are child processes of the parent which has the PID of 5234
Am I missing something in my understanding? I seem to have managed to get everything to work (and it does) but I am left with countless zombie processes on the system!
My plans for a zombie apocalypse are rock solid -
but what on earth can I do when even sudo kill -9 does not kill the zombie child processes?
Update 10 Days later
I've answered this question myself after some additional research, if you are still able to stand my ramblings proceed at will.
I promise there is a solution at the end :P
Alright... so here we are, 10 days later and I believe that I have solved this issue. I didn't want to add onto an already longish post so I'll include in this answer some of the things that I tried.
Taking #sym's advice, and reading more into the documentation and the comments on the documentation, the pcntl_waitpid() description states :
If a child as requested by pid has already exited by the time of the call (a so-called
"zombie" process), the function returns immediately. Any system resources used by the child
are freed...
So I setup my pcntl_signal() handler like this -
function sig_handler($signo){
global $childProcesses;
$pid = pcntl_waitpid(-1, $status, WNOHANG);
echo "Sound the alarm! ";
if ($pid != 0){
if (posix_kill($pid, 9)){
echo "Child {$pid} has tragically died!".PHP_EOL;
unset($childProcesses[$pid]);
}
}
}
// These define the signal handling
// pcntl_signal(SIGTERM, "sig_handler");
// pcntl_signal(SIGHUP, "sig_handler");
// pcntl_signal(SIGINT, "sig_handler");
pcntl_signal(SIGCHLD, "sig_handler");
For completion, I'll include the actual code I'm using for forking a child process -
function broadcastData($socketArray, $data){
global $db,$childProcesses;
$pid = pcntl_fork();
if($pid == -1) {
// Something went wrong (handle errors here)
// Log error, email the admin, pull emergency stop, etc...
echo "Could not fork()!!";
} elseif($pid == 0) {
// This part is only executed in the child
foreach($socketArray AS $socket) {
// There's more happening here but the essence is this
socket_write($socket,$msg,strlen($msg));
// TODO : Consider additional forking here for each client.
}
// This is where the signal is fired
exit(0);
}
// If the child process did not exit above, then this code would be
// executed by both parent and child. In my case, the child will
// never reach these commands.
$childProcesses[] = $pid;
// The child process is now occupying the same database
// connection as its parent (in my case mysql). We have to
// reinitialize the parent's DB connection in order to continue using it.
$db = dbEngine::factory(_dbEngine);
}
Yea... That's a ratio of 1:1 comments to code :P
So this was looking great and I saw the echo of :
Sound the alarm! Child 12345 has tragically died!
However when the socket server loop did it's next iteration, the socket_select() function failed throwing this error :
PHP Warning: socket_select(): unable to select [4]: Interrupted system call...
The server would now hang and not respond to any requests other than manual kill commands from a root terminal.
I'm not going to get into why this was happening or what I did after that to debug it... lets just say it was a frustrating week...
much coffee, sore eyes and 10 days later...
Drum roll please
TL&DR - The Solution :
Mentioned here in a comment from 2007 in the php sockets documentation and in this tutorial on stuporglue (search for "good parenting"), one can simply "ignore" signals comming in from the child processes (SIGCHLD) by passing SIG_IGN to the pcntl_signal() function -
pcntl_signal(SIGCHLD, SIG_IGN);
Quoting from that linked blog post :
If we are ignoring SIGCHLD, the child processes will be reaped automatically upon completion.
Believe it or not - I included that pcntl_signal() line, deleted all the other handlers and things dealing with the children and it worked! There were no more <defunct> processes left hanging around!
In my case, it really did not interest me to know exactly when a child process died, or who it was, I wasn't interested in them at all - just that they didn't hang around and crash my entire server :P
Regards your disclaimer - PHP is no better / worse than many other languages for writing a server in. There are some things which are not possible to do (lightweight processes, asynchronuos I/O) but these do not really apply to a forking server. If you're using OO code, then do ensure that you've got the circular reference checking garbage collector enabled.
Once a child process exits, it becomes a zombie until the parent process cleans it up. Your code seems to send a KILL signal to every child on receipt of any signal. It won't clean up the process entries. It will terminate processes which have not called exit. To get the child process reaped correctly you should call waitpid (see also this example on the pcntl_wait manual page).
http://www.linuxsa.org.au/tips/zombies.html
Zombies are dead processes. You cannot kill the dead. All processes
eventually die, and when they do they become zombies. They consume
almost no resources, which is to be expected because they are dead!
The reason for zombies is so the zombie's parent (process) can
retrieve the zombie's exit status and resource usage statistics. The
parent signals the operating system that it no longer needs the zombie
by using one of the wait() system calls.
When a process dies, its child processes all become children of
process number 1, which is the init process. Init is ``always''
waiting for children to die, so that they don't remain as zombies.
If you have zombie processes it means those zombies have not been
waited for by their parent (look at PPID displayed by ps -l). You
have three choices: Fix the parent process (make it wait); kill the
parent; or live with it. Remember that living with it is not so hard
because zombies take up little more than one extra line in the output
of ps.
I know only too well how hard you have to search for a solution to the problem of zombie processes. My concern with potentially having hundreds or thousands of them was (rightly or wrongly as I don't know if this would actualy be a problem) running out of inodes, as all hell can break loose when that happens.
If only the pcntl_fork() manual page linked to posix-setsid() many of us would have discovered the solution was so simple years ago.
I have a php script that divides a task into multiple parts and runs each part in a separate child process. The code looks like this:
foreach($users as $k => $arr) {
if(($pid = pcntl_fork()) === -1) continue;
if($pid) {
pcntl_wait($status,WNOHANG);
continue;
}
ob_start();
posix_setsid();
dbConnect();
// do stuff to data
exit();
}
I'm running this script using crontab on a Debian server, but the problem is some processes keep running and reserve memory. After a while the server's memory is flooded.
I need to find a way to make sure all processes finish correctly.
I think the issue is the use of WNOHANG in the pcntl_wait call. This means the pcntl_wait function exist before the child process - which you want, in order to be able to fork the other child processes concurrently. But it has the side-effect that the main parent finishes before some of the children. This link http://www.devshed.com/c/a/PHP/Managing-Standalone-Scripts-in-PHP/2/ describes how to loop using pcntl_wait with WNOHANG until all children complete.
The stuff you do to the data takes to long or forever. You need to debug the operations you execute.
This is just a test script. I'm testing basic child fork functionality before making a simple lib to let me spawn out multiple processes to process data batches in parallel in php. Anything else I should test to understand before proceeding? I understand that all resources are copied during a fork, so initialize/open any resources needed after the fork.
<?php
$childcount = 10;
for($i = 1; $i <= $childcount; $i ++)
{
$pid = pcntl_fork();
if ($pid == -1) {
echo "failed to fork on loop $i of forking\n";
} else if ($pid) {
// we are the parent
$pidArray[$pid] = $pid;
// and we want to wait on all children at the end of the loop
} else {
// we are the child
echo "Child is outputting it's count and dying. Count: $i \n ";
doMessage($i);
die;
}
}
echo "sleeping to see if child finished events queue\n";
sleep(10);
print_r($pidArray);
for($j = 1; $j <= $childcount; $j++)
{
echo "parent is waiting on child\n";
$pid = pcntl_wait($status); //Wait for random child to finish
$pidArray[$pid] = "terminated";
echo "parent found $j of the finished children\n";
}
print_r($pidArray);
function doMessage($location)
{
sleep (rand(4,20));
echo "outputting concurrently : $location \n";
}
Output looks like this:
me#myhost:~/$]: php test.php
sleeping to see if child finished events queue
Child is outputting it's count and dying. Count: 3
Child is outputting it's count and dying. Count: 4
Child is outputting it's count and dying. Count: 5
Child is outputting it's count and dying. Count: 6
Child is outputting it's count and dying. Count: 8
Child is outputting it's count and dying. Count: 9
Child is outputting it's count and dying. Count: 10
Child is outputting it's count and dying. Count: 7
Child is outputting it's count and dying. Count: 2
Child is outputting it's count and dying. Count: 1
outputting concurrently : 9
outputting concurrently : 1
outputting concurrently : 6
Array
(
[22700] => 22700
[22701] => 22701
[22702] => 22702
[22703] => 22703
[22704] => 22704
[22705] => 22705
[22706] => 22706
[22707] => 22707
[22708] => 22708
[22709] => 22709
)
parent is waiting on child
parent found 1 of the finished children
parent is waiting on child
parent found 2 of the finished children
parent is waiting on child
parent found 3 of the finished children
parent is waiting on child
outputting concurrently : 5
parent found 4 of the finished children
parent is waiting on child
outputting concurrently : 2
parent found 5 of the finished children
parent is waiting on child
outputting concurrently : 3
parent found 6 of the finished children
parent is waiting on child
outputting concurrently : 8
parent found 7 of the finished children
parent is waiting on child
outputting concurrently : 7
parent found 8 of the finished children
parent is waiting on child
outputting concurrently : 4
outputting concurrently : 10
parent found 9 of the finished children
parent is waiting on child
parent found 10 of the finished children
Array
(
[22700] => terminated
[22701] => terminated
[22702] => terminated
[22703] => terminated
[22704] => terminated
[22705] => terminated
[22706] => terminated
[22707] => terminated
[22708] => terminated
[22709] => terminated
)
I've also confirmed that an additional pctl_wait will just return immediately with the pid of -1 .
Your example does not touch the issues you can encouter with pcntl_fork.
Remember that fork() makes a copy of the program, which means all descriptors are copied. Unfortunately, this is a rather bad situation for a PHP program because most descriptors are handled by PHP or a PHP Extension internally.
The simple, and probably "proper" way to solve this issue is to fork before hand, there really should be no need to fork at many different points among a program, you would simply fork, and then delegate the work. Use a master/worker hierarchy.
For example, if you need to have many processes that use a MySQL Connection, just fork before the connection is made, that way each child has it´s own connection to mysql that it, and it alone, manages.
Another thing to watch out for, is when a child process dies. It's death should be handled by the parent. If it isn't, the child becomes a zombie: it doesn't consume resources, but it still is a process with a PID and all that. This is undesirable, since most (all?) operating systems have an upper limit on the processes it can handle.
When a child dies, a signal is sent to the parent (SIGCHLD). The parent can then handle the death of the child for internal processing. The correct way to unzombie a child is using pcntl_waitpid(). You can use that function to wait until the child dies, or to detect that a child has already died. Use pcntl_wait() when you want to do this for a myriad of children. Look at the relevant section of the PHP manual for more options (including letting the function know not to suspend normal operation).
Using SIGCHLD, however, is not always foolproof. When you quickly create many shortlived children, handling SIGCHLD in combination with pcntl_waitpid() might not handle all zombie processes.
Hope this helped, the documentation is ok on php.net but in my opinion the could have gone deeper in the subjects and potential pitfalls as these functions are easy to misuse.
Here is a trivial example (ripped from php.net comment) that shows bad allocation of ressources to parent thread:
<?php
mysql_connect(/* enter a working server here maybe? */);
$f=pcntl_fork();
while(true){
sleep(rand(0,10)/100);
$r=mysql_query("select $f;");
if(!$r)die($f.": ".mysql_error()."\n");
list($x)=mysql_fetch_array($r);
echo ($f)?".":"-";
if($x!=$f) echo ($f.": fail: $x!=$f\n ");
}
?>
Running this on cli will ouptut different results:
very often is just hangs and doesn't output anything anymore
also very often, the server closes the connection, probably because it receives interleaved requests it can't process.
sometimes one process gets the result of the OTHER processes'
query! (because both send their queries down the same socket,
and it's pure luck who gets the reply)
Hope this helps when you expand on your example to fetch the data and / or use ressources.
Happy conding!
Hey there, I have a simple script that which is suppose to load 2 separate pages at the same time and grab some text from them, however it loads either the parent process or the child process depending on what finishes first, what am i doing wrong ? I want the 2 processes to work simultaneously, here is the example code:
<?php
$pid = pcntl_fork();
if ($pid == -1) {
die("could not fork");
}
else if($pid) {
$url = "http://www.englishpage.com/verbpage/simplepresent.html";
$readurl = file_get_contents($url);
$pattern = '#Examples(.*?)Forms#s';
preg_match($pattern, $readurl, $match);
echo "Test1:".$match[1];
}
else {
$url = "http://www.englishpage.com/verbpage/simplepresent.html";
$readurl = file_get_contents($url);
$pattern = '#Examples(.*?)Forms#s';
preg_match($pattern, $readurl, $match);
echo "Test2:".$match[1];
}
echo "<br>Finished<br>";
?>
any help would be appreciated!
I am not quite sure that I really understand what you are willing to get, but if you want your "Finished" message to be displayed :
only once
only when the two processes have done their work
You should :
Use pcntl_wait in the parent process, so it waits for its child to die
Echo "finished" from the parent process, after it has finished waiting.
For instance, something like this should do :
$pid = pcntl_fork();
if ($pid == -1) {
die("could not fork");
}
else if($pid) { // Father
sleep(mt_rand(0, 5));
echo "Father done\n";
pcntl_wait($status); // Wait for the children to finish / die
echo "All Finished\n\n";
}
else { // Child
sleep(mt_rand(0, 5));
echo "Child done\n";
}
With this, each process will do its work, and only when both have finished, the parent will display that everything is done :
if the parent is done first, it'll wait for the child
if the child ends first, the parent will not wait... But still finish after it.
As a sidenote : you are using two separate processes ; once forked, you cannot "easily" share data between them -- so it's not easy to pass data from the child to the father, nor is it the other way arround.
If you need to do that, you can take a look at Shared Memory Functions -- or just use plain files ^^
Hope this helps -- and that I understood the question correctly ^^
From the Process Control Extension Introduction
Process Control support in PHP implements the Unix style of process creation, program execution, signal handling and process termination. Process Control should not be enabled within a web server environment and unexpected results may happen if any Process Control functions are used within a web server environment.
So basically, you shouldn't use any of the pcntl functions when you are running a PHP script through the apache module.
If you just want to fetch the data from those 2 pages simultaneously then you should be able to use stream_select to achieve this. You can find an example at http://www.ibm.com/developerworks/web/library/os-php-multitask/.
BTW Apparently curl supports this too, using curl_multi_select, an example on how to use that can be found at http://www.somacon.com/p537.php.