php some forked processes keep running - php

I have a php script that divides a task into multiple parts and runs each part in a separate child process. The code looks like this:
foreach($users as $k => $arr) {
if(($pid = pcntl_fork()) === -1) continue;
if($pid) {
pcntl_wait($status,WNOHANG);
continue;
}
ob_start();
posix_setsid();
dbConnect();
// do stuff to data
exit();
}
I'm running this script using crontab on a Debian server, but the problem is some processes keep running and reserve memory. After a while the server's memory is flooded.
I need to find a way to make sure all processes finish correctly.

I think the issue is the use of WNOHANG in the pcntl_wait call. This means the pcntl_wait function exist before the child process - which you want, in order to be able to fork the other child processes concurrently. But it has the side-effect that the main parent finishes before some of the children. This link http://www.devshed.com/c/a/PHP/Managing-Standalone-Scripts-in-PHP/2/ describes how to loop using pcntl_wait with WNOHANG until all children complete.

The stuff you do to the data takes to long or forever. You need to debug the operations you execute.

Related

php never ending loop

I need a function that executes by itself in php without the help of crone. I have come up with the following code that works for me well but as it is a never-ending loop will it cause any problem to my server or script, if so could you give me some suggestion or alternatives, please. Thanks.
$interval=60; //minutes
set_time_limit(0);
while (1){
$now=time();
#do the routine job, trigger a php function and what not.
sleep($interval*60-(time()-$now));
}
We have used the infinite loop in a live system environment to basically wait for incoming SMS and then process it. We found out that doing it this way makes the server resource intensive over time and had to restart the server in order to free up memory.
Another issue we encountered is when you execute a script with an infinite loop in your browser, even if you hit the stop button it will continue to run unless you restart Apache.
while (1){ //infinite loop
// write code to insert text to a file
// The file size will still continue to grow
//even when you click 'stop' in your browser.
}
The solution is to run the PHP script as a deamon on the command line. Here's how:
nohup php myscript.php &
the & puts your process in the background.
Not only we found this method to be less memory intensive but you can also kill it without restarting apache by running the following command :
kill processid
Edit: As Dagon pointed out, this is not really the true way of running PHP as a 'Daemon' but using the nohup command can be considered as the poor man's way of running a process as a daemon.
You can use time_sleep_until() function. It will return TRUE OR FALSE
$interval=60; //minutes
set_time_limit( 0 );
$sleep = $interval*60-(time());
while ( 1 ){
if(time() != $sleep) {
// the looping will pause on the specific time it was set to sleep
// it will loop again once it finish sleeping.
time_sleep_until($sleep);
}
#do the routine job, trigger a php function and what not.
}
There are many ways to create a daemon in php, and have been for a very long time.
Just running something in background isn't good. If it tries to print something and the console is closed, for example, the program dies.
One method I have used on linux is pcntl_fork() in a php-cli script, which basically splits your script into two PIDs. Have the parent process kill itself, and have the child process fork itself again. Again have the parent process kill itself. The child process will now be completely divorced and can happily hang out in background doing whatever you want it to do.
$i = 0;
do{
$pid = pcntl_fork();
if( $pid == -1 ){
die( "Could not fork, exiting.\n" );
}else if ( $pid != 0 ){
// We are the parent
die( "Level $i forking worked, exiting.\n" );
}else{
// We are the child.
++$i;
}
}while( $i < 2 );
// This is the daemon child, do your thing here.
Unfortunately, this model has no way to restart itself if it crashes, or if the server is rebooted. (This can be resolved through creativity, but...)
To get the robustness of respawning, try an Upstart script (if you are on Ubuntu.) Here is a tutorial - but I have not yet tried this method.
while(1) means it is infinite loop. If you want to break it you should use break by condition.
eg,.
while (1){ //infinite loop
$now=time();
#do the routine job, trigger a php function and what no.
sleep($interval*60-(time()-$now));
if(condition) break; //it will break when condition is true
}

Executing functions parallelly in PHP

Can PHP call a function and don't wait for it to return? So something like this:
function callback($pause, $arg) {
sleep($pause);
echo $arg, "\n";
}
header('Content-Type: text/plain');
fast_call_user_func_array('callback', array(3, 'three'));
fast_call_user_func_array('callback', array(2, 'two'));
fast_call_user_func_array('callback', array(1, 'one'));
would output
one (after 1 second)
two (after 2 seconds)
three (after 3 seconds)
rather than
three (after 3 seconds)
two (after 3 + 2 = 5 seconds)
one (after 3 + 2 + 1 = 6 seconds)
Main script is intended to be run as a permanent process (TCP server). callback() function would receive data from client, execute external PHP script and then do something based on other arguments that are passed to callback(). The problem is that main script must not wait for external PHP script to finish. Result of external script is important, so exec('php -f file.php &') is not an option.
Edit:
Many have recommended to take a look at PCNTL, so it seems that such functionality can be achieved. PCNTL is not available in Windows, and I don't have an access to a Linux machine right now, so I can't test it, but if so many people have advised it, then it should do the trick :)
Thanks, everyone!
On Unix platforms you can enable the PCNTL functions, and use pcntl_fork to fork the process and run your jobs in child processes.
Something like:
function fast_call_user_func_array($func, $args) {
if (pcntl_fork() == 0) {
call_user_func_array($func, $args);
}
}
Once you call pcntl_fork, two processes will execute your code from the same position. The parent process will get a PID returned from pcntl_fork, while the child process will get 0. (If there's an error the parent process will return -1, which is worth checking for in production code).
You can check out PHP Process Control:
http://us.php.net/manual/en/intro.pcntl.php
Note: This is not threading, but the handling of separate processes. There is more overhead attached.
Wouldn't it solve your problem to fork, keeping the parent process free for other connections & actions? See http://www.php.net/pcntl_fork. If you need an answer back you could possibly listen to a socket in the parent, and write with the child. A simple while(true) loop with a read could possibly do, and probably you already have that basic functionality if you run a permanent TCP server. Another option would be to keep track of your childprocess-ids, keep a accessable store somewhere (file/database/memcached etc), with a pcnt_wait in the main process with a WNOHANG to check which process has exited, and retrieve the data from the store.
You can do some threading in PHP if you use the method pcntl_fork.
http://ca.php.net/manual/en/function.pcntl-fork.php
I have never use this myself, but the are some good example of how to use it on php.net.
PHP doesn't have this functionality as far as I know
You can emulate the function using a different technique, like this one:
Parallel functions in PHP
PHP does not support multi-threading, so there's no other option than taking advantage of the OS or the web server multi processing capabilities. Note that actually you can fetch both the result and output of exec:
string exec ( string $command [,
array &$output [, int &$return_var
]] )
You can, at least, prevent the parent process from hanging until the child process is done by ignoring the child signals using pcntl_signal(SIGCHLD, SIG_IGN).
So, let's say you want to fork a process and execute another PHP function that takes a while without making the parent wait for it to finish (since you want the main process to finish in a timely manner):
pcntl_signal(SIGCHLD, SIG_IGN);
$pid = pcntl_fork();
if ($pid < 0) {
exit(0);
} elseif (!$pid) {
my_slow_function();
exit(0);
}
// Parent keeps executing and finishes before the child does
If you want to execute a slow external script as the child process, pcntl_exec is handy:
$script = array('/path/to/my/script'); // E.g. /home/my_user/my_script.php
pcntl_exec('/path/to/program/executable',$script); // E.g. /usr/bin/php

PHP Launch script after background process completes?

I am converting a PDF with PDF2SWF and Indexing with XPDF.. with exec.. only this requires the execution time to be really high.
Is it possible to run it as background process and then launch a script when it is done converting?
in general, php does not implement threads.
But there is an ZF-class which may be suitable for you:
http://framework.zend.com/manual/en/zendx.console.process.unix.overview.html
ZendX_Console_Process_Unix allows
developers to spawn an object as a new
process, and so do multiple tasks in
parallel on console environments.
Through its specific nature, it is
only working on nix based systems
like Linux, Solaris, Mac/OSx and such.
Additionally, the shmop_, pcntl_* and
posix_* modules are required for this
component to run. If one of the
requirements is not met, it will throw
an exception after instantiating the
component.
suitable example:
class MyProcess extends ZendX_Console_Process_Unix
{
protected function _run()
{
// doing pdf and flash stuff
}
}
$process1 = new MyProcess();
$process1->start();
while ($process1->isRunning()) {
sleep(1);
}
echo 'Process completed';
.
Try using popen() instead of exec().
This hack will work on any standard PHP installation, even on Windows, no additional libraries required. Yo can't really control all aspects of the processes you spawn this way, but sometimes this is enough:
$p1 = popen("/bin/bash ./some_shell_script.sh argument_1","r");
$p2 = popen("/bin/bash ./some_other_shell_script.sh argument_2","r");
$p2 = popen("/bin/bash ./yet_other_shell_script.sh argument_3","r");
The three spawned shell scripts will run simultaneously, and as long as you don't do a pclose($p1) (or $p2 or $p3) or try to read from any of these pipes, they will not block your PHP execution.
When you're done with your other stuff (the one that you are doing with your PHP script) you can call pclose() on the pipes, and that will pause your script execution until the process you are pclosing finishes. Then your script can do something else.
Note that your PHP will not conclude or die() until those scripts have finished. Reaching the end of the script or calling die() will make it wait.
If you are running it from the command line, you can fork a php process using pcntl_fork
There are also daemon classes that would do the same trick:
http://pear.php.net/package/System_Daemon
$pid = pcntl_fork();
if ($pid == -1) {
die('could not fork');
} else if ($pid) {
//We are the parent, exit
exit();
} else {
// We are the child, do something interesting then call the script at the end.
}

PHP Forking Randomly Does Parent Or Child Process Depending On What Finished First what am i doing wrong?

Hey there, I have a simple script that which is suppose to load 2 separate pages at the same time and grab some text from them, however it loads either the parent process or the child process depending on what finishes first, what am i doing wrong ? I want the 2 processes to work simultaneously, here is the example code:
<?php
$pid = pcntl_fork();
if ($pid == -1) {
die("could not fork");
}
else if($pid) {
$url = "http://www.englishpage.com/verbpage/simplepresent.html";
$readurl = file_get_contents($url);
$pattern = '#Examples(.*?)Forms#s';
preg_match($pattern, $readurl, $match);
echo "Test1:".$match[1];
}
else {
$url = "http://www.englishpage.com/verbpage/simplepresent.html";
$readurl = file_get_contents($url);
$pattern = '#Examples(.*?)Forms#s';
preg_match($pattern, $readurl, $match);
echo "Test2:".$match[1];
}
echo "<br>Finished<br>";
?>
any help would be appreciated!
I am not quite sure that I really understand what you are willing to get, but if you want your "Finished" message to be displayed :
only once
only when the two processes have done their work
You should :
Use pcntl_wait in the parent process, so it waits for its child to die
Echo "finished" from the parent process, after it has finished waiting.
For instance, something like this should do :
$pid = pcntl_fork();
if ($pid == -1) {
die("could not fork");
}
else if($pid) { // Father
sleep(mt_rand(0, 5));
echo "Father done\n";
pcntl_wait($status); // Wait for the children to finish / die
echo "All Finished\n\n";
}
else { // Child
sleep(mt_rand(0, 5));
echo "Child done\n";
}
With this, each process will do its work, and only when both have finished, the parent will display that everything is done :
if the parent is done first, it'll wait for the child
if the child ends first, the parent will not wait... But still finish after it.
As a sidenote : you are using two separate processes ; once forked, you cannot "easily" share data between them -- so it's not easy to pass data from the child to the father, nor is it the other way arround.
If you need to do that, you can take a look at Shared Memory Functions -- or just use plain files ^^
Hope this helps -- and that I understood the question correctly ^^
From the Process Control Extension Introduction
Process Control support in PHP implements the Unix style of process creation, program execution, signal handling and process termination. Process Control should not be enabled within a web server environment and unexpected results may happen if any Process Control functions are used within a web server environment.
So basically, you shouldn't use any of the pcntl functions when you are running a PHP script through the apache module.
If you just want to fetch the data from those 2 pages simultaneously then you should be able to use stream_select to achieve this. You can find an example at http://www.ibm.com/developerworks/web/library/os-php-multitask/.
BTW Apparently curl supports this too, using curl_multi_select, an example on how to use that can be found at http://www.somacon.com/p537.php.

Speeding up a PHP App

I have a list of data that needs to be processed. The way it works right now is this:
A user clicks a process button.
The PHP code takes the first item that needs to be processed, takes 15-25 secs to process it, moves on to the next item, and so on.
This takes way too long. What I'd like instead is that:
The user clicks the process button.
A PHP script takes the first item and starts to process it.
Simultaneously another instance of the script takes the next item and processes it.
And so on, so around 5-6 of the items are being process simultaneously and we get 6 items processed in 15-25 secs instead of just one.
Is something like this possible?
I was thinking that I use CRON to launch an instance of the script every second. All items that need to be processed will be flagged as such in the MySQL database, so whenever an instance is launched through CRON, it will simply take the next item flagged to be processed and remove the flag.
Thoughts?
Edit: To clarify something, each 'item' is stored in a mysql database table as seperate rows. Whenever processing starts on an item, it is flagged as being processed in the db, hence each new instance will simply grab the next row which is not being processed and process it. Hence I don't have to supply the items as command line arguments.
Here's one solution, not the greatest, but will work fine on Linux:
Split the processing PHP into a separate CLI scripts in which:
The command line inputs include `$id` and `$item`
The script writes its PID to a file in `/tmp/$id.$item.pid`
The script echos results as XML or something that can be read into PHP to stdout
When finished the script deletes the `/tmp/$id.$item.pid` file
Your master script (presumably on your webserver) would do:
`exec("nohup php myprocessing.php $id $item > /tmp/$id.$item.xml");` for each item
Poll the `/tmp/$id.$item.pid` files until all are deleted (sleep/check poll is enough)
If they are never deleted kill all the processing scripts and report failure
If successful read the from `/tmp/$id.$item.xml` for format/output to user
Delete the XML files if you don't want to cache for later use
A backgrounded nohup started application will run independent of the script that started it.
This interested me sufficiently that I decided to write a POC.
test.php
<?php
$dir = realpath(dirname(__FILE__));
$start = time();
// Time in seconds after which we give up and kill everything
$timeout = 25;
// The unique identifier for the request
$id = uniqid();
// Our "items" which would be supplied by the user
$items = array("foo", "bar", "0xdeadbeef");
// We exec a nohup command that is backgrounded which returns immediately
foreach ($items as $item) {
exec("nohup php proc.php $id $item > $dir/proc.$id.$item.out &");
}
echo "<pre>";
// Run until timeout or all processing has finished
while(time() - $start < $timeout)
{
echo (time() - $start), " seconds\n";
clearstatcache(); // Required since PHP will cache for file_exists
$running = array();
foreach($items as $item)
{
// If the pid file still exists the process is still running
if (file_exists("$dir/proc.$id.$item.pid")) {
$running[] = $item;
}
}
if (empty($running)) break;
echo implode($running, ','), " running\n";
flush();
sleep(1);
}
// Clean up if we timeout out
if (!empty($running)) {
clearstatcache();
foreach ($items as $item) {
// Kill process of anything still running (i.e. that has a pid file)
if(file_exists("$dir/proc.$id.$item.pid")
&& $pid = file_get_contents("$dir/proc.$id.$item.pid")) {
posix_kill($pid, 9);
unlink("$dir/proc.$id.$item.pid");
// Would want to log this in the real world
echo "Failed to process: ", $item, " pid ", $pid, "\n";
}
// delete the useless data
unlink("$dir/proc.$id.$item.out");
}
} else {
echo "Successfully processed all items in ", time() - $start, " seconds.\n";
foreach ($items as $item) {
// Grab the processed data and delete the file
echo(file_get_contents("$dir/proc.$id.$item.out"));
unlink("$dir/proc.$id.$item.out");
}
}
echo "</pre>";
?>
proc.php
<?php
$dir = realpath(dirname(__FILE__));
$id = $argv[1];
$item = $argv[2];
// Write out our pid file
file_put_contents("$dir/proc.$id.$item.pid", posix_getpid());
for($i=0;$i<80;++$i)
{
echo $item,':', $i, "\n";
usleep(250000);
}
// Remove our pid file to say we're done processing
unlink("proc.$id.$item.pid");
?>
Put test.php and proc.php in the same folder of your server, load test.php and enjoy.
You will of course need nohup (unix) and PHP cli to get this to work.
Lots of fun, I may find a use for it later.
Use an external workqueue like Beanstalkd which your PHP script writes a bunch of jobs too. You have as many worker processes pulling jobs from beanstalkd and processing them as fast as possible. You can spin up as many workers as you have memory / CPU. Your job body should contain as little information as possible, maybe just some IDs which you hit the DB with. beanstalkd has a slew of client APIs and itself has a very basic API, think memcached.
We use beanstalkd to process all of our background jobs, I love it. Easy to use, its very fast.
There is no multithreading in PHP, however you can use fork.
php.net:pcntl-fork
Or you could execute a system() command and start another process which is multithreaded.
can you implementing threading in javascript on the client side? seems to me i've seen a javascript library (from google perhaps?) that implements it. google it and i'm sure you'll find something. i've never done it, but i know its possible. anyway, your client-side javascript could activate (ajax) a php script once for each item in separate threads. that might be easier than trying to do it all on the server side.
-don
If you are running a high traffic PHP server you are INSANE if you do not use Alternative PHP Cache: http://php.net/manual/en/book.apc.php . You do not have to make code modifications to run APC.
Another useful technique that can work along with APC is using the Smarty template system which allows you to cache output so that pages do not have to be rebuilt.
To solve this problem, I've used two different products; Gearman and RabbitMQ.
The benefit of putting your jobs into some sort of queuing software like Gearman or Rabbit is that you have multiple machines, they can all participate in processing items off the queue(s).
Gearman is easier to setup, so I'd suggest poking around with it a bit first. If you find you need something more heavy duty with queue robustness; Look into RabbitMQ
http://www.danga.com/gearman/
http://pear.php.net/package/Net_Gearman (PEAR library)
You can use pcntl_fork() and family to fork a process - however you may need something like IPC to communicate back to the parent process that the child process (the one you fork'd) is finished.
You could have them write to shared memory, like via memcache or a DB.
You could also have the child process write the completed data to a file, that the parent process keeps checking - as each child process completes the file is created/written to/updated, and parent process can grab it, one at a time, and them throw them back to the callee/client.
The parent's job is to control the queue, to make sure the same data isn't processed twice and also to sanity check the children (better kill that runaway process and start over...etc)
Something else to keep in mind - on windows platforms you are going to be severely limited - I dont even think you have access to pcntl_ unless you compiled PHP with support for it.
Also, can you cache the data once its been processed, or is it unique data every time? that would surely speed things up..?

Categories