PHP proc_open() with timeout - php

I want to call proc_open to execute a script in the background, and the background process will terminate after a few seconds. Basically, the script is a C/Java/Python script that will compile and run the user submitted code, so I want the process to be able to be terminated after some time.
What I want to achieve is that when the execution time of the background running script exceeds, say 3 seconds, halt the process as well as stop writing to the file. Let's say I run a for loop to write 1 million lines of some string to a file, and at time >= 3 seconds, the process stops. When I retrieve back the file, I will get like 200k lines of string. Then I can display the output of the file back to the browser.
I am currently using the function exec_timeout from https://blog.dubbelboer.com/2012/08/24/execute-with-timeout.html.
Then I execute a command exec_timeout("exec nohup java -cp some_dir compiled_java_file &", 3), the background process will not be terminated even if it already exceeds the timeout value, instead it will continue to write to the file until it completes. Then only I can echo the result back to the browser. If the user submits a infinite running code, the process would just hanging there until I kill it in ec2 linux instance.
Any idea of why it is not functioning as expected? Or any better function available to achieve my goal? My application is developed in PHP and hosted on AWS Elastic Beanstalk.

On proc_terminate manual, first user contributed notes:
As explained in http://bugs.php.net/bug.php?id=39992, proc_terminate()
leaves children of the child process running. In my application, these
children often have infinite loops, so I need a sure way to kill
processes created with proc_open(). When I call proc_terminate(), the
/bin/sh process is killed, but the child with the infinite loop is
left running.
On exec_timeout:
proc_terminate($process, 9);
should be replaced by:
$status = proc_get_status($process);
if($status['running'] == true) { //process ran too long, kill it
//get the parent pid of the process we want to kill
$ppid = $status['pid'];
//use ps to get all the children of this process, and kill them
$pids = preg_split('/\s+/', `ps -o pid --no-heading --ppid $ppid`);
foreach($pids as $pid) {
if(is_numeric($pid)) {
echo "Killing $pid\n";
posix_kill($pid, 9); //9 is the SIGKILL signal
}
}
proc_close($process);
}

Related

php never ending loop

I need a function that executes by itself in php without the help of crone. I have come up with the following code that works for me well but as it is a never-ending loop will it cause any problem to my server or script, if so could you give me some suggestion or alternatives, please. Thanks.
$interval=60; //minutes
set_time_limit(0);
while (1){
$now=time();
#do the routine job, trigger a php function and what not.
sleep($interval*60-(time()-$now));
}
We have used the infinite loop in a live system environment to basically wait for incoming SMS and then process it. We found out that doing it this way makes the server resource intensive over time and had to restart the server in order to free up memory.
Another issue we encountered is when you execute a script with an infinite loop in your browser, even if you hit the stop button it will continue to run unless you restart Apache.
while (1){ //infinite loop
// write code to insert text to a file
// The file size will still continue to grow
//even when you click 'stop' in your browser.
}
The solution is to run the PHP script as a deamon on the command line. Here's how:
nohup php myscript.php &
the & puts your process in the background.
Not only we found this method to be less memory intensive but you can also kill it without restarting apache by running the following command :
kill processid
Edit: As Dagon pointed out, this is not really the true way of running PHP as a 'Daemon' but using the nohup command can be considered as the poor man's way of running a process as a daemon.
You can use time_sleep_until() function. It will return TRUE OR FALSE
$interval=60; //minutes
set_time_limit( 0 );
$sleep = $interval*60-(time());
while ( 1 ){
if(time() != $sleep) {
// the looping will pause on the specific time it was set to sleep
// it will loop again once it finish sleeping.
time_sleep_until($sleep);
}
#do the routine job, trigger a php function and what not.
}
There are many ways to create a daemon in php, and have been for a very long time.
Just running something in background isn't good. If it tries to print something and the console is closed, for example, the program dies.
One method I have used on linux is pcntl_fork() in a php-cli script, which basically splits your script into two PIDs. Have the parent process kill itself, and have the child process fork itself again. Again have the parent process kill itself. The child process will now be completely divorced and can happily hang out in background doing whatever you want it to do.
$i = 0;
do{
$pid = pcntl_fork();
if( $pid == -1 ){
die( "Could not fork, exiting.\n" );
}else if ( $pid != 0 ){
// We are the parent
die( "Level $i forking worked, exiting.\n" );
}else{
// We are the child.
++$i;
}
}while( $i < 2 );
// This is the daemon child, do your thing here.
Unfortunately, this model has no way to restart itself if it crashes, or if the server is rebooted. (This can be resolved through creativity, but...)
To get the robustness of respawning, try an Upstart script (if you are on Ubuntu.) Here is a tutorial - but I have not yet tried this method.
while(1) means it is infinite loop. If you want to break it you should use break by condition.
eg,.
while (1){ //infinite loop
$now=time();
#do the routine job, trigger a php function and what no.
sleep($interval*60-(time()-$now));
if(condition) break; //it will break when condition is true
}

Catch Command Line Input, Exit If Input = x; Else Continue

I have a CLI script that runs for days. It processes batches, each of which take around 7 minutes. Sometimes I need to stop the script, but I need to stop it only once a batch has been processed, which is a 2 second sleep I have put in. Is there any way I can catch input at any stage of the scripts execution, if that input = x, then stop the script at the end of the next batch; else continue.
I have come across:
$handle = fopen ("php://stdin","r");
$line = fgets($handle);
but this require input.
I don't think you going to get it the way you are thinking. You can catch the StdOut but I don't think it will do you much good in terms of stopping the script. If I was using this on the cli and it ran all the time but I wanted to pause it for a certain amount of time you can do many things but this is probably how I would tackle it for a "quick fix".
Restructure your php code a tiny bit and put the batching process inside a function if it's not already. Then you can create an infinite loop using while. Then I would have it check for the existence of a pause file after each batch process. If the file exists, then don't start the next batch. Basically pausing it. If it doesn't exist proceed on as normal.
So for example.
You php file could look like this little example.
<?php
//path to pause file
$filename = "/root/pause";
while(1){
if(!file_exists($filename)){
batch();
}
}
function batch(){
//batch processing
echo "batching\n";
//fake processing using a usleep pause
usleep(3000000);
}
?>
Then when you want to pause the script. just create the file pause and when the current processing completes it will stop.
So to create the file on Linux, cd to the directory in the script and run the command
touch pause
or you can use the full path like touch /path/to/pause. Just make sure it's in the same directory as in your script. When you are done, just delete the file rm -f pause and it will resume processing the batches.
Note that when it's paused and it's just looping and not processing, it could cause a little jump in cpu usage, however it should be fine.
Long term you can look at this little example to get you going in that direction.
http://www.phpmysqlitutorials.com/2013/05/08/php-standard-input-and-loops-on-the-command-line/

In PHP, exec fails silently, sometimes, when calling many exec commands, but the same command run again later will work

I have a PHP script that uses exec('command args > /log/file &'); within a loop to create multiple child scripts that run at the same time. Basically, the parent script gets user information out of a database and creates child scripts running in parallel, then the child script creates an email to send to a single user. This happens approximately 50,000 times.
To prevent the creation of 50,000 simultaneously running processes, I have a database table that keeps track of the currently running processes, and before creating a new process the parent checks the current child count and sleeps if 25 children are currently active. The child, upon completing its task, deletes its row in the table, freeing the parent to create more children.
The problem is, about 10% of the exec commands fail silently, and for seemingly no reason. I can run the parent script again (it's smart enough not to email the same user twice), and it will work, once again, 90% of the time using the same exec commands that failed last time. Running the script five or six times in a row will email everyone.
By putting a sleep immediately after the exec, I can increase my success rate to around 95%.
Why would exec be failing, if the same command will work later? I can just keep the script repeating until it completes, but I'd much rather solve the exec problem.
Some highly simplified sample code:
Parent script:
do {
//get user, group, and supergroup information for users that haven't
//been emailed yet
foreach ($users as $userArray) {
$processId = insertIntoProcessQueue($userArray);
$cmd = 'sudo php -q ./childScript.php ' . cliArg($userArray) . ' ' .
cliArg($groupArray) . ' ' . cliArg($supergroupArray)
' ' . $proccessId . ' > file.log &';
exec($cmd);
do {
if (numChildren() >= 25) {
sleep(1);
$waiting = true;
}
} while ($waiting);
}
$incomplete = moreUsersToEmail() > 0 ? true : false;
} while ($incomplete);
function cliArg($array) {
return escapeshellarg(json_encode($arg));
}
Child script:
ignore_user_abort(true);
$user = json_decode($argv[1]);
$group = json_decode($argv[2]);
$supergroup = json_decode($argv[3]);
print_r($user);
$email = createEmail($user, $group, $supergroup);
$email->sendEmail();
removeFromProcessQueue($argv[4]);
flush();
exit;
The print_r will only show up in the log file when the script completes and I never get any errors, so I can't get any data about why it's failing. To add to that, it doesn't fail consistently on any individual users, and it doesn't fail running a single user at a time, so I have to run the script through everyone and try and catch the errors amidst the 45,000 that are working properly. And, since the parent and child never communicate beyond the parent starting the child, I can't detect (from the parent) when a child fails (otherwise I could immediately try and start any failed children again instead of rerunning the parent post-hoc).
Edit: So it turns out there's an included script that's dynamically generated and is destroyed and regenerated every time it's used (don't ask me why), which creates a race condition while running processes in parallel that caused the script to fail.
Thanks everyone for your unfortunately wasted time.
I just looked at the PHP docs for exec() and you can pass an array as a reference with a second parameter which will be filled with the output of exec. You can use this to determine a) why the command is failing and b) when the command fails and integrate that into your code.
So I'd change:
exec($cmd);
To something like:
function check_exec_results($results)
{
echo '<HR><PRE>',print_r($output,true),'</PRE><HR>'; //use this to figure out what output you're getting from the exec commands then remove when you've figured out a way to set $results_look_good below
$results_look_good = ?; //you will need to edit this yourself to actually do some kind of check
return $results_look_good;
}
$successful_exec = false;
do
{
$exec_results = array();
exec($cmd,$exec_results);
$successful_exec = check_exec_results($exec_results);
}
while (!$successful_exec);
Note that this is potentially an infinite loop so I'd also go a step further and set a limit to the number of times exec() can be called for each user.
So it turns out there's an included script that's dynamically generated and is destroyed and regenerated every time it's used (don't ask me why), which creates a race condition while running processes in parallel that caused the script to fail.
Thanks everyone for your unfortunately wasted time.

Killing processes opened with popen()?

I'm opening a long-running process with popen(). For debugging, I'd like to terminate the process before it has completed. Calling pclose() just blocks until the child completes.
How can I kill the process? I don't see any easy way to get the pid out of the resource that popen() returns so that I can send it a signal.
I suppose I could do something kludgey and try to fudge the pid into the output using some sort of command-line hackery...
Well, landed on a solution: I switched back to proc_open() instead of popen(). Then it's as simple as:
$s = proc_get_status($p);
posix_kill($s['pid'], SIGKILL);
proc_close($p);
Just send a kill (or abort) signal using kill function:
php http://php.net/manual/en/function.posix-kill.php
c/c++ http://linux.die.net/man/3/kill
You can find the pid, and checks that you're really its parent by doing:
// Find child processes according to current pid
$res = trim(exec('ps -eo pid,ppid |grep "'.getmypid().'" |head -n2 |tail -n1'));
if (preg_match('~^(\d+)\s+(\d+)$~', $res, $pid) !== 0 && (int) $pid[2] === getmypid())
{
// I'm the parent PID, just send a KILL
posix_kill((int) $pid[1], 9);
}
It's working quite well on a fast-cgi PHP server.

Speeding up a PHP App

I have a list of data that needs to be processed. The way it works right now is this:
A user clicks a process button.
The PHP code takes the first item that needs to be processed, takes 15-25 secs to process it, moves on to the next item, and so on.
This takes way too long. What I'd like instead is that:
The user clicks the process button.
A PHP script takes the first item and starts to process it.
Simultaneously another instance of the script takes the next item and processes it.
And so on, so around 5-6 of the items are being process simultaneously and we get 6 items processed in 15-25 secs instead of just one.
Is something like this possible?
I was thinking that I use CRON to launch an instance of the script every second. All items that need to be processed will be flagged as such in the MySQL database, so whenever an instance is launched through CRON, it will simply take the next item flagged to be processed and remove the flag.
Thoughts?
Edit: To clarify something, each 'item' is stored in a mysql database table as seperate rows. Whenever processing starts on an item, it is flagged as being processed in the db, hence each new instance will simply grab the next row which is not being processed and process it. Hence I don't have to supply the items as command line arguments.
Here's one solution, not the greatest, but will work fine on Linux:
Split the processing PHP into a separate CLI scripts in which:
The command line inputs include `$id` and `$item`
The script writes its PID to a file in `/tmp/$id.$item.pid`
The script echos results as XML or something that can be read into PHP to stdout
When finished the script deletes the `/tmp/$id.$item.pid` file
Your master script (presumably on your webserver) would do:
`exec("nohup php myprocessing.php $id $item > /tmp/$id.$item.xml");` for each item
Poll the `/tmp/$id.$item.pid` files until all are deleted (sleep/check poll is enough)
If they are never deleted kill all the processing scripts and report failure
If successful read the from `/tmp/$id.$item.xml` for format/output to user
Delete the XML files if you don't want to cache for later use
A backgrounded nohup started application will run independent of the script that started it.
This interested me sufficiently that I decided to write a POC.
test.php
<?php
$dir = realpath(dirname(__FILE__));
$start = time();
// Time in seconds after which we give up and kill everything
$timeout = 25;
// The unique identifier for the request
$id = uniqid();
// Our "items" which would be supplied by the user
$items = array("foo", "bar", "0xdeadbeef");
// We exec a nohup command that is backgrounded which returns immediately
foreach ($items as $item) {
exec("nohup php proc.php $id $item > $dir/proc.$id.$item.out &");
}
echo "<pre>";
// Run until timeout or all processing has finished
while(time() - $start < $timeout)
{
echo (time() - $start), " seconds\n";
clearstatcache(); // Required since PHP will cache for file_exists
$running = array();
foreach($items as $item)
{
// If the pid file still exists the process is still running
if (file_exists("$dir/proc.$id.$item.pid")) {
$running[] = $item;
}
}
if (empty($running)) break;
echo implode($running, ','), " running\n";
flush();
sleep(1);
}
// Clean up if we timeout out
if (!empty($running)) {
clearstatcache();
foreach ($items as $item) {
// Kill process of anything still running (i.e. that has a pid file)
if(file_exists("$dir/proc.$id.$item.pid")
&& $pid = file_get_contents("$dir/proc.$id.$item.pid")) {
posix_kill($pid, 9);
unlink("$dir/proc.$id.$item.pid");
// Would want to log this in the real world
echo "Failed to process: ", $item, " pid ", $pid, "\n";
}
// delete the useless data
unlink("$dir/proc.$id.$item.out");
}
} else {
echo "Successfully processed all items in ", time() - $start, " seconds.\n";
foreach ($items as $item) {
// Grab the processed data and delete the file
echo(file_get_contents("$dir/proc.$id.$item.out"));
unlink("$dir/proc.$id.$item.out");
}
}
echo "</pre>";
?>
proc.php
<?php
$dir = realpath(dirname(__FILE__));
$id = $argv[1];
$item = $argv[2];
// Write out our pid file
file_put_contents("$dir/proc.$id.$item.pid", posix_getpid());
for($i=0;$i<80;++$i)
{
echo $item,':', $i, "\n";
usleep(250000);
}
// Remove our pid file to say we're done processing
unlink("proc.$id.$item.pid");
?>
Put test.php and proc.php in the same folder of your server, load test.php and enjoy.
You will of course need nohup (unix) and PHP cli to get this to work.
Lots of fun, I may find a use for it later.
Use an external workqueue like Beanstalkd which your PHP script writes a bunch of jobs too. You have as many worker processes pulling jobs from beanstalkd and processing them as fast as possible. You can spin up as many workers as you have memory / CPU. Your job body should contain as little information as possible, maybe just some IDs which you hit the DB with. beanstalkd has a slew of client APIs and itself has a very basic API, think memcached.
We use beanstalkd to process all of our background jobs, I love it. Easy to use, its very fast.
There is no multithreading in PHP, however you can use fork.
php.net:pcntl-fork
Or you could execute a system() command and start another process which is multithreaded.
can you implementing threading in javascript on the client side? seems to me i've seen a javascript library (from google perhaps?) that implements it. google it and i'm sure you'll find something. i've never done it, but i know its possible. anyway, your client-side javascript could activate (ajax) a php script once for each item in separate threads. that might be easier than trying to do it all on the server side.
-don
If you are running a high traffic PHP server you are INSANE if you do not use Alternative PHP Cache: http://php.net/manual/en/book.apc.php . You do not have to make code modifications to run APC.
Another useful technique that can work along with APC is using the Smarty template system which allows you to cache output so that pages do not have to be rebuilt.
To solve this problem, I've used two different products; Gearman and RabbitMQ.
The benefit of putting your jobs into some sort of queuing software like Gearman or Rabbit is that you have multiple machines, they can all participate in processing items off the queue(s).
Gearman is easier to setup, so I'd suggest poking around with it a bit first. If you find you need something more heavy duty with queue robustness; Look into RabbitMQ
http://www.danga.com/gearman/
http://pear.php.net/package/Net_Gearman (PEAR library)
You can use pcntl_fork() and family to fork a process - however you may need something like IPC to communicate back to the parent process that the child process (the one you fork'd) is finished.
You could have them write to shared memory, like via memcache or a DB.
You could also have the child process write the completed data to a file, that the parent process keeps checking - as each child process completes the file is created/written to/updated, and parent process can grab it, one at a time, and them throw them back to the callee/client.
The parent's job is to control the queue, to make sure the same data isn't processed twice and also to sanity check the children (better kill that runaway process and start over...etc)
Something else to keep in mind - on windows platforms you are going to be severely limited - I dont even think you have access to pcntl_ unless you compiled PHP with support for it.
Also, can you cache the data once its been processed, or is it unique data every time? that would surely speed things up..?

Categories