I have made a PHP script which probably would take about 3 hours to complete. I run it from browser and after about 45minutes it stops doing anything. I know this since its polling certain web addresses and then saves some data to database. So it basically stops putting any data to database which lead me to conclusion that it has stopped. It still shows in browser like it would be loading the page though but its neverending.
There arent any errors so it probably is some kind of timeout... But where it occurs is mystery or how can I prevent it from happening. In my case I cant use the CLI, I must user browser client to initiate the script.
I have tried to put
set_time_limit(0);
But it had no apparent effect. Any suggestions what could cause the timeout and a fix for it?
Try this:
set_time_limit(0);
ignore_user_abort(true);
ini_set('max_execution_time', 0);
Most webhosts kill processes that run for a certain length of time. This is intended as a failsafe against infinite loops.
Ask your host about this and see if there's any way it can be disabled for this particular script of yours. In some cases, the killer doesn't apply to Cron tasks, or processes run by SSH. However, this varies from host to host.
Might be the browser that's timing out, not sure if browsers do that or not, but then I've also never had a page require so much time.
Suggestion: I'm assuming you are running a loop. Load a page then run each iteration of the loop in an ajax call to another page, not firing the next iteration until the previous one returns.
There's a setting in PHP to kill processes after sometime. This is especially important for shared servers (you do not want that one process slows up the whole server).
You need to ask your host if you can make modifications to php.ini (through .htaccess). In particular the max_execution_time setting.
If you are using session, then you would need to look at 'session.cookie_lifetime' and not set_time_limit. If you are using an array, the array size might also fill up.
Without more info on how your script handles the task, it would be difficult to identify.
Related
I'm trying to run long process on Php/Apache/Ubuntu (AWS)
This is a simple process that builds a cache during the night.
The process can run for a few hours, and is initiate by crontab accessing a special url with curl.
Sometimes the process stops at a random with no error, I suspect that it is killed by the apache, although I set
#set_time_limit(0);
#ini_set('max_execution_time', -1);
Is it a known issue with Php/Apache/Ubuntu?
Is there a way to solve it?
Currently, my solution is to run the process every 5 minutes, and store the state on the disk, and continue from where it stopped.
But I would like to know more about this issue and if there is a better way to tackle it?
NOTE:
The process stops randomly or doesn't stop at all - the longer the process (i.e. bigger cache) the chance it will stop is higher
One possible reason is that the client disconnects (e.g. after a timeout): PHP stops the request processing by default in this case. To prevent this, you can use ignore_user_abort:
ignore_user_abort(true);
Also note that the set_time_limit call may actually fail (e.g. on a restricted environment) — so it might make sense to remove the error suppression (#) or explicitly check whether set_time_limit(0) returned true.
If I have a PHP page that is doing a task that takes a long time, and I try to load another page from the same site at the same time, that page won't load until the first page has timed out. For instance if my timeout was set to 60 seconds, then I wouldn't be able to load any other page until 60 seconds after the page that was taking a long time to load/timeout. As far as I know this is expected behaviour.
What I am trying to figure out is whether an erroneous/long loading PHP script that creates the above situation would also affect other people on the same network. I personally thought it was a browser issues (i.e. if I loaded http://somesite.com/myscript.php in chrome and it start working it's magic in the background, I couldn't then load http://somesite.com/myscript2.php until that had timed out, but I could load that page in Firefox). However, I've heard contradictory statements, saying that the timeout would happen to everyone on the same network (IP address?).
My script works on some data imported from sage and takes quite a long time to run - sometiems it can timeout before it finishes (i.e. if the sage import crashes over the weeked), so I run it again and it picks up where it left off. I am worried that other staff in the office will not be able to access the site while this is running.
The problem you have here is actually related to the fact that (I'm guessing) you are using sessions. This may be a bit of a stretch, but it would account for exactly what you describe.
This is not in fact "expected behaviour" unless your web server is set up to run a single process with a single thread, which I highly doubt. This would create a situation where the web server is only able to handle a single request at any one time, and this would affect everybody on the network. This is exactly why your web server probably won't be set up like this - in fact I suspect you will find it is impossible to configure your server like this, as it would make the server somewhat useless. And before some smart alec chimes in with "what about Node.js?" - that is a special case, as I am sure you are already well aware.
When a PHP script has a session open, it has an exclusive lock on the file in which the session data is stored. This means that any subsequent request will block at the call to session_start() while PHP tries to acquire that exclusive lock on the session data file - which it can't, because your previous request still has one. As soon as your previous request finishes, it releases it's lock on the file and the next request is able to complete. Since sessions are per-machine (in fact per-browsing session, as the name suggests, which is why it works in a different browser) this will not affect other users of your network, but leaving your site set up so that this is an issue even just for you is bad practice and easily avoidable.
The solution to this is to call session_write_close() as soon as you have finished with the session data in a given script. This causes the script to close the session file and release it's lock. You should try and either finish with the session data before you start the long running process, or not call session_start() until after it has completed.
In theory you can call session_write_close() and then call session_start() again later in the script, but I have found that PHP sometimes exhibits buggy behaviour in this respect (I think this is cookie related, but don't quote me on that). Obviously, pay attention to the fact the setting cookies modifies the headers, so you have to call session_start() before you output any data or enable output buffering.
For example, consider this script:
<?php
session_start();
if (!isset($_SESSION['someval'])) {
$_SESSION['someval'] = 1;
} else {
$_SESSION['someval']++;
}
echo "someval is {$_SESSION['someval']}";
sleep(10);
With the above script, you will have to wait 10 seconds before you are able to make a second request. However, if you add a call to session_write_close() after the echo line, you will be able to make another request before the previous request has completed.
Hmm... I did not check but I think that each request to the webserver is handled in a thread of its own. Thereby a different request should not be blocked. Just try :-) Use a different browser and access your page while the big script is running!
Err.. I just see that this worked for you :-) And it should for others, too.
I'm currently running a Linux based VPS, with 768MB of Ram.
I have an application which collects details of domains and then connect to a service via cURL to retrieve details of the pagerank of these domains.
When I run a check on about 50 domains, it takes the remote page about 3 mins to load with all the results, before the script can parse the details and return it to my script. This causes a problem as nothing else seems to function until the script has finished executing, so users on the site will just get a timer / 'ball of death' while waiting for pages to load.
**(The remote page retrieves the domain details and updates the page by AJAX, but the curl request doesnt (rightfully) return the page until loading is complete.
Can anyone tell me if I'm doing anything obviously wrong, or if there is a better way of doing it. (There can be anything between 10 and 10,000 domains queued, so I need a process that can run in the background without affecting the rest of the site)
Thanks
A more sensible approach would be to "batch process" the domain data via the use of a cron triggered PHP cli script.
As such, once you'd inserted the relevant domains into a database table with a "processed" flag set as false, the background script would then:
Scan the database for domains that aren't marked as processed.
Carry out the CURL lookup, etc.
Update the database record accordingly and mark it as processed.
...
To ensure no overlap with an existing executing batch processing script, you should only invoke the php script every five minutes from cron and (within the PHP script itself) check how long the script has been running at the start of the "scan" stage and exit if its been running for four minutes or longer. (You might want to adjust these figures, but hopefully you can see where I'm going with this.)
By using this approach, you'll be able to leave the background script running indefinitely (as it's invoked via cron, it'll automatically start after reboots, etc.) and simply add domains to the database/review the results of processing, etc. via a separate web front end.
This isn't the ideal solution, but if you need to trigger this process based on a user request, you can add the following at the end of your script.
set_time_limit(0);
flush();
This will allow the PHP script to continue running, but it will return output to the user. But seriously, you should use batch processing. It will give you much more control over what's going on.
Firstly I'm sorry but Im an idiot! :)
I've loaded the site in another browser (FF) and it loads fine.
It seems Chrome puts some sort of lock on a domain when it's waiting for a server response, and I was testing the script manually through a browser.
Thanks for all your help and sorry for wasting your time.
CJ
While I agree with others that you should consider processing these tasks outside of your webserver, in a more controlled manner, I'll offer an explanation for the "server standstill".
If you're using native php sessions, php uses an exclusive locking scheme so only a single php process can deal with a given session id at a time. Having a long running php script which uses sessions can certainly cause this.
You can search for combinations of terms like:
php session concurrency lock session_write_close()
I'm sure its been discussed many times here. I'm too lazy to search for you. Maybe someone else will come along and make an answer with bulleted lists and pretty hyperlinks in exchange for stackoverflow reputation :) But not me :)
good luck.
I'm not sure how your code is structured but you could try using sleep(). That's what I use when batch processing.
I have a PHP script that grabs a chunk of data from a database, processes it, and then looks to see if there is more data. This processes runs indefinitely and I run several of these at a time on a single server.
It looks something like:
<?php
while($shouldStillRun)
{
// do stuff
}
logThatWeExitedLoop();
?>
The problem is, after some time, something causes the process to stop running and I haven't been able to debug it and determine the cause.
Here is what I'm using to get information so far:
error_log - Logging all errors, but no errors are shown in the error log.
register_shutdown_function - Registered a custom shutdown function. This does get called so I know the process isn't being killed by the server, it's being allowed to finish. (or at least I assume that is the case with this being called?)
debug_backtrace - Logged a debug_backtrace() in my custom shutdown function. This shows only one call and it's my custom shutdown function.
Log if reaches the end of script - Outside of the loop, I have a function that logs that the script exited the loop (and therefore would be reaching the end of the source file normally). When the script dies randomly, it's not logging this, so whatever kills it, kills it while it's in the middle of processing.
What other debugging methods would you suggest for finding the culprit?
Note: I should add that this is not an issue with max_execution_time, which is disabled for these scripts. The time before being killed is inconsistent. It could run for 10 seconds or 12 hours before it dies.
Update/Solution: Thank you all for your suggestions. By logging the output, I discovered that when a MySql query failed, the script was set to die(). D'oh. Updated it to log the mysql errors and then terminate. Got it working now like a charm!
I'd log memory usage of your script. Maybe it acquires too much memory, hits memory limit and dies?
Remember, PHP has a variable in the ini file that says how long a script should run. max-execution-time
Make sure that you are not going over this, or use the set_time_limit() to increase execution time. Is this program running through a web server or via cli?
Adding: My Bad Experiences with PHP. Looking through some background scripts I wrote earlier this year. Sorry, but PHP is a terrible scripting language for doing anything for long lengths of time. I see that the newer PHP (which we haven't upgraded to) adds the functionality to force the GC to run. The problem I've been having is from using too much memory because the GC almost never runs to clean up itself. If you use things that recursively reference themselves, they also will never be freed.
Creating an array of 100,000 items makes memory, but then setting the array to an empty array or splicing it all out, does NOT free it immediately, and doesn't mark it as unused (aka making a new 100,000 element array increases memory).
My personal solution was to write a perl script that ran forever, and system("php my_php.php"); when needed, so that the interpreter would free completely. I'm currently supporting 5.1.6, this might be fixed in 5.3+ or at the very least, now they have GC commands that you can use to force the GC to cleanup.
Simple script
#!/usr/bin/perl -w
use strict;
while(1) {
if( system("php /to/php/script.php") != 0 ) {
sleep(30);
}
}
then in your php script
<?php
// do a single processing block
if( $moreblockstodo ) {
exit(0);
} else {
// no? then lets sleep for a bit until we get more
exit(1);
}
?>
I'd log the state of the function to a file in a few different places in each loop.
You can get the contents of most variables as a string with var_export, using the var_export($varname,true) form.
You could just log this to a certain file, and keep an eye on it. The latest state of the function before the log ends should provide some clues.
Sounds like whatever is happening is not a standard php error. You should be able to throw your own errors using a try... catch statement that should then be logged. I don't have more details other than that because I'm on my phone away from a pc.
I've encountered this before on one of our projects at work. We have a similar setup - a PHP script checks the DB if there are tasks to be done (such as sending out an email, updating records, processing some data as well). The PHP script has a while loop inside, which is set to
while(true) {
//do something
}
After a while, the script will also be killed somehow. I've already tried most of what has been said here like setting max_execution_time, using var_export to log all output, placing a try_catch, making the script output ( php ... > output.txt) etc and we've never been able to find out what the problem is.
I think PHP just isn't built to do background tasks by itself. I know it's not answering your question (how to debug this) but the way we worked this is that we used a cronjob to call the PHP file every 5 minutes. This is similar to Jeremy's answer of using a perl script - it ensures that the interpreter if free after the execution is done.
If this is on Linux, try to look into system logs - the process could be killed by the OOM (out-of-memory) killer (unlikely, you'd also see other problems if this was happening), or a segmentation fault (some versions of PHP don't like some versions of extensions, resulting in weird crashes).
this is more of a fundamental question at how apache/threading works.
in this hypothetical (read: sometimes i suck and write terrible code), i write some code that enters the infinite-recursion phases of it's life. then, what's expected, happens. the serve stalls.
even if i close the tab, open up a new one, and hit the site again (locally, of course), it does nothing. even if i hit a different domain i'm hosting through a vhost declaration, nothing. i normally have to wait a number of seconds before apache can begin handling traffic again. most of the time i just get tired and restart the server manually.
can someone explain this process to me? i have the php runtime setting 'ignore_user_abort' set to true to allow ajax calls that are initiated to keep running even if they close their browser, but would this being set to false affect it?
any help would be appreciated. didn't know what to search for.
thanks.
ignore_user_abort() allows your script (and Apache) to ignore a user disconnecting (closing browser/tab, moving away from page, hitting ESC, esc..) and continue processing. This is useful in some cases - for instance in a shopping cart once the user hits "yes, place the order". You really don't want an order to die halfway through the process, e.g. order's in the database, but the charge hasn't been sent to the payment facility yet. Or vice-versa.
However, while this script is busilly running away in "the background", it will lock up resources on the server, especially the session file - PHP locks the session file to make sure that multiple parallel requests won't stomp all over the file, so while your infinite loop is running in the background, you won't be able to use any session-enabled other part of the site. And if the loop is intensive enough, it could tie up the CPU enough that Apache is unable to handle any other requests on other hosted sites, where the session lock might not apply.
If it is an infinite loop, you'll have to wait until PHP's own maximum allowed run time (set_time_limit() and max_execution_time()) kicks in and kills the script. There's also some server-side limiters, like Apache's RLimitCPU and TimeOut that can handle situations like this.
Note that except on Windows, PHP doesn't count "external" time in the set_time_limit. So if your runaway process is doing database stuff, calling external programs via system() and the like, the time spent running those external calls is NOT accounted for in the parent's time limit.
If you write code that causes an (effectively) neverending loop, then apache will execute that, and be unable to respond to any additional new requests for a page, because it's trying to determine the page content (for the served page which caused the neverending loop) by executing the (non-terminating) php code.
Solution: don't write code that doesn't terminate (in a reasonable amount of time). Understand loop invariants.