I am using a script with set_time_limit(60*60*24) to process a big amount of images. But after 1k images or so (1 or 2 minutes), the script stops without showing any errors in the command line.
I'm also using a logger that writes to a file any error thrown by the script, on shutdown (by using register_shutdown_function). But when this script stops, nothing is written (it should write something, even if no errors are thrown. It works perfect with any other script, on any other situation I ever had).
Apache error_log doesn't show anything either.
Any ideas?
Edit: My enviroment is Centos 5.5, with php 5.3.
It is probably running out of memory.
ini_set('memory_limit', '1024M');
May get you going if you can allocate that much.
Please make sure you're not running in safe mode:
http://php.net/manual/en/features.safe-mode.php
Please note that register_shutdown_function does NOT guaranties that the associated function will be executed everytime. So you should not rely on it.
see http://php.net/register_shutdown_function
To debug the issue check the PHP error log. (which is NOT the apache error log when you're using PHP from the console. check your PHP.ini or ini_get('error_log') to know where it is.)
A solution may be to write a simple wrapper script in bash that executes the script and then does what you want to be executed at the end of the script.
Also note that PHP doesn't count the time spent in external, non-php, activities, like network calls, some libraries functions, image magick, etc.
So the time limit you set may actually last much longer than you expect it to.
Related
I am running the cron to update the inventory in our database from the ERP inventory csv file. The ERP inventory CSV file contains the 19K record almost. Cron will pick up all records 1 by 1 and update the matched inventory in the database. But since a few days among 19K records 13K-14k records only parse by files and script break in the middle.
I have tried to run the script directly from browser also but its raised the same issue. No error is displayed in the error log.
I was thinking that its timeout issue and increases the max_execution_time to 1500 (25min). But the issue is not resolved yet.
Anyone can suggest me how to solve this issue? Thanks in Advance!
Did you check cron-log? Which operating system you are using?
No error is displayed in the error log.
Then your first course of action is to verify that your error logging is working as expected. If it is a (PHP) timeout or a memory limit issue then the reason will be getting flagged - but it might not be getting reported.
You forgot to tell us how cron runs the task - is this via the CLI SAPI or are you using a http client (wget, curl etc) to invoke via the webserver? The SAPIs have very different behaviours and usually use seperate php.ini files.
I was thinking that its timeout issue
Because you've checked and it always bombs out at the same interval after starting?
But since a few days among 19K records 13K-14k records only parse
And previously it was taking less than the identified amount of time to complete?
and increases the max_execution_time
How? In the script? In the (right) php.ini? Note that if the script is running via the webserver, then there may also be a timeout configured on the webserver.
You might consider prefixing your script with:
set_time_limit(3000);
error_reporting(E_ALL);
ini_set('display_errors', 1);
ini_set('display_startup_errors', 1);
And capturing the stderr and stdout from the script.
Try to divide csv in 2 files and execute the cron i think your csv file have a problem. if one of them execute without trouble then the process have not issue, repeat process to find the block with problem.
a few years i work with a interface wich read a file to export to SAP, sometimes one special character make the script break.
I run a large number of PHP scripts via CLI executed by CRON in the following format :
/usr/bin/php /var/www/html/a15432/run.php
/usr/bin/php /var/www/html/a29821/run.php
In some instances there is a problem with once of the files whereby the execution of the file enters a loop or simply does not terminate due to an error in that particular script.
What I would like to achieve is that if either of the above occurs, PHP CLI simply stops executing the script.
I have searched this and all indications are that I need to use this command, which I have entered at the beginning of each file, but it does not seem to be making any difference.
ini_set('max_execution_time', 30);
Is there any other way I can better protect my server so that one (or more) of these rogue files cannot be allowed to bring down the server by entering some kind of infinite loop and using all of the servers resources trying to process the file? To make the problem worse, these files can be triggered every 5 minutes which means that at times, there are loads of the same files trying to process.
Of course I realise that the correct solution is to fix the files so that there is better error handling and that they are never allowed to enter this state, but at the moment I'd like to understand why this doesn't work as expected.
Thank you.
I am working on web scraping with php and curl to scrap a whole website
but it takes more than one day to complete the process of scraping
I have even used
ignore_user_abort(true);
set_error_handler(array(&$this, 'customError'));
set_time_limit (0);
ini_set('memory_limit', '-1');
I have also cleared memory after scraping a page I am using simple html DOM
to get the scraping details from a page
But still process runs and works fine for some amount of links after that it stops although process keeps circulating the browser and no error log is generated
Could not understand what seems to be the problem.
Also I need to know if PHP can
run process for two or three days?
thanks in advance
PHP can run for as long as you need it to, but the fact it stops after what seems like the same point every time indicates there is an issue with your script.
You said you have tried ignore_user_abort(true);, but then indicated you were running this via a browser. This setting only works in command line as closing a browser window for a script of this type will not terminate the process anyway.
Do you have xDebug? simplehtmlDOM will throw some rather interesting errors with malformed html (a link within a broken link for example). xDebug will throw a MAX_NESTING_LEVEL error in a browser, but will not throw this in a console unless you have explicitly told it to with the -d flag.
There are lots of other errors, notices, warnings etc which will break/stop your script without writing anything to error_log.
Are you getting any errors?
When using cURL in this way it is important to use multi cURL to parallel process URLs - depending on your environment, 150-200 URLs at a time is easy to achieve.
If you have truly sorted out the memory issue and freed all available space like you have indicated, then the issue must be with a particular page it is crawling.
I would suggest running your script via a console and finding out exactly when it stops to run that URL separately - at least this will indicate if it is a memory issue or not.
Also remember that set_error_handler(array(&$this, 'customError')); will NOT catch every type of error PHP can throw.
When you next run it, debug via a console to show progress, and keep a track of actual memory use - either via PHP (printed to console) or via your systems process manager. This way you will be closer to finding out what the actual issue with your script is.
Even if you set an unlimited memory, there exists a physical limit.
If you call recursively the URLs, the memory can be fullfilled.
Try to do a loop and work with a database:
scan a page, store the founded links if there aren't in the database yet.
when finish, do a select, and get the first unscanned URL
{loop}
I have a PHP script that grabs a chunk of data from a database, processes it, and then looks to see if there is more data. This processes runs indefinitely and I run several of these at a time on a single server.
It looks something like:
<?php
while($shouldStillRun)
{
// do stuff
}
logThatWeExitedLoop();
?>
The problem is, after some time, something causes the process to stop running and I haven't been able to debug it and determine the cause.
Here is what I'm using to get information so far:
error_log - Logging all errors, but no errors are shown in the error log.
register_shutdown_function - Registered a custom shutdown function. This does get called so I know the process isn't being killed by the server, it's being allowed to finish. (or at least I assume that is the case with this being called?)
debug_backtrace - Logged a debug_backtrace() in my custom shutdown function. This shows only one call and it's my custom shutdown function.
Log if reaches the end of script - Outside of the loop, I have a function that logs that the script exited the loop (and therefore would be reaching the end of the source file normally). When the script dies randomly, it's not logging this, so whatever kills it, kills it while it's in the middle of processing.
What other debugging methods would you suggest for finding the culprit?
Note: I should add that this is not an issue with max_execution_time, which is disabled for these scripts. The time before being killed is inconsistent. It could run for 10 seconds or 12 hours before it dies.
Update/Solution: Thank you all for your suggestions. By logging the output, I discovered that when a MySql query failed, the script was set to die(). D'oh. Updated it to log the mysql errors and then terminate. Got it working now like a charm!
I'd log memory usage of your script. Maybe it acquires too much memory, hits memory limit and dies?
Remember, PHP has a variable in the ini file that says how long a script should run. max-execution-time
Make sure that you are not going over this, or use the set_time_limit() to increase execution time. Is this program running through a web server or via cli?
Adding: My Bad Experiences with PHP. Looking through some background scripts I wrote earlier this year. Sorry, but PHP is a terrible scripting language for doing anything for long lengths of time. I see that the newer PHP (which we haven't upgraded to) adds the functionality to force the GC to run. The problem I've been having is from using too much memory because the GC almost never runs to clean up itself. If you use things that recursively reference themselves, they also will never be freed.
Creating an array of 100,000 items makes memory, but then setting the array to an empty array or splicing it all out, does NOT free it immediately, and doesn't mark it as unused (aka making a new 100,000 element array increases memory).
My personal solution was to write a perl script that ran forever, and system("php my_php.php"); when needed, so that the interpreter would free completely. I'm currently supporting 5.1.6, this might be fixed in 5.3+ or at the very least, now they have GC commands that you can use to force the GC to cleanup.
Simple script
#!/usr/bin/perl -w
use strict;
while(1) {
if( system("php /to/php/script.php") != 0 ) {
sleep(30);
}
}
then in your php script
<?php
// do a single processing block
if( $moreblockstodo ) {
exit(0);
} else {
// no? then lets sleep for a bit until we get more
exit(1);
}
?>
I'd log the state of the function to a file in a few different places in each loop.
You can get the contents of most variables as a string with var_export, using the var_export($varname,true) form.
You could just log this to a certain file, and keep an eye on it. The latest state of the function before the log ends should provide some clues.
Sounds like whatever is happening is not a standard php error. You should be able to throw your own errors using a try... catch statement that should then be logged. I don't have more details other than that because I'm on my phone away from a pc.
I've encountered this before on one of our projects at work. We have a similar setup - a PHP script checks the DB if there are tasks to be done (such as sending out an email, updating records, processing some data as well). The PHP script has a while loop inside, which is set to
while(true) {
//do something
}
After a while, the script will also be killed somehow. I've already tried most of what has been said here like setting max_execution_time, using var_export to log all output, placing a try_catch, making the script output ( php ... > output.txt) etc and we've never been able to find out what the problem is.
I think PHP just isn't built to do background tasks by itself. I know it's not answering your question (how to debug this) but the way we worked this is that we used a cronjob to call the PHP file every 5 minutes. This is similar to Jeremy's answer of using a perl script - it ensures that the interpreter if free after the execution is done.
If this is on Linux, try to look into system logs - the process could be killed by the OOM (out-of-memory) killer (unlikely, you'd also see other problems if this was happening), or a segmentation fault (some versions of PHP don't like some versions of extensions, resulting in weird crashes).
After my host enabled suPHP, a previously working script has been timing out after ~3min (it varies, but the script has not run for more then 3, AFAIK).
The odd part is, the script is not throwing any errors that I can see (and yes, full PHP error reporting/logging is enabled and all MYSQL queries have been checked for errors, also) it simply stops.
Refreshing the page will load more of the data the script is supposed to process (probably because the MYSQL queries have been cached...), but if there is a lot of data to process it never fully executes.
The other oddity is that I can run test scripts for over 10 minutes on the same host w/ set_time_limit(0); / etc.
Anyone else had to deal with this, or know what is causing the timeout and how to fix it (assuming that dropping suPHP is not an option). There was also a simultaneous update from PHP 5.2.x to 5.3.x, but I doubt that is causing the issue. :/
I've seen this happen when memory runs out - the script just ends without error. If you have a loop, try using the memory functions to dump the memory status. Also, use phpinfo() to see what your maximum memory allowance is - the switch to suPHP may have changed that to your detriment.