I have a massive amount of data that needs to be read from mysql, analyzed and based on the results split up and stored in new records.
five record takes about 20 seconds, but the records vary in length so I can't really estimate how long the program will take, however have calculated that the process should not take longer much longer than 5 hours, so I'd like to run it over night and feel quite sure that when I come back to the office the next morning the program is done.
Assuming the code is fail safe (I know right ;) how should set up Apache / PHP /Mysql settings so that when I execute the script so that I can be sure that the program will not time out and/or not run out of ram?
(it is basically running in a loop fetching sets of 100 rows until it can't anymore a loop so, I am hoping the fact that the variables are being reset at the beginning of each iteration will keep the memory usage constant.)
The actual size of the database when dumped is 14mb, so the volume of the data is not so high
(on a side note, it might also be that I haven't assigned the maximum resources to the server settings, so maybe that's why it takes 20 seconds to run 5 records)
Make sure you have removed any max_execution_time limits by setting this to 0 (unlimited) in your PHP.ini or by calling set_time_limit(0). This will ensure that PHP doesn't stop the script mid-execution.
If it all possible, you should run the script from the CLI so that you don't have to worry about Apache timing your request out (it shouldn't, but it might).
Since you are working with only 15 MB of data I wouldn't worry about memory usage (128 MB is the default in PHP). If you are really worried you can remove memory limits in PHP by modifying the memory_limit to be either a higher number of -1 (infinite memory).
Keep in mind modifying the PHP.ini will affect all scripts that are interpreted by that installation. I prefer to use the appropriate ini setting functions at the top of my scripts to prevent dangerous global changes.
On a side note: This doesn't really sound like a job for PHP. I'm not trying to discourage your use of PHP here, but there are other languages that are better suited for command line usage.
Better make your script exit the execution, and then restart it. Store the point
where it left last time. This will ensure you do not have memory leaks and script does not run out of memory due to some error in garbage collection,and that
the execution continues if there is unexpected failure.
A simple shell command would be :
while [ 1 ]; do php myPhpScript.php a; done
you can make other checks to ensure proper running.
I'd like to point out, by default scripts run via a CLI in PHP, default to having no time limit, unlike scripts run through CGI, mod_php etc.
And as stated avoid running this via Apache.
However if you MUST do this, consider breaking it down. You can make a page that could process 5-10 results, appends the dump file, then prints out either a meta refresh, or some JavaScript to reload the page with with a parameter telling it where it's up too, and to continue until done.
Not recommended though.
adding to some of the other good options here you might want to look at http://www.electrictoolbox.com/article/php/process-forking/ and also sending some requests to dev/null if you dont need them to give back feedback.
Don't do this using a web interface. Run it from the command line; but look to see if your code can be optimised, or set break points and do it in "chunks"
First of all http://php.net/manual/en/function.set-time-limit.php put set_time_limit(0); at the beginning of the script.
As for the memory you should take care of that by unsetting any variables, array, pointers that you do not need on each iteration.
Better run the script from the shell (CLI) or as cronjob.
As far as I know MySQL connections do not time out, so you should be safe by setting:
php_value max_execution_time X
in a .htaccess file or placing set_time_limit(X) at the beginning of your script where X is a comfortable value in seconds.
Related
I have a script that updates my database with listings from eBay. The amount of sellers it grabs items from is always different and there are some sellers who have over 30,000 listings. I need to be able to grab all of these listings in one go.
I already have all the data pulling/storing working since I've created the client side app for this. Now I need an automated way to go through each seller in the DB and pull their listings.
My idea was to use CRON to execute the PHP script which will then populate the database.
I keep getting Internal Server Error pages when I'm trying to execute a script that takes a very long time to execute.
I've already set
ini_set('memory_limit', '2G');
set_time_limit(0);
error_reporting(E_ALL);
ini_set('display_errors', true);
in the script but it still keeps failing at about the 45 second mark. I've checked ini_get_all() and the settings are sticking.
Are there any other settings I need to adjust so that the script can run for as long as it needs to?
Note the warnings from the set_time_limit function:
This function has no effect when PHP is running in safe mode. There is no workaround other than turning off safe mode or changing the time limit in the php.ini.
Are you running in safe mode? Try turning it off.
This is the bigger one:
The set_time_limit() function and the configuration directive max_execution_time only affect the execution time of the script itself. Any time spent on activity that happens outside the execution of the script such as system calls using system(), stream operations, database queries, etc. is not included when determining the maximum time that the script has been running. This is not true on Windows where the measured time is real.
Are you using external system calls to make the requests to eBay? or long calls to the database?
Look for particularly long operations by profiling your php script, and looking for long operations (> 45 seconds). Try to break those operations into smaller chunks.
Well, as it turns out, I overlooked the fact that I was testing the script through the browser. Which means Apache was handling the PHP process, which was executed with mod_fcgid, which had a timeout of exactly 45 seconds.
Executing the script directly from shell and CRON works just fine.
According to the documentation:
max_execution_time only affect the execution time of the script itself.
Any time spent on activity that happens outside the execution of the script
such as system calls using system(), stream operations, database queries, etc.
is not included when determining the maximum time that the script has been running.
This is not true on Windows where the measured time is real.
This is confirmed by testing:
Will not time out
<?php
set_time_limit(5);
$sql = mysqli_connect('localhost','root','root','mysql');
$query = "SELECT SLEEP(10) FROM mysql.user;";
$sql->query($query) or die($query.'<br />'.$sql->error);
echo "You got the page";
Will time out
<?php
set_time_limit(5);
while (true) {
// do nothing
}
echo "You got the page";
Our problem is that we really would like PHP to timeout, regardless of what it is doing, after a given amount of time (as we don't want to keep resources busy if we know we've failed delivering a page in an acceptable amount of time, like 10 seconds). We know we can play with settings such as the MySQL wait_timeout for the SQL queries, but the page timeout will depend on the number of queries that are executed.
Some people have tried to come up with workarounds and it doesn't seem implementable.
Q: Is there an easy way to get a real PHP max_execution_time on linux, or are we better timing out elsewhere, such as Apache level?
This is quite a tricky advice, but it will definitely do what you want, if you are willing to modify and recompile PHP.
Take a look at the PHP source code at https://github.com/php/php-src/blob/master/Zend/zend_execute_API.c (the file is Zend/zend_execute_API.c), at function zend_set_timeout. This is the function that implements time limit. Here's how it works on different platforms:
on Windows, create a new thread, start a timer on it, and when it finishes, set a global variable called timed_out to 1, the PHP execution core checks this variable for every instruction, then exits (very simplified)
on Cygwin, use itimer with ITIMER_REAL, which measures real time, including any sleep, wait, whatever, then raise a signal that will interrupt any processing and stop processing
on other unix systems, use itimer with ITIMER_PROF, which only measures CPU time spent by the current process (but both in user-mode and kernel-mode). This means waiting for other processes (like MySQL) doesn't count into this.
Now what you want to do is to change the itimer on your Linux from ITIMER_PROF to ITIMER_REAL, which of course you need to do manually, recompile, install etc. The other difference between these two is that they also use different signal when the timer runs out. So my suggestion is to change the ifdef:
# ifdef __CYGWIN__
into
# if 1
so that you set both ITIMER_REAL and the signal that PHP waits for to SIGALRM.
Anyway this whole idea is untested (I use it for some very specific system, where ITIMER_PROF is broken, and it seems to work), unsupported, etc. Use it at your own risk. It may work with PHP itself, but it could break other modules, in PHP and in Apache, if they for whatever reason, use the SIGALRM signal or other timer.
This is an old and answered question. But for the sake of helping others, I wanted to point out the request_terminate_timeout php-fpm option. If you're using PHP-FPM, it is most likely what you need.
If set, this option allows you to tell PHP-FPM to kill a request after N seconds, regardless of what PHP does.
See http://php.net/manual/en/install.fpm.configuration.php#request-terminate-timeout for details.
From httpd.conf:
Timeout: The number of seconds before receives and sends time out
Timeout 300
I am looking for the PHP equivalent for VB doevents.
I have written a realtime analysis package in VB and used doevents to release to the operating system.
Doevents allows me to stay in memory and run continuously without filling up memory and allows me to respond to user input.
I have rewritten the package in PHP and I am looking for that same doevents feature.
If it doesn't exist I could reschedule myself and exit.
But I currently don't know how to do that and I think that would add a lot more overhead.
Thank you, gerardg
usleep is what you are looking for.. Delays program execution for the given number of micro seconds
http://php.net/manual/en/function.usleep.php
It's been almost 10 years since I last wrote anything in VB and as I recall, doevents() function allowed the application to yield to the processor during intensive processing (usually to allow other system events to fire - the most common being WM_PAINT so that your UI won't appear hung).
I don't think PHP has such functionality - your script will run as a single process and end (either when it's done or when it hits the default 30 second timeout).
If you are thinking in terms of threads (as most Windows programmers tend to do) and needing to spawn more than 1 instance of your script, perhaps you should take look at PHP's Process Control functions as a start.
I'm not entirely sure which aspects of doevents you're looking to emulate, so here's pretty much everything that could be useful for you.
You can use ob_implicit_flush(true) at the top of your script to enable implicit output buffer flushing. That means that whenever your script calls echo or print or whatever you use to display stuff, PHP will automatically send it all to the user's browser. You could also just use ob_flush() after each call to display something, which acts more like Application.DoEvents() in VB with regards to keeping your UI active, but must be called each time something is output.
Naturally if your script uses the output buffer already, you could build a copy of the buffer before flushing, with ob_get_contents().
If you need to allow the script to run for more time than usual, you can set a longer tiemout with set_time_limit($time). If you need more memory, and you have access to edit your .htaccess file, place the following code and edit the value:
php_value memory_limit 64M
That sets the memory limit to 64 megabytes.
For running multiple scripts at once, you can use pcntl_exec to start another one running.
If I am missing something important about DoEvents(), let me know and I will try to help you make it work.
PHP is designed for asynchronous on demand processing. However it can be forced to become a background task with a little hackery.
As PHP is running as a single thread you do not have to worry about letting the CPU do other things as that is already taken care of. If this was not the case then a web server would only be able to serve up one page at a time and all other requests would have to sit in a queue. You will need to write some sort of look that never expires until some detectable condition happens (like the "now please exit" message you set in the DB or something).
As pointed out by others you will need to set_time_limit($something); with perhaps usleep stopping the code from running "too fast" if it eats very much CPU each loop. However if you are also using a Database connection most of your script time is actually the script waiting for the Database (by far the biggest overhead for a script).
I have seen PHP worker threads created by using screen and detatching it to a background task. Other approaches also work so long as you do not have a session that will time out or exit (say when the web browser is closed). A cron that starts a script to check if the script is running every x mins or hours gives you automatic recovery from forced exists and/or system restarts.
TL;DR: doevents is "baked in" to PHP and you don't have to worry about it.
I have a php script run as a cron job that executes a set of simple tasks that loops for each user in the database and takes about 30 mins to complete. This process starts over every hour and needs to be as fast and efficient as possible. The problem Im having, is like with any server script, execution time varies and I need to figure out the best cron time settings.
If I run cron every minute, I need to stop the last loop of the script 20 seconds before the end of the minute to make sure that the current loop finishes in time. Over the course of the hour this adds up to a lot of wasted time.
Im wondering if its a bad idea to simple remove the php execution time limit and run the script once an hour and let it run to completion.... is this a bad idea?
Instead of setting the max_execution_time you could also use set_time_limit() to reset the counter on every loop. This will ensure your script is never running out of time unless there is something serious hanging within the current loop (and taking longer than the max_execution_time).
Basically this should make your script run as long as it needs while giving it a 30 seconds timeout between two set_time_limit() calls.
Assuming you'd like the work done ASAP, don't use cron. Cron is good for things that need to happen at specific times. It's often abused to simulate a background process that would ideally process work as soon as work appears. You should probably write a daemon that runs continuously. (Note: you could also look at a message/work-queue type system, there are nice libraries out there to do this too)
You can write a daemon from scratch using the pcntl functions (since you don't care about multiple worker processes, it's super-easy to get a process running in the background.), or cheat and just make a script that runs forever and run it via screen, or leverage some solid library code like PEAR's System:Daemon or nanoserv
Once the daemonization stuff is taken care of, all you really care about is having a loop that runs forever. You'll want to take care that your script doesn't leak memory, or consume too many resources.
Generally, you can do something like:
<?PHP
// some setup code
while(true){
$todo = figureOutIfIHaveWorkToDo();
foreach($todo as $something){
//do stuff with $something
//remember to clean up resources so you don't leak memory!
usleep(/*some integer*/);
}
usleep(/* some other integer */);
}
And it'll work pretty well.
Setting the time limit to 0 and letting it do its thing is fairly typical of PHP based cronjobs (in my experience), but this is also the point when you should ask yourself a few important questions, such as "Should I rewrite this job in a compiled language?" and "Am I using all of my tools (database, etc) to their maximum efficiency?"
That said, maybe better than completely removing the time limit would be to set it to the upper limit you actually want. If that means 48 minutes, then set_time_limit(48 * 60);
I really think you shouldn't set the time out to 0, that is just looking for trouble. At most, set it to 59*60 seconds, but setting it to 0 might cause security problems, if a script hangs, it will hang almost forever until the server host stops the execution. It is considered bad practice to do so.
I have used the php command-line interface for similar long running tasks in the past. You probably do not want to remove the execution time limit for any request.
Sounds like a great idea if there's little chance that it will take more than an hour. Note, however, that the wrong bug can be a really good way of making it take longer than expected..
To avoid all sorts of nasty problems, you should have a guard file with the process ID of the script. On startup, you should check to make sure the file doesn't exist, or if it does that the process ID in the file doesn't exist (through a kill( pid, 0 ) call). If these conditions are met, create a new file with the script's PID and delete the file when you're done.
This is the same trick that many daemons use to ensure it isn't already running. If the daemon was killed suddenly, the file will still exist but the PID of the process therein is unlikely to be running.
Depending on what your script does, it can lead to problems if you remove the time limit. If per example, you are polling an external server that is unresponsive while the job is running, and that your cron takes 2 hours instead of 30 minutes to complete, you may get a stack of PHP processes being fired up even if the previous ones haven't completed yet. This can cause system instability and crashes.
You probably have two options:
Make sure that no other instance of your script is running beforehand, otherwise exit() on start.
Consider changing your cronjob into a daemon.
Does it have to run hourly like clockwork?
If not split the job (you mentioned it was more than one simple task) do each task every hour?
Or split it per user, do A-M on hour, then N-Z the next?
I have a php app that calls a class called Client. Every so often I get a sort of time out error. I thought it was SQL at first but it turns its pointing to the class itself.
Fatal error: Maximum execution time of 30 seconds exceeded in C:\Program Files (x86)\Apache Software Foundation\Apache2.2\htdocs\ClientPortal\classes\Connections.php on line 3
<?php
session_start();
class Connections { //line 3
Does anyone know what's going on here?
thanks,
Billy
PHP scripts have a maximum time they're allowed to execute for, as declared in the php.ini.
You can circumvent this if you really want by adding the following line:
ini_set('max_execution_time', 123456);
where 123456 is the number of seconds you want the limit to be.
You can also use the set_time_limit function, which I only just found out about and assume does the same thing. I've always just done the former though.
You can change it in the php.ini file, but you might be using your script to do a batch operation or something. You wouldn't want a PHP script that is being accessed by an end user to sit there hanging for 30 seconds or more though, so you're better off leaving it at the default or even turning it down in the php.ini file, and setting the max_execution_time on an as-needed basis.
As seengee points out in the comment below, you can set the max_execution_time to 0 to stop the error from ever happening, but seengee is right to say that at least for a web request, you really shouldn't do this. For the php command line interpreter, this behaviour is the default though.
If you're seeing this problem for things that are supposed to be used by end-users through a web request, you might have to do some profiling to work out the real cause. If you're doing MySQL queries, start by turning on the slow query log. It's particularly good at letting you know when you've forgotten an index, or if you're doing something else inefficient.
You can also shove a few $s = microtime(true); yourstuff(); var_dump(microtime(true)-$s); things around to get a vague overview of which bits are slowing things down, just make sure you don't leave any of them in afterwards!
If you're still struggling to find the root cause, set xdebug up on your local machine and run the profiler. The extension is available as a precompiled windows binary (although there seems to be a confusing array of versions). You can inspect the results of running the profiler using wincachegrind.