What are the performance implications of a php script calling 'exit'? - php

I've noticed many times where some php scripts exit. It seems to me that this will force an exit of the httpd/apache child (of course another will be started if required for the next request).
But in the CMS, that next request will require the entire init.php initialization, and of course just cleaning up and starting php in the first place.
It seems that the php files usually start with
if ( !defined( 'SMARTY_DIR' ) ) {
include_once( 'init.php' );
}
which suggests that somebody was imagining that one php process would serve multiple requests. But if every script exits, then each php/apache process will serve one request only.
Any thoughts on the performance and security implications of removing many of the exit calls (especially from the most-frequently-called scripts like index.php etc) to allow one process to serve multiple requests?
Thanks, Peter
--ADDENDUM --
Thank you for the answers. That (php will never serve more than one request) is what I thought originally until last week, when I was debugging a config variable that could only have been set in one script (because of the way the path was set up) but was still set in another script (this is on a webserver with about 20 hits/sec). In that case, I did not have a php exit call in the one script that set up its config slightly differently. But when I added the php exit call to that one script (in the alternate directory) this solved the misconfiguration I was experiencing in all my main scripts in the main directory (which were due to having a css directory variable set erroneously, in a previous page execution). So now I'm confused again, because with what all the answers so far say, php should never serve more than one request.

exit does nothing to Apache processes (it certainly doesn't kill a worker!). It simply ends the execution of the PHP script and returns execution to the Apache process, which'll send the results to the browser and continue on to the next request.
The Smarty code you've excerpted doesn't have anything to do with a PHP process serving multiple requests. It just insures that Smarty is initialised at all times - useful if a PHP script might be alternatively included in another script or accessed directly.

I think your confusion comes from what include_once is for. PHP is basically a "shared-nothing" system, where there are no real persistent server objects. include_once doesn't mean once per Apache child, but once per web request.
PHP may hack up a hairball if you include the same file twice. For instance a function with a particular name can only be defined once. This led to people implementing a copy of the old C #ifndef-#define-#include idiom once for each included file. include_once was the fix for this.

Even if you do not call exit your PHP script is still going to end execution, at which point any generated HTML will be returned to the web server to send on to your browser.
The exit keyword allows you to signal to the PHP engine that your work is done and no further processing needs to take place.
Also note that exit is typically used for error handling and flow control - removing it from includes will likely break your application.

Related

Reliable PHP script reentrant lock

I have to make sure a certain PHP script (started by a web request) does not run more then once simultaneously.
With binaries, it is quite easy to check if a process of a certain binary is already around.
However, a PHP script may be run by several pathways, eg. CGI, FCGI, inside webserver modules etc. so I cannot use system commands to find it.
So how to reliable check if another instance of a certain script is currently running?
The exact same strategy is used as one would chose with local applications:
The process manages a "lock file".
You define a static location in the file system. Upon script startup you check if a lock file exists in that location, if so you bail out. If not you first create that lock file, then proceed. During tear down of your script you delete that lock file again. Such lock file is a simple passive file, only its existence is of interest, often not its content. That is a standard procedure.
You can win extra candy points if you use the lock file not only as a passive semaphore, but if you store the process id of the generating process in it. That allows subsequent attempts to verify of that process actually still exists or has crashed in the mean time. That makes sense because such a crash would leave a stale lock file, thus create a dead lock.
To work around the issue discussed in the comments which correctly states that in some of the scenarios in which php scripts are used in a wen environment a process ID by itself may not be enough to reliably test if a given task has been successfully and completely processed one could use a slightly modified setup:
The incoming request does not directly trigger to task performing php script itself, but merely a wrapper script. That wrapper manages the lock file whilst delegating the actual task to be performed into a sub request to the http server. That allows the controlling wrapper script to use the additional information of the request state. If the actual task performing php script really crashes without prior notice, then the requesting wrapper knows about that: each request is terminated with a specific http status code which allows to decide if the task performing request has terminated normally or not. That setup should be reliable enough for most purposes. The chances of the trivial wrapper script crashing or being terminated falls into the area of a system failure which is something no locking strategy can reliably handle.
As PHP does not always provide a reliable way of file locking (it depends on how the script is run, eg. CGI, FCGI, server modules and the configuration), some other environment for locking should be used.
The PHP script can for example call another PHP interpreter in it's CLI variant. That would provide a unique PID that could be checked for locking. The PID should be stored to some lock file then which can be checked for stale lock by querying if a process using the PID is still around.
Maybe it is also possible to do all tasks needing the lock inside a shell script. Shell scripts also provide a unique PID and release it reliable after exit. A shell script may also use a unique filename that can be used to check if it is still running.
Also semaphores (http://php.net/manual/de/book.sem.php) could be used, which are explicitely managed by the PHP interpreter to reflect a scripts lifetime. They seem to work quite well, however there is not much fuzz around about how reliable they are in case of premature script death.
Also keep in mind that external processes launched by a PHP script may continue executing even if the script ends. For example, a user abort on FCGI releases passthru processes, which carry on working despite the client connection is closed. They may be killed later if enough output accumulated or not at all.
So such external processes have to locked as well, which can't be done by the PHP-accquired semaphores alone.

How php files are executed

I am not sure that I correctly understand how php works in general.
I will explain how I understand it.
When you are making request to webpage for example http://supersite.com/index.php firstly PHP interpreter runs through whole index.php file and all dependencies checking it for errors, after that it is executed from the beginning of the index.php to the end.
How all frameworks work ?
They are doing something named bootstrap, on the first lines of index.php can some method that direct execution flow into another way (to another file) doing a lot of stuff there. Loading classes into memory...
But if there no exit command execution will end at the end of index.php file anyway.
If I understand correctly, this means that for each request PHP start interpretation without previous state, so when PHP responses user request it clears all objects in memory and finishes. For the user it seems that he is working with one program but for the server and PHP each request is stateless, so PHP interpreter have to load all classes required for execution into memory each request?
Am I correct , or PHP stores in memory last loaded classes declaration (permanent generation in java) and static variables?

How do php 'daemons' work?

I'm learning php and I'd like to write a simple forum monitor, but I came to a problem. How do I write a script that downloads a file regularly? When the page is loaded, the php is executed just once, and if I put it into a loop, it would all have to be ran before the page is finished loading. But I want to, say, download a file every minute and make a notification on the page when the file changes. How do I do this?
Typically, you'll act in two steps :
First, you'll have a PHP script that will run every minute -- using the crontab
This script will do the heavy job : downloading and parsing the page
And storing some information in a shared location -- a database, typically
Then, your webpages will only have to check in that shared location (database) if the information is there.
This way, your webpages will always work :
Even if there are many users, only the cronjob will download the page
And even if the cronjob doesn't work for a while, the webpage will work ; worst possible thing is some information being out-dated.
Others have already suggested using a periodic cron script, which I'd say is probably the better option, though as Paul mentions, it depends upon your use case.
However, I just wanted to address your question directly, which is to say, how does a daemon in PHP work? The answer is that it works in the same way as a daemon in any other language - you start a process which doesn't end immediately, and put it into the background. That process then polls files or accepts socket connections or somesuch, and in so doing, accepts some work to do.
(This is obviously a somewhat simplified overview, and of course you'd typically need to have mechanisms in place for process management, signalling the service to shut down gracefully, and perhaps integration into the operating system's daemon management, etc. but the basics are pretty much the same.)
How do I write a script that downloads
a file regularly?
there are shedulers to do that, like 'cron' on linux (or unix)
When the page is loaded, the php is
executed just once,
just once, just like the index.php of your site....
If you want to update a page which is show in a browser than you should use some form of AJAX,
if you want something else than your question is not clear to /me......

Best practice to find source of PHP script termination

I have a PHP script that grabs a chunk of data from a database, processes it, and then looks to see if there is more data. This processes runs indefinitely and I run several of these at a time on a single server.
It looks something like:
<?php
while($shouldStillRun)
{
// do stuff
}
logThatWeExitedLoop();
?>
The problem is, after some time, something causes the process to stop running and I haven't been able to debug it and determine the cause.
Here is what I'm using to get information so far:
error_log - Logging all errors, but no errors are shown in the error log.
register_shutdown_function - Registered a custom shutdown function. This does get called so I know the process isn't being killed by the server, it's being allowed to finish. (or at least I assume that is the case with this being called?)
debug_backtrace - Logged a debug_backtrace() in my custom shutdown function. This shows only one call and it's my custom shutdown function.
Log if reaches the end of script - Outside of the loop, I have a function that logs that the script exited the loop (and therefore would be reaching the end of the source file normally). When the script dies randomly, it's not logging this, so whatever kills it, kills it while it's in the middle of processing.
What other debugging methods would you suggest for finding the culprit?
Note: I should add that this is not an issue with max_execution_time, which is disabled for these scripts. The time before being killed is inconsistent. It could run for 10 seconds or 12 hours before it dies.
Update/Solution: Thank you all for your suggestions. By logging the output, I discovered that when a MySql query failed, the script was set to die(). D'oh. Updated it to log the mysql errors and then terminate. Got it working now like a charm!
I'd log memory usage of your script. Maybe it acquires too much memory, hits memory limit and dies?
Remember, PHP has a variable in the ini file that says how long a script should run. max-execution-time
Make sure that you are not going over this, or use the set_time_limit() to increase execution time. Is this program running through a web server or via cli?
Adding: My Bad Experiences with PHP. Looking through some background scripts I wrote earlier this year. Sorry, but PHP is a terrible scripting language for doing anything for long lengths of time. I see that the newer PHP (which we haven't upgraded to) adds the functionality to force the GC to run. The problem I've been having is from using too much memory because the GC almost never runs to clean up itself. If you use things that recursively reference themselves, they also will never be freed.
Creating an array of 100,000 items makes memory, but then setting the array to an empty array or splicing it all out, does NOT free it immediately, and doesn't mark it as unused (aka making a new 100,000 element array increases memory).
My personal solution was to write a perl script that ran forever, and system("php my_php.php"); when needed, so that the interpreter would free completely. I'm currently supporting 5.1.6, this might be fixed in 5.3+ or at the very least, now they have GC commands that you can use to force the GC to cleanup.
Simple script
#!/usr/bin/perl -w
use strict;
while(1) {
if( system("php /to/php/script.php") != 0 ) {
sleep(30);
}
}
then in your php script
<?php
// do a single processing block
if( $moreblockstodo ) {
exit(0);
} else {
// no? then lets sleep for a bit until we get more
exit(1);
}
?>
I'd log the state of the function to a file in a few different places in each loop.
You can get the contents of most variables as a string with var_export, using the var_export($varname,true) form.
You could just log this to a certain file, and keep an eye on it. The latest state of the function before the log ends should provide some clues.
Sounds like whatever is happening is not a standard php error. You should be able to throw your own errors using a try... catch statement that should then be logged. I don't have more details other than that because I'm on my phone away from a pc.
I've encountered this before on one of our projects at work. We have a similar setup - a PHP script checks the DB if there are tasks to be done (such as sending out an email, updating records, processing some data as well). The PHP script has a while loop inside, which is set to
while(true) {
//do something
}
After a while, the script will also be killed somehow. I've already tried most of what has been said here like setting max_execution_time, using var_export to log all output, placing a try_catch, making the script output ( php ... > output.txt) etc and we've never been able to find out what the problem is.
I think PHP just isn't built to do background tasks by itself. I know it's not answering your question (how to debug this) but the way we worked this is that we used a cronjob to call the PHP file every 5 minutes. This is similar to Jeremy's answer of using a perl script - it ensures that the interpreter if free after the execution is done.
If this is on Linux, try to look into system logs - the process could be killed by the OOM (out-of-memory) killer (unlikely, you'd also see other problems if this was happening), or a segmentation fault (some versions of PHP don't like some versions of extensions, resulting in weird crashes).

why does apache/server die when i make a typo and cause a never ending loop?

this is more of a fundamental question at how apache/threading works.
in this hypothetical (read: sometimes i suck and write terrible code), i write some code that enters the infinite-recursion phases of it's life. then, what's expected, happens. the serve stalls.
even if i close the tab, open up a new one, and hit the site again (locally, of course), it does nothing. even if i hit a different domain i'm hosting through a vhost declaration, nothing. i normally have to wait a number of seconds before apache can begin handling traffic again. most of the time i just get tired and restart the server manually.
can someone explain this process to me? i have the php runtime setting 'ignore_user_abort' set to true to allow ajax calls that are initiated to keep running even if they close their browser, but would this being set to false affect it?
any help would be appreciated. didn't know what to search for.
thanks.
ignore_user_abort() allows your script (and Apache) to ignore a user disconnecting (closing browser/tab, moving away from page, hitting ESC, esc..) and continue processing. This is useful in some cases - for instance in a shopping cart once the user hits "yes, place the order". You really don't want an order to die halfway through the process, e.g. order's in the database, but the charge hasn't been sent to the payment facility yet. Or vice-versa.
However, while this script is busilly running away in "the background", it will lock up resources on the server, especially the session file - PHP locks the session file to make sure that multiple parallel requests won't stomp all over the file, so while your infinite loop is running in the background, you won't be able to use any session-enabled other part of the site. And if the loop is intensive enough, it could tie up the CPU enough that Apache is unable to handle any other requests on other hosted sites, where the session lock might not apply.
If it is an infinite loop, you'll have to wait until PHP's own maximum allowed run time (set_time_limit() and max_execution_time()) kicks in and kills the script. There's also some server-side limiters, like Apache's RLimitCPU and TimeOut that can handle situations like this.
Note that except on Windows, PHP doesn't count "external" time in the set_time_limit. So if your runaway process is doing database stuff, calling external programs via system() and the like, the time spent running those external calls is NOT accounted for in the parent's time limit.
If you write code that causes an (effectively) neverending loop, then apache will execute that, and be unable to respond to any additional new requests for a page, because it's trying to determine the page content (for the served page which caused the neverending loop) by executing the (non-terminating) php code.
Solution: don't write code that doesn't terminate (in a reasonable amount of time). Understand loop invariants.

Categories