I will try to summarize my problem in order to make it understandable.
I have a script serverHandler.php that can start multiples server using an another script server.php.
So I start a new server like this :
$server = shell_exec("php server.php");
So now I will have a server.php script that will run in the backgroung until I manually kill it.
Is there a way to directly manage the kill of this server within the script serverHandler.php like that ?
// Start the script server.php
$server = shell_exec("php server.php");
// Stop the script that run on background
// So the server will be stopped
killTask($server);
Shell management of tasks is typically done using the ID of a process (PID). In order to kill the process, you must keep track of this PID and then provide it to your kill command. If your serverHandler is a command line script then keeping a local copy of the PID could suffice, but in a web interface over HTTP/HTTPS you would need to send back the PID so it could be managed.
Using a stateless language like PHP for this is not recommended, however, as attempting to retrieve process information, determine whether or not the process is one of the server processes previously dispatched, and other fine little details will be unnecessarily complicated and, if you're not careful, error-prone and potentially even dangerous.
Better would be to use a stateful language like Java or Python for managing these processes. By using a single point of access with a maintained state, you can have several threads "waiting" on these processes so that :
you know for certain which PIDs are expected to be valid at all times,
you can avoid the need for excessive PID validation,
you minimize the security risks of bad PID validation,
you know if these processes end prematurely so you can remove them from the list of expected processes automatically
you can keep track of which PID is associated with which server instance.
Use the right tools for the problem you're trying to solve. PHP really isn't the tool for this particular problem (your servers can be written in PHP, but use a different language for your serverHandler to avoid headaches).
Related
I am testing my python script macro.py that runs in a loop from the terminal.
My plan is to code a Laravel application so that I can setup multiple instances of macro.py running on the server and these can be started and stopped at the any time.
Is this possible without too much difficulty?
Invoke an external script the same way it is mentioned in this question. It's very important to store the process pid in any volatile memory (database, file). The retrieved pid number can be later used to stop the process using kill -9 process_pid command.
Be careful, if your python script "breaks" in the background (between your start and stop action called from application) there is a chance that the other process will retrieve the same pid number! I recommend to store also the process startup time as well as pid number. Before killing the process do the additional check of the stored startup time and kill the process only if test passed (otherwise assume that the process stopped unexpectedly and show appropriate information in the user interface).
I have to make sure a certain PHP script (started by a web request) does not run more then once simultaneously.
With binaries, it is quite easy to check if a process of a certain binary is already around.
However, a PHP script may be run by several pathways, eg. CGI, FCGI, inside webserver modules etc. so I cannot use system commands to find it.
So how to reliable check if another instance of a certain script is currently running?
The exact same strategy is used as one would chose with local applications:
The process manages a "lock file".
You define a static location in the file system. Upon script startup you check if a lock file exists in that location, if so you bail out. If not you first create that lock file, then proceed. During tear down of your script you delete that lock file again. Such lock file is a simple passive file, only its existence is of interest, often not its content. That is a standard procedure.
You can win extra candy points if you use the lock file not only as a passive semaphore, but if you store the process id of the generating process in it. That allows subsequent attempts to verify of that process actually still exists or has crashed in the mean time. That makes sense because such a crash would leave a stale lock file, thus create a dead lock.
To work around the issue discussed in the comments which correctly states that in some of the scenarios in which php scripts are used in a wen environment a process ID by itself may not be enough to reliably test if a given task has been successfully and completely processed one could use a slightly modified setup:
The incoming request does not directly trigger to task performing php script itself, but merely a wrapper script. That wrapper manages the lock file whilst delegating the actual task to be performed into a sub request to the http server. That allows the controlling wrapper script to use the additional information of the request state. If the actual task performing php script really crashes without prior notice, then the requesting wrapper knows about that: each request is terminated with a specific http status code which allows to decide if the task performing request has terminated normally or not. That setup should be reliable enough for most purposes. The chances of the trivial wrapper script crashing or being terminated falls into the area of a system failure which is something no locking strategy can reliably handle.
As PHP does not always provide a reliable way of file locking (it depends on how the script is run, eg. CGI, FCGI, server modules and the configuration), some other environment for locking should be used.
The PHP script can for example call another PHP interpreter in it's CLI variant. That would provide a unique PID that could be checked for locking. The PID should be stored to some lock file then which can be checked for stale lock by querying if a process using the PID is still around.
Maybe it is also possible to do all tasks needing the lock inside a shell script. Shell scripts also provide a unique PID and release it reliable after exit. A shell script may also use a unique filename that can be used to check if it is still running.
Also semaphores (http://php.net/manual/de/book.sem.php) could be used, which are explicitely managed by the PHP interpreter to reflect a scripts lifetime. They seem to work quite well, however there is not much fuzz around about how reliable they are in case of premature script death.
Also keep in mind that external processes launched by a PHP script may continue executing even if the script ends. For example, a user abort on FCGI releases passthru processes, which carry on working despite the client connection is closed. They may be killed later if enough output accumulated or not at all.
So such external processes have to locked as well, which can't be done by the PHP-accquired semaphores alone.
I want to have my own variable that would be (most likely an array) storing what my php application is up to right now.
The application can trigger few processes that are in background (like downloading files) and I want to have a list what is being currently processed.
For example
if php calls exec() that will be downloading for 15mins
and then another download starts
and another download starts
then if I access my application I want to be able to see that 3 downloads are in process. If none of them finished yet.
Can do that? Only in memory, not storing anything on the disk?
I thought that the solution would be a some kind of server variable.
PHP doesn't have knowledge of previous processes. As soon has a php process is finished everything it knows about itself goes with it.
I can think of two options. Write knowledge about spawned processes to a file or database and use it to sync all your php request, (store the PID of each spawned process)
Or
Create an Daemon. The people behind PHP have worked hard to clean up PHP memory handling and such to make this more feasible. Take a look at their PEAR package - http://pear.php.net/package/System_Daemon
Off the top of my head, a quick architecture would compose of 3 peices
Part A) The web app that will take in request for downloads, and report back the progress of all request
Part B) You daemon, which accepts requests for downloads, spawns process, and will report back status of all spawned reqeust
Part C) The spawn request that will perform the download you need.
Anyone for shared memory?
Obviously you would have to have some sort of daemon, but you could use the inbuilt semaphore functions to easily have contact between each of the scripts. You need to be careful though because sometimes if you're not closing the memory block properly, you could risk ending up with no blocks left.
You can't store your own variables in $_SERVER. The best method would be to store your data in a database where and query/update it as required.
See also Having a PHP script loop forever doing computing jobs from a queue system, but that doesn't answer all my questions.
If I want to run a PHP script forever, accessing a queue and doing jobs:
What is the potential for memory problems? How to avoid them? (any flush functions or something I should use?)
What if the script dies for some reason? What would be a good method to automatically start it up again?
What would be the best basic approach to start the script. Since it runs forever, I don't need cron. But how do I start it up? (See also 2.)
Set the queue up as a cron script. Have it execute every 10 seconds. When the script fires up, check if there's a lock file present (something like .lock). If there is, exit immediately. If not, create the .lock and start processing. If any errors occur, email/log these errors, delete .lock and exit. If there's no tasks, then exit.
I think this approach is ideal, since PHP isn't really designed to be able to run a script for extended periods of time like you're asking. To avoid potential memory leaks, crashes etc, continuously executing the script is a better approach.
While PHP can access (publish and consume) MQ's, if at all possible try to use a fully functional MQ application to do this.
A fully functional MQ application (in ruby, perl, .NET, java etc) will handle all of the concurrency, error logging, state management and scalability issues that you discuss.
Not going too far with state machines, at least it's a good idea to introduce states both to 'jobs' (example: flv2avi conversion) and 'tasks' (flv2avi 1.flv).
On my script (Perl), sometimes zombie processes are starting to downgrade the whole script's performance. It is a rare case, but it is native in source, so the script should be able to stop reading queue anymore, allowing new instance to continue its tasks&jobs; however, keeping as much of running tasks' data is welcome. Once first instance has 1-2 tasks, it gets killed.
On start :
check for common errors (due to shutdown)
check for known errors (out of space, can't read input)
kill whatever may be killed and set status to 'waiting'
start all waiting.
If you run a piped jobs (vlc | ffmpeg, tail -f | grep), you can try to avoid using too much I/O in your program, instead doing fork() (bad idea for PHP?) or just calling /bin/bash -c "prog1 | prog2", this saves a lot of cpu load.
Start points: both /etc/rc.d and cron (check processes, run first instance || run second with 'debug' argument )
Anyone know how to close the connection (besides just flush()?), but keep executing some code afterwards.
I don't want the client to see the long process that may occur after the page is done.
You might want to look at pcntl_fork() -- it allows you to fork your current script and run it in a separate thread.
I used it in a project where a user uploaded a file and then the script performed various operations on it, including communicating with a third-party server, which could take a long time. After the initial upload, the script forked and displayed the next page to the user, and the parent killed itself off. The child then continued executing, and was queried by the returned page for its status using AJAX. it made the application much more responsive, and the user got feedback as to the status while it was executing.
This link has more on how to use it:
Thorough look at PHP's pcntl_fork() (Apr 2007; by Frans-Jan van Steenbeek)
If you can't use pcntl_fork, you can always fall back to returning a page quickly that fires an AJAX request to execute more items from a queue.
mvds reminds the following (which can apply in a specific server configuration): Don't fork the entire apache webserver, but start a separate process instead. Let that process fork off a child which lives on. Look for proc_open to get full fd interaction between your php script and the process.
I don't want the client to see the
long process that may occur after the
page is done.
sadly, the page isn't done until after the long process has finished - hence what you ask for is impossible (to implement in the way you infer) I'm afraid.
The key here, pointed to by Jhong's answer and inversely suggested by animusen's comment, is that the whole point of what we do with HTTP as web developers is to respond to a request as quickly as possible /end - that's it, so if you're doing anything else, then it points to some design decision that could perhaps have been a little better :)
Typically, you take the additional task you are doing after returning the 'page' and hand it over to some other process, normally that means placing the task in a job queue and having a cli daemon or a cron job pick it up and do what's needed.
The exact solution is specific to what you're doing, and the answer to a different (set of) questions; but for this one it comes down to: no you can't close the connection, and one would advise you look at refactoring the long running process out of that script / page.
Take a look at PHP's ignore_user_abort-setting. You can set it using the ignore_user_abort() function.
An example of (optional) use has been given (and has been reported working by the OP) in the following duplicate question:
close a connection early (Sep 2008)
It basically gives reference to user-notes in the PHP manual. A central one is
Connection Handling user-note #71172 (Nov 2006)
which is also the base for the following two I'd like to suggest you to look into:
Connection Handling user-note #89177 (Feb 2009)
Connection Handling user-note #93441 (Sep 2009)
Don't fork the entire apache webserver, but start a separate process instead. Let that process fork off a child which lives on. Look for proc_open to get full fd interaction between your php script and the process.
We solved this issue by inserting the work that needs to be done into a job queue, and then have a cron-script pick up the backend jobs regularly. Probably not exactly what you need, but it works very well for data-intensive processes.
(you could also use Zend Server's job queue, if you've got a wad of cash and want a tried-and-tested solution)