Anyone know how to close the connection (besides just flush()?), but keep executing some code afterwards.
I don't want the client to see the long process that may occur after the page is done.
You might want to look at pcntl_fork() -- it allows you to fork your current script and run it in a separate thread.
I used it in a project where a user uploaded a file and then the script performed various operations on it, including communicating with a third-party server, which could take a long time. After the initial upload, the script forked and displayed the next page to the user, and the parent killed itself off. The child then continued executing, and was queried by the returned page for its status using AJAX. it made the application much more responsive, and the user got feedback as to the status while it was executing.
This link has more on how to use it:
Thorough look at PHP's pcntl_fork() (Apr 2007; by Frans-Jan van Steenbeek)
If you can't use pcntl_fork, you can always fall back to returning a page quickly that fires an AJAX request to execute more items from a queue.
mvds reminds the following (which can apply in a specific server configuration): Don't fork the entire apache webserver, but start a separate process instead. Let that process fork off a child which lives on. Look for proc_open to get full fd interaction between your php script and the process.
I don't want the client to see the
long process that may occur after the
page is done.
sadly, the page isn't done until after the long process has finished - hence what you ask for is impossible (to implement in the way you infer) I'm afraid.
The key here, pointed to by Jhong's answer and inversely suggested by animusen's comment, is that the whole point of what we do with HTTP as web developers is to respond to a request as quickly as possible /end - that's it, so if you're doing anything else, then it points to some design decision that could perhaps have been a little better :)
Typically, you take the additional task you are doing after returning the 'page' and hand it over to some other process, normally that means placing the task in a job queue and having a cli daemon or a cron job pick it up and do what's needed.
The exact solution is specific to what you're doing, and the answer to a different (set of) questions; but for this one it comes down to: no you can't close the connection, and one would advise you look at refactoring the long running process out of that script / page.
Take a look at PHP's ignore_user_abort-setting. You can set it using the ignore_user_abort() function.
An example of (optional) use has been given (and has been reported working by the OP) in the following duplicate question:
close a connection early (Sep 2008)
It basically gives reference to user-notes in the PHP manual. A central one is
Connection Handling user-note #71172 (Nov 2006)
which is also the base for the following two I'd like to suggest you to look into:
Connection Handling user-note #89177 (Feb 2009)
Connection Handling user-note #93441 (Sep 2009)
Don't fork the entire apache webserver, but start a separate process instead. Let that process fork off a child which lives on. Look for proc_open to get full fd interaction between your php script and the process.
We solved this issue by inserting the work that needs to be done into a job queue, and then have a cron-script pick up the backend jobs regularly. Probably not exactly what you need, but it works very well for data-intensive processes.
(you could also use Zend Server's job queue, if you've got a wad of cash and want a tried-and-tested solution)
Related
I'm looking for some ideas to do the following. I need a PHP script to perform certain action for quite a long time. This is an extension for a CMS and this can't be anything else but PHP. It also can't be a command line script because it should be used by common people that will have only the standard means of the CMS. One of the options is having a cron job (most simple hostings have it) that will trigger the script often so that instead of working for a long time it could perform the action step by step preserving its state from one launch to the next one. This is not perfect but I can't see of any other solutions. If the script will be redirecting to itself server will interrupt it. What other options can suit?
Thanks everyone in advance!
What you're talking about is a daemon or long running program that waits for calls by client programs, performs and action, provides a response then keeps on waiting for more calls.
You might be familiar w/ these in the form of Apache & MySQL ;) Anyway PHP is generally OK in this regard, it does have the ability to function over raw sockets as well as fork sub-processes to handle multiple requests simultaneously.
Having said that PHP daemons are a tool where YMMV. Some folks will say they work great, other folks like me will say they have issues w/ interprocess communication and leaking memory even amidst plethora unset() calls.
Anyway you likely won't be able to deploy a daemon of any type on a shared hosting environment. You'll need to get a better server package or stick with a Cron based solution.
Here's a link about writing a PHP daemon.
Also, one more note. Daemons do crash from time to time and therefore you may still need to store state about whats going on, just in case someone trips over the power cord to your shared server :)
I would also suggest that you think about making it a daemon but if not then you can simply use
set_time_limit(0);
ignore_user_abort(true);
at the top to tell it not to time out and not to get interrupted by anything. Then call it from the cron to start it every day or whatever. I have this on many long processing daily tasks and it works great for me. However, it won't be able to easily talk to the outside world (other scripts can't query it or anything -- if that is what you want look into php services) so once you get it running make sure it will stop and have it print its progress to a logfile.
I want to have my own variable that would be (most likely an array) storing what my php application is up to right now.
The application can trigger few processes that are in background (like downloading files) and I want to have a list what is being currently processed.
For example
if php calls exec() that will be downloading for 15mins
and then another download starts
and another download starts
then if I access my application I want to be able to see that 3 downloads are in process. If none of them finished yet.
Can do that? Only in memory, not storing anything on the disk?
I thought that the solution would be a some kind of server variable.
PHP doesn't have knowledge of previous processes. As soon has a php process is finished everything it knows about itself goes with it.
I can think of two options. Write knowledge about spawned processes to a file or database and use it to sync all your php request, (store the PID of each spawned process)
Or
Create an Daemon. The people behind PHP have worked hard to clean up PHP memory handling and such to make this more feasible. Take a look at their PEAR package - http://pear.php.net/package/System_Daemon
Off the top of my head, a quick architecture would compose of 3 peices
Part A) The web app that will take in request for downloads, and report back the progress of all request
Part B) You daemon, which accepts requests for downloads, spawns process, and will report back status of all spawned reqeust
Part C) The spawn request that will perform the download you need.
Anyone for shared memory?
Obviously you would have to have some sort of daemon, but you could use the inbuilt semaphore functions to easily have contact between each of the scripts. You need to be careful though because sometimes if you're not closing the memory block properly, you could risk ending up with no blocks left.
You can't store your own variables in $_SERVER. The best method would be to store your data in a database where and query/update it as required.
As my server is not supporting cron job, I want a file in my server to trigger its action on a particular time every day..
Please let me know whether it possible to do run a script at a particular time from the server side itself without any external act.
I agree with Kel's answer.
You could try out one of the free cronjob services available, if your server doesn't support it.
Online Cronjobs
Set Cronjob
Just the first two found on Google, there's likely to be more if you search a little.
You cannot start script without ANY external act.
If your file server has SSH or HTTP server or something like that, you can configure cron job on another server to start your script via SSH / HTTP / something like that.
Also, you can create PHP script, which would do sleeping in a loop all the time, and wake up and do some job only if current time is near some specific value. You will have to correct maximum execution time for php script (see here for details), and you will have to start your script on server startup. BTW, this does not look like good solution.
As mentioned before, this is not possible literally "without external act".
A nice solution I found in the ThinkUp software (don't know where else this is used) to use a RSS feed reader. From the point of simplicity, this is probably the best option.
The idea is that you use your feed reader to automatically call a script on your site every XX hours (or whatever interval you want). When called, this script executes the maintenance tasks or whatever it is that you want to do.
To make sure that not everybody can run that script and cause your server to break down (I suppose this is a somewhat heavy task), you can use a unique long identifier string appended as URL parameter to make sure that the script only gets called by you.
Other than that, you can use one of the "poor man's" web cron job services that have been suggested in other answers.
if (rand(0,100)==0){
if (!file_exists($tf='tmp/job.crontime') || (time() - filemtime($tf))>(60*60*24)){
... # your tasks
touch($tf);
}
}
This simple & stupid script uses a file to store the time of last job-ecexution. If >60*60*24 has passed — it launches the job code. rand(0,100) should lower the overhead of checking for jobs on each request: 1/100 is the chance of running your jobs.
Put it in the end of your 'index.php'. Don't use in projects with modelate to high load :))
The Great Disadvantage: it won't run if you don't have any visitors.
UPD: Write a script that runs indefinitely and every 30s does touch('tmp/job.crontime') to report it's still alive. It should also check the current time & perform actions.
In index.php, if more than 30s has passed — re-launch the daemon with an HTTP-request. Ugly, but fully functional. You'll also deal with time limits, be careful!
Well, if this is on a public web server and you have enough visits, you could always use those to run code to check for a given value, say hour of day, number of times a file have been accessed (or store your number in a file). Just put your php code on top of a web page.
I'm currently running a Linux based VPS, with 768MB of Ram.
I have an application which collects details of domains and then connect to a service via cURL to retrieve details of the pagerank of these domains.
When I run a check on about 50 domains, it takes the remote page about 3 mins to load with all the results, before the script can parse the details and return it to my script. This causes a problem as nothing else seems to function until the script has finished executing, so users on the site will just get a timer / 'ball of death' while waiting for pages to load.
**(The remote page retrieves the domain details and updates the page by AJAX, but the curl request doesnt (rightfully) return the page until loading is complete.
Can anyone tell me if I'm doing anything obviously wrong, or if there is a better way of doing it. (There can be anything between 10 and 10,000 domains queued, so I need a process that can run in the background without affecting the rest of the site)
Thanks
A more sensible approach would be to "batch process" the domain data via the use of a cron triggered PHP cli script.
As such, once you'd inserted the relevant domains into a database table with a "processed" flag set as false, the background script would then:
Scan the database for domains that aren't marked as processed.
Carry out the CURL lookup, etc.
Update the database record accordingly and mark it as processed.
...
To ensure no overlap with an existing executing batch processing script, you should only invoke the php script every five minutes from cron and (within the PHP script itself) check how long the script has been running at the start of the "scan" stage and exit if its been running for four minutes or longer. (You might want to adjust these figures, but hopefully you can see where I'm going with this.)
By using this approach, you'll be able to leave the background script running indefinitely (as it's invoked via cron, it'll automatically start after reboots, etc.) and simply add domains to the database/review the results of processing, etc. via a separate web front end.
This isn't the ideal solution, but if you need to trigger this process based on a user request, you can add the following at the end of your script.
set_time_limit(0);
flush();
This will allow the PHP script to continue running, but it will return output to the user. But seriously, you should use batch processing. It will give you much more control over what's going on.
Firstly I'm sorry but Im an idiot! :)
I've loaded the site in another browser (FF) and it loads fine.
It seems Chrome puts some sort of lock on a domain when it's waiting for a server response, and I was testing the script manually through a browser.
Thanks for all your help and sorry for wasting your time.
CJ
While I agree with others that you should consider processing these tasks outside of your webserver, in a more controlled manner, I'll offer an explanation for the "server standstill".
If you're using native php sessions, php uses an exclusive locking scheme so only a single php process can deal with a given session id at a time. Having a long running php script which uses sessions can certainly cause this.
You can search for combinations of terms like:
php session concurrency lock session_write_close()
I'm sure its been discussed many times here. I'm too lazy to search for you. Maybe someone else will come along and make an answer with bulleted lists and pretty hyperlinks in exchange for stackoverflow reputation :) But not me :)
good luck.
I'm not sure how your code is structured but you could try using sleep(). That's what I use when batch processing.
I'm writing a simple application in PHP which needs to occasionally carry out a fairly intensive set of MySQL updates. I don't particularly want this to cause a delay for the user, so I'm wondering about using pcntl_fork().
I'm not sure how this really works though: will the child process continue running after the parent process finishes? Will the parent process end, and the user's page load fully complete before the child process completes?
In other words, is this a safe way to have a PHP script (running under Apache) do some time-consuming updates without delaying the user, or should I just ask my users to put up with some delay?
The parent process will end, the user's page will load fully, the child process will continue, and the use will have no feedback as to whether or not the child process finished successfully.
Someone out there can probably tell you in detail what happens when you call that under apache but the chances are you will get answers that aren't always true depending on what versions and combinations of apache and php you are using. You should use ajax and have two requests. Respond once with the page that says what you are doing and then with an ajax call poll a second request for the status and where you actually do the work.
If PHP runs under Apache as mod_php module forking will not work at all, you'll get a warning saying that function *pcntl_fork()* is undefined. In that case a good solution is to use exec() instead to run a separate php job using the command line.
I think it is a bad idea. I have done the similar stuff, and the apache redirect the ouput of parent to its child. That is your browser shows the info from one of the child process.
Click this for more infomation
Hope it help you.