I have an AJAX script that takes about 10 minutes to do its thing. I would like to be able to tell the user 'Hey listen, the task is being completed, we'll let you know how it turns out', but it won't let the user run any other scripts on the server until that one completes (I believe this is a consequence of PHP being single threaded, but I'm not sure). Is there a way to assign that AJAX script to a separate PHP or Apache process, so that the user can continue to click around in the application without having to wait for the task to finish?
You can use database or files to insert some lock mechanism to prevent task from running multiple times simultaneously. Then you need to just spawn PHP process using command nohup (no hang up), for more details look at this article: https://nsaunders.wordpress.com/2007/01/12/running-a-background-process-in-php/ or this question: nohup: run PHP process in background .
I seek for hours, at least, the solution was very easy for me using Cron Jobs. In cPanel you can go to Advanced -> Cron Jobs, and there schedule a task using a PHP script in Command line.
A Command example that execute a script php:
/usr/bin/wget http://www.example.com/miscript.php
or better:
php /home/USER/public_html/miscript.php
Are you using PHP sessions? If so, then a likely cause is that the long-running script keeps the session locked until it finishes. Any other request trying to access that same session will have to wait until the first one is done (usually it'll exceed request timeouts).
To fix that you'll need session_write_close():
Session data is usually stored after your script terminated without the need to call session_write_close(), but as session data is locked to prevent concurrent writes only one script may operate on a session at any time. When using framesets together with sessions you will experience the frames loading one by one due to this locking. You can reduce the time needed to load all the frames by ending the session as soon as all changes to session variables are done.
So simply call that function right around where you tell the user hey ya gotta wait. If you need (read) access to session variables later on in that script, consider storing them in local variables, then close the session immediately afterwards before moving on to whatever's taking a long time. If you need write access you could try re-running session_start() at the end, but if the session is currently locked elsewhere it'll have the same blocking problem. You could work around that by e.g. storing something in the database from the background script and fetching it from the regular user session, for example.
Related
this time I come with a question that I hope you can guide me to solve.
I have created a PHP script that allows loading a CSV file with a large amount of data (to load it I use the AJAX request). This script extracts the data from the file, then checks that this data is not already stored in the database, makes use of another script to obtain information of each data that is extracted from the file and finally saves the data that has passed successfully. all that validation process in a BD table.
It is a process that can last a few seconds or many minutes, because there are files that I can upload that contain more than 100 thousand data, so I would not like to leave the browser open all the time the process lasts.
What I want to know is how I could leave this process running internally on the server when I close the browser. Something like putting it in queue and let it continue running when I close my browser.
Once I reopen the browser and open the page of the script that shows me how the process is currently going. The idea is that the data processing is not interrupted when I close my browser.
Any suggestions or examples you could give me to achieve this?
Based on your description, I think you'd better run a dedicated daemon (either a 3rd party one or one written by yourself) yourself which does the background stuff.
The rationale behind why I don't think it right to do that in your PHP code is:
If you fork it from your server code, you have to install something else and since it is a folk, that process you are gonna spawn will inherit some data not useful at all from the parent process
With a dedicated daemon, it's easier for you to track the status of each job and more importantly, not a bunch of processes will be spawned if you just fork a new process for each job in the server code.
I'm currently running a Linux based VPS, with 768MB of Ram.
I have an application which collects details of domains and then connect to a service via cURL to retrieve details of the pagerank of these domains.
When I run a check on about 50 domains, it takes the remote page about 3 mins to load with all the results, before the script can parse the details and return it to my script. This causes a problem as nothing else seems to function until the script has finished executing, so users on the site will just get a timer / 'ball of death' while waiting for pages to load.
**(The remote page retrieves the domain details and updates the page by AJAX, but the curl request doesnt (rightfully) return the page until loading is complete.
Can anyone tell me if I'm doing anything obviously wrong, or if there is a better way of doing it. (There can be anything between 10 and 10,000 domains queued, so I need a process that can run in the background without affecting the rest of the site)
Thanks
A more sensible approach would be to "batch process" the domain data via the use of a cron triggered PHP cli script.
As such, once you'd inserted the relevant domains into a database table with a "processed" flag set as false, the background script would then:
Scan the database for domains that aren't marked as processed.
Carry out the CURL lookup, etc.
Update the database record accordingly and mark it as processed.
...
To ensure no overlap with an existing executing batch processing script, you should only invoke the php script every five minutes from cron and (within the PHP script itself) check how long the script has been running at the start of the "scan" stage and exit if its been running for four minutes or longer. (You might want to adjust these figures, but hopefully you can see where I'm going with this.)
By using this approach, you'll be able to leave the background script running indefinitely (as it's invoked via cron, it'll automatically start after reboots, etc.) and simply add domains to the database/review the results of processing, etc. via a separate web front end.
This isn't the ideal solution, but if you need to trigger this process based on a user request, you can add the following at the end of your script.
set_time_limit(0);
flush();
This will allow the PHP script to continue running, but it will return output to the user. But seriously, you should use batch processing. It will give you much more control over what's going on.
Firstly I'm sorry but Im an idiot! :)
I've loaded the site in another browser (FF) and it loads fine.
It seems Chrome puts some sort of lock on a domain when it's waiting for a server response, and I was testing the script manually through a browser.
Thanks for all your help and sorry for wasting your time.
CJ
While I agree with others that you should consider processing these tasks outside of your webserver, in a more controlled manner, I'll offer an explanation for the "server standstill".
If you're using native php sessions, php uses an exclusive locking scheme so only a single php process can deal with a given session id at a time. Having a long running php script which uses sessions can certainly cause this.
You can search for combinations of terms like:
php session concurrency lock session_write_close()
I'm sure its been discussed many times here. I'm too lazy to search for you. Maybe someone else will come along and make an answer with bulleted lists and pretty hyperlinks in exchange for stackoverflow reputation :) But not me :)
good luck.
I'm not sure how your code is structured but you could try using sleep(). That's what I use when batch processing.
I created a script that gets data from some web services and our database, formats a report, then zips it and makes it available for download. When I first started I made it a command line script to see the output as it came out and to get around the script timeout limit you get when viewing in a browser. But because I don't want my user to have to use it from the command line or have to run php on their computer, I want to make this run from our webserver instead.
Because this script could take minutes to run, I need a way to let it process in the background and then start the download once the file has been created successfully. What's the best way to let this script run without triggering the timeout? I've attempted this before (using the backticks to run the script separately and such) but gave up, so I'm asking here. Ideally, the user would click the submit button on the form to start the request, then be returned to the page instead of making them stare at a blank browser window. When the zip file they exists (meaning the process has finished), it should notify them (via AJAX? reloaded page? I don't know yet).
This is on windows server 2007.
You should run it in a different process. Make a daemon that runs continuously, hits a database and looks for a flag, like "ShouldProcessData". Then when you hit that website switch the flag to true. Your daemon process will see the flag on it's next iteration and begin the processing. Stick the results in to the database. Use the database as the communication mechanism between the website and the long running process.
In PHP you have to tell what time-out you want for your process
See PHP manual set_time_limit()
You may have another problem: the time-out of the browser itself (could be around 1~2 minutes). While that time-out should be changeable within the browser (for each browser), you can usually prevent the time-out user side to be triggered by sending some data to the browser every 20 seconds for instance (like the header for download, you can then send other headers, like encoding etc...).
Gearman is very handy for it (create a background task, let javascript poll for progress). It does of course require having gearman installed & workers created. See: http://www.php.net/gearman
Why don't you make an ajax call from the page where you want to offer the download and then just wait for the ajax call to return and also set_time_limit(0) on the other page.
I'm trying to use ajax to make multiple simultaneous requests to a php script, however, it only seems to do 1 instance at a time and I cannot connect to do the next call until the previous one is finished. What do I have to do in order to make it do them all at the same time? I'm using apache (xampp) on windows. I've also tested this on my unix server and the same thing is happening there as well.
In theory, there is nothing preventing one PHP script from being executed several times in parallel -- else, a lot of websites would have big problems ;-)
So, there is probably, in your situation, some locking mecanism that prevents this...
If your script is using sessions, and those are file-based (which is the default), those sessions could cause that kind of problem : with the default session handler, it's not possible to have several files accessing the same session data (i.e. the session data that corresponds to a given user) at the same time ; that's to prevent one script from overriding the data of another, and should probably not be disabled.
So, if your script is using sessions : would it be OK for you to stop using sessions ?
If not, you should try to close them as soon as you don't need them -- to unlock the files that are used to store them.
Here's a quote from the manual page of session_write_close, about that :
Session data is usually stored after
your script terminated without the
need to call session_write_close(),
but as session data is locked to
prevent concurrent writes only one
script may operate on a session at any
time. When using framesets
together with sessions you will
experience the frames loading one by
one due to this locking. You can
reduce the time needed to load all the
frames by ending the session as soon
as all changes to session variables
are done.
For example, there is a very simple PHP script which updates some tables on database, but this process takes a long time (maybe 10 minutes). Therefore, I want this script to continue processing even if the user closed the browser, because sometimes users do not wait and they close the browser or go to another webpage.
If the task takes 10 minutes, do not use a browser to execute it directly. You have lots of other options:
Use a cronjob to execute the task
periodically.
Have the browser
request insert a new row into a
database table so that a regular
cronjob can process the new row and
execute the PHP script with the
appropriate arguments
Have the
browser request write a message to
queue system, which has a subscriber
listening for such events (which then
executes the script).
While some of these suggestions are probably overkill for your situation, the key, combining feature is to de-couple the browser request from the execution of the job, so that it can be completed asynchronously.
If you need the browser window updated with progress, you will need to use a periodically-executed AJAX request to retrieve the job status.
To answer your question directly, see ignore_user_abort
More broadly, you probably have an architecture problem here.
If many users can initiate this stuff, you'll want the web application to add jobs to some kind of queue, and have a set number of background processes that chew through all the work.
The PHP script will keep running after the client terminates the connection (not doing so would be a security risk), but only up to max_execution_time (set in php.ini or through a PHP script, generally 30 seconds by default)..
For example:
<?php
$fh = fopen("bluh.txt", 'w');
for($i=0; $i<20; $i++) {
echo $i."<br/>";
fwrite($fh,$i."\n");
sleep(1);
}
fclose($fh);
?>
Start running that in your browser and close the browser before it completes. You'll find that after 20 seconds the file contains all of the values of $i.
Change the upper bound of the for loop to 100 instead of 20, and you'll find it only runs from 0 to 29. Because of PHP's max_execution_time the script times out and dies.
if the script is completely server based (no feedback to the user) this will be done even if the client is closed.
The general architecture of PHP is that a clients send a request to a script that gives a reply to the user. if nothing is given back to the user the script will still execute even if the user is not on the other side anymore. More simpler: their is no constant connection between server and client on a regular script.
You can make the PHP script run every 20 minutes using a crontab file which contains the time and what command to run in this case it would be the php script.
Yes. The server doesn't know if the user closed the browser. At least it doesn't notice that immediately.
No: the server probably (depending of how it is configured) won't allow for a php script to run for 10 minutes. On a cheap shared hosting I wouldn't rely on a script running for longer than a reasonable response time.
A server-side script will go on what it is doing regardless of what the client is doing.
EDIT: By the way, are you sure that you want to have pages that take 10 minutes to open? I suggest you to employ a task queue (whose items are executed by cron on a timely basis) and redirect user to a "ok, I am on it" page.