I'm not an expert on http request so this question might be trivial for some. I'm sending a request to a php script which takes a lot of time to process a file and return a response. Is there a way to send a response before this script finishes its task to let the user know about the process status? Since this task can take up to several minutes I'd like to notify the user when key parts of the process are finished.
Note: I cannot break this request into several others
I might not have the correct approach here if so do you have other ideas how this could be handled?
Technically yes, but it would require you to have fine grained control of the http-stack, which you may or may not have in a typical php setup. I would suggest you look into other solutions (E.g. Make request to start the task - then poll to get an update on the progress)
http://www.redips.net/javascript/ajax-progress-bar/
here's a great article that goes over creating ajax a progress bar to use with php.
let me know if it doesn't make sense!
I think the best way for long proccessing requests is cron jobs. You can send request which will create 'task' and catch the task by cron job. Cron job can change task status while working and you can check task status via interval requests. I can't imagine another way to inform users about request proccessing. As soon as you response your headers are sent and PHP stops.
EDIT: it should be noted that Cron jobs are only available on Linux servers. windows servers would require access to the task scheduler, which most web hosts will not allow.
Related
I'm been a long time reading how to solve my problem, but I can't find the solution.
I'm working with symfony, and I have a long time process to execute when an user calls an action. Can I process the data when the request has finished? The purpose is launch a polling process from client with jQuery and wait until the process finish to redirect to another action.
Now, I'm doing that with a ContainerAwareCommand, but it waits until the background process finish.
Please, could you help me?
Thanks in advance.
Yes, it's possible to do some background process in Symfony after Response was sent to the user.
You need to write a listener for kernel.terminate event.
And define your long-running process inside of callback.
Just be aware of few things:
This techniek does not work if Response is send in encoded gzip format. So, you should force apache/nginx not to use gzip for this particular response.
It's pretty complex to set any session data during this request, because session will be set only after your long-running process is finished. It means, that you need to find an alternative to flashbag messages.
This is a good case for a queue such as RabbitMQ or Redis. Put a message into a queue as each file is uploaded. One or more PHP daemons read out of the queue, process each file, and update status for the user (e.g. update a row in a database).
This lets you run your processing on servers separate from your web requests and scales easily. With a queue you can also break your processing up into multiple concurrent tasks, if what you need to do allows it.
I have this scenario:
User submits a link to my PHP website and closes the browser. Now that the server has got the link it will analyse the submitted link (page) for the broken links and after it has completely analysed the posted link, it will send an email to the user. I have a complete understanding of the second part i.e. how to analyse the page for the broken links and send the mail to the user. Only problem that I have is how may I achieve this first part i.e. make the server keep running the actions on it's own even even if there is no request made by the client end?
I have learned that "Crontab" or a "fork" may work for me. What do you say about these? Is it possible to achieve what I want, using these? What are the alternatives?
crontab would be the way to go for something like this.
Essentially you have two applications:
A web site where users submit data to a database.
An offline script, scheduled to run via cron, which checks for records in the database and performs the analysis, sending notifications of the results when complete.
Both of these applications share the same database, but are otherwise oblivious to each other.
A website itself isn't suited well for this sort of offline work, it's mainly a request/response system. But a scheduled task works for this. Unless the user is expecting an immediate response, a small delay of waiting for the next scheduled run of the offline task is fine.
The server should run the script independently of the browser. Once the request is submitted, the php server runs the script and returns the result to the browser (if it has a result to return)
An alternative would be to add the request to a database and then use crontab run the php script at a given interval. The script would then check the database to see if there's anything that needs to be processed. You could limit the script to run one database entry every minute (or whatever works). This will help prevent performance problems if you have a lot of requests at once, but will be slower to send the email.
A typical approach would be to enter the link into a database when the user submits it. You would then use a cron job to execute a script periodically, which will process any pending links.
Exactly how to setup a cron job (or equivalent scheduled task) depends on your server. If you have a host which provides a web-based admin tool (such as CPanel), there will often be a way to do it in there.
PHP script will keep running after the client closes the broser (terminating the connection).
Only keep in mind PHP scripts maximum execution time is limited to "max_execution_time" directive value.
Of course here I suppose the link submission happens calling your script page... I don't understand if this is your use case...
For the sake of simplicity, a cronjob could do the wonders. User submits a link, the web handler simply saves the link into a DB (let me pretend here that the table is named "queued_links"). Then a cronjob scheduled to run each minute (for example), selects every link from queued_links, does the application logic (finds broken page links) and sends the email. It then also deletes the link from queued_links (or updates a flag to represent the fact that the link has already been processed.
In the sake of scale and speed, a cronjob wouldn't fit as well as a Message Queue (see rabbitmq, activemq, gearman, and beanstalkd (gearman and beanstalk are my favorite 2, simple and fit well with php)). In lieu of spawning a cronjob every minute, a queue processor listens for 'events' and asynchronously processes the 'events' (think 'onLinkSubmission($link)'), and processes the messages ASAP. The cronjob solution is just a simplified implementation of one of these MQ solutions, will result in better / more predictable results, but at the cost of adding new services to maintain, etc.
well, there are couple of ways, simplest of them would be:
When user submit a request, save this request some where, let's call it jobs table, and inform customer that his request has been received, they'll be updated site finish processing your request, or whatever suites you.
Now, create a (or multiple) scripts (depending upon requirement) and run this script from Cron, this script will pick requests from Job table, process it, do whatever required.
Alternatively, you can evaluate possibility of message_queue or may be using a Job server for this.
so, it all depends on your requirement.
I have a web application written in PHP using a Postgres database.
The next phase of development is for background batch processes to be built that will need to be executed once a day (or adhoc as requested) for each user of the app. The process will query, await response and process the response from third-party services to feed information into the user's account within the web application.
Are there good ways to do this?
How would batches be triggered every day at 3am for each user?
Given there could be a delay in the response, is this a good scenario to use something like node.js?
Is it best to have the output of the batch process directly update the web application's database
with the appropriate data?
Or, is there some other way to handle the output?
Update: The process doesn't have to run at 3am. The key is that a few batch processes may need to run for each user. The execution of batches could be spread throughout the day.. I want this to be a "background" process separate to the app.
You could write a PHP script that runs through any users that need to be processed, and set up a cron job to run your script at 3am. Running as a cron job means you don't need to worry so much about how slow the third party call is. Obviously you'd need to store any necessary data in the database.
Alternatively, if the process is triggered by the user doing something on the site, you could use exec() to trigger the PHP script to process just that user, right away, without the user having to wait. The risk with this is that you can't control how rapidly the process is triggered.
Third option is to just do the request live and make the user wait. But it sounds like this is not an option for you.
It really depends on what third party you're calling and why. How long does the third party take to respond, how reliable they are, what kind of rate limits they might enforce, etc...
I am currently working with a really really slow API and in many instances the website users have to wait for those to finish. E.g. when a contact form is sent and the information is sent via the API.
Now, I am wondering how I can speed the API calls up, at least for the user? Is it ok, do make an asynchronous AJAX-call to a separate PHP-file and make the API-call from there? If so, what happens if the user closes the page, while the API-call is still running? He might think, that everything is already sent.
Is it ok, do make an asynchronous AJAX-call to a separate PHP-file and
make the API-call from there?
Yes, definitely; that would be the best way.
If so, what happens if the user closes the page, while the API-call is
still running? He might think, that everything is already sent.
It likely is sent; the PHP script running the API call continues on its merry way, and it's only when it tries to send a response back (a confirmation or error, likely) that it finds the client went away. If the API call generates an email eventually, it will complete whether the user waits or not (unless there's an error in the API call itself).
If you have the ability to run cron or scheduled task, I would convert it to an offline process. E.g. save the database in the db locally and return immediately. Then write a script that will run periodically via cron to process the new entries.
I blogged any article about this a while called that describes pretty much this exact process: Building A Scalable Queueing System With PHP
I have a login script that passes data to another script for processing. The processing is unrelated to the login script but it does a bit of data checking and logging for internal analysis.
I am using cURL to pass this data, but cURL is waiting for the response. I do not want to wait for the response because it's causing the user to have to wait before the analysis is complete before they can log in.
I am aware that the request could fail, but I am not overly concerned.
I basically want it to work like a multi threaded application where cURL is being used to fork a process. Is there any way to do this?
My code is below:
// Log user in
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,'http://site.com/userdata.php?e=' . $email);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$html = curl_exec($ch);
curl_close($ch);
// Redirect user to their home page
Thats all it does. But at the moment it has to wait for the cURL request to get a response.
Is there any way to make a get request and not wait for the response?
You don't need curl for this. Just open a socket and fire off a manual HTTP request and then close the socket. This is also useful because you can use a custom user agent so as not to skew your logging.
See this answer for an example.
Obviously, it's not "true" async/forking, but it should be quick enough.
I like Matt's idea the best, however to speed up your request you could
a) just make a head request (CURLOPT_NOBODY) which is significantly faster (no response body)
or
b) just set down the requesttime limit really low, however i guess you should test if the abortion of the request is really faster to only HEADing
Another possibility: Since there's apparently no need to do the analysis immediately, why do it immediately? If your provider allows cron jobs, just have the script that curl calls store the passed data quickly in a database or file, and have a cron job execute the processing script once a minute or hour or day. Or, if you can't do that, set up your own local machine to regularly run a script that invokes the remote one which processes the stored data.
It strikes me that what you're describing is a queue. You want to kick off a bunch of offline processing jobs and process them independently of user interaction. There are plenty of systems for doing that, though I'd particularly recommend beanstalkd using pheanstalk in PHP. It's far more reliable and controllable (e.g. managing retries in case of failures) than a cron job, and it's also very easy to distribute processing across multiple servers.
The equivalent of your calling a URL and ignoring the response is creating a new job in a 'tube'. It solves your particular problem because it will return more or less instantly and there is no response body to speak of.
At the processing end you don't need exec - run a CLI script in an infinite loop that requests jobs from the queue and processes them.
You could also look at ZeroMQ.
Overall this is not dissimilar to what GZipp suggests, it's just using a system that's designed specifically for this mode of operation.
If you have a restrictive ISP that won't let you run other software, it may be time to find a new ISP - Amazon AWS will give you a free EC2 micro instance for a year.