Is there any sane way to make a HTTP request asynchronously in PHP without throwing out the response? I.e., something similar to AJAX - the PHP script initiates the request, does it's own thing and later, when the response is received, a callback function/method or another script handles the response.
One approach has crossed my mind - spawning a new php process with another script for each request - the second script does the request, waits for the response and then parses the data and does whatever it should, while the original script goes on spawning new processes. I have doubts, though, about performance in this case - there must be some performance penalty from having to create a new process every time.
Yes, depending on the traffic of your site, spawning a separate PHP process for running a script could be devastating. It would be more efficient to use shell_exec() to start a background process that saves the output to a filename you already know, but even this could be resource intensive.
You could also have a request queue stored in a database. A single, separate background process would pull the job, execute it, and save the output, possibly setting a flag in the DB that your web process could check.
If you're going to use the DB queue approach, use curl_multi* class of functions to send all queued requests at once. This will limit the execution time of each iteration in your background process to the longest request time.
V5 may not be threaded, but you can create applications that exploit in-process multitasking.
Check out the following article: "Develop multitasking applications with PHP V5" from
IBM DeveloperWorks. You can find it here http://www.ibm.com/developerworks/web/library/os-php-multitask/
Related
this time I come with a question that I hope you can guide me to solve.
I have created a PHP script that allows loading a CSV file with a large amount of data (to load it I use the AJAX request). This script extracts the data from the file, then checks that this data is not already stored in the database, makes use of another script to obtain information of each data that is extracted from the file and finally saves the data that has passed successfully. all that validation process in a BD table.
It is a process that can last a few seconds or many minutes, because there are files that I can upload that contain more than 100 thousand data, so I would not like to leave the browser open all the time the process lasts.
What I want to know is how I could leave this process running internally on the server when I close the browser. Something like putting it in queue and let it continue running when I close my browser.
Once I reopen the browser and open the page of the script that shows me how the process is currently going. The idea is that the data processing is not interrupted when I close my browser.
Any suggestions or examples you could give me to achieve this?
Based on your description, I think you'd better run a dedicated daemon (either a 3rd party one or one written by yourself) yourself which does the background stuff.
The rationale behind why I don't think it right to do that in your PHP code is:
If you fork it from your server code, you have to install something else and since it is a folk, that process you are gonna spawn will inherit some data not useful at all from the parent process
With a dedicated daemon, it's easier for you to track the status of each job and more importantly, not a bunch of processes will be spawned if you just fork a new process for each job in the server code.
I'm been a long time reading how to solve my problem, but I can't find the solution.
I'm working with symfony, and I have a long time process to execute when an user calls an action. Can I process the data when the request has finished? The purpose is launch a polling process from client with jQuery and wait until the process finish to redirect to another action.
Now, I'm doing that with a ContainerAwareCommand, but it waits until the background process finish.
Please, could you help me?
Thanks in advance.
Yes, it's possible to do some background process in Symfony after Response was sent to the user.
You need to write a listener for kernel.terminate event.
And define your long-running process inside of callback.
Just be aware of few things:
This techniek does not work if Response is send in encoded gzip format. So, you should force apache/nginx not to use gzip for this particular response.
It's pretty complex to set any session data during this request, because session will be set only after your long-running process is finished. It means, that you need to find an alternative to flashbag messages.
This is a good case for a queue such as RabbitMQ or Redis. Put a message into a queue as each file is uploaded. One or more PHP daemons read out of the queue, process each file, and update status for the user (e.g. update a row in a database).
This lets you run your processing on servers separate from your web requests and scales easily. With a queue you can also break your processing up into multiple concurrent tasks, if what you need to do allows it.
I have a web application written in PHP using a Postgres database.
The next phase of development is for background batch processes to be built that will need to be executed once a day (or adhoc as requested) for each user of the app. The process will query, await response and process the response from third-party services to feed information into the user's account within the web application.
Are there good ways to do this?
How would batches be triggered every day at 3am for each user?
Given there could be a delay in the response, is this a good scenario to use something like node.js?
Is it best to have the output of the batch process directly update the web application's database
with the appropriate data?
Or, is there some other way to handle the output?
Update: The process doesn't have to run at 3am. The key is that a few batch processes may need to run for each user. The execution of batches could be spread throughout the day.. I want this to be a "background" process separate to the app.
You could write a PHP script that runs through any users that need to be processed, and set up a cron job to run your script at 3am. Running as a cron job means you don't need to worry so much about how slow the third party call is. Obviously you'd need to store any necessary data in the database.
Alternatively, if the process is triggered by the user doing something on the site, you could use exec() to trigger the PHP script to process just that user, right away, without the user having to wait. The risk with this is that you can't control how rapidly the process is triggered.
Third option is to just do the request live and make the user wait. But it sounds like this is not an option for you.
It really depends on what third party you're calling and why. How long does the third party take to respond, how reliable they are, what kind of rate limits they might enforce, etc...
I want to have my own variable that would be (most likely an array) storing what my php application is up to right now.
The application can trigger few processes that are in background (like downloading files) and I want to have a list what is being currently processed.
For example
if php calls exec() that will be downloading for 15mins
and then another download starts
and another download starts
then if I access my application I want to be able to see that 3 downloads are in process. If none of them finished yet.
Can do that? Only in memory, not storing anything on the disk?
I thought that the solution would be a some kind of server variable.
PHP doesn't have knowledge of previous processes. As soon has a php process is finished everything it knows about itself goes with it.
I can think of two options. Write knowledge about spawned processes to a file or database and use it to sync all your php request, (store the PID of each spawned process)
Or
Create an Daemon. The people behind PHP have worked hard to clean up PHP memory handling and such to make this more feasible. Take a look at their PEAR package - http://pear.php.net/package/System_Daemon
Off the top of my head, a quick architecture would compose of 3 peices
Part A) The web app that will take in request for downloads, and report back the progress of all request
Part B) You daemon, which accepts requests for downloads, spawns process, and will report back status of all spawned reqeust
Part C) The spawn request that will perform the download you need.
Anyone for shared memory?
Obviously you would have to have some sort of daemon, but you could use the inbuilt semaphore functions to easily have contact between each of the scripts. You need to be careful though because sometimes if you're not closing the memory block properly, you could risk ending up with no blocks left.
You can't store your own variables in $_SERVER. The best method would be to store your data in a database where and query/update it as required.
I have a login script that passes data to another script for processing. The processing is unrelated to the login script but it does a bit of data checking and logging for internal analysis.
I am using cURL to pass this data, but cURL is waiting for the response. I do not want to wait for the response because it's causing the user to have to wait before the analysis is complete before they can log in.
I am aware that the request could fail, but I am not overly concerned.
I basically want it to work like a multi threaded application where cURL is being used to fork a process. Is there any way to do this?
My code is below:
// Log user in
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,'http://site.com/userdata.php?e=' . $email);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$html = curl_exec($ch);
curl_close($ch);
// Redirect user to their home page
Thats all it does. But at the moment it has to wait for the cURL request to get a response.
Is there any way to make a get request and not wait for the response?
You don't need curl for this. Just open a socket and fire off a manual HTTP request and then close the socket. This is also useful because you can use a custom user agent so as not to skew your logging.
See this answer for an example.
Obviously, it's not "true" async/forking, but it should be quick enough.
I like Matt's idea the best, however to speed up your request you could
a) just make a head request (CURLOPT_NOBODY) which is significantly faster (no response body)
or
b) just set down the requesttime limit really low, however i guess you should test if the abortion of the request is really faster to only HEADing
Another possibility: Since there's apparently no need to do the analysis immediately, why do it immediately? If your provider allows cron jobs, just have the script that curl calls store the passed data quickly in a database or file, and have a cron job execute the processing script once a minute or hour or day. Or, if you can't do that, set up your own local machine to regularly run a script that invokes the remote one which processes the stored data.
It strikes me that what you're describing is a queue. You want to kick off a bunch of offline processing jobs and process them independently of user interaction. There are plenty of systems for doing that, though I'd particularly recommend beanstalkd using pheanstalk in PHP. It's far more reliable and controllable (e.g. managing retries in case of failures) than a cron job, and it's also very easy to distribute processing across multiple servers.
The equivalent of your calling a URL and ignoring the response is creating a new job in a 'tube'. It solves your particular problem because it will return more or less instantly and there is no response body to speak of.
At the processing end you don't need exec - run a CLI script in an infinite loop that requests jobs from the queue and processes them.
You could also look at ZeroMQ.
Overall this is not dissimilar to what GZipp suggests, it's just using a system that's designed specifically for this mode of operation.
If you have a restrictive ISP that won't let you run other software, it may be time to find a new ISP - Amazon AWS will give you a free EC2 micro instance for a year.