Related
I work on a somewhat large web application, and the backend is mostly in PHP. There are several places in the code where I need to complete some task, but I don't want to make the user wait for the result. For example, when creating a new account, I need to send them a welcome email. But when they hit the 'Finish Registration' button, I don't want to make them wait until the email is actually sent, I just want to start the process, and return a message to the user right away.
Up until now, in some places I've been using what feels like a hack with exec(). Basically doing things like:
exec("doTask.php $arg1 $arg2 $arg3 >/dev/null 2>&1 &");
Which appears to work, but I'm wondering if there's a better way. I'm considering writing a system which queues up tasks in a MySQL table, and a separate long-running PHP script that queries that table once a second, and executes any new tasks it finds. This would also have the advantage of letting me split the tasks among several worker machines in the future if I needed to.
Am I re-inventing the wheel? Is there a better solution than the exec() hack or the MySQL queue?
I've used the queuing approach, and it works well as you can defer that processing until your server load is idle, letting you manage your load quite effectively if you can partition off "tasks which aren't urgent" easily.
Rolling your own isn't too tricky, here's a few other options to check out:
GearMan - this answer was written in 2009, and since then GearMan looks a popular option, see comments below.
ActiveMQ if you want a full blown open source message queue.
ZeroMQ - this is a pretty cool socket library which makes it easy to write distributed code without having to worry too much about the socket programming itself. You could use it for message queuing on a single host - you would simply have your webapp push something to a queue that a continuously running console app would consume at the next suitable opportunity
beanstalkd - only found this one while writing this answer, but looks interesting
dropr is a PHP based message queue project, but hasn't been actively maintained since Sep 2010
php-enqueue is a recently (2017) maintained wrapper around a variety of queue systems
Finally, a blog post about using memcached for message queuing
Another, perhaps simpler, approach is to use ignore_user_abort - once you've sent the page to the user, you can do your final processing without fear of premature termination, though this does have the effect of appearing to prolong the page load from the user perspective.
When you just want to execute one or several HTTP requests without having to wait for the response, there is a simple PHP solution, as well.
In the calling script:
$socketcon = fsockopen($host, 80, $errno, $errstr, 10);
if($socketcon) {
$socketdata = "GET $remote_house/script.php?parameters=... HTTP 1.1\r\nHost: $host\r\nConnection: Close\r\n\r\n";
fwrite($socketcon, $socketdata);
fclose($socketcon);
}
// repeat this with different parameters as often as you like
On the called script.php, you can invoke these PHP functions in the first lines:
ignore_user_abort(true);
set_time_limit(0);
This causes the script to continue running without time limit when the HTTP connection is closed.
Another way to fork processes is via curl. You can set up your internal tasks as a webservice. For example:
http://domain/tasks/t1
http://domain/tasks/t2
Then in your user accessed scripts make calls to the service:
$service->addTask('t1', $data); // post data to URL via curl
Your service can keep track of the queue of tasks with mysql or whatever you like the point is: it's all wrapped up within the service and your script is just consuming URLs. This frees you up to move the service to another machine/server if necessary (ie easily scalable).
Adding http authorization or a custom authorization scheme (like Amazon's web services) lets you open up your tasks to be consumed by other people/services (if you want) and you could take it further and add a monitoring service on top to keep track of queue and task status.
http://domain/queue?task=t1
http://domain/queue?task=t2
http://domain/queue/t1/100931
It does take a bit of set-up work but there are a lot of benefits.
If it just a question of providing expensive tasks, in case of php-fpm is supported, why not to use fastcgi_finish_request() function?
This function flushes all response data to the client and finishes the request. This allows for time consuming tasks to be performed without leaving the connection to the client open.
You don't really use asynchronicity in this way:
Make all your main code first.
Execute fastcgi_finish_request().
Make all heavy stuff.
Once again php-fpm is needed.
I've used Beanstalkd for one project, and planned to again. I've found it to be an excellent way to run asynchronous processes.
A couple of things I've done with it are:
Image resizing - and with a lightly loaded queue passing off to a CLI-based PHP script, resizing large (2mb+) images worked just fine, but trying to resize the same images within a mod_php instance was regularly running into memory-space issues (I limited the PHP process to 32MB, and the resizing took more than that)
near-future checks - beanstalkd has delays available to it (make this job available to run only after X seconds) - so I can fire off 5 or 10 checks for an event, a little later in time
I wrote a Zend-Framework based system to decode a 'nice' url, so for example, to resize an image it would call QueueTask('/image/resize/filename/example.jpg'). The URL was first decoded to an array(module,controller,action,parameters), and then converted to JSON for injection to the queue itself.
A long running cli script then picked up the job from the queue, ran it (via Zend_Router_Simple), and if required, put information into memcached for the website PHP to pick up as required when it was done.
One wrinkle I did also put in was that the cli-script only ran for 50 loops before restarting, but if it did want to restart as planned, it would do so immediately (being run via a bash-script). If there was a problem and I did exit(0) (the default value for exit; or die();) it would first pause for a couple of seconds.
Here is a simple class I coded for my web application. It allows for forking PHP scripts and other scripts. Works on UNIX and Windows.
class BackgroundProcess {
static function open($exec, $cwd = null) {
if (!is_string($cwd)) {
$cwd = #getcwd();
}
#chdir($cwd);
if (strtoupper(substr(PHP_OS, 0, 3)) == 'WIN') {
$WshShell = new COM("WScript.Shell");
$WshShell->CurrentDirectory = str_replace('/', '\\', $cwd);
$WshShell->Run($exec, 0, false);
} else {
exec($exec . " > /dev/null 2>&1 &");
}
}
static function fork($phpScript, $phpExec = null) {
$cwd = dirname($phpScript);
#putenv("PHP_FORCECLI=true");
if (!is_string($phpExec) || !file_exists($phpExec)) {
if (strtoupper(substr(PHP_OS, 0, 3)) == 'WIN') {
$phpExec = str_replace('/', '\\', dirname(ini_get('extension_dir'))) . '\php.exe';
if (#file_exists($phpExec)) {
BackgroundProcess::open(escapeshellarg($phpExec) . " " . escapeshellarg($phpScript), $cwd);
}
} else {
$phpExec = exec("which php-cli");
if ($phpExec[0] != '/') {
$phpExec = exec("which php");
}
if ($phpExec[0] == '/') {
BackgroundProcess::open(escapeshellarg($phpExec) . " " . escapeshellarg($phpScript), $cwd);
}
}
} else {
if (strtoupper(substr(PHP_OS, 0, 3)) == 'WIN') {
$phpExec = str_replace('/', '\\', $phpExec);
}
BackgroundProcess::open(escapeshellarg($phpExec) . " " . escapeshellarg($phpScript), $cwd);
}
}
}
PHP HAS multithreading, its just not enabled by default, there is an extension called pthreads which does exactly that.
You'll need php compiled with ZTS though. (Thread Safe)
Links:
Examples
Another tutorial
pthreads PECL Extension
UPDATE: since PHP 7.2 parallel extension comes into play
Tutorial/Example
reference manual
This is the same method I have been using for a couple of years now and I haven't seen or found anything better. As people have said, PHP is single threaded, so there isn't much else you can do.
I have actually added one extra level to this and that's getting and storing the process id. This allows me to redirect to another page and have the user sit on that page, using AJAX to check if the process is complete (process id no longer exists). This is useful for cases where the length of the script would cause the browser to timeout, but the user needs to wait for that script to complete before the next step. (In my case it was processing large ZIP files with CSV like files that add up to 30 000 records to the database after which the user needs to confirm some information.)
I have also used a similar process for report generation. I'm not sure I'd use "background processing" for something such as an email, unless there is a real problem with a slow SMTP. Instead I might use a table as a queue and then have a process that runs every minute to send the emails within the queue. You would need to be warry of sending emails twice or other similar problems. I would consider a similar queueing process for other tasks as well.
It's a great idea to use cURL as suggested by rojoca.
Here is an example. You can monitor text.txt while the script is running in background:
<?php
function doCurl($begin)
{
echo "Do curl<br />\n";
$url = 'http://'.$_SERVER['SERVER_NAME'].$_SERVER['REQUEST_URI'];
$url = preg_replace('/\?.*/', '', $url);
$url .= '?begin='.$begin;
echo 'URL: '.$url.'<br>';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$result = curl_exec($ch);
echo 'Result: '.$result.'<br>';
curl_close($ch);
}
if (empty($_GET['begin'])) {
doCurl(1);
}
else {
while (ob_get_level())
ob_end_clean();
header('Connection: close');
ignore_user_abort();
ob_start();
echo 'Connection Closed';
$size = ob_get_length();
header("Content-Length: $size");
ob_end_flush();
flush();
$begin = $_GET['begin'];
$fp = fopen("text.txt", "w");
fprintf($fp, "begin: %d\n", $begin);
for ($i = 0; $i < 15; $i++) {
sleep(1);
fprintf($fp, "i: %d\n", $i);
}
fclose($fp);
if ($begin < 10)
doCurl($begin + 1);
}
?>
There is a PHP extension, called Swoole.
Although it might not be enabled, it is available on my hosting for being enabled at click of a button.
Worth checking it out. I haven't had time to use it yet, as I was searching here for info, when I stumbled across it and thought it worth sharing.
Unfortunately PHP does not have any kind of native threading capabilities. So I think in this case you have no choice but to use some kind of custom code to do what you want to do.
If you search around the net for PHP threading stuff, some people have come up with ways to simulate threads on PHP.
If you set the Content-Length HTTP header in your "Thank You For Registering" response, then the browser should close the connection after the specified number of bytes are received. This leaves the server side process running (assuming that ignore_user_abort is set) so it can finish working without making the end user wait.
Of course you will need to calculate the size of your response content before rendering the headers, but that's pretty easy for short responses (write output to a string, call strlen(), call header(), render string).
This approach has the advantage of not forcing you to manage a "front end" queue, and although you may need to do some work on the back end to prevent racing HTTP child processes from stepping on each other, that's something you needed to do already, anyway.
If you don't want the full blown ActiveMQ, I recommend to consider RabbitMQ. RabbitMQ is lightweight messaging that uses the AMQP standard.
I recommend to also look into php-amqplib - a popular AMQP client library to access AMQP based message brokers.
Spawning new processes on the server using exec() or directly on another server using curl doesn't scale all that well at all, if we go for exec you are basically filling your server with long running processes which can be handled by other non web facing servers, and using curl ties up another server unless you build in some sort of load balancing.
I have used Gearman in a few situations and I find it better for this sort of use case. I can use a single job queue server to basically handle queuing of all the jobs needing to be done by the server and spin up worker servers, each of which can run as many instances of the worker process as needed, and scale up the number of worker servers as needed and spin them down when not needed. It also let's me shut down the worker processes entirely when needed and queues the jobs up until the workers come back online.
i think you should try this technique it will help to call as many as pages you like all pages will run at once independently without waiting for each page response as asynchronous.
cornjobpage.php //mainpage
<?php
post_async("http://localhost/projectname/testpage.php", "Keywordname=testValue");
//post_async("http://localhost/projectname/testpage.php", "Keywordname=testValue2");
//post_async("http://localhost/projectname/otherpage.php", "Keywordname=anyValue");
//call as many as pages you like all pages will run at once independently without waiting for each page response as asynchronous.
?>
<?php
/*
* Executes a PHP page asynchronously so the current page does not have to wait for it to finish running.
*
*/
function post_async($url,$params)
{
$post_string = $params;
$parts=parse_url($url);
$fp = fsockopen($parts['host'],
isset($parts['port'])?$parts['port']:80,
$errno, $errstr, 30);
$out = "GET ".$parts['path']."?$post_string"." HTTP/1.1\r\n";//you can use POST instead of GET if you like
$out.= "Host: ".$parts['host']."\r\n";
$out.= "Content-Type: application/x-www-form-urlencoded\r\n";
$out.= "Content-Length: ".strlen($post_string)."\r\n";
$out.= "Connection: Close\r\n\r\n";
fwrite($fp, $out);
fclose($fp);
}
?>
testpage.php
<?
echo $_REQUEST["Keywordname"];//case1 Output > testValue
?>
PS:if you want to send url parameters as loop then follow this answer :https://stackoverflow.com/a/41225209/6295712
PHP is a single-threaded language, so there is no official way to start an asynchronous process with it other than using exec or popen. There is a blog post about that here. Your idea for a queue in MySQL is a good idea as well.
Your specific requirement here is for sending an email to the user. I'm curious as to why you are trying to do that asynchronously since sending an email is a pretty trivial and quick task to perform. I suppose if you are sending tons of email and your ISP is blocking you on suspicion of spamming, that might be one reason to queue, but other than that I can't think of any reason to do it this way.
I'm building an integration that communicates data to several different systems via API (REST). I need to process data as quickly as possible. This is a basic layout:
Parse and process data (probably into an array as below)
$data = array( Title => "Title", Subtitle => "Test", .....
Submit data into service (1) $result1 = $class1->functionservice1($data);
Submit data into service (2) $result2 = $class2->functionservice2($data);
Submit data into service (3) $result3 = $class3->functionservice3($data);
Report completion echo "done";
Run in a script as above I'll need to wait for each function to finish before it starts the next one (taking 3 times longer).
Is there an easy way to run each service function asynchronously but wait for all to complete before (5) reporting completion. I need to be able to extract data from each $result and return that as one post to a 4th service.
Sorry if this is an easy question - I'm a PHP novice
Many thanks, Ben
Yes, there are multiple ways.
The most efficient is to use an event loop that leverages non-blocking I/O to achieve concurrency and cooperative multitasking.
One such event loop implementation is Amp. There's an HTTP client that works with Amp, it's called Artax. An example is included in its README. You should have a look at how promises and coroutines work. There's Amp\wait to mix synchronous code with async code.
<?php
Amp\run(function() {
$client = new Amp\Artax\Client;
// Dispatch two requests at the same time
$promises = $client->requestMulti([
'http://www.google.com',
'http://www.bing.com',
]);
try {
// Yield control until all requests finish
list($google, $bing) = (yield Amp\all($promises));
var_dump($google->getStatus(), $bing->getStatus());
} catch (Exception $e) {
echo $e;
}
});
Other ways include using threads and or processes to achieve concurrency. Using multiple processes is the easiest way if you want to use your current code. However, spawning processes isn't cheap and using threads in PHP isn't really a good thing to do.
You can also put your code in another php file and call it using this :
exec("nohup /usr/bin/php -f your script > /dev/null 2>&1 &");
If you want to use asynchronicity like you can do in other languages ie. using threads, you will need to install the pthreads extension from PECL, because PHP does not support threading out of the box.
You can find an explaination on how to use threads with this question :
How can one use multi threading in PHP applications
I'm working with the cURL implementation in PHP and leveraging curl_multi_init() and curl_multi_exec() to execute batches of requests in parallel. I've been doing this for a while, and understand this piece of it.
However, the request bodies contain a signature that is calculated with a timestamp. From the moment this signature is generated, I have a limited window of time to make the request before the server will reject the request once it's made. Most of the time this is fine. However, in some cases, I need to do large uploads (5+ GB).
If I batch requests into a pool of 100, 200, 1000, 20000, or anything in-between, and I'm uploading large amounts of data to the server, the initial requests that execute will complete successfully. Later requests, however, won't have started until after the timestamp in the signature expires, so the server rejects those requests out-of-hand.
The current flow I'm using goes something like this:
Do any processing ahead of time.
Add the not-yet-executed cURL handles to the batch.
Let cURL handle executing all of the requests.
Look at the data that came back and parse it all.
I'm interested in finding a way to execute a callback function that can generate a signature on-demand and update the request body at the moment that PHP/cURL goes to execute that particular request. I know that you can bind a callback function to a cURL handle that will execute repeatedly while the request is happening, and you have access to the cURL handle all along the way.
So my question is this: Is there any way to configure an onBefore and/or onAfter callback for a cURL handle? Something that can execute immediately before the cURL executes the request, and then something that can execute immediately after the response comes back so that the response data can be parsed.
I'd like to do something a bit more event oriented, like so:
Add a not-yet-executed cURL handle to the batch, assigning a callback function to execute when cURL (not myself) executes the request (both before and after).
Take the results of the batch request and do whatever I want with the data.
No, this isn't possible with the built in functions of cURL. However, it would be trivial to implement a wrapper around the native functions to do what you want.
For instance, vaguely implementing the Observer pattern:
<?php
class CurlWrapper {
private $ch;
private $listeners;
public function __construct($url) {
$this->ch = curl_init($url);
$this->setopt(CURLOPT_RETURNTRANSFER, true);
}
public function setopt($opt, $value) {
$this->notify('setopt', array('option' => $opt, 'value' => $value));
curl_setopt($this->ch, $opt, $value);
}
public function setopt_array($opts) {
$this->notify('setopt_array', array('options' => $opts));
curl_setopt_array($this->ch, $opts);
}
public function exec() {
$this->notify('beforeExec', array());
$ret = curl_exec($this->ch);
$this->notify('afterExec', array('result' => $ret));
return $ret;
}
public function attachListener($event, $fn) {
if (is_callable($fn)) {
$this->listeners[$event][] = $fn;
}
}
private function notify($event, $data) {
if (isset($this->listeners[$event])) {
foreach ($this->listeners[$event] as $listener) {
$listener($this, $data);
}
}
}
}
$c = new CurlWrapper('http://stackoverflow.com');
$c->setopt(CURLOPT_HTTPGET, true);
$c->attachListener('beforeExec', function($handle, $data) {
echo "before exec\n";
});
$result = $c->exec();
echo strlen($result), "\n";
You can add event listeners (which must be callables) to the object with addListener, and they will automatically be called at the relevant moment.
Obviously you would need to do some more work to this to make it fit your requirements, but it isn't a bad start, I think.
Anything to do with cURL is not advanced PHP. It's "advanced mucking about".
If you have these huge volumes of data going through cURL I would recommend not using cURL at all (actually, I would always recommend not using cURL)
I'd look into a socket implementation. Good ones aren't easy to find, but not that hard to write yourself.
Ok, so you say that the requests are parallelized, I'm not sure exactly what that means, but that's not too important.
As an aside, I'll explain what I mean by Asynchronous. If you open a raw TCP socket, you can call the socket_set_blocking function on the connection, this means that read / write operations don't block. You can take several of these connections and write a small amount of data to each of them in a loop, this way you are sending your requests "at once".
The reason I asked whether you have to wait until the whole message is consumed before the endpoint validates the signature is that even if Curl is sending the requests "all at once", there's always a possibility that the time it takes to upload will mean that the validation fails. Presumably it's slower to upload 2000 requests at once than to upload 5, so you'd expect more failures for the former case? Similarly, if your requests are processing synchronously (i.e. one at a time) then you'll see the same error for the same reason, although in this case it's the later requests that are expected to fail. Maybe you need to think about the data upload rate required to upload a message of a particular size within a particular time frame, then try and calculate an optimum multi-payload size. Perhaps the best approach is the simplest: upload one at a time and calculate the signature just before each upload?
A better approach might be to put the signature in a message header, this way the signature can be read earlier in the upload process.
The question sort of says it all - is there a function which does the same as the JavaScript function setTimeout() for PHP? I've searched php.net, and I can't seem to find any...
There is no way to delay execution of part of the code of in the current script. It wouldn't make much sense, either, as the processing of a PHP script takes place entirely on server side and you would just delay the overall execution of the script. There is sleep() but that will simply halt the process for a certain time.
You can, of course, schedule a PHP script to run at a specific time using cron jobs and the like.
There's the sleep function, which pauses the script for a determined amount of time.
See also usleep, time_nanosleep and time_sleep_until.
PHP isn't event driven, so a setTimeout doesn't make much sense. You can certainly mimic it and in fact, someone has written a Timer class you could use. But I would be careful before you start programming in this way on the server side in PHP.
A few things I'd like to note about timers in PHP:
1) Timers in PHP make sense when used in long-running scripts (daemons and, maybe, in CLI scripts). So if you're not developing that kind of application, then you don't need timers.
2) Timers can be blocking and non-blocking. If you're using sleep(), then it's a blocking timer, because your script just freezes for a specified amount of time.
For many tasks blocking timers are fine. For example, sending statistics every 10 seconds. It's ok to block the script:
while (true) {
sendStat();
sleep(10);
}
3) Non-blocking timers make sense only in event driven apps, like websocket-server. In such applications an event can occur at any time (e.g incoming connection), so you must not block your app with sleep() (obviously).
For this purposes there are event-loop libraries, like reactphp/event-loop, which allows you to handle multiple streams in a non-blocking fashion and also has timer/ interval feature.
4) Non-blocking timeouts in PHP are possible.
It can be implemented by means of stream_select() function with timeout parameter (see how it's implemented in reactphp/event-loop StreamSelectLoop::run()).
5) There are PHP extensions like libevent, libev, event which allow timers implementation (if you want to go hardcore)
Not really, but you could try the tick count function.
http://php.net/manual/en/class.evtimer.php is probably what you are looking for, you can have a function called during set intervals, similar to setInterval in javascript. it is a pecl extension, if you have whm/cpanel you can easily install it through the pecl software/extension installer page.
i hadn't noticed this question is from 2010 and the evtimer class started to be coded in 2012-2013. so as an update to an old question, there is now a class that can do this similar to javascripts settimeout/setinterval.
Warning: You should note that while the sleep command can make a PHP process hang, or "sleep" for a given amount of time, you'd generally implement visual delays within the user interface.
Since PHP is a server side language, merely writing its execution output (generally in the form of HTML) to a web server response: using sleep in this fashion will generally just stall or delay the response.
With that being said, sleep does have practical purposes. Delaying execution can be used to implement back off schemes, such as when retrying a request after a failed connection. Generally speaking, if you need to use a setTimeout in PHP, you're probably doing something wrong.
Solution: If you still want to implement setTimeout in PHP, to answer your question explicitly: Consider that setTimeout possesses two parameters, one which represents the function to run, and the other which represents the amount of time (in milliseconds). The following code would actually meet the requirements in your question:
<?php
// Build the setTimeout function.
// This is the important part.
function setTimeout($fn, $timeout){
// sleep for $timeout milliseconds.
sleep(($timeout/1000));
$fn();
}
// Some example function we want to run.
$someFunctionToExecute = function() {
echo 'The function executed!';
}
// This will run the function after a 3 second sleep.
// We're using the functional property of first-class functions
// to pass the function that we wish to execute.
setTimeout($someFunctionToExecute, 3000);
?>
The output of the above code will be three seconds of delay, followed by the following output:
The function executed!
if you need to make an action after you execute some php code you can do it with an echo
echo "Success.... <script>setTimeout(function(){alert('Hello')}, 3000);</script>";
so after a time in the client(browser) you can do something else, like a redirect to another php script for example or echo an alert
There is a Generator class available in PHP version > 5.5 which provides a function called yield that helps you pause and continue to next function.
generator-example.php
<?php
function myGeneratorFunction()
{
echo "One","\n";
yield;
echo "Two","\n";
yield;
echo "Three","\n";
yield;
}
// get our Generator object (remember, all generator function return
// a generator object, and a generator function is any function that
// uses the yield keyword)
$iterator = myGeneratorFunction();
OUTPUT
One
If you want to execute the code after the first yield you add these line
// get the current value of the iterator
$value = $iterator->current();
// get the next value of the iterator
$value = $iterator->next();
// and the value after that the next value of the iterator
// $value = $iterator->next();
Now you will get output
One
Two
If you minutely see the setTimeout() creates an event loop.
In PHP there are many libraries out there E.g amphp is a popular one that provides event loop to execute code asynchronously.
Javascript snippet
setTimeout(function () {
console.log('After timeout');
}, 1000);
console.log('Before timeout');
Converting above Javascript snippet to PHP using Amphp
Loop::run(function () {
Loop::delay(1000, function () {
echo date('H:i:s') . ' After timeout' . PHP_EOL;
});
echo date('H:i:s') . ' Before timeout' . PHP_EOL;
});
Check this Out!
<?php
set_time_limit(20);
while ($i<=10)
{
echo "i=$i ";
sleep(100);
$i++;
}
?>
Output:
i=0 i=1 i=2 i=3 i=4 i=5 i=6 i=7 i=8 i=9 i=10
I work on a somewhat large web application, and the backend is mostly in PHP. There are several places in the code where I need to complete some task, but I don't want to make the user wait for the result. For example, when creating a new account, I need to send them a welcome email. But when they hit the 'Finish Registration' button, I don't want to make them wait until the email is actually sent, I just want to start the process, and return a message to the user right away.
Up until now, in some places I've been using what feels like a hack with exec(). Basically doing things like:
exec("doTask.php $arg1 $arg2 $arg3 >/dev/null 2>&1 &");
Which appears to work, but I'm wondering if there's a better way. I'm considering writing a system which queues up tasks in a MySQL table, and a separate long-running PHP script that queries that table once a second, and executes any new tasks it finds. This would also have the advantage of letting me split the tasks among several worker machines in the future if I needed to.
Am I re-inventing the wheel? Is there a better solution than the exec() hack or the MySQL queue?
I've used the queuing approach, and it works well as you can defer that processing until your server load is idle, letting you manage your load quite effectively if you can partition off "tasks which aren't urgent" easily.
Rolling your own isn't too tricky, here's a few other options to check out:
GearMan - this answer was written in 2009, and since then GearMan looks a popular option, see comments below.
ActiveMQ if you want a full blown open source message queue.
ZeroMQ - this is a pretty cool socket library which makes it easy to write distributed code without having to worry too much about the socket programming itself. You could use it for message queuing on a single host - you would simply have your webapp push something to a queue that a continuously running console app would consume at the next suitable opportunity
beanstalkd - only found this one while writing this answer, but looks interesting
dropr is a PHP based message queue project, but hasn't been actively maintained since Sep 2010
php-enqueue is a recently (2017) maintained wrapper around a variety of queue systems
Finally, a blog post about using memcached for message queuing
Another, perhaps simpler, approach is to use ignore_user_abort - once you've sent the page to the user, you can do your final processing without fear of premature termination, though this does have the effect of appearing to prolong the page load from the user perspective.
When you just want to execute one or several HTTP requests without having to wait for the response, there is a simple PHP solution, as well.
In the calling script:
$socketcon = fsockopen($host, 80, $errno, $errstr, 10);
if($socketcon) {
$socketdata = "GET $remote_house/script.php?parameters=... HTTP 1.1\r\nHost: $host\r\nConnection: Close\r\n\r\n";
fwrite($socketcon, $socketdata);
fclose($socketcon);
}
// repeat this with different parameters as often as you like
On the called script.php, you can invoke these PHP functions in the first lines:
ignore_user_abort(true);
set_time_limit(0);
This causes the script to continue running without time limit when the HTTP connection is closed.
Another way to fork processes is via curl. You can set up your internal tasks as a webservice. For example:
http://domain/tasks/t1
http://domain/tasks/t2
Then in your user accessed scripts make calls to the service:
$service->addTask('t1', $data); // post data to URL via curl
Your service can keep track of the queue of tasks with mysql or whatever you like the point is: it's all wrapped up within the service and your script is just consuming URLs. This frees you up to move the service to another machine/server if necessary (ie easily scalable).
Adding http authorization or a custom authorization scheme (like Amazon's web services) lets you open up your tasks to be consumed by other people/services (if you want) and you could take it further and add a monitoring service on top to keep track of queue and task status.
http://domain/queue?task=t1
http://domain/queue?task=t2
http://domain/queue/t1/100931
It does take a bit of set-up work but there are a lot of benefits.
If it just a question of providing expensive tasks, in case of php-fpm is supported, why not to use fastcgi_finish_request() function?
This function flushes all response data to the client and finishes the request. This allows for time consuming tasks to be performed without leaving the connection to the client open.
You don't really use asynchronicity in this way:
Make all your main code first.
Execute fastcgi_finish_request().
Make all heavy stuff.
Once again php-fpm is needed.
I've used Beanstalkd for one project, and planned to again. I've found it to be an excellent way to run asynchronous processes.
A couple of things I've done with it are:
Image resizing - and with a lightly loaded queue passing off to a CLI-based PHP script, resizing large (2mb+) images worked just fine, but trying to resize the same images within a mod_php instance was regularly running into memory-space issues (I limited the PHP process to 32MB, and the resizing took more than that)
near-future checks - beanstalkd has delays available to it (make this job available to run only after X seconds) - so I can fire off 5 or 10 checks for an event, a little later in time
I wrote a Zend-Framework based system to decode a 'nice' url, so for example, to resize an image it would call QueueTask('/image/resize/filename/example.jpg'). The URL was first decoded to an array(module,controller,action,parameters), and then converted to JSON for injection to the queue itself.
A long running cli script then picked up the job from the queue, ran it (via Zend_Router_Simple), and if required, put information into memcached for the website PHP to pick up as required when it was done.
One wrinkle I did also put in was that the cli-script only ran for 50 loops before restarting, but if it did want to restart as planned, it would do so immediately (being run via a bash-script). If there was a problem and I did exit(0) (the default value for exit; or die();) it would first pause for a couple of seconds.
Here is a simple class I coded for my web application. It allows for forking PHP scripts and other scripts. Works on UNIX and Windows.
class BackgroundProcess {
static function open($exec, $cwd = null) {
if (!is_string($cwd)) {
$cwd = #getcwd();
}
#chdir($cwd);
if (strtoupper(substr(PHP_OS, 0, 3)) == 'WIN') {
$WshShell = new COM("WScript.Shell");
$WshShell->CurrentDirectory = str_replace('/', '\\', $cwd);
$WshShell->Run($exec, 0, false);
} else {
exec($exec . " > /dev/null 2>&1 &");
}
}
static function fork($phpScript, $phpExec = null) {
$cwd = dirname($phpScript);
#putenv("PHP_FORCECLI=true");
if (!is_string($phpExec) || !file_exists($phpExec)) {
if (strtoupper(substr(PHP_OS, 0, 3)) == 'WIN') {
$phpExec = str_replace('/', '\\', dirname(ini_get('extension_dir'))) . '\php.exe';
if (#file_exists($phpExec)) {
BackgroundProcess::open(escapeshellarg($phpExec) . " " . escapeshellarg($phpScript), $cwd);
}
} else {
$phpExec = exec("which php-cli");
if ($phpExec[0] != '/') {
$phpExec = exec("which php");
}
if ($phpExec[0] == '/') {
BackgroundProcess::open(escapeshellarg($phpExec) . " " . escapeshellarg($phpScript), $cwd);
}
}
} else {
if (strtoupper(substr(PHP_OS, 0, 3)) == 'WIN') {
$phpExec = str_replace('/', '\\', $phpExec);
}
BackgroundProcess::open(escapeshellarg($phpExec) . " " . escapeshellarg($phpScript), $cwd);
}
}
}
PHP HAS multithreading, its just not enabled by default, there is an extension called pthreads which does exactly that.
You'll need php compiled with ZTS though. (Thread Safe)
Links:
Examples
Another tutorial
pthreads PECL Extension
UPDATE: since PHP 7.2 parallel extension comes into play
Tutorial/Example
reference manual
This is the same method I have been using for a couple of years now and I haven't seen or found anything better. As people have said, PHP is single threaded, so there isn't much else you can do.
I have actually added one extra level to this and that's getting and storing the process id. This allows me to redirect to another page and have the user sit on that page, using AJAX to check if the process is complete (process id no longer exists). This is useful for cases where the length of the script would cause the browser to timeout, but the user needs to wait for that script to complete before the next step. (In my case it was processing large ZIP files with CSV like files that add up to 30 000 records to the database after which the user needs to confirm some information.)
I have also used a similar process for report generation. I'm not sure I'd use "background processing" for something such as an email, unless there is a real problem with a slow SMTP. Instead I might use a table as a queue and then have a process that runs every minute to send the emails within the queue. You would need to be warry of sending emails twice or other similar problems. I would consider a similar queueing process for other tasks as well.
It's a great idea to use cURL as suggested by rojoca.
Here is an example. You can monitor text.txt while the script is running in background:
<?php
function doCurl($begin)
{
echo "Do curl<br />\n";
$url = 'http://'.$_SERVER['SERVER_NAME'].$_SERVER['REQUEST_URI'];
$url = preg_replace('/\?.*/', '', $url);
$url .= '?begin='.$begin;
echo 'URL: '.$url.'<br>';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$result = curl_exec($ch);
echo 'Result: '.$result.'<br>';
curl_close($ch);
}
if (empty($_GET['begin'])) {
doCurl(1);
}
else {
while (ob_get_level())
ob_end_clean();
header('Connection: close');
ignore_user_abort();
ob_start();
echo 'Connection Closed';
$size = ob_get_length();
header("Content-Length: $size");
ob_end_flush();
flush();
$begin = $_GET['begin'];
$fp = fopen("text.txt", "w");
fprintf($fp, "begin: %d\n", $begin);
for ($i = 0; $i < 15; $i++) {
sleep(1);
fprintf($fp, "i: %d\n", $i);
}
fclose($fp);
if ($begin < 10)
doCurl($begin + 1);
}
?>
There is a PHP extension, called Swoole.
Although it might not be enabled, it is available on my hosting for being enabled at click of a button.
Worth checking it out. I haven't had time to use it yet, as I was searching here for info, when I stumbled across it and thought it worth sharing.
Unfortunately PHP does not have any kind of native threading capabilities. So I think in this case you have no choice but to use some kind of custom code to do what you want to do.
If you search around the net for PHP threading stuff, some people have come up with ways to simulate threads on PHP.
If you set the Content-Length HTTP header in your "Thank You For Registering" response, then the browser should close the connection after the specified number of bytes are received. This leaves the server side process running (assuming that ignore_user_abort is set) so it can finish working without making the end user wait.
Of course you will need to calculate the size of your response content before rendering the headers, but that's pretty easy for short responses (write output to a string, call strlen(), call header(), render string).
This approach has the advantage of not forcing you to manage a "front end" queue, and although you may need to do some work on the back end to prevent racing HTTP child processes from stepping on each other, that's something you needed to do already, anyway.
If you don't want the full blown ActiveMQ, I recommend to consider RabbitMQ. RabbitMQ is lightweight messaging that uses the AMQP standard.
I recommend to also look into php-amqplib - a popular AMQP client library to access AMQP based message brokers.
Spawning new processes on the server using exec() or directly on another server using curl doesn't scale all that well at all, if we go for exec you are basically filling your server with long running processes which can be handled by other non web facing servers, and using curl ties up another server unless you build in some sort of load balancing.
I have used Gearman in a few situations and I find it better for this sort of use case. I can use a single job queue server to basically handle queuing of all the jobs needing to be done by the server and spin up worker servers, each of which can run as many instances of the worker process as needed, and scale up the number of worker servers as needed and spin them down when not needed. It also let's me shut down the worker processes entirely when needed and queues the jobs up until the workers come back online.
i think you should try this technique it will help to call as many as pages you like all pages will run at once independently without waiting for each page response as asynchronous.
cornjobpage.php //mainpage
<?php
post_async("http://localhost/projectname/testpage.php", "Keywordname=testValue");
//post_async("http://localhost/projectname/testpage.php", "Keywordname=testValue2");
//post_async("http://localhost/projectname/otherpage.php", "Keywordname=anyValue");
//call as many as pages you like all pages will run at once independently without waiting for each page response as asynchronous.
?>
<?php
/*
* Executes a PHP page asynchronously so the current page does not have to wait for it to finish running.
*
*/
function post_async($url,$params)
{
$post_string = $params;
$parts=parse_url($url);
$fp = fsockopen($parts['host'],
isset($parts['port'])?$parts['port']:80,
$errno, $errstr, 30);
$out = "GET ".$parts['path']."?$post_string"." HTTP/1.1\r\n";//you can use POST instead of GET if you like
$out.= "Host: ".$parts['host']."\r\n";
$out.= "Content-Type: application/x-www-form-urlencoded\r\n";
$out.= "Content-Length: ".strlen($post_string)."\r\n";
$out.= "Connection: Close\r\n\r\n";
fwrite($fp, $out);
fclose($fp);
}
?>
testpage.php
<?
echo $_REQUEST["Keywordname"];//case1 Output > testValue
?>
PS:if you want to send url parameters as loop then follow this answer :https://stackoverflow.com/a/41225209/6295712
PHP is a single-threaded language, so there is no official way to start an asynchronous process with it other than using exec or popen. There is a blog post about that here. Your idea for a queue in MySQL is a good idea as well.
Your specific requirement here is for sending an email to the user. I'm curious as to why you are trying to do that asynchronously since sending an email is a pretty trivial and quick task to perform. I suppose if you are sending tons of email and your ISP is blocking you on suspicion of spamming, that might be one reason to queue, but other than that I can't think of any reason to do it this way.