How to make parallel cURL requests in php? - php

I have a php code on my server and in this code I would like to execute multiple cURL requests in parallel.
Each cURL request is used to analyze the bees of a website and therefore I do not want to execute the requests in "sequence" but in parallel (so the execution of the script will be shorter).
I tried to use multi-threaded but i get this error:
Class 'Thread' not found
class AsyncOperation extends Thread {
public function request() {
first_request();
}
}
// Crea un array
$stack = array();
// crea i thread
foreach ( range("A", "D") as $i ) {
$stack[] = new AsyncOperation($i);
}
// avvia i thread
foreach ( $stack as $h ) {
$h->start();
}
I don't know if the error depends on the code or maybe because I can't use multi-threading ...
How can i solve this or to do many requests in parallel ?
Thanks a lot

Related

Laravel CRON or Event process respond to api request via long poll - how to re-vitalise the session

I have a poll route on an API on Laravel 5.7 server, where the api user can request any information since the last poll.
The easy part is to respond immediately to a valid request if there is new information return $this->prepareResult($newData);
If there is no new data I am storing a poll request in the database, and a cron utility can then check once a minute for all poll requests and respond to any polls where data has been updated. Alternatively I can create an event listener for data updates and fire off a response to the poll when the data is updated.
I'm stuck with how to restore each session to match the device waiting for the update. I can store or pass the session ID but how do I make sure the CRON task / event processor can respond to the correct IP address just as if it was to the original request. Can php even do this?
I am trying to avoid websockets as will have lots of devices but with limited updates / interactions.
Clients poll for updates, APIs do not push updates.
REST API's are supposed to be stateless, so trying to have the backend keep track goes against REST.
To answer your question specifically, if you do not want to use websockets, the client app is going to have to continue to poll the endpoint till data is available.
Long poll is a valid technique. i think is a bad idea to run poll with session. since session are only for original user. you can run your long poll with php cli. you can check on your middleware to allow cli only for route poll. you can use pthreads
to run your long poll use pthreads via cli. and now pthreads v3 is designed safely and sensibly anywhere but CLI. you can use your cron to trigger your thread every one hour. then in your controller you need to store a $time = time(); to mark your start time of execution. then create dowhile loop to loop your poll process. while condition can be ($time > time()+3600) or other condition. inside loop you need to check is poll exist? if true then run it. then on the bottom of line inside loop you need to sleep for some second, for example 2 second.
on your background.php(this file is execute by cron)
<?php
error_reporting(-1);
ini_set('display_errors', 1);
class Atomic extends Threaded {
public function __construct($data = NULL) {
$this->data = $data;
}
private $data;
private $method;
private $class;
private $config;
}
class Task extends Thread {
public function __construct(Atomic $atomic) {
$this->atomic = $atomic;
}
public function run() {
$this->atomic->synchronized(function($atomic)
{
chdir($atomic->config['root']);
$exec_statement = array(
"php7.2.7",
$atomic->config['index'],
$atomic->class,
$atomic->method
);
echo "Running Command".PHP_EOL. implode(" ", $exec_statement)." at: ".date("Y-m-d H:i:s").PHP_EOL;
$data = shell_exec(implode(" ", $exec_statement));
echo $data.PHP_EOL;
}, $this->atomic);
}
private $atomic;
}
$config = array(
"root" => "/var/www/api.example.com/api/v1.1",
"index" => "index.php",
"interval_execution_time" => 200
);
chdir($config['root']);
$threads = array();
$list_threads = array(
array(
"class" => "Background_workers",
"method" => "send_email",
"total_thread" => 2
),
array(
"class" => "Background_workers",
"method" => "updating_data_user",
"total_thread" => 2
),
array(
"class" => "Background_workers",
"method" => "sending_fcm_broadcast",
"total_thread" => 2
)
);
for ($i=0; $i < count($list_threads); $i++)
{
$total_thread = $list_threads[$i]['total_thread'];
for ($j=0; $j < $total_thread; $j++)
{
$atomic = new Atomic();
$atomic->class = $list_threads[$i]['class'];
$atomic->method = $list_threads[$i]['method'];
$atomic->thread_number = $j;
$atomic->config = $config;
$threads[] = new Task($atomic);
}
}
foreach ($threads as $thread) {
$thread->start();
usleep(200);
}
foreach ($threads as $thread)
$thread->join();
?>
and this on your controller
<?php
defined('BASEPATH') OR exit('No direct script access allowed');
class Background_workers extends MX_Controller {
public function __construct()
{
parent::__construct();
$this->load->database();
$this->output->enable_profiler(FALSE);
$this->configuration = $this->config->item("configuration_background_worker_module");
}
public function sending_fcm_broadcast() {
$time_run = time();
$time_stop = strtotime("+1 hour");
do{
$time_run = time();
modules::run("Background_worker_module/sending_fcm_broadcast", $this->configuration["fcm_broadcast"]["limit"]);
sleep(2);
}
while ($time_run < $time_stop);
}
}
this is a sample runing code from codeigniter controller.
Long polling requires holding the connection open. That can only happen through an infinite loop of checking to see if the data exists and then adding a sleep.
There is no need to revitalize the session as the response is fired only on a successful data hit.
Note that this method is very CPU and memory intensive as the connection and FPM worker will remain open until a successful data hit. Web sockets is a much better solution regardless of the number of devices and frequency of updates.
You can use notifications. "browser notification" for web clients and FCM and APN notification for mobile clients.
Another option is using SSE (server sent events). It's a connection like socket but over http. Client sends a normal request, and server can just respond to client multiple times and any time if client is available (In the same request that has been sent).

Best way to offload one-shot worker threads in PHP? pthreads? fcntl?

How should I multithread some php-cli code that needs a timeout?
I'm using PHP 5.6 on Centos 6.6 from the command line.
I'm not very familiar with multithreading terminology or code. I'll simplify the code here but it is 100% representative of what I want to do.
The non-threaded code currently looks something like this:
$datasets = MyLibrary::getAllRawDataFromDBasArrays();
foreach ($datasets as $dataset) {
MyLibrary::processRawDataAndStoreResultInDB($dataset);
}
exit; // just for clarity
I need to prefetch all my datasets, and each processRawDataAndStoreResultInDB() cannot fetch it's own dataset. Sometimes processRawDataAndStoreResultInDB() takes too long to process a dataset, so I want to limit the amount of time it has to process it.
So you can see that making it multithreaded would
Speed it up by allowing multiple processRawDataAndStoreResultInDB() to execute at the same time
Use set_time_limit() to limit the amount of time each one has to process each dataset
Notice that I don't need to come back to my main program. Since this is a simplification, you can trust that I don't want to collect all the processed datasets and do a single save into the DB after they are all done.
I'd like to do something like:
class MyWorkerThread extends SomeThreadType {
public function __construct($timeout, $dataset) {
$this->timeout = $timeout;
$this->dataset = $dataset;
}
public function run() {
set_time_limit($this->timeout);
MyLibrary::processRawDataAndStoreResultInDB($this->dataset);
}
}
$numberOfThreads = 4;
$pool = somePoolClass($numberOfThreads);
$pool->start();
$datasets = MyLibrary::getAllRawDataFromDBasArrays();
$timeoutForEachThread = 5; // seconds
foreach ($datasets as $dataset) {
$thread = new MyWorkerThread($timeoutForEachThread, $dataset);
$thread->addCallbackOnTerminated(function() {
if ($this->isTimeout()) {
MyLibrary::saveBadDatasetToDb($dataset);
}
}
$pool->addToQueue($thread);
}
$pool->waitUntilAllWorkersAreFinished();
exit; // for clarity
From my research online I've found the PHP extension pthreads which I can use with my thread-safe php CLI, or I could use the PCNTL extension or a wrapper library around it (say, Arara/Process)
https://github.com/krakjoe/pthreads (and the example directory)
https://github.com/Arara/Process (pcntl wrapper)
When I look at them and their examples though (especially the pthreads pool example) I get confused quickly by the terminology and which classes I should use to achieve the kind of multithreading I'm looking for.
I even wouldn't mind creating the pool class myself, if I had a isRunning(), isTerminated(), getTerminationStatus() and execute() function on a thread class, as it would be a simple queue.
Can someone with more experience please direct me to which library, classes and functions I should be using to map to my example above? Am I taking the wrong approach completely?
Thanks in advance.
Here comes an example using worker processes. I'm using the pcntl extension.
/**
* Spawns a worker process and returns it pid or -1
* if something goes wrong.
*
* #param callback function, closure or method to call
* #return integer
*/
function worker($callback) {
$pid = pcntl_fork();
if($pid === 0) {
// Child process
exit($callback());
} else {
// Main process or an error
return $pid;
}
}
$datasets = array(
array('test', '123'),
array('foo', 'bar')
);
$maxWorkers = 1;
$numWorkers = 0;
foreach($datasets as $dataset) {
$pid = worker(function () use ($dataset) {
// Do DB stuff here
var_dump($dataset);
return 0;
});
if($pid !== -1) {
$numWorkers++;
} else {
// Handle fork errors here
echo 'Failed to spawn worker';
}
// If $maxWorkers is reached we need to wait
// for at least one child to return
if($numWorkers === $maxWorkers) {
// $status is passed by reference
$pid = pcntl_wait($status);
echo "child process $pid returned $status\n";
$numWorkers--;
}
}
// (Non blocking) wait for the remaining childs
while(true) {
// $status is passed by reference
$pid = pcntl_wait($status, WNOHANG);
if(is_null($pid) || $pid === -1) {
break;
}
if($pid === 0) {
// Be patient ...
usleep(50000);
continue;
}
echo "child process $pid returned $status\n";
}

Fastest or most robust way to make 7 soap api requests in parallel

my web app requires making 7 different soap wsdl api requests to complete one task (I need the users to wait for the result of all the requests). The avg response time is 500 ms to 1.7 second for each request. I need to run all these request in parallel to speed up the process.
What's the best way to do that:
pthreads or
Gearman workers
fork process
curl multi (i have to build the xml soap body)
Well the first thing to say is, it's never really a good idea to create threads in direct response to a web request, think about how far that will actually scale.
If you create 7 threads for everyone that comes along and 100 people turn up, you'll be asking your hardware to execute 700 threads concurrently, which is quite a lot to ask of anything really...
However, scalability is not something I can usefully help you with, so I'll just answer the question.
<?php
/* the first service I could find that worked without authorization */
define("WSDL", "http://www.webservicex.net/uklocation.asmx?WSDL");
class CountyData {
/* this works around simplexmlelements being unsafe (and shit) */
public function __construct(SimpleXMLElement $element) {
$this->town = (string)$element->Town;
$this->code = (string)$element->PostCode;
}
public function run(){}
protected $town;
protected $code;
}
class GetCountyData extends Thread {
public function __construct($county) {
$this->county = $county;
}
public function run() {
$soap = new SoapClient(WSDL);
$result = $soap->getUkLocationByCounty(array(
"County" => $this->county
));
foreach (simplexml_load_string(
$result->GetUKLocationByCountyResult) as $element) {
$this[] = new CountyData($element);
}
}
protected $county;
}
$threads = [];
$thread = 0;
$threaded = true; # change to false to test without threading
$counties = [ # will create as many threads as there are counties
"Buckinghamshire",
"Berkshire",
"Yorkshire",
"London",
"Kent",
"Sussex",
"Essex"
];
while ($thread < count($counties)) {
$threads[$thread] =
new GetCountyData($counties[$thread]);
if ($threaded) {
$threads[$thread]->start();
} else $threads[$thread]->run();
$thread++;
}
if ($threaded)
foreach ($threads as $thread)
$thread->join();
foreach ($threads as $county => $data) {
printf(
"Data for %s %d\n", $counties[$county], count($data));
}
?>
Note that, the SoapClient instance is not, and can not be shared, this may well slow you down, you might want to enable caching of wsdl's ...

curl_multi_exec kills whole process when there ia a single timeout

I am using curl_multi to send out emails out in a rolling curl script similar to this one but i added a curlopt_timeout of 10 seconds and a curlopt_connecttimeout of 20 seconds
http://www.onlineaspect.com/2009/01/26/how-to-use-curl_multi-without-blocking/
while testing it i reduced the timeouts to 1ms by using timeout_ms and connecttimeout_ms respectively, just to see how it handles a timeout. But the timeout kills the entire curl process. Is there a way to continue with the other threads even if one times out??
Thanks.
-devo
https://github.com/krakjoe/pthreads
<?php
class Possibilities extends Thread {
public function __construct($url){
$this->url = $url;
}
public function run(){
/*
* Or use curl, this is quicker to make an example ...
*/
return file_get_contents($this->url);
}
}
$threads = array();
$urls = get_my_urls_from_somewhere();
foreach($urls as $index => $url){
$threads[$index]=new Possibilities($url);
$threads[$index]->start();
}
foreach($threads as $index => $thread ){
if( ( $response = $threads[$index]->join() ) ){
/** good, got a response */
} else { /** we do not care **/ }
}
?>
My guess is, you are using curl multi as it's the only option for concurrent execution of the code sending out emails ... if this is the case, I do not suggest that you use anything like the code above, I suggest that you thread the calls to mail() directly as this will be faster and more efficient by far.
But now you know, you can thread in PHP .. enjoy :)

Executing functions in parallel

I have a function that needs to go over around 20K rows from an array, and apply an external script to each. This is a slow process, as PHP is waiting for the script to be executed before continuing with the next row.
In order to make this process faster I was thinking on running the function in different parts, at the same time. So, for example, rows 0 to 2000 as one function, 2001 to 4000 on another one, and so on. How can I do this in a neat way? I could make different cron jobs, one for each function with different params: myFunction(0, 2000), then another cron job with myFunction(2001, 4000), etc. but that doesn't seem too clean. What's a good way of doing this?
If you'd like to execute parallel tasks in PHP, I would consider using Gearman. Another approach would be to use pcntl_fork(), but I'd prefer actual workers when it's task based.
The only waiting time you suffer is between getting the data and processing the data. Processing the data is actually completely blocking anyway (you just simply have to wait for it). You will not likely gain any benefits past increasing the number of processes to the number of cores that you have. Basically I think this means the number of processes is small so scheduling the execution of 2-8 processes doesn't sound that hideous. If you are worried about not being able to process data while retrieving data, you could in theory get your data from the database in small blocks, and then distribute the processing load between a few processes, one for each core.
I think I align more with the forking child processes approach for actually running the processing threads. There is a brilliant demonstration in the comments on the pcntl_fork doc page showing an implementation of a job daemon class
http://php.net/manual/en/function.pcntl-fork.php
<?php
declare(ticks=1);
//A very basic job daemon that you can extend to your needs.
class JobDaemon{
public $maxProcesses = 25;
protected $jobsStarted = 0;
protected $currentJobs = array();
protected $signalQueue=array();
protected $parentPID;
public function __construct(){
echo "constructed \n";
$this->parentPID = getmypid();
pcntl_signal(SIGCHLD, array($this, "childSignalHandler"));
}
/**
* Run the Daemon
*/
public function run(){
echo "Running \n";
for($i=0; $i<10000; $i++){
$jobID = rand(0,10000000000000);
while(count($this->currentJobs) >= $this->maxProcesses){
echo "Maximum children allowed, waiting...\n";
sleep(1);
}
$launched = $this->launchJob($jobID);
}
//Wait for child processes to finish before exiting here
while(count($this->currentJobs)){
echo "Waiting for current jobs to finish... \n";
sleep(1);
}
}
/**
* Launch a job from the job queue
*/
protected function launchJob($jobID){
$pid = pcntl_fork();
if($pid == -1){
//Problem launching the job
error_log('Could not launch new job, exiting');
return false;
}
else if ($pid){
// Parent process
// Sometimes you can receive a signal to the childSignalHandler function before this code executes if
// the child script executes quickly enough!
//
$this->currentJobs[$pid] = $jobID;
// In the event that a signal for this pid was caught before we get here, it will be in our signalQueue array
// So let's go ahead and process it now as if we'd just received the signal
if(isset($this->signalQueue[$pid])){
echo "found $pid in the signal queue, processing it now \n";
$this->childSignalHandler(SIGCHLD, $pid, $this->signalQueue[$pid]);
unset($this->signalQueue[$pid]);
}
}
else{
//Forked child, do your deeds....
$exitStatus = 0; //Error code if you need to or whatever
echo "Doing something fun in pid ".getmypid()."\n";
exit($exitStatus);
}
return true;
}
public function childSignalHandler($signo, $pid=null, $status=null){
//If no pid is provided, that means we're getting the signal from the system. Let's figure out
//which child process ended
if(!$pid){
$pid = pcntl_waitpid(-1, $status, WNOHANG);
}
//Make sure we get all of the exited children
while($pid > 0){
if($pid && isset($this->currentJobs[$pid])){
$exitCode = pcntl_wexitstatus($status);
if($exitCode != 0){
echo "$pid exited with status ".$exitCode."\n";
}
unset($this->currentJobs[$pid]);
}
else if($pid){
//Oh no, our job has finished before this parent process could even note that it had been launched!
//Let's make note of it and handle it when the parent process is ready for it
echo "..... Adding $pid to the signal queue ..... \n";
$this->signalQueue[$pid] = $status;
}
$pid = pcntl_waitpid(-1, $status, WNOHANG);
}
return true;
}
}
you can use "PTHREADS"
very easy to install and works great on windows
download from here -> http://windows.php.net/downloads/pecl/releases/pthreads/2.0.4/
Extract the zip file and then
move the file 'php_pthreads.dll' to php\ext\ directory.
move the file 'pthreadVC2.dll' to php\ directory.
then add this line in your 'php.ini' file:
extension=php_pthreads.dll
save the file.
you just done :-)
now lets see example of how to use it:
class ChildThread extends Thread {
public $data;
public function run() {
/* Do some expensive work */
$this->data = 'result of expensive work';
}
}
$thread = new ChildThread();
if ($thread->start()) {
/*
* Do some expensive work, while already doing other
* work in the child thread.
*/
// wait until thread is finished
$thread->join();
// we can now even access $thread->data
}
for more information about PTHREADS read php docs here:
PHP DOCS PTHREADS
if you'r using WAMP like me, then you should add 'pthreadVC2.dll' into
\wamp\bin\apache\ApacheX.X.X\bin
and also edit the 'php.ini' file (same path) and add the same line as before
extension=php_pthreads.dll
GOOD LUCK!
What you are looking for is parallel which is a succinct concurrency API for PHP 7.2+
$runtime = new \parallel\Runtime();
$future = $runtime->run(function() {
for ($i = 0; $i < 500; $i++) {
echo "*";
}
return "easy";
});
for ($i = 0; $i < 500; $i++) {
echo ".";
}
printf("\nUsing \\parallel\\Runtime is %s\n", $future->value());
Output:
.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*..*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*
Using \parallel\Runtime is easy
Have a look at pcntl_fork. This allows you to spawn child processes which can then do the separate work that you need.
Not sure if a solution for your situation but you can redirect the output of system calls to a file, thus PHP will not wait until the program is finished. Although this may result in overloading your server.
http://www.php.net/manual/en/function.exec.php - If a program is started with this function, in order for it to continue running in the background, the output of the program must be redirected to a file or another output stream. Failing to do so will cause PHP to hang until the execution of the program ends.
There's Guzzle with its concurrent requests
use GuzzleHttp\Client;
use GuzzleHttp\Promise;
$client = new Client(['base_uri' => 'http://httpbin.org/']);
$promises = [
'image' => $client->getAsync('/image'),
'png' => $client->getAsync('/image/png'),
'jpeg' => $client->getAsync('/image/jpeg'),
'webp' => $client->getAsync('/image/webp')
];
$responses = Promise\Utils::unwrap($promises);
There's the overhead of promises; but more importantly Guzzle only works with HTTP requests and it works with version 7+ and frameworks like Laravel.

Categories