PHP Multi Curl Request, do some job while waiting the response - php

Here's a typical multi-curl request example for PHP:
$mh = curl_multi_init();
foreach ($urls as $index => $url) {
$curly[$index] = curl_init($url);
curl_setopt($curly[$index], CURLOPT_RETURNTRANSFER, 1);
curl_multi_add_handle($mh, $curly[$index]);
}
// execute the handles
$running = null;
do {
curl_multi_exec($mh, $running);
curl_multi_select($mh);
} while($running > 0);
// get content and remove handles
$result = array();
foreach($curly as $index => $c) {
$result[$index] = curl_multi_getcontent($c);
curl_multi_remove_handle($mh, $c);
}
// all done
curl_multi_close($this->mh);
The process involves 3 steps:
1. Preparing data
2. Sending requests and waiting until they're finished
3. Collecting responses
I'd like to split step #2 into 2 parts, first send all the requests and then collect all the response, and do some useful job instead of just waiting, for example, processing the responses of last group of requests.
So, how can I split this part of code
$running = null;
do {
curl_multi_exec($mh, $running);
curl_multi_select($mh);
} while($running > 0);
into separate parts?
Send all the requests
Do some another job while waiting
Retrieve all the responses
I tried like this:
// Send the requests
$running = null;
curl_multi_exec($mh, $running);
// Do some job while waiting
// ...
// Get all the responses
do {
curl_multi_exec($mh, $running);
curl_multi_select($mh);
} while($running > 0);
but it doesn't seem to work properly.

I found the solution, here's how to launch all the requests without waiting the response
do {
curl_multi_exec($mh, $running);
} while (curl_multi_select($mh) === -1);
then we can do any other jobs and catch the responses any time later.

Related

PHP - multiple curl requests curl_multi speed optimizations

I'm using curl_multi to process multiple API requests in parallel.
However, I've noticed there is a lot of fluctuation in the time it takes to complete the requests.
Is this related to the speed of the APIs themselves, or the timeout I set on curl_multi_select? Right now it is 0.05. Should it be less? How can I know this process is finishing the requests as fast as possible without wasted time in between checks to see if they're done?
<?php
// Build the multi-curl handle, adding each curl handle
$handles = array(/* Many curl handles*/);
$mh = curl_multi_init();
foreach($handles as $curl){
curl_multi_add_handle($mh, $curl);
}
$running = null;
do {
curl_multi_exec($mh, $running);
curl_multi_select($mh, 0.05); // Should this value be less than 0.05?
} while ($running > 0);
// Close the handles
foreach($results as $curl){
curl_multi_remove_handle($mh, $curl);
}
curl_multi_close($mh);
?>
current implementation of curl_multi_select() in php doesn't block and doesn't respect timeout parameter, maybe it will be fixed later. the proper way of waiting is not implemented in your code, it have to be 2 loops, i will post some tested code from my bot as an example:
$running = 1;
while ($running)
{
# execute request
if ($a = curl_multi_exec($this->murl, $running)) {
throw BotError::text("curl_multi_exec[$a]: ".curl_multi_strerror($a));
}
# check finished
if (!$running) {
break;
}
# wait for activity
while (!$a)
{
if (($a = curl_multi_select($this->murl, $wait)) < 0)
{
throw BotError::text(
($a = curl_multi_errno($this->murl))
? "curl_multi_select[$a]: ".curl_multi_strerror($a)
: 'system select failed'
);
}
usleep($wait * 1000000);# wait for some time <1sec
}
}
doing
$running = null;
for(;;){
curl_multi_exec($mh, $running);
if($running <1){
break;
}
curl_multi_select($mh, 1);
}
should be better, then you'll avoid a useless select() when nothing is running..

How do I know when an individual request in a curl_multi_exec() has finished in PHP?

I would like to time my individual requests. They were timed before as separate curl requests. Now I would like to combine them with curl_multi_exec and continue to time how long they take. Here is my code:
$mh = curl_multi_init();
curl_multi_add_handle($mh, $ebayRequest);
curl_multi_add_handle($mh, $prosperentRequest);
$running = null;
do {
curl_multi_exec($mh, $running);
curl_multi_select($mh);
} while ($running > 0);
$ebay = curl_multi_getcontent($ebayRequest);
$prosperent = curl_multi_getcontent($prosperentRequest);
//close the handles
curl_multi_remove_handle($mh, $ebayRequest);
curl_multi_remove_handle($mh, $prosperentRequest);
curl_multi_close($mh);
Here is the old code which timed them.
$ebayTime = microtime(true);
// $ebay = $this->ebayExecuteSearch($ebay);
Yii::warning("Ebay time: ".(microtime(true) - $ebayTime));
I thought curl_multi_info_read might help, but it doesn't look useful.

curl_multi_exec() not requesting all

I am trying to use curl_multi_exec() in php with about I am guessing 4000 post calls and getting the return (json). However, After 234 records in my results, my print_r starts showing nothing. I can change the post call url's since each of my URLs has a different postfield, but I would still get 234 results. Does anybody know if there are any limits to curl_multi_exec(). I am using an xampp server on my computer to retrieve the json off a remote server. Is it an option in my xampp install that is preventing more results or a server end limits on my connections?
Thanks. My code for the function is below. The function takes in input $opt which is an array of the curl options.
$ch = array();
$results = array();
$mh = curl_multi_init();
foreach($opt as $handler => $array)
{
//print_r($array);
//echo "<br><br>";
$ch[$handler] = curl_init();
curl_setopt_array($ch[$handler],$array);
curl_multi_add_handle($mh, $ch[$handler]);
}
$running = null;
do {
curl_multi_exec($mh, $running);
}
while ($running > 0);
// Get content and remove handles.
foreach ($ch as $key => $val) {
$results[$key] = json_decode(curl_multi_getcontent($val),true);
curl_multi_remove_handle($mh, $val);
}
curl_multi_close($mh);
return $results;

Increase speed of my script

I have a script which takes a some.txt file and reads the links and return if my websites backlink is there or not. But the problem is, it is very slow and I want to increase its speed. Is there any way to increase its speed?
<?php
ini_set('max_execution_time', 3000);
$source = file_get_contents("your-backlinks.txt");
$needle = "http://www.submitage.com"; //without http as I have imploded the http later in the script
$new = explode("\n",$source);
foreach ($new as $check) {
$a = file_get_contents(trim($check));
if (strpos($a,$needle)) {
$found[] = $check;
} else {
$notfound[] = $check;
}
}
echo "Matches that were found: \n ".implode("\n",$found)."\n";
echo "Matches that were not found \n". implode("\n",$notfound);
?>
Your biggest bottleneck is the fact that you are executing the HTTP requests in sequence, not in parallel. curl is able to perform multiple requests in parallel. Here's an example from the documentation, heavily adapted to use a loop and actually collect the results. I cannot promise it's correct, I only promise I've followed the documentation correctly:
$mh = curl_multi_init();
$handles = array();
foreach($new as $check){
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $check);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_multi_add_handle($mh,$ch);
$handles[$check]=$ch;
}
// verbatim from the demo
$active = null;
//execute the handles
do {
$mrc = curl_multi_exec($mh, $active);
} while ($mrc == CURLM_CALL_MULTI_PERFORM);
while ($active && $mrc == CURLM_OK) {
if (curl_multi_select($mh) != -1) {
do {
$mrc = curl_multi_exec($mh, $active);
} while ($mrc == CURLM_CALL_MULTI_PERFORM);
}
}
// end of verbatim code
for($handles as $check => $ch){
$a = curl_multi_getcontent($ch)
...
}
You won't be able to squeeze any more speed out of the operation by optimizing the PHP, except maybe some faux-multithreading solution.
However, you could create a queue system that would allow you to run the check as a background task. Instead of checking the URLs as you iterate through them, add them to the queue instead. Then write a cron script that grabs unchecked URLs from the queue one by one, checks if they contain a reference to your domain and saves the result.

Curl Multi Threading

i am finding a Curl function which can open particular no. of webpage open at a time also there will no output or returndata false will more good . I need to access 5-10 url at a same time .. I heard abt Curl Multi Threading but dont have proper function or class to use it ..
i find some by searching but most of them seems to be loop mean it i not using continuous connection just one after one ! I want something which can connect multiple connection at a time not one by one !
I made one :
function mutload($url){
if(!is_array($url)){
exit;
}
for($i=0;$i<count($url);$i++){
// create both cURL resources
$ch[] = curl_init();
$ch[] = curl_init();
// set URL and other appropriate options
curl_setopt($ch[$i], CURLOPT_URL, $url[$i]);
curl_setopt($ch[$i], CURLOPT_HEADER, 0);
curl_setopt($ch[$i], CURLOPT_RETURNTRANSFER, 0);
}
//create the multiple cURL handle
$mh = curl_multi_init();
for($i=0;$i<count($url);$i++){
//add the two handles
curl_multi_add_handle($mh,$ch[$i]);
}
$active = null;
//execute the handles
do {
$mrc = curl_multi_exec($mh, $active);
} while ($mrc == CURLM_CALL_MULTI_PERFORM);
while ($active && $mrc == CURLM_OK) {
if (curl_multi_select($mh) != -1) {
do {
$mrc = curl_multi_exec($mh, $active);
} while ($mrc == CURLM_CALL_MULTI_PERFORM);
}
}
//close the handles
for($i=0;$i<count($url);$i++){
curl_multi_remove_handle($mh, $ch[$i]);
}
curl_multi_close($mh);
}
ok ! but i m confused that will it connect all the urls at a time or one by one ! mre over i am geeting the content also i only want to connect or request to the site do not need ay content from there i used RETURNTRASFER,false but didnt work .. please hlep me thanks !
You're looking for the curl_multi_* family of functions. Have a look at curl_multi_exec.
Set CURLOPT_NOBODY to prevent curl from downloading any cotent.
I didn't test your code but curl_multi adds items to a queue from a loop and process them in parallel. Sometimes there can be issues if you are trying to load 100s of URLs, but it should be fine for a few URLs. If you have long DNS lookups or slow servers, all your results will have to wait for the slowest request.
This code is tested and should work, it is somewhat similar to yours:
http://www.onlineaspect.com/2009/01/26/how-to-use-curl_multi-without-blocking/

Categories