I was wondering if it's possible to open multiple URLs with cURL or maybe something else.
I tried this until now.
$urls = array(
"http://google.com",
"http://youtube.com",
);
foreach($urls as $url) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT,0);
curl_setopt($ch, CURLOPT_TIMEOUT_MS, 200);
curl_exec($ch);
curl_close($ch);
}
The 200ms are there to let the site open fully.
Maybe you know any alternatives.
Is it possible to open multiple URLs in PHP at the same time? Not client sided, server side.
Your solution would be simultaneous cURL HTTP requests.
For faster implementation, you can use this function (thanks to phpied):
function multiRequest($data, $options = array()) {
// array of curl handles
$curly = array();
// data to be returned
$result = array();
// multi handle
$mh = curl_multi_init();
// loop through $data and create curl handles
// then add them to the multi-handle
foreach ($data as $id => $d) {
$curly[$id] = curl_init();
$url = (is_array($d) && !empty($d['url'])) ? $d['url'] : $d;
curl_setopt($curly[$id], CURLOPT_URL, $url);
curl_setopt($curly[$id], CURLOPT_HEADER, 0);
curl_setopt($curly[$id], CURLOPT_RETURNTRANSFER, 1);
// post?
if (is_array($d)) {
if (!empty($d['post'])) {
curl_setopt($curly[$id], CURLOPT_POST, 1);
curl_setopt($curly[$id], CURLOPT_POSTFIELDS, $d['post']);
}
}
// extra options?
if (!empty($options)) {
curl_setopt_array($curly[$id], $options);
}
curl_multi_add_handle($mh, $curly[$id]);
}
// execute the handles
$running = null;
do {
curl_multi_exec($mh, $running);
} while($running > 0);
// get content and remove handles
foreach($curly as $id => $c) {
$result[$id] = curl_multi_getcontent($c);
curl_multi_remove_handle($mh, $c);
}
// all done
curl_multi_close($mh);
return $result;
}
And use it like this:
$data = array(
'http://search.yahooapis.com/VideoSearchService/V1/videoSearch?appid=YahooDemo&query=Pearl+Jam&output=json',
'http://search.yahooapis.com/ImageSearchService/V1/imageSearch?appid=YahooDemo&query=Pearl+Jam&output=json',
'http://search.yahooapis.com/AudioSearchService/V1/artistSearch?appid=YahooDemo&artist=Pearl+Jam&output=json'
);
$r = multiRequest($data);
echo '<pre>';
print_r($r);
Hope it helps.
Also read this.
Related
I try to write simple parser on php, with can give me only content-length of html page. For now I have this Code :
$urls = array(
'http://Link1.com/',
'http://Link2.com'
);
$mh = curl_multi_init();
$connectionArray = array();
foreach($urls as $key => $url)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_multi_add_handle($mh, $ch);
$connectionArray[$key] = $ch;
}
$running = null;
do
{
curl_multi_exec($mh, $running);
}while($running > 0);
foreach($connectionArray as $key => $ch)
{
$content = curl_multi_getcontent($ch);
echo $content."<br>";
curl_multi_remove_handle($mh, $ch);
}
curl_multi_close($mh);
How can I get Content-Length from $content ?
You can use curl_getinfo($ch, CURLINFO_CONTENT_LENGTH_DOWNLOAD) which returns:
Content length of download, read from Content-Length: field
In this particular case, -1 seems to be a valid response:
Since 7.19.4, this returns -1 if the size isn't known.
I am trying to speed up my website by processing the cURL requests efficiently. I am running about 3 requests, two go to the same server. Here is my code:
$profile = curl_init();
curl_setopt($profile, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($profile, CURLOPT_RETURNTRANSFER, true);
curl_setopt($profile, CURLOPT_FAILONERROR, true);
curl_setopt($profile, CURLOPT_URL,"https://owapi.net/api/v2/u/".$battletag."/stats/".$mode."?platform=".$platform);
$result = curl_exec($profile); //grab API data
curl_close($profile);
$stats = json_decode($result, true); //decode JSON data
$profile1 = curl_init();
curl_setopt($profile1, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($profile1, CURLOPT_RETURNTRANSFER, true);
curl_setopt($profile1, CURLOPT_FAILONERROR, true);
curl_setopt($profile1, CURLOPT_URL,"https://api.lootbox.eu/".$platform."/us/".$battletag."/profile");
$result1 = curl_exec($profile1); //grab API data
curl_close($profile1);
$stats1 = json_decode($result1, true);
$hero_stats = curl_init();
curl_setopt($hero_stats, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($hero_stats, CURLOPT_RETURNTRANSFER, true);
curl_setopt($hero_stats, CURLOPT_FAILONERROR, true);
curl_setopt($hero_stats, CURLOPT_URL,"https://api.lootbox.eu/".$platform."/us/".$battletag."/competitive-play/heroes");
$hero_play_time = curl_exec($hero_stats); //grab API data
curl_close($hero_stats);
$heroes_info = json_decode($hero_play_time, true);
How can I process these requests at the same time without restarting the connection? I want to speed up the load time of my site because right now, it takes a long time. Any help would be appreciated. I have heard of the curl_multi_init() method but I am not sure on how to use it properly. Any help wit that would be welcomed.
Thanks.
Well thanks for all the help guys. Especially #Andrew.
I found a solution that ended up working. Posting it here for other people with a similar problem.
function multiRequest($data, $options = array()) {
// array of curl handles
$curly = array();
// data to be returned
$result = array();
// multi handle
$mh = curl_multi_init();
// loop through $data and create curl handles
// then add them to the multi-handle
foreach ($data as $id => $d) {
$curly[$id] = curl_init();
$url = (is_array($d) && !empty($d['url'])) ? $d['url'] : $d;
curl_setopt($curly[$id], CURLOPT_URL, $url);
curl_setopt($curly[$id], CURLOPT_HEADER, 0);
curl_setopt($curly[$id], CURLOPT_RETURNTRANSFER, 1);
// post?
if (is_array($d)) {
if (!empty($d['post'])) {
curl_setopt($curly[$id], CURLOPT_POST, 1);
curl_setopt($curly[$id], CURLOPT_POSTFIELDS, $d['post']);
}
}
// extra options?
if (!empty($options)) {
curl_setopt_array($curly[$id], $options);
}
curl_multi_add_handle($mh, $curly[$id]);
}
// execute the handles
$running = null;
do {
curl_multi_exec($mh, $running);
} while($running > 0);
// get content and remove handles
foreach($curly as $id => $c) {
$result[$id] = json_decode(curl_multi_getcontent($c), true);
curl_multi_remove_handle($mh, $c);
}
// all done
curl_multi_close($mh);
return $result;
}
$data = array(
'https://owapi.net/api/v2/u/'.$battletag.'/stats/'.$mode.'?platform='.$platform,
'https://api.lootbox.eu/'.$platform.'/us/'.$battletag.'/profile',
'https://api.lootbox.eu/'.$platform.'/us/'.$battletag.'/competitive-play/heroes'
);
$r = multiRequest($data);
This worked, I just had to add the json_decode method over the get_contents method.
Thanks again everyone. Really appreciate the help.
I found the following function that I've been able to use to collect and cache share counts on various social networks. Thus far, I can feed Twitter, LinkedIn, Facebook, and Pinterest URL's in an Array to this function and they all kick back a response that I can parse and get the count from.
In an effort to speed up the process, I recently found this process that uses cURL multi to fetch all the shares at the same time instead of processing one request at a time.
However, the cURL that I had been using for Google Plus has a lot more configuration in order to make it work. Is it possible to get this configuration into this function so that as it's looping through the requests, if it sees Google Plus, it adds all of this information to that specific request, but still runs the request simultaneously to the others?
Here's the cURL multi function that I'm using:
function sw_fetch_shares_via_curl_multi($data, $options = array()) {
// array of curl handles
$curly = array();
// data to be returned
$result = array();
// multi handle
$mh = curl_multi_init();
// loop through $data and create curl handles
// then add them to the multi-handle
foreach ($data as $id => $d) {
$curly[$id] = curl_init();
$url = (is_array($d) && !empty($d['url'])) ? $d['url'] : $d;
curl_setopt($curly[$id], CURLOPT_URL, $url);
curl_setopt($curly[$id], CURLOPT_HEADER, 0);
curl_setopt($curly[$id], CURLOPT_RETURNTRANSFER, 1);
// post?
if (is_array($d)) {
if (!empty($d['post'])) {
curl_setopt($curly[$id], CURLOPT_POST, 1);
curl_setopt($curly[$id], CURLOPT_POSTFIELDS, $d['post']);
}
}
// extra options?
if (!empty($options)) {
curl_setopt_array($curly[$id], $options);
}
curl_multi_add_handle($mh, $curly[$id]);
}
// execute the handles
$running = null;
do {
curl_multi_exec($mh, $running);
} while($running > 0);
// get content and remove handles
foreach($curly as $id => $c) {
$result[$id] = curl_multi_getcontent($c);
curl_multi_remove_handle($mh, $c);
}
// all done
curl_multi_close($mh);
return $result;
}
The array that I feed into it is basically something like this:
$request_url['pinterest'] = 'https://api.pinterest.com/v1/urls/count.json?url='.$url;
$request_url['twitter'] = 'https://urls.api.twitter.com/1/urls/count.json?url=' . $url;
And so on and so forth for the other networks. I pass those into the cURL multi function, and they send me some json that I can parse and work with.
Here's the configuration for Google Plus that I would like to integrate into the same function so that I can easily pass it in as well:
function sw_fetch_googlePlus_shares($url) {
$url = rawurlencode($url);
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, "https://clients6.google.com/rpc");
curl_setopt($curl, CURLOPT_POST, true);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($curl, CURLOPT_POSTFIELDS, '[{"method":"pos.plusones.get","id":"p","params":{"nolog":true,"id":"'.rawurldecode($url).'","source":"widget","userId":"#viewer","groupId":"#self"},"jsonrpc":"2.0","key":"p","apiVersion":"v1"}]');
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_HTTPHEADER, array('Content-type: application/json'));
$curl_results = curl_exec ($curl);
curl_close ($curl);
$json = json_decode($curl_results, true);
return isset($json[0]['result']['metadata']['globalCounts']['count'])?intval( $json[0]['result']['metadata']['globalCounts']['count'] ):0;
}
Can I set up that loop somehow to see if the $id is 'googlePlus', then it adds all this stuff to that particular request? Is it possible to check if it's Google Plus right before the curl_setopts lines and then pass these other ones in instead somehow? Thanks.
It turns out that this was a lot easier than I thought:
function sw_fetch_shares_via_curl_multi($data, $options = array()) {
// array of curl handles
$curly = array();
// data to be returned
$result = array();
// multi handle
$mh = curl_multi_init();
// loop through $data and create curl handles
// then add them to the multi-handle
foreach ($data as $id => $d) {
$curly[$id] = curl_init();
if($id == 'googlePlus'):
curl_setopt($curly[$id], CURLOPT_URL, "https://clients6.google.com/rpc");
curl_setopt($curly[$id], CURLOPT_POST, true);
curl_setopt($curly[$id], CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($curly[$id], CURLOPT_POSTFIELDS, '[{"method":"pos.plusones.get","id":"p","params":{"nolog":true,"id":"'.rawurldecode($d).'","source":"widget","userId":"#viewer","groupId":"#self"},"jsonrpc":"2.0","key":"p","apiVersion":"v1"}]');
curl_setopt($curly[$id], CURLOPT_RETURNTRANSFER, true);
curl_setopt($curly[$id], CURLOPT_HTTPHEADER, array('Content-type: application/json'));
else:
$url = (is_array($d) && !empty($d['url'])) ? $d['url'] : $d;
curl_setopt($curly[$id], CURLOPT_URL, $url);
curl_setopt($curly[$id], CURLOPT_HEADER, 0);
curl_setopt($curly[$id], CURLOPT_RETURNTRANSFER, 1);
endif;
// extra options?
if (!empty($options)) {
curl_setopt_array($curly[$id], $options);
}
curl_multi_add_handle($mh, $curly[$id]);
}
// execute the handles
$running = null;
do {
curl_multi_exec($mh, $running);
} while($running > 0);
// get content and remove handles
foreach($curly as $id => $c) {
$result[$id] = curl_multi_getcontent($c);
curl_multi_remove_handle($mh, $c);
}
// all done
curl_multi_close($mh);
return $result;
}
You can now use this function to fetch all five major networks using simultaneous connections instead of one at a time.
For the array, all the other networks require the request URL, but Google Plus has it's request information built into that plugin so it only requires the URL of the page that you want information for.
I need to make a number of curl requests to the same domain one after the other, but cannot make them in parallel.
I found the following code sample at http://technosophos.com/
which does work well in speeding up the repeated curl calls.
function get2($url) {
// Create a handle.
$handle = curl_init($url);
// Set options...
// Do the request.
$ret = curlExecWithMulti($handle);
// Do stuff with the results...
// Destroy the handle.
curl_close($handle);
}
function curlExecWithMulti($handle) {
// In real life this is a class variable.
static $multi = NULL;
// Create a multi if necessary.
if (empty($multi)) {
$multi = curl_multi_init();
}
// Add the handle to be processed.
curl_multi_add_handle($multi, $handle);
// Do all the processing.
$active = NULL;
do {
$ret = curl_multi_exec($multi, $active);
} while ($ret == CURLM_CALL_MULTI_PERFORM);
while ($active && $ret == CURLM_OK) {
if (curl_multi_select($multi) != -1) {
do {
$mrc = curl_multi_exec($multi, $active);
} while ($mrc == CURLM_CALL_MULTI_PERFORM);
}
}
// Remove the handle from the multi processor.
curl_multi_remove_handle($multi, $handle);
return TRUE;
}
I have tried multiple times by setting the curl options to get function curlExecWithMulti($handle) to return the results of the curl as a variable, but with no success so far.
Can this be done?
Perhaps this will be of interest, very easy to understand. It will do your curl multi requests and then return an array of results, it also does curl POST.
<?php
//demo receiver
if($_SERVER['REQUEST_METHOD']=='POST'){
echo $_POST['post_var'];
die;
}
/**
* CURL GET|POST Multi
*/
function curl_multi($data, $options = array()) {
$curly = array();
$result = array();
$mh = curl_multi_init();
foreach ($data as $id=>$d) {
$curly[$id] = curl_init();
$url = (is_array($d) && !empty($d['url'])) ? $d['url'] : $d;
$header[0]="Accept: text/xml,application/xml,application/xhtml+xml,application/json";
$header[0].="text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[]="Cache-Control: max-age=0";
$header[]="Connection: keep-alive";
$header[]="Keep-Alive: 2";
$header[]="Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[]="Accept-Language: en-us,en;q=0.5";
$header[]="Pragma: ";
curl_setopt($curly[$id], CURLOPT_URL, $url);
curl_setopt($curly[$id], CURLOPT_HEADER, 0);
curl_setopt($curly[$id], CURLOPT_RETURNTRANSFER, true);
curl_setopt($curly[$id], CURLOPT_TIMEOUT, 30);
curl_setopt($curly[$id], CURLOPT_USERAGENT, "cURL (http://".$_SERVER['SERVER_NAME'].")");
curl_setopt($curly[$id], CURLOPT_HTTPHEADER, $header);
curl_setopt($curly[$id], CURLOPT_REFERER, $url);
curl_setopt($curly[$id], CURLOPT_ENCODING, 'gzip,deflate');
curl_setopt($curly[$id], CURLOPT_AUTOREFERER, true);
curl_setopt($curly[$id], CURLOPT_RETURNTRANSFER, true);
// post?
if (is_array($d)) {
if (!empty($d['post'])) {
curl_setopt($curly[$id], CURLOPT_POST, 1);
curl_setopt($curly[$id], CURLOPT_POSTFIELDS, $d['post']);
}
}
// extra options?
if (!empty($options)) {
curl_setopt_array($curly[$id], $options);
}
curl_multi_add_handle($mh, $curly[$id]);
}
$running = null;
do {
curl_multi_exec($mh, $running);
} while($running > 0);
foreach($curly as $id => $c) {
$result[$id] = curl_multi_getcontent($c);
curl_multi_remove_handle($mh, $c);
}
curl_multi_close($mh);
return $result;
}
$request = array(
array('url'=>'http://localhost:8080/testing.php','post'=>array('post_var'=>'a')),
array('url'=>'http://localhost:8080/testing.php','post'=>array('post_var'=>'b')),
array('url'=>'http://localhost:8080/testing.php','post'=>array('post_var'=>'c')),
);
$curl_result = curl_multi($request);
/*
Array
(
[0] => a
[1] => b
[2] => c
)
*/
echo '<pre>'.print_r($curl_result, true).'</pre>';
?>
I need to add authentication to this function:
function multiRequest($data, $options = array()) {
// array of curl handles
$curly = array();
// data to be returned
$result = array();
// multi handle
$mh = curl_multi_init();
// loop through $data and create curl handles
// then add them to the multi-handle
foreach ($data as $id => $d) {
$curly[$id] = curl_init();
$url = (is_array($d) && !empty($d['url'])) ? $d['url'] : $d;
curl_setopt($curly[$id], CURLOPT_URL, $url);
curl_setopt($curly[$id], CURLOPT_HEADER, 0);
curl_setopt($curly[$id], CURLOPT_RETURNTRANSFER, 1);
// post?
if (is_array($d)) {
if (!empty($d['post'])) {
curl_setopt($curly[$id], CURLOPT_POST, 1);
curl_setopt($curly[$id], CURLOPT_POSTFIELDS, $d['post']);
}
}
// extra options?
if (!empty($options)) {
curl_setopt_array($curly[$id], $options);
}
curl_multi_add_handle($mh, $curly[$id]);
}
// execute the handles
$running = null;
do {
curl_multi_exec($mh, $running);
} while($running > 0);
// get content and remove handles
foreach($curly as $id => $c) {
$result[$id] = curl_multi_getcontent($c);
curl_multi_remove_handle($mh, $c);
}
// all done
curl_multi_close($mh);
return $result;
}
I'm looking to add authentication to this function, something along these lines?
curl_setopt($curly[$id], CURLOPT_USERPWD, "$username:$password");
Anyone help?
Have you tried the code you posted? That looks good to me:
curl_setopt($curly[$id], CURLOPT_USERPWD, "$username:$password");
But this function already works with auth, just pass the auth string "user:pass" in the $options array.
multiRequest($data, array('CURLOPT_USERPWD' => "user:pass"));
If you're looking to set auth for each curl handle, something like this should work:
// auth?
if (is_array($d)) {
if (!empty($d['user']) AND !empty($d['pass']) {
curl_setopt($curly[$id], CURLOPT_USERPWD, "{$d['user']}:{$d['pass']}");
}
}
Then just pass 'user' and 'auth' elements as part of the multidimensional array.
Can the function definition be changed?
function multiRequest($data,$user,$pwd, $options = array())
Then in the foreach loop, do the following:
...
curl_setopt($curly[$id], CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curly[$id], CURLOPT_USERPWD, "$user:$pwd");
// post?
...