I am trying to get page meta tags and description from given url .
I have url array that I have to loop through to send curl get request and get each page meta, this takes a lot of time to process .
Is there any way to process all urls simultaneuosly at same time?
I mean send request to all urls at same time and then receive
response as soon as request is completed respectively.
For this purpose I have used
curl_multi_init()
but its not working as expected. I have used this example
Simultaneuos HTTP requests in PHP with cURL
I have also used GuzzleHttp example
Concurrent HTTP requests without opening too many connections
my code
$urlData = [
'http://youtube.com',
'http://dailymotion.com',
'http://php.net'
];
foreach ($urlData as $url) {
$promises[] = $this->client->requestAsync('GET', $url);
}
Promise\all($promises)->then(function (array $responses) {
foreach ($responses as $response) {
$htmlData = $response->getBody();
dump($profile);
}
})->wait();
But I got this error
Call to undefined function GuzzleHttp\Promise\Promise\all()
I am using Guzzle 6 and Promises 1.3
I need a solution whether it is in curl or in guzzle to send simultaneous request to save time .
Check your use statements. You probably have a mistake there, because correct name is GuzzleHttp\Promise\all(). Maybe you forgot use GuzzleHttp\Promise as Promise.
Otherwise the code is correct and should work. Also check that you have cURL extension enabled in PHP, so Guzzle will use it as the backend. It's probably there already, but worth to check ;)
Related
I would like to cache curl responses, and I found out a couple of ways to do that, but all of them include saving a response to the file, and than retrieving it. The problem here is that my code needs to work with curl_getinfo() object, which is available only after the curl_exec call is finished. So, the ideal way would be if the curl itself would cache the response instead of making a new request. I tried that approach using Cache-Control request header with the value max-age=604800, however I don't see any changes. Any ideas how to accomplish this ?
If you have enough information about a request to compile a unique identifier/key you could use for example Memcached:
$key = $url.':'.$some_other_variable;
$cached = $memcached->get($key);
if ($cached)
{
return $cached;
}
// Perform cURL request
// ...
$memcached->set($key, $data_to_cache);
I'm using Guzzle that I installed via composer and failing to do something relatively straightforward.
I might be misunderstanding the documentation but essentially what I'm wanting to do is run a POST request to a server and continue executing code without waiting for a response. Here's what I have :
$client = new \GuzzleHttp\Client(/*baseUrl, and auth credentials here*/);
$client->post('runtime/process-instances', [
'future'=>true,
'json'=> $data // is an array
]);
die("I'm done with the call");
Now lets say the runtime/process-instances runs for about 5mn, I will not get the die message before those 5mn are up... When instead I want it right after the message is sent to the server.
Now I don't have access to the server so I can't have the server respond before running the execution. I just need to ignore the response.
Any help is appreciated.
Things I've tried:
$client->post(/*blabla*/)->then(function ($response) {});
It is not possible in Guzzle to send a request and immediately exit. Asynchronous requests require that you wait for them to complete. If you do not, the request will not get sent.
Also note that you are using post instead of postAsync, the former is a synchronous (blocking) request. To asynchronously send a post request, use the latter. In your code example, by changing post to postAsync the process will exit before the request is complete, but the target will not receive that request.
Have you tried setting a low timeout?
I'm trying to get a JSON string from a page in my Laravel Project. Using this:
$json = file_get_contents($url);
$data = json_decode($json, TRUE);
return View::make('adventuretime.marceline')
->with('json', $json)
->with('title', 'ICE KING')
->with('description', 'I am the Ice King')
->with('content', 'ice king');
But since I'm only using a localhost, I think this doesn't work that's why it doesn't output anything. I want to know what is the proper way for it to be flexible and be able to get the JSON string with any $url value using php?
Looking at the comments above, it is possible that the $url you are using is not valid, check it out by pointing your browser there and see what happens.
If you are sure that the $url is fine, but you still get the 404 Not Found error - verify that you have proper Laravel routing defined for that address. If the routes are fine, maybe you forgot to do
composer dump-autoload
after making modifications in your routes.php. If so, try the above and refresh the browser to see if it helps.
Furthermore, bear in mind that using your current function, you can submit only GET requests. What is more, this function might not be available for fetching remote urls, on some hosting servers due to security reasons. If you still want to use it, it'd be good to check
if($json !== FALSE)
before you process the $json response. If the file_get_contents fails it will return false.
Reffering to the part of your question
what is the proper way for it to be flexible and be able to get the JSON string with any $url
I'd suggest using cURL, as a standard and convenient way to fetch remote content. Using cURL you have better control over the process of sending the http request and receiving the "answer" it returns. Personaly, in my Laravel 4 apps I often use this package jyggen/curl. You can read the docs for it here: jyggen docs
If you are not satisfied with cURL and you want greater control try Guzzle As the authors state, Guzzle is a PHP HTTP client & framework for building RESTful web service clients.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
cURL Mult Simultaneous Requests (domain check)
I'm trying to check to see if a website exists. (if it responds that's good enough) The issue is my array of domains is 20,000 and I'm trying to speed up the process as much as possible.
I've done some research and come across this page which details simultaneous cURL requests ->
http://www.phpied.com/simultaneuos-http-requests-in-php-with-curl/
I also found this page which seems be a good way of checking if a domain webpage is up -> http://www.wrichards.com/blog/2009/05/php-check-if-a-url-exists-with-curl/
Any ideas on how to quickly check 20,000 domains to see if they are up?
$http = curl_init($url);
$result = curl_exec($http);
$http_status = curl_getinfo($http, CURLINFO_HTTP_CODE);
curl_close($http);
if($http_status == 200) // good here
check out RollingCurl
It allows you to execute multiple curl requests.
Here is an example:
require 'curl/RollingCurl.php';
require 'curl/RollingCurlGroup.php';
$rc = new RollingCurl('handle_response');
$rc->window_size = 2;
foreach($domain_array as $domain => $value)
{
$request = new RollingCurlRequest($value);
// echo $temp . "\n";
$rc->add($request);
}
$rc->execute();
function handle_response($response, $info)
{
if($info['http_code'] === 200)
{
// site exists handle response data
}
}
I think that if you really want to speed up the process and save a lot of bandwidth (as I got you plan to check the availability on a regular basis) then you should work with sockets, not with curl. You may open several sockets at time and arrange 'asynchronous' treatment of each socket. Then you need to send not the "GET $sitename/ HTTP/1.0\r\n\r\n" request but "HEAD $sitename/ HTTP/1.0\r\n\r\n". It will return the same status code as GET request would return but without response body. You need to parse only first row of response to get an answer, so you just could regex_match it with good response codes. And as one extra optimization, eventually your code will learn what sites are sitting on the same IPs, so you cache the name mappings and order the list by IP. Then you may check several sites over one connected socket for these sites (remember to add 'Connection: keep-alive' header).
YOu can use multi curl requests, but you probably want to limit them to 10 at a time or so. You would have to track jobs in a separate database for processing the queue: Threads in PHP
I have been given API documentation which I don't quite get as there is no URL to connect up to?
http://support.planetdomain.com/index.php?_m=downloads&_a=viewdownload&downloaditemid=14&nav=0
I'd prefer doing this in PHP..
How can I run a 10 iteration loop, doing a check if domain is available, if it's response is available, then perform the register command and exit the script (using the code provided in thd documentation).
Thank you.
For the basics, I suggest using cURL to access resources by HTTP POST.
I put this into a function:
function api_call($url,$data,$timeout=20)
{
$response=false;
$ch=curl_init($url);
curl_setopt_array($ch,array(CURLOPT_RETURNTRANSFER=>true,CURLOPT_NOBODY=>false,CURLOPT_TIMEOUT=>$timeout,CURLOPT_FORBID_REUSE=>1,CURLOPT_FRESH_CONNECT=>1,CURLOPT_POST=>true));
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);//this is an array containing the data you're sending them - an associative array describing which call.
//data example:
//array('operation'=>'user.verify','admin.username'=>'you','admin.password'=>'pass','reseller.id'=>'xxx')
$response=curl_exec($ch);
$status_code=intval(curl_getinfo($ch,CURLINFO_HTTP_CODE));
curl_close($ch);
return array('status'=>$status_code,'url'=>$url,'data'=>$response);
}
However, you need to supply a URL. Lucanos noted in the comments it is "api.planetdomain.com/servlet/TLDServlet".
http://support.planetdomain.com/index.php?_m=knowledgebase&_a=viewarticle&kbarticleid=77
by the way, I only use cURL for GET requests, so I might be missing some details on how to do a POST right. I tried to fill it in, though.
You ask "How can I run a 10 iteration loop, doing a check if domain is available, if it's response is available, then perform the register command and exit the script (using the code provided in thd documentation)."
Well, here's some pseudocode mixed with valid PHP. I don't know the domainplanet API as you know, so this will NOT work as is but it should give you a decent idea about it.
for($i=0;$i<10;$i++)
{
//set up the domain check call
$domains=array('futunarifountain.co.uk','megahelicopterunicornassaultlovepageant.ly');
$domain_check_call=array('domain.name'=>$domains[$i]);
$domain_info=api_call($dp_base_url,$domain_check_call);
$info=json_decode($domain_info,true);//IF they use JSON and not XML or something
if($info['domain']['status']=='available')
{
$register_call=something();//make the API calls to register the domain, similar to the above
if($register_call['success']){ exit();/*or whatever*/ }
}
}
Hope that helps get you on the right track.