PHP's curl is surprisingly undebugable and obscure. I have some problem downloading a JSON API data with cURL. I want to see what is exactly cURL sending to the remote HTTP server.
Currently the only debug option I have is to temporarily send request to some simple HTTP server that writes input to stdout. I would need to write that server just to debug curl!
What I do:
function get_data($url) {
$ch = curl_init();
echo "Download: $url.\n";
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
// I hoped to get some debug info
// but this setting has no effect
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_HEADER, array(
'User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:43.0) Gecko/20100101 Firefox/43.0',
'X-Purpose: Counting downloads.'
));
echo "Sending: \n".curl_getinfo($ch, CURLINFO_HEADER_OUT);
$data = curl_exec($ch);
var_dump($data);
echo curl_error($ch)." ".curl_errno($ch);
curl_close($ch);
return $data;
}
How can I get the data that is sent by cURL as a text?
If you want to define the headers you should use CURLOPT_HTTPHEADER and not CURLOPT_HEADER, i.e.:
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:43.0) Gecko/20100101 Firefox/43.0',
'X-Purpose: Counting downloads.'
));
To get the the content curl is sending use:
curl_setopt($handle, CURLOPT_VERBOSE, true);
curl_setopt($handle, CURLOPT_STDERR,$f = fopen($verbosePath, "w+"));
function get_data($url) {
$verbosePath = __DIR__.DIRECTORY_SEPARATOR.'verbose.txt';
echo "Saving verbose to: $verbosePath\n";
$handle=curl_init('http://www.google.com/');
curl_setopt($handle, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($handle, CURLOPT_CONNECTTIMEOUT, 5);
curl_setopt($handle, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($handle, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($handle, CURLOPT_HTTPHEADER, array(
'User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:43.0) Gecko/20100101 Firefox/43.0',
'X-Purpose: Counting downloads.'
));
curl_setopt($handle, CURLOPT_VERBOSE, true);
curl_setopt($handle, CURLOPT_STDERR,$f = fopen($verbosePath, "w+"));
$data = curl_exec($handle);
curl_close($handle);
fclose($f);
return $data;
}
get_data("https://www.google.com");
verbose.txt
* About to connect() to www.google.com port 80
* Trying 172.217.0.100... * connected
* Connected to www.google.com (172.217.0.100) port 80
> GET / HTTP/1.1
Host: www.google.com
Accept: */*
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:43.0) Gecko/20100101 Firefox/43.0
X-Purpose: Counting downloads.
Related
I tried to fetch some data using php curl_setopt in my code. But it went through 2 minutes of loading, then got 504 Gateway Timeout error. Here's the code:
function sendRequest($url, $data, $token = '') {
$header[] = "Connection: keep-alive";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
curl_setopt($ch, CURLOPT_HTTP_VERSION, 'CURL_HTTP_VERSION_1_1');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'POST');
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.120 Safari/537.36");
$result = curl_exec($ch);
curl_close($ch);
return $result;
}
Strangely, I succeeded fetching the data instantly using command line cURL with the same options:
curl --url "http://x.x.x.x" --header "Connection: keep-alive" --http1.1 --verbose --request "POST" -d "data1=value1" -d "data2=value2" --user-agent "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.120 Safari/537.36"
The server that hosts the code and the destination is in the same network and firewall is not activated in both servers. What could be the cause of the failure?
No clue what $data is. If it's an associative array, try
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($data));
I think Firefox automatic proxy configuration is TSL 1.2:
$headers = array(
'host:ip-adress.eu',
'User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:48.0) Gecko/20100101 Firefox/48.0',
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language: en-US,en;q=0.5',
'Accept-Encoding: gzip, deflate, br',
'Proxy-Authorization: Basic b2xkLmV1cm9tYWtlckB5YWhvby5jb206WEpzV0Uwc0ZqT1pOK2MydDZWZWc4WWFreklaUVVDSUcxbDVrWE1yK0xKVT0=',
'Connection: keep-alive'
);
$proxy = "GQ2S4MZSFYZTMLRRHAYSGMJUG4ZTQMJRGIYDA.cd-n.net:143";
$url = 'ip-adress.eu';
$curl = curl_init();
curl_setopt($curl, CURLOPT_SSLVERSION, 6);
curl_setopt($curl, CURLOPT_HTTPPROXYTUNNEL, true); // OK
curl_setopt($curl, CURLOPT_PROXY, $proxy); // OK
curl_setopt($curl, CURLOPT_URL, trim($url)); // OK
curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:48.0) Gecko/20100101 Firefox/48.0'); // OK
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_HTTPHEADER, $headers);
curl_setopt($curl, CURLOPT_TIMEOUT, 30);
curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($curl, CURLOPT_VERBOSE, false);
$httpresult = curl_exec($curl);
$httpstatus = curl_getinfo($curl, CURLINFO_HTTP_CODE);
print curl_errno ($curl);
curl_close($curl);
This code returns CURL Error - curl: (56) Recv failure: Connection reset by peer
I am using Curl PHP to fetch data from remote site. My Script is:
<?php
$url = 'https://www.(url).com/';
$sleep = rand(10, 12);
sleep($sleep);
$agent= 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.110 Safari/537.36';
$ch = curl_init();
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8','accept-encoding:gzip, deflate, sdch','accept:image/webp,image/*,*/*;q=0.8'));
curl_setopt($ch, CURLOPT_PROXY, "x.x.x.x:x");
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
curl_setopt($ch, CURLOPT_URL,$url);
$result=curl_exec($ch);
$mainPage = new simple_html_dom;
echo $mainPage->load($result);
But it returns 403 forbidden error in response.
I tried with advanced User agents include, but still I am getting this error in response.
Thanks in advance for suggestions and comments.
I am trying to use php cURL to fetch amazon web page but get
HTTP/1.1 503 Service Temporarily Unavailable instead. Is Amazon blocking cURL?
http://www.amazon.com/gp/offer-listing/B003B7Q5YY/
<?php
function get_html_content($url) {
// fake user agent
$userAgent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.2) Gecko/20070219 Firefox/2.0.0.2';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch,CURLOPT_COOKIEFILE,'cookies.txt');
curl_setopt($ch,CURLOPT_COOKIEJAR,'cookies.txt');
$string = curl_exec($ch);
curl_close($ch);
return $string;
}
echo get_html_content("http://www.amazon.com/gp/offer-listing/B003B7Q5YY");
?>
I use simple
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $offers_page);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 GTB5');
$html = curl_exec($ch);
curl_close($ch);
but i have another problem. if you send a lot of queries to amazon - they start send 500 page to you.
I tried to use file_get_contents and cURL to get the content of an website, I also tried to open the same site using Lynx and could not get the content. I got a 406 Not Acceptable, it seems that the site checks if I'm using a browser. Is there a work around?
It probably expects the user agent to be a web browser. You can set this easily using cURL:
curl_setopt($ch, CURLOPT_USERAGENT, $useragent);
Where $useragent is the string you want to use for a user agent. Try it with some common ones for the major browsers and see if that helps. This page lists some common user agents.
//make a call the the webpage to get his handicap
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://www.golfspain.com/portalgolf/HCP/handicap_resul.aspx?sLic=CB00693474");
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 60);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE );
curl_setopt($ch, CURLOPT_REFERER, "http://google.com" );
curl_setopt($ch, CURLOPT_HEADER, TRUE );
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13');
$header = array(
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7',
'Accept-Language: en-us;q=0.8,en;q=0.6'
);
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
$html = curl_exec($ch);
curl_close($ch);
$doc = new DOMDocument();
$doc->strictErrorChecking = FALSE;
$doc->loadHTML($html);
$xml = simplexml_import_dom($doc);
Maybe you have to set some more HTTP headers like a 'real' browser. With cURL:
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13');
$header = array(
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7',
'Accept-Language: en-us;q=0.8,en;q=0.6'
);
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);