browser headers for curl request - php

We have an assignment to filter authentic curl requests from robots. I am sending a curl request to the site, but it's returning to me an invalid image file(i know because when i view it with my browser it works). It somehow knows my request is not authentic. Is there a field I'm overlooking here, I'm trying to mimic a browser request exactly.
$header_arr = array(
'0' =>'Host: www.myittest.com',
'1' =>'User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:33.0) Gecko/20100101 Firefox/33.0',
'2' =>'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*\/*;q=0.8',
'3' =>'Accept-Language: en-US,en;q=0.5',
'4' =>'Accept-Encoding: gzip, deflate',
'5' =>'Connection: keep-alive',
);
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_HTTPHEADER, $header_arr);
curl_setopt($ch, CURLOPT_TIMEOUT, 30);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_MAXREDIRS, 6);
$raw=curl_exec($ch);

You have requested gzip/deflate encoding but haven't made curl aware of it so it doesn't decode the image. Adding this should fix it:
curl_setopt($ch, CURLOPT_ENCODING, '');

Related

PHP Curl disable cache

I have a php script that periodically perform an http request to a remote api server.
I use curl to perform this task. My code is like the following exemple:
$url='http://apiserver.com/services/someservice.php?apikey='.$my_key;
$ch=curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Accept-Encoding: gzip, deflate',
'Accept: */*',
'Connection: keep-alive',
'User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:48.0) Gecko/20100101 Firefox/48.0'
));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
$data=curl_exec($ch);
curl_close($ch);
$json_data = json_decode($data,TRUE);
Recently I verified that it no longer fetch new data, it is using instead a cached version. I tried to add the following curl flags to the code:
curl_setopt($ch, CURLOPT_FRESH_CONNECT, 1);
curl_setopt($ch, CURLOPT_FORBID_REUSE, 1);
This did not solve the problem. I still receive the same cached response.
If I add an extra parameter to the api url like "&time=".time(), that fix the problem, but I dont want to add extra parameters to the url.
What can I do to solve the problem?

PHP cURL doesn't follow redirects even if the flag is set

Even though I have set curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true) cURL doesn't want to follow redirects, it only shows the "301 Moved page". Tried it with multiple sites.
Strange thing is that it works on localhost, but when I upload it to my webspace then refuses to work.
Is it possible that my web hosting provider made some tweaks that it doesn't work? Never seen such thing :(
Here's the code:
$ch=curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://google.com');
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_ENCODING, '');
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36');
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language: en-US,en;q=0.5',
'Accept-Encoding: gzip, deflate',
'Connection: keep-alive'
));
$result = curl_exec($ch);
curl_close($ch);
I had a similar issue and it was due to cURL executing a GET immediately after receiving the redirect header. To fix this i specified CURLOPT_CUSTOMREQUEST
Example:
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "POST");

PHP cUrl : HTTP Error 400 while The Browser Displayed Correctly

URL:
You can see the url in Here (I put the url in the pastebin because the url is quite long).
Curl & Header :
$header=array();
$header[]="Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
$header[]="Accept-Encoding: gzip, deflate";
$header[]="Accept-Language: en-US,en;q=0.5";
$header[]="Connection: keep-alive";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.1.7) Gecko/20070914 Firefox/2.0.0.7');
curl_setopt($ch, CURLOPT_AUTOREFERER, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_exec($ch);
Result:
Error 400--Bad Request
From RFC 2068 Hypertext Transfer Protocol -- HTTP/1.1:
10.4.1 400 Bad Request
The request could not be understood by the server due to malformed syntax. The client SHOULD NOT repeat the request without modifications."
The Browser when go to the URL directly without curl:
Displayed Nicely.
There are problems with your URL, chances are it was computed wrong.
If you're generating that long URL from your script, make sure it's the right one.
The reason is that if you try deleting stuff, let's say you end up with https://wftc3.e-travel.com/plnext/garuda-indonesia/Override.action, you will see that accessing this page ends up in a 400 error.
I hope this helps.
/edit: this works, so it's probably $url.
<?php
$url = "https://wftc3.e-travel.com/plnext/garuda-indonesia/Override.action?SITE=CBEECBEE&LANGUAGE=ID&EMBEDDED_TRANSACTION=FlexPricerAvailability
$header=array();
$header[]="Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
$header[]="Accept-Encoding: gzip, deflate";
$header[]="Accept-Language: en-US,en;q=0.5";
$header[]="Connection: keep-alive";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.1.7) Gecko/20070914 Firefox/2.0.0.7');
curl_setopt($ch, CURLOPT_AUTOREFERER, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_ENCODING , "gzip");
$x = curl_exec($ch);
die(($x));

PHP curl() Headers bad request

I am trying to figure out why pasing a custom header is resulting in a 400 BAD REQUEST from the server.
$headers = array(
'API KEY: asdf',
);
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Firefox/24.0" );
curl_setopt($ch, CURLOPT_URL, 'http://url');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_POST, 1 );
curl_setopt($ch, CURLOPT_POSTFIELDS, 'stuff');
curl_setopt($ch, CURLOPT_COOKIEFILE, './tmp/cookie.txt');
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_PROXYPORT, '0000');
curl_setopt($ch, CURLOPT_PROXYTYPE, 'HTTP');
curl_setopt($ch, CURLOPT_PROXY, '0.0.0.0');
$result = curl_exec($ch);
curl_close($ch);
I thought that using CURLOPT_HTTPHEADER would add a custom header to the request, but I'm now wondering whether it's simply overriding everything else I set?
There are more reasons a server will give a 400 response than just a header value. Without more information about the endpoint it's difficult to say what's causing the 400 response. With the exception of the extra "," in the headers array in the example the code looks okay. cURL Options

file_get_contents and CURL can't open a specific website

I tried to use file_get_contents and cURL to get the content of an website, I also tried to open the same site using Lynx and could not get the content. I got a 406 Not Acceptable, it seems that the site checks if I'm using a browser. Is there a work around?
It probably expects the user agent to be a web browser. You can set this easily using cURL:
curl_setopt($ch, CURLOPT_USERAGENT, $useragent);
Where $useragent is the string you want to use for a user agent. Try it with some common ones for the major browsers and see if that helps. This page lists some common user agents.
//make a call the the webpage to get his handicap
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://www.golfspain.com/portalgolf/HCP/handicap_resul.aspx?sLic=CB00693474");
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 60);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE );
curl_setopt($ch, CURLOPT_REFERER, "http://google.com" );
curl_setopt($ch, CURLOPT_HEADER, TRUE );
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13');
$header = array(
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7',
'Accept-Language: en-us;q=0.8,en;q=0.6'
);
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
$html = curl_exec($ch);
curl_close($ch);
$doc = new DOMDocument();
$doc->strictErrorChecking = FALSE;
$doc->loadHTML($html);
$xml = simplexml_import_dom($doc);
Maybe you have to set some more HTTP headers like a 'real' browser. With cURL:
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13');
$header = array(
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7',
'Accept-Language: en-us;q=0.8,en;q=0.6'
);
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);

Categories