Here's my code:
$urls = array('http://www.avantlink.com/click.php?p=62629&pw=18967&pt=3&pri=152223&tt=df');
$curl_multi = curl_multi_init();
$handles = array();
$options = $curl_options + array(
CURLOPT_HEADER => true,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_NOBODY => true,
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_HEADERFUNCTION => 'read_header',
CURLOPT_USERAGENT => 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.57 Safari/537.36',
CURLOPT_HTTPHEADER => array(
'Accept-Language: en-US,en;q=0.8',
'Accept-Encoding: gzip,deflate,sdch',
'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.57 Safari/537.36',
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'Connection: keep-alive',
'Host: www.avantlink.com',
));
foreach($urls as $i => $url) {
$handles[$i] = curl_init($url);
curl_setopt_array($handles[$i], $options);
curl_multi_add_handle($curl_multi, $handles[$i]);
}
$active = null;
do {
$status = curl_multi_exec($curl_multi, $active);
sleep(1);
error_log('loading redirect: '.$active.' left');
}
while (!empty($active) && $status == CURLM_OK);
do {
$status = curl_multi_exec($curl_multi, $active);
} while ($status == CURLM_CALL_MULTI_PERFORM);
while ($active && ($status == CURLM_OK)) {
if (curl_multi_select($curl_multi) != -1) {
do {
$status = curl_multi_exec($curl_multi, $active);
} while ($status == CURLM_CALL_MULTI_PERFORM);
}
}
if ($status != CURLM_OK) {
trigger_error("Curl multi read error $status\n", E_USER_WARNING);
}
$results = array();
foreach($handles as $i => $handle) {
$results[$i] = curl_getinfo($handle);
curl_multi_remove_handle($curl_multi, $handle);
curl_close($handle);
}
curl_multi_close($curl_multi);
Here's what it gets me (from the read_header() function):
HTTP/1.1 200 OK
Date: Mon, 18 Nov 2013 22:42:29 GMT
Server: Apache
Vary: Accept-Encoding,User-Agent
Content-Encoding: gzip
Content-Length: 20
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Content-Type: text/html; charset=UTF-8
If I print_r() the curl_getinfo() I get this:
Array
(
[url] => http://www.avantlink.com/click.php?p=62629&pw=18967&pt=3&pri=152223&tt=df
[content_type] => text/html; charset=utf-8
[http_code] => 200
[header_size] => 247
[request_size] => 402
[filetime] => -1
[ssl_verify_result] => 0
[redirect_count] => 0
[total_time] => 2.391713
[namelookup_time] => 0.388584
[connect_time] => 1.389628
[pretransfer_time] => 1.389645
[size_upload] => 0
[size_download] => 0
[speed_download] => 0
[speed_upload] => 0
[download_content_length] => 20
[upload_content_length] => 0
[starttransfer_time] => 2.391203
[redirect_time] => 0
[certinfo] => Array
(
)
)
But if you go to that URL in chrome (or whatever) you'll see that the URL gives you a 302 redirect to http://www.alssports.com/product.aspx?pf_id=10212312&avad=18967_b55919f9. I need it to give me the alssports.com url. I've been working on this code for almost the entire day. What am I doing wrong?
Set your follow location to false
CURLOPT_FOLLOWLOCATION => false
When I connected diretly to www.avantlink.com and requested the page, I received these headers...
HTTP/1.1 302 Found
Date: Mon, 18 Nov 2013 23:04:12 GMT
Server: Apache
Set-Cookie: merchant_id_10240=18967_a55923c7-_-10240-df-62629-18967-152223-84%7E; expires=Thu, 17-Apr-2014 23:04:12 GMT; path=/; domain=.avantlink.com
P3P: CP="NOI DSP LAW NID LEG"
Location: http://www.alssports.com/product.aspx?pf_id=10212312&avad=18967_a55923c7
Vary: Accept-Encoding,User-Agent
Content-Length: 0
Content-Type: text/html; charset=utf-8
Curl is simply following the Location header instead of just returning the 302 response.
Related
I have this POST request to login to a website:
http://xxxx.net-kont.it/
POST / HTTP/1.1
Host: xxxx.net-kont.it
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:56.0) Gecko/20100101 Firefox/56.0
Accept: */*
Accept-Language: it-IT,it;q=0.8,en-US;q=0.5,en;q=0.3
Accept-Encoding: gzip, deflate
Content-Type: application/x-www-form-urlencoded; charset=utf-8
Referer: http://xxxx.net-kont.it/
Content-Length: 1904
Cookie: ASP.NET_SessionId=s44bymd3lm4dsykvymjljv5s
Connection: keep-alive
HTTP/1.1 200 OK
Cache-Control: no-cache, no-store
Pragma: no-cache
Content-Type: text/html; charset=utf-8
Expires: -1
Server: Microsoft-IIS/7.5
X-AspNet-Version: 4.0.30319
Set-Cookie: SSOAuth=EDCCFF8CD40064D70B3377CD0389FF7F807F0B774F2CE1CA6C015314911D3D69AB819EAB9938C14608842D25991D11D8F1A5A94090DB926BD7001C526B1920A51AC986182EB016C323983716720E8F345B54E02E44C65753E9183843D23F569EF3FE52C03FC8567E809A77387B8C; path=/; HttpOnly
X-Powered-By: ASP.NET
Date: Sun, 22 Oct 2017 12:26:40 GMT
Content-Length: 714
----------------------------------------------------------
http://xxxx.net-kont.it/aspx/Empty.aspx?ControllaRichieste=true&CheckCode=29a29a891a7d4d7773f480064e5c869929bcca40e7c84812111f9affbc3be4628a3b7defe8fb9b14f9911be9c6545e7cd31c2fc04b79a8d1e7280e0277264bdcec7428037a43961c3dda5bbd54a2e7ae&wsid=1a57f5e6-bf68-4f2f-9a71-c43e8e8bfbaf&wsnew=false
GET /aspx/Empty.aspx?ControllaRichieste=true&CheckCode=29a29a891a7d4d7773f480064e5c869929bcca40e7c84812111f9affbc3be4628a3b7defe8fb9b14f9911be9c6545e7cd31c2fc04b79a8d1e7280e0277264bdcec7428037a43961c3dda5bbd54a2e7ae&wsid=1a57f5e6-bf68-4f2f-9a71-c43e8e8bfbaf&wsnew=false HTTP/1.1
Host: xxxx.net-kont.it
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:56.0) Gecko/20100101 Firefox/56.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: it-IT,it;q=0.8,en-US;q=0.5,en;q=0.3
Accept-Encoding: gzip, deflate
Referer: http://xxxx.net-kont.it/
Cookie: ASP.NET_SessionId=s44bymd3lm4dsykvymjljv5s; SSOAuth=EDCCFF8CD40064D70B3377CD0389FF7F807F0B774F2CE1CA6C015314911D3D69AB819EAB9938C14608842D25991D11D8F1A5A94090DB926BD7001C526B1920A51AC986182EB016C323983716720E8F345B54E02E44C65753E9183843D23F569EF3FE52C03FC8567E809A77387B8C
Connection: keep-alive
Upgrade-Insecure-Requests: 1
HTTP/1.1 200 OK
Cache-Control: no-cache, no-store
Pragma: no-cache
Content-Type: text/html; charset=utf-8
Expires: -1
Server: Microsoft-IIS/7.5
X-AspNet-Version: 4.0.30319
X-Powered-By: ASP.NET
Date: Sun, 22 Oct 2017 12:26:40 GMT
Content-Length: 95935
----------------------------------------------------------
The post request header requires the following fields:
'__LASTFOCUS' => '',
'__EVENTTARGET' => '',
'__EVENTARGUMENT' => '',
'__VIEWSTATE' => $viewstate,
'__VIEWSTATEGENERATOR' => $viewstategenerator,
'ctl00$hwsid' => $hwsid,
'ctl00$PageSessionId' => $pagesessionid,
'ctl00$DefaultUrl' => $defaulturl,
'ctl00$GenericErrorUrl' => $genericerrorurl,
'ctl00$PopupElement' => '',
'ctl00$PollingTimeoutSecs' => $pollingtimeoutsecs,
'ctl00$bodyContent$txtUser' => $user,
'ctl00$bodyContent$txtPassword' => $password,
'__CALLBACKID' => '__Page',
'__CALLBACKPARAM' => '"hwsid="'.$hwsid.'"&PageSessionId="'.$pagesessionid.'"&DefaultUrl="'.$defaulturl.'"&GenericErrorUrl="'.$genericerrorurl.'"&PopupElement="'.'"&PollingTimeoutSecs="'.$pollingtimeoutsecs.'"&txtUser="'.$user.'"&txtPassword="'.$password,
'__EVENTVALIDATION' => $eventvalidation
From an analysis of the post request, you notice that by sending the first cookie obtained from the website "ASP.NET_SessionId=", you immediately get an additional authentication cookie "SSOAuth="
How can I get the second cookie "SSOAuth=" so that I can get access to the site? I tried this code:
$user = "xx";
$password = "xx";
$url = 'http://xxx.it/Default.aspx';
$contents = file_get_contents($url);
$dom = new DOMDocument;
$dom->loadHTML($contents);
$xpath = new DOMXpath($dom);
$eventvalidation = $xpath->query('//*[#name="__EVENTVALIDATION"]')->item(0)->getAttribute('value');
$viewstate = $xpath->query('//*[#name="__VIEWSTATE"]')->item(0)->getAttribute('value');
$viewstategenerator = $xpath->query('//*[#name="__VIEWSTATEGENERATOR"]')->item(0)->getAttribute('value');
$hwsid = $xpath->query('//*[#name="ctl00$hwsid"]')->item(0)->getAttribute('value');
$pagesessionid = $xpath->query('//*[#name="ctl00$PageSessionId"]')->item(0)->getAttribute('value');
$defaulturl = $xpath->query('//*[#name="ctl00$DefaultUrl"]')->item(0)->getAttribute('value');
$genericerrorurl = $xpath->query('//*[#name="ctl00$GenericErrorUrl"]')->item(0)->getAttribute('value');
$pollingtimeoutsecs = $xpath->query('//*[#name="ctl00$PollingTimeoutSecs"]')->item(0)->getAttribute('value');
$cookies = array_filter(
$http_response_header,
function($v) {return strpos($v, "Set-Cookie:") === 0;}
);
$headers = [
"Accept-language: it-IT,it;q=0.8,en-US;q=0.5,en;q=0.3",
"Content-Type: application/x-www-form-urlencoded; charset=utf-8",
"User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:56.0) Gecko/20100101 Firefox/56.0",
];
foreach ($cookies as $cookie) {
$headers[] = preg_replace("/^Set-/", "", $cookie);
}
$request = array(
'http' => array(
'method' => 'POST',
'timeout' => 0,
'header'=> $headers,
'content' => http_build_query(array(
'__LASTFOCUS' => '',
'__EVENTTARGET' => '',
'__EVENTARGUMENT' => '',
'__VIEWSTATE' => $viewstate,
'__VIEWSTATEGENERATOR' => $viewstategenerator,
'ctl00$hwsid' => $hwsid,
'ctl00$PageSessionId' => $pagesessionid,
'ctl00$DefaultUrl' => $defaulturl,
'ctl00$GenericErrorUrl' => $genericerrorurl,
'ctl00$PopupElement' => '',
'ctl00$PollingTimeoutSecs' => $pollingtimeoutsecs,
'ctl00$bodyContent$txtUser' => $user,
'ctl00$bodyContent$txtPassword' => $password,
'__CALLBACKID' => '__Page',
'__CALLBACKPARAM' => '"hwsid="'.$hwsid.'"&PageSessionId="'.$pagesessionid.'"&DefaultUrl="'.$defaulturl.'"&GenericErrorUrl="'.$genericerrorurl.'"&PopupElement="'.'"&PollingTimeoutSecs="'.$pollingtimeoutsecs.'"&txtUser="'.$user.'"&txtPassword="'.$password,
'__EVENTVALIDATION' => $eventvalidation,
'ctl00$bodyContent$btnLogin' => 'Conferma'
)),
)
);
echo "<hr/>";
$context = stream_context_create($request);
$data = file_get_contents($url, false, $context);
echo htmlentities($data);
But I get the following output of "Authentication failed":
<Notification><Error Code="" Alert="True" ClosePopup="True" Fatal="False" Message="Autenticazione fallita." /></Notification>
The session will be in the HTTP Headers and file_get_contents only get the HTTP Body so you are losing the "metadata" in which is send your cookie.
I've really recommend to use something a bit more advanced than that. #Tarun Lalwani recommended you curl. Curl which can achieve that, although I prefer to use something more intuitive as Guzzle http://docs.guzzlephp.org/en/stable/ .
Guzzle use the PSR-7 http://www.php-fig.org/psr/psr-7/
This is an Guzzle use example where you can see how easy is to access the headers:
$client = new GuzzleHttp\Client();
$res = $client->request('GET', 'https://api.github.com/user', [
'auth' => ['user', 'pass']
]);
echo $res->getStatusCode();
// "200"
echo $res->getHeader('content-type');
// 'application/json; charset=utf8'
echo $res->getBody();
// {"type":"User"...'
I have solved! was easier than expected....in this I simply had to delete the quotes " :
'__CALLBACKPARAM' => '"hwsid="'.$hwsid.'"&PageSessionId="'.$pagesessionid.'"&DefaultUrl="'.$defaulturl.'"&GenericErrorUrl="'.$genericerrorurl.'"&PopupElement="'.'"&PollingTimeoutSecs="'.$pollingtimeoutsecs.'"&txtUser="'.$user.'"&txtPassword="'.$password,
converted to:
'__CALLBACKPARAM' => 'hwsid='.$hwsid.'&PageSessionId='.$pagesessionid.'&DefaultUrl='.$defaulturl.'&GenericErrorUrl='.$genericerrorurl.'&PopupElement='.'&PollingTimeoutSecs='.$pollingtimeoutsecs.'&txtUser='.$user.'&txtPassword='.$password,
It looks like you are trying to parse data directly from a website, have you considered approaching the website owners about building an API? in any event, I recommend using phantomjs, so that the scraper code is simpler and the traffic and other JS countermeasures are solved in an easier manner.
I am creating a loyalty discount for a client, and they have api that will return JSON with discount data.
Now I know how I need to generate api link, and when I output it in my error log, and then paste it to browser or postman I get the JSON just fine. But when I try to fetch the JSON using curl I get 400 error
Status 400: Request authentication failed
The curl part looks like this:
$ch = curl_init();
curl_setopt( $ch, CURLOPT_URL, $api_url );
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, 1 );
$json = json_decode( curl_exec( $ch ), true );
error_log(print_r($api_url, true));
error_log(print_r($json, true));
if ( curl_errno( $ch ) ) {
wp_die( curl_error( $ch ) );
}
$httpcode = curl_getinfo( $ch, CURLINFO_HTTP_CODE );
curl_close( $ch );
if ( $httpcode >= 200 && $httpcode < 300) {
wp_die( $json );
} else {
wp_die( 'httpcode: ' . $httpcode . ' ' . __( 'Probably entered the wrong data on option menu, or REST API not responsive' ) );
}
When I look at error log I can see the url that works when I paste it to postman or browser.
Now, the postman is set to GET, and when I tried to switch it to POST I got the same error. Postman returns headers
CONTENT-LENGTH → 238
I tried setting
curl_setopt( $ch, CURLOPT_CUSTOMREQUEST, 'GET');
before geting curl_exec, but I still get the same error.
Any way to see what could be wrong here?
Oh and I am generating the request using ajax. So user inputs their ID, and submit it to ajax and then I should check and return the json.
EDIT:
I've outputted the headers in my error log
[25-Oct-2016 12:07:53 UTC] Array
(
[url] => this is the url
[content_type] =>
[http_code] => 400
[header_size] => 48
[request_size] => 248
[filetime] => -1
[ssl_verify_result] => 0
[redirect_count] => 0
[total_time] => 0.564651
[namelookup_time] => 0.510374
[connect_time] => 0.520541
[pretransfer_time] => 0.52061
[size_upload] => 0
[size_download] => 41
[speed_download] => 72
[speed_upload] => 0
[download_content_length] => 41
[upload_content_length] => -1
[starttransfer_time] => 0.564589
[redirect_time] => 0
[redirect_url] =>
[primary_ip] => 213.191.137.78
[certinfo] => Array
(
)
[primary_port] => 80
[local_ip] => xxx.xxx.xxx.xxx
[local_port] => 60192
[request_header] => GET /rest/api/v1/webshop/loycard/customer/the endpoint goes here HTTP/1.1
Host: the client host
Accept: */*
)
And when I look in the inspector of pasted link I have
GET /rest/api/v1/webshop/loycard/customer/the endpoint goes here HTTP/1.1
Host: the client host
Connection: keep-alive
Cache-Control: max-age=0
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.59 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding: gzip, deflate, sdch
Accept-Language: hr-HR,hr;q=0.8,en-US;q=0.6,en;q=0.4
I am connecting to an API with multipart/form-data, if I send the request via curl on the command line I have no issues, but when sending it with PHP cURL I get an internal server 500 error.
The command line is
curl -H "Content-Type: multipart/form-data" -H "X-DevKey: XXX" -H "X-AccessToken: XXX" -F "data=#file.json; type=application/json" -X POST https://api.site.com/v1/Items -i -v
Which, give me the following response
> POST /v1/Items HTTP/1.1
> User-Agent: curl/7.24.0 (x86_64-apple-darwin12.0) libcurl/7.24.0 OpenSSL/0.9.8y zlib/1.2.5
> Host: api.site.com
> Accept: */*
> X-DevKey: XXX
> X-AccessToken: XXX
> Content-Length: 1008
> Expect: 100-continue
> Content-Type: multipart/form-data; boundary=----------------------------58bf5d52327e
>
< HTTP/1.1 100 Continue
HTTP/1.1 100 Continue
< Set-Cookie: sto-id=XXX; Expires=Fri, 16-May-2025 15:32:27 GMT; Path=/
Set-Cookie: sto-id=XXX; Expires=Fri, 16-May-2025 15:32:27 GMT; Path=/
< HTTP/1.1 200 OK
HTTP/1.1 200 OK
< Cache-Control: no-cache
Cache-Control: no-cache
< Pragma: no-cache
Pragma: no-cache
< Content-Type: application/json; charset=utf-8
Content-Type: application/json; charset=utf-8
< Expires: -1
Expires: -1
< X-Powered-By: ASP.NET
X-Powered-By: ASP.NET
< Date: Tue, 19 May 2015 15:32:28 GMT
Date: Tue, 19 May 2015 15:32:28 GMT
< Content-Length: 220
Content-Length: 220
< Set-Cookie: sto-id=XXX; Expires=Fri, 16-May-2025 15:32:28 GMT; Path=/
Set-Cookie: sto-id=XXX; Expires=Fri, 16-May-2025 15:32:28 GMT; Path=/
<
* Connection #0 to host api.site.com left intact
{"userMessage":"Item listed with itemID: 484495200","developerMessage":"Item listed with itemID: 484495200","links":[{"rel":"self","href":"https://api.site.com/v1/items/484495200","verb":"GET","title":"484495200"}]}* Closing connection #0
## Heading ##* SSLv3, TLS alert, Client hello (1):
My current PHP function looks like this
function create_listing($buynow, $startingBid, $category, $description, $picture, $sku, $title)
{
$url = "https://api.site.com/v1/Items";
$token = get_access_token();
$headers = array(
"X-DevKey: XXX",
"X-AccessToken: $token",
"Expect: 100-continue",
"Content-Type: multipart/form-data; Boundary=----WebKitFormBoundaryR7f7XrG1vJhOfHzu"
);
$data = array(
"AutoRelist" => 1,
"AutoRelistFixedCount" => 0,
"BuyNowPrice" => 9.99,
"CategoryID" => 2325,
"Condition" => 1,
"CountryCode" => 'US',
"Description" => $description,
"InspectionPeriod" => 1,
//"FixedPrice" => $fixedprice,
"IsFFLRequired" => true,
"ListingDuration" => 1,
"PaymentMethods" => array(
"Check" => false,
"VisaMastercard" => true,
"COD" => false,
"Escrow" => false,
"Amex" => false,
"PayPal" => false,
"Discover" => true,
"SeeItemDesc" => true,
"CertifiedCheck" => true,
"USPSMoneyOrder" => true,
"MoneyOrder" => false
),
"PostalCode" => '85388',
"Quantity" => 1,
//"ReservePrice" => $reserve,
"SalesTaxes" => array(
array(
"State" => 'AZ',
"TaxRate" => 8.5
)
),
//"SerialNumber" => $serial,
"ShippingClassCosts" => array(
"Ground" => 25.00,
"Priority" => 25.00
),
"ShippingClassesSupported" => array(
"Overnight" => false,
"TwoDay" => false,
"ThreeDay" => false,
"Ground" => true,
"FirstClass" => false,
"Priority" => true,
"Other" => false
),
//"ShippingProfileID" => $shippingProfile,
"SKU" => $sku,
"StartingBid" => 0.99,
"Title" => $title,
//"UPC" => $upc,
//"Weight" => $weight,
//"WeightUnit" => 1,
"WhoPaysForShipping" => 8,
//"ItemPremiumFeatures" => array(
//
//),
"WillShipInternational" => false
);
$boundary = '----WebKitFormBoundaryR7f7XrG1vJhOfHzu';
$request = "$boundary\nContent-Disposition: form-data; name=\"data\"\n";
$request .= json_encode($data) . "\n$boundary--";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_POSTFIELDS, $request);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "POST");
curl_setopt($ch, CURLOPT_COOKIEFILE, 'cookie.txt');
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie.txt');
$verbose = fopen('php://temp', 'rw+');
curl_setopt($ch, CURLOPT_STDERR, $verbose);
$result = curl_exec($ch);
if ($result === FALSE) {
printf("cUrl error (#%d): %s<br>\n", curl_errno($ch),
htmlspecialchars(curl_error($ch)));
}
rewind($verbose);
$verboseLog = stream_get_contents($verbose);
echo "Verbose information:\n<pre>", htmlspecialchars($verboseLog), "</pre>\n";
$info = curl_getinfo($ch);
if (!$result) {
echo curl_error($ch);
}
curl_close($ch);
return $result;
}
But I get the following response:
> POST /v1/Items HTTP/1.1
Host: api.site.com
Accept: */*
Cookie: sto-id=JKHHNMED
X-DevKey: XXX
X-AccessToken: XXX
Expect: 100-continue
Content-Type: multipart/form-data; Boundary=----WebKitFormBoundaryR7f7XrG1vJhOfHzu
Content-Length: 939
< HTTP/1.1 100 Continue
< HTTP/1.1 500 Internal Server Error
< Cache-Control: no-cache
< Pragma: no-cache
< Content-Type: application/json; charset=utf-8
< Expires: -1
< X-Powered-By: ASP.NET
< Date: Tue, 19 May 2015 17:11:40 GMT
< Content-Length: 219
* HTTP error before end of send, stop sending
Any help as to why I'm getting the 500 error would be greatly appreciated.
When I run curl -I http://api.stackoverflow.com/1.1/badges fro my terminal, it shows me the following headers:
HTTP/1.1 200 OK
Cache-Control: private
Content-Length: 42804
Content-Type: application/json; charset=utf-8
Content-Encoding: gzip
X-AspNetMvc-Version: 4.0
X-RateLimit-Max: 300
X-RateLimit-Current: 297
X-AspNet-Version: 4.0.30319
Set-Cookie: .ASPXBrowserOverride=; expires=Mon, 08-Oct-2012 04:29:28 GMT; path=/
Date: Tue, 09 Oct 2012 04:29:27 GMT
Yet, when I run the same cURL request through PHP, I get this:
Array
(
[url] => http://api.stackoverflow.com/1.1/badges?10102
[content_type] => application/json; charset=utf-8
[http_code] => 200
[header_size] => 277
[request_size] => 85
[filetime] => -1
[ssl_verify_result] => 0
[redirect_count] => 0
[total_time] => 0.168343
[namelookup_time] => 0.023417
[connect_time] => 0.046293
[pretransfer_time] => 0.046365
[size_upload] => 0
[size_download] => 42804
[speed_download] => 254266
[speed_upload] => 0
[download_content_length] => 42804
[upload_content_length] => 0
[starttransfer_time] => 0.097563
[redirect_time] => 0
[certinfo] => Array
(
)
[redirect_url] =>
)
The major difference that matters to me is that when run through PHP, I do not get the Content-Encoding header, without which I do not know if the content needs to be gzip inflated or not.
Is there a way to get the Content-Encoding header, or to check for gzip compression some other way?
There is no header_response nor accept-encoding in the returned getinfo array. I thought CURLINFO_HEADER_OUT on getinfo would give response headers, but only request headers are given.
But you can get raw headers using the CURLOPT_HEADER option set to true. So I suggest you to do something less natural :
$curl = curl_init();
$opts = array (
CURLOPT_URL => 'http://api.stackoverflow.com/1.1/badges',
CURLOPT_TIMEOUT => 120,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_ENCODING => 'gzip',
CURLOPT_HEADER => true,
);
curl_setopt_array($curl, $opts);
$return = curl_exec($curl);
list($rawHeader, $response) = explode("\r\n\r\n", $return, 2);
$cutHeaders = explode("\r\n", $rawHeader);
$headers = array();
foreach ($cutHeaders as $row)
{
$cutRow = explode(":", $row, 2);
$headers[$cutRow[0]] = trim($cutRow[1]);
}
echo $headers['Content-Encoding']; // gzip
If you set CURLOPT_HEADER to true, curl returns the header alongside the body. If you're just interested in the header, you can set CURLOPT_NOBODY to true and the body is not returned (which emulates the -I flag on the command line).
This example sets just the CURLOPT_HEADER, reads the Content-Encoding header (if it is set) and uncompresses the body:
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, "http://api.stackoverflow.com/1.1/badges");
curl_setopt($curl, CURLOPT_HEADER, 1);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($curl);
curl_close($curl);
list($header, $body) = explode("\r\n\r\n", $response, 2);
if(preg_match('#Content-Encoding:\s+(\w+)#i', $header, $match)) {
switch (strtolower($match[1])) {
case 'gzip':
$body = gzdecode($body);
break;
case 'compress':
$body = gzuncompress($body);
break;
case 'deflate':
$body = gzdeflate($body);
break;
}
}
echo $header;
echo $body;
Disclaimer: gzdecode might not be available in your PHP-version. I've tested it with PHP 5.4.4 and it worked.
You could also install the HTTP_Request2-PEAR package which does that for you (plus you get easy access to the headers without HTTP-header parsing):
include 'HTTP/Request2.php';
$request = new HTTP_Request2('http://api.stackoverflow.com/1.1/badges',
HTTP_Request2::METHOD_GET);
$response = $request->send();
echo $response->getBody();
I have banged my head on this long enough. I hope someone can help me figure this out. I'm not sure anymore whether my problems are caused by cURL, php, Apache, Oracle or a brain fart.
I'm trying to post to a form on an Oracle server. By hand, I can make a GET-like url that will bring up the correct page. I want to do POST to hide the variables (and there could be a lot of them) and because the originating form's method is POST. Either way I can't get the response via my php/cURL program.
My specific questions:
The obvious one, why doesn't it work?
Why can I do the request by hand and not by program?
Why does the access.log have a GET in there?
Why is my request header being rewritten to include the
/DAD/scheme/app of the Oracle server?
Will my programming ego ever be the same?
Here's my current code:
*<?php
$error_dump = 'stderr.txt';
$error_dump_handle = fopen($error_dump,'a');
$ch = curl_init();
$url = 'http://www2.blah.com/pls/blah/blah.blaQuery';
$url_enc_fields = array(
'LAST_NAME' => 'MacBlahBlah',
'FIRST_NAME' => 'Blahberina',
'CONTAINS' => 'Y',
... and more fields ...
);
$url_enc_fields = http_build_query($url_enc_fields);
//$url = $url.'?'.$url_enc_fields; //previous GET attempt
$content_length = strlen($url_enc_fields);// number of bytes
$content_length = 'Content-Length:' . $content_length;
$headers = array(
'Request: POST ' . $url_enc_fields . 'HTTP/1.1', //An attempt to force the request
'Accept: */*',
'Content-Type: application/x-www-form-urlencoded',
'Referer:http://blah.com/pls/blah/blah.startup?code1=MM&code2=bleep',
'Expect: ',
$content_length
);
curl_setopt($ch,CURLOPT_HEADER,1);
curl_setopt($ch,CURLOPT_HTTPHEADER,$headers);
curl_setopt($ch,CURLINFO_HEADER_OUT,TRUE);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,FALSE);
curl_setopt($ch,CURLOPT_FOLLOWLOCATION,TRUE); // one of many guesses
curl_setopt($ch,CURLOPT_FRESH_CONNECT,TRUE);
// NOTE: no cookies or passwords involved
curl_setopt($ch,CURLOPT_CONNECTTIMEOUT,180);
curl_setopt($ch,CURLOPT_TIMEOUT,180);
curl_setopt($ch,CURLOPT_VERBOSE,TRUE);
curl_setopt($ch,CURLOPT_STDERR,$error_dump_handle);
curl_setopt($ch,CURLOPT_ENCODING,'chunked'); // Added because the response was chunked, no difference
curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_POST,TRUE); // Apparently futile
curl_setopt($ch,CURLOPT_POSTFIELDS,$url_enc_fields);
$postResult = curl_exec($ch);
$info = curl_getinfo($ch);
$pretty_info = print_r($info,true);
> echo '<br/><pre>'; print_r($info); print_r($url_enc_fields);
> print_r($headers);
echo '</pre>';
$now_time = getdate();$format = '-------------- %f -----------';
fprintf($error_dump_handle,$format,$now_time[0]);
fprintf($error_dump_handle,$pretty_info);
fprintf($error_dump_handle,curl_error($ch));
fclose($error_dump_handle);
curl_close($ch);
?>*
------------ Here is the current logging -------------
From my error dump:
-------------- 1323725891.000000 -----------Array
(
[url] => http://www2.blah.com/pls/blah/blah.blaQuery
[content_type] => text/html; charset=iso-8859-1
[http_code] => 404
[header_size] => 205
[request_size] => 601
[filetime] => -1
[ssl_verify_result] => 0
[redirect_count] => 0
[total_time] => 0.203
[namelookup_time] => 0
[connect_time] => 0.078
[pretransfer_time] => 0.078
[size_upload] => 146
[size_download] => 336
[speed_download] => 1655
[speed_upload] => 719
[download_content_length] => -1
[upload_content_length] => 0
[starttransfer_time] => 0.203
[redirect_time] => 0
**[request_header] => POST /pls/blah/blah.blaQuery HTTP/1.1**
Host: blah.com
Accept-Encoding: chunked
Request: POST LAST_NAME=MacBlahBlah&FIRST_NAME=Blahberina&CONTAINS=Y&...some other fields... HTTP/1.1
Accept: */*
Content-Type: application/x-www-form-urlencoded
Referer:http://blah.com/pls/wllpub/blah.startup?code1=MM&code2=bleep
Content-Length:146
From my Apache access.log:
*##.##.##.## - - [12/Dec/2011:14:38:06 -0700] "GET /cgi-bin/mydir/mysubmit_form.php HTTP/1.1" 200 2698 "-" "Mozilla/5.0 (Windows NT 6.0; WOW64; rv:8.0) Gecko/20100101 Firefox/8.0"*
Why is it doing a GET?
The HTML code returned:
http://www2.blah.com/pls/blah/blah.blaQuery
HTTP/1.1 404 Not Found
Date: Mon, 12 Dec 2011 22:03:53 GMT
Server: Oracle-Application-Server-10g/10.1.2.2.0 Oracle-HTTP-Server
Transfer-Encoding: chunked Content-Type: text/html; charset=iso-8859-1
Not Found
The requested URL pls/blah/blah.blaQuery was not found on this server.
Oracle-Application-Server-10g/10.1.2.2.0 Oracle-HTTP-Server Server at www2 Port 80
try
$fields = array(
'LAST_NAME' => 'MacBlahBlah',
'FIRST_NAME' => 'Blahberina',
'CONTAINS' => 'Y',
... and more fields ...
);
$url_enc_fields = http_build_query($fields);
and then
curl_setopt($ch,CURLOPT_POST,count($fields)); // Apparently futile
curl_setopt($ch,CURLOPT_POSTFIELDS,$url_enc_fields);
I didn't check these changes, but this is how I usually do it