I'm trying to develop a kind of Browser with PHP.
So far my class can process a GET or a POST request with this Content Type: application/x-www-form-urlencoded.
Now I need to move to a JSON one. I've set the Content-Type header to application/json.
The fact is, with this type I got the following issue: Setting up a POST request will result in a GET request. This is really weird.
Here is my code:
private function request($url, $reset_cookies, $post_data = null, $custom_headers = null)
{
// Create options
$options = array(
CURLOPT_URL => $url,
CURLOPT_RETURNTRANSFER => 1,
CURLOPT_HEADER => 0,
CURLINFO_HEADER_OUT => 1,
CURLOPT_FAILONERROR => 1,
CURLOPT_USERAGENT => $this->user_agent,
CURLOPT_CONNECTTIMEOUT => 30,
CURLOPT_TIMEOUT => 30,
CURLOPT_FOLLOWLOCATION => 1,
CURLOPT_MAXREDIRS => 10,
CURLOPT_AUTOREFERER => 1,
CURLOPT_COOKIESESSION => $reset_cookies ? 1 : 0,
CURLOPT_COOKIEJAR => $this->cookies_file,
CURLOPT_COOKIEFILE => $this->cookies_file,
CURLOPT_HTTPHEADER => array('Accept-language: en'),
// SSL
/*
CURLOPT_SSL_CIPHER_LIST => 'TLSv1',
CURLOPT_SSL_VERIFYPEER => 1,
CURLOPT_CAINFO => dirname(__FILE__) . '/Entrust.netCertificationAuthority(2048).crt',
*/
);
// Add headers
if (isset($custom_headers)) $options[CURLOPT_HTTPHEADER] = array_merge($options[CURLOPT_HTTPHEADER], $custom_headers);
// Add POST data
if (isset($post_data))
{
$options[CURLOPT_POST] = 1;
$options[CURLOPT_POSTFIELDS] = is_string($post_data) ? $post_data : http_build_query($post_data);
}
// Attach options
curl_setopt_array($this->curl, $options);
// Execute the request and read the response
$content = curl_exec($this->curl);
print_r($options);
print_r(curl_getinfo($this->curl, CURLINFO_HEADER_OUT));
// Clean local variables
unset($url);
unset($reset_cookies);
unset($post_data);
unset($custom_headers);
unset($options);
// Handle any error
if (curl_errno($this->curl))
{
unset($content);
throw new Exception(curl_error($this->curl));
}
return $content;
}
To illustrate my issue, here is an example:
CUrl options as an Array:
Array
(
[10002] => http://mywebsite.com/post/
[19913] => 1
[42] => 0
[2] => 1
[45] => 1
[10018] => Mozilla/5.0 (Windows NT 6.1; WOW64; rv:10.0.2) Gecko/20100101 Firefox/10.0.2
[78] => 30
[13] => 30
[52] => 1
[68] => 10
[58] => 1
[96] => 0
[10082] => C:\wamp\www\v2\_libs/../_cookies/14d0fd2b-9f15-4ac5-8fae-4246cc6cef49.cookie
[10031] => C:\wamp\www\v2\_libs/../_cookies/14d0fd2b-9f15-4ac5-8fae-4246cc6cef49.cookie
[10023] => Array
(
[0] => Accept-language: en
[1] => RequestVerificationToken: 4PMxvJsQzFJ5oFt3JdUPe6Bp_geIj4obDJCYIRoU09PrrfcBSUgJT9iB3mXnGFc2KSlYrPcRHF7iHdQhGNu0GKLUzd5FywfaADbGS8wjhXraF36W0
[2] => Content-Type: application/json
)
[47] => 1
[10015] => {"usernameOrFeedId":"manitoba","feed_message_body":"Dummy message goes here"}
)
So the request header seems good to me, but I may be wrong.
And here is the real header sent by CUrl:
GET /post/ HTTP/1.1
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:10.0.2) Gecko/20100101 Firefox/10.0.2
Host: mywebsite.com
Accept: */*
Referer: http://mywebsite.com/post/
Cookie: ADRUM_BT=R%3a53%7cclientRequestGUID%3a9787a51b-b24d-4400-9d6a-efbd618c74c0%7cbtId%3a18790%7cbtERT%3a44; CSRFToken=o_eoIVji7pWclOsrLaJpZEbOFSBJBm851rHbH0Xqwdzw2tC5j07EAc23mlj-opWowgpj0RkHyiktl1cS6onBqI43afM1; WebSessionId=3aem0m2xpwmvesgphna5gaop; prod=rd101o00000000000000000000ffff0a5a2a74o80; AuthenticateCookie=AAAAAtsQgeb8+UXrJ+wa7CGVJKnqizEAo2bMuFvqvwYMAl1NRaa6z68LBRx9hiHzPBC8tYqiayHID6pHChGXB7VywemwTpGivcRQ3nRlUVuaYQKyxQt21p1mx7OMlLCsRA==; web_lang.prod=fr
Accept-language: en
RequestVerificationToken: 4PMxvJsQzFJ5oFt3JdUPe6Bp_geIj4obDJCYIRoU09PrrfcBSUgJT9iB3mXnGFc2KSlYrPcRHF7iHdQhGNu0GKLUzd5FywfaADbGS8wjhXraF34W0
Content-Type: application/json
As you can see, it's a GET request and the post data look to have disapeared.
Am I doing it wrong ?
You're following redirects, that means you get a 3xx response code and curl makes a second request to the new URL.
curl will act according to the specific 3xx code and for some of the redirects it will change request method from POST to GET - enabling VERBOSE will show you if it does so or not. The response codes that makes curl change method are 301, 302 and 303. It does so because that's how browsers act on those response codes.
libcurl offers an option called CURLOPT_POSTREDIR that you can use to tell curl to not change method for specific HTTP responses. Using that, you can thus have curl send a POST even after redirecting with one of these response codes.
CURLOPT_FOLLOWLOCATION
seems to be the cause shown by
Referer: http://mywebsite.com/post/
seems the server is doing a PRG ?
http://en.wikipedia.org/wiki/Post/Redirect/Get
Disable follow location by setting it false and remove the curlopt_maxredirs from your code.
CURLOPT_FOLLOWLOCATION => false,
// CURLOPT_MAXREDIRS => 10,
Related
Just a simple test to get some data via API doensn't work. I use the PHP example from their own website, but no result is printed. Source: https://pro.coinmarketcap.com/api/v1#section/Quick-Start-Guide
Is there anything I do wrong? I use the sandbox-environment with the demo API key.
Curl is installed at my server. Thanks a lot!
<?php
$url = 'https://sandbox-api.coinmarketcap.com/v1/cryptocurrency/listings/latest';
$parameters = [
'start' => '1',
'limit' => '5000',
'convert' => 'USD'
];
$headers = [
'Accepts: application/json',
'X-CMC_PRO_API_KEY: b54bcf4d-1bca-4e8e-9a24-22ff2c3d462c'
];
$qs = http_build_query($parameters); // query string encode the parameters
$request = "{$url}?{$qs}"; // create the request URL
$curl = curl_init(); // Get cURL resource
// Set cURL options
curl_setopt_array($curl, array(
CURLOPT_URL => $request, // set the request URL
CURLOPT_HTTPHEADER => $headers, // set the headers
CURLOPT_RETURNTRANSFER => 1 // ask for raw response instead of bool
));
$response = curl_exec($curl); // Send the request, save the response
print_r(json_decode($response)); // print json decoded response
curl_close($curl); // Close request
?>
When I var_dump($response) I'm getting:
error code: 1020
This is caused by the API detecting a bot/script.
Consider adding a user agent:
PHP cURL how to add the User Agent value OR overcome the Servers blocking cURL requests?
curl_setopt_array($curl, array(
CURLOPT_URL => $request,
CURLOPT_HTTPHEADER => $headers,
CURLOPT_RETURNTRANSFER => 1,
CURLOPT_USERAGENT => 'Mozilla/5.0 (Windows NT 6.2; WOW64; rv:17.0) Gecko/20100101 Firefox/17.0'
));
Now, the script outputs:
stdClass Object
(
[status] => stdClass Object
(
[timestamp] => 2021-08-30T15:06:20.200Z
[error_code] => 0
[error_message] =>
[elapsed] => 0
[credit_count] => 1
[notice] =>
)
[data] => Array
... and a lot more ...
When attempting to web scrape Rubies, I am unable to get past the login. I have absolutely no idea why I am not able to, but here are the cURL options that I am using. If anyone sees a problem, I would greatly appreciate it!
curl_setopt_array($curl, array(
CURLOPT_URL => "https://www.rubies.com/customer/account/loginPost/",
CURLOPT_RETURNTRANSFER => true,
CURLOPT_ENCODING => "",
CURLOPT_MAXREDIRS => 10,
CURLOPT_TIMEOUT => 30,
CURLOPT_HEADER => true,
CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
CURLOPT_POST => 1,
CURLOPT_POSTFIELDS => array('form_key' => "****", "login[username]" => "****", "login[password]" => "****", "persistent_remember_me" => 'on', "send" => ''),
CURLOPT_FOLLOWLOCATION => 1,
CURLOPT_COOKIEFILE => 'cookie.txt',
CURLOPT_COOKIEJAR => 'cookie.txt',
CURLOPT_HTTPHEADER => array(
'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36',
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'Host: www.rubies.com',
'Content-Type: application/x-www-form-urlencoded',
'Origin: https://www.rubies.com',
'Referer: https://www.rubies.com/customer/account/',
'Connection: keep-alive',
'Cache-Control: no-cache',
'Upgrade-Insecure-Requests: 1'
),
CURLOPT_SSL_VERIFYPEER => false,
CURLOPT_SSL_VERIFYHOST => false,
CURLINFO_HEADER_OUT => true
));
I currently have the form key hard encoded, but I am not sure if I would have to change the form key depending on the login. The response from the post is empty, but I get redirected 2 times. Once to the account page, then back to the login. If anyone can tell me what is going on, then I would appreciate it. I think they are using some kind of basic auth system.
Use fiddler2 or another packet sniffer to look at the cURL traffic both requests and responses. Compare that to the traffic using a browser.
You probably either missed or mistyped a field, or missed follow-up steps like setting cookies and posting additional data.
Code for a login often requires fetching the login page, scraping a one-time token (changes with each page request), then posting as the first step. This might trigger script code to set cookies and/or automatically submit other data.
you do several mistakes.
you say to the server that your POST body is application/x-www-form-urlencoded encoded, but you give CURLOPT_POSTFIELDS an array, so what you actually send to the server, is multipart/form-data encoded. to have curl send the post data as application/x-www-form-urlencoded, urlencode the data for CURLOPT_POSTFIELDS - with arrays specifically, http_build_query will do this for you. furthermore, with POSTs when doing multipart/form-data or application/x-www-form-urlencoded, don't set the content-type header at all, curl will do it for you, automatically, depending on which encoding was used. on that note, you shouldn't set the User-Agent header manually, either, but use CURLOPT_USERAGENT. and you should not set the Host header either, curl generates that automatically, and you're more likely than curl to make a mistake.
also, here you send a fake Referer header, some websites can detect when the referer is fake, it's safer just to set CURLOPT_AUTOREFERER, and make a real request, thus obtaining a real referer. also, to actually login to https://www.rubies.com/customer/account/loginPost/ , you need both a cookie session, and a form_key code, the form_key is probably tied to your cookie session, and probably a form of CSRF token, but you provide no code to acquire either. and on top of that, you may need a real referer.
using hhb_curl from https://github.com/divinity76/hhb_.inc.php/blob/master/hhb_.inc.php ,
here's an example code i think would be able to log in, with a real username/password, doing none of the mistakes i listed above:
<?php
declare(strict_types = 1);
require_once ('hhb_.inc.php');
$hc = new hhb_curl ();
$hc->_setComfortableOptions ();
$hc->exec ( 'https://www.rubies.com/customer/account/login/' ); // << getting a referer, form_key (csrf token?), and a session.
$domd = #DOMDocument::loadHTML ( $hc->getResponseBody () );
$csrf = NULL;
// extract the form_key
foreach ( $domd->getElementsByTagName ( "form" ) as $form ) {
if ($form->getAttribute ( "class" ) !== 'form form-login') {
continue;
}
foreach ( $form->getElementsByTagName ( "input" ) as $input ) {
if ($input->getAttribute ( "name" ) !== 'form_key') {
continue;
}
$csrf = $input->getAttribute ( "value" );
break;
}
break;
}
if ($csrf === NULL) {
throw new \RuntimeException ( 'failed to extract the form_key token!' );
}
$hc->setopt_array ( array (
CURLOPT_POST => true,
CURLOPT_POSTFIELDS => http_build_query ( array (
'form_key' => $csrf,
'login' => array (
'username' => '???',
'password' => '???'
),
'persistent_remember_me' => 'on',
'send' => '' // ??
) )
) );
$hc->exec ( 'https://www.rubies.com/customer/account/login/' );
hhb_var_dump ( $hc->getStdErr (), $hc->getResponseBody () );
EDIT: fixed an url, the original code definitely wouldn't work, but it should now.
I am making an HTTP API for use in Roblox. I am using PHP and cURL to do so.
However, there is one method on which I am stuck. That is sending messages through roblox in PHP. I have captured the message sending request using Fiddler. I have put all the headers into my cURL request.
I also am fetching the .ROBLOSECURITY cookie, the X-CSRF Token, and the RBXSessionTracker and __RequestVerification cookies.
Here is the messaging code for those of you want to help:
$RecipientId=$_POST['RecipientId'];
$Subject=$_POST['Subject'];
$Message=$_POST['Message'];
$AuthCookie=decodeJSON($_POST['Login']);
$AuthCookie=$AuthCookie[0];
$TokenCurl=curl_init('roblox.com/build/upload?groupId=1');
curl_setopt_array($TokenCurl,array(
CURLOPT_HTTPGET => 1,
CURLOPT_FRESH_CONNECT => 1,
CURLOPT_RETURNTRANSFER => 1,
CURLOPT_HEADER => 1,
CURLOPT_HTTPHEADER => array('Cookie: .ROBLOSECURITY='.$AuthCookie),
));
$TokenResponse=curl_exec($TokenCurl);
$TokenHeaderSize=curl_getinfo($TokenCurl,CURLINFO_HEADER_SIZE);
$TokenCurlError=curl_error($TokenCurl);
$Cookies=GetCookies(substr($TokenResponse,0,$TokenHeaderSize));
$SessionTracker=$Cookies['RBXSessionTracker'];
$RequestVerification=$Cookies['__RequestVerificationToken'];
$AuthCookie='.ROBLOSECURITY='.$AuthCookie.'; RBXSessionTracker='.$SessionTracker.'; __RequestVerificationToken='.$RequestVerification.';';
curl_close($TokenCurl);
if($TokenCurlError!==''){
die($TokenCurlError);
};
$Cache=time();
preg_match("\`<script\stype=\"text/javascript\">\s*?Roblox\.XsrfToken\.setToken\('(.*?)'\);\s*?</script>\`",$TokenResponse,$Matches);
$Token=$Matches[1];
if(empty($Token)){
die("Could not get X-CSRF Token");
};
$PostData=json_encode(array(subject=>$Subject,body=>$Message,recipientid=>$RecipientId,cacheBuster=>$Cache));
$Curl=curl_init('roblox.com/messages/send');
curl_setopt_array($Curl,array(
CURLOPT_USERAGENT => 'Mozilla/5.0 (Windows NT 6.3; WOW64; rv:39.0) Gecko/20100101 Firefox/39.0',
CURLOPT_POST => 1,
CURLOPT_FRESH_CONNECT => 1,
CURLOPT_HEADER => 1,
CURLOPT_POSTFIELDS => $PostData,
CURLOPT_HTTPHEADER => array('Content-Length: '.strlen($PostData),'X-CSRF-TOKEN: '.$Token,'Referer: www.roblox.com/messages/compose?recipientId='.$RecipientId,'X-Requested-With: XMLHttpRequest','Cookie: '.$AuthCookie,'Accept: application/json, text/javascript, */*; q=0.01','Accept-Language: en-US,en;q=0.5','Accept-Encoding: gzip, deflate','Connection: keep-alive','Pragma: no-cache','Cache-Control: no-cache'),
CURLOPT_RETURNTRANSFER => 1,
CURLOPT_FOLLOWLOCATION => 1,
CURLOPT_VERBOSE => 1
));
$ResponseBody=curl_exec($Curl);
echo $ResponseBody;
$HTTPResponseCode=curl_getinfo($Curl,CURLINFO_HTTP_CODE);
$HTTPHeaderSize=curl_getinfo($Curl,CURLINFO_HEADER_SIZE);
$ResponseHeader=substr($ResponseBody,0,$HTTPHeaderSize);
$CurlError=curl_error($Curl);
curl_close($Curl);
if($CurlError!==''){
die($CurlError);
};
$MessageResult=decodeJSON(substr($ResponseBody,$HTTPHeaderSize));
echo json_encode(array(Message=>$MessageResult['message']));
I removed the 'http://' from the links because of reputation.
Everything works, it just will not send the message. Does anyone know what exactly you need to send a message?
I know that, wordpress redirects closely matched urls to its original url.However, I need to know the actual url at the very beginning of code execution. Is it possible?
1) If there is any method that takes a url as parameter and returns its original wordpress url
, that would be great, but not sure, whether it exists or not. Is there something like that?
2) where in the code actually does this test and redirection please?
3) Is there any hook that I can use to add a plugin to control this situation?
Thanks in advance.
Example:
Suppose, I have a post on this url: "http://mydomain.com/example-post-page"
Now if I try to access it via "http://mydomain.com/example-post-page/", it will redirect you to original first url. It is true for other several small changes in url, all will redirect you to the actual permalink url.
My goal is to control this redirection. That's why, when I am on "http://mydomain.com/example-post-page/" , I like to know(my code in on root index.php) whether this is the origianl permalink or is that something else, before it redirects.
If you are trying to do this in PHP, you can ask the server what the requested URL was. try accessing $_SERVER[ ] and see if this has what you are looking for.
(protip: to see an entire variable in php, use print_r($_SERVER);)
I executed the command on my wordpress site and got back:
Array
(
[SERVER_SOFTWARE] => Apache
[REQUEST_URI] => /wordpress_NEWURL/cookbook/
[REDIRECT_SCRIPT_URL] => /wordpress_NEWURL/cookbook/
[REDIRECT_SCRIPT_URI] => http://openmdao.org/wordpress_NEWURL/cookbook/
[REDIRECT_HTTPS] => off
[REDIRECT_X-FORWARDED-PROTO] => http
[REDIRECT_X-FORWARDED-SSL] => off
[REDIRECT_STATUS] => 200
[SCRIPT_URL] => /wordpress_NEWURL/cookbook/
[SCRIPT_URI] => http://openmdao.org/wordpress_NEWURL/cookbook/
[HTTPS] => off
[X-FORWARDED-PROTO] => http
[X-FORWARDED-SSL] => off
[HTTP_HOST] => openmdao.org
[HTTP_X_FORWARDED_HOST] => openmdao.org
[HTTP_X_FORWARDED_SERVER] => openmdao.org
[HTTP_X_FORWARDED_FOR] => 128.156.10.80
[HTTP_HTTP_X_FORWARDED_PROTO] => http
[HTTP_HTTPS] => off
[HTTP_X_FORWARDED_PROTO] => http
[HTTP_X_FORWARDED_SSL] => off
[HTTP_CONNECTION] => close
[CONTENT_LENGTH] => 92
[HTTP_CACHE_CONTROL] => max-age=0
[HTTP_ORIGIN] => http://openmdao.org
[HTTP_USER_AGENT] => Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.56 Safari/536.5
[CONTENT_TYPE] => application/x-www-form-urlencoded
[HTTP_ACCEPT] => text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
[HTTP_REFERER] => http://openmdao.org/wordpress_NEWURL/cookbook/
[HTTP_ACCEPT_ENCODING] => gzip,deflate,sdch
[HTTP_ACCEPT_LANGUAGE] => en-US,en;q=0.8
[HTTP_ACCEPT_CHARSET] => ISO-8859-1,utf-8;q=0.7,*;q=0.3
[HTTP_COOKIE] => wp-settings-1=m7%3Do%26m11%3Do%26wplink%3D0%26editor%3Dtinymce%26hidetb%3D1%26align%3Dright; wp-settings-time-1=1341576701; wordpress_test_cookie=WP+Cookie+check; wordpress_logged_in_7f72de226f3f83cb831f7a36bd420125=admin%7C1342021433%7C1c54f9729f967300e8d1ecc80eea7e38; w3tc_referrer=http%3A%2F%2Fopenmdao.org%2Fwordpress_test%2Fsdk-1.5.6.2%2F_samples%2F; wp-settings-1=m7%3Do%26m11%3Do%26wplink%3D0; wp-settings-time-1=1339595284; greeting_set=True; _gauges_unique_month=1; _gauges_unique_year=1; _gauges_unique=1; wordpress_test_cookie=WP+Cookie+check; PHPSESSID=0302b71abfbfd5a8d7854ac168aba2f9; sessionid=7b87104b87d95624cab0fe282079aa07; csrftoken=e72c25d0cdc3a6682c89e4680742ff3b; __utma=192130932.1535996015.1339438190.1341938618.1341945509.58; __utmc=192130932; __utmz=192130932.1339605598.7.2.utmcsr=rss|utmccn=new-website|utmcmd=rss
[HTTP_VIA] => 1.1 www-fw.grc.nasa.gov 8B58912A
[PATH] => /usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin:/home/ph/trunk/python-hosting.com/new_site/monitor:/home/ph/trunk/python-hosting.com/new_site/manual_scripts:/root/bin
[SERVER_SIGNATURE] =>
[SERVER_NAME] => openmdao.org
[SERVER_ADDR] => 127.0.0.1
[SERVER_PORT] => 80
[REMOTE_ADDR] => 128.156.10.80
[DOCUMENT_ROOT] => /home/openmdao/webapps/_
[SERVER_ADMIN] => [no address given]
[SCRIPT_FILENAME] => /home/openmdao/webapps/wp_test/index.php
[REMOTE_PORT] => 51907
[REDIRECT_URL] => /wordpress_NEWURL/cookbook/
[GATEWAY_INTERFACE] => CGI/1.1
[SERVER_PROTOCOL] => HTTP/1.0
[REQUEST_METHOD] => POST
[QUERY_STRING] =>
[SCRIPT_NAME] => /wordpress_NEWURL/index.php
[PHP_SELF] => /wordpress_NEWURL/index.php
[REQUEST_TIME] => 1341948361
[argv] => Array
(
)
[argc] => 0
)
Any one of those paramaters can be accessed by $_SERVER['PARAMATER'] from anywhere in you script.
I hope this was helpful!
Do you mean the_permalink? http://codex.wordpress.org/Function_Reference/the_permalink
Ok, At last I got it done in the following way and seems ok for me:
1) Recieve approx url.
2) Make a cURL request using this approx url.
3) get response url.
4) return the response url.
Code given below:
function get_actual_url($approx_url){
$curlObj = curl_init();
curl_setopt($curlObj, CURLOPT_URL, $approx_url);
curl_setopt($curlObj, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curlObj, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curlObj, CURLOPT_ENCODING , "gzip");
curl_setopt($curlObj, CURLOPT_TIMEOUT,180);
$content = curl_exec($curlObj);
$info = curl_getinfo($curlObj);
curl_close($curlObj);
if(!empty($info) && !empty($info["url"])){
return $info["url"];
}
else {
return $approx_url;
}}
I need to get the raw server response, with headers. This also means that gzipped or deflated content should still be compressed. I don't want any changes done to what is received.
Is this possible with PHP?
I tried with curl but that doesn't seem to be working, I set these to zero:
CURLOPT_HTTP_CONTENT_DECODING => 0,
CURLOPT_HTTP_TRANSFER_DECODING => 0,
But no help.
I tried with fsockopen but that seems to uncompress automatically as well.
Anything else?
Edit: these are all my curl headers:
$options = array(CURLOPT_URL => 'http://www.example.com/',
CURLOPT_CONNECTTIMEOUT => 20,
CURLOPT_HEADER => 1,
CURLOPT_RETURNTRANSFER => 1,
CURLOPT_FOLLOWLOCATION => 1,
CURLOPT_USERAGENT => $user_agent,
//CURLOPT_CUSTOMREQUEST => 'HEAD',
//CURLOPT_NOBODY => true,
CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
//CURLOPT_HTTP_CONTENT_DECODING => 0,
//CURLOPT_HTTP_TRANSFER_DECODING => 0,
CURLOPT_BINARYTRANSFER => 1,
CURLOPT_HTTPHEADER => array('Accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language' => 'en-us',
'Accept-Encoding' => 'gzip, deflate'));
Thanks.
fsockopen won't automatically decompress, but you if you are rolling your own HTTP client, you must tell the server you're ready to accept a gzipped response, other the server will send you an uncompressed one.
You can do this by including an Accept-Encoding header in your request, e.g.
Accept-Encoding: compress, gzip