I'm trying to authorize on gmail, but it isn't see cookies.
Error. Most likely, your browser does not set a cookie. Check this
setting, or open a new browser window.
That's my code:
$tmpfname = dirname(__FILE__).'/cookie.txt';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "https://accounts.google.com/ServiceLoginAuth");
curl_setopt($ch, CURLOPT_POSTFIELDS, "GALX=MS-tSuNi3pg&continue=https%3A%2F%2Fmail.google.com%2Fmail%2F&service=mail&hl=ru&_utf8=%E2%98%83&bgresponse=%21A0Lf0QLPRaBwlUTp7ftMkStPvwIAAAAWUgAAAAcqAQFYvp-abJNRjR3DH8MqLNMd2V1lZIUZ8WD7-V22z_v8Lc-TfjBVXX8E0ElzA2hSNiaMERRhArrPj3NR1EuQ7UUE7KbsJ3DPYmn7jsKtGklYfxzO3Uonm6nKj_cfATL8wXFt_ngIdwFI0rY8J_2Kb51KDoxtcx6eEYfD8P0m-t6NcAITwyy3_0EG-1R12MNb2Lc7uLcMW76sHRTt2vc1zV1SjofqaYf73xJ5r-uatz_VTHQ_mT2JBU-92L32nx8qu9JF5__SAcj3-2umIjEiQvqd7KVxuFrSpKHiOGWkzr7CG9DMwFJVYeNvaE0liWW549s7yNcWIu_ERgau0KR0wyIC9A&pstMsg=1&dnConn=&checkConnection=youtube%3A137%3A1&checkedDomains=youtube&Email=*******&Passwd=*******&signIn=%D0%92%D0%BE%D0%B9%D1%82%D0%B8&PersistentCookie=yes&rmShown=1");
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_COOKIESESSION, true );
curl_setopt($ch, CURLOPT_COOKIEJAR, $tmpfname);
curl_setopt($ch, CURLOPT_COOKIEFILE, $tmpfname);
$result = curl_exec($ch);
But Cookies are just a very minor issue.
Google does not make it easy to use curl to login. Because Google uses cookies buried in a 301 redirect, curl may not keep them. Sometimes you also have to grab HTML our of hidden fields <input type=hidden name=”_NAME” value=”_VALUE”>
You have some work ahead of you. It's not as simple as you may think. It certainly cannot be done with one curl HTTP GET. gMail is a nightmare.
Along with about 50 HTTP GET and POST Requests on top of the redirects, Google also uses over 100 JS XHR GET and POST requests and tons of JSON. Information is embedded as cookies, URL Query Strings, and POST Data.
The big hurdle is that gMail will not function without javaScript. Curl does not have built in JavaScript. Without JavaScript you are getting nothing from gMail.
It is not an impossible feat. With 100% certainty it can be done. How long will it take you? is the question. My guess is it will take you about a year to get in from log-in to retrieve and send mail. That is why I suggest you try one function first. Then you will get a taste of what is ahead of you.
What you may be able to do is go to the page where you want to post or scrape the data from, record current cookies then click the feature, then get all the HTML, JS, and XHR requests and responses. You may be able to duplicate that one function without JavaScript. But you have to replace some/most/all (not sure which) of the JS requests with one of your own using curl.
Be prepared to spend some time updating you code as Google is a moving target. They keep changing the way things are done and you'll have to keep up with them.
But the cookies is simple.
This is my work around logging into Google Voice
First I would go to https://www.google.com/voice/
Google puts the cookies in a 301 Redirect. Then four more 302 redirects a little further down the road.
So I do not use:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
I use:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);
Then I will need access to the headers
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLINFO_HEADER_OUT, true);
Grab Cookies from Response Header
$data = curl_exec($ch);
if (curl_errno($ch)){
$data .= 'Retrieve Base Page Error: ' . curl_error($ch);
}
else {
$skip = intval(curl_getinfo($ch, CURLINFO_HEADER_SIZE));
$head = substr($data,0,$skip);
$e = 0;
while(true){
$s = strpos($head,'Set-Cookie: ',$e);
if (!$s){break;}
$s += 12;
$e = strpos($head,';',$s);
$cookie = substr($head,$s,$e-$s) ;
$s = strpos($cookie,'=');
$key = substr($cookie,0,$s);
$value = substr($cookie,$s);
$cookies[$key] = $value;
}
Then create cookie for request header:
$cookie = '';
$delim = '';
foreach ($cookies as $k => $v){
$cookie .= "$delim$k$v";
$delim = '; ';
}
Then catch their redirect location url
$info = curl_getinfo($ch);
$url = $info['redirect_url'];
Look to see if it is a redirect.
if (strlen($url) < 8){
$url='https://accounts.google.com/ServiceLogin';
}
sleep(2);
Then put the cookie in the header:
$request = array();
$request[] = "Host: accounts.google.com";
$request[] = "Pragma: no-cache";
$request[] = "Cookie: $cookie";
$request[] = "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
$request[] = "User-Agent: MOT-V9mm/00.62 UP.Browser/6.2.3.4.c.1.123 (GUI) MMP/2.0";
$request[] = "Accept-Language: en-US,en;q=0.5";
$request[] = "Connection: keep-alive";
$request[] = "Cache-Control: no-cache";
curl_setopt($ch, CURLOPT_HTTPHEADER, $request);
And when there is a Referer (That is not how I spell Referrer, the guy that added it to HTTP spelled it wrong)
$request[] = 'Referer: https://accounts.google.com/ServiceLogin?service=grandcentral&passive=1209600&continue=https%3A%2F%2Fwww.google.com%2Fvoice&followup=https%3A%2F%2Fwww.google.com%2Fvoice<mpl=open';
Get the cookies the same way from the redirect page as previously
Then grab the GALX cookie. Then do the Next Request.
$galax = $cookies['GALX'];
$post = "GALX=$galax&continue=https://www.google.com/voice&followup=https://www.google.com/voice&service=grandcentral<mpl=open&_utf8=%E2%98%83&bgresponse=js_disabled&Email=assratbastard#gmail.com&Passwd=$password&signIn=Sign+in&PersistentCookie=yes&rmShown=1";
curl_setopt($ch, CURLOPT_POSTFIELDS, $post);
curl_setopt($ch, CURLOPT_POST, true);
Lots more of those before you get in.
An example of things further down the road. this is just one HTTP POST Request
Query String Data
continue=https://mail.google.com/mail/
service=mail
sarp=1
Cookies
Cookie: GAPS=1:tx6dl5mwyjNKgiOEtjcvTvzGSNZqQQ:X9TX1quYjhjQfjho; GALX=kBQZRL4MXuU; GMAIL_RTT=216; GMAIL_LOGIN=T1420606553375/1420606553375/1420606580476; NID=67=LbIeO3Xwxjs0nGgZaTOTLrhdJ5bb7_Ce-de10-rKYZVzKVdM4XoKVr3T18sb9NLg_ghRkDoa-G-6vb66FdMR6uIMstAPd0qdQa18s1zGTHtvSOv8lRXaAdDDzqp8p8mguo0xA6VZnz_vV1JnoHMfulS9yoO4PA; SID=DQAAAAkBAADu9krli4XZTP6IWYOSEsmDBjYazF_ywtDmORhqZ8OeVGaC_K-3lSy4cNosYYXfG_-hrMd31fLPbAljFRt3Z5tpOAMLUPmzluYZC0_y1NTWMJ4D7I_bpIgiAsZO5oT9EFobf0vX50KfHLVKTHCetrgckDmLtMd4EkrOqsLkAAK9prD440GMqgCRoICNxLRVu-kS_-5N9mRrIuC3xsOsdi27Qfk4wPOqYNcO5sT1RGGgv1y7jwLqvHzHtz5DmlfARHv9lDtnKM8Gy3jo2Ax_7u8OrwIUP7Tcmz_9FJcj_q_Cz1cu94DbMHDN_qiUIwL1xYzClsdu3Z8EFiHDiEc8esXLg5_HkXPOPOvy-iGO9gTdLQ; LSID=ss:DQAAAAsBAABw1hSyS55goXFvcpcXQZQALGca7K26kfQ6HBc4c_agj3DJe_qMBMzqh0WXc3KNQ8OwP0lCPauBEhr3AdD0DyhCZQDFuIoglHPiw91_r-KIEZ62KjSmuTepv1UYDDEDiZeB5rYEOw4L6l2sOpOBmgBOZOyLfum4azJBLpEYo9kvMsX-OPUlqEJF0z0UMKM-R8Wh1Oxydr0j5R97U_juccmU6DqVsm0DTrP7rjPfv7cfZJ1wdqVemacZdfWjabrExrsXC21fin8ZUtXQI1dL8twk7fM7vo4fvKNdKoACBRUZpxltL9sTtBV-6QcynJF6Km5J6ICynuU3rtZvQNOS5VPIeajbcea7MI5p85XgweiVnw; HSID=A_8tAVmju5qj5J98Y; SSID=A_mBRb5lH8DXaOmm7; APISID=iNCCKNUIqLSXwe-P/AY-19Si5OAZhIv1aj; SAPISID=otuPxzrzp-BltlGm/AKleRqZyVwfhwwCB0; ACCOUNT_CHOOSER=AFx_qI5lJUnyOaSRIf2vxUKACWjny3nvliEw3h7h6NlUUHsklUqbMGc5NH7u6m6u4OSw8s5QqcsmV_fYx7-szFy4TVyvuA6A_itoAFoG-6B9txvdhP2T9gXFJzeRVMKHCQlRie0vibTz
POST Data
GALX=kBQZRL4MXuU&continue=https%3A%2F%2Fmail.google.com%2Fmail%2F&service=mail&rm=false<mpl=default&scc=1&ss=1&_utf8=%E2%98%83&bgresponse=%21A0JLwawFPV34bUQ3xjl5OdgBcQIAAABfUgAAABcqAQSqazjYJpDg-kapblPmSujml011OygP0EUqjVds9Vk_fynd6-gmQ4WyRLVnd1EWIKp_M68OiYoQpy-BsmXpxQoIqbS7pIne_scYIkttMyj3BqWGjYqKEQBS0Ynb39G7n7gVBo_e406b1Ww7Ny9f3nouYPJbOG-kMRdGsuhzBAGwT9v-vMum2Z36_N8gThf12ZQ0gNa1hmEUALqwF0H5leXH7Ex7JhXtGppJ7SiuFjvJYgs0SO_L1ptI5o6eHgud_ti8178KC5KXi0WheHrl5kM2NK6Dn3HhH85-5FTD4P74_HKAbqgH72IeKOosril6qqWekPx_ChXOmSLr6itlnhZjdbEr7g&pstMsg=1&dnConn=&checkConnection=youtube%3A384%3A0&checkedDomains=youtube&Email=g%40assratbastard#gamil.com&$password&signIn=Sign+in&PersistentCookie=yes&rmShown=1
Related
I'm making request to LinkedIn page and receiving "HTTP/1.1 999 Request denied" response.
I use AWS/EC-2 and get this response.
On localhost everything works fine.
This is sample of my code to get html-code of the page.
<?php
error_reporting(E_ALL);
$url= 'https://www.linkedin.com/pulse/5-essential-strategies-digital-michelle';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
$response = curl_exec($ch);
$info = curl_getinfo($ch);
curl_close($ch);
var_dump($response);
var_dump($info);
I don't need whole page content, just meta-tags (title, og-tags).
Note that the error 999 don't exist in W3C Hypertext Transfer Protocol - HTTP/1.1, probably this error is customized (sounds like a joke)
LinkedIn don't allow direct access, the probable reason of them blocking any "url" from others webservers access should be to:
Prevent unauthorized copying of information
Prevent invasions
Prevent abuse of requests.
Force use API
Some IP addresses of servers are blocked, as the "IP" from "domestic ISP" are not blocked and that when you access the LinkedIn with web-browser you use the IP of your internet provider.
The only way to access the data is to use their APIs. See:
Accessing LinkedIn public pages using Python
Heroku requests return 999
Note: The search engines like Google and Bing probably have their IPs in a "whitelist".
<?php
header("Content-Type: text/plain");
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "https://www.linkedin.com/company/technistone-a-s-");
$header = array();
$header[] = "Host: www.linkedin.com";
$header[] = "User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:50.0) Gecko/20100101 Firefox/50.0";
$header[] = "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
$header[] = "Accept-Language: en-US,en;q=0.5";
$header[] = "Accept-Encoding: gzip, deflate, br";
$header[] = "Connection: keep-alive";
$header[] = "Upgrade-Insecure-Requests: 1";
curl_setopt($ch,CURLOPT_ENCODING , "gzip");
curl_setopt($ch, CURLOPT_HTTPHEADER , $header);
$my_var = curl_exec($ch);
echo $my_var;
LinkedIn is not supporting the default encoding 'identity' , so if you set the header
'Accept-Encoding': 'gzip, deflate'
you should get the response , but you would have to decompress it.
I ran into this while doing local web development and using the LinkedIn badge feature (profile.js). I was only getting the 999 Request denied in Chrome, so I just cleared my browser cache and localStorage and it started to work again.
UPDATE - Clearing cache was just a coincidence and the issue came back. LinkedIn is having issues with their badge functionality.
I submitted a help thread to their forums.
https://www.linkedin.com/help/linkedin/forum/question/714971
first time asker, but many times you helped me back in the day. Great job! I ask this because I'm struggling here with and issue I'm unable to solve, and as my PHP (and cURL) knowledge is so scarce, I'm lost.
The Background
I'm developing a Javascript app, that needs to connect to several different servers and make XMLRPC calls to them. The app is working perfectly running it locally (disabling cross-domain security), but to make it run online I knew I had to use a cross-domain proxy, so after several days of searching and investigating, I didn't found one that could make the work, so I managed to make one myself (not without blood and sweat). Know what? It (almost) works!!!
This is my proxy.php:
<?
function readHeader($ch, $header) {
//extracting data to send it to the client
$headers = explode("\n", $header);
foreach ($headers as $item) {
// $string= str_replace($delimiter, $mainDelim, $string);
if (strpos($item, 'Set-Cookie:') !== false) {
$cookie = trim(substr($item,strlen('Set-Cookie:')));
header('X-Set-Cookie:' . $cookie);
} else {
header($item);
}
}
return strlen($header);
}
$allowed_domains = array('domain1.com', 'domain2.com');
header('Content-Type: text/html; charset=iso-8859-1');
$REFERRER = $_SERVER['HTTP_REFERER'];
if ($REFERRER == '') {
// What do you do here?
exit(header('Location: index.html'));
}
$domain = substr($REFERRER, strpos($REFERRER, '://') + 3);
$domain = substr($domain, 0, strpos($domain, '/'));
if (!in_array($domain, $allowed_domains)) {
exit(header('Location: index.html'));
}
$XMLRPC_SERVICE = $_SERVER['HTTP_X_PROXY_URL'];
$xml = $HTTP_RAW_POST_DATA;
$header[] = "Content-type: text/xml; charset=utf-8";
$header[] = "Connection: close";
$header[] = "Accept: text/xml";
if ($_SERVER['HTTP_X_SET_COOKIE'])
$cookie = $_SERVER['HTTP_X_SET_COOKIE'];
if ($_SERVER['HTTP_X_PROXY_URL'] === "other-domain.com")
$header[] = "x-custom-header: value";
$ch = curl_init($XMLRPC_SERVICE);
//URL to post to
curl_setopt($ch, CURLOPT_URL, $XMLRPC_SERVICE);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'POST');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
if ($cookie)
curl_setopt($ch, CURLOPT_COOKIE, $cookie);
curl_setopt($ch, CURLOPT_POSTFIELDS, $xml);
curl_setopt($ch, CURLOPT_HEADERFUNCTION, 'readHeader');
$response = curl_exec($ch);
if (curl_errno($ch)) {
echo curl_error($ch);
} else {
curl_close($ch);
echo $response;
}
?>
The Issue
As I've said, I got it working partially. In fact, it works for most of the usual XMLRPC needs.
It gets the remote server address from the HTTP_X_PROXY_URL header of the request, and using cURL makes the call and returns the values to the javascript client without issues.
The problem comes when I need to get/send a session cookie (probably when getting it, because the cookie value is pretty different when I make calls directly from the app locally). In any case, I can't get the cookie stuff to work. As you see, I'm surrounding the Set-Cookie browser protection on AJAX calls with my own X-Set-Cookie header, that the proxy gets to use or translates accordingly, but the issue with cookies is here, and I can't use cookies, that are critical for app functionality.
Is there a way to redirect the user to another site and fake the referrer at the same time.?
Tried this with my code, i know its wrong but thats only how far i can get.
<?php
$page1 = "http://google.com"; $page2 = "http://yahoo.com/";
$mypages = array($page1,$page2);
$myrandompage = $mypages[mt_rand(0, count($mypages) -1)];
$sites = array_map("trim", file("links.txt"));
$referer = $sites[array_rand($sites)];
function fake_it($url, $ref, $agent)
{
$curl = curl_init();
$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
$header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: en-us,en;q=0.5";
$header[] = "Pragma: "; // browsers keep this blank.
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_USERAGENT, $agent);
curl_setopt($curl, CURLOPT_HTTPHEADER, $header);
curl_setopt($curl, CURLOPT_REFERER, $ref);
curl_setopt($curl, CURLOPT_ENCODING, 'gzip,deflate');
curl_setopt($curl, CURLOPT_AUTOREFERER, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_TIMEOUT, 5000);
$html = curl_exec($curl);
curl_close($curl);
// returns the content provided by the site
return $html;
}
//Below would send a request to the url, with the second parameter as the referrer
echo fake_it($myrandompage, $referer,$_SERVER['HTTP_USER_AGENT']);
?>
what i want is to go from refer.php -> google.com(referer = some other url)..
What you can do is to redirect a user to a https site, like damianb described + do a meta refresh on your redirect.php script:
redirect.php: (e.g https://www.myurl.com/redirect.php?url=http://www.someotherurl.com)
<?php $destination = $_GET['url']; ?>
<html><head><meta http-equiv="refresh" content="0;url=<?php echo $destination; ?>/"></head><body></body></html>
Now you fight with 2 weapons (https, and for browsers that still send the referer: a refresh tag).
In RFC 2616 it says:
1. "If a website is accessed from a HTTP Secure (HTTPS) connection and a link points to anywhere except another secure location, then the referer field is not send"
But since this is not fully true.. unfortunately, you can consider this too:
2. "Most web browsers do not send the referer field when they are instructed to redirect using the "Refresh" field. This does not include some versions of Opera and many mobile web browsers. However, this method of redirection is discouraged by the World Wide Web Consortium (W3C).[7]"
http://en.wikipedia.org/wiki/HTTP_referrer#Referer_hiding
Tested with Chrome and Firefox. Good luck!
I don't think you can change referrers at all.
The only way I know of to trash referrers is to either proxy the page loads with something like cURL (which is bad idea, bad bad), or I believe you can go from an HTTPS page outbound.
I am not absolutely sure, but I seem to recall that browsers don't send referrers when they're coming from an HTTPS site for security reasons.
Lemme double-check.
EDIT: According to RFC 2616, browsers should not send referrers when coming from an HTTPS secured site.
reference: https://www.rfc-editor.org/rfc/rfc2616#section-15.1.3
Clients SHOULD NOT include a Referer header field in a (non-secure)
HTTP request if the referring page was transferred with a secure
protocol.
<?php
if(isset($_GET['token']))
{
$url="http://www.google.com/calendar/feeds/default/allcalendars/full";
$useragent="PHP 5.2";
$header=array( "GET /accounts/AuthSubSessionToken HTTP/1.1",
"Content-Type: application/x-www-form-urlencoded",
"Authorization: AuthSub token=".$_GET['token'],
"User-Agent: PHP/5.2",
"Host: https://www.google.com",
"Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2",
"Connection: keep-alive"
);
$ch = curl_init();
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 60);
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
curl_setopt($ch, CURLOPT_URL, $url);
$data = curl_exec($ch);
print_r($data);
}
?>
The result is page not found. However, I call http://www.google.com/calendar/feeds/default/allcalendars/full from firefox , it's return XML file. So, I think, my code may wrong. But I can't find the error. :(
That is because you are accessing Google Calendar via your personal port. Whenever you access that specific URL, Google checks to see if you are logged in. If not, it sends a 404. If you are, it outputs the calendar based on the settings you provided. That URL does not specify a specific calendar that it's supposed to pull from the site, and it cannot use the cookies stored on the user's computer because it is being fetched from your server, which will not have any cookies for a calendar. When I try to access that page without logging on, I get a 401 Authorization Required error, which I bet is what PHP is getting and you just don't realize it.
You need to go into your Google Calendar settings and find the embedding options to find a URL that is specific to your account so that it will always fetch an XML feed for your calendar.
Read more about the Google 'Calendar Address' here: http://www.google.com/support/calendar/bin/answer.py?answer=34578
View from other applications: http://www.google.com/support/calendar/bin/answer.py?hl=en&answer=37648
I think that you may be overriding the URL with this line in the header:
GET /accounts/AuthSubSessionToken HTTP/1.1
I think that will point CURL to http://www.google.com/accounts/AuthSubSessionToken
What happens when you remove it?
I got it.... I changed like this
<?php
function make_api_call($url, $token)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$curlheader[0] = sprintf("Authorization: AuthSub token=\"%s\"/n", $token);
curl_setopt($ch, CURLOPT_HTTPHEADER, $curlheader);
$output = curl_exec($ch);
curl_close($ch);
return $output;
}
function get_session_token($onetimetoken) {
$output = make_api_call("https://www.google.com/accounts/AuthSubSessionToken", $onetimetoken);
if (preg_match("/Token=(.*)/", $output, $matches))
{
$sessiontoken = $matches[1];
} else {
echo "Error authenticating with Google.";
exit;
}
return $sessiontoken;
}
if(isset($_GET['token']))
{
$sessiontoken=get_session_token($_GET['token']);
$accountxml = make_api_call("http://www.google.com/m8/feeds/contacts/yourmail#gmail.com/full", $sessiontoken);
print_r($accountxml);
}
else
{
$next=urlencode("http://www.mysteryzillion.org/gdata/index.php");
$scope=urlencode("http://www.google.com/m8/feeds/contacts/yourmail#gmail.com/full");
?>
Click here to authenticate through Google.
<?
}
?>
I have a PHP script that does an HTTP request on behalf of the browser and the outputs the response to the browser. Problem is when I click the links from the browser on this page it complains about cookie variables. I'm assuming it needs the browsers cookie(s) for the site.
how can I intercept and forward it to the remote site?
This is how I forward all browser cookies to curl and also return all cookies for the curl request back to the browser. For this I needed to solve some problems like getting cookies from curl, parsing http header, sending multiple cookies and session locking:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
// get http header for cookies
curl_setopt($ch, CURLOPT_VERBOSE, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
// forward current cookies to curl
$cookies = array();
foreach ($_COOKIE as $key => $value)
{
if ($key != 'Array')
{
$cookies[] = $key . '=' . $value;
}
}
curl_setopt( $ch, CURLOPT_COOKIE, implode(';', $cookies) );
// Stop session so curl can use the same session without conflicts
session_write_close();
$response = curl_exec($ch);
curl_close($ch);
// Session restart
session_start();
// Seperate header and body
list($header, $body) = explode("\r\n\r\n", $response, 2);
// extract cookies form curl and forward them to browser
preg_match_all('/^(Set-Cookie:\s*[^\n]*)$/mi', $header, $cookies);
foreach($cookies[0] AS $cookie)
{
header($cookie, false);
}
echo $body;
In fact, it is possible. You just have to take the cookie ofthe browser and pass it as a parameter to curl to mimik the browser.
It's like a session jacking...
Here is a sample code:
// Init curl connection
$curl = curl_init('http://otherserver.com/');
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
// You can add your GET or POST param
// Retrieving session ID
$strCookie = 'PHPSESSID=' . $_COOKIE['PHPSESSID'] . '; path=/';
// We pass the sessionid of the browser within the curl request
curl_setopt( $curl, CURLOPT_COOKIE, $strCookie );
// We receive the answer as if we were the browser
$curl_response = curl_exec($curl);
It works very well if your purpose is to call another website, but this will fail if you call your web server (the same that is launching the curl command). It's because your session file is still open/locked by this script so the URL you are calling can't access it.
If you want to bypass that restriction (call a page on the same server), you have to close the session file with this code before you execute the curl :
$curl = curl_init('http://sameserver.com/');
//...
session_write_close();
$curl_response = curl_exec($curl);
Hope this will help someone :)
From curl_setopt:
By default, libcurl always stores and loads all cookies, independent if they are session cookies or not.
However you may need to set cookies directly, which can be done using:
curl_setopt($ch, CURLOPT_COOKIE, 'foo=bar');
Which is the same as the Set-Cookie HTTP header. Check you're not using curl_setopt($ch, CURLOPT_COOKIESESSION, true) as this will make libcurl ignore some cookies.
You can't.
If you curl the request, you will need to parse the output, and replace all links so they go thru your server.
www.yourdomain.com/f?=www.someotherdomain.com/realpage
The only way this would work is if you use persistent cookies in your curl request. CURL can keep cookies itself. Assign a session ID to the cookie file (in curl) so subsequent requests get the same cookies. When a user clicks a link, you will need to curl the request again.
It is a security issue to allow site1 to set cookies for site2. Imagine if you could set cookies in the browser for paypal and trick the user into thinking they had logged int or some other malicious action.
The Cookie is usually sent with the HTTP request header like
Host stackoverflow.com
User-Agent ...
Accept-Language en-us,en;q=0.5
Referer http://stackoverflow.com/unanswered
Cookie bla=blabla;blubb=blu
So I guess that just have to modify the cookie part in your header.
PiTheNumber's answer was great but I ran into some issues with it that caused it to still print the headers to the page. So I adjusted it to use the more reliable curl_getinfo function. This version also follows redirects.
public function get_page_content( $url ) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);
curl_setopt($ch, CURLOPT_HEADER, 1);
// Forward current cookies to curl
$cookies = array();
foreach ($_COOKIE as $key => $value) {
if ($key != 'Array') {
$cookies[] = $key . '=' . $value;
}
}
curl_setopt( $ch, CURLOPT_COOKIE, implode(';', $cookies) );
$destination = $url;
while ($destination) {
session_write_close();
curl_setopt($ch, CURLOPT_URL, $destination);
$response = curl_exec($ch);
$curl_info = curl_getinfo($ch);
$destination = $curl_info["redirect_url"];
session_start();
}
curl_close($ch);
$headers = substr($response, 0, $curl_info["header_size"]);
$body = substr($response, $curl_info["header_size"]);
// Extract cookies from curl and forward them to browser
preg_match_all('/^(Set-Cookie:\s*[^\n]*)$/mi', $headers, $cookies);
foreach($cookies[0] AS $cookie) {
header($cookie, false);
}
return $body;
}