I'm trying to avoid cURL storing the cookie session into an actual file via "CURLOPT_COOKIEJAR". So I created a method to catch / parse the cookies into a local variable - which is then used via "CURLOPT_COOKIE" to restore the cookie session.
I cut out the cookies via
preg_match_all("/^Set-cookie: (.*?);/ism", $header, $cookies);
To use "CURLOPT_COOKIE" we take the key=value and separate them via "; ". However (As I'm aware), CURLOPT_COOKIE doesn't allow you throw in various flags I.e. expiration, secure flag, and so on.
Update 1/29/2014 6:45pm
So I think my issue actually occurs where CURLOPT_FOLLOWLOCATION occurs. I don't think it has to do with the flags. It doesn't seem like the manual cookie session I have is updating when following a new location (i.e. a site has 2-3 redirects to append various cookies / session). Which would actually make sense because utilizing CURLOPT_COOKIEJAR will directly grab / update cookies sent on header redirects. So, I tried creating a manual redirection path while grabbing / appending the latest cookie - however this method did not work for some plain reason.
Update 1/30/2014 4:22pm
Almost got this figured out. Will be updating with answer shortly. It turns out my method works perfectly fine, it's just a matter of jumping through the manual redirected pages correctly.
Update 1/30/2014 4:51pm
Issue solved -- answered myself below.
So it turns out I was actually doing this correctly and my assumptions were correct.
To keep the cookie session in a variable (vs. CURLOPT_COOKIEJAR). *Make sure you have CURLOPT_HEADER and CURLINFO_HEADER_OUT enabled.*
CURLOPT_FOLLOWLOCATION must be set to false. Otherwise your cookie won't send correctly (This is where CURLOPT_COOKIEJAR does best).
Use preg_match_all to extract cookies. Then use strpos to find the first occurence of "=". Some sites use encoding and include "="'s which won't work with "explode".
$data = curl_exec($curl);
$header_size = curl_getinfo($curl, CURLINFO_HEADER_SIZE);
$header = substr($data, 0, $header_size);
preg_match_all("/^Set-cookie: (.*?);/ism", $header, $cookies);
foreach( $cookies[1] as $cookie ){
$buffer_explode = strpos($cookie, "=");
$this->cookies[ substr($cookie,0,$buffer_explode) ] = substr($cookie,$buffer_explode+1);
}
When making your next curl call, re-call the cookie var/object into CURLOPT_COOKIE.
if( count($this->cookies) > 0 ){
$cookieBuffer = array();
foreach( $this->cookies as $k=>$c ) $cookieBuffer[] = "$k=$c";
curl_setopt($curl, CURLOPT_COOKIE, implode("; ",$cookieBuffer) );
}
This will allow you to keep the latest variable (i.e. changing sessions) intact.
Hope this helps anyone who bumps into this issue!
Related
Okay, I haven't been able to find a solution to this as of yet, and I need to start asking questions on SO so I can get my reputation up and hopefully help out others.
I am making a wordpress plugin that retrieves a json list of items from a remote site. Recently, the site added a redirecting check for a cookie.
Upon first request without the cookie, 302 headers are provided, pointing to a second page which also returns a 302 redirect pointing to the homepage. On this second page, however, the set-cookie headers are also provided, which prevents the homepage from redirecting yet again.
When I make a cURL request to a url on the site, however, it fails in a redirect loop.
Now, obviously the easiest solution would be to fix this on the remote server. It should not be implementing that redirect for api routes. But that at the moment is not an option for me.
I have found how to retrieve the set-cookie header value from a 2** code response, however I cannot seem to figure out how to access that value when 302 headers are provided, and cURL returns nothing but an error.
Is there a way to access the headers even when it reaches the maximum (20) redirects?
Is it possible to stop the execution after a set number of redirects?
How can I get this cookie's value so I can provide it in a final request?
If you use the cURL option CURLOPT_HEADER the data you get back from curl_exec will include the headers from each response, including the 302.
If you enable cookie handling in cURL, it should pick up the cookie set by the 302 response just fine unless you prefer to handle it manually.
I often do something like this when there could be multiple redirects:
$ch = curl_init($some_url_that_302_redirects);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_COOKIEFILE, ''); // enable curl cookie handling
$result = curl_exec($ch);
// $result contains the headers from each response, plus the body of the last response
$info = curl_getinfo($ch); // info will tell us how many redirects were followed
for ($i = 0; $i < intval($info['redirect_count']); ++$i) {
// get headers from each response
list($headers, $response) = explode("\r\n\r\n", $response, 2);
// DO SOMETHING WITH $headers HERE
// If there was a redirect, headers will be all headers from that response,
// including Set-Cookie headers
}
list($headers, $body) = explode("\r\n\r\n", $response, 2);
// Now $headers are the headers from the final response
// $body is the content from the final response
You already had problems before you started trying to add cookies into the mix. Doing a single redirect is bad for performance. Using a 302 response as a means of dissociating data presentation from data retrieval under HTTP/1,1 or later is bad (it works, but is a violation of the protocol - you should be using a 303 if you really must redirect).
Trying to set a cookie in a 3xx response will not work consistently across browsers. Setting a cookie in an Ajax response will not work consistently across browsers.
It should not be implementing that redirect for api routes
Maybe the people at the remote site are trying to prevent you leeching their content?
Fetch the homepage first in an iframe to populate the cookie and record a flag in your domain on the browser.
I actually found another SO question, of course after I posted, that lead me in the right direction to make this possible, HERE
I used the WebGet class to make the curl request. It has not been maintained for three years, but it still works fine.
It has a function that makes the curl request without following through on the redirect loop.
There are a lot of curl options set in that function, and curl is not returning an error in it, so I'm sure the exact solution could be simpler. HERE is a list of curl options for anyone who would like to delve deeper.
Here is how I handle each of the responses to get the final response
$w = new WebGet();
$cookie_file = 'cookie.txt';
if (!file_exists($cookie_file)) {
$cookie_file_inter = fopen($cookie_file, "w");
fclose($cookie_file_inter);
}
$w->cookieFile = $cookie_file; // must exist and be writable
$w->requestContent($url);
$headers = $w->responseHeaders;
if ($w->responseStatusCode == 302 && isset($headers['LOCATION'])) {
$w->requestContent($headers['LOCATION']);
}
if ($w->responseStatusCode == 302 && isset($headers['LOCATION'])) {
$w->requestContent($headers['LOCATION']);
}
$response = $w->cachedContent;
Of course, this is all extremely bad practice, and has severe performance implications, but there may be some rare use cases that find themselves needing to do this.
I'm using CURLOPT_COOKIEJAR to store cookies to a file and CURLOPT_COOKIEFILE to retrieve them from the file.
What I'm wonder is what happens when multiple users are accessing the script at the same time - won't it mess up the contents of the cookie file? Also, how do I manage the cookie files so that it's possible to have multiple users at the same time?
CURLOPT_COOKIEJAR and CURLOPT_COOKIEFILE are just utilities for handling cookies in a file, like a web browser.
And it's not recommended for your case.
But you can play directly with http headers to set and retrieve cookies.
For setting you cookies
<?php
curl_setopt($ch, CURLOPT_COOKIE, 'user=xxxxxxxx-xxxxxxxx');
?>
For retrieving cookies, just identify the headers that startswith Set-Cookie:
You can check this document for understanding how cookie headers works http://curl.haxx.se/rfc/cookie_spec.html
Usage example, quick and dirty, but definitely not standard.
With this headers
<?php
$header_blob = '
Set-Cookie: PART_NUMBER=ROCKET_LAUNCHER_0001; path=/
Set-Cookie: PART_NUMBER=RIDING_ROCKET_0023; path=/ammo
';
Extract cookie headers
$cookies = array();
if (preg_match_all('/Set-Cookie:\s*(?P<cookies>.+?);/i', $header_blob, $matches)) {
foreach ($matches['cookies'] as $cookie) {
$cookies[] = $cookie;
}
$cookies = array_unique($cookies);
}
var_dump($cookies);
Resend cookies
$cookie_blob = implode('; ', $cookies);
var_dump($cookie_blob);
You'll need to specify a different file for each execution of the script, otherwise you'll have issues with the file being overwritten, etc. as you suggest.
You might want to have a look at the tempnam (example below) as a means of generating the unique file, or simply use uniqid, etc. and create the file yourself.
<?php
session_start();
$cookieFilePath = $_SESSION['cookiefilepath']
? $_SESSION['cookiefilepath']
: tempnam(sys_get_temp_dir(), session_id().'_cookie_');
$_SESSION['cookiefilepath'] = $cookieFilePath;
...
curl_setopt($curlSession, CURLOPT_COOKIEFILE, $cookieFilePath);
...
?>
That said, you'll need to ensure that you remove these files once they're no longer required. (If this isn't within the lifetime of your script, you might want to periodically execute a tidy-up script via cron that uses filemtime or similar.)
Incidentally, you can simply provide a full path to the file you want to use - it doesn't have to be in the same directory that the script is in, despite what is said in the existing Can someone explain CURL cookie handling (PHP)? question.
Multiple requests will overwrite the same file (but will probably also slow all other requests execution down due to file locking).
You could incorporate the session_id() into the cookie file name so you'll have one cookie file for every client session. I'd also recommend storing the files in something like sys_get_temp_dir().
something like:
$cookieFile = sys_get_temp_dir().PATH_SEPARATOR.session_id().'-cookies.txt';
Should work fine for that.
I'm not sure if I'm asking this properly.
I have two PHP pages located on the same server. The first PHP page sets a cookie with an expiration and the second one checks to see if that cookie was set. if it is set, it returns "on". If it isn't set, it returns "off".
If I just run the pages like
"www.example.com/set_cookie.php"
AND
"www.example.com/is_cookie_set.php"
I get an "on" from is_cookie_set.php.
Heres the problem, on the set_cookie.php file I have a function called is_set. This function executes the following cURL and returns the contents ("on" or "off"). Unfortunately, the contents are always returned as "off". however, if I check the file manually ("www.example.com/is_cookie_set.php") I can see that the cookie was set.
Heres the function :
<?php
function is_set()
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://example.com/is_cookie_set.php');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$contents = curl_exec ($ch);
curl_close ($ch);
echo $contents;
}
?>
Please note, I'm not using cURL to GET or SET cookies, only to check a page that checks if the cookie was set.
I've looked into CURLOPT_COOKIEJAR, and CURLOPT_COOKIEFILE, but I believe those are for setting cookies via cURL and I don't want to do this.
I believe you are making a confusion. When you are using curl, PHP will go to the trouble of acting like a client (like a browser maybe), and make that request for you. That is, the cookies that curl checks for have nothing to do with the cookies in your current browser. I think.
I'm not entirely sure what you are trying to do here but you are aware, as nc3b already states, that in your is_set() function, it's PHP acting as the client and not your browser, right? That means that your cookie test will always fail (= return with no cookies).
Cookies are stored by the client and sent along with every request to the server.
If you want to find out in PHP whether a cookie has been set - of course, you need to be on the same domain as the cookie for that - you can use plain if (isset($_COOKIE["cookiename"])).
Maybe you are trying to build a solution to query for a cookie on a remote host. For that, see this SO question:
Cross domain cookies
Curl acts like your browser as a http client.
If configured they both recceive and store cookies, but they are in no way related.
Curl doesn't use the browser cookies. If you want to use your browser cookies, you have to use the --cookie option switch. See the manpage for details: http://curl.haxx.se/docs/manpage.html
For example Firefox stores them in a file called cookies.txt.
Under linux its located under ~/.mozilla/firefox/$profilefolder/cookies.txt
Hint: If you use Firefox >= 3.0 the cookies are stored in a sqlite database. If you want to use them with curl, you have to extract a cookies.txt file by yourself.
Here are some examples how to do that:
http://roshan.info/blog/2010/03/14/using-firefox-30-cookies-with-wgetcurl/
http://slacy.com/blog/2010/02/using-cookies-sqlite-in-wget-or-curl/
sqlite3 -separator $'\t' cookies.sqlite \
'select host, "TRUE", path, case isSecure when 0 then "FALSE" else "TRUE" end, expiry, name, value from moz_cookies' > cookies.txt
I have a page called send.email.php which sends an email - pretty simple stuff - I pass an order id, it creates job request and sends it out. This works fine when used in the context I developed it (Use javascript to make an AJAX call to the URL and pass the order_id as a query parameter)
I am now trying to reuse the exact same page in another application however I am calling it using php file_get_contents($base_url.'admin/send.email.php?order_id='.$order_id). When I call the page this way, the $_SESSION array is empty isempty() = 1.
Is this because I am initiating a new session using file_get_contents and the values I stored in the $_SESSION on login are not available to me within there?
--> Thanks for the feedback. It makes sense that the new call doesn't have access to the existing session...
New problem though:
I now get: failed to open stream: HTTP request failed! When trying to execute:
$opts = array('http' => array('header'=> 'Cookie: ' . $_SERVER['HTTP_COOKIE']."\r\n"));
$context = stream_context_create($opts);
$contents = file_get_contents($base_url.'admin/send.sms.php?order_id='.order_id, false, $context);
YET, the URL works fine if I call it as: (It just doesn't let me access session)
$result file_get_contents($base_url.'admin/send.sms.php?order_id='.$order_id);
file_get_contents() shouldn't be used anywhere you need authentication/session information transmitted. It's not going to send any cookies, so the user's authentication information will not be included by default.
You can kind of hack around it by including the session identifier (e.g. 'PHPSESSID' by default) as a query parameter in the URL, and have the other script check for that. But transmitting session identifiers in the URL is horribly bad practice, even if it's just to the same server.
$contents = file_get_contents("http://.... /send_sms.php?order_id=$order_id&" . session_name() . '=' . session_id());
To do this properly, use CURL and build a full HTTP request, including the cookie information of the parent page.
you'd have to include the file or call the respective functions from your send.sms.php script. instead you call it like a webservice (which it isn't)
Actually, it's gotten so messy that I'm not even sure curl is the culprit. So, here's the php:
$creds = array(
'pw' => "xxxx",
'login' => "user"
);
$login_url = "https://www.example.net/login-form"; //action value in real form.
$loginpage = curl_init();
curl_setopt($loginpage, CURLOPT_HEADER, 1);
curl_setopt($loginpage, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($loginpage, CURLOPT_URL, $login_url);
curl_setopt($loginpage, CURLOPT_POST, 1);
curl_setopt($loginpage, CURLOPT_POSTFIELDS, $creds);
$response = curl_exec($loginpage);
echo $response;
I get the headers (which match the headers of a normal, successful request), followed by the login page (I'm guessing curl captured this due to a redirect) which has an error to the effect of "Bad contact type".
I thought the problem was that the request had the host set to the requesting server, not the remote server, but then I noticed (in Firebug), that the request is sent as GET, not POST.
If I copy the login site's form, strip it down to just the form elements with values, and put the full URL for the action, it works just great. So I would think this isn't a security issue where the login request has to originate on the same server, etc. (I even get rid of the empty hidden values and all of the JS which set some of the other cookies).
Then again, I get confused pretty quickly.
Any ideas why it's showing up as GET, or why it's not working, for that matter?
When troubleshooting the entire class of PHP-cURL-related problems, you simply have to turn on CURLOPT_VERBOSE and give CURLOPT_STDERR a file handle.
tail -f your file, compare the headers and response to the ones you see in Firebug, and the problem should become clear.
The request is made from the server, and will not show up in Firebug. (You probably confused it with another request by your browser). Use wireshark to find out what really happens. You are not setting CURLOPT_FOLLOWLOCATION; redirects should not be followed.
Summarizing: Guess less, post more. Link to a pcap dump, and we will be able to tell exactly what you're doing wrong; or post the exact output of the php script, and we might.
The shown code does a multipart formpost (since you pass a hash array to the POSTFIELDS option), which probably is not what the target server expects.
try throwing in a print_r(curl_getinfo($loginpage)) at the end, see what the header data it sent back as.
also, if your trying to fake that your logging in from their site, your going to want to make sure your sending the correct referrer with your post, so that they "think" you were on the website when you sent it.