I'm having issues with following a redirect while maintaining session cookie and post field information. This is how the process goes:
1) Visit URL, they return a cookie and a 302 response (pointing to the same URL you just visited)
2) Re-Visit URL with the cookie they gave you and you can see the proper page.
I can get through to the proper page with CURLOPT_FOLLOWLOCATION = true, however I guess CURL doesn't keep the post fields when following a redirect, so there is no useful content on the page.
I have tried manually storing the cookie, and performing the 'redirect' myself with the stored cookie, however with this method I never get past the 302 redirect to the proper page. The code for the manual method mentioned here is below.
$tmp_name = tempnam('tmp', 'COOKIE');
$url = "MY_URL";
$options = array(
CURLOPT_RETURNTRANSFER => true,
CURLOPT_REFERER => $url,
CURLOPT_HEADER => true,
CURLOPT_POST => true,
CURLOPT_POSTFIELDS => array(
'field1' => 'postfield1',
'field2' => 'postfield2',
),
CURLOPT_VERBOSE => true,
);
// Make the first request, specifying where to store the cookie
// This request returns the cookie and the 302 response
$ch = curl_init($url);
curl_setopt_array($ch, $options);
curl_setopt($ch, CURLOPT_COOKIEJAR, $tmp_name);
$resp1 = curl_exec($ch);
// Make the second request, using the cookie stored above
// Should return the proper page, but gives me the 302 again instead.
$ch = curl_init($url);
curl_setopt_array($ch, $options);
curl_setopt($ch, CURLOPT_COOKIEFILE, $tmp_name);
$resp2 = curl_exec($ch);
Does anyone know what's wrong with the above code, or if there's another way to accomplish the task?
First of all, post data is never kept on redirect. So don't worry about that, you don't have to make two requests. Just stick with
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
I would also suggest the following for further debuging: even if you make two requests, use same curl resource, don't close it to make new one. Also, add:
curl_setopt($ch, CURLOPT_FORBID_REUSE, 0);
curl_setopt($ch, CURLOPT_FRESH_CONNECT, 0);
curl_setopt($ch, CURLOPT_AUTOREFERER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, "valid user agent");
You could also use browser addons (i.e. HttpFox) to check the exact cookies and requests sequence that are needed. You are trying to emulate real request, so looking in-depth at one that your browser makes can help a lot.
Related
TL;DR:
I have some very simple PHP code utilizing cURL that makes single HTTP requests (in practice, to a Diaspora* pod, though that shouldn't be relevant to the question). The code takes note of any cookies returned by the web server and then manually sets those values to libcurl's CURLOPT_COOKIE. However, in trying to hunt down a bug, I'm finding that when I use CURLOPT_COOKIEFILE and CURLOPT_COOKIEJAR, the values of the cookies in the cookie file are different than when I use CURLOPT_COOKIE. Why is this the case? (See code below.)
PRIOR RESEARCH
I have already looked other questions such as this one that suggest various ways of manipulating libcurl's options to keep the same resource handle around and the cookies in memory, but this is not suitable to my application. I need to access the cookie values directly and notably not on a filesystem (to save them into a database, but again, this should not matter with regards to the question).
CODE
For completeness, here is a test case for code I am using:
<?php
// This function simply extracts the cookie set by a webserver by looking at the full HTTP source traffic.
function readCookie ($str) {
$m = array();
preg_match('/Set-Cookie: (.*?);/', $str, $m);
return (!empty($m[1])) ? $m[1] : false;
}
// This function does the same for the CSRF token required for login.
function parseAuthenticityToken ($str) {
$m = array();
preg_match('/content="(.*?)" name="csrf-token"/', $str, $m);
return (!empty($m[1])) ? $m[1] : false;
}
// Get first page, to find the CSRF token.
$ch = curl_init('https://diasp.org/');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$resp = curl_exec($ch);
curl_close($ch);
$csrf_token = parseAuthenticityToken($resp);
$params = array(
'user[username]' => 'my_username',
'user[password]' => 'my_password',
'authenticity_token' => $csrf_token
);
// Make POST request to the log in controller.
$ch = curl_init('https://diasp.org/users/sign_in');
curl_setopt($ch, CURLOPT_POSTFIELDS, $params);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
// In order to work, the COOKIEFILE/JAR options must be used. Why?
//curl_setopt($ch, CURLOPT_COOKIEFILE, '/tmp/test_cookiejar');
//curl_setopt($ch, CURLOPT_COOKIEJAR, '/tmp/test_cookiejar');
$resp = curl_exec($ch);
curl_close($resp);
$cookies = readCookie($resp);
// Even if the login is successful, this fails if and only if no COOKIEFILE/JAR is specified.
// Why?
$ch = curl_init('https://diasp.org/stream');
curl_setopt($ch, CURLOPT_COOKIE, $cookies);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
// If I use COOKIEFILE here, the request works. What is this line doing that CURLOPT_COOKIE is not?
//curl_setopt($ch, CURLOPT_COOKIEFILE, '/tmp/test_cookiejar');
$resp = curl_exec($ch);
curl_close($ch);
var_dump($resp);
SUMMARY
I am making very simple, step-by-step, procedural calls to a web server. These requests are being made one after the other, and the resulting output (of the entire HTTP conversation, including headers), is saved in a variable, which is then read and the values of the cookies are parsed from the Set-Cookie HTTP header lines. However, these values are never the same as the values that libcurl writes to the COOKIEFILE if those lines are uncommented.
What am I doing wrong with CURLOPT_COOKIE or what am I not doing with it that the CURLOPT_COOKIEFILE and CURLOPT_COOKIEJAR options are doing? Is it encoded or decoded in some reversible way? Thanks in advance.
You probably did not notice the difference between CURLOPT_COOKIE and CURLOPT_COOKIELIST/FILE/JAR. The both handle cookies but, CURLOPT_COOKIE does not store the cookies you set this time in the memory, or store them in the cookie file specified by CURLOPT_COOKIEJAR; instread, CURLOPT_COOKIELIST does.
There is a mechanism called cookie engine in libcurl. It is triggered enabled when you set any one of CURLOPT_COOKIELIST/FILE/JAR, libcurl takes care of sending/parsing/reading/storing cookies in all subsequent session.
CURLOPT_COOKIE is just a quick hack way to set a extra cookie for one go.
I have an PHP class which is used to POST some data to a server, and GET some data back using the same open connection.
The problem is that this code will try to POST data from 1st request, in the 2nd request...
curl_setopt(self::$ecurl, CURLOPT_CUSTOMREQUEST, "PUT");
curl_setopt(self::$ecurl, CURLOPT_POSTFIELDS, $data);
$request=curl_exec(self::$ecurl);
curl_setopt(self::$ecurl, CURLOPT_CUSTOMREQUEST, "GET");
$request=curl_exec(self::$ecurl);
So i need the way to unset CURLOPT_POSTFIELDS. I tried to use curl_setopt(self::$ecurl, CURLOPT_POSTFIELDS, null);, but anyway curl send Posting 0 bytes... in request's header.
Also please note, that i need to use exactly the same connection, so I can't create another connection via curl_init.
Set the CURLOPT_HTTPGET to true prior to the last request.
From PHP.net:
CURLOPT_HTTPGET
TRUE to reset the HTTP request method to GET. Since GET is the default, this is only necessary if the request method has been changed.
You could add your curl options into an array and use curl_setopt_array
This will make it easy to unset (or set) your options array elements.
$options = array(
CURLOPT_URL => 'http://www.example.com/',
CURLOPT_CUSTOMREQUEST => 'PUT',
CURLOPT_POSTFIELDS => $post_data
);
// condition on which you want to unset
if ($condition == true) {
unset($options[CURLOPT_POSTFIELDS]);
}
...
curl_setopt_array($ch, $options);
// grab URL and pass it to the browser
curl_exec($ch);
I understand this was an old question but still think this method can be handy.
I've got a cURL PHP script which works. It gets my schedule from my school site. Though there is one strange thing: On my webhost it creates the cookie.txt and on my localhost it doesn't.
Why doesn't it create a cookie on my localhost? Any suggestions? Something with relative paths and wampserver?
And the questions that follows the latter:
Is there any (speed) advantage of already being logged in on the school site (storing the cookie and thus saving an cURL request)?
I could for example check after the first cURL request if there is evidence in the response that I am already logged in.
If the answer to the above question is: 'no, this doesn't make the script faster' I've got another question:
Is it than best to specify only the CURLOPT_COOKIEFILE option? With an empty value? So no cookie jar?
I can't give you my login information, though here is the script:
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL,
'http://www.groenewoud.nl/infoweb/infoweb/index.php');
curl_setopt($curl, CURLOPT_COOKIEFILE, 'cookie.txt');
curl_setopt($curl, CURLOPT_COOKIEJAR, 'cookie.txt');
curl_setopt($curl, CURLOPT_ENCODING, 'gzip');
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_HEADER, false);
$tokenSource = curl_exec($curl);
print_r (curl_getinfo($curl));
if (!$tokenSource) echo 'token problem';
// Get the token from within the source codes of infoweb.
preg_match('/name="csrf" value="(.*?)"/', $tokenSource, $token);
$postFields = array(
'user' => $userNum,
'paswoord' => $userPass,
'login' => 'loginform',
'csrf' => $token[1]);
$postData = http_build_query($postFields);
curl_setopt($curl, CURLOPT_POST, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, false);
curl_setopt($curl, CURLOPT_POSTFIELDS, $postData);
$tableSource = curl_exec($curl);
print_r( curl_getinfo($curl));
if (!$tableSource) echo 'post problem';
curl_close($curl);
1) /cookie/cookie.txt means you'd need to have your cookie directory in the ROOT directory of your entire server. cookie/cookie.txt (note: NO leading slash) means the cookie directory would be a sub-directory of your script's CURRENT directory. E.g. your script is running in /a/b/c/, then you'd have /a/b/c/cookie/cookie.txt.
2) For speed advantages, there's no change in HTTP speeds - you're still stuck with the same pipes and transfer rates. But having the cookie initially MIGHT save you a few extra hits on the server to simulate the login-sequence, so would effectively be SLIGHTLY faster.
3) As for creating the cookies, that's entirely up to curl's settings. If you don't specify a cookie file or cookie jar, it won't create or look for the cookie file. Check the configuration/compile options between the two servers to see if one specifies some curl defaults that the other doesn't have.
4) str_pos WOULD be faster than a curl request. Think of it as the difference between looking in your fridge for some food versus driving to the grocery store. Fridge is local and therefore faster.
5) curlopt_cookiefile tells curl where to store new cookies. curlopt_cookiejar tells curl where to load cookies from when it first starts up. They CAN be different files, but don't have to be. If you'd like to keep some "clean" baseline cookies, then you use cookiejar = newstuff.txt, and cookiejar=baseline.txt. Once you've got an appropriate cookie environment set up, you reset cookiejar to newstuff.txt for subsequent curl runs.
If I load in a cookie, I am able to get to the page that requires cookies, like this:
$cookie = ".ASPXAUTH=Secret";
curl_setopt($ch, CURLOPT_COOKIE, $cookie);
No problem here, I can run curl_exec, and see the page that requires cookies.
If I also want to send some post data, I can do like this:
$data = array(
'index' => "Some data is here"
);
$cookie = ".ASPXAUTH=Secret";
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
curl_setopt($ch, CURLOPT_COOKIE, $cookie);
I have set up a dump script on my local server, to see if it is working. If i send only the cookie, I can see it in the http headers, and if I send only the post data, I can see the post data.
When I send both, I see only the cookie.
Why?
I finally found a solution.
If I manually set the cookie, using a custom http_header, I am able to get the results wanted.
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Cookie:.ASPXAUTH=secretData"));
Even tried on different servers - same results.
I never see how is PUT/DELETE request sent.
How to do it in PHP?
I know how to send a GET/POST request with curl:
$ch = curl_init();
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookieFile);
curl_setopt($ch, CURLOPT_COOKIEFILE,$cookieFile);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_POST, 0);
curl_setopt($ch, CURLOPT_TIMEOUT, 4);
But how to do PUT/DELETE request?
For DELETE use curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'DELETE');
For PUT use curl_setopt($ch, CURLOPT_PUT, true);
An alternative that doesn't rely on cURL being installed would be to use file_get_contents with a custom HTTP stream context.
$result = file_get_contents(
'http://example.com/submit.php',
false,
stream_context_create(array(
'http' => array(
'method' => 'DELETE'
)
))
);
Check out these two articles on doing REST with PHP
http://www.gen-x-design.com/archives/create-a-rest-api-with-php/
http://www.gen-x-design.com/archives/making-restful-requests-in-php/
Generally speaking, if you want to send some "non-GET" request, you'll often work with curl.
And you'll use the curl_setopt function to configure the request you're sending ; amongst the large amount of possible options, to change the request method, you'll be interested by at least those options (quoting) :
CURLOPT_CUSTOMREQUEST : A custom request method to use instead of "GET" or "HEAD" when doing a HTTP request. This is useful for doing "DELETE" or other, more obscure HTTP requests.
CURLOPT_HTTPGET : TRUE to reset the HTTP request method to GET.
CURLOPT_POST : TRUE to do a regular HTTP POST.
CURLOPT_PUT : TRUE to HTTP PUT a file. The file to PUT must be set with CURLOPT_INFILE and CURLOPT_INFILESIZE.
Of course, curl_setopt is not the only function you'll use ; see the documentation page of curl_exec for an example of how to send a request with curl.
(Yes, that example is pretty simple, and sends a GET request -- but you should be able to build from there ;-) )