Okay, I haven't been able to find a solution to this as of yet, and I need to start asking questions on SO so I can get my reputation up and hopefully help out others.
I am making a wordpress plugin that retrieves a json list of items from a remote site. Recently, the site added a redirecting check for a cookie.
Upon first request without the cookie, 302 headers are provided, pointing to a second page which also returns a 302 redirect pointing to the homepage. On this second page, however, the set-cookie headers are also provided, which prevents the homepage from redirecting yet again.
When I make a cURL request to a url on the site, however, it fails in a redirect loop.
Now, obviously the easiest solution would be to fix this on the remote server. It should not be implementing that redirect for api routes. But that at the moment is not an option for me.
I have found how to retrieve the set-cookie header value from a 2** code response, however I cannot seem to figure out how to access that value when 302 headers are provided, and cURL returns nothing but an error.
Is there a way to access the headers even when it reaches the maximum (20) redirects?
Is it possible to stop the execution after a set number of redirects?
How can I get this cookie's value so I can provide it in a final request?
If you use the cURL option CURLOPT_HEADER the data you get back from curl_exec will include the headers from each response, including the 302.
If you enable cookie handling in cURL, it should pick up the cookie set by the 302 response just fine unless you prefer to handle it manually.
I often do something like this when there could be multiple redirects:
$ch = curl_init($some_url_that_302_redirects);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_COOKIEFILE, ''); // enable curl cookie handling
$result = curl_exec($ch);
// $result contains the headers from each response, plus the body of the last response
$info = curl_getinfo($ch); // info will tell us how many redirects were followed
for ($i = 0; $i < intval($info['redirect_count']); ++$i) {
// get headers from each response
list($headers, $response) = explode("\r\n\r\n", $response, 2);
// DO SOMETHING WITH $headers HERE
// If there was a redirect, headers will be all headers from that response,
// including Set-Cookie headers
}
list($headers, $body) = explode("\r\n\r\n", $response, 2);
// Now $headers are the headers from the final response
// $body is the content from the final response
You already had problems before you started trying to add cookies into the mix. Doing a single redirect is bad for performance. Using a 302 response as a means of dissociating data presentation from data retrieval under HTTP/1,1 or later is bad (it works, but is a violation of the protocol - you should be using a 303 if you really must redirect).
Trying to set a cookie in a 3xx response will not work consistently across browsers. Setting a cookie in an Ajax response will not work consistently across browsers.
It should not be implementing that redirect for api routes
Maybe the people at the remote site are trying to prevent you leeching their content?
Fetch the homepage first in an iframe to populate the cookie and record a flag in your domain on the browser.
I actually found another SO question, of course after I posted, that lead me in the right direction to make this possible, HERE
I used the WebGet class to make the curl request. It has not been maintained for three years, but it still works fine.
It has a function that makes the curl request without following through on the redirect loop.
There are a lot of curl options set in that function, and curl is not returning an error in it, so I'm sure the exact solution could be simpler. HERE is a list of curl options for anyone who would like to delve deeper.
Here is how I handle each of the responses to get the final response
$w = new WebGet();
$cookie_file = 'cookie.txt';
if (!file_exists($cookie_file)) {
$cookie_file_inter = fopen($cookie_file, "w");
fclose($cookie_file_inter);
}
$w->cookieFile = $cookie_file; // must exist and be writable
$w->requestContent($url);
$headers = $w->responseHeaders;
if ($w->responseStatusCode == 302 && isset($headers['LOCATION'])) {
$w->requestContent($headers['LOCATION']);
}
if ($w->responseStatusCode == 302 && isset($headers['LOCATION'])) {
$w->requestContent($headers['LOCATION']);
}
$response = $w->cachedContent;
Of course, this is all extremely bad practice, and has severe performance implications, but there may be some rare use cases that find themselves needing to do this.
Related
I'm trying to avoid cURL storing the cookie session into an actual file via "CURLOPT_COOKIEJAR". So I created a method to catch / parse the cookies into a local variable - which is then used via "CURLOPT_COOKIE" to restore the cookie session.
I cut out the cookies via
preg_match_all("/^Set-cookie: (.*?);/ism", $header, $cookies);
To use "CURLOPT_COOKIE" we take the key=value and separate them via "; ". However (As I'm aware), CURLOPT_COOKIE doesn't allow you throw in various flags I.e. expiration, secure flag, and so on.
Update 1/29/2014 6:45pm
So I think my issue actually occurs where CURLOPT_FOLLOWLOCATION occurs. I don't think it has to do with the flags. It doesn't seem like the manual cookie session I have is updating when following a new location (i.e. a site has 2-3 redirects to append various cookies / session). Which would actually make sense because utilizing CURLOPT_COOKIEJAR will directly grab / update cookies sent on header redirects. So, I tried creating a manual redirection path while grabbing / appending the latest cookie - however this method did not work for some plain reason.
Update 1/30/2014 4:22pm
Almost got this figured out. Will be updating with answer shortly. It turns out my method works perfectly fine, it's just a matter of jumping through the manual redirected pages correctly.
Update 1/30/2014 4:51pm
Issue solved -- answered myself below.
So it turns out I was actually doing this correctly and my assumptions were correct.
To keep the cookie session in a variable (vs. CURLOPT_COOKIEJAR). *Make sure you have CURLOPT_HEADER and CURLINFO_HEADER_OUT enabled.*
CURLOPT_FOLLOWLOCATION must be set to false. Otherwise your cookie won't send correctly (This is where CURLOPT_COOKIEJAR does best).
Use preg_match_all to extract cookies. Then use strpos to find the first occurence of "=". Some sites use encoding and include "="'s which won't work with "explode".
$data = curl_exec($curl);
$header_size = curl_getinfo($curl, CURLINFO_HEADER_SIZE);
$header = substr($data, 0, $header_size);
preg_match_all("/^Set-cookie: (.*?);/ism", $header, $cookies);
foreach( $cookies[1] as $cookie ){
$buffer_explode = strpos($cookie, "=");
$this->cookies[ substr($cookie,0,$buffer_explode) ] = substr($cookie,$buffer_explode+1);
}
When making your next curl call, re-call the cookie var/object into CURLOPT_COOKIE.
if( count($this->cookies) > 0 ){
$cookieBuffer = array();
foreach( $this->cookies as $k=>$c ) $cookieBuffer[] = "$k=$c";
curl_setopt($curl, CURLOPT_COOKIE, implode("; ",$cookieBuffer) );
}
This will allow you to keep the latest variable (i.e. changing sessions) intact.
Hope this helps anyone who bumps into this issue!
I will try to explain what I am trying to reach the best way I can.
Let's say there is a page that shows information and it has cookies ( I can see the cookies through Firecookie [Firefox add-on in Firebug]) I am able to print the cookies in my localhost through
$cookies = array();
foreach ($http_response_header as $hdr) {
if (preg_match('/^Set-Cookie:\s*([^;]+)/', $hdr, $matches)) {
parse_str($matches[1], $tmp);
$cookies += $tmp;
}
}
print_r($cookies);
but the original page has request headers, and what I am trying to do is get the request header and make a request to that same page. I guess I have two questions, do I get the request header through COOKIES or separately. And how do I get the request headers of a page and send a request to that page with those request headers? I tried lots of things and couldn't succeed. I don't have the codes I've tried since I constantly try new things therefore can't paste what I have, only the file I pasted.
If you're using PHP with apache you can get request headers using function apache_request_headers.
http://php.net/manual/en/function.apache-request-headers.php
I am developing an application in which the input I receive is through an SMS gateway ( and not a browser). I need to process the data obtained through SMS and pass it onto another PHP file which will finish the processing and send back an SMS to the SMS gateway.
However, when I try to redirect from page1.php to page2.php, it is not working with the following code:
page1.php:
$url = "location:http://www.iweavesolutions.com/$extra?sms=".$msg."&keyword=".$key."&num=".$msg_num."&src=".$source;
header($url);
page2.php:
$msg = $_GET['sms'];
$msg_num = $_GET['num'];
$keyword = $_GET['keyword'];
$src = $_GET['src'];
send_sms($msg,$msg_num);
However, the header call in the first page doesn't seem to work. php documentation says that header is used for browser related activities. In my application there is no browser at all. So, do I need to change my mechanism for passing values across files? Please help
please refer to "CURL"
$ch = curl_init();
curl_setopt($ch,CURLOPT_URL,"http://www.iweavesolutions.com");
curl_setopt($ch,CURLOPT_CONNECTTIMEOUT,2);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, 'variable1=abc&variable2=123');
curl_setopt($ch,CURLOPT_FOLLOWLOCATION,true);
curl_setopt($ch,CURLOPT_MAXREDIRS,1);
$buffer = curl_exec($ch);
curl_close($ch);
some thing like this
Sending a location:[someUrl] header as an answer to a request just tells the requesting client to do another request to that location. It is up to the client whether to follow this redirect or not. Browsers will usually do this, other clients may not.
If the client you're dealing with (the SMS gateway) does not follow location header redirects, you need to check with the clients documentation if there is some mechanism to make him do that. If there is no way to redirect the client, you need to change your server side logic to get rid of the need for the redirect, i.e. you need to call the processing logic in your 'page2.php' directly from 'page1.php' without the indirection of the redirect (or bundle the whole logic in one file, etc.).
The SMS gateway probably does not implement HTTP properly. IME this is not uncommon.
As a side note, your first script (assuming it is complete) is written assuming register_globals is enabled - this has been deprecated for a long time, and does not url-encode the values - which may be the cause of the issue here. If not, you'll need to either:
fix the SMS gateway
change the end point registered on the SMS gateway to eliminate the ned for redirection
include the code from the redirected script into the current endpoint script
proxy the request from the gateway in the endpoint script.
I've got a simple php script to ping some of my domains using file_get_contents(), however I have checked my logs and they are not recording any get requests.
I have
$result = file_get_contents($url);
echo $url. ' pinged ok\n';
where $url for each of the domains is just a simple string of the form http://mydomain.com/, echo verifies this. Manual requests made by myself are showing.
Why would the get requests not be showing in my logs?
Actually I've got it to register the hit when I send $result to the browser. I guess this means the webserver only records browser requests? Is there any way to mimic such in php?
ok tried curl php:
// create curl resource
$ch = curl_init();
// set url
curl_setopt($ch, CURLOPT_URL, "getcorporate.co.nr");
//return the transfer as a string
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
// $output contains the output string
$output = curl_exec($ch);
// close curl resource to free up system resources
curl_close($ch);
same effect though - no hit registered in logs. So far it only registers when I feed the http response back from my script to the browser. Obviously this will only work for a single request and not a bunch as is the purpose of my script.
If something else is going wrong, what debugging output can I look at?
Edit: D'oh! See comments below accepted answer for explanation of my erroneous thinking.
If the request is actually being made, it would be in the logs.
Your example code could be failing silently.
What happens if you do:
<?PHP
if ($result = file_get_contents($url)){
echo "Success";
}else{
echo "Epic Fail!";
}
If that's failing, you'll want to turn on some error reporting or logging and try to figure out why.
Note: if you're in safe mode, or otherwise have fopen url wrappers disabled, file_get_contents() will not grab a remote page. This is the most likely reason things would be failing (assuming there's not a typo in the contents of $url).
Use curl instead?
That's odd. Maybe there is some caching afoot? Have you tried changing the URL dynamically ($url = $url."?timestamp=".time() for example)?
Actually, it's gotten so messy that I'm not even sure curl is the culprit. So, here's the php:
$creds = array(
'pw' => "xxxx",
'login' => "user"
);
$login_url = "https://www.example.net/login-form"; //action value in real form.
$loginpage = curl_init();
curl_setopt($loginpage, CURLOPT_HEADER, 1);
curl_setopt($loginpage, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($loginpage, CURLOPT_URL, $login_url);
curl_setopt($loginpage, CURLOPT_POST, 1);
curl_setopt($loginpage, CURLOPT_POSTFIELDS, $creds);
$response = curl_exec($loginpage);
echo $response;
I get the headers (which match the headers of a normal, successful request), followed by the login page (I'm guessing curl captured this due to a redirect) which has an error to the effect of "Bad contact type".
I thought the problem was that the request had the host set to the requesting server, not the remote server, but then I noticed (in Firebug), that the request is sent as GET, not POST.
If I copy the login site's form, strip it down to just the form elements with values, and put the full URL for the action, it works just great. So I would think this isn't a security issue where the login request has to originate on the same server, etc. (I even get rid of the empty hidden values and all of the JS which set some of the other cookies).
Then again, I get confused pretty quickly.
Any ideas why it's showing up as GET, or why it's not working, for that matter?
When troubleshooting the entire class of PHP-cURL-related problems, you simply have to turn on CURLOPT_VERBOSE and give CURLOPT_STDERR a file handle.
tail -f your file, compare the headers and response to the ones you see in Firebug, and the problem should become clear.
The request is made from the server, and will not show up in Firebug. (You probably confused it with another request by your browser). Use wireshark to find out what really happens. You are not setting CURLOPT_FOLLOWLOCATION; redirects should not be followed.
Summarizing: Guess less, post more. Link to a pcap dump, and we will be able to tell exactly what you're doing wrong; or post the exact output of the php script, and we might.
The shown code does a multipart formpost (since you pass a hash array to the POSTFIELDS option), which probably is not what the target server expects.
try throwing in a print_r(curl_getinfo($loginpage)) at the end, see what the header data it sent back as.
also, if your trying to fake that your logging in from their site, your going to want to make sure your sending the correct referrer with your post, so that they "think" you were on the website when you sent it.