CURLOPT_COOKIEFILE not sending cookie parameter - php

I'm having a problem requesting a page with a cookie using PHP's cURL library. The code is as follows:
$ch = curl_init();
$options = array(
CURLOPT_URL => 'https://website.com/members',
CURLOPT_HEADER => false,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_COOKIEFILE => dirname(__FILE__) . '/../../cookie.txt'
);
curl_setopt_array($ch, $options);
$response = json_decode(curl_exec($ch));
curl_close($ch);
dirname(__FILE__) . '/../../cookie.txt' points to a valid cookie file, as I've verified by dumping the contents of that file using PHP. So PHP does have read permissions for that file (and execute, I set that file to chmod 0777). When I dump the request I see this:
* About to connect() to website.com port 80 (#0)
* Trying 88.19.264.3... * connected
> GET /members HTTP/1.1
Accept: */*
Cookie: __utmz=1.1383258121.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utmc=1; __utmb=1.8.10.1383258121; __utma=1.259269816.1383258121.1383258121.1383258121.1; sync=1; lang=en; https=1
Host: website.com
< HTTP/1.1 200 OK
< Server: nginx
< Date: Thu, 31 Oct 2013 22:36:07 GMT
< Content-Type: text/plain
< Transfer-Encoding: chunked
< Connection: keep-alive
< Keep-Alive: timeout=8
< Vary: Accept-Encoding
< Access-Control-Allow-Origin: *
<
* Connection #0 to host website.com left intact
* Closing connection #0
The cookie header is being set with the contents of cookie.txt, however it's missing the auth parameter which is required to authenticate the session. The auth parameter definitely exists in the cookie.txt file, however.
I just don't get why cURL would only send some parameters in the cookie.txt file, and not the auth parameter. I've tried to remove all the unneeded parameters leaving only auth however it still doesn't get sent.
For reference, the auth parameter looks like this:
.website.com TRUE / TRUE 1385850138.917861 auth y2a4232344b4x2h403a423fgw2c443g4u2c49hg48494g4a4q2o5g4g4v274b4i4a4k5e4e4y2a4c4g4s2l56323b4d453s5946453b4t2c4y7d47336
I dumped the cookie file using cURL on the command line so I know it's the right format.

The 4. parameter TRUE means the cookie will only be sent through secure connections.
Cookie file format: http://curl.haxx.se/mail/archive-2005-03/0099.html
About secure cookies: http://en.wikipedia.org/wiki/HTTP_cookie#Secure_cookie

Related

Symfony HTTP Client getting original request infos

I'm using the HTTPClient with Symfony 5.2
$response = $myApi->request($method, $url, $options);
When a request fails, I'd like to get the detailed info about the original request, ie: the request headers.
$response->getInfo() does not return them (only the response headers).
My $options don't always have all that I need because some can come from the client config.
I need to log that somewhere in production, I saw maintainers working on injecting a logger but didn't find more info about it.
After a quick check on the code, I can see that a logger can be set but it seems to log only the method and URI.
How can I get the request info like headers or params/body ?
Github Issue opened about this
$response->getInfo('debug') contains the request headers once they have been received by the client.
dump($response->getInfo('debug'));
* Found bundle for host myapi: 0x7fe4c3f81580 [serially]
* Can not multiplex, even if we wanted to!
* Re-using existing connection! (#0) with host myapi
* Connected to myapi (xxx.xxx.xxx.xxx) port xxxx (#0)
> POST /my/uri/ HTTP/1.1
Host: myapi:xxxx
Content-Type: application/json
Accept: */*
Authorization: Bearer e5vn9569-n76v9nd-v6n978-dv6n98
User-Agent: Symfony HttpClient/Curl
Accept-Encoding: gzip
Content-Length: 202
* upload completely sent off: 202 out of 202 bytes
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< X-Powered-By: Express
< Content-Type: application/json; charset=utf-8
< Content-Length: 8
< ETag: W/"8-M1u4Sc28uxk+zyXJvSTJaEkyIGw"
< Date: Tue, 06 Apr 2021 07:36:10 GMT
< Connection: keep-alive
<
* Connection #0 to host myapi left intact
Also TraceableHttpClient seems to be designed for detailed debuging
Here's a quick sample of one of the options that I posted in my comment:
public function __construct(HttpClientInterface $client, LoggerInterface $logger)
{
$this->client = $client;
if ($this->client instanceof LoggerAwareInterface) {
$this->client->setLogger($logger);
}
}
You could also delay the setLogger to your use area, too.

Twitter Api Search doesn't work through proxy

I have a weird problem with the new twitter api. I followed the very good answer from this question to create a search in twitter and used the TwitterAPIExchange.php from here.
Everything works fine as long as I am directly calling it from my server with CURL. But in the live environment I have to use a proxy with Basic Authentication.
All I've done is add the proxy authentication to the performRequest function:
if(defined('WP_PROXY_HOST') && defined('WP_PROXY_PORT') && defined('WP_PROXY_USERNAME') && defined('WP_PROXY_PASSWORD'))
{
$options[CURLOPT_HTTPPROXYTUNNEL] = 1;
$options[CURLOPT_PROXYAUTH] = CURLAUTH_BASIC;
$options[CURLOPT_PROXY] = WP_PROXY_HOST . ':' . WP_PROXY_PORT;
$options[CURLOPT_PROXYPORT] = WP_PROXY_PORT;
$options[CURLOPT_PROXYUSERPWD] = WP_PROXY_USERNAME . ':' . WP_PROXY_PASSWORD;
}
Without the proxy I get a JSON response. But with the proxy I get:
HTTP/1.1 200 Connection established
HTTP/1.1 400 Bad Request
content-type: application/json; charset=utf-8
date: Fri, 20 Dec 2013 09:22:59 UTC
server: tfe
strict-transport-security: max-age=631138519
content-length: 61
Proxy-Connection: Keep-Alive
Connection: Keep-Alive
Set-Cookie: guest_id=v1%3A138753137985809686; Domain=.twitter.com; Path=/; Expires=Sun, 20-Dec-2015 09:22:59 UTC
Age: 0 {"errors":[{"message":"Bad Authentication data","code":215}]}
I've tried to simulate a proxy in my local environment with Charles Proxy, and it worked.
I'm assuming the proxy is either not sending the Authentication Header, or is changing data somehow.
Anybody with a clue....
EDIT:
Using the HTTP API works but HTTPS fails. I've tried CURLOPT_SSL_VERIFYPEER and CURLOPT_SSL_VERIFYHOST set to FALSE but the twitter SSL is valid so this is not recommended
Is your proxy response caches or is the date in the proxy response old because you did perform the API call on the 20th december?
If it is cached maybe your proxy is having a cached reply from an actual invalid request?

Making PHP cURL request on Windows yields "400 Bad Request" from proxy

Morning all
Basically, I am unable to make successful cURL requests to internal and external servers from my Windows 7 development PC because of an issue involving a proxy server. I'm running cURL 7.21.2 thru PHP 5.3.6 on Apache 2.4.
Here's a most basic request that fails:
<?php
$curl = curl_init('http://www.google.com');
$log_file = fopen(sys_get_temp_dir() . 'curl.log', 'w');
curl_setopt_array($curl, array(
CURLOPT_RETURNTRANSFER => TRUE,
CURLOPT_VERBOSE => TRUE,
CURLOPT_HEADER => TRUE,
CURLOPT_STDERR => $log_file,
));
$response = curl_exec($curl);
#fclose($log_file);
print "<pre>{$response}";
The following (complete) response is received.
HTTP/1.1 400 Bad Request
Date: Thu, 06 Sep 2012 17:12:58 GMT
Content-Length: 171
Content-Type: text/html
Server: IronPort httpd/1.1
Error response
Error code 400.
Message: Bad Request.
Reason: None.
The log file generated by cURL contains the following.
* About to connect() to proxy usushproxy01.unistudios.com port 7070 (#0)
* Trying 216.178.96.20... * connected
* Connected to usushproxy01.unistudios.com (216.178.96.20) port 7070 (#0)
> GET http://www.google.com HTTP/1.1
Host: www.google.com
Accept: */*
Proxy-Connection: Keep-Alive
< HTTP/1.1 400 Bad Request
< Date: Thu, 06 Sep 2012 17:12:58 GMT
< Content-Length: 171
< Content-Type: text/html
< Server: IronPort httpd/1.1
<
* Connection #0 to host usushproxy01.unistudios.com left intact
Explicitly stating the proxy and user credentials, as in the following, makes no difference: the response is always the same.
<?php
$curl = curl_init('http://www.google.com');
$log_file = fopen(sys_get_temp_dir() . 'curl.log', 'w');
curl_setopt_array($curl, array(
CURLOPT_RETURNTRANSFER => TRUE,
CURLOPT_VERBOSE => TRUE,
CURLOPT_HEADER => TRUE,
CURLOPT_STDERR => $log_file,
CURLOPT_PROXY => 'http://usushproxy01.unistudios.com:7070',
CURLOPT_PROXYUSERPWD => '<username>:<password>',
));
$response = curl_exec($curl);
#fclose($log_file);
print "<pre>{$response}";
I was surprised to see an absolute URL in the request line ('GET ...'), but I think that's fine when dealing with proxy servers - according to the HTTP spec.
I've tried all sorts of combinations of options - including sending a user-agent, following this and that, etc, etc - having been through Stack Overflow questions, and other sites, but all requests end in the same response.
The same problem occurs if I run the script on the command line, so it can't be an Apache issue, right?
If I make a request using cURL from a Linux box on the same network, I don't experience a problem.
It's the "Bad Request" thing that's puzzling me: what on earth is wrong with my request? Do you have any idea why I may be experiencing this problem? A Windows thing? A bug in the version of PHP/cURL I'm using?
Any help very gratefully received. Many thanks.
You might be looking at an issue between cURL (different versions between Windows and Linux) and your IronPort version. In IronPort documentation:
Fixed: Web Proxy uses the Proxy-Connection header instead of the
Connection header, causing problems with some user agents
Previously, the Web Proxy used the Proxy-Connection header instead of the
Connection header when communicating with user agents with explicit
forward requests. Because of this, some user agents, such as Real
Player, did not work as expected. This no longer occurs. Now, the Web
Proxy replies to the client using the Connection header in addition to
the Proxy-Connection header. [Defect ID: 46515]
Try removing the Proxy-Connection (or add a Connection) header and see whether this solves the problem.
Also, you might want to compare the cURL logs between Windows and Linux hosts.

Can't seem to get a web page's contents via cURL - user agent and HTTP headers both set?

For some reason I can't seem to get this particular web page's contents via cURL. I've managed to use cURL to get to the "top level page" contents fine, but the same self-built quick cURL function doesn't seem to work for one of the linked off sub web pages.
Top level page: http://www.deindeal.ch/
A sub page: http://www.deindeal.ch/deals/hotel-cristal-in-nuernberg-30/
My cURL function (in functions.php)
function curl_get($url) {
$ch = curl_init();
$header = array(
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7',
'Accept-Language: en-us;q=0.8,en;q=0.6'
);
$options = array(
CURLOPT_URL => $url,
CURLOPT_HEADER => 0,
CURLOPT_RETURNTRANSFER => 1,
CURLOPT_USERAGENT => 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13',
CURLOPT_HTTPHEADER => $header
);
curl_setopt_array($ch, $options);
$return = curl_exec($ch);
curl_close($ch);
return $return;
}
PHP file to get the contents (using echo for testing)
require "functions.php";
require "phpQuery.php";
echo curl_get('http://www.deindeal.ch/deals/hotel-walliserhof-zermatt-2-naechte-30/');
So far I've attempted the following to get this to work
Ran the file both locally (XAMPP) and remotely (LAMP).
Added in the user-agent and HTTP headers as recommended here file_get_contents and CURL can't open a specific website - before the function curl_get() contained all the options as current, except for CURLOPT_USERAGENTandCURLOPT_HTTPHEADERS`.
Is it possible for a website to completely block requests via cURL or other remote file opening mechanisms, regardless of how much data is supplied to attempt to make a real browser request?
Also, is it possible to diagnose why my requests are turning up with nothing?
Any help answering the above two questions, or editing/making suggestions to get the file's contents, even if through a method different than cURL would be greatly appreciated ;).
Try adding:
CURLOPT_FOLLOWLOCATION => TRUE
to your options.
If you run a simple curl request from the command line (including a -i to see the response headers) then it is pretty easy to see:
$ curl -i 'http://www.deindeal.ch/deals/hotel-cristal-in-nuernberg-30/'
HTTP/1.1 302 FOUND
Date: Fri, 30 Dec 2011 02:42:54 GMT
Server: Apache/2.2.16 (Debian)
Vary: Accept-Language,Cookie,Accept-Encoding
Content-Language: de
Set-Cookie: csrftoken=d127d2de73fb3bd72e8986daeca86711; Domain=www.deindeal.ch; Max-Age=31449600; Path=/
Set-Cookie: generic_cookie=1; Path=/
Set-Cookie: sessionid=987b1a11224ecd0e009175470cf7317b; expires=Fri, 27-Jan-2012 02:42:54 GMT; Max-Age=2419200; Path=/
Location: http://www.deindeal.ch/welcome/?deal_slug=hotel-cristal-in-nuernberg-30
Content-Length: 0
Connection: close
Content-Type: text/html; charset=utf-8
As you can see, it returns a 302 with a Location header. If you hit that location directly, you will get the content you are looking for.
And to answer your two questions:
No, it is not possile to block requests from something like curl. If the consumer can talk HTTP then it can get to anything the browser can get to.
Diagnosing with an HTTP proxy could have been helpful for you. Wireshark, fiddler, charles, et al. should help you out in the future. Or, do like I did and make a request from the command line.
EDIT
Ah, I see what you are talking about now. So, when you go to that link for the first time you are redirected and a cookie (or cookies) is set. Once you have those cookie, your request goes through as intended.
So, you need to use a cookiejar, like in this example: http://icfun.blogspot.com/2009/04/php-how-to-use-cookie-jar-with-curl.html
So, you will need to make an initial request, save the cookies, and make your subsequent requests including the cookies after that.

PHP cURL: HTTP headers show 302 and cookies set, cookies are saved and sent, same headers appear?

This is kind of a carry on from a question asked yesterday: Can't seem to get a web page's contents via cURL - user agent and HTTP headers both set?
I'm attempting to access a url's contents, the problem is the way this url handles request.
The url: http://www.deindeal.ch/deals/atlas-grand-hotel-2-naechte-30-2/
First request (without cookies):
After "learning" to use curl in the command line (props to #d3v3us), a simple request curl -i http://www.deindeal.ch/deals/atlas-grand-hotel-2-naechte-30-2/ shows the following:
curl -i http://www.deindeal.ch/deals/atlas-grand-hote
l-2-naechte-30-2/
HTTP/1.1 302 FOUND
Date: Fri, 30 Dec 2011 13:15:00 GMT
Server: Apache/2.2.16 (Debian)
Vary: Accept-Language,Cookie,Accept-Encoding
Content-Language: de
Set-Cookie: csrftoken=edc8c77fc74f5e788c53488afba4e50a; Domain=www.deindeal.ch;
Max-Age=31449600; Path=/
Set-Cookie: generic_cookie=1; Path=/
Set-Cookie: sessionid=740a8a2cb9fb51166dcf865e35b91888; expires=Fri, 27-Jan-2012
13:15:00 GMT; Max-Age=2419200; Path=/
Location: http://www.deindeal.ch/welcome/?deal_slug=atlas-grand-hotel-2-naechte-
30-2
Content-Length: 0
Connection: close
Content-Type: text/html; charset=utf-8
Second request (with cookies):
So, I save the cookie using -c, check that it saves as cookie.txt, and run the request again with the addition of -b cookie.txt, getting this:
curl -i -b cookie.txt http://www.deindeal.ch/deals/atlas-grand-hotel-2-naechte-3
0-2/
HTTP/1.1 302 FOUND
Date: Fri, 30 Dec 2011 13:38:17 GMT
Server: Apache/2.2.16 (Debian)
Vary: Accept-Language,Cookie,Accept-Encoding
Content-Language: de
Set-Cookie: csrftoken=49f5c804d399f8581253630631692f5f; Domain=www.deindeal.ch; Max-Age=31449600; P
ath=/
Location: http://www.deindeal.ch/welcome/?deal_slug=atlas-grand-hotel-2-naechte-30-2
Content-Length: 0
Connection: close
Content-Type: text/html; charset=utf-8
To me this looks like exactly the same contents, minus one or two parameters in the cookie, but maybe I'm overlooking something?
I'm attempting to get the curl request to function and return the same contents as when requesting that url via a browser, but I'm not sure what I should do next.
Note: I've tagged this PHP, as I am using PHP to make the requests, I've simply using command line to easily show the returned headers - so if there's any other PHP libraries or methods that would work (better, or in a place that cURL wouldn't), please feel free to suggest any.
Any help would be greatly appreciated ;).
You need this,
curl -iL -c cookie.txt -b cookie.txt http://www.deindeal.ch/deals/atlas-grand-hotel-2-naechte-3
-b flag is used to read cookie from . For a file to be used to save cookie after the http transaction use -c flag. Its called cookie jar.
Using WebGet (Sorry, Its written by me) pulling the contents is quite simple.
require "WebGet.php";
$w = new WebGet();
$w->cookieFile = 'cookie.txt'; // must be writable
$w->requestContent("https://github.com/shiplu/dxtool");
print_r($w->responseHeaders) // prints response headers
print_r($w->cachedContent) // prints url content
I may be misunderstanding your question, but a 302 response means content found, and you just need to follow the "Location" right? cUrl will only perform one request, unlike your browser which will see that 302 (set the cookies, just like you're doing) then follow that location header. It looks like your location has a "?" in it that isn't in the original. Run cUrl, with that same cookie jar, on the Location url.
http://en.wikipedia.org/wiki/List_of_HTTP_status_codes#3xx_Redirection

Categories