View cURL request headers complete with POST data - php

How can I view the full request headers, including post data, using libcurl in php?
I am trying to simulate the post of a page, which when done from a browser and viewed in Live HTTP Headers looks like this:
https://###.com
POST /###/### HTTP/1.1
Host: ###.###.com
...snipped normal looking headers...
Content-Type: multipart/form-data; boundary=---------------------------28001808731060
Content-Length: 697
-----------------------------28001808731060
Content-Disposition: form-data; name="file_data"; filename="stats.csv"
Content-Type: text/csv
id,stats_id,scope_id,stat_begin,stat_end,value
61281,1,4,2011-01-01 00:00:00,2011-12-31 23:59:59,0
-----------------------------28001808731060
Content-Disposition: form-data; name="-save"
Submit
-----------------------------28001808731060--
So we nicely see the file I'm uploading, it's content, everything's there. But all my attempts at getting data out of cURL when I try to make the same post from php (using CURLOPT_VERBOSE, or CURLINFO_HEADER_OUT) show request headers that lack the post data, like so:
POST /###/### HTTP/1.1
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1
Host: ###.###.com
...snipped normal-looking headers...
Content-Length: 697
Content-Type: multipart/form-data; boundary=----------------------------e90de36c15f5
Based on the Content-Length here, it appears things are going well, but it would really help my debugging efforts to be able to see the complete request. I am also irked that it is difficult, I should be able to see the whole thing; I know I must be missing something.
--- EDIT ---
What I'm looking for is the equivalent of this:
curl --trace-ascii debugdump.txt http://www.example.com/
which seems to be available with the option CURLOPT_DEBUGFUNCTION in libcurl, but isn't implemented in php. Boo.

I had a need to do precisely this, but I needed to test communication with a bank.
It is extremely easy to use Fiddler2, enable HTTPS traffic decryption, and have cURL use Fiddler2 as a proxy for debugging in this situation:
$proxy = '127.0.0.1:8888';
curl_setopt($ch, CURLOPT_PROXY, $proxy);
curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, 1);

You are sending multipart/formdata. cURL basically shows the HTTP header completely I guess. The "problem" is that multipart/formdata consist of multiple parts. This is beyond "first level HTTP headers" and part of the body of the "main HTTP body".
I don't know your environment, but you can debug using TCP traffic monitoring as well. For this, you can use Wireshark or tcpdump - Wireshark can as well show dump files created by tcpdump.

Related

Understanding how XMLHttpRequest sends data to a server

I haven't understand data transfer to the server completely. Which ways do I have? When I started learning PHP, I thought there are two ways called GET which encrypts data in the URL and POST which sends data in another way to the server. I didn't know where exactly, though.
Now I want to learn about RESTful server backends, and I learned that GET and POST are just request methods, among others like PUT and DELETE, which doesn't seem to have anything to do with how data is transferred to the server.
Moreover, I read that additional data can be sent in the HTTP header. Is this how POST actually sends its data or is there even a difference?
I would like to either read POST data regardless of the request method using PHP's $_POST array, but this doesn't work. On the other hand, when I try to manually parse the header information from php://input, I cannot see POST data. Could someone please explain to me where data is transferred in the different cases?
My goal is to get parameters from the client regardless of content type, which may be form-data, json or something other, and request method. How can I do this in PHP? Requests will be sent using JQuery's AJAX functionality.
To explain how does http work using nc
http://linux.die.net/man/1/nc
GET
$ nc -l 8888 to start a dummy server listen at 8888
send a GET request using jQuery (impl via XHR)
$.get("http://localhost:8888", { a :1 ,b: 2})
nc would print what XHR sent to server to stdout
$nc -l 8888
GET /?a=1&b=2&_=1383234919249 HTTP/1.1
Host: localhost:8888
Connection: keep-alive
Accept: */*
Origin: http://stackoverflow.com
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36
DNT: 1
Referer: http://stackoverflow.com/questions/19710815/understanding-how-xmlhttprequest-sends-data-to-a-server
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8,zh-CN;q=0.6,zh;q=0.4
thus, PHP parse GET /?a=1&b=2&_=1383234919249 into $_GET
POST
using nc to recording POST
POST / HTTP/1.1
Host: localhost:8888
Connection: keep-alive
Content-Length: 7
Accept: */*
Origin: http://stackoverflow.com
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36
Content-Type: application/x-www-form-urlencoded
DNT: 1
Referer: http://stackoverflow.com/questions/19710815/understanding-how-xmlhttprequest-sends-data-to-a-server
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8,zh-CN;q=0.6,zh;q=0.4
a=1&b=2
here you can see Content-Type: application/x-www-form-urlencoded
,which tell the http body sent by browser is form encoded
as a result, PHP parse a=1&b=2 in to array $_POST
WHY php://input can't see POST BODY
according to http://php.net/manual/en/wrappers.php.php
php://input is a stream and can be read only once
below is from php doc
Note: A stream opened with php://input can only be read once; the
stream does not support seek operations. However, depending on the
SAPI implementation, it may be possible to open another php://input
stream and restart reading. This is only possible if the request body
data has been saved. Typically, this is the case for POST requests,
but not other request methods, such as PUT or PROPFIND.

OAuth/Twitteroauth library and Authorization headers

I am trying to better understand OAuth by experimenting with the twitteroauth php library.
It is my understanding that the way to authenticate over OAuth is to make use of an 'Authorize' header when using cUrl. However, examining the source for the twitteroauth library, I can see that the header is set as so for post requests:
curl_setopt($ci, CURLOPT_HTTPHEADER, array('Expect:'));
And the parameters that should be set in the 'Authorize' header are actually being set in the post body instead in the line:
curl_setopt($ci, CURLOPT_POSTFIELDS, $postfields);
What is the reason for it being done this way? When in the twitter API guidelines is specifies the following implementation for the header:
POST /1/statuses/update.json?include_entities=true HTTP/1.1
Accept: */*
Connection: close
User-Agent: OAuth gem v0.4.4
Content-Type: application/x-www-form-urlencoded
Authorization:
OAuth oauth_consumer_key="xvz1evFS4wEEPTGEFPHBog",
oauth_nonce="kYjzVBB8Y0ZFabxSWbWovY3uYSQ2pTgmZeNu2VS4cg",
oauth_signature="tnnArxj06cWHq44gCs1OSKk%2FjLY%3D",
oauth_signature_method="HMAC-SHA1",
oauth_timestamp="1318622958",
oauth_token="370773112-GmHxMAgYyLbNEtIKZeRNFsMKPR9EyMZeS9weJAEb",
oauth_version="1.0"
Content-Length: 76
Host: api.twitter.com
status=Hello%20Ladies%20%2b%20Gentlemen%2c%20a%20signed%20OAuth%20request%21
A client may add the HTTP Expect header to tell the server "hey, I expect you to behave in a certain way". For some reason, Twitter's implementation wants you to expect nothing. I ran into this with my own home-grown implementation. I'm not sure why Twitter wants this header.
You may present your credentials and signature in the POST variables, or in the header. Both will work as long as the correct variables are set (oauth_consumer_key, oauth_nonce, oauth_signature, oauth_signature_method, oauth_timestamp, and oauth_token).
I find setting the Authorization header to be cleaner because it does not depend upon the request method (GET, POST, PUT, etc). But Twitter handles both cases perfectly fine. If that's how they implemented it in their library, so be it.

Can't seem to get a web page's contents via cURL - user agent and HTTP headers both set?

For some reason I can't seem to get this particular web page's contents via cURL. I've managed to use cURL to get to the "top level page" contents fine, but the same self-built quick cURL function doesn't seem to work for one of the linked off sub web pages.
Top level page: http://www.deindeal.ch/
A sub page: http://www.deindeal.ch/deals/hotel-cristal-in-nuernberg-30/
My cURL function (in functions.php)
function curl_get($url) {
$ch = curl_init();
$header = array(
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7',
'Accept-Language: en-us;q=0.8,en;q=0.6'
);
$options = array(
CURLOPT_URL => $url,
CURLOPT_HEADER => 0,
CURLOPT_RETURNTRANSFER => 1,
CURLOPT_USERAGENT => 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13',
CURLOPT_HTTPHEADER => $header
);
curl_setopt_array($ch, $options);
$return = curl_exec($ch);
curl_close($ch);
return $return;
}
PHP file to get the contents (using echo for testing)
require "functions.php";
require "phpQuery.php";
echo curl_get('http://www.deindeal.ch/deals/hotel-walliserhof-zermatt-2-naechte-30/');
So far I've attempted the following to get this to work
Ran the file both locally (XAMPP) and remotely (LAMP).
Added in the user-agent and HTTP headers as recommended here file_get_contents and CURL can't open a specific website - before the function curl_get() contained all the options as current, except for CURLOPT_USERAGENTandCURLOPT_HTTPHEADERS`.
Is it possible for a website to completely block requests via cURL or other remote file opening mechanisms, regardless of how much data is supplied to attempt to make a real browser request?
Also, is it possible to diagnose why my requests are turning up with nothing?
Any help answering the above two questions, or editing/making suggestions to get the file's contents, even if through a method different than cURL would be greatly appreciated ;).
Try adding:
CURLOPT_FOLLOWLOCATION => TRUE
to your options.
If you run a simple curl request from the command line (including a -i to see the response headers) then it is pretty easy to see:
$ curl -i 'http://www.deindeal.ch/deals/hotel-cristal-in-nuernberg-30/'
HTTP/1.1 302 FOUND
Date: Fri, 30 Dec 2011 02:42:54 GMT
Server: Apache/2.2.16 (Debian)
Vary: Accept-Language,Cookie,Accept-Encoding
Content-Language: de
Set-Cookie: csrftoken=d127d2de73fb3bd72e8986daeca86711; Domain=www.deindeal.ch; Max-Age=31449600; Path=/
Set-Cookie: generic_cookie=1; Path=/
Set-Cookie: sessionid=987b1a11224ecd0e009175470cf7317b; expires=Fri, 27-Jan-2012 02:42:54 GMT; Max-Age=2419200; Path=/
Location: http://www.deindeal.ch/welcome/?deal_slug=hotel-cristal-in-nuernberg-30
Content-Length: 0
Connection: close
Content-Type: text/html; charset=utf-8
As you can see, it returns a 302 with a Location header. If you hit that location directly, you will get the content you are looking for.
And to answer your two questions:
No, it is not possile to block requests from something like curl. If the consumer can talk HTTP then it can get to anything the browser can get to.
Diagnosing with an HTTP proxy could have been helpful for you. Wireshark, fiddler, charles, et al. should help you out in the future. Or, do like I did and make a request from the command line.
EDIT
Ah, I see what you are talking about now. So, when you go to that link for the first time you are redirected and a cookie (or cookies) is set. Once you have those cookie, your request goes through as intended.
So, you need to use a cookiejar, like in this example: http://icfun.blogspot.com/2009/04/php-how-to-use-cookie-jar-with-curl.html
So, you will need to make an initial request, save the cookies, and make your subsequent requests including the cookies after that.

Display HTTP Response on Webpage using PHP

I am looking to put together a website that displays the full HTTP Request Headers and HTTP Response Headers for the loading of the page itself. For instance. If someone browses to http://example.com/index.php , I want the following to display:
HTTP Request
GET /index.php HTTP/1.1
Host: example.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.16) Gecko/20110319 Firefox/3.6.16
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive
HTTP Response
HTTP/1.1 200 OK
Date: Mon, 21 Dec 2011 10:20:46 GMT
Server: Apache/2.2.15 (Red Hat)
X-Powered-By: PHP/5.3.3
Content-Length: 1169
Connection: close
Content-Type: text/html; charset=UTF-8
We were able to get the Request header to display fairly simply using the following PHP code:
print $_SERVER['REQUEST_METHOD']." ".$_SERVER['REQUEST_URI']." ".$_SERVER['SERVER_PROTOCOL']."<br>";
foreach (apache_request_headers() as $name => $value)
echo "$name: $value<br>";
But are having some difficulties with the HTTP Response header. Anyone have any ideas of how we can do this? It does not have to be PHP if you have a method that works in Perle or CGI or whatever.
To be clear, I don't mean to set the HTTP Response to anything specific, only display the response served by the web server to load the page.
You want to use headers_list()
http://www.php.net/manual/en/function.headers-list.php
headers_list() will return a list of headers to be sent to the browser / client. To determine whether or not these headers have been sent yet, use headers_sent().
Well here is the issue, the response header is generated after the PHP (or any server-side language for that matter) has already completed its job.
To put it in english its like the post man handing you a letter and you asking him to explain how the process of handing you the letter went. He will probably just look at you dumb.
You will need a client-side language (ie. JavaScript) to perform this task.
Use PHP to get the headers sent to the web Server.
http://www.php.net/manual/en/function.apache-request-headers.php
Use JavaScript to get headers sent by the web server. I would suggest using jQuery for that.
http://api.jquery.com/jQuery.ajax/#jqXHR
This way you are sure that you get all the headers which are either received by the web server or the browser.
Check out get_headers in PHP Manual

Can a cURL based HTTP request imitate a browser based request completely?

This is a two part question.
Q1: Can cURL based request 100% imitate a browser based request?
Q2: If yes, what all options should be set. If not what extra does the browser do that cannot bee imitated by cURL?
I have a website and I see thousands of request being made from a single IP in a very short time. These requests harvest all my data. When looked at the log to identify the agent used, it looks like a request from browser. So was curious to know if its a bot and not a user.
Thanks in advance
This page has all the answers to your questions. You can imitate the things mostly.
R1 : I suppose, if you set all the correct headers, that, yes, a curl-based request can imitate a browser-based one : after all, both send an HTTP request, which is just a couple of lines of text following a specific convention (namely, the HTTP RFC)
R2 : The best way to answer that question is to take a look at what your browser is sending ; with Firefox, for instance, you can use either Firebug or LiveHTTPHeaders to get that.
For instance, to get this page, Firefox sent those request headers :
GET /questions/1926876/can-a-curl-based-http-request-imitate-a-browser-based-request-completely HTTP/1.1
Host: stackoverflow.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.9.2b4) Gecko/20091124 Firefox/3.6b4
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: fr,fr-fr;q=0.8,en-us;q=0.5,en;q=0.3
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive
Referer: http://stackoverflow.com/questions/1926876/can-a-curl-based-http-request-imitate-a-browser-based-request-completely/1926889
Cookie: .......
Cache-Control: max-age=0
(I Just removed a couple of informations -- but you get the idea ;-) )
Using curl, you can work with curl_setopt to set the HTTP headers ; here, you'd probably have to use a combination of CURLOPT_HTTPHEADER, CURLOPT_COOKIE, CURLOPT_USERAGENT, ...

Categories