This is a two part question.
Q1: Can cURL based request 100% imitate a browser based request?
Q2: If yes, what all options should be set. If not what extra does the browser do that cannot bee imitated by cURL?
I have a website and I see thousands of request being made from a single IP in a very short time. These requests harvest all my data. When looked at the log to identify the agent used, it looks like a request from browser. So was curious to know if its a bot and not a user.
Thanks in advance
This page has all the answers to your questions. You can imitate the things mostly.
R1 : I suppose, if you set all the correct headers, that, yes, a curl-based request can imitate a browser-based one : after all, both send an HTTP request, which is just a couple of lines of text following a specific convention (namely, the HTTP RFC)
R2 : The best way to answer that question is to take a look at what your browser is sending ; with Firefox, for instance, you can use either Firebug or LiveHTTPHeaders to get that.
For instance, to get this page, Firefox sent those request headers :
GET /questions/1926876/can-a-curl-based-http-request-imitate-a-browser-based-request-completely HTTP/1.1
Host: stackoverflow.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.9.2b4) Gecko/20091124 Firefox/3.6b4
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: fr,fr-fr;q=0.8,en-us;q=0.5,en;q=0.3
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive
Referer: http://stackoverflow.com/questions/1926876/can-a-curl-based-http-request-imitate-a-browser-based-request-completely/1926889
Cookie: .......
Cache-Control: max-age=0
(I Just removed a couple of informations -- but you get the idea ;-) )
Using curl, you can work with curl_setopt to set the HTTP headers ; here, you'd probably have to use a combination of CURLOPT_HTTPHEADER, CURLOPT_COOKIE, CURLOPT_USERAGENT, ...
Related
If a logged in user navigates to a certain area of the site which is to use WebSockets, How can I grab that session Id so I can identify him on the server?
My server is basically an endless while loop which holds information about all connected users and stuff, so in order to grab that id I figured the only suitable moment is at the handshake, but unfortunately the handshake's request headers contain no cookie data:
Request Headers
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8
Accept-Encoding: gzip, deflate
Accept-Language: en-US,en;q=0.5
Cache-Control: no-cache
Connection: keep-alive, Upgrade
DNT: 1
Host: 192.168.1.2:9300
Origin: http://localhost
Pragma: no-cache
Sec-WebSocket-Key: 5C7zarsxeh1kdcAIdjQezg==
Sec-WebSocket-Version: 13
Upgrade: websocket
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
rv:27.0) Gecko/20100101 Firefox/27.0
So how can I really grab that id? I thought I could somehow force javascript to send cookie data along with that request but any self-respecting website in 2014 will have httpOnly session cookies so that wont work out. Any help is greatly appreciated!
Here's a link for the server I'm using: https://github.com/Flynsarmy/PHPWebSocket-Chat/blob/master/class.PHPWebSocket.php (thanks to accepted answer)
http only cookies as well as secure cookies work fine with websocket.
Some websocket modules have chosen to ignore cookies in the request, so you need to read the specs of the module.
Try: websocket node: https://github.com/Worlize/WebSocket-Node.
Make sure to use the secure websocket protocol as wss://xyz.com
Update:
Also, chrome will not show the cookies in the "inspect element" Network tab.
In node try dumping the request, something like:
wsServer.on('request', function(request) {
console.log(request);
console.log(request.cookies); // works in websocket node
}
If you see the cookies somewhere in the log...you've got it.
If you're using secure-only cookies, you need to be in secure web sockets: wss://
Update2:
The cookies are passed in the initial request. Chrome does not show it (all the time) as sometimes it shows provisional headers which omits cookie information.
It is up to the websocket server to do 'something' with the cookies and attach them to each request.
Looking at the code of your server: https://github.com/Flynsarmy/PHPWebSocket-Chat/blob/master/class.PHPWebSocket.php I do not see the word "cookie" anywhere, so it is not being nicely packaged and attached to each websocket connection. I could be wrong, that's why you might want to contact the developer and see if the whole header is being attached to each connection and how to access it.
This I can say for certain: If you're using secure cookies then cookies will not be transmitted unless you use the secure websocket wss://mysite.com. Plain ws://mysite.com will not work.
Also, cookies will only be transmitted in the request if the domain is the same as the webpage.
I'm working through a security assessment report on a php app generated by Accunetix.
The report is claiming a SQL Injection vulnerability. The app is PHP with MySQL. Here's the headers it says are making the attack (specifically the accept-language header):
GET /user_login.php HTTP/1.1
user-agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)
accept-language: 1;select pg_sleep(1); --
X-Requested-With: XMLHttpRequest
Cookie: PHPSESSID=35kno6h8kmkbin973q02gojp82; uniqueuser=1382404387
Host: xxx.xxx.com
Connection: Keep-alive
Accept-Encoding: gzip,deflate
I haven't found "accept-language" or "accept_language" anywhere in the app. Also, pg_sleep() isn't a MySQL function.
I searched for a known vulnerability in PHP and didn't find anything. Is this a false positive, or am I missing something?
Accept-Language is the request header sent by client's browser.
Accunetix were trying to manipulate these headers by injecting malicious code to find security wholes (imitating hackers) to test if you application is vulnerable to them.
If you haven't used accept-language header, or request headers in your DB queries, then probably it is a false positive. To make sure, see the response of that request, if the response is normal, then it is all OK.
The code will probably treat that header as a source for selecting the language, an that is done via a database query. And when generating the query, the contents of the HTTP header are improperly parsed.
The reason for you not seeing this might be because the fetching of the HTTP headers is done indirectly (like in $_SERVER[$language_header]).
How can I view the full request headers, including post data, using libcurl in php?
I am trying to simulate the post of a page, which when done from a browser and viewed in Live HTTP Headers looks like this:
https://###.com
POST /###/### HTTP/1.1
Host: ###.###.com
...snipped normal looking headers...
Content-Type: multipart/form-data; boundary=---------------------------28001808731060
Content-Length: 697
-----------------------------28001808731060
Content-Disposition: form-data; name="file_data"; filename="stats.csv"
Content-Type: text/csv
id,stats_id,scope_id,stat_begin,stat_end,value
61281,1,4,2011-01-01 00:00:00,2011-12-31 23:59:59,0
-----------------------------28001808731060
Content-Disposition: form-data; name="-save"
Submit
-----------------------------28001808731060--
So we nicely see the file I'm uploading, it's content, everything's there. But all my attempts at getting data out of cURL when I try to make the same post from php (using CURLOPT_VERBOSE, or CURLINFO_HEADER_OUT) show request headers that lack the post data, like so:
POST /###/### HTTP/1.1
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1
Host: ###.###.com
...snipped normal-looking headers...
Content-Length: 697
Content-Type: multipart/form-data; boundary=----------------------------e90de36c15f5
Based on the Content-Length here, it appears things are going well, but it would really help my debugging efforts to be able to see the complete request. I am also irked that it is difficult, I should be able to see the whole thing; I know I must be missing something.
--- EDIT ---
What I'm looking for is the equivalent of this:
curl --trace-ascii debugdump.txt http://www.example.com/
which seems to be available with the option CURLOPT_DEBUGFUNCTION in libcurl, but isn't implemented in php. Boo.
I had a need to do precisely this, but I needed to test communication with a bank.
It is extremely easy to use Fiddler2, enable HTTPS traffic decryption, and have cURL use Fiddler2 as a proxy for debugging in this situation:
$proxy = '127.0.0.1:8888';
curl_setopt($ch, CURLOPT_PROXY, $proxy);
curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, 1);
You are sending multipart/formdata. cURL basically shows the HTTP header completely I guess. The "problem" is that multipart/formdata consist of multiple parts. This is beyond "first level HTTP headers" and part of the body of the "main HTTP body".
I don't know your environment, but you can debug using TCP traffic monitoring as well. For this, you can use Wireshark or tcpdump - Wireshark can as well show dump files created by tcpdump.
How is it possible for client browser data to be saved in an array in PHP?
PHP runs on the server side, so I don't understand how it has access to information about the client's browser.
User agent data is usually sent with every HTTP requests, in the User-Agent HTTP header field. You might want to read up on HTTP message formats in general. For example, this is part of the HTTP request that my browser sent to load jQuery on this very page:
GET http://ajax.googleapis.com/ajax/libs/jquery/1.5.2/jquery.min.js HTTP/1.1
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
Connection: keep-alive
If-Modified-Since: Fri, 01 Apr 2011 21:23:55 GMT
Accept-Charset: UTF-8,*;q=0.5
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/534.24 (KHTML, like Gecko) Chrome/11.0.696.60 Safari/534.24
Accept: */*
PHP reads the client browser data that you're asking about from the User-Agent header field.
The client sends data to the server which the server uses to build the array (I'm assuming you're talking about $_GET, $_POST, $_SERVER, etc.)
You will find it here
$_SERVER['HTTP_USER_AGENT']
You may need to parse this by regex to get the browser name and version separately.
$_REQUEST
An associative array that by default contains the contents of $_GET, $_POST and $_COOKIE.
The data is submited by the browser when a new page is requested, PHP just puts it into an array for your convenience.
You should start by reading a bit about HTTP (GET and POST to begin with), and HTTP headers.
Im using a script from this guy
A. Valums http://valums.com/ajax-upload/
Everything is fine until the file has finished uploading and i get a 406 error on the firebug(ONLY). when i right click the link on firebug and open in new window, the file does exist and does what i expect it to do.
the page on firebug says
Not Acceptable
An appropriate representation of the requested resource upload.php could not be found on this server.
Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request.
but when i see the page on a new tab it works fine and returns the right thing, that script on A. Valums has ajax requests btw
UPDATE
Host www.example.com
User-Agent Mozilla/5.0 (Windows; U; Windows NT 6.1; en-GB; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13
Accept text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language en-gb,en;q=0.5
Accept-Encoding gzip,deflate
Accept-Charset ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive 115
Connection keep-alive
Content-Type application/octet-stream
Referer http://www.example.com
Content-Length 192378
Cookie
Look at the HTTP headers. Your JavaScript is likely adding an Accept header that the server thinks it doesn't have a suitable type of data to respond with.
It was a server error.something called "mod_security" which needs to be disabled and I have no clue what it is but ask your hosting provider they should know about it if you experience the problem :)