Function get_headers() does not give an array with the same indexes when I make a change in domain. When and why does this occur?
I want Content-Length value for hundreds of domains. What changes do I need to make?
<?php
$url = 'http://www.ecomexpomelbourne.com.au/sponsors/';
echo "<pre>";
$domain_data[] = array();
$domain_data = get_headers( $url, 1 );
print_r($domain_data);
echo $domain_data['Last-Modified'];
?>
When used for current page url I get Content-Length index
Sometimes the server just does not send the Content-length header, and you should expect and treat properly such cases, as there are such provisions described in RFC2616.
In some situations, when the page is dynamically generated (with PHP or other language), the length of the body is not known yet at the stage of the sending of headers, so there is no way for the server to generate proper Content-length header in advance. But there are also cases when the Content-length header is explicitly forbidden to be sent.
PWS is hopelessly outdated, use IIS instead. I'm not sure that PWS allows to configure response headers and I can't find any docs. This is what it sends now:
Cache-Control:private
Connection:keep-alive
Content-Encoding:gzip
Content-Type:text/html; charset=utf-8
Date:Tue, 09 Dec 2014 09:46:13 GMT
Expires:Mon, 08 Dec 2014 09:46:13 GMT
Server:PWS/8.1.20.5
Set-Cookie:www.ecomexpomelbourne.com.au_trackingData=; path=/
Transfer-Encoding:chunked
Vary:Accept-Encoding
X-AspNet-Version:4.0.30319
X-Powered-By:ASP.NET
X-Px:nc h0-s4.p1-hyd ( h0-s2.p6-lhr), nc h0-s2.p6-lhr ( origin)
there is no Content-length
Related
I did R&D on prevention of CRLF injection in php, but i didn't find any solution in mycase, as I'm using a burp suite tool to inject some headers using CRLF characters like the below.
// Using my tool i put CRLF characters at the start of my request url
GET /%0d%0a%20HackedHeader:By_Hacker controller/action
//This generates an header for me like below
HackedHeader:By_Hacker
So i can modify all headers by doing just like above
This tool is just like a proxy server so it catches the request and gives the response and we can modify the response in the way we want.
So i'm just modifying the response by injecting some headers using CRLF characters. Now the Server responds to this request by injecting the CRLF characters in the response.
I'm just worried as header fields like Pragma, Cache-Control, Last-Modified can lead to cache poisoning attacks.
header and setcookie contain mitigations against response/header splitting, But these can't support me in fixing the above issue
Edit
When i request to mysite.com contact us page like below This is the request I captured in my tool like below
Request headers:
GET /contactus HTTP/1.1
Host: mysite.com
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
And i get the Response HTML for the above request
Now for the same request using the tool i'm adding custom headers just like below
Request Headers:
GET /%0d%0a%20Hacked_header:By_Hacker/contactus HTTP/1.1
Host: mysite.com
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Response Headers:
HTTP/1.1 302 Found
Date: Fri, 10 Jul 2015 11:51:22 GMT
Server: Apache/2.2.22 (Ubuntu)
Last-Modified: Fri, 10 Jul 2015 11:51:22 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Location: mysite.com
Hacked_header:By_Hacker/..
Vary: Accept-Encoding
Content-Length: 2
Keep-Alive: timeout=5, max=120
Connection: Keep-Alive
Content-Type: text/html; charset=UTF-8
You can see the injected header Hacked_header:By_Hacker/.. in the above response
Is there anyway in php or apache server configuration to prevent such kind of headers' hack?
Not sure why all the down votes - infact, it is an interesting question :)
I can see that you have tagged CakePHP - which means your app is using Cake Framework... Excellent! If you are using Cake 3 , it is automatically strip off : %0d%0a
Alternatively, where you receive the response header, just strip off %0d%0a and you are good!
Where things like these could be applied - a 3rd party API response or say.... a Webhook response! or a badly sanitized way to handle intl.. example : lang=en to lang=fr where the GET param is directly set as response header... That would not be a wise move!
Ideally, the responses will be as GET and not in the header but either way just strip the %0d%0a and you are good.
Answering your edit.
You can see the injected header Hacked_header:By_Hacker/.. in the above response
That injected header cannot be controlled or stopped, mate. We do not have control over what the other server does.
The question is.. What do you do with the response header?
The answer is... You sanitize it, as ndm said you need to sanitize the input.. What you get as a response IS an input. As soon as you detect %0d%0a, discard the response.
Need code work?
<?php
$cr = '/\%0d/';
$lf = '/\%0a/';
$response = // whatever your response is generated in;
$cr_check = preg_match($cr , $response);
$lf_check = preg_match($lf , $response);
if (($cr_check > 0) || ($lf_check > 0)){
throw new \Exception('CRLF detected');
}
I have this HTTP response content :
HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
Transfer-Encoding: chunked
Date: Mon, 12 Aug 2013 15:08:10 GMT
PK�Ctemps_attente.json���n� �߅9�Bw���VU��Uߠs���^��#�CGç��ͷ�r7G�3Hnp����^pYSu\#Qo%~x��FGa�Y�ا����S���-ua���t��j-���s�%э��+,g�xq.��������t�fb� �0:)�:�K�}^�N�L����>�щ%#�̲x`C#��m݃ :^��$~�i8���WzCh�a�ă���7t�O|��AX˂��UO$���<��y"�;�'F��]��{֘Ha}F��<��l6��o벰V���66t�&��f�Ť��x�H��툗���/PKA�Y�1�PK�CA�Y�1�temps_attente.jsonPK#q
I would like to know what format is the response and how to decompile to have the final response.
I tried to use this function: http_chunked_decode but I did not succeed.
The body (or at least what appears to be the body) of the response is not chunked.
It does appear to be compressed - with HTTP this should be expressly stated in the headers,
There is no blank line between the what appears to be the headers and what appears to be the body.
If this is really the response you are getting it's not HTTP - an off-the-shelf function is not going to make sense of it.
How can I get the Header Informations like this web page (below) with using php
This web site Check website HTTP Server Header Information
Result:
HTTP Status for: "http://www.abc.com"
The title is: ""
Keywords: ""
Description: ""
HTTP/1.1 200 OK
Date: Fri, 22 Feb 2013 12:00:56 GMT
Server: Apache/2.2.3 (Debian) PHP/4.4.4-8+etch6
X-Powered-By: PHP/4.4.4-8+etch6
Keep-Alive: timeout=300
Connection: Keep-Alive
Content-Type: text/html
Use PHPs function get_headers:
$headers = get_headers($url, 1);
See: http://php.net/manual/en/function.get-headers.php
If you also want the meta keywords and meta description use get_meta_tags():
$tags = get_meta_tags($url);
You can use the PHP function get_header();.
This function will return an array with all the header fields.
For more information see: http://php.net/manual/en/function.get-headers.php
You could use:
print_r(get_headers($url));
So I just now learned of the X-Robots-Tag which can be set as part of a server response header. Now that I have learned about this particular field, I am wondering if there are any other specific fields I should be setting when I output a webpage via PHP? I did see this list of responses, but what should I be manually setting? What do you like to set manually?
Restated, in addition to...
header('X-Robots-Tag: noindex, nofollow, noarchive, nosnippet', true);
...what else should I be setting?
Thanks in advance!
You don't necessarily need to set any of them manually, and I don't send any unless absolutely necessary: most response headers are the web server's job, not the application's (give or take Location & situational cache-related headers).
As for the "X-*" headers, the X implies they aren't "official," so browsers may or may not interpret them to mean anything - like, you can add an arbitrary "X-My-App-Version" header to a public project to get a rough idea of where people are using it, but it's just extra info unless the requester knows what to do with it.
I think most X-headers are more commonly delivered via HTML as meta tags already. For example, <meta name="robots" content="noindex, nofollow, (etc)" />, which does the same as X-Robots-Tag. That's arguably better handled with the meta tag version anyway, since it won't trip over output buffering as header() can do, and it will be naturally cached since it's part of the page.
These are headers from Stackoverflow (this page), so the answer is, probably none.
You don't want your site indexed (noindex)?
Status=OK - 200
Cache-Control=public, max-age=60
Content-Type=text/html; charset=utf-8
Content-Encoding=gzip
Expires=Tue, 28 Sep 2010 01:23:00 GMT
Last-Modified=Tue, 28 Sep 2010 01:22:00 GMT
Vary=*
Set-Cookie=usr=t=&s=; domain=.stackoverflow.com; expires=Mon, 28-Mar-2011 01:22:00 GMT; path=/; HttpOnly
Date=Tue, 28 Sep 2010 01:21:59 GMT
Content-Length=6929
This header comes handy to me. Characters are displayed correctly, even if meta tag is missing.
Content-Type: text/html; charset=utf-8
When I send a 304 response. How will the browser interpret other headers which I send together with the 304?
E.g.
header("HTTP/1.1 304 Not Modified");
header("Expires: " . gmdate("D, d M Y H:i:s", time() + $offset) . " GMT");
Will this make sure the browser will not send another conditional GET request (nor any request) until $offset time has "run out"?
Also, what about other headers?
Should I send headers like this together with the 304:
header('Content-Type: text/html');
Do I have to send:
header("Last-Modified:" . $modified);
header('Etag: ' . $etag);
To make sure the browser sends a conditional GET request the next time the $offset has "run out" or does it simply save the old Last Modified and Etag values?
Are there other things I should be aware about when sending a 304 response header?
This blog post helped me a lot in order to tame the "conditional get" beast.
An interesting excerpt (which partially contradicts Ben's answer) states that:
If a normal response would have included an ETag header, that header must also be included in the 304 response.
Cache headers (Expires, Cache-Control, and/or Vary), if their values might differ from those sent in a previous response.
This is in complete accordance with the RFC 2616 sec 10.3.5.
Below a 200 request...
HTTP/1.1 200 OK
Server: nginx/0.8.52
Date: Thu, 18 Nov 2010 16:04:38 GMT
Content-Type: image/png
Last-Modified: Thu, 15 Oct 2009 02:04:11 GMT
Expires: Thu, 31 Dec 2010 02:04:11 GMT
Cache-Control: max-age=315360000
Accept-Ranges: bytes
Content-Length: 6394
Via: 1.1 proxyIR.my.corporate.proxy.name:8080 (IronPort-WSA/6.3.3-015)
Connection: keep-alive
Proxy-Connection: keep-alive
X-Junk: xxxxxxxxxxxxxxxx
...And its optimal valid 304 counterpart.
HTTP/1.1 304 Not Modified
Server: nginx/0.8.52
Date: Thu, 18 Nov 2010 16:10:35 GMT
Expires: Thu, 31 Dec 2011 16:10:35 GMT
Cache-Control: max-age=315360000
Via: 1.1 proxyIR.my.corporate.proxy.name:8080 (IronPort-WSA/6.3.3-015)
Connection: keep-alive
Proxy-Connection: keep-alive
X-Junk: xxxxxxxxxxx
Notice that the Expires header is at most Current Date + One Year as per RFC-2616 14.21.
The Content-Type header only applies to responses which contain a body. A 304 response does not contain a body, so that header does not apply. Similarly, you don't want to send Last-Modified or ETag because a 304 response means that the document hasn't changed (and so neither have the values of those two headers).
For an example, see this blog post by Anne van Kesteren examining WordPress' http_modified function. Note that it returns either Last-Modified and ETag or a 304 response.