checking if a web page has compression enabled with php - php

I want to check if a web page has gzip/deflate compression enabled with php. Probably with get_headers I won`t make it, so a anydvices, or any bit of code anywhere to check this ( couldn't find anything on the subject ). Probably I need to find in the headers for the compression, how do i make a HTTP request with compression enabled ?

You can use curl and use curl_setopt,
set CURLOPT_ENCODING to gzip,deflate
if the webpage is gzip/deflate enabled,
the encoding request will be respect and gzip content will be returned

Related

HTTP response content parsing [duplicate]

I'm building a REST API.. Sometimes the server returns the response with chunked transfer encoding? Why is that?!
Why can't the server always return the response in the same encoding?
The problem is that I don't know how to read the data when its returned as chunked!?
update
neeed moore downvotes... to breeeath...
Assuming your server is using Apache, this is expected behaviour. You can disable it by putting this line in your .htaccess file:
SetEnv downgrade-1.0
However, you should consider modifying your reading code to just support different content encodings. What library are you using to make the HTTP request? Any reasonable HTTP library can handle chunked requests. If your requesting code is written in PHP, use curl. http://php.net/manual/en/book.curl.php
Taken from Server Fault:
specify the "Content-Length' header, so server knows, what's the size of the response
use HTTP 1.0 at the requester's side
A problem may be that Apache is gzipping your download, taking care of correcting the Content-Length, or in your case, adding the header
Content-Encoding: chunked
You can add a .htaccess RewriteRule to disable gzip:
RewriteRule . - [E=no-gzip:1]

how to use gzip decompression with AFNetworking API call?

I am using AFNetworking library in my projects. recently I heard about gzip data compression which is given by defalut in NSURLConnection Class and reducing the time and loading time of large json response, hence AFNetworking might have that feature as it is working on top of NSURLConnection.
but I do not know how to get gzip compressed json response from php API through AFNetworking.
I need this technique when Json response file size is more that 100kb+.
If server supports gzip it could require client to ask server to respond with gzip enabled. To ask server to use gzip you add specific "Accept-Encoding" header to your requests. You can do this, for example, with this lines of code:
// Get one that serializes your requests,
// for example from your AFHTTPSessionManager subclass
AFHTTPRequestSerializer <AFURLRequestSerialization> *requestSerializer = ...
[requestSerializer setValue:#"gzip, identity" forHTTPHeaderField:#"Accept-Encoding"];
"Accept-Encoding" header must contain gzip while, probably, identity is not required.
If you are using NSURLSession based API in AFNetworking, it would be included automatically as said here. Thus you don't need to do anything.
Please be noted that you may not see the header value in the app log for request, but you may check the header in response to confirm this behaviour if the server supports gzip.

PHP fopen and file_get_contents limited download speed, why?

I'm trying to retrieve a remote file (6MB text file) with PHP and I noticed that with fopen the speed is limited to 100KB/s and with file_get_contents is 15KB/s.
Howewer with wget from the server the speed is above 5MB/s.
What controls these speeds?
I checked the live speeds with nethogs.
wget is great on it's own to mirror sites it can actually parse links from pages and download files.
file_get_contents doesn't send a "connection" HTTP header, so the remote web server considers by default that's it's a keep-alive connection and doesn't close the TCP stream until 15 seconds (It might not be a standard value - depends on the server conf).
A normal browser would consider the page is fully loaded if the HTTP payload length reaches the length specified in the response Content-Length HTTP header. File_get_contents doesn't do this and that's a shame.
SOLUTION
SO, if you want to know the solution, here it is:
$context = stream_context_create(array('http' => array('header'=>'Connection: close\r\n')));
file_get_contents("http://www.something.com/somepage.html",false,$context);
The thing is just to tell the remote web server to close the connection when the download is complete, as file_get_contents isn't intelligent enough to do it by itself using the response Content-Length HTTP header.

server sometimes returns chunked transfer encoding

I'm building a REST API.. Sometimes the server returns the response with chunked transfer encoding? Why is that?!
Why can't the server always return the response in the same encoding?
The problem is that I don't know how to read the data when its returned as chunked!?
update
neeed moore downvotes... to breeeath...
Assuming your server is using Apache, this is expected behaviour. You can disable it by putting this line in your .htaccess file:
SetEnv downgrade-1.0
However, you should consider modifying your reading code to just support different content encodings. What library are you using to make the HTTP request? Any reasonable HTTP library can handle chunked requests. If your requesting code is written in PHP, use curl. http://php.net/manual/en/book.curl.php
Taken from Server Fault:
specify the "Content-Length' header, so server knows, what's the size of the response
use HTTP 1.0 at the requester's side
A problem may be that Apache is gzipping your download, taking care of correcting the Content-Length, or in your case, adding the header
Content-Encoding: chunked
You can add a .htaccess RewriteRule to disable gzip:
RewriteRule . - [E=no-gzip:1]

Does cURL NOBODY actually fetch the body?

I'm not sure if this function in CURL just strips the response body out but still load it fully.
Is that true? I don't want to waste bandwidth, I just want the headers.
CURLOPT_NOBODY will send a HEAD request to web server. The server should respond with just the HTTP headers and no body content.
http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
The HEAD method is identical to GET except that the server MUST NOT return a message-body in the response.
It will only load the headers, it won't load the body of the requested document.
As you can see in official doc, It will not download the body if you enable it
https://curl.haxx.se/libcurl/c/CURLOPT_NOBODY.html
DESCRIPTION A long parameter set to 1 tells libcurl to not include the
body-part in the output when doing what would otherwise be a download.
For HTTP(S), this makes libcurl do a HEAD request. For most other
protocols it means just not asking to transfer the body data.
Enabling this option means asking for a download but without a body.

Categories