curl request header info and not the contents of the page

curl request header info and not the contents of the page - php

I'm writing a script that will use cURL to check a number of links. I know I can use curl_getinfo() to get the http status code, but I'm not sure if that requests the entire page or just the response with the headers. Is there any curl option or setting I can use to only request the headers of the URL (i.e. 404 not found, moved, etc) ?

You can instruct cURL to not download the contents (body) of the request by setting the CURLOPT_NOBODY option to TRUE - see
Does CURLOPT_NOBODY still download the body - using bandwidth
p.s. curl_getinfo() does not make requests - it gets information from requests that have already been executed.

Related

Setting header for notify file content changed. PHP, Twilio

I'm working on a twilio project with PHP which will be playing back a frequently changing audio file.
Twilio's TwiML Voice documentation states to:
make sure your web server is sending the proper headers to inform us
that the contents of the file have changed
Which headers are these and how do I set them in PHP.

Which headers are these?
This is how caching works on Twilio
Twilio requests a .mp3 from your server using a GET request. Your
server sends back a 200 OK, and also sends back an E-Tag header.
Twilio will save the E-Tag header, as well as the mp3 file, in its
database.
The next time Twilio sends a GET request to that URL, it will send
along the E-Tag header (it should look like "If-None-Match"). If the
file has not changed since the last time Twilio accesses it, your
server will send back a 304 Not Modified header. Crucially, it will
not send the mp3 file data. Twilio will use the mp3 file it has
stored in its database. It's much faster for Twilio to read the mp3
file from its database than it is for your server to send it (and it
also saves your server bandwidth).
If you change the content of the mp3 that is being served at the URL,
and Twilio makes a GET request to the URL, then your server will send
back a 200 OK, with a new E-Tag. Twilio will download the file from
your server, and cache it.
How do I set them in PHP?
header("ETag: \"uniqueID\");
When sending a file, web server attaches ID of the file in header called ETag. When requesting file, browser checks if the file was already downloaded. If cached file is found, server sends the ID with the file request to server. Server checks if the IDs match and if they do, sends back header("HTTP/1.1 304 Not Modified"); else Server sends the file normally.
One easy way to check is by adding some fake key-value pairs to the end of the URL, like http://yoururl.com/play.mp3?key=somevalue. Your website should still serve the same mp3 as it would if you loaded example.com/test.mp3, but to Twilio it will appear to be a new URL (uncached).

Twilio uses Squid to cache MP3. You can control how long an item is cached using the cache control header.
cache-control: max-age=3600
http://wiki.squid-cache.org/SquidFaq/InnerWorkings#How_does_Squid_decide_when_to_refresh_a_cached_object.3F

304 not modified header and SAME ORIGIN policy

I would like to see if an external image is sending back the 304 not modfied header, is this able to be done on my server or will this violate the same origin policy?

The same origin policy can only make problems when using ajax. Images, javascripts etc. aren't affected by this policy, which only exists in browsers.
Simply send an curl request with the headers and then read the headers. See also: Can PHP cURL retrieve response headers AND body in a single request?

Does cURL NOBODY actually fetch the body?

I'm not sure if this function in CURL just strips the response body out but still load it fully.
Is that true? I don't want to waste bandwidth, I just want the headers.

CURLOPT_NOBODY will send a HEAD request to web server. The server should respond with just the HTTP headers and no body content.
http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
The HEAD method is identical to GET except that the server MUST NOT return a message-body in the response.

It will only load the headers, it won't load the body of the requested document.

As you can see in official doc, It will not download the body if you enable it
https://curl.haxx.se/libcurl/c/CURLOPT_NOBODY.html
DESCRIPTION A long parameter set to 1 tells libcurl to not include the
body-part in the output when doing what would otherwise be a download.
For HTTP(S), this makes libcurl do a HEAD request. For most other
protocols it means just not asking to transfer the body data.
Enabling this option means asking for a download but without a body.

php cookies problem

i just want to ask if do i need to set something to enable the cookies from my hosting?
i have this
<?php
setcookie("TestCookie","Hello",time()+3600);
print_r($_COOKIE);
?>
it will function perfectly at my server which is xampp. but when i upload it to my hosting,
it will not function.. what should i do? or what will i add to the code?

It is also possible that the cookie is actually sent but the client doesn't sent the value back to the webserver in subsequent requests.
Possible causes:
your server's clock is misconfigured and therefore time()+3600 is in the past from the client's perspective => the client will delete the cookie immediately.
the cookie failed the general tail match check, search for "tail match" in http://curl.haxx.se/rfc/cookie_spec.html
the client is configured not to accept those cookies
There are many addons for different browsers that let you see the http headers the client actually received. E.g. Firebug for Firefox. Use them to check if there is a Set-cookie header in the response. If there is the client for some reasons rejected it. If there is no such header you have to check why the server didn't sent it.

Cookies are sent as http response headers. Those headers can only be sent before anything from the response body has been sent.
Make sure no output happens before setcookie():
no echo, printf, readfile
nothing outside the <?php ?> tags, not a single white-space or BOM in any of the scripts that have been included and executed for the same request before setcookie()
increase the error_reporting level and check for warnings/notices. Those messages are usually logged in a file (e.g. error.log). You can set display_errors to true to see them in the output.
PHP can be configured to use output buffers. In this case output is not sent directly to the client but held in a buffer until either it's full, the buffer is flushed or the script ends (implicit flush). So until the content of the buffer is actually sent to the client you can set/modify/delete http headers like cookies. see http://docs.php.net/outcontrol.configuration

Use web developer toolbar to view cookie.
My problem days in practicing PHP reminds me:
there may be new line ahead of <?php [this occurred to me when I use some proxies to upload]
free hosting providers include("top_ads.php") that is put on top of your php file [this occurred to me when I used free hosting]

If I'm not mistaken... Cookies are not accessible on the same page - has to be on the next page. Once cookies are set, you need to forward to another page via a header(); function, and THEN the $_COOKIE vars become accessible. It's not meant to work on the same page.

PHP Curl get HTTP code, not whole document

I'm using curl in PHP to check the HTTP code when requesting some files, I'm trying to make my speed run faster so I'm wondering is there a way to make it get the HTTP code without actually getting the web page from the remote host

Set CURLOPT_NOBODY to true. This means that rather than preforming a GET or POST request, a HEAD request will be preformed so the remote server will only return the HTTP header.
curl_setopt($ch, CURLOPT_NOBODY, true);
There is also some example code in this answer

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.