So I am trying to retrieve only headers using cURL with the following:
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_HEADER, true); // we want headers
curl_setopt ($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_TIMEOUT,10);
$output = curl_exec($ch);
The problem is that while trying to get headers of large file the script uses all the memory. I would like to avoid getting also the body and I have tried to use:
curl_setopt($ch, CURLOPT_NOBODY, true);
The problem is that this issues a HEAD request instead of GET. And some website retrun an error when you request with HEAD.
Is there any way with curl to retrieve only header without doing HEAD request?
First, don't use CURLOPT_RETURNTRANSFER as that's the option that makes it keep the entire response in memory.
Then, two options:
A) use a write callback and make that abort the transfer as soon as the first byte of the body is returned. There's a write callback example in the PHP.net docs.
B) use CURLOPT_RANGE and ask for only the first byte to be retrieved, 0-0. This avoids the write callback but has the downside that not all HTTP servers and URLs will acknowledge it.
If you dont have to use cURL, you could use get_headers(). By default get_headers uses a GET request to fetch the headers. And you could also modify that request by using stream_context_set_default()
$headers = get_headers('http://example.com');
More Info: PHP: get_headers
Related
Hi I am new to php and want to know some alternate function for the header('location:mysit.php');
I am in a scenario that I am sending the request like this:
header('Location: http://localhost/(some external site).php'&?var='test')
something like this but what I wanna do is that I want to send values of variables to the external site but I actually dont want that page to pop out.
I mean variables should be sent to some external site/page but on screen I want to be redirected to my login page. But seemingly I dont know any alternative please guide me. Thx.
You are searching for PHP cUrl:
$ch = curl_init();
// set URL and other appropriate options
curl_setopt($ch, CURLOPT_URL, "http://www.example.com/");
curl_setopt($ch, CURLOPT_HEADER, 0);
// grab URL and pass it to the browser
curl_exec($ch);
// close cURL resource, and free up system resources
curl_close($ch);
Set the location header to the place you actually want to redirect the browser to and use something like cURL to make an HTTP request to the remote site.
The way you usually would do that is by sending those parameters by cURL, parse the return values and use them however you need.
By using cURL you can pass POST and GET variables to any URL.
Like so:
$ch = curl_init('http://example.org/?aVariable=theValue');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$result = curl_exec($ch);
curl_close($ch);
Now, in $result you have the response from the URL passed to curl_init().
If you need to post data, the code needs a little more:
$ch = curl_init('http://example.org/page_to_post_to.php');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, 'variable1=value1&variable2=value2');
$result = curl_exec($ch);
curl_close($ch);
Again, the result from your POST reqeust is saved to $result.
You could connect to another URL in the background in numerous ways. There's cURL ( http://php.net/curl - already mentioned here in previous comments ), there's fopen ( http://php.net/manual/en/function.fopen.php ), there's fsockopen ( http://php.net/manual/en/function.fsockopen.php - little more advanced )
I've been trying to perform an XML request. I've faced so many problems that I managed to solve. But this one I couldn't solve.
this is the script:
$url ="WebServiceUrl";
$xml="XmlRequest";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_MUTE, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Type: text/xml'));
curl_setopt($ch, CURLOPT_POSTFIELDS, "$xml");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$output = curl_exec($ch);
curl_close($ch);
echo $output;
It is giving me this error:
System.InvalidOperationException: Request format is invalid: text/xml. at System.Web.Services.Protocols.HttpServerProtocol.ReadParameters() at System.Web.Services.Protocols.WebServiceHandler.CoreProcessRequest()
I'm still a noob at this. So go easy on me:)
thanks.
Looks like you're sending stuff as text/xml, which is not what it wants. Find the docs for this web service e.g. WSDL stuff if it's there, and find out what data formats it accepts.
Be sure e.g. that it's not really saying it will respond in XML, after receiving a request as standard HTML POST variables.
There are two main content types used with the HTTP POST method: application/x-www-form-urlencoded and multipart/form-data.
The content-type determines what the format of the CURLOPT_POSTFIELDS should be. If you are using the default, which is "application/x-www-form-urlencoded" you probably want to use build_http_query() to construct the url encoded query string.
If you are sending non-ASCII data you canpass an associative array with keys that match the field names and values that correspond to the value for the field. Using this technique will cause the request to be issued with a multipart/formdata content-type.
At this point, it sounds like your next step should be figuring out what fields the API is expecting.
application/x-www-form-urlencoded or multipart/form-data?
I use the following command in some old scripts:
curl -Lk "https:www.example.com/stuff/api.php?"
I then record the header into a variable and make comparisons and so forth. What I would really like to do is convert the process to PHP. I have enabled curl, openssl, and believe I have everything ready.
What I cannot seem to find is a handy translation to convert that command line syntax to the equivalent commands in PHP.
I suspect something in the order of :
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
// What goes here so that I just get the Location and nothing else?
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
// Get the response and close the channel.
$response = curl_exec($ch);
curl_close($ch);
The goal being $response = the data from the api “OK=1&ect”
Thank you
I'm a little confused by your comment:
// What goes here so that I just get the Location and nothing else?
Anyway, if you want to obtain the response body from the remote server, use:
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch);
If you want to get the headers in the response (i.e.: what your comment might be referring to):
curl_setopt($ch, CURLOPT_HEADER, 1);
If your problem is that there is a redirection between the initial call and the response, use:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
I have a site that has a simple API which can be used via http. I wish to make use of the API and submit data about 1000-1500 times at one time. Here is their API: http://api.jum.name/
I have constructed the URL to make a submission but now I am wondering what is the best way to make these 1000-1500 API GET requests? Here is the PHP CURL implementation I was thinking of:
$add = 'http://www.mysite.com/3rdparty/API/api.php?fn=post&username=test&password=tester&url=http://google.com&category=21&title=story a&content=content text&tags=Season,news';
curl_setopt ($ch, CURLOPT_URL, "$add");
curl_setopt ($ch, CURLOPT_POST, 0);
curl_setopt ($ch, CURLOPT_COOKIEFILE, 'files/cookie.txt');
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 0);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, TRUE);
$postdata = curl_exec ($ch);
Shall I close the CURL connection every time I make a submission? Can I re-write the above in a better way that will make these 1000-1500 submissions quicker?
Thanks all
If you have access to php 5.2+ I would highly recommend php's curl_multi.
This allows you to process several curl requests in parallel, which in this case would definitely come in handy.
Related documentation : http://us3.php.net/manual/en/ref.curl.php
An example usage : http://www.somacon.com/p537.php
PHP's curl, by default, reuses a connection for multiple calls to curl_exec().
So in this case, you just ruse the curl handle, you got by curl_init and if the URL matches between calls to curl_exec(), it will send a "Connection: keep-alive" header and reuse the connection.
Do not close the connection and do not set CURLOPT_FORBID_REUSE
also see here:
Persistent/keepalive HTTP with the PHP Curl library?
I have simple code that does a head request for a URL and then prints the response headers. I've noticed that on some sites, this can take a long time to complete.
For example, requesting http://www.arstechnica.com takes about two minutes. I've tried the same request using another web site that does the same basic task, and it comes back immediately. So there must be something I have set incorrectly that's causing this delay.
Here's the code I have:
$ch = curl_init();
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, 20);
curl_setopt ($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
// Only calling the head
curl_setopt($ch, CURLOPT_HEADER, true); // header will be at output
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'HEAD'); // HTTP request is 'HEAD'
$content = curl_exec ($ch);
curl_close ($ch);
Here's a link to the web site that does the same function: http://www.seoconsultants.com/tools/headers.asp
The code above, at least on my server, takes two minutes to retrieve www.arstechnica.com, but the service at the link above returns it right away.
What am I missing?
Try simplifying it a little bit:
print htmlentities(file_get_contents("http://www.arstechnica.com"));
The above outputs instantly on my webserver. If it doesn't on yours, there's a good chance your web host has some kind of setting in place to throttle these kind of requests.
EDIT:
Since the above happens instantly for you, try setting this curl setting on your original code:
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, true);
Using the tool you posted, I noticed that http://www.arstechnica.com has a 301 header sent for any request sent to it. It is possible that cURL is getting this and not following the new Location specified to it, thus causing your script to hang.
SECOND EDIT:
Curiously enough, trying the same code you have above was making my webserver hang too. I replaced this code:
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'HEAD'); // HTTP request is 'HEAD'
With this:
curl_setopt($ch, CURLOPT_NOBODY, true);
Which is the way the manual recommends you do a HEAD request. It made it work instantly.
You have to remember that HEAD is only a suggestion to the web server. For HEAD to do the right thing it often takes some explicit effort on the part of the admins. If you HEAD a static file Apache (or whatever your webserver is) will often step in an do the right thing. If you HEAD a dynamic page, the default for most setups is to execute the GET path, collect all the results, and just send back the headers without the content. If that application is in a 3 (or more) tier setup, that call could potentially be very expensive and needless for a HEAD context. For instance, on a Java servlet, by default doHead() just calls doGet(). To do something a little smarter for the application the developer would have to explicitly implement doHead() (and more often than not, they will not).
I encountered an app from a fortune 100 company that is used for downloading several hundred megabytes of pricing information. We'd check for updates to that data by executing HEAD requests fairly regularly until the modified date changed. It turns out that this request would actually make back end calls to generate this list every time we made the request which involved gigabytes of data on their back end and xfer it between several internal servers. They weren't terribly happy with us but once we explained the use case they quickly came up with an alternate solution. If they had implemented HEAD, rather than relying on their web server to fake it, it would not have been an issue.
If my memory doesn't fails me doing a HEAD request in CURL changes the HTTP protocol version to 1.0 (which is slow and probably the guilty part here) try changing that to:
$ch = curl_init();
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, 20);
curl_setopt ($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
// Only calling the head
curl_setopt($ch, CURLOPT_HEADER, true); // header will be at output
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'HEAD'); // HTTP request is 'HEAD'
curl_setopt($ch, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_1); // ADD THIS
$content = curl_exec ($ch);
curl_close ($ch);
I used the below function to find out the redirected URL.
$head = get_headers($url, 1);
The second argument makes it return an array with keys. For e.g. the below will give the Location value.
$head["Location"]
http://php.net/manual/en/function.get-headers.php
This:
curl_setopt($ch, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_1);
I wasn't trying to get headers.
I was just trying to make the page load of some data not take 2 minutes similar to described above.
That magical little options has dropped it down to 2 seconds.