I know that when I set CURLOPT_FOLLOWLOCATION to true, cURL will follow the Location header and redirect to new page. But is it possible only to get header of the new page without actually redirecting there? Or is it not possible?
Appears to be a duplicate of PHP cURL: Get target of redirect, without following it
However, this can be done in 3 easy steps:
Step 1. Initialise curl
curl_init($ch); //initialise the curl handle
//COOKIESESSION is optional, use if you want to keep cookies in memory
curl_setopt($ch, CURLOPT_COOKIESESSION, true);
Step 2. Get the headers for $url
curl_setopt($ch, CURLOPT_URL, $url); //specify your URL
curl_setopt($ch, CURLOPT_HEADER, true); //include headers in http data
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false); //don't follow redirects
$http_data = curl_exec($ch); //hit the $url
$curl_info = curl_getinfo($ch);
$headers = substr($http_data, 0, $curl_info["header_size"]); //split out header
Step 3. Parse the headers to get the new URL
preg_match("!\r\n(?:Location|URI): *(.*?) *\r\n!", $headers, $matches);
$url = $matches[1];
Once you have the new URL you can then repeat steps 2-3 as often as you like.
No. You'd have to disable FOLLOWLOCATION, extract the redirect URL from the response, and then issue a new HEAD request with that URL.
Set CURLOPT_FOLLOWLOCATION as false and CURLOPT_HEADER as true, and get the "Location" from the response header.
Yes, you can set it to follow the redirect until you get the last location on the header response.
The function to get the last redirect:
function get_redirect_final_target($url)
{
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_NOBODY, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); // follow redirects
curl_setopt($ch, CURLOPT_AUTOREFERER, 1); // set referer on redirect
curl_setopt($ch,CURLOPT_HEADER,false); // if you want to print the header response change false to true
$response = curl_exec($ch);
$target = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
curl_close($ch);
if ($target)
return $target; // the location you want
return false;
}
You can get the redirect URL directly with curl_getinfo:
$ch = curl_init();
curl_setopt($ch, CURLOPT_COOKIESESSION, false);
curl_setopt($ch, CURLOPT_URL, $url); //specify your URL
curl_setopt($ch, CURLOPT_HEADER, true); //include headers in http data
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false); //don't follow redirects
$http_data = curl_exec($ch); //hit the $url
$redirect = curl_getinfo($ch)['redirect_url'];
curl_close($ch);
return $redirect;
And for analyze headers, your can use CURLOPT_HEADERFUNCTION
Make sure you set CURLOPT_HEADER to True to get the headers in the response, otherwise the response returned as blank string
Related
I am sending json data as part of post request to the specified target url and able to get the Location url. But the response from the target url is asynchronous response. So how to find and when to call the Location url to get the response. Below is my code to get the Location url.
$target_url = 'http://websieapi.com/api/messaging/post';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $target_url);
curl_setopt($ch, CURLOPT_HEADER, true); //include headers in http data
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Type: application/json'));
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS,$json_request);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$result = curl_exec($ch);
$headers = substr($result, 0, $curl_info["header_size"]); //split out header
preg_match("!\r\n(?:Location|URI): *(.*?) *\r\n!", $headers, $matches);
$location_url = $matches[1];
curl_close($ch);
Now in my $location_url, I am having generated url. But how do I know when to call this url to get the response. Because if there is high traffic then it will take time to generate response. Is there any way to find if the response has been generated or not? Any help would be greatly appreciated.
I'm trying to get copy a website page, but it redirects after enter it on my browser.
For example,
I enter,
http://www.domain.com/cat/121
it redirects,
http://www.domain.com/cat/121/title-of-the-page/
And when I try to php copy function for "www.domain.com/cat/121"
it is not working...
How can I take the redirected new url?
$url='your url';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); // follow the redirects
curl_setopt($ch, CURLOPT_HEADER, false); // no needs to pass the headers to the data stream
curl_setopt($ch, CURLOPT_NOBODY, true); // get the resource without a body
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); // accept any server certificate
curl_exec($ch);
// get the last used URL
$lastUrl = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
curl_close($ch);
echo $lasturl;
this will help you to get the redirected url
Use header function for redirecting to specific url.
header('Location: http://www.domain.com/cat/121');
If you are using a CMS you can use appropriate plugins.
For instance, For Wordpress you could use: https://wordpress.org/plugins/redirection/
Hi I know its a very common topic on StackOverFlow.
I have already spent my entire week to search it out.
I have a url : abc.com/default.asp?strSearch=19875379
this further redirect to this url: abc.com/default.asp?catid={170D4F36-39F9-4C48-88EB-CFC8DDF1F531}&details_type=1&itemid={49F6A281-8735-4B74-A170-B6110AF6CC2D}
I have made my effort to get the final url in my php code using Curl but can't make it.
here is my code:
<?php
$name="19875379";
$url = "http://www.ikea.co.il/default.asp?strSearch=".$name;
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
$a = curl_exec($ch);
curl_close( $ch );
// the returned headers
$headers = explode("\n",$a);
// if there is no redirection this will be the final url
$redir = $url;
// loop through the headers and check for a Location: str
$j = count($headers);
for($i = 0; $i < $j; $i++){
// if we find the Location header strip it and fill the redir var
//print_r($headers);
if(strpos($headers[$i],"Location:") !== false){
$redir = trim(str_replace("Location:","",$headers[$i]));
break;
}
}
// do whatever you want with the result
echo $redir;
?>
it gives me url "abc.com/default.asp?strSearch=19875379" instead of this url "abc.com/default.asp?catid={170D4F36-39F9-4C48-88EB-CFC8DDF1F531}&details_type=1&itemid={49F6A281-8735-4B74-A170-B6110AF6CC2D}"
Thanks in advance for your kind help :)
Thank you everyone for helping me in my situation.
Actually I want to develop a scraper in php for ikea website used in Israel (in Hebrew).
After putting a lot of hours I recognize that there is no server side redirection in url which I put to get the redirected url. It may be javascript redirection.
I have now implemented the below code and it works for me.
<?php
$name="19875379";
$url = "http://www.ikea.co.il/default.asp?strSearch=".$name;
$ch = curl_init();
$timeout = 0;
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_HEADER, TRUE);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
$header = curl_exec($ch);
$redir = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
//print_r($header);
$x = preg_match("/<script>location.href=(.|\n)*?<\/script>/", $header, $matches);
$script = $matches[0];
$redirect = str_replace("<script>location.href='", "", $script);
$redirect = "http://www.ikea.co.il" . str_replace("';</script>", "", $redirect);
echo $redirect;
?>
Thanks again everyone :)
The accepted answer is applicable to a very specific scenario. So, most of us will be better off having a more general answer. Though you can extract the more general answer from within the accepted answer, separately having that part may be more helpful.
So, if you just want to get the last redirected URL, this code will help.
<?php
function redirectedUrl($url) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']); // set browser info to avoid old browser warnings
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); // allow url redirects
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // get the return value of curl execution as a string
$html = curl_exec($ch);
// store last redirected url in a variable before closing the curl session
$lastUrl = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
curl_close($ch);
return $lastUrl;
}
First of all, I didn't see any redirection while I have given a run on your code. Anyway, here are few things you can do for this(keeping your approach intact):
First of all, make sure that the header will be returned to your curl output(in this case at $a).
curl_setopt($ch, CURLOPT_HEADER, true);
Now, separates only the header portion from the whole http response.
// header will be at 0 index, and html will be at 1 index.
$header = explode("\n\r",$a);
Explode the header string into headers array.
$headers = explode("\n", $header[0]);
You can use curl_getinfo() ...
http://php.net/manual/en/function.curl-getinfo.php
I have a Affiliate URL Like http://track.abc.com/?affid=1234
open this link will go to http://www.abc.com
now i want to execute the http://track.abc.com/?affid=1234 Using CURL
and now how i can Get http://www.abc.com
with Curl ?
If you want cURL to follow redirect headers from the responses it receives, you need to set that option with:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
You may also want to limit the number of redirects it follows using:
curl_setopt($ch, CURLOPT_MAXREDIRS, 3);
So you'd using something similar to this:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://track.abc.com/?affid=1234");
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_MAXREDIRS, 3);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
$data = curl_exec($ch);
Edit: Question wasn't exactly clear but from the comment below, if you want to get the redirect location, you need to get the headers from cURL and parse them for the Location header:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://track.abc.com/?affid=1234");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_HEADER, true);
$data = curl_exec($ch);
This will give you the headers returned by the server in $data, simply parse through them to get the location header and you'll get your result. This question shows you how to do that.
I wrote a function that will extract any header from a cURL header response.
function getHeader($headerString, $key) {
preg_match('#\s\b' . $key . '\b:\s.*\s#', $headerString, $header);
return substr($header[0], strlen($key) + 3, -2);
}
In this case, you're looking for the value of the header Location. I tested the function by retrieving headers from a TinyURL, that redirects to http://google.se, using cURL.
$url = "http://tinyurl.com/dtrkv";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$data = curl_exec($ch);
curl_close($ch);
$location = getHeader($data, 'Location');
var_dump($location);
Output from the var_dump.
string(16) "http://google.se"
If I have a Twitter t.co link, how can I unshorten it in php?
simple example:
$ch = curl_init("http://t.co/...");
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$yy = curl_exec($ch);
curl_close($ch);
$w = explode("\n",$yy);
$real_url = substr($w[3],10); # the fourth line is "Location: http://..."
echo $real_url;
You'll want to use cURL (with the CURLOPT_HEADER option) to fetch the URL's headers and look for the Location: header.
I would recommend using CURLINFO_EFFECTIVE_URL with curl_getinfo().
See https://stackoverflow.com/a/10661246/168815