I want to get full redirect path of the url.
Let's say if source.com redirects to destination.com after multiple redirects like this:
http://www.source.com/ -> http://www.b.com/ -> http://www.c.com/ -> http://www.destination.com/
how do I get all redirected URL's?
using this below code I am getting only http://www.destination.com/ how do I detect full url redirect chain?
<?php
$url='windows.com';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); // follow the redirects
curl_setopt($ch, CURLOPT_HEADER, false); // no needs to pass the headers to the data stream
curl_setopt($ch, CURLOPT_NOBODY, true); // get the resource without a body
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); // accept any server certificate
curl_exec($ch);
// get the last used URL
$lastUrl = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
curl_close($ch);
echo $lastUrl;
?>
This code has another problem it can't detect redirected url of youtube redirects.
Tested URL : https://www.youtube.com/redirect?redir_token=QUFFLUhqbkVxUFZUME9NbWF4RThxdFpGV3pmTTJEdFVWQXxBQ3Jtc0tubGJqU016TzJ6WnlfeUItX0ZmOUItUE1jRlZoZXhxMzNpQllpM0NLSk4ycnBLMGNidTFsX3N6WkU2X3RsUTRZb1lXQVp5SEZjbnU3eDFuZS1VU3dhdzg2QW9ZMTl1azFCZFZHcHRLdFF3dTM1MlRWdw%3D%3D&event=video_description&v=KEa2XWRGf_4&q=https%3A%2F%2Fwww.facebook.com%2Fabhiandniyu
My question is how do I detect full url redirect chain for all types of redirect requests.
You're probably missing:
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
add it to your CURL config and it should work then.
Don't follow HTTP redirects: curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);
And output HTTP headers, while testing: curl_setopt($ch, CURLOPT_HEADER, true);
Then you can obtain the Location header from the received HTTP 302 response.
When it's more than one redirect, this would have to run in a loop, until HTTP 200 has been received. In this context HTTP 200 means, that the final destination has been reached.
Related
If I visit the website https://example.com/a/abc?name=jack, I get redirected to https://example.com/b/uuid-123. I want only the end URL which is https://example.com/b/uuid-123. The contents of https://example.com/b/uuid-123 is about 1mb but I am not interested in the content. I only want the redirected URL and not the content. How can I get the redirected URL without having to also load the 1mb of content which wastes my bandwidth and time.
I have seen a few questions about redirection on stackoverflow but nothing on how not to load the content.
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://example.com/a/abc?name=jack');
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_exec($ch);
$end_url = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
curl_close($ch);
echo('End URL is ' . $end_url);
For clarity i'll add it as an answer as well.
You can tell curl to only retrieve the headers by setting the CURLOPT_NOBODY to true.
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_HEADER, true);
From these headers you can parse the location part to get the redirected URL.
Edit for potential future readers: CURLOPT_HEADER will also need to be set to true, i had left this out as you already had it included in your code.
Lets say we have our own script here:
https://example.com/random_name/script.php
// Code inside the script.php
$url = 'https://foo.com';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($ch, CURLOPT_TIMEOUT, 60);
$data = curl_exec($ch);
As you can see, the link above gets content of this domain:
https://foo.com
I know foo.com can see example.com and it's IP Address!
but Questions is, can foo.com (the page we're getting content from) also by any methods detect this exact part:
/random_name/script.php is making the request? does it depend on using TLS?
The page (server/application) can have all the information in the request. If the request (in your example - script.php) will not send the extra data (/random_name/script.php) the page the receive the request will not have it.
If you want to receiving end (foo.com) to know about it you can use the referer header:
curl_setopt($ch, CURLOPT_REFERER, "https://example.com/random_name/script.php");
And this way - foo.com can view that information in the referer header.
When in the browser you follow the link:
http://steamcommunity.com/market/priceoverview/?country=US%C2%A4cy=5&appid=570&market_hash_name=Gem%20of%20Taegeuk
Gives out { "success": false }, In headings 500 a mistake. But when I do the same inquiry through cUrl
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://steamcommunity.com/market/priceoverview/?country=US¤cy=5&appid=570&market_hash_name=Gem%20of%20Taegeuk");
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
$curl = curl_exec($ch);
In response, instead of json I get this:
‹ЄV*.MNN-.VІJKМ)NяятКC4
Tell me how to fix this and what might be the cause of the error (500)?
The server return gzipped response (header Content-Encoding: gzip). So, you need auto encoding:
curl_setopt($ch,CURLOPT_ENCODING, '');
P.S. Browser unlike the curl unpacks the response automatically.
Two problems:
1) There's an additional %C2%A4cy% and missing curren after country=US in the example link. The URL in CURL looks ok.
2) Your CURL commands do not follow redirects, the URL should be with https:// (browser does that automatically). You can follow redirects with curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
okay so I am trying to curl to a website using jsessionid this is new to me.
I have a curl php script shown below how can I get the correct jsession id cookies and set them correctly .
<?php
$url = 'http://www.example.com/i/sec/stats.do';
$ch = curl_init();
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookies.txt"); // Cookie management.
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookies.txt");
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$result= curl_exec ($ch);
curl_close ($ch);
echo $result;
?>
I then get this output how can I get and set the jsessionid.
This document you requested has moved temporarily.
It's now at https://www.example.com/i/sec/stats.do;jsessionid=c7dnSdlXc18Zpmqj1Tv1Rxq5TZDwD7dCpt5dpbg7LXmp1gnZs9V9!212735760!1390241207640.
Possibly you are getting 301 or 302 redirect response from your curl request. Use this option to handle redirection.
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
But, if the site doesn't return the 301, 302 redirection response, then parse the url from the response, and call another curl request.
For instance, on the following CURL snippet:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url); //set target URL
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);// allow redirects
curl_setopt($ch, CURLOPT_POST, $usePost); // set POST method
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers); //set headers
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_HEADER, $returnHeaders);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE); //prevent unverified SSL error
Before I run curl_exec on it, what if I want to see the full request headers and body before it is sent. (to see if is correctly following certain REST API guidelines)
You could send a request to the local server:
$test_url = 'http://localhost/nonexistent-page';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $test_url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLINFO_HEADER_OUT, true);
// Other options.
curl_exec($ch);
echo nl2br(curl_getinfo($ch, CURLINFO_HEADER_OUT));
This will give you the request headers, with only the request line path and the Host: line being different from your actual request.
If you have access to a graphical environment on your server, you could use Wireshark to examine the network packets being sent and received. Wireshark allows you to use filters, to filter out specific IP-adresses and protocols.
For instance, I use this filter to see all the traffic from my cURL requests/responses to the server with IP w.x.y.z (substitute with the ip of the server you are connecting to):
ip.addr == w.x.y.z && http
I can then examine all my requests responses.
This has given me great insight in what's happening 'under the hood'.