Detect URL redirect path php - php

I want to get full redirect path of the url.
Let's say if source.com redirects to destination.com after multiple redirects like this:
http://www.source.com/ -> http://www.b.com/ -> http://www.c.com/ -> http://www.destination.com/
how do I get all redirected URL's?
using this below code I am getting only http://www.destination.com/ how do I detect full url redirect chain?
<?php
$url='windows.com';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); // follow the redirects
curl_setopt($ch, CURLOPT_HEADER, false); // no needs to pass the headers to the data stream
curl_setopt($ch, CURLOPT_NOBODY, true); // get the resource without a body
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); // accept any server certificate
curl_exec($ch);
// get the last used URL
$lastUrl = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
curl_close($ch);
echo $lastUrl;
?>
This code has another problem it can't detect redirected url of youtube redirects.
Tested URL : https://www.youtube.com/redirect?redir_token=QUFFLUhqbkVxUFZUME9NbWF4RThxdFpGV3pmTTJEdFVWQXxBQ3Jtc0tubGJqU016TzJ6WnlfeUItX0ZmOUItUE1jRlZoZXhxMzNpQllpM0NLSk4ycnBLMGNidTFsX3N6WkU2X3RsUTRZb1lXQVp5SEZjbnU3eDFuZS1VU3dhdzg2QW9ZMTl1azFCZFZHcHRLdFF3dTM1MlRWdw%3D%3D&event=video_description&v=KEa2XWRGf_4&q=https%3A%2F%2Fwww.facebook.com%2Fabhiandniyu
My question is how do I detect full url redirect chain for all types of redirect requests.

You're probably missing:
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
add it to your CURL config and it should work then.

Don't follow HTTP redirects: curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);
And output HTTP headers, while testing: curl_setopt($ch, CURLOPT_HEADER, true);
Then you can obtain the Location header from the received HTTP 302 response.
When it's more than one redirect, this would have to run in a loop, until HTTP 200 has been received. In this context HTTP 200 means, that the final destination has been reached.

Related

Get the destination URL only without loading the contents with cURL

If I visit the website https://example.com/a/abc?name=jack, I get redirected to https://example.com/b/uuid-123. I want only the end URL which is https://example.com/b/uuid-123. The contents of https://example.com/b/uuid-123 is about 1mb but I am not interested in the content. I only want the redirected URL and not the content. How can I get the redirected URL without having to also load the 1mb of content which wastes my bandwidth and time.
I have seen a few questions about redirection on stackoverflow but nothing on how not to load the content.
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://example.com/a/abc?name=jack');
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_exec($ch);
$end_url = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
curl_close($ch);
echo('End URL is ' . $end_url);
For clarity i'll add it as an answer as well.
You can tell curl to only retrieve the headers by setting the CURLOPT_NOBODY to true.
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_HEADER, true);
From these headers you can parse the location part to get the redirected URL.
Edit for potential future readers: CURLOPT_HEADER will also need to be set to true, i had left this out as you already had it included in your code.

When using cURL, can the target page detect the exact url?

Lets say we have our own script here:
https://example.com/random_name/script.php
// Code inside the script.php
$url = 'https://foo.com';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($ch, CURLOPT_TIMEOUT, 60);
$data = curl_exec($ch);
As you can see, the link above gets content of this domain:
https://foo.com
I know foo.com can see example.com and it's IP Address!
but Questions is, can foo.com (the page we're getting content from) also by any methods detect this exact part:
/random_name/script.php is making the request? does it depend on using TLS?
The page (server/application) can have all the information in the request. If the request (in your example - script.php) will not send the extra data (/random_name/script.php) the page the receive the request will not have it.
If you want to receiving end (foo.com) to know about it you can use the referer header:
curl_setopt($ch, CURLOPT_REFERER, "https://example.com/random_name/script.php");
And this way - foo.com can view that information in the referer header.

Different answers to the same queries cUrl (Steam Market)

When in the browser you follow the link:
http://steamcommunity.com/market/priceoverview/?country=US%C2%A4cy=5&appid=570&market_hash_name=Gem%20of%20Taegeuk
Gives out { "success": false }, In headings 500 a mistake. But when I do the same inquiry through cUrl
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://steamcommunity.com/market/priceoverview/?country=US&currency=5&appid=570&market_hash_name=Gem%20of%20Taegeuk");
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
$curl = curl_exec($ch);
In response, instead of json I get this:
‹ЄV*.MNN-.VІJKМ)N­яятКC4
Tell me how to fix this and what might be the cause of the error (500)?
The server return gzipped response (header Content-Encoding: gzip). So, you need auto encoding:
curl_setopt($ch,CURLOPT_ENCODING, '');
P.S. Browser unlike the curl unpacks the response automatically.
Two problems:
1) There's an additional %C2%A4cy% and missing curren after country=US in the example link. The URL in CURL looks ok.
2) Your CURL commands do not follow redirects, the URL should be with https:// (browser does that automatically). You can follow redirects with curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);

Curl Php Jsessionid cannot get or set

okay so I am trying to curl to a website using jsessionid this is new to me.
I have a curl php script shown below how can I get the correct jsession id cookies and set them correctly .
<?php
$url = 'http://www.example.com/i/sec/stats.do';
$ch = curl_init();
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookies.txt"); // Cookie management.
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookies.txt");
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$result= curl_exec ($ch);
curl_close ($ch);
echo $result;
?>
I then get this output how can I get and set the jsessionid.
This document you requested has moved temporarily.
It's now at https://www.example.com/i/sec/stats.do;jsessionid=c7dnSdlXc18Zpmqj1Tv1Rxq5TZDwD7dCpt5dpbg7LXmp1gnZs9V9!212735760!1390241207640.
Possibly you are getting 301 or 302 redirect response from your curl request. Use this option to handle redirection.
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
But, if the site doesn't return the 301, 302 redirection response, then parse the url from the response, and call another curl request.

Using PHP Curl, how can I look at exactly what is being sent?

For instance, on the following CURL snippet:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url); //set target URL
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);// allow redirects
curl_setopt($ch, CURLOPT_POST, $usePost); // set POST method
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers); //set headers
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_HEADER, $returnHeaders);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE); //prevent unverified SSL error
Before I run curl_exec on it, what if I want to see the full request headers and body before it is sent. (to see if is correctly following certain REST API guidelines)
You could send a request to the local server:
$test_url = 'http://localhost/nonexistent-page';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $test_url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLINFO_HEADER_OUT, true);
// Other options.
curl_exec($ch);
echo nl2br(curl_getinfo($ch, CURLINFO_HEADER_OUT));
This will give you the request headers, with only the request line path and the Host: line being different from your actual request.
If you have access to a graphical environment on your server, you could use Wireshark to examine the network packets being sent and received. Wireshark allows you to use filters, to filter out specific IP-adresses and protocols.
For instance, I use this filter to see all the traffic from my cURL requests/responses to the server with IP w.x.y.z (substitute with the ip of the server you are connecting to):
ip.addr == w.x.y.z && http
I can then examine all my requests responses.
This has given me great insight in what's happening 'under the hood'.

Categories