Converting Rackspace Cloud Files CDN URLs from HTTP to HTTPS - php

I have a series of Rackspace Cloud Files CDN URLs stored which reference an HTTP address and I would like to convert them to the HTTPS equivalent.
Rackspace Cloud Files CDN URLs are in the following format:
http://c186397.r97.cf1.rackcdn.com/CloudFiles Akamai.pdf
And the SSL equivalent for this URL would be:
https://c186397.ssl.cf1.rackcdn.com/CloudFiles Akamai.pdf
The changes to the URL are (source):
HTTP becomes HTTPS
The second URI segment ('r97' in this example) becomes 'ssl'
The 'r00' part seems to vary in length (as some are 'r6' etc.) so I'm having trouble converting these URLs to HTTPS. Here's the code I have so far:
function rackspace_cloud_http_to_https($url)
{
//Replace the HTTP part with HTTPS
$url = str_replace("http", "https", $url, $count = 1);
//Get the position of the .r00 segment
$pos = strpos($url, '.r');
if ($pos === FALSE)
{
//Not present in the URL
return FALSE;
}
//Get the .r00 part to replace
$replace = substr($url, $pos, 4);
//Replace it with .ssl
$url = str_replace($replace, ".ssl", $url, $count = 1);
return $url;
}
This however does not work for URLs where the second segment is of a different length.
Any thoughts appreciated.

I know this is old, but if you are using this library: https://github.com/rackspace/php-opencloud you can use the getPublicUrl() method on the object, you just need to use the following namespace
use OpenCloud\ObjectStore\Constants as Constant;
// Code logic to upload file
$https_file_url = $response->getPublicUrl(Constant\UrlType::SSL);

Try this:
function rackspace_cloud_http_to_https($url)
{
$urlparts = explode('.', $url);
// check for presence of 'r' segment
if (preg_match('/r\d+/', $urlparts[1]))
{
// replace appropriate segments of url
$urlparts[0] = str_replace("http", "https", $urlparts[0]);
$urlparts[1] = 'ssl';
// put url back together
$url = implode('.', $urlparts);
return $url;
}
else
{
return false;
}
}

Related

How to follow a URL redirect to the final location?

How can I get the final destination URL for an Airbnb short-link URL in PHP? (e.g. https://abnb.me/Vt3MA7vVyM)
Using Redirect Detective, I can see that the link gets redirected three times:
file_get_contents() follows the redirections defined in the HTTP response. But, there is a redirection in JavaScript using window.top.location. So, you can parse it using strpos() and a simple preg_match():
$url = 'https://abnb.me/Vt3MA7vVyM';
$ret = file_get_contents($url);
$pos = strpos($ret, 'window.top.location');
if ($pos !== false) {
$str = substr($ret, $pos);
$str = preg_match('~validate\("([^"]*)~', $str, $matches);
echo html_entity_decode($matches[1]);
}
Output:
https://airbnb.com/rooms/2110908?=&s=41&ref_device_id=43fb193006d0cb8848f689aec67ba15ae5c48471&user_id=10758532&_branch_match_id=519823851281523702

Cloudfront, HTML5 Video and Signed URLs

I am having trouble getting Cloudfront videos to play when using a signed url. If I do NOT require a signed URL, everything works fine. Here is the code that signs the url:
function rsa_sha1_sign($policy, $private_key_filename)
{
$signature = "";
// load the private key
$fp = fopen($private_key_filename, "r");
$priv_key = fread($fp, 8192);
fclose($fp);
//echo $priv_key;
$pkeyid = openssl_get_privatekey($priv_key);
// compute signature
openssl_sign($policy, $signature, $pkeyid);
// free the key from memory
openssl_free_key($pkeyid);
//echo $signature;
return $signature;
}
function url_safe_base64_encode($value)
{
$encoded = base64_encode($value);
// replace unsafe characters +, = and / with
// the safe characters -, _ and ~
return str_replace(
array('+', '=', '/'),
array('-', '_', '~'),
$encoded);
}
// No restriction
$keyPairId = "KEYPAIRID-DIST-NOT-REQUIRING-SIGNEDURL";
$download_url = "http://URL-DIST-NOT-REQUIRING-SIGNEDURL.cloudfront.net/myvideo.mp4";
//This is just a flag to aid in switching between the 2 testing distributions
if($restrict) {
$key_pair_id = "KEYPAIRID-DIST-REQUIRING-SIGNEDURL"";
$download_url = "http://URL-DIST-REQUIRING-SIGNEDURL.cloudfront.net/myvideo.mp4";
}
$DateLessThan = time() + (24*7*60*60);
$policy = '{"Statement":[{"Resource":"'.$download_url.'","Condition":{"DateLessThan":{"AWS:EpochTime":'.$DateLessThan.'}}}]}';
$private_key_file = "/path/to/privatekey.pem";
$signature = rsa_sha1_sign($policy, $private_key_file);
$signature = url_safe_base64_encode($signature);
$final_url = $download_url.'?Policy='.url_safe_base64_encode($policy).'&Signature='.$signature.'&Key-Pair-Id='.$key_pair_id;
echo $final_url;
In the above, if I use the Cloudfront distribution that requires a signed URL (by passing in $restrict=1) then I get an error, "Video not found". In console I see that the GET request for the video was canceled (Status Text: cancelled... weirdly I see this twice). If I use the Distribution that doe NOT require a signed URL everything works fine and the video loads correctly.
What am I missing? The distributions are identical except for the requiring of the signed URL and they both use the same Amazon S3 bucket source for the video.
The player is flowplayer(HTML5) but since it works fine without the signed url I would assume the player isn't the problem.
Please see my answer here: Amazon S3 signed url not working with flowplayer
Hopefully that will help.
In my case, I needed to remove the "mp4:" prefix before signing the url, and then add it back on again.

Extract top domain from string php

I need to extract the domain name out of a string which could be anything. Such as:
$sitelink="http://www.somewebsite.com/product/3749875/info/overview.html";
or
$sitelink="http://subdomain.somewebsite.com/blah/blah/whatever.php";
In any case, I'm looking to extract the 'somewebsite.com' portion (which could be anything), and discard the rest.
With parse_url($url)
<?php
$url = 'http://username:password#hostname/path?arg=value#anchor';
print_r(parse_url($url));
?>
The above example will output:
Array
(
[scheme] => http
[host] => hostname
[user] => username
[pass] => password
[path] => /path
[query] => arg=value
[fragment] => anchor
)
Using thos values
echo parse_url($url, PHP_URL_HOST); //hostname
or
$url_info = parse_url($url);
echo $url_info['host'];//hostname
here it is
<?php
$sitelink="http://www.somewebsite.com/product/3749875/info/overview.html";
$domain_pieces = explode(".", parse_url($sitelink, PHP_URL_HOST));
$l = sizeof($domain_pieces);
$secondleveldomain = $domain_pieces[$l-2] . "." . $domain_pieces[$l-1];
echo $secondleveldomain;
note that this is not probably the behavior you are looking for, because, for hosts like
stackoverflow.co.uk
it will echo "co.uk"
see:
http://publicsuffix.org/learn/
http://www.dkim-reputation.org/regdom-libs/
http://www.dkim-reputation.org/regdom-lib-downloads/ <-- downloads here, php included
2 complexe url
$url="https://www.example.co.uk/page/section/younameit";
or
$url="https://example.co.uk/page/section/younameit";
To get "www.example.co.uk":
$host=parse_url($url, PHP_URL_HOST);
To get "example.co.uk" only
$parts = explode('www.',$host);
$domain = $parts[1];
// ...or...
$domain = ltrim($host, 'www.')
If your url includes "www." or not you get the same end result, i.e. "example.co.uk"
Voilà!
You need package that uses Public Suffix List, only in this way you can correctly extract domains with two-, third-level TLDs (co.uk, a.bg, b.bg, etc.) and multilevel subdomains. Regex, parse_url() or string functions will never produce absolutely correct result.
I recomend use TLD Extract. Here example of code:
$extract = new LayerShifter\TLDExtract\Extract();
$result = $extract->parse('http://www.somewebsite.com/product/3749875/info/overview.html');
$result->getSubdomain(); // will return (string) 'www'
$result->getHostname(); // will return (string) 'somewebsite'
$result->getSuffix(); // will return (string) 'com'
$result->getRegistrableDomain(); // will return (string) 'somewebsite.com'
For a string that could be anything, new approach:
function extract_plain_domain($text) {
$text=trim($text,"/");
$text=strtolower($text);
$parts=explode("/",$text);
if (substr_count($parts[0],"http")) {
$parts[0]="";
}
reset ($parts);while (list ($key, $val) = each ($parts)) {
if (!empty($val)) { $text=$val; break; }
}
$parts=explode(".",$text);
if (empty($parts[2])) {
return $parts[0].".".$parts[1];
} else {
$num_parts=count($parts);
return $parts[$num_parts-2].".".$parts[$num_parts-1];
}
} // end function extract_plain_domain
You can use the Utopia Domains library (https://github.com/utopia-php/domains), it will return the domain TLD and public suffix based on Mozilla public suffix list (https://publicsuffix.org), it can be used as an alternative to the currently archived TLDExtract package.
You can use 'parse_url' function to get the hostname from your URL and than use Utopia Domains parser to get the correct TLD and join it together with the domain name:
<?php
require_once './vendor/autoload.php';
use Utopia\Domains\Domain;
$url = 'http://demo.example.co.uk/site';
$domain = new Domain(parse_url($url, PHP_URL_HOST)); // demo.example.co.uk
var_dump($domain->get()); // demo.example.co.uk
var_dump($domain->getTLD()); // uk
var_dump($domain->getSuffix()); // co.uk
var_dump($domain->getName()); // example
var_dump($domain->getSub()); // demo
var_dump($domain->isKnown()); // true
var_dump($domain->isICANN()); // true
var_dump($domain->isPrivate()); // false
var_dump($domain->isTest()); // false
var_dump($domain->getName().'.'.$domain->getSuffix()); // example.co.uk

facebook graph api picture - get full url with .jpg

I am able to access my user image from FB with graph api by accessing the user id like so: https://graph.facebook.com/<USER_ID>/picture
However for my code to work, i need the real path to the image like http://profile.ak.fbcdn.net/hprofile-ak-snc6/******_**************_********_q.jpg
FBs doc shows that by adding ?callback=foo i can get an output, but in practice it doesnt seem to work.
any suggestions for getting the full path to my image with that .jpg extension from graph api or with the user id, thank you.
Callback is for javascript requests,
For php,try appending a redirect=false in url.
Do a curl request to,
https://graph.facebook.com/shaverm/picture?redirect=false
If you want to use callback in js,
$.getJSON('https://graph.facebook.com/zuck/picture?callback=?',function (resp) {
$('body').html(resp.data.url);
});​
Demo
Reference
*USE FOLLOWING YOU NEVER GET WRONG RESULT *
$URL='FB GRAPH API URL';
$headers = get_headers($URL, 1); // make link request and wait for redirection
if(isset($headers['Location'])) {
$URL = $headers['Location']; // this gets the new url
}
$url_arr = explode ('/',$URL);
$ct = count($url_arr);
$name = $url_arr[$ct-1];
$name_div = explode('.', $name);
$ct_dot = count($name_div);
$img_type = $name_div[$ct_dot -1];
$pos = strrpos($img_type, "&");//many time you got in url
if($pos)
{
$pieces = explode("&", $img_type);
$img_type = $pieces[0];
}
$imagename = imgnameyouwant'.'.$img_type;
$content = file_get_contents($URL);
file_put_contents("fbscrapedimages/$imagename", $content);

OpenID Discovery Methods - Yadis VS HTML

Recently, I've begun writing my own PHP OpenID consumer class in order to better understand openID. As a guide, I've been referencing the [LightOpenID Class][1]. For the most part, I understand the code and how OpenID works. My confusion comes when looking at the author's discover function:
function discover($url)
{
if(!$url) throw new ErrorException('No identity supplied.');
# We save the original url in case of Yadis discovery failure.
# It can happen when we'll be lead to an XRDS document
# which does not have any OpenID2 services.
$originalUrl = $url;
# A flag to disable yadis discovery in case of failure in headers.
$yadis = true;
# We'll jump a maximum of 5 times, to avoid endless redirections.
for($i = 0; $i < 5; $i ++) {
if($yadis) {
$headers = explode("\n",$this->request($url, 'HEAD'));
$next = false;
foreach($headers as $header) {
if(preg_match('#X-XRDS-Location\s*:\s*(.*)#', $header, $m)) {
$url = $this->build_url(parse_url($url), parse_url(trim($m[1])));
$next = true;
}
if(preg_match('#Content-Type\s*:\s*application/xrds\+xml#i', $header)) {
# Found an XRDS document, now let's find the server, and optionally delegate.
$content = $this->request($url, 'GET');
# OpenID 2
# We ignore it for MyOpenID, as it breaks sreg if using OpenID 2.0
$ns = preg_quote('http://specs.openid.net/auth/2.0/');
if (preg_match('#<Service.*?>(.*)<Type>\s*'.$ns.'(.*?)\s*</Type>(.*)</Service>#s', $content, $m)
&& !preg_match('/myopenid\.com/i', $this->identity)) {
$content = $m[1] . $m[3];
if($m[2] == 'server') $this->identifier_select = true;
$content = preg_match('#<URI>(.*)</URI>#', $content, $server);
$content = preg_match('#<LocalID>(.*)</LocalID>#', $content, $delegate);
if(empty($server)) {
return false;
}
# Does the server advertise support for either AX or SREG?
$this->ax = preg_match('#<Type>http://openid.net/srv/ax/1.0</Type>#', $content);
$this->sreg = preg_match('#<Type>http://openid.net/sreg/1.0</Type>#', $content);
$server = $server[1];
if(isset($delegate[1])) $this->identity = $delegate[1];
$this->version = 2;
$this->server = $server;
return $server;
}
# OpenID 1.1
$ns = preg_quote('http://openid.net/signon/1.1');
if(preg_match('#<Service.*?>(.*)<Type>\s*'.$ns.'\s*</Type>(.*)</Service>#s', $content, $m)) {
$content = $m[1] . $m[2];
$content = preg_match('#<URI>(.*)</URI>#', $content, $server);
$content = preg_match('#<.*?Delegate>(.*)</.*?Delegate>#', $content, $delegate);
if(empty($server)) {
return false;
}
# AX can be used only with OpenID 2.0, so checking only SREG
$this->sreg = preg_match('#<Type>http://openid.net/sreg/1.0</Type>#', $content);
$server = $server[1];
if(isset($delegate[1])) $this->identity = $delegate[1];
$this->version = 1;
$this->server = $server;
return $server;
}
$next = true;
$yadis = false;
$url = $originalUrl;
$content = null;
break;
}
}
if($next) continue;
# There are no relevant information in headers, so we search the body.
$content = $this->request($url, 'GET');
if($location = $this->htmlTag($content, 'meta', 'http-equiv', 'X-XRDS-Location', 'value')) {
$url = $this->build_url(parse_url($url), parse_url($location));
continue;
}
}
if(!$content) $content = $this->request($url, 'GET');
# At this point, the YADIS Discovery has failed, so we'll switch
# to openid2 HTML discovery, then fallback to openid 1.1 discovery.
$server = $this->htmlTag($content, 'link', 'rel', 'openid2.provider', 'href');
$delegate = $this->htmlTag($content, 'link', 'rel', 'openid2.local_id', 'href');
$this->version = 2;
# Another hack for myopenid.com...
if(preg_match('/myopenid\.com/i', $server)) {
$server = null;
}
if(!$server) {
# The same with openid 1.1
$server = $this->htmlTag($content, 'link', 'rel', 'openid.server', 'href');
$delegate = $this->htmlTag($content, 'link', 'rel', 'openid.delegate', 'href');
$this->version = 1;
}
if($server) {
# We found an OpenID2 OP Endpoint
if($delegate) {
# We have also found an OP-Local ID.
$this->identity = $delegate;
}
$this->server = $server;
return $server;
}
throw new ErrorException('No servers found!');
}
throw new ErrorException('Endless redirection!');
}
[1]: http://gitorious.org/lightopenid
Okay, Here's the logic as I understand it (basically):
Check to see if the $url sends you a valid XRDS file that you then parse to figure out the OpenID provider's endpoint.
From my understanding, this is called the Yadis authentication method.
If no XRDS file is found, Check the body of the response for an HTML <link> tag that contains the url of the endpoint.
What. The. Heck.
I mean seriously? Essentially screen scrape the response and hope you find a link with the appropriate attribute value?
Now, don't get me wrong, this class works like a charm and it's awesome. I'm just failing to grok the two separate methods used to discover the endpoint: XRDS (yadis) and HTML.
My Questions
Are those the only two methods used in the discovery process?
Is one only used in version 1.1 of OpenID and the other in version 2?
Is it critical to support both methods?
The site I've encountered the HTML method on is Yahoo. Are they nuts?
Thanks again for your time folks. I apologize if I sound a little flabbergasted, but I was genuinely stunned at the methodology once I began to understand what measures were being taken to find the endPoint.
Specification is your friend.
But answering your question:
Yes. Those are the only two methods defined by the OpenID specifications (at least, for URLs -- there is a third method for XRIs).
No, both can be used with both version of the protocol. Read the function carefully, and you'll see that it supports both methods for both versions.
If you want your library to work with every provider and user, you'd better do. Some users paste the HTML tags into their sites, so their site's url can be used as an openid.
Some providers even use both methods at once, to mantain compatibility with consumers not implementing YADIS discovery (which isn't part of OpenID 1.1, but can be used with it). So that does make sense.
And yes, HTML discovery is about searching for a <link> in the response body. That's why it's called HTML discovery.

Categories