Remove protocl and subdomain from URL - php

I have a string like this:
http://www.downlinegoldmine.com/viralmarketing
I need to remove http://www. from the string if it exists, as well as http:// if www is not included.
In few words I just need the domain name without any protocol.

parse_url is the perfect tool for the job. You would first call it to split the url in parts, then check the hostname part to see if it starts with www. and strip it, then assemble the url back.
Update: code
echo normalize_url('http://www.downlinegoldmine.com/viralmarketing');
function normalize_url($url) {
$parts = parse_url($url);
unset($parts['scheme']);
if (substr($parts['hostname'], 0, 4) == 'www.') {
$parts['hostname'] = substr($parts['hostname'], 4);
}
if (function_exists('http_build_url')) {
// This PECL extension makes life a lot easier
return http_build_url($parts);
}
// Otherwise it's the hard way
$result = null;
if (!empty($parts['username'])) {
$result .= $parts['username'];
if (!empty($parts['password'])) {
$result .= ':'.$parts['password'];
}
$result .= '#';
}
$result .= $parts['host'].$parts['path'];
if (!empty($parts['query'])) {
$result .= '?'.$parts['query'];
}
if (!empty($parts['fragment'])) {
$result .= '#'.$parts['fragment'];
}
return $result;
}
See it in action.

Just use parse_url (see: http://php.net/manual/de/function.parse-url.php ). It will also incorporate different protocols and paths etc.

$nvar = preg_replace("#http://(www\.)?#i", "", "http://www.downlinegoldmine.com/viralmarketing");
Test:
php> echo preg_replace("#http://(www\.)?#i", "", "http://www.downlinegoldmine.com/viralmarketing");
downlinegoldmine.com/viralmarketing
php> echo preg_replace("#http://(www\.)?#i", "", "http://downlinegoldmine.com/viralmarketing");
downlinegoldmine.com/viralmarketing

There's probably a better way, but:
$url = preg_replace("#^(http://)?(www\\.)?#i", "", $url);

$url = strncmp('http://', $url, 7) ? $url : substr($url, 7);
$url = strncmp('www.', $url, 4) ? $url : substr($url, 4);

You can use the following to remove the https://, http://, and www. from a url.
$url = 'http://www.downlinegoldmine.com/viralmarketing';
echo preg_replace('/https?:\/\/|www./', '', $url);
above returns downlinegoldmine.com/viralmarketing
and you can use the following to remove the urls path as well as the https://, http://, and www..
$url = 'http://www.downlinegoldmine.com/viralmarketing';
echo implode('/', array_slice(explode('/',preg_replace('/https?:\/\/|www./', '', $url)), 0, 1));
above returns downlinegoldmine.com

Related

Need to change url using php

I am going to make a URL checking system.
I have this URL
https://lasvegas.craigslist.org/mob/6169799901.html
Now I want to make this URL like this
https://lasvegas.craigslist.org/search/mob?query=6169799901
how can I do it using PHP?
Since I ended up (maybe?) solving it anyways, here's one method using URL/path parsing:
$url = 'https://lasvegas.craigslist.org/mob/6169799901.html';
$parsed = parse_url($url);
$basepath = pathinfo($parsed['path']);
echo $parsed['scheme'].
"://".
$parsed['host'].
"/search".
$basepath['dirname'].
"?query=".
$basepath['filename'];
Formatted for readability.
https://3v4l.org/E6Y54
Try this
$url = "https://lasvegas.craigslist.org/mob/6169799901.html";
$id = substr($url, strrpos($url, '/') + 1);
$id = str_replace(".html","",$id);
$result = "https://lasvegas.craigslist.org/search/mob?query=".$id;
echo $result;

Internal Server Error when trying to check url

I'm trying to check the string after the last trailing slash in my URL.
My code is as follows:
$url = "http://$_SERVER[HTTP_HOST]$_SERVER[REQUEST_URI]";
$data = substr($url, strrpos($url, '/') + 1);
if($data == "dashboard") {
require_once VIEW_ROOT . '/cp/dashboard_view.php';
} else {
echo $data;
}
Once I go to http://MYURL/dashboard/in it should show in as the $data. Instead it gives me a 500 error.
You can simply use explode() function to break the string... .Or else $_SERVER[REQUEST_URI] shall give you the data after the host name...
But for the data after the last '/' explode function will work the best..
This will work.
$url = "http://$_SERVER[HTTP_HOST]$_SERVER[REQUEST_URI]";
$x = explode('/',$url);
$data = $x[sizeof($x)-1];
echo $data;
You should try :
$url = "http://".$_SERVER[HTTP_HOST].$_SERVER[REQUEST_URI];
You need to join
http:// string with $_SERVER[HTTP_HOST] and then $_SERVER[REQUEST_URI] using .(dot).

PHP replace URL segment with str_replace();

I have "/foo/bar/url/" coming straight after my domain name.
What I want is to find penultimate slash symbol in my string and replace it with slash symbol + hashtag. Like so: from / to /# (The problem is not how to get URL, but how to handle it)
How this could be achieved? What is the best practice for doing stuff like that?
At the moment I'm pretty sure that I should use str_replace();
UPD. I think preg_replace() would be suitable for my case. But then there is another problem: what should regexp look like in order to make my issue solved?
P.S. Just in a case I'm using SilverStripe framework (v3.1.12)
$url = '/foo/bar/url/';
if (false !== $last = strrpos($url, '/')) {
if (false !== $penultimate = strrpos($url, '/', $last - strlen($url) - 1)) {
$url = substr_replace($url, '/#', $penultimate, 1);
}
}
echo $url;
This will output
/foo/bar/#url/
If you want to strip the last /:
echo rtrim($url, '/'); // print /foo/bar/#url
Here is a method that would function. There are probably cleaner ways.
// Let's assume you already have $url_string populated
$url_string = "http://whatever.com/foo/bar/url/";
$url_explode = explode("\\",$url_string);
$portion_count = count($url_explode);
$affected_portion = $portion_count - 2; // Minus two because array index starts at 0 and also we want the second to last occurence
$i = 0;
$output = "";
foreach ($url_explode as $portion){
$output.=$portion;
if ($i == $affected_portion){
$output.= "#";
}
$i++;
}
$new_url = $output;
Assuming you now have
$url = $this->Link(); // e.g. /foo/bar/my-urlsegment
You can combine it like
$handledUrl = $this->ParentID
? $this->Parent()->Link() + '#' + $this->URLSegment
: $this->Link();
where $this->Parent()->Link() is e.g. /foo/bar and $this->URLSegment is my-urlsegment
$this->ParentID also checks if we have a parent page or are on the top level of SiteTree
I might be tooooo late for answering this question but I thought this might help you. You can simply use preg_replace like as
$url = '/foo/bar/url/';
echo preg_replace('~(\/)(\w+)\/$~',"$1#$2",$url);
Output:
/foo/bar/#url
In my case this solved my problem:
$url = $this->Link();
$url = rtrim($url, '/');
$url = substr_replace($url, '#', strrpos($url, '/') + 1, 0);

greek url conversion and trim unwated numbers and symbols

This problem is little complicated since i'm newbee to php encoding.
My site uses utf-8 encoding.
After a lot of tests, i found some solution. I use this kind of code:
function chr_conv($str)
{
$a=array with pattern('%CE%B2','%CE%B3','%CE%B4','%CE%B5' etc..);
$b=array with replacement characters(a,b,c,d, etc...);
return str_replace($a, $b2, $str);
}
function replace_old($str)
{
$a1 = array ('index.php','/http://' etc...);
$a2 = array with replacement characters('','' etc...);
return str_replace($a1, $a2, $str);
}
function sanitize($url)
{
$url= replace_old(replace_old($url));
$url = strtolower($url);
$url = preg_replace('/[0-9]/', '', $url);
$url = preg_replace('/[?]/', '', $url);
$url = substr($url,1);
return $url;
}
function wbz404_process404()
{
$options = wbz404_getOptions();
$urlRequest = $_SERVER['REQUEST_URI'];
$url = chr_conv($urlRequest);
$requestedURL = replace_old(replace_old($url));
$requestedURL .= wbz404_SortQuery($urlParts);
//Get URL data if it's already in our database
$redirect = wbz404_loadRedirectData($requestedURL);
echo sanitize($requestedURL);
echo "</br>";
echo $requestedURL;
echo "</br>";
}
When incoming url is:
/content.php?147-%CE%A8%CE%AC%CF%81%CE%B9-%CE%BC%CE%B5-%CF%80%CF%81%CE%AC%CF%83%CE%B1%28%CE%A7%CE%BF%CF%8D%CE%BC%CF%80%CE%BB%CE%B9%CE%BA%29";
I get:
/content.php?147-psari-me-prasa-choumplik
I want only:
/psari-me-prasa-choumplik
without the content.php?147- before URL.
BUT the most important problem is that I get ENDLESS LOOP instead of correct URL.
What am i doing wrong?
Have in mind that .htaccess solution won't work since i have a lighttpd server, not Apache.
If you need
I am assuming it's not always ?147- that you need to skip. But always after the first hyphen. In which case, before the echo add the following:
$requestedURL = substr($requestedURL, strrpos( $requestedURL , '-') +1 );
This will search for the position of the first hyphen and return that, add one so you skip the hyphen itself, and use that to cut the $requestedURL string up after the hyphen to the end of the string.
If it's always /content.php?127- then replace strrpos( $requestedURL , '-') +1 with the number 17.

PHP Regex to Remove http:// from string

I have full URLs as strings, but I want to remove the http:// at the beginning of the string to display the URL nicely (ex: www.google.com instead of http://www.google.com)
Can someone help?
$str = 'http://www.google.com';
$str = preg_replace('#^https?://#', '', $str);
echo $str; // www.google.com
That will work for both http:// and https://
You don't need regular expression at all. Use str_replace instead.
str_replace('http://', '', $subject);
str_replace('https://', '', $subject);
Combined into a single operation as follows:
str_replace(array('http://','https://'), '', $urlString);
Better use this:
$url = parse_url($url);
$url = $url['host'];
echo $url;
Simpler and works for http:// https:// ftp:// and almost all prefixes.
Why not use parse_url instead?
To remove http://domain ( or https ) and to get the path:
$str = preg_replace('#^https?\:\/\/([\w*\.]*)#', '', $str);
echo $str;
If you insist on using RegEx:
preg_match( "/^(https?:\/\/)?(.+)$/", $input, $matches );
$url = $matches[0][2];
Yeah, I think that str_replace() and substr() are faster and cleaner than regex. Here is a safe fast function for it. It's easy to see exactly what it does. Note: return substr($url, 7) and substr($url, 8), if you also want to remove the //.
// slash-slash protocol remove https:// or http:// and leave // - if it's not a string starting with https:// or http:// return whatever was passed in
function universal_http_https_protocol($url) {
// Breakout - give back bad passed in value
if (empty($url) || !is_string($url)) {
return $url;
}
// starts with http://
if (strlen($url) >= 7 && "http://" === substr($url, 0, 7)) {
// slash-slash protocol - remove https: leaving //
return substr($url, 5);
}
// starts with https://
elseif (strlen($url) >= 8 && "https://" === substr($url, 0, 8)) {
// slash-slash protocol - remove https: leaving //
return substr($url, 6);
}
// no match, return unchanged string
return $url;
}
<?php
// (PHP 4, PHP 5, PHP 7)
// preg_replace — Perform a regular expression search and replace
$array = [
'https://lemon-kiwi.co',
'http://lemon-kiwi.co',
'lemon-kiwi.co',
'www.lemon-kiwi.co',
];
foreach( $array as $value ){
$url = preg_replace("(^https?://)", "", $value );
}
This code output :
lemon-kiwi.co
lemon-kiwi.co
lemon-kiwi.co
www.lemon-kiwi.co
See documentation PHP preg_replace

Categories