I'm using some code like this to grab the URL from an inbound link:
$inbound_url = $_SERVER['HTTP_REFERER'];
//then do some stuff writing the url to a database table, but....
//ONLY IF the url doesn't already exist in the table
Let's say the link comes in from the same website, same webpage, but different only in the www. So I get this:
1) http://www.mysite.com/page.html
2) http://mysite.com/page.html
This shows up twice in my table since one has the www and one doesn't.
Is there a way to parse the results of $_SERVER['HTTP_REFERER']; to either:
1) add www. where it's missing, OR
2) strip everything of ...http://...www. or ..http://
Thanks in advance as always.
Sure you can. Some simple string manipulation and replacement should be all you need to remove the www from any URL -
$inbound_url = str_replace('http://www','http://',$inbound_url);
As defined in the documentation -
str_replace() - Replace all occurrences of the search string with the replacement string
Notice that I'm including the http:// in the search so that any other occurrence of the string www URL will remain untouched.
Use this
$url = 'http://stackoverflow.com';
$d = array_shift( explode( '.', str_replace('www.', '', parse_url( $url, PHP_URL_HOST )) ) );
echo $d; //stackoverflow
or you can also use
http://php.net/manual/en/function.parse-url.php function
Related
i made some function using preg_replace. the code is to get the preview of it.
i made this code
$strings = htmlspecialchars_uni($thread['soc_instagram']);
$searchs = array('~(?:https://instagram\.com/p/)?([a-zA-Z0-9_\-+?:]+)~');
$replaces = array('https://instagram.com/p/$1/media/?size=l');
$soc_instagram = preg_replace($searchs,$replaces,$strings);
the code work perfect if i post an instagram with this url https://instagram.com/p/BarUcqwht_u
and it will produce code
https://instagram.com/p/BaQsAubg6H3/media/?size=l
but the problem is when i try to add WWW in the url, something like this https://www.instagram.com/p/BarUcqwht_u the code will produce error string
the result will be like this
https://instagram.com/p/https:/media/?size=l//https://instagram.com/p/www/media/?size=l.https://instagram.com/p/instagram/media/?size=l.https://instagram.com/p/com/media/?size=l/https://instagram.com/p/p/media/?size=l/https://instagram.com/p/BarUcqwht_u/media/?size=l/
i try to add WWW in my preg_replace code but the result will be like this
https://www.instagram.com/p/https:/media/?size=l//https://www.instagram.com/p/instagram/media/?size=l.https://www.instagram.com/p/com/media/?size=l/https://www.instagram.com/p/p/media/?size=l/https://www.instagram.com/p/BaQsAubg6H3/media/?size=l
any help will be nice, thanks
Add the www. after the protocol and make it optional. preg_replace also doesn't require arrays, strings work fine.
$strings = 'https://www.instagram.com/p/BarUcqwht_u';
$searchs = '~(?:https://(?:www\.)?instagram\.com/p/)?([a-zA-Z0-9_\-+?:]+)~';
$replaces = 'https://instagram.com/p/$1/media/?size=l';
$soc_instagram = preg_replace($searchs,$replaces,$strings);
echo $soc_instagram;
Demo: https://3v4l.org/WSors
What your current implementation does is replaces characters not listed in your character class with the URL. See https://regex101.com/r/PXb2h1/2/ for a visual representation.
Lets say that $content is the content of a textarea
/*Convert the http/https to link */
$content = preg_replace('!((https://|http://)+[a-z0-9_./?=&-]+)!i', '<a target="_blank" href="$1">$1</a> ', nl2br($_POST['helpcontent'])." ");
/*Convert the www. to link prepending http://*/
$content = preg_replace('!((www\.)+[a-z0-9_./?=&-]+)!i', '<a target="_blank" href="http://$1">$1</a> ', $content." ");
This was working ok for links, but realised that it was breaking the markup when an image is within the text...
I am trying like this now:
$content = preg_replace('!\s((https?://|http://)+[a-z0-9_./?=&-]+)!i', ' $1 ', nl2br($_POST['content'])." ");
$content = preg_replace('!((www\.)+[a-z0-9_./?=&-]+)!i', '<a target="_blank" href="http://$1">$1</a> ', $content." ");
As is the images are respected, but the problem is that url's with http:// or https:// format won't be converted now..:
google.com -> Not converted (as expected)
www.google.com -> Well Converted
http://google.com -> Not converted (unexpected)
https://google.com -> Not converted (unexpected)
What am I missing?
-EDIT-
Current almost working solution:
$content = preg_replace('!(\s|^)((https?://)+[a-z0-9_./?=&-]+)!i', ' $2 ', nl2br($_POST['content'])." ");
$content = preg_replace('!(\s|^)((www\.)+[a-z0-9_./?=&-]+)!i', '<a target="_blank" href="http://$2" target="_blank">$2</a> ', $content." ");
The thing here is that if this is the input:
www.funcook.com http://www.funcook.com https://www.funcook.com
funcook.com http://funcook.com https://funcook.com
All the urls I want (all, except name.domain) are converted as expected, but this is the output
www.funcook.com http://www.funcook.com https://www.funcook.com ;
funcook.com http://funcook.com https://funcook.com
Note an ; is inserted, any idea why?
try this:
preg_replace('!(\s|^)((https?://|www\.)+[a-z0-9_./?=&-]+)!i', ' $2 ',$text);
It will pick up links beginning with http:// or with www.
Example
You can't at 100%. Becuase there may be links such as stackoverflow.com which do not have www..
If you're only targeting those links:
!(www\.\S+)!i
Should work well enough for you.
EDIT: As for your newest question, as to why http links don't get converted but https do, Your first pattern only searches for https://, or http://. which isn't the case. Simplify it by replacing:
(https://|http://\.)
With
(https?://)
Which will make the s optional.
Another method to go about adding hyperlinks is that you could take the text that you want to parse for links, and explode it into an array. Then loop through it using foreach (very fast function - http://www.phpbench.com/) and change anything that starts with http://, or https://, or www., or ends with .com/.org/etc into a link.
I'm thinking maybe something like this:
$userTextArray = explode(" ",$userText);
foreach( $userTextArray as &$word){
//if statements to test if if it starts with www. or ends with .com or whatever else
//change $word so that it is a link
}
Your changes will be reflected in the array since you had the "&" before $userText in your foreach statement.
Now just implode the array back into a string and you're good to go.
This made sense in my head... But I'm not 100% sure that this is what you're looking for
I had similar problem. Here is function which helped me. Maybe it will fit your needs to:
function clHost($Address) {
$parseUrl = parse_url(trim($Address));
return str_replace ("www.","",trim(trim($parseUrl[host] ? $parseUrl[host].$parseUrl[path] : $parseUrl[path]),'/'));
}
This function will return domain without protocol and "www", so you can add them yourself later.
For example:
$url = "http://www.". clHost($link);
I did it like that, because I couldn't find good regexp.
\s((https?://|www.)+[a-z0-9_./?=&-]+)
The problem is that your starting \s is forcing the match to start with a space, so, if you don't have that starting space your match fails. The reg exp is fine (without the \s), but to avoid replacing the images you need to add something to avoid matching them.
If the images are pure html use this:
(?<!src=")((https?://|www.)+[a-z0-9_./?=&-]+)
That will look for src=" before the url, to ignore it.
If you use another mark up, tell me and I'll try to find another way to avoid the images.
String to pull from : http:\/\/c.ypcdn.com\/2\/c\/rtd?vrid=357c99c36bd7ed631eda2e43fc9e30f8&rid=283d465f-f63b-4b0d-90b0-be6c12ed7617&ptid=943aw4l8qj&ypid=11720135&lid=194823099&tl=6&lsrc=SP&dest=http%3A%2F%2FCleanation.com
RegEx I have used before: www\..*?\.\w{2,5}
However the above RegEx will only grab the URL if it has a "www". in it. If I take out the "www." of the RegEx it justs grabs the c.ypcdn.com. I want to grab the Cleanation.com at the end of the string.
Needs to be dynamic so it can grab any url that doesn't have a "www." out of that url.
why not use parse_url() and then parse_str() on the returned query index to get it?
edit: example:
$url= "http://c.ypcdn.com/2/c/rtd?vrid=357c99c36bd7ed631eda2e43fc9e30f8&rid=283d465f-f63b-4b0d-90b0-be6c12ed7617&ptid=943aw4l8qj&ypid=11720135&lid=194823099&tl=6&lsrc=SP&dest=http%3A%2F%2FCleanation.com";
$query = parse_url($url, PHP_URL_QUERY);
parse_str($query,$params);
echo $params['dest'];
If this is always the dest parameter, you can grab it with something like:
"dest=https?%3A%2F%2F([^?&]+?)"
If its aways the last parameter, you can grab it with:
"dest=https?%3A%2F%2F(.+)$"
here's my code snippet to start with:
$url = $_SERVER["REQUEST_URI"]; // gives /test/test/ from http://example.org/test/test/
echo"$url";
trim ( $url ,'/' );
echo"$url";
I use this in combination with .htaccess rewrite, I’ll get the information from the URL and generate the page for the user with PHP using explode.
I don't want .htaccess to interpret the URL, which is probably better, but I am more common with PHP and I think it’s more flexible.
I already read this (which basically is what I want):
Best way to remove trailing slashes in URLs with PHP
The only problem is, that trim doesn’t trim the leading slashes. Why?
But actually it should work. Replacing '/' with "/", '\47' or '\x2F' doesn’t change anything.
It neither works online nor on localhost.
What am I doing wrong?
The trim function returns the trimmed string. It doesn't modify the original. Your third line should be:
$url = trim($url, '/');
This can be done in one line...
echo trim($_SERVER['REQUEST_URI'], '/');
You need to do:
$url = trim($url, '/');
You also should just do
echo $url;
It is faster.
trim does not modify the original. You'll need to do something such as:
$url = $_SERVER["REQUEST_URI"]; // gives /test/test/ from http://example.org/test/test/
echo"$url";
$url = trim ( $url ,'/' );
echo"$url";
I’m working on a small hoppy project where I want to replace a specific page on a URL. Let me explain:
I’ve got the URL
http://www.example.com/article/paragraph/low/
I want to keep the URL but replace the last segment /low/ with /high/ so the new URL is:
http://www.example.com/article/paragraph/high/
I’ve tried different explode, split and splice but I just can’t seem to wrap my head around it and make it work. I can change the entire URL but not just the last segment and save it in a new variable.
I’m pretty confidence that it is a pretty straight forward case but I’ve never worked that much with arrays / string-manipulation in PHP so I’m pretty lost.
I guess that I have to first split the URL up in segments, using the "\" to separate it (I tried that but have problems by using explode("\", $string)) and then replace the last \low\ with \high\
Hope someone could help or point me in the right direction to what methods to use for doing this.
Sincere
Mestika
how about str_replace?
<?php
$newurl = str_replace('low', 'high', $oldurl);
?>
documentation;
http://php.net/manual/en/function.str-replace.php
edit;
Rik is right; if your domain (or any other part of the url for that matter) includes the string "low", this will mess up your link.
So: if your url may contain multiple 'low' 's, you will have to add an extra indicator in the script. An example of that would be including the /'s in your str_replace.
You took \ for /.
$url = explode('/', rtrim($url, '/'));
if (end($url) == 'low') {
$url[count($url)-1] = 'high';
}
$url = implode('/', $url) .'/';
Use parse_url to split the URL into its components, modify them as required (here you can use explode to split the path into its segments), and then rebuild the URL with http_build_url.
<?php
class TestURL extends PHPUnit_Framework_TestCase {
public function testURL() {
$URL = 'http://www.mydomain.com/article/paragraph/low/';
$explode = explode('/', $URL);
$explode[5] = 'high';
$expected = 'http://www.mydomain.com/article/paragraph/high/';
$actual = implode('/', $explode);
$this->assertEquals($expected, $actual);
}
}
--
phpunit simple-test.php
PHPUnit 3.4.13 by Sebastian Bergmann.
.
Time: 0 seconds, Memory: 4.75Mb
OK (1 test, 1 assertion)
This will probably be enough:
$url = "http://www.mydomain.com/article/paragraph/low/";
$newUrl = str_replace('/low/', '/high/', $url);
or with regular expressions (it allows more flexibility)
$url = "http://www.mydomain.com/article/paragraph/low/";
$newUrl = preg_replace('/low(\/?)$/', 'high$1', $url);
Note that the string approach will replace any low segment and only if it's followed by a /. The regex approach will replace low only if it's the last segment and it may not be followed by a /.