Replace one URL with another - php

In PHP, replace one URL with another within a string e.g.
New post on the site http://stackoverflow.com/xyz1</p>
becomes:
New post on the site http://yahoo.com/abc1</p>
Must work for repeating strings as above. Appreciate this is simple but struggling!

function replace_url($text, $newurl) {
$text = preg_replace('#(https?://([-\w\.]+)+(:\d+)?(/([\w/_\.]*(\?\S+)?)?)?)#', $newurl, $text);
return $text;
}
Should work.
Regex stolen from here. This will replace all URLs in the string with the new one.

Use str_replace():
$text = str_replace('http://stackoverflow.com/xyz1', 'http://yahoo.com/abc1', $text);
That will replace the first URL with the second URL in $text.

Try this:
preg_replace('#(https?://)(www\.)?stackoverflow.com\b#', '\1\2yahoo.com', $text);
If you want to change the path after the url, add another group and use preg_replace_callabck. More information in the PHP documentation.

Related

Find url from string with php

following code is used to find url from a string with php. Here is the code:
$string = "Hello http://www.bytes.com world www.yahoo.com";
preg_match('/(http:\/\/[^\s]+)/', $string, $text);
$hypertext = "" . $text[0] . "";
$newString = preg_replace('/(http:\/\/[^\s]+)/', $hypertext, $string);
echo $newString;
Well, it shows a link but if i provide few link it doesn't work and also if i write without http:// then it doesn't show link. I want whatever link is provided it should be active, Like stackoverflow.com.
Any help please..
A working method for linking with http/https/ftp/ftps/scp/scps:
$newStr = preg_replace('!(http|ftp|scp)(s)?:\/\/[a-zA-Z0-9.?&_/]+!', "\\0",$str);
I strongly advise NOT linking when it only has a dot, because it will consider PHP 5.2, ASP.NET, etc. links, which is hardly acceptable.
Update: if you want www. strings as well, take a look at this.
If you want to detect something like stackoverflow.com, then you're going to have to check for all possible TLDs to rule out something like Web 2.0, which is quite a long list. Still, this is also going to match something as ASP.NET etc.
The regex would looks something like this:
$hypertext = preg_replace(
'{\b(?:http://)?(www\.)?([^\s]+)(\.com|\.org|\.net)\b}mi',
'$1$2$3',
$text
);
This only matches domains ending in .com, .org and .net... as previously stated, you would have to extend this list to match all TLDs
#axiomer your example wasn't work if link will be in format:
https://stackoverflow.com?val1=bla&val2blablabla%20bla%20bla.bl
correct solution:
preg_replace('!(http|ftp|scp)(s)?:\/\/[a-zA-Z0-9.?%=&_/]+!', "\\0", $content);
produces:
https://stackoverflow.com?val1=bla&val2blablabla%20bla%20bla.bl

Validate URL and display as a link in PHP

When I display the text from database, I want to detect whether that text is URL and if that's with URL format, i want to hyperlink those text automatically.
For example, if my text is like this
"Hey, check this out, i found a great website and i would like to share with you all. Here is the website www.google.com"
So in the above text, I would like to hyperlink www.google.com to www.google.com
Which method should i use to detect url format and adding hyperlink ?
Please kindly suggest. Thank you.
function makeClickableLinks($text) {
$text = eregi_replace('(((f|ht){1}tp://)[-a-zA-Z0-9#:%_\+.~#?&//=]+)', '\\1', $text);
$text = eregi_replace('([[:space:]()[{}])(www.[-a-zA-Z0-9#:%_\+.~#?&//=]+)', '\\1\\2', $text);
$text = eregi_replace('([_\.0-9a-z-]+#([0-9a-z][0-9a-z-]+\.)+[a-z]{2,3})', '\\1', $text);
return $text;
}
This is the right one ;-) works for all HTTP links (with or without http://) and for e-mail links. Usage echo makeClickableLinks($string);
It does not support https as I see, the code is from http://www.totallyphp.co.uk/code/convert_links_into_clickable_hyperlinks.htm here and seems to work. At least this kicks you in the right direction.
you could use this code snippet:
$text = preg_replace('#(https?://([-\w\.]+)+(:\d+)?(/([\w/_\.]*(\?\S+)?)?)?)#', '$1', $text);
found on snipplr.com
This can be done with regular expressions. Something along the lines of:
echo preg_replace("%((http|https|ftp)://(\S*?\.\S*?))(\s|\;|\)|\]|\[|\{|\}|,|\"|'|:|\<|$|\.\s)%ie", "$3$4",$text);
*Edit: Updated regex

PHP url create from string

I would like to check a string and convert all the substrings that could be potential links inside the original string like http://www.google.com, or www.google.com, replaced with
<a href='http://www.google.com'>http://www.google.com</a> so that i can create real links from them.
How can i do this?
you can create the HTML links by calling the following function in PHP:
$stringToCheck = 'http://www.google.com, or www.google.com';
$stringWithHTMLLinks = '';
$stringWithHTMLLinks = preg_replace('/\b((https?|ftp|file):\/\/|www\.|ftp\.)[-A-Z0-9+&##\/%?=~_|!:,.;]*[A-Z0-9+&##\/%=~_|]/si', '\0', $stringToCheck);
Use this regex provided on Daring Fireball to match an URL.

regex to get current page or directory name?

I am trying to get the page or last directory name from a url
for example if the url is: http://www.example.com/dir/ i want it to return dir or if the passed url is http://www.example.com/page.php I want it to return page Notice I do not want the trailing slash or file extension.
I tried this:
$regex = "/.*\.(com|gov|org|net|mil|edu)/([a-z_\-]+).*/i";
$name = strtolower(preg_replace($regex,"$2",$url));
I ran this regex in PHP and it returned nothing. (however I tested the same regex in ActionScript and it worked!)
So what am I doing wrong here, how do I get what I want?
Thanks!!!
Don't use / as the regex delimiter if it also contains slashes. Try this:
$regex = "#^.*\.(com|gov|org|net|mil|edu)/([a-z_\-]+).*$#i";
You may try tho escape the "/" in the middle. That simply closes your regex. So this may work:
$regex = "/.*\.(com|gov|org|net|mil|edu)\/([a-z_\-]+).*/i";
You may also make the regex somewhat more general, but that's another problem.
You can use this
array_pop(explode('/', $url));
Then apply a simple regex to remove any file extension
Assuming you want to match the entire address after the domain portion:
$regex = "%://[^/]+/([^?#]+)%i";
The above assumes a URL of the format extension://domainpart/everythingelse.
Then again, it seems that the problem here isn't that your RegEx isn't powerful enough, just mistyped (closing delimiter in the middle of the string). I'll leave this up for posterity, but I strongly recommend you check out PHP's parse_url() method.
This should adequately deliver:
substr($s = basename($_SERVER['REQUEST_URI']), 0, strrpos($s,'.') ?: strlen($s))
But this is better:
preg_replace('/[#\.\?].*/','',basename($path));
Although, your example is short, so I cannot tell if you want to preserve the entire path or just the last element of it. The preceding example will only preserve the last piece, but this should save the whole path while being generic enough to work with just about anything that can be thrown at you:
preg_replace('~(?:/$|[#\.\?].*)~','',substr(parse_url($path, PHP_URL_PATH),1));
As much as I personally love using regular expressions, more 'crude' (for want of a better word) string functions might be a good alternative for you. The snippet below uses sscanf to parse the path part of the URL for the first bunch of letters.
$url = "http://www.example.com/page.php";
$path = parse_url($url, PHP_URL_PATH);
sscanf($path, '/%[a-z]', $part);
// $part = "page";
This expression:
(?<=^[^:]+://[^.]+(?:\.[^.]+)*/)[^/]*(?=\.[^.]+$|/$)
Gives the following results:
http://www.example.com/dir/ dir
http://www.example.com/foo/dir/ dir
http://www.example.com/page.php page
http://www.example.com/foo/page.php page
Apologies in advance if this is not valid PHP regex - I tested it using RegexBuddy.
Save yourself the regular expression and make PHP's other functions feel more loved.
$url = "http://www.example.com/page.php";
$filename = pathinfo(parse_url($url, PHP_URL_PATH), PATHINFO_FILENAME);
Warning: for PHP 5.2 and up.

Extract URLs from text in PHP

I have this text:
$string = "this is my friend's website http://example.com I think it is coll";
How can I extract the link into another variable?
I know it should be by using regular expression especially preg_match() but I don't know how?
Probably the safest way is using code snippets from WordPress. Download the latest one (currently 3.1.1) and see wp-includes/formatting.php. There's a function named make_clickable which has plain text for param and returns formatted string. You can grab codes for extracting URLs. It's pretty complex though.
This one line regex might be helpful.
preg_match_all('#\bhttps?://[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/))#', $string, $match);
But this regex still can't remove some malformed URLs (ex. http://google:ha.ckers.org ).
See also:
How to mimic StackOverflow Auto-Link Behavior
I tried to do as Nobu said, using Wordpress, but to much dependencies to other WordPress functions I instead opted to use Nobu's regular expression for preg_match_all() and turned it into a function, using preg_replace_callback(); a function which now replaces all links in a text with clickable links. It uses anonymous functions so you'll need PHP 5.3 or you may rewrite the code to use an ordinary function instead.
<?php
/**
* Make clickable links from URLs in text.
*/
function make_clickable($text) {
$regex = '#\bhttps?://[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/))#';
return preg_replace_callback($regex, function ($matches) {
return "<a href=\'{$matches[0]}\'>{$matches[0]}</a>";
}, $text);
}
URLs have a quite complex definition — you must decide what you want to capture first. A simple example capturing anything starting with http:// and https:// could be:
preg_match_all('!https?://\S+!', $string, $matches);
$all_urls = $matches[0];
Note that this is very basic and could capture invalid URLs. I would recommend catching up on POSIX and PHP regular expressions for more complex things.
The code that worked for me (especially if you have several links in your $string):
$string = "this is my friend's website https://www.example.com I think it is cool, but this one is cooler https://www.stackoverflow.com :)";
$regex = '/\b(https?|ftp|file):\/\/[-A-Z0-9+&##\/%?=~_|$!:,.;]*[A-Z0-9+&##\/%=~_|$]/i';
preg_match_all($regex, $string, $matches);
$urls = $matches[0];
// go over all links
foreach($urls as $url)
{
echo $url.'<br />';
}
Hope that helps others as well.
If the text you extract the URLs from is user-submitted and you're going to display the result as links anywhere, you have to be very, VERY careful to avoid XSS vulnerabilities, most prominently "javascript:" protocol URLs, but also malformed URLs that might trick your regexp and/or the displaying browser into executing them as Javascript URLs. At the very least, you should accept only URLs that start with "http", "https" or "ftp".
There's also a blog entry by Jeff where he describes some other problems with extracting URLs.
preg_match_all('/[a-z]+:\/\/\S+/', $string, $matches);
This is an easy way that'd work for a lot of cases, not all. All the matches are put in $matches. Note that this do not cover links in anchor elements (<a href=""...), but that wasn't in your example either.
You could do like this..
<?php
$string = "this is my friend's website http://example.com I think it is coll";
echo explode(' ',strstr($string,'http://'))[0]; //"prints" http://example.com
preg_match_all ("/a[\s]+[^>]*?href[\s]?=[\s\"\']+".
"(.*?)[\"\']+.*?>"."([^<]+|.*?)?<\/a>/",
$var, &$matches);
$matches = $matches[1];
$list = array();
foreach($matches as $var)
{
print($var."<br>");
}
You could try this to find the link and revise the link (add the href link).
$reg_exUrl = "/(http|https|ftp|ftps)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\/\S*)?/";
// The Text you want to filter for urls
$text = "The text you want to filter goes here. http://example.com";
if(preg_match($reg_exUrl, $text, $url)) {
echo preg_replace($reg_exUrl, "{$url[0]} ", $text);
} else {
echo "No url in the text";
}
refer here: http://php.net/manual/en/function.preg-match.php
There are a lot of edge cases with urls. Like url could contain brackets or not contain protocol etc. Thats why regex is not enough.
I created a PHP library that could deal with lots of edge cases: Url highlight.
Example:
<?php
use VStelmakh\UrlHighlight\UrlHighlight;
$urlHighlight = new UrlHighlight();
$urlHighlight->getUrls("this is my friend's website http://example.com I think it is coll");
// return: ['http://example.com']
For more details see readme. For covered url cases see test.
Here is a function I use, can't remember where it came from but seems to do a pretty good job of finding links in the text. and making them links.
You can change the function to suit your needs. I just wanted to share this as I was looking around and remembered I had this in one of my helper libraries.
function make_links($str){
$pattern = '(?xi)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'".,<>?«»“”‘’]))';
return preg_replace_callback("#$pattern#i", function($matches) {
$input = $matches[0];
$url = preg_match('!^https?://!i', $input) ? $input : "http://$input";
return '' . "$input";
}, $str);
}
Use:
$subject = 'this is a link http://google:ha.ckers.org maybe don't want to visit it?';
echo make_links($subject);
Output
this is a link http://google:ha.ckers.org maybe don't want to visit it?
<?php
preg_match_all('/(href|src)[\s]?=[\s\"\']?+(.*?)[\s\"\']+.*?/', $webpage_content, $link_extracted);
preview
This Regex works great for me and i have checked with all types of URL,
<?php
$string = "Thisregexfindurlhttp://www.rubular.com/r/bFHobduQ3n mixedwithstring";
preg_match_all('/(https?|ssh|ftp):\/\/[^\s"]+/', $string, $url);
$all_url = $url[0]; // Returns Array Of all Found URL's
$one_url = $url[0][0]; // Gives the First URL in Array of URL's
?>
Checked with lots of URL's can find here http://www.rubular.com/r/bFHobduQ3n
public function find_links($post_content){
$reg_exUrl = "/(http|https|ftp|ftps)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\/\S*)?/";
// Check if there is a url in the text
if(preg_match_all($reg_exUrl, $post_content, $urls)) {
// make the urls hyper links,
foreach($urls[0] as $url){
$post_content = str_replace($url, ' LINK ', $post_content);
}
//var_dump($post_content);die(); //uncomment to see result
//return text with hyper links
return $post_content;
} else {
// if no urls in the text just return the text
return $post_content;
}
}

Categories