Regex to deterime text 'http://...' but not in iframes, embeds...etc

Regex to deterime text 'http://...' but not in iframes, embeds...etc - php

This regex is used to replace text links with a clickable anchor tag.
#(?<!href="|">)((?:https?|ftp|nntp)://[^\s<>()]+)#i
My problem is, I don't want it to change links that are in things like <iframe src="http//... or <embed src="http://...
I tried checking for a whitespace character before it by adding \s, but that didn't work.
Or - it appears they're first checking that an href=" doesn't already exist (?) - maybe I can check for the other things too?
Any thoughts / explanations how I would do this is greatly appreciated. Main, I just need the regex - I can implement in CakePHP myself.
The actual code comes from CakePHP's Text->autoLink():
function autoLinkUrls($text, $htmlOptions = array()) {
$options = var_export($htmlOptions, true);
$text = preg_replace_callback('#(?<!href="|">)((?:https?|ftp|nntp)://[^\s<>()]+)#i', create_function('$matches',
'$Html = new HtmlHelper(); $Html->tags = $Html->loadConfig(); return $Html->link($matches[0], $matches[0],' . $options . ');'), $text);
return preg_replace_callback('#(?<!href="|">)(?<!http://|https://|ftp://|nntp://)(www\.[^\n\%\ <]+[^<\n\%\,\.\ <])(?<!\))#i',
create_function('$matches', '$Html = new HtmlHelper(); $Html->tags = $Html->loadConfig(); return $Html->link($matches[0], "http://" . $matches[0],' . $options . ');'), $text);
}

You can expand the lookbehind at the beginning of those regexes to check for src=" as well as href=", like this:
(?<!href="|src="|">)

Related

How to add missing http:// to an anchor in a string - PHP

I wrote a code which adds hyperlink to all plain text where it finds http:// or https://. The code works pretty well for https://www.google.com and http://yahoo.com. It converts these text into clickable hyperlink with correct address.
<?php
function convert_text_to_link($str)
{
$pattern = "/(?:(https?):\/\/([^\s<]+)|(www\.[^\s<]+?\.[^\s<]+))(?<![\.,:])/i";
return preg_replace($pattern, "<a href='$0' target='_blank'>$0</a>", $str);
}
$str = "https://www.google.com is the biggest search engine. It's competitors are http://yahoo.com and www.bing.com.";
echo convert_text_to_link($str);
?>
But when my code sees www.bing.com, though it adds hyperlink to it but the href attribute also becomes www.bing.com. There is no http:// prepended it. Therefore the link becomes unusable without the link http://localhost/myproject/www.bing.com will go nowhere.
How can I add http:// to www.bing.com so that it should become http://www.bing.com?

Here is your function. Try this.
function convert_text_to_link($str) {
$pattern = '#(http)?(s)?(://)?(([a-zA-Z])([-\w]+\.)+([^\s\.]+[^\s]*)+[^,.\s])#';
return preg_replace($pattern, '$0', $str);
}

You should try and check if this works:
window.location = window.location.href.replace(/^www./, 'https:');
might be you will get your solution.
I just got to know about some other approaches too, you can try them out as per your code and requirements:
1.
str_replace("www.","http://","$str");
The test here is case-sensitive. This means that if the string is initially this will change it to http://Http://example.com which is probably not what you want.
try regex:
if (!$str.match(/^[a-zA-Z]+:\/\//))
{
$str = 'http://' + $str;
}.
hope this helps.

use php preg_replace to replace alt="20x20" with style="width:20;height:20;"

How would I use php preg_replace to parse a string of HTML and replace
alt="20x20" with style="width:20;height:20;"
Any help is appreciated.
I tried this.
$pattern = '/(<img.*) alt="(\d+)x(\d+)"(.*style=")(.*)$/';
$style = '$1$4width:$2px;height:$3px;$5';
$text = preg_replace($pattern, $style, $text);

You don't need preg_replace to do this. You can use str_replace
$html = '<img alt="20x20" />';
preg_match('/<img.*?alt="(.*?)".*>/',$html,$match);
$search = 'alt="' . $match[1] . '"';
list($width, $height) = explode('x', $match[1]);
if(is_numeric($width) && is_numeric($height))
{
$replace = 'style="width:' . $width . 'px;height:' . $height . 'px;"';
echo str_replace($search, $replace, $html);
}
Output:
<img style="width:20px;height:20px;">

If you insist on using regular expressions to alter HTML markup you'll no doubt get stuck sometime, at which point you'd do well to look into something like Python's beautiful soup, or possibly the goode olde tidy library, which I think comes included as part of the PHP spec. But for now:
<?php
$originalString = 'Your string containing <img src="xyz.png" alt="20x20">';
$patternToFind = '/alt="20x20"/i';
$replacementString = 'style="width:20;height:20;"';
preg_replace($patternToFind, $replacementString, $originalString);
?>
And since it seems a lot of people are mighty peeved at what seems to be a code request, you might check this link out for php.net's guidance. It's not always this clear in explaining PHP's constructs, but would have solved your problem easily in this case:
http://php.net/manual/en/function.preg-replace.php

As mentioned in the comments you should use DOM to manipulate HTML codes.
If you want to do this by preg_replace anyway, I'd suggest to find out the regex by yourself with the help of sites like this one.

PHP RegEx Negation Word

My preg_replace pattern regex code here..
/<img(.*?)src="(.*?)"/
This is my replace code..
<img$1src="'.$path.'$2"
So i want to negate/exlude a condition..
If img tag have a rel="customimg", dont preg_replace so skip it..
Example: Skip This Line
<img rel="customimg" src="http..">
What might add to this regex pattern?
I searched another post, but I couldn't exactly..

Because src argument may use single or double quotes, I suggest you to use
preg_replace(
"/(<img\b(?!.*\brel=[\"']customimg[\"']).*?\bsrc=)([\"']).*?\2/i",
"$1$2" . $path . "$2",
$string);
Edit:
To add url prefix instead of full url replacement, use
preg_replace(
"/(<img\b(?!.*\brel=[\"']customimg[\"']).*?\bsrc=)([\"'])(.*?)\2/i",
"$1$2" . $path . "$3$2",
$string);

Add a negative lookahead:
/<img(?![^>]*\srel="customimg")(.*?)src="(.*?)"/

Because I only see regex "solutions" coming in. Here is the answer using DOMDocument:
<?php
$path = 'the/path';
$doc = new DOMDocument();
#$doc->loadHTML('<img rel="customimg" src="/image.jpgm"><img src="/image.jpg">');
$xpath = new DOMXPath($doc);
$imageNodes = $xpath->query('//img[not(#rel="customimg")]');
foreach ($imageNodes as $node) {
$node->setAttribute('src', $path . $node->getAttribute('src'));
}
Demo: http://codepad.viper-7.com/uID5wz

It would seem like it'd be easier/more expressive to do
if(strpos($haystackString, '"customimg"') === false) // The === is important
{
// your preg_replace here
}
Edit: Thanks for pointing out missing param guys

PHP using prefix tags to linkify text

I'm trying to write a code library for my own personal use and I'm trying to come up with a solution to linkify URLs and mail links. I was originally going to go with a regex statement to transform URLs and mail addresses to links but was worried about covering all the bases. So my current thinking is perhaps use some kind of tag system like this:
l:www.google.com becomes http://www.google.com and where m:john.doe#domain.com becomes john.doe#domain.com.
What do you think of this solution and can you assist with the expression? (REGEX is not my strong point). Any help would be appreciated.

Maybe some regex like this :
$content = "l:www.google.com some text m:john.doe#domain.com some text";
$pattern = '/([a-z])\:([^\s]+)/'; // One caracter followed by ':' and everything who goes next to the ':' which is not a space or tab
if (preg_match_all($pattern, $content, $results))
{
foreach ($results[0] as $key => $result)
{
// $result is the whole matched expression like 'l:www.google.com'
$letter = $results[1][$key];
$content = $results[2][$key];
echo $letter . ' ' . $content . '<br/>';
// You can put str_replace here
}
}

PHP Explode and Get_Url: Not Showing up the URL

its a little bit hard to understand.
in the header.php i have this code:
<?
$ID = $link;
$url = downloadLink($ID);
?>
I get the ID with this Variable $link --> 12345678
and with $url i get the full link from the functions.php
in the functions.php i have this snippet
function downloadlink ($d_id)
{
$res = #get_url ('' . 'http://www.example.com/' . $d_id . '/go.html');
$re = explode ('<iframe', $res);
$re = explode ('src="', $re[1]);
$re = explode ('"', $re[1]);
$url = $re[0];
return $url;
}
and normally it prints the url out.. but, i cant understand the code..

It's written in kind of a strange way, but basically what downloadLink() does is this:
Download the HTML from http://www.example.com/<ID>/go.html
Take the HTML, and split it at every point where the string <iframe occurs.
Now take everything that came after the first <iframe in the HTML, and split it at every point where the string src=" appears.
Now take everything after the first src=" and split it at every point where " appears.
Return whatever was before the first ".
So it's a pretty poor way of doing it, but effectively it looks for the first occurence of this in the HTML code:
<iframe src="<something>"
And returns the <something>.
Edit: a different method, as requested in comment:
There's not really any particular "right" way to do it, but a fairly straightforward way would be to change it to this:
function downloadlink ($d_id)
{
$html = #get_url ('' . 'http://www.example.com/' . $d_id . '/go.html');
preg_match('/\<iframe src="(.+?)"/', $html, $matches);
return $matches[1];
}

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Regex to deterime text 'http://...' but not in iframes, embeds...etc - php

You can expand the lookbehind at the beginning of those regexes to check for src=" as well as href=", like this: (?<!href="|src="|">)

Related

How to add missing http:// to an anchor in a string - PHP

use php preg_replace to replace alt="20x20" with style="width:20;height:20;"

PHP RegEx Negation Word

PHP using prefix tags to linkify text

PHP Explode and Get_Url: Not Showing up the URL

Categories

Resources