Regex find urls but not BBCode Urls? - php

When I run my parsing, it breaks images to replace the URLs with links through the Oembed system. How can I edit this Regex so it will not capture links within BBCode brackets?
// Parse bbcodes before link parsing for image support
$text = self::parseBBCodes($text);
$text = preg_replace_callback('/(https?:\/\/.*?)(\s|$)/i', function ($match) use (&$oembedCount, &$maxOembedCount) {
I have now tried
$text = preg_replace_callback('/(?<!\])(https?:\/\/.*?)(\s|$)(?!\[)/i', function ($match) use (&$oembedCount, &$maxOembedCount) {
Which seems works, but images are not being converter. Though regular bbcode is.
BBCode Function
/**
* Parse BBCode
*
* #param string $text contains the text with BBCode to be parsed
*/
public static function parseBBcodes($text)
{
// BBcode array
$find = array(
'~\[b\](.*?)\[/b\]~s',
'~\[i\](.*?)\[/i\]~s',
'~\[u\](.*?)\[/u\]~s',
'~\[quote\](.*?)\[/quote\]~s',
'~\[size=(.*?)\](.*?)\[/size\]~s',
'~\[color=(.*?)\](.*?)\[/color\]~s',
'~\[img\](https?://.*?\.(?:jpg|jpeg|gif|png|bmp))\[/img\]~s'
);
// HTML tags to replace BBcode
$replace = array(
'<b>$1</b>',
'<i>$1</i>',
'<span style="text-decoration:underline;">$1</span>',
'<pre>$1</'.'pre>',
'<span style="font-size:$1px;">$2</span>',
'<span style="color:$1;">$2</span>',
'<img src="$1" alt="" />'
);
// Replacing the BBcodes with corresponding HTML tags
return preg_replace($find,$replace,$text);
}

you may want to strip your text
$text= strip_tags($text);

Related

How to remove Wordpress images sizes from filename inside an HTML attribute?

I'm using the following code to remove the sizes added by Wordpress to medias' filenames.
function replace_content($content) {
$content = preg_replace('/-([^-]*(\d+)x(\d+)\. ((?:png|jpeg|jpg|gif|bmp)))"/', '.${4}"', $content);
return $content;
}
add_filter('the_content','replace_content');
How to change the regex to apply it only to the href attribute value?
Folowing regex with preg_replace() function
$replaced_content = preg_replace( '#<img[^>]*?src[\s]?=[\s]?[\'"]?([^\'">]*?(https|http|\/\/)[^\'">]*?(png|jpeg|jpg|gif|bmp))[^\'" >]*?)[\'" ][^>]*?>#',
'<img src="$1">', $content );
cleans this awful img tag
<img ttl='Ren src = https://cdn.wpbeginner.com/wp-content/uploads/2015/01/rename-on-save.png' alt="Rena width=520" height="344" wp-image-25391">
to this clean and nice code
<img src="https://cdn.wpbeginner.com/wp-content/uploads/2015/01/rename-on-save.png">

PHP preg_replace on correct one

So, I am using this to replace BBCode to HTML:
$text = htmlspecialchars($text);
$advanced_bbcode = array(
'#\[quote](\r\n)?(.+?)\[/quote]#si',
'#\[url](.+)\[/url]#Usi');
$advanced_html = array(
'<blockquote class="quote">$2</blockquote>',
'<a rel="nofollow" target="_blank" href="$1">$1</a>');
$text = preg_replace($advanced_bbcode, $advanced_html,$text);
echo nl2br($text);
public static function nl2br($var)
{
return str_replace(array('\\r\\n','\r\\n','r\\n','\r\n', '\n', '\r'), '<br />', nl2br($var));
}
This works fine if I only have 1 quote, but If I use multiple quotes like: [quote][quote][quote]first[/quote]second[/quote]end[/quote]
I expect to get:
<blockquote class="quote"><blockquote class="quote"><blockquote class="quote">first</blockquote>second</blockquote>end</blockquote>
But because it takes the first [/qoute] it will turn into:
<blockquote class="quote">[quote][quote]first</blockquote>second[/quote]end[/quote]
I've looked it up but I cant find anything that is working for me. I am new to this kind of stuff.
Thanks.
Make replace until there is BBCode in the string
do {
$text = preg_replace($advanced_bbcode, $advanced_html,$text,-1,$c);
} while($c);
demo

Allow custom tags with dashes in kses filter in Wordpress

I wrote a small Wordpress plugin which allows adding user defined HTML Custom Element Tags (like <my-element>) to the HTML of a Post. So with that a User that doesn't have the capability unfiltered_html is at least capable of using such predefined Custom Tags.
The Problem is if I add a filter like so:
add_filter('wp_kses_allowed_html', 'returnAllowedCustomTags', 10, 2);
function returnAllowedCustomTags($allowedTags, $context) {
$myAllowedTags = array('my-element' = > array(), 'myelement' = > array());
$allowedTags = array_merge($allowedTags, $myAllowedTags);
return $allowedTags;
}
Saving html <myelement>blah</myelement> is possible. But saving html <my-element>blah</my-element> is not possible. I think this is because before filtering the HTML Tags with dashes are removed from HTML string.
Is there a good solution (without adjust wordpress core files) to prevent wordpress kses from filtering html tags with dashes?
And I don't want to give the users the unfiltered_html capability.
Finally after trying several approaches I came up with the following solution:
add_filter('wp_kses_allowed_html', 'returnAllowedCustomTags', 10, 2);
// use this filter to replace all custom tags with dashes before kses filter is applied
add_filter('content_save_pre', 'transformCustomTags', 9);
// use this to retransform filtered html before saving
add_filter('content_save_pre', 'retransformCustomTags', 11);
function transformCustomTags($html) {
$customTags = array('my-element', 'my-element1', 'my-element2');
// iterate over all allowed custom tags and replace them so they won't get stripped out by kses filter
foreach ($customTags as $tag) {
// transform each tag name replacing dash with '0000'
// e.g. <my-element ...> will be replaced with <my0000element ...>
// with that we can pevent kses from stripping them out
$pattern = '<' . $tag;
$replace = '<' . str_ireplace('-', '0000', $tag);
$html = str_ireplace($pattern, $replace, $html);
// replace all closing tags analog to opening tags above
$pattern = '</' . $tag;
$replace = '</' . str_ireplace('-', '0000', $tag);
$html = str_ireplace($pattern, $replace, $html);
}
return $html;
}
function retransformCustomTags($html) {
$customTags = array('my-element', 'my-element1', 'my-element2');
// iterate over all allowed tags and reverse transformation to get the dash again in tag names
foreach ($customTags as $tag) {
// e.g. <my0000element ...> will be replaced with <my-element" ...>
$pattern = '<' . str_ireplace('-', '0000', $tag);
$replace = '<' . $tag;
$html = str_ireplace($pattern, $replace, $html);
// replace all closing tags analog to opening tags above
$pattern = '</' . str_ireplace('-', '0000', $tag);
$replace = '</' . $tag;
$html = str_ireplace($pattern, $replace, $html);
}
return $html;
}
function returnAllowedCustomTags($allowedTags, $context) {
$myAllowedTags = array('my-element', 'my-element1', 'my-element2');
foreach ($myAllowedTags as $tag) {
$tagNameEscaped = str_ireplace('-', '0000', $tag);
$allowedTags[$tagNameEscaped] = array();
}
return $allowedTags;
}
Basically I replace the dash in tag name for any allowed custom tag. <my-element ...>...</my-element> is transformed to <my0000element ...>...</my0000element>. After that kses filter can do it's work but accepts tag <my0000element>. After kses filter is done but before saving $html to database I retransform <my0000element ...>...</my0000element> back to <my-element ...>...</my-element>.
With that solution dashes in tag names won't get strip out for custom tags defined as allowed tags. So users who don't have capability unfiltered_html are able to use a predefined set of custom elements.
Of course the replace pattern for dash could be a little safer gegarding false matchings. But for demonstration needs 0000 was enough.
Note: With that approach there is no adjustment in wordpress core code. Only using it's provided hooks.

BBCode preg_replace change link with or without "http://"

I'm working on a parser for my site and I've run into an issue--Is there anyway to remove/insert "http://" from a string user preg_replace? I know the basics of how to use the function, and have been following tutorials, but I'm not exactly sure how to go about doing this.
here's my code:
function showBBcodes($text)
{
// BBcode array
$find = array(
'~\[b\](.*?)\[/b\]~s',
'~\[i\](.*?)\[/i\]~s',
'~\[u\](.*?)\[/u\]~s',
'~\[quote\](.*?)\[/quote\]~s',
'~\[size=(.*?)\](.*?)\[/size\]~s',
'~\[color=(.*?)\](.*?)\[/color\]~s',
'~\[url=([^]]*)\]([^[]*)\[/url\]~s',
'~\[img\](https?://.*?\.(?:jpg|jpeg|gif|png|bmp))\[/img\]~s'
);
// HTML tags to replace BBcode
$replace = array(
'<b>$1</b>',
'<i>$1</i>',
'<span style="text-decoration:underline;">$1</span>',
'<pre>$1</'.'pre>',
'<span style="font-size:$1px;">$2</span>',
'<span style="color:$1;">$2</span>',
'$2\',
'<img src="$1" alt="" />'
);
// Replacing the BBcodes with corresponding HTML tags
return preg_replace($find,$replace,$text);
}

PHP preg_replace multiple url replaces

Hey I am trying to do 2 preg_replace:
1.make all urls to html links
2.make all images url to html img tag
But these regexs cancel the other one
<?php
// The Regular Expression filter
$reg_exUrl = "/(http|https|ftp|ftps)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\/\S*)?/";
$reg_exImg = "!http://[a-z0-9\-\.\/]+\.(?:jpe?g|png|gif)!Ui";
// The Text you want to filter for urls
$text = "The text you want to filter goes here. http://google.comhttp://www.ynet.co.il http://dogsm.files.wordpress.com/2011/12/d7a1d7a7d795d791d799-d793d795.jpg";
// Check if there is a url in the text
$text = preg_replace($reg_exImg, '<img src=$0 >', $text);
$text = preg_replace($reg_exUrl, '$0', $text);
echo $text;
?>
How can I make that the preg_replace that make url to links dont do this to the tag?
Thanks.
Haim
This might be better as a callback.
Use just the $reg_exUrl, and do this:
$text = preg_replace_callback($reg_exUrl,function($m) {
if( preg_match("/\.(?:jpe?g|png|gif)$/i",$m[0])) {
return "<img src='".$m[0]."' />";
}
else {
return "<a href='".$m[0]."' rel='nofollow'>".$m[0]."</a>";
}
},$text);

Categories