php bbcode parser [tag] in [tag] - php

I have simple BBCode parser:
function parse($text) {
$text = htmlspecialchars($text);
$text = nl2br($text);
$text = preg_replace("#\[b\](.*?)\[/b\]#si", '<b>\\1</b>', $text);
$text = preg_replace("#\[i\](.*?)\[/i\]#si", '<i>\\1</i>', $text);
$text = preg_replace("#\[u\](.*?)\[/u\]#si", '<u>\\1</u>', $text);
$text = preg_replace("#\[color=(.*?)\](.*?)\[/color\]#si", "<span style=\"color:\\1;\">\\2</span>", $text);
//and some more rules [...]
return $text;
}
It work's good when i have simple input, but when user trying use color in color, it's not working.
For example 1:
[b]bold[color=#f00]red[/color][i]italic[/i][/b]
everything is OK, but when user try something like example 2:
[b]bold[color=#f00]red[color=#0f0]green[/color][/color][i]italic[/i][/b]
my function returns:
<b>bold<span style="color:#f00;">red[color=#0f0]green</span>[/color]<i>italic</i></b>
of course example 3 working good:
[b]bold[color=#f00]red[/color][color=#0f0]green[/color][i]italic[/i][/b]
My question is it any simple solution to build something like DOM and then parse expresion?
I'd like get something like this for 2nd example:
<b>bold<span style="color:#f00;">red<span style="color:#0f0;">green</span></span><i>italic</i></b>

You should look into already existing solutions if you're willing to parse complex BBCode (see the post mario linked in a comment for reference).
However, if you're willing to stick with your own implementation, you can use recursive regexes, for example this way:
<?php
function bbcodeColor($input)
{
$regex = '#\[color=(.*?)\](((?R)|.)*?)\[\/color\]#is';
if (is_array($input)) {
$input = '<span style="color:'.$input[1].';">'.$input[2].'</span>';
}
return preg_replace_callback($regex, 'bbcodeColor', $input);
}
echo bbcodeColor('[color=#f00]red[color=#0f0]green[/color][/color]');
// <span style="color:#f00;">red<span style="color:#0f0;">green</span></span>

Related

PHP : add a html tag around specifc words

I have a data base with texts and in each text there are words (tags) that start with # (example of a record : "Hi I'm posting an #issue on #Stackoverflow ")
I'm trying to find a solution to add html code to transform each tag into a link when printing the text.
So the text are stored as strings in MySQL database like this :
Some text #tag1 text #tag2 ...
I want to replace all these #abcd with
#abcd
And have a final result as follow:
Some text #tag1 text #tag2 ...
I guess that i should use some regex but it is not at all my strong side.
Try the following using preg_replace(..)
$input = "Hi I'm posting an #issue on #Stackoverflow";
echo preg_replace("/#([a-zA-Z0-9]+)/", "<a href='targetpage.php?val=$1'>#$1</a>", $input);
http://php.net/manual/en/function.preg-replace.php
A simple solution could look like this:
$re = '/\S*#(\[[^\]]+\]|\S+)/m';
$str = 'Some text #tag1 text #tag2 ...';
$subst = '#$1';
$result = preg_replace($re, $subst, $str);
echo "The result of the substitution is ".$result;
Demo
If you are actually after Twitter hashtags and want to go crazy take a look here how it is done in Java.
There is also a JavaScript Twitter library that makes things very easy.
Try this the function
<?php
$demoString1 = "THIS is #test STRING WITH #abcd";
$demoString2 = "Hi I'm posting an #issue on #Stackoverflow";
function wrapWithAnchor($link,$string){
$pattern = "/#([a-zA-Z0-9]+)/";
$replace_with = '<a href="'.$link.'?val=$1">$1<a>';
return preg_replace( $pattern, $replace_with ,$string );
}
$link= 'http://www.targetpage.php';
echo wrapWithAnchor($link,$demoString1);
echo '<hr />';
echo wrapWithAnchor($link,$demoString2);
?>

How to remove scripts tags inside another code by regex

I'm trying to remove script tags from the source code using regular expression.
/<\s*script[^>]*[^\/]>(.*?)<\s*\/\s*script\s*>/is
But I ran into the problem when I need to remove the code inside another code.
Please see this screenshot
I'm tested in https://regex101.com/r/R6XaUT/1
How do I correctly create a regular expression so that it can cover all the code?
Sample text:
$text = '<b>sample</b> text with <div>tags</div>';
Result for strip_tags($text):
Output: sample text with tags
Result for strip_tags_content($text):
Output: text with
Result for strip_tags_content($text, ''):
Output: <b>sample</b> text with
Result for strip_tags_content($text, '', TRUE);
Output: text with <div>tags</div>
I hope that someone is useful :)
source link
Simply use the PHP function strip_tags. See
http://php.net/manual/de/function.strip-tags.php
$string = "<div>hello</div>";
echo strip_tags($string);
Will output
hello
You also can provide a list of tags to keep.
==
Another approach is this:
// Load a file into $html
$html = file_get_contents('scratch.html');
$matches = [];
preg_match_all("/<\/*([^\s>]*)>/", $html, $matches);
// Have a list of all Tags only once
$tags = array_unique($matches[1]);
// Find the script index and remove it
$scriptTagIndex = array_search("script", $tags);
if($scriptTagIndex !== false) unset($tags[$scriptTagIndex]);
// Taglist must be a string containing <tagname1><tagename2>...
$allowedTags = array_map(function ($s) { return "<$s>"; }, $tags);
// Stript the HTML and keep all Tags except for removed ones (script)
$noScript = strip_tags($html,join("", $allowedTags));
echo $noScript;

replace link with another

I'm struggling on replacing text in each link.
$reg_ex = "/(http|https)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\/\S*)?/";
$text = '<br /><p>this is a content with a link we are supposed to click</p><p>another - this is a content with a link we are supposed to click</p><p>another - this is a content with a link we are supposed to click</p>';
if(preg_match_all($reg_ex, $text, $urls))
{
foreach($urls[0] as $url)
{
echo $replace = str_replace($url,'http://www.sometext'.$url, $text);
}
}
From the code above, I'm getting 3x the same text, and the links are changed one by one: everytime is replaced only one link - because I use foreach, I know.
But I don't know how to replace them all at once.
Your help would be great!
You don't use regexes on html. use DOM instead. That being said, your bug is here:
$replace = str_replace(...., $text);
^^^^^^^^--- ^^^^^---
you never update $text, so you continually trash the replacement on every iteration of the loop. You probably want
$text = str_replace(...., $text);
instead, so the changes "propagate"
If you want the final variable to contain all replacements change it so something like this...
You basically are not passing the replaced string back into the "subject". I assume that is what you are expecting since it's a bit difficult to understand the question.
$reg_ex = "/(http|https)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\/\S*)?/";
$text = '<br /><p>this is a content with a link we are supposed to click</p><p>another - this is a content with a link we are supposed to click</p><p>another - this is a content with a link we are supposed to click</p>';
if(preg_match_all($reg_ex, $text, $urls))
{
$replace = $text;
foreach($urls[0] as $url) {
$replace = str_replace($url,'http://www.sometext'.$url, $replace);
}
echo $replace;
}

Lines get split on preg_replace usage

My code-
$input = "this text is for highlighting a text if it exists in a string. Let us check if it works or not";
$pattern ="/if/";
$replacement= "H1Fontbracket"."if"."H1BracketClose";
echo preg_replace($pattern, $replacement, $input);
Now the problem is that when i run this code, it splits into multiple lines, what else do i need to do so that i am able to get it in one line
Use str_replace rather than preg_replace. preg_replace will return an array of strings, and str_replace will just return the string:
echo str_replace($pattern, $replacement, $input);
What do you mean by multiple lines? Of course it'll show up as multiple lines on a webpage if you wrap the ifs in header tags. Headers are block elements. And more importantly, headers are headers. Not for highlighting text.
If you want to highlight something with HTML, you should probably use a span with a class, or you could use the HTML5 element mark:
$input = "this text is for highlighting a text if it exists in an iffy string.";
echo preg_replace('/\\bif\\b/', '<span class="highlighted">$0</span>', $input);
echo preg_replace('/\\bif\\b/', '<mark>$0</mark>', $input);
The \\b is to only match if words, and not just the if letters, which might be part of a different word. Then in your CSS you can decide how the marked words should show up:
.highlighted { background: yellow }
mark { background: yellow }
Or whatever. I would recommend that you read up a bit on how HTML and CSS works if you're going to make web pages :)
Try this
$input = "this text is for highlighting a text if
it exists in a string. Let us check if it works or not";
$pattern="if";
$replacement="<h1>". $pattern. "</h1>";
$input= str_replace($pattern,$replacement,$input);
echo "$input";
function highlight($str,$search){
$patterns = array('/\//', '/\^/', '/\./', '/\$/', '/\|/',
'/\(/', '/\)/', '/\[/', '/\]/', '/\*/', '/\+/',
'/\?/', '/\{/', '/\}/', '/\,/');
$replace = array('\/', '\^', '\.', '\$', '\|', '\(', '\)',
'\[', '\]', '\*', '\+', '\?', '\{', '\}', '\,');
$search = preg_replace($patterns, $replace, $search);
$search = str_replace(" ","|",$search);
return #preg_replace("/(^|\s)($search)/i",'${1}<span class=highlight>${2}</span>',$str);
}

What is the best way to strip out all html tags from a string?

Using PHP, given a string such as: this is a <strong>string</strong>; I need a function to strip out ALL html tags so that the output is: this is a string. Any ideas? Thanks in advance.
PHP has a built-in function that does exactly what you want: strip_tags
$text = '<b>Hello</b> World';
print strip_tags($text); // outputs Hello World
If you expect broken HTML, you are going to need to load it into a DOM parser and then extract the text.
What about using strip_tags, which should do just the job ?
For instance (quoting the doc) :
<?php
$text = '<p>Test paragraph.</p><!-- Comment --> Other text';
echo strip_tags($text);
echo "\n";
will give you :
Test paragraph. Other text
Edit : but note that strip_tags doesn't validate what you give it. Which means that this code :
$text = "this is <10 a test";
var_dump(strip_tags($text));
Will get you :
string 'this is ' (length=8)
(Everything after the thing that looks like a starting tag gets removed).
strip_tags is the function you're after. You'd use it something like this
$text = '<strong>Strong</strong>';
$text = strip_tags($text);
// Now $text = 'Strong'
I find this to be a little more effective than strip_tags() alone, since strip_tags() will not zap javascript or css:
$search = array(
"'<head[^>]*?>.*?</head>'si",
"'<script[^>]*?>.*?</script>'si",
"'<style[^>]*?>.*?</style>'si",
);
$replace = array("","","");
$text = strip_tags(preg_replace($search, $replace, $html));

Categories