Regex nested forum quotes (BBCode) [duplicate]

Regex nested forum quotes (BBCode) [duplicate] - php

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
php regex [b] to <b>
I'm having trouble with regex, I'm an absolute regex noob. I can't see what's going wrong with trying to convert the HTML back to the 'BBCode'.
Could somebody take a look at the 'unquote' function and tell me the obvious mistake I'm making? (I know it's obvious because I always find the un-obvious errors)
NOTE: I'm not using recursive Regex because I couldn't get my head around it and already started this way round of sorting out the Quotes so they're nested.
<?php
function quote($str){
$str = preg_replace('#\[(?i)quote=(.*?)\](.*?)#si', '<div class="quote"><div class="quote-title">\\1 wrote:</div><div class="quote-inner">\\2', $str);
$str = preg_replace('#\[/(?i)quote\]#si', '</div></div>', $str);
return $str;
}
function unquote($str){
$str = preg_replace('#\<(?i)div class="quote"\>\<(?i)div class="quote_title"\>(.*?)wrote:\</(?i)div\><(?i)div class="quote-inner"\>(.*?)#si', '[quote=\\1]\\2', $str);
$str = preg_replace('#\</(?i)div\></(?i)div\>#si', '[/quote]', $str);
}
?>
This is just some code to help test it:
<html>
<head>
<style>
body {
font-family: sans-serif;
}
.quote {
background: rgba(51,153,204,0.4) url(../img/diag_1px.png);
border: 1px solid rgba(116,116,116,0.36);
padding: 5px;
}
.quote-title, .quote_title {
font-size: 18px;
margin: 5px;
}
.quote-inner {
margin: 10px;
}
</style>
</head>
<body>
<?php
$quote_text = '[quote=VCMG][quote=2xAA]DO RECURSIVE QUOTES WORK?[/quote]I have no idea.[/quote]';
$quoted = quote($quote_text);
echo $quoted.'<br><br>'.unquote($quoted); ?>
</body>
Thanks in advance, Sam.

Well, you could start by setting you php class to either quote-title or quote_title but keep it consistent.
Then, add a return $str; to your second function and you should be nearly there.
And you can simplify your regex's a little :
function quote($str){
$str = preg_replace('#\[quote=(.*)\]#siU', '<div class="quote"><div class="quote-title">\\1 wrote:</div><div class="quote-inner">', $str);
$str = preg_replace('#\[/quote\]#si', '</div></div>', $str);
return $str;
}
function unquote($str){
$str = preg_replace('#<div class="quote"><div class="quote-title">(.*) wrote:</div><div class="quote-inner">#siU', '[quote=\\1]', $str);
$str = preg_replace('#</div></div>#si', '[/quote]', $str);
return $str;
}
But beware of replacing with different calls the start and end tags of your quotes. I thin that unquote can create some strange behaviours if you happen to have other bbcode creating </div></div> code.

Personally, I take advantage of the fact that the resulting HTML is basically:
<div class="quote">Blah <div class="quote">INCEPTION!</div> More blah</div>
Repeatedly run the regex until there are no more matches:
do {
$str = preg_replace( REGEX , REPLACE , $str , -1 , $c);
} while($c > 0);
Also, do it as one regex to make this easier:
'(\[quote=(.*?)\](.*?)\[/quote\])is'
'<div class="quote"><div class="quote-title">$1 wrote:</div><div class="quote-inner">$1</div></div>'

Related

Initialize PHP echo without using ' ' or " "?

I am currently working on a Webproject for my school which is build with HTML, PHP and SQL Databases for dynamic content. Until now everything works out pretty good but I have reached a point where I have to echo something out which contains many Characters like '' and "" which pretty much makes it impossible to use PHP echo with those starting tags ('' and ""). Is there any other way to start a PHP echo ?
if ($rows[$number]['kulturschule'] == 1) {
echo '<div class="tp-caption tp-resizeme hover-scale"
data-x="center"
data-y="center"
data-voffset="[290, 290, 250, 210]"
data-hoffset="0"
data-frames='[{"delay":1000,"speed":2000,"frame":"0","from":"sX:0.9;sY:0.9;opacity:0;fb:20px;","to":"o:1;fb:0;","ease":"Power3.easeInOut"},{"delay":"wait","speed":500,"frame":"999","to":"sX:0.9;sY:0.9;opacity:0;fb:20px;","ease":"Power3.easeInOut"}]'
style="z-index: 20; max-width: auto; max-height: auto; white-space: nowrap;"><img src="img/logo/kulturschule.jpg"> ';

This is a perfect situation to use HEREDOC:
// put all the html in a variable:
$html = <<<EOT
<div class="tp-caption tp-resizeme hover-scale"
data-x="center"
data-y="center"
data-voffset="[290, 290, 250, 210]"
data-hoffset="0"
data-frames='[{"delay":1000,"speed":2000,"frame":"0","from":"sX:0.9;sY:0.9;opacity:0;fb:20px;","to":"o:1;fb:0;","ease":"Power3.easeInOut"},{"delay":"wait","speed":500,"frame":"999","to":"sX:0.9;sY:0.9;opacity:0;fb:20px;","ease":"Power3.easeInOut"}]'
style="z-index: 20; max-width: auto; max-height: auto; white-space: nowrap;"><img src="img/logo/kulturschule.jpg">
EOT;
// note, that EOT; has to be at the very start of the line.
// then:
echo $html;

I have reached a point where I have to echo something out which contains many Characters like '' and "" which pretty much makes it impossible to use PHP echo with those starting tags
Then you should escape those characters inside. cf. http://php.net/string
When you look for an alternative, you can use HEREDOC/NOWDOC syntax (see link above).

Yes you have the possiblity to use this format:
$text = <<<EOT
Place your text between the EOT. It's
the delimiter that ends the text
of your multiline string.
$var
EOT;
If you want to use raw strings use this format:
$var = "foo";
$text = <<<'EOT'
My $var
EOT;
This will ignore the $var and print it as-is
Note:
You can not indent the EOT;

Another solution is to go way old-school and use php like it was used 10 years ago:
<?php
if ($rows[$number]['kulturschule'] == 1) {
?>
<div class="tp-caption tp-resizeme hover-scale"
data-x="center"
data-y="center"
data-voffset="[290, 290, 250, 210]"
data-hoffset="0"
data-frames='[{"delay":1000,"speed":2000,"frame":"0","from":"sX:0.9;sY:0.9;opacity:0;fb:20px;","to":"o:1;fb:0;","ease":"Power3.easeInOut"},{"delay":"wait","speed":500,"frame":"999","to":"sX:0.9;sY:0.9;opacity:0;fb:20px;","ease":"Power3.easeInOut"}]'
style="z-index: 20; max-width: auto; max-height: auto; white-space: nowrap;"><img src="img/logo/kulturschule.jpg">
<?php } ?>

You can use the following way to escape quotes inside echo :
if ($rows[$number]['kulturschule'] == 1) {
echo "<div class='tp-caption tp-resizeme hover-scale'
data-x='center'
data-y='center'
data-voffset='[290, 290, 250, 210]'
data-hoffset='0'
data-frames='[{\"delay\":1000,\"speed\":2000,\"frame\":\"0\",\"from\":\"sX:0.9;sY:0.9;opacity:0;fb:20px;\",\"to\":\"o:1;fb:0;\",\"ease\":\"Power3.easeInOut\"},{\"delay\":\"wait\",\"speed\":500,\"frame\":\"999\",\"to\":\"sX:0.9;sY:0.9;opacity:0;fb:20px;\",\"ease\":\"Power3.easeInOut\"}]";

You can escape charachters:
\' single quote
\" double quote

How to remove HTML tags inside of brackets[]?

I have a string like this:
[<span style="font-size: 12.1599998474121px; line-height: 15.8079996109009px;">heading </span>heading="h1"]Its a <span style="text-decoration: line-through;">subject</span>.[/<span style="font-size: 12.1599998474121px; line-height: 15.8079996109009px;">heading</span>]
I want to remove HTML tags which are inside of brackets using PHP preg_replace etc. Final string should be like this:
[heading heading="h1"]Its a <span style="text-decoration: line-through;">subject</span>.[/heading]
I searched a lot for finding the solution but no success.

This should work for you:
Here I just use strip_tags() in every brackets of your string and return it.
echo $str = preg_replace_callback("/\[(.*?)\]/", function($m){
return strip_tags($m[0]);
}, $str);

You can use a callback with the following regular expression and utilize strip_tags() ...
$str = preg_replace_callback('~\[[^]]*]~',
function($m) {
return strip_tags($m[0]);
}, $str);
eval.in

Depends really how much you want to remove.
Example:
Pattern: '<.*?>'
Result: [heading heading="h1"]Its a subject.[/heading]
But judging from your answer you want to keep the html tags that are inside your heading. I don't understand, based on which rule exactly ? Why is this an exception ?

You can use a single regex to get what you want:
$re = "#][^\[\]]*(*SKIP)(*F)|<\/?[a-z].*?>#si";
$str = "[<span style=\"font-size: 12.1599998474121px; line-height: 15.8079996109009px;\">heading </span>heading=\"h1\"]Its a <span style=\"text-decoration: line-through;\">subject</span>.[/<span style=\"font-size: 12.1599998474121px; line-height: 15.8079996109009px;\">heading</span>]";
$result = preg_replace($re, '', $str);
echo $result;
Ouput of the sample code:
[heading heading="h1"]Its a <span style="text-decoration: line-through;">subject</span>.[/heading]

Regular Expressions to Search for Hex color codes

So I'm trying to minify some code and I'm using the PHP function preg_replace() to do so. I'm trying to compress Hex colors. For example:
#FF0000 => #F00
I found some code over the interwebs and so far, this is what I have:
$hex_char = '[a-f0-9]';
$html = preg_replace("/(?<=^#)($hex_char)\\1($hex_char)\\2($hex_char)\\3\z/i", '\1\2\3', $html);
It works for a string like:
$html = "#FF0000";
OK, so the real problem is that I need the code to search for all the Hex colors in a chunk of code like CSS, etc. It would be something like this:
<?php
$html = '
.this{
color: #FF0000;
background-color: #CCCCCC;
}
';
$hex_char = '[a-f0-9]';
$html = preg_replace("/(?<=^#)($hex_char)\\1($hex_char)\\2($hex_char)\\3\z/i", '\1\2\3', $html);
echo $html;
?>
How can I do this? Thank you!

Remove the lookbehind and end of the string anchor from your regex.
<?php
$html = <<<EOT
.this{
color: #FF0000;
background-color: #CCCCCC;
}
EOT;
$hex_char = '[a-f0-9A-F]';
$html = preg_replace("~#($hex_char)\\1($hex_char)\\2($hex_char)\\3~", '#$1$2$3', $html);
echo $html;
?>
Output:
.this{
color: #F00;
background-color: #CCC;
}

Just remove the ^ and \z anchors.
'/\#\K([a-f0-9])\1([a-f0-9])\2([a-f0-9])\3/i'
Or
# '/(?<=\#)([a-f0-9])\1([a-f0-9])\2([a-f0-9])\3/i'
(?xi-)
(?<= \# )
( [a-f0-9] )
\1
( [a-f0-9] )
\2
( [a-f0-9] )
\3

preg_replace HTML code in PHP

I want to remove string like below from a html code
<span style="font-size: 0.8px; letter-spacing: -0.8px; color: #ecf6f6">3</span>
so I came up with regex.
$pattern = "/<span style=\"font-size: \\d(\\.\\d)?px; letter-spacing: -\\d(\\.\\d)?px; color: #\\w{6}\">\\w\\w?</span>/um";
However, regex doesn’t work. Can someone point me what i did wrong. I'm new to PHP.
when I tested with a simple regex, it works so problem remains with the regex.
$str = $_POST["txtarea"];
$pattern = $_POST["regex"];
echo preg_replace($pattern, "", $str);

As much as I would advocate DOMDocument to do the job here, you would still need some regular expression down the line, so ...
The expression for the px numeric value can be simply [\d.-]+, since you're not trying to validate anything.
The contents of the span can be simplified to [^<]* (i.e. anything but a opening bracket):
$re = '/<span style="font-size: [\d.-]+px; letter-spacing: [\d.-]+px; color: #[0-9a-f]{3,6}">[^<]*<\/span>/';
echo preg_replace($re, '', $str);

Do not use regex for this problem. Use an html parser. Here is a solution in python with BeautifulSoup, because I like this library for these tasks:
from BeautifulSoup import BeautifulSoup
with open('Path/to/file', 'r') as content_file:
content = content_file.read()
soup = BeautifulSoup(content)
for div in soup.findAll('span', {'style':re.compile("font-size: \d(\.\d)?px; letter-spacing: -\d(\.\d)?px; color: #\w{6}")}):
div.extract()
with open('Path/to/file.modified', 'w') as output_file:
output_file.write(str(soup))

you have a slash ( / ) in your ending tag ( closing span )
you need to escape it or to use a different delimiter than slash

nested bb code quotes how to>

Hi im using a pretty basic bbcode parser.
could you guys help me with a problem of mine?
but when for example this is written:
[quote=tanab][quote=1][code]a img{
text-decoration: none;
}[/code][/quote][/quote]
the output is this:
tanab said:
[quote=1]
a img{
text-decoration: none;
}
[/quote]
how would i go and fix that? im realllly bad at the whole preg_replace stuff.
this is my parser:
function bbcode($input){
$input = htmlentities($input);
$search = array(
'/\[b\](.*?)\[\/b\]/is',
'/\[i\](.*?)\[\/i\]/is',
'/\[img\](.*?)\[\/img\]/is',
'/\[url=(.*?)\](.*?)\[\/url\]/is',
'/\[code\](.*?)\[\/code\]/is',
'/\[\*\](.*?)/is',
'/\\t(.*?)/is',
'/\[quote=(.*?)\](.*?)\[\/quote\]/is',
);
$replace = array(
'<b>$1</b>',
'<i>$1</i>',
'<img src="$1">',
'$2',
'<div class="code">$1</div>',
'<ul><li>$1</li></ul>',
' ',
'<div class="quote"><div class="quote-writer">$1 said:</div><div class="quote-body">$2</div></div>',
);
return preg_replace($search,$replace,$input);
}

This could be adapted with a recursive regex:
'/\[quote=(.*?)\](((?R)|.*?)+)\[\/quote\]/is'
Which will at least ensure that the output divs will not be incorrectly nested. But you would still have to run the regex twice or three times to catch all quote blocks.
Otherwise it would require a rewrite of your code with preg_replace_callback. Which I cannot be bothered to showcase, since this came up a few dozen times already (try the site search!), has been solved before, etc.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Regex nested forum quotes (BBCode) [duplicate] - php

Related

Initialize PHP echo without using ' ' or " "?

How to remove HTML tags inside of brackets[]?

Regular Expressions to Search for Hex color codes

preg_replace HTML code in PHP

nested bb code quotes how to>

Categories

Resources