Hi im using a pretty basic bbcode parser.
could you guys help me with a problem of mine?
but when for example this is written:
[quote=tanab][quote=1][code]a img{
text-decoration: none;
}[/code][/quote][/quote]
the output is this:
tanab said:
[quote=1]
a img{
text-decoration: none;
}
[/quote]
how would i go and fix that? im realllly bad at the whole preg_replace stuff.
this is my parser:
function bbcode($input){
$input = htmlentities($input);
$search = array(
'/\[b\](.*?)\[\/b\]/is',
'/\[i\](.*?)\[\/i\]/is',
'/\[img\](.*?)\[\/img\]/is',
'/\[url=(.*?)\](.*?)\[\/url\]/is',
'/\[code\](.*?)\[\/code\]/is',
'/\[\*\](.*?)/is',
'/\\t(.*?)/is',
'/\[quote=(.*?)\](.*?)\[\/quote\]/is',
);
$replace = array(
'<b>$1</b>',
'<i>$1</i>',
'<img src="$1">',
'$2',
'<div class="code">$1</div>',
'<ul><li>$1</li></ul>',
' ',
'<div class="quote"><div class="quote-writer">$1 said:</div><div class="quote-body">$2</div></div>',
);
return preg_replace($search,$replace,$input);
}
This could be adapted with a recursive regex:
'/\[quote=(.*?)\](((?R)|.*?)+)\[\/quote\]/is'
Which will at least ensure that the output divs will not be incorrectly nested. But you would still have to run the regex twice or three times to catch all quote blocks.
Otherwise it would require a rewrite of your code with preg_replace_callback. Which I cannot be bothered to showcase, since this came up a few dozen times already (try the site search!), has been solved before, etc.
Related
Despite setting the "white-space" property to my paragraphs to "normal" in CSS, the breaks in some of my lines are spacing really strangely. Does anyone have an idea at to what may be the cause? Here is an example of this.
Thanks a lot
these lines are prob. seperated by different p elements.
word-break: break-all;
word-break: break-word;
word-break: keep-all;
overflow-wrap: break-word;
hyphens: manual;
hyphens: auto;
line-break: loose;
line-break: strict;
line-break: anywhere;
text-wrap: balance;
these are some styles can apply to change break points in text.
Upon investigation, it turns out my wordpress had been adding in random   breaks to my <p> elements. I found a solution to this in another thread which involves adding the following code to my functions.php file:
function replace_content($content) {
$content = htmlentities($content, null, 'utf-8');
$content = str_replace(" ", " ", $content);
$content = html_entity_decode($content);
return $content;
}
add_filter('the_content','replace_content', 999999999);
[Credits to 'Bruno' https://wordpress.stackexchange.com/questions/29702/wordpress-automatically-adding-nbsp]
i got this table generated with php:
a function generates a string with all the html code:
<table><tr><td>1</td><td>2</td><td>3</td><td>4</td><td>5</td><td>6</td><td>7</td><td>8</td><td>9</td><td>10</td></tr><tr><td>2</td><td>4</td><td>6</td><td>8</td><td>10</td><td>12</td><td>14</td><td>16</td><td>18</td><td>20</td></tr><tr><td>3</td><td>6</td><td>9</td><td>12</td><td>15</td><td>18</td><td>21</td><td>24</td><td>27</td><td>30</td></tr><tr><td>4</td><td>8</td><td>12</td><td>16</td> .... </table>
now i want to make the numbers 1 to 10 black. i'm trying to replace '<td>(10|[0-9])</td>' with <td style="font-weight: bold">THE-ORIGINAL-NUMBER</td>.
Thanx in advance!
p.s. i know there're alot of similir answers out there but i just couldnt figure it out.. is there an actually noob-friendly tut/glossary of regex out there? i couldn't really find a modern day site.
If you are matching this regular expression:
<td>(10|[0-9])</td>
You are capturing 10|[0-9] into capture group #1. This can be referenced in your replacement with either of the following backreferences:
\1
$1
Full PHP code:
$html = '<td>1</td>';
$html = preg_replace(
'~<td>(10|[0-9])</td>~',
'<td style="font-weight: bold">\1</td>',
$html
);
use this regex
(?<=<td>)(10|[0-9])(?=<\/td>)
replace group #1 with:
<span class="BoldText">$1</span>
Style:
.BoldText {
font-weight: bold;
}
using <b> may be useful:
replace
'~<td>(10|[0-9])</td>~'
with
'<td><b>\1</b></td>'
I want to remove string like below from a html code
<span style="font-size: 0.8px; letter-spacing: -0.8px; color: #ecf6f6">3</span>
so I came up with regex.
$pattern = "/<span style=\"font-size: \\d(\\.\\d)?px; letter-spacing: -\\d(\\.\\d)?px; color: #\\w{6}\">\\w\\w?</span>/um";
However, regex doesn’t work. Can someone point me what i did wrong. I'm new to PHP.
when I tested with a simple regex, it works so problem remains with the regex.
$str = $_POST["txtarea"];
$pattern = $_POST["regex"];
echo preg_replace($pattern, "", $str);
As much as I would advocate DOMDocument to do the job here, you would still need some regular expression down the line, so ...
The expression for the px numeric value can be simply [\d.-]+, since you're not trying to validate anything.
The contents of the span can be simplified to [^<]* (i.e. anything but a opening bracket):
$re = '/<span style="font-size: [\d.-]+px; letter-spacing: [\d.-]+px; color: #[0-9a-f]{3,6}">[^<]*<\/span>/';
echo preg_replace($re, '', $str);
Do not use regex for this problem. Use an html parser. Here is a solution in python with BeautifulSoup, because I like this library for these tasks:
from BeautifulSoup import BeautifulSoup
with open('Path/to/file', 'r') as content_file:
content = content_file.read()
soup = BeautifulSoup(content)
for div in soup.findAll('span', {'style':re.compile("font-size: \d(\.\d)?px; letter-spacing: -\d(\.\d)?px; color: #\w{6}")}):
div.extract()
with open('Path/to/file.modified', 'w') as output_file:
output_file.write(str(soup))
you have a slash ( / ) in your ending tag ( closing span )
you need to escape it or to use a different delimiter than slash
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
php regex [b] to <b>
I'm having trouble with regex, I'm an absolute regex noob. I can't see what's going wrong with trying to convert the HTML back to the 'BBCode'.
Could somebody take a look at the 'unquote' function and tell me the obvious mistake I'm making? (I know it's obvious because I always find the un-obvious errors)
NOTE: I'm not using recursive Regex because I couldn't get my head around it and already started this way round of sorting out the Quotes so they're nested.
<?php
function quote($str){
$str = preg_replace('#\[(?i)quote=(.*?)\](.*?)#si', '<div class="quote"><div class="quote-title">\\1 wrote:</div><div class="quote-inner">\\2', $str);
$str = preg_replace('#\[/(?i)quote\]#si', '</div></div>', $str);
return $str;
}
function unquote($str){
$str = preg_replace('#\<(?i)div class="quote"\>\<(?i)div class="quote_title"\>(.*?)wrote:\</(?i)div\><(?i)div class="quote-inner"\>(.*?)#si', '[quote=\\1]\\2', $str);
$str = preg_replace('#\</(?i)div\></(?i)div\>#si', '[/quote]', $str);
}
?>
This is just some code to help test it:
<html>
<head>
<style>
body {
font-family: sans-serif;
}
.quote {
background: rgba(51,153,204,0.4) url(../img/diag_1px.png);
border: 1px solid rgba(116,116,116,0.36);
padding: 5px;
}
.quote-title, .quote_title {
font-size: 18px;
margin: 5px;
}
.quote-inner {
margin: 10px;
}
</style>
</head>
<body>
<?php
$quote_text = '[quote=VCMG][quote=2xAA]DO RECURSIVE QUOTES WORK?[/quote]I have no idea.[/quote]';
$quoted = quote($quote_text);
echo $quoted.'<br><br>'.unquote($quoted); ?>
</body>
Thanks in advance, Sam.
Well, you could start by setting you php class to either quote-title or quote_title but keep it consistent.
Then, add a return $str; to your second function and you should be nearly there.
And you can simplify your regex's a little :
function quote($str){
$str = preg_replace('#\[quote=(.*)\]#siU', '<div class="quote"><div class="quote-title">\\1 wrote:</div><div class="quote-inner">', $str);
$str = preg_replace('#\[/quote\]#si', '</div></div>', $str);
return $str;
}
function unquote($str){
$str = preg_replace('#<div class="quote"><div class="quote-title">(.*) wrote:</div><div class="quote-inner">#siU', '[quote=\\1]', $str);
$str = preg_replace('#</div></div>#si', '[/quote]', $str);
return $str;
}
But beware of replacing with different calls the start and end tags of your quotes. I thin that unquote can create some strange behaviours if you happen to have other bbcode creating </div></div> code.
Personally, I take advantage of the fact that the resulting HTML is basically:
<div class="quote">Blah <div class="quote">INCEPTION!</div> More blah</div>
Repeatedly run the regex until there are no more matches:
do {
$str = preg_replace( REGEX , REPLACE , $str , -1 , $c);
} while($c > 0);
Also, do it as one regex to make this easier:
'(\[quote=(.*?)\](.*?)\[/quote\])is'
'<div class="quote"><div class="quote-title">$1 wrote:</div><div class="quote-inner">$1</div></div>'
I found lots of posts regarding estracting a filename from an img-tag, but none from a CSS inline style tag. Here's the source string
<span style="width: 40px; height: 30px; background-image: url("./files/foo/bar.png");" class="bar">FOO</span>
What I want to get is bar.png.
I tried this:
$pattern = "/background-image: ?.png/";
preg_match($pattern, $string, $matches);
But this didnt work out.
Any help appreciated..
You need to read up about regular expressions.
"/background-image: ?.png/"
means "background-image:" followed optionally by a space, followed by any single character, followed (directly) by "png".
Exactly what you need depends on how much variation you need to allow for in the layout of the tag, but it will be something like
"/background-image\s*:\s*url\s*(\s*".*([^\/]+)"/
where all the "\s*" are optional spaces, and parenthesis captures something that doesn't contain a slash.
Generally, regexp is not a good tool for parsing HTML, but in this limited case it might be OK.
$string = '<span style="width: 40px; height: 30px; background-image: url("./files/foo/bar.png");" class="bar">FOO</span>';
$pattern = '/background-image:\s*url\(\s*([\'"]*)(?P<file>[^\1]+)\1\s*\)/i';
$matches = array();
if (preg_match($pattern, $string, $matches)) {
echo $matches['file'];
}
something along the lines
$style = "width: 40px; height: 30px; background-image: url('./files/foo/bar.png');";
preg_match("/url[\s]*\(([\'\"])([^\'\"]+)([\'\"])\)/", $style, $matches);
var_dump($matches[2]);
it wont work for filenames that contain ' or ". It basically matches anything between the parenthesis of url() that is not ' or "