SyntaxHighlighter BBCode PHP

SyntaxHighlighter BBCode PHP - php

I'm having some problems with the BBCode I created to use with the SyntaxHighlighter
function bb_parse_code($str) {
while (preg_match_all('`\[(code)=?(.*?)\]([\s\S]*)\[/code\]`', $str, $matches)) foreach ($matches[0] as $key => $match) {
list($tag, $param, $innertext) = array($matches[1][$key], $matches[2][$key], $matches[3][$key]);
switch ($tag) {
case 'code': $replacement = '<pre class="brush: '.$param.'">'.str_replace(" ", " ", str_replace(array("<br>", "<br />"), "\n", $innertext))."</pre>"; break;
}
$str = str_replace($match, $replacement, $str);
}
return $str;
}
And I have the bbcode:
[b]bold[/b]
[u]underlined[/u]
[code=js]function (lol) {
alert(lol);
}[/code]
[b]bold2[/b]
[code=php]
<? echo 'lol' ?>
[/code]
Which returns this:
I know the problem is on the ([\s\S]*) of the regex that allows any character, but how do to make the code work with line breaks?

You should use the following pattern:
`\[(code)=?(.*?)\](.*?)\[/code\]`s
A couple of changes:
The switch to .*? to make the quantifier lazy.
The s modifier at the end, which causes . to match new lines too.

Related

PHP Preg Replace. Remove strings inside {~ string ~} pattern, but skip <pre>{~ string ~}</pre> [duplicate]

I am using a WordPress plugin named Acronyms (https://wordpress.org/plugins/acronyms/). This plugin replaces acronyms with their description. It uses a PHP PREG_REPLACE function.
The issue is that it replaces the acronyms contained in a <pre> tag, which I use to present a source code.
Could you modify this expression so that it won't replace acronyms contained inside <pre> tags (not only directly, but in any moment)? Is it possible?
The PHP code is:
$text = preg_replace(
"|(?!<[^<>]*?)(?<![?.&])\b$acronym\b(?!:)(?![^<>]*?>)|msU"
, "<acronym title=\"$fulltext\">$acronym</acronym>"
, $text
);

You can use a PCRE SKIP/FAIL regex trick (also works in PHP) to tell the regex engine to only match something if it is not inside some delimiters:
(?s)<pre[^<]*>.*?<\/pre>(*SKIP)(*F)|\b$acronym\b
This means: skip all substrings starting with <pre> and ending with </pre>, and only then match $acronym as a whole word.
See demo on regex101.com
Here is a sample PHP demo:
<?php
$acronym = "ASCII";
$fulltext = "American Standard Code for Information Interchange";
$re = "/(?s)<pre[^<]*>.*?<\\/pre>(*SKIP)(*F)|\\b$acronym\\b/";
$str = "<pre>ASCII\nSometext\nMoretext</pre>More text \nASCII\nMore text<pre>More\nlines\nASCII\nlines</pre>";
$subst = "<acronym title=\"$fulltext\">$acronym</acronym>";
$result = preg_replace($re, $subst, $str);
echo $result;
Output:
<pre>ASCII</pre><acronym title="American Standard Code for Information Interchange">ASCII</acronym><pre>ASCII</pre>

It is also possible to use preg_split and keep the code block as a group, only replace the non-code block part then combine it back as a complete string:
function replace($s) {
return str_replace('"', '"', $s); // do something with `$s`
}
$text = 'Your text goes here...';
$parts = preg_split('#(<\/?[-:\w]+(?:\s[^<>]+?)?>)#', $text, null, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
$text = "";
$x = 0;
foreach ($parts as $v) {
if (trim($v) === "") {
$text .= $v;
continue;
}
if ($v[0] === '<' && substr($v, -1) === '>') {
if (preg_match('#^<(\/)?(?:code|pre)(?:\s[^<>]+?)?>$#', $v, $m)) {
$x = isset($m[1]) && $m[1] === '/' ? 0 : 1;
}
$text .= $v; // this is a HTML tag…
} else {
$text .= !$x ? replace($v) : $v; // process or skip…
}
}
return $text;
Taken from here.

PHP Regex expression excluding <pre> tag

I am using a WordPress plugin named Acronyms (https://wordpress.org/plugins/acronyms/). This plugin replaces acronyms with their description. It uses a PHP PREG_REPLACE function.
The issue is that it replaces the acronyms contained in a <pre> tag, which I use to present a source code.
Could you modify this expression so that it won't replace acronyms contained inside <pre> tags (not only directly, but in any moment)? Is it possible?
The PHP code is:
$text = preg_replace(
"|(?!<[^<>]*?)(?<![?.&])\b$acronym\b(?!:)(?![^<>]*?>)|msU"
, "<acronym title=\"$fulltext\">$acronym</acronym>"
, $text
);

You can use a PCRE SKIP/FAIL regex trick (also works in PHP) to tell the regex engine to only match something if it is not inside some delimiters:
(?s)<pre[^<]*>.*?<\/pre>(*SKIP)(*F)|\b$acronym\b
This means: skip all substrings starting with <pre> and ending with </pre>, and only then match $acronym as a whole word.
See demo on regex101.com
Here is a sample PHP demo:
<?php
$acronym = "ASCII";
$fulltext = "American Standard Code for Information Interchange";
$re = "/(?s)<pre[^<]*>.*?<\\/pre>(*SKIP)(*F)|\\b$acronym\\b/";
$str = "<pre>ASCII\nSometext\nMoretext</pre>More text \nASCII\nMore text<pre>More\nlines\nASCII\nlines</pre>";
$subst = "<acronym title=\"$fulltext\">$acronym</acronym>";
$result = preg_replace($re, $subst, $str);
echo $result;
Output:
<pre>ASCII</pre><acronym title="American Standard Code for Information Interchange">ASCII</acronym><pre>ASCII</pre>

It is also possible to use preg_split and keep the code block as a group, only replace the non-code block part then combine it back as a complete string:
function replace($s) {
return str_replace('"', '"', $s); // do something with `$s`
}
$text = 'Your text goes here...';
$parts = preg_split('#(<\/?[-:\w]+(?:\s[^<>]+?)?>)#', $text, null, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
$text = "";
$x = 0;
foreach ($parts as $v) {
if (trim($v) === "") {
$text .= $v;
continue;
}
if ($v[0] === '<' && substr($v, -1) === '>') {
if (preg_match('#^<(\/)?(?:code|pre)(?:\s[^<>]+?)?>$#', $v, $m)) {
$x = isset($m[1]) && $m[1] === '/' ? 0 : 1;
}
$text .= $v; // this is a HTML tag…
} else {
$text .= !$x ? replace($v) : $v; // process or skip…
}
}
return $text;
Taken from here.

Checking pattern as much as possible

How to make preg find all possible solutions for regular expression pattern?
Here's the code:
<?php
$text = 'Amazing analyzing.';
$regexp = '/(^|\\b)([\\S]*)(a)([\\S]*)(\\b|$)/ui';
$matches = array();
if (preg_match_all($regexp, $text, $matches, PREG_SET_ORDER)) {
foreach ($matches as $match) {
echo "{$match[2]}[{$match[3]}]{$match[4]}\n";
}
}
?>
Output:
Am[a]zing
an[a]lyzing.
Output that i need:
[A]mazing
Am[a]zing
[A]nalyzing.
an[a]lyzing.

You have to use look behind/ahead zero-length assertions (instead of a normal pattern which consumes the characters around what your are looking for): http://www.regular-expressions.info/lookaround.html

Lookaround assertions won't help, for two reasons:
Since they are zero-length, they won't return characters that you need.
As Avinash Raj noted, PHP lookbehind doesn't allow *.
This yields the output that you need:
<?php
$text = 'Amazing analyzing.';
foreach (preg_split('/\s+/', $text) as $word)
{
$matches = preg_split('/(a)/i', $word, 0, PREG_SPLIT_DELIM_CAPTURE);
for ($match = 1; $match < count($matches); $match += 2)
{
$prefix = join(array_slice($matches, 0, $match));
$suffix = join(array_slice($matches, $match+1));
echo "{$prefix}[{$matches[$match]}]{$suffix}\n";
}
}
?>

Ucfirst all strings within <strong> tag

I am trying to Ucfirst all strings within <strong> in a sentence. Tried this without any luck:
function getTextBetweenTags($string, $tagname)
{
$pattern = "/<$tagname>(.*?)<\/$tagname>/";
preg_match($pattern, $string, $matches);
return ucfirst($matches[1]);
}
$sentence = "Yellow pitty lies <strong>about</strong> the life.";
$finalsentence = getTextBetweenTags($sentence,"strong");
What is the correct way to do that ?

There is a simpler way. Instead of using php you could use only css, for instance:
strong:first-letter{
text-transform: capitalize
}

You need to include matching for the text before and after the tags.
function getTextBetweenTags($string, $tagname)
{
$pattern = "/(.*<$tagname>)(.*?)(<\/$tagname>.*)/";
preg_match($pattern, $string, $matches);
return $matches[1] . ucfirst($matches[2]) . $matches[3];
}
$sentence = "Yellow pitty lies <strong>about</strong> the life.";
$finalsentence = getTextBetweenTags($sentence,"strong");

highlighting words at the end of a word

i'm not sure how i could have phrased the title better, but my issue is that the highlight function doesn't highlight the search keywords which are at the end of the word. for example, if the search keyword is 'self', it will highlight 'self' or 'self-lessness' or 'Self' [with capital S] but it will not highlight the self of 'yourself' or 'himself' etc. .
this is the highlight function:
function highlightWords($text, $words) {
preg_match_all('~\w+~', $words, $m);
if(!$m)
return $text;
$re = '~\\b(' . implode('|', $m[0]) . ')~i';
$string = preg_replace($re, '<span class="highlight">$0</span>', $text);
return $string;
}

It seems you might have a \b at the beginning of your regex, which means a word boundary. Since the 'self' in 'yourself' doesn't start at a word boundary, it doesn't match. Get rid of the \b.

Try something like this:
function highlight($text, $words) {
if (!is_array($words)) {
$words = preg_split('#\\W+#', $words, -1, PREG_SPLIT_NO_EMPTY);
}
$regex = '#\\b(\\w*(';
$sep = '';
foreach ($words as $word) {
$regex .= $sep . preg_quote($word, '#');
$sep = '|';
}
$regex .= ')\\w*)\\b#i';
return preg_replace($regex, '<span class="highlight">\\1</span>', $text);
}
$text = "isa this is test text";
$words = array('is');
echo highlight($text, $words); // <span class="highlight">isa</span> <span class="highlight">this</span> <span class="highlight">is</span> test text
The loop, is so that every search word is properly quoted...
EDIT: Modified function to take either string or array in $words parameter.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

SyntaxHighlighter BBCode PHP - php

You should use the following pattern: `\[(code)=?(.?)\](.?)\[/code\]`s A couple of changes: The switch to .*? to make the quantifier lazy. The s modifier at the end, which causes . to match new lines too.

Related

PHP Preg Replace. Remove strings inside {~ string ~} pattern, but skip <pre>{~ string ~}</pre> [duplicate]

PHP Regex expression excluding <pre> tag

Checking pattern as much as possible

Ucfirst all strings within <strong> tag

highlighting words at the end of a word

Categories

Resources

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

SyntaxHighlighter BBCode PHP - php

You should use the following pattern: `\[(code)=?(.*?)\](.*?)\[/code\]`s A couple of changes: The switch to .*? to make the quantifier lazy. The s modifier at the end, which causes . to match new lines too.

Related

PHP Preg Replace. Remove strings inside {~ string ~} pattern, but skip <pre>{~ string ~}</pre> [duplicate]

PHP Regex expression excluding <pre> tag

Checking pattern as much as possible

Ucfirst all strings within <strong> tag

highlighting words at the end of a word

Categories

Resources

You should use the following pattern: `\[(code)=?(.?)\](.?)\[/code\]`s A couple of changes: The switch to .*? to make the quantifier lazy. The s modifier at the end, which causes . to match new lines too.