replacing text with preg_replace_callback and str_replace - php

Okay, so I'm trying to replace really long quotes in my websites comments section that uses bbcode, what I'm trying to do is encase long quotes in a collapse I already have coded in js and css.
My problem is that it will do the first quote, then any other quotes vanish. I'm obviously missing something, but this is my first time using callbacks like this.
Here's my php code right now to do this:
$body = preg_replace_callback("/\[quote\](.*?)\[\/quote\]/is",
function($matches)
{
if (strlen($matches[1]) >= '1000')
{
$matches[0] = str_replace($matches[0], '<div class="box"><div class="collapse_container"><div class="collapse_header"><span>Long quote, click to expand</span></div><div class="collapse_content">' . $matches[1] . '</div></div></div>', $matches[0]);
return $matches[0];
}
}, $body);
Some example text:
[quote]aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa[/quote]
[quote]booohoo[/quote]
[quote]new quoting[/quote]
[b]test[/b]

You need to move the return $matches[0] code outside the if block:
function($matches)
{
if (strlen($matches[1]) >= '1000') {
$matches[0] = str_replace($matches[0], '<div class="box"><div class="collapse_container"><div class="collapse_header"><span>Long quote, click to expand</span></div><div class="collapse_content">' . $matches[1] . '</div></div></div>', $matches[0]);
}
return $matches[0];
}
Also, I advise to unroll your lazy matching regex as follows:
'~\[quote\]([^[]*(?:\[(?!/quote\])[^[]*)*)\[/quote\]~i'
See my regex demo (30 steps) and your regex demo (2025 steps).
See IDEONE demo

Phps preg* functions are acting greedy by default. They will match the longest possible string described by your regex. In your case the regex mathces everything from the first [quote] to the very last [/quote]. To turn this behavior of you have to use the "U" modifier:
$body = preg_replace_callback("/\[quote\](.*?)\[\/quote\]/isU",...);
For a list of modifiers see http://php.net/manual/en/reference.pcre.pattern.modifiers.php

Related

Find next word after colon in regex

I am getting a result as a return of a laravel console command like
Some text as: 'Nerad'
Now i tried
$regex = '/(?<=\bSome text as:\s)(?:[\w-]+)/is';
preg_match_all( $regex, $d, $matches );
but its returning empty.
my guess is something is wrong with single quotes, for this i need to change the regex..
Any guess?
Note that you get no match because the ' before Nerad is not matched, nor checked with the lookbehind.
If you need to check the context, but avoid including it into the match, in PHP regex, it can be done with a \K match reset operator:
$regex = '/\bSome text as:\s*'\K[\w-]+/i';
See the regex demo
The output array structure will be cleaner than when using a capturing group and you may check for unknown width context (lookbehind patterns are fixed width in PHP PCRE regex):
$re = '/\bSome text as:\s*\'\K[\w-]+/i';
$str = "Some text as: 'Nerad'";
if (preg_match($re, $str, $match)) {
echo $match[0];
} // => Nerad
See the PHP demo
Just come from the back and capture the word in a group. The Group 1, will have the required string.
/:\s*'(\w+)'$/

Pregmatch using word boundary regex not working as expected

I am using preg_match to find exact words and phrases and replace them with AHREF links. I am using word boundary regex but it is not working correctly. It is matching within words.
Example:
'rings' is being matched to 'earrings'. I don't want that. I just want 'rings'
Is my preg_match regex wrong?
$keyword="rings";
$text="women's earrings, clothing rings, earrings, rings";
if (preg_match("/\b$keyword\b/i",$text))
Italics are meant to be underlined below
output = "women's ear*rings*, clothing *rings*, ear*rings*, *rings*"
expected = "women's earrings, clothing *rings*, earrings, *rings*"
update
I think the problem is in the replace function:
function str_replace_first($from, $to, $subject)
{ $from = '/'.preg_quote($from, '/').'/';
return preg_replace($from, $to, $subject,2);
}
if (preg_match_all("/\b$keyword\b/i",$text,$matches)>0)
{
print_r($matches)."<p> ";
$ahref="<a href='$anchor_url'>$keyword</a>";
$text=str_replace_first($keyword, $ahref, $text);
} ELSE {
echo "<p>no Match<br>";
}
echo $text;
Use preg_replace directly, without collecting the matches since you are not really going to use them (you only need to wrap them up with some other texts):
$keyword="rings";
$anchor_url = "http_//www.t.tt";
$url = "<a href='$anchor_url'>\$0</a>";
$text="women's earrings, clothing rings, earrings, rings";
$newtxt = preg_replace('/\b' . preg_quote($keyword, '/') . '\b/i', $url, $text);
if ($newtxt != $text) {
echo $newtxt;
} else { echo "No matches!"; }
See the PHP demo.
Note that you need \b word boundaries to match a whole word. You also need to preg_quote the keyword and escape the regex delimiter, too. Then, since you are using a case insensitive regex, you cannot use $keyword hardcoded in the replacement, you need to use $0 backreference to the whole match. If you need to check if there was no match, just compare a new string with the original string.

STRTOLOWER in PREG_REPLACE_CALLBACK not working with #hashtag link

For some reason, I can't, for the life of me, get strtolower to work properly with an anchor tag that is linking to a #Hashtag...even using preg_replace_callback().
public static function convertHashtags($str) {
$str = preg_replace_callback(
'/(\#([a-z0-9_]+))/ix',
function( $matches ) {
$uri = strtolower($matches[2]);
// return $uri;
return ''. $matches[1] .'';
}, $str, -1);
return $str;
}
All this needs to do is grab the #hashtag and turn it into a link. The URL needs to be lowercased while the #HashTag retains it's original formatting.
Example:
#Palladia turns into:
#Palladia
However, I am noticing something wonky...if I put a # in the return, right before $matches[1] it works fine, but obviously displays 2 #'s. So I thought, ok then, I'll just use $matches[2] with a # in front of it. Nope, doesn't work. For whatever reason it needs that extra # in front of the #Palladia...this results in a not so ideal result:
##Palladia
Oddly enough, if I simply return strotolower($matches[2]), it does lowercase the string...it just doesn't want to work inside of the anchor tag.
Does anyone have any idea how to make it so I do not need that extra # there?
I think the confusion is coming from what is in $matches -- you have two sets of brackets, but you really only need one to capture the text after the hashtag.
I've simplified the code slightly:
public static function convertHashtags($str) {
return preg_replace_callback(
'/#([\w]+)/', // all "word" characters, all digits, and underscore
// brackets around the string AFTER the hashtag
function( $matches ) {
// $matches[0] is the complete match (including the hashtag)
// $matches[1] is the match for the subpattern enclosed in brackets
return '<a href="'. SITE_URL .'/hashtag/'
. strtolower($matches[1]) .'">'
. $matches[0] .'</a>';
}, $str);
}
convertHashtags('#Palladium')
// output: #Palladium
Also works with text containing multiple hashtags:
convertHashtags("I love #StackOverflow and #Hashtags. They're awesome! #Awesomesauce");
// output: I love #StackOverflow and
// #Hashtags. They're awesome! <a
// href="SITE_URL/hashtag/awesomesauce">#Awesomesauce</a>

Get the number of matched characters in a regex group

I may be pushing the boundaries of Regular Expressions, but who knows...
I'm working in php.
In something like:
preg_replace('/(?:\n|^)(={3,6})([^=]+)(\1)/','<h#>$2</h#>', $input);
Is there a way to figure out how many '=' (={3,6}) matched, so I can backreference it where the '#'s are?
Effectively turning:
===Heading 3=== into <h3>Heading 3</h3>
====Heading 4==== into <h4>Heading 4</h4>
...
You can use:
preg_replace('/(?:\n|^)(={3,6})([^=]+)(\1)/e',
"'<h'.strlen('$1').'>'.'$2'.'</h'.strlen('$1').'>'", $input);
Ideone Link
No, PCRE can't do that. You should instead use preg_replace_callback and do some character counting then:
preg_replace_callback('/(?:\n|^)(={3,6})([^=]+)(\1)/', 'cb_headline', $input);
function cb_headline($m) {
list(, $markup, $text) = $m;
$n = strlen($markup);
return "<h$n>$text</h$n>";
}
Additionally you might want to be forgiving with the trailing === signs. Don't use a backreference but allow a variable number.
You might also wish to use the /m flag for your regex, so you can keep ^ in place of the more complex (?:\n|^) assertion.
It is very simple with modifier e in regexp, no need in preg_replace_callback
$str = '===Heading 3===';
echo preg_replace('/(?:\n|^)(={3,6})([^=]+)(\1)/e',
'implode("", array("<h", strlen("$1"), ">$2</h", strlen("$1"), ">"));',
$str);
or this way
echo preg_replace('/(?:\n|^)(={3,6})([^=]+)(\1)/e',
'"<h".strlen("$1").">$2</h".strlen("$1").">"',
$str);
I would do it like this:
<?php
$input = '===Heading 3===';
$h_tag = preg_replace_callback('#(?:\n|^)(={3,6})([^=]+)(\1)#', 'paragraph_replace', $input);
var_dump($h_tag);
function paragraph_replace($matches) {
$length = strlen($matches[1]);
return "<h{$length}>". $matches[2] . "</h{$length}>";
}
?>
Output:
string(18) "<h3>Heading 3</h3>"

regular expression in PHP to create wiki-style links

I'm developing a site which is going to use wiki-style links to internal content eg [[Page Name]]
I'm trying to write a regex to achieve this and I've got as far as turning it into a link and replacing spaces with dashes (this is our space substitute rather than underscores) but only for page names of two words.
I could write a separate regex for all likely numbers of words (say from 10 downwards) but I'm sure there must be a neater way of doing it.
Here's what I have at the moment:
$regex = "#[\[][\[]([^\s\]]*)[\s]([^\s\]]*)[\]][\]]#";
$description = preg_replace($regex,"$1 $2",$description);
If someone can advise me how I can modify this regex so it works for any number of words that would be really helpful.
You can use the preg_replace_callback() function which accepts a callback to process the replacement string. You can also use lazy quantifiers in the pattern instead of a lot of negations inside character classes.
The external preg_replace_callback will extract the matched text and pass it to the callback function, which will return the properly modified version.
$str = '[[Page Name with many words]]';
echo preg_replace_callback('/\[\[(.*?)\]\]/', 'parse_tags', $str);
function parse_tags($match) {
$text = $match[1];
$slug = preg_replace('/\s+/', '-', $text);
return "$text";
}
You should use a callback function to do the replacement (using preg_replace_callback):
$str = preg_replace_callback('/\[\[([^\]]+)\]\]/', function($matches) {
return '<a href="' . preg_replace('/\s+/', '-', $matches[1]) . '>' . $matches[1] . '</a>';
}, $str);

Categories