preg_replace_callback with array pattern and replacement

preg_replace_callback with array pattern and replacement - php

I have a function which uses preg_replace() where the pattern and replacements are arrays. I need a counter to track the replacements, so I am converting the function to use preg_replace_callback with a closure, but I can't seem to find a way to distinguish which pattern the match being passed to the callback matches. Is there a way to do array => array replacement using preg_replace_callback?
Ideally this is what I'd like to work, but obviously it won't since $pattern and $replace are evaluated going into the call and not inbetween each replacement
function replaceTags($text)
{
$i = 1;
$pattern = array(
'/\[d\](.*?)\[\/d\]/',
'/\[s\](.*?)\[\/s\]/',
);
$replace = array(
'<div id="'.($i++).'">$1</div>',
'<span id="'.($i++).'">$1</span>',
);
return preg_replace($pattern, $replace, $text);
}

If I understand it correctly, you just need to maintain state between calls to your callback function. The ideal way to do this is using a member function. The state is stored in the object instance. Each time you call, you can modify the object, changing your state.
I also added an extra capture to your patterns in order to differentiate between the patterns in the callback.
<?php
class TagReplacer {
private $counter = 0;
public function replacer($matches) {
// modify the state
$this->counter++;
// return the replacement text using the state
if ($matches[1] === "d")
$tag = 'div';
else
$tag = 'span';
return "<{$tag} id=\"{$this->counter}\">{$matches[2]}</{$tag}>";
}
}
function replaceTags($text) {
$stateObject = new TagReplacer();
$patterns = array(
'/\[(d)\](.*?)\[\/d\]/',
'/\[(s)\](.*?)\[\/s\]/',
);
return preg_replace_callback(
$patterns,
array(&$stateObject, "replacer"),
$text);
}
echo replaceTags("zzz[d]123[/d]zzz[s]456[/s]zzz[d]78[/d]zzz[s]90[/s]zzz");
?>
The output is
zzz<div id="1">123</div>zzz<span id="3">456</span>zzz<div id="2">78</div>zzz<span id="4">90</span>zzz
I was surprised that the ids are not in numerical order. My guess is that preg_replace_callback iterates through the pattern array doing all the replacements at once for each pattern.

Related

preg_replace_callback: regular expression search and replace

$details = "text...[book=123]...text...";
$details = preg_replace_callback(
"/\[book=(.+?)\]/smi",
function ($m) {
global $skip_books;
$book = $m[1]; // 123
$feed = $m[2]; // 456
return "<div id=\"view_book_".$book."_".$feed."\"></div>";
},
$details
);
With this pattern i can only get $book ($m[1]):
"/\[book=(.+?)\]/smi"`
But i want to get $feed ($m[2]) too, so i replace to this [book=123_456].
How to get "456" ($m[2]) after the underline?
"/\[book=(.+?)_(.+?)\]/smi" ???

Don't use global here; you're already using a closure, so use the use:
function ($m) use ($skip_books) {
// ...
}
Btw, you're not actually using $skip_books in the code you've shown so far, but I'm assuming that's because you've simplified it
If your arguments are always numbers, don't use something generic like (.+?) but be specific (the more the better):
/\[book=(\d+)_(\d+)\]/i
I've also removed the /s and /m modifiers, which are useless here.

PHP Spintax Processor

I've been using the recurisve SpinTax processor as seen here, and it works just fine for smaller strings. However, it begins to run out of memory when the string goes beyond 20KB, and it's becoming a problem.
If I have a string like this:
{Hello|Howdy|Hola} to you, {Mr.|Mrs.|Ms.} {Smith|Williams|Austin}!
and I want to have random combinations of the words put together, and not use the technique as seen in the link above (recursing through the string until there are no more words in curly-braces), how should I do it?
I was thinking about something like this:
$array = explode(' ', $string);
foreach ($array as $k=>$v) {
if ($v[0] == '{') {
$n_array = explode('|', $v);
$array[$k] = str_replace(array('{', '}'), '', $n_array[array_rand($n_array)]);
}
}
echo implode(' ', $array);
But it falls apart when there are spaces in-between the options for the spintax. RegEx seems to be the solution here, but I have no idea how to implement it and have much more efficient performance.
Thanks!

You could create a function that uses a callback within to determine which variant of the many potentials will be created and returned:
// Pass in the string you'd for which you'd like a random output
function random ($str) {
// Returns random values found between { this | and }
return preg_replace_callback("/{(.*?)}/", function ($match) {
// Splits 'foo|bar' strings into an array
$words = explode("|", $match[1]);
// Grabs a random array entry and returns it
return $words[array_rand($words)];
// The input string, which you provide when calling this func
}, $str);
}
random("{Hello|Howdy|Hola} to you, {Mr.|Mrs.|Ms.} {Smith|Williams|Austin}!");
random("{This|That} is so {awesome|crazy|stupid}!");
random("{StackOverflow|StackExchange} solves all of my {problems|issues}.");

You can use preg_replace_callback() to specify a replacement function.
$str = "{Hello|Howdy|Hola} to you, {Mr.|Mrs.|Ms.} {Smith|Williams|Austin}!";
$replacement = function ($matches) {
$array = explode("|", $matches[1]);
return $array[array_rand($array)];
};
$str = preg_replace_callback("/\{([^}]+)\}/", $replacement, $str);
var_dump($str);

mb_eregi_replace multiple matches get them

$string = 'test check one two test3';
$result = mb_eregi_replace ( 'test|test2|test3' , '<$1>' ,$string ,'i');
echo $result;
This should deliver: <test> check one two <test3>
Is it possible to get, that test and test3 was found, without using another match function ?

You can use preg_replace_callback instead:
$string = 'test check one two test3';
$matches = array();
$result = preg_replace_callback('/test|test2|test3/i' , function($match) use ($matches) {
$matches[] = $match;
return '<'.$match[0].'>';
}, $string);
echo $result;
Here preg_replace_callback will call the passed callback function for each match of the pattern (note that its syntax differs from POSIX). In this case the callback function is an anonymous function that adds the match to the $matches array and returns the substitution string that the matches are to be replaced by.
Another approach would be to use preg_split to split the string at the matched delimiters while also capturing the delimiters:
$parts = preg_split('/test|test2|test3/i', $string, null, PREG_SPLIT_DELIM_CAPTURE);
The result is an array of alternating non-matching and matching parts.

As far as I know, eregi is deprecated.
You could do something like this:
<?php
$str = 'test check one two test3';
$to_match = array("test", "test2", "test3");
$rep = array();
foreach($to_match as $val){
$rep[$val] = "<$val>";
}
echo strtr($str, $rep);
?>
This too allows you to easily add more strings to replace.

Hi following function used to found the any word from string
<?php
function searchword($string, $words)
{
$matchFound = count($words);// use tha no of word you want to search
$tempMatch = 0;
foreach ( $words as $word )
{
preg_match('/'.$word.'/',$string,$matches);
//print_r($matches);
if(!empty($matches))
{
$tempMatch++;
}
}
if($tempMatch==$matchFound)
{
return "found";
}
else
{
return "notFound";
}
}
$string = "test check one two test3";
/*** an array of words to highlight ***/
$words = array('test', 'test3');
$string = searchword($string, $words);
echo $string;
?>

If your string is utf-8, you could use preg_replace instead
$string = 'test check one two test3';
$result = preg_replace('/(test3)|(test2)|(test)/ui' , '<$1>' ,$string);
echo $result;
Oviously with this kind of data to match the result will be suboptimal
<test> check one two <test>3
You'll need a longer approach than a direct search and replace with regular expressions (surely if your patterns are prefixes of other patterns)

To begin with, the code you want to enhance does not seem to comply with its initial purpose (not at least in my computer). You can try something like this:
$string = 'test check one two test3';
$result = mb_eregi_replace('(test|test2|test3)', '<\1>', $string);
echo $result;
I've removed the i flag (which of course makes little sense here). Still, you'd still need to make the expression greedy.
As for the original question, here's a little proof of concept:
function replace($match){
$GLOBALS['matches'][] = $match;
return "<$match>";
}
$string = 'test check one two test3';
$matches = array();
$result = mb_eregi_replace('(test|test2|test3)', 'replace(\'\1\')', $string, 'e');
var_dump($result, $matches);
Please note this code is horrible and potentially insecure. I'd honestly go with the preg_replace_callback() solution proposed by Gumbo.

PHP RegExp Variable in Function

I'm using a Function to parse UBBC and I want to use a function to find data from a database to replace text (a [user] kind of function). However the code is ignoring the RegExp Variable. Is there any way I can get it to recognise the RegExp variable?
PHP Function:
function parse_ubbc($string){
$string = $string;
$tags = array(
"user" => "#\[user\](.*?)\[/user\]#is"
);
$html = array(
"user" => user_to_display("$1", 0)
);
return preg_replace($tags, $html, $string);
}
My function uses the username of the user to get their display name, 0 denotes that it is the username being used and can be ignored for the sake of this.
Any help would be greatly appreciated.

You either rewrite your code to use preg_replace_callback, as advised.
Or your rewrite the regex to use the #e flag:
function parse_ubbc($string){
$string = $string;
$tags = array(
"user" => "#\[user\](.*?)\[/user\]#ise"
);
$html = array(
"user" => 'user_to_display("$1", 0)'
);
return preg_replace($tags, $html, $string);
}
For that it's important that PHP does not execute the function in the replacement array immediately. That's why you have to put the function call into 'user_to_display("$1", 0)' single quotes. So preg_replace executes it later with the #e flag.
A significant gotcha here is, that the username may never contain " double quotes which would allow the regex placeholder $0 to break up the evaluated function call (cause havoc). Hencewhy you have to rewrite the regex itself to use \w+ instead of .*?. Or again just use preg_replace_callback for safety.

You need to use preg_replace_callback if you want to source replacements from a database.

function parse_ubbc($string){
$string = $string;
function get_user_to_display($m){
user_to_display($m[1], 0);
}
return preg_replace_callback('#\[user\](.*?)\[/user\]#is', 'get_user_to_display', $string);
}

You're calling user_to_display() with the string '$1', not the actual found string. Try:
function parse_ubbc($string){
$string = $string;
$tags = array(
"user" => "#\[user\](.*?)\[/user\]#ise"
);
$html = array(
"user" => 'user_to_display("$1", 0)'
);
return preg_replace($tags, $html, $string);
}
The changes are adding 'e' to the end of the regexp string, and putting the function call in quotes.

Multiple regular expression interfere

I use regex to create html tags in plain text. like this
loop
$SearchArray[] = "/\b(".preg_quote($user['name'], "/").")\b/i";
$ReplaceArray[] = '$1';
-
$str = preg_replace($SearchArray, $ReplaceArray, $str);
I'm looking for a way to not match $user['name'] in a tag.

You could use preg_replace_callback()
for 5.3+:
$callback = function($match) using ($user) {
return ''.$match[1].'';
};
$regex = "/\b(".preg_quote($user['name'], "/").")\b/i";
$str = preg_replace_callback($regex, $callback, $string);
for 5.2+:
$method = 'return \'\'.$match[1].\'\';';
$callback = create_function('$match', $method);
$regex = "/\b(".preg_quote($user['name'], "/").")\b/i";
$str = preg_replace_callback($regex, $callback, $string);

So the problem is that you're making several passes over the document, replacing a different user name in each pass, and you're afraid you'll unintentionally replace a name inside a tag that was created in a previous pass, right?
I would try to do all of the replacements in one pass, using preg_replace_callback as #ircmaxwell suggested, and one regex that can match any legal user name. In the callback function, you look up the matched string to see if it's a real user's name. If it is, return the generated link; if not, return the matched string for reinsertion.

It looks like you're trying to add a bunch of anchors to a document. Have you thought of using SimpleXML. This assumes that the anchor tags are part of a larger xhtml document.
//$xhtml_doc is some xhtml doc's path
$doc = simplexml_load_file($xhtml);
//NOTE: find the parent element for all these anchors (maybe with xpath)
//example: $parent = $doc->xpath('//div[#id=parent]');
foreach($user as $k => $v){
$anchor = $doc->addChild('a', $v['name']);
$anchor->addAttribute('href', $v['url']);
}
return $doc->asXML();
simpleXML helps me a lot in these situations. It'll be a lot faster than regex, even if this isn't exactly what you want to do.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

preg_replace_callback with array pattern and replacement - php

Related

preg_replace_callback: regular expression search and replace

PHP Spintax Processor

mb_eregi_replace multiple matches get them

PHP RegExp Variable in Function

Multiple regular expression interfere

Categories

Resources