PHP Dynamic preg_replace - php

$end = preg_replace($pattern, $replacement, $str);
How can I make the replacement string $replacement vary with each match in $str? For example, I want to replace each matched string with an associated image. Something about callbacks... right?

Yes, something with callbacks. Specifically preg_replace_callback, which makes repeated calls redundant. For a list of things to replace:
$src = preg_replace_callback('/(thing1|thing2|thing3)/', 'cb_vars', $src);
Where the callback can do some form of lookup or conversion:
function cb_vars($m) {
return strtoupper($m[1]);
}
Likewise can you do that inline with the normal preg_replace and the /e modifier.

You need to either use preg_replace_callback, or the /e modifier in the pattern string. The first is more powerful, but the second is more convenient if you are only after something relatively simple.

Related

Matching substrings with PHP preg_match_all()

I'm attempting to create a lightweight BBCode parser without hardcoding regex matches for each element. My way is utilizing preg_replace_callback() to process the match in the function.
My simple yet frustrating way involves using regex to group the elements name and parse different with a switch for each function.
Here is my regex pattern:
'~\[([a-z]+)(?:=(.*))?(?: (.*))?\](.*)(?:\[/\1\])~siU'
And here is the preg_replace_callback() I've got to test.
return preg_replace_callback(
'~\[([a-z]+)(?:=(.*))?(?: (.*))?\](.*)(?:\[/\1\])~siU',
function($matches) {
var_dump($matches);
return "<".$matches[1].">".$matches[4]."</".$matches[1].">";
},
$this->raw
);
This one issue has stumped me. The regex pattern won't seem to recursively match, meaning if it matches an element, it won't match elements inside it.
Take this BBCode for instance:
[i]This is all italics along with a [b]bold[/b].[/i]
This will only match the [u], and won't match any of the elements inside of it, so it looks like
This is all italics along with a [b]bold[/b].
preg_match_all() continues to show this to be the case, and I've tried messing with greedy syntax and modes.
How can I solve this?
Thanks to #Casimir et Hippolyte for their comment, I was able to solve this using a while loop and the count parameter like they said.
The basic regex strings don't work because I would like to use values in the tags like [color=red] or [img width=""].
Here is the finalized code. It isn't perfect but it works.
$str = $this->raw;
do {
$str = preg_replace_callback(
'~\[([a-z]+)(?:=([^]\s]*))?(?: ([^[]*))?\](.*?)(?:\[/\1\])~si',
function($matches) {
return "<".$matches[1].">".$matches[4]."</".$matches[1].">";
},
$str,
-1,
$count
);
} while ($count);
return $str;

Regex pattern for matching mm <sup>3<sup>

I’m trying to write a regular expression to change mm3 to mL:
<?php
$match = 'mm<sup>3</sup>';
if(preg_match('/\b(mm<sup>3</sup>)\b/', $match))
{
$replacement = 'ml';
$replac = preg_replace('/\b(mm<sup>3</sup>)\b/', $replacement, $match);
echo $replac;
}
?>
But my regular expression doesn't capture the content in $match variable, and the $replac value isn't output. What am I doing wrong?
Change:
if(preg_match('/\b(mm<sup>3</sup>)\b/',$match))
to:
if(preg_match('#\bmm<sup>3</sup>\b#',$match))
and similarly in the preg_replace call.
Since your regular expression contains /, you need to either escape it or use a different delimiter around the regular expression.
There's also no need for the parentheses, since you're not doing anything with the groups.
You need to either use preg_quote to get rid of that / in your regexp, or use a different delimiter (usually # is used).
Also, the \b separator after the > is not necessary, nor are parentheses since you don't seem to be doing capture; you're basically doing a more expensive str_replace.
Finally, you can do everything in one move. If there's no match, nothing will happen.
<?php
$match = 'mm<sup>3</sup>';
$replacement='ML';
$replac = preg_replace('#\\bmm<sup>3</sup>#',
$replacement,
$match);
echo $replac;
?>
If you want to be picky, I guess you should also replace with 'ml', not 'ML' :-)
(for replacement of multiple strings, preg_replace supports arrays).
Note: unless you're sure that is the correct HTML you want replaces, maybe you ought to try
$match = 'mm\\s*<sup>\\s*3\\s*</sup>';
in order to catch mm 3 and similar, in addition to mm3 (in some circumstances they may look alike, and some editors might use or automatically "correct" either form into the other).

Regex pattern to match all images with replacement holding self PHP function

I have a preg_replace function to find all images and wrap them inside <figure> tag with different class, which depends on image source:
$pattern = '"/<img[^>]*src="([^"]+)"[^>]*alt="([^"]+)"[^>]*>\i"e';
$replacement = '"<figure class=\"" . check($1) . "\"><img src=\"$1\" alt=\"$2\" /></figure>"';
preg_replace($pattern, $replacement, $content);
Therefore, to put a right class, I wish to call a function check($source) with image source parameter. By this way, function is going to return necessary class.
As you can see in a code above, I am trying to use e modifier, but it seems it doesn't work.
Do I have to modify my pattern and replacement?
Should I use preg_replace_all to find all the images, if they are many inside my $content variable?
You can use preg_replace_callback() for this purpose. It allows you to define and use a function for replacement. The function should expect an array of matches and it is supposed to return the replacement value.
preg_replace() with an e modifier will also do the trick.
Check the regular expression library, there are already some HTML image patterns.

using preg_match to strip specified underscore in php

There has always been a confusion with preg_match in php.
I have a string like this:
apsd_01_03s_somedescription
apsd_02_04_somedescription
Can I use preg_match to strip off anything from 3rd underscore including the 3rd underscore.
thanks.
Try this:
preg_replace('/^([^_]*_[^_]*_[^_]*).*/', '$1', $str)
This will take only the first three sequences that are separated by _. So everything from the third _ on will be removed.
if you want to strip the "_somedescription" part: preg_replace('/([^]*)([^]*)([^]*)(.*)/', '$1_$2_$3', $str);
I agree with Gumbo's answer, however, instead of using regular expressions, you can use PHP's array functions:
$s = "apsd_01_03s_somedescription";
$parts = explode("_", $s);
echo implode("_", array_slice($parts, 0, 3));
// apsd_01_03s
This method appears to execute similarly in speed, compared to a regular expression solution.
If the third underscore is the last one, you can do this:
preg_replace('/^(.+)_.+?)$/', $1, $str);

Replacing HTML attributes using a regex in PHP

OK,I know that I should use a DOM parser, but this is to stub out some code that's a proof of concept for a later feature, so I want to quickly get some functionality on a limited set of test code.
I'm trying to strip the width and height attributes of chunks HTML, in other words, replace
width="number" height="number"
with a blank string.
The function I'm trying to write looks like this at the moment:
function remove_img_dimensions($string,$iphone) {
$pattern = "width=\"[0-9]*\"";
$string = preg_replace($pattern, "", $string);
$pattern = "height=\"[0-9]*\"";
$string = preg_replace($pattern, "", $string);
return $string;
}
But that doesn't work.
How do I make that work?
PHP is unique among the major languages in that, although regexes are specified in the form of string literals like in Python, Java and C#, you also have to use regex delimiters like in Perl, JavaScript and Ruby.
Be aware, too, that you can use single-quotes instead of double-quotes to reduce the need to escape characters like double-quotes and backslashes. It's a good habit to get into, because the escaping rules for double-quoted strings can be surprising.
Finally, you can combine your two replacements into one by means of a simple alternation:
$pattern = '/(width|height)="[0-9]*"/i';
Your pattern needs the start/end pattern character. Like this:
$pattern = "/height=\"[0-9]*\"/";
$string = preg_replace($pattern, "", $string);
"/" is the usual character, but most characters would work ("|pattern|","#pattern#",whatever).
I think you're missing the parentheses (which can be //, || or various other pairs of characters) that need to surround a regular expression in the string. Try changing your $pattern assignments to this form:
$pattern = "/width=\"[0-9]*\"/";
...if you want to be able to do a case-insensitive comparison, add an 'i' at the end of the string, thus:
$pattern = "/width=\"[0-9]*\"/i";
Hope this helps!
David

Categories