How to match this specific string in RE? - php

Once again I'm stuck at regular expression. There is nowhere any good material where to learn the more advance usage.
I'm trying to match [image width="740" height="249" parameters=""]51lca7dn56.jpg[/image] to $cache->image_tag("$4", $1, $2, "$3").
Everything works great if all the [image] parameters are there, but I need it to match, even if something is missing. So for example [image width="740"]51lca7dn56.jpg[/image].
Current code is:
$text = preg_replace('#\[image width=\"(.*?)\" height=\"(.*?)\" parameters=\"(.*?)\"\](.*?)\[/image\]#e', '$cache->image_tag("$4", $1, $2, "$3")', $text);
Regular expression is the only thing that always gets me stuck, so if anybody could also refer some good resource, so I could manage these types of issues myself, it would be much appreciated.
My dummy version what I'm trying to do is this:
// match only [image]
$text = preg_replace('#\[image\](.*?)\[/image\]#si', '$cache->image_tag("$1", 0, 0, "")', $text);
// match only width
$text = preg_replace('#\[image width=\"(.*?)\"\](.*?)\[/image\]#si', '$cache->image_tag("$2", $1, 0, "")', $text);
// match only width and height
$text = preg_replace('#\[image width=\"(.*?)\" height=\"(.*?)\"\](.*?)\[/image\]#si', '$cache->image_tag("$3", $1, $2, "")', $text);
// match only all
$text = preg_replace('#\[image width=\"(.*?)\" height=\"(.*?)\" parameters=\"(.*?)\"\](.*?)\[/image\]#si', '$cache->image_tag("$4", $1, $2, $3)', $text);
(This code actually doesn't work as expected, but you will understand my point more better.) I hope to put all this horrible mess into one RE call basically.
Final code tested and working based on Ωmega's answer:
// Match: [image width="740" height="249" parameters="bw"]51lca7dn56.jpg[/image]
$text = preg_replace('#\[image\b(?=(?:[^\]]*\bwidth="(\d+)"|))(?=(?:[^\]]*\bheight="(\d+)"|))(?=(?:[^\]]*\bparameters="([^"]+)"|))[^\]]*\]([^\[]*)\[\/image\]#si', '$cache->image_tag("$4", $1, $2, "$3")', $text); // the end is #si, so it would be eaiser to debug, in reality its #e
However, since if width or height might not be there, it will return empty not NULL. So I adopted drews idea of preg_replace_callback():
$text = preg_replace_callback('#\[image\b(?=(?:[^\]]*\bwidth="(\d+)"|))(?=(?:[^\]]*\bheight="(\d+)"|))(?=(?:[^\]]*\bparameters="([^"]+)"|))[^\]]*\]([^\[]*)\[\/image\]#', create_function(
'$matches',
'global $cache; return $cache->image_tag($matches[4], ($matches[1] ? $matches[1] : 0), ($matches[2] ? $matches[2] : 0), $matches[3]);'), $text);

Maybe try a regex like this instead which tries to grab extra params in the image tag (if any). This way, the parameters can be in any order with any combination of included and omitted parameters:
$string = 'this is some code and it has bbcode in it like [image width="740" height="249" parameters=""]51lca7dn56.jpg[/image] for example.';
if (preg_match('/\[image([^\]]*)\](.*?)\[\/image\]/i', $string, $match)) {
var_dump($match);
}
Resulting match:
array(3) {
[0]=>
string(68) "[image width="740" height="249" parameters=""]51lca7dn56.jpg[/image]"
[1]=>
string(39) " width="740" height="249" parameters="""
[2]=>
string(14) "51lca7dn56.jpg"
}
So you can then examine $match[1] and parse out the parameters. You may need to use preg_replace_callback to implement the logic inside the callback.
Hope that helps.

I would suggest you to use regex
\[image\b(?=(?:[^\]]*\bwidth="(\d+)"|))(?=(?:[^\]]*\bheight="(\d+)"|))(?=(?:[^\]]*\bparameters="([^"]+)"|))[^\]]*\]([^\[]*)\[\/image\]
Edit:
$string = 'this is some code and it has bbcode in it like [image width="740" height="249" parameters=""]51lca7dn56.jpg[/image] for example and [image parameters="" height="123" width="456"]12345.jpg[/image].';
if (preg_match_all('/\[image\b(?=(?:[^\]]*\bwidth="(\d+)"|))(?=(?:[^\]]*\bheight="(\d+)"|))(?=(?:[^\]]*\bparameters="([^"]+)"|))[^\]]*\]([^\[]*)\[\/image\]/i', $string, $match) > 0) {
var_dump($match);
}
Output:
array(5) {
[0]=>
array(2) {
[0]=>
string(68) "[image width="740" height="249" parameters=""]51lca7dn56.jpg[/image]"
[1]=>
string(63) "[image parameters="" height="123" width="456"]12345.jpg[/image]"
}
[1]=>
array(2) {
[0]=>
string(3) "740"
[1]=>
string(3) "456"
}
[2]=>
array(2) {
[0]=>
string(3) "249"
[1]=>
string(3) "123"
}
[3]=>
array(2) {
[0]=>
string(0) ""
[1]=>
string(0) ""
}
[4]=>
array(2) {
[0]=>
string(14) "51lca7dn56.jpg"
[1]=>
string(9) "12345.jpg"
}
}

Related

preg_replace replace only once even if the match is found

my HTML form code replaces some words with <-#word#-> using the code
$string = preg_replace("/($p)/i", '<-#$1#->', $string);
the problem is that if the form has some errors, upon resubmitting the form the word becomes <-#<-#<-#word#->#->#-> every time someone resubmits the form. Is it possible to replace but if it is already replaced then do not.
This is what I tried using NOT operator but it is not working
$string = preg_replace("/^(<-#)($p)^(#->)/i", '<-#$1#->', $string);
You could use a negative lookarounds to assert what is directly on the left an on the right is not <-# and
(?<!<-#)(word)(?!#->)
Regex demo | Php demo
Your code could look like:
$string = preg_replace("/(?<!<-#)($p)(?!#->)/i", '<-#$1#->', $string);
Another method might be to check with preg_match_all() to ensure if your matches are returning:
$string = '<-#<-#<-#Any alphanumeric input that user may wish#->#->#->';
preg_match_all("/(<-#)+([A-Za-z0-9_\s]+)(#->)+/s", $string, $matches);
$string = '<-#' . $matches[2][0] . '#->';
var_dump($string);
which outputs:
string(47) "<-#Any alphanumeric input that user may wish#->"
var_dump($matches); would return:
array(4) {
[0]=>
array(1) {
[0]=>
string(59) "<-#<-#<-#Any alphanumeric input that user may wish#->#->#->"
}
[1]=>
array(1) {
[0]=>
string(3) "<-#"
}
[2]=>
array(1) {
[0]=>
string(41) "Any alphanumeric input that user may wish"
}
[3]=>
array(1) {
[0]=>
string(3) "#->"
}
}

Multiple words between curly brackets in PHP

I have the following string:
$string = "Hello from {me} to {you}";
What i want is an array with the words between the curly brackets (without the curly brackets of course.
array(2) {
[0]=>
string(2) "me"
[1]=>
string(3) "you"
}
I tried the following pattern but it only shows one word (with the brackets) selected.
/\{([^}]+)\}/
or
/\{(\s*?.*?)*?\}/
I am new to regular expressions.
Thanks
Use preg_match_all. In the code below, $results is what you're looking for:
$raw_string = "Hello from {me} to {you}";
$pattern = "/{(.*?)}/"; //will match everything in { }
if(preg_match_all($pattern,$raw_string,$matches)):
$results = $matches[1];
else:
//no matches
endif;
You need to use the third parameter in preg_match_all to get the matched values in an array.
<?php
$string = "Hello from {me} to {you}";
preg_match_all('/\{([^}]+)\}/', $string, $matches);
var_dump($matches);
?>
Which produces,
array(2) { [0]=> array(2) { [0]=> string(4) "{me}" [1]=> string(5) "{you}" } [1]=> array(2) { [0]=> string(2) "me" [1]=> string(3) "you" } }
To get the clean version,
echo $matches[1][yourKey];
Reading Material
preg_match_all();
$string = "Hello from {me} to {you}";
preg_match_all('/{([^}]+)}/', $string, $matches);
print_r($matches[count($matches)-1]);

Regexp for string which shouldn't contain two known chars

For example
I have a string like "12345%67890"
Regexp [^%]* gives me 12345.
How to get the same result, if I need to use not "%", but "<%" for example.Thanks a lot.
A bit more information:
I have a huge text, where I make some replacements between %%, like %test% I change to something else using preg_match_all and preg_replace, but if % was used not like a separator, everything crashes. Ex: %test 90% test%, so I've decided to change % to something more complicated like <% test 90% test %>.
Based on your new information it sounds like you control the output, which makes this all kind of weird.
In any case, here's a regex that will capture the contents of the wrapper you've created:
<%(.+?)%>
Notice the ? for a lazy match.
Code sample:
$string = "asdfar <%test123%>farasr%<5 sara><%90% is cool%%><%ooooaaaah%>>>%<%>%%";
preg_match_all('/<%(.+?)%>/', $string, $matches);
var_dump($matches);
Output:
array(2) {
[0]=>
array(3) {
[0]=>
string(11) "<%test123%>"
[1]=>
string(16) "<%90% is cool%%>"
[2]=>
string(13) "<%ooooaaaah%>"
}
[1]=>
array(3) {
[0]=>
string(7) "test123"
[1]=>
string(12) "90% is cool%"
[2]=>
string(9) "ooooaaaah"
}
}
Seems to me you should be doing a split, not a match:
$subject = "12345<%67890";
$result = preg_split('/<%/', $subject);
print_r($result);
output:
Array
(
[0] => 12345
[1] => 67890
)

Regex quantified capture

php > preg_match("#/m(/[^/]+)+/t/?#", "/m/part/other-part/t", $m);
php > var_dump($m);
array(2) {
[0]=>
string(20) "/m/part/other-part/t"
[1]=>
string(11) "/other-part"
}
php > preg_match_all("#/m(/[^/]+)+/t/?#", "/m/part/other-part/t", $m);
php > var_dump($m);
array(2) {
[0]=>
array(1) {
[0]=>
string(20) "/m/part/other-part/t"
}
[1]=>
array(1) {
[0]=>
string(11) "/other-part"
}
}
With said example I would like the capture to match both /part and /other-part, unfortunately with regex /m(/[^/]+)+/t/? doesn't capture both, as I expect.
This capture should not be bound to only match this sample, it should capture an undefined number of repetitions of the capture group; e.g. /m/part/other-part/and-another/more/t
UPDATE:
Given that this is expected behavior my question stands as of how I would be able to achieve this matching of mine?
Try this one out:
preg_match_all("#(?:/m)?/([^/]+)(?:/t)?#", "/m/part/other-part/another-part/t", $m);
var_dump($m);
It gives:
array(2) {
[0]=>
array(3) {
[0]=>
string(7) "/m/part"
[1]=>
string(11) "/other-part"
[2]=>
string(15) "/another-part/t"
}
[1]=>
array(3) {
[0]=>
string(4) "part"
[1]=>
string(10) "other-part"
[2]=>
string(12) "another-part"
}
}
//EDIT
IMO the best way to do what you want is to use preg_match() from #stema and explode result by / to get list of parts you want.
Thats the way capturing groups are working. repeated capturing groups have only the last match stored after the regex finished. Thats in your test "/other-part".
Try this instead
/m((?:/[^/]+)+)/t/?
See it here on Regexr, while hovering over the match, you can see the content of the capturing group.
Just make your group non-capturing by adding a ?: at the start and put another one around the whole repetition.
In php
preg_match_all("#/m((?:/[^/]+)+)/t/?#", "/m/part/other-part/t", $m);
var_dump($m);
Output:
array(2) {
[0]=> array(1) {
[0]=>
string(20) "/m/part/other-part/t"
}
[1]=> array(1) {
[0]=>
string(16) "/part/other-part"
}
}
As already written in a comment, you can't do this at once because preg_match does not allow you to return the same subgroup matches as well (like you can do with Javascript or .Net, see Get repeated matches with preg_match_all()). So you can divide the operation onto multiple steps:
Match the subject, extract the part you're interested in.
Match the interested part only.
Code:
$subject = '/m/part/other-part/t';
$subpattern = '/[^/]+';
$pattern = sprintf('~/m(?<path>(?:%s)+)/t/?~', $subpattern);
$r = preg_match($pattern, $subject, $matches);
if (!$r) return;
$r = preg_match_all("~$subpattern~", $matches['path'], $matches);
var_dump($matches);
Output:
array(1) {
[0]=>
array(2) {
[0]=>
string(5) "/part"
[1]=>
string(11) "/other-part"
}
}

preg_match not returning expected results

I'm attempting to use regexp to parse a search string that from time to time may contain special syntax. The syntax im looking for is [special keyword : value] and i want each match put into an array. Keep in mind that the search string will contain other text that is not intended to be parsed.
$searchString = "[StartDate:2010-11-01][EndDate:2010-11-31]";
$specialKeywords = array();
preg_match("/\[{1}.+\:{1}.+\]{1}/", $searchString, $specialKeywords);
var_dump($specialKeywords);
Output:
array(1) { [0]=> string(43) "[StartDate:2010-11-01] [EndDate:2010-11-31]" }
Desired Output:
array(2) { [0]=> string() "[StartDate:2010-11-01]"
[1]=> string() "[EndDate:2010-11-01]"}
Please let me know if i am not being clear enough.
Your .+ matches across the boundaries between the two [...] parts because it matches any character, and as many of them as possible. You could be more restrictive about which characters may be matched. Also {1} is redundant and can be dropped.
/\[[^:]*:[^\]]*\]/
should work more reliably.
Explanation:
\[ # match a [
[^:]* # match any number of characters except :
: # match a :
[^\]]* # match any number of characters except ]
\] # match a ]
This:
$searchString = "[StartDate:2010-11-01][EndDate:2010-11-31]";
preg_match_all('/\[.*?\]/', $searchString, $match);
print_r($match);
gives the expected result, I'm not sure if it matches all the constraints.
Try the following:
$searchString = "[StartDate:2010-11-01][EndDate:2010-11-31]";
$specialKeywords = array();
preg_match_all("/\[\w+:\d{4}-\d\d-\d\d\]/i", $searchString, $specialKeywords);
var_dump($specialKeywords[0]);
Outputs:
array(2) {
[0]=>
string(22) "[StartDate:2010-11-01]"
[1]=>
string(20) "[EndDate:2010-11-31]"
}
Use this regex: "/\[(.*?)\:(.*?)\]{1}/" and also use preg_match_all, it will return
array(3) {
[0]=>
array(2) {
[0]=>
string(22) "[StartDate:2010-11-01]"
[1]=>
string(20) "[EndDate:2010-11-31]"
}
[1]=>
array(2) {
[0]=>
string(9) "StartDate"
[1]=>
string(7) "EndDate"
}
[2]=>
array(2) {
[0]=>
string(10) "2010-11-01"
[1]=>
string(10) "2010-11-31"
}
}
/\[.+?\:.+?\]/
I suggest this method, less complex but it handles the same as tim's

Categories