Remove comments from JSON data - php

I need to remove all /*...*/ style comments from JSON data. How do I do it with regular expressions so that string values like this
{
"propName": "Hello \" /* hi */ there."
}
remain unchanged?

You must first avoid all the content that is inside double quotes using the backtrack control verbs SKIP and FAIL (or a capture)
$string = <<<'LOD'
{
"propName": "Hello \" /* don't remove **/ there." /*this must be removed*/
}
LOD;
$result = preg_replace('~"(?:[^\\\"]+|\\\.)*+"(*SKIP)(*FAIL)|/\*(?:[^*]+|\*+(?!/))*+\*/~s', '',$string);
// The same with a capture:
$result = preg_replace('~("(?:[^\\\"]+|\\\.)*+")|/\*(?:[^*]+|\*+(?!/))*+\*/~s', '$1',$string);
Pattern details:
"(?:[^\\\"]+|\\\.)*+"
This part describe the possible content inside quotes:
" # literal quote
(?: # open a non-capturing group
[^\\\"]+ # all characters that are not \ or "
| # OR
\\\.)*+ # escaped char (that can be a quote)
"
Then You can make this subpattern fails with (*SKIP)(*FAIL) or (*SKIP)(?!). The SKIP forbid the backtracking before this point if the pattern fails after. FAIL forces the pattern to fail. Thus, quoted part are skipped (and can't be in the result since you make the subpattern fail after).
Or you use a capturing group and you add the reference in the replacement pattern.
/\*(?:[^*]+|\*+(?!/))*+\*/
This part describe content inside comments.
/\* # open the comment
(?:
[^*]+ # all characters except *
| # OR
\*+(?!/) # * not followed by / (note that you can't use
# a possessive quantifier here)
)*+ # repeat the group zero or more times
\*/ # close the comment
The s modifier is used here only when a backslash is before a newline inside quotes.

Related

Return code blocks between curly braces in separate lines using regex and in separate regex groups [duplicate]

Given a dummy function as such:
public function handle()
{
if (isset($input['data']) {
switch($data) {
...
}
} else {
switch($data) {
...
}
}
}
My intention is to get the contents of that function, the problem is matching nested patterns of curly braces {...}.
I've come across recursive patterns but couldn't get my head around a regex that would match the function's body.
I've tried the following (no recursion):
$pattern = "/function\shandle\([a-zA-Z0-9_\$\s,]+\)?". // match "function handle(...)"
'[\n\s]?[\t\s]*'. // regardless of the indentation preceding the {
'{([^{}]*)}/'; // find everything within braces.
preg_match($pattern, $contents, $match);
That pattern doesn't match at all. I am sure it is the last bit that is wrong '{([^{}]*)}/' since that pattern works when there are no other braces within the body.
By replacing it with:
'{([^}]*)}/';
It matched till the closing } of the switch inside the if statement and stopped there (including } of the switch but excluding that of the if).
As well as this pattern, same result:
'{(\K[^}]*(?=)})/m';
Update #2
According to others comments
^\s*[\w\s]+\(.*\)\s*\K({((?>"(?:[^"\\]*+|\\.)*"|'(?:[^'\\]*+|\\.)*'|//.*$|/\*[\s\S]*?\*/|#.*$|<<<\s*["']?(\w+)["']?[^;]+\3;$|[^{}<'"/#]++|[^{}]++|(?1))*)})
Note: A short RegEx i.e. {((?>[^{}]++|(?R))*)} is enough if you know your input does not contain { or } out of PHP syntax.
So a long RegEx, in what evil cases does it work?
You have [{}] in a string between quotation marks ["']
You have those quotation marks escaped inside one another
You have [{}] in a comment block. //... or /*...*/ or #...
You have [{}] in a heredoc or nowdoc <<<STR or <<<['"]STR['"]
Otherwise it is meant to have a pair of opening/closing braces and depth of nested braces is not important.
Do we have a case that it fails?
No unless you have a martian that lives inside your codes.
^ \s* [\w\s]+ \( .* \) \s* \K # how it matches a function definition
( # (1 start)
{ # opening brace
( # (2 start)
(?> # atomic grouping (for its non-capturing purpose only)
"(?: [^"\\]*+ | \\ . )*" # double quoted strings
| '(?: [^'\\]*+ | \\ . )*' # single quoted strings
| // .* $ # a comment block starting with //
| /\* [\s\S]*? \*/ # a multi line comment block /*...*/
| \# .* $ # a single line comment block starting with #...
| <<< \s* ["']? # heredocs and nowdocs
( \w+ ) # (3) ^
["']? [^;]+ \3 ; $ # ^
| [^{}<'"/#]++ # force engine to backtack if it encounters special characters [<'"/#] (possessive)
| [^{}]++ # default matching bahaviour (possessive)
| (?1) # recurse 1st capturing group
)* # zero to many times of atomic group
) # (2 end)
} # closing brace
) # (1 end)
Formatting is done by #sln's RegexFormatter software.
What I provided in live demo?
Laravel's Eloquent Model.php file (~3500 lines) randomly is given as input. Check it out:
Live demo
This works to output header file (.h) out of inline function blocks (.c)
Find Regular expression:
(void\s[^{};]*)\n^\{($[^}$]*)\}$
Replace with:
$1;
For input:
void bar(int var)
{
foo(var);
foo2();
}
will output:
void bar(int var);
Get the body of the function block with second matched pattern :
$2
will output:
foo(var);
foo2();

Match the body of a function using Regex

Given a dummy function as such:
public function handle()
{
if (isset($input['data']) {
switch($data) {
...
}
} else {
switch($data) {
...
}
}
}
My intention is to get the contents of that function, the problem is matching nested patterns of curly braces {...}.
I've come across recursive patterns but couldn't get my head around a regex that would match the function's body.
I've tried the following (no recursion):
$pattern = "/function\shandle\([a-zA-Z0-9_\$\s,]+\)?". // match "function handle(...)"
'[\n\s]?[\t\s]*'. // regardless of the indentation preceding the {
'{([^{}]*)}/'; // find everything within braces.
preg_match($pattern, $contents, $match);
That pattern doesn't match at all. I am sure it is the last bit that is wrong '{([^{}]*)}/' since that pattern works when there are no other braces within the body.
By replacing it with:
'{([^}]*)}/';
It matched till the closing } of the switch inside the if statement and stopped there (including } of the switch but excluding that of the if).
As well as this pattern, same result:
'{(\K[^}]*(?=)})/m';
Update #2
According to others comments
^\s*[\w\s]+\(.*\)\s*\K({((?>"(?:[^"\\]*+|\\.)*"|'(?:[^'\\]*+|\\.)*'|//.*$|/\*[\s\S]*?\*/|#.*$|<<<\s*["']?(\w+)["']?[^;]+\3;$|[^{}<'"/#]++|[^{}]++|(?1))*)})
Note: A short RegEx i.e. {((?>[^{}]++|(?R))*)} is enough if you know your input does not contain { or } out of PHP syntax.
So a long RegEx, in what evil cases does it work?
You have [{}] in a string between quotation marks ["']
You have those quotation marks escaped inside one another
You have [{}] in a comment block. //... or /*...*/ or #...
You have [{}] in a heredoc or nowdoc <<<STR or <<<['"]STR['"]
Otherwise it is meant to have a pair of opening/closing braces and depth of nested braces is not important.
Do we have a case that it fails?
No unless you have a martian that lives inside your codes.
^ \s* [\w\s]+ \( .* \) \s* \K # how it matches a function definition
( # (1 start)
{ # opening brace
( # (2 start)
(?> # atomic grouping (for its non-capturing purpose only)
"(?: [^"\\]*+ | \\ . )*" # double quoted strings
| '(?: [^'\\]*+ | \\ . )*' # single quoted strings
| // .* $ # a comment block starting with //
| /\* [\s\S]*? \*/ # a multi line comment block /*...*/
| \# .* $ # a single line comment block starting with #...
| <<< \s* ["']? # heredocs and nowdocs
( \w+ ) # (3) ^
["']? [^;]+ \3 ; $ # ^
| [^{}<'"/#]++ # force engine to backtack if it encounters special characters [<'"/#] (possessive)
| [^{}]++ # default matching bahaviour (possessive)
| (?1) # recurse 1st capturing group
)* # zero to many times of atomic group
) # (2 end)
} # closing brace
) # (1 end)
Formatting is done by #sln's RegexFormatter software.
What I provided in live demo?
Laravel's Eloquent Model.php file (~3500 lines) randomly is given as input. Check it out:
Live demo
This works to output header file (.h) out of inline function blocks (.c)
Find Regular expression:
(void\s[^{};]*)\n^\{($[^}$]*)\}$
Replace with:
$1;
For input:
void bar(int var)
{
foo(var);
foo2();
}
will output:
void bar(int var);
Get the body of the function block with second matched pattern :
$2
will output:
foo(var);
foo2();

preg_match a php string with simple or double quotes escaped inside

I want to parse some php files containing something like this :
// form 1
__('some string');
// form 2
__('an other string I\'ve written with a quote');
// form 3
__('an other one
multiline');
// form 4
__("And I want to handle double quotes too !");
// form 5
__("And I want to handle double quotes too !", $second_parameter_may_happens);
The following regex match everything except the 2nd one
preg_match_all('#__\((\'|")(.*)\1(?:,.*){0,1}\)#smU', $file_content);
You can use this pattern:
$pattern = '~__\((["\'])(?<param1>(?>[^"\'\\\]+|\\\.|(?!\1)["\'])*)\1(?:,\s*(?<param2>\$[a-z0-9_-]+))?\);~si';
if (preg_match_all($pattern, $data, $matches, PREG_SET_ORDER))
print_r($matches);
But as Jon notices it, this kind of pattern may be difficult to maintain. This is the reason why, i suggest to change the pattern to this:
$pattern = <<<'LOD'
~
## definitions
(?(DEFINE)
(?<sqc> # content between single quotes
(?> [^'\\]+ | \\. )* #'
# can be written in a more efficient way, with an unrolled pattern:
# [^'\\]*+ (?:\\. ['\\]*)*+
)
(?<dqc> # content between double quotes
(?> [^"\\]+ | \\. )* #"
)
(?<var> # variable
\$ [a-zA-Z0-9_-]+
)
)
## main pattern
__\(
(?| " (?<param1> \g<dqc> ) " | ' (?<param1> \g<sqc> ) ' )
# note that once you define a named group in the first branch in a branch reset
# group, you don't have to include the name in other branches:
# (?| " (?<param1> \g<dgc>) " | ' ( \g<sqc> ) ' ) does the same. Even if the
# second branch succeeds, the capture group will be named as in the first branch.
# Only the order of groups is taken in account.
(?:, \s* (?<param2> \g<var> ) )?
\);
~xs
LOD;
This simple change makes your pattern more readable and editable.
The content between quotes subpatterns have been designed to deal with escaped quotes. The idea is to match all character preceded by a backslash (that can be a backslash itself) to ensure to match literal backslashes and escaped quotes::
\' # an escaped quote
\\' #'# an escaped backslash and a quote
\\\' # an escaped backslash and an escaped quote
\\\\' #'# two escaped backslashes and a quote
...
subpattern details:
(?> # open an atomic group (inside which the bactracking is forbiden)
[^'\\]+ #'# all that is not a quote or a backslash
| # OR
\\. # an escaped character
)* # repeat the group zero or more times
I finally found a solution based on my 1st expression, so I will write it, but using the extended style of Casimir, who made a really great answer
$pattern = <<<'LOD'
#
__\(
(?<quote>'|") # catch the opening quote
(?<param1>
(?:
[^'"] # anything but quoteS
|
\\' # escaped single quote are ok
|
\\" # escaped double quote are ok too
)*
)
\k{quote} # find the closing quote
(?:,.*){0,1} # catch any type of 2nd parameter
\)
#smUx # x to allow comments :)
LOD;

variable length masking with preg_replace

I am masking all characters between single quotes (inclusively) within a string using preg_replace_callback(). But I would like to only use preg_replace() if possible, but haven't been able to figure it out. Any help would be appreciated.
This is what I have using preg_replace_callback() which produces the correct output:
function maskCallback( $matches ) {
return str_repeat( '-', strlen( $matches[0] ) );
}
function maskString( $str ) {
return preg_replace_callback( "('.*?')", 'maskCallback', $str );
}
$str = "TEST 'replace''me' ok 'me too'";
echo $str,"\n";
echo $maskString( $str ),"\n";
Output is:
TEST 'replace''me' ok 'me too'
TEST ------------- ok --------
I have tried using:
preg_replace( "/('.*?')/", '-', $str );
but the dashes get consumed, e.g.:
TEST -- ok -
Everything else I have tried doesn't work either. (I'm obviously not a regex expert.) Is this possible to do? If so, how?
Yes you can do it, (assuming that quotes are balanced) example:
$str = "TEST 'replace''me' ok 'me too'";
$pattern = "~[^'](?=[^']*(?:'[^']*'[^']*)*+'[^']*\z)|'~";
$result = preg_replace($pattern, '-', $str);
The idea is: you can replace a character if it is a quote or if it is followed by an odd number of quotes.
Without quotes:
$pattern = "~(?:(?!\A)\G|(?:(?!\G)|\A)'\K)[^']~";
$result = preg_replace($pattern, '-', $str);
The pattern will match a character only when it is contiguous to a precedent match (In other words, when it is immediately after the last match) or when it is preceded by a quote that is not contiguous to the precedent match.
\G is the position after the last match (at the beginning it is the start of the string)
pattern details:
~ # pattern delimiter
(?: # non capturing group: describe the two possibilities
# before the target character
(?!\A)\G # at the position in the string after the last match
# the negative lookbehind ensure that this is not the start
# of the string
| # OR
(?: # (to ensure that the quote is a not a closing quote)
(?!\G) # not contiguous to a precedent match
| # OR
\A # at the start of the string
)
' # the opening quote
\K # remove all precedent characters from the match result
# (only one quote here)
)
[^'] # a character that is not a quote
~
Note that since the closing quote is not matched by the pattern, the following characters that are not quotes can't be matched because there is no precedent match.
EDIT:
The (*SKIP)(*FAIL) way:
Instead of testing if a single quote is not a closing quote with (?:(?!\G)|\A)' like in the precedent pattern, you can break the match contiguity on closing quotes using the backtracking control verbs (*SKIP) and (*FAIL) (That can be shorten to (*F)).
$pattern = "~(?:(?!\A)\G|')(?:'(*SKIP)(*F)|\K[^'])~";
$result = preg_replace($pattern, '-', $str);
Since the pattern fails on each closing quotes, the following characters will not be matched until the next opening quote.
The pattern may be more efficient written like this:
$pattern = "~(?:\G(?!\A)(?:'(*SKIP)(*F))?|'\K)[^']~";
(You can also use (*PRUNE) in place of (*SKIP).)
Short answer : It's possible !!!
Use the following pattern
' # Match a single quote
(?= # Positive lookahead, this basically makes sure there is an odd number of single quotes ahead in this line
(?:(?:[^'\r\n]*'){2})* # Match anything except single quote or newlines zero or more times followed by a single quote, repeat this twice and repeat this whole process zero or more times (basically a pair of single quotes)
(?:[^'\r\n]*'[^'\r\n]*(?:\r?\n|$)) # You guessed, this is to match a single quote until the end of line
)
| # or
\G(?<!^) # Preceding contiguous match (not beginning of line)
[^'] # Match anything that's not a single quote
(?= # Same as above
(?:(?:[^'\r\n]*'){2})* # Same as above
(?:[^'\r\n]*'[^'\r\n]*(?:\r?\n|$)) # Same as above
)
|
\G(?<!^) # Preceding contiguous match (not beginning of line)
' # Match a single quote
Make sure to use the m modifier.
Online demo.
Long answer : It's a pain :)
Unless not only you but your whole team loves regex, you might think of using this regex but remember that this is insane and quite difficult to grasp for beginners. Also readability goes (almost) always first.
I'll break the idea of how I did write such a regex:
1) We first need to know what we actually want to replace, we want to replace every character (including the single quotes) that's between two single quotes with a hyphen.
2) If we're going to use preg_replace() that means our pattern needs to match one single character each time.
3) So the first step would be obvious : '.
4) We'll use \G which means match beginning of string or the contiguous character that we matched earlier. Take this simple example ~a|\Gb~. This will match a or b if it's at the beginning or b if the previous match was a. See this demo.
5) We don't want anything to do with beginning of string So we'll use \G(?<!^).
6) Now we need to match anything that's not a single quote ~'|\G(?<!^)[^']~.
7) Now begins the real pain, how do we know that the above pattern wouldn't go match c in 'ab'c ? Well it will, we need to count the single quotes...
Let's recap:
a 'bcd' efg 'hij'
^ It will match this first
^^^ Then it will match these individually with \G(?<!^)[^']
^ It will match since we're matching single quotes without checking anything
^^^^^ And it will continue to match ...
What we want could be done in those 3 rules:
a 'bcd' efg 'hij'
1 ^ Match a single quote only if there is an odd number of single quotes ahead
2 ^^^ Match individually those characters only if there is an odd number of single quotes ahead
3 ^ Match a single quote only if there was a match before this character
8) Checking if there is an odd number of single quotes could be done if we knew how to match an even number :
(?: # non-capturing group
(?: # non-capturing group
[^'\r\n]* # Match anything that's not a single quote or newline, zero or more times
' # Match a single quote
){2} # Repeat 2 times (We'll be matching 2 single quotes)
)* # Repeat all this zero or more times. So we match 0, 2, 4, 6 ... single quotes
9) An odd number would be easy now, we just need to add :
(?:
[^'\r\n]* # Match anything that's not a single quote or newline, zero or more times
' # Match a single quote
[^'\r\n]* # Match anything that's not a single quote or newline, zero or more times
(?:\r?\n|$) # End of line
)
10) Merging above in a single lookahead:
(?=
(?: # non-capturing group
(?: # non-capturing group
[^'\r\n]* # Match anything that's not a single quote or newline, zero or more times
' # Match a single quote
){2} # Repeat 2 times (We'll be matching 2 single quotes)
)* # Repeat all this zero or more times. So we match 0, 2, 4, 6 ... single quotes
(?:
[^'\r\n]* # Match anything that's not a single quote or newline, zero or more times
' # Match a single quote
[^'\r\n]* # Match anything that's not a single quote or newline, zero or more times
(?:\r?\n|$) # End of line
)
)
11) Now we need to merge all 3 rules we defined earlier:
~ # A modifier
#################################### Rule 1 ####################################
' # A single quote
(?= # Lookahead to make sure there is an odd number of single quotes ahead
(?: # non-capturing group
(?: # non-capturing group
[^'\r\n]* # Match anything that's not a single quote or newline, zero or more times
' # Match a single quote
){2} # Repeat 2 times (We'll be matching 2 single quotes)
)* # Repeat all this zero or more times. So we match 0, 2, 4, 6 ... single quotes
(?:
[^'\r\n]* # Match anything that's not a single quote or newline, zero or more times
' # Match a single quote
[^'\r\n]* # Match anything that's not a single quote or newline, zero or more times
(?:\r?\n|$) # End of line
)
)
| # Or
#################################### Rule 2 ####################################
\G(?<!^) # Preceding contiguous match (not beginning of line)
[^'] # Match anything that's not a single quote
(?= # Lookahead to make sure there is an odd number of single quotes ahead
(?: # non-capturing group
(?: # non-capturing group
[^'\r\n]* # Match anything that's not a single quote or newline, zero or more times
' # Match a single quote
){2} # Repeat 2 times (We'll be matching 2 single quotes)
)* # Repeat all this zero or more times. So we match 0, 2, 4, 6 ... single quotes
(?:
[^'\r\n]* # Match anything that's not a single quote or newline, zero or more times
' # Match a single quote
[^'\r\n]* # Match anything that's not a single quote or newline, zero or more times
(?:\r?\n|$) # End of line
)
)
| # Or
#################################### Rule 3 ####################################
\G(?<!^) # Preceding contiguous match (not beginning of line)
' # Match a single quote
~x
Online regex demo.
Online PHP demo
Well, just for the fun of it and I seriously wouldn't recommend something like that because I try to avoid lookarounds when they are not necessary, here's one regex that uses the concept of 'back to the future':
(?<=^|\s)'(?!\s)|(?!^)(?<!'(?=\s))\G.
regex101 demo
Okay, it's broken down into two parts:
1. Matching the beginning single quote
(?<=^|\s)'(?!\s)
The rules that I believe should be established here are:
There should be either ^ or \s before the beginning quote (hence (?<=^|\s)).
There is no \s after the beginning quote (hence (?!\s)).
2. Matching the things inside the quote, and the ending quote
(?!^)\G(?<!'(?=\s)).
The rules that I believe should be established here are:
The character can be any character (hence .)
The match is 1 character long and following the immediate previous match (hence (?!^)\G).
There should be no single quote, that is itself followed by a space, before it (hence (?<!'(?=\s)) and this is the 'back to the future' part). This effectively will not match a \s that is preceded by a ' and will mark the end of the characters wrapped between single quotes. In other words, the closing quote will be identified as a single quote followed by \s.
If you prefer pictures...

Regular expression for template engine?

I'm learning about regular expressions and want to write a templating engine in PHP.
Consider the following "template":
<!DOCTYPE html>
<html lang="{{print("{hey}")}}" dir="{{$dir}}">
<head>
<meta charset="{{$charset}}">
</head>
<body>
{{$body}}
{{}}
</body>
</html>
I managed to create a regex that will find anything except for {{}}.
Here's my regex:
{{[^}]+([^{])*}}
There's just one problem. How do I allow the literal { and } to be used within {{}} tags?
It will not find {{print("{hey}")}}.
Thanks in advance.
This is a pattern to match the content inside double curly brackets:
$pattern = <<<'LOD'
~
(?(DEFINE)
(?<quoted>
' (?: [^'\\]+ | (?:\\.)+ )++ ' |
" (?: [^"\\]+ | (?:\\.)+ )++ "
)
(?<nested>
{ (?: [^"'{}]+ | \g<quoted> | \g<nested> )*+ }
)
)
{{
(?<content>
(?:
[^"'{}]+
| \g<quoted>
| \g<nested>
)*+
)
}}
~xs
LOD;
Compact version:
$pattern = '~{{((?>[^"\'{}]+|((["\'])(?:[^"\'\\\]+|(?:\\.)+|(?:(?!\3)["\'])+)++\3)|({(?:[^"\'{}]+|\g<2>|(?4))*+}))*+)}}~s';
The content is in the first capturing group, but you can use the named capture 'content' with the detailed version.
If this pattern is longer, it allows all that you want inside quoted parts including escaped quotes, and is faster than a simple lazy quantifier in much cases.
Nested curly brackets are allowed too, you can write {{ doThat(){ doThis(){ }}}} without problems.
The subpattern for quotes can be written like this too, avoiding to repeat the same thing for single and double quotes (I use it in compact version)
(["']) # the quote type is captured (single or double)
(?: # open a group (for the various alternatives)
[^"'\\]+ # all characters that are not a quote or a backslash
| # OR
(?:\\.)+ # escaped characters (with the \s modifier)
| #
(?!\g{-1})["'] # a quote that is not the captured quote
)++ # repeat one or more times
\g{-1} # the captured quote (-1 refers to the last capturing group)
Notice: a backslash must be written \\ in nowdoc syntax but \\\ or \\\\ inside single quotes.
Explanations for the detailed pattern:
The pattern is divided in two parts:
the definitions where i define named subpatterns
the whole pattern itself
The definition section is useful to avoid to repeat always the same subpattern several times in the main pattern or to make it more clear. You can define subpatterns that you will use later in this space: (?(DEFINE)....)
This section contains 2 named subpatterns:
quoted : that contains the description of quoted parts
nested : that describes nested curly brackets parts
detail of nested
(?<nested> # open the named group "nested"
{ # literal {
## what can contain curly brackets? ##
(?> # open an atomic* group
[^"'{}]+ # all characters one or more times, except "'{}
| # OR
\g<quoted> # quoted content, to avoid curly brackets inside quoted parts
# (I call the subpattern I have defined before, instead of rewrite all)
| \g<nested> # OR curly parts. This is a recursion
)*+ # repeat the atomic group zero or more times (possessive *)
} # literal }
) # close the named group
(* more informations about atomic groups and possessive quantifiers)
But all of this are only definitions, the pattern begins really with: {{
Then I open a named capture group (content) and I describe what can be found inside, (nothing new here).
I use to modifiers, x and s. x activates the verbose mode that allows to put freely spaces in the pattern (useful to indent). s is the singleline mode. In this mode, the dot can match newlines (it can't by default). I use this mode because there is a dot in the subpattern quoted.
You can just use "." instead of the character classes. But you then have to make use of non-greedy quantifiers:
\{\{(.+?)\}\}
The quantifier "+?" means it will consume the least necessary number of characters.
Consider this example:
<table>
<tr>
<td>{{print("{first name}")}}</td><td>{{print("{last name}")}}</td>
</tr>
</table>
With a greedy quantifier (+ or *), you'd only get one result, because it sees the first {{ and then the .+ consumes as many characters as it can as long as the pattern is matched:
{{print("{first name}")}}</td><td>{{print("{last name}")}}
With a non-greedy one (+? or *?) you'll get the two as separate results:
{{print("{first name}")}}
{{print("{last name}")}}
Make you regex less greedy using {{(.*?)}}.
I figured it out. Don't ask me how.
{{[^{}]*("[^"]*"\))?(}})
This will match pretty much anything.. like for example:
{{print("{{}}}{{{}}}}{}}{}{hey}}{}}}{}7")}}

Categories