Having trouble with this regular expression in PHP - php

I'm trying to run regular expression on the following string with PHP using preg_match_all function
"{{content 1}}{{content 2}}"
The result I'm looking for is array with 2 matches inside {{ and }}
Here is the expression '/\{\{(.+)\}\}/'
I'm suspecting that my expression is too greedy but how to make it less greedy?

You can use the ungreedy modifier ?, like so:
$regex = '/\{\{.*?\}\}/';
New regex will output:
Array
(
[0] => Array
(
[0] => {{content 1}}
[1] => {{content 2}}
)
)
EDIT:
Just remembered another way to do this. You can just add a U (capital u) in the end of your regex string and result will be the same, like so:
$regex = '/\{\{.+\}\}/U';
Also, here is a useful list of regex modifiers.

You can also use the U PCRE modifier (Ungreedy)

Is the regular expression string in a single quote or double quote PHP string? Because if it's in a double quoted string, curly braces include variables so need to be escaped twice.
"/\\{\\{(.+)\\}\\}/"

Related

How do I match this pattern using preg_match in PHP?

I'm writing a simple quiz engine in PHP and supply the question text in this format
question|correct/feedback|wrong/feedback|wrong/feedback
There can be as many wrong/feedback options as necessary. I want to use preg_match to return the results so I can display them. For instance:
q|aaa/aaa|bbb/bbb|ccc/ccc
...should return...
array(
0 => q|aaa/aaa|bbb/bbb|ccc/ccc
1 => q
2 => aaa/aaa
3 => bbb/bbb
4 => ccc/ccc
)
So, far I've got this regular expression which matches the question and the correct/feedback combination...
([^\|]+)\|([^\/]+\/[^\|$]+)
...but I have no idea how to match the remaining wrong/feedback strings
You can also use the "glue" feature in your pattern with preg_match_all, this way it's possible to check if the syntax is correct and to extract each part at the same time.
The glue feature ensures that each match follows immediately the previous match without gap. To do that I use the A global modifier (Anchored to the start of the string or the next position after the previous match).
$s = 'q|aaa/aaa|bbb/bbb|ccc/ccc';
$pat = '~ (?!\A) \| \K [^|/]+ / [^|/]+ (?: \z (*:END) )? | \A [^|/]+ ~Ax';
if ( preg_match_all($pat, $s, $m) && isset($m['MARK']) ) {
$result = $m[0];
print_r($result);
}
I use also a marker (*:END) to be sure that the end of the string is well reached despite of the pattern constraints. If this marker exists in the matches array, it's a proof that the syntax is correct. Advantage: you have to parse the string only once (you don't even need to check the whole string syntax in a lookahead assertion anchored at the start of the string).
demo
If you want the whole question as first item in the result array, just write:
$result = array_merge([$s], $m[0]);
So, after the advice, I've decided to use preg_match to check the syntax and then explode to split the string.
This regex seems to match the string format up until any mismatch occurs.
^[^\|/]+(?:\|[^\|/]+/[^\|/]+)+
If I check that the length of the match is the same as the original string I think this will tell me the syntax is correct. Does this sound feasible?

How to use a regular expression with preg_match_all to split a string into blocks following a pattern

I'm going to be working with a long string of data that is serialized into blocks using a pattern (x:y).
However, I struggle with regular expressions, and are looking for resources to help identify how to construct a regex to identify any/all of these blocks as they appear in a string.
For example, given the following string:
$s = 't:user c:red t:admin n:"bob doe" s:expressionsf:json';
Note: the f:json at the end is missing a space on purpose, because the format might vary with how the string is eventually given to me. Each block might be spaced, and they might not.
How would I identify each block of x:y to end with the below result:
Array
(
[0] => t:user
[1] => c:red
[2] => t:admin
[3] => n:"bob doe"
[4] => s:expression
[5] => f:json
)
I've tested various expressions using my limited knowledge, but have not been terribly successful.
I can successfully match the pattern using something like this:
^[ctrns]:.+
But this unfourtunately matches the entire string. The part I seem to be missing is how to break each block, while also maintaining the ability to keep spaces within the pairs (see n:"bob doe" example).
Any assistance would be super appreciated! Also, ideally any submission would be explained as to what each token in the expression was accomplishing so that I better my understanding of these techniques.
I've been using https://regexr.com/ to practice.
You may use this regex in preg_match_all:
[ctnsf]:(?:"[^"\\]*(?:\\.[^"\\]*)*"|\S+?(?=[ctnsf]:|\s|$))
RegEx Demo
RegEx Details:
[ctnsf]:: Match one of ctnsf characters followed by :
(?:"[^"\\]*(?:\\.[^"\\]*)*": Match a quoted substring. This takes care of escaped quotes as well.
|: OR
\S+?: Match 1+ not-whitespace characters (non-greedy)
(?=[ctnsf]:|\s|$): Positive lookahead to assert one of the conditions given in assertions.
Code:
$re = '/[ctnsf]:(?:"[^"\\\\]*(?:\\\\.[^"\\\\]*)*"|\S+?(?=[ctnsf]:|\s|$))/m';
$str = 't:user c:red t:admin n:"bob \\"doe" s:expressionsf:json';
preg_match_all($re, $str, $matches);
// Print the entire match result
print_r($matches[0]);
Code Demo

Regular expression searching special tag

I have a special tag in text [Attachment: image;upload;url] to parse it I need to find all this tags, I have wrote this regular expression:
preg_match_all("/.*(\[Attachment: (.*);upload;(.*)\]).*/", $text, $matches);
All work fine, it returns this
Array
(
[0] => Array
(
Text
)
[1] => Array
(
[Attachment: image;upload;url]
)
[2] => Array
(
image
)
[3] => Array
(
url
)
)
But here is one problem, when text contains two or more tags, it will return info only about last founded tag.
You should match only the tags, not the surrounding text:
"/\[Attachment: ([^;]*);upload;([^\]]*)\]/"
Instead of the negative character set you could also use .*? to use non-greedy matching; however, I prefer to use the look-ahead set.
Remove the .* part from the end of the regex. With the .*, the regex matches to the end of the string, including any of the other substrings that you want to find. (Or at least all the ones on the same line - I can't remember what the default settings are in PHP.) After that it looks for more matches from the end of the string, but can't find any.
This regex should do it:
$regex = '/[Attachment: (.*?);(.*?);(.*?)]/';
preg_match_all($regex, $string, $matches);
For me, this came back with what you wanted (3 results);

PHP regular expression not being matched - what is wrong?

I have the following regular expression:
"^[x]{1}[a-z]{3,4}:[a-z0-9]{1,6}"
I want to use it to be able to match strings like:
xabc:z123
However, when I try it with this regex tester, it does not match the pattern. Is it my pattern that is wrong, or is the online tester unreliable?.
If my pattern is wrong, could someone point out why it is wrong.
Also, I want to make the pattern matching case insensitive - but I'm not too sure the best way to do that (thought better to ask rather than trial and error). How do I change the pattern so it matches irrespective of case?
Just add an i for case insensitive matching:
/^[x]{1}[a-z]{3,4}:[a-z0-9]{1,6}/i
By the way, your regular expression works!?
Output:
Array
(
[0] => xabc:z123
)
If you want to have something like:
Array
(
[0] => 'xabc:z123',
[1] => 'x',
[2] => 'abc'
...
)
You need to add groups using (), e.g.:
/^([x]{1})([a-z]{3,4}):([a-z0-9]{1,6})/i
In the tester, you have to enter the regex without the surrounding quotes. In PHP source code, you have to use quotes and a regex delimiter; the tester shows that in the code it generates:
$ptn = "/^[x]{1}[a-z]{3,4}:[a-z0-9]{1,6}/";
To make it case insensitive, you have two options. One is to add an i after the closing delimiter, as #middus's answer demonstrates. The other is to add (?i) to the the regex itself:
(?i)^[x]{1}[a-z]{3,4}:[a-z0-9]{1,6}
The tester will accept it either way; if you don't add the delimiters yourself it adds / to either end, which means any slashes in your regex need to be escaped (i.e., it doesn't escape them for you). Be aware that PHP allows you to use other characters as the delimiters, but that tester only recognizes /.
Some further notes:
To match a single x, all you need is x. The square brackets are unnecessary when there's only one letter inside them, and the {1} quantifier never has any effect--it's pure clutter.
If you're using the regex to validate the string, you may want to add a $ anchor to the end.
End result:
/^x[a-z]{3,4}:[a-z0-9]{1,6}$/i
Here is another tester that lets you choose your own delimiters, among other things.

Negate charactor group: match "abc'," but not "abc\',"

I need a pattern which can negate a charactor group and also negate a charactor inside the negate group
The following pattern works, but I want to do a bit more
(?:(?!'\,).)+
Here I don't want to match strings that contain ',
But what I really need is to integrate a negation inside the negation group - something like this
(?:(?![^\\]'\,).)+
I don't want to match any escaped quote signs
Match: abc',
Don't match: abc\',
argh.. it posts on enter..
$str = "'abc\',',asdf";
preg_match("/^('(?:(?!',).)+')/", $str, $matches);
echo '<pre>';
print_r($matches);
echo '</pre>';
this should output abc\', but it outputs abc\
Judging by your last comment, I think you're trying to match a single-quoted string literal, which might contain single-quotes escaped with backslashes. For example, in this string:
'abc\',','xyz'
...you want to match 'abc\',' and 'xyz'. That's easy enough:
$source = "'abc\',','xyz'";
print "$source\n\n";
preg_match_all("/'(?:[^'\\\\]++|\\\\.)*+'/", $source, $matches);
print_r($matches);
output:
'abc\',','xyz'
Array
(
[0] => Array
(
[0] => 'abc\','
[1] => 'xyz'
)
)
see it on ideone
But maybe you want to match all the items in a comma-separated list, which may or may not be quoted--in other words, CSV (or something very similar). If that's the case, you should use a dedicated CSV processing tool; there are many of them out there. In fact, PHP has one built in: http://php.net/manual/en/function.fgetcsv.php
/^(?:(?!\\\\',).)+$/ appears to do what you want. Note that you have to escape the single quote ''. See http://ideone.com/ypln2
If you don't necessarily want to match the full string, remove the ^ and $. See http://ideone.com/G67RV

Categories