regex match between 2 strings - php

For example I have the text
a1aabca2aa3adefa4a
I want to extract 2 and 3 with a regex between abc and def, so 1 and 4 should be not included in the result.
I tried this
if(preg_match_all('#abc(?:a(\d)a)+def#is', file_get_contents('test.txt'), $m, PREG_SET_ORDER))
print_r($m);
I get this
> Array
(
[0] => Array
(
[0] => abca1aa2adef
[1] => 3
)
)
But I want this
Array
(
[0] => Array
(
[0] => abca1aa2adef
[1] => 2
[2] => 3
)
)
Is this possible with one preg_match_all call? How can I do it?
Thanks

preg_match_all(
'/\d # match a digit
(?=.*def) # only if followed by <anything> + def
(?!.*abc) # and not followed by <anything> + abc
/x',
$subject, $result, PREG_PATTERN_ORDER);
$result = $result[0];
works on your example. It assumes that there is exactly one instance of abc and def per line in your string.
The reason why your attempt didn't work is that your capturing group (\d) that matches the digit is within another, repeated group (?:a(\d)a)+. With every repetition, the result of the capture is overwritten. This is how regular expressions work.
In other words - see what's happening during the match:
Current position Current part of regex Capturing group 1
--------------------------------------------------------------
a1a no match, advancing... undefined
abc abc undefined
a2a (?:a(\d)a) 2
a3a (?:a(\d)a) (repeated) 3 (overwrites 2)
def def 3

You ask if it is possible with a single preg_match_all.
Indeed it is.
This code outputs exactly what you want.
<?php
$subject='a1aabca2aa3adefa4a';
$pattern='/abc(?:a(\d)a+(\d)a)def/m';
preg_match_all($pattern, $subject, $all_matches,PREG_OFFSET_CAPTURE | PREG_PATTERN_ORDER);
$res[0]=$all_matches[0][0][0];
$res[1]=$all_matches[1][0][0];
$res[2]=$all_matches[2][0][0];
var_dump($res);
?>
Here is the output:
array
0 => string 'abca2aa3adef' (length=12)
1 => string '2' (length=1)
2 => string '3' (length=1)

Related

REGEX Pattern for Validation that check all string is integer and split into single integers

I tried multiple time to make a pattern that can validate given string is natural number and split into single number.
..and lack of understanding of regex, the closest thing that I can imagine is..
^([1-9])([0-9])*$ or ^([1-9])([0-9])([0-9])*$ something like that...
It only generates first, last, and second or last-second split-numbers.
I wonder what I need to know to solve this problem.. thanks
You may use a two step solution like
if (preg_match('~\A\d+\z~', $s)) { // if a string is all digits
print_r(str_split($s)); // Split it into chars
}
See a PHP demo.
A one step regex solution:
(?:\G(?!\A)|\A(?=\d+\z))\d
See the regex demo
Details
(?:\G(?!\A)|\A(?=\d+\z)) - either the end of the previous match (\G(?!\A)) or (|) the start of string (^) that is followed with 1 or more digits up to the end of the string ((?=\d+\z))
\d - a digit.
PHP demo:
$re = '/(?:\G(?!\A)|\A(?=\d+\z))\d/';
$str = '1234567890';
if (preg_match_all($re, $str, $matches)) {
print_r($matches[0]);
}
Output:
Array
(
[0] => 1
[1] => 2
[2] => 3
[3] => 4
[4] => 5
[5] => 6
[6] => 7
[7] => 8
[8] => 9
[9] => 0
)

need some help on regex in preg_match_all()

so I need to extract the ticket number "Ticket#999999" from a string.. how do i do this using regex.
my current regex is working if I have more than one number in the Ticket#9999.. but if I only have Ticket#9 it's not working please help.
current regex.
preg_match_all('/(Ticket#[0-9])\w\d+/i',$data,$matches);
thank you.
In your pattern [0-9] matches 1 digit, \w matches another digit and \d+ matches 1+ digits, thus requiring 3 digits after #.
Use
preg_match_all('/Ticket#([0-9]+)/i',$data,$matches);
This will match:
Ticket# - a literal string Ticket#
([0-9]+) - Group 1 capturing 1 or more digits.
PHP demo:
$data = "Ticket#999999 ticket#9";
preg_match_all('/Ticket#([0-9]+)/i',$data,$matches, PREG_SET_ORDER);
print_r($matches);
Output:
Array
(
[0] => Array
(
[0] => Ticket#999999
[1] => 999999
)
[1] => Array
(
[0] => ticket#9
[1] => 9
)
)

Regular expression doesn't work as expected: '/=(\w+\s*)+=/'

This is what I have:
<?php
preg_match_all('/=(\w+\s*)+=/', 'aaa =bbb ccc ddd eee= zzz', $match);
print_r($match);
It matches only eee:
Array
(
[0] => Array
(
[0] => =bbb ccc ddd eee=
)
[1] => Array
(
[0] => eee
)
)
I need it to match bbb, ccc, ddd, eee, e.g.:
...
[1] => Array
(
[0] => bbb
[1] => ccc
[2] => ddd
[3] => eee
)
...
Where is the problem?
Try this regex:
(\w+)(?=[^=]*=[^=]*$)
Explaining:
(\w+) # group all words
(?= # only if right after can be found:
[^=]* # regardless of non '=' character
= # one '=' character
[^=]*$ # non '=' character till the end makes sure the first words are eliminated... You can try remove it from regex101 to see what happens.
)
Regex live here.
Hope it helps.
Thats is expected behaviour. Group captures are overwritten on repetition.
1 group, 1 capture
Instead of trying to get them in 1 match attempt, you should match one token on each attempt. Use \G to match the end of last match.
Something like this should work:
/(?(?=^)[^=]*+=(?=.*=)|\G\s+)([^\s=]+)/
regex101 Demo
Regex break-down
(?(?=^) ... | ... ) IF on start of string
[^=]*+= consume everything up to the first =
(?=.*=) and check there's a closing = as well
ELSE
\G\s+ only match if the last match ended here, consuming preceding spaces
([^\s=]+) Match 1 token, captured in group 1.
If you're also interested in matching more than 1 set of tokens, you need to match the text in between sets as well:
/(?(?=^)[^=]*+=(?=.*=)|\G\s*+(?:=[^=]*+=(?=.*=))?)([^\s=]+)/
regex101 Demo
Your regex starts and ends with an =, so the only possible match is:
=bbb ccc ddd eee=
You can use preg_replace with preg_split, i.e.:
$string = "aaa =bbb ccc ddd eee= zzz";
$matches = preg_split('/ /', preg_replace('/^.*?=|=.*?$/', '', $string));
print_r($matches);
OUTPUT:
Array
(
[0] => bbb
[1] => ccc
[2] => ddd
[3] => eee
)
DEMO:
http://ideone.com/pAmjbk

PHP Regex: How to get optional text if present?

Let's take an example of following string:
$string = "length:max(260):min(20)";
In the above string, :max(260):min(20) is optional. I want to get it if it is present otherwise only length should be returned.
I have following regex but it doesn't work:
/(.*?)(?::(.*?))?/se
It doesn't return anything in the array when I use preg_match function.
Remember, there can be something else than above string. Maybe like this:
$string = "number:disallow(negative)";
Is there any problem in my regex or PHP won't return anything? Dumping preg_match returns int 1 which means the string matches the regex.
Fully Dumped:
int 1
array (size=2)
0 => string '' (length=0)
1 => string '' (length=0)
You're using single character (.) matching in the case of being lazy, at the very beginning. So it stops at the zero position. If you change your preg_match function to preg_match_all you'll see the captured groups.
Another problem is with your Regular Expression. You're killing the engine. Also e modifier is deprecated many many decades before!!! and yet it was used in preg_replace function only.
Don't use s modifier too! That's not needed.
This works at your case:
/([^:]+)(:.*)?/
Online demo
I tried to prepare a regex which can probably solve your issue and also add some value to it
this regex will not only match the optional elements but will also capture in key value pair
Regex
/(?<=:|)(?'prop'\w+)(?:\((?'val'.+?)\))?/g
Test string
length:max(260):min(20)
length
number:disallow(negative)
Result
MATCH 1
prop [0-6] length
MATCH 2
prop [7-10] max
val [11-14] 260
MATCH 3
prop [16-19] min
val [20-22] 20
MATCH 4
prop [24-30] length
MATCH 5
prop [31-37] number
MATCH 6
prop [38-46] disallow
val [47-55] negative
try demo here
EDIT
I think I understand what you meant by duplicate array with different key, it was due to named captures eg. prop & val
here is the revision without named capturing
Regex
/(?<=:|)(\w+)(?:\((.+?)\))?/
Sample code
$str = "length:max(260):min(20)";
$str .= "\nlength";
$str .= "\nnumber:disallow(negative)";
preg_match_all("/(?<=:|)(\w+)(?:\((.+?)\))?/",
$str,
$matches);
print_r($matches);
Result
Array
(
[0] => Array
(
[0] => length
[1] => max(260)
[2] => min(20)
[3] => length
[4] => number
[5] => disallow(negative)
)
[1] => Array
(
[0] => length
[1] => max
[2] => min
[3] => length
[4] => number
[5] => disallow
)
[2] => Array
(
[0] =>
[1] => 260
[2] => 20
[3] =>
[4] =>
[5] => negative
)
)
try demo here

Catching ids and its values from a string with preg_match

I was wondering how can I create preg_match for catching:
id=4
4 being any number and how can I search for the above example in a string?
If this is could be correct /^id=[0-9]/, the reason why I'm asking is because I'm not really good with preg_match.
for 4 being any number, we must set the range for it:
/^id\=[0-9]+/
\escape the equal-sign, plus after the number means 1 or even more.
You should go with the the following:
/id=(\d+)/g
Explanations:
id= - Literal id=
(\d+) - Capturing group 0-9 a character range between 0 and 9; + - repeating infinite times
/g - modifier: global. All matches (don't return on first match)
Example online
If you want to grab all ids and its values in PHP you could go with:
$string = "There are three ids: id=10 and id=12 and id=100";
preg_match_all("/id=(\d+)/", $string, $matches);
print_r($matches);
Output:
Array
(
[0] => Array
(
[0] => id=10
[1] => id=12
[2] => id=100
)
[1] => Array
(
[0] => 10
[1] => 12
[2] => 100
)
)
Example online
Note: If you want to match all you must use /g modifier. PHP doesn't support it but has other function for that which is preg_match_all. All you need to do is remove the g from the regex.

Categories