Regex Extracting After the Match - php

preg_match('/\$(\d+\.\d+)/',$message,$keywords);
dd($keywords);
Hi , have got a few questions
1) Is it possible to detect/extract the text after the regular expression? eg I'm trying to detect $1.20 possible to detect the text after it eg per hour , /hr , per hr, / hour.
1.1) Maybe like Extract 20 characters after the match
1.2) Possible to know the position of the match if i cant extract?
$100000/hour test test test
Extract test test tst

1) Put everything you want to extract in the regex, like this:
preg_match('#\$(\d+\.\d+)(\s+per hour|\s*/hr|\s+per hr|\s*/hour)?#',$message,$keywords);
You'll get the amount in $keywords[1] and the other piece of text in $keywords[2];
1.1) Use /\$(\d+\.\d+)(.{,20})/ to get at most 20 characters in the second match (if you remove the comma it will match only if after the amount there are at least 20 characters).
1.2) Use the $flags parameter of preg_match(): preg_match('/\$(\d+\.\d+)/',$message,$keywords,PREG_OFFSET_CAPTURE);. Check print_r($keywords) to see how the matched values and their offsets are returned
You probably need to find all the appearances. In this case use preg_match_all().

Try this:
$re = '~\$(\d+\.?\d+)/?(\w+)?~m';
$str = "$100000/hour\n$100.2000/min";
preg_match_all($re, $str, $matches);
var_dump($matches);
Demo on regex101
Output
array (size=3)
0 =>
array (size=2)
0 => string '$100000/hour' (length=12)
1 => string '$100.2000/min' (length=13)
1 =>
array (size=2)
0 => string '100000' (length=6)
1 => string '100.2000' (length=8)
2 =>
array (size=2)
0 => string 'hour' (length=4)
1 => string 'min' (length=3)

Related

PHP preg_split() pattern

I need help finding a PCRE pattern using preg_split().
I'm using the regex pattern below to split a string based on its starting 3 character code and semi-colons. The pattern works fine in Javascript, but now I need to use the pattern in PHP. I tried preg_split() but just getting back junk.
// Each group will begin with a three letter code, have three segments separated by a semi-colon. The string will not be terminated with a semi-colon.
// Pseudocode
string_to_split = "AAA;RED;111;BBB;BLUE;22;CCC;GREEN;33;DDD;WHITE;44"
// This works in JS
// https://regex101.com
$pattern = "/[AAA|BBB|CCC|DDD][^;]*;[^;]*[;][^;]*/gi";
Match 1
Full match 0-11 `AAA;RED;111`
Match 2
Full match 12-23 `BBB;BLUE;22`
Match 3
Full match 24-36 `CCC;GREEN;33`
Match 4
Full match 37-49 `DDD;WHITE;44`
$pattern = "/[AAA|BBB|CCC|DDD][^;]*;[^;]*[;][^;]*/";
$split = preg_split($pattern, $string_to_split);
returns
array(5)
0:""
1:";"
2:";"
3:";"
4:""
According to your additional information in some comments to the answers, I update my answer to be very specific to your source format.
You might want something like this:
$subject = "AAA;RED;111;AAA;Oh my dog;12.34;AAA;Oh Long John;.4556;BBB;Oh Long Johnson;1.2323;BBB;Oh Don Piano;.33;CCC;Why I eyes ya;1.445;CCC;All the live long day;2.3343;DDD;Faith Hilling;.89";
$pattern = '/(?<=;|^)(AAA|BBB|CCC|DDD);([^;]*);((?:\d*\.)?\d+)(?=;|$)/';
preg_match_all($pattern, $subject,$matches);
var_dump($matches);
giving you
array (size=4)
0 =>
array (size=8)
0 => string 'AAA;RED;111' (length=11)
1 => string 'AAA;Oh my dog;12.34' (length=19)
2 => string 'AAA;Oh Long John;.4556' (length=22)
3 => string 'BBB;Oh Long Johnson;1.2323' (length=26)
4 => string 'BBB;Oh Don Piano;.33' (length=20)
5 => string 'CCC;Why I eyes ya;1.445' (length=23)
6 => string 'CCC;All the live long day;2.3343' (length=32)
7 => string 'DDD;Faith Hilling;.89' (length=21)
1 =>
array (size=8)
0 => string 'AAA' (length=3)
1 => string 'AAA' (length=3)
2 => string 'AAA' (length=3)
3 => string 'BBB' (length=3)
4 => string 'BBB' (length=3)
5 => string 'CCC' (length=3)
6 => string 'CCC' (length=3)
7 => string 'DDD' (length=3)
2 =>
array (size=8)
0 => string 'RED' (length=3)
1 => string 'Oh my dog' (length=9)
2 => string 'Oh Long John' (length=12)
3 => string 'Oh Long Johnson' (length=15)
4 => string 'Oh Don Piano' (length=12)
5 => string 'Why I eyes ya' (length=13)
6 => string 'All the live long day' (length=21)
7 => string 'Faith Hilling' (length=13)
3 =>
array (size=8)
0 => string '111' (length=3)
1 => string '12.34' (length=5)
2 => string '.4556' (length=5)
3 => string '1.2323' (length=6)
4 => string '.33' (length=3)
5 => string '1.445' (length=5)
6 => string '2.3343' (length=6)
7 => string '.89' (length=3)
The start marker should occur at the start of string or immidiately after a semicolon, so we do a lookbehind, looking for start or semicolon:
(?<=;|^)
We look for an alternative of AAA,BBB,CCC or DDD and capture it:
(AAA|BBB|CCC|DDD)
After a semicolon we look for any character except a semicolon. The quantifier * means 0 or more time. Use + if you want at least 1.
;([^;]*)
After the next semicolon wie look for a number. This task has to be splitted to fit a valid format: We first look for 0 or more digits followed by a dot:
(?:\d*\.)?
where (?:) means a non-capturing group.
Behind we look for at least one digit: \d+
We want to capture both parts of of the number using parentheses after the searched semicolon:
;((?:\d*\.)?\d+)
This matches "1234", ".1234", "1.234", "12.34" , "123.4" but "1234.", "1.2.3"
Finally we want this to immediately occur before a semicolon or the end of string. Thus we do a lookahead:
(?=;|$)
Lookaheads and lookbehinds are not part of the captured result behind or respectively before.
I've modified your pattern a little, and added a couple of flags to preg_split.
The PREG_SPLIT_NO_EMPTY flag will exclude empty matches from the result, and PREG_SPLIT_DELIM_CAPTURE will include the captured value in the result.
$split = preg_split('/([abcd]{3};[^;]+;\d+);?/i', $string, -1, PREG_SPLIT_NO_EMPTY|PREG_SPLIT_DELIM_CAPTURE);
Result:
Array
(
[0] => AAA;RED;111
[1] => BBB;BLUE;22
[2] => CCC;GREEN;33
[3] => DDD;WHITE;44
)
Alternatively, and more suitably, you can use preg_match_all with the following pattern.
preg_match_all('/([abcd]{3};[^;]+;\d+);?/i', $string, $matches);
print_r($matches[0]);
Result:
Array
(
[0] => AAA;RED;111
[1] => BBB;BLUE;22
[2] => CCC;GREEN;33
[3] => DDD;WHITE;44
)
You don't want to split your string but match elements, use preg_match_all:
$str = "AAA;RED;111;AAA;Oh my dog;2.34;AAA;Oh Long John;.4556;BBB;Oh Long Johnson;1.2323;BBB;Oh Don Piano;.33;CCC;Why I eyes ya;1.445;CCC;All the live long day;2.3343;DDD;Faith Hilling;.89";
$res = preg_match_all('/(?:AAA|BBB|CCC|DDD);[^;]*;[^;]*;?/', $str, $m);
print_r($m[0]);
Output:
Array
(
[0] => AAA;RED;111;
[1] => AAA;Oh my dog;2.34;
[2] => AAA;Oh Long John;.4556;
[3] => BBB;Oh Long Johnson;1.2323;
[4] => BBB;Oh Don Piano;.33;
[5] => CCC;Why I eyes ya;1.445;
[6] => CCC;All the live long day;2.3343;
[7] => DDD;Faith Hilling;.89
)
Explanation:
/ : regex delimiter
(?:AAA|BBB|CCC|DDD) : non capture group AAA or BBB or CCC or DDD
; : a semicolon
[^;]* : 0 or more any character that is not a semicolon
; : a semicolon
[^;]* : 0 or more any character that is not a semicolon
;? : optional semicolon
/ : regex delimiter

how to find out number from given string using preg_match?

if (
preg_match("/^bundle id/", trim($rows[$key])) &&
preg_match('/\d/', trim($rows[$key]), $temp))
.
bundle id 1 mode active
bundle id 99 mode active
bundle id 999 mode active
how to find out 1,99 and 999 in given preg_match expression.
Your second preg_match needs to become preg_match_all, and the regex needs to look for \d+, which is one or more numerical digits:
preg_match_all('/\d+/', trim($rows[$key]), $temp))
$temp will now contain an array of values:
array (size=1)
0 =>
array (size=3)
0 => string '1' (length=1)
1 => string '99' (length=2)
2 => string '999' (length=3)
Edit
The above answer was on the basis that the sample string is one line.
If each line represents a different string, then all you need to do is alter the regex to\d+:
if(preg_match("/^bundle id/", trim($rows[$key])) &&
preg_match('/\d+/', trim($rows[$key]), $temp))
I'd do:
preg_match("/^bundle id\s*(\d+)/", trim($rows[$key]), $match)
Then the results are in $match[1]

preg match all get group multiple times

I am trying to get a regular expression to get a subgroup everytime it is found. This is my code:
$string2 = 'cabbba';
preg_match_all('#c(a(b)*a)#',$string2,$result3,PREG_SET_ORDER);
var_dump($result3);
My goal is to get 'b' as a captured group each time (so 3 times). This codes outputs the following:
array (size=1)
0 =>
array (size=3)
0 => string 'cabbba' (length=6)
1 => string 'abbba' (length=5)
2 => string 'b' (length=1)
I want it to show 'b' each times it appears, so something like this
array (size=1)
0 =>
array (size=3)
0 => string 'cabbba' (length=6)
1 => string 'abbba' (length=5)
2 => array (size=3)
0 => string 'b' (length 1)
1 => string 'b' (length 1)
2 => string 'b' (length 1)
This is a simplified example, in the real code the subpattern 'b' will be different each time, but it follows the same pattern.
This would be possible only through \G anchor.
(?:ca|\G)(b)(?=b|(a))
DEMO
Did you try using a non-greedy modifier for your b*?
$string2 = 'cabbba';
preg_match_all('#c(a(b)*?a)#', $string2, $result3, PREG_SET_ORDER);
var_dump($result3);
Excuse me if it's not what you asked, I'm not sure I really understood your needs...
UPDATE:
Sorry, previous answer is wrong, please ignore it...
I'm trying to elaborate a right one...
Just trying something like
preg_match_all('#c(a(?:(b{1}))*a)#', $string2, $result3, PREG_SET_ORDER);
but it doesn't work, either... :-(
UPDATE 2:
See Avinash Raj answer, I think it's quite good...

preg_match Regex Matching Full String

I have a simple regex, but it's matching more than I want...
Basically, I'm trying to match certain operators (eg. > < != =) followed by a string.
Regex:
/^(<=|>=|<>|!=|=|<|>)(.*)/
Example subject:
>42
What I'm getting:
array (size=3)
0 => string '>42' (length=3)
1 => string '>' (length=1)
2 => string '42' (length=2)
What I'm trying to get:
array (size=2)
0 => string '>' (length=1)
1 => string '42' (length=2)
What I don't understand is that my regex works perfectly on Regex101
Edit: To clarify, how can I get rid of the full string match?
Your answer is correct.Group(0) is the whole match.Group(1) if first group and group(2) is the second group.
You are getting all 3 groups \0, \1, and '\2'. see the group matching at the bottom of the page
assuming your matches are in $matches you can run array_shift($matches) to remove the '\0' match if you wish.

Regex, PHP: assigning matches to an array after inserting expanded words

Regex, PHP: assigning matches to an array after inserting expanded words
In a system that matches user input against a regex pattern I allow the pattern to contain "concept words" that are marked by a twiddle (~).
E.g. I can define
~service-type as '"oil change" rotation brake "tune up"' and
~day as 'Monday Tuesday Wednesday Thursday Friday'
I can then have a pre-regex like:
.*get.*~service-type.*~day
Which by some preprocessing gets expanded to:
/.*get.*(oil change|rotation|brake|tune up).*(Monday|Tuesday|Wednesday|Thursday|Friday)/i
So it will match a sentence like: "I'd like to get an oil change on Wednesday."
This gives me a nice $matches array that looks like this:
array
0 => string 'I'd like to get an oil change on Wednesday' (length=42)
1 => string 'oil change' (length=10)
2 => string 'Wednesday' (length=9)
The difficulty now arises that it is possible or sometimes necessary that the regex contains other (...) patterns.
In this example I wouldn't really need it, but it shows the point:
(.*)(get).*~service-type(.*)~day
expands to
/(.*)(get).*(oil change|rotation|brake|tune up)(.*)(Monday|Tuesday|Wednesday|Thursday|Friday)/i
which results in $matches being:
array
0 => string 'I'd like to get an oil change on Wednesday' (length=42)
1 => string 'I'd like to ' (length=12)
2 => string 'get' (length=3)
3 => string 'oil change' (length=10)
4 => string ' on ' (length=4)
5 => string 'Wednesday' (length=9)
What I'm looking for is a quick and elegant way that would allow me in either case to generate some array like:
array
'service-type' => string 'oil change' (length=10)
'day' => string 'Wednesday' (length=9)
With elegant I mean I don't have to parse the pattern myself to find out how many and at which locations there are already (...) patterns and where I inserted the expanded concept words. If there's no better way please tell me too, than I can stop agonizing whether there is a nice way and bite the bullet.
Thanks
This seems like something you could achieve using named patterns in your regex. See http://uk3.php.net/preg_match#example-4885

Categories