PHP regexp how get all matches in preg_match - php

I have string
$s = 'Sections: B3; C2; D4';
and regexp
preg_match('/Sections(?:[:;][\s]([BCDE][\d]+))+/ui', $s, $m);
Result is
Array
(
[0] => Sections: B3; C2; D4
[1] => D4
)
How I can get array with all sections B3, C2, D4
I can't use preg_match_all('/[BCDE][\d]+)/ui', because searching strongly after Sections: word.
The number of elements (B3, С2...) can be any.

You may use
'~(?:\G(?!^);|Sections:)\s*\K[BCDE]\d+~i'
See the regex demo
Details
(?:\G(?!^);|Sections:) - either the end of the previous match and a ; (\G(?!^);) or (|) a Sections: substring
\s* - 0 or more whitespace chars
\K - a match reset operator
[BCDE] - a char from the character set (due to i modifier, case insensitive)
\d+ - 1 or more digits.
See the PHP demo:
$s = "Sections: B3; C2; D4";
if (preg_match_all('~(?:\G(?!^);|Sections:)\s*\K[BCDE]\d+~i', $s, $m)) {
print_r($m[0]);
}
Output:
Array
(
[0] => B3
[1] => C2
[2] => D4
)

You don't need regex an explode will do fine.
Remove "Section: " then explode the rest of the string.
$s = 'Sections: B3; C2; D4';
$s = str_replace('Sections: ', '', $s);
$arr = explode("; ", $s);
Var_dump($arr);
https://3v4l.org/PcrNK

Related

PHP string split regular

Regular exp = (Digits)*(A|B|DF|XY)+(Digits)+
I'm confused about this pattern really
I want to separate this string in PHP, someone can help me
My input maybe something like this
A1234
B 1239
1A123
12A123
1A 1234
12 A 123
1234 B 123456789
12 XY 1234567890
and convert to this
Array
(
[0] => 12
[1] => XY
[2] => 1234567890
)
<?php
$input = "12 XY 123456789";
print_r(preg_split('/\d*[(A|B|DF|XY)+\d+]+/', $input, 3));
//print_r(preg_split('/[\s,]+/', $input, 3));
//print_r(preg_split('/\d*[\s,](A|B)+[\s,]\d+/', $input, 3));
You may match and capture the numbers, letters, and numbers:
$input = "12 XY 123456789";
if (preg_match('/^(?:(\d+)\s*)?(A|B|DF|XY)(?:\s*(\d+))?$/', $input, $matches)){
array_shift($matches);
print_r($matches);
}
See the PHP demo and the regex demo.
^ - start of string
(?:(\d+)\s*)? - an optional sequence of:
(\d+) - Group 1: any or more digits
\s* - 0+ whitespaces
(A|B|DF|XY) - Group 2: A, B, DF or XY
(?:\s*(\d+))? - an optional sequence of:
\s* - 0+ whitespaces
(\d+) - Group 3: any or more digits
$ - end of string.

How to split repeated chars and numbers with preg_split?

I'm trying to solve some problem and I need to split repeated chars and all integers
$code = preg_split('/(.)(?!\1|$)\K/', $code);
I tried this one, but it separate and not repeated chars and not repeated integers , I need only chars
I have a string 'FFF86C6'
I need an array (FFF, 86, C, 6);
with pattern '/(.)(?!\1|$)\K/' returns (FFF, 8, 6, C, 6)
Do you have any idea how to make it?
You can use this regex with preg_match_all:
([A-Za-z])(\1*)|\d+
It looks for a letter, followed by some number of the same character, or some digits. By using preg_match_all we find all matches in the string. Usage in PHP:
$string = "FFF86CR6";
$pieces = preg_match_all('/([A-Za-z])(\1*)|\d+/', $string, $matches);
print_r($matches[0]);
Output:
Array (
[0] => FFF
[1] => 86
[2] => C
[3] => R
[4] => 6
)
Demo on 3v4l.org

Split string after each number

I have a database full of strings that I'd like to split into an array. Each string contains a list of directions that begin with a letter (U, D, L, R for Up, Down, Left, Right) and a number to tell how far to go in that direction.
Here is an example of one string.
$string = "U29R45U2L5D2L16";
My desired result:
['U29', 'R45', 'U2', 'L5', 'D2', 'L16']
I thought I could just loop through the string, but I don't know how to tell if the number is one or more spaces in length.
You can use preg_split to break up the string, splitting on something which looks like a U,L,D or R followed by numbers and using the PREG_SPLIT_DELIM_CAPTURE to keep the split text:
$string = "U29R45U2L5D2L16";
print_r(preg_split('/([UDLR]\d+)/', $string, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY));
Output:
Array (
[0] => U29
[1] => R45
[2] => U2
[3] => L5
[4] => D2
[5] => L16
)
Demo on 3v4l.org
A regular expression should help you:
<?php
$string = "U29R45U2L5D2L16";
preg_match_all("/[A-Z]\d+/", $string, $matches);
var_dump($matches);
Because this task is about text extraction and not about text validation, you can merely split on the zer-width position after one or more digits. In other words, match one or more digits, then forget them with \K so that they are not consumed while splitting.
Code: (Demo)
$string = "U29R45U2L5D2L16";
var_export(
preg_split(
'/\d+\K/',
$string,
0,
PREG_SPLIT_NO_EMPTY
)
);
Output:
array (
0 => 'U29',
1 => 'R45',
2 => 'U2',
3 => 'L5',
4 => 'D2',
5 => 'L16',
)

split string by spaces and colon but not if inside quotes

having a string like this:
$str = "dateto:'2015-10-07 15:05' xxxx datefrom:'2015-10-09 15:05' yyyy asdf"
the desired result is:
[0] => Array (
[0] => dateto:'2015-10-07 15:05'
[1] => xxxx
[2] => datefrom:'2015-10-09 15:05'
[3] => yyyy
[4] => asdf
)
what I get with:
preg_match_all("/\'(?:[^()]|(?R))+\'|'[^']*'|[^(),\s]+/", $str, $m);
is:
[0] => Array (
[0] => dateto:'2015-10-07
[1] => 15:05'
[2] => xxxx
[3] => datefrom:'2015-10-09
[4] => 15:05'
[5] => yyyy
[6] => asdf
)
Also tried with preg_split("/[\s]+/", $str) but no clue how to escape if value is between quotes. Can anyone show me how and also please explain the regex. Thank you!
I would use PCRE verb (*SKIP)(*F),
preg_split("~'[^']*'(*SKIP)(*F)|\s+~", $str);
DEMO
Often, when you are looking to split a string, using preg_split isn't the best approach (that seems a little counter intuitive, but that's true most of the time). A more efficient way consists to find all items (with preg_match_all) using a pattern that describes all that is not the delimiter (white-spaces here):
$pattern = <<<'EOD'
~(?=\S)[^'"\s]*(?:'[^']*'[^'"\s]*|"[^"]*"[^'"\s]*)*~
EOD;
if (preg_match_all($pattern, $str, $m))
$result = $m[0];
pattern details:
~ # pattern delimiter
(?=\S) # the lookahead assertion only succeeds if there is a non-
# white-space character at the current position.
# (This lookahead is useful for two reasons:
# - it allows the regex engine to quickly find the start of
# the next item without to have to test each branch of the
# following alternation at each position in the strings
# until one succeeds.
# - it ensures that there's at least one non-white-space.
# Without it, the pattern may match an empty string.
# )
[^'"\s]* #"'# all that is not a quote or a white-space
(?: # eventual quoted parts
'[^']*' [^'"\s]* #"# single quotes
|
"[^"]*" [^'"\s]* # double quotes
)*
~
demo
Note that with this a little long pattern, the five items of your example string are found in only 60 steps. You can use this shorter/more simple pattern too:
~(?:[^'"\s]+|'[^']*'|"[^"]*")+~
but it's a little less efficient.
For your example, you can use preg_split with negative lookbehind (?<!\d), i.e.:
<?php
$str = "dateto:'2015-10-07 15:05' xxxx datefrom:'2015-10-09 15:05' yyyy asdf";
$matches = preg_split('/(?<!\d)(\s)/', $str);
print_r($matches);
Output:
Array
(
[0] => dateto:'2015-10-07 15:05'
[1] => xxxx
[2] => datefrom:'2015-10-09 15:05'
[3] => yyyy
[4] => asdf
)
Demo:
http://ideone.com/EP06Nt
Regex Explanation:
(?<!\d)(\s)
Assert that it is impossible to match the regex below with the match ending at this position (negative lookbehind) «(?<!\d)»
Match a single character that is a “digit” «\d»
Match the regex below and capture its match into backreference number 1 «(\s)»
Match a single character that is a “whitespace character” «\s»

PHP Regex Word Boundary exclude underscore _

I'm using regex word boundary \b, and I'm trying to match foo in the following $sentence but the result is not what I need, the underscore is killing me, I want underscore to be word boundary just like hyphen or space:
$sentence = "foo_foo_foo foo-foo_foo";
X X X YES X X
Expected:
$sentence = "foo_foo_foo foo-foo_foo";
YES YES YES YES YES YES
My code:
preg_match("/\bfoo\b/i", $sentence);
You would have to create DIY boundaries.
(?:\b|_\K)foo(?=\b|_)
Does this do what you want?:
preg_match_all("/foo/i", $sentence, $matches);
var_dump($matches);
You can subtract _ from the \w and use unambiguous word boundaries:
/(?<![^\W_])foo(?![^\W_])/i
See this regex demo. Note \bfoo = (?<!\w)foo and foo(?!\w) = foo\b, and subtracting a _ from \w (that is equal to [^\W]) results in [^\W_].
In PHP, you can use preg_match_all to find all occurrences:
preg_match_all("/(?<![^\W_])foo(?![^\W_])/i", $sentence)
To replace / remove all occurrences, you may use preg_replace:
preg_replace("/(?<![^\W_])foo(?![^\W_])/i", "YES", $sentence)
See the PHP demo online:
$sentence = "foo_foo_foo foo-foo_foo";
if (preg_match_all("/(?<![^\W_])foo(?![^\W_])/i", $sentence, $matches)) {
print_r($matches[0]);
}
// => Array( [0] => foo [1] => foo [2] => foo [3] => foo [4] => foo [5] => foo)
echo PHP_EOL . preg_replace("/(?<![^\W_])foo(?![^\W_])/i", "YES", $sentence);
// => YES_YES_YES YES-YES_YES

Categories