a question about preg_match - php

I have a question in PHP:
When using preg_match, why #^(([a-z]{2})/)?(([a-z\-]{3,})/(([a-z\-]{3,}))?)?$#i match ab/cde/fgh and do not match ab/cde?
(I mean:
preg_match_all('#^(([a-z]{2})/)?(([a-z\-]{3,})/(([a-z\-]{3,}))?)?$#i','ab/cde/fgh',$match)
$match = Array
(
[0] => ab/cde/fgd
[1] => ab/
[2] => ab
[3] => cde/fgd
[4] => cde
[5] => fgd
[6] => fgd
)
and
preg_match_all('#^(([a-z]{2})/)?(([a-z\-]{3,})/(([a-z\-]{3,}))?)?$#i','ab/cde',$match)
$match = Array ()

Because as the regex is written, you need a slash after the cde. ab/cde/ should match.

[a-z-]{3,} = 3 or more characters

Related

PHP preg_match() doesn't match all subpatterns

I have a preg_match() which matches the pattern but doesn't receive the expected matches (in third param).
My regex patterns have multiple subpatterns.
$pattern = "~^&multi&[^&]+(&(?:(p-(?<sad>[1-9]\d*)|page-(?<sad>[1-9]\d*))))?&[^&]+(&(?:(p-(?<gogosi>[1-9]\d*)|page-(?<gogosi>[1-9]\d*))))?&?$~J";
$string = "&multi&mickael&p-23&george&page-34";
preg_match($pattern, $string, $matches);
This is what $matches contains:
Array
(
[0] => &multi&mickael&p-23&george&page-34
[1] => &p-23
[2] => p-23
[sad] =>
[3] => 23
[4] =>
[5] => &page-34
[6] => page-34
[gogosi] => 34
[7] =>
[8] => 34
)
The problem is [sad] should have 23 value.
If I don't include in $string second page (page-34), 'cause is optional [...]
$string = "&multi&mickael&p-23&george";
[...] I have good $matches 'cause my [sad] got his value:
Array
(
[0] => &multi&mickael&p-23&george
[1] => &p-23
[2] => p-23
[sad] => 23
[3] => 23
)
But I want regex to return properly value even when I have both paginations in $string.
What to do such that all subpatterns will have their value ?
Note: Words as ('p', 'page') are only examples. Can be any words there.
Note: Above data is just an example. Don't give me workaround solutions, but something good for any input data.
You may use a branch reset group, (?|...|...):
'~^&multi&[^&]+(&((?|p-(?<sad>[1-9]\d*)|page-(?<sad>[1-9]\d*))))?&[^&]+(&((?|p-(?<gogosi>[1-9]\d*)|page-(?<gogosi>[1-9]\d*))))?&?$~J'
See the regex demo.
See the PHP demo:
$pattern = "~^&multi&[^&]+(&((?|p-(?<sad>[1-9]\d*)|page-(?<sad>[1-9]\d*))))?&[^&]+(&((?|p-(?<gogosi>[1-9]\d*)|page-(?<gogosi>[1-9]\d*))))?&?$~J";
$string = "&multi&mickael&p-23&george&page-34";
if (preg_match($pattern, $string, $matches)) {
print_r($matches);
}
Output:
Array
(
[0] => &multi&mickael&p-23&george&page-34
[1] => &p-23
[2] => p-23
[sad] => 23
[3] => 23
[4] => &page-34
[5] => page-34
[gogosi] => 34
[6] => 34
)

Reg Exp - preg_match_all reduce array result

This is my Reg Exp "[c]?[\d+|\D+]\s*". My input is this "c7=c4/c5*100" and the result is :
Array
(
[0] => Array
(
[0] => c7
[1] => =
[2] => c5
[3] => +
[4] => c3
[5] => *
[6] => 1
[7] => 0
[8] => 0
)
)
But what I want is:
Array
(
[0] => Array
(
[0] => c7
[1] => =
[2] => c5
[3] => +
[4] => c3
[5] => *
[6] => 100
)
)
I can't seem to get the last part working, I'm lost as what to do next - Can anyone help?
Thanks,
Paul
You specified a character class [\d+|\D+] which would match any of the specified characters. I think you meant using an or | with a grouping construct c?(?:\d+|\D+)\s* but in that case it would match c followed by either \d+ or \D so that would match the = sign right after it resulting in c= as a match and /c as a match.
Try matching an optional c c? followed by one or more digits or | match not a digit \D
c?\d+|\D
$re = '/c?\d+|\D/m';
$str = 'c7=c4/c5*100';
preg_match_all($re, $str, $matches);
print_r($matches);
That will result in:
Array
(
[0] => Array
(
[0] => c7
[1] => =
[2] => c4
[3] => /
[4] => c5
[5] => *
[6] => 100
)
)
Demo

Match numbers separated with colons, semicolons optionally

I have the following string:
objectsA=38155,54,1;38155,53,1;38155,45,1;38155,47,1;38155,46,1;2000,55,1;38155,50,1;38155,49,1;38155,48,1;38155,40,1;38155,41,1;38155,42,1;38155,43,1;38155,51,1;38155,52,1;38155,44,1;38155,35,1;38155,33,1;38155,32,1;38155,34,1;38155,36,1;38155,38,1;38155,39,1;:objectsB=
And I want to know how can I match, optionally, the numbers inside objectsA and objectsB but put into consideration, that may one or another can be empty. For example:
objectsA can be:
objectsA=38155,54,1;38155,53,1;38155,45,1;38155,47,1;38155,46,1;2000,55,1;38155,50,1;38155,49,1;38155,48,1;38155,40,1;38155,41,1;38155,42,1;38155,43,1;38155,51,1;38155,52,1;38155,44,1;38155,35,1;38155,33,1;38155,32,1;38155,34,1;38155,36,1;38155,38,1;38155,39,1;
But also can be
objectsA=:objectsB=38155,54,1;38155,53,1;38155,45,1;38155,47,1;38155,46,1;2000,55,1;38155,50,1;38155,49,1;38155,48,1;38155,40,1;38155,41,1;38155,42,1;38155,43,1;38155,51,1;38155,52,1;38155,44,1;38155,35,1;38155,33,1;38155,32,1;38155,34,1;38155,36,1;38155,38,1;38155,39,1;
Or even
objectsA=38155,54,1;38155,53,1;38155,45,1;38155,47,1;38155,46,1;2000,55,1;38155,50,1;38155,49,1;38155,48,1;38155,40,1;38155,41,1;38155,42,1;38155,43,1;38155,51,1;38155,52,1;38155,44,1;38155,35,1;38155,33,1;38155,32,1;38155,34,1;38155,36,1;38155,38,1;38155,39,1;:objectsB=objectsA=38155,54,1;38155,53,1;38155,45,1;38155,47,1;38155,46,1;2000,55,1;38155,50,1;38155,49,1;38155,48,1;38155,40,1;38155,41,1;38155,42,1;38155,43,1;38155,51,1;38155,52,1;38155,44,1;38155,35,1;38155,33,1;38155,32,1;38155,34,1;38155,36,1;38155,38,1;38155,39,1;
The current code:
$line2 = "
2016-07-31 00:39:00 debian-8gb-sfo2-01 gdeliveryd: notice : formatlog:trade:roleidA=3328:roleidB=2161:moneyA=0:moneyB=0:objectsA=38155,54,1;38155,53,1;38155,45,1;38155,47,1;38155,46,1;2000,55,1;38155,50,1;38155,49,1;38155,48,1;38155,40,1;38155,41,1;38155,42,1;38155,43,1;38155,51,1;38155,52,1;38155,44,1;38155,35,1;38155,33,1;38155,32,1;38155,34,1;38155,36,1;38155,38,1;38155,39,1;:objectsB=";
if (strpos($line2, ':trade:roleidA=3328') > 0) {
if (!preg_match('/([\d-: ]+)\s*.*\sformatlog:trade:roleidA=(\d+):(.*)roleidB=(\d+):moneyA=(\d+):moneyB=(\d+):objectsA=(regexhere):objectsB=(regexhere).*$/', $line2, $c)) {
// error occured
}
echo '<pre>';
print_r($c);
}
And the problems is that the current regex ((\d+\,\d+\,\d\;)+|) has an weird behavior, that can't happen.
Output:
Array
(
[0] => 2016-07-31 00:39:00 debian-8gb-sfo2-01 gdeliveryd: notice : formatlog:trade:roleidA=3328:roleidB=2161:moneyA=0:moneyB=0:objectsA=38155,54,1;38155,53,1;38155,45,1;38155,47,1;38155,46,1;2000,55,1;38155,50,1;38155,49,1;38155,48,1;38155,40,1;38155,41,1;38155,42,1;38155,43,1;38155,51,1;38155,52,1;38155,44,1;38155,35,1;38155,33,1;38155,32,1;38155,34,1;38155,36,1;38155,38,1;38155,39,1;:objectsB=38155,54,1;38155,53,1;38155,45,1;38155,47,1;38155,46,1;2000,55,1;38155,50,1;38155,49,1;38155,48,1;38155,40,1;38155,41,1;38155,42,1;38155,43,1;38155,51,1;38155,52,1;38155,44,1;38155,35,1;38155,33,1;38155,32,1;38155,34,1;38155,36,1;38155,38,1;38155,39,1;
[1] => 2016-07-31 00:39:00
[2] => 3328
[3] =>
[4] => 2161
[5] => 0
[6] => 0
[7] => 38155,54,1;38155,53,1;38155,45,1;38155,47,1;38155,46,1;2000,55,1;38155,50,1;38155,49,1;38155,48,1;38155,40,1;38155,41,1;38155,42,1;38155,43,1;38155,51,1;38155,52,1;38155,44,1;38155,35,1;38155,33,1;38155,32,1;38155,34,1;38155,36,1;38155,38,1;38155,39,1;
[8] => 38155,39,1;
[9] => 38155,54,1;38155,53,1;38155,45,1;38155,47,1;38155,46,1;2000,55,1;38155,50,1;38155,49,1;38155,48,1;38155,40,1;38155,41,1;38155,42,1;38155,43,1;38155,51,1;38155,52,1;38155,44,1;38155,35,1;38155,33,1;38155,32,1;38155,34,1;38155,36,1;38155,38,1;38155,39,1;
[10] => 38155,39,1;
)
For some reason, if the objects has the same size, the regex are creating a new array index, wich shouldn't happen.
The expected result:
Array
(
[0] => 2016-07-31 00:39:00 debian-8gb-sfo2-01 gdeliveryd: notice : formatlog:trade:roleidA=3328:roleidB=2161:moneyA=0:moneyB=0:objectsA=38155,54,1;38155,53,1;38155,45,1;38155,47,1;38155,46,1;2000,55,1;38155,50,1;38155,49,1;38155,48,1;38155,40,1;38155,41,1;38155,42,1;38155,43,1;38155,51,1;38155,52,1;38155,44,1;38155,35,1;38155,33,1;38155,32,1;38155,34,1;38155,36,1;38155,38,1;38155,39,1;:objectsB=38155,54,1;38155,53,1;38155,45,1;38155,47,1;38155,46,1;2000,55,1;38155,50,1;38155,49,1;38155,48,1;38155,40,1;38155,41,1;38155,42,1;38155,43,1;38155,51,1;38155,52,1;38155,44,1;38155,35,1;38155,33,1;38155,32,1;38155,34,1;38155,36,1;38155,38,1;38155,39,1;
[1] => 2016-07-31 00:39:00
[2] => 3328
[4] => 2161
[5] => 0
[6] => 0
[7] => 38155,54,1;38155,53,1;38155,45,1;38155,47,1;38155,46,1;2000,55,1;38155,50,1;38155,49,1;38155,48,1;38155,40,1;38155,41,1;38155,42,1;38155,43,1;38155,51,1;38155,52,1;38155,44,1;38155,35,1;38155,33,1;38155,32,1;38155,34,1;38155,36,1;38155,38,1;38155,39,1;
[8] => 38155,54,1;38155,53,1;38155,45,1;38155,47,1;38155,46,1;2000,55,1;38155,50,1;38155,49,1;38155,48,1;38155,40,1;38155,41,1;38155,42,1;38155,43,1;38155,51,1;38155,52,1;38155,44,1;38155,35,1;38155,33,1;38155,32,1;38155,34,1;38155,36,1;38155,38,1;38155,39,1;
)
Regex: ^(?:\s?\d+(?:[-:]\d+){2}){2}|\w+=\K[^:]+
Details:
(?:) Non-capturing group
[] Match a single character present in the list
\K Resets the starting point of the reported match
+ Matches between one and unlimited times
| Or
PHP code:
$string = "2016-07-31 00:39:00 debian-8gb-sfo2-01 gdeliveryd: notice : formatlog:trade:roleidA=3328:roleidB=2161:moneyA=0:moneyB=0:objectsA=38155,54,1;38155,53,1;38155,45,1;38155,47,1;38155,46,1;2000,55,1;38155,50,1;38155,49,1;38155,48,1;38155,40,1;38155,41,1;38155,42,1;38155,43,1;38155,51,1;38155,52,1;38155,44,1;38155,35,1;38155,33,1;38155,32,1;38155,34,1;38155,36,1;38155,38,1;38155,39,1;:objectsB=38155,54,1;38155,53,1;38155,45,1;38155,47,1;38155,46,1;2000,55,1;38155,50,1;38155,49,1;38155,48,1;38155,40,1;38155,41,1;38155,42,1;38155,43,1;38155,51,1;38155,52,1;38155,44,1;38155,35,1;38155,33,1;38155,32,1;38155,34,1;38155,36,1;38155,38,1;38155,39,1;";
preg_match_all('~^(?:\s?\d+(?:[-:]\d+){2}){2}|\w+=\K[^:]+~', $string, $matches);
print_r($matches[0]);
Output:
Array
(
[0] => 2016-07-31 00:39:00
[1] => 3328
[2] => 2161
[3] => 0
[4] => 0
[5] => 38155,54,1;38155,53,1;38155,45,1;38155,47,1;38155,46,1;2000,55,1;38155,50,1;38155,49,1;38155,48,1;38155,40,1;38155,41,1;38155,42,1;38155,43,1;38155,51,1;38155,52,1;38155,44,1;38155,35,1;38155,33,1;38155,32,1;38155,34,1;38155,36,1;38155,38,1;38155,39,1;
[6] => 38155,54,1;38155,53,1;38155,45,1;38155,47,1;38155,46,1;2000,55,1;38155,50,1;38155,49,1;38155,48,1;38155,40,1;38155,41,1;38155,42,1;38155,43,1;38155,51,1;38155,52,1;38155,44,1;38155,35,1;38155,33,1;38155,32,1;38155,34,1;38155,36,1;38155,38,1;38155,39,1;
)
Code demo
For are navigating through this question, sometimes: less is more. The pattern (.*) will do the trick.
([\d-: ]+)\s*.*\sformatlog:trade:roleidA=(\d+):roleidB=(\d+):moneyA=(\d+):moneyB=(\d+):objectsA=(.*):objectsB=(.*).*$

How to make this weird string explode in PHP?

I have a string like the following
DAS-1111[DR-Helpfull-R]-RUN--[121668688374]-N-[+helpfull_+string]
The above string is a kind of formatted in groups that looks like the following:
A-B[C]-D-E-[F]-G-[H]
The think is that I like to process some of those groups, and I like to make something like explode.
I say like, because I have try this code:
$string = 'DAS-1111[DR-Helpfull-R]-RUN--[121668688374]-N-[+helpfull_+string]';
$parts = explode( '-', $string );
print_r( $parts );
and I get the following result:
Array
(
[0] => DAS
[1] => 1111[DR
[2] => Helpfull
[3] => R]
[4] => RUN
[5] =>
[6] => [121668688374]
[7] => N
[8] => [+helpfull_+string]
)
that it is not what I need.
What I need is the following output:
Array
(
[0] => DAS
[1] => 1111[DR-Helpfull-R]
[2] => RUN
[3] =>
[4] => [121668688374]
[5] => N
[6] => [+helpfull_+string]
)
Can someone please suggest a nice and elegant way to explode this string in the way I need it ?
what I forgot to mention, is that the string can have more or less groups. Examples:
DAS-1111[DR-Helpfull-R]-RUN--[121668688374]-N-[+helpfull_+string]
DAS-1111[DR-Helpfull-R]-RUN--[121668688374]
DAS-1111[DR-Helpfull-R]-RUN--[121668688374]-N-[+helpfull_+string]-anotherPart
Update 1
As mentioned by #axiac, the preg_split can do the work. But can you please help with the regex now ?
I have try this but it seems that it is incorrect:
(?!\]\-)\-
The code:
$str = 'DAS-1111[DR-Helpfull-R]-RUN--[121668688374]-N-[+helpfull_+string]';
$re = '/([^-[]*(?:\[[^\]]*\])?[^-]*)-?/';
$matches = array();
preg_match_all($re, $str, $matches);
print_r($matches[1]);
Its output:
Array
(
[0] => DAS
[1] => 1111[DR-Helpfull-R]
[2] => RUN
[3] =>
[4] => [121668688374]
[5] => N
[6] => [+helpfull_+string]
[7] =>
)
There is an extra empty value at position 7 in the output. It appears because of the zero-or-one repetitions quantifier (?) placed at the end of the regex. The quantifier is needed because without it the last piece (at index 6) is not matched.
You can remove the ? after the last - and ask this way the dash (-) always match. In this case you must append an extra - to your input string.
The regex
( # start of the 1st subpattern
# the captured value is returned in $matches[1]
[^-[]* # match any character but '-' and '[', zero or more times
(?: # start of a non-capturing subpattern
\[ # match an opening square bracket ('[')
[^\]]* # match any character but ']', zero or more times
\] # match a closing square bracket (']')
)? # end of the subpattern; it is optional (can appear 0 or 1 times)
[^-]* # match any character but '-', zero or more times
) # end of the 1st subpattern
-? # match an optional dash ('-')
Instead of exploding you should try to match the following pattern:
(?:^|-)([^-\[]*(?:\[[^\]]+\])?)
Here is an example:
$regex = '/(?:^|-)([^-\[]*(?:\[[^\]]+\])?)/';
$tests = array(
'DAS-1111[DR-Helpfull-R]-RUN--[121668688374]-N-[+helpfull_+string]',
'DAS-1111[DR-Helpfull-R]-RUN--[121668688374]',
'DAS-1111[DR-Helpfull-R]-RUN--[121668688374]-N-[+helpfull_+string]-anotherPart'
);
foreach ($tests as $test) {
preg_match_all($regex, $test, $result);
print_r($result[1]);
}
Output:
// DAS-1111[DR-Helpfull-R]-RUN--[121668688374]-N-[+helpfull_+string]
Array
(
[0] => DAS
[1] => 1111[DR-Helpfull-R]
[2] => RUN
[3] =>
[4] => [121668688374]
[5] => N
[6] => [+helpfull_+string]
)
// DAS-1111[DR-Helpfull-R]-RUN--[121668688374]
Array
(
[0] => DAS
[1] => 1111[DR-Helpfull-R]
[2] => RUN
[3] =>
[4] => [121668688374]
)
// DAS-1111[DR-Helpfull-R]-RUN--[121668688374]-N-[+helpfull_+string]-anotherPart
Array
(
[0] => DAS
[1] => 1111[DR-Helpfull-R]
[2] => RUN
[3] =>
[4] => [121668688374]
[5] => N
[6] => [+helpfull_+string]
[7] => anotherPart
)
This case is perfect for the (*SKIP)(*FAIL) method. You want to split your string on the hyphens, so long as they aren't inside of square brackets.
Easy. Just disqualify these hyphens as delimiters like so:
Pattern: ~\[[^]]+\](*SKIP)(*FAIL)|-~ (Pattern Demo)
Code: (Demo)
$strings=['DAS-1111[DR-Helpfull-R]-RUN--[121668688374]-N-[+helpfull_+string]',
'DAS-1111[DR-Helpfull-R]-RUN--[121668688374]',
'DAS-1111[DR-Helpfull-R]-RUN--[121668688374]-N-[+helpfull_+string]-anotherPart'];
foreach($strings as $string){
var_export(preg_split('~\[[^]]+\](*SKIP)(*FAIL)|-~',$string));
echo "\n\n";
}
Output:
array (
0 => 'DAS',
1 => '1111[DR-Helpfull-R]',
2 => 'RUN',
3 => '',
4 => '[121668688374]',
5 => 'N',
6 => '[+helpfull_+string]',
)
array (
0 => 'DAS',
1 => '1111[DR-Helpfull-R]',
2 => 'RUN',
3 => '',
4 => '[121668688374]',
)
array (
0 => 'DAS',
1 => '1111[DR-Helpfull-R]',
2 => 'RUN',
3 => '',
4 => '[121668688374]',
5 => 'N',
6 => '[+helpfull_+string]',
7 => 'anotherPart',
)

preg_split with regex giving incorrect output

I'm using preg_split to an string, but I'm not getting desired output. For example
$string = 'Tachycardia limit_from:1900-01-01 limit_to:2027-08-29 numresults:10 sort:publication-date direction:descending facet-on-toc-section-id:Case Reports';
$vals = preg_split("/(\w*\d?):/", $string, NULL, PREG_SPLIT_DELIM_CAPTURE);
is generating output
Array
(
[0] => Tachycardia
[1] => limit_from
[2] => 1900-01-01
[3] => limit_to
[4] => 2027-08-29
[5] => numresults
[6] => 10
[7] => sort
[8] => publication-date
[9] => direction
[10] => descending facet-on-toc-section-
[11] => id
[12] => Case Reports
)
Which is wrong, desire output it
Array
(
[0] => Tachycardia
[1] => limit_from
[2] => 1900-01-01
[3] => limit_to
[4] => 2027-08-29
[5] => numresults
[6] => 10
[7] => sort
[8] => publication-date
[9] => direction
[10] => descending
[11] => facet-on-toc-section-id
[12] => Case Reports
)
There something wrong with regex, but I'm not able to fix it.
I would use
$vals = preg_split("/(\S+):/", $string, NULL, PREG_SPLIT_DELIM_CAPTURE);
Output is exactly like you want
It's because the \w class does not include the character -, so i would expand the \w with that too:
/((?:\w|-)*\d?):/
Try this regex instead to include '-' or other characters in your splitting pattern: http://regexr.com?32qgs
((?:[\w\-])*\d?):

Categories