Valid example
12[red,green],13[xs,xl,xxl,some other text with chars like _&-##%]
number[anythingBut ()[]{},anythingBut ()[]{}](,number[anythingBut ()[]{},anythingBut ()[]{}]) or nothing
Full match 12[red,green]
Group 1 12
Group 2 red,green
Full match 13[xs,xl,xxl,some other text with chars like _&-##%]
Group 1 13
Group 2 xs,xl,xxl,some other text with chars like _&-##%
Not valid example
13[xs,xl,xxl 9974-?ds12[dfgd,dfgd]]
What I tried is this: (\d+(?=\[))\[([^\(\[\{\}\]\)]+)\], regex101 link with what I tried, but this also matches wrong input like given in the example.
If you just need to validate the input, you can add some anchors:
^(?:\d+\[[^\(\[\{\}\]\)]+\](?:,|$))+$
Regex101
If you also need to get all the matching parts, you can use another regex. Using only one will not work well.
$in = '12[red,green],13[xs,xl,xxl,some other text with chars like _&-##%],13[xs,xl,xxl 9974-?ds12[dfgd,dfgd]]';
preg_match_all('/(\d+)\[([^][{}()]+)(?=\](?:,|$))/', $in, $matches);
print_r($matches);
Output:
Array
(
[0] => Array
(
[0] => 12[red,green
[1] => 13[xs,xl,xxl,some other text with chars like _&-##%
)
[1] => Array
(
[0] => 12
[1] => 13
)
[2] => Array
(
[0] => red,green
[1] => xs,xl,xxl,some other text with chars like _&-##%
)
)
Explanation:
/ : regex delimiter
(\d+) : group 1, 1 or more digits
\[ : open square bracket
( : start group 2
[^][{}()]+ : 1 or more any character that is not open or close parenthesis, brackets or square brackets
) : end group 2
(?= : positive lookahead, make sure we have after
\] : a close square bracket
(?:,|$) : non capture group, a comma or end of string
) : end group 2
/ : regex delimiter
Related
I'm not good at Regex and I've been trying for hours now so I hope you can help me. I have this text:
✝his is *✝he* *in✝erne✝*
I need to capture (using PREG_OFFSET_CAPTURE) only the ✝ in a word surrounded with *, so I only need to capture the last three ✝ in this example. The output array should look something like this:
[0] => Array
(
[0] => ✝
[1] => 17
)
[1] => Array
(
[0] => ✝
[1] => 32
)
[2] => Array
(
[0] => ✝
[1] => 44
)
I've tried using (✝) but ofcourse this will select all instances including the words without asterisks. Then I've tried \*[^ ]*(✝)[^ ]*\* but this only gives me the last instance in one word. I've tried many other variations but all were wrong.
To clarify: The asterisk can be at all places in the string, but always at the beginning and end of a word. The opening asterisk always precedes a space except at the beginning of the string and the closing asterisk always ends with a space except at the end of the string. I must add that punctuation marks can be inside these asterisks. ✝ is exactly (and only) what I need to capture and can be at any position in a word.
You could make use of the \G anchor to get iterative matches between the *. The anchor matches either at the start of the string, or at the end of the previous match.
(?:\*|\G(?!^))[^&*]*(?>&(?!#)[^&*]*)*\K✝(?=[^*]*\*)
Explanation
(?: Non capture group
\* Match *
| Or
\G(?!^) Assert the end of the previous match, not at the start
) Close non capture group
[^&*]* Match 0+ times any char except & and *
(?> Atomic group
&(?!#) Match & only when not directly followed by #
[^&*]* Match 0+ times any char except & and *
)* Close atomic group and repeat 0+ times
\K Clear the match buffer (forget what is matched until now)
✝ Match literally
(?=[^*]*\*) Positive lookahead, assert a * at the right
Regex demo | Php demo
For example
$re = '/(?:\*|\G(?!^))[^&*]*(?>&(?!#)[^&*]*)*\K✝(?=[^*]*\*)/m';
$str = '✝his is *✝he* *in✝erne✝*';
preg_match_all($re, $str, $matches, PREG_OFFSET_CAPTURE);
print_r($matches[0]);
Output
Array
(
[0] => Array
(
[0] => ✝
[1] => 16
)
[1] => Array
(
[0] => ✝
[1] => 31
)
[2] => Array
(
[0] => ✝
[1] => 43
)
)
Note The the offset is 1 less than the expected as the string starts counting at 0. See PREG_OFFSET_CAPTURE
If you want to match more variations, you could use a non capturing group and list the ones that you would accept to match. If you don't want to cross newline boundaries you can exclude matching those in the negated character class.
(?:\*|\G(?!^))[^&*\r\n]*(?>&(?!#)[^&*\\rn]*)*\K&#(?:x271D|169);(?=[^*\r\n]*\*)
Regex demo
I have below string. This string having data (#[ID:username__FULLNAME]) of three users mentioned. I want to extract them. I have tried below code but not getting desired results.
ID is integer type
username and FULLNAME may contain numbers, letter and all kind of special chars.
$t = 'Hi #[4232:mark__MΛRK ATTLEY] how are you ?
Hi #[4232:ryan__RYΛN вυηту] how are you ?
Hi #[4232:david__DΛVID शाहिद ] how are you ?
';
My PHP CODE:
$pattern = "|(?:(#\[[0-9]+:[\s\S(?!\])]+\]*))|";
preg_match_all($pattern, $string, $mentionList, PREG_PATTERN_ORDER);
print_r($mentionList);
Current Result:
Array
(
[0] => Array
(
[0] => #[4232:mark__MΛRK ATTLEY] how are you ?
Hi #[4232:ryan__RYΛN вυηту] how are you ?
Hi #[4232:david__DΛVID शाहिद] how are you ?
)
[1] => Array
(
[0] => #[4232:mark__MΛRK ATTLEY] how are you ?
Hi #[4232:ryan__RYΛN вυηту] how are you ?
Hi #[4232:david__DΛVID शाहिद] how are you ?
)
)
Expected Result:
Array
(
[0] => Array
(
[0] => #[4232:mark__MΛRK ATTLEY]
[1] => #[4232:ryan__RYΛN вυηту]
[2] => #[4232:david__DΛVID शाहिद ]
)
)
Can someone help me getting the desired results?
Thanks.
You can use the following regex: #\[.+\] (demo) that gets you all you have in [] plus the front #.
Check this working php demo
You can use this regex with 3 captured groups:
/#\[(\d+):(\S+)\h+(\S+)\h*\]/
RegEx Demo
RegEx Explanation:
#: Match literal #
\[: Match literal [
(\d+): Match 1+ digits and capture it in group #1 for id
:: Match literal :
(\S+): Match 1+ non-whitespace characters and capture it in group #2 for firstName
\h+: Match 1 or more horizontal whitespaces
(\S+): Match 1+ non-whitespace characters and capture it in group #3 for lastName
\h*: Match 0 or more horizontal whitespaces
\]: Match literal ]
Not sure if this will give you the exact output you are looking for, but yor regex is a bit too greedy. You can simplify it like this: (?:#\[[0-9]+.+?])
This should return the captured groups separately.
Not sure if the anonymous capture group is needed so it could be simplified down to (#\[[0-9]+.+?]) or possibly even (#\[.+?]).
so I need to extract the ticket number "Ticket#999999" from a string.. how do i do this using regex.
my current regex is working if I have more than one number in the Ticket#9999.. but if I only have Ticket#9 it's not working please help.
current regex.
preg_match_all('/(Ticket#[0-9])\w\d+/i',$data,$matches);
thank you.
In your pattern [0-9] matches 1 digit, \w matches another digit and \d+ matches 1+ digits, thus requiring 3 digits after #.
Use
preg_match_all('/Ticket#([0-9]+)/i',$data,$matches);
This will match:
Ticket# - a literal string Ticket#
([0-9]+) - Group 1 capturing 1 or more digits.
PHP demo:
$data = "Ticket#999999 ticket#9";
preg_match_all('/Ticket#([0-9]+)/i',$data,$matches, PREG_SET_ORDER);
print_r($matches);
Output:
Array
(
[0] => Array
(
[0] => Ticket#999999
[1] => 999999
)
[1] => Array
(
[0] => ticket#9
[1] => 9
)
)
I have been sitting for hours to figure out a regExp for a preg_match_all function in php.
My problem is that i whant two different things from the string.
Say you have the string "Code is fun [and good for the brain.] But the [brain is] tired."
What i need from this an array of all the word outside of the brackets and the text in the brackets together as one string.
Something like this
[0] => Code
[1] => is
[2] => fun
[3] => and good for the brain.
[4] => But
[5] => the
[6] => brain is
[7] => tired.
Help much appreciated.
You could try the below regex also,
(?<=\[)[^\]]*|[.\w]+
DEMO
Code:
<?php
$data = "Code is fun [and good for the brain.] But the [brain is] tired.";
$regex = '~(?<=\[)[^\]]*|[.\w]+~';
preg_match_all($regex, $data, $matches);
print_r($matches);
?>
Output:
Array
(
[0] => Array
(
[0] => Code
[1] => is
[2] => fun
[3] => and good for the brain.
[4] => But
[5] => the
[6] => brain is
[7] => tired.
)
)
The first lookbind (?<=\[)[^\]]* matches all the characters which are present inside the braces [] and the second [.\w]+ matches one or more word characters or dot from the remaining string.
You can use the following regex:
(?:\[([\w .!?]+)\]+|(\w+))
The regex contains two alternations: one to match everything inside the two square brackets, and one to capture every other word.
This assumes that the part inside the square brackets doesn't contain any characters other than alphabets, digits, _, !, ., and ?. In case you need to add more punctuation, it should be easy enough to add them to the character class.
If you don't want to be that specific about what should be captured, then you can use a negated character class instead — specify what not to match instead of specifying what to match. The expression then becomes: (?:\[([^\[\]]+)\]|(\w+))
Visualization:
Explanation:
(?: # Begin non-capturing group
\[ # Match a literal '['
( # Start capturing group 1
[\w .!?]+ # Match everything in between '[' and ']'
) # End capturing group 1
\] # Match literal ']'
| # OR
( # Begin capturing group 2
\w+ # Match rest of the words
) # End capturing group 2
) # End non-capturing group
Demo
Example input:
hjkhwe5boijdfg
I need to split this into 3 variables as below:
hjkhwe5 (any length, always ends in some number (can be any number))
b (always a single letter, can be any letter)
oijdfg (everything remaining at the
end, numbers or letters in any combination)
I've got the PHP preg_match all setup but have no idea how to do this complex regex. Could someone give me a hand?
Have a try with:
$str = 'hjkhwe5boijdfg';
preg_match("/^([a-z]+\d+)([a-z])(.*)$/", $str, $m);
print_r($m);
output:
Array
(
[0] => hjkhwe5boijdfg
[1] => hjkhwe5
[2] => b
[3] => oijdfg
)
Explanation:
^ : begining of line
( : 1rst group
[a-z]+ : 1 or more letters
\d+ : followed by 1 or more digit
) : end of group 1
( : 2nd group
[a-z] : 1 letter
) : end group 2
( : 3rd group
.* : any number of any char
) : end group 3
$
You can use preg_match as:
$str = 'hjkhwe5boijdfg';
if(preg_match('/^(\D*\d+)(\w)(.*)$/',$str,$m)) {
// $m[1] has part 1, $m[2] has part 2 and $m[3] has part 3.
}
See it