I am using PHP Regex to see if a pattern is in my string. Simply put, I want one of two words followed by a number. So, here's my pattern:
#(guitare|piano)[0-9]#
Basically, if the string contains the word "guitare", I don't want it to be matched, only if I have "guitare9" or "piano0"
At this point, If I use this pattern for the follwing string:
J'aime la guitare9
The array of the preg_match() returns guitare and guitare9: http://www.phpliveregex.com/p/7tZ
What do I have to change in my regex to only match guitare9 ?
Turn capturing group in your regex to non-capturing group. Because preg_match would display the match and also the captured strings in an array. By turning capturing group into non-capturing, you must get a single array element.
#(?:guitare|piano)[0-9]#
DEMO
Related
I need to validate a regex where between STRING_{here}G_ can be 0 or even 4 digits, I tried the following regex:
(?<=TEST_[0-9]{0,4}G_).*
But the tester returns the error:
Your pattern contains one or more errors, please see the explanation section above.
And when trying to use manually, using two [0-9], it doesn't get my strings
ABC_TEST_20G_a123-abc1
ABC_TEST_100G_abc1
I need a regex that validates both strings and returns what is after G_
Remembering that the regex must have the "TEST_", it is a string that I need to validate
Most regexp engines don't allow lookbehinds to be variable-length, so you can't have a {0,4} quantifier in it.
Instead of a lookbehind, use a capture group to capture everything after this pattern.
TEST_[0-9]{0,4}G_(.*)
Capture group 1 will contain what you want to get.
DEMO
I am trying to grab the text after the first hyphen in a pattern
<title>.*?-(.*?)(-|<\/title>)
which then grabs DesiredText from the pattern below:
<title>Stuff - DesiredText - Other Stuff</title>
However in this pattern:
<title>Stuff - Unwanted - DesiredText - Otherstuff</title>
I want it to skip the 'Unwanted' text and match the text after the next hyphen instead (DesiredText). I made a regex101 with both patterns and need to modify my basic regex so that if a word or words I don't want to match are present in that capture group it then matches the second hyphen text instead:
https://regex101.com/r/veSqH3/1
I believe this is what you are looking for. The key is in using the caret (^) character within the square-bracket character list ([]). Using the caret and brackets together indicate a blacklist. It will only match things that are NOT in the list.
https://regex101.com/r/alAZhj/3
Pattern: <title>.*?-\s*([^-\s]*)\s*- End<\/title>
This matches anything in between the middle hyphens that is not a hyphen or space. You can of course modify the pattern to include such characters by using the following pattern.
Pattern: <title>.*?-\s*([^-]*)\s*- End<\/title>
This will match anything in between the middle hyphens that is not a hyphen, so that you can have less restricted text in there.
This will use a negative lookahead to disqualify Note. There may be ways to optimize the pattern, but I cannot do so with confidence because I don't know how variable your inputs strings are.
Pattern: /<title>.*?- (?P<title>(?!Note).*?)(?= -|<])/
Demo
I am using a positive lookahead to ensure the captured match doesn't have any unwanted trailing characters.
If you just want the second last delimited value, you could do something like this to return the value as the fullstring match:
~- \K[^-]*(?= - [^-]*?</title>)~
Or faster with a capture group:
~- ([^-]*) - [^-]*?</title>~
This assumes there are no hyphens in the value.
I took a different approach and focused on returning the capture prior to the last word, rather than any sort of negation. In this way it's highly generic.
This pattern will match what you want in the capture group:
\s-\s([a-zA-Z]+)\s-\s[a-zA-Z]+<\/title>
If you are concerned that this only match between title tags, then you can add:
<title>.*?\s-\s([a-zA-Z]+)\s-\s[a-zA-Z]+<\/title>
Here's a link to the Test
The only limitation to this I see, is that it uses words and whitespace, so if your desired match is "- Some phrase -" then this won't work with it, but that was not indicated in your example. It's a bit unclear because you used "other stuff" and then "otherstuff".
Is it possible to have a RegEx fall back to the beginning of the string and begin matching again?
Here's why I ask. Given the below string, I'd like to capture the sub strings black, red, blue, and green in that order, regardless of the order of occurrence in the subject string and only if all substrings are present in the subject string.
$str ='blue-ka93-red-kdke3-green-weifk-black'
So, for all of the below strings, the RegEx should capture black, red, blue, and green (in that order)
'blue-ka93-red-kdke3-green-weifk-black'
'green-ka93-red-kdke3-blue-weifk-black'
'blue-ka93-black-kdke3-green-weifk-red'
'green-ka93-black-kdke3-blue-weifk-red'
I wonder if there isn't a way to match a capture group then fall back to the start of the string and find the next capture group. I was hoping that something like ^.*(?=(black))^.*(?=(red))^.*(?=(blue))^.*(?=(green)) would work but of course the ^ and lookaheads do not behave this way.
Is it possible to construct such a RegEx?
For context, I'll be using the RegEx in PHP.
You can use
^(?=.*(black))(?=.*(red))(?=.*(blue))(?=.*(green))
Note: This will require all these keywords to be in the string.
See demo
There is no way to reset RegEx index when matching, so, you can only use capturing mechanism inside a positive lookahead anchored at the start. The lookahead will match an empty location at the start of the string (due to ^) and each of tose lookaheads in the RegEx above will be executed one after another if the previous one returned true (found a string of text meeting its pattern).
Your RegEx did not work the same way because you matched, consumed the text with.* (this subpattern was outside the lookaheads) and repeated the start of string anchor that automatically fails a RegEx if you do not use a multiline modifier.
Why not just use capture groups for maintaining the order.
^(?:(black)|(red)|(blue)|(green)|.)+$
This will match any string, all colors are optional.
See demo at regex101 or php demo at eval.in
I have some text with heading string and set of letters.
I need to get first one-digit number after set of string characters.
Example text:
ABC105001
ABC205001
ABC305001
ABCD105001
ABCD205001
ABCD305001
My RegEx:
^(\D*)(\d{1})(?=\d*$)
Link: http://www.regexr.com/390gv
As you cans see, RegEx works ok, but it captures first groups in results also. I need to get only this integer and when I try to put ?= in first group like this: ^(?=\D*)(\d{1})(?=\d*$) , Regex doesn't work.
Any ideas?
Thanks in advance.
(?=..) is a lookahead that means followed by and checks the string on the right of the current position.
(?<=...) is a lookbehind that means preceded by and checks the string on the left of the current position.
What is interesting with these two features, is the fact that contents matched inside them are not parts of the whole match result. The only problem is that a lookbehind can't match variable length content.
A way to avoid the problem is to use the \K feature that remove all on the left from match result:
^[A-Z]+\K\d(?=\d*$)
You're trying to use a positive lookahead when really you want to use non-capturing groups.
The one match you want will work with this regex:
^(?:\D*\d{1})(\d*)$
The (?: string will start a non-capturing group. This will not come back in matches.
So, if you used preg_match(';^(?:\D*\d{1})(\d*)$;', $string, $matches) to find your match, $matches[1] would be the string for which you're looking. (This is because $matches[0] will always be the full match from preg_match.)
try:
^(?:\D*)(\d{1})(?=\d*$) // (?: is the beginning of a no capture group
(?P<id>\d*)(/(?P<title>.*))?
Most of the time,we use regex to match something,but how to generate the matching string if we have id and title already?
Example,if id=4 and title='hello world',the result should be:
4/hello world
But if we only have id=4,it should be:
4
Because as the regex indicates,title is optional.
Two answers both misunderstood...
There is no preg_match yet
You propose that if you pass the regular expression (?P<id>\d*)(/(?P<title>.*))? to a function along with the parameters id=4 and title='' that the function would return 4. And that this would work for any regular expression. That is simply impossible. Your regular expression is an example that such a function could never support.
If you call preg_match with your regular expression on the string 4 then the match results will return 4 for the capturing group id and the empty string for the capturing group title. If you call preg_match with the same regex on the string 4/ then the match results will also be 4 for the capturing group id and the empty string for the capturing group title. PHP does not differentiate between capturing groups that match the empty string and capturing groups that do not participate in the regex at all. Both return the empty string. Notice that in your regex you did not only make the group optional with the question mark, the .* inside your capturing group is also optional.
Thus, we have two possible matches 4 and 4/ for which preg_match returns 4 for the capturing group id and the empty string for the capturing group title. So how is your requested reverse function supposed to determine whether it should return 4 or 4/ for your two capturing groups? It can't be done without additional information.
In fact, do you realize your regex also matches / as well as the empty string? Everything in your regular expression is optional!
how to get the original matched string
if we have id and title already?
Original Matched string should be in
$0
The whole regex match is always located in the group 0, so you can get this match in the matches array of preg_match() by accessing index 0 an/or using $0 or \0 in the replacement of preg_replace()
Not sure I understand the question, you mean constructing the string like this?
$string = $id;
if ( isset($title) ) {
$string .= "/$title";
}