This question already has answers here:
Reference - What does this regex mean?
(1 answer)
Regular expressions: Ensuring b doesn't come between a and c
(4 answers)
Closed 3 years ago.
I know this sounds easy but I am stuck.
I want to match strings that has asterisk *.
Essentially I want to allow strings having asterisk at front/back/both but not middle:
(At max there will be 2 asterisks, front and both but no middle, and the presence string is a must)
ALLOW:
*string* *string string* string
DENY:
*str*ing*
*str*ing str*ing* str*ing
*string*****
I tried
^\\*?((?!\\*).)*\\\*?$
and somehow it works.
Can someone explains how this works?
And verify if this is correct because regex..hard to debug and check..
You can use the following regex:
^\*?\w+\*?$
demo: https://regex101.com/r/vwuXv2/1/
Explanations:
^ anchor imposing the start of a line
\*? a * appearing at most one time
\w+ at least 1 word char appearing in the text ([a-zA-Z0-9_] feel free to change it depending on your need)
\*? a * appearing at most one time
$ end of line anchor
Now if you are interested in partial line matches, you can use the following regex:
(?<=^| )\*?\w+\*?(?=$| )
demo: https://regex101.com/r/vwuXv2/2/
Explanations: you add lookbehind, lookahead assertions.
Adding Japanese characters as requested in the comment (add in [^*\s] all the characters you need to exclude from the words):
^\*?[^*\s]+\*?$
demo: https://regex101.com/r/RaCmwt/1/
or
^\*?[[:alpha:]]+\*?$
(with unicode flag enabled) or just
^\*?\p{L}+\*?$
demo: https://regex101.com/r/RaCmwt/2/
You can simply say: Optionally start with asterisk, 0 or more arbitrary characters except asterisk, optionally end with asterisk.
^\*?[^*]*\*?$
https://regex101.com/r/bibCEc/2
An alternative is to inverse the match and test if there is not ( i.e. if(!...)) any asterisk not at the begin or end using negative look behind and look ahead:
(?<!^)\*(?!$)
https://regex101.com/r/8St0M4/2
According to your recent edit you would use the quatifier + to match 1 or more characters:
^\*?[^*]+\*?$
https://regex101.com/r/bibCEc/3
Related
I have following problem:
I have a pattern like this:
/(?<=template=")(.*?)(.*\/)/gm
And an text like this:
template="test/widgets/glasgow.phtml"}}
My regex should search for the path infront of my file, i need to cut it out so that it will look at the end like this:
template="glasgow.phtml"}}
That works fine but the problem is that i have sometimes an text that looks like this:
block="core/template" template="test/widgets/getcallus.phtml"}}</p>
It cuts everything out till the </.
This is getting cutted out:
test/widgets/getcallus.phtml"}}</
Instead of:
test/widgets/
I have tried to limit the end with $ but it doesnt do nothing.
I am testing it on regexr.com
https://regexr.com/50hi2
You may use the following pattern:
template="\K[^"\/]*\/[^"\/]*\/
See the regex demo. In PHP, you may get rid of backslashes if you specify another regex delimiter:
$regex = '~template="\K[^"/]*/[^"/]*/~';
Details
template=" - literal text
\K - match reset operator
[^"\/]* - 0 or more chars other than / and "
\/ - a / char
[^"\/]* - 0 or more chars other than / and "
\/ - a / char
It is equal to template="\K(?:[^"\/]*\/){2}, where (?:...){2} repeats the non-capturing group sequence of patterns twice.
Be careful with (.*?)(.*\/)
This pattern corresponds to a REDOS vulnerability. (There are 2^n ways to read the n chars before the last /...
To keep a regex closed to yours, you can use
/(?<=template=")([^"]*?\/)*([^"]*)"/
([^"]*?\/)* reads as many blocks "non / nor " chars followed by /" as possible.
https://regex101.com/r/SMSv5R/2
I have these two regular expression
^(((98)|(\+98)|(0098)|0)(9){1}[0-9]{9})+$
^(9){1}[0-9]{9}+$
How can I combine these phrases together?
valid phone :
just start with : 0098 , +98 , 98 , 09 and 9
sample :
00989151855454
+989151855454
989151855454
09151855454
9151855454
You haven't provided what passes and what doesn't, but I think this will work if I understand correctly...
/^\+?0{0,2}98?/
Live demo
^ Matches the start of the string
\+? Matches 0 or 1 plus symbols (the backslash is to escape)
0{0,2} Matches between 0 and 2 (0, 1, and 2) of the 0 character
9 Matches a literal 9
8? Matches 0 or 1 of the literal 8 characters
Looking at your second regex, it looks like you want to make the first part ((98)|(\+98)|(0098)|0) in your first regex optional. Just make it optional by putting ? after it and it will allow the numbers allowed by second regex. Change this,
^(((98)|(\+98)|(0098)|0)(9){1}[0-9]{9})+$
to,
^(?:98|\+98|0098|0)?9[0-9]{9}$
^ this makes the non-grouping pattern optional which contains various alternations you want to allow.
I've made few more corrections in the regex. Use of {1} is redundant as that's the default behavior of a character, with or without it. and you don't need to unnecessarily group regex unless you need the groups. And I've removed the outer most parenthesis and + after it as that is not needed.
Demo
This regex
^(?:98|\+98|0098|0)?9[0-9]{9}$
matches
00989151855454
+989151855454
989151855454
09151855454
9151855454
Demo: https://regex101.com/r/VFc4pK/1/
However note that you are requiring to have a 9 as first digit after the country code or 0.
There is something really I couldn't understand is how can I check my previous match with the next character and set starting and ending character please guys help me.
Here is an Example of my string
..A..B..A...B.A.B
What I'm trying to do is starting of string:
1=> Check the first character is .. or A
2=> and the Second thing is String cannot be like this ..A..A it must be like ..A..B.. and sequence.
3=> Ending character must be .. or B and won't be A
However, I can match the first character like so ^([A]{1}|[.]{1,100}) But when I'm trying this same way with ending character it is not working and I'm not getting how to do the step 2.
Save my day guys. Thanks
Failed Regex: ^[\.{1,40}|A{1}]+(?!A)+(B)+(?!B)+(B|\.{1,40})$
This regex should match the description you've given:
^(?:\.+?)?(A\.+?B\.?|\.\.)+$
^ is the start of the string (or line if m modifier is used).
(?:\.+?)? is one or more ., but it optional.
A\.+B\.? is looking for an A any amount of .s then a B and an optional ..
| is an alternative pattern we'll look at
\.\. are 2 .s
+ allows for the whole group to occur once or more
$ is the end of the string (or line, again depends on modifier being used)
Demo: https://regex101.com/r/OUJxxc/3/ (Probably with a clearer description than I provided)
I have a peculiar use case where I need to detect paragraphs that end in !!. Normal occurrences of ! (a single one) is fine in the paragraph, but the block ends when !! is found.
For example:
test foo bar !!
longer paragraph this time!
goes on and on
and then stops !!
Should be detected as two separate matches, one covering the first line, and another (separate) covering lines 2, 3 and 4. This brings it to a total of 2 matches.
(Preferably it should work with multiline-mode, as it's part of a larger regex that employs this mode.)
How would I accomplish this? I tried [^!!]* which to me says, find as many non-!! characters as possible, but I'm not sure how to leverage that, and worse yet it still finds single occurrences of !.
There is a common idiom in regular expressions that is used for escape sequences. (Like "\n" in a string.) You can use the same concept here.
The trick is to match either NOT the first character, or the first character followed by a valid second character.
In your case, that would be:
(?: # this is a package, either A or B, choose one
[^!] # Not a bang
| # or
![^!] # Bang, followed by not-a-bang
)
This pair of alternatives describes all the characters in your paragraph. So you can repeat it either 0 times (*) or one-or-more times (+) depending on what you are doing in the rest of your pattern.
# All together:
(?:[^!]|![^!])* # zero or more
(?:[^!]|![^!])+ # one or more
(Obviously, you can match '!!' at the end if you like...)
^([!]?[^!]+[!]?[^!]+)*[!]{2}$/gm
This regex worked for me. It ensures any single ! characters are separated by non-! characters, but there don't have to be any single ! characters. It worked on multiline mode. This also has the added benefit of extracting the text that comes before an occurrence of "!!" since I assume you want to work with it.
/^([!]?[^!]+[!]?[^!]+)*.?[!]{2}$|^([!]?[^!]+[!]?[^!]+)*[^!]?[!]?$/gm
This slightly longer regex captures text that occurs after the final !! (ie, if the file has text between !! and EOF). I wouldn't recommend using the capturing groups though as on my regex checker, they didn't seem to work properly (that may have just been an implementation glitch, however, as the capturing groups look like they should work properly).
Try this:
([\w\s!]+?\!{2})
DEMO
Output:
MATCH 1
1. [0-15] `test foo bar !!`
MATCH 2
1. [15-76] `
longer paragraph this time!
goes on and on
and then stops !!`
or
(?:\n?([\w\s!]+?)\s?\!{2})
DEMO
Output:
MATCH 1
1. [0-12] `test foo bar`
MATCH 2
1. [16-73] `longer paragraph this time!
goes on and on
and then stops`
Try following regex using lookahead
VERSION #1
/(?<=!!|^).*?(?=!!)/gms
Please see https://regex101.com/r/cQ0wC0/2
Result should be
OUTPUT:
test foo bar
longer paragraph this time!
goes on and on
and then stops
VERSION #2
Since OP want to capture last paragraph of text after !! even it's not ending with bang signs.
/(?<=!!|^).*?(?=!!)|(?<=!!).*$/gms
Please see demo https://regex101.com/r/cQ0wC0/4
INPUT:
test foo bar !!
longer paragraph this time!
goes on and on
and then stops !!
longer paragraph this time!
goes on and on
OUTPUT:
test foo bar
longer paragraph this time!
goes on and on
and then stops
longer paragraph this time!
goes on and on
I'd like to capture up to four groups of text between <p> and </p>. I can do that using the following regex:
<h5>Trivia<\/h5><p>(.*)<\/p><p>(.*)<\/p><p>(.*)<\/p><p>(.*)<\/p>
The text to match on:
<h5>Trivia</h5><p>Was discovered by a freelance photographer while sunbathing on Bournemouth Beach in August 2003.</p><p>Supports Southampton FC.</p><p>She has 11 GCSEs and 2 'A' Levels.</p><p>Listens to soul, R&B, Stevie Wonder, Aretha Franklin, Usher Raymond, Michael Jackson and George Michael.</p>
It outputs the four lines of text. It also works as intended if there are more trivia items or <p> occurrences.
But if there are less than 4 trivia items or <p> groups, it outputs nothing since it cannot find the fourth group. How do I make that group optional?
I've tried: <h5>Trivia<\/h5><p>(.*?)<\/p>(?:<p>(.*?)<\/p>)?(?:<p>(.*?)<\/p>)?(?:<p>(.*?)<\/p>)?(?:<p>(.*?)<\/p>)? and that works according to http://gskinner.com/RegExr/ but it doesn't work if I put it inside PHP code. It only detects one group and puts everything in it.
The magic word is either 'escaping' or 'delimiters', read on.
The first regex:
<h5>Trivia<\/h5><p>(.*)<\/p><p>(.*)<\/p><p>(.*)<\/p><p>(.*)<\/p>
worked because you escaped the / characters in tags like </h5> to <\/h5>.
But in your second regex (correctly enclosing each paragraph in a optional non-capturing group, fetching 1 to 5 paragraphs):
<h5>Trivia</h5><p>(.*?)</p>(?:<p>(.*?)</p>)?(?:<p>(.*?)</p>)?(?:<p>(.*?)</p>)?(?:<p>(.*?)</p>)?
you forgot to escape those / characters.
It should then have been:
$pattern = '/<h5>Trivia<\/h5><p>(.*?)<\/p>(?:<p>(.*?)<\/p>)?(?:<p>(.*?)<\/p>)?(?:<p>(.*?)<\/p>)?(?:<p>(.*?)<\/p>)?/';
The above is assuming you were putting your regex between two / "delimiters" characters (out of conventional habit).
To dive a little deeper into the rabbit-hole, one should note that in php the first and last character of a regular expression is usually a "delimiter", so one can add modifiers at the end (like case-insensitive etc).
So instead of escaping your regex, you could also use a ~ character (or #, etc) as a delimiter.
Thus you could also use the same identical (second) regex that you posted and enclose for example like this:
$pattern = '~<h5>Trivia</h5><p>(.*?)</p>(?:<p>(.*?)</p>)?(?:<p>(.*?)</p>)?(?:<p>(.*?)</p>)?(?:<p>(.*?)</p>)?~';
Here is a working (web-based) example of that, using # as delimiter (just because we can).
You can use the question mark to make each <p>...</p> optional:
$pattern = '~<h5>Trivia</h5>(?:<p>(.*?)</p>)?(?:<p>(.*?)</p>)?(?:<p>(.*?)</p>)?(?:<p>(.*?)</p>)?~';
Use the Dom is a good option too.