Regex for word not followed by asterisk

Regex for word not followed by asterisk - php

i need a regex (for php) matching any 1 or 2 characters that start with a + and end not with a *.
So far i got this one: /\+\b\w{1,2}\b/ which finds +a3 but also finds +a3* as the asterisk is seen as after the word.
In a String like +find +in +me* i only want to find the +in but not the +me*.
I tried with /\+\b[\w\*]{1,2}\b/ but that does not seem to make any difference.
preg_replace($regex,'','+do+find +in +me*'); //expected result: '+do+find +me*'

How about:
/\+\w{1,2}\b(?!\*)/
(?!\*) is a negative lookahead that assure a * doesn't follow the two character.
The \b isn't mandatory between \+ and \w.
Edit according to comment:
This matches the "+2c" in "whatever+2c" what would i need to change that it wont match this but only matches for "whatever +2c" or "+2c whatever"
Use this one:
/(?:^|\s)\+\w{1,2}(?:\s|$)
According to comments:
/(?<=^|\s)\+\w{1,2}(?:\s|$)/

Related

Regex prevent selecting characters from previous match

My title probably doesn't explain exactly what I mean. Take the following string:
POWERSTART9^{{2|3}}POWERENDx{{3^EXSTARTxEXEND}}=POWERSTART27^{{1|4}}POWEREND
What I want to do here is isolate the parts that are like this:
{{2|3}} or {{1|4}}
The following expression works to an extent, it selects the first one {{2|3}} with no issue:
\{\{(.*?)\|(.*?)\}\}
The problem is, it's not just selecting the first if {{2|3}} and the second of {{1|4}} because after the first one we have {{3^EXSTARTxEXEND}} so it's taking the starting point from {{3 and going right until the end of the second part I want |4}}
Here it is highlighted on RegExr:
I've never been great with regex and can't work out how to stop it doing that. Any ideas? I basically want it to only match the exact pattern and not something that contains it.

You may use
\{\{((?:(?!{{).)*?)\|(.*?)}}
See the regex demo.
If there can be no { and } inside the {{...}} substrings, you may use a simpler \{\{([^{}|]*)\|([^{}]*)}} expression (see demo).
Details
\{\{ - a {{ substring
((?:(?!{{).)*?) - Capturing group 1: any char (.), as few as possible (*?), that does not start a {{ char sequence (tempered greedy token)
[^{}|]* - any 0 or more chars other than {, } and |
\| - a | char
(.*?) - Capturing group 2: any 0 or more chars, as few as possible
[^{}]* - any 0 or more chars other than { and }
}} - a }} substring.

Try this \{\{([^\^|]*)\|([^\^|]*)\}\}
https://regex101.com/r/bLF8Oq/1

preg - Difference between Search Patterns with [] and without

It seems I am not able to understand something very basic with preg regex Patterns in PHP.
What is the difference between these Regex Patterns:
\b([A-Z...]...)
[\b]{1}([A-Z...]...)
The Pattern should start with a word boundary, but why is the result different, when I put it in []{1} ??
The first one works like I expected, but the second not. The problem is, that I want to put more into the [], so that the pattern can start with a word boundary OR a small character [a-z].
Thank you!
Example Text:
Race1529/05/201512:45K4 Senior Men 1000m
LaneName(s)NFBib(s)TimeRank250m500m750m
152
Martin SCHUBERT / Lukas REUSCHENBACH155
11
153
151Kostja STROINSKI / Kai SPENNER
03:07.740
GER
8
I want to find the names of the racers. Sometimes they have a word-break (\b) at the beginning, sometimes not. (But i need the word-break.)
$pattern = '#\b(['.$GB.$KB.'\s\-]{2,40})\s(['.$GB.'\'\-\s]{2,40})[0-9]{0,5}#';
($GB is a variable with all Uppercase Letters, $KB with lower case letters)
preg_match_all gives me all racers where the Name has a word-break at the beginning. (In this example Schubert, Reuschenbach, Spenner) but of course not Stroinski. So, I try this:
$pattern = '#[\b0-9]+(['.$GB.$KB.'\s\-]{2,40})\s(['.$GB.'\'\-\s]{2,40})[0-9]{0,5}#';
Does not work. Even if i remove the 0-9 and only put [\b]{1} at the beginning it doesn't find any hit.
I don't see the difference between \b and [\b]{1}. It seems to be a very basic misunderstanding.

The [\b] is a character class that only matches a backspace char (\u0008).
See PHP regex reference:
note that "\b" has a different meaning, namely the backspace character, inside a character class
Also, .{1} = ., the {1} limiting quantifier is always redundant and only makes sense when your patterns are built dynamically from variables.

PHP Regex Not Quite Working

I am using the following regex:
^[0-9.,]*(([.,][-])|([.,][0-9]{2}))?\$
I use this regex to check for valid prices -- so it catches/rejects things like xxx, or llddd or 34.23dsds
and allows things like 100 or 120.00
The problem with it seems to be if it is blank(empty) it passes as valid which it should not -- any ideas how to change this??
Thanks

One of your problems is that you use the dot in your regex which stands for "any character". If you mean a dot you need to escape it like this \.
Also you should have at least one number in it so exchange the asterisk * by a + for "one or more".
Then you can have .,.,.,.,.,.,- if you do not remove the comma and dot from the first part:
^[0-9]+(([\.,][-])|([\.,][0-9]{2}))?$

Taking yoiur regex and just solving the "don't match blanks" problem:
^[0-9.,]+(([.,][-])|([.,][0-9]{2}))?$
the * allows 0 or more, while the + allows 1 or more, thus the * allowed blanks but the + will not, instead there must be at least one digit.
EDIT:
You should clean this regex up a bit to be
^[0-9]+(?:[.,-](?:[0-9]{2})?)?$
This solves the matching of ",,,"
http://www.regextester.com/?fam=95185
EDIT 2: #Fuzzzzel pointed out that this did not match the case "50,-" which we assume you would like to match and that removing capturing groups is presumptive. Here's the latest iteration of my suggested regex:
^[0-9]+([.,-](-|([0-9]{2}))?)?$

Quick PHP regex for digit format

I just spent hours figuring out how to write a regular expression in PHP that I need to only allow the following format of a string to pass:
(any digit)_(any digit)
which would look like:
219211_2
so far I tried a lot of combinations, I think this one was the closest to the solution:
/(\\d+)(_)(\\d+)/
also if there was a way to limit the range of the last number (the one after the underline) to a certain amount of digits (ex. maximal 12 digits), that would be nice.
I am still learning regular expressions, so any help is greatly appreciated, thanks.

The following:
\d+_\d{1,12}(?!\d)
Will match "anywhere in the string". If you need to have it either "at the start", "at the end" or "this is the whole thing", then you will want to modify it with anchors
^\d+_\d{1,12}(?!d) - must be at the start
\d+_\d{1,12}$ - must be at the end
^\d+_\d{1,12}$ - must be the entire string
demo: http://regex101.com/r/jG0eZ7
Explanation:
\d+ - at least one digit
_ - literal underscore
\d{1,12} - between 1 and 12 digits
(?!\d) - followed by "something that is not a digit" (negative lookahead)
The last thing is important otherwise it will match the first 12 and ignore the 13th. If your number happens to be at the end of the string and you used the form I originally had [^\d] it would fail to match in that specific case.
Thanks to #sln for pointing that out.

You don't need double escaping \\d in PHP.
Use this regex:
"/^(\d+)_(\d{1,12})$/"
\d{1,12} will match 1 to 12 digist
Better to use line start/end anchors to avoid matching unexpected input

Try this:
$regex= '~^/(\d+)_(\d+)$~';
$input= '219211_2';
if (preg_match($regex, $input, $result)) {
print_r($result);
}

Just try with following regex:
^(\d+)_(\d{1,12})$

Regex - matching all between second set of brackets ([])

I have the following string that I need to match only the last seven digets between [] brackets. The string looks like this
[15211Z: 2012-09-12] ([5202900])
I only need to match 5202900 in the string contained between ([]), a similar number could appear anywhere in the string so something like this won't work (\d{7})
I also tried the following regex
([[0-9]{1,7}])
but this includes the [] in the string?

If you just want the 7 digits, not the brackets, but want to make sure that the digits are surrounded with brackets:
(?<=\[)\d{7}(?=\])
FYI: This is called a positive lookahead and positive lookbehind.
Good source on the topic: http://www.regular-expressions.info/lookaround.html

Try matching \(\[(\d{7})\]\), so you match this whole regular expression, then you take group 1, the one between unescaped parentheses. You can replace {7} with a '*' for zero or more, + for 1 or more or a precise range like you already showed in your question.

You can try to use
\[(\d{1,7})\]

If first pattern looks like yours (not only digits), then this should work for you to extract group of digits surrounded by brackets like ([123]):
\(\[(\d+)\]\)

From your details, lookbehind and lookaround seems to be good one. You can also use this one:
(\d{7})\]\)$
Since the pattern of seven digit is expected at the end of the line, engine need to work less in order to find the match.
Hope it helps!

Here is a benchmark (in Perl, but I think is close the same in php) that compares lookaround approach and capture group:
use Benchmark qw(:all);
my $str = q/[15211Z: 2012-09-12] ([5202900])/;
my $count = -3;
cmpthese($count, {
'lookaround' => sub {
$str =~ /(?<=\[)\d{7}(?=\])/;
},
'capture group' => sub {
$str =~ /\[(\d{7})\]/;
},
});
result:
Rate lookaround capture group
lookaround 274914/s -- -70%
capture group 931043/s 239% --
As we can see, capture is more than 3 times faster than lookaround.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Regex for word not followed by asterisk - php

Related

Regex prevent selecting characters from previous match

preg - Difference between Search Patterns with [] and without

PHP Regex Not Quite Working

Quick PHP regex for digit format

Regex - matching all between second set of brackets ([])

Categories

Resources