I have some kind of simple and tricky problem.
Here I have a METAR (Weather in a very specific string format).
LIEA 051550Z 21005KT 9999 FEW020 19/14 Q1011
In this string, 051550Z represents that the weather bullettin has been emitted on 5th of the month at 15:50 UTC,... and 9999 indicates the visibility,...
Well, I tried to match a RegExp which could output me the visibility, but I didn't manage to get out of the problem.
preg_match_all() returns me the numbers
0515 (from the time group)
2100 (from the wind group)
9999 (wanted)
1011 (from the pressure group)
with the RegExp I've tried
([0-9]{4})
And then, I blindly added a
(?!Z)
trying not to get at least the time group...
But it doesn't work...
Looking at the problem itself, is it better to consider taking every time the third element of the array (without (?!Z) RegExp addition) or trying to catch directly the right value?
In my opinion the last choice would be better...
So, how can I get the visibility?
You could use a word boundary \b and then match 4 digits to get the visibility:
\b\d{4}\b
If it has to be 4 digits at the fourth position you could also match the first 3 sets matching 1+ times not a whitespace character \S+ followed by 1+ times a horizonal whitespace \h and repeat that 3 times.
Then use \K to forget what was matched and match 4 digit followed by a word boundary.
^(?:\S+\h+){3}\K\d{4}\b
Regex demo
Related
Im creating a regex that searches for a text, but only if there isnt a dash after the match. Im using lookahead for this:
Regex: Text[\s\.][0-9]*(?!-)
Expected result Result
--------------- -------
Text 11 Text 11 Text 11
Text 52- <No Match> Text 5
Test case: https://regex101.com/r/doklxc/1/
The lookahead only seems to be matching with the previous character, which leaves me with Text 5, while I need it to not return a match at all.
Im checking the https://www.regular-expressions.info/ guides and tried using groups, but I cant wrap my head around this one.
How can I make it so the lookbehind function affects the entire preceding match?
Im using the default .Net Text.RegularExpressions library.
The [0-9]* backtracks and lets the regex engine find a match even if there is a -.
There are two ways: either use atomic groups or check for a digit in the lookahead:
Text[\s.][0-9]*(?![-\d])
Or
Text(?>[\s.][0-9]*)(?!-)
See the regex demo #1 and the regex demo #2.
Details
Text[\s.][0-9]*(?![-\d]) matches Text, then a dot or a whitespace, then 0 or more digits, and then it checks of there is a - or digit immediately to the right, and if there is, it fails the match. Even when trying to backtrack and match fewer digits than it grabbed before, the \d in the lookahead will fail those attempts
Text(?>[\s.][0-9]*)(?!-) matches Text, then an atomic group starts where backtracking won't be let in after the group patterns find their matching text. (?!-) only checks for a - after the [0-9]* pattern tries to grab any digits.
Assuming I have a set of numbers (from 1 to 22) divided by some trivial delimiters (comma, point, space, etc). I need to make sure that this set of numbers does not contain any repetition of the same number. Examples:
1,14,22,3 // good
1,12,12,3 // not good
Is it possible to do via regular expression?
I know it's easy to do using just php, but I really wander how to make it work with regex.
Yes, you could achieve this through regex via negative looahead.
^(?!.*\b(\d+)\b.*\b\1\b)\d+(?:,\d+)+$
(?!.*\b(\d+)\b.*\b\1\b) Negative lookahead at the start asserts that the there wouldn't be a repeated number present in the match. \b(\d+)\b.*\b\1\b matches the repeated number.
\d+ matches one or more digits.
(?:,\d+)+ One or more occurances of , , one or more digits.
$ Asserts that we are at the end .
DEMO
OR
Regex for the numbers separated by space, dot, comma as delimiters.
^(?!.*\b(\d+)\b.*\b\1\b)\d+(?:([.\s,])\d+)(?:\2\d+)*$
(?:([.\s,])\d+) capturing group inside this non-capturing group helps us to check for following delimiters are of the same type. ie, the above regex won't match the strings like 2,3 5.6
DEMO
You can use this regex:
^(?!.*?(\b\d+)\W+\1\b)\d+(\W+\d+)*$
Negative lookahead (?!.*?(\b\d+)\W+\1\b) avoids the match when 2 similar numbers appear one after another separated by 1 or more non-word characters.
RegEx Demo
Here is the solution that fit my current need:
^(?>(?!\2\b|\3\b)(1\d{1}|2[0-2]{1}|\d{1}+)[,.; ]+)(?>(?!\1\b|\3\b)(1\d{1}|2[0-2]{1}|\d{1}+)[,.; ]+)(?>(?!\1\b|\2\b)(1\d{1}|2[0-2]{1}|\d{1}+))$
It returns all the sequences with unique numbers divided by one or more separator and also limit the number itself from 1 to 22, allowing only 3 numbers in the sequence.
See working example
Yet, it's not perfect, but work fine! Thanks a lot to everyone who gave me a hand on this!
I have these
name
name[one]
name[one][two]
name[one][two][three]
I want to be able to match them like this:
[name]
[name, one]
[name, one, two]
[name, one, two, three]
Here's my regex I've tried:
/([\w]+)(?:(?:\[([\w]+)\])+)?/
I just can't quite to seem to get it right, only gets the last square brackets
You can't have a dynamic number of captures; the number of captures is exactly equal to the number of capture parenthesis pairs ((?:...) don't count). You have two capture parenthesis pairs, that means you get two captures - no more, no less.
To handle variable number of matches, use submatches (in a replace with a function, if your language supports that), or split.
You haven't labeled with a programming language, so this is as specific as I can go.
This should do ([\w]+)(?:\[([\w]+)\]\+)?
http://regex101.com/r/mF8pC8/3
Changes from original regex - removed extra capture and added \ before last +.
1st Capturing group ([\w]+)
[\w]+ match a single character present in the list below
Quantifier: Between one and unlimited times, as many times as possible, giving back as needed [greedy]
\w match any word character [a-zA-Z0-9_]
(?:\[([\w]+)\]\+)? Non-capturing group
Quantifier: Between zero and one time, as many times as possible, giving back as needed [greedy]
\[ matches the character [ literally
2nd Capturing group ([\w]+)
[\w]+ match a single character present in the list below
Quantifier: Between one and unlimited times, as many times as possible, giving back as needed [greedy]
\w match any word character [a-zA-Z0-9_]
\] matches the character ] literally
\+ matches the character + literally
g modifier: global. All matches (don't return on first match)
You can't repeat groups in a regex. You can write them out a number of times though. This works for up to three groups in square brackets. You can add more if you like.
(\w+)\[(\w+)\](?:\[(\w+)\])?(?:\[(\w+)\])?
You can not have dynamic number of captures with php regexp...
Why not just, write something like: explode('[',strtr('name[one][two][three]', [']'=>''])) - it will give you desired result.
I'm trying to get a six digit number that is not surrounded by any other number, and is not in a sequence of numbers. This number can exist at the beginning of the string, anywhere in it, and at the end. It can also have commas and text in front of it, but most importantly distinct 6 digit blocks of numbers. I've pulled my hair out doing lookaheads and conditions and can't find a complete solution that solves all issues.
Sample data:
00019123211231731ORDER NO 761616 BR ADDRESS 123 A ST
ORDER NO. 760641 JOHN DOE
REF: ORDER #761625
OP212312165 ORDER NUMBER 759699 /REC/YR 123 A ST
766911
761223,761224,761225
(^|\D)(\d{6})(\D|$). You will find your needed 6 digit match in capturing group 2. Notice that this solution is reliable only for one match. It won't find both numbers in 123456,567890 (Thank you Alan for pointing this out!). If multiple matches are needed a lookaround solution should be used.
With look-arounds:
(?<=^|\D)\d{6}(?=\D|$)
or with look-arounds and the condition to be a valid number (i.e. the first digit is not 0):
(?<=^|\D)[1-9]\d{5}(?=\D|$)
You can use a negative lookbehind and negative lookahead to make sure there are no digits adjacent to the match:
(?<!\d)\d{6}(?!\d)
This only matches the number, and not the adjacent characters.
Also, it works if the match is at the beginning or end of the string.
Couldn't you just as easily use this regex
[^0-9](\d{6})[^0-9]
It should match any 6 digit number, not padded by any other numbers. Therefore not being in a sequence.
I am using the following regex to match an account number. When we originally put this regex together, the rule was that an account number would only ever begin with a single letter. That has since changed and I have an account number that has 3 letters at the beginning of the string.
I'd like to have a regex that will match a minimum of 1 letter and a maximum of 3 letters at the beginning of the string. The last issue is the length of the string. It can be as long as 9 characters and a minimum of 3.
Here is what I am currently using.
'/^([A-Za-z]{1})([0-9]{7})$/'
Is there a way to match all of this?
You want:
^[A-Za-z]([A-Za-z]{2}|[A-Za-z][0-9]|[0-9]{2})[0-9]{0,6}$
The initial [A-Za-z] ensures that it starts with a letter, the second bit ([A-Za-z]{2}|[A-Za-z][0-9]|[0-9]{2}) ensures that it's at least three characters long and consists of between one and three letters at the start, and the final bit [0-9]{0,6} allows you to go up to 9 characters in total.
Further explaining:
^ Start of string/line anchor.
[A-Za-z] First character must be alpha.
( [A-Za-z]{2} Second/third character are either alpha/alpha,
|[A-Za-z][0-9] alpha/digit,
|[0-9]{2} or digit/digit
) (also guarantees minimum length of three).
[0-9]{0,6} Then up to six digits (to give length of 3 thru 9).
$ End of string/line marker.
Try this:
'/^([A-Za-z]{1,3})([0-9]{0,6})$/'
That will give you from 1 to 3 letters and from 3 to 9 total characters.