regex repetitive parts in monetary amounts - php

In a monetary pattern I look for thousand separator
(?:[. ]\d{3})*
In this case the thousand separator could either be . or . But how to make sure that the pattern will not match patterns where the thousand separator is mixed?
Only match patterns like
.123.123.123
123 123 123
Do not match
.123 123 123.123

You can match the first separator and then use a backreference to match the same separator for the remaining digit groups:
^(?:([. ])\d{3}(\1\d{3})*)?$
Demo: https://regex101.com/r/u4zPuf/1

If you don't want to match empty strings:
^([. ])\d{3}(?:\1\d{3})*$
In parts the pattern matches:
^ Start of string
([. ]) Capture group 1, match either . or a space
\d{3} Match 3 digits
(?:\1\d{3})* Optionally repeat a back reference to what is captured in group 1 followed by 3 digits
$ End of strings
Regex demo

Related

regex repetitive pattern with monetary amount

(?:[.,]?\d{3})* this is a part of a monetary amount pattern which match the thousand separator and the 3 digits after. The thousand separator can either be ., , or nothing
But how to make it repetitive? If the first thousand separator is . then the rest should also be .
You can use a capture group with a repeating backreference:
^(?:\d{4,}|\d{1,3}(?:([,.])\d{3}(?:\1\d{3})*)?)$
Explanation
^ Start of string
(?: Non capture group for the alternatives
\d{4,} Match 4 or more digits
| Or
\d{1,3} Match 1-3 digits
(?: Non capture group
([,.])\d{3} capture either a . or comma in group 1
(?:\1\d{3})* Optionally repeat the backreference to group 1 (the same char) followed by 3 digits
)? Close the non capture group and make it optional (to also match 1-3 digits)
) Close the non capture group
$ End of string
See a regex demo.
Take it block by block: the begining the middle and the end. Then use a or to match the numbers < 1000.
^((\d{1,3}[.,]){,2}(\d{3}[,.])*(\d{3})|(\d{1,3}))$
(This works with python regex on this tool: https://regex101.com/)
For PHP I had to modify it like this:
^(\d{1,3}|(\d{1,3}[.,])+(\d{3}[.,])*\d{3})$

Regex optional groups and digit length

Maybe some regex-Master can solve my problem.
I have a big list with many addresses with no seperators( , ; ).
The address string contains following Information:
The first group is the street name
The second group is the street number
The third group is the zipcode (optional)
The last group is the town name (optional)
As you can see on the image above the last two test strings are not matching.
I need the last two regex groups to be optional and the third group should be either 4 or 5 digits.
I tried (\d{4,5}) for allowing 4 and 5 digits. But this only works halfways as you can see here: https://regex101.com/r/ZurqHh/1
(This sometimes mixes the street number and zipcode together)
I also tried (?:\d{5})? to make the third and fourth group optional. But this destroys my whole group layout...
https://regex101.com/r/EgxeMy/1
This is my current regex:
/^([a-zäöüÄÖÜß\s\d.,-]+?)\s*([\d\s]+(?:\s?[-|+\/]\s?\d+)?\s*[a-z]?)?\s*(\d{5})\s*(.+)?$/im
Try it out yourself:
https://regex101.com/r/zC8NCP/1
My brain is only farting at this moment and i can't think straight anymore.
Please help me fix this problem so i can die in peace.
You can use
^(.*?)(?:\s+(\d+(?:\s*[-|+\/]\s*\d+)*\s*[a-z]?\b))?(?:\s+(\d{4,5})(?:\s+(.*))?)?$
See the regex demo (note all \s are replaced with \h to only match horizontal whitespaces).
Details:
^ - start of string
(.*?) - Group 1: any zero or more chars other than line break chars
(?:\s+(\d+(?:\s*[-|+\/]\s*\d+)*\s*[a-z]?\b))? - an optional non-capturing group matching
\s+ - one or more whitespaces
(\d+(?:\s*[-|+\/]\s*\d+)*\s*[a-z]?\b) - Group 2:
\d+ - one or more digits
(?:\s*[-|+\/]\s*\d+)* - zero or more sequences of zero or more whitespaces, -, +, | or /, zero or more whitespaces, one or more digits
\s* - zero or more whitespaces
[a-z]?\b - an optional lowercase ASCII letter and a word boundary
(?:\s+(\d{4,5})\b(?:\s+(.*))?)? - an optional non-capturing group matching
\s+ - one or more whitespaces
(\d{4,5}) - Group 3: four or five digits
(?:\s+(.*))? - an optional sequence of one or more whitespaces and then any zero or more chars other than line break chars as many as possible
$ - end of string.
Please note that the (?:\s+(.*))? optional group must be inside the (?:\s+(\d{4,5})...)? group to work.
It is difficult to parse addresses because we are halfway between formatted text and natural language. Here is a pattern that tries as much as possible to reduce the number of optional parameters to succeed with the examples offered without asking too much to the regex engine. To do this, I mainly rely on character classes, atomic groups, and a relatively accurate description of the street names. Obviously, all the examples of the question cannot be representative of reality and characters could be added or removed from the classes to deal with new cases. Nevertheless, the structure of this pattern is a good starting point.
~
^
(?<strasse> [\pL\d-]+ \.? (?> \h+ [\pL\d-]+ \.? )*? ) \h*
(?<nummer> \b (?> \d+ | [-+/\h]+ | [a-z] \b )*? )
(?: \h+ (?<plz> \d{4,5} )
\h+ (?<stadt> .+ ) )?
$
~mxui
demo
Note that in the above link you can also see a previous version of this pattern with a more accurate description of the street number (a bit more efficient but longer).

Regex Preg_match for licence key 25 alphanumeric and 4 hyphens

I'm still trying to get to grips with regex patterns and just after a little double-checking if someone wouldn't mind obliging!
I have a string which should either contain:
A 10 digit (numbers and letters) licence key, for example: 1234567890 OR
A 25 digit (numbers and letters) licence key, for example: ABCD1EFGH2IJKL3MNOP4QRST5 OR
A 29 digit licence number (25 numbers and letters, separated into 5 group by hyphens), for example: ABCD1-EFGH2-IJKL3-MNOP4-QRST51
I can match the first two fine, using ctype_alnum and strlen functions. However, for the last one I think I'll need to use regex and preg_match.
I had a go over at regex101.com and came up with the following:
preg_match('^([A-Za-z0-9]{5})+-+([A-Za-z0-9]{5})+-+([A-Za-z0-9]{5})+-([A-Za-z0-9]{5})+-+([A-Za-z0-9]{5})', $str);
Which seems to match what I'm looking for.
I want the string to only contain an exact match for a string beginning with the licence number, and contain nothing other than mixed upper/lower case letters and numbers in any order and hyphens between each group of 5 characters (so a total of 29 characters - I don't want any further matches). No white space, no other characters and nothing else before or after the 29 digit key.
Will the above work, without allowing any other combinations? Will it stop checking at 29 characters? I'm not sure if there is a simpler way to express this in regex?
Thanks for your time!
The main point is that you need to use both ^ (start of string) and $ (end of string) anchors. Also, when you use + after (...), you allow 1 or more repetitions of the whole subpattern inside the (...). So, you need to remove the +s and add the $ anchor. Also, you need regex delimiters for your regex to work in PHP preg_match. I prefer ~ so as not to escape /. Maybe it is not the case here, but this is a habit.
So, the regex can look like
'~^[A-Za-z0-9]{5}(?:-[A-Za-z0-9]{5}){4}$~'
See the regex demo
The (?:-[A-Za-z0-9]{5}){4} matches 4 occurrences of -[A-Za-z0-9]{5} subpattern. The (?:...) is a non-capturing group whose matched text does not get stored in any buffer (unlike the capturing group).
See the IDEONE demo:
$re = '~^[A-Za-z0-9]{5}(?:-[A-Za-z0-9]{5}){4}$~';
$str = "ABCD1-EFGH2-IJKL3-MNOP4-QRST5";
if (preg_match($re, $str, $matches)) {
echo "Matched!";
}
How about:
preg_match('/^([a-z0-9]{5})(?:-(?1)){4}$/i', $str);
Explanation:
/ : regex delimiter
^ : begining of string
( : begin group 1
[a-z0-9]{5} : exactly 5 alphanum.
) : end of group 1
(?: : begin NON capture group
- : a dash
(?1) : same as definition in group 1 (ie. [a-z0-9]{5})
){4} : this group must be repeated 4 times
$ : end of string
/i : regex delimiter with case insensitive modifier

PHP regex and repeated patterns

I'd like to match these strings using preg_match, basicaly only repeated patterns of digits and a comma (optional), no letters
123
123,
123,456
123,456,
123,456,789
123,456,789,
etc...
but not
abc
123,abc
123,abc,456
abc,123,456
thanks
Put comma and the pattern to match one or more digits inside a non-capturing group and then make it to repaet zero or more times. And also don't forget to add an optional comma at the last.
^[0-9]+(?:,[0-9]+)*,?$
DEMO

Phone no contain this patteren AABBCC e.g 112233

I want to check if phone no contains this pattern AABBCC
Where A[0-9],B[0-9],C[0,9] They should be different e.g 112233,553322,887766
Let Us Suppose
I Have a phone no 03334112233
It will say yes pattern matched.
PHP Code but It Is For Exact String
$str = 'aabbaabbccaass'; //or whatever
if (preg_match('/(?!.*?aabbcc)^.*$/', $str))
echo "accepted\n";
else
echo "rejected\n";
Problem i don't know how to do if string is for numbers
Possible Duplicate
but it does not contain answer and exact detail.
Edited :
I want to match the last 6 characters of the string in this pattern AABBCC e.g 03329112233
To match number with AABBCC format, you can use this pattern:
(?:(\d)\1(?!\1)){2}(\d)\2
example of use:
if (preg_match('/(?:(\d)\1(?!\1)){2}(\d)\2/', $str)
echo "rejected\n";
else
echo "accepted\n";
But if you have other tests to do (for example that there is only digits), it can be more flexible to use it in this way:
if (preg_match('/(?!.*(?:(\d)\1(?!\1)){2}(\d)\2)^\d+$/', $str)
echo "accepted\n";
else
echo "rejected\n";
pattern details:
(?: # open a non capturing group that describes a repeated digit
(\d) # capture the first digit with group 1
\1 # a backreference to group 1 (the same digit thus)
(?!\1) # check with a negative lookahead that the same digit doesn't follow
){2} # repeat the group two times
(\d)\2 # same thing for digits 5 & 6 (the lookahead isn't needed here)
Note that the digit in the capture group change at each repetition of the non capturing group (because the negative lookahead forces it).
Notice: if you want to reject numbers that contains, for example, 111122 or 112222 or 111111, you only need to remove the negative lookahead.
if you want to reject numbers with the format 112211 or 448844, you must change the pattern like this: (\d)\1(?!\d{0,2}\1)(\d)\2(?!\2)(\d)\3
As I understand, you only want to match the last 6 characters of the string, if they are digits, and of 3 all different digit pairs. Would also use a lookahead and some pattern like this:
(?>((\d)\2)(?!.*\1)){3}$
\2 checks for an equivalent of 2nd capturing group, which is one digit (shorthand \d)
using a negative lookahead to check, if not followed by .* any amount of any characters, followed by equivalent of 1st capturing group (which contains 2 equal digits).
{3} 3 repitions at $ end of string.
Test on regex101.com, Regex FAQ
Your regex should be like this:
^((\d)\2){3}$
It is simpler and also works.
You can use capturing groups and backreferences like this:
if (preg_match('/(?!.*(.)\1(.)\2(.)\3)^.*$/', $str))
The (.) will match any single character and assign it to a group. The first instance is assigned to group 1, the second to group 2 and so on. Later in the pattern, the backreference \1 will match exactly what was previously captured in group the first group, \2 will match what was captured in the second group, etc.
You probably will also want to use \d to match any single digit (it's only necessary to use this outside of the lookahead) and a {n,m} quantifier to match between n and m digits. For example, the following will match any sequence of 7 to 10 digits that does not contain a subsequence like AABBCC:
if (preg_match('/(?!.*(.)\1(.)\2(.)\3)^\d{7,10}$/', $str))

Categories