PHP and regex: Check if string follows the pattern with square brackets - php

I have no experience with regular expressions whatsoever, hence my question.
I have a string which should look like this:
[1,24,2,59]
As the string can be manipulated by the user and hence changed, I want to check whether it is still following the same organization pattern, and only containing numbers,square brackets and commas.

You can use this regex in preg_match to validate your input:
'/\[\d+(,\d+)*\]/'
\[ matches the character [ literally
\d+ match a digit [0-9]
Quantifier: Between one and unlimited times, as many times as possible, giving back as
needed [greedy]
1st Capturing group (,\d+)*
Quantifier: Between zero and unlimited times, as many times as possible, giving back as
needed [greedy]
Note: A repeated capturing group will only capture the last iteration. Put a capturing
group around the repeated group to capture all iterations or use a non-capturing group
instead if you're not interested in the data
, matches the character , literally
\d+ match a digit [0-9]
Quantifier: Between one and unlimited times, as many times as possible, giving back as
needed [greedy]
\] matches the character ] literally

'/^\[\d+(,\d+)*\]$/'
Same as the other answer, except it requires the whole string to match, otherwise a valid substring would suffice (e.g. "bla bla [1,3] bla bla")
'/^\[([1-9]\d+|\d)(,([1-9]\d+|\d))*\]$/'
Same, but numbers must have leading zeros removed, so "[12]" is OK but "[012]" is not.
'/^\s*\[\s*([1-9]\d+|\d)(\s*,\s*([1-9]\d+|\d))*\s*\]\s*$/'
Allows white spaces, e.g. " [ 1 , 12 , 12 ] " will be accepted but not "[1 2]".

Related

PHP Regex: allowed no more 3 digits but any count for another chars

I need to modify the exp:
/^[-\p{L}\d]+$/u
The purpose is to allow maximum 3 digits in whole string, but letter chars and dashes should be allowed without quantity restrictions. No matters where digits are located within string. For instance:
test345 //this should match
345te-st //this should match
3454dtest //this shouldn\'t match
I have already tried some patterns, but is don't work properly:
/^[-\p{L}\d{0,3}]+$/u
/^[-\p{L}]+[\d]{0,3}+$/u
/^([-\p{L}]+)([\d]+){0,3}$/u
The patterns that you tried don't work properly matching max 3 digits anywhere in the string because between the start and end anchors:
This notation is fully in a character class [-\p{L}\d{0,3}]+ which matches repeating 1+ times any of the listed characters between [...] (So there is no digit limit due to the +)
This notation [-\p{L}]+[\d]{0,3}+ matches 1+ times [-\p{L}] followed by 0-3 consecutive digits (So the digits can not be at the start for example)
This notation ([-\p{L}]+)([\d]+){0,3} uses 2 capture groups, but the order here is also 1+ times [-\p{L}] and 0-3 consecutive digits (only here is the capture group repeated)
If empty strings are also allowed:
^(?:[\p{L}-]*\d){0,3}[\p{L}-]*$
^ Start of string
(?:[\p{L}-]*\d){0,3} Match 0-3 times optional repetitions of [\p{L}-] and a single digit
[\p{L}-]* Match optional repetitions of [\p{L}-] at the end
$ End of string
Regex demo
Else you can assert that the string is not empty using (?!$)
^(?!$)(?:[\p{L}-]*\d){0,3}[\p{L}-]*$
Regex demo

Regex to get the first number after a certain string followed by any data until the number

I have a piece of data, retrieved from the database and containing information I need. Text is entered in a free form so it's written in many different ways. The only thing I know for sure is that I'm looking for the first number after a given string, but after that certain string (before the number) can be any text as well.
I tried this (where mytoken is the string I know for sure its there) but this doesn't work.
/(mytoken|MYTOKEN)(.*)\d{1}/
/(mytoken|MYTOKEN)[a-zA-Z]+\d{1}/
/(mytoken|MYTOKEN)(.*)[0-9]/
/(mytoken|MYTOKEN)[a-zA-Z]+[0-9]/
Even mytoken can be written in capitals, lowercase or a mix of capitals and lowercase character. Can the expression be case insensitive?
You do not need any lazy matching since you want to match any number of non-digit symbols up to the first digit. It is better done with a \D*:
/(mytoken)(\D*)(\d+)/i
See the regex demo
The pattern details:
(mytoken) - Group 1 matching mytoken (case insensitively, as there is a /i modifier)
(\D*) - Group 2 matching zero or more characters other than a digit
(\d+) - Group 3 matching 1 or more digits.
Note that \D also matches newlines, . needs a DOTALL modifier to match across newlines.
You need to use a lazy quantifier. You can do that by putting a question mark after the star quantifier in the regex: .*?. Otherwise, the numbers will be matched by the dot operator until the last number, which will be matched by \d.
Regex: /(mytoken|MYTOKEN)(.*?)\d/
Regex demo
You can use the opposite:
/(mytoken|MYTOKEN)(\D+)(\d)/
This says: mytoken, followed by anything not a number, followed by a number. The (lazy) dot-star-soup is not always your best bet. The desired number will be in $3 in this example.

Regex to return different length sentences

I'm tyring to match different length sentences with digits at the begining.
How can I do this and return matches of different lengths?
eg "2341' Macbeth",
"2354' The Hunger Games",
"1236' Crimson Peak"
preg_match_all("d+\\'\s\w+\s\w+(?(?=w))~", $string, $array);
Clearly I'm new to regex and programming in general, any responses would be greatly appreciated.
Thank you.
You can just use \d+.
working demo
If you want to capture then use capturing groups
(\d+)
Match information
MATCH 1
1. [0-4] `2341`
MATCH 2
1. [17-21] `2354`
MATCH 3
1. [43-47] `1236`
Btw, if you have a multiline sentence, just add the ^ at the beginning to match only line starting with numbers:
^(\d+)
Working demo
I guess this will work
/^\d+.*?$/
DEMO
https://regex101.com/r/fY9yA5/1
REGEX EXPLANATION
^\d+.*?$
Assert position at the beginning of a line «^»
Match a single character that is a “digit” «\d+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match any single character «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Assert position at the end of a line «$»

php - regex - how to extract a number with decimal (dot and comma) from a string (e.g. 1,120.01)?

how to extract a number with decimal (dot and comma) from a string (e.g. 1,120.01) ?
I have a regex but doesn't seem to play well with commas
preg_match('/([0-9]+\.[0-9]+)/', $s, $matches);
The correct regex for matching numbers with commas and decimals is as follows (The first two will validate that the number is correctly formatted):
decimal optional (two decimal places)
^[+-]?[0-9]{1,3}(?:,?[0-9]{3})*(?:\.[0-9]{2})?$
Debuggex Demo
Explained:
number (decimal optional)
^[+-]?[0-9]{1,3}(?:,?[0-9]{3})*(?:\.[0-9]{2})?$
Options: case insensitive
Assert position at the beginning of the string «^»
Match a single character present in the list below «[+-]?»
Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
The character “+” «+»
The character “-” «-»
Match a single character in the range between “0” and “9” «[0-9]{1,3}»
Between one and 3 times, as many times as possible, giving back as needed (greedy) «{1,3}»
Match the regular expression below «(?:,?[0-9]{3})*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match the character “,” literally «,?»
Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
Match a single character in the range between “0” and “9” «[0-9]{3}»
Exactly 3 times «{3}»
Match the regular expression below «(?:\.[0-9]{2})?»
Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
Match the character “.” literally «\.»
Match a single character in the range between “0” and “9” «[0-9]{2}»
Exactly 2 times «{2}»
Assert position at the end of the string (or before the line break at the end of the string, if any) «$»
Will Match:
1,432.01
456.56
654,246.43
432
321,543
Will not Match
454325234.31
324,123.432
,,,312,.32
123,.23
decimal mandatory (two decimal places)
^[+-]?[0-9]{1,3}(?:,?[0-9]{3})*\.[0-9]{2}$
Debuggex Demo
Explained:
number (decimal required)
^[+-]?[0-9]{1,3}(?:,?[0-9]{3})*\.[0-9]{2}$
Options: case insensitive
Assert position at the beginning of the string «^»
Match a single character present in the list below «[+-]?»
Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
The character “+” «+»
The character “-” «-»
Match a single character in the range between “0” and “9” «[0-9]{1,3}»
Between one and 3 times, as many times as possible, giving back as needed (greedy) «{1,3}»
Match the regular expression below «(?:,?[0-9]{3})*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match the character “,” literally «,?»
Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
Match a single character in the range between “0” and “9” «[0-9]{3}»
Exactly 3 times «{3}»
Match the character “.” literally «\.»
Match a single character in the range between “0” and “9” «[0-9]{2}»
Exactly 2 times «{2}»
Assert position at the end of the string (or before the line break at the end of the string, if any) «$»
Will Match:
1,432.01
456.56
654,246.43
324.75
Will Not Match:
1,43,2.01
456,
654,246
324.7523
Matches Numbers separated by commas or decimals indiscriminately:
^(\d+(.|,))+(\d)+$
Debuggex Demo
Explained:
Matches Numbers Separated by , or .
^(\d+(.|,))+(\d)+$
Options: case insensitive
Match the regular expression below and capture its match into backreference number 1 «(\d+(.|,))+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Note: You repeated the capturing group itself. The group will capture only the last iteration. Put a capturing group around the repeated group to capture all iterations. «+»
Match a single digit 0..9 «\d+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the regular expression below and capture its match into backreference number 2 «(.|,)»
Match either the regular expression below (attempting the next alternative only if this one fails) «.»
Match any single character that is not a line break character «.»
Or match regular expression number 2 below (the entire group fails if this one fails to match) «,»
Match the character “,” literally «,»
Match the regular expression below and capture its match into backreference number 3 «(\d)+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Note: You repeated the capturing group itself. The group will capture only the last iteration. Put a capturing group around the repeated group to capture all iterations. «+»
Match a single digit 0..9 «\d»
Will Match:
1,32.543,2
5456.35,3.2,6.1
2,7
1.6
Will Not Match:
1,.2 // two ., side by side
1234,12345.5467. // ends in a .
,125 // begins in a ,
,.234 // begins in a , and two symbols side by side
123,.1245. // ends in a . and two symbols side by side
Note: wrap either in a group and then just pull the group, let me know if you need more specifics.
Description: This type of RegEx works with any language really (PHP, Python, C, C++, C#, JavaScript, jQuery, etc). These Regular Expressions are good for currency mainly.
You can use this regex: -
/((?:[0-9]+,)*[0-9]+(?:\.[0-9]+)?)/
Explanation: -
/(
(?:[0-9]+,)* # Match 1 or more repetition of digit followed by a `comma`.
# Zero or more repetition of the above pattern.
[0-9]+ # Match one or more digits before `.`
(?: # A non-capturing group
\. # A dot
[0-9]+ # Digits after `.`
)? # Make the fractional part optional.
)/
Add the comma to the range that can be in front of the dot:
/([0-9,]+\.[0-9]+)/
# ^ Comma
And this regex:
/((?:\d,?)+\d\.[0-9]*)/
Will only match
1,067120.01
121,34,120.01
But not
,,,.01
,,1,.01
12,,,.01
# /(
# (?:\d,?) Matches a Digit followed by a optional comma
# + And at least one or more of the previous
# \d Followed by a digit (To prevent it from matching `1234,.123`)
# \.? Followed by a (optional) dot
# in case a fraction is mandatory, remove the `?` in the previous section.
# [0-9]* Followed by any number of digits --> fraction? replace the `*` with a `+`
# )/
The locale-aware float (%f) might be used with sscanf.
$result = sscanf($s, '%f')
That doesn't split the parts into an array though. It simply parses a float.
See also: http://php.net/manual/en/function.sprintf.php
A regex approach:
/([0-9]{1,3}(?:,[0-9]{3})*\.[0-9]+)/
This should work
preg_match('/\d{1,3}(,\d{3})*(\.\d+)?/', $s, $matches);
Here is a great working regex. This accepts numbers with commas and decimals.
/^-?(?:\d+|\d{1,3}(?:,\d{3})+)?(?:\.\d+)?$/

Can Someone explain this reg ex to me?

I recently asked a question on formatting a telephone number and I got lots of responses. Most of the responses were great but one i really wanted to figure out what its doing because it worked great. If phone is the following how do the other lines work...what are they doing so i can learn
$phone = "(407)888-9999";
$phone = preg_replace("~[^0-9]~", "", $phone);
preg_match('~([0-9]{3})([0-9]{3})([0-9]{4})~', $phone, $matches);
Let's break the code into two lines.
preg_replace("~[^0-9]~", "", $phone);
First, we're going to replace matches to a regex with an empty string (in other words, delete matches from the string). The regex is [^0-9] (the ~ on each end is a delimiter). [...] in a regex defines a character class, which tells the regex engine to match one character within the class. Dashes are generally special characters inside a character class, and are used to specify a range (ie. 0-9 means all characters between 0 and 9, inclusive).
You can think of a character class like a shorthand for a big OR condition: ie. [0-9] is a shorthand for 1 or 2 or 3 or 4 or 5 or 6 or 7 or 8 or 9. Note that classes don't have to contain ranges, either -- [aeiou] is a character class that matches a or e or i or o or u (or in other words, any vowel).
When the first character in the class is ^, the class is negated, which means that the regex engine should match any character that isn't in the class. So when you put all that together, the first line removes anything that isn't a digit (a character between 0 and 9) from $phone.
preg_match('~([0-9]{3})([0-9]{3})([0-9]{4})~', $phone, $matches);
The second line tries to match $phone against a second expression, and puts the results into an array called $matches, if a match is made. You will note there are three sets of brackets; these define capturing groups -- ie. if there is a match of a pattern as a whole, you will end up with three submatches, which in this case will contain the area code, prefix and suffix of the phone number. In general, anything contained in brackets in a regular expression is capturing (while there are exceptions, they are beyond the scope of this explanation). Groups can be useful for other things too, without wanting the overhead of capturing, so a group can be made non-capturing by prefacing it with ?: (ie. (?:...)).
Each group does a similar thing: [0-9]{3} or [0-9]{4}. As we saw above, [0-9] defines a character class containing the digits between 0 and 9 (as the classes here don't start with ^, these are not negated groups). The {3} or {4} is a repetition operator, which says "match exactly 3 (or 4) of the previous token (or group)". So [0-9]{3} will match exactly three digits in a row, and [0-9]{4} will match exactly four digits in a row. Note that the digits don't have to be all the same (ie. 111), because the character class is evaluate for each repetition (so 123 will match because 1 matches [0-9], then 2 matches [0-9], and then 3 matches [0-9]).
In the preg_replace it looks for anything that is not, ^ inside of the [], 0-9 (basically not a number) and replaces / removes it from that string given the replacement is "".
For the first section, it pulls out the first 3 numbers ([0-9]{3}) the {3} is the number of characters to match the items inside the [] are what to match and since this is inside of paranthesis () it stores it as a match in the array $matches. The second part pulls out the next 3 numbers and the last part pulls out the last 4 numbers from $phone and stores the matches that were matched in $matches.
The ~ are delimeters for the regular expressions.
You know it's a regular expression from the regex tag.
So, you are pattern matching.
The pattern you are matching is: [^0-9] followed by the phone number.
[^0-9] is NOT '^' any one digit
So, the match after that is any 3 digits, followed by any 3 digits, followed by any 4 digits.
I don't think it will match because of the () around the area code and the dash are missing.
I'd do this:
~\(([0-9]{3})\)([0-9]{3})-([0-9]{4})~'
"[^0-9]" means everything but numbers from 0 to 9. So basically, first line replace everything but numbers with "" (nothing)
[0-9]{3} means number from 0 to 9, 3 times in a row.
So it check if you have 3 numbers then 3 numbers than 4 numbers and try to match it with $matches.
Check this tuts
Using Regular Expressions with PHP
http://www.webcheatsheet.com/php/regular_expressions.php
$phone = "(407)888-9999";
$phone = preg_replace("~[^0-9]~", "", $phone);
In php you have to delimit regex pattern in some non-alphanumeric character "~" is used here.
[^0-9] is regex pattern used to remove anything out of $phone that is not in 0-9 range remember [^...] will negate the pattern it precedes.
preg_match('~([0-9]{3})([0-9]{3})([0-9]{4})~', $phone, $matches);
Again in this line of code you have "~" as delimiter and
([0-9]{3}) this part of pattern will return 3 numbers from string (note: {} is used to specify range/number of characters to match) in a different output array dimension (check your $matches variable for result) using ( ) in a pattern results in groups/submatches

Categories