regular expression match, Need to match both number - php

How to match both number 123,340.00 and 1.9e10?
I have tried regex as below
^-?\d+(,)*(\d+\.(\d+e?))
But it matches only 123,340.00, I am looking to match both number. any idea? please.
Note: I have tried at online regex tool https://regex101.com

You should allow some digits after e at least. Also, ,* matches zero or more commas, and I think you should only allow comma + digits groups.
I suggest using
'~^-?\d+(?:,\d+)*(?:\.\d+)?(?:e[+-]?\d+)?$~i'
See the regex demo
Pattern explanation:
^ - start of sting
-? - an optional - (you may use [-+]? to match plus or minus)
\d+ - 1 or more digits
(?:,\d+)* - zero or more sequences of a comma + 1 or more digits
(?:\.\d+)? - an optional decimal part, a dot and 1+ digits
(?:e[+-]?\d+)? - an optional exponent part, e, optional minus or plus, and 1+ digits
$ - end of string.
Note that the ~i modifier is used to match both e and E.

Little less complicated:
^-?\d+(,\d+)*(\.\d+(e\d+)?)?$

If you are looking for something very strict, you can use:
^-?[1-9](?:(?:\.[0-9]*[1-9])?[eE][+-]?[1-9][0-9]*|[0-9]{0,2}(?:,[0-9]{3})*(?:\.[0-9]+)?)$
It's a bit long but it doesn't match when the number has:
012 # leading zeroes
1.20e5 # trailing zeroes in scientific notation
12.3e4 # more that one digit before the dot
1.2e0 # a zero as exponent
1,2345,678,9 # a crazy digit separator

Related

PHP regex preg_match to identify url pattern

Is there any way to make rule allow only example 1 and 3 and not all 4 of them?
/^(en\/|)([\d]{1,3})([-])(.+?)([\/])$/
examples:
12-blog/
12-blog/blog2/
en/12-blog/
en/12-blog/blog2/
https://www.phpliveregex.com/p/tFe
You might use an optional part for en/ followed by match 1-3 digits, - and match not a / 1+ times using a negated character class.
Note that you can omit the square brackets for [\d], [-] and [\/]. If you choose a different delimiter than / you don't have to escape the forward slash.
^(?:en/)?\d{1,3}-[^/]+/$
In parts
^ Start of string
(?:en/)? Optionally match en/
\d{1,3} Match 1-3 digits
- Match literally
[^/]+/ Match 1+ times any char except /
$ End of string
Regex demo | Php demo

Update a regex that matches twitter like mentions to allow for dots

I have already found helpful answers for a regex that matches twitter like username mentions in this answer and this answer
(?<=^|(?<=[^a-zA-Z0-9-_\.]))#([A-Za-z]+[A-Za-z0-9_]+)
(?<=^|(?<=[^a-zA-Z0-9-_\.]))#([A-Za-z]+[A-Za-z0-9-_]+)
However, I need to update this regex to also include usernames that has dots.
One or more dots are allowed in a username.
The username must not start or end with a dot.
No two consecutive dots are allowed.
Example of a matched string:
#valid.user.name
^^^^^^^^^^^^^^^^
Examples of non-matched strings:
#.user.name // starts with a dot
#user.name. // ends with a dot
#user..name // has two consecutive dots
You can use this refactored regex:
(?<=[^\w.-]|^)#([A-Za-z]+(?:\.\w+)*)$
RegEx Demo
RegEx Details:
(?<=[^\w.-]|^): Lookbehind to assert that we have start of line or any non-word, non-dot, non-hyphen character before current position
#: Match literal `#1
(: Start capture group
[A-Za-z]+: Match 1+ ASCII letters
(?:\.\w+)*: Match 0 or more instances of dot followed 1+ word characters
): End capture group
$: End
The (?<=^|(?<=[^a-zA-Z0-9-_\.])) is a positive lookbehind that requires a match to be at the start of the string or right after an alphanumeric, -, _, ., you may write it in a more compact way as (?<![\w.-]), a negative lookbehind.
Next, ([A-Za-z]+[A-Za-z0-9_]+) captures 1+ ASCII letters and then 1+ ASCII letters or/and underscores. You seem to make sure the first char is a letter, then any number of sequences of . and 1+ word chars are allowed, that is, you may use [A-Za-z]\w*(?:\.\w+)*.
As you do not want to match it if there is a . right after the expected match, you need to set a lookahead that will require a space or end of string, (?!\S).
So, combining it, you can use
'~(?<![\w.-])#([A-Za-z]\w*(?:\.\w+)*)(?!\S)~'
See the regex demo
Details
(?<![\w.-]) - no letters, digits, _, . and - immediately to the left of the current location are allowed
# - a # char
([A-Za-z]\w*(?:\.\w+)*) - Group 1:
[A-Za-z] - an ASCII letter
\w* - 0+ letters, digits, _
(?:\.\w+)* - 0+ sequences of
\. - dot
\w+ - 1+ letters, digits, _
(?!\S) - whitespace or end of string are required immediately to the right of the current location.
EDIT: Simpler version (same result)
^#[a-zA-Z](\.?[\w-]+)*$
Original
Another one:
^#[a-zA-Z][a-zA-Z_-]?(\.?[\w\d-]+){0,}$
^# starts with #
[a-zA-Z] first char
[a-zA-Z_-]? match a-zA-Z_- 0 or more times
( start group
\.? match . (optional)
[\w\d-]+ match a-zA-Z0-9-_ 1 or more times
) end group
{0,} repeat group 0 to infinite times
$ end
Tests
valid:
#validusername
#valid.user.name
#valid-user-name
#valid_user-name
#valid-user123_name
#a.valid-user123_name
not valid:
#-invalid.user
#_invalid.user
#1notvalid-user_123name33
#.user.name
#user.name.
#user..name

PHP: Complex Multi-Line RegEx, Tilde delimeters

I need to generate a regex that will match the following format:
-1 LKSJDF LSAALSKJ~
Syjsdf
lkjdf
This block may contain multiple characters including digits, colons, etc. Any character other than a tilde.
~
I'm currently using this:
/(-\d|\d)\s([^$\~][a-zA-Z\s]*)\~\n/s
Which matches the first line fine. I need to capture the -1 through 60 that begins the pattern, the words after the space and up until the first tilde. I then need to capture all of the text BETWEEN the tildes.
I'm not the strongest with regex in the first place, but I'm having trouble getting this to work without also capturing the tildes.
You can use
'/^(-?\d+)\s+([^~]*)~([^~]+)~/m'
See demo
The regex matches:
^ - start of a line (due to /m modifier ^ does not match start of string any longer)
(-?\d+) - (Group 1) a one or zero - followed with one or more digits
\s+ - one or more whitespace symbols (to only match tab and regular spaces, use \h+ instead)
([^~]*) - (Group 2) zero or more characters other than a ~ (you can force to match these characters on the first line only by adding a \n\r to the negated character class - [^~\n\r])
~ - a literal leading tilde
([^~]+) - (Group 3) one or more characters other than a tilde
~ - a literal trailing tilde
If you need to only match these strings if the number is an integer between -1 and 60, you can use
'/^(-1|[1-5]?[0-9]|60)\s+([^~]*)~([^~]+)~/m'
See another demo
Here, the first group matches integer numbers from -1 to 60 with (-1|[1-5]?[0-9]|60) alternation group. -1 and 60 match literal numbers, and [1-5]?[0-9] matches one or zero (optional) digit from 1 to 5 (replace with [0-5]? if a leading zero is allowed) and then any one digit may follow.

php regex - find uppercase string with number and spaces in text

I want to write php regular expression to find uppercase string , which can also contain one number and spaces, from text.
For example from this text "some text to contain EXAM PL E 7STRING uppercase word" I want to get string- EXAM PL E 7STRING ,
found string should start and end only with uppercase, but in the middle, without uppercase letters can also contain(but not necessarily ) one number and spaces. So, regex should match any of these patterns
1) EXAMPLESTRING - just uppercase string
2) EXAMP4LESTRING - with number
3) EXAMPLES TRING - with space
4) EXAM PL E STRING - with more than one spaces
5) EXAMP LE4STRING - with number and space
6) EXAMP LE 4ST RI NG - with number and spaces
and with total length string should be equal or more than 4 letters
I wrote this regex '/[A-Z]{1,}([A-Z\s]{2,}|\d?)[A-Z]{1,}/', that can find first 4 patterns, but I can not figure it out to match also the last 2 patterns.
Thanks
There is a neat trick called a lookahead. It just checks what is following after the current position. That can be used to check for multiple conditions:
'/(?<![A-Z])(?=(?:[A-Z][\s\d]*){3}[A-Z])(?!(?:[A-Z\s]*\d){2})[A-Z][A-Z\s\d]*[A-Z]/'
The first lookaround is actually a lookbehind and checks that there is no previous uppercase letter. This is just a little speedup for strings that would fail the match anyway. The second lookaround (a lookahead) checks that there are at least four letters. The third one checks that there are no two digits. The rest just matches then a string of the allowed characters, starting and ending with an uppercase letter.
Note that in the case of two digits this will not match at all (instead of matching everything up to the second digit). If you do want to match in such a case, you could incorporate the "1 digit" rule into the actual match instead:
'/(?<![A-Z])(?=(?:[A-Z][\s\d]*){3}[A-Z])[A-Z][A-Z\s]*\d?[A-Z\s]*[A-Z]/'
EDIT:
As Ωmega pointed out, this will cause problems if there are less then four letters before the second digit, but more after that. This is actually quite tough, because the assertion needs to be, that there are more than 4 letters before the second digit. Since we do not know where the first digit occurs in those four letters, we have to check for all possible positions. For this I would do away with the lookaheads altogether, and simply provide the three different alternatives. (I will keep the lookbehind as an optimization for non-matching parts.)
'/(?<![A-Z])[A-Z]\s*(?:\d\s*[A-Z]\s*[A-Z]|[A-Z]\s*\d\s*[A-Z]|[A-Z]\s*[A-Z][A-Z\s]*\d?)[A-Z\s]*[A-Z]/'
Or here with added comments:
'/
(?<! # negative lookbehind
[A-Z] # current position is not preceded by a letter
) # end of lookbehind
[A-Z] # match has to start with uppercase letter
\s* # optional spaces after first letter
(?: # subpattern for possible digit positions
\d\s*[A-Z]\s*[A-Z]
# digit comes after first letter, we need two more letters before last one
| # OR
[A-Z]\s*\d\s*[A-Z]
# digit comes after second letter, we need one more letter before last one
| # OR
[A-Z]\s*[A-Z][A-Z\s]*\d?
# digit comes after third letter, or later, or not at all
) # end of subpattern for possible digit positions
[A-Z\s]* # arbitrary amount of further letters and whitespace
[A-Z] # match has to end with uppercase letter
/x'
That gives the same result on Ωmega's lengthy test input.
I suggest to use regex pattern
[A-Z][ ]*(\d)?(?(1)(?:[ ]*[A-Z]){3,}|[A-Z][ ]*(\d)?(?(2)(?:[ ]*[A-Z]){2,}|[A-Z][ ]*(\d)?(?(3)(?:[ ]*[A-Z]){2,}|[A-Z][ ]*(?:\d|(?:[ ]*[A-Z])+[ ]*\d?))))(?:[ ]*[A-Z])*
(see this demo).
[A-Z][ ]*(?:\d(?:[ ]*[A-Z]){2}|[A-Z][ ]*\d[ ]*[A-Z]|(?:[A-Z][ ]*){2,}\d?)[A-Z ]*[A-Z]
(see this demo)

php regular expression for "|" and digits

I want know, what regular expression should I have for my string. My string can contains only "|" and digits.For example: "111|333|111|333". And string must begin from number. I am using this code, but he is ugly:
if (!preg_match('/\|d/', $ids)) {
$this->_redirect(ROOT_PATH . '/commission/payment/active');
}
Thank you in advance. Sorry for my english.
Looking at your example I assume you are looking for a regex to match string that begin and end with numbers and numbers are separated with |. If so you can use:
^\d+(?:\|\d+)*$
Explanation:
^ - Start anchor.
\d+ - One ore more digits, that is a number.
(? ) - Used for grouping.
\| - | is a regex meta char used for alternation,
to match a literal pipe, escape it.
* - Quantifier for zero or more.
$ - End anchor.
The regex is:
^\d[|\d]*$
^ - Start matching only from the beginning of the string
\d - Match a digit
[] - Define a class of possible matches. Match any of the following cases:
| - (inside a character class) Match the '|' character
\d - Match a digit
$ - End matching only from the beginning of the string
Note: Escaping the | is not necessary in this situation.
A string that contains only | or digits and begins with a digit is written as ^\d(\||\d)*$. That means: either \| (notice the escape!) or a digit, written as \d, multiple times.
The ^ and $ mean: from start to end, i.e. there’s no other character before or after that.
I think /^\d[\d\|]*$/ would work, however, if you always have three digits separated by bars, you need /^\d{3}(?:\|\d{3})*$/.
EDIT:
Finally, if you always have sequences of one or more number separated by bars, this will do: /^\d+(?:\|\d+)*$/.

Categories