I am using this RegEx:
\s*((\w*\.*\s*)+[.*(\w)]+(?=\d{4}))(\d{4}\s*\,)*
And the goal is to match words with the last one that ends with a dot, follower with 4 digits ending with a comma.
This is a test.2014,
And it works fine. Now, I would like to add the possibility to have a whitespace (\s) between the "test." and "2014,", but the whitespace is not a mandatory parameter, it should be matched if there.
Could you please help me on how to add that to my regex? How can we set not mandatory parameters ?
Thank you.
Try this pattern
\s*((\w*\.*\s*)+[.*(\w)]+\s*(?=\d{4}))(\d{4}\s*\,)*
\s* there is a zero or more time space. I added it before date pattern
Or try:
\s*((\w*\.*\s*)+[.*(\w)]+(?=\s*\d{4}))(\s*\d{4}\s*\,)*
There are a couple ways you can do this. You can use the question mark, an asterisk, or you can specify the range of viable occurrences as is shown in the table below.
Regular Expressions Quantifiers
* 0 or more
+ 1 or more
? 0 or 1
{3} Exactly 3
{3,} 3 or more
{3,5} 3, 4 or 5
\s*((\w*\.*\s*)+[.*(\w)]+(\s+)(?=\d{4}))(\d{4}\s*\,)*
\s*((\w*\.*\s*)+[.*(\w)]+(\s*)(?=\d{4}))(\d{4}\s*\,)*
\s*((\w*\.*\s*)+[.*(\w)]+\s{0,1}(?=\d{4}))(\d{4}\s*\,)*
And the goal is to match words with the last one that ends with a dot,
follower with 4 digits ending with a comma
(?:\s*\w+)*\s*\w+\.\d{4},
Now, I would like to add the possibility to have a whitespace (\s)
between the "test." and "2014,", but the whitespace is not a mandatory
parameter,
(?:\s*\w+)*\s*\w+\.\s*\d{4},
Try this
/\s*((\w*\.*\s*)+[.*(\w)]+(\s*)(?=\d{4}))(\d{4}\s*\,)*/
Related
i have the text:
<a href="blahblahblah-dynamic" class="blahblahblah-dynamic"
title="blahblahblah-dynamic">2.550,00 €</a>1000 € 900 € 5000 € ......
and the expression:
#(\d+[\.\,]\d*?)\s?[€]#su
that matches:
550,00
example in: regexr
How can I match the whole:
2.550,00 ?
p.s I dont want to match the others 1000, 900 and numbers without , and/or .
In other words, I want to match d,d or d.d,d
so the question possible duplicate, does not cover my case.
Can someone help me on this?
You might use:
([0-9]{1,3}(?:.[0-9]{3})*\,[0-9]+)\s?€
This will match in a capturing group 1-3 digits. Then repeats in a group a dot and 3 digits and at the end will match a comma followed by one or more digits.
After the capturing group \s?[€] is matches but is not part of the group.
If you want to match exactly 2 digits after the comma you could update ,[0-9]+ to ,[0-9]{2}
As an alternative you could match your value without a capturing group and use a positive lookahead (?=\s?[€]) to assert that what is on the right side is an optional whitespace character followed by €
[0-9]{1,3}(?:.[0-9]{3})*\,[0-9]+(?=\s?€)
i'm guessing you got that value in a variable or something , why not try get it with
$value = substr($dataReceivedFromA, 0, -1); // returns "2.550,00 "
i really think there is no point in using regex here if you only wanna get rid of the sign
I have to create regex to match ugly abbreviations and numbers. These can be one of following "formats":
1) [any alphabet char length of 1 char][0-9]
2) [double][whitespace][2-3 length of any alphabet char]
I tried to match double:
preg_match("/^-?(?:\d+|\d*\.\d+)$/", $source, $matches);
But I coldn't get it to select following example: 1.1 AA My test title. What is wrong with my regex and how can I add those others to my regex too?
In your regex you say "start of string, followed by maybe a - followed by at least one digit or followed by 0 or more digits, followed by a dot and followed by at least one digit and followed by the end of string.
So you regex could match for example.. 4.5, -.1 etc. This is exactly what you tell it to do.
You test input string does not match since there are other characters present after the number 1.1 and even if it somehow magically matched your "double" matching regex is wrong.
For a double without scientific notation you usually use this regex :
[-+]?\b[0-9]+(\.[0-9]+)?\b
Now that we have this out of our way we need a whitespace \s and
[2-3 length of alphabet]
Now I have no idea what [2-3 length of alphabet] means but by combining the above you get a regex like this :
[-+]?\b[0-9]+(\.[0-9]+)?\b\s[2-3 length of alphabet]
You can also place anchors ^$ if you want the string to match entirely :
^[-+]?\b[0-9]+(\.[0-9]+)?\b\s[2-3 length of alphabet]$
Feel free to ask if you are stuck! :)
I see multiple issues with your regex:
You try to match the whole string (as a number) by the anchors: ^ at the beginning and $ at the end. If you don't want that, remove those.
The number group is non-catching. It will be checked for matches, but those won't be added to $matches. That's because of the ?: internal options you set in (?:...). Remove ?: to make that group catching.
You place the shorter digit-pattern before the longer one. If you swap the order, the regex engine will look for it first and on success prefer it over the shorter one.
Maybe this already solves your issue:
preg_match("/-?(\d*\.\d+|\d+)/", $source, $matches);
Demo
I need a regular expression for string validation. String can be empty, can have 5 digits, and can have 9 digits. Other situations is invalid. I am using the next regex:
/\d{5}|\d{9}/
But it doesn't work.
Just as Marc B said in the comments, I would use this regular expression:
/^(\d{5}(\d{4})?)?$/
This matches either exactly five digits that might be followed by another four digits (thus nine digits in total) or no characters at all (note the ? quantifier around the digits expression that makes the group optional).
The advantage of this pattern in opposite to the other mentioned patterns with alternations is that this won’t require backtracking if matching five digits failed.
use anchors and "?" to allow empty string
/^(\d{5}|\d{9})?$/
~^(?:\d{5}|\d{9}|)$~
You forgot the anchors ^ and $. Without them the string would match those digits anywhere in the string, not only at beginning or end. Furthermore you didn't cover the empty string case.
"doesn't work" isn't much help. but wouldn't it be something like this?
/^(\d{5}|\d{9}|)$/
(Bit rusty on regexp, but i'm trying to do is "start, then 5 digits OR 9 digits OR nothing, then end)
The answer as to why it doesent work is with Perl style regex's alternations are prioritized from left to right.
Change it to:
/\d{9}|\d{5}/ (Though, this won't tell you anything else about 6-8 and 10-infinity
unless its anchored with assertions or something else.)
/^(\d{5}|\d{9}|)$/
I'm trying to figure out how to write a regex that can detect if in my string, any character is repeated more than five times consecutively? For example it wouldn't detect "hello", but it would detect "helloooooooooo".
Any ideas?
Edit: Sorry, to clarify, I need it to detect the same character repeated more than five times, not any sequence of five characters. And I also need it to work with any charter, not just "o" like in my example. ".{5,}" is no good because it just detects any sequence of any five characters, not the same character.
This should do it
(\w)\1{5,}
(\w) match any character and put it in the first group
\1{5,} check that the first group match at least 5 times.
Usage :
$input = 'helloooooooooo';
if (preg_match('/(\w)\1{5,}/', $input)) {
# Successful match
} else {
# Match attempt failed
}
Correction, should be (.)\1{5,}, I believe. My mistake. This gets you:
(.) #Any character
\1 #The character captured by (.)
{5,} #At least 5 more repetitions (total of at least 6)
You can also restrict it to letters by using (\w)\1{5,} or ([a-zA-Z])\1{5,}
You can use the regex:
(.)\1{5,}
Explanation:
. : Meta char that matches any
char.
() : Are used for grouping and
remembering the matched single char.
\1 : back reference to the single
char that was remembered in prev
step.
{5,} : Quantifier for 5 or more
and in PHP you can use it as:
$input = 'helloooooooooo';
if(preg_match('/(.)\1{5,}/',$input,$matches)) {
echo "Found repeating char $matches[1] in $input";
}
Output:
Found repeating char o in helloooooooooo
Yep.
(.)\1+
This will match repeated sequences of any character.
The \1 looks at the contents of the first set of brackets. (so if you have more complex regex, you'd need to adjust it to the correct number so it picks up the right set of brackets).
If you need to specify, say more than three of them:
(.)\1{3,}
The \1 syntax is quite powerful -- eg You can also use it elsewhere in your regex to search for the same character appearing in different places in your search string.
I recently asked a question on formatting a telephone number and I got lots of responses. Most of the responses were great but one i really wanted to figure out what its doing because it worked great. If phone is the following how do the other lines work...what are they doing so i can learn
$phone = "(407)888-9999";
$phone = preg_replace("~[^0-9]~", "", $phone);
preg_match('~([0-9]{3})([0-9]{3})([0-9]{4})~', $phone, $matches);
Let's break the code into two lines.
preg_replace("~[^0-9]~", "", $phone);
First, we're going to replace matches to a regex with an empty string (in other words, delete matches from the string). The regex is [^0-9] (the ~ on each end is a delimiter). [...] in a regex defines a character class, which tells the regex engine to match one character within the class. Dashes are generally special characters inside a character class, and are used to specify a range (ie. 0-9 means all characters between 0 and 9, inclusive).
You can think of a character class like a shorthand for a big OR condition: ie. [0-9] is a shorthand for 1 or 2 or 3 or 4 or 5 or 6 or 7 or 8 or 9. Note that classes don't have to contain ranges, either -- [aeiou] is a character class that matches a or e or i or o or u (or in other words, any vowel).
When the first character in the class is ^, the class is negated, which means that the regex engine should match any character that isn't in the class. So when you put all that together, the first line removes anything that isn't a digit (a character between 0 and 9) from $phone.
preg_match('~([0-9]{3})([0-9]{3})([0-9]{4})~', $phone, $matches);
The second line tries to match $phone against a second expression, and puts the results into an array called $matches, if a match is made. You will note there are three sets of brackets; these define capturing groups -- ie. if there is a match of a pattern as a whole, you will end up with three submatches, which in this case will contain the area code, prefix and suffix of the phone number. In general, anything contained in brackets in a regular expression is capturing (while there are exceptions, they are beyond the scope of this explanation). Groups can be useful for other things too, without wanting the overhead of capturing, so a group can be made non-capturing by prefacing it with ?: (ie. (?:...)).
Each group does a similar thing: [0-9]{3} or [0-9]{4}. As we saw above, [0-9] defines a character class containing the digits between 0 and 9 (as the classes here don't start with ^, these are not negated groups). The {3} or {4} is a repetition operator, which says "match exactly 3 (or 4) of the previous token (or group)". So [0-9]{3} will match exactly three digits in a row, and [0-9]{4} will match exactly four digits in a row. Note that the digits don't have to be all the same (ie. 111), because the character class is evaluate for each repetition (so 123 will match because 1 matches [0-9], then 2 matches [0-9], and then 3 matches [0-9]).
In the preg_replace it looks for anything that is not, ^ inside of the [], 0-9 (basically not a number) and replaces / removes it from that string given the replacement is "".
For the first section, it pulls out the first 3 numbers ([0-9]{3}) the {3} is the number of characters to match the items inside the [] are what to match and since this is inside of paranthesis () it stores it as a match in the array $matches. The second part pulls out the next 3 numbers and the last part pulls out the last 4 numbers from $phone and stores the matches that were matched in $matches.
The ~ are delimeters for the regular expressions.
You know it's a regular expression from the regex tag.
So, you are pattern matching.
The pattern you are matching is: [^0-9] followed by the phone number.
[^0-9] is NOT '^' any one digit
So, the match after that is any 3 digits, followed by any 3 digits, followed by any 4 digits.
I don't think it will match because of the () around the area code and the dash are missing.
I'd do this:
~\(([0-9]{3})\)([0-9]{3})-([0-9]{4})~'
"[^0-9]" means everything but numbers from 0 to 9. So basically, first line replace everything but numbers with "" (nothing)
[0-9]{3} means number from 0 to 9, 3 times in a row.
So it check if you have 3 numbers then 3 numbers than 4 numbers and try to match it with $matches.
Check this tuts
Using Regular Expressions with PHP
http://www.webcheatsheet.com/php/regular_expressions.php
$phone = "(407)888-9999";
$phone = preg_replace("~[^0-9]~", "", $phone);
In php you have to delimit regex pattern in some non-alphanumeric character "~" is used here.
[^0-9] is regex pattern used to remove anything out of $phone that is not in 0-9 range remember [^...] will negate the pattern it precedes.
preg_match('~([0-9]{3})([0-9]{3})([0-9]{4})~', $phone, $matches);
Again in this line of code you have "~" as delimiter and
([0-9]{3}) this part of pattern will return 3 numbers from string (note: {} is used to specify range/number of characters to match) in a different output array dimension (check your $matches variable for result) using ( ) in a pattern results in groups/submatches