http://www.tehplayground.com/#0qrTOzTh3
$inputs = array(
'2', // no match
'29.2', // no match
'2.48',
'8.06.16', // no match
'-2.41',
'-.54', // no match
'4.492', // no match
'4.194,32',
'39,299.39',
'329.382,39',
'-188.392,49',
'293.392,193', // no match
'-.492.183,33', // no match
'3.492.249,11',
'29.439.834,13',
'-392.492.492,43'
);
$number_pattern = '-?(?:[0-9]|[0-9]{2}|[0-9]{3}[\.,]?)?(?:[0-9]|[0-9]{2}|[0-9]{3})[\.,][0-9]{2}(?!\d)';
foreach($inputs as $input){
preg_match_all('/'.$number_pattern.'/m', $input, $matches);
print_r($matches);
}
It seems you are looking for
$number_pattern = '-?(?<![\d.,])\d{1,3}(?:[,.]\d{3})*[.,]\d{2}(?![\d.])';
See the PHP demo and a regex demo.
The anchors are not used, there are lookarounds on both sides of the pattern instead.
Pattern details:
-? - an optional hyphen
(?<![\d.,]) - there cannot be a digit, comma or dot befire the current location
-\d{1,3} - 1 to 3 digits
(?:[,.]\d{3})* - zero or more sequences of a comma or dot followed with 3 digits
[.,] - a comma or dot
\d{2} - 2 digits that are
(?![\d.]) - not followed with a digit or dot.
Note in PHP, you do not need to specify the /m MULTILINE mode and use the $ end of string anchor,
preg_match_all('/'.$number_pattern.'/', $input, $matches);
is enough to match the numbers you need in larger texts.
If you need to match them as standalone strings, use a simpler
^-?\d{1,3}(?:[,.]\d{3})*[.,]\d{2}$
See the regex demo.
Related
I have a few hundred thousand strings that are laid out like the following
AX23784268B2
LJ93842938A1
MN39423287S
IY289383N2
With PHP I'm racking my brain how to return B2, A1, S, and N2.
Tried all sorts of substr, strstr, strlen manipulation and am coming up short.
substr('MN39423287S', -2); ?> // returns 7S, not S
This is a simpler regexp than the other answer:
preg_match('/[A-Z][^A-Z]*$/', $token, $matches);
echo $matches[0];
[A-Z] matches a letter, [^A-Z] matches a non-letter. * makes the preceiding pattern match any number of times (including 0), and $ matches the end of the string.
So this matches a letter followed by any number of non-letters at the end of the string.
$matches[0] contains the portion of the string that the entire regexp matched.
There's many way to do this.
One example would be a regex
<?php
$regex = "/.+([A-Z].?+)$/";
$tokens = [
'AX23784268B2',
'LJ93842938A1',
'MN39423287S',
'IY289383N2',
];
foreach($tokens as $token)
{
preg_match($regex, $token, $matches);
var_dump($matches[1]);
// B2, A2, S, N2
}
How the regex works;
.+ - any character except newline
( - create a group
[A-Z] - match any A-Z character
.?+ - also match any characters after it, if any
) - end group
$ - match the end of the string
I have the following:
$pattern = "/^([\w_]{1})(.+)([\w_]{1}#)/u";
$replacement = "$1*$3***$4";
$email = "testa#weste.de";
echo "obfuscated: ".preg_replace($pattern, $replacement, $email).RT;
The result is: t*a#***weste.de
But I would like to have: t*#w***.de
How to grab the letter after the # and not before. And how does it work with the .de part?
For the replacement in the example data, you might use a match with \K to forget what is matched after the first character and keep it.
To keep the first character after the # sign, you can use a capture group and use that in the replacement.
^\w\K[^\s#]+#(\w)[^\s.#]+
^ Start of string
\w Match a single word char (That will also match _)
\K Forget what is matched so far
[^\s#]+ Match 1+ chars other than # or a whitespace char
# Match the # char
(\w) Capture group 1, match a word char (to keep)
[^\s.#]+ Match 1+ chars other than #, a whitespace char or dot
Regex demo | Php demo
In the replacement use a single capture group *#$1***
$email = "testa#weste.de";
$pattern = "/^\w\K[^\s#]+#(\w)[^\s.#]+/";
$replacement = "*#$1***";
echo preg_replace($pattern, $replacement, $email);
Output
t*#w***.de
You can make the pattern as specific as you would like. If there should for example be a dot followed by at least 2 chars a-z at the end of the string, and you don't want to stop matching at the first dot after the #
^\w\K[^\s#]+#(\w)[^\s#]+(?=\.[a-z]{2,}$)
Regex demo
I found this way to do it:
$email = 'someemail#domain.com'
[$firstPart, $lastPart] = explode('#', $email);
$maskedEmail = str_replace(substr($firstPart, 0, 7), str_repeat('*', 7), $email);
Uses PHP native functions and works just fine!
Say we have two price strings in different format:
$s_price = '85.95' or '1500.00'
$r_price = '$ 85.95' or '1,500'
But all these prices are the same and should match.
I have a regex to do that but don't know if this is how we do it:
(\d+)*(,)?\d+(.)?\d*
To retrieve and parse a float from a string in PHP, use the floatval() method.
For the symbols, it depends on wether you always use the same conventions for your currencies (comma for thousands separator and dot for decimals). In that case, you should remove non-digits except dots with the preg_replace() method (the correspondig Regex could be /[^0-9.]/)
<?php
function sanitize($price) {
return floatval(preg_replace('/[^0-9.]/', '', $price));
}
$a1 = '85.95';
$a2 = '1500.00';
$b1 = '$ 85.95';
$b2 = '1,500';
sanitize($a1); // (float) 85.95
sanitize($a2); // (float) 1500
sanitize($b1); // (float) 85.95
sanitize($b2); // (float) 1500
sanitize($a1) === sanitize($b1); // (bool) true
sanitize($a2) === sanitize($b2); // (bool) true
sanitize($a1) <= sanitize($a2); // (bool) true
sanitize($b1) >= sanitize($b2); // (bool) false
Hope it will help !
You have a lot of optional parts in your pattern using ? and * and you could omit the capturing groups if you are not referring to them in the code.
What you might do is match an optional part for the dollar sign followed by 0+ horizontal whitespace chars.
Then match 1+ digits followed by an optional part to match a dot or comma and 1+ digits:
(?<!\S)(?:\$\h*)?\d+(?:[,.]\d+)\b
Explanation
(?<!\S) Assert what is on the left is not a non whitespace char
(?:\$\h*)? Optionally match a dollar sign and 0+ horizontal whitespace chars
\d+(?:[,.]\d+) Match 1+ digits followed by an optional part to match either a dot or comma and 1+ digits
\b word boundary to prevent the digit being part of a larger word
Regex demo | Php demo
you store numbers as integer or float
and to compare you need to use || not or
hope that was helpful
Let's say I want to split this string in two variables:
$string = "levis 501";
I will use
preg_match('/\d+/', $string, $num);
preg_match('/\D+/', $string, $text);
but then let's say I want to split this one in two
$string = "levis 5° 501";
as $text = "levis 5°"; and $num = "501";
So my guess is I should add a rule to the preg_match('/\d+/', $string, $num); that looks for numbers only at the END of the string and I want it to be between 2 and 3 digits.
But also the $text match now has one number inside...
How would you do it?
To slit a string in two parts, use any of the following:
preg_match('~^(.*?)\s*(\d+)\D*$~s', $s, $matches);
This regex matches:
^ - the start of the string
(.*?) - Group 1 capturing any one or more characters, as few as possible (as *? is a "lazy" quantifier) up to...
\s* - zero or more whitespace symbols
(\d+) - Group 2 capturing 1 or more digits
\D* - zero or more characters other than digit (it is the opposite shorthand character class to \d)
$ - end of string.
The ~s modifier is a DOTALL one forcing the . to match any character, even a newline, that it does not match without this modifier.
Or
preg_split('~\s*(?=\s*\d+\D*$)~', $s);
This \s*(?=\s*\d+\D*$) pattern:
\s* - zero or more whitespaces, but only if followed by...
(?=\s*\d+\D*$) - zero or more whitespaces followed with 1+ digits followed with 0+ characters other than digits followed with end of string.
The (?=...) construct is a positive lookahead that does not consume characters and just checks if the pattern inside matches and if yes, returns "true", and if not, no match occurs.
See IDEONE demo:
$s = "levis 5° 501";
preg_match('~^(.*?)\s*(\d+)\D*$~s', $s, $matches);
print_r($matches[1] . ": ". $matches[2]. PHP_EOL);
print_r(preg_split('~\s*(?=\s*\d+\D*$)~', $s, 2));
What would be Regex to match the following 10-digit numbers:
0108889999 //can contain nothing except 10 digits
011 8889999 //can contain a whitespace at that place
012 888 9999 //can contain two whitespaces like that
013-8889999 // can contain one dash
014-888-9999 // can contain two dashes
If you're just looking for the regex itself, try this:
^(\d{3}(\s|\-)?){2}\d{4}$
Put slightly more legibly:
^ # start at the beginning of the line (or input)
(
\d{3} # find three digits
(
\s # followed by a space
| # OR
\- # a hyphen
)? # neither of which might actually be there
){2} # do this twice,
\d{4} # then find four more digits
$ # finish at the end of the line (or input)
EDIT: Oops! The above was correct, but it was also too lenient. It would match things like 01088899996 (one too many characters) because it liked the first (or the last) 10 of them. Now it's more strict (I added the ^ and $).
I'm assuming you want a single regex to match any of these examples:
if (preg_match('/(\d{3})[ \-]?(\d{3})[ \-]?(\d{4})/', $value, $matches)) {
$number = $matches[1] . $matches[2] . $matches[3];
}
preg_match('/\d{3}[\s-]?\d{3}[\s-]?\d{4}/', $string);
0108889999 // true
011 8889999 // true
012 888 9999 // true
013-8889999 // true
014-888-9999 // true
To match the specific parts:
preg_match('/(\d{3})[\s-]?(\d{3})[\s-]?(\d{4}/)', $string, $matches);
echo $matches[1]; // first 3 numbers
echo $matches[2]; // next 3 numbers
echo $matches[3]; // next 4 numbers
You can try this pattern. It satisfies your requirements.
[0-9]{3}[-\s]?[0-9]{3}[-\s]?[0-9]{4}
Also, you can add more conditions to the last character by appending [\s.,]+: (phone# ending with space, dot or comma)
[0-9]{3}[-\s]?[0-9]{3}[-\s]?[0-9]{4}[\s.,]+