I'm trying to check if a string has a certain number of occurrence of a character.
Example:
$string = '123~456~789~000';
I want to verify if this string has exactly 3 instances of the character ~.
Is that possible using regular expressions?
Yes
/^[^~]*~[^~]*~[^~]*~[^~]*$/
Explanation:
^ ... $ means the whole string in many regex dialects
[^~]* a string of zero or more non-tilde characters
~ a tilde character
The string can have as many non-tilde characters as necessary, appearing anywhere in the string, but must have exactly three tildes, no more and no less.
As single character is technically a substring, and the task is to count the number of its occurences, I suppose the most efficient approach lies in using a special PHP function - substr_count:
$string = '123~456~789~000';
if (substr_count($string, '~') === 3) {
// string is valid
}
Obviously, this approach won't work if you need to count the number of pattern matches (for example, while you can count the number of '0' in your string with substr_count, you better use preg_match_all to count digits).
Yet for this specific question it should be faster overall, as substr_count is optimized for one specific goal - count substrings - when preg_match_all is more on the universal side. )
I believe this should work for a variable number of characters:
^(?:[^~]*~[^~]*){3}$
The advantage here is that you just replace 3 with however many you want to check.
To make it more efficient, it can be written as
^[^~]*(?:~[^~]*){3}$
This is what you are looking for:
EDIT based on comment below:
<?php
$string = '123~456~789~000';
$total = preg_match_all('/~/', $string);
echo $total; // Shows 3
i have this function to check my phone number:
function isValid( $what, $data ) {
switch( $what ) {
// validate a phone number
case 'phone_number':
$pattern = "/^[0-9-+]+$/";
break;
default:
return false;
break;
}
return preg_match($pattern, $data) ? true : false;
}
i want to change that regex to accept the following: the ) ( chars like (800) and the space.
So for example this number will pass the validation, right now is not passing:
+1 (201) 223-3213
Let us construct the regular expression step by step. Consider also that spaces are trimmed before matching.
at the beginning there might or might not be a + sign. This also needs to be escaped. \+?
then comes one or more digits, before the part with parenthesis [0-9]+ You might want to write [0-9]* if the number can begin directly with a group in parenthesis
then, optionally comes a group of digits in parenthesis: (\[0-9]+\)?. Suppose that only one such group is allowed
then comes the local phone number, hyphens also allowed: [0-9-]*
the final character must be a digit [0-9], hyphen is not allowed here
^\+?[0-9]+(\([0-9]+\))?[0-9-]*[0-9]$
See the result here. Trimming spaces looks like $trimmed = str_replace(' ', '', $pattern);.
How about this regexp:
/^[0-9-+()\s]+$/
See it in action here
'/\(?\b[0-9]{3}\)?[-. ]?[0-9]{3,5}[-. ]?[0-9]{4,8}\b/'
Since you seem to be using this for validation you can use str_replace('[\s\+\-\(\)]', '', $data) to get a string that should (if the phone number is valid) contain only digits. You can then test this assumption easily by running preg_match('\d{11}', $data) (the {11} means 11 digits, if there's a range allowed, use min, max like this {min,max}, e.g. \d{10,11}).
It's worth noting that this isn't as thorough as Lorlin's answer in that you're ignoring any invalid use of brackets, +s or -s. You may want to use a combination of the two, or whatever suits your needs the best.
I need to be able to search within a string and find out if [topic] is equal to a number and grab that number only from within the string.
For example, a string like so:
[topic]=10[board]=1
should return 10
But a string like this:
[topic][board]=1
should return 0 or false
A string like this:
[topic]=1.5[board]=2
should return 1, cause we need to round down floor()
Also, we aren't worried about negative numbers, cause this will never happen.
How can I do this to just grab the number only, rounding down, from these types of strings that look like this, only if [topic] is present in the string and defined with an equal sign.
Thanks guys :)
The idea below uses preg_match and a regular express that looks for the word "topic" inside square brackets followed by an equal sign and one of more numbers. Before the matches, I set the default value of the topic (false in this case). If a topic is found, I then convert it to an integer.
This will ignore the decimal point and any numbers that follow as \d only contains the numbers 0 through 9.
Example:
<?php
$string = '[topic]=10[board]=1';
$topic = false;
if (preg_match('/\[topic\]=(?P<topic>\d+)/', $string, $matches)) {
$topic = (int)$matches['topic'];
}
var_dump($topic);
This isn't a big issue for me (as far as I'm aware), it's more of something that's interested me. But what is the main difference, if any, of using is_numeric over preg_match (or vice versa) to validate user input values.
Example One:
<?php
$id = $_GET['id'];
if (!preg_match('/^[0-9]*$/', $id)) {
// Error
} else {
// Continue
}
?>
Example Two:
<?php
$id = $_GET['id'];
if (!is_numeric($id)) {
// Error
} else {
// Continue
}
?>
I assume both do exactly the same but is there any specific differences which could cause problems later somehow? Is there a "best way" or something I'm not seeing which makes them different.
is_numeric() tests whether a value is a number. It doesn't necessarily have to be an integer though - it could a decimal number or a number in scientific notation.
The preg_match() example you've given only checks that a value contains the digits zero to nine; any number of them, and in any sequence.
Note that the regular expression you've given also isn't a perfect integer checker, the way you've written it. It doesn't allow for negatives; it does allow for a zero-length string (ie with no digits at all, which presumably shouldn't be valid?), and it allows the number to have any number of leading zeros, which again may not be the intended.
[EDIT]
As per your comment, a better regular expression might look like this:
/^[1-9][0-9]*$/
This forces the first digit to only be between 1 and 9, so you can't have leading zeros. It also forces it to be at least one digit long, so solves the zero-length string issue.
You're not worried about negatives, so that's not an issue.
You might want to restrict the number of digits, because as things stand, it will allow strings that are too big to be stored as integers. To restrict this, you would change the star into a length restriction like so:
/^[1-9][0-9]{0,15}$/
This would allow the string to be between 1 and 16 digits long (ie the first digit plus 0-15 further digits). Feel free to adjust the numbers in the curly braces to suit your own needs. If you want a fixed length string, then you only need to specify one number in the braces.
According to http://www.php.net/manual/en/function.is-numeric.php, is_numeric alows something like "+0123.45e6" or "0xFF". I think this not what you expect.
preg_match can be slow, and you can have something like 0000 or 0051.
I prefer using ctype_digit (works only with strings, it's ok with $_GET).
<?php
$id = $_GET['id'];
if (ctype_digit($id)) {
echo 'ok';
} else {
echo 'nok';
}
?>
is_numeric() allows any form of number. so 1, 3.14159265, 2.71828e10 are all "numeric", while your regex boils down to the equivalent of is_int()
is_numeric would accept "-0.5e+12" as a valid ID.
Not exactly the same.
From the PHP docs of is_numeric:
'42' is numeric
'1337' is numeric
'1e4' is numeric
'not numeric' is NOT numeric
'Array' is NOT numeric
'9.1' is numeric
With your regex you only check for 'basic' numeric values.
Also is_numeric() should be faster.
is_numeric checks whether it is any sort of number, while your regex checks whether it is an integer, possibly with leading 0s. For an id, stored as an integer, it is quite likely that we will want to not have leading 0s. Following Spudley's answer, we can do:
/^[1-9][0-9]*$/
However, as Spudley notes, the resulting string may be too large to be stored as a 32-bit or 64-bit integer value. The maximum value of an signed 32-bit integer is 2,147,483,647 (10 digits), and the maximum value of an signed 64-bit integer is 9,223,372,036,854,775,807 (19 digits). However, many 10 and 19 digit integers are larger than the maximum 32-bit and 64-bit integers respectively. A simple regex-only solution would be:
/^[1-9][0-9]{0-8}$/
or
/^[1-9][0-9]{0-17}$/
respectively, but these "solutions" unhappily restrict each to 9 and 19 digit integers; hardly a satisfying result. A better solution might be something like:
$expr = '/^[1-9][0-9]*$/';
if (preg_match($expr, $id) && filter_var($id, FILTER_VALIDATE_INT)) {
echo 'ok';
} else {
echo 'nok';
}
is_numeric checks more:
Finds whether the given variable is numeric. Numeric strings consist
of optional sign, any number of digits, optional decimal part and
optional exponential part. Thus +0123.45e6 is a valid numeric value.
Hexadecimal notation (0xFF) is allowed too but only without sign,
decimal and exponential part.
You can use this code for number validation:
if (!preg_match("/^[0-9]+$/i", $phone)) {
$errorMSG = 'Invalid Number!';
$error = 1;
}
If you're only checking if it's a number, is_numeric() is much much better here. It's more readable and a bit quicker than regex.
The issue with your regex here is that it won't allow decimal values, so essentially you've just written is_int() in regex. Regular expressions should only be used when there is a non-standard data format in your input; PHP has plenty of built in validation functions, even an email validator without regex.
PHP's is_numeric function allows for floats as well as integers. At the same time, the is_int function is too strict if you want to validate form data (strings only). Therefore, you had usually best use regular expressions for this.
Strictly speaking, integers are whole numbers positive and negative, and also including zero. Here is a regular expression for this:
/^0$|^[-]?[1-9][0-9]*$/
OR, if you want to allow leading zeros:
/^[-]?[0]|[1-9][0-9]$/
Note that this will allow for values such as -0000, which does not cause problems in PHP, however. (MySQL will also cast such values as 0.)
You may also want to confine the length of your integer for considerations of 32/64-bit PHP platform features and/or database compatibility. For instance, to limit the length of your integer to 9 digits (excluding the optional - sign), you could use:
/^0$|^[-]?[1-9][0-9]{0,8}$/
Meanwhile, all the values above will only restrict the values to integer,
so i use
/^[1-9][0-9\.]{0,15}$/
to allow float values too.
You can use filter_var() to check for integers in strings
<?php
$intnum = "1000022";
if (filter_var($intnum, FILTER_VALIDATE_INT) !== false){
echo $intnum.' is an int now';
}else{
echo "$intnum is not an int.";
}
// will output 1000022 is an int now
I read on a forum that you can't completely trust is_numeric(). It lets through "0xFF" for example which is an allowed hexadecimal...
So my question is can you trick is_numeric? Will I need to use a regex to do it correctly?
Here is what is_numeric() considers to be a numeric string:
Numeric strings consist of optional sign, any number of digits, optional decimal part and optional exponential part. Thus +0123.45e6 is a valid numeric value. Hexadecimal notation (0xFF) is allowed too but only without sign, decimal and exponential part.
If you only want to check if a string consists of decimal digits 0-9, you could use ctype_digit().
One can also check using ctype_digit() to check if its a true number.
Regex would obviously be your better option, however it does come with an overhead. So it really depends on your situation and what you want to do.
Is it for validating user input? Then the overhead of using a regexp or asserting it doesn't contain an "x" and is_numeric() wouldn't be too much overhead.
If you just want to check that something is an integer, try this:
function isInteger($value){
return (is_numeric($value) ? intval($value) == $value : false);
}
If you want to check for floats too then this won't work obviously :)