I have a function that is used throughout my code. The function expects that the passed parameter is a positive integer. Since PHP is loosely typed, the data type is unimportant. But it is important that it contain nothing but digits. Currently, I am using a regular expression to check the value before continuing.
Here is a simplified version of my code:
function do_something($company_id) {
if (preg_match('/\D/', $company_id)) exit('Invalid parameter');
//do several things that expect $company_id to be an integer
}
I come from a Perl background and tend to reach for regular expressions often. However, I know their usage is controversial.
I considered using intval() or (int) and forcing $company_id to be an integer. However, I could end up with some unexpected values and I want it to fail fast.
The other option is:
if (!ctype_digit((string) $company_id)) exit('Invalid parameter');
Is this scenario a valid use of regular expressions? Is one way preferred over the other? If so, why? Are there any gotchas I haven't considered?
The Goal
The original question is about validating a value of unknown data type and discarding all values except those that contain nothing but digits. There seems to be only two ways to achieve this desired result.
If the goal is to fail fast, one would want to check for invalid values and then fail rather than checking for valid values and having to wrap all code in an if block.
Option 1 from Question
if (preg_match('/\D/', $company_id)) exit('Invalid parameter');
Using regex to fail if match non-digits. Con: regex engine has overhead
Option 2 from Question
if (!ctype_digit((string) $company_id)) exit('Invalid parameter');
Using ctype_digit to fail if FALSE. Con: value must be cast to string which is a (small) extra step
You must cast value to a string because ctype_digit expects a string and PHP will not cast the parameter to a string for you. If you pass an integer to ctype_digit, you will get unexpected results.
This is documented behaviour. For example:
ctype_digit('42'); // true
ctype_digit(42); // false (ASCII 42 is the * character)
Difference Between Option 1 and 2
Due to the overhead of the regex engine, option two is probably the best option. However, worrying about the difference between these two options may fall into the premature optimization category.
Note: There is also a functional difference between the two options above. The first option considers NULL and empty strings as valid values, the second option does not (as of PHP 5.1.0). That may make one method more desirable than the other. To make the regex option function the same as the ctype_digit version, use this instead.
if (!preg_match('/^\d+$/', $company_id)) exit('Invalid parameter');
Note: The 'start of string' ^ and 'end of string' $ anchors in the above regex are very important. Otherwise, abc123def would be considered valid.
Other Options
There are other methods that have been suggested here and in other questions that will not achieve the stated goals, but I think it is important to mention them and explain why they won't work as it might help someone else.
is_numeric allows exponential parts, floats, and hex values
is_int checks data type rather than value which is not useful for validation if '1' is to be considered valid. And form input is always a string. If you aren't sure where the value is coming from, you can't be sure of the data type.
filter_var with FILTER_VALIDATE_INT allows negative integers and values such as 1.0. This seems like the best function to actually validate an integer regardless of data type. But doesn't work if you want only digits. Note: It's important to check FALSE identity rather than just truthy/falsey if 0 is to be considered a valid value.
What about filter_var + FILTER_VALIDATE_INT ?
if (FALSE === ($id = filter_var($_GET['id'], FILTER_VALIDATE_INT))) {
// $_GET['id'] does not look like a valid int
} else {
// $id is a int because $_GET['id'] looks like a valid int
}
Besides, it has min_range/max_range options.
The base idea of this function is more or less equivalent to :
function validate_int($string) {
if (!ctype_digit($string)) {
return FALSE;
} else {
return intval($string);
}
}
Also, if you expect an integer, you could use is_int. Unfortunately, type hinting is limited to objets and array.
Both methods will cast the variable into a string. preg_match does not accept a subject of type integer so it will be cast to a string once passed to the function. ctype_digit is definitely the best solution in this case.
Related
I need to check if a number is an absolute integer.
I seem to be easily able to convert a number to an absint abs( intval( $_POST['number'] ) ); but can't figure out how to check for one:
// pseudo code
if ( is_asbint( $_POST['number'] ) ) {
echo 'yay';
}
There are many logical ways to check that, but one particular function - ctype_digit() - happens to do exactly what you want.
It's probably worth noting that I'm saying "happens" and "what you want" for two specific reasons:
ctype_digit() is designed to check if a string value (it won't work on values that are actually of the integer type) consists entirely of digit characters, meaning that it won't accept the minus/dash sign.
Any $_POST value (excluding arrays) is guaranteed to be a string unless you've modified it from within your code - user inputs always come as strings ... you're just calling it an "integer" here.
I am writing a controller action that takes two inputs from a form: an existing user id and and a new userid.
Both should be integer values.
To avoid any potential security problems, is it enough to simply check is_int?
ie:
if (is_int($existingUserId)) {
}
This should avoid any problems i think - but am not 100% sure
You should use is_numeric instead of is_int.
According to the is_int documentation: is_int('23') = bool(false). Also, the documentation notes:
To test if a variable is a number or a numeric string (such as form input, which is always a string), you must use is_numeric().
I'm sure someone asked this before but I just can't find a post similar.
how necessary is it to validate an ID field from $_GET variable?
I'm using is_numeric() to make sure I'm getting a number at least but am I just putting in unnecessary code?
ex.
www.test.com/user.php?user_id=5
if (isset($_GET['user_id']) && is_numeric($_GET['user_id'])) {
*PDO query for user information*
}
is the is_numeric() necessary?
is there a possibility of an attack by changing user_id in the address?
The best way to sanitize a numeric id is by using an (int) cast.
$id = (int) $_GET['ID'];
with strings you just never know.
Is the is_int() necessary?
You are probably looking for retrieving data by id. Therefore convert the string to an int is the simplest way to go. On a side note is_int will always return false if applied to a string.
Is there a possibility of an attack by changing user_id in the address?
Well, strings are always dirty. You never know what strange characters an user might input and how that will effect the query. For example, I don't know if it can be applied in this case but, you should take a look at NULL bytes attacks.
If you want to properly validate an integer before performing the query, you should use filter_input(); the outcome is either a valid integer, false if it's not a valid integer or null if the parameter wasn't passed at all.
if (is_int($userId = filter_input(INPUT_GET, 'user_id', FILTER_VALIDATE_INT))) {
*PDO query for user information*
}
If you're using prepared statements this won't really matter so much, but if you wish to return a failure response based on whether the input conforms to what's expected, you can use the above.
If you don't want to use prepared statements, PDO::quote should be the correct function:
Returns a quoted string that is theoretically safe to pass into an SQL statement.
is_int will not work, because GET variables are always passed as strings.
Personally, I like to test for a valid integer with:
if(strval(intval($_GET['user_id'])) === $_GET['user_id'])
However, this can be overkill. After all, if you're using prepared statements then there's no need to handle any escaping, and searching for a row that doesn't exist will just return no results. I'd throw in intval($_GET['user_id']), but only to really make it clear to future coders that the ID is a number.
is_int check type of variable. But $_GET['id'] will be always a string. Better use filter_var.
But you must use prepared statement anyway.
P.S. With prepared statements you can not use validation. DB will tell that nothing was found. But if you want to warn user about bad request you must validate it before querying.
I am wondering, If I have a value I know should be numeric, is multiplying it by 1 a safe method to clean it?
function x($p1){
$p1*=1;
sql="select * from t where id ={$p1}";
//run query..
}
Although my example uses an ID, this is being used for many types of numeric values I have in my app (can be money, can be pai etc).
I don't see why it wouldn't be. But what's wrong with using prepared statements? That's always going to be safer than using PHP variables directly in SQL statements.
You can use is_numeric()
I'm sure there is a more "appropriate" way, but for the scope of your question, I would say yes. If some sort of string is passed PHP will interpret it as a zero when doing the mathematical operation.
You can also use is_int()
While that'll probably work, intval seems like a better solution. http://php.net/manual/en/function.intval.php. Your intent will likely be more obvious to someone else reading your code.
If you want to check if a value is numeric before converting it to an int, use is_numeric ( http://php.net/manual/en/function.is-numeric.php ). It'll check for strings that are numeric as well as integers. For example, if a number was coming back from a text input form via AJAX, it might be a string. In that case, is_int would return false, but is_numeric would return true.
EDIT
Now that I know you use DECIMAL for the MySQL column type, you can do something like this:
function getItem($pValue)
{
if (!is_numeric($pValue))
{
return false;
}
$Query = sprintf
(
'SELECT * FROM %s WHERE %s = %.2f',
'TableName',
'Price',
$pValue
);
// Do something with $Query
}
It works most of the times as it will cast strings to integers or doubles, but you have to be careful. It's going to work correctly for scalar values. However, if you do this:
x(new stdClass);
You'll get an E_NOTICE. This is not so bad, right? This:
x(array());
And you'll get an E_ERROR, Unsupported operand types, and the script terminates.
Maybe you'd think that it isn't so bad, but a fatal error at an inopportune moment can leave your system in an unstable state, per example, by losing referential integrity or leaving a series of queries unfinished.
Only you knows if a case like the above can happen. But if this data comes from a user in any way, I'd go with Murphy's Law on this one and not trust it.
Since it appears intval only returns 0 if it's not an integer (and I want 0 sometimes in the input), and also since is_int doesn't evaluate string input to tell me if it's an integer (and I'm simply not familiar with what casting a variable as (int) does if it's not an integer).
What's the correct way to go about this?
You might want to take a look at the ctype_digit() function (quoting) :
bool ctype_digit ( string $text )
Returns TRUE if every character
in the string $text is a decimal
digit, FALSE otherwise.
Using this to test what the input is made of, you should then be able to decide what to do with it, depending on the fact it contains an integer or not.
I suppose something like this should do the trick :
if (ctype_digit($_POST['your_field'])) {
// it's an integer => use it
} else {
// Not an integer
}
Maybe try is_numeric. According to the examples in the documentation, this is exactly what you want.
Could you use is_int in conjunction with is_string?
For example:
if(is_int(x) == 0 && is_string(x))
Same answer as matt but easier to read:
preg_match('/^\d+$/', $x)
But ctype is probably faster if your system has it.
$input = preg_match("/^[0-9]{1,}$/",$input) ? (int)$input:$input;