Regex validation pattern only allows one character [duplicate] - php

This question already has answers here:
How to regex match entire string instead of a single character
(2 answers)
Closed 3 years ago.
I'm trying to make functions for validating usernames, emails, and passwords and my regex isn't working. My regex for usernames is ^[a-zA-Z0-9 _-]$ and when I put anything through that should work it always returns false.
As I understand it, the ^ and $ at the beginning and the end means that it makes sure the entire string matches this regular expression, the a-z and A-Z allows all letters, 0-9 allows all numbers, and the last three characters (the space, underscore, and dash) allow the respective characters.
Why is my regular expression not evaluating properly?

You need a quantifier, + or *. As it was written that only allows 1 of the characters in the character class.
Your a-zA-Z0-9_ also can be replaced with \w. Try:
^[\w -]+$
+ requires 1 or more matches. * requires 0 or more matches so if an empty string is valid use *.
Additionally you could use \h in place of the space character if tabs are allowed. That is the metacharacter for a horizontal space. I find it easier to read than the literal space.
Per comment, Update:
Since it looks like you want the string to be between a certain number of characters we can get more specific with the regex. A range can be created with {x,y} which will replace the quantifier.
^[\w -]{3,30}$
Additionally in PHP you must provide delimiters at the start and end of the regex.
preg_match("/^[\w -]{3,30}$/", $username);
Additionally, you should enable error reporting so you get these useful errors in the future. See https://stackoverflow.com/a/21429652/3783243

You're not specifying the character count. Lets try this instead:
^[A-z0-9]*$
Where [A-z0-9] states that you can use any alphanumeric characters and that it is case sensitive.
The * specifies how many characters, and in this case is unlimited. If you wanted to max out your username length to 10 characters, then you could change it to:
^[A-z0-9]{10}$
Whereby the {10} is specifying a maximum of 10 characters.
UPDATE
To also allow the use of underscores, hyphens and blank spaces (anywhere in the string) - use the below:
^[A-z0-9 _-]{10}$

Related

Allow Underscores at the start and end of Usernames [duplicate]

This question already has answers here:
Regular expression for alphanumeric and underscores
(21 answers)
Closed 3 years ago.
Looking to see how I can edit the Username creation process to allow underscores and hyphens at the beginning and end of usernames.
Currently, if you end your username with a _, it drops it from the creation process.
$regex = '/^[A-Za-z0-9]+[A-Za-z0-9_.]*[A-Za-z0-9]+$/';
if(!preg_match($regex, $_POST['username'])) {
$_SESSION['error'][] = $language->register->error_message->username_characters;
}
You just need to add underscore _ and hyphen - to your first and last character set to allow your username to start or end with those two new characters and write your regex like this,
^[A-Za-z0-9_-]+[A-Za-z0-9_.]*[A-Za-z0-9_-]+$
and as \w is same as writing [a-zA-Z0-9_] hence you can compact your regex to this,
^[\w-]+[\w.]*[\w-]+$
Just want to also mention one point that whenever you write a hyphen - in a character set, make sure to always place it as either the first or last character in the character set, else unknowingly, the hyphen may act either as a range specifier and may not act as a literal hyphen. Although as in above regex, there is only \w and - in the character set, hence we don't need to worry here about the placement of hyphen.
Regex Demo
Also, I am not sure if you want to allow usernames (unlike a variable name which generally is allowed to be just one character) of just one character, but if you do, then you can modify your regex to this,
^[\w-]+([\w.]*[\w-]+)?$
Regex Demo allowing just one character as username

Regex for numbers, hyphen and plus + [duplicate]

This question already has answers here:
Regex to only allow alphanumeric, comma, hyphen, underscore and semicolon
(3 answers)
Closed 9 years ago.
I'm using the jquery validation engine and trying to add a custom validation with regexp.
But how to set it to only allow numbers, hyphens and + sign?
I tried it in different ways like:
^[0-9a-zA-Z -+]+$
^[0-9a-zA-Z -\+]+$
But none of them worked if I put the plus sign after the hyphen. Why?
In a character class, the - (hyphen) is used to indicate character ranges (just like 0-9 means from 0 to 9). You can either escape it, or put it at the end to make it work properly:
^[0-9a-zA-Z\-+]+$
^[0-9a-zA-Z+-]+$
EDIT: Also, I'm not sure what you put the space in there. I removed it due to your restriction, but you can add it back (before the hyphen for the second regex) if need be.
The hyphen creates a range (just like you did with 0-9. In your case it generates a range from space to + (in ASCII/Unicode order). That's quite a bunch of characters: !"#$%&'()*+ and the space itself.
Escape the hyphen or put it at the end (and remove the space unless you want to accept spaces):
^[0-9a-zA-Z+-]+$
- has special meaning in a character class. In fact, you're even using it to that end in your a-z, A-Z and 0-9 groups.
To use a literal hyphen, either escape it \- or just put it at the end of the class (right before the ])
try this regex pattern
^[0-9a-zA-Z+-]+$

php regular expression for 4 characters

I am trying to construct a regular expression for a string which can have 0 upto 4 characters. The characters can only be 0 to 9 or a to z or A to Z.
I have the following expression, it works but I dont know how to set it so that only maximum of 4 characters are accepted. In this expression, 0 to infinity characters that match the pattern are accepted.
'([0-9a-zA-Z\s]*)'
You can use {0,4} instead of the * which will allow zero to four instances of the preceding token:
'([0-9a-zA-Z\s]{0,4})'
(* is actually the same as {0,}, i.e. at least zero and unbounded.)
If you want to match a string that consists entirely of zero to four of those characters, you need to anchor the regex at both ends:
'(^[0-9a-zA-Z]{0,4}$)'
I took the liberty of removing the \s because it doesn't fit your problem description. Also, I don't know if you're aware of this, but those parentheses do not form a group, capturing or otherwise. They're not even part of the regex; PHP is using them as regex delimiters. Your regex is equivalent to:
'/^[0-9a-zA-Z]{0,4}$/'
If you really want to capture the whole match in group #1, you should add parentheses inside the delimiters:
'/(^[0-9a-zA-Z]{0,4}$)/'
... but I don't see why you would want to; the whole match is always captured in group #0 automatically.
You can use { } to specify finite quantifiers:
[0-9a-zA-Z\s]{0,4}
http://www.regular-expressions.info/reference.html
You can avoid regular expressions completely.
if (strlen($str) <= 4 && ctype_alnum($str)) {
// contains 0-4 characters, that are either letters or digits
}
ctype_alnum()

Allow + in regex email validate email [duplicate]

This question already has answers here:
How to validate an email address in PHP
(15 answers)
Closed 2 years ago.
Regex is blowing my mind. How can I change this to validate emails with a plus sign? so I can sign up with test+spam#gmail.com
if(!preg_match("/^[_a-z0-9-]+(\.[_a-z0-9-]+)*#[a-z0-9-]+(\.[a-z0-9-]+)*$/i", $_GET['em'])) {
It seems like you aren't really familiar with what your regex is doing currently, which would be a good first step before modifying it. Let's walk through your regex using the email address john.robert.smith#mail.com (in each section below, the bolded part is what is matched by that section):
^ is the start of string
anchor.
It specifies that any match must
begin at the beginning of the
string. If the pattern is not
anchored, the regex engine can match
a substring, which is often
undesired.
Anchors are zero-width, meaning that
they do not capture any characters.
[_a-z0-9-]+ is made up of two
elements, a character
class
and a repetition
modifer:
[...] defines a character class, which tells the regex engine,
any of these characters are valid matches. In this case the class
contains the characters a-z, numbers
0-9 and the dash and underscore (in
general, a dash in a character class
defines a range, so you can use
a-z instead of
abcdefghijklmnopqrstuvwxyz; when
given as the last character in the
class, it acts as a literal dash).
+ is a repetition modifier that specifies that the preceding token
(in this case, the character class)
can be repeated one or more times.
There are two other repetition
operators: * matches zero or more
times; ? matches exactly zero or
one times (ie. makes something
optional).
(captures
john.robert.smith#mail.com)
(\.[_a-z0-9-]+)* again contains a
repeated character class. It also
contains a
group,
and an escaped character:
(...) defines a group, which allows you to group multiple tokens
together (in this case, the group
will be repeated as a
whole).Let's say we wanted to
match 'abc', zero or more times (ie.
abcabcabc matches, abcccc doesn't).
If we tried to use the pattern
abc*, the repetition modifier
would only apply to the c, because
c is the last token before the
modifier. In order to get around
this, we can group abc ((abc)*),
in which case the modifier would
apply to the entire group, as if it
was a single token.
\. specifies a literal dot character. The reason this is needed
is because . is a special
character in regex, meaning any
character.
Since we want to match an actual dot
character, we need to escape it.
(captures
john.robert.smith#mail.com)
# is not a special character in
regex, so, like all other
non-special characters, it matches
literally.
(captures john.robert.smith#mail.com)
[a-z0-9-]+ again defines a repeated character class, like item #2 above.
(captures john.robert.smith#mail.com)
(\.[a-z0-9-]+)* is almost exactly the same pattern as #3 above.
(captures john.robert.smith#mail.com)
$ is the end of string anchor. It works the same as ^ above, except matches the end of the string.
With that in mind, it should be a bit clearer how to add a section with captures a plus segment. As we saw above, + is a special character so it has to be escaped. Then, since the + has to be followed by some characters, we can define a character class with the characters we want to match and define its repetition. Finally, we should make the whole group optional because email addresses don't need to have a + segment:
(\+[a-z0-9-]+)?
When inserted into your regex, it'd look like this:
/^[_a-z0-9-]+(\.[_a-z0-9-]+)*(\+[a-z0-9-]+)?#[a-z0-9-]+(\.[a-z0-9-]+)*$/i
Save your sanity. Get a pre-made PHP RFC 822 Email address parser
I've used this regex to validate emails, and it works just fine with emails that contain a+:
/^(([^<>()[\]\\.,;:\s#\"]+(\.[^<>()[\]\\.,;:\s#\"]+)*)|(\".+\"))#((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$/
\+ will match a literal + sign, but be aware: You still won't be close to matching all possible email addresses according to the RFC spec, because the actual regex for that is madness. It's almost certainly not worth it; you should use a real email parser for this.
This is another solution (is similar to the solution found by David):
//Escaped for .Net
^[_a-zA-Z0-9-]+((\\.[_a-zA-Z0-9-]+)*|(\\+[_a-zA-Z0-9-]+)*)*#[a-zA-Z0-9-]+(\\.[a-zA-Z0-9-]+)*(\\.[a-zA-Z]{2,4})$
//Native
^[_a-zA-Z0-9-]+((\.[_a-zA-Z0-9-]+)*|(\+[_a-zA-Z0-9-]+)*)*#[a-zA-Z0-9-]+(\.[a-zA-Z0-9-]+)*(\.[a-zA-Z]{2,4})$
This is the another solution
/^[_a-z0-9-+]+(\.[_a-z0-9-+]+)*(\+[a-z0-9-]+)?#[a-z0-9-.]+(\.[a-z0-9]+)$/
or For razor page(#=\u0040)
/^[_a-z0-9-+]+(\.[_a-z0-9-+]+)*(\+[a-z0-9-]+)?\u0040[a-z0-9-.]+(\.[a-z0-9]+)$/

Regex for netbios names

I got this issue figuring out how to build a regexp for verifying a netbios name. According to the ms standard these characters are illegal
\/:*?"<>|
So, thats what I'm trying to detect. My regex is looking like this
^[\\\/:\*\?"\<\>\|]$
But, that wont work.
Can anyone point me in the right direction? (not regexlib.com please...)
And if it matters, I'm using php with preg_match.
Thanks
Your regular expression has two problems:
you insist that the match should span the entire string. As Andrzej says, you are only matching strings of length 1.
you are quoting too many characters. In a character class (i.e. []), you only need to quote characters that are special within character classes, i.e. hyphen, square bracket, backslash.
The following call works for me:
preg_match('/[\\/:*?"<>|]/', "foo"); /* gives 0: does not include invalid characters */
preg_match('/[\\/:*?"<>|]/', "f<oo"); /* gives 1: does include invalid characters */
As it stands at the moment, your regex will match the start of the string (^), then exactly one of the characters in the square brackets (i.e. the illegal characters), then then end of the string ($).
So this likely isn't working because a string of length > 1 will trivially fail to match the regex, and thus be considered OK.
You likely don't need the start and end anchors (the ^ and $). If you remove these, then the regex should match one of the bracketed characters occurring anywhere on the input text, which is what you want.
(Depending on the exact regex dialect, you may canonically need less backslashes within the square brackets, but they are unlikely to do any harm in any case).

Categories