This question already has answers here:
Regex to only allow alphanumeric, comma, hyphen, underscore and semicolon
(3 answers)
Closed 9 years ago.
I'm using the jquery validation engine and trying to add a custom validation with regexp.
But how to set it to only allow numbers, hyphens and + sign?
I tried it in different ways like:
^[0-9a-zA-Z -+]+$
^[0-9a-zA-Z -\+]+$
But none of them worked if I put the plus sign after the hyphen. Why?
In a character class, the - (hyphen) is used to indicate character ranges (just like 0-9 means from 0 to 9). You can either escape it, or put it at the end to make it work properly:
^[0-9a-zA-Z\-+]+$
^[0-9a-zA-Z+-]+$
EDIT: Also, I'm not sure what you put the space in there. I removed it due to your restriction, but you can add it back (before the hyphen for the second regex) if need be.
The hyphen creates a range (just like you did with 0-9. In your case it generates a range from space to + (in ASCII/Unicode order). That's quite a bunch of characters: !"#$%&'()*+ and the space itself.
Escape the hyphen or put it at the end (and remove the space unless you want to accept spaces):
^[0-9a-zA-Z+-]+$
- has special meaning in a character class. In fact, you're even using it to that end in your a-z, A-Z and 0-9 groups.
To use a literal hyphen, either escape it \- or just put it at the end of the class (right before the ])
try this regex pattern
^[0-9a-zA-Z+-]+$
Related
This question already has an answer here:
Javascript Regex restrict underscore at start and end
(1 answer)
Closed 4 months ago.
I need to compose a regular expression for string, with a max length of 6 characters, containing only Latin letters in lowercase, with an optional underscore separator, without underscore starting and trailing.
I tried the following
^[a-z_]{1,6}$
But it allows underscore at the start and the end.
I also tried:
^([a-z]_?[a-z]){1,6}$
^(([a-z]+)_?([a-z]+)){1,6}$
^([a-z](?:_?)[a-z]){1,6}$
But nothing works. Please help.
Expecting:
Valid:
ex_bar
Not valid:
_exbar
exbar_
_test_
This is a fairly simple pattern that should work ^(?!_)[a-z_]{0,5}[a-z]$. See here for a breakdown.
I would express your requirement as:
^(?!.{7,}$)[a-z](?:[a-z_]*[a-z])*$
This pattern matches:
^ from the start of the string
(?!.{7,}$) assert that at most 6 characters are present
[a-z] first letter must be a-z
(?:[a-z_]*[a-z])* match a-z or underscore in the middle, but only a-z at the end
$ end of the string
Note that the behavior of the above pattern is that one character matches must be only letter a-z. Similarly, two character matches can also only be a-z twice. With three character matches and longer, it is possible for underscore to appear in the middle.
Here is a running demo.
(?!^_)([a-z_]{6})(?<!_$)
You could use a negative look-ahead and negative look-behind to ensure that the string doesn't start and end with an _ underscore.
https://regex101.com/r/sMho0c/1
This question already has answers here:
How to regex match entire string instead of a single character
(2 answers)
Closed 3 years ago.
I'm trying to make functions for validating usernames, emails, and passwords and my regex isn't working. My regex for usernames is ^[a-zA-Z0-9 _-]$ and when I put anything through that should work it always returns false.
As I understand it, the ^ and $ at the beginning and the end means that it makes sure the entire string matches this regular expression, the a-z and A-Z allows all letters, 0-9 allows all numbers, and the last three characters (the space, underscore, and dash) allow the respective characters.
Why is my regular expression not evaluating properly?
You need a quantifier, + or *. As it was written that only allows 1 of the characters in the character class.
Your a-zA-Z0-9_ also can be replaced with \w. Try:
^[\w -]+$
+ requires 1 or more matches. * requires 0 or more matches so if an empty string is valid use *.
Additionally you could use \h in place of the space character if tabs are allowed. That is the metacharacter for a horizontal space. I find it easier to read than the literal space.
Per comment, Update:
Since it looks like you want the string to be between a certain number of characters we can get more specific with the regex. A range can be created with {x,y} which will replace the quantifier.
^[\w -]{3,30}$
Additionally in PHP you must provide delimiters at the start and end of the regex.
preg_match("/^[\w -]{3,30}$/", $username);
Additionally, you should enable error reporting so you get these useful errors in the future. See https://stackoverflow.com/a/21429652/3783243
You're not specifying the character count. Lets try this instead:
^[A-z0-9]*$
Where [A-z0-9] states that you can use any alphanumeric characters and that it is case sensitive.
The * specifies how many characters, and in this case is unlimited. If you wanted to max out your username length to 10 characters, then you could change it to:
^[A-z0-9]{10}$
Whereby the {10} is specifying a maximum of 10 characters.
UPDATE
To also allow the use of underscores, hyphens and blank spaces (anywhere in the string) - use the below:
^[A-z0-9 _-]{10}$
This question already has answers here:
Regular expression for alphanumeric and underscores
(21 answers)
Closed 3 years ago.
Looking to see how I can edit the Username creation process to allow underscores and hyphens at the beginning and end of usernames.
Currently, if you end your username with a _, it drops it from the creation process.
$regex = '/^[A-Za-z0-9]+[A-Za-z0-9_.]*[A-Za-z0-9]+$/';
if(!preg_match($regex, $_POST['username'])) {
$_SESSION['error'][] = $language->register->error_message->username_characters;
}
You just need to add underscore _ and hyphen - to your first and last character set to allow your username to start or end with those two new characters and write your regex like this,
^[A-Za-z0-9_-]+[A-Za-z0-9_.]*[A-Za-z0-9_-]+$
and as \w is same as writing [a-zA-Z0-9_] hence you can compact your regex to this,
^[\w-]+[\w.]*[\w-]+$
Just want to also mention one point that whenever you write a hyphen - in a character set, make sure to always place it as either the first or last character in the character set, else unknowingly, the hyphen may act either as a range specifier and may not act as a literal hyphen. Although as in above regex, there is only \w and - in the character set, hence we don't need to worry here about the placement of hyphen.
Regex Demo
Also, I am not sure if you want to allow usernames (unlike a variable name which generally is allowed to be just one character) of just one character, but if you do, then you can modify your regex to this,
^[\w-]+([\w.]*[\w-]+)?$
Regex Demo allowing just one character as username
This question already has answers here:
PHP Regex: How to match \r and \n without using [\r\n]?
(7 answers)
Closed 1 year ago.
I have this nice preg_match regex:
if(preg_match ("%^[A-Za-z0-9ążśźęćń󳥯ŚŹĘĆŃÓŁ\.\,\-\?\!\(\)\"\ \/\t\/\n]{2,50}$%", stripslashes(trim($_POST['x']))){...}
Which should allow all characters that could be used in and eventual text content of a post. Problem is, despite the \n it the functions still doesn't work for new lines in my post, so a syntax of
foo
bar
would not work.
Does anybody know why the function would not work properly?
Any help would be gratefully appreciated.
By default a preg_match() with a pattern using ^ and $ will consider the whole string, even if it contains newlines.
This behaviour can be altered using Pattern Modifiers, of which I will list the ones that fit this topic:
s (PCRE_DOTALL): by default, the dot (.) will not match newlines, but by using the modifier s it will. However, character classes (e.g. [a-z] and [^a-z]) never treat the newline as a special character anyway, thus this modifier will not affect their behaviour like it will for the dot (.).
m (PCRE_MULTILINE): by default, the start (^) and end ($) anchors will by default match the start and end of the whole string that is subjected to pattern matching, even if that string contains newlines. However, when this modifier is used, the preg-function is allowed to consider each part of the string that is separated by newlines as a complete string, so "foo\nbar\nbar" will result in three matches (1: foo, 2: bar, 3: bar) when matched against the pattern /^[a-z]$/m, not just one (1: foo\nbar\bar) as when the m modifier is not used: /^[a-z]$/.
D (PCRE_DOLLAR_ENDONLY): by default, the end ($) anchor will not only match the very end of a string, but also right before a trailing newline (trailing meaning: at the very end of the string). To undo this behaviour and make it very stricly only match the string ending, use this pattern modifier.
YOUR PROBLEM:
if(preg_match("%^[A-Za-z0-9ążśźęćń󳥯ŚŹĘĆŃÓŁ\.\,\-\?\!\(\)\"\ \/\t\/\n]{2,50}$%m", stripslashes(trim($_POST['x']))){...}
I don't see much wrong with your pattern, except that it is not required that you escape characters other than \, -, ^ (only at the start of the character class) and ] (only when not at the start of the character class), but the PHP doc says it's not a violation to still do so.
It might be, though, that your text snippet contains newlines in the form of \r\n and since \r is not included in the character class of your pattern, it will not be matched.
Since my original post mentioned the use of the Patter Modifier m to which you replied that that worked, I wonder what really might have been the issue.
This question already has answers here:
How to validate an email address in PHP
(15 answers)
Closed 2 years ago.
Regex is blowing my mind. How can I change this to validate emails with a plus sign? so I can sign up with test+spam#gmail.com
if(!preg_match("/^[_a-z0-9-]+(\.[_a-z0-9-]+)*#[a-z0-9-]+(\.[a-z0-9-]+)*$/i", $_GET['em'])) {
It seems like you aren't really familiar with what your regex is doing currently, which would be a good first step before modifying it. Let's walk through your regex using the email address john.robert.smith#mail.com (in each section below, the bolded part is what is matched by that section):
^ is the start of string
anchor.
It specifies that any match must
begin at the beginning of the
string. If the pattern is not
anchored, the regex engine can match
a substring, which is often
undesired.
Anchors are zero-width, meaning that
they do not capture any characters.
[_a-z0-9-]+ is made up of two
elements, a character
class
and a repetition
modifer:
[...] defines a character class, which tells the regex engine,
any of these characters are valid matches. In this case the class
contains the characters a-z, numbers
0-9 and the dash and underscore (in
general, a dash in a character class
defines a range, so you can use
a-z instead of
abcdefghijklmnopqrstuvwxyz; when
given as the last character in the
class, it acts as a literal dash).
+ is a repetition modifier that specifies that the preceding token
(in this case, the character class)
can be repeated one or more times.
There are two other repetition
operators: * matches zero or more
times; ? matches exactly zero or
one times (ie. makes something
optional).
(captures
john.robert.smith#mail.com)
(\.[_a-z0-9-]+)* again contains a
repeated character class. It also
contains a
group,
and an escaped character:
(...) defines a group, which allows you to group multiple tokens
together (in this case, the group
will be repeated as a
whole).Let's say we wanted to
match 'abc', zero or more times (ie.
abcabcabc matches, abcccc doesn't).
If we tried to use the pattern
abc*, the repetition modifier
would only apply to the c, because
c is the last token before the
modifier. In order to get around
this, we can group abc ((abc)*),
in which case the modifier would
apply to the entire group, as if it
was a single token.
\. specifies a literal dot character. The reason this is needed
is because . is a special
character in regex, meaning any
character.
Since we want to match an actual dot
character, we need to escape it.
(captures
john.robert.smith#mail.com)
# is not a special character in
regex, so, like all other
non-special characters, it matches
literally.
(captures john.robert.smith#mail.com)
[a-z0-9-]+ again defines a repeated character class, like item #2 above.
(captures john.robert.smith#mail.com)
(\.[a-z0-9-]+)* is almost exactly the same pattern as #3 above.
(captures john.robert.smith#mail.com)
$ is the end of string anchor. It works the same as ^ above, except matches the end of the string.
With that in mind, it should be a bit clearer how to add a section with captures a plus segment. As we saw above, + is a special character so it has to be escaped. Then, since the + has to be followed by some characters, we can define a character class with the characters we want to match and define its repetition. Finally, we should make the whole group optional because email addresses don't need to have a + segment:
(\+[a-z0-9-]+)?
When inserted into your regex, it'd look like this:
/^[_a-z0-9-]+(\.[_a-z0-9-]+)*(\+[a-z0-9-]+)?#[a-z0-9-]+(\.[a-z0-9-]+)*$/i
Save your sanity. Get a pre-made PHP RFC 822 Email address parser
I've used this regex to validate emails, and it works just fine with emails that contain a+:
/^(([^<>()[\]\\.,;:\s#\"]+(\.[^<>()[\]\\.,;:\s#\"]+)*)|(\".+\"))#((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$/
\+ will match a literal + sign, but be aware: You still won't be close to matching all possible email addresses according to the RFC spec, because the actual regex for that is madness. It's almost certainly not worth it; you should use a real email parser for this.
This is another solution (is similar to the solution found by David):
//Escaped for .Net
^[_a-zA-Z0-9-]+((\\.[_a-zA-Z0-9-]+)*|(\\+[_a-zA-Z0-9-]+)*)*#[a-zA-Z0-9-]+(\\.[a-zA-Z0-9-]+)*(\\.[a-zA-Z]{2,4})$
//Native
^[_a-zA-Z0-9-]+((\.[_a-zA-Z0-9-]+)*|(\+[_a-zA-Z0-9-]+)*)*#[a-zA-Z0-9-]+(\.[a-zA-Z0-9-]+)*(\.[a-zA-Z]{2,4})$
This is the another solution
/^[_a-z0-9-+]+(\.[_a-z0-9-+]+)*(\+[a-z0-9-]+)?#[a-z0-9-.]+(\.[a-z0-9]+)$/
or For razor page(#=\u0040)
/^[_a-z0-9-+]+(\.[_a-z0-9-+]+)*(\+[a-z0-9-]+)?\u0040[a-z0-9-.]+(\.[a-z0-9]+)$/