I'm still a newbie for regular expressions. I want to create a regular expression with this rule:
if (preg_match('^[ A-Za-z0-9_-#]^', $text) == 1) {
return true;
}
else {
return false;
}
In short, I would like $text to accept texts, numbers, spaces, underscores, dashes, and hashes (#).
Is the above reg expression correct? it always return true.
First off, you shouldn't use ^ as the expression boundaries, because they're also used for expression anchors; use /, ~ or # instead.
Second, the dash in the character set should be at the last position; otherwise it matches the range from _ until #, and that's probably not what you want.
Third, the expression now only matches a single character; you will want to use a multiplier such as + or *.
Lastly, you should anchor the expression so that only those valid characters are present in the string:
/^[ \w#-]+$/
Btw, I've replaced A-Za-z0-9_ with \w as a shortcut.
That you can do:
\w stand for [a-zA-Z0-9_]
the character - have a special meaning in a character class since it is used to define ranges, thus you must place it at the begining or at the end of the class
the preg_match function return 0 if there is no match or false when an error occurs, thus you don't need to test if it is equal to 1 (you can use that preg_match returns to do things)
example:
if (preg_match('~[\w #-]++~', $subject))
...
else
...
Related
Hi very interesting Regex Expression, i tried a lot of time,but some difficulties in it.
Regex rule:
user can use following characters all small([a-z]), all capital([A-Z]), numbers([0-9]) and the follownig symbols
!~*:;<>+#-£$&_?(){}[] and one space. All characers are in any order,
but restiriction are following:
input can not start with digit.
user can use zero or one white space any where in the input, but input can not start and end with white space.
input must contains at least one special following character!~*:;<>+#-£$&_?(){}[] in any order.
input length is in between 6-15.
Question: Regular expression which fullfills the above requirment. i have spent many of hours on it.but make the following Regex expression.
Regex='/^([a-zA-Z]|!|\~|*|\:|\;|\<|>|+|#|-|\£|\$|\&|_|\?|{|}|[|]|(|)){1,20}(\s){0,1}([a-zA-Z]|!|\~|*|\:|\;|\<|>|+|#|-|\d|\£|\$|\&|_|\?|{|}|[|]|(|)){1,20}(!|\~|*|\:|\;|\<|>|+|#|-|\£|\$|\&|_|\?|{|}|[|]|(|)){1,}$/i';
it fullfills all rule but not rule no.4.
complete regex which fullfills the above rules will be appreciated.
Some hints before getting to the solution.
You use the modifier i, means "case independent" matching. So no a-zA-Z needed just use a-z or A-Z.
From your list of characters [a-zA-Z]|!|\~|*|\:|\;|\<|>|+|#|-|\£|\$|\&|_|\?|{|}|[|]|(|)
There are some characters that needs escaping, since they are special in regex.
Skip the alternation and put all characters in the char class (you can also spare the escaping then)
To enforce some of your rules you need lookahead assertions
So your regex (for php) can look like:
^(?![\d ])(?![^ ]*[ ][^ ]*[ ])(?=.*[!~*:;<>+#\-£$&_?{}\[\]()])[a-z\d!~*:;<>+#\-£$&_?{}\[\]() ]{6,15}(?<![ ])$
If you need the regex for JavaScript, you can not use the lookbehind assertion. You can replace it also by a lookahead:
^(?![\d ])(?!.* $)(?![^ ]*[ ][^ ]*[ ])(?=.*[!~*:;<>+#\-£$&_?{}\[\]()])[a-z\d!~*:;<>+#\-£$&_?{}\[\]() ]{6,15}$
See it here on Regexr (Note. I have used there [^ \r]) just because I need multiline for testing)
The regex explained:
[a-z\d!~*:;<>+#\-£$&_?{}\[\]() ]{6,15} matches all characters you want to allow, in the required length.
(?![\d ]) negative lookahead assertion, that ensures the string does not start with a digit or a space.
(?![^ ]*[ ][^ ]*[ ]) negative lookahead assertion, that ensures the string does not have more than one space
(?=.*[!~*:;<>+#\-£$&_?{}\[\]()]) positive lookahead assertion, that ensures the string does have one of your special symbols
(?<![ ])$ negative lookbehind assertion, that ensures the string does not end with a space.
Unicode:
JavaScript does not support this natively!
If you want to support Unicode letters instead of only the old ASCII letters, then replace
[a-z] with \p{L}. You can then also remove the i modifier, since \p{L} is a Unicode property that matches all letters in any language (only complete letters, not combined ones, there you could use [\p{L}\p{Mn}\p{Mc}])
Here's how I'd do it.
<?php
$symbols = '!~*:;<>+#\-£$&_?(){}\[\]';
$regex = "/^(?=.*[$symbols])(?=.{6,15}\$)(?!.* )[a-zA-Z][a-zA-Z0-9 $symbols]+[a-zA-Z0-9]$";
?>
If any of the following explanations are unclear, please ask, referring to their number:
Note that we've factored out the list of symbols for convenience.
Note that we've escaped -, [, and ], as these are meaningful characters in character classes. It's possible to not escape them as long as the - is at the beginning or end, and the ] is at the beginning, but since we're mixing $symbols with other characters, we can't be sure where the "beginning" or "end" really is.
(?=...) are known as lookaheads. They're useful for asserting multiple conditions. For example, the (?=.*[$symbols]) asserts that there is a symbol somewhere (hence the .*); and the (?=.{6,15}\$) asserts that, from beginning to end, the string is between 6 and 15 characters in length (note that the $ is escaped only because it exists in a double-quote).
The (?!...) is known as a negative lookahead. The (?!.* ) asserts that there are no two consecutive spaces anywhere.
The remainder should be obvious.
Like suggested by #Juhana in the comments, why not test your rules separately rather than make a single over complex regex? something like this. (these are not actual solutions or tested as you didn't provide any test strings, more an example of how to think differently about your problem)
Javascript
function verify(string) {
var length = string.length;
if (length < 6 || length > 15 || /^\d/.test(string) || /^\s/.test(string) || /\s$/.test(string) || /\s\s/.test(string) || /[!~*:;<>+#-£$&_?(){}[\]]+/.test(string)) {
return false;
}
return true;
}
PHP
function verify($string) {
$length = strlen($string);
if ($length < 6 || $length > 15 || preg_match("/^\d/", $string) || preg_match("/^\s/", $string) || preg_match("/\s$/", $string) || preg_match("/\s\s/", $string) || preg_match("/[!~*:;<>+#-£$&_?(){}[\]]+/", $string)) {
return false;
}
return true;
}
To be honest, I don't really get RegEx. So I'm completely oblivious as to where I'm going wrong here.
I'm looking for a RegEx that accepts alphanumeric characters only (and underscores, it's for usernames). I've searched around here and found numerous example RegExes that I've tried and not one of them has worked.
Among others, which I've mostly gotten from answers around here, I've tried
^[a-zA-Z0-9_]*$
/[^a-z_\-0-9]/i
/^\w+$/
To match these, I've tried (with each of the regexes)
if(preg_match("/^\w+$/", $username)) {
//don't accept
}
and
if(!preg_match("/^\w+$/", $username)) {
//don't accept
}
and
if(preg_match("/^\w+$/", $username) == 1) {
//don't accept
}
and
if(preg_match("/^\w+$/", $username) == 0) {
//don't accept
}
etc...
Each and every single time it's accepting special characters (I've tried &, $, ^, and %).
What exactly am I doing wrong here? Is it the format of the RegEx? Is it how I'm asking it to check?
Also, what exactly is the return type I get if it's found special characters? (i.e One I don't want to accept)
preg_match returns 1 if the input string matched the pattern you gave, and 0 if it didn't.
You want each character in your usernames to be alphanumeric (plus underscore). One PCRE way of expressing that is with a character class inside square brackets, like this one: [A-Za-z0-9_]. There are a couple of ways you could use this basic class to do what you want.
One way is a "negative" search: try to match a non-alphanumeric character, and if you do, then the test fails. For this, we just add a carat at the front of the character class. This means we're matching any character not in that set.
So, the following pattern matches "any non-alphanumeric, non-underscore character." Here, a match means an invalid username:
if (preg_match('/[^A-Za-z0-9_]/', $username)) {
// invalid username
}
Or, you could do the opposite kind of match, where you give a pattern for a valid username and check if you match that. This time, we don't change the character class itself at all, but we add the + quantifier after it, meaning we're matching one or more of the "good" characters.
Additionally, we wrap the ^ and $ beginning-and-end-of-string anchors around our pattern. (It's a little confusing, but a carat at the beginning of a pattern has a completely different meaning from a carat at the beginning of a character class, within the brackets).
The end result is a pattern that means: "1 or more alphanumeric characters (plus underscore) and nothing else." A match on this one means a valid username:
if (preg_match('/^[A-Za-z0-9_]+$/', $username)) {
// valid username
}
if (preg_match("^[a-zA-Z0-9_]+$", $username) === 1) {
// Good username
}
else {
// Bad username
}
The use of the strict equality operator (===) means we're comparing what preg_match() returns to 1, the number, not the boolean value. If it returns a 0, it means there are no matches, a boolean false, an error occored. Check out the page for preg_match for more information: http://php.net/manual/en/function.preg-match.php
Per the PHP manual *preg_match* will return 0 if it can't find a successful match with your regex and FALSE if en error occurs. So if you want to make sure you're testing for 0, and not something which can evaluate to false, you should use the === operator.
If you only want letters and underscores you can use a character class of [a-z_] which specifies that the range of characters for a to z and the _ symbol will match. And the + following the class specifies that you want one or multiple of the same. The ^ says the pattern must match from the beginning of the text, while the $ says that the pattern must match up until the end of the text.
if (preg_match("/^[a-z_]+$/i", $text_variable) === 1) {
//"A match was found.";
} else {
//"A match was not found.";
}
Regex is very easy to understand if you get the basics :)
I'll try to explain to you all three expressions you tried:
With ^[a-zA-Z0-9_]*$ string will be matched which:
^ // from the beginning...
[a-zA-Z0-9_] // contains only characters a-z or A-Z or 0-9 or _ sign
* // and has 0 or more of such characters
$ // to the end
Matched strings for example:
(empty string - since you told 0 or more characters)
abc09
fidjwieofoj4fio3j4fiojrfioj3ijfo
000000000000000000000
__________
and_many_many_more_as_long_as_they_contain_alpha_characters_and___sign
With /[^a-z_-0-9]/i string will be matched which:
[^a-z_\-0-9]
// ^ means "the opposite" so that subset describes characters
// which are not included in it
// (are not a-z or _ sign, or - dash sign, or 0-9 numbers)
i modifier
// stands for case insensitive, all letters are treated as lowercase
You did not add * or ? or + after the subset so basically you are looking for one character only, and because you did not put your regexp between ^ and $ signs, this expression will finally match any text which contains at least one character which is not A-Z or a-z, or _ sign, or - dash sign, or 0-9 numbers.
Matched strings for example:
!
a>a
A<9
ffffffffff.dflskfdfd
00000,
]]]]]]]]]]]]]]]]]]
and so-on
With /^\w+$/ string will be matched which:
^ // from the beginning
\w // contains only characters a-z or A-Z or 0-9 or _ sign
+ // and the string must be at least 1 character long
$ // to the end
Probably the most useful regular expression. Remember, \w is just an alias for [a-zA-Z0-9_]. This regexp will match only whole string which is not empty and contains only alphanumeric characters and _ sign.
Matched strings for example:
mike
alice
bob10
0000000000
1111
9
php
user_example
Hope that helps. To you, most useful expression imvho to match valid usernames would be /^\w{3,15}$/ as it would match any string which is 3 to 15 characters long and consist only of alphanumeric characters and the underscore sign (a-z A-Z 0-9 _).
Try this:
<?php
function isValidUsername($username)
{
return preg_match('/^\w{3,15}$/', $username) == 1;
}
echo isValidUsername('mike999') ? 'Yes' : 'No' , '<br>';
echo isValidUsername('alice!') ? 'Yes' : 'No';
Cheers.
I'm learning regular expression, so please go easy with me!
Username is considered valid when does not start with _ (underscore) and if contains only word characters (letters, digits and underscore itself):
namespace Gremo\ExtraValidationBundle\Validator\Constraints;
use Symfony\Component\Validator\Constraint;
use Symfony\Component\Validator\ConstraintValidator;
class UsernameValidator extends ConstraintValidator
{
public function validate($value, Constraint $constraint)
{
// Violation if username starts with underscore
if (preg_match('/^_', $value, $matches)) {
$this->context->addViolation($constraint->message);
return;
}
// Violation if username does not contain all word characters
if (!preg_match('/^\w+$/', $value, $matches)) {
$this->context->addViolation($constraint->message);
}
}
}
In order to merge them in one regular expression, i've tried the following:
^_+[^\w]+$
To be read as: add a violation if starts with an underscore (eventually more than one) and if at least one character following is not allowed (not a letter, digit or underscore). Does not work with "_test", for example.
Can you help me to understand where I'm wrong?
You can add a negative lookahead assertion to your 2nd regex:
^(?!_)\w+$
Which now means, try to match the entire string and not any part of it. The string must not begin with an underscore and can have one or more of word characters.
See it work
The problem is De Morgan's Law. ^_+[^\w]+$ will only match if it starts with one or more underscores and all subsequent characters are non-word characters. You need to match if it starts with an underscore or any character is a non-word character.
I think it's simpler, in this case, to focus on the valid usernames: they start with a word character other than an underscore, and all remaining characters are word characters. In other words, valid usernames are described by the pattern ^[^\W_]\w*$. So, you can write:
if (! preg_match('/^[^\W_]\w*$/', $value, $matches)) {
The simple solution is this:
if (!preg_match('/^[a-zA-Z0-9]+$/', $value, $matches)) {
you just wanted the \w group (which includes the underscore) but without the underscore, so [a-zA-Z0-9] is equivalent to \w but without the underscore.
There are of course, many different ways of doing this. I'd probably look at going with something along the lines of /^(?!_)[\w\d_]+/$.
The [\w\d_]+ part combined with the anchors (^ and $), essentially assert that the entire string only consist of those characters. The (?!_) part is a negative lookahead assertion. It means check the next character isn't an underscore. Since it's right next to the ^ anchor, this ensures the first character isn't an underscore.
I am trying to use the following regular expression to check whether a string is a positive number with either zero decimal places, or 2:
^\d+(\.(\d{2}))?$
When I try to match this using preg_match, I get the error:
Warning: preg_match(): No ending delimiter '^' found in /Library/WebServer/Documents/lib/forms.php on line 862
What am I doing wrong?
The error is about delimiter missing that is / or #, etc, make sure that you wrap regex expression in delimiters:
if (preg_match('/^\d+(\.(\d{2}))?$/', $text))
{
// more code
return true;
}
Or:
if (preg_match('#^\d+(\.(\d{2}))?$#', $text))
{
// more code
return true;
}
For PHP's preg suite of functions, the regexes must be specified with a delimiter, such as /; for instance, /[a-z]/ will match any character from a to z. When you give it the string "^\\d+(.(\\d{2}))?$", it wants to treat the regex as \\d+(.(\\d{2}))?$, delimited by ^s, but it can't find the second ^. Thus, fixing that is as simple as "/^\\d+(.(\\d{2}))?$/" The other thing you need to fix is the .; that's a metacharacter which will match any non-newline character; for a literal period, you want \.. This gives you the regex "/^\d+(\.(\d{2}))?$/". Also, note that if you don't want a capturing group, you can use "/^\d+(?:\.(\d{2}))?$/", which will put the digits after the decimal point in $1 instead of $2.
I want a regular expression which ALLOWS only this:
letter a-z
case insensitive
allows underscores
allows any nrs
How should this be written?
Thanks
That would be
\w
if I'm not mistaken (As it turns out, it depends: In PHP the meaning of \w changes with the locale that's currently in effect). You can use a more explicit form to nail it down:
[A-Za-z0-9_]
I use it in context, add start-of-string and end-of-string anchors and a quantifier that defines how many characters you will allow:
^[A-Za-z0-9_]+$
PHP:
if (preg_match('/[^a-z0-9_]/i', $input)) {
// invalid input
} else {
// valid input
}
So [a-z0-9_] is a character set for your valid characters. Adding a ^ to the front ([^a-z0-9_]) negates it. The logic is, if any character matches something that ISN'T in the valid character set, the input is considered invalid.
The /i at the end makes the match case insensitive.
How should it be written? (breaking it into multiple lines)
/ # Start RegExp Pattern
^ # Match beginning of string only
[a-z0-9_]* # Match characters in the set [ a-z, 0-9 and _ ] * = Zero or more times
$ # Match end of string
/i # End Pattern - Case Insensitive Matching
Giving you
if (preg_match('/^[a-z0-9_]*$/i', $input)) {
// input is valid
}
You could also use a + instead of * if you want to force at least one character as well.
if(preg_match('/^[0-9a-z_]+$/i', $string)) {
//if it matches
}
else {
//if it doesn't match
}
[0-9a-z_] is a character class that defines the digits 0 through 9, the letters a through z and the underscore. The i at the end makes the match case-insensitive. ^ and $ are anchors that match the beginning and end of the string respectively. The + means 1 or more characters.