I've made this regex:
^[a-zA-Z0-9_.-]*$
Supports:
letters [uppercase and lowercase]
numbers [from 0 to 9]
underscores [_]
dots [.]
hyphens [-]
Now, I want to add these:
spaces [ ]
comma [,]
exclamation mark [!]
parenthesis [()]
plus [+]
equal [=]
apostrophe [']
double quotation mark ["]
at [#]
dollar [$]
percent [%]
asterisk [*]
For example, this code accept only some of the symbols above:
^[a-zA-Z0-9 _.,-!()+=“”„#"$#%*]*$
Returns:
Warning: preg_match(): Compilation failed: range out of order in character class at offset 16
Make sure to put hyphen - either at start or at end in character class otherwise it needs to be escaped. Try this regex:
^[a-zA-Z0-9 _.,!()+=`,"#$#%*-]*$
Also note that because * it will even match an empty string. If you don't want to match empty strings then use +:
^[a-zA-Z0-9 _.,!()+=`,"#$#%*-]+$
Or better:
^[\w .,!()+=`,"#$#%*-]+$
TEST:
$text = "_.,!()+=,#$#%*-";
if(!preg_match('/\A[\w .,!()+=`,"#$#%*-]+\z/', $text)) {
echo "error.";
}
else {
echo "OK.";
}
Prints:
OK.
The hyphen is being treated as a range marker -- when it sees ,-! it thinks you're asking for a range all characters in the charset that fall between , and ! (ie the same way that A-Z works. This isn't what you want.
Either make sure the hyphen is the last character in the character class, as it was before, or escape it with a backslash.
I would also point out that the quote characters you're using “”„ are part of an extended charset, and are not the same as the basic ASCII quotes "'. You may want to include both sets in your pattern. If you do need to include the non-ASCII characters in the pattern, you should also add the u modifier after the end of your pattern so it correctly picks up unicode characters.
Try escaping your regex: [a-zA-Z0-9\-\(\)\*]
Check if this help you: How to escape regular expression special characters using javascript?
Inside of a character class [...] the hyphen - has a special meaning unless it is the first or last character, so you need to escape it:
^[a-zA-Z0-9 _.,\-!()+=“”„#"$#%*]*$
None of the other characters need to be escaped in the character class (except ]). You will also need to escape the quote indicating the string. e.g.
'/[\']/'
"/[\"]/"
try this
^[A-Z0-9][A-Z0-9*&!_^%$#!~#,=+,./\|}{)(~`?][;:\'""-]{0,8}$
use this link to test
trick is i reverse ordered the parenthesis and other braces that took care of some problems. And for square braces you must escape them
Related
$.validator.addMethod('AZ09_', function (value) {
return /^[a-zA-Z0-9.-_]+$/.test(value);
}, 'Only letters, numbers, and _-. are allowed');
When I use somehting like test-123 it still triggers as if the hyphen is invalid. I tried \- and --
Escaping using \- should be fine, but you can also try putting it at the beginning or the end of the character class. This should work for you:
/^[a-zA-Z0-9._-]+$/
Escaping the hyphen using \- is the correct way.
I have verified that the expression /^[a-zA-Z0-9.\-_]+$/ does allow hyphens. You can also use the \w class to shorten it to /^[\w.\-]+$/.
(Putting the hyphen last in the expression actually causes it to not require escaping, as it then can't be part of a range, however you might still want to get into the habit of always escaping it.)
The \- maybe wasn't working because you passed the whole stuff from the server with a string. If that's the case, you should at first escape the \ so the server side program can handle it too.
In a server side string: \\-
On the client side: \-
In regex (covers): -
Or you can simply put at the and of the [] brackets.
Generally with hyphen (-) character in regex, its important to note the difference between escaping (\-) and not escaping (-) the hyphen because hyphen apart from being a character themselves are parsed to specify range in regex.
In the first case, with escaped hyphen (\-), regex will only match the hyphen as in example /^[+\-.]+$/
In the second case, not escaping for example /^[+-.]+$/ here since the hyphen is between plus and dot so it will match all characters with ASCII values between 43 (for plus) and 46 (for dot), so will include comma (ASCII value of 44) as a side-effect.
\- should work to escape the - in the character range. Can you quote what you tested when it didn't seem to? Because it seems to work: http://jsbin.com/odita3
A more generic way of matching hyphens is by using the character class for hyphens and dashes ("\p{Pd}" without quotes). If you are dealing with text from various cultures and sources, you might find that there are more types of hyphens out there, not just one character. You can add that inside the [] expression
I have a question regarding one character in the preg_match syntax below.
I just want to completely understand.
\w looking for alpha-numberic characters and the underscore.
My question is what does the \ mean after \w and before the # sign?
Does this mean that it will allow:
any alphanumeric
any backslash
any dash
or is this backslash meant to single out the character that follows?
When I test it in w3schools.com example I can have backslashes in the email address which validates but they are removed when they are echoed out.
$email = test_input($_POST["email"]);
// check if e-mail address syntax is valid
if (!preg_match("/([\w\-]+\#[\w\-]+\.[\w\-]+)/",$email))
{
$emailErr = "Invalid email format";
}
The backslash is used to escape characters that have a special meaning in a regex to obtain a literal character. There are twelve characters that must be escaped: [ { ( ) . ? * + | \ ^ $
If I want to write a literal $ in a pattern, I must write \$
Note: you don't need to escape { if the situation is no ambiguous (with the quantifier {m,n} or {m})
Note 2: The delimiter of the pattern must be escaped too, inside and outside a character class.
Inside a character class these twelve characters don't need no more to be escaped since they loose their special meaning and are seen as literals. However, there is three characters that have a special meaning if they are in a special position in the character class. These characters are: ^ - ]
^ at the first position is used to negate a character class ([^M] => all that is not a M ). If you want to use it as a literal character at "the first position", you must write: [\^]
- between two characters defines a character range ([a-z]). This means that you don't need to escape it at the begining (or immediatly after ^) or at the end of the class. You only need to escape it between two characters. - is seen as a literal (and doesn't define a range) in all these examples:
[-abcd]
[^-abcd]
[abcd-]
[ab\-cd]
[\s-abcd] # because \s is not a character
] since it is used to close the character class must be escaped except at the first position or immediatly after the ^. []] and [^]] are correct.
If I write the pattern without uneeded backslashes, I obtain:
/([\w-]+#[\w-]+\.[\w-]+)/
To answer your question ("What does it mean?"): Nothing, uneeded escapes are ignored by the regex engine.
I'm trying to catch any characters that are not letters, numbers, or .-_ (period, dash, underscore)
My code is
return !preg_match('/[^A-Za-z0-9.-_]/', $strToFilter);
My hope is that it will return false when it find an invalid character. As of now it allows ._ (period and underscore) but does not allow - (dash). It also does not detect characters like /, \, [, ], %, ^, etc as invalid characters.
What is wrong with my expression?
In Regex character classes, you can't match a literal hyphen unless it is:
immediately against either bracket,
follows the negate caret (^), or
is escaped using the backslash (\)
The hyphen can be included right after the opening bracket, or right before the closing bracket, or right after the negating caret. Both [-x] and [x-] match an x or a hyphen. [^-x] and [^x-] match any character thas is not an x or a hyphen. This works in all flavors discussed in this tutorial. Hyphens at other positions in character classes where they can't form a range may be interpreted as literals or as errors. Regex flavors are quite inconsistent about this.
Source - See Metacharacters Inside Character Classes.
Just escape the dash:
return !preg_match('/[^A-Za-z0-9.\-_]/', $strToFilter);
I have a regular expression that allows only specific characters from the name fields in an HTML form, namely letters, white space, single quotes, hyphens and periods. Here is the pattern:
return mb_ereg_match("^[\w\s'-\.]+$", $name);
Problem is this pattern, for some reason, returns true when there are literal asterisks in $name. This shouldn't be possible unless I'm missing something. I've done multiple searches on literal asterisks and all I found was the "\*" pattern for intentionally matching them.
The same pattern in preg_match() also returns a match when passed a string like "*John".
What the heck am I missing?
You need a double-backslash in front of these codes. One to escape the backslash, one to escape the escape sequence.
You also need to escape the -, otherwise it accepts all characters "between" ' and ..
return mb_ereg_match("^[\\w\\s'\\-\\.]+$", $name);
Have a look at a working case (using preg_match): http://ideone.com/E8afAM
When enclosed in square-brackets, the hyphen acts as a special character to denote a range. In your case, it's matching all characters in the range ' to ..
Escaping the hyphen should return the desired result:
^[\w\s'\-\.]+$
I have a regular expression that allows only specific characters from the name fields in an HTML form, namely letters, white space, single quotes, hyphens and periods.
You miss, that \w is not a letter character. php.net says:
A "word" character is any letter or digit or the underscore character, that is, any character which can be part of a Perl "word".
And, the perl definition is:
A \w matches a single alphanumeric character (an alphabetic character, or a decimal digit) or a connecting punctuation character, such as an underscore ("_").
The connecting punctuation character should mean only _ as i read, but this is maybe a multibyte extension's bug.
If you use mb_ereg_match only for whole unicode matches, give a try to preg_match's /u modifier & the Unicode character properties feature, since php 5.1.0
I'm writing a password regex in PHP that should return false for any string that has at least one character that is not:
a lowercase letter a-z
an uppercase letter A-Z
a number 0-9
a whitespace " *"
a punctuation symbol :,.!().?";
So far I have this:
<?php
$password = 'azAZ0 giggles 9*":,.!() .?";';
$regex1 = '#^[a-zA-Z0-9" *":,.!().?";\']+$#i';
if (preg_match($regex1, $password)) {
echo "A match was found.";
} else {
echo "A match was not found.";
}
?>
Does this seem to be working as I intend it to, or do you see any glaring errors?
And what should I add to the regex so that it should return false for any string that has at least one character that is not:
a hyphen -
Your regex is pretty close to the target, but not totally correct.
I would use this one:
$regex1 = '/^[a-z0-9 :,.!().?";\'-]+$/i';
Points of interest:
Moved the hyphen to the end of the list, so that it won't be mistaken for a character range delimiter
Included an apostrophe by escaping it with a backslash, as per PHP's string escaping rules
Removed the A-Z part since the regex includes the case-insensitive modifier
Replaced * (which in this context means "a space or an asterisk") with just a space -- if you want to also allow tabs and newlines as part of the password (unlikely), replace it with \s
You simply need to escape ' using \. Try this
$regex1 = '#^[a-zA-Z0-9" *":,.!-().?";\']+$#i';
And you already seem to have - in the regex.
Within a character class (denoted by square brackets in regex), a minus - is always introducing a range: [A-Z].
You have !-(, which is no meaningful range and therefore does not do what you think. Solution:
Move the - to the start or the end of the character class: [-A-Z...] / [A-Z...-]
Escape the -: [A-Z\-...]
The other question you ask is "How do I get a single quote into a PHP string?" and really has nothing to do with regex. But "escape it" is the answer, of course.