This question already has an answer here:
Javascript Regex restrict underscore at start and end
(1 answer)
Closed 4 months ago.
I need to compose a regular expression for string, with a max length of 6 characters, containing only Latin letters in lowercase, with an optional underscore separator, without underscore starting and trailing.
I tried the following
^[a-z_]{1,6}$
But it allows underscore at the start and the end.
I also tried:
^([a-z]_?[a-z]){1,6}$
^(([a-z]+)_?([a-z]+)){1,6}$
^([a-z](?:_?)[a-z]){1,6}$
But nothing works. Please help.
Expecting:
Valid:
ex_bar
Not valid:
_exbar
exbar_
_test_
This is a fairly simple pattern that should work ^(?!_)[a-z_]{0,5}[a-z]$. See here for a breakdown.
I would express your requirement as:
^(?!.{7,}$)[a-z](?:[a-z_]*[a-z])*$
This pattern matches:
^ from the start of the string
(?!.{7,}$) assert that at most 6 characters are present
[a-z] first letter must be a-z
(?:[a-z_]*[a-z])* match a-z or underscore in the middle, but only a-z at the end
$ end of the string
Note that the behavior of the above pattern is that one character matches must be only letter a-z. Similarly, two character matches can also only be a-z twice. With three character matches and longer, it is possible for underscore to appear in the middle.
Here is a running demo.
(?!^_)([a-z_]{6})(?<!_$)
You could use a negative look-ahead and negative look-behind to ensure that the string doesn't start and end with an _ underscore.
https://regex101.com/r/sMho0c/1
Related
This question already has answers here:
How to check, if a php string contains only english letters and digits?
(10 answers)
Closed 12 months ago.
Title says it all: I am checking to see if a user's username contains anything that isn't a number or letter, such as €{¥]^}+<€, punctuation, spaces or even things like âæłęč. Is this possible in php?
You can use the ctype_alnum() function in PHP.
From the manual..
Check for alphanumeric character(s)
Returns TRUE if every character in text is either a letter or a digit, FALSE otherwise.
var_dump(ctype_alnum("æøsads")); // false
var_dump(ctype_alnum("123asd")); // true
Live demo at https://3v4l.org/5etr7
PHP does REGEX
What you want to do is fairly trivial, PHP has a number of regex functions
Testing a String For a Character
If all you want is to know IF a string contains non-alphanumeric characters, then just use preg_match():
preg_match( '/[^A-Za-z0-9]*/', $userName );
This will return 1 if the username contains anything other than alphanumeric (A-Z or a-z or 0to9), it returns 0 if it doesn't contain a non-alphanumeric.
Regex Pattern Elements
Regex PCRE patterns open and close with a delimiter such as a slash/, and that needs to be treated like a string (quoted):'/myPattern/' Some other key features are:
[ brackets contain match sets ]
[a-z] // means match any lowercase letter
This pattern means check the current character in the $String relative to the pattern in these brackets, in this case match any lowercase letter a to z.
^ Caret (Meta-Character)
[^a-z] // means no lowercase letters If the caret ^ (aka hat) is the first character inside brackets, it NEGATES the pattern inside brackets so [^A7] means match anything EXCEPT uppercase A and the numeral 7. (Note: when outside brackets, the caret ^ means the start of the string.)
\w\W\d\D\s\S. Meta-Characters (WildCards)
\w // match all alphanumeric An escaped (i.e. preceded by a backslash \ ) lowercase w means match any "word" character, i.e. alphanumeric and the underscore _, this is shorthand for [A-Za-z0-9_]. The uppercase \W is the NOT word character, equivalent to [^A-Za-z0-9_] or [^\w]
. // (dot) match ANY single character except return/newline
\w // match any word character [A-Za-z0-9_]
\W // NOT any word character [^A-Za-z0-9_]
\d // match any digit [0-9]
\D // NOT any digit [^0-9].
\s // match any whitespace (tab, space, newline)
\S // NOT any whitespace
.*+?| Meta-Characters (Quantifiers))
These modify the behavior outside of a set []
* // match previous character or [set] zero or more times,
// so .* means match everything (including nothing) until reaching a return/newline.
+ // match previous at least one or more times.
? // match previous only zero or one time (i.e. optional).
| // means logical OR eg.: com|net means match either literal "com" or "net"
Not shown: capture groups, backreferences, substitution (the real power of regex). See https://www.phpliveregex.com/#tab-preg-match for more including a live pattern-match playground that is based on the PHP functions, and delivers results as arrays.
Back To Your StringCleaning
So for your pattern, to match all non-letters and numbers (including underscores) you need either: '/[^A-Za-z0-9]*/' or '/[\W_]*/'
Strip Search
If instead you want to STRIP all the non-alpha characters from a string then use preg_replace( $Regex, $Replacement, $StringToClean )
<?php
$username = 'Svéñ déGööfinøff';
echo preg_replace('/[\W_]*/', '', $username);
?>
The output is: SvdGfinff If you'd prefer to replace certain accented letters with standard latin ones to keep the names reasonably readable, then I believe you'd need a lookup table (array). There is one ready to use at the PHP site
I am trying to find many 8 digited words using regex,
which should contain either number/alphabets/both
after that 8 digits it should end with .php
it should only have 8 digits neither 7 nor 6
I Tried this \b\d{8}\b.php
But I failed it only works for numbers for example
12121212.php
23232323.php
Also i don't need
something-catergory.php
AB787C-category.php
has-bookshok.php
The final result should be like abcd1234.php rather than something-abcd1234.php
You can use character class
\b[a-zA-Z0-9]{8}\b
\b - Word boundry.
[a-zA-Z0-9]{8} - match number, alphabets or both. ( {8} -> length must be 8 character)
Update
The final result should be like abcd1234.php rather than
something-abcd1234.php
\b[a-zA-Z0-9]{8}\.php$
Demo
Well if you want complete string to match you need to use ^ anchor at start and $ at end instead of \b
^[a-zA-Z0-9]{8}\.php$
If your data is a list of filenames then this regex will work:
/^[a-z0-9]{8}\.php$/i
It asserts that the filename is exactly 8 [a-zA-Z0-9] characters followed by .php. Note that the i modifier makes it case insensitive so we don't have to specify A-Z in the character class as well.
Here's a demo on 3v4l.org
I'm having a bit of trouble getting my pattern to validate the string entry correctly. The PHP portion of this assignment is working correctly, so I won't include that here as to make this easier to read. Can someone tell me why this pattern isn't matching what I'm trying to do?
This pattern has these validation requirements:
Should first have 3-6 lowercase letters
This is immediately followed by either a hyphen or a space
Followed by 1-3 digits
$codecheck = '/^([[:lower:]]{3,6}-)|([[:lower:]]{3,6} ?)\d{1,3}$/';
Currently this catches most of the requirements, but it only seems to validate the minimum character requirements - and doesn't return false when more than 6 or 3 characters (respectively) are entered.
Thanks in advance for any assistance!
The problem here lies in how you group the alternatives. Right now, the regex matches a string that
^([[:lower:]]{3,6}-) - starts with 3-6 lowercase letters followed with a hyphen
| - or
([[:lower:]]{3,6} ?)\d{1,3}$ - ends with 3-6 lowercase letters followed with an optional space and followed with 1-3 digits.
In fact, you can get rid of the alternation altogether:
$codecheck = '/^\p{Ll}{3,6}[- ]\d{1,3}$/';
See the regex demo
Explanation:
^ - start of string
\p{Ll}{3,6} - 3-6 lowercase letters
[- ] - a positive character class matching one character, either a hyphen or a space
\d{1,3} - 1-3 digits
$ - end of string
You need to delimit the scope of the | operator in the middle of your regex.
As it is now:
the right-side argument of that OR runs up until the very end of your regex, even including the $. So the digits, nor the end-of-string condition do not apply for the left side of the |.
the left-side argument of the OR starts with ^, and only applies to the left side.
That is why you get a match when you supply 7 lowercase characters. The first character is ignored, and the rest matches with the right-side of the regex pattern.
I currently have this regular expression:
/(^| )[a-z]{5}-[a-z]{5}( |$)/i
start of string or space, 5 letters, literal dash, 5 letters, space or end of string, case insensitive.
This finds a string that looks like this: pejnd-zxdgn
I need to allow the first letter only to be a digit instead of a letter.
How do I write this?
Edit:
To clarify
should match: pejnd-zxdgn or 7ejnd-zxdgn
Should not match 7pejnd-zxdgn or 7ejn-zxdgn or p7ejnd-zxdgn
Just add a pattern for digit before the [a-z] part. And change the quantifier to {4}:
/(^| )[0-9][a-z]{4}-[0-9][a-z]{4}( |$)/i
After your update, you just want the first character to be digit or character. Also, you can use word boundaries - \b at the beginning and the end, as noted in comments. So, change your regex to:
/\b[a-z0-9][a-z]{4}-[a-z]{5}\b/i
I'm still kinda new to using Regular Expressions, so here's my plight. I have some rules for acceptable usernames and I'm trying to make an expression for them.
Here they are:
1-15 Characters
a-z, A-Z, 0-9, and spaces are acceptable
Must begin with a-z or A-Z
Cannot end in a space
Cannot contain two spaces in a row
This is as far as I've gotten with it.
/^[a-zA-Z]{1}([a-zA-Z0-9]|\s(?!\s)){0,14}[^\s]$/
It works, for the most part, but doesn't match a single character such as "a".
Can anyone help me out here? I'm using PCRE in PHP if that makes any difference.
Try this:
/^(?=.{1,15}$)[a-zA-Z][a-zA-Z0-9]*(?: [a-zA-Z0-9]+)*$/
The look-ahead assertion (?=.{1,15}$) checks the length and the rest checks the structure:
[a-zA-Z] ensures that the first character is an alphabetic character;
[a-zA-Z0-9]* allows any number of following alphanumeric characters;
(?: [a-zA-Z0-9]+)* allows any number of sequences of a single space (not \s that allows any whitespace character) that must be followed by at least one alphanumeric character (see PCRE subpatterns for the syntax of (?:…)).
You could also remove the look-ahead assertion and check the length with strlen.
make everything after your first character optional
^[a-zA-Z]?([a-zA-Z0-9]|\s(?!\s)){0,14}[^\s]$
The main problem of your regexp is that it needs at least two characters two have a match :
one for the [a-zA-Z]{1} part
one for the [^\s] part
Beside this problem, I see some parts of your regexp that could be improved :
The [^\s] class will match any character, except spaces : a dot or semi-colon will be accepted, try to use the [a-zA-Z0-9] class here to ensure the character is a correct one.
You can delete the {1} part at the beginning, as the regexp will match exactly one character by default