Regex for alphanumeric characters plus brackets and spaces - php

I am trying to build a regex to match these strings:
jfldfldf ldjfdlf ldfl
ldfldf 8998 dfjldjf 89dfdf dfdf899
ljdljf [dff]dfdf (fdfdf) 898
Requirements:
String should starts only with any small or capital character (A-Z)
It may contain spaces or brackets (( ) [ ])
Any other special characters are not allowed
I tried /^[a-zA-Z]+[\sa-zA-Z0-9\[\]\(\)].+/m, but it is still accepting other special characters.

So close.
/^[a-zA-Z]+[\sa-zA-Z0-9\[\]\(\)].+/m
^ ^ ^-- missing $
^ ^-- delete this dot
^-- you could also delete this plus, but that's not as important

/^[a-zA-Z]{1}[a-zA-Z0-9\ \[\]\(\)]+$/m
\s = allows whitespaces like spaces tabs and new lines, so this should probably be "\ "
Because the rule is only the first letter needs to be a capital or lowercase letter, strictly it's {1} as + means one or more.
Needed a $ at the end to show this is the end of the line, and nothing else can follow it

The biggest thing that is failing in that regex is the single '.'. That serves as a wildcard matching any value aside from a new line. The plus symbols are not needed and the end of string character '$' is missing.
/^[a-zA-Z][\sa-zA-Z0-9\[\]\(\)]$/m

Related

Regular expression alphanumeric with dash and underscore and space, but not at the beginning or at the end of the string [duplicate]

I want to design an expression for not allowing whitespace at the beginning and at the end of a string, but allowing in the middle of the string.
The regex I've tried is this:
\^[^\s][a-z\sA-Z\s0-9\s-()][^\s$]\
This should work:
^[^\s]+(\s+[^\s]+)*$
If you want to include character restrictions:
^[-a-zA-Z0-9-()]+(\s+[-a-zA-Z0-9-()]+)*$
Explanation:
the starting ^ and ending $ denotes the string.
considering the first regex I gave, [^\s]+ means at least one not whitespace and \s+ means at least one white space. Note also that parentheses () groups together the second and third fragments and * at the end means zero or more of this group.
So, if you take a look, the expression is: begins with at least one non whitespace and ends with any number of groups of at least one whitespace followed by at least one non whitespace.
For example if the input is 'A' then it matches, because it matches with the begins with at least one non whitespace condition. The input 'AA' matches for the same reason. The input 'A A' matches also because the first A matches for the at least one not whitespace condition, then the ' A' matches for the any number of groups of at least one whitespace followed by at least one non whitespace.
' A' does not match because the begins with at least one non whitespace condition is not satisfied. 'A ' does not matches because the ends with any number of groups of at least one whitespace followed by at least one non whitespace condition is not satisfied.
If you want to restrict which characters to accept at the beginning and end, see the second regex. I have allowed a-z, A-Z, 0-9 and () at beginning and end. Only these are allowed.
Regex playground: http://www.regexr.com/
This RegEx will allow neither white-space at the beginning nor at the end of your string/word.
^[^\s].+[^\s]$
Any string that doesn't begin or end with a white-space will be matched.
Explanation:
^ denotes the beginning of the string.
\s denotes white-spaces and so [^\s] denotes NOT white-space. You could alternatively use \S to denote the same.
. denotes any character expect line break.
+ is a quantifier which denote - one or more times. That means, the character which + follows can be repeated on or more times.
You can use this as RegEx cheat sheet.
In cases when you have a specific pattern, say, ^[a-zA-Z0-9\s()-]+$, that you want to adjust so that spaces at the start and end were not allowed, you may use lookaheads anchored at the pattern start:
^(?!\s)(?![\s\S]*\s$)[a-zA-Z0-9\s()-]+$
^^^^^^^^^^^^^^^^^^^^
Here,
(?!\s) - a negative lookahead that fails the match if (since it is after ^) immediately at the start of string there is a whitespace char
(?![\s\S]*\s$) - a negative lookahead that fails the match if, (since it is also executed after ^, the previous pattern is a lookaround that is not a consuming pattern) immediately at the start of string, there are any 0+ chars as many as possible ([\s\S]*, equal to [^]*) followed with a whitespace char at the end of string ($).
In JS, you may use the following equivalent regex declarations:
var regex = /^(?!\s)(?![\s\S]*\s$)[a-zA-Z0-9\s()-]+$/
var regex = /^(?!\s)(?![^]*\s$)[a-zA-Z0-9\s()-]+$/
var regex = new RegExp("^(?!\\s)(?![^]*\\s$)[a-zA-Z0-9\\s()-]+$")
var regex = new RegExp(String.raw`^(?!\s)(?![^]*\s$)[a-zA-Z0-9\s()-]+$`)
If you know there are no linebreaks, [\s\S] and [^] may be replaced with .:
var regex = /^(?!\s)(?!.*\s$)[a-zA-Z0-9\s()-]+$/
See the regex demo.
JS demo:
var strs = ['a b c', ' a b b', 'a b c '];
var regex = /^(?!\s)(?![\s\S]*\s$)[a-zA-Z0-9\s()-]+$/;
for (var i=0; i<strs.length; i++){
console.log('"',strs[i], '"=>', regex.test(strs[i]))
}
if the string must be at least 1 character long, if newlines are allowed in the middle together with any other characters and the first+last character can really be anyhing except whitespace (including ##$!...), then you are looking for:
^\S$|^\S[\s\S]*\S$
explanation and unit tests: https://regex101.com/r/uT8zU0
This worked for me:
^[^\s].+[a-zA-Z]+[a-zA-Z]+$
Hope it helps.
How about:
^\S.+\S$
This will match any string that doesn't begin or end with any kind of space.
^[^\s].+[^\s]$
That's it!!!! it allows any string that contains any caracter (a part from \n) without whitespace at the beginning or end; in case you want \n in the middle there is an option s that you have to replace .+ by [.\n]+
pattern="^[^\s]+[-a-zA-Z\s]+([-a-zA-Z]+)*$"
This will help you accept only characters and wont allow spaces at the start nor whitespaces.
This is the regex for no white space at the begining nor at the end but only one between. Also works without a 3 character limit :
\^([^\s]*[A-Za-z0-9]\s{0,1})[^\s]*$\ - just remove {0,1} and add * in order to have limitless space between.
As a modification of #Aprillion's answer, I prefer:
^\S$|^\S[ \S]*\S$
It will not match a space at the beginning, end, or both.
It matches any number of spaces between a non-whitespace character at the beginning and end of a string.
It also matches only a single non-whitespace character (unlike many of the answers here).
It will not match any newline (\n), \r, \t, \f, nor \v in the string (unlike Aprillion's answer). I realize this isn't explicit to the question, but it's a useful distinction.
Letters and numbers divided only by one space. Also, no spaces allowed at beginning and end.
/^[a-z0-9]+( [a-z0-9]+)*$/gi
I found a reliable way to do this is just to specify what you do want to allow for the first character and check the other characters as normal e.g. in JavaScript:
RegExp("^[a-zA-Z][a-zA-Z- ]*$")
So that expression accepts only a single letter at the start, and then any number of letters, hyphens or spaces thereafter.
use /^[^\s].([A-Za-z]+\s)*[A-Za-z]+$/. this one. it only accept one space between words and no more space at beginning and end
If we do not have to make a specific class of valid character set (Going to accept any language character), and we just going to prevent spaces from Start & End, The must simple can be this pattern:
/^(?! ).*[^ ]$/
Try on HTML Input:
input:invalid {box-shadow:0 0 0 4px red}
/* Note: ^ and $ removed from pattern. Because HTML Input already use the pattern from First to End by itself. */
<input pattern="(?! ).*[^ ]">
Explaination
^ Start of
(?!...) (Negative lookahead) Not equal to ... > for next set
Just Space / \s (Space & Tabs & Next line chars)
(?! ) Do not accept any space in first of next set (.*)
. Any character (Execpt \n\r linebreaks)
* Zero or more (Length of the set)
[^ ] Set/Class of Any character expect space
$ End of
Try it live: https://regexr.com/6e1o4
^[^0-9 ]{1}([a-zA-Z]+\s{1})+[a-zA-Z]+$
-for No more than one whitespaces in between , No spaces in first and last.
^[^0-9 ]{1}([a-zA-Z ])+[a-zA-Z]+$
-for more than one whitespaces in between , No spaces in first and last.
Other answers introduce a limit on the length of the match. This can be avoided using Negative lookaheads and lookbehinds:
^(?!\s)([a-zA-Z0-9\s])*?(?<!\s)$
This starts by checking that the first character is not whitespace ^(?!\s). It then captures the characters you want a-zA-Z0-9\s non greedily (*?), and ends by checking that the character before $ (end of string/line) is not \s.
Check that lookaheads/lookbehinds are supported in your platform/browser.
Here you go,
\b^[^\s][a-zA-Z0-9]*\s+[a-zA-Z0-9]*\b
\b refers to word boundary
\s+ means allowing white-space one or more at the middle.
(^(\s)+|(\s)+$)
This expression will match the first and last spaces of the article..

PHP RegEx Remove words from string which contain non-letters/numbers

Could anyone please help me with this regular expression, as I'm not sure how to implement it.
I need a regex for removing all words from a string which contain at least one character which is not a UTF-8 letter or number, or punctuation in the middle of the word (but not at the end).
Examples:
This is ®Aix string
A bad str?ng is here
The first example contains ®, which is not a letter, number or punctuation.
The second example contains punctuation in the middle.
I need to remove these bad words, but keep the rest of the string intact. E.g. This is string, A bad is here.
Please note that A bad string? is here would not contain any bad words, as the punctuation is at the end of the word.
Thank you in advance for your help.
How about this:
$result = preg_replace(
'/\b # Start of word
[\p{L}\p{N}]+ # One or more Unicode letters
[^\s\p{L}\p{N}] # One non-letter (and non-whitespace), followed by
[^\s\p{P}]+ # at least one non-whitespace, non-punctuation character
\b # End of word
\s* # optional following whitespace
/xu',
'', $subject);

Match only alpha characters and whitespace

I have this regex:
/[^a-z\s]/i
This is suppose to match any character from a-z and A-Z and any whitespace encountered. It works for characters, but not for spaces, why ?
I'm checking in php like this :
if (preg_match('/[^a-z\s]/i', $username)) {
...
}
I'm checking to see if the username contains any other character than letters ( a-z,A-Z ) or than space.
Your regex should like this:
/^[a-z\s]+$/i
if (preg_match('/^[a-z\s]+$/i', $username)) {
//the username is ok.
}
/[^a-z\s]/i will only match characters that aren't in the case-insensitive set a-z and space. Try removing the ^, which negates the characters inside your brackets. The pattern to match all letters and spaces should read:
/[a-z\s]/i
Note that \s won't just match spaces. It will match any whitespace character (like tabs and newlines) as well.
If you want to force matches to begin with a letter or space, you must move the ^ outside of the brackets like so:
/^[a-z\s]/i
Finally, if you're trying to match strings that begin with one or more occurrences of letters and spaces you need to add the + modifier. Otherwise it will only match a single character:
/^[a-z\s]+/i
because the ^ character is an anchor and you've placed it incorrectly...if you use ^ and $ for the start and end string markers they need to appear at the absolute beginning and end respectively.
So it sounds like you'd want:
^[a-zA-Z\s]$
or if you want to match multiples of alpha and/or spaces then:
^[a-zA-Z\s]*$
Works for me! Perhaps you should include a fully reproducible example, but it picks up spaces for me.
You can also rewrite this regex to do the opposite, which is a bit more obvious for me personally:
/^[a-z\s]*$/

Regular expression any character but a white space

I'm creating a password validator which takes any character but whitespaces and with at least 6 characters.
After searching the best I came up is this is this example:
What is the regular expression for matching that contains no white space in between text?
It disallows any spaces inbetween but does allow starting and ending with a space. I want to disallow any space in the string passed.
I tried this but it doesn't work:
if (preg_match("/^[^\s]+[\S+][^\s]{6}$/", $string)) {
return true;
} else {
return false;
}
Thanks.
Something like this:
/^\S{6,}\z/
Can be quoted like:
preg_match('/^\S{6,}\z/', $string)
All answers using $ are wrong (at least without any special flags). You should use \z instead of $ if you do not want to allow a line break at the end of the string.
$ matches end of string or before a line break at end of string (if no modifiers are used)
\z matches end of string (independent of multiline mode)
From http://www.pcre.org/pcre.txt:
^ start of subject
also after internal newline in multiline mode
\A start of subject
$ end of subject
also before newline at end of subject
also before internal newline in multiline mode
\Z end of subject
also before newline at end of subject
\z end of subject
The simplest expression:
^\S{6,}$
^ means the start of the string
\S matches any non-whitespace character
{6,} means 6 or more
$ means the end of the string
In PHP, that would look like
preg_match('/^\S{6,}$/', $string)
Edit:
>> preg_match('/^\S{6,}$/', "abcdef\n")
1
>> preg_match('/^\S{6,}\z/', "abcdef\n")
0
>> preg_match('/^\S{6,}$/D', "abcdef\n")
0
Qtax is right. Good call! Although if you're taking input from an HTML <input type="text"> you probably won't have any newlines in it.
I think you should be fine using the following, which would match any string longer than 1 character with no whitespace:
^[^\s]+$
You can see the test here: http://regexr.com?2ua2e.
Try this. This will match at least 6 non whitespace characters followed by any number of additional non whitespace characters.
^[^\s]{6}[^\s]*$
\S - Matches any non-white-space character. Equivalent to the Unicode character categories [^\f\n\r\t\v\x85\p{Z}]. If ECMAScript-compliant behavior is specified with the ECMAScript option, \S is equivalent to [^ \f\n\r\t\v].
The start of string you can do : ^[ \t]+, and for end : [ \t]+$ (tab and spaces)
ETA:
By the way, you regex [\S+], i think you're looking for : [\S]+

Regex to match numbers, # # % signs

I am trying to write a regex that matches all numbers (0-9) and # # % signs.
I have tried ^[0-9#%#]$ , it doesn't work.
I want it to match, for example: 1234345, 2323, 1, 3#, %#, 9, 23743, #####, or whatever...
There must be something missing?
Thank you
You're almost right... All you're missing is something to tell the regular expression there may be more than once of those characters like a * (0 or more) or a + (1 or more).
^[0-9#%#]+$
The ^ and $ are used do indicate the start and end of a string, respectively. Make sure that you string only contains those characters otherwise, it won't work (e.g. "The number is 89#1" wouldn't work because the string begins with something other than 0-9, #, %, or #).
Your pattern ^[0-9#%#]$ only matches strings that are one character long. The [] construct matches a single character, and the ^ and $ anchors mean that nothing can come before or after the character matched by the [].
If you just want to know if the string has one of those characters in it, then [0-9#%#] will do that. If you want to match a string that must have at least one character in it, then use ^[0-9#%#]+$. The "+" means to match one or more of the preceding item. If you also want to match empty strings, then use [0-9#%#]*. The "*" means to match zero or more of the preceding item.
It should be /^[0-9#%#]+$/. The + is a qualifier that means "one or more of the preceding".
The problem with your current regex is that it will only match one character that could either be a number or #, %, or #. This is because the ^ and $ characters match the beginning and the end of the line respectively. By adding the + qualifier, you are saying that you want to match one or more of the preceding character-class, and that the entire line consists of one or more of the characters in the specified character-class.
remove the caret (^), it is used to match from the start of the string.
You forgot "+"
^[0-9#%#]+$ must work

Categories