/(?![a-z]+:)/
Anyone knows?
the / are delimiters.
?! is negative lookahead.
[a-z] is a character class (any character in the a-z range)
+ is one-or-more times of the preceding pattern ([a-z] in this case)
: is just the colon literal
It roughly means "look ahead and make sure there are no alpha characters followed by a colon".
This regex would make more sense if it had a start of string anchor: /^(?![a-z]+:/, so it wouldn't match abc: (like one of the other answers say), but without the (^) I don't know how useful this is.
according to Regex Buddy (a product i highly recommend):
Assert that it is impossible to match the regex below starting at this position (negative lookahead) «(?![a-z]+:)»
Match a single character in the range between “a” and “z” «[a-z]+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the character “:” literally «:»
(?!REGEX) is the syntax for negative lookahead. Check the link for an explanation of lookaheads.
The regex fails if the pattern [a-z]+: appear in the string from the current position. If the pattern is not found, regex would succeed, but won't consume any characters.
It would match 123: or abc but not abc:
It would match the : in abc:.
Related
I am trying to write a regular expression in PHP to ensure a password matches a criteria which is:
It should atleast 8 characters long
It should include at least one special character
It should include at least one capital letter.
I have written the following expression:
$pattern=([a-zA-Z\W+0-9]{8,})
However, it doesn't seem to work as per the listed criteria. Could I get another pair of eyes to aid me please?
Your regex - ([a-zA-Z\W+0-9]{8,}) - actually searches for a substring in a larger text that is at least 8 characters long, but also allowing any English letters, non-word characters (other than [a-zA-Z0-9_]), and digits, so it does not enforce 2 of your requirements. They can be set with look-aheads.
Here is a fixed regex:
^(?=.*\W.*)(?=.*[A-Z].*).{8,}$
Actually, you can replace [A-Z] with \p{Lu} if you want to also match/allow non-English letters. You can also consider using \p{S} instead of \W, or further precise your criterion of a special character by adding symbols or character classes, e.g. [\p{P}\p{S}] (this will also include all Unicode punctuation).
An enhanced regex version:
^(?=.*[\p{S}\p{P}].*)(?=.*\p{Lu}.*).{8,}$
A human-readable explanation:
^ - Beginning of a string
(?=.*\W.*) - Requirement to have at least 1 non-word character
OR (?=.*[\p{S}\p{P}].*) - At least 1 Unicode special or punctuation symbol
(?=.*[A-Z].*) - Requirement to have at least 1 uppercase English letter
OR (?=.*\p{Lu}.*) - At least 1 Unicode letter
.{8,} - Requirement to have at least 8 symbols
$ - End of string
See Demo 1 and Demo 2 (Enhanced regex)
Sample code:
if (preg_match('/^(?=.*\W.*)(?=.*[A-Z].*).{8,}$/u', $header)) {
// PASS
}
else {
# FAIL
}
Using positive lookahead ?= we make sure that all password requirements are met.
Requirements for strong password:
At least 8 chars long
At least 1 Capital Letter
At least 1 Special Character
Regex:
^((?=[\S]{8})(?:.*)(?=[A-Z]{1})(?:.*)(?=[\p{S}])(?:.*))$
PHP implementation:
if (preg_match('/^((?=[\S]{8})(?:.*)(?=[A-Z]{1})(?:.*)(?=[\p{S}])(?:.*))$/u', $password)) {
# Strong Password
} else {
# Weak Password
}
Examples:
12345678 - WEAK
1234%fff - WEAK
1234_44A - WEAK
133333A$ - STRONG
Regex Explanation:
^ assert position at start of the string
1st Capturing group ((?=[\S]{8})(?:.*)(?=[A-Z]{1})(?:.*)(?=[\p{S}])(?:.*))
(?=[\S]{8}) Positive Lookahead - Assert that the regex below can be matched
[\S]{8} match a single character present in the list below
Quantifier: {8} Exactly 8 times
\S match any kind of visible character [\P{Z}\H\V]
(?:.*) Non-capturing group
.* matches any character (except newline) [unicode]
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
(?=[A-Z]{1}) Positive Lookahead - Assert that the regex below can be matched
[A-Z]{1} match a single character present in the list below
Quantifier: {1} Exactly 1 time (meaningless quantifier)
A-Z a single character in the range between A and Z (case sensitive)
(?:.*) Non-capturing group
.* matches any character (except newline) [unicode]
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
(?=[\p{S}]) Positive Lookahead - Assert that the regex below can be matched
[\p{S}] match a single character present in the list below
\p{S} matches math symbols, currency signs, dingbats, box-drawing characters, etc
(?:.*) Non-capturing group
.* matches any character (except newline) [unicode]
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
$ assert position at end of the string
u modifier: unicode: Pattern strings are treated as UTF-16. Also causes escape sequences to match unicode characters
Demo:
https://regex101.com/r/hE2dD2/1
In my project (php), I got some regexs(pcre) like this one :
preg_match('/[\s^0-9]{0,1}([0-9]{2})[\s^0-9]{0,1}/',$chanson['nom'],$resultPreg1)
This regex catch two numbers who can be delimited or not by a single space, and can't be delimited by number. What I want to do is, that there is or a space (and no number) in beginning, or a space (and no number) at the end. But it must have at least one delimiter.
How can I do this ?
You simply need to split it up and test each case:
/\s\d{2}\D|\D\d{2}\s/
This will match a space, two digits, and any non-digit character or a non-digit character, two digits and a space.
Note: \d is a digit, equivalent to [0-9]. \D is a non-digit, equivalent to [^0-9].
The above regex requires there to be at least one non-digit on each side of the numbers, however. Also, if you had a pattern like .11 22., it would not match both numbers, because the space would be eaten up by the first match. If this is a problem, you can use look-arounds:
/\s\d{2}(?!\d)|(?<!\d)\d{2}\s/
This matches a space, then two digits not followed by another digit or two digits not preceded by a digit, followed by a space.
(?!...) is negative look-ahead. It means "the match cannot be followed by this."
(?<!...) is negative look-behind, meaning "the match cannot be preceded by this."
You can't mix the negative and positive character classes in a single set of square brackets. A "space" OR "not a number" could be written \s|[^0-9]. But a space isn't a number, so no need to put it in specially, just [^0-9] will suffice for you. Your syntax for "zero or one" of {0,1} is technically correct, but there is a much more concise syntax for the same thing: ?.
preg_match('/[^0-9]?([0-9]{2})[^0-9]?/',$chanson['nom'],$resultPreg1)
You could almost use word breaks around your number to get what you are looking for except it wouldn't find numbers embedded in letters like "abc23def".
preg_match('/\b([0-9]{2})\b/',$chanson['nom'],$resultPreg1)
How would I write a php preg_match() in php to pick out the 250 value. I have a large string of html code that I want to pick the 250 out of and I can't seem to get the regular expression right.
This is the html pattern I want to match - note that I want to extract the integer where the 250 is:
<span class="price-ld">H$250</span>
I have been trying for hours to do this and I can't get it to work lol
preg_match('/<span class="price-ld">H$(\d+)<\/span>/i', $your_html, $matches);
print "Its ".$matches[1]." USD";
The regex actually depends on your code. Where are you exactly searching for?
This is the regex you're looking for:
(?<=<span class="price-ld">H\$)\d+(?=</span>)
You can see the results here.
And here's the explanation:
Options: case insensitive; ^ and $ match at line breaks
Assert that the regex below can be matched, with the match ending at this position (positive lookbehind) «(?<=<span class="price-ld">H\$)»
Match the characters “<span class="price-ld">H” literally «<span class="price-ld">H»
Match the character “$” literally «\$»
Match a single digit 0..9 «\d+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=</span>)»
Match the characters “</span>” literally «span>»
I would like to match the whole "word"—one that starts with a number character and that may include special characters but does not end with a '%'.
Match these:
112 (whole numbers)
10-12 (ranges)
11/2 (fractions)
11.2 (decimal numbers)
1,200 (thousand separator)
but not
12% (percentages)
A38 (words starting with a alphabetic character)
I've tried these regular expressions:
(\b\p{N}\S)*)
but that returns '12%' in '12%'
(\b\p{N}(?:(?!%)\S)*)
but that returns '12' in '12%'
Can I make an exception to the \S term that disregards %?
Or will have to do something else?
I'll be using it in PHP, but just write as you would like and I'll convert it to PHP.
This matches your specification:
\b\p{N}\S*+(?<!%)
Explanation:
\b # Start of number
\p{N} # One Digit
\S*+ # Any number of non-space characters, match possessively
(?<!%) # Last character must not be a %
The possessive quantifier \S*+ makes sure that the regex engine will not backtrack into a string of non-space characters it has already matched. Therefore, it will not "give back" a % to match 12 within 12%.
Of course, that will also match 1!abc, so you might want to be more specific than \S which matches anything that's not a whitespace character.
Can i make an exception to the \S term that disregards %
Yes you can:
[^%\s]
See this expression \b\d[^%\s]* here on Regexr
\d+([-/\.,]\d+)?(?!%)
Explanation:
\d+ one or more digits
(
[-/\.,] one "-", "/", "." or ","
\d+ one or more digits
)? the group above zero or one times
(?!%) not followed by a "%" (negative lookahead)
KISS (restrictive):
/[0-9][0-9.,-/]*\s/
try this one
preg_match("/^[0-9].*[^%]$/", $string);
Try this PCRE regex:
/^(\d[^%]+)$/
It should give you what you need.
I would suggest just:
(\b[\p{N},.-]++(?!%))
That's not very exact regarding decimal delimiters or ranges. (As example). But the ++ possessive quantifier will eat up as many decimals as it can. So that you really just need to check the following character with a simple assertion. Did work for your examples.
Can someone explain what this function
preg_replace('/&\w;/', '', $buf)
does? I have looked at various tutorials and found that it replaces the pattern /&\w;/ with string ''. But I can't understand the pattern /&\w;/. What does it represent?
Similarly in
preg_match_all("/(\b[\w+]+\b)/", $buf, $words)
I can't understand what does the string "/(\b[\w+]+\b)/" represents.
Please help. Thanks in advance :)
The explanation of your first expression is simple, it is:
& # Match the character “&” literally
\w # Match a single character that is a “word character” (letters, digits, and underscores)
; # Match the character “;” literally
The second one is:
( # Match the regular expression below and capture its match into backreference number 1
\b # Assert position at a word boundary
[\w+] # Match a single character present in the list below
# A word character (letters, digits, and underscores)
# The character “+”
+ # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
\b # Assert position at a word boundary
)
The preg_replace function makes use of regular expressions. Regular expressions allow you to find patterns in text in a really powerful way.
To be able to use functions like preg_replace or preg_match I recommend you to take a look first at how regular expressions work.
You can gather a lot of info on this site http://www.regular-expressions.info/
And you can use software tools to help you understand the regex (like RegexBuddy)
In regular expressions, \w stands for any "word" character. That is: a-z, A-Z, 0-9 and underscore. \b stands for "word boundary", that is the beginning and end of a word (a series of word characters).
So, /&\w;/ is a regular expression to match the & sign, followed by a series of word characters, followed by a ;. For example, &foobar; would match, and preg_replace will replace it with an empty string.
In that same manner, /(\b[\w+]+\b)/ matches a word boundary, followed by multiple word characters, followed by another word boundary. The words are captured separately using the parenthesis. So, this regular expression will simply return the words in a string as an array.