RegEx for matching specific HTML Entity pattern (Emoji) - php

I am working in this regular expression to match exactly the following pattern. The issue is that if it is exceeded, the pattern should not be considered:
I want exactly 6 digits starting with #, but if I write {5} returns true. Then the same happens with ; I want exactly one and to be at the end. Also, I don't know how to use here the $ to specify the final character.
if(preg_match(('/^(#)+([0-9]{6}){1}(;)/'),"#128515;")){
return true;
}
SHOULD BE IN THIS FORMAT:
#128515; for #DDDDDD; not ##DDDD;;
Exactly 6 digits start with one # and finish with one ;

preg_match will return 1 when it matches given subject and if you have 6 digits, it can match 5 as well when there is no ending semicolon as there is no ending boundary set.
You could add anchors ^ and $ to assert the start and the end of the string so it matches exactly 6 digits.
From your pattern you can omit {1} because the group is already matched 1 time.
If you don't reference to the groups in the code you could also omit them and just us a match only.
You could use:
^#[0-9]{6};$
^ Start of string
# Match #
[0-9]{6}; Match 6 digits
$ Assert end of string
Your code could look like
if(preg_match(('/^#[0-9]{6};$/'),"#128515;")){
return true;
}

Related

Creating a regular expression that will match requirements in a string

The issue
I need to write a regular expression that will match the following requirements in a string with the structure {A/B}.
Requirements/Conditions:
A and B can only be exactly one of [UGWRB].
A structure where U or G do not appear is invalid.
A structure where both characters are equal is invalid.
U or G must appear in the combination at least once.
The structure can repeat or continue infinite times, as long as each following instance is still valid when read alone. (see valid matches below)
Valid matches:
{U/G}{U/G}{U/G}
{W/G}{U/B}
{U/G}{U/B}
{U/G}
{G/U}
{U/B}
...
Invalid matches:
{U/U}{U/U}
{U/U}{G/G}
{U/G}{U/U}
{U/G}{R/B}
{G/G}
{R/B}
{W/R}
{B/W}
...
My attempt
This is what I have gotten so far, but out of all the combinations of UGWRB, I'm only getting 8 matches out of 14.
{([UG])(?(1)|\w)\/(?(1)\w|[UG])}
You have to work with lookaheads both negative and positive in order to accomplish the task:
^(?:{(?=[^{}]*[UG])([UGWRB])\/(?!\1)(?1)})+$
See live demo here
Note that m flag should be set.
Regex breakdown:
^ Match start of input string
(?: Start of non-capturing group
{ Match { literally
(?= Start of positive lookahead
[^{}]*[UG] Look for [UG] in combination
) End of lookahead
([UGWRB]) Match and capture a letter from character class
\/(?!\1)(?1) Match / and see if next char is not the same as recently captured one
} Match } literally
)+ End of group, repeat at least once
$ Match end of input string
Try this regex:
^(?!.*{([UGWRB])\/\1})(?:{(?(?=[UG]).\/[UGWRB]|[WRB]\/[UG])})+$
Click for Demo
Explanation:
^ - matches the start of the string
(?!.*{([UGWRB])\/\1}) - negative lookahead to make sure that the structures like {G/G} or {U/U} or {R/R} are not present anywhere in the string
{ - matches {
(?(?=[UG]).\/[UGWRB]|[WRB]\/[UG]) - Regex Conditional. If the current position is followed by either U or G, then the match that character followed by / and the character class [UGWRB]. Otherwise, match the character class [WRB] followed by / followed by U or G
} - matches }
+ - matches 1+ occurrences of the above sub-sequence (?:{(?(?=[UG]).\/[UGWRB]|[WRB]\/[UG])})
$ - matches the end of the string

Capturing group with optional start and end characters

i have the follow string: find me String1\String2\String3, so i wanna capture string1, 2 and 3 if they exist. String 3 can be optional.
So far, what i could make is: (?<=find me)\s(\\?[\w]+\\?){1,3}, my assumption was:
The string should have find meat the beggining but it should not be captured
a whitespace
a group with \ as optional character at the beggining of the string, a word following it and \at the end of it, optional too, it can appear from 1 to 3 times.
What is wrong with my regex pattern?
Assuming your regex flavor supports \G, you can use this regex to capture all 3 strings separately:
(?<=find me |(?<!^)\G\\)\w+
RegEx Demo
\G asserts position at the end of the previous match or the start of the string for the first match.
\G matches a position that either line start OR end of the previous match. In this case I also have a negative lookbehind (?<!^) which means don't match line start, hence it makes \G match only the positions that end of the previous matches. For your example, it matches twice i.e. end of String1 and end of String2.

Phone no contain this patteren AABBCC e.g 112233

I want to check if phone no contains this pattern AABBCC
Where A[0-9],B[0-9],C[0,9] They should be different e.g 112233,553322,887766
Let Us Suppose
I Have a phone no 03334112233
It will say yes pattern matched.
PHP Code but It Is For Exact String
$str = 'aabbaabbccaass'; //or whatever
if (preg_match('/(?!.*?aabbcc)^.*$/', $str))
echo "accepted\n";
else
echo "rejected\n";
Problem i don't know how to do if string is for numbers
Possible Duplicate
but it does not contain answer and exact detail.
Edited :
I want to match the last 6 characters of the string in this pattern AABBCC e.g 03329112233
To match number with AABBCC format, you can use this pattern:
(?:(\d)\1(?!\1)){2}(\d)\2
example of use:
if (preg_match('/(?:(\d)\1(?!\1)){2}(\d)\2/', $str)
echo "rejected\n";
else
echo "accepted\n";
But if you have other tests to do (for example that there is only digits), it can be more flexible to use it in this way:
if (preg_match('/(?!.*(?:(\d)\1(?!\1)){2}(\d)\2)^\d+$/', $str)
echo "accepted\n";
else
echo "rejected\n";
pattern details:
(?: # open a non capturing group that describes a repeated digit
(\d) # capture the first digit with group 1
\1 # a backreference to group 1 (the same digit thus)
(?!\1) # check with a negative lookahead that the same digit doesn't follow
){2} # repeat the group two times
(\d)\2 # same thing for digits 5 & 6 (the lookahead isn't needed here)
Note that the digit in the capture group change at each repetition of the non capturing group (because the negative lookahead forces it).
Notice: if you want to reject numbers that contains, for example, 111122 or 112222 or 111111, you only need to remove the negative lookahead.
if you want to reject numbers with the format 112211 or 448844, you must change the pattern like this: (\d)\1(?!\d{0,2}\1)(\d)\2(?!\2)(\d)\3
As I understand, you only want to match the last 6 characters of the string, if they are digits, and of 3 all different digit pairs. Would also use a lookahead and some pattern like this:
(?>((\d)\2)(?!.*\1)){3}$
\2 checks for an equivalent of 2nd capturing group, which is one digit (shorthand \d)
using a negative lookahead to check, if not followed by .* any amount of any characters, followed by equivalent of 1st capturing group (which contains 2 equal digits).
{3} 3 repitions at $ end of string.
Test on regex101.com, Regex FAQ
Your regex should be like this:
^((\d)\2){3}$
It is simpler and also works.
You can use capturing groups and backreferences like this:
if (preg_match('/(?!.*(.)\1(.)\2(.)\3)^.*$/', $str))
The (.) will match any single character and assign it to a group. The first instance is assigned to group 1, the second to group 2 and so on. Later in the pattern, the backreference \1 will match exactly what was previously captured in group the first group, \2 will match what was captured in the second group, etc.
You probably will also want to use \d to match any single digit (it's only necessary to use this outside of the lookahead) and a {n,m} quantifier to match between n and m digits. For example, the following will match any sequence of 7 to 10 digits that does not contain a subsequence like AABBCC:
if (preg_match('/(?!.*(.)\1(.)\2(.)\3)^\d{7,10}$/', $str))

PHP check that string has 2 numbers, 8 chars and 1 capital

I found lots of php regex and other options to determine string length, and if it contains one letter or one number, but how do I determine if a string has 2 numbers in it?
I am trying to validate a password that
Must have exactly 8 characters
One of them must be an Uppercase letter
2 of them must be numbers
Is there a one line regex solution for this?
if (preg_match(
'/^ # Start of string
(?=.*\p{Lu}) # at least one uppercase letter
(?=.*\d.*\d) # at least two digits
.{8} # exactly 8 characters
$ # End of string
/xu',
$subject)) {
# Successful match
(?=...) is a lookahead assertion. It checks if a certain regex can be matched at the current position, but doesn't actually consume any part of the string, so you can just place several of those in a row.

Regexp return true, but author of a book says it shouldn't

Reading an online resource on PHP about Regexp(TuxRadar).
According to the author the following should not match "aaa1" to the pattern and therefore return false(0), but I get true(1).
<?php
$str = "aaa1";
print preg_match("/[a-z]+[0-9]?[a-z]{1}/", $str);
?>
Why?
Regular Expressions
Are you sure there isn't supposed to be a trailing $ there? Without it, returning true makes a lot of sense - the first [a-z] block matches the first 2 a characters, the [0-9] matches nothing, and the last [a-z] matches the 3rd a. The trailing 1 is ignored.
Looking at the link to the book, it does seem there's an error there:
Must end with a lower case letter
This is only true if the regular expression is anchored to the end of the string with a $.
It matches because [0-9]? matches a digit zero or one times.
<?php
$str = "aaa1";
print preg_match("/[a-z]+[0-9]+[a-z]{1}/", $str);
?>
won't result in a match.
Lets break down the regular expression
[a-z]+ means one or more letters, being gready that would match a, aa or aaa
[0-9]? means an optional - so could match a digit
[a-z] means to match a letter, that could be an a
Therefore due to the [0-9] being optional 1 would match aa, 2 would match nothing and 3 would match an a

Categories