Regex expression to mach one of many strings in php - php

I am totally new to regex , I want to match if the value is any one of the following
cs,ch,es,ee,it,me
Till now I have tried
if (preg_match("/^[cs|ch|es|ee|it|me]{2}$/",$val))
echo "true";
else
echo "false";
Its working fine for true cases but also returns true for reverse of them like sc,hc etc.
Also it will be really helpful if you refer some good source/books to learn it for PHP.

Remove the character class [] from your regex and wrap them using (). Also remove the {2} as its not necessary anymore.
if (preg_match("/^(cs|ch|es|ee|it|me)$/",$val))
And this will do for you.

You need to use () insteadof []
/^(cs|ch|es|ee|it|me)$/
Note: While using parentheses do not use {2}
So your Final code is:
if (preg_match("/^(cs|ch|es|ee|it|me)$/",$val))
echo "true";
else
echo "false";
TO learn regex for php I will suggest this book its a good one for quick refere or refer this question for more.

You must use the grouping delimiters (parentheses). The character class delimiters (square brackets) are used for matching ranges of characters.
/^(cs|ch|es|ee|it|me)$/
If you only use the regular expressions to match something (and not capture anything) then you can use the (?:) grouping.
/^(?:cs|ch|es|ee|it|me)$/
One of the better websites for learning regular expressions is regular-expressions.info.

do you know what [] does ?
lets take an example [abcdef]
it will match any of the letters mentioned in the square brackets, suppose you are providing : ^[cs|ch|es|ee|it|me]{2}$
it will match a single character in the list cs|heitm
you can add a single letter howsoever times you want but it will match only once.
so it will match any word of two letters as you have mentioned starting with the letters cs|heitm
so it will match cs hs |s etc.
hope you understand it :)
the corrected regex should be
/^(cs|ch|es|ee|it|me)$/
this will match for exact literal words rather than letters.

Related

please solve for me my problem if you can [duplicate]

Obviously, you can use the | (pipe?) to represent OR, but is there a way to represent AND as well?
Specifically, I'd like to match paragraphs of text that contain ALL of a certain phrase, but in no particular order.
Use a non-consuming regular expression.
The typical (i.e. Perl/Java) notation is:
(?=expr)
This means "match expr but after that continue matching at the original match-point."
You can do as many of these as you want, and this will be an "and." Example:
(?=match this expression)(?=match this too)(?=oh, and this)
You can even add capture groups inside the non-consuming expressions if you need to save some of the data therein.
You need to use lookahead as some of the other responders have said, but the lookahead has to account for other characters between its target word and the current match position. For example:
(?=.*word1)(?=.*word2)(?=.*word3)
The .* in the first lookahead lets it match however many characters it needs to before it gets to "word1". Then the match position is reset and the second lookahead seeks out "word2". Reset again, and the final part matches "word3"; since it's the last word you're checking for, it isn't necessary that it be in a lookahead, but it doesn't hurt.
In order to match a whole paragraph, you need to anchor the regex at both ends and add a final .* to consume the remaining characters. Using Perl-style notation, that would be:
/^(?=.*word1)(?=.*word2)(?=.*word3).*$/m
The 'm' modifier is for multline mode; it lets the ^ and $ match at paragraph boundaries ("line boundaries" in regex-speak). It's essential in this case that you not use the 's' modifier, which lets the dot metacharacter match newlines as well as all other characters.
Finally, you want to make sure you're matching whole words and not just fragments of longer words, so you need to add word boundaries:
/^(?=.*\bword1\b)(?=.*\bword2\b)(?=.*\bword3\b).*$/m
Look at this example:
We have 2 regexps A and B and we want to match both of them, so in pseudo-code it looks like this:
pattern = "/A AND B/"
It can be written without using the AND operator like this:
pattern = "/NOT (NOT A OR NOT B)/"
in PCRE:
"/(^(^A|^B))/"
regexp_match(pattern,data)
The AND operator is implicit in the RegExp syntax.
The OR operator has instead to be specified with a pipe.
The following RegExp:
var re = /ab/;
means the letter a AND the letter b.
It also works with groups:
var re = /(co)(de)/;
it means the group co AND the group de.
Replacing the (implicit) AND with an OR would require the following lines:
var re = /a|b/;
var re = /(co)|(de)/;
You can do that with a regular expression but probably you'll want to some else. For example use several regexp and combine them in a if clause.
You can enumerate all possible permutations with a standard regexp, like this (matches a, b and c in any order):
(abc)|(bca)|(acb)|(bac)|(cab)|(cba)
However, this makes a very long and probably inefficient regexp, if you have more than couple terms.
If you are using some extended regexp version, like Perl's or Java's, they have better ways to do this. Other answers have suggested using positive lookahead operation.
Is it not possible in your case to do the AND on several matching results? in pseudocode
regexp_match(pattern1, data) && regexp_match(pattern2, data) && ...
Why not use awk?
with awk regex AND, OR matters is so simple
awk '/WORD1/ && /WORD2/ && /WORD3/' myfile
The order is always implied in the structure of the regular expression. To accomplish what you want, you'll have to match the input string multiple times against different expressions.
What you want to do is not possible with a single regexp.
If you use Perl regular expressions, you can use positive lookahead:
For example
(?=[1-9][0-9]{2})[0-9]*[05]\b
would be numbers greater than 100 and divisible by 5
In addition to the accepted answer
I will provide you with some practical examples that will get things more clear to some of You. For example lets say we have those three lines of text:
[12/Oct/2015:00:37:29 +0200] // only this + will get selected
[12/Oct/2015:00:37:x9 +0200]
[12/Oct/2015:00:37:29 +020x]
See demo here DEMO
What we want to do here is to select the + sign but only if it's after two numbers with a space and if it's before four numbers. Those are the only constraints. We would use this regular expression to achieve it:
'~(?<=\d{2} )\+(?=\d{4})~g'
Note if you separate the expression it will give you different results.
Or perhaps you want to select some text between tags... but not the tags! Then you could use:
'~(?<=<p>).*?(?=<\/p>)~g'
for this text:
<p>Hello !</p> <p>I wont select tags! Only text with in</p>
See demo here DEMO
You could pipe your output to another regex. Using grep, you could do this:
grep A | grep B
((yes).*(no))|((no).*(yes))
Will match sentence having both yes and no at the same time, regardless the order in which they appear:
Do i like cookies? **Yes**, i do. But milk - **no**, definitely no.
**No**, you may not have my phone. **Yes**, you may go f yourself.
Will both match, ignoring case.
Use AND outside the regular expression. In PHP lookahead operator did not not seem to work for me, instead I used this
if( preg_match("/^.{3,}$/",$pass1) && !preg_match("/\s{1}/",$pass1))
return true;
else
return false;
The above regex will match if the password length is 3 characters or more and there are no spaces in the password.
Here is a possible "form" for "and" operator:
Take the following regex for an example:
If we want to match words without the "e" character, we could do this:
/\b[^\We]+\b/g
\W means NOT a "word" character.
^\W means a "word" character.
[^\We] means a "word" character, but not an "e".
see it in action: word without e
"and" Operator for Regular Expressions
I think this pattern can be used as an "and" operator for regular expressions.
In general, if:
A = not a
B = not b
then:
[^AB] = not(A or B)
= not(A) and not(B)
= a and b
Difference Set
So, if we want to implement the concept of difference set in regular expressions, we could do this:
a - b = a and not(b)
= a and B
= [^Ab]

PHP regular expression : match the closest one

I have a string like this
<div><span style="">toto</span> some character <span>toto2</span></div>
My regex:
/(<span .*>)(.*)(<\/span>)/
I used preg_match and it returns the entire string
<span style="">toto</span> some character <span>toto2</span>
I want it returns:
<span style="">toto</span>
and
<span>toto2</span>
What do I need to do to achieve this? Thanks.
How about this:
/(<span[^>]*>)(.*?)(<\/span>)/
Check the docs here at PHP preg_match Repetition:
By default, the quantifiers are "greedy", that is, they match as much as possible
and
However, if a quantifier is followed by a question mark, then it becomes lazy, and instead matches the minimum number of times possible
Even though I guess all previous answers are correct, I just want to add that as you only want to capture the whole expressions (i.e. from to ) you don't have to capture eveything inside the regexp with ()
The following does what you expect without capturing additional expressions
/(<span\w*[^>]*>[^<]*<\/span>)/
(tested on http://rubular.com/)
EDIT : of course there might be some differences between PHP and ruby regexp implementations, but the idea is the same :)

Case-sensitive regex that works

I have tried to change this regular expression to case-sensitive with a lot of possible solutions (/[u=|&l=|&dl=|&f=]/i and so on) but I didn't make it to work as I want to.
u=, &l=, &dl=, and &f= is taken from profile-photos.php?u=edgren&dl=. I use this regular expression to only get the username edgren and identify those other GETs (l, dl, and f) for example;
Looking at '.properize($profile).' '.(isset($_GET['l']) ? 'likes' : (isset($_GET['dl']) ? 'dislikes' : 'favorites')) which prints "Looking at edgrens dislikes" with the URL profile-photos.php?u=edgren&dl=.
The regular expression I have now, prints egren (example at regexpal.com) if the GET is &dl= which is wrong. I want to print the whole username and not the half of it, so to speak.
How can I fix my problem?
Thanks in advance.
You are confusing alternation with character classes. If you want to match one of several strings, use round brackets: (u=|&l=|&dl=|&f=). Square brackets are for character classes (which have the meaning "match one character if it is one of those specified between these square brackets").
Also i makes the regex explicitly case-insensitive.

Regular Expressions PHP

I'm new with Regex in PHP and what I want to know is how to match words that are equal or like each other.
Example:
I have the word "designer" and the word "design", if we try to match the designer with design will return false, but if we try to match design with designer it will return a match. I need to match both cases using one preg_match statement.
Can Anyone help me?
I believe you are looking for stemming:
http://en.wikipedia.org/wiki/Stemming
If you are only looking to match on those two words then do as nickb suggested and keep it simple. If you are seeking to replicate this matching on many words then you could use this PorterStemmer class: http://tartarus.org/~martin/PorterStemmer/php.txt
What I think you're looking for is an optional match:
/design(?:er)?/
The parentheses group the "er", "?:" makes it non-capturing, and the "?" following make that group optional.
In more general terms, if you want to capture a word or any longer version of that word:
/design\w*/
That matches on "design" and zero or more ("*") word characters ("\w").

What Regex for this?

I'm trying to learn regular expression, because I can't do without them.
So, this is a list of different dimension patterns (for products to sale) :
40x30x75
46x38x23-27
Ø30H30
Ø25-18H27
So, what pattern to use to find each kind of dimensions ?
For example, now, I'm using this to find this kind of pattern 40x30x75, but it not works :
if(preg_match("#^[0-9][x][0-9][x][0-9]#", $dimension))
echo "ok"
Could you help me ?
Try the following regex:
(^[0-9]+x[0-9]+x[0-9]+$)|(^[0-9]+x[0-9]+x[0-9]+-[0-9]+$)|(^Ø[0-9]+H[0-9]+$)|(^Ø[0-9]+-[0-9]+H[0-9]+$)
So:
if (preg_match("/(^[0-9]+x[0-9]+x[0-9]+$)|(^[0-9]+x[0-9]+x[0-9]+-[0-9]+$)|(^Ø[0-9]+H[0-9]+$)|(^Ø[0-9]+-[0-9]+H[0-9]+$)/", $dimension))
echo "ok";
It probably can be simplified even more, maybe someone would want to have a go at that?
By the way, did you know about a website called RegExr it allows you to test your regular expessions, it has been very useful to me whenever I work with regex's.
Your regex is missing quantifiers, add a + sign behind the character classes in question to singal you're looking for one or more matches:
if(preg_match("#^[0-9]+x[0-9]+x[0-9]+#", $dimension))
echo "ok"
By default it's looking for one character of the class only. Single characters do not need the character class (albeit it was not wrong). See the x'es in the example above.
Your regex should be:
^[0-9]{2}x[0-9]{2}x[0-9]{2}$
[0-9] means a single character which is between 0 and 9. So, you either need to have two of those, or use a quantifier thing like {2}. Instead of [0-9] you could also use \d, meaning any digit. So, you could for example write:
^\d\dx\d\dx\d\d$
Tip: If you can't do without regular expressions, want to learn it and have an easier life, I can recommend you get RegexBuddy. Bought it for myself when I just got started, and it has helped me a lot.
This will validate the first two:
^[0-9]+x[0-9]+x[0-9]+-?[0-9]*$

Categories