How to match multiple words in regex - php

Just a simple regex I don't know how to write.
The regex has to make sure a string matches all 3 words. I see how to make it match any of the 3:
/advancedbrain|com_ixxocart|p\=completed/
but I need to make sure that all 3 words are present in the string.
Here are the words
advancebrain
com_ixxocart
p=completed

Use lookahead assertions:
^(?=.*advancebrain)(?=.*com_ixxochart)(?=.*p=completed)
will match if all three terms are present.
You might want to add \b work boundaries around your search terms to ensure that they are matched as complete words and not substrings of other words (like advancebraindeath) if you need to avoid this:
^(?=.*\badvancebrain\b)(?=.*\bcom_ixxochart\b)(?=.*\bp=completed\b)

^(?=.*?p=completed)(?=.*?advancebrain)(?=.*?com_ixxocart).*$
Spent too long testing and refining =/ Oh well.. Will still post my answer

Use lookahead:
(?=.*\badvancebrain)(?=.*\bcom_ixxocart)(?=.*\bp=completed)
Order won't matter. All three are required.

Related

Regular expression to match any combination of repeated values

I need to test strings for repeated chars. Is there an singular regular expression I could use for this or should I compile a list of multiple different regular expressions?
111333555777
aaaabbbbccccdddd
aabbcc
11111
abcabcabc
There's a couple of different types of repetition
Not sure if I get you right, but maybe this regex would be what you want
^(?:(.*)\1+)*$
matches
111333555777
aaaabbbbccccdddd
aabbcc
11111
abcabcabc
By use of a capturing groups and backreference check, if string consists only by repeated values.
^(?:(\w+)\1+)+$
See demo at regex101
This is like the others, except the inner capture expression is non-greedy.
Not really sure if it maters though it insures the finest granularity.
(?:(.+?)\1+)+
It is probably impossible though to get the repeating boundary's via capture
group info.

Regex with negative lookahead to ignore the word "class"

I'm getting insane over this, it's so simple, yet I can't figure out the right regex. I need a regex that will match blacklisted words, ie "ass".
For example, in this string:
<span class="bob">Blacklisted word was here</span>bass
I tried that regex:
((?!class)ass)
That matches the "ass" in the word "bass" bot NOT "class".
This regex flags "ass" in both occurences. I checked multiple negative lookaheads on google and none works.
NOTE: This is for a CMS, for moderators to easily find potentially bad words, I know you cannot rely on a computer to do the filtering.
If you have lookbehind available (which, IIRC, JavaScript does not and that seems likely what you're using this for) (just noticed the PHP tag; you probably have lookbehind available), this is very trivial:
(?<!cl)(ass)
Without lookbehind, you probably need to do something like this:
(?:(?!cl)..|^.?)(ass)
That's ass, with any two characters before as long as they are not cl, or ass that's zero or one characters after the beginning of the line.
Note that this is probably not the best way to implement a blacklist, though. You probably want this:
\bass\b
Which will match the word ass but not any word that includes ass in it (like association or bass or whatever else).
It seems to me that you're actually trying to use two lists here: one for words that should be excluded (even if one is a part of some other word), and another for words that should not be changed at all - even though they have the words from the first list as substrings.
The trick here is to know where to use the lookbehind:
/ass(?<!class)/
In other words, the good word negative lookbehind should follow the bad word pattern, not precede it. Then it would work correctly.
You can even get some of them in a row:
/ass(?<!class)(?<!pass)(?<!bass)/
This, though, will match both passhole and pass. ) To make it even more bullet-proof, we can add checking the word boundaries:
/ass(?<!\bclass\b)(?<!\bpass\b)(?<!\bbass\b)/
UPDATE: of course, it's more efficient to check for parts of the string, with (?<!cl)(?<!b) etc. But my point was that you can still use the whole words from whitelist in the regex.
Then again, perhaps it'd be wise to prepare the whitelists accordingly (so shorter patterns will have to be checked).
Is this one is what you want ? (?<!class)(\w+ass)

Discard character in matching group

I have a couple of matching groups one after another in a long Regex pattern. Around the middle I have
...(?<number>(?:/(?:digit|num))?\d+|)...
which should match something like /num9, /digit9 or 9 or blank (because I need the named group to appear in the resulting associative array even if it's empty).
The pattern works, but is it possible to discard the / character if the one of first two cases is matched? I tried a positive lookahead, but it seems that you can't use those if you have expressions before the lookahead.
Is what I'm trying to accomplish possible using Regex?
Based on your input, I think that you need to capture / anyway at some point, otherwise your whole regex fails. At the same time you want to ignore it, so it cannot be a part of you named group. Therefore by putting it outside it and making it optional, while ensuring that a digit is not preceded directly by a / you come up with the desired results :
^/?(?<number>(?:(?:digit|num))?(?<!/)\d+|)$
However given your lack of a more complete input and regex, I am not 100% sure this will work for all your cases.

Regular expression to match words with no space

I am trying to do a preg_match to filter unwanted spam queries and I would like to match any word that is listed in the preg_match and filter it if it has no space after it.
So for example if I have the word balloon in the preg_match then I want to filter anything like "balloon1" or "balloond" or "balloonedfbdg" etc and allow anything with a space after balloon like "balloon big", "balloon small" etc.
I have a lot of queries from google that take a single word and add a whole bunch of crap to it that I want to filter out. It is only a few words but it is irritating for me enough to come here and find an answer to fix this.
I already use a preg_match for some of the spam queries using regular expressions but I do not know how to match something that is not spaced and allow something that has a space.
Any help is appreciated, Thanks.
Your Expression: /(balloon|otherwordone|othertwo)[^\s]/i
This matches the listed words if they're not followed by a whitespace (\s)
Edit: Using \B (not a word boundary):
/(balloon|otherwordone|othertwo)\B/i
This prevents common sentence symbols from triggering the regex (like dot, comma).

Regex to check if exact string exists

I am looking for a way to check if an exact string match exists in another string using Regex or any better method suggested. I understand that you tell regex to match a space or any other non-word character at the beginning or end of a string. However, I don't know exactly how to set it up.
Search String: t
String 1: Hello World, Nice to see you! t
String 2: Hello World, Nice to see you!
String 3: T Hello World, Nice to see you!
I would like to use the search string and compare it to String 1, String 2 and String 3 and only get a positive match from String 1 and String 3 but not from String 2.
Requirements:
Search String may be at any character position in the Subject.
There may or may not be a white-space character before or after it.
I do not want it to match if it is part of another string; such as part of a word.
For the sake of this question:
I think I would do this using this pattern: /\bt\b/gi
/\b{$search_string}\b/gi
Does this look right? Can it be made better? Any situations where this pattern wouldn't work?
Additional info: this will be used in PHP 5
Your suggestion of /\bt\b/gi will work and is probably the way to go. You've correctly used \b for word boundaries. You're using the global and case-insensitive modifiers which will find all matches in both cases. Simple, straight forward, clean. Look no further than what you've already come up with.
Looks fine to me. You might want to check the exact meaning of the \b assertion to make sure it's exactly what you need.
Can't really name any situation where this pattern "wouldn't work" without a more elaborate description, but \b would work fine for your testcases.
According to the old saying give a man a reg expression and he is happy for a day, teach him to write regular expression and he is happy for a lifetime (or something to that effect) try out the "regulator"
It provides a GUI and some pretty good examples for reg exp needs.

Categories