Regular expression to search in html template - php

I'm trying to find certain strings in a lot of files (cc. 2000 files).
What I basically need is to find any ID attribute in a html file except some certain ID's.
For example I want to find:
<a id="certain_id">sdfsdf</a>
But I want to exclude:
<a id="manage">Manage</a>
I only need a regular expression since I want to use it in Eclipse Search. But if I cannot I can make a php file to do it too.
Something like: id="(?!manage)" or similar. I don't want to replace anything I just want a list of elements in each file.
Thanks for the help

This worked for me:
id="(?!(manage|something)").*?"
^ ^ ^
| | Any character (not greedy), followed by a quote
| Negative lookahead to make sure there isn't manage|something and a quote
Match the id=" characters literally
Where manage and something are two IDs you don't want to match.
You could also use this to make it non-greedy:
id="(?!(manage|something)")[^"]+"

Regular Expressions doesn't contain a natural "not" operator.
I believe a workaround exists, for instance you could do something like: id="(?!(undesirable1|undesirable2|undesirable3))"
It's been a little while since I did anything with regex, but I think that should work.
Edit: I think nick's answer is better

http://txt2re.com/ Best Tool Ever

Related

(preg_match ('#^/thank-you/hello/#', $_SERVER['REQUEST_URI'])

So basically I'm trying to select all content that is in /thank-you/hello/, so this can be /thank-you/hello/x/, /thank-you/hello/y/, /thank-you/hello/z/, etc.
This is what I'm using right now:
preg_match ('#^/thank-you/hello/#', $_SERVER['REQUEST_URI']
This block of code only works for stuff that is in /thank-you/hello/.
How should I change this snippet to include all the other folders that are after /hello/?
I suggest you read more about regex
I also recommend regex101 to test and study the site
In the desired pattern you can replace the desired word from .*?
.: Matches any character other than newline (or including line terminators with the /s flag)
a*: Matches zero or more consecutive a characters.
a?: Matches an a character or nothing.
They may seem a little incomplete without their examples
I suggest you see their examples on regex101
example:
preg_match('#^/thank-you/hello/.*?/#', $_SERVER['REQUEST_URI']);
It may not be exactly what you want
Or something may increase or decrease later and you may want to make a change
I think everyone should learn regex so that they can implement what they want according to their own desires.
I do not think it is a good idea to use patterns that you do not know what they mean

how to find two groups and replace one with the other one using regexp?

I have a situation where I need to find 2 groups in some text and replace $1 with $2.
I'm not sure it's possible. Also, I don't want to use php but simple search and replace.
See here or see this text:
<li>
<div>
</div><div class="videoTb_title">Kevin Austin</div>
</li>
What I want is to find video-477 and replace it with Kevin Austin.
This is just an example, I have more li's that have different names and video id's
Any ideas?
Thanks
Assuming /video/testimonials is actually part of most of your urls, you can capture and replace with the following regex:
(<a[^"]+\"/video/testimonials/)([^"]+)(\".+</a>\r?\n?.*?videoTb_title\"\>)([^<]+)</div>
to be replaced with
$1$4$3$4$5</div>
Keep in mind, your example accepted: href="/video/testimonials/Kevin Austin", which might be considered bad practice to use whitespace in urls.
We would really need more information about the structure of all of the elements to make sure that we could produce a regex that always worked.
Here is an example of a regex that would work for your example:
/video/testimonials/([^"]+)
"video-477" would then be stored in $1.

What Regex for this?

I'm trying to learn regular expression, because I can't do without them.
So, this is a list of different dimension patterns (for products to sale) :
40x30x75
46x38x23-27
Ø30H30
Ø25-18H27
So, what pattern to use to find each kind of dimensions ?
For example, now, I'm using this to find this kind of pattern 40x30x75, but it not works :
if(preg_match("#^[0-9][x][0-9][x][0-9]#", $dimension))
echo "ok"
Could you help me ?
Try the following regex:
(^[0-9]+x[0-9]+x[0-9]+$)|(^[0-9]+x[0-9]+x[0-9]+-[0-9]+$)|(^Ø[0-9]+H[0-9]+$)|(^Ø[0-9]+-[0-9]+H[0-9]+$)
So:
if (preg_match("/(^[0-9]+x[0-9]+x[0-9]+$)|(^[0-9]+x[0-9]+x[0-9]+-[0-9]+$)|(^Ø[0-9]+H[0-9]+$)|(^Ø[0-9]+-[0-9]+H[0-9]+$)/", $dimension))
echo "ok";
It probably can be simplified even more, maybe someone would want to have a go at that?
By the way, did you know about a website called RegExr it allows you to test your regular expessions, it has been very useful to me whenever I work with regex's.
Your regex is missing quantifiers, add a + sign behind the character classes in question to singal you're looking for one or more matches:
if(preg_match("#^[0-9]+x[0-9]+x[0-9]+#", $dimension))
echo "ok"
By default it's looking for one character of the class only. Single characters do not need the character class (albeit it was not wrong). See the x'es in the example above.
Your regex should be:
^[0-9]{2}x[0-9]{2}x[0-9]{2}$
[0-9] means a single character which is between 0 and 9. So, you either need to have two of those, or use a quantifier thing like {2}. Instead of [0-9] you could also use \d, meaning any digit. So, you could for example write:
^\d\dx\d\dx\d\d$
Tip: If you can't do without regular expressions, want to learn it and have an easier life, I can recommend you get RegexBuddy. Bought it for myself when I just got started, and it has helped me a lot.
This will validate the first two:
^[0-9]+x[0-9]+x[0-9]+-?[0-9]*$

Regex equals condition except for certain condition

I have written the following Regex in PHP for use within preg_replace().
/\b\S*(.com|.net|.us|.biz|.org|.info|.xxx|.mx|.ca|.fr|.in|.cn|.hk|.ng|.pr|.ph|.tv|.ru|.ly|.de|.my|.ir)\S*\b/i
This regex removes all URLs from a string pretty effectively this far (though I am sure I can write a better one). I need to be able to add an exclusion though from a specific domain. So the pseudo code will look like this:
IF string contains: .com or .net or. biz etc... and does not contain: foo.com THEN execute condition.
Any idea on how to do this?
Just add a negative lookahead assertion:
/(?<=\s|^)(?!\S*foo\.com)\S*\.(com|net|us|biz|org|info|xxx|mx|ca|fr|in|cn|hk|ng|pr|ph|tv|ru|ly|de|my|ir)\S*\b/im
Also, remember that you need to escape the dot - and that you can move it outside the alternation since each of the alternatives starts with a dot.
Use preg_replace_callback instead.
Let your callback decide whether to replace.
It can give more flexibility if the requirements become too complicated for a simple regex.

regex: find the part, which doesn't contain none of some words

how can i much the sentense, if it doesn't contain none of {word1,word2,word3}
where i must put ^ symbol?
i think it must looks like this
^([^word1|word2|word3])$
but it doesn't work.
could you help? thanks
Regex isn't the best tool for testing these sorts of conditions, but if you must then you can do it with negative lookaheads:
^(?!.*word1)(?!.*word2)(?!.*word3).*$
What you are trying to do won't work because [^...] is a negative character class with an unordered list of characters. What you wrote is equivalent to:
^([^123dorw|])$
Note also that depending on your needs you might also want to include word-boundaries in your regular expression:
^(?!.*\bword1\b)(?!.*\bword2\b)(?!.*\bword3\b).*$
im not familiar with the use of regex in htaccess, so my thoughts may be bit high level:
what you try looks like kind of:
if sentence contains not word1 | not word2 | not word3
then do something
i would suggest a solution the way:
if sentence contains word1|word2|word3
then do nothing
else do something
means don't use the negation in the "query", but in the result, which makes the regex simpler.

Categories