This question already has answers here:
Warning: preg_replace(): Unknown modifier
(3 answers)
Closed 3 years ago.
I'm a newbie with regular expressions and i need some help :).
I have this:
$url = '<img src="http://mi.url.com/iconos/oks/milan.gif" alt="Milan">';
$pattern = '/<img src="http:\/\/mi.url.com/iconos/oks/(.*)" alt="(.*)"\>/i';
preg_match_all($pattern, $url, $matches);
print_r($matches);
And I get this error:
Warning: preg_match_all() [function.preg-match-all]: Unknown modifier 'c'
I want to select that 'milan.gif'.
How can I do that?
If you’re using / as delimiter, you need to escape every occurrence of that character inside the regular expression. You didn’t:
/<img src="http:\/\/mi.url.com/iconos/oks/(.*)" alt="(.*)"\>/i
^
Here the marked / is treated as end delimiter of the regular expression and everything after is is treated as modifier. i is a valid modifier but c isn’t (see your error message).
So:
/<img src="http:\/\/mi\.url\.com\/iconos\/oks\/(.*)" alt="(.*)"\>/i
But as Pekka already noted in the comments, you shouldn’t try to use regular expressions on a non-regular language like HTML. Use an HTML parser instead. Take a look at Best methods to parse HTML.
The problem is that you haven't escaped the forward slashes in the url string (you have escaped the ones in the http:// part, but not the url path).
Therefore the first one it comes across it (which is after .com), it thinks is the end of the regex, so it treats everything after that slash as the 'modifier' codes.
The next character ('i') is a valid modifier (as you know, since you're actually using it in your example), so that passes the test. However the next character ('c') is not, so it throws an error, which is what you're seeing.
To fix it, simply escape the slashes. So your example would look like this:
$pattern = '/<img src="http:\/\/mi.url.com\/iconos\/oks\/(.*)" alt="(.*)"\\>/i';
Hope that helps.
Note, as someone has already said, it's generally not advisable to use regex to match HTML, since HTML can be too complex to match accurately. It's generally preferrable to use a DOM parser. In your example, the regex could fail if the alt attribute or the end of the image URL contains unexpected characters, or if the quoting in the HTML code isn't as you expect.
Related
This question already has answers here:
preg_match() Unknown modifier '[' help
(2 answers)
Closed 8 years ago.
I have a script that downloads the latest newsletter from a group inbox on a spare touchscreen in our office. It works fine, but people keep accidentally unsubscribing us so I want to hide the unsubscribe link from the email.
$preg_replace seems like it would work because I can set up a pattern that simply removes any link withthe word "unsubscribe" in. I validated the pattern below using the tool at http://regex101.com/ , and it even picks up variations like "manage subscription" as well. It is ok if the odd legitimate link with the word subscribe also get removed - there won't be many and it's only for internal use.
However, when I execute I get an error.
Here's my code:
line 53: $pat='<\s*(a|A)\s+[^>]*>[^<>]*ubscri[^<>]*<\s*\/(a|A)\s*>';
line 54: $themail[bodycontent]= preg_replace($pat, ' ',$themail[bodycontent]);
and I get this error:
preg_replace() [function.preg-replace]: Unknown modifier ']' in /home/trev/public_html/bigscreen/screen-functions.php on line 54
It must be something really simple like an unescaped char but I have gone code blind and can't for the life of me see it.
How do I get this pattern:
<\s*(a|A)\s+[^>]*>[^<>]*ubscri[^<>]*<\s*\/(a|A)\s*>
to run in a simple php script?
Thanks
You haven't used any delimiters so it's treating the < character as the delimiter
Try something like this instead
$pat='#<\s*(a|A)\s+[^>]*>[^<>]*ubscri[^<>]*<\s*\/(a|A)\s*>#';
You have no delimiter. Or rather you do, but it's not the one you meant. PCRE is interpreting your first < as the opening delimiter (you can use matching brackets as delimiters - in fact, I use parentheses to help remind myself that the entire match is index 0). Then it sees the first > as the ending delimiter. Anything after that should be a modifier, but of course ] is not a modifier.
Wrap your regex with (...) to give it a proper set of delimiters.
$themail[bodycontent] should be either $themail['bodycontent'] or $themail[$bodycontent].
It's trying to parse bodycontent] ... as the array index.
Patterns used in preg_match need to be enclosed by a pair of delimiter characters.
For example, a / or a ~ at the start and end of the string.
Anything outside of these delimiters at the end of the string is considered to be a regex "modifier".
Your example doesn't have delimiters, so PHP is wrongly assuming that the < character is the delimiter. It therefore sees the next < character as the closing delimiter, and therefore, anything after that as a modifier. Obviously all that stuff is supposed to be inside the pattern and isn't valid as modifiers, which is why PHP is complaining.
Solution: Add a pair of modifier characters:
$pat='~<\s*(a|A)\s+[^>]*>[^<>]*ubscri[^<>]*<\s*\/(a|A)\s*>~';
^ ^
add this ...and this
(it doesn't have to be ~, you can choose your own modifier character to suit your needs. Best one to use is one that doesn't occur in your string (although you can escape it if it does)
Starting and ending of pattern with slash /
$pat='/<\s*(a|A)\s+[^>]*>[^<>]*ubscri[^<>]*<\s*\/(a|A)\s*>/';
This question already has answers here:
How can I convert ereg expressions to preg in PHP?
(4 answers)
Closed 8 years ago.
I'm using PHP 5.2.17. I want to remove some surplus data from a JSON string and I've thought I can use some replace function to do so. Specifically I'm using ereg_replace with the next expression:
'^.*?(?=\"created_at)'
Which I've validated at http://www.regexpal.com. I've pasted my JSON string in there and the match is right. However, when I make the call:
$tweets = eregi_replace('^.*?(?=\"created_at)', $temp, 'something');
and then I echo the $tweets variable, there's output. No errors in console neither. Apache error log, however, complains about an error called REG_BADRPT. There's a comment in the php docs of eregi_replace suggesting this can be due to I need to escape special characters, but I've already escaped the " character. And I've tried to escape others but no different behavior.
Where could the problem be then?
I don't think that ereg supports lookarounds. preg_replace exists in php 5.2, so you should really use that instead. It will work with your expression with delimiters.
$tweets = preg_replace('#^.*?(?=\"created_at)#i', 'something', $temp);
As other people have pointed out, ereg functions are deprecated, so use preg_replace. You also have to encapsulate your regex string in slashes (/). You can put your regex flags after your last slash.
I've been searching for hours trying to find a solution to this. I am trying to determing if the REQUEST URI is legit and break it down from there.
$samplerequesturi = "/variable/12345678910";
To determine if it is legit, the first section variable is only letters and is variable in length. The second section is numbers, which should have 11 total. My problem is escaping the forward slash so it is matched in the uri. I've tried:
preg_match("/^[\/]{1}[a-z][\/]{1}[0-9]{11}+$/", $samplerequesturi)
preg_match("/^[\\/]{1}[a-z][\\/]{1}[0-9]{11}+$/", $samplerequesturi)
preg_match("/^#/#{1}[a-z]#/#{1}[0-9]{11}+$/", $samplerequesturi)
preg_match("/^|/|{1}[a-z]|/|{1}[0-9]{11}+$/", $samplerequesturi)
Among others which I can't remember now.
The request usually errors out:
preg_match(): Unknown modifier '|'
preg_match(): Unknown modifier '#'
preg_match(): Unknown modifier '['
Edit:
I guess I should state that the REQUEST URI is already known. I'm trying to prove the whole string to make sure it isn't a bogus string ie to make sure there the 1st set is only lower case letters, and the 2nd set is only 11 numbers.
/ is not the only thing you can use as a delimiter. In fact, you can use almost any non-slphanumeric character. Personally I like to use () because it reminds me that the first item of the result array is the entire match and it also never needs escaping in the pattern.
preg_match("(^/([a-z]+)/(\d+)$)i",$samplerequesturi,$out);
var_dump($out);
That should do it.
If you want to use regex (which I don't think is necessary in this case, simply splitting on "/" should be fine:
$samplerequesturi = "/variable/12345678910";
preg_match("#^/([A-Za-z]+)/(\d+)$#", $samplerequesturi, $out);
echo $out[1];
echo $out[2];
should get you going
Your problem may be that you are using the / forward-slash as a regex delimiter (at the start and end of the regex expression). Switch to using a character other than the forward-slash, such as a # hash symbol or any other symbol which will never need to appear in this particular expression. Then you won't need to escape the forward-slash character at all in the expression.
This question already has answers here:
Warning: preg_replace(): Unknown modifier
(3 answers)
Closed 3 years ago.
Im trying to search for something on a page but i keep getting this silly error
this is the error i am getting
Warning: preg_match() [function.preg-match]: Unknown modifier 'd'
this is the code im using
$qa = file_get_contents($_GET['url']);
preg_match('/Click here/',$qa,$result);
And $_GET['url'] can eqaul many things but in one case it was http://freegamesforyourwebsite.com/game/18-wheeler--2.html
the the html of that url basically
Anyone got a clue :S ? I dont even know where to start cus i dont know what a modifire is and the php.net site is no help
thankyou !
You need to escape the '/' before download.php otherwise it thinks you are ending your regex and providing 'd' as a modifier for your regex. You will also need to escape the next '/' in the ending anchor tag.
preg_match('/<a href="\/download.php\?g=(?P<number>.+)">Click here<\/a>/',$qa,$result);
You have to escape your pattern delimiters or use different ones:
# v- escape the '/'
preg_match('/Click here/',$qa,$result);
# v- use hatch marks instead
preg_match('#Click here#',$qa,$result);
Your regular expression needs to be escaped correctly.
It should be:
'/<a href="\/download.php\?g=(?P<number>.+)">Click here<\/a>/'
The problem is that your regular expression is delimited by / characters, but also contains / characters as data. What it's complaining about is /download -- it thinks the / has ended your regular expression and the d that follows is a modifier for your regular expression. However, there is no such modifier d.
The easiest solution is to use some character that is not contained in the regex to delimit it. In this case, # would work well.
preg_match('#Click here#',$qa,$result);
if (preg_match('(\p{Nd}{4}/\p{Nd}{2}/\p{Nd}{2}/\p{L}+)', '2010/02/14/this-is-something'))
{
// do stuff
}
The above code works. However this one doesn't.
if (preg_match('/\p{Nd}{4}/\p{Nd}{2}/\p{Nd}{2}/\p{L}+/u', '2010/02/14/this-is-something'))
{
// do stuff
}
Maybe someone could shed some light as to why the one below doesn't work. This is the error that is being produced:
A PHP Error was encountered
Severity: Warning
Message: preg_match()
[function.preg-match]: Unknown
modifier '\'
Try this: (delimit the regex with ())
if (preg_match('#\p{Nd}{4}/\p{Nd}{2}/\p{Nd}{2}/\p{L}+#', '2010/02/14/this-is-something'))
{
// do stuff
}
Edited
The modifier u is available from PHP 4.1.0 or greater on Unix and from PHP 4.2.3 on win32.
Also as nvl observed, you are using / as the delimiter and you are not escaping the / present in the regex. So you'lll have to use:
/\p{Nd}{4}\/\p{Nd}{2}\/\p{Nd}{2}\/\p{L}+/u
To avoid this escaping you can use a different set of delimiters like:
#\p{Nd}{4}/\p{Nd}{2}/\p{Nd}{2}/\p{L}+#
or
#\p{Nd}{4}/\p{Nd}{2}/\p{Nd}{2}/\p{L}+#
As a tip, if your delimiter is present in your regex, its better to choose a different delimiter not found in the regex. This keeps the regex clean and short.
In the second regex you're using / as the regex delimiter, but you're also using it in the regex. The compiler is trying to interpret this part as a complete regex:
/\p{Nd}{4}/
It thinks the next character after the second / should be a modifier like 'u' or 'm', but it sees a backslash instead, so it throws that cryptic exception.
In the first regex you're using parentheses as regex delimiters; if you wanted to add the u modifier, you would put it after the closing paren:
'(\p{Nd}{4}/\p{Nd}{2}/\p{Nd}{2}/\p{L}+)u'
Although it's legal to use parentheses or other bracketing characters ({}, [], <>) as regex delimiters, it's not a good idea IMO. Most people prefer to use one of the less common punctuation characters. For example:
'~\p{Nd}{4}/\p{Nd}{2}/\p{Nd}{2}/\p{L}+~u'
'%\p{Nd}{4}/\p{Nd}{2}/\p{Nd}{2}/\p{L}+%u'
Of course, you could also escape the slashes in the regex with backslashes, but why bother?