How to convert eregi to preg_match? - php

I am using a lib which uses
eregi($match="^http/[0-9]+\\.[0-9]+[ \t]+([0-9]+)[ \t]*(.*)\$",$line,$matches)
but as eregi is deprecated now, i want to convert above to preg_match. I tried it as below
preg_match($match="/^http/[0-9]+\\.[0-9]+[ \t]+([0-9]+)[ \t]*(.*)\$/i",$line,$matches)
but it throws an error saying Unknown modifier '[' in filename.php
any ideas how to resolve this issue?
Thanks

If you use / as the regex delimiter (ie. preg_match('/.../i', ...)), you need to escape any instances of / in your pattern or php will think it's referring to the end of the pattern.
You can also use a different character such as % as your delimiter:
preg_match('%^http/[0-9]+\.[0-9]+[ \t]+([0-9]+)[ \t]*(.*)$%i',$line,$matches)

You need to escape the delimiters inside the regular expression (in this case the /):
"/^http\\/[0-9]+\\.[0-9]+[ \t]+([0-9]+)[ \t]*(.*)\$/i"
But you could also chose a different delimiter like ~:
"~^http/[0-9]+\\.[0-9]+[ \t]+([0-9]+)[ \t]*(.*)\$~i"

You can try:
preg_match("#^http/[0-9]+\\.[0-9]+[ \t]+([0-9]+)[ \t]*(.*)\$#i",$line,$matches)
You can drop the the $match=
You are using / as the delimiter
and there is another / present in the
regex after http, which effectively
marks the end of your regex. When PHP
sees the [ after this it complains.
You can use a different set of
delimiters as # or escape the / after http

Related

Unexpected ] error in simple preg replace script [duplicate]

This question already has answers here:
preg_match() Unknown modifier '[' help
(2 answers)
Closed 8 years ago.
I have a script that downloads the latest newsletter from a group inbox on a spare touchscreen in our office. It works fine, but people keep accidentally unsubscribing us so I want to hide the unsubscribe link from the email.
$preg_replace seems like it would work because I can set up a pattern that simply removes any link withthe word "unsubscribe" in. I validated the pattern below using the tool at http://regex101.com/ , and it even picks up variations like "manage subscription" as well. It is ok if the odd legitimate link with the word subscribe also get removed - there won't be many and it's only for internal use.
However, when I execute I get an error.
Here's my code:
line 53: $pat='<\s*(a|A)\s+[^>]*>[^<>]*ubscri[^<>]*<\s*\/(a|A)\s*>';
line 54: $themail[bodycontent]= preg_replace($pat, ' ',$themail[bodycontent]);
and I get this error:
preg_replace() [function.preg-replace]: Unknown modifier ']' in /home/trev/public_html/bigscreen/screen-functions.php on line 54
It must be something really simple like an unescaped char but I have gone code blind and can't for the life of me see it.
How do I get this pattern:
<\s*(a|A)\s+[^>]*>[^<>]*ubscri[^<>]*<\s*\/(a|A)\s*>
to run in a simple php script?
Thanks
You haven't used any delimiters so it's treating the < character as the delimiter
Try something like this instead
$pat='#<\s*(a|A)\s+[^>]*>[^<>]*ubscri[^<>]*<\s*\/(a|A)\s*>#';
You have no delimiter. Or rather you do, but it's not the one you meant. PCRE is interpreting your first < as the opening delimiter (you can use matching brackets as delimiters - in fact, I use parentheses to help remind myself that the entire match is index 0). Then it sees the first > as the ending delimiter. Anything after that should be a modifier, but of course ] is not a modifier.
Wrap your regex with (...) to give it a proper set of delimiters.
$themail[bodycontent] should be either $themail['bodycontent'] or $themail[$bodycontent].
It's trying to parse bodycontent] ... as the array index.
Patterns used in preg_match need to be enclosed by a pair of delimiter characters.
For example, a / or a ~ at the start and end of the string.
Anything outside of these delimiters at the end of the string is considered to be a regex "modifier".
Your example doesn't have delimiters, so PHP is wrongly assuming that the < character is the delimiter. It therefore sees the next < character as the closing delimiter, and therefore, anything after that as a modifier. Obviously all that stuff is supposed to be inside the pattern and isn't valid as modifiers, which is why PHP is complaining.
Solution: Add a pair of modifier characters:
$pat='~<\s*(a|A)\s+[^>]*>[^<>]*ubscri[^<>]*<\s*\/(a|A)\s*>~';
^ ^
add this ...and this
(it doesn't have to be ~, you can choose your own modifier character to suit your needs. Best one to use is one that doesn't occur in your string (although you can escape it if it does)
Starting and ending of pattern with slash /
$pat='/<\s*(a|A)\s+[^>]*>[^<>]*ubscri[^<>]*<\s*\/(a|A)\s*>/';

regular expression error php error

I have made a regular expression to remove a script tag from a imported page.(used curl)
<script[\s\S]*?/script> this is my expresion
when i used it with preg_replace to remove the tag it gave me this error
Warning: preg_replace() [function.preg-replace]: Unknown modifier 'c' in C:\xampp\htdocs\get_page.php on line 21
can anyone help me
thanks
You should choose a suitable delimiter for your regular expression (preferably one that doesn't' occur anywhere in your pattern, so that you don't need to escape). For example:
"#<script[\s\S]*?/script>#"
Also, don't do that if you are trying to prevent malicious people from injecting Javascript into your page. It can easily be worked around. Use a whitelist of known safe constructs rather than trying to remove dangerous code.
PHP requires delimiters on RegExp patterns. Also, your expression can be simplified.
|<script.+/script>|
Did you wrap your regexp in forward slashes?
$str = preg_replace('/<script[\s\S]*?\/script>/', ...);
Did you surround your regular expression with a delimiter, such as /? If you didn't, you need to. If you did, and you used / (as opposed to your other choices) you'll need to escape the / in your /script, so it'll look like \/script instead.
Use the following code :
$result = preg_replace('%<script[\s\S]*?/script>%', $change_to, $subject);

simple question about preg_match() in php

i used bellow code to search and find if http is includes in $url address user enters
if (!preg_match("/http:///", $user_website)
but i got this error
Warning: preg_match() [function.preg-match]: Unknown modifier '/' in
i know its becuase of // of http but how work arround this !?
Instead of having to escape every / in URL regexes it's often useful to use preg_* alternative characters to mark the start/end of the pattern.
if (!preg_match("#http://#", $user_website)
The delimiter you are using / is found in the pattern as well. In such cases you can either escape the delimiter in the pattern:
if (!preg_match("/http:\/\//", $user_website)
or you can choose a different delimiter. This will keep the pattern clean and short:
if (!preg_match("#http://#", $user_website)
You can escape the slashes like the other answers mention, or alternatively you can use different delimiters, preferably characters you won't use in your regex:
preg_match('~http://~', ...)
preg_match('!http://!', ...)
And you don't really need regex for this. String matching should be enough:
if (strpos($user_website, 'http://') !== false) {
// do something
}
See: strpos()
Surely you must do
$parts = parse_url($my_url);
$parts['scheme'] will then contain the url scheme (might be 'http').
Escape / characters with \ characters.
You need to escape literal characters. Place a back-slash before your forward slashes.
http:// becomes http:\/\/
if (!preg_match("/http:\/\//", $user_website)

php preg_split error when switching from split to preg_split

I get this warning from php after the change from split to preg_split for php 5.3 compatibility :
PHP Warning: preg_split(): Delimiter must not be alphanumeric or backslash
the php code is :
$statements = preg_split("\\s*;\\s*", $content);
How can I fix the regex to not use anymore \
Thanks!
The error is because you need a delimiter character around your regular expression.
$statements = preg_split("/\s*;\s*/", $content);
Although the question was tagged as answered two minutes after being asked, I'd like to add some information for the records.
Similar to the way strings are delimited by quotation marks, regular expressions in many languages, such as Perl or JavaScript, are delimited by forward slashes. This will lead to expressions looking like this:
/\s*;\s*/
This syntax also allows to specify modifiers:
/\s*;\s*/Ui
PHP's Perl-compatible regular expressions (aka preg_... functions) inherit this. However, PHP itself doesn't support this syntax so feeding preg_split() with /\s*;\s*/ would raise a parse error. Instead, you enclose it with quotes to build a regular string.
One more thing you must take into account is that PHP allows to change the delimiter. For instance, you can use this:
#\s*;\s*#Ui
What is it good for? It simplifies the use of forward slashes inside the expression since you don't need to escape them. Compare:
/^\/home\/.*$/i
#^/home/.*$#i
If you don't like delimiters, you can use T-Regx tool:
pattern("\\s*;\\s*")->split($content):
You can also use Pattern::of("\\s*;\\s*")->split()

Weird error using preg_match and unicode

if (preg_match('(\p{Nd}{4}/\p{Nd}{2}/\p{Nd}{2}/\p{L}+)', '2010/02/14/this-is-something'))
{
// do stuff
}
The above code works. However this one doesn't.
if (preg_match('/\p{Nd}{4}/\p{Nd}{2}/\p{Nd}{2}/\p{L}+/u', '2010/02/14/this-is-something'))
{
// do stuff
}
Maybe someone could shed some light as to why the one below doesn't work. This is the error that is being produced:
A PHP Error was encountered
Severity: Warning
Message: preg_match()
[function.preg-match]: Unknown
modifier '\'
Try this: (delimit the regex with ())
if (preg_match('#\p{Nd}{4}/\p{Nd}{2}/\p{Nd}{2}/\p{L}+#', '2010/02/14/this-is-something'))
{
// do stuff
}
Edited
The modifier u is available from PHP 4.1.0 or greater on Unix and from PHP 4.2.3 on win32.
Also as nvl observed, you are using / as the delimiter and you are not escaping the / present in the regex. So you'lll have to use:
/\p{Nd}{4}\/\p{Nd}{2}\/\p{Nd}{2}\/\p{L}+/u
To avoid this escaping you can use a different set of delimiters like:
#\p{Nd}{4}/\p{Nd}{2}/\p{Nd}{2}/\p{L}+#
or
#\p{Nd}{4}/\p{Nd}{2}/\p{Nd}{2}/\p{L}+#
As a tip, if your delimiter is present in your regex, its better to choose a different delimiter not found in the regex. This keeps the regex clean and short.
In the second regex you're using / as the regex delimiter, but you're also using it in the regex. The compiler is trying to interpret this part as a complete regex:
/\p{Nd}{4}/
It thinks the next character after the second / should be a modifier like 'u' or 'm', but it sees a backslash instead, so it throws that cryptic exception.
In the first regex you're using parentheses as regex delimiters; if you wanted to add the u modifier, you would put it after the closing paren:
'(\p{Nd}{4}/\p{Nd}{2}/\p{Nd}{2}/\p{L}+)u'
Although it's legal to use parentheses or other bracketing characters ({}, [], <>) as regex delimiters, it's not a good idea IMO. Most people prefer to use one of the less common punctuation characters. For example:
'~\p{Nd}{4}/\p{Nd}{2}/\p{Nd}{2}/\p{L}+~u'
'%\p{Nd}{4}/\p{Nd}{2}/\p{Nd}{2}/\p{L}+%u'
Of course, you could also escape the slashes in the regex with backslashes, but why bother?

Categories