I am wondering if someone could please help me convert a piece of PHP code that is now deprecated.
Here is the single line I am trying to convert:
eregi("<text>(.*)TYPE[ \r\n]*(OF|or)[ \r\n]*REPORTING[ \r\n]*PERSON",$string,$outp);
When I convert to the following:
preg_match("/<text>(.*)TYPE[ \r\n]*(OF|or)[ \r\n]*REPORTING[ \r\n]*PERSON/i",$string,$outp);
It matched nothing. The original eregi function works well.
You need the /is flag at the end of the regex.
The reason is that the preg_ function does not match linebreaks with .*, whereas the old ereg functions would do that per default.
Otherwise your regular expression should work unchanged with PCRE.
Related
This question already has answers here:
How can I convert ereg expressions to preg in PHP?
(4 answers)
Closed 8 years ago.
I'm using PHP 5.2.17. I want to remove some surplus data from a JSON string and I've thought I can use some replace function to do so. Specifically I'm using ereg_replace with the next expression:
'^.*?(?=\"created_at)'
Which I've validated at http://www.regexpal.com. I've pasted my JSON string in there and the match is right. However, when I make the call:
$tweets = eregi_replace('^.*?(?=\"created_at)', $temp, 'something');
and then I echo the $tweets variable, there's output. No errors in console neither. Apache error log, however, complains about an error called REG_BADRPT. There's a comment in the php docs of eregi_replace suggesting this can be due to I need to escape special characters, but I've already escaped the " character. And I've tried to escape others but no different behavior.
Where could the problem be then?
I don't think that ereg supports lookarounds. preg_replace exists in php 5.2, so you should really use that instead. It will work with your expression with delimiters.
$tweets = preg_replace('#^.*?(?=\"created_at)#i', 'something', $temp);
As other people have pointed out, ereg functions are deprecated, so use preg_replace. You also have to encapsulate your regex string in slashes (/). You can put your regex flags after your last slash.
I found that syntax of preg_match() and the deprecated ereg() is different.
For example:
I thought that
preg_match('/^<div>(.*)</div>$/', $content);
means the same as
ereg('^<div>(.*)</div>$', $content);
but I was wrong. preg_match() doesn't include special characters as enter like ereg() does.
So I started to use this syntax:
preg_match('/^<div>([^<]*)</div>$/', $content);
but it isn't exactly the same to what I need.
Can anyone suggest me how to solve this problem, without using deprecated functions?
For parsing HTML I'd suggest reading this question and choosing a built in PHP extension.
If for some reason you need or want to use RegEx to do it you should know that:
preg_match() is a greedy little bugger and it will try to eat your anything (.*) till it get's sick (meaning it hits recursion or backtracking limits). You change this with the U modifier1.
the engine expects to be fed a single line. You change this with the m or s modifiers1.
using your 'not a < character' ([^<]*) hack does a good job as it forces the engine to stop at the first < char, but will work only if the <div> doesn't contain other tags inside!
ref: 1 PCRE Pattern Modifiers
I'm looking for a multi-byte function to replace preg_match_all(). I need one that will give me an array of matched strings, like the $matches argument from preg_match(). The function mb_ereg_match() doesn't seem to do it -- it only gives me a boolean indicating if there were any matches.
Looking at the mb_* functions page, I don't offhand see anythng that replaces the functionality of preg_match(). What do I use?
Edit I'm an idiot. I originally posted this question asking for a replacement for preg_match, which of course is ereg_match. However both those only return the first result. What I wanted was a replacement for preg_match_all, which returns all match texts. But anyways, the u modifier works in my case for preg_match_all, as hakre pointed out.
Have you taken a look into mb_ereg?
Additionally, you can pass an UTF-8 encoded string into preg_match using the u modifier, which might be the kind of multi-byte support you need. The other option is to encode into UTF-8 and then encode the results back.
See as well an answer to a related question: Are the PHP preg_functions multibyte safe?
PHP: preg_grep manual
$matches = preg_grep('/(needles|to|find)/u', $inputArray);
Returns an array indexed using the keys from the input array.
Note the /u modifier which enables multibyte support.
Hope it helps others.
I get this warning from php after the change from split to preg_split for php 5.3 compatibility :
PHP Warning: preg_split(): Delimiter must not be alphanumeric or backslash
the php code is :
$statements = preg_split("\\s*;\\s*", $content);
How can I fix the regex to not use anymore \
Thanks!
The error is because you need a delimiter character around your regular expression.
$statements = preg_split("/\s*;\s*/", $content);
Although the question was tagged as answered two minutes after being asked, I'd like to add some information for the records.
Similar to the way strings are delimited by quotation marks, regular expressions in many languages, such as Perl or JavaScript, are delimited by forward slashes. This will lead to expressions looking like this:
/\s*;\s*/
This syntax also allows to specify modifiers:
/\s*;\s*/Ui
PHP's Perl-compatible regular expressions (aka preg_... functions) inherit this. However, PHP itself doesn't support this syntax so feeding preg_split() with /\s*;\s*/ would raise a parse error. Instead, you enclose it with quotes to build a regular string.
One more thing you must take into account is that PHP allows to change the delimiter. For instance, you can use this:
#\s*;\s*#Ui
What is it good for? It simplifies the use of forward slashes inside the expression since you don't need to escape them. Compare:
/^\/home\/.*$/i
#^/home/.*$#i
If you don't like delimiters, you can use T-Regx tool:
pattern("\\s*;\\s*")->split($content):
You can also use Pattern::of("\\s*;\\s*")->split()
I'm trying to find the proper regular expression to convert eregi($1,$2) to preg_match("/$1/i",$2)
i need to consider if there will be functions with () in it, and they may appear more then once.
can anyone please provide the proper regular expression to do so ?
thanks
You don't want to use a regular expression to parse code.
You want to use a parser.
Are you trying to modify your source code, since eregi is deprecated? This regex will do the trick:
$source= <<<STR
eregi(\$1, \$2);
eregi('hello', 'world');
STR;
$source2= preg_replace("/eregi\(['\"]*([^\'\"),]+)['\"]*,\s*['\"]*([^'\"),]+)['\"]*\)/", 'preg_match("/$1/i", "$2")', $source);
var_dump($source2);