This question already has answers here:
How can I convert ereg expressions to preg in PHP?
(4 answers)
Closed 3 years ago.
Dear Sir/m'am
How can i replace ther deprecated ereg_replace with preg_replace or str_replace
and still have the same functionality as in the code below?
return ereg_replace("^(.*)%%number%%(.*)$","\\1$i\\2",$number);
///this doesnt work
return preg_replace("^(.*)%%number%%(.*)$","\\1$i\\2",$number);
Anyone smarter have a clue?
Try this:
return ereg_replace("^(.*)%%number%%(.*)$","\\1$i\\2",$number);
becomes
return preg_replace("/^(.*)%%number%%(.*)$/","\\1$i\\2",$number);
Note the / around the regex.
I'll go with a read the fabulous manual approach.
The PHP Manual has a section for moving from POSIX Regex to PCRE.
The PCRE functions require that the pattern is enclosed by delimiters.
Unlike POSIX, the PCRE extension does not have dedicated functions for
case-insensitive matching. Instead,
this is supported using the /i pattern
modifier. Other pattern modifiers are
also available for changing the
matching strategy.
The POSIX functions find the longest of the leftmost match, but
PCRE stops on the first valid match.
If the string doesn't match at all it
makes no difference, but if it matches
it may have dramatic effects on both
the resulting match and the matching
speed. To illustrate this difference,
consider the following example from
"Mastering Regular Expressions" by
Jeffrey Friedl. Using the pattern
one(self)?(selfsufficient)? on the
string oneselfsufficient with PCRE
will result in matching oneself, but
using POSIX the result will be the
full string oneselfsufficient. Both
(sub)strings match the original
string, but POSIX requires that the
longest be the result.
Good luck,
Alin
Perl Compatible Regular Expressions, used by the preg_ functions in PHP require a demarcation character in the pattern string, defining where the actual string pattern starts and ends, and where attributes for extra functionality, such as case insensitivity, is.
For example:
$pattern = "/dog/i"; // Search pattern for "dog", case insensitive.
$replace = "cat";
$subject = "Dogs are cats.";
$result = preg_replace($pattern, $replace, $subject);
Related
How can I make the following regex ignore case sensitivity? It should match all the correct characters but ignore whether they are lower or uppercase.
G[a-b].*
Assuming you want the whole regex to ignore case, you should look for the i flag. Nearly all regex engines support it:
/G[a-b].*/i
string.match("G[a-b].*", "i")
Check the documentation for your language/platform/tool to find how the matching modes are specified.
If you want only part of the regex to be case insensitive (as my original answer presumed), then you have two options:
Use the (?i) and [optionally] (?-i) mode modifiers:
(?i)G[a-b](?-i).*
Put all the variations (i.e. lowercase and uppercase) in the regex - useful if mode modifiers are not supported:
[gG][a-bA-B].*
One last note: if you're dealing with Unicode characters besides ASCII, check whether or not your regex engine properly supports them.
Depends on implementation
but I would use
(?i)G[a-b].
VARIATIONS:
(?i) case-insensitive mode ON
(?-i) case-insensitive mode OFF
Modern regex flavors allow you to apply modifiers to only part of the regular expression. If you insert the modifier (?im) in the middle of the regex then the modifier only applies to the part of the regex to the right of the modifier. With these flavors, you can turn off modes by preceding them with a minus sign (?-i).
Description is from the page:
https://www.regular-expressions.info/modifiers.html
regular expression for validate 'abc' ignoring case sensitive
(?i)(abc)
The i flag is normally used for case insensitivity. You don't give a language here, but it'll probably be something like /G[ab].*/i or /(?i)G[ab].*/.
Just for the sake of completeness I wanted to add the solution for regular expressions in C++ with Unicode:
std::tr1::wregex pattern(szPattern, std::tr1::regex_constants::icase);
if (std::tr1::regex_match(szString, pattern))
{
...
}
JavaScript
If you want to make it case insensitive just add i at the end of regex:
'Test'.match(/[A-Z]/gi) //Returns ["T", "e", "s", "t"]
Without i
'Test'.match(/[A-Z]/g) //Returns ["T"]
In JavaScript you should pass the i flag to the RegExp constructor as stated in MDN:
const regex = new RegExp('(abc)', 'i');
regex.test('ABc'); // true
As I discovered from this similar post (ignorecase in AWK), on old versions of awk (such as on vanilla Mac OS X), you may need to use 'tolower($0) ~ /pattern/'.
IGNORECASE or (?i) or /pattern/i will either generate an error or return true for every line.
C#
using System.Text.RegularExpressions;
...
Regex.Match(
input: "Check This String",
pattern: "Regex Pattern",
options: RegexOptions.IgnoreCase)
specifically: options: RegexOptions.IgnoreCase
[gG][aAbB].* probably simples solution if the pattern is not too complicated or long.
Addition to the already-accepted answers:
Grep usage:
Note that for greping it is simply the addition of the -i modifier. Ex: grep -rni regular_expression to search for this 'regular_expression' 'r'ecursively, case 'i'nsensitive, showing line 'n'umbers in the result.
Also, here's a great tool for verifying regular expressions: https://regex101.com/
Ex: See the expression and Explanation in this image.
References:
man pages (man grep)
http://droptips.com/using-grep-and-ignoring-case-case-insensitive-grep
In Java, Regex constructor has
Regex(String pattern, RegexOption option)
So to ignore cases, use
option = RegexOption.IGNORE_CASE
Kotlin:
"G[a-b].*".toRegex(RegexOption.IGNORE_CASE)
You also can lead your initial string, which you are going to check for pattern matching, to lower case. And using in your pattern lower case symbols respectively .
You can practice Regex In Visual Studio and Visual Studio Code using find/replace.
You need to select both Match Case and Regular Expressions for regex expressions with case. Else [A-Z] won't work.enter image description here
I found that syntax of preg_match() and the deprecated ereg() is different.
For example:
I thought that
preg_match('/^<div>(.*)</div>$/', $content);
means the same as
ereg('^<div>(.*)</div>$', $content);
but I was wrong. preg_match() doesn't include special characters as enter like ereg() does.
So I started to use this syntax:
preg_match('/^<div>([^<]*)</div>$/', $content);
but it isn't exactly the same to what I need.
Can anyone suggest me how to solve this problem, without using deprecated functions?
For parsing HTML I'd suggest reading this question and choosing a built in PHP extension.
If for some reason you need or want to use RegEx to do it you should know that:
preg_match() is a greedy little bugger and it will try to eat your anything (.*) till it get's sick (meaning it hits recursion or backtracking limits). You change this with the U modifier1.
the engine expects to be fed a single line. You change this with the m or s modifiers1.
using your 'not a < character' ([^<]*) hack does a good job as it forces the engine to stop at the first < char, but will work only if the <div> doesn't contain other tags inside!
ref: 1 PCRE Pattern Modifiers
This question already has answers here:
How can I convert ereg expressions to preg in PHP?
(4 answers)
Closed 3 years ago.
I work on a large PHP application (>1 million lines, 10 yrs old) which makes extensive use of ereg and ereg_replace - currently 1,768 unique regular expressions in 516 classes.
I'm very aware why ereg is being deprecated but clearly migrating to preg could be highly involved.
Does anyone know how long ereg support is likely to be maintained in PHP, and/or have any advice for migrating to preg on this scale. I suspect automated translation from ereg to preg is impossible/impractical?
I'm not sure when ereg will be removed but my bet is as of PHP 6.0.
Regarding your second issue (translating ereg to preg) doesn't seem something that hard, if your application has > 1 million lines surely you must have the resources to get someone doing this job for a week at most. I would grep all the ereg_ instances in your code and set up some macros in your favorite IDE (simple stuff like adding delimiters, modifiers and so on).
I bet most of the 1768 regexes can be ported using a macro, and the others, well, a good pair of eyes.
Another option might be to write wrappers around the ereg functions if they are not available, implementing the changes as needed:
if (function_exists('ereg') !== true)
{
function ereg($pattern, $string, &$regs)
{
return preg_match('~' . addcslashes($pattern, '~') . '~', $string, $regs);
}
}
if (function_exists('eregi') !== true)
{
function eregi($pattern, $string, &$regs)
{
return preg_match('~' . addcslashes($pattern, '~') . '~i', $string, $regs);
}
}
You get the idea. Also, PEAR package PHP Compat might be a viable solution too.
Differences from POSIX regex
As of PHP 5.3.0, the POSIX Regex
extension is deprecated. There are a
number of differences between POSIX
regex and PCRE regex. This page lists
the most notable ones that are
necessary to know when converting to
PCRE.
The PCRE functions require that the pattern is enclosed by delimiters.
Unlike POSIX, the PCRE extension does not have dedicated functions for
case-insensitive matching. Instead,
this is supported using the /i pattern
modifier. Other pattern modifiers are
also available for changing the
matching strategy.
The POSIX functions find the longest of the leftmost match, but
PCRE stops on the first valid match.
If the string doesn't match at all it
makes no difference, but if it matches
it may have dramatic effects on both
the resulting match and the matching
speed. To illustrate this difference,
consider the following example from
"Mastering Regular Expressions" by
Jeffrey Friedl. Using the pattern
one(self)?(selfsufficient)? on the
string oneselfsufficient with PCRE
will result in matching oneself, but
using POSIX the result will be the
full string oneselfsufficient. Both
(sub)strings match the original
string, but POSIX requires that the
longest be the result.
My intuition says that they are never going to remove ereg on purpose. PHP still supports really old and deprecated stuff like register globals. There're simply too many outdated apps out there. There's however a little chance that the extension has to be removed because someone finds a serious vulnerability and there's just nobody to fix it.
In any case, it's worth noting that:
You are not forced to upgrade your PHP installation. It's pretty common to keep outdated servers to run legady apps.
The PHP_Compat PEAR package offers plain PHP version of some native functions. If ereg disappears, it's possible that it gets added.
BTW... In fact, PHP 6 is dead. They realised that their approach to make PHP fully Unicode compliant was harder than they thought and they are rethinking it all. The conclusion is: you can never make perfect predictions.
I had this problem on a much smaller scale - an application more like 10,000 lines. In every case, all I need to do was switch to preg_replace() and put delimiters around the regex pattern.
Anyone should be able to do that - even a non-programmer can be given a list of filenames and line numbers.
Then just run your tests to watch for any failures that can be fixed.
ereg functions will be removed from PHP6, by the way - http://jero.net/articles/php6.
All ereg functions will be removed as of PHP 6, I believe.
Can anyone give me a quick summary of the differences please?
To my mind, are they both doing the same thing?
str_replace replaces a specific occurrence of a string, for instance "foo" will only match and replace that: "foo". preg_replace will do regular expression matching, for instance "/f.{2}/" will match and replace "foo", but also "fey", "fir", "fox", "f12", etc.
[EDIT]
See for yourself:
$string = "foo fighters";
$str_replace = str_replace('foo','bar',$string);
$preg_replace = preg_replace('/f.{2}/','bar',$string);
echo 'str_replace: ' . $str_replace . ', preg_replace: ' . $preg_replace;
The output is:
str_replace: bar fighters, preg_replace: bar barhters
:)
str_replace will just replace a fixed string with another fixed string, and it will be much faster.
The regular expression functions allow you to search for and replace with a non-fixed pattern called a regular expression. There are many "flavors" of regular expression which are mostly similar but have certain details differ; the one we are talking about here is Perl Compatible Regular Expressions (PCRE).
If they look the same to you, then you should use str_replace.
str_replace searches for pure text occurences while preg_replace for patterns.
I have not tested by myself, but probably worth of testing. But according to some sources preg_replace is 2x faster on PHP 7 and above.
See more here: preg_replace vs string_replace.
I'm trying to strip all punctuation out of a string using a simple regular expression and the php preg_replace function, although I get the following error:
Compilation failed: POSIX named classes are supported only within a class at offset 0
I guess this means I can't use POSIX named classes outside of a class at offset 0. My question is, what does it means when it says "within a class at offset 0 "?
$string = "I like: perl";
if (eregi('[[:punct:]]', $string))
$new = preg_replace('[[:punct:]]', ' ', $string); echo $new;
The preg_* functions expect Perl compatible regular expressions with delimiters. So try this:
preg_replace('/[[:punct:]]/', ' ', $string)
NOTE: The g modifier is not needed with PHP's PCRE implementation!
In addition to Gumbo's answer, use the g modifier to replace all occurances of punctuation:
preg_replace('/[[:punct:]]/g', ' ', $string)
// ^
From Johnathan Lonowski (see comments):
> [The g modifier] means "Global" -- i.e., find all existing matches. Without it, regex functions will stop searching after the first match.
An explanation of why you're getting that error: PCRE uses Perl's loose definition of what a delimiter is. Your outer []s look like valid delimiters to it, causing it to read [:punct:] as the regex part.
(Oh, and avoid the ereg functions if you can - they're not going to be included in PHP 5.3.)
I just added g to the regexp as suggested in one of the anwers, it did the opposite of wahts expected and DIDN'T filter out the punctuation, turns out preg_replace doesnt require g as it's global/recursive in the first place