Php replace characters matching the regular expression

Php replace characters matching the regular expression - php

I tried using preg_replace method to replace matching regular expression but i am getting the error message
"Warning: preg_replace(): No ending delimiter '_' found"
$oldString = "";
$newString = preg_replace("/[^a-z0-9_]/ig", "", $oldString);
Here i am trying to remove all the characters other than alphabets,numbers and underscore.

The g is not supported in PHP, remove the g modifier (global) will do.
Here is the list of supported modifier

I think php doesn't like the control g char after your trailing /. I've been having trouble with this as well and removing the g seems to help. preg_replace has optional params it takes after the string you wish to augment where you control the number of times you wish to limit the search to, it's global by default.
The manual says that you will set the limit with the 4th param (limit) and if you want you can pass in a count param 5th which will will give you the number of times it found the match.
For my money this is just another thing that PHP does 1/2 right, which all adds up to it being just about a perfectly 1/2 assed language. But that's neither here nor there :)
Oh, and welcome to Stack! :)

First of all there is no modifier g for preg_replace.
$oldString = "";
$newString = preg_replace("/[^a-z0-9_]*/i", "", $oldString);
Second, try to put a multiplier after your character class in order to replace more than 1 char.

In RegEx \W means any non-alpha-numeric-underscore characters. Keep in mind this will also replace spaces.
$oldString = "This, is not _all_ alpha-numeric";
$newString = preg_replace("/\W+/", "", $oldString);
# Gives "Thisisnot_all_alphanumeric"
$newString = preg_replace("/[^\w ]+/", "", $oldString);
# Gives "This is not _all_ alphanumeric"

Related

Function preg_quote works incorrect?

Suppose I want to check input in order to allow Unicode letters and numbers plus configured symbols.
$allow_symbols = './*!#%&[]:,-_ ';
// $allow_symbols = '';
$pattern = '/^['.preg_quote($allow_symbols).'\p{L}\p{N}]+$/iu';
print $pattern."\n";
preg_match($pattern, '');
Sandbox is here: http://sandbox.onlinephpfunctions.com/code/b99a8f042695d1dc1528834d21e6eb6ad62972e6
I got
Warning</b>: preg_match(): Unknown modifier '\' in <b>[...][...]</b> on line <b>9</b>
The problem originates from $allow_symbols, if I override it with empty string as it commented out - nothing wrong happens. And when I past exactly printed pattern to https://www.phpliveregex.com/p/rxj it works fine.
So, what's the matter and how to deal with it?

preg_quote does not escape the regex's delimiter by default, because it can be any non-alphanumeric, non-backslash, non-whitespace character.
Set its second parameter ($delimiter) to also escape forward slashes:
$escaped_symbols = preg_quote($allow_symbols, '/');
$pattern = "/^[$escaped_symbols\p{L}\p{N}]+$/iu";

You can use T-Regx library which automatically choses delimiters and handles unsafe characters:
$allow_symbols = './*!#%&[]:,-_ ';
Pattern::prepare(['^[', [$allow_symbols], '\p{L}\p{N}]+$'], 'iu')->match('');

Regular Expression to check if string ends with one underscore and two letters with php

I'm trying to check if a string ends with one _ and two letters on an old system with php. I've check here on stackoverflow for answers and I found one that wanted to do the same but with one . and two digits.
I tried to change it to work with my needs, and I got this:
\\.*\\_\\a{2,2}$
Then I went to php and tried this:
$regex = '(\\.*\\_\\a{2,2}$)';
echo preg_match($regex, $key);
But this always returns an error, saying the following:
preg_match(): Delimiter must not be alphanumeric or backslash
I get this happens because I can't use the backslashes or something, how can I do this correctly? And also, is my regex correct(I don't know ho to form this expressions and how they work)?

You can use this regex with delimiters:
$regex = '/_[a-z]{2}$/i';
You're getting that error because in PHP every regex needs a delimiter (not use of / above which can be any other character like ~ also).

^.*_[a-zA-Z]{2}$
This should do it for you.
$re = "/^.*_[a-zA-Z]{2}$/";
$str = "abc_ac";
preg_match($re, $str);

Regexp for preg_replace in PHP

I have strings like this (some examples):
F7998FM3213/02F
J442554NM/05
K439459845/34D
I need to use PHP with preg_replace and regular expressions to delete all non-numeric characters in any string, after the forward-slash, '/'.
For example the codes above would look like this afterwards:
F7998FM3213/02
J442554NM/05
K439459845/34

If you're going for readability, something like this would be perfect:
$parts = explode("/",$line,2);
$parts[1] = preg_replace("/\D/","",$parts[1]);
$output = implode("/",$parts);
However, for conciseness and based entirely on the examples you have given, try this:
$output = preg_replace("/\D+$/","",$input);
This will strip any non-numeric characters from the end of the string, which seems to be what you're after based on your examples.

you can use this:
$subject = <<<LOD
F7998FM3213/02F
J442554NM/05
K439459845/34D
K439459845/34D34
LOD;
echo preg_replace('~^[^/]*+/\K|[^\d\n]++~m', '', $subject);
explanation:
The regex is an alternation between two things:
You match the begining until you encounter / included
the part after the / that is all that is not a digit or a new line one or more times
Since the begining of the string is checked at first, all non digit characters are removed after the /

To remove all \D anywhere after a / you could replace:
(?:/\K|\G(?!^))(\d*)\D+
with $1. Like:
preg_replace(',(?:/\K|\G(?!^))(\d*)\D+,', '$1', $str);

preg_replace PHP not working?

Why doesn't preg_replace return anything in this scenario? I've been trying to figure it out all night.
Here is the text contained within $postContent:
Test this. Here is a quote: [Quote]1[/Quote] Quote is now over.
Here is my code:
echo "Test I'm Here!!!";
$startQuotePos = strpos($postContent,'[Quote]')+7;
$endQuotePos = strpos($postContent,'[/Quote]');
$postStrLength = strlen($postContent);
$quotePostID = substr($postContent,$startQuotePos,($endQuotePos-$postStrLength));
$quotePattern = '[Quote]'.$quotePostID.'[/Quote]';
$newPCAQ = preg_replace($quotePattern,$quotePostID,$postContent);
echo "<br />$startQuotePos<br />$endQuotePos<br />$quotePostID<br />Qpattern:$quotePattern<br />PCAQ: $newPCAQ<br />";
This is my results:
Test I'm Here!!!
35
36
1
Qpattern:[Quote]1[/Quote]
PCAQ:

For preg_replace(), "[Quote]" matches a single character that is one of the following: q, u, o, t, or e.
If you want that preg_replace() finds the literal "[Quote]", you need to escape it as "\[Quote\]". preg_quote() is the function you should use: preg_quote("[Quote]").
Your code is also wrong because a regular expression is expected to start with a delimiter. In the preg_replace() call I am showing at the end of my answer, that is #, but you could use another character, as long as it doesn't appear in the regular expression, and it is used also at the end of the regular expression. (In my case, # is followed by a pattern modifier, and pattern modifiers are the only characters allowed after the pattern delimiter.)
If you are going to use preg_replace(), it doesn't make sense that you first find where "[Quote]" is. I would rather use the following code:
$newPCAQ = preg_replace('#\[Quote\](.+?)\[/Quote\]#i', '\1', $postContent);
I will explain the regular expression I am using:
The final '#i' is saying to preg_replace() to ignore the difference between lowercase, and uppercase characters; the string could contain "[QuOte]234[/QuOTE]", and that substring would match the regular expression the same.
I use a question mark in "(.+?)" to avoid ".+" is too greedy, and matches too much characters. without it, the regular expression could include in a single match a substring like "[Quote]234[/Quote] Other text [Quote]475[/Quote]" while this should be matched as two substrings: "[Quote]234[/Quote]", and "[Quote]475[/Quote]".
The '\1' string I am using as replacement string is saying to preg_replace() to use the string matched from the sub-group "(.+?)" as replacement. In other words, the call to preg_replace() is removing "[Quote]", and "[/Quote]" surrounding other text. (It doesn't replace "[/Quote]" that doesn't match with "[Quote]", such as in "[/Quote] Other text [Quote]".)

your regex must start & end with '/':
$quotePattern = '/[Quote]'.$quotePostID.'[/Quote]/';

The reason you don't see anything for the return value of preg_replace is because it has returned NULL (see the manual link for details). This is what preg_replace returns when an error occurs, which is what happened in your situation. The string value of NULL is a zero-length string. You can see this by using var_dump instead, which will tell you that preg_replace returned NULL.
Your regular expression is invalid and as such PHP will throw an E_WARNING level error of Warning: preg_replace(): Unknown modifier '['
There are a couple of reason for this. First, you need to specify an opening and closing delimiter for you regular expression as preg_* functions use PCRE style regular expression. Second, you want to also consider using preg_quote on your patter (sans the delimiter) to ensure it is escaped properly.
$postContent = "Test this. Here is a quote: [Quote]1[/Quote] Quote is now over.";
/* Specify a delimiter for your regular expression */
$delimiter = '#';
$startQuotePos = strpos($postContent,'[Quote]')+7;
$endQuotePos = strpos($postContent,'[/Quote]');
$postStrLength = strlen($postContent);
$quotePostID = substr($postContent,$startQuotePos,($endQuotePos-$postStrLength));
/* Make sure you use the delimiter in your pattern and escape it properly */
$quotePattern = $delimiter . preg_quote("[Quote]{$quotePostID}[/Quote]", $delimiter) . $delimiter;
$newPCAQ = preg_replace($quotePattern,$quotePostID,$postContent);
echo "<br />$startQuotePos<br />$endQuotePos<br />$quotePostID<br />Qpattern:$quotePattern<br />PCAQ: $newPCAQ<br />";
The output will be:
35
36
1
Qpattern:#[Quote]1[/Quote]#
PCAQ: Test this. Here is a quote: 1 Quote is now over.

Making a url regex global

I've been searching for a regex to replace plain text url's in a string (the string can contain more than 1 url), by:
url
and I found this:
http://mathiasbynens.be/demo/url-regex
I would like to use the diegoperini's regex (which according to the tests is the best):
_^(?:(?:https?|ftp)://)(?:\S+(?::\S*)?#)?(?:(?!10(?:\.\d{1,3}){3})(?!127(?:\.\d{1,3}){3})(?!169\.254(?:\.\d{1,3}){2})(?!192\.168(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\x{00a1}-\x{ffff}0-9]+-?)*[a-z\x{00a1}-\x{ffff}0-9]+)(?:\.(?:[a-z\x{00a1}-\x{ffff}0-9]+-?)*[a-z\x{00a1}-\x{ffff}0-9]+)*(?:\.(?:[a-z\x{00a1}-\x{ffff}]{2,})))(?::\d{2,5})?(?:/[^\s]*)?$_iuS
But I want o make it global to replace all the url's in a string.
When I use this:
/_(?:(?:https?|ftp)://)(?:\S+(?::\S*)?#)?(?:(?!10(?:\.\d{1,3}){3})(?!127(?:\.\d{1,3}){3})(?!169\.254(?:\.\d{1,3}){2})(?!192\.168(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\x{00a1}-\x{ffff}0-9]+-?)*[a-z\x{00a1}-\x{ffff}0-9]+)(?:\.(?:[a-z\x{00a1}-\x{ffff}0-9]+-?)*[a-z\x{00a1}-\x{ffff}0-9]+)*(?:\.(?:[a-z\x{00a1}-\x{ffff}]{2,})))(?::\d{2,5})?(?:/[^\s]*)?_iuS/g
It does not work, how do I make this regex global and what does the underscore at the beginning and the "_iuS", at the end, means?
I would like to use it with php so I am using:
preg_replace($regex, '$0', $examplestring);

The underscores are the regex delimiters, the i, u and S are pattern modifiers :
i (PCRE_CASELESS)
If this modifier is set, letters in the pattern match both upper and lower
case letters.
U (PCRE_UNGREEDY)
This modifier inverts the "greediness" of the quantifiers so that they are
not greedy by default, but become greedy if followed by ?. It is not compatible
with Perl. It can also be set by a (?U) modifier setting within the pattern
or by a question mark behind a quantifier (e.g. .*?).
S
When a pattern is going to be used several times, it is worth spending more
time analyzing it in order to speed up the time taken for matching. If this
modifier is set, then this extra analysis is performed. At present, studying
a pattern is useful only for non-anchored patterns that do not have a single
fixed starting character.
For more informations see http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php
When you added the / ... /g , you added another regex delimiter plus the modifier g wich does not exists in PCRE, that's why it did not work.

I agree with #verdesmarald and used this pattern in the following function:
$string = preg_replace_callback(
"_(?:(?:https?|ftp)://)(?:\S+(?::\S*)?#)?(?:(?!10(?:\.\d{1,3}){3})(?!127(?:\.\d{1,3}){3})(?!169\.254(?:\.\d{1,3}){2})(?!192\.168(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\x{00a1}-\x{ffff}0-9]+-?)*[a-z\x{00a1}-\x{ffff}0-9]+)(?:\.(?:[a-z\x{00a1}-\x{ffff}0-9]+-?)*[a-z\x{00a1}-\x{ffff}0-9]+)*(?:\.(?:[a-z\x{00a1}-\x{ffff}]{2,})))(?::\d{2,5})?(?:/[^\s]*)?_iuS",
create_function('$match','
$m = trim(strtolower($match[0]));
$m = str_replace("http://", "", $m);
$m = str_replace("https://", "", $m);
$m = str_replace("ftp://", "", $m);
$m = str_replace("www.", "", $m);
if (strlen($m) > 25)
{
$m = substr($m, 0, 25) . "...";
}
return "$m";
'), $string);
return $string;
It seem to do the trick, and resolve an issue I was having. As #verdesmarald said, removing the ^ and $ characters allowed the pattern to work even in my pre_replace_callback().
Only thing that concerns me, is how efficient is the pattern. If used in a busy/high traffic web app, could it cause a bottle neck?
UPDATE
The above regex pattern breaks if there is a trail dot at the end of the path section of a url, like so http://www.mydomain.com/page.. To solve this I modified the final part of the regex pattern by adding ^. making the final part look like so [^\s^.]. As I read it, do not match a trailing space or dot.
In my tests so far it seems to be working fine.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Php replace characters matching the regular expression - php

The g is not supported in PHP, remove the g modifier (global) will do. Here is the list of supported modifier

First of all there is no modifier g for preg_replace. $oldString = ""; $newString = preg_replace("/[^a-z0-9_]*/i", "", $oldString); Second, try to put a multiplier after your character class in order to replace more than 1 char.

Related

Function preg_quote works incorrect?

Regular Expression to check if string ends with one underscore and two letters with php

Regexp for preg_replace in PHP

preg_replace PHP not working?

Making a url regex global

Categories

Resources