Capture Group Not Captured in Preg_replace Replacement

Capture Group Not Captured in Preg_replace Replacement - php

I have the following code:
$f = 'IMG_1474.PNG';
preg_replace('/(.*)([.][^.]+$)/', '$1' . time() . '$2', $f)
which should split the IMG_1474 and .PNG and add the epoch in between. The first capture group appears to be empty though.
If preg_match is used I can see the first capture group is not empty (so regex performs the same in PHP as on regex101).
preg_match('/(.*)([.][^.]+$)/', $f, $match);
print_r($match);
Functional demo: https://3v4l.org/VgDWJ (the 604147502 is the time() result)
My presumption is that something is happening with the concatenation on the replace bit.

In fact, the issue is the concatenation of the capture groups with time(). Use this version instead:
$f = 'IMG_1474.PNG';
$output = preg_replace('/(.*)([.][^.]+$)/', '${1}' . time() . '${2}', $f);
echo $output;
This prints:
IMG_14741604148316.PNG
^^^^^^^^^^ second since epoch (1970-01-01)
To understand why this is happening, we have to consider that variable substitution occurs before the preg_replace function call does. So your concatenated replacement term:
'$1' . time() . '$2'
actually first becomes:
'$1' . '1604148623' . '$2'
which is:
'$11604148623' . '$2'
The "first" capture group here is probably being interpreted as $11 (or $116, etc.), and is not defined. This is why you are currently only getting the second capture group, but the first one is empty. The solution I suggested uses ${1} which preserves the capture group reference so it is available during the preg_replace call as you intend.

Related

php preg_replace not putting dash back in?

So, I'm doing some manipulation on lat/long pairs, and I need to turn this:
39.1889375383777,-94.48019109594397
into:
39.1889375383777 -94.48019109594397
I can't use str_replace, unless I want to have an array of 10 search and 10 replace strings, so I was hoping to use preg_replace:
$query1 = preg_replace( "/([0-9-]),([0-9-])/", "\1 \2", $query );
The problem is that the "-" gets lost:
39.1889375383777 94.48019109594397
Note, that I have a string containing a list of these, trying to do all at once:
[[39.1889375383777,-94.48019109594397],[39.18425796890108,-94.28288005131176],[39.41972019529712,-94.19956344733345],[39.41412315915102,-94.41932608390658],[39.34785744845041,-94.4893603307242],[39.1889375383777,-94.48019109594397]]
I managed to make this work with preg_replace_callback:
$str = preg_replace_callback( "/([0-9-]),([0-9-])/",
function ($matches) {return $matches[1] . " " . $matches[2];},
$query
);
But still not sure why the simpler preg_match didn't work?

Your main issue is that "\1 \2" define a "\x1\x20\x2" string, where the first character is a SOH char and the third one is STX char (see the ASCII table). To define backreferences, you need to use a literal backslash, "\\", or, better, use $n notation, and better inside a single-quoted string literal.
You can also use a solution without backreferences:
preg_replace('~(?<=\d),(?=-?\d)~', ' ', $str)
Details:
(?<=\d) - a location that is immediately preceded with a digit
, - a comma
(?=-?\d) - a location that is immediately followed with an optional - and a digit.
See the PHP demo:
$str = '[[39.1889375383777,-94.48019109594397],[39.18425796890108,-94.28288005131176],[39.41972019529712,-94.19956344733345],[39.41412315915102,-94.41932608390658],[39.34785744845041,-94.4893603307242],[39.1889375383777,-94.48019109594397]]';
echo preg_replace('~(?<=\d),(?=-?\d)~', ' ', $str);
// => [[39.1889375383777 -94.48019109594397],[39.18425796890108 -94.28288005131176],[39.41972019529712 -94.19956344733345],[39.41412315915102 -94.41932608390658],[39.34785744845041 -94.4893603307242],[39.1889375383777 -94.48019109594397]]

PHP Array str_replace Whole Word

I'm doing str_replace on a very long string and my $search is an array.
$search = array(
" tag_name_item ",
" tag_name_item_category "
);
$replace = array(
" tag_name_item{$suffix} ",
" tag_name_item_category{$suffix} "
);
echo str_replace($search, $replace, $my_really_long_string);
The reason why I added spaces on both $search and $replace is because I want to only match whole words. As you would have guessed from my code above, if I removed the spaces and my really long string is:
...
tag_name_item ...
tag_name_item_category ...
...
Then I would get something like
...
tag_name_item_sfx ...
tag_name_item_sfx_category ...
...
This is wrong because I want the following result:
...
tag_name_item_sfx ...
tag_name_item_category_sfx ...
...
So what's wrong?
Nothing really, it works. But I don't like it. Looks dirty, not well coded, inefficient.
I realized I can do something like this using regular expressions using the \b modifier but I'm not good with regex and so I don't know how to preg_replace.

A possible approach using regular expressions would/could look like this:
$result = preg_replace(
'/\b(tag_name_item(_category)?)\b/',
'$1' . $suffix,
$string
);
How it works:
\b: As you say are word boundaries, this is to ensure we're only matching words, not word parts
(: We want to use part of our match in the replacement string (tag_name_index has to be replaced with itself + a suffix). That's why we use a match group, so we can refer back to the match in the replacement string
tag_name_index is a literal match for that string.
(_category)?: Another literal match, grouped and made optional through use of the ? operator. This ensures that we're matching both tag_name_item and tag_name_item_category
): end of the first group (the optional _category match is the second group). This group, essentially, holds the entire match we're going to replace
\b: word boundary again
These matches are replaced with '$1' . $suffix. The $1 is a reference to the first match group (everything inside the outer brackets in the expression). You could refer to the second group using $2, but we're not interested in that group right now.
That's all there is to it really
More generic:
So, you're trying to suffix all strings starting with tag_name, which judging by your example, can be followed by any number of snake_cased words. A more generic regex for that would look something like this:
$result = preg_replace(
'/\b(tag_name[a-z_]*)\b/',
'$1' . $suffix,
$string
);
Like before, the use of \b, () and the tag_name literal remains the same. what changed is this:
[a-z_]*: This is a character class. It matches characters a-z (a to z), and underscores zero or more times (*). It matches _item and _item_category, just as it would match _foo_bar_zar_fefe.
These regex's are case-sensitive, if you want to match things like tag_name_XYZ, you'll probably want to use the i flag (case-insensitive): /\b(tag_name[a-z_]*)\b/i
Like before, the entire match is grouped, and used in the replacement string, to which we add $suffix, whatever that might be

To avoid the problem, you can use strtr that parses the string only once and chooses the longest match:
$pairs = [ " tag_name_item " => " tag_name_item{$suffix} ",
" tag_name_item_category " => " tag_name_item_category{$suffix} " ];
$result = strtr($str, $pairs);

This function replaces the entire whole word but not the substring with an array element which matches the word
<?PHP
function removePrepositions($text){
$propositions=array('/\b,\b/i','/\bthe\b/i','/\bor\b/i');
if( count($propositions) > 0 ) {
foreach($propositions as $exceptionPhrase) {
$text = preg_replace($exceptionPhrase, '', trim($text));
}
$retval = trim($text);
}
return $retval;
}
?>
See the entire example

Search and replace number in string with PHP preg_replace

I've got this format: 00-0000 and would like to get to 0000-0000.
My code so far:
<?php
$string = '11-2222';
echo $string . "\n";
echo preg_replace('~(\d{2})[-](\d{4})~', "$1_$2", $string) . "\n";
echo preg_replace('~(\d{2})[-](\d{4})~', "$100$2", $string) . "\n";
The problem is - the 0's won't be added properly (I guess preg_replace thinks I'm talking about argument $100 and not $1)
How can I get this working?

You could try the below.
echo preg_replace('~(?<=\d{2})-(?=\d{4})~', "00", $string) . "\n";
This will replace hyphen to 00. You still make it simple
or
preg_replace('~-(\d{2})~', "$1-", $string)

The replacement string "$100$2" is interpreted as the content of capturing group 100, followed by the content of capturing group 2.
In order to force it to use the content of capturing group 1, you can specify it as:
echo preg_replace('~(\d{2})[-](\d{4})~', '${1}00$2', $string) . "\n";
Take note how I specify the replacement string in single-quoted string. If you specify it in double-quoted string, PHP will attempt (and fail) at expanding variable named 1.

PHP replacement is empty, or not replaced

Well im trying to replace the first number in a string in PHP, but not behaves as spected.
$str = 'A12:B17';
$newvalue = '987';
echo preg_replace('/(^[A-Za-z])\d+(.*)/', '\1'.$newvalue.'\2', $str);
The problem is \1 is well replaced when i put it alone, but when i put $newvalue and \2 the first \1 is ignored
input1:
echo preg_replace('/(^[A-Za-z])\d+(.*)/', '\1'.$newvalue.'\2', $str);
output1:
87:B17 // dissapears first character :/
input2:
echo preg_replace('/(^[A-Za-z])\d+(.*)/', '\1'.$newvalue.'\2', $str);
output2:
A
desired result:
A987:B17
NOTE: I need a regex solution, this applies to other similar problems.

You can use:
echo preg_replace('/(^[A-Za-z])\d+(.*)/', '${1}' . $newvalue . '${2}', $str);
//=> OUTPUT: A987:B17
Problem is that in your code back reference variable \1 is becoming \1987 and that's why showing empty value. ${1} keeps it separate from 987 and hence values are properly replaced.

anubhava's answer is great, but you could also use a lookbehind assertion like this:
echo preg_replace('/(?<=^[A-Za-z])\d+/', $newvalue, $str);
The lookbehind ensures that the matched string (\d+) immediately follows a string which matches the pattern, ^[A-Za-z]. However, unlike your original, the portion of the string which matches the lookbehind is not captured in the match, so the entire match is 12.
And just to provide yet another solution, you could also use a callback:
echo preg_replace_callback('/(^[A-Za-z])\d+/', function($m) use (&$newvalue) {
return $m[1].$newvalue;
}, $str);

How to append or replace trailing question marks using preg_replace?

I want to enforce single question mark at the and of the string. In JavaScript it works perfectly:
var re = /[?7!1]*$/;
document.write('lolwut'.replace(re, '?'));
document.write('lolwut?'.replace(re, '?'));
document.write('lolwut??'.replace(re, '?'));
document.write('lolwut???'.replace(re, '?'));
document.write('lolwut???!!!11'.replace(re, '?'));
All of returned values equals "lolwut?"
PHP variant doesnt work that smooth:
$re = '/[?7!1]*$/';
echo preg_replace($re, '?', 'lolwut') . "\n";
echo preg_replace($re, '?', 'lolwut?') . "\n";
echo preg_replace($re, '?', 'lolwut??') . "\n";
echo preg_replace($re, '?', 'lolwut???') . "\n";
echo preg_replace($re, '?', 'lolwut???!!!11') . "\n";
output is:
lolwut?
lolwut??
lolwut??
lolwut??
lolwut??
What i'm doing wrong here?
Update:
$ (Dollar) Assert end of string
An assertion is a test on the characters following or preceding the current matching point that does not actually consume any characters.
is my confusion here, along with implicit global flag of preg_replace, thanks to salathe for providing a clue. (you guys should vote his answer up, really)

Checkout rtrim() - http://php.net/manual/en/function.rtrim.php
echo rtrim('lolwut','?7!1').'?'; // lolwut?
echo rtrim('lolwut?','?7!1').'?'; // lolwut?
echo rtrim('lolwut??','?7!1').'?'; // lolwut?
echo rtrim('lolwut???!!!11','?7!1').'?'; // lolwut?
echo rtrim('lolwut1??!7!11','?7!1').'?'; // lolwut?
rtrim will Strip whitespace (or other characters) from the end of a string
The second argument:
You can also specify the characters you want to strip, by means of the charlist parameter. Simply list all characters that you want to be stripped. With .. you can specify a range of characters.

Just to answer the question asked ("What i'm doing wrong here?"), you're being confused about what precisely the regular expression matches. With the strings presented, bar the first one, the regex actually matches twice which is why you get two question marks (two matches, two replacements). The root of this behaviour is a mixture of the quantifier (* allows matching nothing) and the end-anchor ($ matches the end of the string).
For lolwut???!!!11:
The regex first matches ???!!!11 which is what you expect
Giving the string a new value of lulwut?
Then it also matches at the point right at the end of the new string
Leading to a final replaced value of lulwut??
If you wanted to continue using the same regex with preg_replace then simply restrict it to one replacement by providing a value of 1 to the fourth ($limit) argument:
preg_replace('/[?7!1]*$/', '?', 'lolwut???!!!111', 1);
// limit to only one replacement ------------------^
As for a better solution, as the others have said, use rtrim.

You should use the trim function:
echo trim('lolwut???!!!11', '?7!1');
output is:
lolwut

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Capture Group Not Captured in Preg_replace Replacement - php

Related

php preg_replace not putting dash back in?

PHP Array str_replace Whole Word

Search and replace number in string with PHP preg_replace

PHP replacement is empty, or not replaced

How to append or replace trailing question marks using preg_replace?

Categories

Resources