Replace only specific group in file form preg_replace - php

I have txt file with content:
fggfhfghfghf
$config['website'] = 'Olpa';
asdasdasdasdasdas
And PHP script for replacing by preg_replace in file:
write_file('tekst.txt', preg_replace('/\$config\[\'website\'] = \'(.*)\';/', 'aaaaaa', file_get_contents('tekst.txt')));
But it doesn't work exactly what I want it to work.
Because this script replace whole match, and after change it looks like this:
fggfhfghfghf
aaaaaa
asdasdasdasdasdas
And that's bad.
All I want is to not change whole match $config['website'] = 'Olpa'; But to just change this Olpa
As you can see it belongs not to Group 2. of match information.
And all I want is to just change this Group 2. one specific thing.
to finally after script it will look like:
fggfhfghfghf
$config['website'] = 'aaaaaa';
asdasdasdasdasdas

You need to change your preg_replace to
preg_replace('/(\$config\[\'website\'] = \').*?(\';)/', '$1aaaaaa$2', file_get_contents('tekst.txt'))
It means, capture what you need to keep (and then use backreferences to restore the text) and just match what you need to replace.
See the regex demo.
Pattern details:
(\$config\[\'website\'] = \') - Group 1 capturing a literal $config['website'] = ' substring (later referenced to with $1)
.*? - any 0+ chars other than line break chars as few as possible
(\';) - Group 2: a ' followed with ; (later referenced to with $2)
In case your aaa actually starts with a digit, you would need a ${1} backreference.

I have a better, faster, leaner solution for you. No capture groups are required, it only requires careful attention to escaping the single quotes:
Pattern: \$config\['website'] = '\K[^']+
\K means "start the fullstring match here", this combined with the negated character class ([^']+) affords the omission of capture groups.
Pattern Demo (just 25 steps)
PHP Implementation:
$txt='fggfhfghfghf
$config[\'website\'] = \'Olpa\';
asdasdasdasdasdas';
print_r(preg_replace('/\$config\[\'website\'\] = \'\K[^\']+/','aaaaaa',$txt));
Using single quotes around the pattern is crucial so that $config isn't interpreted as a variable. As a result, all of the single quotes inside of the pattern must be escaped.
Output:
fggfhfghfghf
$config['website'] = 'aaaaaa';
asdasdasdasdasdas

Related

Replace data after .PNG extension in image tag regular expression

Here is my code
<img src="folder/img1.jpg?somestring">
<img src="folder/img2.jpg?somediffstring">
want to replace somestring & somediffstring with another string in whole html. please suggest some regular expression with php.
example
change to using regular expression or anything
First of all, you shouldn't parse HTML with Regular Expressions.
Solution 1
Now, if you are exclusively parsing img tags, you could come up with a satisfying enough solution like this:
(\b\.jpg|\b\.png)\?(.*?)\"
That is:
(\b\.jpg|\b\.png) # 1st Capturing Group
\b\.jpg # 1st Alternative: match ``.jpg`` literally
\b\.png # 2nd Alternative: match ``.png`` literally
\? # Match the character ? literally
(.+?) # 2nd Capturing Group
.+? # Match any character between one and unlimited times,
# as few times as possible, expanding as needed.
\" # Match the character " literally
Problem
What's the problem? We are not checking if we are inside an img tag. This will match everywhere in the HTML.
Solution 2
Let's add the check for img > src:
<img.+?src=\".*?(\b\.jpg|\b\.png)\?(.+?)\"
That is:
<img # Match ``<img`` literally
.+? # Match any character between one and unlimited times,
# as few times as possible, expanding as needed.
# Needed in case there are rel or alt options inside the img tag.
src=\" # Match ``src="`` literally
... # The rest is same as before.
Problem
Does this really do its job? Apparently yes, but in reality no.
Consider the following HTML code
<img src="data:image/png;base64,iVBORw0KG" />
<div style="background-image: url(../images/test-background.jpg?)">
blah blah
</div>
It shouldn't match right? But it does (if you remove line-breaks). The regular expression above starts the match at <img src=", and will stop at "> of the div tag. The capturing group will contain the characters between ? and ": ), substituting it will break the HTML.
This was just an example, but many other situations will match even if they should not.
Other solutions...?
No matter how many constraints you can add to your RegEx and how sophisticated it becomes... HTML is a Context-Free Language and it can't be captured by a Regular Expression, which only recognizes Regular Languages.
In PHP
Still sure you're gonna use Regular Expressions? Alright, then your PHP function is preg_replace. You only need to keep in mind that it will replace everything that matched, not only the capturing groups. Hence, you need to wrap what you want to "remember" into another capturing group:
$str = '<img src="folder/img1.jpg?foo">';
$pattern = '/(<img.+?src=\".*?(\b\.jpg|\b\.png)\?)(.+?)(\")/';
$replacement = '$1' . 'bar' . '$4';
$str_replaced = preg_replace($pattern, $replacement, $str);
// Now you have $str_replaced = '<img src="folder/img1.jpg?bar">';
With reference to this How can I use the captured group in the same regex
suppose u wanna change img1.jpg?somestring to img1.jpg?somestringAAA
and img2.jpg?somediffstring to img2.jpg?somediffstringAAA
Search for: src="([a-zA-Z.0-9_]*)[?]([a-zA-Z.0-9_]*)">
Replace with: src="$1?$2AAA">
here $1 represents whatever is inside first round paranthesis () , i.e., img1.jpg
and $2 represents second paranthesis
UPDATE:
$string = 'img1.jpg?somestring';
$pattern = '/([a-zA-Z.0-9_]*)[?]([a-zA-Z.0-9_]*)/i';
$replacement = '$1?$2AAA';
echo preg_replace($pattern, $replacement, $string);
You can do it in this way :
<?php
$url_value = "folder/img2.jpg?somediffstring";
echo $url =substr($url_value, 0, strpos($url_value, "?"));
?>
you can use the regex \?(\w*)"
if u want to replace somestring and somediffstring with xx then u can replace it with regex \?(\w*)" and value as ?xx
https://regex101.com/r/S5pPuW/1

PHP preg_replace - in case of match remove the beginning and end of the string partly matched by regex with one call?

In PHP I try to achive the following (if possible only with the preg_replace function):
Examples:
$example1 = "\\\\\\\\\\GLS\\\\\\\\\\lorem ipsum dolor: T12////GLS////";
$example2 = "\\\\\\GLS\\\\\\hakunamatata ::: T11////GLS//";
$result = preg_replace("/(\\)*GLS(\\)*(.)*(\/)*GLS(\/)*/", "REPLACEMENT", $example1);
// current $result: REPLACEMENT (that means the regex works, but how to replace this?)
// desired $result
// for $example1: lorem ipsum dolor: T12
// for $example2: hakunamatata ::: T11
Have consulted http://php.net/manual/en/function.preg-replace.php of course but my experiments with replacement have not been successful yet.
Is this possible with one single preg_replace or do I have to split the regular expression and replace the front match and the back match seperatly?
If the regex does not match at all I like to receive an error but this i may cover with preg_match first.
The main point is to match and capture what you need with a capturing group and then replace with the back-reference to that group. In your regex, you applied a quantifier to the group ((.)*) and thus you lost access to the whole substring, only the last character is saved in that group.
Note that (.)* matches the same string as (.*), but in the former case you will have 1 character in the capture group as the regex engine grabs a character and saves it in the buffer, then grabs another and re-writes the previous one and so on. With the (.*) expression, all the characters are grabbed together in one chunk and saved into the buffer as one whole substring.
Here is a possible way:
$re = "/\\\\*GLS\\\\*([^\\/]+)\\/+GLS\\/+/";
// Or to use fewer escapes, use other delimiters
// $re = "~\\\\*GLS\\\\*([^/]+)/+GLS/+~";
$str = "\\\\\\GLS\\\\\\hakunamatata ::: T11////GLS//";
$result = preg_replace($re, "$1", $str);
echo $result;
Result of the IDEONE demo: hakunamatata ::: T11.

php Regular Expression Issues - Can't remove/strip out and replace a string within a string

I have never worked with regular expressions before and I need them now and I am having some issues getting the expected outcome.
Consider this for example:
[x:3xerpz1z]Some Text[/x:3xerpz1z] Some More Text
Using the php preg_replace() function, I want to replace [x:3xerpz1z] with <start> and [/x:3xerpz1z] with </end> but I can't figure this out. I have read some regular expression tutorials but I am still confused.
I have tried this for the starting tag:
preg_replace('/(.*)\[x:/','<start>', $source_string);
The above would return:<start>3xerpz1z
As you can see, the "3xerpz1z" isn't getting removed and it needs to be stripped out. I can't hard code and search and replace "3xerpz1z" because the "3xerpz1z" chars are randomly generated and the characters are always different but the length of the tag is the same.
This is the desired output I want:
<start>Some Text</end> Some More Text
I haven't event tried processing [/x:3xerpz1z] because I can't even get the first tag going.
You must use capturing groups (....):
$data = '[x:3xerpz1z]Some Text[/x:3xerpz1z] Some More Text';
$result = preg_replace('~\[x:([^]]+)](.*?)\[/x:\1]~s', '<start>$2</end>', $data);
pattern details:
~ # pattern delimiter: better than / here (no need to escape slashes)
\[x:
([^]]+) # capture group 1: all that is not a ]
]
(.*?) # capture group 2: content
\[/x:\1] # \1 is a backreference to the first capturing group
~s # s allows the dot to match newlines

how to match this pattern in php

I am looking for a regular expression in php to parse a string of the following pattern. The command are wrapped by double square bracket as
[[a src="" desc=""]]
where a, src and desc are the keywords (won't be changed). src must be given but desc is optional, the value of src or desc can be wrapped by double or single quote. And src and desc could be given in any order. For example, the following patterns are all valid
[[a src="http://a.c.d" desc ="hello"]]
[[a src ="http://a.c.d" desc= 'hello']]
[[a desc ="hello " src= 'http://a.c.d' ]]
[[a src = "http://a.c.d" ]]
[[a src="http://a.c.d" desc ="hello"]]
any space between value and 'a', 'src', 'desc', '=' (without quotation) should be ignored. I am going to replace this command with html tag like
SOMETHING_EXTRACT_FROM_DESC
It seems pretty tough to think of one regex to do the work. Now I have 3 regex setup to handle difference cases separately. It looks like this
$pattern = '/\[\[a[:blank:]+src[:blank:]*=[:blank:]*"(.*?)"[:blank:]+desc[:blank:]*=[:blank:]+"(.*?)"\]\]/i';
$rtn = preg_replace($pattern, '${2}', $src);
$pattern = '/\[\[a[:blank:]+desc[:blank:]*=[:blank:]*"(.*?)"[:blank:]+src[:blank:]*=[:blank:]+"(.*?)"\]\]/i';
$rtn = preg_replace($pattern, '${2}', $rtn);
$pattern = '/\[\[a[:blank:]+src[:blank:]*=[:blank:]+"(.*?)"\]\]/i';
$rtn = preg_replace($pattern, '${2}', $rtn);
But this doesn't work, regular expression is hard to learn :(
I wrote a regular expression that matches everything you requested, but allows a bit of an overhead I''ll explain at the end. But first the regex:
Looks like this:
\[\[a(\s+(src|desc)\s*=\s*('[^']*'|"[^"]*")){1,2}\s*\]\]
I'll brake it down so you can understand it:
\[\[ ... \]\] matches [[ ... ]], the beginning and ending
\s matches any whitespace (space and tab), \s+ expects at least one
(src|desc) matches either the string src or the string desc. It's an OR operator: match src OR desc.
'[^']*' matches two single quotes and anything in between that is not a single quote
"[^"]*" same with double quotes
('[^']*'|"[^"]*") matches one of the above two
(src|desc)\s*=\s*('[^']*'|"[^"]*") matches a token like src='something'
{1,2} matches something once or twice, appending to the above expression, metches one or two of those tokens
And that's pretty much it. The only problem is that it will also match this:
[[a src="http://a.c.d" src="http://a.c.d"]]
Which I think is a mismatch. If it doesn't bother you, you're good to go, otherwise you'll need to change the whole concept of using a big atom with ors (i.e.: |) and take a different approach. You could use look-aheads for example. But it will get real nasty pretty fast.
You can test it online HERE
The regex is much more readable if I remove the backslashes and the \s stuffs. This won't work, but I think it will help you understand it:
[[a ( (src|desc)=('[^']*'|"[^"]*") ){1,2} ]]

Need help with REGEX in PHP. A simple one. Help!

Recently, I'm playing with something related to BBCode in phpBB3. When I trace back my database, the posts table and for a random post. I found that the image tag is written this way [img:fcjsgy5j]. There are 8 random characters generated between [img: ... ] for each post.
[img:fcjsgy5j]http://imageurl.jpg[/img]
My question is, how can I make use of preg_replace() to replace the random characters into this way..
<img src="http://imageurl.jpg">
$output = preg_replace("`\[img:.+?\](.*?)\[/img\]`i", '<img src="$1"/>', $input);
[ begins a character set. We don't want that; we want to match the literal [ character, so we have to escape it with a \
. matches any character
+ means we match 1 or more of the previous thing (any character)
? makes the previous quantifier ungreedy (.+ would match everything, right to the very end of the string, that's not what we want, we want it to match as little as possible... just up to the next ]
(.*?) matches all the junk between the [img] tags. Ungreedy again. We put () around it to make it mtaching set
The ` (back-tick) at the start and the end could be any character... whatever character you start with, you have to end with. A lot of people use / but I prefer the back-tick because it rarely appears anywhere inside the regular expression, thus I don't need to escape it.
The i at the very end means The expression will be case insensitive. (will match img, IMG, ImG, etc.)
The $1 in the replace refers back to the () section we denoted earlier... it basically takes whatever was matched there, and plops it into the place of $1
$result = preg_replace('%\[img:[^]]+\]([^[]+)\[/img\]%', '<img src="\1">', $subject);
or, as a commented regex:
$result = preg_replace(
'%\[img: # match [img:
[^]]+ # match one or more non-] characters
\] # match ]
([^[]+) # match one or more non-[ characters
\[/img\] # match [/img]
%x',
'<img src="\1">', $subject);
Try this code :
<?php
$search = array(
'\[img:.+?\](.*?)\[\/img\]\'
);
$replace = array(
'<img src="\\2">'
);
$result = preg_replace($search, $replace, $string);
}
?>
I used the array form of preg_replace so that u can add more search and replace patterns in the future. I think you are trying to replace some BBCODE tags. There is plenty of libraries on the net to handle BBCODE correctly.
Edited
Like this one :
http://php.net/manual/en/book.bbcode.php

Categories