Regex replace recursive with one pattern - php

$array[key][key]...[key]
replace to
$array['key']['key']...['key']
I managed only to add quotes to the first keyword of the array.
\$([a-zA-Z0-9]+)\[([a-zA-Z_-]+[0-9]*)\] replace to \$\1\[\'\2\3\'\]

You may use a regex that does not perform a recursive, but consecutive matching:
$re = '/(\$\w+|(?!^)\G)\[([^]]*)\]/';
$str = "\$array[key][key][key]";
$subst = "$1['$2']";
$result = preg_replace($re, $subst, $str);
echo $result;
See IDEONE demo
The regex (\$\w+|(?!^)\G)\[([^]]*)\] matches all square parenthetical substrings (capturing their contents into Group 2) (with \[([^]]*)\]) that either are right after a '$'+alphanumerics substring (due to the \$\w+ part) or that follow one another consecutively (thanks to (?!^)\G).

Shouldn't need anything fancy, just get the stuff you need then
replace in a callback.
Untested:
$new_input = preg_replace_callback('/(?i)\$[a-z]+\K(?:\[[^\[\]]*\])+/',
function( $matches ){
return preg_replace( '/(\[)|(\])/', "$1'$2", $matches[0]);
},
$input );

Related

Operation on string in PHP. Remove part of string

How can i remove part of string from example:
##lang_eng_begin##test##lang_eng_end##
##lang_fr_begin##school##lang_fr_end##
##lang_esp_begin##test33##lang_esp_end##
I always want to pull middle of string: test, school, test33. from this string.
I Read about ltrim, substr and other but I had no good ideas how to do this. Becouse each of strings can have other length for example :
'eng', 'fr'
I just want have string from middle between ## and ##. to Maye someone can help me? I tried:
foreach ($article as $art) {
$title = $art->titl = str_replace("##lang_eng_begin##", "", $art->title);
$art->cleanTitle = str_replace("##lang_eng_end##", "", $title);
}
But there
##lang_eng_end##
can be changed to
##lang_ger_end##
in next row so i ahvent idea how to fix that
If your strings are always in this format, an explode way looks easy:
$str = "##lang_eng_begin##test##lang_eng_end## ";
$res = explode("##", $str)[2];
echo $res;
You may use a regex and extract the value in between the non-starting ## and next ##:
$re = "/(?!^)##(.*?)##/";
$str = "##lang_eng_begin##test##lang_eng_end## ";
preg_match($re, $str, $match);
print_r($match[1]);
See the PHP demo. Here, the regex matches a ## that is not at the string start ((?!^)##), then captures into Group 1 any 0+ chars other than newline as few as possible ((.*?)) up to the first ## substring.
Or, replace all ##...## substrings with `preg_replace:
$re = "/##.*?##/";
$str = "##lang_eng_begin##test##lang_eng_end## ";
echo preg_replace($re, "", $str);
See another demo. Here, we just remove all non-overlapping substrings beginning with ##, then having any 0+ chars other than a newline up to the first ##.

How to not perform preg_replace if subject starts with quote

I'm trying to convert plain links to HTML links using preg_replace. However it's replacing links that are already converted.
To combat this I'd like it to ignore the replacement if the link starts with a quote.
I think a positive lookahead may be needed but everything I've tried hasn't worked.
$string = 'test http://www.example.com';
$string = preg_replace("/((https?:\/\/[\w]+[^ \,\"\n\r\t<]*))/is", "$1", $string);
var_dump($string);
The above outputs:
http://www.example.com">test</a> http://www.example.com
When it should output:
test http://www.example.com
You might get along with lookarounds.
Lookarounds are zero-width assertions that make sure to match/not to match anything immediately around the string in question. They do not consume any characters.
That being said, a negative lookbehind might be what you need in your situation:
(?<![">])\bhttps?://\S+\b
In PHP this would be:
<?php
$string = 'I want to be transformed to a proper link: http://www.google.com ';
$string .= 'But please leave me alone ';
$string .= '(https://www.google.com).';
$regex = '~ # delimiter
(?<![">]) # a neg. lookbehind
https?://\S+ # http:// or https:// followed by not a whitespace
\b # a word boundary
~x'; # verbose to enable this explanation.
$string = preg_replace($regex, "<a href='$0'>$0</a>", $string);
echo $string;
?>
See a demo on ideone.com. However, maybe a parser is more appropriate.
Since you can use Arrays in preg_replace, this might be convenient to use depending on what you want to achieve:
<?php
$string = 'test http://www.example.com';
$rx = array("&(<a.+https?:\/\/[\w]+[^ \,\"\n\r\t<]*>)(.*)(<\/a\>)&si", "&(\s){1,}(https?:\/\/[\w]+[^ \,\"\n\r\t<]*)&");
$rp = array("$1$2$3", "$2");
$string = preg_replace($rx,$rp, $string);
var_dump($string);
// DUMPS:
// 'testhttp://www.example.com'
The Idea
You can split your string at the already existing anchors, and only parse the pieces in between.
The Code
$input = 'test http://www.example.com';
// Split the string at existing anchors
// PREG_SPLIT_DELIM_CAPTURE flag includes the delimiters in the results set
$parts = preg_split('/(<a.*?>.*?<\/a>)/is', $input, PREG_SPLIT_DELIM_CAPTURE);
// Use array_map to parse each piece, and then join all pieces together
$output = join(array_map(function ($key, $part) {
// Because we return the delimiter in the results set,
// every $part with an uneven key is an anchor.
return $key % 2
? preg_replace("/((https?:\/\/[\w]+[^ \,\"\n\r\t<]*))/is", "$1", $part)
: $part;
}, array_keys($parts), $parts);

Matching all of a certain character after a Positive Lookbehind

I have been trying to get the regex right for this all morning long and I have hit the wall. In the following string I wan't to match every forward slash which follows .com/<first_word> with the exception of any / after the URL.
$string = "http://example.com/foo/12/jacket Input/Output";
match------------------------^--^
The length of the words between slashes should not matter.
Regex: (?<=.com\/\w)(\/) results:
$string = "http://example.com/foo/12/jacket Input/Output"; // no match
$string = "http://example.com/f/12/jacket Input/Output";
matches--------------------^
Regex: (?<=\/\w)(\/) results:
$string = "http://example.com/foo/20/jacket Input/O/utput"; // misses the /'s in the URL
matches----------------------------------------^
$string = "http://example.com/f/2/jacket Input/O/utput"; // don't want the match between Input/Output
matches--------------------^-^--------------^
Because the lookbehind can have no modifiers and needs to be a zero length assertion I am wondering if I have just tripped down the wrong path and should seek another regex combination.
Is the positive lookbehind the right way to do this? Or am I missing something other than copious amounts of coffee?
NOTE: tagged with PHP because the regex should work in any of the preg_* functions.
If you want to use preg_replace then this regex should work:
$re = '~(?:^.*?\.com/|(?<!^)\G)[^/\h]*\K/~';
$str = "http://example.com/foo/12/jacket Input/Output";
echo preg_replace($re, '|', $str);
//=> http://example.com/foo|12|jacket Input/Output
Thus replacing each / by a | after first / that appears after starting .com.
Negative Lookbehind (?<!^) is needed to avoid replacing a string without starting .com like /foo/bar/baz/abcd.
RegEx Demo
Use \K here along with \G.grab the groups.
^.*?\.com\/\w+\K|\G(\/)\w+\K
See demo.
https://regex101.com/r/aT3kG2/6
$re = "/^.*?\\.com\\/\\w+\\K|\\G(\\/)\\w+\\K/m";
$str = "http://example.com/foo/12/jacket Input/Output";
preg_match_all($re, $str, $matches);
Replace
$re = "/^.*?\\.com\\/\\w+\\K|\\G(\\/)\\w+\\K/m";
$str = "http://example.com/foo/12/jacket Input/Output";
$subst = "|";
$result = preg_replace($re, $subst, $str);
Another \G and \K based idea.
$re = '~(?:^\S+\.com/\w|\G(?!^))\w*+\K/~';
The (: non capture group to set entry point ^\S+\.com/\w or glue matches \G(?!^) to it.
\w*+\K/ possessively matches any amount of word characters until a slash. \K resets match.
See demo at regex101

Why is my regex rejecting apostrophes?

I'm making a regex which should match everything like that : [[First example]] or [[I'm an example]].
Unfortunately, it doesn't match [[I'm an example]] because of the apostrophe.
Here it is :
preg_replace_callback('/\[\[([^?"`*%#\\\\:<>]+)\]\]/iU', ...)
Simple apostrophes (') are allowed so I really do not understand why it doesn't work.
Any ideas ?
EDIT : Here is what's happening before I'm using this regex
// This match something [[[like this]]]
$contents = preg_replace_callback('/\[\[\[(.+)\]\]\]/isU',function($matches) {
return '<blockquote>'.$matches[1].'</blockquote>';
}, $contents);
// This match something [[like that]] but doesn't work with apostrophe/quote when
// the first preg_replace_callback has done his job
$contents = preg_replace_callback('/\[\[([^?"`*%#\\\\:<>]+)\]\]/iU', ..., $contents);
try this:
$string = '[[First example]]';
$pattern = '/\[\[(.*?)\]\]/';
preg_match ( $pattern, $string, $matchs );
var_dump ( $matchs );
You can use this regex:
\[\[.*?]]
Working demo
Php code
$re = '/\[\[.*?]]/';
$str = "not match this but [[Match this example]] and not this";
preg_match_all($re, $str, $matches);
Btw, if you want to capture the content within brackets you have to use capturing groups:
\[\[(.*?)]]

preg_replace everything but # sign

I've searched for an example of this, but can't seem to find it.
I'm looking to replace everything for a string but the #texthere
$Input = this is #cool isn't it?
$Output = #cool
I can remove the #cool using preg_replace("/#(\w+)/", "", $Input); but can't figure out how to do the opposite
You could match #\w+ and then replace the original string. Or, if you need to use preg_replace, you should be able to replace everything with the first capture group:
$output = preg_replace('/.*(#\w+).*/', '\1', $input);
Solution using preg_match (I assume this will perform better):
$matches = array();
preg_match('/#\w+/', $input, $matches);
$output = $matches[0];
Both patterns above do not address the issue how to handle inputs which match multiple times, such as this is #cool and #awesome, right?

Categories