I like to replace the letters "KELLY" bettween "#" with the same length of "#". (here, repetitive five #'s instead of 'KELLY')
$str = "####KELLY#####"; // any alpabet letters can come.
preg_replace('/(#{3,})[A-Z]+(#{3,})/', "$1$2", $str);
It returns ######### (four hashes then five hashes) without 'KELLY'.
How can I get ############## which is four original leading hashes, then replace each letter with a hash, then the five original trailing hashes?
The \G continue metacharacter makes for a messier pattern, but it enables the ability to use preg_replace() instead of preg_replace_callback().
Effectively, it looks for the leading three-or-more hashes, then makes single-letter replacements until it reaches the finishing sequence of three-or-more hashes.
This technique also allows hash markers to be "shared" -- I don't actually know if this is something that is desired.
Code: (Demo)
$str = "####KELLY##### and ###ANOTHER###### not ####foo#### but: ###SHARE###MIDDLE###HASHES### ?";
echo $str . "\n";
echo preg_replace('/(?:#{3}|\G(?!^))\K[A-Z](?=[A-Z]*#{3})/', '#', $str);
Output:
####KELLY##### and ###ANOTHER###### not ####foo#### but: ###SHARE###MIDDLE###HASHES### ?
############## and ################ not ####foo#### but: ############################# ?
Breakdown:
/ #starting pattern delimiter
(?: #start non-capturing group
#{3} #match three hash symbols
| # OR
\G(?!^) #continue matching, disallow matching from the start of string
) #close non-capturing group
\K #forget any characters matched up to this point
[A-Z] #match a single letter
(?= #lookahead (do not consume any characters) for...
[A-Z]* #zero or more letters then
#{3} #three or more hash symbols
) #close the lookahead
/ #ending pattern delimiter
Or you can achieve the same result with preg_replace_callback().
Code: (Demo)
echo preg_replace_callback(
'/#{3}\K[A-Z]+(?=#{3})/',
function($m) {
return str_repeat('#', strlen($m[0]));
},
$str
);
I solved the problem with preg_replace_callback function in php.
Thanks CBroe for the tips.
preg_replace_callback('/#{3,}([A-Z]+)#{3,}/i', 'replaceLetters', $str);
function replaceLetters($matches) {
$ret = '';
for($i=0; $i < strlen($matches[0]); $i++) {
$ret .= "#";
}
return $ret;
}
Related
I have the following content
"aa_bb" : "foo"
"pp_Qq" : "bar"
"Xx_yY_zz" : "foobar"
And I want to convert the content on the left side to camelCase
"aaBb" : "foo"
"ppQq" : "bar"
"xxYyZz" : "foobar"
And the code:
// selects the left part
$newString = preg_replace_callback("/\"(.*?)\"(.*?):/", function($matches) {
// selects the characters following underscores
$matches[1] = preg_replace_callback("/_(.?)/", function($matches) {
//removes the underscore and uppercases the character
return strtoupper($matches[1]);
}, $matches[1]);
// lowercases the first character before returning
return "\"".lcfirst($matches[1])."\" : ".$matches[2];
}, $string);
Can this code be simplified?
Note: The content will always be a single string.
First, since you already have a working code you want to improve, consider to post your question in code review instead of stackoverflow next time.
Let's start to improve your original approach:
$result = preg_replace_callback('~"[^"]*"\s*:~', function ($m) {
return preg_replace_callback('~_+(.?)~', function ($n) {
return strtoupper($n[1]);
}, strtolower($m[0]));
}, $str);
pro: patterns are relatively simple and the idea is easy to understand.
cons: nested preg_replace_callback's may hurt the eyes.
After this eyes warm-up exercice, we can try a \G based pattern approach:
$pattern = '~(?|\G(?!^)_([^_"]*)|("(?=[^"]*"\s*:)[^_"]*))~';
$result = preg_replace_callback($pattern, function ($m) {
return ucfirst(strtolower($m[1]));
}, $str);
pro: the code is shorter, no need to use two preg_replace_callback's.
cons: the pattern is from far more complicated.
notice: When you write a long pattern, nothing forbids to use the free-spacing mode with the x modifier and to put comments:
$pattern = '~
(?| # branch reset group: in which capture groups have the same number
\G # contigous to the last successful match
(?!^) # but not at the start of the string
_
( [^_"]* ) # capture group 1
|
( # capture group 1
"
(?=[^"]*"\s*:) # lookahead to check if it is the "key part"
[^_"]*
)
)
~x';
Is there compromises between these two extremes, and what is the good one? Two suggestions:
$result = preg_replace_callback('~"[^"]+"\s*:~', function ($m) {
return array_reduce(explode('_', strtolower($m[0])), function ($c, $i) {
return $c . ucfirst($i);
});
}, $str);
pro: minimal use of regex.
cons: needs two callback functions except that this time the second one is called by array_reduce and not by preg_replace_callback.
$result = preg_replace_callback('~["_][^"_]*(?=[^"]*"\s*:)~', function ($m) {
return ucfirst(strtolower(ltrim($m[0], '_')));
}, $str);
pro: the pattern is relatively simple and the callback function stays simple too. It looks like a good compromise.
cons: the pattern isn't very constrictive (but should suffice for your use case)
pattern description: the pattern looks for a _ or a " and matches following characters that aren't a _ or a ". A lookahead assertion then checks that these characters are inside the key part looking for a closing quote and colon. The match result is always like _aBc or "aBc (underscores are trimmed on the left in the callback function and " stays the same after applying ucfirst).
pattern details:
["_] # one " or _
[^"_]* # zero or more characters that aren't " or _
(?= # open a lookahead assertion (followed with)
[^"]* # all that isn't a "
" # a literal "
\s* # eventual whitespaces
: # a literal :
) # close the lookahead assertion
There's no good answer and what looks simple or complicated really depends on the reader.
You might make use of preg_replace_callback in combination with the \G anchor and capturing groups.
(?:"\K([^_\r\n]+)|\G(?!^))(?=[^":\r\n]*")(?=[^:\r\n]*:)_?([a-zA-Z])([^"_\r\n]*)
In parts
(?: Non capturing group
"\K([^_\r\n]+) Match ", capture group 1 match 1+ times any char except _ or newline
| Or
\G(?!^) Assert position at the previous match, not at the start
) Close group
(?=[^":\r\n]*") Positive lookahead, assert "
(?=[^:\r\n]*:) Positive lookahead, assert :
_? Match optional _
([a-zA-Z]) Capture group 2 match a-zA-Z
([^"_\r\n]*) Capture group 3 match 0+ times any char except _ or newline
In the replacement concatenate a combination of strtolower and strtoupper using the 3 capturing groups.
Regex demo
For example
$re = '/(?:"\K([^_\r\n]+)|\G(?!^))(?=[^":\r\n]*")(?=[^:\r\n]*:)_?([a-zA-Z])([^"_\r\n]*)/';
$str = '"aa_bb" : "foo"
"pp_Qq" : "bar"
"Xx_yY_zz" : "foobar"
"Xx_yYyyyyyYyY_zz_a" : "foobar"';
$result = preg_replace_callback($re, function($matches) {
return strtolower($matches[1]) . strtoupper($matches[2]) . strtolower($matches[3]);
}, $str);
echo $result;
Output
"aaBb" : "foo"
"ppQq" : "bar"
"xxYyZz" : "foobar"
"xxYyyyyyyyyyZzA" : "foobar"
Php demo
To replace a whitespace with a comma and whitespace in a string I should do something like this:
$result = preg_replace('/[ ]+/', ', ', trim($value));
The result: Some, example, here, for, you
However, I only want to replace the 3d white space, so that the result would look like this:
Some example here, for you
How do I do that?
You may use something like
$value = " Some example here for you ";
$result = preg_replace('/^\S+(?:\s+\S+){2}\K\s+/', ',$0', trim($value), 1);
echo $result; // => Some example here, for you
See the PHP demo and the regex demo.
Pattern details
^ - start of string
\S+ - 1+ non-whitespaces
(?:\s+\S+){2} - two consecutive occurrences of
\s+ - 1+ whitespaces
\S+ - 1+ non-whitespaces
\K - a match reset operator
\s+ - (the $0 in the replacement pattern references this substring) 1+ whitespaces.
You can use an callback function and control when to replace:
<?php
$string = 'Some example here for you';
$i = 0;
$string = preg_replace_callback('/\s+/',function($m) use(&$i){
$i++;
if($i == 3) {
return ', ';
}
return ' ';
},$string);
echo $string;
Try this
$result = preg_replace('/^([^\s]+)\s+((?1)\s+(?1))/', '\1 \2,', trim($value));
Test it
Explanation:
^ start of string
([^\s]+) - capture everything not a space
\s+ space 1 or more
((?1)\s+(?1)) - (?1) repeat first capture group, we do this 2x with a space between, and capture that. I guess you could capture them separately, but what's the point.
The nice thing about (?{n}) is if you have to change the regex for the word capturing you only have to change it 1 time, not 3. Probably it doesn't matter here so much, but I like using it...
I have to replace characters in URLs but only form a certain point and also handle duplicate characters.
The URLs look like this:
http://example.com/001-one-two.html#/param-what-ever
http://example.com/002-one-two-three.html#/param-what--ever-
http://example.com/003-one-two-four.html#/param2-what-ever-
http://example.com/004-one-two-five.html#/param33-what--ever---here-
and they should look like this:
http://example.com/001-one-two.html#/param-what_ever
http://example.com/002-one-two-three.html#/param-what_ever_
http://example.com/003-one-two-four.html#/param2-what_ever_
http://example.com/004-one-two-five.html#/param33-what_ever_here_
In words replace - characters (any number of it) with a single _ char but skip the first - after #/
The string length after the #/ varies obviously and I couldn't figure out a way to do this.
How can I do this?
Here is a way to go, using preg_replace_callback:
$in = array(
'http://example.com/001-one-two.html#/param-what-ever',
'http://example.com/002-one-two-three.html#/param-what--ever-',
'http://example.com/003-one-two-four.html#/param2-what-ever-',
'http://example.com/004-one-two-five.html#/param33-what--ever---here-'
);
foreach($in as $str) {
$res = preg_replace_callback('~^.*?#/[^-]+-(.+)$~', function ($m) {
return preg_replace('/-+/', '_', $m[1]);
},
$str);
echo "$res\n";
}
Explanation:
~ : regex delimiter
^ : start of string
.*? : 0 or more any character, not greedy
#/ : literally #/
[^-]+ : 1 or more any character that is not a dash
- : a dash
\K : forget all we have seen until here
(.+) : group 1, contains avery thing after the first dash after #/
$ : end of string
~ : regex delimiter
Output:
http://example.com/001-one-two.html#/param-what_ever
http://example.com/002-one-two-three.html#/param-what_ever_
http://example.com/003-one-two-four.html#/param2-what_ever_
http://example.com/004-one-two-five.html#/param33-what_ever_here_
I want to manipulate a string like "...4+3(4-2)-...." to become "...4+3*(4-2)-....", but of course it should recognize any number, d, followed by a '(' and change it to 'd*('. And I also want to change ')(' to ')*(' at the same time if possible. Would nice if there is a possibility to add support for constants like pi or e too.
For now, I just do it this stupid way:
private function make_implicit_multiplication_explicit($string)
{
$i=1;
if(strlen($string)>1)
{
while(($i=strpos($string,"(",$i))!==false)
{
if(strpos("0123456789",substr($string,$i-1,1)))
{
$string=substr_replace($string,"*(",$i,1);
$i++;
}
$i++;
}
$string=str_replace(")(",")*(",$string);
}
return $string;
}
But I Believe this could be done much nicer with preg_replace or some other regex function? But those manuals are really cumbersome to grasp, I think.
Let's start by what you are looking for:
either of the following: ((a|b) will match either a or b)
any number, \d
the character ): \)
followed by (: \(
Which creates this pattern: (\d|\))\(. But since you want to modify the string and keep both parts, you can group the \( which results in (\() making it worse to read but better to handle.
Now everything left is to tell how to rearrange, which is simple: \\1*\\2, leaving you with code like this
$regex = "/(\d|\))(\()/";
$replace = "\\1*\\2";
$new = preg_replace($regex, $replace, $test);
To see that the pattern actually matches all cases, see this example.
To recognize any number followed by a ( OR a combination of a )( and place an asterisk in between them, you can use a combination of lookaround assertions.
echo preg_replace("/
(?<=[0-9)]) # look behind to see if there is: '0' to '9', ')'
(?=\() # look ahead to see if there is: '('
/x", '*', '(4+3(4-2)-3)(2+3)');
The Positive Lookbehind asserts that what precedes is either a number or right parentheses. While the Positive Lookahead asserts that the preceding characters are followed by a left parentheses.
Another option is to use the \K escape sequence in replace of the Lookbehind. \K resets the starting point of the reported match. Any previously consumed characters are no longer included ( throws away everything that it has matched up to that point. )
echo preg_replace("/
[0-9)] # any character of: '0' to '9', ')'
\K # resets the starting point of the reported match
(?=\() # look ahead to see if there is: '('
/x", '*', '(4+3(4-2)-3)(2+3)');
Your php code should be,
<?php
$mystring = "4+3(4-2)-(5)(3)";
$regex = '~\d+\K\(~';
$replacement = "*(";
$str = preg_replace($regex, $replacement, $mystring);
$regex1 = '~\)\K\(~';
$replacement1 = "*(";
echo preg_replace($regex1, $replacement1, $str);
?> //=> 4+3*(4-2)-(5)*(3)
Explanation:
~\d+\K\(~ this would match the one or more numbers followed by a (. Because of \K it excludes the \d+
Again it replaces the matched part with *( which in turn produces 3*( and the result was stored in another variable.
\)\K\( Matches )( and excludes the first ). This would be replaced by *( which in turn produces )*(
DEMO 1
DEMO 2
Silly method :^ )
$value = '4+3(4-2)(1+2)';
$search = ['1(', '2(', '3(', '4(', '5(', '6(', '7(', '8(', '9(', '0(', ')('];
$replace = ['1*(', '2*(', '3*(', '4*(', '5*(', '6*(', '7*(', '8*(', '9*(', '0*(', ')*('];
echo str_replace($search, $replace, $value);
How would I remove repeating characters (e.g. remove the letter k in cakkkke for it to be cake)?
One straightforward way to do this would be to loop through each character of the string and append each character of the string to a new string if the character isn't a repeat of the previous character.
Here is some code that can do this:
$newString = '';
$oldString = 'cakkkke';
$lastCharacter = '';
for ($i = 0; $i < strlen($oldString); $i++) {
if ($oldString[$i] !== $lastCharacter) {
$newString .= $oldString[$i];
}
$lastCharacter = $oldString[$i];
}
echo $newString;
Is there a way to do the same thing more concisely using regex or built-in functions?
Use backrefrences
echo preg_replace("/(.)\\1+/", "$1", "cakkke");
Output:
cake
Explanation:
(.) captures any character
\\1 is a backreferences to the first capture group. The . above in this case.
+ makes the backreference match atleast 1 (so that it matches aa, aaa, aaaa, but not a)
Replacing it with $1 replaces the complete matched text kkk in this case, with the first capture group, k in this case.
You want to first match a character, followed by that character repeated: (.)\1+. Replace that with the first character. The brackets create a backreference to the first character, which you use both to match the repeated instances and as the replacement text.
preg_replace('/(.)\1+/', '$1', $str);