Replace 3d whitespace with comma and whitespace in string - php

To replace a whitespace with a comma and whitespace in a string I should do something like this:
$result = preg_replace('/[ ]+/', ', ', trim($value));
The result: Some, example, here, for, you
However, I only want to replace the 3d white space, so that the result would look like this:
Some example here, for you
How do I do that?

You may use something like
$value = " Some example here for you ";
$result = preg_replace('/^\S+(?:\s+\S+){2}\K\s+/', ',$0', trim($value), 1);
echo $result; // => Some example here, for you
See the PHP demo and the regex demo.
Pattern details
^ - start of string
\S+ - 1+ non-whitespaces
(?:\s+\S+){2} - two consecutive occurrences of
\s+ - 1+ whitespaces
\S+ - 1+ non-whitespaces
\K - a match reset operator
\s+ - (the $0 in the replacement pattern references this substring) 1+ whitespaces.

You can use an callback function and control when to replace:
<?php
$string = 'Some example here for you';
$i = 0;
$string = preg_replace_callback('/\s+/',function($m) use(&$i){
$i++;
if($i == 3) {
return ', ';
}
return ' ';
},$string);
echo $string;

Try this
$result = preg_replace('/^([^\s]+)\s+((?1)\s+(?1))/', '\1 \2,', trim($value));
Test it
Explanation:
^ start of string
([^\s]+) - capture everything not a space
\s+ space 1 or more
((?1)\s+(?1)) - (?1) repeat first capture group, we do this 2x with a space between, and capture that. I guess you could capture them separately, but what's the point.
The nice thing about (?{n}) is if you have to change the regex for the word capturing you only have to change it 1 time, not 3. Probably it doesn't matter here so much, but I like using it...

Related

PHP - How to modify matched pattern and replace

I have string which contain space in its html tags
$mystr = "&lt; h3&gt; hello mom ?&lt; / h3&gt;"
so i wrote regex expression for it to detect the spaces in it
$pattern = '/(?<=<)\s\w+|\s\/\s\w+|\s\/(?=>)/mi';
so next i want to modify the matches by removing space from it and replace it, so any idea how it can be done? so that i can fix my string like
"&lt;h3&gt; hello mom ?&lt;/h3&gt;"
i know there is php function pre_replace but not sure how i can modify the matches
$result = preg_replace( $pattern, $replace , $mystr );
For the specific tags like you showed, you can use
preg_replace_callback('/&lt;(?:\s*\/)?\s*\w+\s*&gt;/ui', function($m) {
return preg_replace('/\s+/u', '', $m[0]);
}, $mystr)
The regex - note the u flag to deal with Unicode chars in the string - matches
&lt; - a literal string
(?:\s*\/)? - an optional sequence of zero or more whitespaces and a / char
\s* - zero or more whitespaces
\w+ - one or more word chars
\s* - zero or more whitespaces
&gt; - a literal string.
The preg_replace('/\s+/u', '', $m[0]) line in the anonymous callback function removes all chunks of whitespaces (even those non-breaking spaces).
You could keep it simple and do:
$output = str_replace(['&lt; / ', '&lt; ', '&gt; '],
['&lt;/', '&lt;', '&gt;'], $input);

Replace uppercase letters between special pattern

I like to replace the letters "KELLY" bettween "#" with the same length of "#". (here, repetitive five #'s instead of 'KELLY')
$str = "####KELLY#####"; // any alpabet letters can come.
preg_replace('/(#{3,})[A-Z]+(#{3,})/', "$1$2", $str);
It returns ######### (four hashes then five hashes) without 'KELLY'.
How can I get ############## which is four original leading hashes, then replace each letter with a hash, then the five original trailing hashes?
The \G continue metacharacter makes for a messier pattern, but it enables the ability to use preg_replace() instead of preg_replace_callback().
Effectively, it looks for the leading three-or-more hashes, then makes single-letter replacements until it reaches the finishing sequence of three-or-more hashes.
This technique also allows hash markers to be "shared" -- I don't actually know if this is something that is desired.
Code: (Demo)
$str = "####KELLY##### and ###ANOTHER###### not ####foo#### but: ###SHARE###MIDDLE###HASHES### ?";
echo $str . "\n";
echo preg_replace('/(?:#{3}|\G(?!^))\K[A-Z](?=[A-Z]*#{3})/', '#', $str);
Output:
####KELLY##### and ###ANOTHER###### not ####foo#### but: ###SHARE###MIDDLE###HASHES### ?
############## and ################ not ####foo#### but: ############################# ?
Breakdown:
/ #starting pattern delimiter
(?: #start non-capturing group
#{3} #match three hash symbols
| # OR
\G(?!^) #continue matching, disallow matching from the start of string
) #close non-capturing group
\K #forget any characters matched up to this point
[A-Z] #match a single letter
(?= #lookahead (do not consume any characters) for...
[A-Z]* #zero or more letters then
#{3} #three or more hash symbols
) #close the lookahead
/ #ending pattern delimiter
Or you can achieve the same result with preg_replace_callback().
Code: (Demo)
echo preg_replace_callback(
'/#{3}\K[A-Z]+(?=#{3})/',
function($m) {
return str_repeat('#', strlen($m[0]));
},
$str
);
I solved the problem with preg_replace_callback function in php.
Thanks CBroe for the tips.
preg_replace_callback('/#{3,}([A-Z]+)#{3,}/i', 'replaceLetters', $str);
function replaceLetters($matches) {
$ret = '';
for($i=0; $i < strlen($matches[0]); $i++) {
$ret .= "#";
}
return $ret;
}

Regex only grabbing first digit

I'm trying to grab everything after the following digits, so I end up with just the store name in this string:
full string: /stores/1077029-gacha-pins
what I want to ignore: /stores/1077029-
what I need to grab: gacha-pins
Those digits can change at any time so it's not specifically that ID, but any numbers after /stores/
My attempt so far is only grabbing /stores/1
\/stores\/[0-9]
I'm still trying, just thought I would see if I can get some help in the meantime too, will post an answer if I solve.
You may use
'~/stores/\d+-\K[^/]+$~'
Or a more specific one:
'~/stores/\d+-\K\w+(?:-\w+)*$~'
See the regex demo and this regex demo.
Details
/stores/ - a literal string
\d+ - 1+ digits
- - a hyphen
\K - match reset operator
[^/]+ - any 1+ chars other than /
\w+(?:-\w+)* - 1+ word chars and then 0+ sequences of - and 1+ word chars
$ - end of string.
See the PHP demo:
$s = "/stores/1077029-gacha-pins";
$rx = '~/stores/\d+-\K[^/]+$~';
if (preg_match($rx, $s, $matches)) {
echo "Result: " . $matches[0];
}
// => Result: gacha-pins
You should do it like this:
$string = '/stores/1077029-gacha-pins';
preg_match('#/stores/[0-9-]+(.*)#', $string, $matches);
$part = $matches[1];
print_r($part);

split string in numbers and text but accept text with a single digit inside

Let's say I want to split this string in two variables:
$string = "levis 501";
I will use
preg_match('/\d+/', $string, $num);
preg_match('/\D+/', $string, $text);
but then let's say I want to split this one in two
$string = "levis 5° 501";
as $text = "levis 5°"; and $num = "501";
So my guess is I should add a rule to the preg_match('/\d+/', $string, $num); that looks for numbers only at the END of the string and I want it to be between 2 and 3 digits.
But also the $text match now has one number inside...
How would you do it?
To slit a string in two parts, use any of the following:
preg_match('~^(.*?)\s*(\d+)\D*$~s', $s, $matches);
This regex matches:
^ - the start of the string
(.*?) - Group 1 capturing any one or more characters, as few as possible (as *? is a "lazy" quantifier) up to...
\s* - zero or more whitespace symbols
(\d+) - Group 2 capturing 1 or more digits
\D* - zero or more characters other than digit (it is the opposite shorthand character class to \d)
$ - end of string.
The ~s modifier is a DOTALL one forcing the . to match any character, even a newline, that it does not match without this modifier.
Or
preg_split('~\s*(?=\s*\d+\D*$)~', $s);
This \s*(?=\s*\d+\D*$) pattern:
\s* - zero or more whitespaces, but only if followed by...
(?=\s*\d+\D*$) - zero or more whitespaces followed with 1+ digits followed with 0+ characters other than digits followed with end of string.
The (?=...) construct is a positive lookahead that does not consume characters and just checks if the pattern inside matches and if yes, returns "true", and if not, no match occurs.
See IDEONE demo:
$s = "levis 5° 501";
preg_match('~^(.*?)\s*(\d+)\D*$~s', $s, $matches);
print_r($matches[1] . ": ". $matches[2]. PHP_EOL);
print_r(preg_split('~\s*(?=\s*\d+\D*$)~', $s, 2));

Remove repeating character

How would I remove repeating characters (e.g. remove the letter k in cakkkke for it to be cake)?
One straightforward way to do this would be to loop through each character of the string and append each character of the string to a new string if the character isn't a repeat of the previous character.
Here is some code that can do this:
$newString = '';
$oldString = 'cakkkke';
$lastCharacter = '';
for ($i = 0; $i < strlen($oldString); $i++) {
if ($oldString[$i] !== $lastCharacter) {
$newString .= $oldString[$i];
}
$lastCharacter = $oldString[$i];
}
echo $newString;
Is there a way to do the same thing more concisely using regex or built-in functions?
Use backrefrences
echo preg_replace("/(.)\\1+/", "$1", "cakkke");
Output:
cake
Explanation:
(.) captures any character
\\1 is a backreferences to the first capture group. The . above in this case.
+ makes the backreference match atleast 1 (so that it matches aa, aaa, aaaa, but not a)
Replacing it with $1 replaces the complete matched text kkk in this case, with the first capture group, k in this case.
You want to first match a character, followed by that character repeated: (.)\1+. Replace that with the first character. The brackets create a backreference to the first character, which you use both to match the repeated instances and as the replacement text.
preg_replace('/(.)\1+/', '$1', $str);

Categories