Looking for specific character in capture group - php

I need to replace all double quotes in any (variable) given string.
For example:
$text = 'data-caption="hello"world">';
$pattern = '/data-caption="[[\s\S]*?"|(")]*?">/';
$output = preg_replace($pattern, '"', $text);
should result in:
"hello"world"
(The above pattern is my attempt at getting it to work)
The problem is that I don't now in advance if and how many double quotes are going to be in the string.
How can i replace the " with quot; ?

You may match strings between data-caption=" and "> and then replace all " inside that match with " using a mere str_replace:
$text = 'data-caption="<element attribute1="wert" attribute2="wert">Name</element>">';
$pattern = '/data-caption="\K.*?(?=">)/';
$output = preg_replace_callback($pattern, function($m) {
return str_replace('"', '"', $m[0]);
}, $text);
print_r($output);
// => data-caption="<element attribute1="wert" attribute2="wert">Name</element>">
See the PHP demo
Details
data-caption=" - starting delimiter
\K - match reset operator
.*? - any 0+ chars other than line break chars, as few as possible
(?=">) - a positive lookahead that requires the "> substring immediately to the right of the current location.
The match is passed to the anonymous function inside preg_replace_callback (accessible via $m[0]) and that is where it is possible to replace all " symbols in a convenient way.

Related

PHP - How to modify matched pattern and replace

I have string which contain space in its html tags
$mystr = "&lt; h3&gt; hello mom ?&lt; / h3&gt;"
so i wrote regex expression for it to detect the spaces in it
$pattern = '/(?<=<)\s\w+|\s\/\s\w+|\s\/(?=>)/mi';
so next i want to modify the matches by removing space from it and replace it, so any idea how it can be done? so that i can fix my string like
"&lt;h3&gt; hello mom ?&lt;/h3&gt;"
i know there is php function pre_replace but not sure how i can modify the matches
$result = preg_replace( $pattern, $replace , $mystr );
For the specific tags like you showed, you can use
preg_replace_callback('/&lt;(?:\s*\/)?\s*\w+\s*&gt;/ui', function($m) {
return preg_replace('/\s+/u', '', $m[0]);
}, $mystr)
The regex - note the u flag to deal with Unicode chars in the string - matches
&lt; - a literal string
(?:\s*\/)? - an optional sequence of zero or more whitespaces and a / char
\s* - zero or more whitespaces
\w+ - one or more word chars
\s* - zero or more whitespaces
&gt; - a literal string.
The preg_replace('/\s+/u', '', $m[0]) line in the anonymous callback function removes all chunks of whitespaces (even those non-breaking spaces).
You could keep it simple and do:
$output = str_replace(['&lt; / ', '&lt; ', '&gt; '],
['&lt;/', '&lt;', '&gt;'], $input);

Find words starting and ending with dollar signs $ in PHP

I am looking to find and replace words in a long string. I want to find words that start looks like this: $test$ and replace it with nothing.
I have tried a lot of things and can't figure out the regular expression. This is the last one I tried:
preg_replace("/\b\\$(.*)\\$\b/im", '', $text);
No matter what I do, I can't get it to replace words that begin and end with a dollar sign.
Use single quotes instead of double quotes and remove the double escape.
$text = preg_replace('/\$(.*?)\$/', '', $text);
Also a word boundary \b does not consume any characters, it asserts that on one side there is a word character, and on the other side there is not. You need to remove the word boundary for this to work and you have nothing containing word characters in your regular expression, so the i modifier is useless here and you have no anchors so remove the m (multi-line) modifier as well.
As well * is a greedy operator. Therefore, .* will match as much as it can and still allow the remainder of the regular expression to match. To be clear on this, it will replace the entire string:
$text = '$fooo$ bar $baz$ quz $foobar$';
var_dump(preg_replace('/\$(.*)\$/', '', $text));
# => string(0) ""
I recommend using a non-greedy operator *? here. Once you specify the question mark, you're stating (don't be greedy.. as soon as you find a ending $... stop, you're done.)
$text = '$fooo$ bar $baz$ quz $foobar$';
var_dump(preg_replace('/\$(.*?)\$/', '', $text));
# => string(10) " bar quz "
Edit
To fix your problem, you can use \S which matches any non-white space character.
$text = '$20.00 is the $total$';
var_dump(preg_replace('/\$\S+\$/', '', $text));
# string(14) "$20.00 is the "
There are three different positions that qualify as word boundaries \b:
Before the first character in the string, if the first character is a word character.
After the last character in the string, if the last character is a word character.
Between two characters in the string, where one is a word character and the other is not a word character.
$ is not a word character, so don't use \b or it won't work. Also, there is no need for the double escaping and no need for the im modifiers:
preg_replace('/\$(.*)\$/', '', $text);
I would use:
preg_replace('/\$[^$]+\$/', '', $text);
You can use preg_quote to help you out on 'quoting':
$t = preg_replace('/' . preg_quote('$', '/') . '.*?' . preg_quote('$', '/') . '/', '', $text);
echo $t;
From the docs:
This is useful if you have a run-time string that you need to match in some text and the string may contain special regex characters.
The special regular expression characters are: . \ + * ? [ ^ ] $ ( ) { } = ! < > | : -
Contrary to your use of word boundary markers (\b), you actually want the inverse effect (\B)-- you want to make sure that there ISN'T a word character next to the non-word character $.
You also don't need to use capturing parentheses because you are not using a backreference in your replacement string.
\S+ means one or more non-whitespace characters -- with greedy/possessive matching.
Code: (Demo)
$text = '$foo$ boo hi$$ mon$k$ey $how thi$ $baz$ bar $foobar$';
var_export(
preg_replace(
'/\B\$\S+\$\B/',
'',
$text
)
);
Output:
' boo hi$$ mon$k$ey $how thi$ bar '

preg_replace vs trim PHP

I am working with a slug function and I dont fully understand some of it and was looking for some help on explaining.
My first question is about this line in my slug function $string = preg_replace('# +#', '-', $string); Now I understand that this replaces all spaces with a '-'. What I don't understand is what the + sign is in there for which comes after the white space in between the #.
Which leads to my next problem. I want a trim function that will get rid of spaces but only the spaces after they enter the value. For example someone accidentally entered "Arizona " with two spaces after the a and it destroyed the pages linked to Arizona.
So after all my rambling I basically want to figure out how I can use a trim to get rid of accidental spaces but still have the preg_replace insert '-' in between words.
ex.. "Sun City West " = "sun-city-west"
This is my full slug function-
function getSlug($string){
if(isset($string) && $string <> ""){
$string = strtolower($string);
//var_dump($string); echo "<br>";
$string = preg_replace('#[^\w ]+#', '', $string);
//var_dump($string); echo "<br>";
$string = preg_replace('# +#', '-', $string);
}
return $string;
}
You can try this:
function getSlug($string) {
return preg_replace('#\s+#', '-', trim($string));
}
It first trims extra spaces at the beginning and end of the string, and then replaces all the other with the - character.
Here your regex is:
#\s+#
which is:
# = regex delimiter
\s = any space character
+ = match the previous character or group one or more times
# = regex delimiter again
so the regex here means: "match any sequence of one or more whitespace character"
The + means at least one of the preceding character, so it matches one or more spaces. The # signs are one of the ways of marking the start and end of a regular expression's pattern block.
For a trim function, PHP handily provides trim() which removes all leading and trailing whitespace.

Attempting to understand handling regular expressions with php

I am trying to make sense of handling regular expression with php. So far my code is:
PHP code:
$string = "This is a 1 2 3 test.";
$pattern = '/^[a-zA-Z0-9\. ]$/';
$match = preg_match($pattern, $string);
echo "string: " . $string . " regex response: " , $match;
Why is $match always returning 0 when I think it should be returning a 1?
[a-zA-Z0-9\. ] means one character which is alphanumeric or "." or " ". You will want to repeat this pattern:
$pattern = '/^[a-zA-Z0-9. ]+$/';
^
"one or more"
Note: you don't need to escape . inside a character group.
Here's what you're pattern is saying:
'/: Start the expressions
^: Beginning of the string
[a-zA-Z0-9\. ]: Any one alphanumeric character, period or space (you should actually be using \s for spaces if your intention is to match any whitespace character).
$: End of the string
/': End the expression
So, an example of a string that would yield a match result is:
$string = 'a'
Of other note, if you're actually trying to get the matches from the result, you'll want to use the third parameter of preg_match:
$numResults = preg_match($pattern, $string, $matches);
You need a quantifier on the end of your character class, such as +, which means match 1 or more times.
Ideone.

Replace the leading space with with the same number of times using a PHP regular expression

I want to replace leading space with with the same number of occurrences.
Explanation:
If one leading space exist in the input then it should replace it with one .
If two leading spaces exist in input then it should replace with two s.
If n leading spaces are exist in the input then it should replace it with exactly n number of times with .
Example 1:
My name is XYZ
Output:
My name is XYZ
Example 2:
My name is XYZ
Output:
My name is XYZ
How can I replace only leading spaces, using a PHP regular expression?
preg_replace('/\G /', ' ', $str);
\G matches the position where the last match ended, or the beginning of the string if there isn't any previous match.
Actually, in PHP it matches where the next match is supposed to begin. That isn't necessarily the same as where the previous match ended.
$len_before = strlen($str);
$str = ltrim($str, ' ');
$len_after = strlen($str);
$str = str_repeat(' ', $len_before - $len_after) . $str;
Using preg_replace there is also
$str = preg_replace('/^( +)/e', 'str_repeat(" ", strlen("$1"))', $str);
but note that it uses the /e flag.
See http://www.ideone.com/VWNKZ for the result.
Use:
preg_replace('/^ +/m', ' ', $str);
You can test it here.
Use preg_match with PREG_OFFSET_CAPTURE flag set. The offset is the length of the "spaces". Then use str_repeat with the offset.

Categories