I want to remove the space between two words with a regex, however this does not seem to work.
$pattern = "#\<a href=\"(.+?) (.+?)\">#is";
$txt = preg_replace($pattern, "\<a href=\"\\1%20\\2\">", $txt);
I also need this to work for multiple words, but only withing the tags, as the rest of the text should have spaces. So a str_replace won't work (I think?)
Any tips?
The stable solution would be: Use DOM to retrieve the href value, use str_replace() to remove the spaces and then write back the value using DOM again.
Don't use regexes to handle html / xml.
Try this regex code to remove white spaces
\s+(?=[^()]*\
You can try the regex:
$txt = preg_replace('~(?:href="|(?<!^)\G)\K([^" ]*)\s+~g', "$1%20", $txt);
\G matches at the end of the previous match so that you can replace multiple spaces in a single attribute.
regex101 demo.
Related
Removing redundant tags <p><br></p> at the beginning and end of the string, and in the middle leaving only one.
Input:
<p><br></p><p><br></p><p><br></p><p>gfdsgfdsgfds</p><p><br></p><p><br></p><p><br></p><p>gfdsgfdsgfdsgfds</p><p><br></p><p><br></p><p><br></p>
Desired output:
<p>gfdsgfdsgfds</p><p><br></p><p>gfdsgfdsgfdsgfds</p>
Alternative desired output:
<p>gfdsgfdsgfds</p><p><br></p><p><br></p><p><br></p><p>gfdsgfdsgfdsgfds</p>
I've tried to use: preg_replace
$string = preg_replace('/(<p><br></p>)+/', '', $string);
But the result is null.
You need to escape the slash / character in your regex:
$string = preg_replace('/(<p><br><\/p>)+/', '', $string);
Also note that this will remove all instances where multiple of these patterns occor, resulting in the following:
<p>gfdsgfdsgfds</p><p>gfdsgfdsgfdsgfds</p>
To remove duplicates but leave one instance, could do the following:
$string = preg_replace('/(<p><br><\/p>)+/', '<p><br></p>', $string);
Maybe the Purifier http://htmlpurifier.org/ can help you. It can clean up html code and also remove javascript for example if needed.
I have an CSV File with the following content:
"Title","Firstname","Lastname","Description"
"Mr","Peter","Tester",,
"Mrs",,"Master","Chief, Supporter"
"Mr","Seg, Jr.","Tuti","Head, Developer"
Now I want to remove the quote sourrounded comma by preg_replace ("Chief, Supporter"; "Seg, Jr."; "Head, Developer").
But I am not able to build a suitable Regex.
My last result looks like: /\"(.[^\",]*),(.[^\"]*)\"/i
Your requirements are a little unclear, but if I understand correctly, you want to remove the comma if one exists within a double-quoted string so that
e.g. "Head, Developer" becomes "Head Developer", etc
Based on that assumption then
/\"([^\"]+?),+ *(\w[^\"]+?)\"/gmi
will find those commas, and
"$1 $2"
will replace it with a space.
see demo here
PHP example (I'm not very conversant with php so the character escaping, etc might need tweaking)
<?php
$string = '"Mr","Seg Jr.","Tuti","Head Developer"';
$pattern = '/\"([^\"]+?),+ *(\w[^\"]+?)\"/gmi';
$replacement = '"$1 $2"';
echo preg_replace($pattern, $replacement, $string);
?>
Use this regex: ((?>\*\*)),((?>\*\*))
Capture the ** in parentheses and replace with ,, see DEMO
$str="Your LaTeX document can \DIFaddbegin \DIFadd{test}\DIFaddend be easily
and the text can have multiple lines in it like this\DIFaddbegin \DIFadd{test2}
\DIFaddend"
I need to convert all \DIFaddbegin \DIFadd{test}\DIFaddend to \added{test}.
I tried
$o= preg_replace_callback('/\\DIFaddbegin\\s\DIFadd{(.*?)}\\DIFaddend/',
function($m) {return preg_replace('/$m[0]/','\added{$m[1]}',$m[0]);},$str);
But no luck. Which would be correct pattern for this? And also even if the string contains new line character the pattern should work.
You don't need a callback, using preg_replace() is fine for this task. To match a single backslash you need to double escape it meaning \\\\. To match possible whitespace between each substring, you can use \s* meaning whitespace "zero or more" times.
$str = preg_replace('~\\\\DIFaddbegin\s*\\\\DIFadd({[^}]*})\s*\\\\DIFaddend~', '\added$1', $str);
Try this:
$new_str = preg_replace("/\\\\DIFaddbegin \\\\DIFadd\{(.*)\}\\\\DIFaddend/s","\\added{\$1}",$str);
I am having a problem some regular expression.
I am using the following regular expression to obtain text between html tags.
preg_replace("/<.*>/ix", " ", $input_lines);
This expression works well with
<a href="some.html">Somelink
output is
Somelink
But it doesn't work with
Somelink
it shows a blank output.
My Actual input is like this
Somelink<anytag>Somelink</anytag>
And Desired output is
Somelink Somelink
all Tags whether starting or ending tag gets replaced by spaces.
And a Small question:
In your regex .* means before the last >
So it should be .*?
More safely, it will be [^>]*
If i understood your problem , you may use this method strip_tags
see this link
Perhaps it helps you
Try strip_tags function.
For replacing, try this
$result = preg_replace('/[ ]{2,}/imx', ' ', $subject);
I currently have this regex:
$text = preg_replace("#<sup>(?:(?!</?sup).)*$key(?:(?!</?sup).)*<\/sup>#is", '<sup>'.$val.'</sup>', $text);
The objective of the regex is to take <sup>[stuff here]$key[stuff here]</sup> and remove the stuff within the [stuff here] locations.
What I actually would like to do, is not remove $key[stuff here]</sup>, but simply move the stuff to $key</sup>[stuff here]
I've tried using $1-$4 and \\1-\\4 and I can't seem to get the text to be added after </sup>
Try this;
$text = preg_replace(
'#<sup>((?:(?!</?sup).)*)'.$key.'((?:(?!</?sup).)*)</sup>#is',
'<sup>'.$val.'</sup>\1\2',
$text
);
The (?:...)* bit isn't actually a sub-pattern, and is therefor not available using backreferences. Also, if you use ' rather than " for string literals, you will only need to escape \ and '
// Cheers, Morten
You have to combine preg_match(); and preg_replace();
You match the desired stuff with preg_match() and store in to the variable.
You replace with the same regex to empty string.
Append the variable you store to at the end.