regex matching content that's between angle brackets [duplicate] - php

This question already has answers here:
How do you parse and process HTML/XML in PHP?
(31 answers)
Closed 2 years ago.
How can I convert all multiple white space inside any given HTML element to a single space using regex and preg_replace in php?
Eg: <div class="myClass" jsaction="UjQMac:.CLIENT" data-id="3739" >Edit</div>
Cleaned: <div class="myClass" jsaction="UjQMac" data-id="3739">Edit</div> All multiple spaces removed and only single spaces retained. Also, the > is replaced with a >
I've been trying unsuccessfully with this regex \<(\s+)\>. Can you help?
Edit:
The regex (?:(\s{2,})|(\s>)) from the answer below works fine, but does not match only between < & >

This will do it: (?:(\s{2,})|(\s>))
It matches any whitespace character that appears 2x or more often and also > with a leading .
See: https://regex101.com/r/NN9YUU/2/

Related

remove colon between < and > php [duplicate]

This question already has answers here:
Regex to replace surrounded word multiple times
(3 answers)
PHP Regex replace all instances of a character in similar strings
(4 answers)
PHP replace all occurences with preg_replace_callback
(1 answer)
php regex preg_replace_callback fails to handle mutliple lines
(1 answer)
Replace unwanted characters inside opening HTML tag only
(4 answers)
Closed 17 days ago.
Hi I have trouble understanding the preg_replace rules in php.
I have a string:
<fa:foo r:attr="ddd">fa:foo</fa:foo>,
and I am looking for a way to replace the colon by underscores only within each set of < and >
and reach following result:
<fa_foo r_attr="ddd">fa:foo</fa_foo>

My regex function seems to be right but it doesn't work [duplicate]

This question already has answers here:
How do you parse and process HTML/XML in PHP?
(31 answers)
Closed 2 years ago.
I'm trying to remove a script that contains a malware from my database.
It was injected in a lot of registers of my table.
The script starts with a <script> tag and ends with a </script> tag.
I'm using the following code to find and replace it:
$content = $post->post_content;
$new_content = preg_replace('/(<script>.+?)+(<\/script>)/i', '', $content);
I've tested it on regx101.com and it's working fine but on my code, it doesn't work.Does anyone know what's wrong?
Here is my goto regex for <script>...</script> tags with their contents:
(\<script\>)([\s\S]*?)(<\/script>)
You're not escaping some key characters and you're not capturing everything which could be in the contents of the tags.
Here is an explanation of the content capturing group:
\s matches any whitespace character
\S matches any non-whitespace character
*? matches between zero and unlimited times, as few times as possible, expanding as needed
As I stated before, you really shouldn't do this. You should use a PHP DOM parser instead.

regex to replace contend of second <p>-tag [duplicate]

This question already has answers here:
How do you parse and process HTML/XML in PHP?
(31 answers)
Closed 5 years ago.
Output (var $DESC)
<p>erster Absatz</p>
<p>zweiter Absatz</p>
Regex (PHP)
preg_replace("<([a-z][a-z0-9]*)\b[^>]*>(.*?)</\1>{2}", '', $DESC)
I would like to delete only the second p but this regex finds both. Thanks for any help.
Normally I would just tell you to use an HTML parser instead of regex, but since your requirement is so specific, this can actually be accomplished with regex quite safely.
(?<=<\/p>)\s+<p>[\w ]+<\/p>
https://regex101.com/r/Yqaajy/6
Explanation:
(?<=<\/p>) - Make sure the rest of the pattern is preceded by a <\p> ending tag (positive lookbehind).
\s+ - Any number of whitespace characters. Note that this will not match correctly if you have single line mode enabled.
<p>[\w ]+<\/p> - A paragraph block containing one or more word characters (digits, letters, and underscore) and spaces.
Try this:
$DESC ='<p>erster Absatz</p>
<p>zweiter Absatz</p>';
$DESC = preg_replace('#\</p\>[^\<]*\<p[^\>]*\>(.*?)\</p\>#i', '</p>', $DESC);
echo $DESC; // <p>erster Absatz</p>

What is the regular expression I should use in preg_replace() to find tabs inside all quoted strings and replace with a single space? [duplicate]

This question already has answers here:
How do I replace tabs with spaces within variables in PHP?
(9 answers)
Replace all occurences of char inside quotes on PHP
(5 answers)
Closed 5 years ago.
I know what needs to be done, but have been unsuccessful with the correct regex.
What is the regular expression I should use in preg_replace() to find tabs inside all quoted strings and replace with a single space?
any help would be much appreciated.
Example:
$string = '"Foo\tMan\tChoo"'
preg_replace($expression_string, ' ',$string);
echo $string;//desired result---->'"Foo Man Choo"'
I think this question is not a duplicate
answer : you will need to repeat preg_replace as many time as the max of \t you expect in a field (never could be bothered comming up with a more generic solution : it is usually possible to use this one)
also make sure every line ends with a tab (otherwise the last field will not be processed)
then you need to repeat replacement starting with the max possible number of \t (2 in the example)
$string = '"Foo'."\t".'Man'."\t".'Choo"'."\t".'boo'."\t".'"Foo'."\t".'Man"'."\t";
$string = preg_replace('/"([^"]*)'."\t".'([^"]*)'."\t".'([^"]*)"'."\t".'/','"$1_$2_$3"'."\t",$string);
$string = preg_replace('/"([^"]*)'."\t".'([^"]*)"'."\t".'/','"$1_$2"'."\t",$string);
echo $string;

Remove spaces from string but not all [duplicate]

This question already has answers here:
How can I convert ereg expressions to preg in PHP?
(4 answers)
Closed 7 years ago.
To be clear, i want this:
Europe Qualicitaions
to transform to this:
Europe Qualifications
I getting data from xml feed, and some strings like this up has more than one space, problem is that i don't know how many spaces will be in string, so i want to remove all spaces before, and after string, also remove separator spaces from middle more than one, i want to have one space in middle, but no 2,3-10...
$newstring = preg_replace('/(\s+)/', ' ', $str);

Categories