I think my the regex is off (not very good at regex yet). What I'm trying to do is remove the first and last <section> tags (though this is set to replace all, if it worked). I set it up like this so it would completely remove any attributes of the tag, along with the closing tag.
The code:
//Remove from string
$content = "<section><p>Test</p></section>";
$section = "<(.*?)section(.*?)>";
$output= str_replace($section, "", $content);
echo $output;
You are looking for strip_tags.
Try this:
print strip_tags($content, '<section>');
Related
I have a set of <p></p> tags wrapping a set of data, that data includes other tags such as <script></script> however that content could contain any number of different tags.
I just need to remove any paragraph tags from the content
Example below
$text = "<p><script>example text inside script.<script></p>";
I see the strip_tags function but I believe that will remove all the tags.
How would I go about just removing the paragraph?
Thanks.
Try,
$text = "<p><script>example text inside script.<script></p>";
$formatted_text = str_replace(['<p>', '</p>'], '', $text);
You can allow tag with strip_tags()
like this example:
$text = "<p><script>example text inside script.<script></p>";
echo strip_tags($text, '<script>');
Try this.
<?php
$text = "<p><script>example text inside script.<script></p>";
$replace = array('<p>','</p>');
echo str_replace($replace,'',$text);
http://sandbox.onlinephpfunctions.com/code/43150c7af4e7e5f572827d91abca5756213ab7ba
Version 2 (works for classes)
echo preg_replace('%<p(.*?)>|</p>%s','',$text);
http://sandbox.onlinephpfunctions.com/code/6c5414773efc1317578b5f0581b68e5acabb9a2b
Hope this helps.
You can use str_replace(),
add this line -
$text = str_replace('<p>','',$text);
$text = str_replace('</p>','',$text);
It will remove both
<p> and </p>
So i have a string and I used the strip_tags() function to remove all tags except IMG but I still have plain text next to my IMG element. Here a visual example
$myvariable = "This text needs to be removed<a href='blah_blah_blah'>Blah</a><img src='blah.jpg'>"
So using PHP strip_tags() I was able to remove all tags except the <img> tag (which is what I want). But the thing is now it didn't remove the text.
How do I remove the left over text? Text will always either before tag or after tag as well
[ADDED MORE DETAILS]
$description = 'crazy stuff<img src="https://scontent.cdninstagram.com/t51.2885-15/e15/14287934_1389514537744146_673363238_n.jpg?ig_cache_key=MTMzNzM3MzgwNjAyNDY5NDAzMA%3D%3D.2">';
that's what the variable is actually holding.
Thanks in Advance
Instead of replacing something you can very well extract the values you want:
(<(\w+).+</\2>)
To be used with preg_match(), see a demo on regex101.com.
IN PHP:
<?php
$regex = '~(<(\w+).+</\2>)~';
$string = 'crazy stuff<img src="https://scontent.cdninstagram.com/t51.2885-15/e15/14287934_1389514537744146_673363238_n.jpg?ig_cache_key=MTMzNzM3MzgwNjAyNDY5NDAzMA%3D%3D.2">here as well';
if (preg_match($regex, $string, $match)) {
echo $match[1];
}
?>
Please show your whole piece of code with the use of strip_tags.
You can try: preg_replace('~.*(<img[^>]+>)~', '$1', $myvariable);
I need to strip out <p> tags which is inside a pre tag, How can i do this in php? My code will be like this:
<pre class="brush:php;">
<p>Guna</p><p>Sekar</p>
</pre>
I need text inside <p> tags, need to remove only <p> </p> tag.
This could be done with a single regex, this was tested in powershell but should work for most regex which supports look arounds
$string = '<pre class="brush:php;"><p>Guna</p><p>Sekar</p></pre><pre class="brush:php;"><p>Point</p><p>Miner</p></pre>'
$String -replace '(?<=<pre.*?>[^>]*?)(?!</pre)(<p>|</p>)(?=.*?</pre)', ""
Yields
<pre class="brush:php;">GunaSekar</pre><pre class="brush:php;">PointMiner</pre>
Dissecting the regex:
the first lookahead validates there is a pre tag before the current match
the second lookaround validates there was no /pre tag between the pre tag and the match
test for both p and /p
look around to ensure there is a closing /pre tag
You could use basic Regexp.
<?php
$str = <<<STR
<pre class="brush:php;">
<p>Guna</p><p>Sekar</p>
</pre>
STR;
echo preg_replace("/<[ ]*p( [^>]*)?>|<\/[ ]*p[ ]*>/i", " ", $str);
You can try the following code. It runs 2 regex commands to list all the <p> tags inside <pre> tags.
preg_match('/<pre .*?>(.*?)<\/pre>/s', $string, $matches1);
preg_match_all('/<p>.*?<\/p>/', $matches1[1], $ptags);
The matching <p> tags will be available in $ptags array.
You could use preg_replace_callback() to match everything that's in a <pre> tag and then use strip_tags() to remove all html tags:
$html = '<pre class="brush:php;">
<p>Guna</p><p>Sekar</p>
</pre>
';
$removed_tags = preg_replace_callback('#(<pre[^>]*>)(.+?)(</pre>)#is', function($m){
return($m[1].strip_tags($m[2]).$m[3]);
}, $html);
var_dump($removed_tags);
Note this works only with PHP 5.3+
It looked like simple work, but it took hours to find a way. This is what i done:
Downloaded simple dom parser from source forge
Traversed each <pre> tag and strip out <p> tags
Rewrite the content into <pre> tag
Retrive modified content
Here is full code:
include_once 'simple_html_dom.php';
$text='<pre class="brush:php;"><p>Guna</p><p>Sekar</p></pre>';
$html = str_get_html($text);
$strip_chars=array('<p>','</p>');
foreach($html->find('pre') as $element){
$code = $element->getAttribute('innertext');
$code=str_replace($strip_chars,'',$code);
$element->setAttribute('innertext',$code);
}
echo $html->root->innertext();
This will output:
<pre class="brush:php;">GunaSekar</pre>
Thanks for all your suggestions.
How can i strip html tag except the content inside the pre tag
code
$content="
<div id="wrapper">
Notes
</div>
<pre>
<div id="loginfos">asdasd</div>
</pre>
";
While using strip_tags($content,'') the html inside the pre tag too stripped of. but i don't want the html inside pre stripped off
Try :
echo strip_tags($text, '<pre>');
You may do the following:
Use preg_replace with 'e' modifier to replace contents of pre tags with some strings like ###1###, ###2###, etc. while storing this contents in some array
Run strip_tags()
Run preg_relace with 'e' modifier again to restore ###1###, etc. into original contents.
A bit kludgy but should work.
<?php
$document=html_entity_decode($content);
$search = array ("'<script[^>]*?>.*?</script>'si","'<[/!]*?[^<>]*?>'si","'([rn])[s]+'","'&(quot|#34);'i","'&(amp|#38);'i","'&(lt|#60);'i","'&(gt|#62);'i","'&(nbsp|#160);'i","'&(iexcl|#161);'i","'&(cent|#162);'i","'&(pound|#163);'i","'&(copy|#169);'i","'&#(d+);'e");
$replace = array ("","","\1","\"","&","<",">"," ",chr(161),chr(162),chr(163),chr(169),"chr(\1)");
$text = preg_replace($search, $replace, $document);
echo $text;
?>
$text = 'YOUR CODE HERE';
$org_text = $text;
// hide content within pre tags
$text = preg_replace( '/(<pre[^>]*>)(.*?)(<\/pre>)/is', '$1###pre###$3', $text );
// filter content
$text = strip_tags( $text, '<pre>' );
// insert back content of pre tags
if ( preg_match_all( '/(<pre[^>]*>)(.*?)(<\/pre>)/is', $org_text, $parts ) ) {
foreach ( $parts[2] as $code ) {
$text = preg_replace( '/###pre###/', $code, $text, 1 );
}
}
print_r( $text );
Ok!, you leave nothing but one choice: Regular Expressions... Nobody likes 'em, but they sure get the job done. First, replace the problematic text with something weird, like this:
preg_replace("#<pre>(.+?)</pre>#", "||k||", $content);
This will effectively change your
<pre> blah, blah, bllah....</pre>
for something else, and then call
strip_tags($content);
After that, you can just replace the original value in ||k||(or whatever you choose) and you'll get the desired result.
I think your content is not stored very well in the $content variable
could you check once by converting inner double quotes to single quotes
$content="
<div id='wrapper'>
Notes
</div>
<pre>
<div id='loginfos'>asdasd</div>
</pre>
";
strip_tags($content, '<pre>');
You may do the following:
Use preg_replace with 'e' modifier to replace contents of pre tags with some strings like ###1###, ###2###, etc. while storing this contents in some array
Run strip_tags()
Run preg_relace with 'e' modifier again to restore ###1###, etc. into original contents.
A bit kludgy but should work.
Could you please write full code. I understood, but something goes wrong. Please write full programming code
So, I have a regex that searches for HTML tags and modifies them slightly. It's working great, but I need to do something special with the last closing HTML tag I find. Not sure of the best way to do this. I'm thinking some sort of reverse reg ex, but haven't found a way to do that. Here's my code so far:
$html = '<div id="test"><p style="hello_world">This is a test.</p></div>';
$pattern = array('/<([A-Z][A-Z0-9]*)(\b[^>]*)>/i');
$replace = array('<tag>');
$html = preg_replace($pattern,$replace,$html);
// Outputs: <tag><tag>This is a test</p></div>
I'd like to replace the last occurance of <tag> with something special, say for example, <end_tag>.
Any ideas?
If I read this right, you want to find the last closing tag in the document.
You could find the last occurrence of </*> which has no more '<>' characters after it. This will be the last tag, assuming all remaining angle-brackets are encoded as < and >:
<?php
$html = '<div id="test"><p style="hello_world">This is a test.</p></div>';
// Outputs:
// '<div id="test"><p style="hello_world">This is a test.</p></tag>'
echo preg_replace('/<\/[A-Z][A-Z0-9]*>([^<>]*)$/i', '</tag>$1', $html);
This will replace the final </div> with </tag>, preserving any content that follows the final closing tag.
I don't know why you'd want to do this with only the closing tag, as if you change it you also have to change the matching opening tag. Also, this will fail to find the last self-closing tag, like <img /> or <br />.
I believe this method works the same as #meager's, but is more concise:
<?php
$html = '<div id="test"><p style="hello_world">This is a test.</p></div>';
$readmore = ' Read More…';
// Outputs:
// '<div id="test"><p style="hello_world">This is a test.</p> Read More…</div>'
echo preg_replace('#</\w>\s*$#', $readmore .'$1', $html);
?>