Regular expression from wysiwyg - php

I have HTML-code, that come from user who use wysiwyg redactor.
I need to сlean code from tags like <b ..><i ..><strong><p><a ..>, and clean up from all main js code, like onclick and other.
Thanks.

Use strip_tags to remove html from text. Example below.
$text = '<p>Test paragraph.</p><!-- Comment --> Other text';
echo strip_tags($text);
echo "\n";

Related

Display Joomla Intro Article without <p> tag

I'm customizing the layout of category blog view for my template and I need to display the article intro text without the tag. In my custom file "blogalternativelayout_item.php", I use:
<?php echo substr(($this->item->introtext),0,75); ?>
Anyway this renders the introtext as
<p>Lorem ipsum etc...</p>
How could I do to remove the paragraph tags?
Thanks in advance.
You can use php strip_tags() function. eg;
echo strip_tags($this->item->introtext);
the code above will strip all the html tags in introtext.
If you want to strip tags except tags, then you can put it like this:
echo strip_tags($this->item->introtext, "<a>");
You have to use regex to achieve this task
<?php
$text = substr(($this->item->introtext),0,75);
//get the contents inside <p> tag using this regex
$result = preg_replace('/<p\b[^>]*>(.*?)<\/p>/i', '', $text);
echo $result;
?>
Thanks to both suggestions. I've created this code and working:
<?php
$desctrunc = substr(($this->item->introtext),0,75);
$desc = strip_tags($desctrunc);
echo $desc . '...';
?>
Thanks.

To remove special characters like <p>, <li>

I want to display content on client side. The problem is that I am getting output like this -
<p> Aliette is a systemic fungicide effective against Oomcytes fungi like downy mildew
diseases of grapes and damping off and Azhukal diseases of cardamom.</p> <span> Despite
its extensive use since 1978, there is no report of resistance development in fungus.
True systemic action makes application of Aliette as the best prophylactic solution for
downy mildew control in grape.</span>
Now I want to remove those special characters i.e <p>,</p>, <span>, </span>
Value stored in database is description = "<p> test <p>";
$sel_pro = "select * from bayer_product where product_group like '%".$_REQUEST['searchfield']."%'";
$res_pro = mysql_query($sel_pro);
$num_pro = mysql_num_rows($res_pro);
while($row_pro = mysql_fetch_assoc($res_pro))
{
echo $desc = strip_tags($row_pro['description']);
}
If they are tags then you can use strip_tags().
$text = '<p>Test paragraph.</p><!-- Comment --> Other text';
echo strip_tags($text);
// Allow <p> and <a>
echo strip_tags($text, '<p><a>');
the outputs will be
Test paragraph. Other text //strip all tags
<p>Test paragraph.</p> Other text //strip all tags except <p> & <a>
For clean all html tags in text use strip_tags() function. Usage you can see in the docs. If you need to clean all tags, except few that you need - simply put "allowable_tags" param.
http://www.nusphere.com/kb/phpmanual/function.strip-tags.htm
Use PHP function strip_tags for remove tag
<?php
$text = '<p> Aliette is a systemic fungicide effective against Oomcytes fungi like downy mildew
diseases of grapes and damping off and Azhukal diseases of cardamom.</p> <span> Despite
its extensive use since 1978, there is no report of resistance development in fungus.
True systemic action makes application of Aliette as the best prophylactic solution for
downy mildew control in grape.</span>';
echo strip_tags($text);
echo "\n";
// Autorise <p> et <a>
echo strip_tags($text, '<p><span>');
?>

how to remove links from a html content using php

I have the following html content:
<p>My name is way2project</p>
Now I want this text as <p>My name is way2project</p>
Is there any way to do this? Please help me thanks
I used preg_replace but in vain.
Thanks again
You can use the strip tags function
$string = '<p>My name is way2project</p>';
echo strip_tags($string,'<p>');
note the second parameter is the list of allowed tags you wont to ignore.
This seems strange, but not knowing the complete scope of your issue and seeing that you want to do this in PHP, you can try:
$origstring = '<p>My name is way2project</p>';
$newstring = str_replace('way2project', 'way2project', $origstring);
echo $newstring;
Checkout Simple Html Dom Parser
$html = str_get_html('<html><body>Hello!SO</body></html>');
echo $html->find('a',0)->innertext; //prints "SO"
strip_tags you can use this, to remove html tags.

Replace pagebreak tag with regular expression

Here's what I am tryin to accomplish. The CMS editor of our Magento webshop, has a button to insert a <!-- pagebreak --> tag. I would like to use this, to create a read more functionality. I thought I would search/replace for this tag to do this.
I want to search inside <p> tags, and I want people to be able to use this tag as often as they want.
Suppose this is my original HTML:
<p>This is my example text, but<!-- pagebreak --> this should be readable after 'click more'<!-- pagebreak --> with even more click more possible</p>
I would like to convert it to something like this.. I think the first one is the easiest to accomplish, maybe by doing an preg_replace in a while loop? The second one is probably cleaner/better html (less nesting)
<p>This is my example text, but <a href="#" onClick='#'>read more</a><div class='hiddenreadmore' id='hiddenreadmore-1'> this should be readable after 'click more'<a href="#" onClick='#'>read more</a><div class='hiddenreadmore' id='hiddenreadmore-2'> with even more click more possible</div></div></p>
or
<p>This is my example text, but <a href="#" onClick='#'>read more</a><div class='hiddenreadmore' id='hiddenreadmore-1'> this should be readable after 'click more'<a href="#" onClick='#'>read more</a></div><div class='hiddenreadmore' id='hiddenreadmore-2'> with even more click more possible</div></p>
So I came up with this, but I think there should be a way to do it with one replace.
$pattern = '#\<p\>(.+?)\<\!-- pagebreak --\>(.+?)\<\/p\>#s';
$count = true;
while ($count) {
$text = preg_replace($pattern, '<p>$1 read more<div class="hidden">$2</div></p>', $text, -1, $count);
}
Well if it you dont need to check if it's in a <p> tag you can use something like this:
str_replace ( "<!-- pagebreak -->" , '<p>$1 read more<div class="hidden">$2</div></p>' , $text, $count );
It's a lot lighter to the system.
I guess this would do the job:
$pattern = '#\<p>(.*?)\<!-- pagebreak -->(.*?)\</p>#s';
$text = "<p>some test <!-- pagebreak --> hidden content</p> second test <p>lolo <!-- pagebreak --> more hidden content</p>";
echo preg_replace($pattern, '<p>$1 read more<div class="hidden">$2</div></p>', $text, -1, $count);

Replace Links Location (href='...')

I would like to replace the link location (of anchor tag) of a page as follows.
Sample Input:
text text text <a href='http://test1.com/'> click </a> text text
other text <a class='links' href="gallery.html" title='Look at the gallery'> Gallery</a>
more text
Sample Output
text text text <a href='http://example.com/p.php?q=http://test1.com/'> click </a> text text
other text <a class='links' href="http://example.com/p.php?q=gallery.html" title='Look at the gallery'> Gallery</a>
more text
I hope I have make it clear. Anyway I am trying to do it with PHP and reg-ex. Would you please light me up with right.
Thank you
Sadi
Don't use regular expressions for parsing HTML.
Do use PHP's built-in XML parsing engine. It works quite well on your question (and answers the question to boot):
<?php
libxml_use_internal_errors(true); // ignore malformed HTML
$xml = new DOMDocument();
$xml->loadHTMLFile("http://stackoverflow.com/questions/3099187/replace-links-location-href");
foreach($xml->getElementsByTagName('a') as $link) {
$link->setAttribute('href', "http://www.google.com/?q=" . $link->getAttribute('href'));
}
echo $xml->saveHTML(); // output to browser, save to file, etc.
Try to use str_replace ();
$string = 'your text';
$newstring = str_replace ('href="', 'href="http://example.com/p.php?q=', $string);

Categories