PHP cut text from a specific word in an HTML string - php

I would like to cut every text ( image alt included ) in an HTML string form a specific word.
for example this is the string:
<?php
$string = '<div><img src="img.jpg" alt="cut this text form here" />cut this text form here</div>';
?>
and this is what I would like to output
<div>
<a href="#">
<img src="img.jpg" alt="cut this text" />
cut this text
</a>
</div>
The $string is actually an element of an Object but I didn't wanted to put too long code here.
Obviously I can't use explode because that would kill the HTML markup.
And also str_replace or substr is out because the length before or after the word where it needs to be cut is not constant.
So what can I do to achive this?

Ok I solved my problem and I only post an answer to my question because it could help someone.
so this is what I did:
<?php
$string = '<div><img src="img.jpg" alt="cut this text form here" />cut this text form here</div>';
$txt_only = strip_tags($string);
$explode = explode(' from', $txt_only);
$find_txt = array(' from', $explode[1]);
$new_str = str_replace($find_txt, '', $string);
echo $new_str;
?>
This might not be the best solution but it was quick and did not involve DOM Parse.
If anybody wants to try this make sure that your href or src or any ather attribute what needs to be untouched doesn't have any of the chars in the same way and order as in $find_txt else it will replace those too.

Related

PHP can't take <img> tag from page

I have a problem with PHP preg_match function.
In CMS DLE, I try to extract a picture from the news (image-x), but in the module I'm referring to via a direct link.
//remove <p></p> tags
$row[$i]['short_story'] = str_replace( "</p><p>", " ",$row[$i]['short_story'] );
//remove the \" escapes (DLE put it in the MySQL column)
$row[$i]['short_story'] = str_replace("\\\"", " ", $row[$i]['short_story']);
//remove all tags except <img>, but there remains a simple text that is stored without tags
$row[$i]['img'] = strip_tags($row[$i]['short_story'], "<img>");
//try to find <img> (by '>'), to remove the simple text;
preg_match(".*>", $row[$i]['img'], $matches);
// print only <br/> (matches is empty)
print_r($matches."<br/>\n");
for example print_r($row[$i]['img']) is
<img src="somelink" class="fr-fic" fr-dib="" alt=""> Some text
And i need only
<img src="somelink" class="fr-fic" fr-dib="" alt="">
Your regex pattern to selecting <img> is incorrect. Use /<img[^>]+>/ in pattern instead. The code should change to
preg_match("/<img[^>]+>/", $row[$i]['img'], $matches);
Also you can use preg_replace() to removing additional text after <img>
preg_replace("/(<img[^>]+>)[\w\s]+/", "$1", $string)

substr doesn't work after using strip_tags

I want to remove all tags before showing them on preview mode (just some text).
I have this code:
$text = strip_tags($item['content']);
echo substr($text,0,13);
here is my $item['content'] is something like this
<div class="note note-success">
<p>
Font Awesome gives you scalable
vector icons that can instantly be customized — size, color, drop
shadow, and anything that can be done with the power of CSS. The
complete set of 439 icons in Font Awesome 4.1.0
</p>
For more info check out: <a target="_blank" href="http://fortawesome.github.io/Font-Awesome/icons/">http://fortawesome.github.io/Font-Awesome/icons/</a>
</div>
The problem is that when I use substr it doesn't show anything, but when I use normal echo, it shows the content of the variable that was stripped before.
Does strip_tags not give string output?
Try to remove whitespaces before outputting your substring:
$new = str_replace(' ','',$text); (Use trim instead as #mario.klump said)
$text = strip_tags($item['content']);
$new = trim($text);
echo substr($new,0,13);
strip_tags() function works only when following type of html text. what you are doing is convert html encoded text so, it will not be parse.
$text = '<p>Test paragraph.</p><!-- Comment --> Other text';
echo strip_tags($text);
For your example you can use like this:
$text = htmlentities($item['content']);
echo substr(html_entity_decode($text),0,13); or
echo substr($text,0,13);

Remove characters in link within string

My string:
some text some text < b >some text< /b > some text < a href = http ://sometextsometext< b >sometext< /b >sometextsometext >text< /a > some text some text
Is there any way using preg_match or str_replace to only remove the < b >< /b > in the link tag?
Thanks
Okay, so here's an idea using PHP's function preg_replace_callback
<?php
// SET TEXT TO BE USED
$string = 'some text some text <b>some text</b> some text <a href=http://sometextsometext<b>sometext</b>sometextsometext>text</a> some text some text. And We Have A <a href=http://google.com>Google</a> Link';
// USE A CALLBACK FUNTION TO SCAN THROUGH LINKS
$string = preg_replace_callback('~<a.*?</a>~', 'remove_crap_from_links', $string);
print $string;
// THIS IS THE CALLBACK FUNCTION ... EACH LINK IS STORED AS '$m'
function remove_crap_from_links($m) {
// PULL OUT THE PART OF THE LINK BEFORE THE CLOSING LINK BRACKET
// (USE A NEGATIVE LOOKAHEAD TO MAKE SURE THAT IT CAN'T HAVE ANY OPENING/CLOSING HTML BRACKETS IN THERE
if (preg_match('~<a(.*?)>(?:[^<>]*?)</a>~i', $m[0], $url_matches)) {
// RUN A PHP strip_tags FUNCTION TO PULL OUT ANY HTML TAGS FOUND IN THE LINK BODY
$stripped_url = strip_tags($url_matches[1]);
// REBUILD THE URL, USING THE $stripped_url IN PLACE OF WHAT WAS ALREADY THERE
$clean_url = preg_replace('~(<a)(.*?)(>(?:[^<>]*?)</a>)~', '$1'.$stripped_url.'$3', $m[0]);
}
return $clean_url;
}
So basically, I'm taking the part that has been suggested a couple of times with PHP's strip_tags function, but only using on parts that it finds inside of link tags.
Here is a working demo
You can use strip_tags function in php to remove html tags.
$text = 'Test paragraph. Other text';
echo strip_tags($text);
Output : Test paragraph. Other text
strip_tags - Strip HTML and PHP tags from a string
PHP Code
<?php
$text = '<p>Test paragraph.</p><!-- Comment --> Other text';
echo strip_tags($text);
echo "\n";
// Allow <p> and <a>
echo strip_tags($text, '<p><a>');
?>
The above example will output:
Test paragraph. Other text
<p>Test paragraph.</p> Other text
Ref: http://php.net/manual/en/function.strip-tags.php

Using PHP to remove a html element from a string

I am having trouble working out how to do this, I have a string looks something like this...
$text = "<p>This is some example text This is some example text This is some example text</p>
<p><em>This is some example text This is some example text This is some example text</em></p>
<p>This is some example text This is some example text This is some example text</p>";
I basically want to use something like preg_repalce and regex to remove
<em>This is some example text This is some example text This is some example text</em>
So I need to write some PHP code that will search for the opening <em> and closing </em> and delete all text in-between
hope someone can help,
Thanks.
$text = preg_replace('/([\s\S]*)(<em>)([\s\S]*)(</em>)([\s\S]*)/', '$1$5', $text);
In case if you are interested in a non-regex solution following would aswell:
<?php
$text = "<p>This is some example text This is some example text This is some example text</p>
<p><em>This is some example text This is some example text This is some example text</em></p>
<p>This is some example text This is some example text This is some example text</p>";
$emStartPos = strpos($text,"<em>");
$emEndPos = strpos($text,"</em>");
if ($emStartPos && $emEndPos) {
$emEndPos += 5; //remove <em> tag aswell
$len = $emEndPos - $emStartPos;
$text = substr_replace($text, '', $emStartPos, $len);
}
?>
This will remove all the content in between tags.
$text = '<p>This is some example text This is some example text This is some example text</p>
<p><em>This is the em text</em></p>
<p>This is some example text This is some example text This is some example text</p>';
preg_match("#<em>(.+?)</em>#", $text, $output);
echo $output[0]; // This will output it with em style
echo '<br /><br />';
echo $output[1]; // This will output only the text between the em
[ View output ]
For this example to work, I changed the <em></em> contents a little, otherwise all your text is the same and you cannot really understand if the script works.
However, if you want to get rid of the <em> and not to get the contents:
$text = '<p>This is some example text This is some example text This is some example text</p>
<p><em>This is the em text</em></p>
<p>This is some example text This is some example text This is some example text</p>';
echo preg_replace("/<em>(.+)<\/em>/", "", $text);
[ View output ]
Use strrpos to find the first element and
then the last element.
Use substr to get the part of string.
And then replace the substring with empty string from original string.
format: $text = str_replace('<em>','',$text);
$text = str_replace('</em>','',$text);

Storing a piece of the string for later use in replacement

Not sure if the subject was clear, hard to describe, easy to show:
I need to convert this:
$text = 'Bunch of text %img_Green %img_Red and some more text here';
to this:
$text = 'Bunch of text <img src="/assets/images/Green.gif" alt="Green" /> <img src="/assets/images/Red.gif" alt="Red" /> and some more text here';
Thanks in advance.
should be
$text = preg_replace('/%img_(\w+)/', '<img src="/assets/images/\1.jpg" alt="\1"/>', $text);
check out the preg_replace doc page on how this is working

Categories