How to remove first image from string - php

I have the results from a database that contains images and text. I would like to remove the first image.
Example:
$string = 'This is some text and images
<img src="http://linktoimage" border="0">,
more text and <img src="http://linktoanotherimage"> and so on';
I would like to remove the first image, it is not always the same url.
Thanks for any help.

$string = 'This is some text and images
<img src="http://linktoimage" border="0">,
more text and <img src="http://linktoanotherimage"> and so on';
print preg_replace('/<img(.*)>/i','',$string,1);
The above should return
This is some text and images
,
more text and <img src="http://linktoanotherimage"> and so on
Assuming you know it'll be prefixed by spaces and a line break, and suffixed by a comma and line break (and you want to remove these, too), you can do
print preg_replace("/\n <img(.*)>\,\n /i",'',$string,1);
Which will give you
This is some text and images more text and <img src="http://linktoanotherimage"> and so on

There was a great answer on another Thread
function get_first_image($html){
require_once('SimpleHTML.class.php')
$post_dom = str_get_dom($html);
$first_img = $post_dom->find('img', 0);
if($first_img !== null) {
return $first_img->src;
}
return null;
}
You can do it via Regex expressions however regex isn't really suited for this.

$var = 'This is some text and images
<img src="http://linktoimage" border="0">,
more text and <img src="http://linktoanotherimage"> and so on';
echo preg_replace('/<img.*?>/', '123', $var, 1);
This should do it. ? in the regex is to make it ungreedy.

Being a RegEx newbie, I tend to avoid it and use something like:
$i = strpos($string,'<img');
if ($i !== false) {
$j = strpos($string, '>', $i);
if ($j !== false) $string = substr($string,0,$i) . substr($string,$j);
}
Probably should be $i-1 and/or $j+1 - I never remember exactly what it should be.

Finally nailed it down:
preg_replace('/<img[\S\s]*?>/i','',$txt,1)
This one also works for cases when you might have:
$string = 'This is some text and images
<img
src="http://linktoimage" border="0">,
more text and <img src="http://linktoanotherimage"> and so on';
(a new line character in the tag.)

This code works for me
$html is your html content variable from which you want to remove first image tag.
$str = preg_replace('/(<img[^>]+>)/i','',$html,1);

Related

PHP preg_replace: Replace all anchor tags in text with their href value with Regex

I want to replace all anchor tags within a text with their href value, but my pattern does not work right.
$str = 'This is a text with multiple anchor tags. This is the first one: Link 1 and this one the second: Link 2 after that a lot of other text. And here the 3rd one: Link 3 Some other text.';
$test = preg_replace("/<a\s.+href=['|\"]([^\"\']*)['|\"].*>[^<]*<\/a>/i",'\1', $str);
echo $test;
At the end the text should look like this:
This is a text with multiple anchor tags. This is the first one: https://www.link1.com/ and this one the second: https://www.link2.com/ after that a lot of other text. And here the 3rd one: https://www.link3.com/ Some other text.
Thank you very much!
Just don't.
Use a parser instead.
$dom = new DOMDocument();
// since you have a fragment, wrap it in a <body>
$dom->loadHTML("<body>".$str."</body>");
$links = $dom->getElementsByTagName("a");
while($link = $links[0]) {
$link->parentNode->insertBefore(new DOMText($link->getAttribute("href")),$link);
$link->parentNode->removeChild($link);
}
$result = $dom->saveHTML($dom->getElementsByTagName("body")[0]);
// remove <body>..</body> wrapper
$output = substr($result, strlen("<body>"), -strlen("</body>"));
Demo on 3v4l
In case you're still set on regex, this should work:
preg_replace("/<a\s+href=['\"]([^'\"]+)['\"][^\>]*>[^<]+<\/a>/i",'$1', $str);
But you're probably better off with a solution like what Andreas posted.
FYI: the reason your previous regex didn't work was this little number:
.*>
Because . selects everything you ended up matching everything past the url to be replaced; all the way to the end. This is why it appeared to only select and replace the first anchor tag it found and cut off the rest.
Changing that to
[^\>]*
Ensures that this particular selection is constrained to only the portion of the string which exists between the url and the ending bracket of the a tag.
Simpler perhaps not, but safer is to loop the string with strpos to find and cut the string and remove the html.
$str = 'This is a text with multiple anchor tags. This is the first one: <a class="funky-style" href="https://www.link1.com/" title="Link 1">Link 1</a> and this one the second: Link 2 after that a lot of other text. And here the 3rd one: Link 3 Some other text.';
$pos = strpos($str, '<a');
while($pos !== false){
// Find start of html and remove up to link (<a href=")
$str = substr($str, 0, $pos) . substr($str, strpos($str, 'href="', $pos)+6);
// Find end of link and remove that.(" title="Link 1">Link 1</a>)
$str = substr($str, 0, strpos($str,'"', $pos)) . substr($str, strpos($str, '</a>', $pos)+4);
// Find next link if possible
$pos = strpos($str, '<a');
}
echo $str;
https://3v4l.org/vdN7E
Edited to handle different order of a a-tag.
If you want to replace a tags with href values you can do:
$post = preg_replace("/<a.*?href=\"(.*?)\".*?>(.*?)<\/a>/","$1",$post);
If you want to replace with text values:
$post = preg_replace("/<a.*?href=\"(.*?)\".*?>(.*?)<\/a>/","$2",$post);

How to get first image in string using php?

I use this code:
<?php
$texthtml = '<p>test</p><br><p><img src="1.jpeg" alt=""><br></p><p><img src="2.png" alt=""><br><img src="3.png" alt=""></p>';
preg_match('/<img.+src=[\'"](?P<src>.+?)[\'"].*>/i', $texthtml, $image);
echo $image['src'];
?>
However, when I test it, I get last image (3.png) from a string.
I want to know how can I do for get first image (1.jpeg) in a string.
Try:
preg_match('/<img(?: [^<>]*?)?src=([\'"])(.*?)\1/', $texthtml, $image);
echo isset($image[1]) ? $image[1] : 'default.png';
Regex is not suitable for html tags.
You can read about it here: RegEx match open tags except XHTML self-contained tags
I suggest DOM document if it's more complex than what you have shown here.
If it's not more complex than this I suggest strpos to find the words and "trim" it with substr.
$texthtml = '<p>test</p><br><p><img src="1.jpeg" alt=""><br></p><p><img src="2.png" alt=""><br><img src="3.png" alt=""></p>';
$search = 'img src="';
$pos = strpos($texthtml, $search)+ strlen($search); // find postition of img src" and add lenght of img src"
$lenght= strpos($texthtml, '"', $pos)-$pos; // find ending " and subtract $pos to find image lenght.
echo substr($texthtml, $pos, $lenght); // 1.jpeg
https://3v4l.org/48iiI

PHP cut text from a specific word in an HTML string

I would like to cut every text ( image alt included ) in an HTML string form a specific word.
for example this is the string:
<?php
$string = '<div><img src="img.jpg" alt="cut this text form here" />cut this text form here</div>';
?>
and this is what I would like to output
<div>
<a href="#">
<img src="img.jpg" alt="cut this text" />
cut this text
</a>
</div>
The $string is actually an element of an Object but I didn't wanted to put too long code here.
Obviously I can't use explode because that would kill the HTML markup.
And also str_replace or substr is out because the length before or after the word where it needs to be cut is not constant.
So what can I do to achive this?
Ok I solved my problem and I only post an answer to my question because it could help someone.
so this is what I did:
<?php
$string = '<div><img src="img.jpg" alt="cut this text form here" />cut this text form here</div>';
$txt_only = strip_tags($string);
$explode = explode(' from', $txt_only);
$find_txt = array(' from', $explode[1]);
$new_str = str_replace($find_txt, '', $string);
echo $new_str;
?>
This might not be the best solution but it was quick and did not involve DOM Parse.
If anybody wants to try this make sure that your href or src or any ather attribute what needs to be untouched doesn't have any of the chars in the same way and order as in $find_txt else it will replace those too.

Removing characters from a variable created using preg_replace

So I'm trying to hack off a few characters at the end of a URL I'm getting from a preg_replace function. However it doesn't seem to be working. I'm not familiar with using these variables in preg_replace (it was just something I found that "mostly" worked).
Here's my attempt:
function addlink_replace($string) {
$pattern = '/<ul(.*?)class="slides"(.*?)<img(.*?)src="(.*?)"(.*?)>(.*?)<\/ul>/is';
$URL = substr($4, 0, -8);;
$replacement = '<ul$1class="slides"$2<a rel=\'shadowbox\' href="'.$URL.'"><img$3src="$4"$5></a>$6</ul>';
return preg_replace($pattern, $replacement, $string);
}
add_filter('the_content', 'addlink_replace', 9999);
Basically I need to remove the last bit of my .jpg file name, so I can show the LARGE image rather than the THUMBNAIL it's generating, but the "$4" doesn't seem to want to be manipulated.
This answer is based off of what you're looking to accomplish in this question with the HTML structure of your other question. The regex that is posted in your question will not match anything other than the first set of <li> and <img> tags , and you've indicated that you need to match all <li> and <img> tags within a <ul> so I've written a larger function to do so.
It will wrap all <img> tags that are inside of an <li> within a <ul> with the class of slides with an <a> with the source being the image's URL with the -110x110 string removed, while preserving the thumbnail source in the <img> tag.
function addlink_replace($string) {
$new_ul_block = '';
$ul_pattern = '/<ul(.*?)class="slides"(.*?)>(.*?)<\/ul>/is';
$img_pattern = '/<li(.*?)>(.*?)<img(.*?)src="(.*?)"(.*?)>(.*?)<\/li>/is';
preg_match($ul_pattern, $string, $ul_matches);
if (!empty($ul_matches[3]))
{
preg_match_all($img_pattern, $ul_matches[3], $img_matches);
if (!empty($img_matches[0]))
{
$new_ul_block .= "<ul{$ul_matches[1]}class=\"slides\"{$ul_matches[2]}>";
foreach ($img_matches[0] as $id => $img)
{
$new_img = str_replace('-110x110', '', $img_matches[4][$id]);
$new_ul_block .= "<li{$img_matches[1][$id]}>{$img_matches[2][$id]}<a href=\"{$new_img}\">";
$new_ul_block .= "<img{$img_matches[3][$id]}src=\"{$img_matches[4][$id]}\"{$img_matches[5][$id]}></a>{$img_matches[6][$id]}</li>";
}
$new_ul_block .= "</ul>";
}
}
if (!empty($new_ul_block))
{
$replace_pattern = '/<ul.*?class="slides".*?>.*?<\/ul>/is';
return preg_replace($replace_pattern, $new_ul_block, $string);
}
else
{
return $string;
}
}
The change of the <a>'s href attribute from what the image had is specifically done on the line
$new_img = str_replace('-110x110', '', $img_matches[2][$id]);
if you would like to modify it. If you need to remove anything other than -110x110 from the URL you may need to change it from str_replace to a preg_replace, or if you want to remove a specific number of characters from the end of the URL, you could use substr:
$new_img = substr($img_matches[2][$id], 0, -12);
Where -12 is the number of characters you want to remove from the end of the string (it's negative because it's starting at the end).
I've posted a working example of this function here.
You may want to consider modifying the source of what is generating this code block, rather than using this regex, as this regex may be hard to maintain in the future if the HTML structure changes.

Check if a string had been hyperlinked

What regular expression should I use to detect is the text I want to hyperlink had been already hyperlinked.
Example:
I use
$text = preg_replace('/((http)+(s)?:\/\/[^<>\s]+)/i', '\\0', $text);
to link regular URL, and
$text = preg_replace('/[#]+([A-Za-z_0-9]+)/', '#\\1', $text);
to link Twitter handle.
I want to detect whether or not the text I'm going to hyperlink had been wrapped in already.
Maybe not an answer but another possible solution; You could also search to see if the starting a element exists
$text = 'here';
if (gettype(strpos($text, "<a")) == "integer"){
//<a start tag was found
};
or just strip all tags regardless and build the link anyway
$text = 'here';
echo '' . strip_tags($text) . '';
Simple, replace the regular URLs first, as it won't affect anything starting with an # cause no URL starts with an #. Then replace the twitter handles.
That way you don't need to detect if it's been hyperlinked already.
if (strpos($str, '<a ') !== FALSE) echo 'ok';
else echo 'error';
$html = 'Stephen Ou';
$str = 'Stephen Ou';
if (strlen(str_replace($str, '', $html)) !== strlen($html)) {
echo 'I got a feeling';
}

Categories