I have summernote WYSIWYG plugin, Now whenever i add any images it converts the image into
<img data-filename="Untitled-1.png" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAoUAAAELCAIAAAAgGWu2AA" style="width: 645px;">
Now all I want is to detect this first tag and get it's src value & store it in db to show it as a featured image
for e.g if there are two img data-file-name tags
<img data-filename="Untitled-1.png" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAoUAAAELCAIAAAAgGWu2AA" style="width: 645px;">
<img data-filename="Untitled-2.png" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAoUAAAELCAIAAAAgGWu2AA" style="width: 645px;">
I want to get the src value of Untitled-1.png only, not the Untitled-2.png,
Here is what I've tried
preg_match('/(<img .*?>)/', $go, $img_tag);
$feature = $img_tag[0];
Use DOMDocument and DOMXPath to easily target what you want using the HTML structure:
$content = <<<'EOD'
<img data-filgename="Untitled-1.png" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAoUAAAELCAIAAAAgGWu2AA" style="width: 645px;">
<img data-filgename="Untitled-2.png" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAoUAAAELCAIAAAAgGWu2AA" style="width: 645px;">
EOD;
$dom = new DOMDocument;
$dom->loadHTML($content);
$xp = new DOMXPath($dom);
$result = $xp->evaluate('string(//img[#data-filename]/#src)');
# img node anywhere --------^ ^ ^---- src attribute
# in the DOM tree '---- predicate: must have a
# data-filename attribute
if (!empty($result))
echo $result, PHP_EOL;
Related
I have a string containing different types of html tags and stuff, including some <img> elements. I am trying to wrap those <img> elements inside a <figure> tag. So far so good using a preg_replace like this:
preg_replace( '/(<img.*?>)/s','<figure>$1</figure>',$content);
However, if the <img>tag has a neighboring <figcaption> tag, the result is rather ugly, and produces a stray end tag for the figure-element:
<figure id="attachment_9615">
<img class="size-full" src="http://www.example.com/pic.png" alt="name" width="1699" height="354" />
<figcaption class="caption-text"></figure>Caption title here</figcaption>
</figure>
I've tried a whole bunch of preg_replace regex variations to wrap both the img-tag and figcaption-tag inside figure, but can't seem to make it work.
My latest try:
preg_replace( '/(<img.*?>)(<figcaption .*>*.<\/figcaption>)?/s',
'<figure">$1$2</figure>',
$content);
As others pointed out, better use a parser, i.e. DOMDocument instead. The following code wraps a <figure> tag around each img where the next sibling is a <figcaption>:
<?php
$html = <<<EOF
<html>
<img class="size-full" src="http://www.example.com/pic.png" alt="name" width="1699" height="354" />
<figcaption class="caption-text">Caption title here</figcaption>
<img class="size-full" src="http://www.example.com/pic.png" alt="name" width="1699" height="354" />
<img class="size-full" src="http://www.example.com/pic.png" alt="name" width="1699" height="354" />
<figcaption class="caption-text">Caption title here</figcaption>
</html>
EOF;
$dom = new DOMdocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
# get all images
$imgs = $xpath->query("//img");
foreach ($imgs as $img) {
if ($img->nextSibling->tagName == 'figcaption') {
# create a new figure tag and append the cloned elements
$figure = $dom->createElement('figure');
$figure->appendChild($img->cloneNode(true));
$figure->appendChild($img->nextSibling->cloneNode(true));
# insert the newly generated elements right before $img
$img->parentNode->insertBefore($figure, $img);
# and remove both the figcaption and the image from the DOM
$img->nextSibling->parentNode->removeChild($img->nextSibling);
$img->parentNode->removeChild($img);
}
}
$dom->formatOutput=true;
echo $dom->saveHTML();
See a demo on ideone.com.
To have a <figure> tag around all your images, you might want to add an else branch:
} else {
$figure = $dom->createElement('figure');
$figure->appendChild($img->cloneNode(true));
$img->parentNode->insertBefore($figure, $img);
$img->parentNode->removeChild($img);
}
i want replace all images on my html but the code replace one and escaping one and so on
i use DOMDocument to replace images on my content and i use the next code the problem is the code escaping image
for example
1 2 3 4 images the code replace one and three and escaping tow and four and so on
$dom = new \DOMDocument();
$dom->loadHTML("data"));
$dom->preserveWhiteSpace = true;
$count = 1;
$images = $dom->getElementsByTagName('img');
foreach ($images as $img) {
$src = $img->getAttribute('src');
$newsrc = $dom->createElement("newimg");
$newsrc->nodeValue = $src;
$newsrc->setAttribute("id","qw".$count);
$img->parentNode->replaceChild($newsrc, $img);
$count++;
}
$html = $dom->saveHTML();
return $html;
the html code is
<p><img class="img-responsive" src="http://www.jarofquotes.com/img/quotes/86444b28aa86d706e33246b823045270.jpg" alt="" width="600" height="455" /></p>
<p> </p>
<p>some text</p>
<p> </p>
<p><img class="img-responsive" src="http://40.media.tumblr.com/c0bc20fd255cc18dca150640a25e13ef/tumblr_nammr75ACv1taqt2oo1_500.jpg" alt="" width="480" height="477" /></p>
<p> </p>
<p><span class="marker"><img class="img-responsive" src="http://wiselygreen.com/wp-content/uploads/green-living-coach-icon.png" alt="" width="250" height="250" /><br /><br /></span></p>
i want output html replace all images with
<newimg>Src </newimg>
Ok, I couldn't find a dupe suitable for PHP, so I am answering this one.
The issue you are facing is that NodeLists returned by getElementsByTagName() are live list. That means, when you do the call to replaceChild(), you are altering the NodeList you are currently iterating.
Let's assume we have this HTML:
$html = <<< HTML
<html>
<body>
<img src="1.jpg"/>
<img src="2.jpg"/>
<img src="3.jpg"/>
</body>
</html>
HTML;
Now let's load it into a DOMDocument and get the img elements:
$dom = new DOMDocument;
$dom->loadHTML($html);
$allImages = $dom->getElementsByTagName('img');
echo $allImages->length, PHP_EOL;
This will print 3 because there is 3 img elements in the DOM right now.
Let's replace the first img element with a p element:
$allImages->item(0)->parentNode->replaceChild(
$dom->createElement("p"),
$allImages->item(0)
);
echo $allImages->length, PHP_EOL;
This now gives 2 because there is now only 2 img elements left, essentially
item 0: img will be removed from the list
item 1: img will become item 0
item 2: img will become item 1
You are using foreach, so you are first replacing item 0, then move on to item 1, but item 1 is now item 2 and the item 0 is item 1 you would expect next. But because the list is live, you are skipping it.
To get around this, use a while loop and always replace the first element:
while ($allImages->length > 0) {
$allImages->item(0)->parentNode->replaceChild(
$dom->createElement("p"),
$allImages->item(0)
);
}
This will then catch all the img elements.
I am trying to remove all text from node, but when I am removing text, it removes normal text not from table text and inner div's text.
Here is my code:
$dom = new DOMDocument();
$result = $dom->loadHTML($html);
$finder = new DomXPath($dom);
//$nodes = $finder->query('//div[starts-with(#id, "post_message_")]');
$nodes = $finder->query('//div[contains(text(), "") and .//img and .//a and starts-with(#id, "post_message_")]');
But it gives me this html in node:
<div id="post_message_31962189">.<br><div align="center"><img src="http://s3.postimage.odf.jpg" border="0" alt=""></div><br><b><div align="center"><font size="5"><font color="Blue"><br><br>
WATERMARKED <br><br>
ADDED 4 IN LAST PAGE<br><br></font></font></div></b><br>
=============================================================================<br>
IN HOTEL <br><br><b><font size="4"><font color="Red"> i promise </font></font></b><br><br><b><div align="center"><font size="5"><font color="Blue">ADDED 4 NEW </font></font></div></b><br><br><br>Ashoka hotel<br><br><br><br><img src="http:/img.jpg" border="0" alt=""></div>
I want to remove all the things except img a and br.
I have an RSS feed from which I'm trying to extract data though SimplePie (in WordPress).
I have to extract the content tag. It works with <?php echo $item->get_content(); ?>. It throws out all this stuff (of course this is just an entry, the others have the same structure):
<table><tr valign="top">
<td width="67">
<a href="http://www.anobii.com/books/Lapproccio_sistemico_al_governo_dellimpresa/9788813230944/014c5c45a7ddaab1ec/" style="border: 1px solid #333333">
<img src="http://image.anobii.com/anobi/image_book.php?type=3&item_id=014c5c45a7ddaab1ec&time=0">
</a>
</td><td style="margin-left: 10px;padding-left: 10px">[person name] put "[title]" onto shelf<br/></td></tr></table>
Though what I need is just the content inside src="" tag (image url). How can I extract only that?
You can do it using DOMDocument (the best way):
$doc = new DOMDocument();
#$doc->loadHTML($html);
$imgs = $doc->getElementsbyTagName('img');
$res = $imgs->item(0)->getAttribute('src');
print_r($res);
With a regex (the bad way):
if (preg_match('~\bsrc\s*=\s*["\']\K[^"\']*+~i', $html, $match))
print_r($match);
I fetches from value from db like:
<p><img alt="" src="images/1.jpg" style="width: 2450px; height: 1054px;" /></p>
and wants to only get src="images/1.jpg" but don't know how. Please guide me
If you need the source, use a DOM Parser:
// Construct a new DOMDocument with your fragment
$domDoc = new DOMDocument;
$domDoc->loadHTML( '<p><img src="images/1.jpg" style="width: 2450px;" /></p>' );
// Locate the first image the document
$img = $domDoc->getElementsByTagName( "img" )->item( 0 );
// Echo its src value
echo $img->attributes->getNamedItem( "src" )->nodeValue;
Results: http://codepad.org/oMXGK9Iu
Ideally you would ensure the image elements exist before accessing items #0. Likewise, you would ensure the attributes exist before just leaping out and grabbing them.
Further reading: http://www.php.net/manual/en/class.domdocument.php
If you just want to grab that particular portion of the text, you could use a simple regular expression:
// Prep our html
$html = '<p><img src="images/1.jpg" style="width: 2450px;" /></p>';
// Look for the source string
preg_match( '/src=\".*?\"/', $html, $matches );
// If we found it, spit it out.
echo $matches ? $matches[0] : "No source";
if alt="" is empty by default and style is width: 2450px; height: 1054px; by default you could use:
<?php
$str = '<p><img alt="" src="images/1.jpg" style="width: 2450px; height: 1054px;" /></p>';
$str = str_replace('<p><img alt="" src="','', $str);
$str = str_replace('" style="width: 2450px; height: 1054px;" /></p>','',$str);
echo $str; //Outputs: images/1.jpg
?>