i want replace all images on my html but the code replace one and escaping one and so on
i use DOMDocument to replace images on my content and i use the next code the problem is the code escaping image
for example
1 2 3 4 images the code replace one and three and escaping tow and four and so on
$dom = new \DOMDocument();
$dom->loadHTML("data"));
$dom->preserveWhiteSpace = true;
$count = 1;
$images = $dom->getElementsByTagName('img');
foreach ($images as $img) {
$src = $img->getAttribute('src');
$newsrc = $dom->createElement("newimg");
$newsrc->nodeValue = $src;
$newsrc->setAttribute("id","qw".$count);
$img->parentNode->replaceChild($newsrc, $img);
$count++;
}
$html = $dom->saveHTML();
return $html;
the html code is
<p><img class="img-responsive" src="http://www.jarofquotes.com/img/quotes/86444b28aa86d706e33246b823045270.jpg" alt="" width="600" height="455" /></p>
<p> </p>
<p>some text</p>
<p> </p>
<p><img class="img-responsive" src="http://40.media.tumblr.com/c0bc20fd255cc18dca150640a25e13ef/tumblr_nammr75ACv1taqt2oo1_500.jpg" alt="" width="480" height="477" /></p>
<p> </p>
<p><span class="marker"><img class="img-responsive" src="http://wiselygreen.com/wp-content/uploads/green-living-coach-icon.png" alt="" width="250" height="250" /><br /><br /></span></p>
i want output html replace all images with
<newimg>Src </newimg>
Ok, I couldn't find a dupe suitable for PHP, so I am answering this one.
The issue you are facing is that NodeLists returned by getElementsByTagName() are live list. That means, when you do the call to replaceChild(), you are altering the NodeList you are currently iterating.
Let's assume we have this HTML:
$html = <<< HTML
<html>
<body>
<img src="1.jpg"/>
<img src="2.jpg"/>
<img src="3.jpg"/>
</body>
</html>
HTML;
Now let's load it into a DOMDocument and get the img elements:
$dom = new DOMDocument;
$dom->loadHTML($html);
$allImages = $dom->getElementsByTagName('img');
echo $allImages->length, PHP_EOL;
This will print 3 because there is 3 img elements in the DOM right now.
Let's replace the first img element with a p element:
$allImages->item(0)->parentNode->replaceChild(
$dom->createElement("p"),
$allImages->item(0)
);
echo $allImages->length, PHP_EOL;
This now gives 2 because there is now only 2 img elements left, essentially
item 0: img will be removed from the list
item 1: img will become item 0
item 2: img will become item 1
You are using foreach, so you are first replacing item 0, then move on to item 1, but item 1 is now item 2 and the item 0 is item 1 you would expect next. But because the list is live, you are skipping it.
To get around this, use a while loop and always replace the first element:
while ($allImages->length > 0) {
$allImages->item(0)->parentNode->replaceChild(
$dom->createElement("p"),
$allImages->item(0)
);
}
This will then catch all the img elements.
Related
I have summernote WYSIWYG plugin, Now whenever i add any images it converts the image into
<img data-filename="Untitled-1.png" src="" style="width: 645px;">
Now all I want is to detect this first tag and get it's src value & store it in db to show it as a featured image
for e.g if there are two img data-file-name tags
<img data-filename="Untitled-1.png" src="" style="width: 645px;">
<img data-filename="Untitled-2.png" src="" style="width: 645px;">
I want to get the src value of Untitled-1.png only, not the Untitled-2.png,
Here is what I've tried
preg_match('/(<img .*?>)/', $go, $img_tag);
$feature = $img_tag[0];
Use DOMDocument and DOMXPath to easily target what you want using the HTML structure:
$content = <<<'EOD'
<img data-filgename="Untitled-1.png" src="" style="width: 645px;">
<img data-filgename="Untitled-2.png" src="" style="width: 645px;">
EOD;
$dom = new DOMDocument;
$dom->loadHTML($content);
$xp = new DOMXPath($dom);
$result = $xp->evaluate('string(//img[#data-filename]/#src)');
# img node anywhere --------^ ^ ^---- src attribute
# in the DOM tree '---- predicate: must have a
# data-filename attribute
if (!empty($result))
echo $result, PHP_EOL;
I have a string containing different types of html tags and stuff, including some <img> elements. I am trying to wrap those <img> elements inside a <figure> tag. So far so good using a preg_replace like this:
preg_replace( '/(<img.*?>)/s','<figure>$1</figure>',$content);
However, if the <img>tag has a neighboring <figcaption> tag, the result is rather ugly, and produces a stray end tag for the figure-element:
<figure id="attachment_9615">
<img class="size-full" src="http://www.example.com/pic.png" alt="name" width="1699" height="354" />
<figcaption class="caption-text"></figure>Caption title here</figcaption>
</figure>
I've tried a whole bunch of preg_replace regex variations to wrap both the img-tag and figcaption-tag inside figure, but can't seem to make it work.
My latest try:
preg_replace( '/(<img.*?>)(<figcaption .*>*.<\/figcaption>)?/s',
'<figure">$1$2</figure>',
$content);
As others pointed out, better use a parser, i.e. DOMDocument instead. The following code wraps a <figure> tag around each img where the next sibling is a <figcaption>:
<?php
$html = <<<EOF
<html>
<img class="size-full" src="http://www.example.com/pic.png" alt="name" width="1699" height="354" />
<figcaption class="caption-text">Caption title here</figcaption>
<img class="size-full" src="http://www.example.com/pic.png" alt="name" width="1699" height="354" />
<img class="size-full" src="http://www.example.com/pic.png" alt="name" width="1699" height="354" />
<figcaption class="caption-text">Caption title here</figcaption>
</html>
EOF;
$dom = new DOMdocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
# get all images
$imgs = $xpath->query("//img");
foreach ($imgs as $img) {
if ($img->nextSibling->tagName == 'figcaption') {
# create a new figure tag and append the cloned elements
$figure = $dom->createElement('figure');
$figure->appendChild($img->cloneNode(true));
$figure->appendChild($img->nextSibling->cloneNode(true));
# insert the newly generated elements right before $img
$img->parentNode->insertBefore($figure, $img);
# and remove both the figcaption and the image from the DOM
$img->nextSibling->parentNode->removeChild($img->nextSibling);
$img->parentNode->removeChild($img);
}
}
$dom->formatOutput=true;
echo $dom->saveHTML();
See a demo on ideone.com.
To have a <figure> tag around all your images, you might want to add an else branch:
} else {
$figure = $dom->createElement('figure');
$figure->appendChild($img->cloneNode(true));
$img->parentNode->insertBefore($figure, $img);
$img->parentNode->removeChild($img);
}
I'm parsing Wordpress post HTML through PHP. I want all images to be centered. This alone is easy enough, however, I also want images on the same line to be centered together. In order to do this I need to apply the attribute class="image-content" to the <p> block.
How do I do this with PHP?
This is what the post would look like in the editor:
And this is the HTML that Wordpress provides for this post:
<p>Single line paragraph.</p>
<p>
<a href="image.png">
<img class="alignnone wp-image-39 size-thumbnail" src="image.png" width="150" height="150" />
</a>
</p>
<p>
Multi line paragraph which is a multi line paragraph
which is a multi line paragraph which is a multi line
paragraph which is a multi line paragraph which is a
multi line paragraph which is a multi line paragraph
which is a multi line paragraph which is a multi line
paragraph which is a multi line paragraph which is a
multi line paragraph which is a multi line paragraph
which is a multi line paragraph which is a multi line
paragraph which is a multi line paragraph which is a
multi line paragraph which is a multi line paragraph.
</p>
<p>
<a href="image.png">
<img class="alignnone wp-image-39 size-thumbnail" src="image.png" width="150" height="150" />
</a>
<a href="image.png">
<img class="alignnone wp-image-39 size-thumbnail" src="image.png" width="300" height="300" />
</a>
</p>
<p>End of post.</p>
You can do this with DOMDocument, xpath and a simple replacement.
$parse = new \DOMDocument();
$parse->loadHTML($html);
$xpath = new \DOMXpath($parse);
$images = $xpath->query('//p//img');
$re = "/(.*)/";
$subst = "$1 image-content";
foreach ($images as $image) {
$class = preg_replace($re, $subst, $image->getAttribute('class'), 1);
$image->setAttribute('class',$class);
}
$htmlFinal = $parse->saveHTML();
EDIT
If you want to attach the class to the containing p Element, you can use it like this:
$parse = new \DOMDocument();
$parse->loadHTML($html);
$xpath = new \DOMXpath($parse);
$ps = $xpath->query('//p');
foreach ($ps as $p) {
if ($p->getElementsByTagName('img')->length > 0) $p->setAttribute('class', 'image-content');
}
$htmlFinal = $parse->saveHTML();
If the p tags may have a class set before parsing the Dom, you should combine those two examples to add the new class instead of only setting it.
In a nutshell, this is what I'm trying to do:
Get all <img> tags from a document
Set a data-src attribute (for lazy loading)
Empty their sources (for lazy loading)
Inject a <noscript> tag after this image
1-3 are fine. I just can't get the created <noscript> tag to be beside the image correctly.
I'm trying with insertBefore but I'm open for suggestions:
// Create a DOMDocument instance
$dom = new DOMDocument;
$dom->formatOutput = true;
$dom->preserveWhiteSpace = false;
// Loads our content as HTML
$dom->loadHTML($content);
// Get all of our img tags
$images = $dom->getElementsByTagName('img');
// How many of them
$len = count($images);
// Loop through all the images in this content
for ($i = 0; $i < $len; $i++) {
// Reference this current image
$image = $images->item($i);
// Create our fallback image before changing this node
$fallback_image = $image->cloneNode();
// Add the src as a data-src attribute instead
$image->setAttribute('data-src', $src);
// Empty the src of this img
$image->setAttribute('src', '');
// Now prepare our <noscript> markup
// E.g <noscript><img src="foobar.jpg" /></noscript>
$noscript = $dom->createElement("noscript");
$noscript->appendChild( $fallback_image );
$image->parentNode->insertBefore( $noscript, $image );
}
return $dom->saveHTML();
Having two images in the page, this is the result (abbreviated for clarity's sake):
Before:
<div>
<img />
<p />
</div>
<p>
<img />
</p>
After:
<div>
<img /> <!-- this should be the fallback wrapped in <noscript> that is missing -->
<p>
<img />
</p>
</div>
<p>
<img /> <!-- nothing happened here -->
</p>
Using $dom->appendChild works but the <noscript> tag should be beside the image and not at the end of the document.
My PHP skills are very rusty so I'd appreciate any clarification or suggestions.
UPDATE
Just realised saveHTML() was also adding <DOCTYPE><html><body> tags, so I've added a preg_replace (until I find a better solution) to take care of removing that.
Also, the output I have pasted before was based on the inspector of Chrome's Developer Tools.
I checked the viewsoure to see what was really going on (and thus found out about the tag).
This is what's really happening:
https://eval.in/114620
<div>
<img /> </noscript> <!-- wha? just a closing noscript tag -->
<p />
</div>
<p>
<img /> <!-- nothing happened here -->
</p>
SOLVED
So this is how I fixed it:
https://eval.in/117959
I think it's a good idea to work with new nodes after they have being inserted into the DOM:
$noscript = $dom->createElement("noscript");
$noscriptnode = $image->parentNode->insertBefore( $noscript, $image );
// Only now work with noscript by adding it's contents etc...
Also when it's inserted with "insertBefore" - it's a good idea to save it's reference.
$noscriptnode = $image->parentNode->insertBefore( $noscript, $image );
And another thing: I wasrunning this code within Wordpress. Some hooks were being run afterwards which was messing up my markup.
My below code retrieves a series of images from the search results of a site and also the corresponding age data. It works fine however I get a list of images followed by a list of the information in the age field.
img img img img age age age age and so on.
How do I combine these so I can display them in sets: img age img age img age
<?php
error_reporting(-1);
$html = new DOMDocument();
#$html->loadHtmlFile('http://www.site.com/searchresults.html');
$xpath = new DOMXPath( $html );
$nodelist = $xpath->query( "//div[#class='age']" );
$tags = $html->getElementsByTagName('img');
foreach ($tags as $tag) {
$image = $tag->getAttribute('src');
echo '<img src='. $image .' alt="image" ><br>';
}
foreach ($nodelist as $n)
{
echo $n->nodeValue."<br>";
}
?>
Sample page, I want to extract the img source title data from <div class="age" title="30 usa">:
<div id="sr-15763292" class="search-result">
<div class="thumb-wrapper">
<a class="bioLink" href="http://www.site.com/user/" title="View user"><img src="http://www.site.com/img/15763292.jpg" class="thumb" alt="user" width="140" height="105"></a>
<p class="status"><a href="http://www.site.com/user/" >Online</a></p>
</div>
<div class="rating">
<div class="rating-stars rating4"></div>
</div>
<div class="age" title="30 usa">
<p>30</p>
<p class="gender m">m</p>
<p>USA</p>
</div>
<div>
<p class="headline">Hello there.</p>
</div>
</div>
It's hard to answer if we don't know what the HTML looks like! Assuming it looks something like this
<div class="age"><p>21</p>
<img src="a.jpg" />
</div>
<div class="age"><p>51</p>
<img src="b.jpg" />
</div>
you need to find each div and then find the image inside each div. getElementsByTagName() will give you a list even if there's only one result, so use item() to fetch the first.
error_reporting(-1);
$html = new DOMDocument();
#$html->loadHtmlFile('results.html');
$xpath = new DOMXPath( $html );
$nodelist = $xpath->query( "//div[#class='age']" );
foreach ($nodelist as $node) {
$tags = $node->getElementsByTagName('img');
$image = $tags->item(0)->getAttribute('src');
echo '<img src="'. $image .'" alt="image" ><br>';
echo $node->textContent . '<br>';
}
If the HTML is like this
<div class="age"><p>21</p></div><img src="a.jpg" />
you can try
$node->nextSibling()
As a general point trace through the HTML and think how do I get from A to B? Go forwards? backwards? up to parent, to the next node and down again ...?