Find <p> that contains image(s) and change style of that <p> - php

I'm parsing Wordpress post HTML through PHP. I want all images to be centered. This alone is easy enough, however, I also want images on the same line to be centered together. In order to do this I need to apply the attribute class="image-content" to the <p> block.
How do I do this with PHP?
This is what the post would look like in the editor:
And this is the HTML that Wordpress provides for this post:
<p>Single line paragraph.</p>
<p>
<a href="image.png">
<img class="alignnone wp-image-39 size-thumbnail" src="image.png" width="150" height="150" />
</a>
</p>
<p>
Multi line paragraph which is a multi line paragraph
which is a multi line paragraph which is a multi line
paragraph which is a multi line paragraph which is a
multi line paragraph which is a multi line paragraph
which is a multi line paragraph which is a multi line
paragraph which is a multi line paragraph which is a
multi line paragraph which is a multi line paragraph
which is a multi line paragraph which is a multi line
paragraph which is a multi line paragraph which is a
multi line paragraph which is a multi line paragraph.
</p>
<p>
<a href="image.png">
<img class="alignnone wp-image-39 size-thumbnail" src="image.png" width="150" height="150" />
</a>
<a href="image.png">
<img class="alignnone wp-image-39 size-thumbnail" src="image.png" width="300" height="300" />
</a>
</p>
<p>End of post.</p>

You can do this with DOMDocument, xpath and a simple replacement.
$parse = new \DOMDocument();
$parse->loadHTML($html);
$xpath = new \DOMXpath($parse);
$images = $xpath->query('//p//img');
$re = "/(.*)/";
$subst = "$1 image-content";
foreach ($images as $image) {
$class = preg_replace($re, $subst, $image->getAttribute('class'), 1);
$image->setAttribute('class',$class);
}
$htmlFinal = $parse->saveHTML();
EDIT
If you want to attach the class to the containing p Element, you can use it like this:
$parse = new \DOMDocument();
$parse->loadHTML($html);
$xpath = new \DOMXpath($parse);
$ps = $xpath->query('//p');
foreach ($ps as $p) {
if ($p->getElementsByTagName('img')->length > 0) $p->setAttribute('class', 'image-content');
}
$htmlFinal = $parse->saveHTML();
If the p tags may have a class set before parsing the Dom, you should combine those two examples to add the new class instead of only setting it.

Related

remove <p> tags arround images

I have a string:
$str = '<p>line</p>
<p><img src="images/01.jpg">line with image</p>
<p><img src="images/02.jpg">line with image</p>';
and want to turn it into:
$str = '<p>line</p>
<img src="images/01.jpg"><p>line with image</p>
<img src="images/02.jpg"><p>line with image</p>';
I tried
$result = preg_replace('%(.*?)<p>\s*(<img[^<]+?)\s*</p>(.*)%is', '$1$2$3', $str);
but it's only removing one image not the second one. Please suggest a regex.
This will remove <p> tag from around img (using DOM parser)
$html = str_get_html('<p>line</p>
<p><img src="images/01.jpg">line with image</p>
<p><img src="images/02.jpg">line with image</p>');
foreach($html->find('img') as $img) {
$str ="<p>".$img->parent()->plaintext."</p>";
$img->parent()->outertext=$img;
$img->parent()->outertext .=$str;
}
echo $html;
o/p:
<p>line</p>
<img src="images/01.jpg">
line with image
<img src="images/02.jpg">
line with image
found the solution I guess. these two regex together solves my problem:
$str = '<p>line</p>
<p><img src="images/01.jpg">line with image</p>
<p>line with image<img src="images/02.jpg"></p>';
$str = preg_replace('/<p>(<img[^>]*>)/', '$1<p>', $str);
$str = preg_replace('/(<img[^>]*>)<\/p>/', '</p>$1', $str);
echo $str;
o/p:
<p>line</p>
<img src="images/01.jpg"><p>line with image</p>
<p>line with image</p><img src="images/02.jpg">
here is the working link and
thanks a lot every body and specially #bobblebobble

preg_replace regex to remove stray end tag

I have a string containing different types of html tags and stuff, including some <img> elements. I am trying to wrap those <img> elements inside a <figure> tag. So far so good using a preg_replace like this:
preg_replace( '/(<img.*?>)/s','<figure>$1</figure>',$content);
However, if the <img>tag has a neighboring <figcaption> tag, the result is rather ugly, and produces a stray end tag for the figure-element:
<figure id="attachment_9615">
<img class="size-full" src="http://www.example.com/pic.png" alt="name" width="1699" height="354" />
<figcaption class="caption-text"></figure>Caption title here</figcaption>
</figure>
I've tried a whole bunch of preg_replace regex variations to wrap both the img-tag and figcaption-tag inside figure, but can't seem to make it work.
My latest try:
preg_replace( '/(<img.*?>)(<figcaption .*>*.<\/figcaption>)?/s',
'<figure">$1$2</figure>',
$content);
As others pointed out, better use a parser, i.e. DOMDocument instead. The following code wraps a <figure> tag around each img where the next sibling is a <figcaption>:
<?php
$html = <<<EOF
<html>
<img class="size-full" src="http://www.example.com/pic.png" alt="name" width="1699" height="354" />
<figcaption class="caption-text">Caption title here</figcaption>
<img class="size-full" src="http://www.example.com/pic.png" alt="name" width="1699" height="354" />
<img class="size-full" src="http://www.example.com/pic.png" alt="name" width="1699" height="354" />
<figcaption class="caption-text">Caption title here</figcaption>
</html>
EOF;
$dom = new DOMdocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
# get all images
$imgs = $xpath->query("//img");
foreach ($imgs as $img) {
if ($img->nextSibling->tagName == 'figcaption') {
# create a new figure tag and append the cloned elements
$figure = $dom->createElement('figure');
$figure->appendChild($img->cloneNode(true));
$figure->appendChild($img->nextSibling->cloneNode(true));
# insert the newly generated elements right before $img
$img->parentNode->insertBefore($figure, $img);
# and remove both the figcaption and the image from the DOM
$img->nextSibling->parentNode->removeChild($img->nextSibling);
$img->parentNode->removeChild($img);
}
}
$dom->formatOutput=true;
echo $dom->saveHTML();
See a demo on ideone.com.
To have a <figure> tag around all your images, you might want to add an else branch:
} else {
$figure = $dom->createElement('figure');
$figure->appendChild($img->cloneNode(true));
$img->parentNode->insertBefore($figure, $img);
$img->parentNode->removeChild($img);
}

DOMDocument with php

i want replace all images on my html but the code replace one and escaping one and so on
i use DOMDocument to replace images on my content and i use the next code the problem is the code escaping image
for example
1 2 3 4 images the code replace one and three and escaping tow and four and so on
$dom = new \DOMDocument();
$dom->loadHTML("data"));
$dom->preserveWhiteSpace = true;
$count = 1;
$images = $dom->getElementsByTagName('img');
foreach ($images as $img) {
$src = $img->getAttribute('src');
$newsrc = $dom->createElement("newimg");
$newsrc->nodeValue = $src;
$newsrc->setAttribute("id","qw".$count);
$img->parentNode->replaceChild($newsrc, $img);
$count++;
}
$html = $dom->saveHTML();
return $html;
the html code is
<p><img class="img-responsive" src="http://www.jarofquotes.com/img/quotes/86444b28aa86d706e33246b823045270.jpg" alt="" width="600" height="455" /></p>
<p> </p>
<p>some text</p>
<p> </p>
<p><img class="img-responsive" src="http://40.media.tumblr.com/c0bc20fd255cc18dca150640a25e13ef/tumblr_nammr75ACv1taqt2oo1_500.jpg" alt="" width="480" height="477" /></p>
<p> </p>
<p><span class="marker"><img class="img-responsive" src="http://wiselygreen.com/wp-content/uploads/green-living-coach-icon.png" alt="" width="250" height="250" /><br /><br /></span></p>
i want output html replace all images with
<newimg>Src </newimg>
Ok, I couldn't find a dupe suitable for PHP, so I am answering this one.
The issue you are facing is that NodeLists returned by getElementsByTagName() are live list. That means, when you do the call to replaceChild(), you are altering the NodeList you are currently iterating.
Let's assume we have this HTML:
$html = <<< HTML
<html>
<body>
<img src="1.jpg"/>
<img src="2.jpg"/>
<img src="3.jpg"/>
</body>
</html>
HTML;
Now let's load it into a DOMDocument and get the img elements:
$dom = new DOMDocument;
$dom->loadHTML($html);
$allImages = $dom->getElementsByTagName('img');
echo $allImages->length, PHP_EOL;
This will print 3 because there is 3 img elements in the DOM right now.
Let's replace the first img element with a p element:
$allImages->item(0)->parentNode->replaceChild(
$dom->createElement("p"),
$allImages->item(0)
);
echo $allImages->length, PHP_EOL;
This now gives 2 because there is now only 2 img elements left, essentially
item 0: img will be removed from the list
item 1: img will become item 0
item 2: img will become item 1
You are using foreach, so you are first replacing item 0, then move on to item 1, but item 1 is now item 2 and the item 0 is item 1 you would expect next. But because the list is live, you are skipping it.
To get around this, use a while loop and always replace the first element:
while ($allImages->length > 0) {
$allImages->item(0)->parentNode->replaceChild(
$dom->createElement("p"),
$allImages->item(0)
);
}
This will then catch all the img elements.

Counting img tags line by line

I'm trying to count all <img> tags inside a string line by line but cant figure it out.
i've already made to split the string line by line, then count the <img> tags after it.
Example :
$string = "
some text <img src="" /> some text <img src="" /> some text <img src="" /> some text \n
some text <img src="" /> some text `<img src="" /> some text <img src="" /> some text ";
now my code is
first to split it line by line
$array = explode("\n", $string);
now count how many <img> tags are there in the first line of var string.
$first_line = $array['0'];
i was using preg_match() to get match for img tags.
$img_line = preg_match("#<img.+>#U", $array['0']);
echo count($img_line);
this wont work for me, in the $string there are 3 <img src=""> per line but my code gives me only 1.
any hint or tips are highly appreciated.
If you do a simple explode line by line, this will give you the count:
$explode = explode('<img ', $array[0]);
echo count($explode);
Got it..
After splitting the string per line.
$first_line = $array['0'];
$match = preg_match_all("#<img.+>#U", $first_line, $matches);
print_r($matches);
echo count($matches['0']);
the code above will return this..
Array
(
[0] => Array
(
[0] =>
[1] =>
[2] =>
)
)
3
You can try the following code:
<?php
$string = <<<TXT
some text <img src="" /> some text <img src="" /> some text <img src="" /> some text
some text <img src="" /> some text <img src="" /> some text <img src="" /> some text
TXT;
$lines = explode("\n", $string);
// For each line
$count = array_map(function ($v) {
// If one or more img tag are found
if (preg_match_all('#<img [^>]*>#i', $v, $matches, PREG_SET_ORDER)) {
// We return the count of tags.
return count($matches);
}
}, $lines);
/*
Array
(
[0] => 3 // Line 1
[1] => 3 // Line 2
)
*/
print_r($count);
Here, PREG_SET_ORDERstores the results in a single level (first capture to index $matches[0], second capture to index $matches[1]). Thus, we can easily retrieve the number of catches.
<?php
$string = 'some text <img src="" /> some text <img src="" /> some text <img src="" /> some text \n
some text <img src="" /> some text `<img src="" /> some text <img src="" /> some text ';
$count = preg_match_all("/<img/is", $string, $matches);
echo $count;
?>

Strip tags, but keep the first one

How can I keep for example the first img tag but strip all the others?
(from a HTML string)
example:
<p>
some text
<img src="aimage.jpg" alt="desc" width="320" height="200" />
<img src="aimagethatneedstoberemoved.jpg" ... />
</p>
so it should be just:
<p>
some text
<img src="aimage.jpg" alt="desc" width="320" height="200" />
</p>
The function from this example can be used to keep the first N IMG tags, and removes all the other <img>s.
// Function to keep first $nrimg IMG tags in $str, and strip all the other <img>s
// From: http://coursesweb.net/php-mysql/
function keepNrImgs($nrimg, $str) {
// gets an array with al <img> tags from $str
if(preg_match_all('/(\<img[^\>]+\>)/i', $str, $mt)) {
// gets array with the <img>s that must be stripped ($nrimg+), and removes them
$remove_img = array_slice($mt[1], $nrimg);
$str = str_ireplace($remove_img, '', $str);
}
return $str;
}
// Test, keeps the first two IMG tags in $str
$str = 'First img: <img src="img1.jpg" alt="img 1" width="30" />, second image: <img src="img_2.jpg" alt="img 2" width="30">, another Img tag <img src="img3.jpg" alt="img 3" width="30" />, etc.';
$str = keepNrImgs(2, $str);
echo $str;
/* Output:
First img: <img src="img1.jpg" alt="img 1" width="30" />, second image: <img src="img_2.jpg" alt="img 2" width="30">, another Img tag , ... etc.
*/
You might be able to accomplish this with a complex regex string, however my suggestion would be to use preg_replace_callback, particularly if you are on php 5.3+ and here's why. http://www.php.net/manual/en/function.preg-replace-callback.php
$tagTracking = array();
preg_replace_callback('/<[^<]+?(>|/>)/', function($match) use($tagTracking) {
// your code to track tags here, and apply as you desire.
});

Categories