I have some HTML snippets retrieved through PHP/JSON such as:
<div>
<p>Some Text</p>
<img src="example.jpg" />
<img src="example2.jpg" />
<img src="example3.jpg" />
</div>
I am loading it with DOMDocument() and xpath and would like to be able to manipulate it so I can add lazy loading to the images like so:
<div>
<p>Some Text</p>
<img class="lazy" src="blank.gif" data-src="example.jpg" />
<img class="lazy" src="blank.gif" data-src="example2.jpg" />
<img class="lazy" src="blank.gif" data-src="example3.jpg" />
</div>
Which entails:
Add class .lazy
Add data-src attribute from original src attribute
Modify src attribute to blank.gif
I am trying the following but it isn't working:
foreach ($xpath->query("//img") as $node) {
$node->setAttribute( "class", $node->getAttribute("class")." lazy");
$node->setAttribute( "data-src", $node->getAttribute("src"));
$node->setAttribute( "src", "./inc/image/blank.gif");
}
but it isn't working.
Are you sure? The following works for me.
<?php
$html = <<<EOQ
<div>
<p>Some Text</p>
<img src="example.jpg" />
<img src="example2.jpg" />
<img src="example3.jpg" />
</div>
EOQ;
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
foreach ($xpath->query('//img') as $node) {
$node->setAttribute('class', $node->getAttribute('class') . ' lazy');
$node->setAttribute( "data-src", $node->getAttribute("src"));
$node->setAttribute( "src", "./inc/image/blank.gif");
}
echo $dom->saveHTML();
Related
his is the content:
<div class="image">
<img src="https://www.gravatar.com/avatar/" alt="test" width="50" height="50">
</div>
I want to use preg_replace to add data-mfp-src attribute (getting the value from the src attribute) to be the final code like this:
<div class="image">
<img src="https://www.gravatar.com/avatar/" data-mfp-src="https://www.gravatar.com/avatar/" alt="test" width="50" height="50">
</div>
This is my code and it's working without any issues but i want to use preg_replcae for some specific reasons:
function lazyload_images( $content ){
$content = mb_convert_encoding($content, 'HTML-ENTITIES', "UTF-8");
$dom = new DOMDocument;
libxml_use_internal_errors(true);
#$dom->loadHTML($content);
libxml_use_internal_errors(false);
$xpath = new DOMXPath($dom);
foreach ($xpath->evaluate('//div[img]') as $paragraphWithImage) {
//$paragraphWithImage->setAttribute('class', 'test');
foreach ($paragraphWithImage->getElementsByTagName('img') as $image) {
$image->setAttribute('data-mfp-src', $image->getAttribute('src'));
$image->removeAttribute('src');
}
};
return preg_replace('~<(?:!DOCTYPE|/?(?:html|head|body))[^>]*>\s*~i', '', $dom->saveHTML($dom->documentElement));
}
As a robust means of isolating the src value and setting the new attribute to this value, I'll urge you to avoid regex. Not that it can't be done, but that my snippet to follow won't break if more classes are added to the <div> nor if the <img> attributes are shifted around.
Code: (Demo)
$html = <<<HTML
<div class="image">
<img src="https://www.gravatar.com/avatar/" alt="test" width="50" height="50">
</div>
HTML;
$dom = new DOMDocument;
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new DOMXPath($dom);
// using a loop in case there are multiple occurrences
foreach ($xpath->query("//div[contains(#class, 'image')]/img") as $node) {
$node->setAttribute('data-mfp-src', $node->getAttribute('src'));
}
echo $dom->saveHTML();
Output:
<div class="image">
<img src="https://www.gravatar.com/avatar/" alt="test" width="50" height="50" data-mfp-src="https://www.gravatar.com/avatar/">
</div>
Resources:
http://php.net/manual/en/domelement.setattribute.php
http://php.net/manual/en/domelement.getattribute.php
Just to show you what the regex might look like...
Find: ~<img src="([^"]*)"~
Replace: <img src="$1" data-mfp-src="$1"
Demo: https://regex101.com/r/lXIoFw/1 but again, I don't recommend it because it could silently let you down in the future.
I have a example:
<a href="http://test.html" class="watermark" target="_blank">
<img width="399" height="4652" src="http://test.html/uploads/2013/10/10.jpg" class="aligncenter size-full wp-image-78360">
</a>
I using preg_replace to change value class of a tag and src of img tag
$content = preg_replace('#<a(.*?)href="([^"]*/)?(([^"/]*)\.[^"]*)"([^>]*?)><img(.*?)src="([^"]*/)?(([^"/]*)\.[^"]*)"([^>]*?)></a>#', '<img$1src="http://test.html/uploads/2013/10/10_new.jpg">', $content);
How to result is ?
<a href="http://test.html" class="fancybox" target="_blank">
<img width="399" height="4652" src="http://test.html/uploads/2013/10/10_new.jpg" class="aligncenter size-full wp-image-78360">
</a>
Regex, as is mentioned many times daily here on SO, is not the best tool for HTML manipulation - luckily we have the DOMDocument object!
If you're supplied with just that string you can make the changes like so:
$orig = ' <a href="http://test.html" class="watermark" target="_blank">
<img width="399" height="4652" src="http://test.html/uploads/2013/10/10.jpg" class="aligncenter size-full wp-image-78360">
</a>';
$doc = new DOMDocument();
$doc->loadHTML($orig);
$anchor = $doc->getElementsByTagName('a')->item(0);
if($anchor->getAttribute('class') == 'watermark')
{
$anchor->setAttribute('class','fancybox');
$img = $anchor->getElementsByTagName('img')->item(0);
$currSrc = $img->getAttribute('src');
$img->setAttribute('src',preg_replace('/(\.[^\.]+)$/','_new$1',$currSrc));
}
$newStr = $doc->saveHTML($anchor);
Else if you're using a full document HTML source:
$orig = '<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title></title>
</head>
<body>
<a href="http://test.html" class="watermark" target="_blank">
<img width="399" height="4652" src="http://test.html/uploads/2013/10/10.jpg" class="aligncenter size-full wp-image-78360">
</a>
<span>random</span>
<a href="http://test.html" class="watermark" target="_blank">
<img width="399" height="4652" src="http://test.html/uploads/2013/10/10.jpg" class="aligncenter size-full wp-image-78360">
</a>
<a href="#foobar" class="gary">
<img src="/imgs/yay.png" />
</a>
</body>
</html>';
$doc = new DOMDocument();
$doc->loadHTML($orig);
$anchors = $doc->getElementsByTagName('a');
foreach($anchors as $anchor)
{
if($anchor->getAttribute('class') == 'watermark')
{
$anchor->setAttribute('class','fancybox');
$img = $anchor->getElementsByTagName('img')->item(0);
$currSrc = $img->getAttribute('src');
$img->setAttribute('src',preg_replace('/(\.[^\.]+)$/','_new$1',$currSrc));
}
}
$newStr = $doc->saveHTML();
Although for brain exercise, I've provided a regex solution as that was the original question, and sometimes DOM docs can be overkill amounts of code (though still preferable)
$newStr = preg_replace('#<a(.+?)class="watermark"(.+?)<img(.+?)src="(.+?)(\.[^.]+?)"(.*?>.*?</a>)#s','<a$1class="fancybox"$2<img$3src="$4_new$5"$6',$orig);
Don't parse HTML with regex.
Find all links in html that have watermark class, change class to fancybox, and update first child image src.
$dom = new DOMDocument;
#$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
foreach ($xpath->query('//a[contains(#class, "watermark")]') as $a) {
$a->setAttribute('class', 'fancybox');
$img = $xpath->query('descendant::img', $a)->item(0);
# old value = $img->getAttribute('src');
$img->setAttribute('src', 'new_value');
}
echo $dom->saveHTML();
I have page image.php
where images are kept in container like below :- Note: There are other Images outside container div too.. i just want images from container div.
<!DOCTYPE html>
<head>
<title>Image Holder</title>
</head>
<body>
<header>
<img src="http://examepl.com/logo.png">
<div id="side">
<div id="facebook"><img src="http://examepl.com/fb.png"></div>
<div id="twiiter"><img src="http://examepl.com/t.png"></div>
<div id="gplus"><img src="http://examepl.com/gp.png"></div>
</div>
</header>
<div class="container">
<p>SOme Post</p>
<img src="http://examepl.com/some.png" title="some image" />
<p>SOme Post</p>
<img src="http://examepl.com/some.png" title="some image" />
<p>SOme Post</p>
<img src="http://examepl.com/some.png" title="some image" />
</div>
<footer>
<div id="foot">
copyright © 2013
</div>
</footer>
</body>
</html>
and i am trying to fetch only image from my image.php file with preg_match_all, but it returns boolean(false) :(
my php code :-
<?php
$file = file_get_contents("image.php");
preg_match_all("/<div class=\"container\">(.*?)</div>/", $file, $match);
preg_match_all("/<img src=\"(.*?)\">/", $match, $images);
var_dump($images);
?>
Both the files are in root folder , and now i am getting blank page :(
Any help would be great
Thanks
I think this will work for you try the link below to test your regex
preg_match_all("/<div class=\"container\">(.*?)<\/div>/", $file, $match);
preg_match_all("/<img .*?(?=src)src=\"([^\"]+)\"/", $match[1][0], $images);
http://www.phpliveregex.com
You better not use regex for this purpose. PHP provides nice DOM api for this purpose. Consider code like below:
$html = <<< EOF
<div class="container">
<p>SOme Post</p>
<img src="http://examepl.com/some1.png" title="some image" />
<p>SOme Post</p>
<img src="http://examepl.com/some2.png" title="some image" />
<p>SOme Post</p>
<img src="http://examepl.com/some3.png" title="some image" />
</div>
EOF;
$doc = new DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTML($html); // loads your html
$xpath = new DOMXPath($doc);
$nodelist = $xpath->query("//div[#class='container']/img");
$img = array();
for($i=0; $i < $nodelist->length; $i++) {
$node = $nodelist->item($i);
$img[] = $node->getAttribute('src');
}
print_r($img);
OUTPUT:
Array
(
[0] => http://examepl.com/some1.png
[1] => http://examepl.com/some2.png
[2] => http://examepl.com/some3.png
)
Live Demo: http://ideone.com/iBhVMF
You can easily obtain what you want with an XPath query:
$url = 'http://examepl.com/image.php';
$doc = new DOMDocument();
#$doc->loadHTMLFile($url);
$xpath = new DOMXPath($doc);
$srcs = $xpath->query("//div[#class='container']//img/attribute::src");
foreach ($srcs as $src) {
echo '<br/>' . $src->value;
}
preg_match_all("/<img src=\"(.*?)\">/", $match, $images);
replace with
preg_match_all("/<img src=\"(.*?)\"/", $match, $images); // stripped ">" char
My below code retrieves a series of images from the search results of a site and also the corresponding age data. It works fine however I get a list of images followed by a list of the information in the age field.
img img img img age age age age and so on.
How do I combine these so I can display them in sets: img age img age img age
<?php
error_reporting(-1);
$html = new DOMDocument();
#$html->loadHtmlFile('http://www.site.com/searchresults.html');
$xpath = new DOMXPath( $html );
$nodelist = $xpath->query( "//div[#class='age']" );
$tags = $html->getElementsByTagName('img');
foreach ($tags as $tag) {
$image = $tag->getAttribute('src');
echo '<img src='. $image .' alt="image" ><br>';
}
foreach ($nodelist as $n)
{
echo $n->nodeValue."<br>";
}
?>
Sample page, I want to extract the img source title data from <div class="age" title="30 usa">:
<div id="sr-15763292" class="search-result">
<div class="thumb-wrapper">
<a class="bioLink" href="http://www.site.com/user/" title="View user"><img src="http://www.site.com/img/15763292.jpg" class="thumb" alt="user" width="140" height="105"></a>
<p class="status"><a href="http://www.site.com/user/" >Online</a></p>
</div>
<div class="rating">
<div class="rating-stars rating4"></div>
</div>
<div class="age" title="30 usa">
<p>30</p>
<p class="gender m">m</p>
<p>USA</p>
</div>
<div>
<p class="headline">Hello there.</p>
</div>
</div>
It's hard to answer if we don't know what the HTML looks like! Assuming it looks something like this
<div class="age"><p>21</p>
<img src="a.jpg" />
</div>
<div class="age"><p>51</p>
<img src="b.jpg" />
</div>
you need to find each div and then find the image inside each div. getElementsByTagName() will give you a list even if there's only one result, so use item() to fetch the first.
error_reporting(-1);
$html = new DOMDocument();
#$html->loadHtmlFile('results.html');
$xpath = new DOMXPath( $html );
$nodelist = $xpath->query( "//div[#class='age']" );
foreach ($nodelist as $node) {
$tags = $node->getElementsByTagName('img');
$image = $tags->item(0)->getAttribute('src');
echo '<img src="'. $image .'" alt="image" ><br>';
echo $node->textContent . '<br>';
}
If the HTML is like this
<div class="age"><p>21</p></div><img src="a.jpg" />
you can try
$node->nextSibling()
As a general point trace through the HTML and think how do I get from A to B? Go forwards? backwards? up to parent, to the next node and down again ...?
Hoi,
i have a string filled with html-content. What i need to do now is, to replace every image with its alt-text.
The html can look like this:
<h1> some h1</h1>
<img src="images/image.jpg" alt="My Alt-text" width="540" height="304" />
<img src="images_2003/basket.jpg" alt="My other alt text" width="540" height="304" />
<h2>some h2</h2>
<img src="images/image2.jpg" alt="My next Alt-text" width="540" height="304" />
<img src="images/image45.jpg" alt="Yet other alt text" width="540" height="304" />
...
What it should be:
<h1> some h1</h1>
My Alt-text
My other alt text
<h2>some h2</h2>
My next Alt-text
Yet other alt text
...
What would be the best way, to accomplish this?
Using a DOM parser this is pretty straightforward:
$contents = <<<EOS
<h1> some h1</h1>
<img src="images/image.jpg" alt="My Alt-text" width="540" height="304" />
<img src="images_2003/basket.jpg" alt="My other alt text" width="540" height="304" />
<h2>some h2</h2>
<img src="images/image2.jpg" alt="My next Alt-text" width="540" height="304" />
<img src="images/image45.jpg" alt="Yet other alt text" width="540" height="304" />
EOS;
$doc = new DOMDocument;
libxml_use_internal_errors(true);
$doc->loadHTML($contents);
libxml_clear_errors();
$xp = new DOMXPath($doc);
foreach ($xp->query('//img') as $node) {
$text = $doc->createTextNode($node->getAttribute('alt') . "\n");
$node->parentNode->replaceChild($text, $node);
}
echo $doc->saveHTML();
I think you can use jquery to perform this operation.
Maybe try this code (I don't have test it) :
$("img").each(function() {
var alt_txt = $(this).attr('alt');
$(this).replaceWith(alt_txt);
});
Hope this helps ! Bye !
If you need a jQuery solution try this
$(document).ready(function() {
var alt = ""
$( "img" ).each(function( index ) {
alt = $(this).attr("alt");
$(this).replaceWith("<p>" + alt + "</p>"); // modify p with someother tag if needed.
});
});
Play here