preg_match() get the source link of image using regex

preg_match() get the source link of image using regex - php

I want to get the image link from the html content with preg_match() function.
I tried like this but not getting the correct source link.
$data = "<div class="poster">
<div class="pic">
<img class="xfieldimage img" src="https://bobtor.com/uploads/posts/2019-01/1546950927_mv5bnji5yta2mtetztmzny00odc5lwfimzctnme2owqwnwnkywm1xkeyxkfqcgdeqxvyntm3mdmymdq._v1_-1.jpg" alt="Song of Back and Neck 2018" title="Song of Back and Neck 2018">
</div>
</div>";
preg_match("'<img class=\"xfieldimage img\" src=\"(.*?)\" alt=\"(.*?)\" title=\"(.*?)\" />'si", $data, $movie_poster);
print_r($movie_poster);
Its not working.

self-contained tags meme link.
$dom = new DOMDocument();
$dom->loadHTML($data);
$xpath = new DOMXPath($dom);
$image = $xpath->query("//img[#class='xfieldimage img']")->item(0);
echo $image->getAttribute("src");

Related

PHP dom parser: How to get element count only if it comes after another element?

I'm trying to get a count of how many images are on an HTML page sprinkled throughout an article but I do not want to count the image if it comes before the text of the article begins. The problem is the classes are exactly the same, so I can't use that to help me, and not every article is even going to start with an image. So the HTML might look like this:
<img class="image-asset" src="image.jpg">
<p>First line</p>
<p>Second line</p>
<img class="image-asset" src="second_image.jpg">
<p>Third line</p>
<img class="image-asset" src="third_image.jpg">
In this instance, I want to only count the second and third images. Here's my code, which is successfully counting every image at the moment:
$photoCount = count($html->find('div.image-asset'));

I believe you are looking for something along these lines:
$dom = new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$target = $xpath->query('//img[preceding-sibling::p]');
echo count($target), PHP_EOL;
//and just to be on the safe side:
foreach ($target as $t) {
echo $t->ownerDocument->saveHTML($t), PHP_EOL;
};
Output:
2
<img class="image-asset" src="second_image.jpg">
<img class="image-asset" src="third_image.jpg">

Replace all images in HTML with text

I am trying to replace all images in some HTML which meet specific requirements with the appropriate text. The specific requirements are that they are of class "replaceMe" and the image src filename is in $myArray. Upon searching for solutions, it appears that some sort of PHP DOM technique is appropriate, however, I am very new with this. For instance, given $html, I wish to return $desired_html. At the bottom of this post is my attempted implementation which currently doesn't work. Thank you
$myArray=array(
'goodImgage1'=>'Replacement for Good Image 1',
'goodImgage2'=>'Replacement for Good Image 2'
);
$html = '<div>
<p>Random text and an <img src="goodImgage1.png" alt="" class="replaceMe">. More random text.</p>
<p>Random text and an <img src="goodImgage2.png" alt="" class="replaceMe">. More random text.</p>
<p>Random text and an <img src="goodImgage2.png" alt="" class="dontReplaceMe">. More random text.</p>
<p>Random text and an <img src="badImgage1.png" alt="" class="replaceMe">. More random text.</p>
</div>';
$desiredHtml = '<div>
<p>Random text and an Replacement for Good Image 1. More random text.</p>
<p>Random text and an Replacement for Good Image 2. More random text.</p>
<p>Random text and an <img src="goodImgage2.png" alt="" class="dontReplaceMe">. More random text.</p>
<p>Random text and an <img src="badImgage1.png" alt="" class="replaceMe">. More random text.</p>
</div>';
Below is what I am attempting to do..
libxml_use_internal_errors(true); //Temorarily disable errors resulting from improperly formed HTML
$doc = new DOMDocument();
$doc->loadHTML($html);
//What does this do for me?
$imgs= $doc->getElementsByTagName('img');
foreach ($imgs as $img){}
$xpath = new DOMXPath($doc);
foreach( $xpath->query( '//img') as $img) {
if(true){ //How do I check class and image name?
$new = $doc->createTextNode("New Attribute");
$img->parentNode->replaceChild($new,$img);
}
}
$html=$doc->saveHTML();
libxml_use_internal_errors(false);

Do it like this, you were on a good way:
$myArray=array(
'goodImgage1.png'=>'Replacement for Good Image 1',
'goodImgage2.png'=>'Replacement for Good Image 2'
);
$html = '<div>
<p>Random text and an <img src="goodImgage1.png" alt="" class="replaceMe">. More random text.</p>
<p>Random text and an <img src="goodImgage2.png" alt="" class="replaceMe">. More random text.</p>
<p>Random text and an <img src="goodImgage2.png" alt="" class="dontReplaceMe">. More random text.</p>
<p>Random text and an <img src="badImgage1.png" alt="" class="replaceMe">. More random text.</p>
</div>';
$classesToReplace = array('replaceMe');
libxml_use_internal_errors(true); //Temorarily disable errors resulting from improperly formed HTML
$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
foreach( $xpath->query( '//img') as $img) {
// get the classes into an array
$classes = explode(' ', $img->getAttribute('class')); // this will contain the classes assigned to the element
$classMatches = array_intersect($classes, $classesToReplace);
// preprocess the image name to match the $myArray keys
$imageName = $img->getAttribute('src');
if (isset($myArray[$imageName]) && $classMatches) {
$new = $doc->createTextNode($myArray[$imageName]);
$img->parentNode->replaceChild($new,$img);
}
}
echo var_dump($html = $doc->saveHTML());
Please note the following:
I made the code check for images that have the replaceMe class, potentially in addition to other classes
I added the full image file names to your $myArray keys, basically for simplicity.

likeitlikeit was faster. I'll post my answer, though, because it has some differences in detail, e.g. xpath doing the job of selecting only <img> with the appropriate class attribute, use of pathinfo to get filename without extension.
$doc = new DOMDocument();
$doc->loadHTML($h); // assume HTML in $h
$xpath = new DOMXPath($doc);
$imgs = $xpath->query("//img[#class = 'replaceMe']");
foreach ($imgs as $img) {
$imgfile = pathinfo($img->getAttribute("src"),PATHINFO_FILENAME);
if (array_key_exists($imgfile, $myArray)) {
$replacement = $doc->createTextNode($myArray[$imgfile]);
$img->parentNode->replaceChild($replacement, $img);
}
}
echo "<pre>" . htmlentities($doc->saveHTML()) . "</pre>";
see it working: http://codepad.viper-7.com/11XZt7

preg_match move selection above paragraph

I'm wanting to move images above their container paragraphs in a large body of text using preg_replace.
So, I might have
$body = '<p><img src="a" alt="image"></p><img src="b" alt="image"><p>something here<img src="c" alt="image"> text</p>'
What I want (apart from the 40' yacht etc etc);
<img src="a" alt="image"><p></p><img src="b" alt="image"><img src="c" alt="image"><p>something here text</p>
I've got this, which aint working,
$body = preg_replace('/(<p>.*\s*)(<img.*\s*?image">)(.*\s*?<\/p>)/', '$2$1$3',$body);
It results in;
<img src="c" alt="image"><p><img src="a" alt="image"></p><img src="b" alt="image"><p>something here text</p>

You should load the HTML with DOMDocument and use its operations to move nodes around:
$content = <<<EOM
<p><img src="a" alt="image"></p>
<img src="b" alt="image"><p>something here<img src="c" alt="image"> text</p>
EOM;
$doc = new DOMDocument;
$doc->loadHTML($content);
$xp = new DOMXPath($doc);
// find images that are a direct descendant of a paragraph
foreach ($xp->query('//p/img') as $img) {
$parent = $img->parentNode;
// move image as a previous sibling of its parent
$parent->parentNode->insertBefore($img, $parent);
}
echo $doc->saveHTML();

How to get img tag value inside a specific div and specific anchor tag using regular expression

I am new to regular expression i tried a lot for getting image tag value inside a anchor tag html
this is my html expresstion
<div class="smallSku" id="ctl00_ContentPlaceHolder1_smallImages">
<a title="" name="http://www.playg.in/productImages/med/PNC000051_PNC000051.jpg" href="http://www.playg.in/productImages/lrg/PNC000051_PNC000051.jpg" onclick="return showPic(this)" onmouseover="return showPic(this)">
<img border="0" alt="" src="http://www.playg.in/productImages/thmb/PNC000051_PNC000051.jpg"></a> <a title="PNC000051_PNC000051_1.jpg" name="http://www.playg.in/productImages/med/PNC000051_PNC000051_1.jpg" href="http://www.playg.in/productImages/lrg/PNC000051_PNC000051_1.jpg" onclick="return showPic(this)" onmouseover="return showPic(this)">
<img border="0" alt="PNC000051_PNC000051_1.jpg" src="http://www.playg.in/productImages/thmb/PNC000051_PNC000051_1.jpg"></a>
</div>
i want to return only the src value of image tag and i tried a matching pattern in "preg_match_all()" and the pattern was
"#<div[\s\S]class="smallSku"[\s\S]id="ctl00_ContentPlaceHolder1_smallImages"\><a title=\"\" name="[\w\W]" href="[\w\W]" onclick=\"[\w\W]" onmouseover="[\w\W]"\><img[\s\S]src="(.*)"[\s\S]></a><\/div>#"
please help i tried a lots of time for this also tried with this link too Match image tag not nested in an anchor tag using regular expression

Regular expression is not the right tool for parsing HTML. See this FAQ: How to parse and process HTML/XML?
Here is an example on how to get the src property using your example:
$doc = new DOMDocument();
$doc->loadHTML($your_html_string);
$xpath = new DOMXPath($doc);
foreach ($xpath->query('//div[#class="smallSku"]/a/img/#src') as $attr) {
$src = $attr->value;
print $src;
}

try this sunith
$content = file_get_contents('your url');
preg_match_all("|<div class='items'>.*</div>|", $content, $arr, PREG_PATTERN_ORDER);
preg_match_all("/src='([^']+)'/", $arr[0][0], $arrr, PREG_PATTERN_ORDER);
echo '<pre>';
print_r($arrr);

How to read the <strong> text and the link url using DOMdocument?

I have this html:
<a href=" URL TO KEEP" class="class_to_check">
<strong> TEXT TO KEEP</strong>
</a>
I have a long html code with many link as above, I have to keep the links that have the <strong> inside, I have to keep the HREF of the link and the text inside the <strong>, how can i do using DOMDocument?
Thank you!

$html = "...";
$dom = new DOMDOcument();
$dom->loadHTML($html);
$xp = new XPath($dom);
$a = $xp->query('//a')->item(0);
$href = $a->getAttribute('href');
$strong = $a->nodeValue;
Of course, this XPath stuff works for just this particular html snippet. You'll have to adjust it to work with a more fully populated HTML tree.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

preg_match() get the source link of image using regex - php

self-contained tags meme link. $dom = new DOMDocument(); $dom->loadHTML($data); $xpath = new DOMXPath($dom); $image = $xpath->query("//img[#class='xfieldimage img']")->item(0); echo $image->getAttribute("src");

Related

PHP dom parser: How to get element count only if it comes after another element?

Replace all images in HTML with text

preg_match move selection above paragraph

How to get img tag value inside a specific div and specific anchor tag using regular expression

How to read the <strong> text and the link url using DOMdocument?

Categories

Resources