DOMXPath query check for div if it exists - php

I have this code:
$html = '<div class="container">A<div class="wrapper">B</div>C</div>'
$dom = new DOMDocument;
#$dom->loadHTML($html);
$xp = new DOMXPath($dom);
$links = $xp->query('//div[contains(#class,"container")]');
I want to make the DOMXPath query select the <div> element with class = "container" but i want it only to select the <div class="wrapper"></div> when it exists. So i want it to select <div class="container"> when <div class="wrapper"> doesn't exist, but when it does i want it to only select <div class="wrapper">.
Thanks in advance.

As first, you can count all <div class="wrapper"> elements via:
$wrapper = $xp->query('//div[contains(#class,"wrapper")]')->length;
if this returns int(0) it means, that no element with wrapper class has been found. With these information, we can easily modify your code to something like this:
$html = '<div class="container">A<div class="wrapper">B</div>C</div>';
$dom = new DOMDocument;
#$dom->loadHTML($html);
$xp = new DOMXPath($dom);
$wrapper = $xp->query('//div[contains(#class,"wrapper")]');
if($wrapper->length == 0) {
// wrapper class NOT FOUND, now we can select container class
$links = $xp->query('//div[contains(#class,"container")]');
}
else {
// 1 or MORE wrapper class FOUND, do something with your .wrapper class
}

Related

XPath extract attribute from <div> in PHP

i want to extract an attribute from an and display its value.
<div class="b-text-4xl b-text-btc-first b-font-bold btcecc-animated liveup livedown" data-price="38696.15125182" data-live-price="bitcoin" data-rate="1" data-currency="USD" data-timeout="1610051644181"><span>38,696.15</span> <b class="fiat-symbol">$</b></div>
I need the value of "data-price".
The location of the full html is at https://www.btc-echo.de/kurs/bitcoin/
I tried this:
$url = "https://www.btc-echo.de/kurs/bitcoin/";
libxml_use_internal_errors(true);
$doc = new DOMDocument;
$doc->loadHTML(utf8_encode(file_get_contents($url)));
$xpath = new DOMXpath($doc);
foreach ($xpath->query('*[#id="main"]/div[1]/div[3]/div[2]/div[1]/div/div/div[1]/div/#data-price') as $textNode) {
echo $textNode->nodeValue;
}

Getting all elements between html tag in php

I refered this question
But, i want to iterate and get all the elements between the html tag
This is what i did
$homepage = file_get_contents('http://www.example.com');
Which will print the following
<html>
<body>
<div class = "alpha">hey</div>
<div class = "beta">one</div>
<div class = "beta">two</div>
</body>
</html>
Here i need to get all the elements with the class beta.
How can i do this ?
Here's the code that i tried so far
$dom = new DOMDocument();
$dom->loadHTML($homepage);
foreach($dom->getAllElements as $element ){
if(!$element->hasClass('beta')){
echo $element;
}
}
But it says DOMDocument::loadHTML(): Tag nav invalid in Entity,
Try this
<?php
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTML("<html>
<body>
<div class = 'alpha'>hey</div>
<div class = 'beta'>one</div>
<div class = 'beta'>two</div>
</body>
</html>");
libxml_clear_errors();
$classname="beta";
$finder = new DomXPath($dom);
$spaner = $finder->query("//*[contains(#class, '$classname')]");
foreach($spaner as $element ){
print_r($element);
}
?>

PHP DOMDocument: Delete elements by class

I' trying to delete every node with a given class.
To find the elements I use:
$xpath = new DOMXPath($dom);
foreach( $xpath->query('//div[contains(attribute::class, "foo")]') as $e ) {
// Delete this node
}
But how can I delete the elements in this foreach-loop?
Edit: By the way: How can I check first if there is a element with the class "foo" in the DOM (before starting the loop)?
Update:
This is my HTML:
<div class="main">
<div class="delete_this" contenteditable="true">Target</div>
<div class="class1"></div>
<div class="content"><p>Anything</p></div>
</div>
This doesn't work for the example above:
$xpath = new DOMXPath($dom);
foreach( $xpath->query('//div[contains(attribute::class, "delete_this")]') as $e ) {
$e->parentNode->removeChild($e);
}
You need to use the removeChild() method of the parent element:
$xpath = new DOMXPath($dom);
foreach($xpath->query('//div[contains(attribute::class, "foo")]') as $e ) {
// Delete this node
$e->parentNode->removeChild($e);
}
Btw, about your second question, if there are no elements found, the loop won't iterate at all.
Here comes a working example:
$html = <<<EOF
<div class="main">
<div class="delete_this" contenteditable="true">Target</div>
<div class="class1"></div>
<div class="content"><p>Anything</p></div>
</div>
EOF;
$doc = new DOMDocument();
$doc->loadHTML($html);
$selector = new DOMXPath($doc);
foreach($selector->query('//div[contains(attribute::class, "delete_this")]') as $e ) {
$e->parentNode->removeChild($e);
}
echo $doc->saveHTML($doc->documentElement);
For the second part of the question, the result of the query has a length property which you can use to see if anything was matched:
$xpath = new DOMXPath($doc);
$nodes = $xpath->query('//div[contains(attribute::class, "foo")]');
printf('Removing %d nodes', $nodes->length);
This removes all divs with that class.
To actually remove all the elements by class use *:
$selector = new \DOMXPath( $doc );
foreach ( $selector->query( '//*[contains(attribute::class, "' . $class . '")]' ) as $e ) {
$e->parentNode->removeChild( $e );
}

Iterate through elements with DOMDocument & DOMXPath

I am trying to iterate through every child element of the containing div:
$html = ' <div id="roothtml">
<h1>
Introduction</h1>
<p>text</p>
<h2>
text</h2>
<p>
test</p>
</div>';
And I have this PHP:
$dom = new DOMDocument();
$dom->loadHTML($html);
$dom->preserveWhitespace = false;
$xpath = new DOMXPath($dom);
$els = $xpath->query("/div");
print_r($els);
All I get though is DOMNodeList Object ( )
Having looked at the IBM tutorial I should be getting an array. What is it I am doing wrong?
Any help is appreciated.
You're using the wrong query string, you should be using //div.
Iterate over the list like this:
$els = $xpath->query("//div");
foreach( $els as $el) {
echo $el->textContent;
}

PHP: Fetch content from a html page using xpath()

I'm trying to fetch the content of a div in a html page using xpath and domdocument. This is the structure of the page:
<div id="content">
<div class="div1"></div>
<span class="span1></span>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>
<div class="div2"></div>
</div>
I want to get only the content of p, not spans and divs. I came thru this xpath expression .//*[#id='content']/p but guess something's not right because i'm getting only the first p. Tried using other expression with following-sibling and node() but all return the first p only.
.//*[#id='content']/span/following-sibling::p
.//*[#id='content']/node()[self::p]
This is how's used xpath:
$domDocument=new DOMDocument();
$domDocument->encoding = 'UFT8';
$domDocument->loadHTML($page);
$domXPath = new DOMXPath($domDocument);
$domNodeList = $domXPath->query($this->xpath);
$content = $this->GetHTMLFromDom($domNodeList);
And this is how i get html from nodes:
private function GetHTMLFromDom($domNodeList){
$domDocument = new DOMDocument();
$node = $domNodeList->item(0);
foreach($node->childNodes as $childNode)
$domDocument->appendChild($domDocument->importNode($childNode, true));
return $domDocument->saveHTML();
}
This XPath expression:
//div[#id='content']/p
Result in the wanted node set (five p elements)
EDIT: Now it's clear what is your problem. You need to iterate over the NodeList:
private function GetHTMLFromDom($domNodeList){
$domDocument = new DOMDocument();
foreach ($nodelist as $node) {
$domDocument->appendChild($domDocument->importNode($node, true));
}
return $domDocument->saveHTML();
}

Categories