How to get all child nodes from DOMDocument? - php

I have the following
$string = '<html><head></head><body><ul id="mainmenu">
<li id="1">Hallo</li>
<li id="2">Welt
<ul>
<li id="3">Sub Hallo</li>
<li id="4">Sub Welt</li>
</ul>
</li>
</ul></body></html>';
$dom = new DOMDocument;
$dom->loadHTML($string);
now I want to have all li IDs inside one array.
I tried the following:
$all_li_ids = array();
$menu_nodes = $dom->getElementById('mainmenu')->childNodes;
foreach($menu_nodes as $li_node){
if($li_node->nodeName=='li'){
$all_li_ids[]=$li_node->getAttribute('id');
}
}
print_r($all_li_ids);
As you might see, this will print out [1,2]
How do I get all children (the subchildren as well [1,2,3,4])?

My test doesn't return element by using $dom->getElementById('mainmenu'). But if your using does, do not use Xpath
$xpath = new DOMXPath($dom);
$ul = $xpath->query("//*[#id='mainmenu']")->item(0);
$all_li_ids = array();
// Find all inner li tags
$menu_nodes = $ul->getElementsByTagName('li');
foreach($menu_nodes as $li_node){
$all_li_ids[]=$li_node->getAttribute('id');
}
print_r($all_li_ids); 1,2,3,4

One way to do it would be to add another foreach loop, ie:
foreach($menu_nodes as $node){
if($node->nodeName=='li'){
$all_li_ids[]=$node->getAttribute('id');
}
foreach($node as $sub_node){
if($sub_node->nodeName=='li'){
$all_li_ids[]=$sub_node->getAttribute('id');
}
}
}

Related

Change class of all ancestor li elements with PHP DOMDocument

I have nested bullet lists of ul > li > ul > li, etc.
<ul>
<li>Mammals
<ul>
<li>Canine
<ul>
<li>Fox</li>
<li>Wolf</li>
</ul>
</li>
<li>Feline</li>
</ul>
</li>
<li>Fish</li>
</ul>
How can I apply a class to all "li" elements (recursively) which are ancestors of the target element? I have:
<?php
$list = ob_get_clean();
$dom = new DOMDocument;
$dom->loadHTML($list);
$xpath = new DOMXPath($dom);
$nodes = $xpath->query('//li');
foreach ($nodes as $object) {
$parts = parse_url($object->nodeValue);
parse_str($parts['query'], $query);
if (true) {
//if certain requirements are met, modify the current object
}
//also modify all ancestor li elements
//$object-> ?? ->setAttribute('class', 'current');
}
?>
There are reasons that the target objects must be identified before searching through each ones' ancestors. I just stripped this code down for relevancy.
Work up the chain of parent nodes, altering those which are li nodes:
<?php
$parentNode = $node->parentNode; // ul
while ($parentNode = $parentNode->parentNode) {
if ($parentNode->nodeName == 'LI') {
$parentNode->setAttribute('class', 'current');
}
}

How get list subItens nodes separateds using PHP DOM

I was seeing this tip
PHP DOM get items from first ul element
But in this case:
<li>First item
<ul>
<li>
First SubItem
</li>
<li>
Second SubItem
</li>
</ul>
</li>
PHP Code:
$DOM = new DOMDocument;
libxml_use_internal_errors(true);
$DOM->loadHTML( $output);
$items = $DOM->getElementsByTagName('ul');
echo '<ul>';
foreach ($items->item(3)->getElementsByTagName('li') as $li) {
var_dump($li);die();
echo '<li>'.$li->nodeValue;
$ul = $li->getElementsByTagName('ul');
echo '<ul>';
echo '--->'.$ul->length.'<br>';
for($u=0;$u<$ul->length;$u++){
foreach ($ul->item($u)->getElementsByTagName('li') as $lii) {
echo '<li>'.$lii->nodeValue.'</li>';
}
}
echo '</ul>';
echo '</li>';
}
echo '</ul>';
The Problem is:
Im getting in //$li->nodeValue;// "First itemFirst SubItemSecond SubItem" as the Fist node;
I need get this items separated (subItems)
I'm assuming you just want to retrieve the text values from those <li> tags.
You can greatly simplify the query with DOMXPath as ->query('//li') will fetch all <li> tags in your code snippet.
$DOM = new DOMDocument();
$DOM->loadHTML($output);
$xPath = new DOMXPath($DOM);
if($xpResponse = $xPath->query('//li/text()')) {
echo "<ul>\n";
foreach($xpResponse as $xNode) {
echo "<li>" . trim($xNode->nodeValue) . "</li>\n";
}
echo "</ul>\n";
}
This will simply output (as HTML):
First item
First SubItem
Second SubItem

DOMXpath & PHP: how to wrap a bunch of <li> inside an <ul>

I have a html-document with this not-so-nice markup, without the 'ul':
<p>Lorem</p>
<p>Ipsum...</p>
<li class='item'>...</li>
<li class='item'>...</li>
<li class='item'>...</li>
<div>...</div>
I am now trying to "grab" all li-elements and wrap them inside an ul-list which I'd like to place in the same spot, using PHP and DOMXPath. I manage to find and "remove" the li-elements:
$elements = $xpath->query('//li[#class="item"]');
$wrapper = $document->createElement('ul');
foreach($elements as $child) {
$wrapper->appendChild($child);
}
Maybe you can get the parentNode of the first <li> and then use the insertBefore method:
$html = <<<HTML
<p>Lorem</p>
<p>Ipsum...</p>
<li class='item'>...</li>
<li class='item'>...</li>
<li class='item'>...</li>
<div>...</div>
HTML;
$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXpath($doc);
$elements = $xpath->query('//li[#class="item"]');
$wrapper = $doc->createElement('ul');
$elements->item(0)->parentNode->insertBefore(
$wrapper, $elements->item(0)
);
foreach($elements as $child) {
$wrapper->appendChild($child);
}
echo $doc->saveHTML();
Demo
Here's what you need. You may need to tweak the XPath query for your real HTML.
$document = new DOMDocument;
// We don't want to bother with white spaces
$document->preserveWhiteSpace = false;
$html = <<<EOT
<p>Lorem</p>
<p>Ipsum...</p>
<li class='item'>...</li>
<li class='item'>...</li>
<li class='item'>last...</li>
<div>...</div>
EOT;
$document->LoadHtml($html);
$xpath = new DOMXPath($document);
$elements = $xpath->query('//li[#class="item"]');
// Saves a reference to the Node that is positioned right after our li's
$ref = $xpath->query('//li[#class="item"][last()]')->item(0)->nextSibling;
$wrapper = $document->createElement('ul');
foreach($elements as $child) {
$wrapper->appendChild($child);
}
$ref->parentNode->insertBefore($wrapper, $ref);
echo $document->saveHTML();
Running example: https://repl.it/B3UO/24

PHP - Get links from within an element after element has been found

I have the following code....
<div class="outer">
<div>
<h1>Christmas</h1>
<ul>
<li>Holiday</li>
<li>Fun</li>
<li>Joy</li>
</ul>
<h1>4th July</h1>
<ul>
<li>Fireworks</li>
<li>Happy</li>
<li>Spectral</li>
</ul>
</div>
</div>
<div class="outer">
<div>
<h1>Christmas2</h1>
<ul>
<li>Holiday</li>
<li>Fun</li>
<li>Joy</li>
</ul>
<h1>4th July</h1>
<ul>
<li>Fireworks2</li>
<li>Happy</li>
<li>Spectral</li>
</ul>
</div>
</div>
I already know that I can find the DIV and then look inside the DIV for the elements etc by doing...
$doc->loadHTML($output); //$output being the text above
$xpath = new DOMXpath($doc);
$elements = $xpath->query('//div[#class="outer"]'); //Check outer
I know this above 3 lines will get the elements from within the DIV listed, but what I really want to be able to do is get the text of the [H1], then display the [li] values next to each H1..
the output i'm looking for is...
Christmas - Holiday, Fun, Joy
4th July - Fireworks, Happy, Spectral
Christmas2 - Holiday, Fun, Joy
4th July2 - Fireworks, Happy, Spectral
Yes you can continue to use xpath to traverse the elements on the header and get its following sibling, the list. Example:
$doc = new DOMDocument();
$doc->loadHTML($output);
$xpath = new DOMXpath($doc);
$elements = $xpath->query('//div[#class="outer"]/div');
if($elements->length > 0) {
foreach($elements as $div) {
foreach ($xpath->query('./h1', $div) as $e) {
$header = $e->nodeValue;
$list = array();
foreach ($xpath->query('./following-sibling::ul/li', $e) as $li) {
$list[] = $li->nodeValue;
}
echo $header . ' - ' . implode(', ', $list) . '<br/>';
}
echo '<hr/>';
}
}
Sample Output
I've used phpQuery for this type of issue in the past:
// include phpquery
require('phpQuery/phpQuery.php');
// initialize
$doc = phpQuery::newDocumentHTML($markup);
// get the text from the various elements
$h1Value = $doc['h1:first']->text(); // Christmas
// ... etc.
(untested)

pick the elements INSIDE the ul NOT the ul itself

this is the php code:
include_once('simple_html_dom.php');
$html = file_get_html('URL');
$elem = $html->find('ul[id=members-list]', 0);
echo $elem;
I would like to be able to pick the inside of the UL so the elements per se, not the ul itself.
html as follows:
<ul id="members-list">
<li>1</li>
<li>2</li>
<li>3</li>
<li>4</li>
</ul>
so when I do echo $elem it returns the ul included. I want to take it out just return :
<li>1</li>
<li>2</li>
<li>3</li>
<li>4</li>
You just forgot to use children() method. Consider this example:
$ul = $html->find('ul[id="members-list"]', 0)->children();
foreach($ul as $li) {
echo $li;
}
Is stated in the manual:
How to traverse the DOM tree? -> Traverse the DOM Tree
mixed$e->children ( [int $index] ) Returns the Nth child object if index is set, otherwise return an array of children.
Or the much easier way: ->innertext magic attribute
$ul = $html->find('ul[id="members-list"]', 0);
echo $ul->innertext;
You can use:
$('#members-list li')
to iterate over them:
$('#members-list li').each(function(){
console.log(this);//object of current li
});
Have a look at this this also will print the values of inside li tag
<?php
$html = file_get_contents('2.html');
$dom = new DOMDocument;
$dom->loadHTML($html);
foreach ($dom->getElementsByTagName('ul') as $node) {
foreach($node->childNodes as $childNode){
echo $childNode->nodeValue;
}
}
?>
You have to use .html() for the selected ul element :
include_once('simple_html_dom.php');
$html = file_get_html('URL');
$elem = $html->find('ul[id=members-list]', 0)->html();
^-- to get all child elements with tag
echo $elem;

Categories