How get list subItens nodes separateds using PHP DOM - php

I was seeing this tip
PHP DOM get items from first ul element
But in this case:
<li>First item
<ul>
<li>
First SubItem
</li>
<li>
Second SubItem
</li>
</ul>
</li>
PHP Code:
$DOM = new DOMDocument;
libxml_use_internal_errors(true);
$DOM->loadHTML( $output);
$items = $DOM->getElementsByTagName('ul');
echo '<ul>';
foreach ($items->item(3)->getElementsByTagName('li') as $li) {
var_dump($li);die();
echo '<li>'.$li->nodeValue;
$ul = $li->getElementsByTagName('ul');
echo '<ul>';
echo '--->'.$ul->length.'<br>';
for($u=0;$u<$ul->length;$u++){
foreach ($ul->item($u)->getElementsByTagName('li') as $lii) {
echo '<li>'.$lii->nodeValue.'</li>';
}
}
echo '</ul>';
echo '</li>';
}
echo '</ul>';
The Problem is:
Im getting in //$li->nodeValue;// "First itemFirst SubItemSecond SubItem" as the Fist node;
I need get this items separated (subItems)

I'm assuming you just want to retrieve the text values from those <li> tags.
You can greatly simplify the query with DOMXPath as ->query('//li') will fetch all <li> tags in your code snippet.
$DOM = new DOMDocument();
$DOM->loadHTML($output);
$xPath = new DOMXPath($DOM);
if($xpResponse = $xPath->query('//li/text()')) {
echo "<ul>\n";
foreach($xpResponse as $xNode) {
echo "<li>" . trim($xNode->nodeValue) . "</li>\n";
}
echo "</ul>\n";
}
This will simply output (as HTML):
First item
First SubItem
Second SubItem

Related

Want to get specific data from a webpage

I am trying hard to get data from following portion of a webpage
<div id="menu_pannel">
<ul class="sf-menu" id="nav">
<li class="current"><a href="/" class="current" >Home</a></li>
<li class="">Schedule</li>
<li class="">All Channels</li>
<li class="">Sports Channels
<ul id="submenu">
<li>Sky Sports 1</li>
<li>Sky Sports 2</li>
<li><a href="http://www.time4tv.com/2011/03/sky-sports-3.php">Sky Sports
I want to get data from for that i am using
$pattern = '|<ul id="nav" class="sf-menu">(.*?)</ul>|';
preg_match($pattern, $html, $data);
but getting emty array .
if strip_tags($html) doesn't returns what you want, you can use this example to get an array of text:
function getTextBetweenTags($string, $tagname) {
preg_match_all("#<$tagname.*?>([^<]+)</$tagname>#", $string, $matches);
return $matches[1];
}
$values = getTextBetweenTags ($html, 'a' );
foreach($values as $value) {
echo $value . '<br>';
}
where $html is a var containing your html.
If you decide to use dom parser
$doc = new DOMDocument();
$doc->loadHTML($str);
$x = new DomXpath($doc);
$ul = $x->query('//ul[#id="nav"]'); // 'id' is a unique identifier!
// Echo outerHTML of ul[#id="nav"]
echo $doc->saveHTML($ul->item(0));
demo
Use DOMDocument class for manipulating HTML content:
// $html_str is your html fragment
$doc = new DOMDocument();
$doc->loadHTML($html_str);
$ul_content = "";
$ul = $doc->getElementsByTagName("ul")->item(0);
if ($ul && $ul->getAttribute('class') == 'sf-menu') {
foreach ($ul->childNodes as $n) {
$ul_content .= $doc->saveHTML($n);
}
}
echo $ul_content;

How to get all child nodes from DOMDocument?

I have the following
$string = '<html><head></head><body><ul id="mainmenu">
<li id="1">Hallo</li>
<li id="2">Welt
<ul>
<li id="3">Sub Hallo</li>
<li id="4">Sub Welt</li>
</ul>
</li>
</ul></body></html>';
$dom = new DOMDocument;
$dom->loadHTML($string);
now I want to have all li IDs inside one array.
I tried the following:
$all_li_ids = array();
$menu_nodes = $dom->getElementById('mainmenu')->childNodes;
foreach($menu_nodes as $li_node){
if($li_node->nodeName=='li'){
$all_li_ids[]=$li_node->getAttribute('id');
}
}
print_r($all_li_ids);
As you might see, this will print out [1,2]
How do I get all children (the subchildren as well [1,2,3,4])?
My test doesn't return element by using $dom->getElementById('mainmenu'). But if your using does, do not use Xpath
$xpath = new DOMXPath($dom);
$ul = $xpath->query("//*[#id='mainmenu']")->item(0);
$all_li_ids = array();
// Find all inner li tags
$menu_nodes = $ul->getElementsByTagName('li');
foreach($menu_nodes as $li_node){
$all_li_ids[]=$li_node->getAttribute('id');
}
print_r($all_li_ids); 1,2,3,4
One way to do it would be to add another foreach loop, ie:
foreach($menu_nodes as $node){
if($node->nodeName=='li'){
$all_li_ids[]=$node->getAttribute('id');
}
foreach($node as $sub_node){
if($sub_node->nodeName=='li'){
$all_li_ids[]=$sub_node->getAttribute('id');
}
}
}

How to get value of onclick= using xpath?

I have a string that has lots of <li> sets of data. I want to get this value:
1: call.php?category=fruits&fruitid=123456
inside onclick using xpath . My current xpath doesn't get me the onclick value so I parse it further to get my required data ! Could any one tell me what is the correct xpath to get value of onclick?
libxml_use_internal_errors(true);
$dom = new DOMDocument;
$dom->loadHTML($code2);
$xpath = new DOMXPath($dom);
// Empty array to hold all links to return
$result = array();
//Loop through each <li> tag in the dom
foreach($dom->getElementsByTagName('li') as $li) {
//Loop through each <a> tag within the li, then extract the node value
foreach($li->getElementsByTagName('a') as $links){
$result[] = $links->nodeValue;
echo $result[0] . "\n";
}
$onclicks = $xpath->query("//li/a/onclick");
foreach ($onclicks as $onclick) {
echo $onclick->nodeValue . "\n";
}
}
data:
<li><a id="FR123456" onclick="setFood(false);setSeasonFruitID('123456');getit('call.php?category=fruits&fruitid=123456&',detailFruit,false);">mango season</a><img src="http://imagehosting.com/images/fru_123456.png">
</li>
onclick is an attribute, and you use #attribute_name to reference attribute in XPath :
$onclicks = $xpath->query("//li/a/#onclick");
foreach ($onclicks as $onclick) {
echo $onclick->nodeValue . "\n";
}
Try something like this :
$onclicks = $xpath->query("//li/a");
foreach ($links as $link) {
echo $link->getAttribute('onclick'). "\n";
}

PHP - Get links from within an element after element has been found

I have the following code....
<div class="outer">
<div>
<h1>Christmas</h1>
<ul>
<li>Holiday</li>
<li>Fun</li>
<li>Joy</li>
</ul>
<h1>4th July</h1>
<ul>
<li>Fireworks</li>
<li>Happy</li>
<li>Spectral</li>
</ul>
</div>
</div>
<div class="outer">
<div>
<h1>Christmas2</h1>
<ul>
<li>Holiday</li>
<li>Fun</li>
<li>Joy</li>
</ul>
<h1>4th July</h1>
<ul>
<li>Fireworks2</li>
<li>Happy</li>
<li>Spectral</li>
</ul>
</div>
</div>
I already know that I can find the DIV and then look inside the DIV for the elements etc by doing...
$doc->loadHTML($output); //$output being the text above
$xpath = new DOMXpath($doc);
$elements = $xpath->query('//div[#class="outer"]'); //Check outer
I know this above 3 lines will get the elements from within the DIV listed, but what I really want to be able to do is get the text of the [H1], then display the [li] values next to each H1..
the output i'm looking for is...
Christmas - Holiday, Fun, Joy
4th July - Fireworks, Happy, Spectral
Christmas2 - Holiday, Fun, Joy
4th July2 - Fireworks, Happy, Spectral
Yes you can continue to use xpath to traverse the elements on the header and get its following sibling, the list. Example:
$doc = new DOMDocument();
$doc->loadHTML($output);
$xpath = new DOMXpath($doc);
$elements = $xpath->query('//div[#class="outer"]/div');
if($elements->length > 0) {
foreach($elements as $div) {
foreach ($xpath->query('./h1', $div) as $e) {
$header = $e->nodeValue;
$list = array();
foreach ($xpath->query('./following-sibling::ul/li', $e) as $li) {
$list[] = $li->nodeValue;
}
echo $header . ' - ' . implode(', ', $list) . '<br/>';
}
echo '<hr/>';
}
}
Sample Output
I've used phpQuery for this type of issue in the past:
// include phpquery
require('phpQuery/phpQuery.php');
// initialize
$doc = phpQuery::newDocumentHTML($markup);
// get the text from the various elements
$h1Value = $doc['h1:first']->text(); // Christmas
// ... etc.
(untested)

html DOM program to find href value

I am a newbie in php and I have been assigned with a project to fetch the HREF value from the following HTML snippet:
<p class="title">
<a href="http://canon.com/">Canon Pixma iP100 + Accu Kit
</a>
</p>
Now for this am using the following code:
$dom = new DOMDocument();
#$dom->loadHTML($html);
foreach($dom->getElementsByTagName('p') as $link) {
# Show the <a href>
foreach($link->getElementsByTagName('a') as $link)
{
echo $link->getAttribute('href');
echo "<br />";
}
}
This code gives me the HREF value of all <a href> from all the <P> tag in that page. I want to parse the <P> with the class "title" only...I can't use Simple_HTML_DOM or any kind of library here.
Thanks in advance.
Alternatively, you could use DOMXpath for this one. Like this:
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXpath($dom);
// target p tags with a class with "title" with an anchor tag
$target_element = $xpath->query('//p[#class="title"]/a');
if($target_element->length > 0) {
foreach($target_element as $link) {
echo $link->getAttribute('href'); // http://canon.com/
}
}
Or If if you want to traverse it. Then you need to have to search it manually.
foreach($dom->getElementsByTagName('p') as $p) {
// if p tag has a "title" class
if($p->getAttribute('class') == 'title') {
foreach($p->childNodes as $child) {
// if has an anchor children
if($child->tagName == 'a' && $child->hasAttribute('href')) {
echo $child->getAttribute('href'); // http://cannon.com
}
}
}
}

Categories