How to get id of HTML elements - php

In PHP, I want to parse a HTML page and obtain the ids of certain elements. I am able to obtain all the elements, but unable to obtain the ids.
$doc = new DOMDocument();
$doc->loadHTML('<html><body><h3 id="h3-elem-id">A</h3></body></html>');
$divs = $doc->getElementsByTagName('h3');
foreach($divs as $n) {
(...)
}
Is there a way to also obtain the id of the element?
Thank you.

If you want the id attribute values, then you need to use getAttribute():
$doc = new DOMDocument();
$doc->loadHTML('<html><body><h3 id="h3-elem-id">A</h3></body></html>');
$divs = $doc->getElementsByTagName('h3');
foreach($divs as $n) {
echo $n->getAttribute('id') . '<br/>';
}

Related

php read html and handle double id-appearance

For my project I'm reading an external website which has used the same ID twice. I can't change that.
I need the content from the second appearance of that ID but my code just results the first one and does not see the second one.
Also a count to $data results 1 but not 2.
I'm desperate. Does anyone have an idea how to access the second ID 'hours'?
<?PHP
$url = 'myurl';
$contents = file_get_contents($url);
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTMLFile($url);
$data = $dom->getElementById("hours");
echo $data->nodeValue."\n";
echo count($data);
?>
As #rickdenhaan points out, getElementById always returns a single element which is the first element that has that specific value of id. However you can use DOMXPath to find all nodes which have a given id value and then pick out the one you want (in this code it will find the second one):
$url = 'myurl';
$contents = file_get_contents($url);
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTMLFile($url);
$xpath = new DOMXPath($dom);
$count = 0;
foreach ($xpath->query("//*[#id='hours']") as $node) {
if ($count == 1) echo $node->nodeValue;
$count++;
}
As #NigelRen points out in the comments, you can simplify this further by directly selecting the second input in the XPath i.e.
$node = $xpath->query("(//*[#id='hours'])[2]")[0];
echo $node->nodeValue;
Demo on 3v4l.org

Append HTML code to DOMDocument in PHP

I have two pieces of HTML code (both can be contain many tags and sub-tags). I iterate through first one DOMDocument and DOMXPath and count text length inside each tag. When the counter is more than X, I want to add second HTML to current node in first HTML. I use this code but I don't know how to use appenChild or similar functions to append my HTML.
$doc = new DOMDocument();
$doc->loadHTML($HTML1);
$xpath = new DOMXPath($doc);
$characterCounter = 0;
foreach ($xpath->evaluate('//*[count(*) = 0]') as $node)
{
$characterCounter += strlen($node->nodeValue);
if($characterCounter > 150)
{
//Here I have to append second HTML but it does not append
$node->appendChild($doc->createTextNode($HTML2));
break;
}
}
$doc->saveHTML();

Looping through list elements of an unordered list using xpath

I want to display the list elements of the ul but not the last one. I have used DOM nut it takes a long time.. Can Someone please give me the Xpath expression to solve this.
Please Provide the whole solution code.
$doc = new DOMDocument();
#$doc->loadHTMLFile($sel_image['snapdeal_content']);
$divs = $doc->getElementsByTagName('ul');
foreach($divs as $div) {
if ($div->getAttribute('class') == 'key-features') {
$li = $div->getElementsByTagName('li');
for($j=0;$j<$li->length-1;$j++){
echo "->".$li->item($j)->nodeValue;
echo "<br />";
}
}
}
Try this excerpt to replace the for-loops in your solution. The $li array should contain all n-1 <li> elements of all <ul> enumerations in the document.
$xpath = new DOMXPath($doc);
$query = '//ul[#class = "key-features"]/li[position() < last()]';
$li = $xpath->query($query);
Also see http://www.php.net/manual/en/domxpath.query.php and XSL for-each: how to detect last node? .

xpath extract complete html

I am trying to extract a complete table including the HTML tags, with XPath, that I can store in a variable, do a bit of string replacement on, then echo directly to the screen. I have found numerous posts on getting the text out of the table but I want to retain the HTML formatting since I am just going to display it (after minor modification).
At present I am extracting the table using string functions stristr, substr etc. but I would prefer to use XPath.
I can display the contents of the table with the following but it just displays the table TD fields with no formatting. It also does not store it in a variable that I can manipulate.
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$arr = $xpath->query('//table');
foreach($arr as $el) {
echo $el->textContent;
I tried this but got no output:
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$arr = $xpath->query('//table');
echo $arr->saveHTML();
Use DOMNode::C14N():
foreach($arr as $el) {
echo $el->C14N();

count number of items in xml with php

I have to parse xml files that look like this : http://goo.gl/QQirq
How can I count number of items/records in this xml- by 'item' I mean a 'productItem' element, ie there are 5 items in the example xml. I don't specify the tag name 'productItem' when parsing the xml, so I can't count occurrences of 'productItem'. Here is the code I have:
<?php
$doc = new DOMDocument();
$doc->load("test.xml");
$xpath = new DOMXpath( $doc );
$nodes = $xpath->query( '//*| //#*' );
$nodeNames = array();
foreach( $nodes as $node )
{
$nodeNames = $node->nodeName;
$name=$node->nodeName;
$value=$node->nodeValue;
echo ''.$name.':'.$value.'<br>';
}
?>
How can I count number of items and display them one by one, like this ideally : http://goo.gl/O1FI8 ?
Why don't you use DOMDocument::getElementsByTagName?
//get the number of product items
echo $doc->getElementsByTagName('productitem')->length;
//traverse the collection of productitem
foreach($doc->getElementsByTagName('productitem') as $element){
//$element is a DOMElement
$nodeNames = $element->nodeName;
$name=$element->nodeName;
$value=$element->nodeValue;
echo ''.$name.':'.$value.'<br>';
}
As you want to traverse your document, use XPath is just greedy. Moreover you will instantiate each node of the document even if you only want one or two.
You can use hasChildNodes methode and childNodes attribute to traverse your document
function searchInNode(DOMNode $node){
if(isGoodNode($node)){//if your node is good according to your database
mapTheNode($node);
}
if($node->hasChildNodes()){
foreach($node->childNodes as $nodes){
searchInNode($nodes);
}
}
}
searchInNode($domdocument);

Categories