How to verify if the nth element exist with DIDOM html parser - php

I am using DiDom html parser library. From it's documentation (https://github.com/Imangazaliev/DiDOM#verify-if-element-exists):
If you need to check if element exist and then get it:
if ($document->has('.post')) {
$elements = $document->find('.post');
// code
}
But what if i need to check existance of n-th element of array of elements with '.post' class, for example:
$elements = $document->find('.post')[1];
The code below doesn't work and throws errors:
if ($document->has('.post')[1]) {
$elements = $document->find('.post')[1];
// code
}

I found the solution. DiDOM has() method doesn't offers nth-child option. So i've used pseudo-classes selector nth-child(n) to check appearance of n-th element.
The code looks now:
if ($document->find('.post:nth-child(2)')) {
$elements = $document->find('.post:nth-child(2)'))[0]->text();
} else {
echo "there are no such item";
}

Related

get elements by name from a DOMNodeList

In the following code, I retrieve with DOM a list of nodes from an xml document. Then I would like to select, by their tag name, some of these nodes.
$index = new DOMDocument();
$index->load('index.xml');
$xpath = new DOMXpath($index);
$related_notions = $xpath->query("/index/notion[name='" . $name . "']/relations/*"); // the variable $name is dynamically defined previously in the script
foreach ($related_notions->getElementsByTagName("superordinate") as $item) {
// do something
}
I get the following error: Uncaught Error: Call to undefined method DOMNodeList::getElementsByTagName()
I don't understand why the method getElementsByTagName() is not defined for DOMNodeList. After all, getting elements by their name seems to me something obvious that one might want to do with a node list. At any rate, my actual question is: How can I do what I want to do? That is, in the absence of the method getElementsByTagName(), how do I get elements by tag name from a node list?
Thanks in advance for your help!
I found a solution to my problem, namely to test for the nodeName with an if statement inside the foreach:
$index = new DOMDocument();
$index->load('index.xml');
$xpath = new DOMXpath($index);
$related_notions = $xpath->query("/index/notion[name='" . $name . "']/relations/*");
foreach ($related_notions as $item) {
if ($item->nodeName == "superordinate") {
// do something
}
}

How to use Simple HTML Dom to get an adjacent sibling element?

I am using Simple HTML Dom parser to get an element from an HTML string using it's class name, like:
foreach ($html->find('div[class=news-div]')) {
$news = $news-div;
}
But I also need to get two elements (one is span and the other is a) that occur just before the $news, but they don't have an id that can be predicted because it is calculated dynamically, and they don't have a unique class name.
How can I extract the two adjacent elements occurring before $news-div?
SimpleHTML has prev_sibling and next_sibling methods
$elems = $html->find('div[class=news-div]');
foreach ( $elems as $news ) {
$prev_span = $news->prev_sibling();
$prev_a = $prev_span->prev_sibling();
}

Check if XML element is existing in loop

For a website i'm making i need to get data from an external XML file.
I load the data like this:
$doc = new DOMDocument();
$url = 'http://myurl/results/xml/12345';
if (!$doc->load($url))
{
echo json_encode(array('error'=> 'error'));
exit;
}
$xpath = new DOMXPath($doc);
$program_date = $xpath->query('//game/date');
Then i use a foreach loop to get all the data
if($program_date){
foreach($program_date as $node){
$programArray['program_date'][] = $node->nodeValue;
}
}
The problem i'm having is that sometimes a certain game doesn't have a date.
So when a game doesn't have a date, i just want it to put "-", instead of the date from the XML file. My problem is that i don't know how to check if a date is present in the data.
I used a lot of ways like isset, !isset, else, !empty, empty
$teamArray['program_kind'][] = "-";
but noting works...
Can someone help me with this problem?
Thanks in advance
You need to iterate the game elements, use them as a context and fetch the data with additional XPath expressions.
But one thing first. Use DOMXPath::evaluate(). DOMXPath::query() only supports location paths. It can only return a node list. But XPath expressions can return scalar values, too.
$xpath = new DOMXPath($doc);
$games = $xpath->evaluate('//game');
The result of //game will always be a DOMNodeList object. It can be an empty list, but you can directly iterate it. A condition like if ($games) will always be true.
foreach ($games as $game) {
Now that you have the game element node, you can use it as an context to fetch other data.
$date = $xpath->evaluate('string(date)', $game);
string() casts the first node of the location path into a string. If it can not match a node, it will return an empty string. Check normalize-space() if you want to remove whitespaces at the same time.
You can validate if the game element has a date node using count().
$hasDate = $xpath->evaluate('count(date) > 0', $game);
The result of this XPath expression is always a boolean.

XPath query is sometimes not showing the right elements

I am using XPath, and this is my query:
$elements = $xpath->query('//div/div/div/div/div/div[#id="con1"]/table/tr/td');
And everything works fine.
Then I change the condition in the div, and the query is like this:
$elements = $xpath->query('//div/div/div/div/div/div[#id="con2"]/table/tr/td');
And I do see what I must see.
But later, if I do this:
$elements = $xpath->query('//div/div/div/div/div/div[#id="con1" or #id="con2"]/table/tr/td');
I see again only the elements of con1. Why is that?
The full code is below:
$elements = $xpath->query('//div/div/div/div/div/div[#id="con1" or #id="con2"]/table/tr/td');
foreach ( $elements as $element ) {
$str1=$element->getAttribute('class');
$str2="first-td";
$str3="status";
if (strcmp($str1,$str2)==0) {
var_dump( $element->nodeValue);
}
if (strcmp($str1,$str3)==0) {
echo $element->childNodes->item(0)->getAttribute('class'). "<br />";
}
}
To sum up: If my condition is only con1, I see the correct results. If it's only con2, I see the correct results. The problem comes when I am using the or. In that case, I see the results only from con1. It's like it's stopping after fullfilling the first condtions. They are at the same level of the DOM tree.
What you are trying to do is to retrieve <div id="con1"> and <div id="con2"> in the same expression, but what you are actually doing is to retrieve a div which either has an attribute id="con1" or id="con2". The first expression of the condition returns true and then you get the <div id="con1"> node. It makes sense.
To get both nodes you need something like:
//div[#id="con1"]|//div[#id="con2"
Note: //div[#id="con1"] finds whatever node <div id="con1"> in the tree and the id in a document has to be unique. It's not necessary to specify all the path down.

DOM removing selected child nodes

I have a dom element with html inside chat contains some html elements I'd like to remove, while still keeping some tags that are ok.
I try to iterate through child elements all child elements and delete those that need to be removed
foreach ($node->getElementsByTagName('*') as $element)
if ($element->nodeName != 'br')
$node->removeChild($element);
But this throws a Not Found Error exception which not being caught causes a fatal error.
How would I solve this problem ?
Use the following instead to remove the node:
$element->parentNode->removeChild($element);
getElementsByTagName('*') finds all descendent elements, not child elements. So some of the $element you want to remove are not children of $node, hence the failure.
I'm not 100% sure what your intention is here, but most likely you just want to remove certain immediate children. In this case, do the following:
$nodestoremove = array();
foreach ($node->childNodes as $n) {
if ($n->nodeType===XML_ELEMENT_NODE and $n->nodeName!=='br') {
$nodestoremove[] = $n;
}
}
foreach ($nodestoremove as $n) {
$node->removeChild($n);
}
unset($nodestoremove); // so nodes can be garbage-collected
echo $node->C14N(); // xml fragment after removal
Note that we make two passes: one to identify the nodes to delete, and a second pass to delete. This is because childNodes is an active list, so we can't iterate through it forwards as we delete. (Although we could iterate through it backwards.)

Categories