Fatal error: Call to undefined method DOMDocument::getElementsById() - php

I'm parsing a exterior html (http://www.amazon.com/Toshiba-Satellite-C55-A5245-15-6-Inch-Horizon/dp/B00D78PZE8/ref=lp_9277875011_1_1?s=pc&ie=UTF8&qid=1400886357&sr=1-1) where I have a element like this:
<span id="priceblock_ourprice" class="a-size-medium a-color-price">$429.99</span>
and a php with the following code:
$dom = new DOMDocument;
libxml_use_internal_errors(TRUE);
$dom->loadHTMLFile($url);
libxml_clear_errors();
$links = $dom->getElementsById('priceblock_ourprice');
foreach ($links as $link ) {
echo "- ".$link->nodeValue."<br>";
}
But I'm getting the following error:
Fatal error: Call to undefined method DOMDocument::getElementsById()
Anyone could tell me what I'm doing wrong?
Thanks!

getElementsById() is not a method of DOMDocument, you should try getElementById() instead. I don't even think two elements can have the same id, so you won't be able to get a collection (array) based on id.

Ok, so I don't quite understand this, seems that Firebug in Firefox was showing me the wrong ID, I used the following code to get the Id of the different spans and the right one was:
$dom = new DOMDocument();
libxml_use_internal_errors(TRUE);
$dom->loadHTMLFile($url);
libxml_clear_errors();
$nodes = $dom->getElementsByTagName('span');
foreach($nodes as $node) {
echo $node->getAttribute('id'). '->'.$node->textContent.'<br>';
}
and it returned a different id for the field that I was looking for, I guess I had some error at some point, really sorry for waisting your time.

Related

How to get the href attribute

I have a url:
http://www.indeed.com/viewjob?jk=daddefef363643d7&qd=E8dXiB4h7yBMgEwoEDfyDF2ACaqK5NNcKe-lg0a0QeWlgGT7hwsgagao8YFkybxtaLZJqFprtIWhTxIjvWFBLUePVQb0Chqftd-uc7_Pfa4LB2pHYt-YP2NYagtBg9Lp&atk=1a4sk4spi1c0o5la&utm_source=publisher&utm_medium=organic_listings&utm_campaign=affiliate
I want to extract href value of anchor for view and apply
my code is:-
$dom = new DOMDocument();
#$dom->loadHtml($html);
$xpath = new DOMXpath($dom);
$applylink = $xpath->query("//*[#class='job-footer-button-row']/a");
if(!is_null($applylink)){
$this->view->applylink = $applylink->item(0)->getAttribute('href');
}
But it always shows below error:
Fatal error: Call to a member function getAttribute() on a non-object
This happens because DOMXPath::query does not return null when it finds no matches. Please read the documentation to see what it returns, and that should allow you to correct your code.

PHP parse HTML empty input value

I know there are many questions on parsing HTML in PHP, but I can't seem to find the specific problem I'm experiencing. My code works on other elements in the page, and also iterates over the inputs returning the tag name. At the same time their value property is empty, when 2 of them have a value for sure. Here is my code
$html = file_get_contents('http://...sample website...html');
$doc = new DOMDocument;
libxml_use_internal_errors(true);
$doc->loadHTML($html);
$xpath = new DOMXpath($doc);
$elements = $xpath->query("//*/input[#type='hidden']");
if(!is_null($elements)){
foreach ($elements as $element) {
echo "<br/>[". $element->nodeName. "]";
echo $element->nodeValue. "\n";
}
}
$xpath->query("//*/input[#type='hidden']/#value");
instead of
$xpath->query("//*/input[#type='hidden']");
also works well.
Same question, same answers
I got it myself, if anyone else has a similar problem it is just that nodeValue returns the "innerHTML" of an element, to get its properties use $element -> getAttribute("value") (for the "value" attribute)

PHP DOMDocument how to get that content of this tag?

I am using domDocument hoping to parse this little html code. I am looking for a specific span tag with a specific id.
<span id="CPHCenter_lblOperandName">Hello world</span>
My code:
$dom = new domDocument;
#$dom->loadHTML($html); // the # is to silence errors and misconfigures of HTML
$dom->preserveWhiteSpace = false;
$nodes = $dom->getElementsByTagName('//span[#id="CPHCenter_lblOperandName"');
foreach($nodes as $node){
echo $node->nodeValue;
}
But For some reason I think something is wrong with either the code or the html (how can I tell?):
When I count nodes with echo count($nodes); the result is always 1
I get nothing outputted in the nodes loop
How can I learn the syntax of these complex queries?
What did I do wrong?
You can use simple getElementById:
$dom->getElementById('CPHCenter_lblOperandName')->nodeValue
or in selector way:
$selector = new DOMXPath($dom);
$list = $selector->query('/html/body//span[#id="CPHCenter_lblOperandName"]');
echo($list->item(0)->nodeValue);
//or
foreach($list as $span) {
$text = $span->nodeValue;
}
Your four part question gets an answer in three parts:
getElementsByTagName does not take an XPath expression, you need to give it a tag name;
Nothing is output because no tag would ever match the tagname you provided (see #1);
It looks like what you want is XPath, which means you need to create an XPath object - see the PHP docs for more;
Also, a better method of controlling the libxml errors is to use libxml_use_internal_errors(true) (rather than the '#' operator, which will also hide other, more legitimate errors). That would leave you with code that looks something like this:
<?php
libxml_use_internal_errors(true);
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
foreach($xpath->query("//span[#id='CPHCenter_lblOperandName']") as $node) {
echo $node->textContent;
}

DOMElement empty nodeValue

I have a project where I need to parse a xml page and pick out some data. The domDocument class seems perfect and I tried a few basic tests to see if it would do what I wanted.
Here is my code for the moment:
$dom = new domDocument;
$html = file_get_contents('http://wadmag.com/feed.xml');
$previous_value = libxml_use_internal_errors(TRUE);
$dom->loadHTML("$html");
libxml_clear_errors(); //This here is to clear the errors caused by the page not
libxml_use_internal_errors($previous_value); // being proper html
$links = $dom->getElementsByTagName('item');
echo "Found : ".$links->length. " items";
foreach ($links as $link) {
echo $link->nodeValue."<br>";
}
Now the problem is that when I load the page, I get the message "Found: 21 items", meaning that the getElementsByTagName returned a list, but when I try to display the contents of the list, nothing is displayed, as if the nodeValue was empty.
The even weirder thing is that if I replace "link" in the getElementsByTagName by title or description, it displays everything as it should. Can't seem to understand why, the only difference I can see is that and might be proper html whereas is not.
If you parse XML, use $dom->loadXML($response) instead of $dom->loadHtml($response)

Using a var as agrument 2 in addChild method for writing XML

Here i parse some data from a webpage.
I want to write it to an file. It all works ok when i use some test strings in
$xml->addChild('alink', 'test');
But when i try and write in the data i actually need to use
$xml->addChild('alink', $value);
It doesnt work.
Message is :
Warning: SimpleXMLElement::addChild() [simplexmlelement.addchild]: unterminated entity reference .wvx= in C:\Documents and Settings\Owner\My Documents\Downloads\XAMPP_1.7.1\xampp\htdocs\PhpTest2\index.php on line 96
Complete code. Why does addChild not let me use a var there as agrument 2 in that method? And what is the word around to getting that working. Can find no explanation on php.net
$dom = new DOMDocument();
#$dom->loadHtml($html);
$xpath = new DOMXPath($dom);
$articleList = $xpath->query("//body/div/div/div/table/tbody/tr/td/a");
$xml = new SimpleXmlElement('<links></links>');
$xml->addChild('dvd');
foreach ($articleList as $art)
{
$value = $art->getAttribute('href');
$xml->addChild('alink', $value);
}
$xml->asXML('/simplexml_create.xml');
Many Thanks,
-Code

Categories