I use DOMDocument object to get some data from this:
<div class="prodImg">
<img src="some_image_src"/>
</div>
With this code:
libxml_use_internal_errors(true);
$homepage = file_get_contents('some_src');
$doc = new DomDocument;
#$doc->loadHtml($homepage);
$xpath = new DomXpath($doc);
$div = $xpath->query('//*[#class="prodImg"]')->item(0);
I get the whole div container but I want to get only the image src attribute.
It works for me:
$div = $xpath->query('//*[#class="prodImg"]')->item(0)->getElementsByTagName('img')->item(0)->getAttribute('src');
var_dump($div);
Yields:
string 'some_image_src' (length=14)
Related
I have the following source code:
<?php
function getTerms()
{
$doc = new DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTML('https://charitablebookings.com/terms'); // loads your HTML
$xpath = new DOMXPath($doc);
// returns a list of all links with rel=nofollow
$nodeList = $xpath->query("//div[#class='terms-conditions']");
$temp_dom = new DOMDocument();
$node = $nodeList->item(0);
$temp_dom = new DOMDocument();
foreach($nodeList as $n) $temp_dom->appendChild($temp_dom->importNode($n,true));
print_r($temp_dom->saveHTML());
}
getTerms();
?>
which I'm trying to get a text from a web page by getting a specific class. I don't get anything on my browser when I try to print_r the temp_dom. And $node is null. What am I doing wrong ?
Thanks for your time
The first issue is that DOMDocument's loadHTML method expects HTML content as its first parameter, not an URL.
$doc = new DOMDocument();
libxml_use_internal_errors(true);
$html = file_get_contents('https://charitablebookings.com/terms');
$doc->loadHTML($html);
And the second problem is with your XPath expression: $xpath->query("//div[#class='terms-conditions']") - as there is no div with class of terms-conditions in the document (it probably gets added by some JavaScript loader).
I need to search for an element by ID using PHP then appending html content to it. It seems simple enough but I'm new to php and can't find the right function to use to do this.
$html = file_get_contents('http://example.com');
$doc = new DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTML($html);
$descBox = $doc->getElementById('element1');
I just don't know how to do the next step. Any help would be appreciated.
Like chris mentioned in his comment try using DOMNode::appendChild, which will allow you to add a child element to your selected element and DOMDocument::createElement to actually create the element like so:
$html = file_get_contents('http://example.com');
libxml_use_internal_errors(true);
$doc = new DOMDocument();
$doc->loadHTML($html);
//get the element you want to append to
$descBox = $doc->getElementById('element1');
//create the element to append to #element1
$appended = $doc->createElement('div', 'This is a test element.');
//actually append the element
$descBox->appendChild($appended);
Alternatively if you already have an HTML string you want to append you can create a document fragment like so:
$html = file_get_contents('http://example.com');
libxml_use_internal_errors(true);
$doc = new DOMDocument();
$doc->loadHTML($html);
//get the element you want to append to
$descBox = $doc->getElementById('element1');
//create the fragment
$fragment = $doc->createDocumentFragment();
//add content to fragment
$fragment->appendXML('<div>This is a test element.</div>');
//actually append the element
$descBox->appendChild($fragment);
Please note that any elements added with JavaScript will be inaccessible to PHP.
you can also append this way
$html = '
<html>
<body>
<ul id="one">
<li>hello</li>
<li>hello2</li>
<li>hello3</li>
<li>hello4</li>
</ul>
</body>
</html>';
libxml_use_internal_errors(true);
$doc = new DOMDocument();
$doc->loadHTML($html);
//get the element you want to append to
$descBox = $doc->getElementById('one');
//create the element to append to #element1
$appended = $doc->createElement('li', 'This is a test element.');
//actually append the element
$descBox->appendChild($appended);
echo $doc->saveHTML();
dont forget to saveHTML on the last line
If I have the following html:
<div id="thisID">100</div>
I can get the value, 100, like this:
$dom = new DOMDocument();
$dom->loadHTMLFile($url);
$data = $dom->getElementById("thisID");
$result = $data->nodeValue;`
But what about this html?
<span class="foo" id="bar" itemprop="price">100</span>
Is there a way I can get an element content by a tag variable and value, in this case itemprop="price"?
a) Use DOMXPath:
<?php
$doc = new DOMDocument();
$doc->loadHTML('<span class="foo" id="bar" itemprop="price">100</span>');
$xpath = new DOMXPath($doc);
$result = $xpath->evaluate('number(//*[#itemprop="price"])');
b) Use a real microdata parser.
I have a string of 'source html' and a string of 'replacement html'. In the 'source html' I want to look for a node with a specific class and replace its content with my 'replacement html'. I have tried using the replaceChild method, but this seems to require that I traverse a level up (parentNode).
This doesn't work
$dom = new DOMDocument;
$dom->loadXml($sourceHTML);
$replacement = $dom->createDocumentFragment();
$replacement->appendXML($replacementHTML);
$xpath = new DOMXPath($dom);
$oldNode = $xpath->query('//div[contains(#class,"arrangement--index__field-dato")]')->item(0);
$oldNode->replaceChild($replacement, $oldNode);
This works, but it's not the content which is being replaced
$dom = new DOMDocument;
$dom->loadXml($sourceHTML);
$replacement = $dom->createDocumentFragment();
$replacement->appendXML($replacementHTML);
$xpath = new DOMXPath($dom);
$oldNode = $xpath->query('//div[contains(#class,"arrangement--index__field-dato")]')->item(0);
$oldNode->parentNode->replaceChild($replacement, $oldNode);
How do I replace the content or the node I have queried for?
Instead of replacing the child node, loop over it's children, drop them and insert the new content as child node. Something like
foreach ($oldNode->childNodes as $child)
$oldNode->removeChild($child);
$oldNode->appendChild($replacement);
This will replace the contents (children) instead of the node itself.
This seems to work!
$dom = new DOMDocument;
$dom->loadXml($sourceHTML);
$replacement = $dom->createDocumentFragment();
$replacement->appendXML($replacementHTML);
$xpath = new DOMXPath($dom);
$oldNode = $xpath->query('//div[contains(#class,"arrangement--index__field-dato")]')->item(0);
$oldNode->removeChild($oldNode->firstChild);
$oldNode->appendChild($replacement);
exactly as its descriped in the title currently my code is:
<?php
$url = "remotesite.com/page1.html";
$html = file_get_contents($url);
$doc = new DOMDocument(); // create DOMDocument
libxml_use_internal_errors(true);
$doc->loadHTML($html); // load HTML you can add $html
$elements = $doc->getElementsByTagName('div');
?>
my coding skills are very basic so at this point i am lost and dont know how to display only the div that has the id id=mydiv
If you have PHP 5.3.6 or higher you can do the following:
$url = "remotesite.com/page1.html";
$html = file_get_contents($url);
$doc = new DOMDocument(); // create DOMDocument
libxml_use_internal_errors(true);
$doc->loadHTML($html); // load HTML you can add $html
$testElement = $doc->getElementById('divIDName');
echo $doc->saveHTML($testElement);
http://php.net/manual/en/domdocument.getelementbyid.php
If you have a lower version I believe you would need to copy the Dom node once you found it with getElementById into a new DomDocument object.
$elementDoc = new DOMDocument();
$cloned = $testElement->cloneNode(TRUE);
$elementDoc->appendChild($elementDoc->importNode($cloned,TRUE));
echo $elementDoc->saveHTML();