I'm using PHP/Zend to load html into a DOM, and then I get a specific div id that I want to modify.
$dom = new Zend_Dom_Query($html);
$element = $dom->query('div[id="someid"]');
How do I modify the text/content/html displayed inside that $element div, and then save the changes to the $dom or $html so I can print the modified html. Any idea how to do this?
Zend_Dom_Query is tailored just for querying a dom, so it doesn't provide an interface in and of itself to alter the dom and save it, but it does expose the PHP Native DOM objects that will let you do so. Something like this should work:
$dom = new Zend_Dom_Query($html);
$document = $dom->getDocument();
$elements = $dom->query('div[id="someid"]');
foreach($elements AS $element) {
//$element is an instance of DOMElement (http://www.php.net/DOMElement)
//You have to create new nodes off the document
$node = $document->createElement("div", "contents of div");
$element->appendChild($node)
}
$newHtml = $document->saveXml();
Take a look at the PHP Doc for DOMElement to get an idea of how you can alter the dom:
http://www.php.net/DOMElement
Related
I am using PHP Simple HTML Dom library. I can obtain the element that I want. But this element contains other elements that I want to remove from selection.
[elem]
include this data
[elem]exclude this data[elem]
[elem]
If it is possible please show an example.
xml
<elem>
include this data
<elem>exclude this data</elem>
</elem>
php -- Pure DOMDocument solution:
$dom = new DOMDocument;
$dom->load('xml.xml');
$node = $dom->getElementsByTagName('elem')->item(0);
$child = $node->getElementsByTagName('elem')->item(0);
$node->removeChild($child);
echo $dom->saveXml();
php -- SimpleXML with DOMDocument
$doc = simplexml_load_file('xml.xml');
$toremove = $doc->elem;
$dom = dom_import_simplexml($toremove);
$dom->parentNode->removeChild($dom);
echo $doc->asXml();
I'd like to search for nodes with the same node name in a SimpleXML Object no matter how deep they are nested and create an instance of them as an array.
In the HTML DOM I can do that with JavaScript by using getElementsByTagName(). Is there a way to do that in PHP as well?
Yes use xpath
$xml->xpath('//div');
Here $xml is your SimpleXML object.
In this example you will get array of all 'div' elements
$fname = dirname(__FILE__) . '\\xml\\crRoll.xml';
$dom = new DOMDocument;
$dom->load($fname, LIBXML_DTDLOAD|LIBXML_DTDATTR);
$root = $dom->documentElement;
$xpath = new DOMXpath($dom);
$xpath->registerNamespace('cr', "http://www.w3.org/1999/xhtml");
$candidateNodes = $xpath->query("//cr:break");
foreach ($candidateNodes as $child) {
$max = $child->getAttribute('tstamp');
}
This finds all the BREAK nodes (tstamp attr) using XPath ...
Only on DOMDocument::getElementsByTagName,
however, you can import/export SimpleXML into DOMDocument,
or simply use DOMDocument to parse XML.
Another answer mentioned about Xpath,
it will return duplication of node, if you have something like :-
<div><div>1</div></div>
I'm just getting started with using php DOMDocument and am having a little trouble.
How would I select all link nodes under a specific node lets say
in jquery i could simply do.. $('h5 > a')
and this would give me all the links under h5.
how would i do this in php using DOMDocument methods?
I tried using phpquery but for some reason it can't read the html page i'm trying to parse.
As far as I know, jQuery rewrites the selector queries to XPath. Any node jQuery can select, XPath also can.
h5 > a means select any a node for which the direct parent node is h5. This can easily be translated to a XPath query: //h5/a.
So, using DOMDocument:
$dom = new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$nodes = $xpath->query('//h5/a');
foreach ($nodes as $node) {
// do stuff
}
Retrieve the DOMElement whose children you are interested in and call DOMElement::getElementsByTagName on it.
Get all h5 tags from it, and loop through each one, checking if it's parent is an a tag.
// ...
$h5s = $document->getElementsByTagName('h5');
$correct_tags = array();
foreach ($h5s as $h5) {
if ($h5->parentNode->tagName == 'a') {
$correct_tags[] = $h5;
}
}
// do something with $correct_tags
im in need of converting part of DOM element to string with html tags inside of them.
i tried following but it prints just a text without tags in side.
$dom = new DOMDocument();
$dom->loadHTMLFile('http://www.pixmania-pro.co.uk/gb/uk/08920684/art/packard-bell/easynote-tm89-gu-015uk.html');
$xpath = new DOMXPath($dom);
$elements=xpath->query('//table');
foreach($elements as $element)
echo $element->nodeValue;
i want all the tags as it is and the content inside tables. can some one help me. it'll be a greate help.
thanks.
Current solution:
foreach($elements as $element){
echo $dom->saveHTML($element);
}
Old answer (php < 5.3.6):
Create new instance of DomDocument
Clone node (with all sub nodes) you wish to save as HTML
Import cloned node to new instance of DomDocument and append it as a child
Save new instance as html
So something like this:
foreach($elements as $element){
$newdoc = new DOMDocument();
$cloned = $element->cloneNode(TRUE);
$newdoc->appendChild($newdoc->importNode($cloned,TRUE));
echo $newdoc->saveHTML();
}
With php 5.3.6 or higher you can use a node in DOMDocument::saveHTML:
foreach($elements as $element){
echo $dom->saveHTML($element);
}
Is there a way to remove a HTML element by using the DOMDocument class?
In addition to Dave Morgan's answer you can use DOMNode::removeChild to remove child from list of children:
Removing a child by tag name
//The following example will delete the table element of an HTML content.
$dom = new DOMDocument();
//avoid the whitespace after removing the node
$dom->preserveWhiteSpace = false;
//parse html dom elements
$dom->loadHTML($html_contents);
//get the table from dom
if($table = $dom->getElementsByTagName('table')->item(0)) {
//remove the node by telling the parent node to remove the child
$table->parentNode->removeChild($table);
//save the new document
echo $dom->saveHTML();
}
Removing a child by class name
//same beginning
$dom = new DOMDocument();
$dom->preserveWhiteSpace = false;
$dom->loadHTML($html_contents);
//use DomXPath to find the table element with your class name
$xpath = new DomXPath($dom);
$classname='MyTableName';
$xpath_results = $xpath->query("//table[contains(#class, '$classname')]");
//get the first table from XPath results
if($table = $xpath_results->item(0)){
//remove the node the same way
$table ->parentNode->removeChild($table);
echo $dom->saveHTML();
}
Resources
http://us2.php.net/manual/en/domnode.removechild.php
How to delete element with DOMDocument?
How to get full HTML from DOMXPath::query() method?
http://us2.php.net/manual/en/domnode.removechild.php
DomDocument is a DomNode.. You can just call remove child and you should be fine.
EDIT: Just noticed you were probably talking about the page you are working with currently. Don't know if DomDocument would work. You may wanna look to use javascript at that point (if its already been served up to the client)