I'm trying to delete a child node within a XML document using DOM and PHP but I can't quite figure out how to do it. I do not have access to simpleXML.
XML Layout:
<list>
<as>
<a>
<a1>delete</a1>
</a>
<a>
<a1>keep</a1>
</a>
</as>
<list>
PHP Code:
$xml = "file.xml";
$dom = DOMDocument::load($xml);
$list = $dom->getElementsByTagName('as')->item(0);
//Cycle through <as> elements (there are multiple in the full file)
foreach($list->childNodes as $child) {
$subChild = substr($child->tagName, 0, -1);
$a = $dom->getElementsByTagName($subChild);
//Cycle through <a> elements
foreach($a as $node)
{
//Get status for status check
$check= $node->getElementsByTagName("a1")->item(0)->nodeValue;
if(strcmp($check,'delete')==0)
{
//code to delete here (I wish to delete the <a> that this triggers
}
}
}
http://www.php.net/manual/en/class.domnode.php
http://www.php.net/manual/en/domnode.removechild.php
You need the parent of a node to remove it, and you've got it as a property of the node that you want to remove, so no biggie. The result would be:
$node->parentNode->removeChild($node);
Related
I want to remove first video element (video src=time.mp4) from this xml (filename.xml) and save the xml into filename4.smil :
<?xml version="1.0" encoding="utf-8"?>
<smil>
<stream name="mysq"/>
<playlist name="Default" playOnStream="mysq" repeat="true" scheduled="2010-01-01 01:01:00">
<video src="time.mp4" start="0" length="-1"> </video>
<video src="sample.mp4" start="0" length="-1"> </video>
</playlist>
</smil>
i am using this code, but is not working:
<?php
$doc = new DOMDocument;
$doc->load("filename.xml");
$thedocument = $doc->documentElement;
//this gives you a list of the messages
$list0 = $thedocument->getElementsByTagName('playlist');
$list = $list0->item(0);
$nodeToRemove = null;
foreach ($list as $domElement){
$videos = $domElement->getElementsByTagName( 'video' );
$video = $videos->item(0);
$attrValue = $video->getAttribute('src');
if ($attrValue == 'time.mp4') {
$nodeToRemove = $videos; //will only remember last one- but this is just an example :)
}
}
//Now remove it.
if ($nodeToRemove != null)
$thedocument->removeChild($nodeToRemove);
$doc->save('filename4.smil');
?>
Assuming that there is only 1 playlist item and you want to remove the first video element from that, here are 2 methods.
This one uses getElementsByTagName() as you are in your code, but simple picks the first item from each list and then removes the item (you have to use parentNode to remove the child node).
$playlist = $doc->getElementsByTagName('playlist')->item(0);
$video = $playlist->getElementsByTagName( 'video' )->item(0);
$video->parentNode->removeChild($video);
This version uses XPath, which is more flexible, it looks for the playlist elements with a video element somewhere inside. Again, just taking the first one and removing it...
$xp = new DOMXPath($doc);
$video = $xp->query('//playlist//video')->item(0);
$video->parentNode->removeChild($video);
The problem with
$thedocument->removeChild($nodeToRemove);
is that you are trying to remove a child element from the base document. As this node is nested in the hierarchy, it won't be able to remove it, you need to remove it from it's direct parent.
Using Xpath expressions you can fetch video nodes with a specific src attribute, iterate them and remove them.
$document = new DOMDocument();
$document->loadXML($xml);
$xpath = new DOMXpath($document);
$expression = '/smil/playlist/video[#src="time.mp4"]';
foreach ($xpath->evaluate($expression) as $video) {
$video->parentNode->removeChild($video);
}
var_dump($document->saveXML());
It is possible to fetch nodes by position as well: /smil/playlist/video[1].
I have an array $arr=array("A-B-C-D","A-B-E","A-B-C-F") and my expected output of the XML should be
<root>
<A>
<B>
<C>
<D></D>
<F></F>
</C>
<E></E>
</B>
</A>
</root>
I have already done the code that creates a new node of the XML for the first array element i.e. A-B-C-D. But when I move to the second element I need to check how many nodes are already created (A-B) and then add the new node based on that in the proper position.
So how do I traverse the XML and find the exact position where the new node should be attached?
my current code looks like this
$arr=explode("-",$input);
$doc = new DomDocument();
$doc->formatOutput=true;
$doc->LoadXML('<root/>');
$root = $doc->documentElement;
$comm = $doc->createElement('comm');
$root->appendChild($comm);
foreach($arr as $a2) {
$newcomm = $doc->createElement($a2);
$community->appendChild($newcomm);
$community=$newcomm;
}
Should I use xpath or some other method will be easier?
To stick with using DOMDocument, I've added an extra loop to allow you to add all of the original array items in. The main thing is before adding a new item in, check if it's already there...
<?php
error_reporting(E_ALL);
ini_set('display_errors', 1);
$set=array("A-B-C-D","A-B-E","A-B-C-F-G", "A-B-G-Q");
$doc = new DomDocument();
$doc->formatOutput=true;
$doc->LoadXML('<root/>');
foreach ( $set as $input ) {
$arr=explode("-",$input);
$base = $doc->documentElement;
foreach($arr as $a2) {
$newcomm = null;
// Decide if the element already exists.
foreach ( $base->childNodes as $nextElement ) {
if ( $nextElement instanceof DOMElement
&& $nextElement->tagName == $a2 ) {
$newcomm = $nextElement;
}
}
if ( $newcomm == null ) {
$newcomm = $doc->createElement($a2);
$base->appendChild($newcomm);
}
$base=$newcomm;
}
}
echo $doc->saveXML();
As there is no quick way ( as far as I know) to check for a child with a specific tag name, it just looks through all of the child elements for a DOMElement with the same name.
I started using getElementByTagName, but this finds any child node with the name and not just at the current level.
The output from above is...
<?xml version="1.0"?>
<root>
<A>
<B>
<C>
<D/>
<F>
<G/>
</F>
</C>
<E/>
<G>
<Q/>
</G>
</B>
</A>
</root>
I added a few other items in to show that it adds things in at the right place.
I'd like to remove <font> tags from my html and am trying to use replaceChild to do so, but it doesn't seem to work properly. Can anyone catch what might be wrong?
$html = '<html><body><br><font class="heading2">Limited Size and Resources</font><p><br><strong>Q: When can a member use the limited size and resources exception?</strong></p></body></html>';
$dom = new DOMDocument();
$dom->loadHTML($html);
$font_tags = $dom->GetElementsByTagName('font');
foreach($font_tags as $font_tag) {
foreach($font_tag as $child) {
$child->replaceChild($child->nodeValue, $font_tag);
}
}
echo $dom->saveHTML();
From what I understand, $font_tags is a DOMNodeList, so I need to iterate through it twice in order to use the DOMNode::replaceChild function. I then want to replace the current value with just the content inside of the tags. However, when I output the $html nothing changes. Any ideas what could be wrong?
Here is a PHP Sandbox to test the code.
I'll put my remarks inline
$html = '<html><body><br><font class="heading2">Limited Size and Resources</font><p><br><strong>Q: When can a member use the limited size and resources exception?</strong></p></body></html>';
$dom = new DOMDocument();
$dom->loadHTML($html);
$font_tags = $dom->GetElementsByTagName('font');
/* You only need one loop, as it is iterating your collection
You would only need a second loop if each font tag had children of their own
*/
foreach($font_tags as $font_tag) {
/* replaceChild replaces children of the node being called
So, to replace the font tag, call the function on its parent
$prent will be that reference
*/
$prent = $font_tag->parentNode;
/* You can't insert arbitrary text, you have to create a textNode
That textNode must also be a member of your document
*/
$prent->replaceChild($dom->createTextNode($font_tag->nodeValue), $font_tag);
}
echo $dom->saveHTML();
Updated Sandbox: Hopefully I understood your requirements correctly
First, let us find out what wasn't working in your code.
foreach($font_tag as $child) wasn't even iterating once as $font_tag is a single 'font' tag element from font_tags array, and not an array itself.
$child->replaceChild($child->nodeValue, $font_tag); - A child node can't replace its parent ($font_tag), but the reverse is possible.
As replaceChild is a method of the parent node to replace its child.
For more details check the PHP: DOMNode::replaceChild documentation, or the point 2 below my code.
echo $html will output the $html string, but not the updated $dom object that we are modifying.
This would work -
$html = '<html><body><br><font class="heading2">Limited Size and Resources</font><p><br><strong>Q: When can a member use the limited size and resources exception?</strong></p></body></html>';
$dom = new DOMDocument();
$dom->loadHTML($html);
$font_tags = $dom->GetElementsByTagName('font');
foreach($font_tags as $font_tag)
{
$new_node = $dom->createTextNode($font_tag->nodeValue);
$font_tag->parentNode->replaceChild($new_node, $font_tag);
}
echo $dom->saveHTML();
I am creating a $new_node directly in the $dom, so the node is live in the DOMDocument and not any local variable.
To replace the child object $font_tag, we have to first traverse to the parent node using the parentNode method.
Finally, we are printing out the modified $dom using saveHTML method, which will convert the DOMDocument into a HTML String.
Remove a specific span tag from HTML while preserving/keeping the inside content using PHP and DOMDocument
<?php
$content = '<span style="font-family: helvetica; font-size: 12pt;"><div>asdf</div><span>TWO</span>Business owners are fearful of leading. They would rather follow the leader than embrace a bold move that challenges their confidence. </span>';
$dom = new DOMDocument();
// Use LIBXML for preventing output of doctype, <html>, and <body> tags
$dom->loadHTML($content, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new DOMXPath($dom);
foreach ($xpath->query('//span[#style="font-family: helvetica; font-size: 12pt;"]') as $span) {
// Move all span tag content to its parent node just before it.
while ($span->hasChildNodes()) {
$child = $span->removeChild($span->firstChild);
$span->parentNode->insertBefore($child, $span);
}
// Remove the span tag.
$span->parentNode->removeChild($span);
}
// Get the final HTML with span tags stripped
$output = $dom->saveHTML();
print_r($output);
I have next type of XML:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE test SYSTEM "dtd">
<root>
<tag1>
<1>Name</1>
<2>Num1</2>
<3>NumOrder</3>
<4>test</5>
<6>line</6>
<7>HTTP </7>
<8>1</8>
<9></9>
</tag1>
<tag2>
<1>Name</1>
<2>Num1</2>
<3>NumOrder</3>
<4>test</5>
<6>line</6>
<7>HTTP </7>
<8>1</8>
<9></9>
</tag2>
...
<tagN>
<1>Name</1>
<2>Num1</2>
<3>NumOrder</3>
<4>test</5>
<6>line</6>
<7>HTTP </7>
<8>1</8>
<9></9>
</tagN>
</root>
And i need to get root with each child element separately in array saved as HTML:
array = [rootwithchild1,rootwithchild2...N];
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE test SYSTEM "dtd">
<root>
<tagN>
<1>Name</1>
<2>Num1</2>
<3>NumOrder</3>
<4>test</5>
<6>line</6>
<7>HTTP </7>
<8>1</8>
<9></9>
</tagN>
</root>
For now i make 2 doms, in one i get all child separately, in another i have deleted all child and left only root. At these step i wanted to add each child to root, save as html, delete child, and so on with each child, but this doesn't work.
$bodyNode = $copydoc->getElementsByTagName('root')->item(0);
foreach ($mini as $value) {
$bodyNode->appendChild($value);
$result[] = $copydoc->saveHTML();
$bodyNode->removeChild($value);
}
Error on $bodyNode->appendChild($value);
Mini is array of cut child.
Lib: $doc = new DOMDocument();
Can anyone advice how to do this right, maybe better to use xpath or something else..?
Thanks
I would simply create a new document that contains only the root element and a “fake” initial child:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE test SYSTEM "dtd">
<root>
<fakechild />
</root>
After that, loop over the child elements of the original document – and for each of those perform the following steps:
import the child node from the original document into the new document using DOMDocument::importNode
replace the current child node of the root element of the new document with the imported node using DOMNode::replaceChild with the firstChild of the root element as second parameter
save the new document
(Having the <fakechild /> in the root element to begin with is not technically necessary, a simple whitespace text node should do as well – but with an empty root element this would not work in such a straight fashion, because the firstChild would give you NULL in the first loop iteration, so you would not have a node to feed to DOMNode::replaceChild as second parameter. Of course you could do additional checks for that and use appendChild instead of replaceChild for the first item … but why complicate stuff more than necessary.)
DOMNode::getElemementsByTagName() returns a live result. So if you remove the node from the DOM it is removed from the node list as well.
You can iterate the list backwards...
for ($i = $nodes->length - 1; $i >= 0; $i--) {
$node = $nodes->item($i);
...
}
... or copy it to an array:
foreach (iterator_to_array($nodes) as $node) {
...
}
Node lists from DOMXpath::evaluate() are not affected that way. XPath allows a more specific selection of nodes, too.
$xpath = new DOMXpath($domDocument);
$nodes = $xpath->evaluate('/root/*');
foreach (iterator_to_array($nodes) as $node) {
...
}
But I wonder why are you modifying (destroying) the original XML source?
If would create a new document to act as a template and. Never removing nodes, only creating new documents and importing them:
// load the original source
$source= new DOMDocument();
$source->loadXml($xml);
$xpath = new DOMXpath($source);
// create a template dom
$template = new DOMDocument();
$parent = $template;
// add a node and all its ancestors to the template
foreach ($xpath->evaluate('/root/part[1]/ancestor-or-self::*') as $node) {
$parent = $parent->appendChild($template->importNode($node, FALSE));
}
// for each of the child element nodes
foreach ($xpath->evaluate('/root/part/*') as $node) {
// create a new target
$target = new DOMDocument();
// import the nodes from the template
$target->appendChild($target->importNode($template->documentElement, TRUE));
// find the first element node that has no child element nodes
$targetXpath = new DOMXpath($target);
$targetNode = $targetXpath->evaluate('//*[count(*) = 0]')->item(0);
// append the child node from the original xml
$targetNode->appendChild($target->importNode($node, TRUE));
echo $target->saveXml(), "\n\n";
}
Demo: https://eval.in/191304
I am trying to update my XML file based on an HTML form processed by PHP but the new XML snippet I am trying to append to specific areas of my current XML just keeps getting added to the end of my document.
$specific_node = "0"; //this is normally set by a select input from the form.
$doc = new DOMDocument();
$doc->load( 'rfp_files.xml' );
$doc->formatOutput = true;
//below is where my issue is having problems the variable '$specific_node' can be one of three options 0,1,2 and what I am trying to do is find the child of content_sets. So the first second or third child elemts and that is where I will add my new bit of XML
$r = $doc->getElementsByTagname('content_sets')->item($specific_node);
//This is where I build out my new XML to append
$fileName = $doc->createElement("file_name");
$fileName->appendChild(
$doc->createTextNode( $Document_Array["url"] )
);
$b->appendChild( $fileName );
//this is were I add the new XML to the child node mention earlier in the script.
$r->appendChild( $b );
XML Example:
<?xml version="1.0" encoding="UTF-8"?>
<content_sets>
<doc_types>
<article>
<doc_name>Additional</doc_name>
<file_name>Additional.docx</file_name>
<doc_description>Test Word document. Please remove when live.</doc_description>
<doc_tags>word document,test,rfp template,template,rfp</doc_tags>
<last_update>01/26/2013 23:07</last_update>
</article>
</doc_types>
<video_types>
<article>
<doc_name>Test Video</doc_name>
<file_name>test_video.avi</file_name>
<doc_description>Test video. Please remove when live.</doc_description>
<doc_tags>test video,video, avi,xvid,svid avi</doc_tags>
<last_update>01/26/2013 23:07</last_update>
</article>
</video_types>
<image_types>
<article>
<doc_name>Test Image</doc_name>
<file_name>logo.png</file_name>
<doc_description>Logo transparent background. Please remove when live.</doc_description>
<doc_tags>png,logo,logo png,no background,graphic,hi res</doc_tags>
<last_update>01/26/2013 23:07</last_update>
</article>
</image_types>
</content_sets>
This is getting the root element:
$specific_node = "0";
$r = $doc->getElementsByTagname('content_sets')->item($specific_node);
So you are appending a child onto the root which is why you always see it added near the end of the document. You need to get the children of the root element like this:
$children = $doc->documentElement->childNodes;
This can return several types of node, but you are only interested in 'element' type nodes. It's not very elegant, but the only way I've found to get a child element by position is looping like this...
$j = 0;
foreach ($doc->documentElement->childNodes as $r)
if ($r->nodeType === XML_ELEMENT_NODE && $j++ == $specific_node)
break;
if ($j <= $specific_node)
// handle situation where $specific_node is more than number of elements
You could use getElementsByTagName() if you can pass the name of the node required instead of the ordinal position, or change the XML so that the child elements all have the same name and use an attribute to differentiate them.