I want to remove first video element (video src=time.mp4) from this xml (filename.xml) and save the xml into filename4.smil :
<?xml version="1.0" encoding="utf-8"?>
<smil>
<stream name="mysq"/>
<playlist name="Default" playOnStream="mysq" repeat="true" scheduled="2010-01-01 01:01:00">
<video src="time.mp4" start="0" length="-1"> </video>
<video src="sample.mp4" start="0" length="-1"> </video>
</playlist>
</smil>
i am using this code, but is not working:
<?php
$doc = new DOMDocument;
$doc->load("filename.xml");
$thedocument = $doc->documentElement;
//this gives you a list of the messages
$list0 = $thedocument->getElementsByTagName('playlist');
$list = $list0->item(0);
$nodeToRemove = null;
foreach ($list as $domElement){
$videos = $domElement->getElementsByTagName( 'video' );
$video = $videos->item(0);
$attrValue = $video->getAttribute('src');
if ($attrValue == 'time.mp4') {
$nodeToRemove = $videos; //will only remember last one- but this is just an example :)
}
}
//Now remove it.
if ($nodeToRemove != null)
$thedocument->removeChild($nodeToRemove);
$doc->save('filename4.smil');
?>
Assuming that there is only 1 playlist item and you want to remove the first video element from that, here are 2 methods.
This one uses getElementsByTagName() as you are in your code, but simple picks the first item from each list and then removes the item (you have to use parentNode to remove the child node).
$playlist = $doc->getElementsByTagName('playlist')->item(0);
$video = $playlist->getElementsByTagName( 'video' )->item(0);
$video->parentNode->removeChild($video);
This version uses XPath, which is more flexible, it looks for the playlist elements with a video element somewhere inside. Again, just taking the first one and removing it...
$xp = new DOMXPath($doc);
$video = $xp->query('//playlist//video')->item(0);
$video->parentNode->removeChild($video);
The problem with
$thedocument->removeChild($nodeToRemove);
is that you are trying to remove a child element from the base document. As this node is nested in the hierarchy, it won't be able to remove it, you need to remove it from it's direct parent.
Using Xpath expressions you can fetch video nodes with a specific src attribute, iterate them and remove them.
$document = new DOMDocument();
$document->loadXML($xml);
$xpath = new DOMXpath($document);
$expression = '/smil/playlist/video[#src="time.mp4"]';
foreach ($xpath->evaluate($expression) as $video) {
$video->parentNode->removeChild($video);
}
var_dump($document->saveXML());
It is possible to fetch nodes by position as well: /smil/playlist/video[1].
Related
using php domdocument, to import xml file, i can't have the list of "tags"
I have tried multiple way but i can't
xml document :
<resource>
<title>hello world</title>
<tags>
<resource>great</resource>
<resource>fun</resource>
<resource>omg</resource>
</resource>
php :
<?php
$url='test.xml';
$doc = new DOMDocument();
$doc->load($url);
$feed = $doc->getElementsByTagName("resource");
foreach($feed as $entry) {
echo $entry->getElementsByTagName("username")->item(0)->nodeValue;
echo '<br>';
echo $entry->getElementsByTagName("tags")->item(0)->nodeValue;
echo '<br>';
}
i expect the outpout to be a list like that :
hello world
great
fun
omg
but the actual output is NOT a list the result is a sentence without space :
hello world greatfunomg
DOMDocument::getElementsByTagName() returns all descendant element nodes with the specified name. DOMElement::$nodeValue will return the text content of an element node including all its descendants.
In your case echo $entry->getElementsByTagName("tags")->item(0)->nodeValue fetches all tags, access the first node of that list and outputs its text content. That is greatfunomg.
Using the DOM methods to access nodes is verbose and requires a lot of code and if you want to do it right a lot of conditions. It is a lot easier if you use Xpath expressions. The allow you to scalar values and lists of nodes from an DOM.
$xml = <<<'XML'
<_>
<resource>
<title>hello world</title>
<tags>
<resource>great</resource>
<resource>fun</resource>
<resource>omg</resource>
</tags>
</resource>
</_>
XML;
$document = new DOMDocument();
$document->loadXML($xml);
// create an Xpath instance for the document
$xpath = new DOMXpath($document);
// fetch resource nodes that are a direct children of the document element
$entries = $xpath->evaluate('/*/resource');
foreach($entries as $entry) {
// fetch the title node of the current entry as a string
echo $xpath->evaluate('string(title)', $entry), "\n";
// fetch resource nodes that are children of the tags node
// and map them into an array of strings
$tags = array_map(
function(\DOMElement $node) {
return $node->textContent;
},
iterator_to_array($xpath->evaluate('tags/resource', $entry))
);
echo implode(', ', $tags), "\n";
}
Output:
hello world
great, fun, omg
If you just need to output the first piece of text for each <resource> element - wherever it is, then using XPath and (making sure you ignore whitespace on load) pick the first child element of this and output the node value.
Ignoring the whitespace on load is important as the whitespace will create nodes for all the padding around each element and so the first child of each <resource> element may just be a new line or tab.
$xml = '<root>
<resource>
<title>hello world</title>
<tags>
<resource>great</resource>
<resource>fun</resource>
<resource>omg</resource>
</tags>
</resource>
</root>';
$doc = new DOMDocument();
$doc->preserveWhiteSpace = false;
$doc->loadXML($xml);
// $doc->load($filename); // If loading from a file
$xpath = new DOMXpath($doc);
$resources = $xpath->query("//resource");
foreach ( $resources as $resource ){
echo $resource->firstChild->nodeValue.PHP_EOL;
}
The output of which is
hello world
great
fun
omg
Or without using XPath...
$doc = new DOMDocument();
$doc->preserveWhiteSpace = false;
$doc->loadXML($xml);
//$doc->load($filename);
$resources = $doc->getElementsByTagName("resource");
foreach ( $resources as $resource ){
echo $resource->firstChild->nodeValue.PHP_EOL;
}
Here is the code snipet being used:
$urlContent = file_get_contents('http://www.techeblog.com/');
$dom = new DOMDocument();
#$dom->loadHTML($urlContent);
$domPath=new DOMXpath($dom);
$linkList = $domPath->evaluate("/html/body/a/img");
foreach ($linkList as $link)
{
echo $link->getAttribute("src")."<br />";
}
Need to extract all the links in which the child node is an image tag.
Your XPath expression will only return image tags that are inside links that are direct children of the body tag. If you want all link tags that contain images anywhere in the document, use the expression //a[img]
That being said, you may want to be more specific about which images you pull. This expression will limit the results to links containing images that are inside the blog entries //div[#class="entry"]//a[img].
Here is a great XPath cheat sheet.
<?php
$urlContent = file_get_contents('http://www.techeblog.com/');
$dom = new DOMDocument();
#$dom->loadHTML($urlContent);
$domPath=new DOMXpath($dom);
$linkList = $domPath->evaluate('//div[#class="entry"]//a[img]');
foreach ($linkList as $link)
{
echo $link->getAttribute("href").PHP_EOL;
}
Also, your echo is looking for an attribute calles src, which will not be present in the links.
I used an XMLHttpRequest object to retrieve data from a PHP response.
Then, I created an XML file:
<?xml version="1.0" encoding="UTF-8"?>
<persons>
<person>
<name>Ce</name>
<gender>male</gender>
<age>24</age>
</person>
<person>
<name>Lin</name>
<gender>female</gender>
<age>25</age>
</person>
</persons>
In the PHP file, I load the XML file and try to echo tag values of "name."
$dom = new DOMDocument("1.0");
$dom -> load("test.xml");
$persons = $dom -> getElementsByTagName("person");
foreach($persons as $person){
echo $person -> childNodes -> item(0) -> nodeValue;
}
But the nodeValue returned is null. However, when I change to item(1), the name tag values can be displayed. Why?
Change code to
$dom = new DOMDocument("1.0");
$dom -> load("test.xml");
$persons = $dom -> getElementsByTagName("persons");
foreach($persons as $person){
echo $person->childNodes[1]->nodeValue;
}
Anything in a DOM is a node, include texts and text with only whitespaces. So the first child of the person element node is a text node that contains the linebreak and indent before the name element node.
Here is a property that removes any whitespace node at parse time:
$document = new DOMDocument("1.0");
// do not preserve whitespace only text nodes
$document->preserveWhiteSpace = FALSE;
$document->load("test.xml");
$persons = $document->getElementsByTagName("person");
foreach ($persons as $person) {
echo $person->firstChild->textContent;
}
However typically a better way is to use Xpath expressions.
$document = new DOMDocument("1.0");
$document->load("test.xml");
$xpath = new DOMXpath($document)
$persons = $xpath->evaluate("/persons/person");
foreach ($persons as $person) {
echo $xpath->evaluate("string(name)", $person);
}
string(name) fetches the child element node name (position is not relevant) and casts it into a string. If here is no name element it will return an empty string.
Using DOM you need to get the right element to pick up the name, child nodes include all sorts of things including whitespace. The node 0 your trying to use is null because of this. So for DOM...
$dom = new DOMDocument("1.0");
$dom -> load("test.xml");
$persons = $dom -> getElementsByTagName("person");
foreach($persons as $person){
$name = $person->getElementsByTagName("name");
echo $name->item(0)->nodeValue.PHP_EOL;
}
If your requirements are as simple as this, you could alternatively use SimpleXML...
$sxml = simplexml_load_file("test.xml");
foreach ( $sxml->person as $person ) {
echo $person->name.PHP_EOL;
}
This allows you to access elements as though they are object properties and as you can see ->person equates to accessing <person>.
I have next type of XML:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE test SYSTEM "dtd">
<root>
<tag1>
<1>Name</1>
<2>Num1</2>
<3>NumOrder</3>
<4>test</5>
<6>line</6>
<7>HTTP </7>
<8>1</8>
<9></9>
</tag1>
<tag2>
<1>Name</1>
<2>Num1</2>
<3>NumOrder</3>
<4>test</5>
<6>line</6>
<7>HTTP </7>
<8>1</8>
<9></9>
</tag2>
...
<tagN>
<1>Name</1>
<2>Num1</2>
<3>NumOrder</3>
<4>test</5>
<6>line</6>
<7>HTTP </7>
<8>1</8>
<9></9>
</tagN>
</root>
And i need to get root with each child element separately in array saved as HTML:
array = [rootwithchild1,rootwithchild2...N];
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE test SYSTEM "dtd">
<root>
<tagN>
<1>Name</1>
<2>Num1</2>
<3>NumOrder</3>
<4>test</5>
<6>line</6>
<7>HTTP </7>
<8>1</8>
<9></9>
</tagN>
</root>
For now i make 2 doms, in one i get all child separately, in another i have deleted all child and left only root. At these step i wanted to add each child to root, save as html, delete child, and so on with each child, but this doesn't work.
$bodyNode = $copydoc->getElementsByTagName('root')->item(0);
foreach ($mini as $value) {
$bodyNode->appendChild($value);
$result[] = $copydoc->saveHTML();
$bodyNode->removeChild($value);
}
Error on $bodyNode->appendChild($value);
Mini is array of cut child.
Lib: $doc = new DOMDocument();
Can anyone advice how to do this right, maybe better to use xpath or something else..?
Thanks
I would simply create a new document that contains only the root element and a “fake” initial child:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE test SYSTEM "dtd">
<root>
<fakechild />
</root>
After that, loop over the child elements of the original document – and for each of those perform the following steps:
import the child node from the original document into the new document using DOMDocument::importNode
replace the current child node of the root element of the new document with the imported node using DOMNode::replaceChild with the firstChild of the root element as second parameter
save the new document
(Having the <fakechild /> in the root element to begin with is not technically necessary, a simple whitespace text node should do as well – but with an empty root element this would not work in such a straight fashion, because the firstChild would give you NULL in the first loop iteration, so you would not have a node to feed to DOMNode::replaceChild as second parameter. Of course you could do additional checks for that and use appendChild instead of replaceChild for the first item … but why complicate stuff more than necessary.)
DOMNode::getElemementsByTagName() returns a live result. So if you remove the node from the DOM it is removed from the node list as well.
You can iterate the list backwards...
for ($i = $nodes->length - 1; $i >= 0; $i--) {
$node = $nodes->item($i);
...
}
... or copy it to an array:
foreach (iterator_to_array($nodes) as $node) {
...
}
Node lists from DOMXpath::evaluate() are not affected that way. XPath allows a more specific selection of nodes, too.
$xpath = new DOMXpath($domDocument);
$nodes = $xpath->evaluate('/root/*');
foreach (iterator_to_array($nodes) as $node) {
...
}
But I wonder why are you modifying (destroying) the original XML source?
If would create a new document to act as a template and. Never removing nodes, only creating new documents and importing them:
// load the original source
$source= new DOMDocument();
$source->loadXml($xml);
$xpath = new DOMXpath($source);
// create a template dom
$template = new DOMDocument();
$parent = $template;
// add a node and all its ancestors to the template
foreach ($xpath->evaluate('/root/part[1]/ancestor-or-self::*') as $node) {
$parent = $parent->appendChild($template->importNode($node, FALSE));
}
// for each of the child element nodes
foreach ($xpath->evaluate('/root/part/*') as $node) {
// create a new target
$target = new DOMDocument();
// import the nodes from the template
$target->appendChild($target->importNode($template->documentElement, TRUE));
// find the first element node that has no child element nodes
$targetXpath = new DOMXpath($target);
$targetNode = $targetXpath->evaluate('//*[count(*) = 0]')->item(0);
// append the child node from the original xml
$targetNode->appendChild($target->importNode($node, TRUE));
echo $target->saveXml(), "\n\n";
}
Demo: https://eval.in/191304
I'm trying to delete a child node within a XML document using DOM and PHP but I can't quite figure out how to do it. I do not have access to simpleXML.
XML Layout:
<list>
<as>
<a>
<a1>delete</a1>
</a>
<a>
<a1>keep</a1>
</a>
</as>
<list>
PHP Code:
$xml = "file.xml";
$dom = DOMDocument::load($xml);
$list = $dom->getElementsByTagName('as')->item(0);
//Cycle through <as> elements (there are multiple in the full file)
foreach($list->childNodes as $child) {
$subChild = substr($child->tagName, 0, -1);
$a = $dom->getElementsByTagName($subChild);
//Cycle through <a> elements
foreach($a as $node)
{
//Get status for status check
$check= $node->getElementsByTagName("a1")->item(0)->nodeValue;
if(strcmp($check,'delete')==0)
{
//code to delete here (I wish to delete the <a> that this triggers
}
}
}
http://www.php.net/manual/en/class.domnode.php
http://www.php.net/manual/en/domnode.removechild.php
You need the parent of a node to remove it, and you've got it as a property of the node that you want to remove, so no biggie. The result would be:
$node->parentNode->removeChild($node);