How to remove XML elements and all children? - php

I need to read an XML file and delete all the elements named <images> and all the children associated. I have found similar old questions that did not work. What am I doing wrong? Is there a better method?
XML:
<?xml version='1.0' encoding='UTF-8'?>
<settings>
<background_color>#000000</background_color>
<show_context_menu>yes</show_context_menu>
<image>
<thumb_path>210x245.png</thumb_path>
<big_image_path>620x930.png</big_image_path>
</image>
<image>
<thumb_path>200x295.png</thumb_path>
<big_image_path>643x950.png</big_image_path>
</image>
</settings>
PHP:
$dom = new DOMDocument();
$dom->load('test.xml');
$thedocument = $dom->documentElement;
$elements = $thedocument->getElementsByTagName('image');
foreach ($elements as $node) {
$node->parentNode->removeChild($node);
}
$save = $dom->saveXML();
file_put_contents('test.xml', $save)

I figured it out after a good night of sleep. It was quite simple actually.
$xml = simplexml_load_file('test.xml');
unset($xml->image);
$xml_file = $xml->asXML();
$xmlFile = 'test.xml';
$xmlHandle = fopen($xmlFile, 'w');
fwrite($xmlHandle, $xml_file);
fclose($xmlHandle);
Edit: You probably want to make it save directly:
$file = 'test.xml';
$xml = simplexml_load_file($file);
unset($xml->image);
$success = $xml->asXML($file);
See SimpleXMLElement::asXML()Docs.

In the PHP Manual page (where you should always go 1st :-) one awesome contributor points out that:
You can't remove DOMNodes from a DOMNodeList as you're iterating over them in a foreach loop.
Then goes on to offer a potential solution. Try something like this instead:
<?php
$domNodeList = $domDocument->getElementsByTagname('p');
$domElemsToRemove = array();
foreach ( $domNodeList as $domElement ) {
// ...do stuff with $domElement...
$domElemsToRemove[] = $domElement;
}
foreach( $domElemsToRemove as $domElement ){
$domElement->parentNode->removeChild($domElement);
}
?>

First of all, your XML is broken, see <thumb>...</thumb_path>and next line as well -> fix it!
Then, real simple in 3 lines of code:
$xml = simplexml_load_string($x); // $x holds your xml
$count = $xml->image->count()-1;
for ($i = $count;$i >= 0;$i--) unset($xml->image[$i]);
See live demo # http://codepad.viper-7.com/HkGy5o

Related

I want to remove a node in XML with PHP but it is not working [duplicate]

I am trying to develop a function that removes certain URL nodes from my sitemap file. Here is what I have so far.
$xpath = new DOMXpath($DOMfile);
$elements = $xpath->query("/urlset/url/loc[contains(.,'$pageUrl')]");
echo count($elements);
foreach($elements as $element){
//this is where I want to delete the URL
echo $element;
echo "here".$element->nodeValue;
}
Which outputs "111111". I don't know why I can't echo a string in a foreach loop if the $elements count is '1'.
Up until now, I've been doing
$urls = $dom->getElementsByTagName( "url" );
foreach( $urls as $url ){
$locs = $url->getElementsByTagName( "loc" );
$loc = $locs->item(0)->nodeValue;
echo $loc;
if($loc == $fullPageUrl){
$removeUrl = $dom->removeChild($url);
}
}
Which would work fine if my sitemap wasn't so big. It times out right now, so I'm hoping using xpath queries will be faster.
After Gordon's comment, I tried:
$xpath = new DOMXpath($DOMfile);
$query = sprintf('/urlset/url[./loc = "%d"]', $pageUrl);
foreach($xpath->query($query) as $element) {
//this is where I want to delete the URL
echo $element;
echo "here".$element->nodeValue;
}
And its not returning anything.
I tried going a step further and used codepad, using what was used in the other post mentioned, and did this:
<?php error_reporting(-1);
$xml = <<< XML <?xml version="1.0"
encoding="UTF-8" ?> <url>
<loc>professional_services</loc>
<loc>5professional_services</loc>
<loc>6professional_services</loc>
</url> XML;
$id = '5professional_services';
$dom = new DOMDocument; $dom->loadXML($xml);
$xpath = new DOMXPath($dom); $query = sprintf('/url/[loc = $id]');
foreach($xpath->query($query) as $record) {
$record->parentNode->removeChild($record);
}
echo $dom->saveXml();
and I'm getting a "Warning: DOMXPath::query(): Invalid expression" at the foreach loop line. Thanks for the other comment on the urlset, I'll be sure to include the double slashes in my code, tried it and it returned nothing.
XML from a sitemap should be :
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc></loc>
...
</url>
<url>
<loc></loc>
...
</url>
...
</urlset>
Since it got a namespace, the query is a little more complicated than my previous answer :
$xpath = new DOMXpath($DOMfile);
// Here register your namespace with a shortcut
$xpath->registerNamespace('sm', "http://www.sitemaps.org/schemas/sitemap/0.9");
// this request should work
$elements = $xpath->query('/sm:urlset/sm:url[sm:loc = "'.$pageUrl.'"]');
foreach($elements as $element){
// This is a hint from the manual comments
$element->parentNode->removeChild($element);
}
echo $DOMfile->saveXML();
I'm writing out of memory just before going to bed. If it doesn't work I'll go test tomorrow morning. (And yes, I'm aware that it could bring some downvotes)
If you don't have a namespace (you should but that's not an obligation sigh)
$elements = $xpath->query('/urlset/url[loc = "'.$pageUrl.'"]');
You got a concrete example that it's working here : http://codepad.org/vuGl1MAc

How to use php to remove an XML element [duplicate]

I am trying to develop a function that removes certain URL nodes from my sitemap file. Here is what I have so far.
$xpath = new DOMXpath($DOMfile);
$elements = $xpath->query("/urlset/url/loc[contains(.,'$pageUrl')]");
echo count($elements);
foreach($elements as $element){
//this is where I want to delete the URL
echo $element;
echo "here".$element->nodeValue;
}
Which outputs "111111". I don't know why I can't echo a string in a foreach loop if the $elements count is '1'.
Up until now, I've been doing
$urls = $dom->getElementsByTagName( "url" );
foreach( $urls as $url ){
$locs = $url->getElementsByTagName( "loc" );
$loc = $locs->item(0)->nodeValue;
echo $loc;
if($loc == $fullPageUrl){
$removeUrl = $dom->removeChild($url);
}
}
Which would work fine if my sitemap wasn't so big. It times out right now, so I'm hoping using xpath queries will be faster.
After Gordon's comment, I tried:
$xpath = new DOMXpath($DOMfile);
$query = sprintf('/urlset/url[./loc = "%d"]', $pageUrl);
foreach($xpath->query($query) as $element) {
//this is where I want to delete the URL
echo $element;
echo "here".$element->nodeValue;
}
And its not returning anything.
I tried going a step further and used codepad, using what was used in the other post mentioned, and did this:
<?php error_reporting(-1);
$xml = <<< XML <?xml version="1.0"
encoding="UTF-8" ?> <url>
<loc>professional_services</loc>
<loc>5professional_services</loc>
<loc>6professional_services</loc>
</url> XML;
$id = '5professional_services';
$dom = new DOMDocument; $dom->loadXML($xml);
$xpath = new DOMXPath($dom); $query = sprintf('/url/[loc = $id]');
foreach($xpath->query($query) as $record) {
$record->parentNode->removeChild($record);
}
echo $dom->saveXml();
and I'm getting a "Warning: DOMXPath::query(): Invalid expression" at the foreach loop line. Thanks for the other comment on the urlset, I'll be sure to include the double slashes in my code, tried it and it returned nothing.
XML from a sitemap should be :
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc></loc>
...
</url>
<url>
<loc></loc>
...
</url>
...
</urlset>
Since it got a namespace, the query is a little more complicated than my previous answer :
$xpath = new DOMXpath($DOMfile);
// Here register your namespace with a shortcut
$xpath->registerNamespace('sm', "http://www.sitemaps.org/schemas/sitemap/0.9");
// this request should work
$elements = $xpath->query('/sm:urlset/sm:url[sm:loc = "'.$pageUrl.'"]');
foreach($elements as $element){
// This is a hint from the manual comments
$element->parentNode->removeChild($element);
}
echo $DOMfile->saveXML();
I'm writing out of memory just before going to bed. If it doesn't work I'll go test tomorrow morning. (And yes, I'm aware that it could bring some downvotes)
If you don't have a namespace (you should but that's not an obligation sigh)
$elements = $xpath->query('/urlset/url[loc = "'.$pageUrl.'"]');
You got a concrete example that it's working here : http://codepad.org/vuGl1MAc

PHP Search between two points

I'm trying to deal with some XML in PHP.
I have code, such as this:
<?php
$stream = fopen("xml","r");
?>
Where "xml" contains something such as this:
<name>name1</name>
<key>key1</key>
<name>name2</name>
<key>key2</key>
etc.
I'd like to create an array out of the contents of the <key> tags, something like where
keys[0] = "key1"
and
keys[1] = "key2"
Any help is appreciated, thank you very much :)
Solution:
$xmlstr = fread($stream,filesize("xml-file"));
$sxe = new SimpleXMLElement($xmlstr);
echo $sxe->getName() . "\n";
foreach ($sxe->children() as $child) {
echo $child->children();
}
You should use DOM functions for this case. Let's suppose a well-formed XML document (xmltest.xml):
<?xml version="1.0" encoding="utf-8"?>
<root>
<name>name1</name>
<key>key1</key>
<name>name2</name>
<key>key2</key>
</root>
This code loads the xml file into DOM document and gets all nodes with tag key;
<?php
$dom = new DOMDocument('1.0','utf-8');
$dom->load('xmltest.xml');
$keys = $dom->getElementsByTagName('key');
for ($i = 0; $i < $keys->length; $i++) {
echo $keys->item($i)->nodeValue . "</br>";
}
?>

xml parsing with php

I would like to create a new simplified xml based on an existing one:
(using "simpleXml")
<?xml version="1.0" encoding="UTF-8"?>
<xls:XLS>
<xls:RouteInstructionsList>
<xls:RouteInstruction>
<xls:Instruction>Start</xls:Instruction>
</xls:RouteInstruction>
</xls:RouteInstructionsList>
<xls:RouteInstructionsList>
<xls:RouteInstruction>
<xls:Instruction>End</xls:Instruction>
</xls:RouteInstruction>
</xls:RouteInstructionsList>
</xls:XLS>
Because there are always colons in the element-tags, it will mess with "simpleXml", I tried to use the following solution->link.
How can I create a new xml with this structure:
<main>
<instruction>Start</instruction>
<instruction>End</instruction>
</main>
the "instruction-element" gets its content from the former "xls:Instruction-element".
Here is the updated code:
But unfortunately it never loops through:
$source = "route.xml";
$xmlstr = file_get_contents($source);
$xml = #simplexml_load_string($xmlstr);
$new_xml = simplexml_load_string('<main/>');
foreach($xml->children() as $child){
print_r("xml_has_childs");
$new_xml->addChild('instruction', $child->RouteInstruction->Instruction);
}
echo $new_xml->asXML();
there is no error-message, if I leave the "#"…
/* the use of # is to suppress warning */
$xml = #simplexml_load_string($YOUR_RSS_XML);
$new_xml = simplexml_load_string('<main/>');
foreach ($xml->children() as $child)
{
$new_xml->addChild('instruction', $child->RouteInstruction->Instruction);
}
/* to print */
echo $new_xml->asXML();
You could use xpath to simplify things. Without knowing the full details, I don't know if it will work in all cases:
$source = "route.xml";
$xmlstr = file_get_contents($source);
$xml = #simplexml_load_string($xmlstr);
$new_xml = simplexml_load_string('<main/>');
foreach ($xml->xpath('//Instruction') as $instr) {
$new_xml->addChild('instruction', (string) $instr);
}
echo $new_xml->asXML();
Output:
<?xml version="1.0"?>
<main><instruction>Start</instruction><instruction>End</instruction></main>
Edit: The file at http://www.gps.alaingroeneweg.com/route.xml is not the same as the XML you have in your question. You need to use a namespace like:
$xml = #simplexml_load_string(file_get_contents('http://www.gps.alaingroeneweg.com/route.xml'));
$xml->registerXPathNamespace('xls', 'http://www.opengis.net/xls'); // probably not needed
$new_xml = simplexml_load_string('<main/>');
foreach ($xml->xpath('//xls:Instruction') as $instr) {
$new_xml->addChild('instruction', (string) $instr);
}
echo $new_xml->asXML();
Output:
<?xml version="1.0"?>
<main><instruction>Start (Southeast) auf Sihlquai</instruction><instruction>Fahre rechts</instruction><instruction>Fahre halb links - Ziel erreicht!</instruction></main>

How do I remove a specific node using its attribute value in PHP XML Dom?

My question is best phrase as:
Remove a child with a specific attribute, in SimpleXML for PHP
except I'm not using simpleXML.
I'm new to XML for PHP so I may not be doing the best way
I have a xml created using the $dom->save($xml) for each individual user. (not placing all in one xml due to undisclosed reasons)
It gives me that xml declaration <?xml version="1.0"?> (no idea how to make it to others, but that's not the point, hopefully)
<?xml version="1.0"?>
<details>
<person>name</person>
<data1>some data</data1>
<data2>some data</data2>
<data3>some data</data3>
<category id="0">
<categoryName>Cat 1</categoryName>
<categorydata1>some data</categorydata1>
</category>
<category id="1">
<categoryName>Cat 2</categoryName>
<categorydata1>some data</categorydata1>
<categorydata2>some data</categorydata2>
<categorydata3>some data</categorydata3>
<categorydata4>some data</categorydata4>
</category>
</details>
And I want to remove a category that has a specific attribute named id with the DOM class in php when i run a function activated from using a remove button.
the following is the debug of the function im trying to get to work. Can i know what I'm doing wrong?
function CatRemove($myXML){
$xmlDoc = new DOMDocument();
$xmlDoc->load( $myXML );
$categoryArray = array();
$main = $xmlDoc->getElementsByTagName( "details" )->item(0);
$mainElement = $xmlDoc->getElementsByTagName( "details" );
foreach($mainElement as $details){
$currentCategory = $details->getElementsByTagName( "category" );
foreach($currentCategory as $category){
$categoryID = $category->getAttribute('id');
array_push($categoryArray, $categoryID);
if($categoryID == $_POST['categorytoremoveValue']) {
return $categoryArray;
}
}
}
$xmlDoc->save( $myXML );
}
Well the above prints me an array of [0]->0 all the time when i slot the return outside the if.
is there a better way? I've tried using getElementbyId as well but I've no idea how to work that.
I would prefer not to use an attribute though if that would make things easier.
Ok, let’s try this complete example of use:
function CatRemove($myXML, $id) {
$xmlDoc = new DOMDocument();
$xmlDoc->load($myXML);
$xpath = new DOMXpath($xmlDoc);
$nodeList = $xpath->query('//category[#id="'.(int)$id.'"]');
if ($nodeList->length) {
$node = $nodeList->item(0);
$node->parentNode->removeChild($node);
}
$xmlDoc->save($myXML);
}
// test data
$xml = <<<XML
<?xml version="1.0"?>
<details>
<person>name</person>
<data1>some data</data1>
<data2>some data</data2>
<data3>some data</data3>
<category id="0">
<categoryName>Cat 1</categoryName>
<categorydata1>some data</categorydata1>
</category>
<category id="1">
<categoryName>Cat 2</categoryName>
<categorydata1>some data</categorydata1>
<categorydata2>some data</categorydata2>
<categorydata3>some data</categorydata3>
<categorydata4>some data</categorydata4>
</category>
</details>
XML;
// write test data into file
file_put_contents('untitled.xml', $xml);
// remove category node with the id=1
CatRemove('untitled.xml', 1);
// dump file content
echo '<pre>', htmlspecialchars(file_get_contents('untitled.xml')), '</pre>';
So you want to remove the category node with a specific id?
$node = $xmlDoc->getElementById("12345");
if ($node) {
$node->parentNode->removeChild($node);
}
You could also use XPath to get the node, for example:
$xpath = new DOMXpath($xmlDoc);
$nodeList = $xpath->query('//category[#id="12345"]');
if ($nodeList->length) {
$node = $nodeList->item(0);
$node->parentNode->removeChild($node);
}
I haven’t tested it but it should work.
Can you try with this modified version:
function CatRemove($myXML, $id){
$doc = new DOMDocument();
$doc->loadXML($myXML);
$xpath = new DOMXpath($doc);
$nodeList = $xpath->query("//category[#id='$id']");
foreach ($nodeList as $element) {
$element->parentNode->removeChild($element);
}
echo htmlentities($doc->saveXML());
}
It's working for me. Just adapt it to your needs. It's not intended to use as-is, but just a proof of concept.
You also have to remove the xml declaration from the string.
the above funciton modified to remove an email from a mailing list
function CatRemove($myXML, $id) {
$xmlDoc = new DOMDocument();
$xmlDoc->load($myXML);
$xpath = new DOMXpath($xmlDoc);
$nodeList = $xpath->query('//subscriber[#email="'.$id.'"]');
if ($nodeList->length) {
$node = $nodeList->item(0);
$node->parentNode->removeChild($node);
}
$xmlDoc->save($myXML);
}
$xml = 'list.xml';
$to = $_POST['email'];//user already submitted they email using a form
CatRemove($xml,$to);

Categories