simplexml editing CDATA node - php

I have an xml file,
I want to open it, edit certain CDATA node with the values from $_POST input and save it as same file,
I've read some online documentation and ended up here,
someone please suggest a nice way of doing this...
regardsh

SimpleXML does not make CDATA elements accessible by default. You can either tell simplexml to skip them (default) or to read them (see: read cdata from a rss feed). If you read them, they are standard text values, so they get merged with other textnodes.
More control is offered by the Document Object ModelDocs, which offers a DOMCdataSection which extends from DOMText, the standard text node model.
Even though this is a different PHP library (DOM vs. SimpleXML), both are compatible to each other. For example a SimpleXMLElement can be converted into a DOMElement by using the dom_import_simplexml function.
If you post some code what you've done so far it should be easy to figure out how to access the CDATA sections you want to modify. Please provide as well some demo XML data so the example is more speaking.

Since I had the same issue just recently, I wanted to let people also see some code, because the linked examples can only add new CDATA sections, but do not remove the old ones. So "my" solutions is merged from the mentioned code example plus deleting the old CDATA node.
// get DOM node
$node = dom_import_simplexml($mySimpleXmlElement);
// remove existing CDATA ($node->childNodes->item(1) does not seem to work)
foreach($node->childNodes as $child) {
if ($child->nodeType == XML_CDATA_SECTION_NODE) {
$node->removeChild($child);
}
}
// add new CDATA
$no = $node->ownerDocument;
$node->appendChild($no->createCDATASection($myNewContent));
// print result
echo $xml->asXML();

I suggest you use this http://www.php.net/manual/en/class.domdocument.php

You can extend class SimpleXMLElement with simples function to do this
class ExSimpleXMLElement extends SimpleXMLElement {
/**
* Add CDATA text in a node
* #param string $cdata_text The CDATA value to add
*/
private function addCData($cdata_text) {
$node = dom_import_simplexml($this);
$no = $node->ownerDocument;
$node->appendChild($no->createCDATASection($cdata_text));
}
/**
* Create a child with CDATA value
* #param string $name The name of the child element to add.
* #param string $cdata_text The CDATA value of the child element.
*/
public function addChildCData($name, $cdata_text) {
$child = $this->addChild($name);
$child->addCData($cdata_text);
return $child;
}
/**
* Modify a value with CDATA value
* #param string $name The name of the node element to modify.
* #param string $cdata_text The CDATA value of the node element.
*/
public function valueChildCData($name, $cdata_text) {
$name->addCData($cdata_text);
return $name;
}
}
usage:
$xml_string = <<<XML
<root>
<item id="foo"/>
</root>
XML;
$xml5 = simplexml_load_string($xml_string, 'ExSimpleXMLElement');
$xml5->valueChildCData($xml5->item, 'mysupertext');
echo $xml5->asXML();
$xml6 = simplexml_load_string($xml_string, 'ExSimpleXMLElement');
$xml6->item->addChildCData('mylittlechild', 'thepunishment');
echo $xml6->asXML();
result:
<?xml version="1.0"?>
<root>
<item id="foo"><![CDATA[mysupertext]]></item>
</root>
<?xml version="1.0"?>
<root>
<item id="foo">
<mylittlechild><![CDATA[thepunishment]]></mylittlechild>
</item>
</root>

Related

Xpath to remove empty nodes within XML with child nodes

I am using the following code in which I pass the dom and this returns the XML without the empty nodes
/**
* Remove Empty Tags
*
* #return void
* #author Fahad Sheikh
**/
public function remove_empty_tags($dom)
{
// Remove Empty Tags
$xpath = new DOMXPath($dom);
foreach( $xpath->query('//*[not(node())]') as $node ) {
$node->parentNode->removeChild($node);
}
}
But this returns the following XML where it for some reason combines the nested XML tag values and deletes their tag.
<formxml>
<type>Potential customer</type>
<origin>Content</origin>
<source>test source<medium>test medium</medium>
<campaign>test campaign</campaign>
<matchtype>terms</matchtype>
<test>valueonevaluetwo</test>
<keyword>test term</keyword>
<ad>test</ad>
</formxml>
Instead, this should be:
<formxml>
<type>Potential customer</type>
<origin>Content</origin>
<source>test source<medium>test medium</medium>
<campaign>test campaign</campaign>
<matchtype>terms</matchtype>
<test>
<testvarone>valueone</testvarone>
<testvartwo>valuetwo</testvartwo>
</test>
<keyword>test term</keyword>
<ad>test</ad>
</formxml>
This solution was mentioned here:
Remove empty tags from a XML with PHP

Delete attribute in XML

I want to completely remove the size="id" attribute from every <door> element.
<?xml version="1.0" encoding="UTF-8"?>
<doors>
<door id="1" entry="3249" size="30"/>
<door id="1041" entry="6523" size="3094"/>
-- and 1000 more....
</doors>
The PHP code:
$xml = new SimpleXMLElement('http://mysite/doors.xml', NULL, TRUE);
$ids_to_delete = array( 1, 1506 );
foreach ($ids_to_delete as $id) {
$result = $xml->xpath( "//door[#size='$id']" );
foreach ( $result as $node ) {
$dom = dom_import_simplexml($node);
$dom->parentNode->removeChild($dom);
}
}
$xml->saveXml();
I get no errors but it does not delete the size attribute. Why?
I get no errors but it does not delete the size attribute. Why?
There are mulitple reasons why it does not delete the size attribute. The one that popped first into my mind was that attributes are no child nodes. Using a method to remove a child does just not fit to remove an attribute.
Each element node has an associated set of attribute nodes; the element is the parent of each of these attribute nodes; however, an attribute node is not a child of its parent element.
From: Attribute Nodes - XML Path Language (XPath), bold by me.
However, you don't see an error here, because the $result you have is an empty array. You just don't select any nodes to remove - neither elements nor attributes - with your xpath. That is because there is no such element you look for:
//door[#size='1']
You're searching for the id in the size attribute: No match.
These are the reasons why you get no errors and it does not delete any size attribute: 1.) you don't delete attributes here, 2.) you don't query any elements to delete attributes from.
How to delete attributes in SimpleXML queried by Xpath?
You can remove the attribute nodes by selecting them with an Xpath query and then unset the SimpleXMLElement self-reference:
// all size attributes of all doors
$result = $xml->xpath("//door/#size");
foreach ($result as $node) {
unset($node[0]);
}
In this example, all attribute nodes are queried by the Xpath expressions that are size attributes of door elements (which is what you ask for in your question) and then those are removed from the XML.
//door/#size
(see Abbreviated Syntax)
Now here the full example:
<?php
/**
* #link https://eval.in/215817
*/
$buffer = <<<XML
<?xml version="1.0" encoding="UTF-8"?>
<doors>
<door id="1" entry="3249" size="30"/>
<door id="1041" entry="6523" size="3094"/>
-- and 1000 more....
</doors>
XML;
$xml = new SimpleXMLElement($buffer);
// all size attributes of all doors
$result = $xml->xpath("//door/#size");
foreach ($result as $node) {
unset($node[0]);
}
$xml->saveXml("php://output");
Output (Online Demo):
<?xml version="1.0" encoding="UTF-8"?>
<doors>
<door id="1" entry="3249"/>
<door id="1041" entry="6523"/>
-- and 1000 more....
</doors>
You can do your whole query in DOMDocument using DOMXPath, rather than switching between SimpleXML and DOM:
$dom = new DOMDocument;
$dom->load('my_xml_file.xml');
# initialise an XPath object to act on the $dom object
$xp = new DOMXPath( $dom );
# run the query
foreach ($xp->query( "//door[#size]" ) as $door) {
# remove the attribute
$door->removeAttribute('size');
}
print $dom->saveXML();
Output for the input you supplied:
<?xml version="1.0" encoding="UTF-8"?>
<doors>
<door id="1" entry="3249"/>
<door id="1041" entry="6523"/>
</doors>
If you do want only to remove the size attribute for the IDs in your list, you should use the code:
foreach ($ids_to_delete as $id) {
# searches for elements with a matching ID and a size attribute
foreach ($xp->query("//door[#id='$id' and #size]") as $door) {
$door->removeAttribute('size');
}
}
Your code wasn't working for several reasons:
it looks like your XPath was wrong, since your array is called $ids_to_delete and your XPATH is looking for door elements with the size attribute equal to the value from $ids_to_delete;
you're converting the nodes to DOMDocument objects ($dom = dom_import_simplexml($node);) to do the deletion, but $xml->saveXml();, which I presume you printed somehow, is a SimpleXML object;
you need to remove the element attribute; removeChild removes the whole element.

Altering value of DomDocument element/node

I'm building an XML like:
<?xml ... ?>
<root>
<elements>0</elements>
<list>
<element>test1</element>
<element>test1</element>
<element>test1</element>
</list>
</root>
After appending all <element>s, I want to replace <elements>0</elements> by <elements>3</elements> for example.
I tried DOMNode::replaceChild, but it has no affect.
$numberOfElements = $xml->createElement('numberOfElements', '0');
$root->appendChild($numberOfElements);
/* append elements and count them */
$root->replaceChild($numberOfElements,
$xml->createElement('numberOfElements', $countElements)
);
How to properly use replaceChild or is there a different way?
From the docs:
public DOMNode replaceChild ( DOMNode $newnode , DOMNode $oldnode )
This means that you must specify the new node first, then the node to be replaced. You have it the wrong way around.
EDIT: That said, why not do this?
$numberOfElements->nodeValue = $countElements;

How to get values inside <![CDATA[values]] > using php DOM?

How can i get values inside <![CDATA[values]] > using php DOM.
This is few code from my xml.
<Destinations>
<Destination>
<![CDATA[Aghia Paraskevi, Skiatos, Greece]]>
<CountryCode>GR</CountryCode>
</Destination>
<Destination>
<![CDATA[Amettla, Spain]]>
<CountryCode>ES</CountryCode>
</Destination>
<Destination>
<![CDATA[Amoliani, Greece]]>
<CountryCode>GR</CountryCode>
</Destination>
<Destination>
<![CDATA[Boblingen, Germany]]>
<CountryCode>DE</CountryCode>
</Destination>
</Destinations>
Working with PHP DOM is fairly straightforward, and is very similar to Javascript's DOM.
Here are the important classes:
DOMNode — The base class for anything that can be traversed inside an XML/HTML document, including text nodes, comment nodes, and CDATA nodes
DOMElement — The base class for tags.
DOMDocument — The base class for documents. Contains the methods to load/save XML, as well as normal DOM document methods (see below).
There are a few staple methods and properties:
DOMDocument->load() — After creating a new DOMDocument, use this method on that object to load from a file.
DOMDocument->getElementsByTagName() — this method returns a node list of all elements in the document with the given tag name. Then you can iterate (foreach) on this list.
DOMNode->childNodes — A node list of all children of a node. (Remember, a CDATA section is a node!)
DOMNode->nodeType — Get the type of a node. CDATA nodes have type XML_CDATA_SECTION_NODE, which is a constant with the value 4.
DOMNode->textContent — get the text content of any node.
Note: Your CDATA sections are malformed. I don't know why there is an extra ]] in the first one, or an unclosed CDATA section at the end of the line, but I think it should simply be:
<![CDATA[Aghia Paraskevi, Skiatos, Greece]]>
Putting this all together we:
Create a new document object and load the XML
Get all Destination elements by tag name and iterate over the list
Iterate over all child nodes of each Destination element
Check if the node type is XML_CDATA_SECTION_NODE
If it is, echo the textContent of that node.
Code:
$doc = new DOMDocument();
$doc->load('test.xml');
$destinations = $doc->getElementsByTagName("Destination");
foreach ($destinations as $destination) {
foreach($destination->childNodes as $child) {
if ($child->nodeType == XML_CDATA_SECTION_NODE) {
echo $child->textContent . "<br/>";
}
}
}
Result:
Aghia Paraskevi, Skiatos, Greece
Amettla, Spain
Amoliani, Greece
Boblingen, Germany
Use this:
$parseFile = simplexml_load_file($myXML,'SimpleXMLElement', LIBXML_NOCDATA)
and next :
foreach ($parseFile->yourNode as $node ){
etc...
}
Best and easy way
$xml = simplexml_load_string($xmlData, 'SimpleXMLElement', LIBXML_NOCDATA);
$xmlJson = json_encode($xml);
$xmlArr = json_decode($xmlJson, 1); // Returns associative array
Use replace CDATA before parsing PHP DOM element after that you can get the innerXml or innerHtml:
str_replace(array('<\![CDATA[',']]>'), '', $xml);
I use following code.
Its not only read all xml data with
<![CDATA[values]] >
but also convert xml object to php associative array. So we can apply loop on the data.
$xml_file_data = json_decode(json_encode(simplexml_load_string($xml, 'SimpleXMLElement', LIBXML_NOCDATA),true), true);
Hope this will work for you.
function inBetweenOf(string $here, string $there, string $content) : string {
$left_over = strlen(substr($content, strpos($content, $there)));
return substr($content, strpos($content, $here) + strlen($here), -$left_over);
}
Iterate over "Destination" tags and then call inBetweenOf on each iteration.
$doc = inBetweenOf('<![CDATA[', ']]>', $xml);

In SimpleXML, how can I add an existing SimpleXMLElement as a child element?

I have a SimpleXMLElement object $child, and a SimpleXMLElement object $parent.
How can I add $child as a child of $parent? Is there any way of doing this without converting to DOM and back?
The addChild() method only seems to allow me to create a new, empty element, but that doesn't help when the element I want to add $child also has children. I'm thinking I might need recursion here.
Unfortunately SimpleXMLElement does not offer anything to bring two elements together. As #nickf wrote, it's more fitting for reading than for manipulation. However, the sister extension DOMDocument is for editing and you can bring both together via dom_import_simplexml(). And #salathe shows in a related answer how this works for specific SimpleXMLElements.
The following shows how this work with input checking and some more options. I do it with two examples. The first example is a function to insert an XML string:
/**
* Insert XML into a SimpleXMLElement
*
* #param SimpleXMLElement $parent
* #param string $xml
* #param bool $before
* #return bool XML string added
*/
function simplexml_import_xml(SimpleXMLElement $parent, $xml, $before = false)
{
$xml = (string)$xml;
// check if there is something to add
if ($nodata = !strlen($xml) or $parent[0] == NULL) {
return $nodata;
}
// add the XML
$node = dom_import_simplexml($parent);
$fragment = $node->ownerDocument->createDocumentFragment();
$fragment->appendXML($xml);
if ($before) {
return (bool)$node->parentNode->insertBefore($fragment, $node);
}
return (bool)$node->appendChild($fragment);
}
This exemplary function allows to append XML or insert it before a certain element, including the root element. After finding out if there is something to add, it makes use of DOMDocument functions and methods to insert the XML as a document fragment, it is also outlined in How to import XML string in a PHP DOMDocument. The usage example:
$parent = new SimpleXMLElement('<parent/>');
// insert some XML
simplexml_import_xml($parent, "\n <test><this>now</this></test>\n");
// insert some XML before a certain element, here the first <test> element
// that was just added
simplexml_import_xml($parent->test, "<!-- leave a comment -->\n ", $before = true);
// you can place comments above the root element
simplexml_import_xml($parent, "<!-- this works, too -->", $before = true);
// but take care, you can produce invalid XML, too:
// simplexml_add_xml($parent, "<warn><but>take care!</but> you can produce invalid XML, too</warn>", $before = true);
echo $parent->asXML();
This gives the following output:
<?xml version="1.0"?>
<!-- this works, too -->
<parent>
<!-- leave a comment -->
<test><this>now</this></test>
</parent>
The second example is inserting a SimpleXMLElement. It makes use of the first function if needed. It basically checks if there is something to do at all and which kind of element is to be imported. If it is an attribute, it will just add it, if it is an element, it will be serialized into XML and then added to the parent element as XML:
/**
* Insert SimpleXMLElement into SimpleXMLElement
*
* #param SimpleXMLElement $parent
* #param SimpleXMLElement $child
* #param bool $before
* #return bool SimpleXMLElement added
*/
function simplexml_import_simplexml(SimpleXMLElement $parent, SimpleXMLElement $child, $before = false)
{
// check if there is something to add
if ($child[0] == NULL) {
return true;
}
// if it is a list of SimpleXMLElements default to the first one
$child = $child[0];
// insert attribute
if ($child->xpath('.') != array($child)) {
$parent[$child->getName()] = (string)$child;
return true;
}
$xml = $child->asXML();
// remove the XML declaration on document elements
if ($child->xpath('/*') == array($child)) {
$pos = strpos($xml, "\n");
$xml = substr($xml, $pos + 1);
}
return simplexml_import_xml($parent, $xml, $before);
}
This exemplary function does normalize list of elements and attributes like common in Simplexml. You might want to change it to insert multiple SimpleXMLElements at once, but as the usage example shows below, my example does not support that (see the attributes example):
// append the element itself to itself
simplexml_import_simplexml($parent, $parent);
// insert <this> before the first child element (<test>)
simplexml_import_simplexml($parent->children(), $parent->test->this, true);
// add an attribute to the document element
$test = new SimpleXMLElement('<test attribute="value" />');
simplexml_import_simplexml($parent, $test->attributes());
echo $parent->asXML();
This is a continuation of the first usage-example. Therefore the output now is:
<?xml version="1.0"?>
<!-- this works, too -->
<parent attribute="value">
<!-- leave a comment -->
<this>now</this><test><this>now</this></test>
<!-- this works, too -->
<parent>
<!-- leave a comment -->
<test><this>now</this></test>
</parent>
</parent>
I hope this is helpful. You can find the code in a gist and as online demo / PHP version overview.
I know this isn't the most helpful answer, but especially since you're creating/modifying XML, I'd switch over to using the DOM functions. SimpleXML's good for accessing simple documents, but pretty poor at changing them.
If SimpleXML is treating you kindly in all other places and you want to stick with it, you still have the option of jumping over to the DOM functions temporarily to perform what you need to and then jump back again, using dom_import_simplexml() and simplexml_import_dom(). I'm not sure how efficient this is, but it might help you out.
Actually, it's possible (dynamically) if you look carefully on how addChild() is defined. I used this technique to convert any array into XML using recursion and pass-by-reference
addChild() returns SimpleXMLElement of added child.
to add leaf node, use $xml->addChilde($nodeName, $nodeValue).
to add a node which may have subnode or value, use
$xml->addChilde($nodeName), no value is passed to addChild(). This
will result in having a subnode of type SimpleXMLElement! not a
string!
target XML
<root>
<node>xyz</node>
<node>
<node>aaa</node>
<node>bbb</node>
</node>
</root>
Code:
$root = new SimpleXMLElement('<root />');
//add child with name and string value.
$root.addChild('node', 'xyz');
//adds child with name as root of new SimpleXMLElement
$sub = $root->addChild('node');
$sub.addChild('node', 'aaa');
$sub.addChild('node', 'bbb');
Leaving this here as I just stumbled upon this page and found that SimpleXML now supports this functionality through the ::addChild method.
You can use this method to do add any cascading elements as well:
$xml->addChild('parent');
$xml->parent->addChild('child');
$xml->parent->child->addChild('child_id','12345');

Categories