php export xml CDATA escaped - php

I am trying to export xml with CDATA tags. I use the following code:
$xml_product = $xml_products->addChild('product');
$xml_product->addChild('mychild', htmlentities("<![CDATA[" . $mytext . "]]>"));
The problem is that I get CDATA tags < and > escaped with < and > like following:
<mychild><![CDATA[My some long long long text]]></mychild>
but I need:
<mychild><![CDATA[My some long long long text]]></mychild>
If I use htmlentities() I get lots of errors like tag raquo is not defined etc... though there are no any such tags in my text. Probably htmlentities() tries to parse my text inside CDATA and convert it, but I dont want it either.
Any ideas how to fix that? Thank you.
UPD_1 My function which saves xml to file:
public static function saveFormattedXmlFile($simpleXMLElement, $output_file) {
$dom = new DOMDocument('1.0', 'UTF-8');
$dom->preserveWhiteSpace = false;
$dom->formatOutput = true;
$dom->loadXML(urldecode($simpleXMLElement->asXML()));
$dom->save($output_file);
}

A short example of how to add a CData section, note the way it skips into using DOMDocument to add the CData section in. The code builds up a <product> element, $xml_product has a new element <mychild> created in it. This newNode is then imported into a DOMElement using dom_import_simplexml. It then uses the DOMDocument createCDATASection method to properly create the appropriate bit and adds it back into the node.
$xml = new SimpleXMLElement('<?xml version="1.0" encoding="UTF-8"?><Products />');
$xml_product = $xml->addChild('product');
$newNode = $xml_product->addChild('mychild');
$mytext = "<html></html>";
$node = dom_import_simplexml($newNode);
$cdata = $node->ownerDocument->createCDATASection($mytext);
$node->appendChild($cdata);
echo $xml->asXML();
This example outputs...
<?xml version="1.0" encoding="UTF-8"?>
<Products><product><mychild><![CDATA[<html></html>]]></mychild></product></Products>

Related

Replace element value in DOM

I want to save DOM tags value to exist XML, I found replace function but it is in js and I need the function in PHP
I tried save and saveXML function, but this didn't worked. I have tags in XML with colon "iaiext:auction_title". I used getElement and it's work good, next i cut title to 50 characters function work too, but how i can replace old title to this new title if i dont use path like simple_load_file. How to show in my script this path?
$dom = new DOMDocument;
$dom->load('p.xml');
$i = 0;
$tytuly = $dom->getElementsByTagName('auction_title');
foreach ($tytuly as $tytul){
$title = $tytul->nodeValue;
$end_title = doTitleCut($title);
//echo "<pre>";
//echo($end_title);
//echo "<pre>";
$i = $i+1;
}
In your loop, you can update a particular nodes value the same way you fetch it - with nodeValue. So in your loop, just update it each time...
$tytul->nodeValue = doTitleCut($title);
Then after your loop, you can just echo the new XML out using
echo $dom->saveXML();
or save it using
$dom->save("3.xml");
It is the same basic API in PHP. However browsers implement more or other parts of the API. Here are 5 revisions of the API (DOM Level 1 to 4 and DOM LS). DOM 3 added a property to read/write the text content of a node: https://www.w3.org/TR/DOM-Level-3-Core/core.html#Node3-textContent
The following example prefixes the titles:
$xml = <<<'XML'
<auctions>
<auction_title>World!</auction_title>
<auction_title>World & Universe!</auction_title>
</auctions>
XML;
$document = new DOMDocument();
$document->loadXML($xml);
$titleNodes = $document->getElementsByTagName('auction_title');
foreach ($titleNodes as $titleNode) {
$title = $titleNode->textContent;
$titleNode->textContent = 'Hello '.$title;
}
echo $document->saveXML();
Output:
<?xml version="1.0"?>
<auctions>
<auction_title>Hello World!</auction_title>
<auction_title>Hello World & Universe!</auction_title>
</auctions>
PHPs DOMNode::$nodeValue implementation does not match the W3C API definition. It behaves the same as DOMNode::$textContent for reads and does not fully escape on write.

How to append XML data without overwriting?

I'm in the process of writing an XML file:
<?php
$xml2 = "currenttest";
$xml = new DOMDocument("1.0");
$root = $xml->createElement ('tv');
$xml->appendChild($root);
$root->appendChild($xml->createTextNode("\n"));
$root->appendChild($xml->createTextNode($xml2));
$root->appendChild($xml->createTextNode("\n"));
$xml->save('epg.xml');
XML:
<?xml version="1.0"?>
<tv>
test
</tv>
If i change the text and again runs the code, the old content is deleted.
And I want the old text to stay.
Let's say this:
<?xml version="1.0"?>
<tv>
currenttest...
newtest...
</tv>
My previous way was to write the XML with:
file_put_contents($file, $xml2, FILE_APPEND | LOCK_EX);
FILE_APPEND | LOCK_EX, its helped me that the previous text would not be erased
I found a solution in another post:
$doc->loadXML(file_get_contents('epg.xml'));
foreach($doc->getElementsByTagName('***') as $node)
{
}
But how can it fit into my code?
You have nothing particular to do, just to reload your xml string and to append a new text node to your root element:
// your previous code (I only changed the variable names and added a default encoding)
$text = "currenttest";
$dom = new DOMDocument("1.0", "UTF-8");
$root = $dom->createElement('tv');
$dom->appendChild($root);
$root->appendChild($dom->createTextNode("\n"));
$root->appendChild($dom->createTextNode($text));
$root->appendChild($dom->createTextNode("\n"));
$xml = $dom->saveXML();
// let's add a new element
$newtext = 'newtext';
$dom = new DOMDocument;
$dom->loadXML($xml);
$root = $dom->documentElement; // conveniant way to target the root element
// but you can also write:
//$root = $dom->getElementsByTagName('tv')->item(0);
$root->appendChild($dom->createTextNode($newtext));
$newxml = $dom->saveXML();
echo $newxml;
demo
About $doc->loadXML(file_get_contents('epg.xml'));, note that you don't need to use file_get_contents since DOMDocument has already two methods:
DOMDocument::loadXML that loads the xml content from a string.
DOMDocument::load that loads the xml content directly from a file.
In addition to DOMNode::appendChild that adds a node to an element after all the children nodes of this element, you have also DOMNode::insertBefore to add a node to an element before the child node of your choice.
I tryed the code on top, 'cause i was overwriting my data, but when I coded in my application, it didn't worked cause I was trying to add the new node data in the loaded xml, you have to create a root to add data inside.
$xml = new DOMDocument("1.0", "UTF-8");
//an tag root must be first thing to add
$root = $xml->createElement('root');
$xml->appendChild($root);
Then, just add the data when you need
$xml = new DOMDocument("1.0", "UTF-8");
$xml->load($sFilepath);
$root = $xml->getElementsByTagName('root')->item(0);
your structure must looks like this:
<xml version="1.0" encoding="UTF-8">
<root>
</root>
The answer on top is totally correct. This answer is only to help if somebody is having trouble to understand.

PHP: Keeping HTML inside XML node without CDATA

I've got an xml like this:
<father>
<son>Text with <b>HTML</b>.</son>
</father>
I'm using simplexml_load_string to parse it into SimpleXmlElement. Then I get my node like this
$xml->father->son->__toString(); //output: "Text with .", but expected "Text with <b>HTML</b>."
I need to handle simple HTML such as:
<b>text</b> or <br/> inside the xml which is sent by many users.
Me problem is that I can't just ask them to use CDATA because they won't be able to handle it properly, and they are already use to do without.
Also, if it's possible I don't want the file to be edited because the information need to be the one sent by the user.
The function simplexml_load_string simply erase anything inside HTML node and the HTML node itself.
How can I keep the information ?
SOLUTION
To handle the problem I used the asXml as explained by #ThW:
$tmp = $xml->father->son->asXml(); //<son>Text with <b>HTML</b>.</son>
I just added a preg_match to erase the node.
A CDATA section is a character node, just like a text node. But it does less encoding/decoding. This is mostly a downside, actually. On the upside something in a CDATA section might be more readable for a human and it allows for some BC in special cases. (Think HTML script tags.)
For an XML API they are nearly the same. Here is a small DOM example (SimpleXML abstracts to much).
$document = new DOMDocument();
$father = $document->appendChild(
$document->createElement('father')
);
$son = $father->appendChild(
$document->createElement('son')
);
$son->appendChild(
$document->createTextNode('With <b>HTML</b><br>It\'s so nice.')
);
$son = $father->appendChild(
$document->createElement('son')
);
$son->appendChild(
$document->createCDataSection('With <b>HTML</b><br>It\'s so nice.')
);
$document->formatOutput = TRUE;
echo $document->saveXml();
Output:
<?xml version="1.0"?>
<father>
<son>With <b>HTML</b><br>It's so nice.</son>
<son><![CDATA[With <b>HTML</b><br>It's so nice.]]></son>
</father>
As you can see they are serialized very differently - but from the API view they are basically exchangeable. If you're using an XML parser the value you get back should be the same in both cases.
So the first possibility is just letting the HTML fragment be stored in a character node. It is just a string value for the outer XML document itself.
The other way would be using XHTML. XHTML is XML compatible HTML. You can mix an match different XML formats, so you could add the XHTML fragment as part of the outer XML.
That seems to be what you're receiving. But SimpleXML has some problems with mixed nodes. So here is an example how you can read it in DOM.
$xml = <<<'XML'
<father>
<son>With <b>HTML</b><br/>It's so nice.</son>
</father>
XML;
$document = new DOMDocument();
$document->loadXml($xml);
$xpath = new DOMXpath($document);
$result = '';
foreach ($xpath->evaluate('/father/son[1]/node()') as $child) {
$result .= $document->saveXml($child);
}
echo $result;
Output:
With <b>HTML</b><br/>It's so nice.
Basically you need to save each child of the son element as XML.
SimpleXML is based on the same DOM library internally. That allows you to convert a SimpleXMLElement into a DOM node. From there you can again save each child as XML.
$father = new SimpleXMLElement($xml);
$sonNode = dom_import_simplexml($father->son);
$document = $sonNode->ownerDocument;
$result = '';
foreach ($sonNode->childNodes as $child) {
$result .= $document->saveXml($child);
}
echo $result;

how to add xml element tag to correct place?

I have code which add item like below to xml file:
<newWord>
<Heb>rer</Heb>
<Eng>twew</Eng>
</newWord>
the problem is when I add element it will be not in the right place, it is out from xml tag like below.
<?xml version="1.0" encoding="UTF-8"?>
<xml>
<newWord>
<Heb>hebword</Heb>
<Eng>banna</Eng>
</newWord>
</xml>
<newWord>
<Heb>rer</Heb>
<Eng>twew</Eng>
</newWord>
How can I add the new element to correct place? thx
my code:
<?php
$wordH=$_GET['varHeb'];
$wordE=$_GET['varEng'];
$domtree='';
if(!$domtree)
{
$domtree = new DOMDocument('1.0', 'UTF-8');
$domtree->formatOutput = true;
$domtree->load('Dictionary_user.xml');
}
$Dictionary_user = $domtree->documentElement;
$currentitem = $domtree->createElement("newWord");
$currentitem = $domtree->appendChild($currentitem);
$currentitem->appendChild($domtree->createElement('Heb', $wordH));
$currentitem->appendChild($domtree->createElement('Eng',$wordE));
$Dictionary_user->childNodes->item(0)->parentNode->insertBefore($currentitem,$Dictionary_user->childNodes->item(0));
header("Content-type: text/xml");
$domtree->save("Dictionary_user.xml");
/* get the xml printed */
echo $domtree->saveXML();
?>
Perhaps you want something like:
<?php
if(!empty($_GET['varHeb']) && !empty($_GET['varEng'])){
//Dont forget utf-8 if using Hebrew letters
header('Content-type: text/xml; charset=utf-8');
//Load It
$xml = simplexml_load_file('./Dictionary_user.xml');
//Add a new node & add the children
$node = $xml->addChild('newWord');
$node->addChild('Heb', trim($_GET['varHeb']));
$node->addChild('Eng', trim($_GET['varEng']));
//DOMDocument to format code output
$dom = new DOMDocument('1.0', 'UTF-8');
$dom->preserveWhiteSpace = false;
$dom->formatOutput = true;
$dom->loadXML($xml->asXML());
echo $dom->saveXML();
}else{
//params not set for $_GET
}
/**
* Result
<?xml version="1.0" encoding="UTF-8"?>
<xml>
<newWord>
<Heb>hebword</Heb>
<Eng>banna</Eng>
</newWord>
<newWord>
<Heb>אנחנו אוהבים גלישת מחסנית</Heb>
<Eng>We Love Stack overflow</Eng>
</newWord>
</xml>
*/
The issue is you doing:
$currentitem = $domtree->createElement("newWord");
$currentitem = $domtree->appendChild($currentitem);
Change it to
$currentitem = $domtree->createElement("newWord");
$currentitem = $domtree->documentElement->appendChild($currentitem);
This will them append to the root node.
There is also few other things to be improved in your code:
$wordH=$_GET['varHeb'];
$wordE=$_GET['varEng'];
Make sure you sanitize/filter this content. DOM is quite good at making sure no malicious content will be inserted, but anything coming from the outside world should be sanity checked.
$domtree='';
if(!$domtree)
{
This is rather pointless to do. You made $domtree an empty string, so the if block will always get triggered. Remove that part and simply load the XML.
$domtree = new DOMDocument('1.0', 'UTF-8');
$domtree->formatOutput = true;
$domtree->load('Dictionary_user.xml');
}
When you load a string or file with DOM, it will discard any arguments you put into the constructor and use the arguments found in the xml prolog of the string or file instead. So you do not need to supply them when newing the DOMDocument.
$Dictionary_user = $domtree->documentElement;
This is the root node (<xml>) you want to append to.
$currentitem = $domtree->createElement("newWord");
$currentitem = $domtree->appendChild($currentitem);
This is where the error is happening. You create a new element and then append it to the $domtree, which is the document as a whole. In an XPath query, $domtree is / while you want to append to /xml, e.g. the root element.
$currentitem->appendChild($domtree->createElement('Heb', $wordH));
$currentitem->appendChild($domtree->createElement('Eng',$wordE));
This works as expected. You are adding to <newWord>
$Dictionary_user
->childNodes
->item(0)
->parentNode
->insertBefore($currentitem, $Dictionary_user->childNodes->item(0));
You are traversing from the root element (<xml>) to the first child element and then back up to the parent, which is the root element again obviously. So the entire traversal is not necessary. In fact, unless you want to move the appended <newWord> above the first child of the root element, the entire block is not necessary at all.

How store xml file special characters values in its as it is?

I am using my xml file to store special chars.
This is my original file
<root>
<popups>
<popup id="1">
<text1>
<![CDATA[dynamic text popup 2a]]>
</text1>
<text2>
<![CDATA[dynamic text popup 2b]]>
</text2>
</popup>
</popups>
</root>
Now when I use php to save special chars eg , it becomes like that
<root>
<popups>
<popup id="1">
<text1><![CDATA[Hello world]]></text1>
<text2><![CDATA[asassa]]></text2>
</popup>
</popups>
</root>
I have used the following code :
$this->xmlDocument = simplexml_load_file("xml/conf.xml");
$pages_node = $this->xmlDocument->xpath("/root/popups/popup[#id=1]");
$name = $_POST['popup-name'];
$editor1 = trim(strip_tags($_POST['editor1']));
$editor2 = trim(strip_tags($_POST['editor2']));
if (!empty($name)){
if (!empty($editor1)){
$pages_node[0]->text1 = "<![CDATA[".$editor1."]]>";
}
if (!empty($editor2)){
$pages_node[0]->text2 = "<![CDATA[".$editor2."]]>" ;
}
$this->xmlDocument->asXml($this->basePath() . "conf/conf.xml");
}
How can I save the special chars as they are without needing to encode them?
Simplexml is meant to be simple, so there is no such option. dom_import_simplexml can help you create domdocument from simplexml object.
You have to create new instance of DOMDocument, then create CDATA section and put it into imported DOMElement node.
If you are using php DomDocument, you have to create DOMCDATASection and append it to text1/text2 nodes.
If you don't have text1 and text2 nodes, you have to create them first, then appencd cdata node to them and finally append them to popup:
$cdata = $dom->createCDATASection("test");
$text1 = $dom->createElement('text1');
$text1->appendChild($cdata);
$text1 = $dom->createElement('text2');
$text2->appendChild($cdata);
$popupNode->appendChild($text1);
$popupNode->appendChild($text2);

Categories