How to decode HTML entities when saving an XML file? - php

I have the following code in my PHP script:
$str = '<item></item>';
$xml = new DOMDocument('1.0', 'UTF-8');
$xml->formatOutput = true;
$xml->load('file.xml');
$items = $addon->getElementsByTagName('items')->item(0);
$items->nodeValue = $str;
$xml->save('file.xml');
In the saved file.xml I see the following:
<item><\item>
How can I save it in the XML file without encoding HTML entities?

Use a DOMDocumentFragment:
<?php
$doc = new DOMDocument();
$doc->load('file.xml'); // '<doc><items/></doc>'
$item = $doc->createDocumentFragment();
$item->appendXML('<item id="1">item</item>');
$doc->getElementsByTagName('items')->item(0)->appendChild($item);
$doc->save('file.xml');
If you're appending to the root element, use $doc->documentElement->appendChild($item); instead of getElementsByTagName.

Related

PHP Domdocument use saveXML instead of save

I am creating an XML file in PHP like this...
$myXML = new DOMDocument();
$myXML ->formatOutput = true;
$data = $myXML ->createElement('data');
$data->nodeValue = 'mydata';
$final->appendChild($data);
$myXML ->save('/mypath/myfile.xml');
This works, but how can I convert this to use saveXML() instead? I have tried like this but I get nothing
$myXML->saveXML();
Where am I going wrong?
I see two things:
$final is not declared. Change it.
In case saveXML() is called, the output has to be assigned to a variable or printed
Here goes the working code:
<?php
$myXML = new DOMDocument();
$myXML ->formatOutput = true;
$data = $myXML ->createElement('data');
$data->nodeValue = 'mydata';
$myXML->appendChild($data);
echo $myXML ->saveXML();
?>
Output:
<?xml version="1.0"?>
<data>mydata</data>

Create a XML document using the DOM object with white characters

I have a question: I am trying to create a XML file using DomDocument and I would like to have this output:
<?xml version="1.0" encoding="UTF-8"?>
<winstrom version="1.0">
<main_tag>
<child_tag>example</child_tag>
</main_tag>
<winstrom>
The problem is with the second row - if I write it as below then the output is "Invalid Character Error". I guess it is not allowed to have space characters there... However I need it like this, so what are the options?
$dom = new DomDocument('1.0', 'UTF-8');
$root = $dom->createElement('winstrom version=1.0');
$dom->appendChild($root);
$item = $dom->createElement('hlavni_tag');
$root2->appendChild($item);
$text = $dom->createTextNode('example');
$item->appendChild($text);
$dom->formatOutput = true;
echo $dom->saveXML();
There seems to be a misunderstanding of what an XML element is and how it differs from attributes.
Try this code:
<?php
$dom = new DomDocument('1.0', 'UTF-8');
$root = $dom->createElement('winstrom');
$root->setAttribute("version","1.0");
$dom->appendChild($root);
$root2 = $dom->createElement("main_tag"); //You forgot this part
$root->appendChild($root2);
$item = $dom->createElement('hlavni_tag'); //Should it be "child_tag"?
$root2->appendChild($item);
$text = $dom->createTextNode('example');
$item->appendChild($text);
$dom->formatOutput = true;
echo $dom->saveXML();

domDocument's formatOutput property writes inline [duplicate]

Here are the codes:
$doc = new DomDocument('1.0');
// create root node
$root = $doc->createElement('root');
$root = $doc->appendChild($root);
$signed_values = array('a' => 'eee', 'b' => 'sd', 'c' => 'df');
// process one row at a time
foreach ($signed_values as $key => $val) {
// add node for each row
$occ = $doc->createElement('error');
$occ = $root->appendChild($occ);
// add a child node for each field
foreach ($signed_values as $fieldname => $fieldvalue) {
$child = $doc->createElement($fieldname);
$child = $occ->appendChild($child);
$value = $doc->createTextNode($fieldvalue);
$value = $child->appendChild($value);
}
}
// get completed xml document
$xml_string = $doc->saveXML() ;
echo $xml_string;
If I print it in the browser I don't get nice XML structure like
<xml> \n tab <child> etc.
I just get
<xml><child>ee</child></xml>
And I want to be utf-8
How is this all possible to do?
You can try to do this:
...
// get completed xml document
$doc->preserveWhiteSpace = false;
$doc->formatOutput = true;
$xml_string = $doc->saveXML();
echo $xml_string;
You can make set these parameter right after you've created the DOMDocument as well:
$doc = new DomDocument('1.0');
$doc->preserveWhiteSpace = false;
$doc->formatOutput = true;
That's probably more concise. Output in both cases is (Demo):
<?xml version="1.0"?>
<root>
<error>
<a>eee</a>
<b>sd</b>
<c>df</c>
</error>
<error>
<a>eee</a>
<b>sd</b>
<c>df</c>
</error>
<error>
<a>eee</a>
<b>sd</b>
<c>df</c>
</error>
</root>
I'm not aware how to change the indentation character(s) with DOMDocument. You could post-process the XML with a line-by-line regular-expression based replacing (e.g. with preg_replace):
$xml_string = preg_replace('/(?:^|\G) /um', "\t", $xml_string);
Alternatively, there is the tidy extension with tidy_repair_string which can pretty print XML data as well. It's possible to specify indentation levels with it, however tidy will never output tabs.
tidy_repair_string($xml_string, ['input-xml'=> 1, 'indent' => 1, 'wrap' => 0]);
With a SimpleXml object, you can simply
$domxml = new DOMDocument('1.0');
$domxml->preserveWhiteSpace = false;
$domxml->formatOutput = true;
/* #var $xml SimpleXMLElement */
$domxml->loadXML($xml->asXML());
$domxml->save($newfile);
$xml is your simplexml object
So then you simpleXml can be saved as a new file specified by $newfile
<?php
$xml = $argv[1];
$dom = new DOMDocument();
// Initial block (must before load xml string)
$dom->preserveWhiteSpace = false;
$dom->formatOutput = true;
// End initial block
$dom->loadXML($xml);
$out = $dom->saveXML();
print_R($out);
Tried all the answers but none worked. Maybe it's because I'm appending and removing childs before saving the XML.
After a lot of googling found this comment in the php documentation. I only had to reload the resulting XML to make it work.
$outXML = $xml->saveXML();
$xml = new DOMDocument();
$xml->preserveWhiteSpace = false;
$xml->formatOutput = true;
$xml->loadXML($outXML);
$outXML = $xml->saveXML();
// ##### IN SUMMARY #####
$xmlFilepath = 'test.xml';
echoFormattedXML($xmlFilepath);
/*
* echo xml in source format
*/
function echoFormattedXML($xmlFilepath) {
header('Content-Type: text/xml'); // to show source, not execute the xml
echo formatXML($xmlFilepath); // format the xml to make it readable
} // echoFormattedXML
/*
* format xml so it can be easily read but will use more disk space
*/
function formatXML($xmlFilepath) {
$loadxml = simplexml_load_file($xmlFilepath);
$dom = new DOMDocument('1.0');
$dom->preserveWhiteSpace = false;
$dom->formatOutput = true;
$dom->loadXML($loadxml->asXML());
$formatxml = new SimpleXMLElement($dom->saveXML());
//$formatxml->saveXML("testF.xml"); // save as file
return $formatxml->saveXML();
} // formatXML
Two different issues here:
Set the formatOutput and preserveWhiteSpace attributes to TRUE to generate formatted XML:
$doc->formatOutput = TRUE;
$doc->preserveWhiteSpace = TRUE;
Many web browsers (namely Internet Explorer and Firefox) format XML when they display it. Use either the View Source feature or a regular text editor to inspect the output.
See also xmlEncoding and encoding.
This is a slight variation of the above theme but I'm putting here in case others hit this and cannot make sense of it ...as I did.
When using saveXML(), preserveWhiteSpace in the target DOMdocument does not apply to imported nodes (as at PHP 5.6).
Consider the following code:
$dom = new DOMDocument(); //create a document
$dom->preserveWhiteSpace = false; //disable whitespace preservation
$dom->formatOutput = true; //pretty print output
$documentElement = $dom->createElement("Entry"); //create a node
$dom->appendChild ($documentElement); //append it
$message = new DOMDocument(); //create another document
$message->loadXML($messageXMLtext); //populate the new document from XML text
$node=$dom->importNode($message->documentElement,true); //import the new document content to a new node in the original document
$documentElement->appendChild($node); //append the new node to the document Element
$dom->saveXML($dom->documentElement); //print the original document
In this context, the $dom->saveXML(); statement will NOT pretty print the content imported from $message, but content originally in $dom will be pretty printed.
In order to achieve pretty printing for the entire $dom document, the line:
$message->preserveWhiteSpace = false;
must be included after the $message = new DOMDocument(); line - ie. the document/s from which the nodes are imported must also have preserveWhiteSpace = false.
based on the answer by #heavenevil
This function pretty prints using the browser
function prettyPrintXmlToBrowser(SimpleXMLElement $xml)
{
$domXml = new DOMDocument('1.0');
$domXml->preserveWhiteSpace = false;
$domXml->formatOutput = true;
$domXml->loadXML($xml->asXML());
$xmlString = $domXml->saveXML();
echo nl2br(str_replace(' ', ' ', htmlspecialchars($xmlString)));
}

Appended nodes not formatted

I made a PHP script that updates an existing XML file by adding new nodes. The problem is that the new nodes are not formatted. They are written in a single line. Here is my code :
$file = fopen('data.csv','r');
$xml = new DOMDocument('1.0', 'utf-8');
$xml->formatOutput = true;
$doc = new DOMDocument();
$doc->loadXML(file_get_contents('data.xml'));
$xpath = new DOMXPath($doc);
$root = $xpath->query('/my/node');
$root = $root->item(0);
$root = $xml->importNode($root,true);
// all the tags created in this loop are not formatted, and written in a single line
while($line=fgetcsv($file,1000,';')){
$tag = $xml->createElement('cart');
$tag->setAttribute('attr1',$line[0]);
$tag->setAttribute('attr2',$line[1]);
$root->appendChild($tag);
}
$xml->appendChild($root);
$xml->save('updated.xml');
How can I solve this?
Try adding preserveWhiteSpace = FALSE; to DOMDocument object where is file stored.
$xml = new DOMDocument('1.0', 'utf-8');
$xml->formatOutput = true;
$doc = new DOMDocument();
$doc->preserveWhiteSpace = false;
$doc->loadXML(file_get_contents('data.xml'));
$doc->formatOutput = true;
...
PHP.net - DOMDocument::preserveWhiteSpace

How to load a xml file in php so that i can use xpath on it?

I have a problem with php,
If I implement this code below then nothing will be happen.
$filename = "/opt/olat/olatdata/bcroot/course/85235053647606/runstructure.xml";
if (file_exists($filename)) {
$xml = simplexml_load_file($filename, 'SimpleXMLElement', LIBXML_NOCDATA);
// $xpath = new DOMXPath($filename);
}
$doc = new DOMDocument();
$doc->loadXML($xml);
$xpath = new DOMXpath($doc);
$res = $xpath->query('/org.olat.course.Structure/rootNode/children/org.olat.course.nodes.STCourseNode/shortTitle');
foreach ($res as $entry) {
echo "{$entry->nodeValue}<br/>";
}
If I change the contents of $xml in the content with the content of the $filename
$xml = '<org.olat.course.Structure><rootNode class="org.olat.course.nodes.STCourseNode"> ... ';
then it works, so i think that there is something wrong with loading methode of the xml file,
I've also tried to load the xml file as a Domdocument but it won't work neither.
And in both cases, it does work if I collect xml data via xml
for example this works
echo $Course_name = $xml->rootNode->longTitle;
loadXML takes a string as input, not the return value of simplexml_load_file. Just use file_get_contents to get the (full) contents of a file as string

Categories