In a nutshell: creating a xml document with PHP DOMDocument(). Need to add a line to the document.
Successfully creating an XML document with:
$doc = new DOMDocument();
$doc->formatOutput = true;
$r = $doc->createElement( "products" );
$doc->appendChild( $r );
while ( $row = mysql_fetch_array($res) )
{
$b = $doc->createElement( "product" );
//adding child elements.....
}
echo $doc->saveXML();
$doc->save("write.xml")
However, at the top of the xml document I need to link to an xls.
<?xml-stylesheet type="text/xsl" href="my.xsl"?>
How do I add that line in the PHP, to the dynamically generated xml?
Thanks.
If you check the manual for DOMDocument, there is actually an example to do exactly what you need in the createProcessingInstruction method, # http://www.php.net/manual/en/domdocument.createprocessinginstruction.php
Related
I have wrote this code in PHP to compile an XML file with parameters that are in URL.
But when the XML file is already created instead of adding the new data at the bottom of file inside the root element, overwrite it and delete all old data.
Where is the problem?
I have seen some examples online but I can't figure out how fix it.
I need to verify if file already exist and then add the element?
Or I need to read it and then add again the old elements and new?
I don't know very well dom so I can't figure out
<?php
$FOL = $_GET["FOL"];
$NUM = $_GET["NUM"];
$DAT = $_GET["DAT"];
$ZON = $_GET["ZON"];
$TIP = $_GET["TIP"];
$COM = $_GET["COM"];
$dom = new DOMDocument();
$dom->encoding = 'utf-8';
$dom->xmlVersion = '1.0';
$dom->formatOutput = true;
$xml_file_name = "$NUM.xml";
$xmlString = file_get_contents($xml_file_name);
$dom->loadXML($xmlString);
$loaded_xml = $dom->getElementsByTagName('Territorio');
$territorio_node = $dom->createElement('Territorio');
$child_node_NOM = $dom->createElement('NOM', "$NOM");
$territorio_node->appendChild($child_node_NOM);
$child_node_NUM = $dom->createElement('NUM', "$NUM");
$territorio_node->appendChild($child_node_NUM);
$child_node_DAT = $dom->createElement('DAT', "$DAT");
$territorio_node->appendChild($child_node_DAT);
$child_node_ZON = $dom->createElement('ZON', "$ZON");
$territorio_node->appendChild($child_node_ZON);
$dom->appendChild($territorio_node);
$child_node_TIP = $dom->createElement('TIP', "$TIP");
$territorio_node->appendChild($child_node_TIP);
$child_node_COM = $dom->createElement('COM', "$COM");
$territorio_node->appendChild($child_node_COM);
$dom->appendChild($territorio_node);
$dom->save($FOL.'/'.$xml_file_name);
echo "$xml_file_name creato correttamente";
?>
as per the comment: You should check that the root node of the XML file exists before calling createElement to generate a new one. To do that you can call getElementsByClassName and test whether the first entry is empty
ie: empty( $dom->getElementsByTagName('Territorio')[0] ) sort of thing...
If the root node exists we use that, otherwise add a new root to the document and continue
// check that the querystring has all the required parameters
if( isset(
$_GET['FOL'],
$_GET['NUM'],
$_GET['DAT'],
$_GET['ZON'],
$_GET['TIP'],
$_GET['COM']
)){
// the filename is generated from one of the querystring parameters
// - here the directory used is the same as the script running
$file=sprintf( '%s/%s.xml', __DIR__, $_GET['NUM'] );
// create an empty file if it does not exist
if( !file_exists( $file )){
file_put_contents( $file, '' );
}
clearstatcache();
// create the DOMDocument and set various options
libxml_use_internal_errors( true );
$dom=new DOMDocument('1.0','utf-8');
$dom->strictErrorChecking=false;
$dom->preserveWhiteSpace=true;
$dom->formatOutput=true;
$dom->recover=true;
// load the XML file directly rather than loading a string read from the original file.
$dom->load( $file );
// The ROOT node of the document ... does it exist? if not, create it and add to the DOM.
$root=$dom->getElementsByTagName('Territorio')[0];
if( empty( $root ) ){
$root=$dom->createElement('Territorio');
$dom->appendChild( $root );
}
// I added this part so that you can distinguish easily new elements
$record=$dom->createElement('record');
$root->appendChild( $record );
// add all the querystring parameters within the new record.
$record->appendChild( $dom->createElement('NUM', $_GET["NUM"] ) );
$record->appendChild( $dom->createElement('DAT', $_GET["DAT"] ) );
$record->appendChild( $dom->createElement('ZON', $_GET["ZON"] ) );
$record->appendChild( $dom->createElement('TIP', $_GET["TIP"] ) );
$record->appendChild( $dom->createElement('COM', $_GET["COM"] ) );
$record->appendChild( $dom->createElement('FOL', $_GET["FOL"] ) );
$dom->save( $file );
}
An example of the XML generated:
<?xml version="1.0" encoding="utf-8"?>
<Territorio>
<record>
<NUM>wibble</NUM>
<DAT>25/10/2022</DAT>
<ZON>europe</ZON>
<TIP>total</TIP>
<COM>94</COM>
<FOL>234</FOL>
</record>
<record>
<NUM>wibble</NUM>
<DAT>26/10/2022</DAT>
<ZON>europe</ZON>
<TIP>total</TIP>
<COM>96</COM>
<FOL>238</FOL>
</record>
</Territorio>
In the original code the file is saved to a location defined by another parameter in the querystring ( only just noticed that afterwards ) so rather than
$file=sprintf( '%s/%s.xml', __DIR__, $_GET['NUM'] );
you would likely want to do:
$file=sprintf( '%s/%s.xml', $_GET['FOL'], $_GET['NUM'] );
The root node of an XML document is called document element and here is an property for it. So you can just check if it is undefined. However an document can have only a single document element, so will need to modify the structure of your XML - for example add a "Territori" document element.
Do not use the second argument of the createElement() method or the $nodeValue property. Their escaping is broken - try adding a value with an &. Use $textContent or add a text node.
In modern DOM you can even just append() a string.
$NUM = "NUM";
$DAT = "DAT";
$ZON = "ZON";
$document = new DOMDocument('1.0', 'UTF-8');
// let the parser ignore existing indents
$document->preserveWhiteSpace = false;
$document->loadXML(getXMLString());
// fetch or create document element
$territori = $document->documentElement
?? $document->appendChild($document->createElement('Territori'));
// create/append an item element
$territori
->appendChild(
$territorio = $document->createElement('Territorio')
);
// create/append an element and set its text content
$territorio
->appendChild($document->createElement('NUM'))
->textContent = $NUM;
// create/append an element with a text child node
$territorio
->appendChild($document->createElement('DAT'))
->appendChild($document->createTextNode($DAT));
// create/append an element and a string (DOM Level 3)
$territorio
->appendChild($document->createElement('ZON'))
->append((string)$ZON);
// enable output formatting
$document->formatOutput = true;
echo $document->saveXML();
function getXMLString() {
return <<<'XML'
<?xml version="1.0"?>
<Territori>
<Territorio>
<NUM>NUM</NUM>
<DAT>DAT</DAT>
<ZON>DAT</ZON>
</Territorio>
</Territori>
XML;
}
For a more flexible approach to fetch nodes use Xpath expressions. Here is an example that checks if an Territorio with a specific NUM value exists:
$document = new DOMDocument('1.0', 'UTF-8');
$document->loadXML(getXMLString());
$xpath = new DOMXpath($document);
if ($xpath->evaluate('count(//Territorio[NUM="NUM"]) > 0')) {
echo "Node exists";
}
I want to get some data from this website but as you can see in their html code there are some weird stuff going on as <TABLE BORDER=0 CELLSPACING=1 CELLPADDING=3 WIDTH=100%> without using "" and some other stuff, so I'm having errors when I try to parse the table using SimpleXmlElement which I have been using for a few time and works perfectly in some websites,
I'm doing something like:
$html = file_get_html('https://secure.tibia.com/community/?subtopic=killstatistics&world=Menera');
$table = $html->find('table', 4);
$xml = new SimpleXmlElement($table);
I get a bunch of erros and stuff, so is there a way of cleaning the code before sending to SimpleXmlElement or perhaps using another kind of DOM class?
What do you guys recommend?
The problem with your HTML code is that the tag attributes are not wrapped by quotes: unquoted attributes are allowed in HTML, but not in XML.
If you don't care about attributes, you can continue using Simple HTML Dom, otherwise you have to change HTML parser.
Cleaning attributes with Simple HTML DOM:
Start creating a function to clear all node attributes:
function clearAttributes( $node )
{
foreach( $node->getAllAttributes() as $key => $val )
{
$node->$key = Null;
}
}
Then apply the function to your <table>, <tr> and <td> nodes:
clearAttributes( $table );
foreach( $table->find('tr') as $tr )
{
clearAttributes( $tr );
foreach( $tr->find( 'td' ) as $td )
{
clearAttributes( $td );
}
}
Last but not least: site HTML contains a lot of encoded characters. If you don't want see a lot of <td>1 </td><td>0 </td> inside your XML, you have to prepend at your string a utf-8 declaration before importing it in a SimpleXml object:
$xml = '<?xml version="1.0" encoding="utf-8" ?>'.html_entity_decode( $table );
$xml = new SimpleXmlElement( $xml );
Preserving attributes with DOMDocument:
The built-in DOMDocument class is more powerful and less memory hungry than Simple HTML Dom. In this case, it will well-format original HTML for you. Despite appearances, its use is simple.
First, you have to init a DOMDocument object, setting libxml_use_internal_errors (to suppress a lot of warnings on malformed HTML) and load your url:
$dom = new DOMDocument();
libxml_use_internal_errors( 1 );
$dom->loadHTMLfile( 'https://secure.tibia.com/community/?subtopic=killstatistics&world=Menera' );
$dom->formatOutput = True;
Then, you retrieve desired <table>:
$table = $dom->getElementsByTagName( 'table' )->item(4);
And, like in Simple HTML Dom example, you have to prepend utf-8 declaration to avoid weird characters:
$xml = '<?xml version="1.0" encoding="utf-8" ?>'.$dom->saveHTML( $table );
$xml = new SimpleXmlElement( $xml );
As you can see, the DOMDocument syntax to retrieve a node as HTML is different than Simple HTML Dom: you need always to refer to main object and specify the node to print as argument:
echo $dom->saveHTML(); // print entire HTML document
echo $dom->saveHTML( $node ); // print node $node
Edit: removing with DOMDocument:
To remove unwanted from HTML, you can pre-load HTML and use str_replace.
Change this line:
$dom->loadHTMLfile( 'https://secure.tibia.com/community/?subtopic=killstatistics&world=Menera' );
with this:
$data = file_get_contents( 'https://secure.tibia.com/community/?subtopic=killstatistics&world=Menera' );
$data = str_replace( ' ', '', $data );
$dom->loadHTML( $data );
Here are the codes:
$doc = new DomDocument('1.0');
// create root node
$root = $doc->createElement('root');
$root = $doc->appendChild($root);
$signed_values = array('a' => 'eee', 'b' => 'sd', 'c' => 'df');
// process one row at a time
foreach ($signed_values as $key => $val) {
// add node for each row
$occ = $doc->createElement('error');
$occ = $root->appendChild($occ);
// add a child node for each field
foreach ($signed_values as $fieldname => $fieldvalue) {
$child = $doc->createElement($fieldname);
$child = $occ->appendChild($child);
$value = $doc->createTextNode($fieldvalue);
$value = $child->appendChild($value);
}
}
// get completed xml document
$xml_string = $doc->saveXML() ;
echo $xml_string;
If I print it in the browser I don't get nice XML structure like
<xml> \n tab <child> etc.
I just get
<xml><child>ee</child></xml>
And I want to be utf-8
How is this all possible to do?
You can try to do this:
...
// get completed xml document
$doc->preserveWhiteSpace = false;
$doc->formatOutput = true;
$xml_string = $doc->saveXML();
echo $xml_string;
You can make set these parameter right after you've created the DOMDocument as well:
$doc = new DomDocument('1.0');
$doc->preserveWhiteSpace = false;
$doc->formatOutput = true;
That's probably more concise. Output in both cases is (Demo):
<?xml version="1.0"?>
<root>
<error>
<a>eee</a>
<b>sd</b>
<c>df</c>
</error>
<error>
<a>eee</a>
<b>sd</b>
<c>df</c>
</error>
<error>
<a>eee</a>
<b>sd</b>
<c>df</c>
</error>
</root>
I'm not aware how to change the indentation character(s) with DOMDocument. You could post-process the XML with a line-by-line regular-expression based replacing (e.g. with preg_replace):
$xml_string = preg_replace('/(?:^|\G) /um', "\t", $xml_string);
Alternatively, there is the tidy extension with tidy_repair_string which can pretty print XML data as well. It's possible to specify indentation levels with it, however tidy will never output tabs.
tidy_repair_string($xml_string, ['input-xml'=> 1, 'indent' => 1, 'wrap' => 0]);
With a SimpleXml object, you can simply
$domxml = new DOMDocument('1.0');
$domxml->preserveWhiteSpace = false;
$domxml->formatOutput = true;
/* #var $xml SimpleXMLElement */
$domxml->loadXML($xml->asXML());
$domxml->save($newfile);
$xml is your simplexml object
So then you simpleXml can be saved as a new file specified by $newfile
<?php
$xml = $argv[1];
$dom = new DOMDocument();
// Initial block (must before load xml string)
$dom->preserveWhiteSpace = false;
$dom->formatOutput = true;
// End initial block
$dom->loadXML($xml);
$out = $dom->saveXML();
print_R($out);
Tried all the answers but none worked. Maybe it's because I'm appending and removing childs before saving the XML.
After a lot of googling found this comment in the php documentation. I only had to reload the resulting XML to make it work.
$outXML = $xml->saveXML();
$xml = new DOMDocument();
$xml->preserveWhiteSpace = false;
$xml->formatOutput = true;
$xml->loadXML($outXML);
$outXML = $xml->saveXML();
// ##### IN SUMMARY #####
$xmlFilepath = 'test.xml';
echoFormattedXML($xmlFilepath);
/*
* echo xml in source format
*/
function echoFormattedXML($xmlFilepath) {
header('Content-Type: text/xml'); // to show source, not execute the xml
echo formatXML($xmlFilepath); // format the xml to make it readable
} // echoFormattedXML
/*
* format xml so it can be easily read but will use more disk space
*/
function formatXML($xmlFilepath) {
$loadxml = simplexml_load_file($xmlFilepath);
$dom = new DOMDocument('1.0');
$dom->preserveWhiteSpace = false;
$dom->formatOutput = true;
$dom->loadXML($loadxml->asXML());
$formatxml = new SimpleXMLElement($dom->saveXML());
//$formatxml->saveXML("testF.xml"); // save as file
return $formatxml->saveXML();
} // formatXML
Two different issues here:
Set the formatOutput and preserveWhiteSpace attributes to TRUE to generate formatted XML:
$doc->formatOutput = TRUE;
$doc->preserveWhiteSpace = TRUE;
Many web browsers (namely Internet Explorer and Firefox) format XML when they display it. Use either the View Source feature or a regular text editor to inspect the output.
See also xmlEncoding and encoding.
This is a slight variation of the above theme but I'm putting here in case others hit this and cannot make sense of it ...as I did.
When using saveXML(), preserveWhiteSpace in the target DOMdocument does not apply to imported nodes (as at PHP 5.6).
Consider the following code:
$dom = new DOMDocument(); //create a document
$dom->preserveWhiteSpace = false; //disable whitespace preservation
$dom->formatOutput = true; //pretty print output
$documentElement = $dom->createElement("Entry"); //create a node
$dom->appendChild ($documentElement); //append it
$message = new DOMDocument(); //create another document
$message->loadXML($messageXMLtext); //populate the new document from XML text
$node=$dom->importNode($message->documentElement,true); //import the new document content to a new node in the original document
$documentElement->appendChild($node); //append the new node to the document Element
$dom->saveXML($dom->documentElement); //print the original document
In this context, the $dom->saveXML(); statement will NOT pretty print the content imported from $message, but content originally in $dom will be pretty printed.
In order to achieve pretty printing for the entire $dom document, the line:
$message->preserveWhiteSpace = false;
must be included after the $message = new DOMDocument(); line - ie. the document/s from which the nodes are imported must also have preserveWhiteSpace = false.
based on the answer by #heavenevil
This function pretty prints using the browser
function prettyPrintXmlToBrowser(SimpleXMLElement $xml)
{
$domXml = new DOMDocument('1.0');
$domXml->preserveWhiteSpace = false;
$domXml->formatOutput = true;
$domXml->loadXML($xml->asXML());
$xmlString = $domXml->saveXML();
echo nl2br(str_replace(' ', ' ', htmlspecialchars($xmlString)));
}
i am trying to create a page of xml from php
$xml = new DOMDocument();
$xml->load("recv.xml") or exit("not loaded");
$xml_album = $xml->createElement("Album");
$xml_track = $xml->createElement("Track");
$xml_album->appendChild( $xml_track );
$xml->appendChild( $xml_album );
$doc->save('recv.xml');
this is my script but it is not working
Do i need to include to any file. Help me
Edit, error description according to comment;
DOMDocument::load() [function.DOMDocument-load]:
Start tag expected, '<' not found
You should probably create a fresh new document instead of loading existing not properly formed document.
$xml = new DOMDocument();
//$xml->load("recv.xml") or exit("not loaded"); - do not load the existing document!!!!
$xml_album = $xml->createElement("Album");
$xml_track = $xml->createElement("Track");
$xml_album->appendChild( $xml_track );
$xml->appendChild( $xml_album );
$xml->save('recv.xml');
XML is not allowed to have more than one root node, so if you are using <Album> as a root, opening existing document and adding one more root node will also produce error.
I have a little question ... this is my xml:
<?xml version="1.0" encoding="UTF-8"?>
<links>
<link>
<id>432423</id>
<href>http://www.google.ro</href>
</link>
<link>
<id>5432345</id>
<href>http://www.youtube.com</href>
</link>
<link>
<id>5443</id>
<href>http://www.yoursite.com</href>
</link>
</links>
How can i ad another
<link>
<id>5443</id>
<href>http://www.yoursite.com</href>
</link>
??
I managed only to add a record to ROOT/LINKS -> LINK using xpath, and here is the code
<?php
$doc = new DOMDocument();
$doc->load( 'links.xml' );
$links= $doc->getElementsByTagName("links");
$xpath = new DOMXPath($doc);
$hrefs = $xpath->evaluate("/links");
$href = $hrefs->item(0);
$item = $doc->createElement("item");
/*HERE IS THE ISSUE...*/
$link = $doc->createElement("id","298312800");
$href->appendChild($link);
$link = $doc->createElement("link","www.anysite.com");
$href->appendChild($link);
$href->appendChild($item);
print $doc->save('links.xml');
echo "the link has been added!";
?>
Any help would be appreciated :D
$doc = new DOMDocument();
// Setting formatOutput to true will turn on xml formating so it looks nicely
// however if you load an already made xml you need to strip blank nodes if you want this to work
$doc->load('links.xml', LIBXML_NOBLANKS);
$doc->formatOutput = true;
// Get the root element "links"
$root = $doc->documentElement;
// Create new link element
$link = $doc->createElement("link");
// Create and add id to new link element
$id = $doc->createElement("id","298312800");
$link->appendChild($id);
// Create and add href to new link element
$href = $doc->createElement("href","www.anysite.com");
$link->appendChild($href);
// Append new link to root element
$root->appendChild($link);
print $doc->save('links.xml');
echo "the link has been added!";
XPath is used to locate nodes in an XML document, not to manipulate the tree. Try $dom->appendChild($new_link).