store and retrieve illegal XML characters for XHTML output - php

I need to store content in an xml database. some data in the database looks like this:
<item>
<span class ="person">Henry 8<sup>th</sup></span>
</item>
<item>
<span class="company">Berkley & Jensen</span>
</item>
I need to load the data into a dom object with loadXML() then pass it to a xsl stylesheet where it is further manipulated using xpath and css. When I load the data the code breaks because of the '&' and I do not want to convert all entities because I need to use css on <sup> and the xpath on the 'class' and I suspect that encoded entities will cause them to fail. How should I store and retrieve the illegal characters?
Because of the comments I am providing a sample php script. If you add the php tags it should run. Thank you for the CDATA suggestion. I have used it to demonstrate the problem. If I try to use the 'block' tag as a target for the XPATH it works fine but if I try to use the 'span' tag it prints nothing.
$xsl = <<<XSL
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template name="doContent" match="/">
<div class="story">
<xsl:for-each select="//body/block"> <xsl:copy-of select="." />
</xsl:for-each>
</div>
</xsl:template>
</xsl:stylesheet>
XSL;
$xml = <<<XML
<?xml version="1.0" encoding="utf-8"?>
<content id="test" >
<headline>test</headline>
<author>test</author>
<body>
<block id="1"><![CDATA[<span class="normal"><p>1</p></span>]]></block>
<block id="2"><![CDATA[<span class=""><p>2</p></span>]]></block>
<block id="3"><![CDATA[<span class ="person">Henry 8<sup>th</sup></span>]]></block>
<block id="4"><![CDATA[<span class="company">Berkley & Jensen</span>]]></block>
<block id="5"><![CDATA[<span class=""><p>5</p></span>]]></block>
<block id="6"><![CDATA[<span class=""><p>6</p></span>]]></block>
</body>
</content>
XML;
$xslDoc = new DOMDocument();
$xslDoc->loadXML($xsl);
$xmlDoc = new DOMDocument();
$xmlDoc->loadXML($xml);
$proc = new XSLTProcessor();
$proc->importStylesheet($xslDoc);
echo $proc->transformToXML($xmlDoc);

Wrap it into <![CDATA[]]>:
<item>
<![CDATA[<span class="company">Berkley & Jensen</span>]]>
</item>
More on CDATA: What does <![CDATA[]]> in XML mean?

i was able to resolve my situation with a function that I created to sanitise the unwanted characters. You can try it with the sample xml that I gave above. notice that I use loadHTML NOT loadXML!
function clean_invalid_nodes(&$node)
{
global $xpath, $xmlDoc;
$nodes = $xpath->query("child::node()",$node);
foreach ($nodes as $n)
{
if ($n->nodeType == XML_ELEMENT_NODE) clean_invalid_nodes($n);
elseif ($n->nodeType == XML_TEXT_NODE)
{
if(trim($n->nodeValue)!='')
{
$newnode = $xml->createTextNode(htmlentities($xmlDoc ->saveXML($n), ENT_SUBSTITUTE, 'utf-8'));
$n->parentNode->replaceChild($newenode, $n);
}
}
}
}
$xmlDoc = new DOMDocument();
#$xmlDoc->loadHTML($xml);
$xpath = new DomXPath($xmlDoc);
$nodes = $xpath->query("//span");
foreach ($nodes as $node) clean_invalid_nodes($node);
$out = $xpath->query("//html/body")->item(0);
echo $xmlDoc ->saveXML($out);

Related

(XML) PHP is not renaming all tags?

I have a problem... I want to rename the tags in some XML files. An it works with this code:
$xml = file_get_contents('data/onlinekeystore.xml');
renameTags($xml, 'priceEUR', 'price', 'data/onlinekeystore.xml');
But if I want to rename another XML file it doens't work with the SAME method...
See the example below. I have no idea why...
Does anybody has an idea and can help me?
$xml = file_get_contents('data/g2a.xml');
renameTags($xml, 'name', 'title', 'data/g2a.xml');
Function Code:
function renameTags($xml, $old, $new, $path){
$dom = new DOMDocument();
$dom->loadXML($xml);
$nodes = $dom->getElementsByTagName($old);
$toRemove = array();
foreach ($nodes as $node) {
$newNode = $dom->createElement($new);
foreach ($node->attributes as $attribute) {
$newNode->setAttribute($attribute->name, $attribute->value);
}
foreach ($node->childNodes as $child) {
$newNode->appendChild($node->removeChild($child));
}
$node->parentNode->appendChild($newNode);
$toRemove[] = $node;
}
foreach ($toRemove as $node) {
$node->parentNode->removeChild($node);
}
$dom->saveXML();
$dom->save($path);
}
onlinekeystore.xml Input:
<product>
<priceEUR>5.95</priceEUR>
</product>
onlinekeystore.xml Ouput:
<product>
<price>5.95</price>
</product>
g2a.xml Input:
<products>
<name><![CDATA[1 Random STEAM PREMIUM CD-KEY]]></name>
</products>
g2a.xml Ouput:
<products>
<name><![CDATA[1 Random STEAM PREMIUM CD-KEY]]></name>
</products>
Greetings
Consider a dynamic XSLT running the Identity Transform and then updates node names in a specific template anywhere in document. The sprintf formats XSL string passing in $old and $new values. This process avoids any nested looping through entire tree and even pretty prints output no matter the format of input.
function renameTags($xml, $old, $new, $path){
// LOAD XML
$dom = new DOMDocument();
$dom->loadXML($xml);
// LOAD XSL
$xslstr = '<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output version="1.0" encoding="UTF-8" indent="yes" method="xml" cdata-section-elements="%2$s"/>
<xsl:strip-space elements="*"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="%1$s">
<xsl:element name="%2$s">
<xsl:apply-templates/>
</xsl:element>
</xsl:template>
</xsl:transform>';
$xsl = new DOMDocument();
$xsl->loadXML(sprintf($xslstr, $old, $new));
// INITIALIZE TRANSFORMER (REQUIRES php_xsl EXTENSION ENABLED IN .ini)
$proc = new XSLTProcessor;
$proc->importStyleSheet($xsl);
// TRANSFORM XML AND SAVE OUTPUT
$newXML = $proc->transformToXML($dom);
file_put_contents($path, $newXML);
}
Output
renameTags('<product><priceEUR>5.95</priceEUR></product>', 'priceEUR', 'price', 'data/onlinekeystore.xml');
// <?xml version="1.0" encoding="UTF-8"?>
// <product>
// <price><![CDATA[5.95]]></price>
// </product>
renameTags('<products><name><![CDATA[1 Random STEAM PREMIUM CD-KEY]]></name></products>', 'name', 'title', 'data/g2a.xml');
// <?xml version="1.0" encoding="UTF-8"?>
// <products>
// <title><![CDATA[1 Random STEAM PREMIUM CD-KEY]]></title>
// </products>
Note: In XSLT, the <![CDATA[...]] tags are not preserved unless explicitly specified in <xsl:output>. Right now, any new node's text is wrapped with it. Remove the output spec and no CData tags render. So either include such escape tags for all or none.

php getimagesize inside xslt

I am quite a bit outside of my comfort zone, working with xslt.
I would like to get the positive ratio between the height and width of an image. But I am having trouble even getting the parameters.
I tried this one:
<xsl:value-of select="php:functionString('getimagesize', image)"/></xsl:element>
But that of course just outputs "Array".
Is there a way to "break" the array similar to $size[1]?
You can create a DOMDocument or document fragment in your PHP code which contains the data you want to return, then you can use XPath on the XSLT side to select the data, here is an example:
<?php
function getDims($url) {
$info = getimagesize($url);
$doc = new DOMDocument();
$root = $doc->appendChild($doc->createElement('dimensions'));
$doc->appendChild($root);
$width = $doc->createElement('width', $info[0]);
$root->appendChild($width);
$height = $doc->createElement('height', $info[1]);
$root->appendChild($height);
return $doc;
}
$xml = <<<'EOB'
<root>
<image>foo.gif</image>
</root>
EOB;
$doc = new DOMDocument();
$doc->loadXML($xml);
$xsl = <<<'EOB'
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:exsl="http://exslt.org/common"
xmlns:php="http://php.net/xsl"
exclude-result-prefixes="exsl php">
<xsl:output method="html" encoding="utf-8" indent="yes"/>
<xsl:template match="image">
<xsl:variable name="dimensions" select="php:function('getDims', string(.))/*"/>
<img width="{$dimensions/width}" height="{$dimensions/height}" src="{.}"/>
</xsl:template>
</xsl:stylesheet>
EOB;
$xsldoc = new DOMDocument();
$xsldoc->loadXML($xsl);
$proc = new XSLTProcessor();
$proc->registerPHPFunctions();
$proc->importStyleSheet($xsldoc);
echo $proc->transformToXML($doc);
?>

Save and Display a XML file created by transforming values of another XML using Xsl file.

Hi I have a xml file and i'm transforming it's values from xsl file in php. I want to display the new values as an xml. I can get values echoed in the php but when I put the header it gives an error saying junk after element. and I cannot use saveXML() to the returned string.
I need to save these returned information in a new xml file which I have no idea how to do. Please help me in this> thank you in advance.
php file
<?php
//header('Content-Type: text/xml');
$xmlDoc = new DOMDocument('1.0');
$xmlDoc->formatOutput = true;
$xmlDoc->load("rental.xml");
$xslDoc = new DomDocument('1.0');
$xslDoc->load("apartment.xsl");
$proc = new XSLTProcessor;
$proc->importStyleSheet($xslDoc);
$strXml= $proc->transformToXML($xmlDoc);
//echo (toXml($strXml));
echo ($strXml);
//echo $strXml->saveXML();
?>
xsl file
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" />
<xsl:template match="/">
<xsl:for-each select="//property">
<xsl:element name="rentalProperties">
<xsl:element name="description">
<xsl:value-of select="description"/>
</xsl:element>
</xsl:element>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
This code will help u to save xml string to a file.
<?php
//header('Content-Type: text/xml');
$xmlDoc = new DOMDocument('1.0');
$xmlDoc->formatOutput = true;
$xmlDoc->load("rental.xml");
$xslDoc = new DomDocument('1.0');
$xslDoc->load("apartment.xsl");
$proc = new XSLTProcessor;
$proc->importStyleSheet($xslDoc);
$strXml= $proc->transformToXML($xmlDoc);
//echo (toXml($strXml));
echo ($proc->transformToXML($xmlDoc));
//$strXml->saveXML();
// Way to parse XML string and save to a file
$convertedXML = simplexml_load_string($strXml);
$convertedXML->saveXML("member.xml");
?>
Cheers.

XInclude not evaluated in XSLT transformation using PHP

I am trying to include different source files (e.g. file1.xml and file2.xml) and have these includes resolved for an XSLT transformation using PHPs XSLTProcessor. This is my input:
source.xml
<?xml version="1.0" encoding="utf-8" ?>
<root xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="file1.xml" />
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="file2.xml" />
</root>
transform.xsl
<?xml version="1.0" encoding="UTF-8"?>
<xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xi="http://www.w3.org/2001/XInclude">
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()" />
</xsl:copy>
</xsl:template>
</xsl:transform>
transform.php
<?php
function transform($xml, $xsl) {
global $debug;
// XSLT Stylesheet laden
$xslDom = new DOMDocument("1.0", "utf-8");
$xslDom->load($xsl, LIBXML_XINCLUDE);
// XML laden
$xmlDom = new DOMDocument("1.0", "utf-8");
$xmlDom->loadHTML($xml); // loadHTML to handle possibly defective markup
$xsl = new XsltProcessor(); // create XSLT processor
$xsl->importStylesheet($xslDom); // load stylesheet
return $xsl->transformToXML($xmlDom); // transformation returns XML
}
exit(transform("source.xml", "transform.xsl"));
?>
My desired output is
<?xml version="1.0" encoding="utf-8" ?>
<root>
<!-- transformed contents of file1.xml -->
<!-- transformed contents of file2.xml -->
</root>
My current output is an exact copy of my source file:
<?xml version="1.0" encoding="utf-8" ?>
<root>
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="file1.xml" />
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="file2.xml" />
</root>
It turned out, I just forgot one simple but important line in my PHP code. I had to call DOMDocument::xinclude to have the includes resolved before the transformation is done.
The full example:
<?php
function transform($xml, $xsl) {
global $debug;
// XSLT Stylesheet laden
$xslDom = new DOMDocument("1.0", "utf-8");
$xslDom->load($xsl, LIBXML_XINCLUDE);
// XML laden
$xmlDom = new DOMDocument("1.0", "utf-8");
$xmlDom->load($xml);
$xmlDom->xinclude(); // IMPORTANT!
$xsl = new XsltProcessor();
$xsl->importStylesheet($xslDom);
return $xsl->transformToXML($xmlDom);
}
exit(transform("source.xml", "transform.xsl"));
?>

How to sort a xml file using DOM

I have a xml file structured like
<?xml version="1.0"?>
<library>
<book id="1003">
<title>Jquery MVC</title>
<author>Me</author>
<price>500</price>
</book>
<book id="1001">
<title>Php</title>
<author>Me</author>
<price>600</price>
</book>
<book id="1002">
<title>Where to use IFrame</title>
<author>Me</author>
<price>300</price>
</book>
</library>
In order to sort this xml according to the book id,
after reviewing this method from stackoverflow
i coded like this
$dom = new DOMDocument();
$dom->load('DOM.xml');
$library = $dom->documentElement;
$xpath = new DOMXPath($dom);
$result = $xpath->query('/library/book');
function sort_trees($t1,$t2){
return strcmp($t1['id'], $t2['id']);
}
usort($result, 'sort_trees');
print_r($result);*/
But it gives me an error
Warning: usort() expects parameter 1 to be array, object given in /var/www/html/testphp/phpxml/readxml.php on line 24
The answer you cite is for SimpleXML, but you are using DOMDocument.
If you want to continue to use DOMDocument you need to keep its API in mind.
$dom = new DOMDocument();
$dom->load('DOM.xml');
$xp = new DOMXPath($dom);
$booklist = $xp->query('/library/book');
// Books is a DOMNodeList, not an array.
// This is the reason for your usort() warning.
// Copies DOMNode elements in the DOMNodeList to an array.
$books = iterator_to_array($booklist);
// Second, your sorting function is using the wrong API
// $node['id'] is SimpleXML syntax for attribute access.
// DOMElement uses $node->getAttribute('id');
function sort_by_numeric_id_attr($a, $b)
{
return (int) $a->getAttribute('id') - (int) $b->getAttribute('id');
}
// Now usort()
usort($books, 'sort_by_numeric_id_attr');
// verify:
foreach ($books as $book) {
echo $book->C14N(), "\n";
}
If you need to create a new output document with the nodes sorted, create a new document, import the root element, then import the book nodes in sorted order and add to the document.
$newdoc = new DOMDocument('1.0', 'UTF-8');
$libraries = $newdoc->appendChild($newdoc->importNode($dom->documentElement));
foreach ($books as $book) {
$libraries->appendChild($newdoc->importNode($book, true));
}
echo $newdoc->saveXML();
However, a much better approach is to use XSLT:
<?xml version="1.0" encoding="UTF-8" ?>
<!-- file "sort_by_numeric_id.xsl" -->
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output encoding="UTF-8" method="xml" />
<xsl:template match="node()|#*">
<xsl:copy><xsl:apply-templates select="node()|#*"/></xsl:copy>
</xsl:template>
<xsl:template match="/*">
<xsl:copy>
<xsl:apply-templates select="#*"/>
<xsl:apply-templates select="*">
<xsl:sort select="#id" data-type="number"/>
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Then use XSLTProcessor (or xsltproc from the command line):
$xsltdoc = new DOMDocument();
$xsltdoc->load('sort_by_numeric_id.xsl');
$xslt = new XSLTProcessor();
$xslt->importStyleSheet($xsltdoc);
// You can now use $xslt->transformTo*() methods over and over on whatever documents you want
$libraryfiles = array('library1.xml', 'library2.xml');
foreach ($libraryfiles as $lf) {
$doc = new DOMDocument();
$doc->load($lf);
// write the new document
$xslt->transformToUri($doc, 'file://'.preg_replace('/(\.[^.]+)?$/', '-sorted$0', $lf, 1);
unset($doc); // just to save memory
}
(copied over from your duplicate question) This works
$dom = new DOMDocument();
$dom->load('dom.xml');
$xp = new DOMXPath($dom);
$booklist = $xp->query('/library/book');
$books = iterator_to_array($booklist);
function sort_by_numeric_id_attr($a, $b)
{
return (int) $a->getAttribute('id') - (int) $b->getAttribute('id');
}
usort($books, 'sort_by_numeric_id_attr');
$newdom = new DOMDocument("1.0");
$newdom->formatOutput = true;
$root = $newdom->createElement("library");
$newdom->appendChild($root);
foreach ($books as $b) {
$node = $newdom->importNode($b,true);
$root->appendChild($newdom->importNode($b,true));
}
$newdom->save('DOM2.xml');
As you can see, you need to create a new DOMDocument, and add the sorted children to it (via importNode to copy the DOMNodes from one DOMDocument to another).

Categories