PHP Move DOMDocument nodes to a new parent - php

I have a xml file from a client which is not compleet what i want, so i have to rewrite it.
This is what i have:
<artikel>
<kop>
<titel>Artikel 2.</titel>
</kop>
<lid>
<lidnr>1</lidnr>
<al>content</al>
</lid>
<lid>
<lidnr>2</lidnr>
<al>content</al>
</lid>
</artikel>
and this is what i need:
<artikel>
<kop>
<titel>Artikel 2.</titel>
</kop>
<leden>
<lid>
<lidnr>1</lidnr>
<al>content</al>
</lid>
<lid>
<lidnr>2</lidnr>
<al>content</al>
</lid>
</leden>
</artikel>
I do not know xml very well, so i have a problem. I think this needed to be done:
1) create a new_parent_node "leden"
2) per "lid": add "lid" to "leden" node and remove from "artikel" node
3) add new node "leden" after "kop" node
This is what i have so far:
$dom->load($publicatieurl_xml);
$artikels = $dom->getElementsByTagName('artikel');
foreach ($artikels as $key => $artikel) {
$lidNodes = $artikel->getElementsByTagName('lid');
if ( $lidNodes->length !== 0 ) {
$new_parent_node = $dom->createElement('leden');
foreach ( $lidNodes as $key => $lid ) {
$new_parent_node->appendChild( $lid );
}
echo ($new_parent_node->ownerDocument->saveXML($new_parent_node));
}
}
Where this does not work: $new_parent_node->appendChild( $lid );
because it is an object.
So what i need to know is:
1) how can i add the already existing XML-element "$lid" to my "leden" node
2) how do i remove the "lid" nodes? Yet another foreach loop? Because i can not remove it in the one where i append the $lid, because that ruins the foreach elements...

I would use XSLT for that. First create the stylesheet document:
translate.xsl
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<artikel>
<xsl:copy-of select="/artikel/kop" />
<leden>
<xsl:copy-of select="/artikel/lid" />
</leden>
</artikel>
</xsl:template>
</xsl:stylesheet>
Now comes the PHP code:
// Load input from customer. (Can be an http:// url if desired)
$input = new DOMDocument();
$input->load('input.xml');
// Load the stylesheet document
$xsl = new DOMDocument();
$xsl->load('translate.xsl');
$xsltproc = new XSLTProcessor();
$xsltproc->importStylesheet($xsl);
// transformToXML() returns the translated xml as a string
echo $xsltproc->transformToXML($input);
// ... or transformToDoc() can be used if you need to
// further process the translated xml.
$newdoc = $xsltproc->transformToDoc($input);
Btw, if you don't want to store the xsl in a separate file, you use DOMDocument::loadXML() to load it:
$xsl = new DOMDocument();
$xsl->loadXML(<<<EOF
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<artikel>
<xsl:copy-of select="/artikel/kop" />
<leden>
<xsl:copy-of select="/artikel/lid" />
</leden>
</artikel>
</xsl:template>
</xsl:stylesheet>
EOF
);

Related

How to merge two xml files by ID (in first as subnode value, in second as attribute)

I have two XML files with this structure:
first.xml
<items>
<item>
<id>foo</id>
<desc>lorem ipsum</desc>
</item>
<item>
<id>boo</id>
<desc>lorem ipsum</desc>
</item>
</items>
second.xml
<item_list>
<item id="foo">
<stock_quantity>20</stock_quantity>
</item>
<item id="boo">
<stock_quantity>11</stock_quantity>
</item>
</item_list>
and I need to combine them by the id so the ouput file would look like this:
output.xml
<items>
<item>
<id>foo</id>
<desc>lorem ipsum</desc>
<stock_quantity>20</stock_quantity>
</item>
<item>
<id>boo</id>
<desc>lorem ipsum</desc>
<stock_quantity>11</stock_quantity>
</item>
</items>
I need to use PHP and XML DOMDocument. Do you have any idea how to do this?
You can use simplexml library to achieve that,
// loading xml to object from file
$xml1 = simplexml_load_file("first.xml") or die("Error: Cannot create object");
$xml2 = simplexml_load_file("second.xml") or die("Error: Cannot create object");
// its core xml iterator for simplexml library
foreach ($xml1->children() as $items1) {
$id = trim($items1->id); // trim to check with id matched in 2.xml
foreach ($xml2->children() as $items2) { // iterating children of 2.xml
if ($items2[0]['id'] == $id) { // simply checking attribute of id in 2.xml with 1.xml's id value
foreach ($items2 as $key => $value) {
$items1->addChild($key, (string) ($value)); // adding children to 1.xml object
}
}
}
}
$xml1->asXml('output.xml'); // generating https://www.php.net/manual/en/simplexmlelement.asxml.php
Using DOMDocument and it's ability to copy nodes from one document to the other allows you to directly insert the node from the stock to the main XML.
Rather than looping to find the matching record, this also uses XPath to search for the matching record, the expression //item[#id='boo']/stock_quantity says find the <stock_quantity> element in the <item> element with an attribute of id='boo'
$main = new DOMDocument();
$main->load("main.xml");
$add = new DOMDocument();
$add->load("stock.xml");
$searchAdd = new DOMXPath($add);
// Find the list of items
$items = $main->getElementsByTagName("item");
foreach ( $items as $item ) {
// Exract the value of the id node
$id = $item->getElementsByTagName("id")[0]->nodeValue;
// Find the corresponding node in the stock file
$stockQty = $searchAdd->evaluate("//item[#id='{$id}']/stock_quantity");
// Import the <stock_quantity> node (and all contents)
$copy = $main->importNode($stockQty[0], true);
// Add the imported node
$item->appendChild($copy);
}
echo $main->saveXML();
Consider XSLT, the special-purpose language (like SQL) designed to transform XML files such as your specific end-use needs. Like many general-purpose languages, PHP can run XSLT 1.0 as a lower level layer using special libraries namely php-xsl class (requires the .ini extension enabled).
XSLT (save as .xsl file, a special .xml file; below assumes second XML in same directory)
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- IDENTITY TRANSFORM -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<!-- ADD NODE BY CORRESPONDING id VALUE -->
<xsl:template match="item">
<xsl:copy>
<xsl:variable name="curr_id" select="id"/>
<xsl:apply-templates select="#*|node()"/>
<xsl:copy-of select="document('second.xml')/item_list/item[#id = $curr_id]/*"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
PHP (reference only first XML)
// Load the XML source and XSLT file
$xml = new DOMDocument;
$xml->load('first.xml');
$xsl = new DOMDocument;
$xsl->load('XSLTScript.xsl');
// Configure transformer
$proc = new XSLTProcessor;
$proc->importStyleSheet($xsl);
// Transform XML source
$newXML = new DOMDocument;
$newXML = $proc->transformToXML($xml);
echo $newXML;
// Save output to file
$xmlfile = 'output.xml';
file_put_contents($xmlfile, $newXML);

Split XML files with PHP not outputting top level parent node

I'm trying to separate an XML file into two files, longrentals.xml and shortrentals.xml but have hit a last hurdle I'm stuck on. The following is what I would like to happen:
rentals.xml is parsed and for each instance of term = "short" the top parent "property" node of that entry is saved to shortrentals.xml.
Each instance is removed from the rentals.xml file (after extracting).
The shortrentals.xml file is saved.
The remaining entries in the original file is saved to longrentals.xml.
The XML structure is as follows:
<property>
...
<rent>
<term>short</term>
<freq>week</freq>
<price_peak>5845</price_peak>
<price_high>5845</price_high>
<price_medium>4270</price_medium>
<price_low>3150</price_low>
</rent>
...
</property>
The code I'm using is as follows:
$destination = new DOMDocument;
$destination->preserveWhiteSpace = true;
$destination->loadXML('<?xml version="1.0" encoding="utf-8"?><root></root>');
$source = new DOMDocument;
$source->load('file/rentals.xml');
$xp = new DOMXPath($source);
$destRoot = $destination->getElementsByTagName("root")->item(0);
foreach ($xp->query('/root/property/rent[term = "short"]') as $item) {
$newItem = $destination->importNode($item, true);
$destRoot->appendChild($newItem);
$item->parentNode->removeChild($item);
}
$source->save("file/longrentals.xml");
$destination->formatOutput = true;
$destination->save("file/shortrentals.xml");
This works except the output in shortrentals.xml only contains the rent node not the top level parent Property node. Also the removed entry from longrentals.xml only removes the Rent child node. So, how do I go up a level using my code please?
You can use the parentNode attribute of a DOMNode to go up a level in the structure (similar to how you do it in the removeChild code)...
foreach ($xp->query('/root/property/rent[term = "short"]') as $item) {
$property = $item->parentNode;
$newItem = $destination->importNode($property, true);
$destRoot->appendChild($newItem);
$property->parentNode->removeChild($property);
}
Alternatively, consider XSLT, the special-purpose XML transformation language, to create both such XML files without foreach loops. Here, XSLT is embedded as string but can be parsed from file like any other XML file. Assumed XML structure: <root><property><rent>...
shortrentals.xml output
// Load XML and XSL sources
$xml = new DOMDocument;
$xml->load('file/rentals.xml');
$xslstr = '<?xml version="1.0" ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/root">
<xsl:copy>
<xsl:apply-templates select="property[rent/term=\'short\']"/>
</xsl:copy>
</xsl:template>
<xsl:template match="property">
<xsl:copy>
<xsl:copy-of select="*"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>';
$xsl = new DOMDocument;
$xsl->loadXML($xslstr);
// Configure transformer
$proc = new XSLTProcessor;
$proc->importStyleSheet($xsl);
// Transform XML source
$newXML = new DOMDocument;
$newXML = $proc->transformToXML($xml);
// Output file
file_put_contents('file/shortrentals.xml', $newXML);
longrentals.xml (Using Identity Transform and empty template to remove nodes)
// Load XML and XSL sources
$xml = new DOMDocument;
$xml->load('file/rentals.xml');
$xslstr = '<?xml version="1.0" ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- Identity Transform -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<!-- Remove Non-Short Terms -->
<xsl:template match="property[rent/term=\'short\']"/>
</xsl:stylesheet>';
$xsl = new DOMDocument;
$xsl->loadXML($xslstr);
// Configure transformer
$proc = new XSLTProcessor;
$proc->importStyleSheet($xsl);
// Transform XML source
$newXML = new DOMDocument;
$newXML = $proc->transformToXML($xml);
// Output file
file_put_contents('file/longrentals.xml', $newXML);

PHP: remove node from xml by attribute

suppose I have an xml like this:
<products>
<product id="1">
<name>aaa</name>
<producturl>aaa</producturl>
<bigimage>aaa</bigimage>
<description>aaa</description>
<price>aaa</price>
<categoryid1>aaa</categoryid1>
<instock>aaa</instock>
</product>
<product id="2">
<name>aaa</name>
<producturl>aaa</producturl>
<bigimage>aaa</bigimage>
<description>aaa</description>
<price>aaa</price>
<categoryid1>aaa</categoryid1>
<instock>aaa</instock>
</product>
</products>
and I need to delete certain node depending on the id attribute, if this attribute is in an array.
I've tried different ways, but the xml is outputted always as the original one!
My code so far:
<?php header("Content-type: text/xml");
$url="http://www.aaa.it/aaa.xml";
$url=file_get_contents($url);
$array=array("1","4","5");
$doc=new SimpleXMLElement($url);
foreach($doc->product as $product){
if(!in_array($product['id'],$array)){
$dom=dom_import_simplexml($product);
$dom->parentNode->removeChild($dom);
// unset($doc->product->$product);
}
}
echo $doc->asXml(); ?>
Thanks a lot everyone.
Consider a partly XPath and XSLT solution, both siblings in the Extensible Stylesheet Family. XPath is first used to retrieve all current product ids which is then compared with current array of ids to keep using array_diff. XSLT is then iteratively built to remove nodes according to these unmatched ids. Removing nodes in XSLT requires simply an empty template match.
// Load the XML source
header("Content-type: text/xml");
$url="http://www.aaa.it/aaa.xml";
$url=file_get_contents($url);
$doc=new SimpleXMLElement($url);
// Retrieve all XML product ids with XPath
$xpath = $doc->xpath("//product/#id");
$xmlids = [];
foreach($xpath as $item => $value){ $xmlids[] = (string)$value; }
// Compare difference with $array
$array = array("1","4","5");
$removeids = array_diff($xmlids, $array);
// Dynamically build XSLT string for each resulting id
foreach($removeids as $id){
$xslstr='<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output version="1.0" encoding="UTF-8" indent="yes" />
<xsl:strip-space elements="*"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="product[#id=\''.$id.'\']"/>
</xsl:transform>';
$xsl = new SimpleXMLElement($xslstr);
// Configure the transformer and run
$proc = new XSLTProcessor;
$proc->importStyleSheet($xsl);
$newXML = $proc->transformToXML($doc);
// Adjust $doc object with each loop
$doc = new SimpleXMLElement($newXML);
}
// Echo Output
echo $doc->asXML();

How do I clone Distinct XML structures without data in PHP?

I have an XML document that looks like this:
<root>
<node/>
<node>
<sub>more</sub>
</node>
<node>
<sub>another</sub>
</node>
<node>value</node>
</root>
Here's my pseudo-code:
import xml.
create empty-xml.
foreach child of imported-xml-root-node,
recursively clone node structure without data.
if clone does not match one already in empty-xml,
then add clone to empty-xml.
I'm trying to get a result that looks like this:
<root>
<node/>
<node>
<sub/>
</node>
</root>
Note that my piddly example data is only 3 nodes deep. In production, there will be an unknown number of descendants, so an acceptable answer needs to handle variable node depths.
Failed Approaches
I have reviewed The DOMNode class which has a cloneNode method with a recursive option that I would like to use, although it would take some extra work to purge the data. But while the class contains a hasChildNodes function which returns a boolean, I can't find a way to actually return the collection of children.
$doc = new DOMDocument();
$doc->loadXML($xml);
$root_node = $doc->documentElement;
if ( $root_node->hasChildNodes() ) {
// looking for something like this:
// foreach ($root_node->children() as $child)
// $doppel = $child->cloneNode(true);
}
Secondly, I have tried my hand with the The SimpleXMLElement class which does have an awesome children method. Although it's lacking the recursive option, I built a simple function to surmount that. But the class is missing a clone/copyNode method, and my function is bloating into something nasty to compensate. Now I'm considering combining usage of the two classes so I've got access to both SimpleXMLElement::children and DOMDocument::cloneNode, but I can tell this is not going cleanly and surely this problem can be solved better.
$sxe = new SimpleXMLElement($xml);
$indentation = 0;
function getNamesRecursive( $xml, &$indentation )
{
$indentation++;
foreach($xml->children() as $child) {
for($i=0;$i<$indentation;$i++)
echo "\t";
echo $child->getName() . "\n";
getNamesRecursive($child,$indentation);
}
$indentation--;
}
getNamesRecursive($sxe,$indentation);
Consider XSLT, the special-purpose language designed to transform XML files. And PHP maintains an XSLT 1.0 processor. You simply need to keep items of position 1 and copy only its elements not text.
XSLT (save as .xsl file to use below in php)
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output version="1.0" encoding="UTF-8" indent="yes" omit-xml-declaration="yes" />
<xsl:strip-space elements="*"/>
<!-- Identity Transform -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<!-- Remove any nodes position greater than 2 -->
<xsl:template match="*[position() > 2]"/>
<!-- Copy only tags -->
<xsl:template match="/*/*/*">
<xsl:copy/>
</xsl:template>
</xsl:transform>
PHP
// LOAD XML AND XSL FILES
$xml = new DOMDocument('1.0', 'UTF-8');
$xml->load('Input.xml');
$xslfile = new DOMDocument('1.0', 'UTF-8');
$xslfile->load('Script.xsl');
// TRANSFORM XML with XSLT
$proc = new XSLTProcessor;
$proc->importStyleSheet($xslfile);
$newXml = $proc->transformToXML($xml);
// ECHO OUTPUT STRING
echo $newXml;
# <root>
# <node/>
# <node>
# <sub/>
# </node>
# </root>
// NEW DOM OBJECT
$final = new DOMDocument('1.0', 'UTF-8');
$final->loadXML($newXml);
well here's my stinky solution. suggestions for improvements or completely new better answers are still very welcome.
$xml = '
<root>
<node/>
<node>
<sub>more</sub>
</node>
<node>
<sub>another</sub>
</node>
<node>value</node>
</root>
';
$doc = new DOMDocument();
$doc->loadXML($xml);
// clone without data
$empty_xml = new DOMDocument();
$empty_xml->appendChild($empty_xml->importNode($doc->documentElement));
function clone_without_data(&$orig, &$clone, &$clonedoc){
foreach ($orig->childNodes as $child){
if(get_class($child) === "DOMElement")
$new_node = $clone->appendChild($clonedoc->importNode($child));
if($child->hasChildNodes())
clone_without_data($child,$new_node,$clonedoc);
}
}
clone_without_data($doc->documentElement, $empty_xml->documentElement, $empty_xml);
// remove all duplicates
$distinct_structure = new DOMDocument();
$distinct_structure->appendChild($distinct_structure->importNode($doc->documentElement));
foreach ($empty_xml->documentElement->childNodes as $child){
$match = false;
foreach ($distinct_structure->documentElement->childNodes as $i => $element){
if ($distinct_structure->saveXML($element) === $empty_xml->saveXML($child)) {
$match = true;
break;
}
}
if (!$match)
$distinct_structure->documentElement->appendChild($distinct_structure->importNode($child,true));
}
$distinct_structure->formatOutput = true;
echo $distinct_structure->saveXML();
Which results in this output:
<?xml version="1.0"?>
<root>
<node/>
<node>
<sub/>
</node>
</root>

xml_parser extraction of attributes

I wonder whether it is possible to convert this XML
<url name="profile_link">http://example.com/profile/2345/</url>
into this HTML
http://example.com/profile/2345/
with the PHP XML Parser.
I do not understand how to fill the href in my link. The URL (i.e. the data content) is accessible via the xml_set_character_data_handler(), but the start handler (exchanging the url with the anchor) was already called before that event is triggered.
Here are two approaches for this:
Replace the nodes using DOM
Replacing nodes requires less bootstrap. It is done completely in PHP.
$xml = <<<'XML'
<url name="profile_link">http://example.com/profile/2345/</url>
XML;
$dom = new DOMDocument();
$dom->loadXml($xml);
$xpath = new DOMXpath($dom);
$nodes = $xpath->evaluate('//url');
foreach ($nodes as $node) {
$link = $dom->createElement('a');
$link->appendChild($dom->createTextNode($node->textContent));
$link->setAttribute('href', $node->textContent);
$node->parentNode->insertBefore($link, $node);
$node->parentNode->removeChild($node);
}
var_dump($dom->saveXml($dom->documentElement));
Transform the XML using XSLT
The second approach requires an XSLT template file. XSLT is an language designed to transform XML. So the initial bootstrap is larger, but the actual transformation is easier to define. I would suggest this approach if you need to do other transformations, too.
$xml = <<<'XML'
<url name="profile_link">http://example.com/profile/2345/</url>
XML;
$xsl = <<<'XSL'
<?xml version="1.0"?>
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="url">
<a href="text()">
<xsl:value-of select="text()"/>
</a>
</xsl:template>
<!-- pass through for unknown tags in the xml tree -->
<xsl:template match="*">
<xsl:element name="{local-name()}">
<xsl:copy-of select="#*"/>
<xsl:apply-templates select="node()"/>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
XSL;
$dom = new DOMDocument();
$dom->loadXml($xml);
$xslDom = new DOMDocument();
$xslDom->loadXml($xsl);
$xsltProc = new XsltProcessor();
$xsltProc->importStylesheet($xslDom);
$result = $xsltProc->transformToDoc($dom);
var_dump($result->saveXml($result->documentElement));

Categories