Using XPath to extract XML in PHP

Using XPath to extract XML in PHP - php

I have the following XML:
<root>
<level name="level1">
<!-- More children <level> -->
</level>
<level name="level2">
<!-- Some more children <level> -->
</level>
</root>
How can I extract a <level> directly under <root> so that I can run an XPath query such as $xml->xpath('//some-query') relative to the extracted <level>?

DOMXPath::query's second parameter is the context node. Just pass the DOMNode instance you have previously "found" and your query runs "relative" to that node. E.g.
<?php
$doc = new DOMDocument;
$doc->loadxml( data() );
$xpath = new DOMXPath($doc);
$nset = $xpath->query('/root/level[#name="level1"]');
if ( $nset->length < 1 ) {
die('....no such element');
}
else {
$elLevel = $nset->item(0);
foreach( $xpath->query('c', $elLevel) as $elC) {
echo $elC->nodeValue, "\r\n";
}
}
function data() {
return <<< eox
<root>
<level name="level1">
<c>C1</c>
<a>A</a>
<c>C2</c>
<b>B</b>
<c>C3</c>
</level>
<level name="level2">
<!-- Some more children <level> -->
</level>
</root>
eox;
}
But unless you have to perform multiple separate (possible complex) subsequent queries, this is most likely not necessary
<?php
$doc = new DOMDocument;
$doc->loadxml( data() );
$xpath = new DOMXPath($doc);
foreach( $xpath->query('/root/level[#name="level1"]/c') as $c ) {
echo $c->nodeValue, "\r\n";
}
function data() {
return <<< eox
<root>
<level name="level1">
<c>C1</c>
<a>A</a>
<c>C2</c>
<b>B</b>
<c>C3</c>
</level>
<level name="level2">
<c>Ahh</c>
<a>ouch</a>
<c>no</c>
<b>wrxl</b>
</level>
</root>
eox;
}
has the same output using just one query.

DOMXpath::evaluate() allows you to fetch node lists and scalar values from a DOM.
So you can fetch a value directly using an Xpath expression:
$document = new DOMDocument();
$document->loadXml($xml);
$xpath = new DOMXpath($document);
var_dump(
$xpath->evaluate('string(/root/level[#name="level2"]/#name)')
);
Output:
string(6) "level2"
The Xpath expression
All level element nodes in root:
/root/level
That have a specific name attribute:
/root/level[#name="level2"]
The value you like to fetch (name attribute for validation):
/root/level[#name="level2"]/#name
Cast into a string, if node was found the result will be an empty string:
string(/root/level[#name="level2"]/#name)
Loop over nodes, use them as context
If you need to execute several expression for the node it might be better to fetch it separately and use foreach(). The second argument for DOMXpath::evaluate() is the context node.
foreach ($xpath->evaluate('/root/level[#name="level2"]') as $level) {
var_dump(
$xpath->evaluate('string(#name)', $level)
);
}
Node list length
If you need to handle that no node was found you can check the DOMNodeList::$length property.
$levels = $xpath->evaluate('/root/level[#name="level2"]');
if ($levels->length > 0) {
$level = $levels->item(0);
var_dump(
$xpath->evaluate('string(#name)', $level)
);
} else {
// no level found
}
count() expression
You can validate that here are elements before with a count() expression, too.
var_dump(
$xpath->evaluate('count(/root/level[#name="level2"])')
);
Output:
float(1)
Boolean result
It is possible to make that a condition in Xpath and return the boolean value.
var_dump(
$xpath->evaluate('count(/root/level[#name="level2"]) > 0')
);
Output:
bool(true)

Using querypath for parsing XML/HTML makes this all super easy.
$qp = qp($xml) ;
$levels = $qp->find('root')->eq(0)->find('level') ;
foreach($levels as $level ){
//do whatever you want with it , get its xpath , html, attributes etc.
$level->xpath() ; //
}
Excellent beginner tutorial for Querypath

This should work:
$dom = new DOMDocument;
$dom->loadXML($xml);
$levels = $dom->getElementsByTagName('level');
foreach ($levels as $level) {
$levelname = $level->getAttribute('name');
if ($levelname == 'level1') {
//do stuff
}
}
I personally prefer the DOMNodeList class for parsing XML.

Related

PHP XML Trying to add stock_quantity by item id into main feed

I would like merge two feeds the one has all product data and has an product identifier ITEM_ID in every , the second XML feed has same value as ITEM_ID in <item id=""> and inside this <item> has stock_quantity tag but I can't figure it out how to merge these values.. The three dots in XML content means that there are more item tags
The first feed (items.xml) looks like:
<SHOP>
<SHOPITEM>
<DESCRIPTION>
<![CDATA[ <p><span>Just an description. </span></p> ]]>
</DESCRIPTION>
<URL>https://www.korkmaz.cz/tombik-cajova-konvice-2l/</URL>
<IMGURL>https://cdn.myshoptet.com/usr/www.korkmaz.cz/user/shop/orig/52_konvice-tombik-1l.jpg?5f4fcd7d</IMGURL>
<IMGURL_ALTERNATIVE>https://cdn.myshoptet.com/usr/www.korkmaz.cz/user/shop/orig/52-1_bez-trouby.jpg?5f4fcd7d</IMGURL_ALTERNATIVE>
<PURCHASE_PRICE>487,99</PURCHASE_PRICE>
<PRICE_VAT>797,00</PRICE_VAT>
<VAT>21%</VAT>
<CATEGORYTEXT>KUCHYŇSKÉ DOPLŇKY | Příprava čaje a kávy</CATEGORYTEXT>
<DELIVERY_DATE>0</DELIVERY_DATE>
<ITEM_ID>A093</ITEM_ID>
...
</SHOPITEM>
</SHOP>
The second feed (stock.xml) loks like:
<item_list>
<item id="A093">
<delivery_time orderDeadline="2021-09-14 12:00">2021-09-16 12:00</delivery_time>
<stock_quantity>32</stock_quantity>
...
</item>
</item_list>
So I trying something like this (similar method like the $item->ITEM_ID was in separate tag in stock.xml) but doesn't work for me..
<?php
$catalog_name = 'items.xml';
$catalog_url = 'https://admin.srovnej-ceny.cz/export/ca1b20bb6415b2d93ff36c9e3df3f96c.xml';
file_put_contents($catalog_name, fopen($catalog_url, 'r'));
$stock_name = 'stock.xml';
$stock_url = 'https://www.korkmaz.cz/heureka/export/availability.xml';
file_put_contents($stock_name, fopen($stock_url, 'r'));
$stocks=simplexml_load_file("stock.xml") or die("Error: Cannot create object");
foreach($stocks->children() as $item) {
$_stocks["" . $item['id'] . ""] = $item->stock_quantity;
}
$xml=simplexml_load_file("items.xml") or die("Error: Cannot create object");
$dom = new DOMDocument();
$dom->encoding = 'utf-8';
$dom->xmlVersion = '1.0';
$dom->formatOutput = true;
$xml_file_name = 'products.xml';
$root = $dom->createElement('SHOP');
$i=0;
foreach($xml->children() as $item) {
$item_node = $dom->createElement('SHOPITEM');
//$track = $xml->addChild('item');
$item_node->appendChild($dom->createElement('ITEM_ID', $item->ITEM_ID ));
$item_node->appendChild($dom->createElement('PRODUCTNAME', htmlspecialchars($item->PRODUCTNAME) ));
$item_node->appendChild($dom->createElement('DESCRIPTION', htmlspecialchars($item->DESCRIPTION)));
$item_node->appendChild($dom->createElement('MANUFACTURER', $item->MANUFACTURER));
$item_node->appendChild($dom->createElement('EAN', strval($item->EAN) ));
$item_node->appendChild($dom->createElement('IMGURL', strval($item->IMGURL)));
$item_node->appendChild($dom->createElement('PRICE_VAT', strval($item->PRICE_VAT)));
$item_node->appendChild( $dom->createElement('STOCK', $_stocks["" . $item['id'] . ""] ) );
$root->appendChild($item_node);
$i++;
}
$dom->appendChild($root);
$dom->save($xml_file_name);
echo "$i items to $xml_file_name has been successfully created";
?>

Without simplexml you can quite easily "merge" the two documents using the standard DOMDocument and DOMXPath functions.
Given input files as follows:
items.xml
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<SHOP>
<SHOPITEM>
<DESCRIPTION>
<![CDATA[ <p><span>Just an description. </span></p> ]]>
</DESCRIPTION>
<URL>https://www.korkmaz.cz/tombik-cajova-konvice-2l/</URL>
<IMGURL>https://cdn.myshoptet.com/usr/www.korkmaz.cz/user/shop/orig/52_konvice-tombik-1l.jpg?5f4fcd7d</IMGURL>
<IMGURL_ALTERNATIVE>https://cdn.myshoptet.com/usr/www.korkmaz.cz/user/shop/orig/52-1_bez-trouby.jpg?5f4fcd7d</IMGURL_ALTERNATIVE>
<PURCHASE_PRICE>487,99</PURCHASE_PRICE>
<PRICE_VAT>797,00</PRICE_VAT>
<VAT>21%</VAT>
<CATEGORYTEXT>KUCHYŇSKÉ DOPLŇKY | Příprava čaje a kávy</CATEGORYTEXT>
<DELIVERY_DATE>0</DELIVERY_DATE>
<ITEM_ID>A093</ITEM_ID>
</SHOPITEM>
<SHOPITEM>
<DESCRIPTION>
<![CDATA[ <p><span>Just an description. </span></p> ]]>
</DESCRIPTION>
<URL>https://www.korkmaz.cz/tombik-cajova-konvice-2l/</URL>
<IMGURL>https://cdn.myshoptet.com/usr/www.korkmaz.cz/user/shop/orig/52_konvice-tombik-1l.jpg?5f4fcd7d</IMGURL>
<IMGURL_ALTERNATIVE>https://cdn.myshoptet.com/usr/www.korkmaz.cz/user/shop/orig/52-1_bez-trouby.jpg?5f4fcd7d</IMGURL_ALTERNATIVE>
<PURCHASE_PRICE>1850,99</PURCHASE_PRICE>
<PRICE_VAT>2598,00</PRICE_VAT>
<VAT>21%</VAT>
<CATEGORYTEXT>KUCHYŇSKÉ DOPLŇKY | Příprava čaje a kávy</CATEGORYTEXT>
<DELIVERY_DATE>0</DELIVERY_DATE>
<ITEM_ID>A094</ITEM_ID>
</SHOPITEM>
<SHOPITEM>
<DESCRIPTION>
<![CDATA[ <p><span>Just an description. </span></p> ]]>
</DESCRIPTION>
<URL>https://www.korkmaz.cz/tombik-cajova-konvice-2l/</URL>
<IMGURL>https://cdn.myshoptet.com/usr/www.korkmaz.cz/user/shop/orig/52_konvice-tombik-1l.jpg?5f4fcd7d</IMGURL>
<IMGURL_ALTERNATIVE>https://cdn.myshoptet.com/usr/www.korkmaz.cz/user/shop/orig/52-1_bez-trouby.jpg?5f4fcd7d</IMGURL_ALTERNATIVE>
<PURCHASE_PRICE>200,99</PURCHASE_PRICE>
<PRICE_VAT>300,00</PRICE_VAT>
<VAT>21%</VAT>
<CATEGORYTEXT>KUCHYŇSKÉ DOPLŇKY | Příprava čaje a kávy</CATEGORYTEXT>
<DELIVERY_DATE>0</DELIVERY_DATE>
<ITEM_ID>A095</ITEM_ID>
</SHOPITEM>
stock.xml
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<item_list>
<item id="A093">
<delivery_time orderDeadline="2021-09-14 12:00">2021-09-16 12:00</delivery_time>
<stock_quantity>32</stock_quantity>
</item>
<item id="A094">
<delivery_time orderDeadline="2021-09-14 12:00">2021-09-16 12:00</delivery_time>
<stock_quantity>8366</stock_quantity>
</item>
<item id="A095">
<delivery_time orderDeadline="2021-09-14 12:00">2021-09-16 12:00</delivery_time>
<stock_quantity>6732</stock_quantity>
</item>
</item_list>
To generate the "combined xml" type output based upon matching item IDS:
<?php
/*
#
# XML file merge
#
Read "stock.xml" and find matching elements in "items.xml"
- update "items.xml" with nodes cloned from "stock.xml"
*/
function getdom($file){
libxml_use_internal_errors( true );
$dom=new DOMDocument;
$dom->validateOnParse=true;
$dom->recover=true;
$dom->strictErrorChecking=true;
$dom->preserveWhiteSpace=true;
$dom->formatOutput=true;
$dom->load($file);
libxml_clear_errors();
return $dom;
}
$items=getdom('items.xml');
$xpi=new DOMXPath($items);
$stock=getdom('stock.xml');
$xps=new DOMXPath($stock);
/*
If ALL nodes from "stock.xml" are to be merged per ID then set `$merge_only_selected=false`
`$merge_nodes` is an array of nodes from "stock.xml" that will be merged if `$merge_only_selected` is true
*/
$merge_only_selected=true;
$merge_nodes=array('stock_quantity','stock_supplier');
#Find all items in the "stock.xml" file to get the item ID
$col=$xps->query( '//item[#id]' );
foreach( $col as $node ){
#The ID from the "item"
$id=$node->getAttribute('id');
# nodelist of items within "items.xml" that have the same ID.
$item=$xpi->query( sprintf( '//SHOPITEM/ITEM_ID[ text()="%s" ]', $id ) );
# only proceed if we have found a matching node in "items.xml"
if( $item && $item->length > 0 ){
# Find the matched element
$obj=$item->item(0);
# Find the children from the "item"
$children=$node->childNodes;
# for each child found, clone it and import to the "items.xml" file
foreach( $children as $child ){
if( $child->nodeType==XML_ELEMENT_NODE && $id==$obj->nodeValue ){
if( $merge_only_selected==true && !in_array( $child->tagName, $merge_nodes ) ){
continue;
}
$clone=$child->cloneNode(true);
$obj->parentNode->appendChild( $items->importNode( $clone, true ) );
}
}
}
}
#To actually save the modified "items.xml" file:
#$items->save('items.xml');
#To simply view the changes:
printf('<textarea cols=150 rows=50>%s</textarea>',$items->saveXML() );
?>

This is a lot easier with DOM+Xpath. You can use DOMXpath::evaluate() to fetch node lists and scalar values from the XML.
An Xpath expression like /SHOP/SHOP_ITEM returns a DOMNodeList which implements Traversable to support foreach().
But Xpath expression can return scalar values as well. A boolean if they are a condition or a string/number if they contain a type cast or function call. string(/item_list/item[#id="A093"]/stock_quantity) will return the text content of the first matching node or an empty string.
DOMDocument::importNode() allows you to copy a node from another document. But in this case I would suggest creating a new node with a name matching the existing elements.
// bootstrap the XML
$shopDocument = new DOMDocument();
// ignoring pure whitespace nodes (indentation)
$shopDocument->preserveWhiteSpace = FALSE;
$shopDocument->loadXML(getShopXML());
$shopXpath = new DOMXpath($shopDocument);
$stocksDocument = new DOMDocument();
$stocksDocument->loadXML(getStocksXML());
$stocksXpath = new DOMXpath($stocksDocument);
// iterate the shop items
foreach ($shopXpath->evaluate('/SHOP/SHOPITEM') as $shopItem) {
// get the item ID
$itemID = $shopXpath->evaluate('string(ITEM_ID)', $shopItem);
$stockQuantity = 0;
if ($itemID !== '') {
// fetch the stock quantity using the item id
$stockQuantity = (int)$stocksXpath->evaluate(
"string(/item_list/item[#id = '$itemID']/stock_quantity)"
);
// check if here is a "STOCK_QUANTITY" element in the item
if ($shopXpath->evaluate('count(STOCK_QUANTITY) > 0', $shopItem)) {
// update it
foreach ($shopXpath->evaluate('STOCK_QUANTITY', $shopItem) as $quantity) {
$quantity->textContent = (string)$stockQuantity;
}
} else {
// add one
$shopItem
->appendChild($shopDocument->createElement('STOCK_QUANTITY'))
->textContent = (string)$stockQuantity;
}
}
}
$shopDocument->formatOutput = TRUE;
echo $shopDocument->saveXML();

What is the difference in PHP between DOM nodes and XMLreader->expand() Nodes?

I've rewritten a script that used the PHP DOM functions to iterate through an XML file with a structure like this:
<file>
<record>
<Source>
<SourcePlace>
<Country>Germany</Country>
</SourcePlace>
</Source>
<Person>
<Name>
<firstname>John</firstname>
<lastname>Doe<lastname>
</Name>
</Person>
</record>
<record>
..
</record>
</file>
I've replaced it with a script that uses XMLreader to find each separate record and turn that into a DOMdocument after which it is iterated through. Iteration was done by checking if the node had a child:
function findLeaves($node) {
echo "nodeType: ".$node->nodeType.", nodeName:". $node->nodeName."\n";
if($node->hasChildNodes() ) {
foreach($node->childNodes as $element) {
findLeaves($element)
}
}
ELSE { <do something with leave> }
}
The problem is that the behaviour of the findLeaves() function has changed between the two. Under DOM a node without a value (like Source) had no #text childnodes. Output of above would be:
nodeType:1, nodeName:Source
nodeType:1, nodeName:SourcePlace
nodeType:1, nodeName:Country
nodeType:3, nodeName:#text ```
Under XMLreader this becomes:
nodeType: 1, nodeName:Source
nodeType: 3, nodeName:#text
nodeType: 1, nodeName:SourcePlace
nodeType: 3, nodeName:#text
nodeType: 1, nodeName:Country
I've checked the saveXML() result of the data before entering this function but it seems identical, barring some extra spaces. What could be the reason for the difference?
Code loading the file before the findleaves() function under DOM:
$xmlDoc = new DOMDocument();
$xmlDoc->preserveWhiteSpace = false;
$xmlDoc->load($file);
$xpath = new DOMXPath($xmlDoc);
$records = $xpath->query('//record');
foreach($records as $record) {
foreach ($xpath->query('.//Source', $record) as $source_record) {
findleaves($source_record);
}
}
Code loading the file before the findleaves() function under XMLreader:
$xmlDoc = new XMLReader()
$xmlDoc->open($file)
while ($xmlDoc->read() ) {
if ($xmlDoc->nodeType == XMLReader::ELEMENT && $xmlDoc->name == 'record') {
$record_node = $xmlDoc->expand();
$recordDOM = new DomDocument();
$n = $recordDOM->importNode($record_node,true);
$recordDOM->appendChild($n);document
$recordDOM->preserveWhiteSpace = false;
$xpath = new DOMXPath($recordDOM);
$records = $xpath->query('//record');
foreach($records as $record) {
foreach ($xpath->query('.//Source', $record) as $source_record) {
findleaves($source_record);
}
}

The property DOMDocument::$preserveWhiteSpace affects the load/parse functions. So if you use XMLReader::expand() the property of the document has no effect - you do not load a XML string into it.
You're using Xpath already. .//*[not(*) and normalize-space(.) !== ""] will select element nodes without element children and without any text content (expect white spaces).
Here is an example (including other optimizations):
$xml = <<<'XML'
<file>
<record>
<Source>
<SourcePlace>
<Country>Germany</Country>
</SourcePlace>
</Source>
<Person>
<Name>
<firstname>John</firstname>
<lastname>Doe</lastname>
</Name>
</Person>
</record>
</file>
XML;
$reader = new XMLReader();
$reader->open('data://text/plain;base64,'.base64_encode($xml));
$document = new DOMDocument();
$xpath = new DOMXpath($document);
// find first record
while ($reader->read() && $reader->localName !== 'record') {
continue;
}
while ($reader->localName === 'record') {
// expand node into prepared document
$record = $reader->expand($document);
// match elements without child elements and empty text content
// ignore text nodes with only white space
$expression = './Source//*[not(*) and normalize-space() != ""]';
foreach ($xpath->evaluate($expression, $record) as $leaf) {
var_dump($leaf->localName, $leaf->textContent);
}
// move to the next record sibling
$reader->next('record');
}
$reader->close();
Output:
string(7) "Country"
string(7) "Germany"

how to get the different tag values when receiving response from an xml file

i have an xml file and a php file.i have received a result from an the xml file but i am not being able to get the different values of the tags.what i want is the data from individual tags.Any idea how to do it?
Here is the xml file:
<?xml version="1.0" encoding="utf-8"?>
<users xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<user>
<id>1</id>
<username>neem99</username>
<password>dbhcasvc</password>
<email>vgwdevwe#hfvuejd.com</email>
</user>
</users>
Sample php file:
$xp = new DOMXPath( $dom );
echo var_dump($xp);
$col = $xp->query( $query );
echo var_dump($col);
$array = array();
if( $col->length > 0 ){
foreach( $col as $node) echo $node->nodeValue
}
result : 1 neem99 dbhcasvc vgwdevwe#hfvuejd.com

DOMXpath::evaluate() allows to use Xpath expressions that return scalar values. string() casts a list of nodes to a string by returning the text content of the first node.
Demo:
$document = new DOMDocument();
$document->loadXML($xml);
$xpath = new DOMXpath($document);
// get first user id
var_dump($xpath->evaluate('string(/users/user/id)'));
//iterate all user nodes
foreach ($xpath->evaluate('/users/user') as $user) {
// get its username
var_dump($xpath->evaluate('string(username)', $user));
}

I would do like:
$doc = new DOMDocument; #$doc->load('yourFileName.xml');
$user = $doc->getElementsByTagName('user');
foreach($user as $u){
echo 'nodeName:'.$u->nodeName.'; nodeValue:'.$u->nodeValue.PHP_EOL;
}

query multi namespace xml

xml:
<lev:Locatie axisLabels="x y" srsDimension="2" srsName="epsg:28992" uomLabels="m m">
<gml:exterior xmlns:gml="http://www.opengis.net/gml">
<gml:LinearRing>
<gml:posList>
222518.0 585787.0 222837.0 585875.0 223229.0 585969.0 223949.0 586123.0 223389.0 586579.0 223305.0 586564.0 222690.0 586464.0 222706.0 586319.0 222424.0 586272.0 222287.0 586313.0 222054.0 586517.0 221988.0 586446.0 222174.0 586305.0 222164.0 586292.0 222172.0 586202.0 222232.0 586143.0 222279.0 586149.0 222358.0 586076.0 222422.0 586018.0 222518.0 585787.0
</gml:posList>
</gml:LinearRing>
</gml:exterior>
</lev:Locatie>
I need to get to the gml:posList. I tried the following
SimpleXML:
$xmldata = new SimpleXMLElement($xmlstr);
$xmlns = $xmldata->getNamespaces(true);
$retval = array();
foreach( $xmldata as $attr => $child ) {
if ( (string)$child !== '' ) {
$retval[$attr] = (string)$child;
}
else {
$retval[$attr] = $child->children( $xmlns['gml'] );
}
}
var_export( $retval );
xpath:
$domdoc = new DOMDocument();
$domdoc->loadXML($xml );
$xpath = new DOMXpath($domdoc);
$xpath->registerNamespace('l', $xmlns['lev'] );
$xpath->registerNamespace('g', $xmlns['gml'] );
var_export( $xml->xpath('//g:posList') );
If I query the attributes for lev:Locatie, I can get them, however, I seem unable to retrieve the gml:posList's value or the attributes for e.g gml:exterior. I know I'm doing something wrong, I just don't see what ...

You're registering the namespaces on the DOMXpath instance, but use a SimpleXMLElement::xpath() call. That will not work. You can register them on the SimpleXMLElement using SimpleXMLElement::registerXpathNamespace() or you switch to DOM and use DOMXpath::evaluate(). The attributes do not have a prefix, so they are not in a namespace. gml:exterior does not have any attributes, only the namespace definition. It looks like an attribute but it is handled differently by the parser.
The nice thing about DOMXpath::evaluate() is that it can a node list or a scalar depending on the Xpath expression. So you can fetch a value directly.
For example the gml:posList:
$xmlString = <<<'XML'
<lev:Locatie axisLabels="x y" srsDimension="2" srsName="epsg:28992" uomLabels="m m" xmlns:lev="urn:lev">
<gml:exterior xmlns:gml="http://www.opengis.net/gml">
<gml:LinearRing>
<gml:posList>
222518.0 585787.0 222837.0
</gml:posList>
</gml:LinearRing>
</gml:exterior>
</lev:Locatie>
XML;
$document = new DOMDocument();
$document->loadXML($xmlString);
$xpath = new DOMXpath($document);
$xpath->registerNamespace('g', 'http://www.opengis.net/gml');
var_export(
$xpath->evaluate('normalize-space(//g:posList)')
);
Output:
'222518.0 585787.0 222837.0'
normalize-space() is an Xpath function that replaces all sequences of whitespaces with a single space and trims the result. Because it is a string function it triggers a implicit cast of the first node from the location path.

How to sort content of an XML file loaded with SimpleXML?

There is an XML file with a content similar to the following:
<FMPDSORESULT xmlns="http://www.filemaker.com">
<ERRORCODE>0</ERRORCODE>
<DATABASE>My_Database</DATABASE>
<LAYOUT/>
<ROW MODID="1" RECORDID="1">
<Name>John</Name>
<Age>19</Age>
</ROW>
<ROW MODID="2" RECORDID="2">
<Name>Steve</Name>
<Age>25</Age>
</ROW>
<ROW MODID="3" RECORDID="3">
<Name>Adam</Name>
<Age>45</Age>
</ROW>
I tried to sort the ROW tags by the values of Name tags using array_multisort function:
$xml = simplexml_load_file( 'xml1.xml');
$xml2 = sort_xml( $xml );
print_r( $xml2 );
function sort_xml( $xml ) {
$sort_temp = array();
foreach ( $xml as $key => $node ) {
$sort_temp[ $key ] = (string) $node->Name;
}
array_multisort( $sort_temp, SORT_DESC, $xml );
return $xml;
}
But the code doesn't work as expected.

I would recommend using the DOM extension, as it is more flexible:
$doc = new DOMDocument();
$doc->preserveWhiteSpace = false;
$doc->formatOutput = true;
$doc->load('xml1.xml');
// Get the root node
$root = $doc->getElementsByTagName('FMPDSORESULT');
if (!$root->length)
die('FMPDSORESULT node not found');
$root = $root[0];
// Pull the ROW tags from the document into an array.
$rows = [];
$nodes = $root->getElementsByTagName('ROW');
while ($row = $nodes->item(0)) {
$rows []= $root->removeChild($row);
}
// Sort the array of ROW tags
usort($rows, function ($a, $b) {
$a_name = $a->getElementsByTagName('Name');
$b_name = $b->getElementsByTagName('Name');
return ($a_name->length && $b_name->length) ?
strcmp(trim($a_name[0]->textContent), trim($b_name[0]->textContent)) : 0;
});
// Append ROW tags back into the document
foreach ($rows as $row) {
$root->appendChild($row);
}
// Output the result
echo $doc->saveXML();
Output
<?xml version="1.0"?>
<FMPDSORESULT xmlns="http://www.filemaker.com">
<ERRORCODE>0</ERRORCODE>
<DATABASE>My_Database</DATABASE>
<LAYOUT/>
<ROW MODID="3" RECORDID="3">
<Name>Adam</Name>
<Age>45</Age>
</ROW>
<ROW MODID="1" RECORDID="1">
<Name>John</Name>
<Age>19</Age>
</ROW>
<ROW MODID="2" RECORDID="2">
<Name>Steve</Name>
<Age>25</Age>
</ROW>
</FMPDSORESULT>
Regarding XPath
You can use DOMXPath for even more flexible traversing. However, in this specific problem the use of DOMXPath will not bring significant improvements, in my opinion. Anyway, I'll give examples for completeness.
Fetching the rows:
$xpath = new DOMXPath($doc);
$xpath->registerNamespace('myns', 'http://www.filemaker.com');
$rows = [];
foreach ($xpath->query('//myns:ROW') as $row) {
$rows []= $row->parentNode->removeChild($row);
}
Appending the rows back into the document:
$root = $xpath->evaluate('/myns:FMPDSORESULT')[0];
foreach ($rows as $row) {
$root->appendChild($row);
}

Some SimpleXMLElement methods return arrays but most return SimpleXMLElement objects which implement Iterator. A var_dump() will only show part of of the data in a simplified representation. However it is an object structure, not a nested array.
If I understand you correctly you want to sort the ROW elements by the Name child. You can fetch them with the xpath() method, but you need to register a prefix for the namespace. It returns an array of SimpleXMLElement objects. The array can be sorted with usort.
$fResult = new SimpleXMLElement($xml);
$fResult->registerXpathNamespace('fm', 'http://www.filemaker.com');
$rows = $fResult->xpath('//fm:ROW');
usort(
$rows,
function(SimpleXMLElement $one, SimpleXMLElement $two) {
return strcasecmp($one->Name, $two->Name);
}
);
var_dump($rows);
In DOM that will not look much different, but DOMXpath::evaluate() return a DOMNodeList. You can convert it into an array using iterator_to_array.
$document = new DOMDocument();
$document->loadXml($xml);
$xpath = new DOMXpath($document);
$xpath->registerNamespace('fm', 'http://www.filemaker.com');
$rows = iterator_to_array($xpath->evaluate('//fm:ROW'));
usort(
$rows,
function(DOMElement $one, DOMElement $two) use ($xpath) {
return strcasecmp(
$xpath->evaluate('normalize-space(Name)', $one),
$xpath->evaluate('normalize-space(Name)', $two)
);
}
);
var_dump($rows);
DOM has no magic methods to access children and values, Xpath can be used to fetch them. The Xpath function string() converts the first node into a string. It return an empty string if the node list is empty. normalize-space() does a little more. It replaces all groups of whitespaces with a single space and strips it from the start and end of the string.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Using XPath to extract XML in PHP - php

Using querypath for parsing XML/HTML makes this all super easy. $qp = qp($xml) ; $levels = $qp->find('root')->eq(0)->find('level') ; foreach($levels as $level ){ //do whatever you want with it , get its xpath , html, attributes etc. $level->xpath() ; // } Excellent beginner tutorial for Querypath

This should work: $dom = new DOMDocument; $dom->loadXML($xml); $levels = $dom->getElementsByTagName('level'); foreach ($levels as $level) { $levelname = $level->getAttribute('name'); if ($levelname == 'level1') { //do stuff } } I personally prefer the DOMNodeList class for parsing XML.

Related

PHP XML Trying to add stock_quantity by item id into main feed

What is the difference in PHP between DOM nodes and XMLreader->expand() Nodes?

how to get the different tag values when receiving response from an xml file

query multi namespace xml

How to sort content of an XML file loaded with SimpleXML?

Categories

Resources