I need a recursive php function to loop through a xml file - php

I'm trying to loop through a xml file and save nodes pared with it's value into an array (key => value). I also want it to keep track of the nodes it passed (something like array(users_user_name => "myName", users_user_email => "myEmail") etc.).
I know how to do this but there is a problem. All the nodes could have children and those children might also have children etc. so I need some sort of recursive function to keep looping through the children until it reaches the last child.
So far I got this:
//loads the xml file and creates simpleXML object
$xml = simplexml_load_string($content);
// for each root value
foreach ($xml->children() as $children) {
// for each child of the root node
$node = $children;
while ($children->children()) {
foreach ($children as $child) {
if($child->children()){
break;
}
$children = $node->getName();
//Give key a name
$keyOfValue = $xml->getName() . "_" . $children . "_" . $child->getName();
// pass value from child to children
$children = $child;
// no children, fill array: key => value
if ($child->children() == false) {
$parent[$keyOfValue] = (string)$child;
}
}
}
$dataObject[] = $parent;
}
The "break;" is to prevent it from giving me the wrong values because "child" is an object and not the last child.

Using recursion, you can write some 'complicated' processing, but the problem is loosing your place.
The function I use here passed in a couple of things to keep track of the name and the current output, but also the node it's currently working with. As you can see - the method checks if there are any child nodes and calls the function again to process each one of them.
$content = <<< XML
<users>
<user>
<name>myName</name>
<email>myEmail</email>
<address><line1>address1</line1><line2>address2</line2></address>
</user>
</users>
XML;
function processNode ( $base, SimpleXMLElement $node, &$output ) {
$base[] = $node->getName();
$nodeName = implode("_", $base);
$childNodes = $node->children();
if ( count($childNodes) == 0 ) {
$output[ $nodeName ] = (string)$node;
}
else {
foreach ( $childNodes as $newNode ) {
processNode($base, $newNode, $output);
}
}
}
$xml = simplexml_load_string($content);
$output = [];
processNode([], $xml, $output);
print_r($output);
This prints out...
Array
(
[users_user_name] => myName
[users_user_email] => myEmail
[users_user_address_line1] => address1
[users_user_address_line2] => address2
)
With this implementation, there are limitations to the content - so for example - repeating content will only keep the last value (say for example there were multiple users).

You'll want to use recursion!
Here's a simple example of recursion:
function doThing($param) {
// Do what you need to do
$param = alterParam($param);
// If there's more to do, do it again
if ($param != $condition) {
$param = doThing($param);
}
// Otherwise, we are ready to return the result
else {
return $param;
}
}
You can apply this thinking to your specific use case.

//Using SimpleXML library
// Parses XML but returns an Object for child nodes
public function getNodes($root)
{
$output = array();
if($root->children()) {
$children = $root->children();
foreach($children as $child) {
if(!($child->children())) {
$output[] = (array) $child;
}
else {
$output[] = self::getNodes($child->children());
}
}
}
else {
$output = (array) $root;
}
return $output;
}

I'll just add to this
I've had some trouble when namespaces come into the mix so i made the following recursive function to solve a node
This method goes into the deepest node and uses it as the value, in my case the top node's nodeValue contains all the values nested within so we have to dig into the lowest level and use that as the true value
// using the XMLReader to read an xml file ( in my case it was a 80gig xml file which is why i don't just load everything into memory )
$reader = new \XMLReader;
$reader->open($path); // where $path is the file path to the xml file
// using a dirty trick to skip most of the xml that is irrelevant where $nodeName is the node im looking for
// then in the next while loop i skip to the next node
while ($reader->read() && $reader->name !== $nodeName);
while ($reader->name === $nodeName) {
$doc = new \DOMDocument;
$dom = $doc->importNode($reader->expand(), true);
$data = $this->processDom($dom);
$reader->next($dom->localName);
}
public function processDom(\DOMNode $node)
{
$data = [];
/** #var \DomNode $childNode */
foreach ($node->childNodes as $childNode) {
// child nodes include of a lot of #text nodes which are irrelevant for me, so i just skip them
if ($childNode->nodeName === '#text') {
continue;
}
$childData = $this->processDom($childNode);
if ($childData === null || $childData === []) {
$data[$childNode->localName] = $childNode->nodeValue;
} else {
$data[$childNode->localName] = $childData;
}
}
return $data;
}

Related

When converting xml to json , several objects name change to "#attributes"

I read this article.
https://qiita.com/yasumodev/items/74a73ed4b3f1dd45edb8
And I did the same thing.
// XML(RSSなど)を取得
$strXml = file_get_contents("./doc.xml");
// XML⇒JSONに変換
$strJson = xml_to_json($strXml);
// 出力
echo $strJson;
//**********************************
// XML ⇒ JSONに変換する関数
//**********************************
function xml_to_json($xml)
{
// コロンをアンダーバーに(名前空間対策)
$xml = preg_replace("/<([^>]+?):([^>]+?)>/", "<$1_$2>", $xml);
// プロトコルのは元に戻す
$xml = preg_replace("/_\/\//", "://", $xml);
// XML文字列をオブジェクトに変換(CDATAも対象とする)
$objXml = simplexml_load_string($xml, NULL, LIBXML_NOCDATA);
// 属性を展開する
xml_expand_attributes($objXml);
// JSON形式の文字列に変換
$json = json_encode($objXml, JSON_PRETTY_PRINT | JSON_UNESCAPED_UNICODE);
// "\/" ⇒ "/" に置換
return preg_replace('/\\\\\//', '/', $json);
}
//**********************************
// XMLタグの属性を展開する関数
//**********************************
function xml_expand_attributes($node)
{
if($node->count() > 0) {
foreach($node->children() as $child)
{
foreach($child->attributes() as $key => $val) {
$node->addChild($child->getName()."#".$key, $val);
}
xml_expand_attributes($child); // 再帰呼出
}
}
}
But in this way , several objects name change to "#attributes".
I want the original object name here(T_T)
Please help me.
When you json_encode() XML with attributes, this will create the #attributes elements your getting. The only way round this is to remove them as you expand them. I've changed the routine, the first thing is that I've put the part that processes the attributes first, this ensures that the root node gets processed as well.
The main thing is that I've changed the way it works to use XPath to retrieve the attributes, this then encodes them as you have, but also allows you to remove the attribute from the original node (using unset($attribute[0]);)...
function xml_expand_attributes($node)
{
foreach ($node->xpath("#*") as $attribute) {
$node->addChild($node->getName()."#".$attribute->getName(), (string)$attribute);
unset($attribute[0]);
}
if($node->count() > 0) {
foreach($node->children() as $child)
{
xml_expand_attributes($child); // 再帰呼出
}
}
}

How to remove Child node with item(index)?

I'm writing a script that search in multiple XML files for some tag and then if it's find in this tag child named update I need to delete that child and then add it again.
Problem is that I don't understand why it doesn't deletes nodes I want to delete.
Ok so my script (important part I want to discuss) looks like this:
/*
// Pushing all offers from all files to $allOffers array
*/
foreach ($offerFiles as $file)
{
$file = $path . "\\" . $file;
$currentXML = new SimpleXMLElement($file, 0, true);
foreach($currentXML->offer as $offer)
{
if ($offer->number) {
if (!check_if_exists($compiledXML, $offer->number))
{
//array_push($allOffers, $offer);
}
if (check_if_exists($compiledXML, $offer->number) && $offer->action == "update")
{
update_existing_entry($compiledFile, $compiledXML, $offer);
// var_dump($allOffers);
}
}
}
}
/*
// Find and delete existing XML entry offer with update action
*/
function update_existing_entry ($compiledFile, $compiledXML, $parsedOffer) {
$index = 0;
$doc = new DOMDocument();
$doc->load($compiledFile);
$elem = $doc->documentElement;
foreach ($compiledXML->offer as $offer) {
if ((string)$parsedOffer->number === (string)$offer->number) {
$firstchild = $doc->getElementsByTagName('offer')->item($index);
// $firstchild->nodeValue = null;
$elem->removeChild($firstchild);
$doc->save($compiledFile);
//var_dump($parsedOffer->asXML());
}
$index++;
}
var_dump($deleteNodes);
}
Now if I have 2 XML files, 1 with update action, another without it then it works perfect. Problems starts when 1 and 2 files has update action, then I always ends with only one deleted node and error:
Fatal error: Uncaught TypeError: Argument 1 passed to
DOMNode::removeChild() must be an instance of DOMNode, null given
Why I can't delete nodes with selected index?
I don't know if it's the best approach, but I have fixed it this way:
function update_existing_entry ($compiledFile, $compiledXML, $parsedOffer) {
$doc = new DOMDocument();
$doc->load($compiledFile);
$node = $doc->documentElement;
foreach ($doc->getElementsByTagName('offer') as $child) {
if (strpos($child->nodeValue, (string)$parsedOffer->number) !== false) {
$node->removeChild($child);
}
}
$doc->save($compiledFile);
}

Build an array as a tree structure from an XML file

I am trying to read this xml file, but the code I am trying to make should work for any xml-file:
<?xml version="1.0"?>
<auto>
<binnenkant>
<voorkant>
<stuur/>
<gas/>
</voorkant>
<achterkant>
<stoel/>
<bagage/>
</achterkant>
</binnenkant>
<buitenkant>
<dak>
<dakkoffer>
<sky/>
<schoen/>
</dakkoffer>
</dak>
<trekhaak/>
<wiel/>
</buitenkant>
</auto>
I am using the two functions below to turn the XML-file into an array and turn that array into a tree.
I am trying to keep the parent-child relationship of the XML file. All I am getting back from the second function is an array with all the tags in the xml-file.
Can someone please help me?
function build_xml_tree(array $vals, $parent, $level) {
$branch = array();
foreach ($vals as $item) {
if (($item['type'] == "open") || $item['type'] == "complete") {
if ($branch && level == $item['level']) {
array_push($branch, ucfirst(strtolower($item['tag'])));
} else if ($parent == "" || $level < $item['level']) {
$branch = array(ucfirst(strtolower($item['tag'])) => build_xml_tree($vals, strtolower($item['tag']), $level));
}
}
}
return $branch;
}
function build_tree ($begin_tree, $content_naam) {
$xml = file_get_contents('xml_files/' . $content_naam . '.xml');
$p = xml_parser_create();
xml_parse_into_struct($p, $xml, $vals, $index);
?>
<pre>
<?php
print_r($vals);
?>
</pre>
<?php
$eindarray = array_merge($begin_tree, build_xml_tree($vals, "", 1));
return $eindarray;
}
There are many classes which can load an XML file. Many of them already represent the file in a tree structure: DOMDocument is one of them.
It seems a bit strange that you want to make a tree as an array when you already have a tree in a DOMDocument object: since you'll have to traverse the array-tree in some way... why don't you traverse directly the tree structure of the object-tree for printing for example?
Anyway the following code should do what you're asking for: I've used a recursive function in which the array-tree is passed by reference.
It should be trivial at this point to arrange my code to better suit your needs - i.e. complete the switch statement with more case blocks.
the $tree array has a numeric key for each node level
the tag node names are string values
if an array follows a string value, it contains the node's children
any potential text node is threated as a child node
function buildTree(DOMNode $node = NULL, &$tree) {
foreach ($node->childNodes as $cnode) {
switch($cnode->nodeType) {
case XML_ELEMENT_NODE:
$tree[] = $cnode->nodeName;
break;
case XML_TEXT_NODE:
$tree[] = $cnode->nodeValue;
break;
}
if ($cnode->hasChildNodes())
buildTree($cnode, $tree[count($tree)]);
}
}
$source ='the string which contains the XML';
$doc = new DOMDocument();
$doc->preserveWhiteSpace = FALSE;
$doc->loadXML($source, LIBXML_NOWARNING);
$tree = array();
buildTree($doc, $tree);
var_dump($tree);

Why is this variable variable not acting properly? PHP

I have this foreach loop.
$i2 looks like this (everytime):
$i2 = array(
'id' => "category['id']"
);
And here is a foreach loop.
foreach ($i2 as $o3 => $i3)
{
if (is_array($i3) !== true)
{
$new .= "<{$o3}>{$node->$i3}</{$o3}>";
} else {
$new .= "<{$o3}>";
$new .= "</{$o3}>";
}
}
}
$node is a new SimpleXMLElement($xml_reader->readOuterXML());. But this is working properly.
The problem is here: if I use $node->$i3 in order to get XML value of that field - it doesn't work. But if I substitute it for $node->category['id'] it does. Which seems weird as $i3 contains category['id'] and I can check that with debug tools.
I was using this in the previous projects and variable of variable was working fine. Now it doesn't. Why?
#edit
This is what happens before the code:
// i move the cursor to the first product tag
while ($xml_reader->read() and $xml_reader->name !== 'product');
// i iterate over it as long as I am inside of it
while ($xml_reader->name === 'product')
{
// i use SimpleXML inside XMLReader to work with nodes easily but without the need of loading the whole file to memory
$node = new SimpleXMLElement($xml_reader->readOuterXML());
foreach ($this->columns as $out => $in) // for each XML tag inside the product tag
{
// ... do stuff
Basically this is what happens before the code in question. The $columns is an array that enables the configuration of input XML file (keys are Prestashops XML tags and values are names of tags in the user's XML file).
For example:
<product>
<associations>
<categories>
<category>
<id></id>
</category>
<category>
</associations>
</products>
And in input one:
<category id="1"></category>
So in $columns:
$columns = ('associations' => array(
'categories' => array(
'category' => 'category['id'] // this is what $i3 later is
)
));
I can get to the given point of XML file easily. I get the category['id'] value and this is what $i3 is.
The PHP preprocessor tries to find the $i3 property on $node. But the object $node has no such property, and then it fails.
Are you sure you use the same syntax when trying to reach the category['id'] property in your previous projects?
You can try this syntax:
foreach ($i2 as $o3 => $i3)
{
if (is_array($i3) !== true)
{
$new .= "<{$o3}>" . eval("return \$node->$i3;") ."</{$o3}>";
} else {
$new .= "<{$o3}>";
$new .= "</{$o3}>";
}
}

PHP loads not wanted elements from a XML file

I wan't to load some data from my XML file using this function:
public function getElements()
{
$elements = array();
$element = $this->documentElement->getElementsByTagName('elements')->item(0);
// checks if it has any immunities
if( isset($element) )
{
// read all immunities
foreach( $element->getElementsByTagName('element') as $v)
{
$v = $v->attributes->item(0);
// checks if immunity is set
if($v->nodeValue > 0)
{
$elements[$v->nodeName] = $v->nodeValue;
}
}
}
return $elements;
}
I wan't to load that elements from my XML file:
<elements>
<element physicalPercent="10"/>
<element icePercent="10"/>
<element holyPercent="-10"/>
</elements>
I wan't to load only element node name and node value.
Got this code in my query loop:
$elements = $monster->getElements();
$elN = 0;
$elC = count($elements);
if(!empty($elements)) {
foreach($elements as $element => $value) {
$elN++;
$elements_string .= $element . ":".$value;
if($elC != $elN)
$elements_string .= ", ";
}
}
And finally - the output of $elements_string variable is wrong:
earthPercent:50, holyPercent:50, firePercent:15, energyPercent:5, physicalPercent:25, icePercent:30, deathPercent:30firePercent:20, earthPercent:75firePercent:20, earthPercent:75firePercent:20, earthPercent:75physicalPercent:70, holyPercent:20, deathPerce
It should rather return:
physicalPercent:10, icePercent:10, holyPercent:-10
Could you help me one more time?:)
Thank you in advance.
Well the XML-Parser doesn't magically know which elements you want to load and which you won't - you have to filter this by yourself. Then you have to decide where you want to filter your desired elements in the getElements-function you posted or in your "query loop" as you call it.
Should the getElements be some kind of general function which must return all elements? Then you should change that check if($v->nodeValue > 0) to something like if(!empty($v->nodeValue)) otherwise you wont get the "holyPercent" value since this is negative (and the old expression becomes false).
Then in your "query loop", just select your desired elements:
foreach($elements as $element => $value) {
if(in_array($element, array("physicalPercent", "icePercent", "holyPercent"))) {
$elN++;
$elements_string .= $element . ":".$value;
if($elC != $elN)
$elements_string .= ", ";
}
}
Just:
$xml = new SimpleXMLElement($xmlfile);
And then:
for($i=1;$i<Count($xml->elements);$i++)
echo $xml->elements[$i][0];
Didn't try if it works with [0], usually i use :
echo $xml->elements[$i]['attributename'];

Categories