Build an array as a tree structure from an XML file - php

I am trying to read this xml file, but the code I am trying to make should work for any xml-file:
<?xml version="1.0"?>
<auto>
<binnenkant>
<voorkant>
<stuur/>
<gas/>
</voorkant>
<achterkant>
<stoel/>
<bagage/>
</achterkant>
</binnenkant>
<buitenkant>
<dak>
<dakkoffer>
<sky/>
<schoen/>
</dakkoffer>
</dak>
<trekhaak/>
<wiel/>
</buitenkant>
</auto>
I am using the two functions below to turn the XML-file into an array and turn that array into a tree.
I am trying to keep the parent-child relationship of the XML file. All I am getting back from the second function is an array with all the tags in the xml-file.
Can someone please help me?
function build_xml_tree(array $vals, $parent, $level) {
$branch = array();
foreach ($vals as $item) {
if (($item['type'] == "open") || $item['type'] == "complete") {
if ($branch && level == $item['level']) {
array_push($branch, ucfirst(strtolower($item['tag'])));
} else if ($parent == "" || $level < $item['level']) {
$branch = array(ucfirst(strtolower($item['tag'])) => build_xml_tree($vals, strtolower($item['tag']), $level));
}
}
}
return $branch;
}
function build_tree ($begin_tree, $content_naam) {
$xml = file_get_contents('xml_files/' . $content_naam . '.xml');
$p = xml_parser_create();
xml_parse_into_struct($p, $xml, $vals, $index);
?>
<pre>
<?php
print_r($vals);
?>
</pre>
<?php
$eindarray = array_merge($begin_tree, build_xml_tree($vals, "", 1));
return $eindarray;
}

There are many classes which can load an XML file. Many of them already represent the file in a tree structure: DOMDocument is one of them.
It seems a bit strange that you want to make a tree as an array when you already have a tree in a DOMDocument object: since you'll have to traverse the array-tree in some way... why don't you traverse directly the tree structure of the object-tree for printing for example?
Anyway the following code should do what you're asking for: I've used a recursive function in which the array-tree is passed by reference.
It should be trivial at this point to arrange my code to better suit your needs - i.e. complete the switch statement with more case blocks.
the $tree array has a numeric key for each node level
the tag node names are string values
if an array follows a string value, it contains the node's children
any potential text node is threated as a child node
function buildTree(DOMNode $node = NULL, &$tree) {
foreach ($node->childNodes as $cnode) {
switch($cnode->nodeType) {
case XML_ELEMENT_NODE:
$tree[] = $cnode->nodeName;
break;
case XML_TEXT_NODE:
$tree[] = $cnode->nodeValue;
break;
}
if ($cnode->hasChildNodes())
buildTree($cnode, $tree[count($tree)]);
}
}
$source ='the string which contains the XML';
$doc = new DOMDocument();
$doc->preserveWhiteSpace = FALSE;
$doc->loadXML($source, LIBXML_NOWARNING);
$tree = array();
buildTree($doc, $tree);
var_dump($tree);

Related

When converting xml to json , several objects name change to "#attributes"

I read this article.
https://qiita.com/yasumodev/items/74a73ed4b3f1dd45edb8
And I did the same thing.
// XML(RSSなど)を取得
$strXml = file_get_contents("./doc.xml");
// XML⇒JSONに変換
$strJson = xml_to_json($strXml);
// 出力
echo $strJson;
//**********************************
// XML ⇒ JSONに変換する関数
//**********************************
function xml_to_json($xml)
{
// コロンをアンダーバーに(名前空間対策)
$xml = preg_replace("/<([^>]+?):([^>]+?)>/", "<$1_$2>", $xml);
// プロトコルのは元に戻す
$xml = preg_replace("/_\/\//", "://", $xml);
// XML文字列をオブジェクトに変換(CDATAも対象とする)
$objXml = simplexml_load_string($xml, NULL, LIBXML_NOCDATA);
// 属性を展開する
xml_expand_attributes($objXml);
// JSON形式の文字列に変換
$json = json_encode($objXml, JSON_PRETTY_PRINT | JSON_UNESCAPED_UNICODE);
// "\/" ⇒ "/" に置換
return preg_replace('/\\\\\//', '/', $json);
}
//**********************************
// XMLタグの属性を展開する関数
//**********************************
function xml_expand_attributes($node)
{
if($node->count() > 0) {
foreach($node->children() as $child)
{
foreach($child->attributes() as $key => $val) {
$node->addChild($child->getName()."#".$key, $val);
}
xml_expand_attributes($child); // 再帰呼出
}
}
}
But in this way , several objects name change to "#attributes".
I want the original object name here(T_T)
Please help me.
When you json_encode() XML with attributes, this will create the #attributes elements your getting. The only way round this is to remove them as you expand them. I've changed the routine, the first thing is that I've put the part that processes the attributes first, this ensures that the root node gets processed as well.
The main thing is that I've changed the way it works to use XPath to retrieve the attributes, this then encodes them as you have, but also allows you to remove the attribute from the original node (using unset($attribute[0]);)...
function xml_expand_attributes($node)
{
foreach ($node->xpath("#*") as $attribute) {
$node->addChild($node->getName()."#".$attribute->getName(), (string)$attribute);
unset($attribute[0]);
}
if($node->count() > 0) {
foreach($node->children() as $child)
{
xml_expand_attributes($child); // 再帰呼出
}
}
}

I need a recursive php function to loop through a xml file

I'm trying to loop through a xml file and save nodes pared with it's value into an array (key => value). I also want it to keep track of the nodes it passed (something like array(users_user_name => "myName", users_user_email => "myEmail") etc.).
I know how to do this but there is a problem. All the nodes could have children and those children might also have children etc. so I need some sort of recursive function to keep looping through the children until it reaches the last child.
So far I got this:
//loads the xml file and creates simpleXML object
$xml = simplexml_load_string($content);
// for each root value
foreach ($xml->children() as $children) {
// for each child of the root node
$node = $children;
while ($children->children()) {
foreach ($children as $child) {
if($child->children()){
break;
}
$children = $node->getName();
//Give key a name
$keyOfValue = $xml->getName() . "_" . $children . "_" . $child->getName();
// pass value from child to children
$children = $child;
// no children, fill array: key => value
if ($child->children() == false) {
$parent[$keyOfValue] = (string)$child;
}
}
}
$dataObject[] = $parent;
}
The "break;" is to prevent it from giving me the wrong values because "child" is an object and not the last child.
Using recursion, you can write some 'complicated' processing, but the problem is loosing your place.
The function I use here passed in a couple of things to keep track of the name and the current output, but also the node it's currently working with. As you can see - the method checks if there are any child nodes and calls the function again to process each one of them.
$content = <<< XML
<users>
<user>
<name>myName</name>
<email>myEmail</email>
<address><line1>address1</line1><line2>address2</line2></address>
</user>
</users>
XML;
function processNode ( $base, SimpleXMLElement $node, &$output ) {
$base[] = $node->getName();
$nodeName = implode("_", $base);
$childNodes = $node->children();
if ( count($childNodes) == 0 ) {
$output[ $nodeName ] = (string)$node;
}
else {
foreach ( $childNodes as $newNode ) {
processNode($base, $newNode, $output);
}
}
}
$xml = simplexml_load_string($content);
$output = [];
processNode([], $xml, $output);
print_r($output);
This prints out...
Array
(
[users_user_name] => myName
[users_user_email] => myEmail
[users_user_address_line1] => address1
[users_user_address_line2] => address2
)
With this implementation, there are limitations to the content - so for example - repeating content will only keep the last value (say for example there were multiple users).
You'll want to use recursion!
Here's a simple example of recursion:
function doThing($param) {
// Do what you need to do
$param = alterParam($param);
// If there's more to do, do it again
if ($param != $condition) {
$param = doThing($param);
}
// Otherwise, we are ready to return the result
else {
return $param;
}
}
You can apply this thinking to your specific use case.
//Using SimpleXML library
// Parses XML but returns an Object for child nodes
public function getNodes($root)
{
$output = array();
if($root->children()) {
$children = $root->children();
foreach($children as $child) {
if(!($child->children())) {
$output[] = (array) $child;
}
else {
$output[] = self::getNodes($child->children());
}
}
}
else {
$output = (array) $root;
}
return $output;
}
I'll just add to this
I've had some trouble when namespaces come into the mix so i made the following recursive function to solve a node
This method goes into the deepest node and uses it as the value, in my case the top node's nodeValue contains all the values nested within so we have to dig into the lowest level and use that as the true value
// using the XMLReader to read an xml file ( in my case it was a 80gig xml file which is why i don't just load everything into memory )
$reader = new \XMLReader;
$reader->open($path); // where $path is the file path to the xml file
// using a dirty trick to skip most of the xml that is irrelevant where $nodeName is the node im looking for
// then in the next while loop i skip to the next node
while ($reader->read() && $reader->name !== $nodeName);
while ($reader->name === $nodeName) {
$doc = new \DOMDocument;
$dom = $doc->importNode($reader->expand(), true);
$data = $this->processDom($dom);
$reader->next($dom->localName);
}
public function processDom(\DOMNode $node)
{
$data = [];
/** #var \DomNode $childNode */
foreach ($node->childNodes as $childNode) {
// child nodes include of a lot of #text nodes which are irrelevant for me, so i just skip them
if ($childNode->nodeName === '#text') {
continue;
}
$childData = $this->processDom($childNode);
if ($childData === null || $childData === []) {
$data[$childNode->localName] = $childNode->nodeValue;
} else {
$data[$childNode->localName] = $childData;
}
}
return $data;
}

Why is this variable variable not acting properly? PHP

I have this foreach loop.
$i2 looks like this (everytime):
$i2 = array(
'id' => "category['id']"
);
And here is a foreach loop.
foreach ($i2 as $o3 => $i3)
{
if (is_array($i3) !== true)
{
$new .= "<{$o3}>{$node->$i3}</{$o3}>";
} else {
$new .= "<{$o3}>";
$new .= "</{$o3}>";
}
}
}
$node is a new SimpleXMLElement($xml_reader->readOuterXML());. But this is working properly.
The problem is here: if I use $node->$i3 in order to get XML value of that field - it doesn't work. But if I substitute it for $node->category['id'] it does. Which seems weird as $i3 contains category['id'] and I can check that with debug tools.
I was using this in the previous projects and variable of variable was working fine. Now it doesn't. Why?
#edit
This is what happens before the code:
// i move the cursor to the first product tag
while ($xml_reader->read() and $xml_reader->name !== 'product');
// i iterate over it as long as I am inside of it
while ($xml_reader->name === 'product')
{
// i use SimpleXML inside XMLReader to work with nodes easily but without the need of loading the whole file to memory
$node = new SimpleXMLElement($xml_reader->readOuterXML());
foreach ($this->columns as $out => $in) // for each XML tag inside the product tag
{
// ... do stuff
Basically this is what happens before the code in question. The $columns is an array that enables the configuration of input XML file (keys are Prestashops XML tags and values are names of tags in the user's XML file).
For example:
<product>
<associations>
<categories>
<category>
<id></id>
</category>
<category>
</associations>
</products>
And in input one:
<category id="1"></category>
So in $columns:
$columns = ('associations' => array(
'categories' => array(
'category' => 'category['id'] // this is what $i3 later is
)
));
I can get to the given point of XML file easily. I get the category['id'] value and this is what $i3 is.
The PHP preprocessor tries to find the $i3 property on $node. But the object $node has no such property, and then it fails.
Are you sure you use the same syntax when trying to reach the category['id'] property in your previous projects?
You can try this syntax:
foreach ($i2 as $o3 => $i3)
{
if (is_array($i3) !== true)
{
$new .= "<{$o3}>" . eval("return \$node->$i3;") ."</{$o3}>";
} else {
$new .= "<{$o3}>";
$new .= "</{$o3}>";
}
}

SimpleXML load String puting the values into Array() in PHP

I have a XML in form of String (after XLS transform):
<course>
<topic>
<chapter>Some value</chapter>
<title>Some value</title>
<content>Some value</content>
</topic>
<topic>
<chapter>Some value</chapter>
<title>Some value</title>
<content>Some value</content>
</topic>
....
</course>
Then I'm pushing above mentioned XML into the Array():
$new_xml = $proc->transformToXML($xml);
$xml2 = simplexml_load_string($new_xml);
$root = $xml2->xpath("//topic");
$current = 0;
$topics_list = array();
// put the xml values into multidimensional array
foreach($root as $data) {
if ($data === 'chapter') {
$topics_list[$current]['chapter'] = $data->chapter;
}
if ($data === 'title') {
$topics_list[$current]['title'] = $data->title;
}
if ($data === 'content') {
$topics_list[$current]['content'] = $data->content;
}
$current++;
}
print_r($topics_list);
Problem: Result is empty array. I've tried string like:
$topics_list[$current]['chapter'] = (string) $data->chapter;
but result is still empty. Can anyone explain, where is my mistake. Thanks.
Because my topic element has only simple child elements and not attributes, I can cast it to array and add it to the list (Demo):
$xml2 = SimpleXMLElement($new_xml);
$topics_list = array();
foreach ($xml2->children() as $data) {
$topics_list[] = (array) $data;
}
The alternative method is to map get_object_vars on the topic elements (Demo):
$topics_list = array_map('get_object_vars', iterator_to_array($xml2->topic, false));
But that might become a bit hard to read/follow. Foreach is probably more appropriate.
And here is the first working version of my code:
$xml2 = SimpleXMLElement($new_xml);
$current = 0;
$topics_list = array();
foreach($xml2->children() as $data) {
$topics_list[$current]['chapter'] = (string) $data->chapter;
$topics_list[$current]['title'] = (string) $data->title;
$topics_list[$current]['content'] = (string) $data->content;
$current++;
}
Thanks again to #Jack, #CoursesWeb and #fab for their investigation.
There are several mistakes.
1. return value of xpath()
$root = $xml2->xpath("//topic");
Here you assign $root to a list of all nodes retrieved by the XPath //topic. So, when you iterate over it with
foreach($root as $data)
$data refers to each of the <topic> elements, not the children of those.
2. comparison of SimpleXMLElements with strings
Let's assume, you loop over the right elements and $data refers to the <chapter> element: then the following expressions are true:
$data == 'Some value'
(string) $data === 'Some value'
But you cannot do a type safe comparison (===) between a SimpleXMLElement and a string, and the conversion to string does not result in the element name. What you want to do is:
if ($data->getName() === 'chapter')
3. how to get the text value
it should already be clear from the explanation above but you also will have to replace
$data->chapter
with
(string) $data
The problem is that you not get the name of the xml element.
To get the name of the xml element, apply elm->getName()
In your code should be:
if ($data->getName() === 'chapter')
For more details about traversing and getting xml elements with Simplexml, see this tutorial: http://coursesweb.net/php-mysql/php-simplexml

PHP loads not wanted elements from a XML file

I wan't to load some data from my XML file using this function:
public function getElements()
{
$elements = array();
$element = $this->documentElement->getElementsByTagName('elements')->item(0);
// checks if it has any immunities
if( isset($element) )
{
// read all immunities
foreach( $element->getElementsByTagName('element') as $v)
{
$v = $v->attributes->item(0);
// checks if immunity is set
if($v->nodeValue > 0)
{
$elements[$v->nodeName] = $v->nodeValue;
}
}
}
return $elements;
}
I wan't to load that elements from my XML file:
<elements>
<element physicalPercent="10"/>
<element icePercent="10"/>
<element holyPercent="-10"/>
</elements>
I wan't to load only element node name and node value.
Got this code in my query loop:
$elements = $monster->getElements();
$elN = 0;
$elC = count($elements);
if(!empty($elements)) {
foreach($elements as $element => $value) {
$elN++;
$elements_string .= $element . ":".$value;
if($elC != $elN)
$elements_string .= ", ";
}
}
And finally - the output of $elements_string variable is wrong:
earthPercent:50, holyPercent:50, firePercent:15, energyPercent:5, physicalPercent:25, icePercent:30, deathPercent:30firePercent:20, earthPercent:75firePercent:20, earthPercent:75firePercent:20, earthPercent:75physicalPercent:70, holyPercent:20, deathPerce
It should rather return:
physicalPercent:10, icePercent:10, holyPercent:-10
Could you help me one more time?:)
Thank you in advance.
Well the XML-Parser doesn't magically know which elements you want to load and which you won't - you have to filter this by yourself. Then you have to decide where you want to filter your desired elements in the getElements-function you posted or in your "query loop" as you call it.
Should the getElements be some kind of general function which must return all elements? Then you should change that check if($v->nodeValue > 0) to something like if(!empty($v->nodeValue)) otherwise you wont get the "holyPercent" value since this is negative (and the old expression becomes false).
Then in your "query loop", just select your desired elements:
foreach($elements as $element => $value) {
if(in_array($element, array("physicalPercent", "icePercent", "holyPercent"))) {
$elN++;
$elements_string .= $element . ":".$value;
if($elC != $elN)
$elements_string .= ", ";
}
}
Just:
$xml = new SimpleXMLElement($xmlfile);
And then:
for($i=1;$i<Count($xml->elements);$i++)
echo $xml->elements[$i][0];
Didn't try if it works with [0], usually i use :
echo $xml->elements[$i]['attributename'];

Categories