I want to Parse some DOM Content to a multidimensional array. Lets assume, I have this HTML Content
<for model="customer" id="0" processed="0">
<tag model="customer" value="name">name</tag>
<for model="accounts_receivable" processed="0">
<p>This is inside accounts_receivable</p>
</for>
</for>
I would like to parse this to:
array(
FOR => array (
ATTRIBUTES =>
SUBELEMENTS => array (
FOR => array (
ATTRIBUTES =>
SUBELEMENTS =>
)
)
)
)
I tried with parsing DOM via PHP with get ElementsByTagName, but its returning two for tags in the array.
The Key Point is, that the function should work with 2 layers or 20 layers.
Any good Idea?
Cheers,
Niklas
I wrote a function, doing this for "for" tag nodes. It ignores all other nodes, but parses recursively the complete DOM for all the for tag nodes.
$doc->loadHTML($this->template, LIBXML_NOWARNING | LIBXML_NOERROR);
$elements = $doc->getElementsByTagName('for');
$array = [];
if (!is_null($elements)) {
foreach ($elements as $element) {
if($element->getAttribute("processed") == false || $element->getAttribute("processed") != 1){
array_push($array, $this->parseDomChild($element));
}
}
}
function parseDomChild($element) {
$array = [];
if(isset($element->tagName) && $element->tagName == 'for') {
$array['nodeSelf'] = $element;
$element->setAttribute("processed", 1);
}
if($element->hasChildNodes()) {
$array['nodesChild'] = [];
foreach($element->childNodes as $node) {
array_push($array['nodesChild'], $this->parseDomChild($node));
}
}
return $array;
}
Related
Have a question about mtownsend/request-xml (XML to array) plugin.
So, plugin makes XML file to array.
I use it in my Laravel projects and there is several reason, because I need exact it, but here is one problem.
Have two simple XML files
first file oneitem.xml with one item <flat> into <post>
<?xml version="1.0" encoding="UTF-8"?>
<data>
<post>
<flat>
<roms>4</roms>
<baths>2</baths>
</flat>
</post>
</data>
second file severalitems.xml one with several items <flat> into <post>:
<?xml version="1.0" encoding="UTF-8"?>
<data>
<post>
<flat>
<roms>4</roms>
<baths>2</baths>
</flat>
<flat>
<roms>5</roms>
<baths>1</baths>
</flat>
<flat>
<roms>7</roms>
<baths>3</baths>
</flat>
</post>
</data>
Then, I use a simple code to make an array from this files, and show the result array for each:
$xmlone = XmlToArray::convert(file_get_contents('public/xml/test/oneitem.xml'));
$oneflat = $xmlone['post'];
print_r($oneflat);
$xmlseveral = XmlToArray::convert(file_get_contents('public/xml/test/severalitems.xml'));
$severalflats = $xmlseveral['post'];
print_r($severalflats);
If we try to make an array from first file (with one flat), and find all we have in posts we have this result:
Array ( [flat] => Array ( [roms] => 4 [baths] => 2 ) )
If we do the same in second file (with several `flat), we have this result
Array ( [flat] => Array ( [0] => Array ( [roms] => 4 [baths] => 2 ) [1] => Array ( [roms] => 5 [baths] => 1 ) [2] => Array ( [roms] => 7 [baths] => 3 ) )
So, if we have several items, plugin adds a additional arrays with keys, [0], [1], [2]....
I need to do it the same, even there is just one item flat into posts. So the results have same formats.
I know, that it makes plugin. If plugin see, that there is just one flat in post, he makes result array simple.
The code of main file of plugin is here, but I cant understand, which lines do it...
Thanks for your help
public static function convert($xml, $outputRoot = false)
{
$array = self::xmlStringToArray($xml);
if (!$outputRoot && array_key_exists('#root', $array)) {
unset($array['#root']);
}
return $array;
}
protected static function xmlStringToArray($xmlstr)
{
$doc = new DOMDocument();
$doc->loadXML($xmlstr);
$root = $doc->documentElement;
$output = self::domNodeToArray($root);
$output['#root'] = $root->tagName;
return $output;
}
protected static function domNodeToArray($node)
{
$output = [];
switch ($node->nodeType) {
case XML_CDATA_SECTION_NODE:
case XML_TEXT_NODE:
$output = trim($node->textContent);
break;
case XML_ELEMENT_NODE:
for ($i = 0, $m = $node->childNodes->length; $i < $m; $i++) {
$child = $node->childNodes->item($i);
$v = self::domNodeToArray($child);
if (isset($child->tagName)) {
$t = $child->tagName;
if (!isset($output[$t])) {
$output[$t] = [];
}
$output[$t][] = $v;
} elseif ($v || $v === '0') {
$output = (string) $v;
}
}
if ($node->attributes->length && !is_array($output)) { // Has attributes but isn't an array
$output = ['#content' => $output]; // Change output into an array.
}
if (is_array($output)) {
if ($node->attributes->length) {
$a = [];
foreach ($node->attributes as $attrName => $attrNode) {
$a[$attrName] = (string) $attrNode->value;
}
$output['#attributes'] = $a;
}
foreach ($output as $t => $v) {
if (is_array($v) && count($v) == 1 && $t != '#attributes') {
$output[$t] = $v[0];
}
}
}
break;
}
return $output;
}
}
** Thanks for your help!**
Looks like its not posstible. As I understood, standart PHP tools to XML to array convertation makes it so. The plugin based on them.
Anyway, I think there is a way to solve it throught changing source code of this plugin, but I soleved in my situation, by workung with results of XML to array convertation, and check if result array have one flat or several.
EDITED
I'm trying to put my form inputs into an xml file.
Searching on this site I've found the following code and I used it to parse $_POST content.
After a few attempts I realized that "numeric tags" (resulting from not-associative arrays) could be reason of my insuccess so I modified the code as below:
function array_to_xml(array $arr, SimpleXMLElement $xml, $NumK = false)
{
foreach ($arr as $k => $v) {
if (is_array($v)){
preg_match('/^0|[1-9]\d*$/', implode(array_keys($v)))
? array_to_xml($v, $xml->addChild($k), true)
: array_to_xml($v, $xml->addChild($k));
}else{
$NumK
? $xml->addChild('_'.$k.'_', $v)
: $xml->addChild($k, $v);
}
}
return $xml;
}
Anyway I'm still "fighting" with xpath commands because I'm not able to find the GrandParent of some nodes (coming from not-associative arrays) that I need to convert into repeated tags.
That's the logic I'm trying to follow:
1st - Find nodes to reformat (The only ones having numeric tag);
2nd - Find grandparent (The tag I need to repeat);
3rd - Replace the grandparent (and his descendants) whith a grandparent's tag for each group of grandchilds (one for each child).
So far I'm still stuck on 1st step beacuse of xpath misunderstanding.
Below, the result xml I have and how I would to transform it:
My array is something like:
$TestArr = Array
("First" => array
("Martha" => "Text01"
,
"Lucy" => "Text02"
,
"Bob" => array
("Jhon" => array
("01", "02")
),
"Frank" => "One"
,
"Jessy" => "Two"
)
,
"Second" => array
("Mary" => array
("Jhon" => array
("03", "04")
,
"Frank" => array
("Three", "Four")
,
"Jessy" => array
("J3", "J4")
)
)
);
using the function array_to_xml($TestArr, new SimpleXMLElement('<root/>')) I get an xml like:
<root>
<First>
<Martha>Text01</Martha>
<Lucy>Text02</Lucy>
<Bob>
<Jhon>
<_0_>01</_0_>
<_1_>02</_1_>
</Jhon>
</Bob>
<Frank>One</Frank>
<Jessy>Two</Jessy>
</First>
<Second>
<Mary>
<Jhon>
<_0_>03</_0_>
<_1_>04</_1_>
</Jhon>
<Frank>
<_0_>Three</_0_>
<_1_>Four</_1_>
</Frank>
<Jessy>
<_0_>J3</_0_>
<_1_>J4</_1_>
</Jessy>
</Mary>
</Second>
</root>
My needed result is something like:
<root>
<First>
<Martha>Text01</Martha>
<Lucy>Text02</Lucy>
<Bob>
<Jhon>01</Jhon>
</Bob>
<Bob>
<Jhon>02</Jhon>
</Bob>
<Frank>One</Frank>
<Jessy>Two</Jessy>
</First>
<Second>
<Mary>
<Jhon>03</Jhon>
<Frank>Three</Frank>
<Jessy>J3</Jessy>
</Mary>
<Mary>
<Jhon>04</Jhon>
<Frank>Four</Frank>
<Jessy>J4</Jessy>
</Mary>
</Second>
</root>
I've updated the code to try and get closer to what you were trying to achieve. I've taken into the account of how to identify the grouping of data and to do this I've added an 'id' attribute to each of the elements added in this way. Also for convenience, I set a 'max' counter for the parent elements.
The first XPath expression (//*[#id]/..) fetches all the elements that needed to be processed. This then loops for the number of sub elements counted earlier. The XPath descendant::*[#id='{$i}'] picks out each set of elements (so all ones with id='0', then '1' etc.) This is the natural grouping of the data.
function array_to_xml(array $arr, SimpleXMLElement $xml, string $elementName = null)
{
foreach ($arr as $k => $v) {
if (is_array($v)){
if ( preg_match('/^0|[1-9]\d*$/', implode(array_keys($v)))) {
array_to_xml($v, $xml, $k);
}
else {
array_to_xml($v, $xml->addChild($k));
}
}
else {
if ( $elementName != null ) {
$newElement = $xml->addChild($elementName, $v);
$newElement["id"] = $k;
$xml["max"] = $k;
}
else {
$xml->addChild($k, $v);
}
}
}
//return $xml;
}
$xml = new SimpleXMLElement("<root />");
array_to_xml($TestArr, $xml);
$todoList = $xml->xpath("//*[#id]/..");
foreach ( $todoList as $todo ) {
$parent = $todo->xpath("..")[0];
for ( $i = 0; $i <= $todo['max']; $i++ ) {
$content = $todo->xpath("descendant::*[#id='{$i}']");
$newName = $todo->getName();
$new = $parent->addChild($newName);
foreach ( $content as $addIn ) {
$new->addChild($addIn->getName(), (string)$addIn);
}
}
unset ( $parent->$newName[0]);
}
print $xml->asXML();
Outputs...
<?xml version="1.0"?>
<root>
<First>
<Martha>Text01</Martha>
<Lucy>Text02</Lucy>
<Frank>One</Frank>
<Jessy>Two</Jessy>
<Bob>
<Jhon>01</Jhon>
</Bob>
<Bob>
<Jhon>02</Jhon>
</Bob>
</First>
<Second>
<Mary>
<Jhon>03</Jhon>
<Frank>Three</Frank>
<Jessy>J3</Jessy>
</Mary>
<Mary>
<Jhon>04</Jhon>
<Frank>Four</Frank>
<Jessy>J4</Jessy>
</Mary>
</Second>
</root>
Iterate your array value and add elements with the same key
if (is_array($v)){
foreach($v as $arr_ele){
$xml->addChild($k, $arr_ele);
}
}
Up until now, I've been using the snippet below to convert an XML tree to an array:
$a = json_decode(json_encode((array) simplexml_load_string($xml)),1);
..however, I'm now working with an XML that has duplicate key values, so the array is breaking when it loops through the XML. For example:
<users>
<user>x</user>
<user>y</user>
<user>z</user>
</users>
Is there a better method to do this that allows for duplicate Keys, or perhaps a way to add an incremented value to each key when it spits out the array, like this:
$array = array(
users => array(
user_1 => x,
user_2 => y,
user_3 => z
)
)
I'm stumped, so any help would be very appreciated.
Here is a complete universal recursive solution.
This class will parse any XML under any structure, with or without tags, from the simplest to the most complex ones.
It retains all proper values and convert them (bool, txt or int), generates adequate array keys for all elements groups including tags, keep duplicates elements etc etc...
Please forgive the statics, it s part of a large XML tools set I used, before rewriting them all for HHVM or pthreads, I havent got time to properly construct this one, but it will work like a charm for straightforward PHP.
For tags, the declared value is '#attr' in this case but can be whatever your needs are.
$xml = "<body>
<users id='group 1'>
<user>x</user>
<user>y</user>
<user>z</user>
</users>
<users id='group 2'>
<user>x</user>
<user>y</user>
<user>z</user>
</users>
</body>";
$result = xml_utils::xml_to_array($xml);
result:
Array ( [users] => Array ( [0] => Array ( [user] => Array ( [0] => x [1] => y [2] => z ) [#attr] => Array ( [id] => group 1 ) ) [1] => Array ( [user] => Array ( [0] => x [1] => y [2] => z ) [#attr] => Array ( [id] => group 2 ) ) ) )
Class:
class xml_utils {
/*object to array mapper */
public static function objectToArray($object) {
if (!is_object($object) && !is_array($object)) {
return $object;
}
if (is_object($object)) {
$object = get_object_vars($object);
}
return array_map('objectToArray', $object);
}
/* xml DOM loader*/
public static function xml_to_array($xmlstr) {
$doc = new DOMDocument();
$doc->loadXML($xmlstr);
return xml_utils::dom_to_array($doc->documentElement);
}
/* recursive XMl to array parser */
public static function dom_to_array($node) {
$output = array();
switch ($node->nodeType) {
case XML_CDATA_SECTION_NODE:
case XML_TEXT_NODE:
$output = trim($node->textContent);
break;
case XML_ELEMENT_NODE:
for ($i = 0, $m = $node->childNodes->length; $i < $m; $i++) {
$child = $node->childNodes->item($i);
$v = xml_utils::dom_to_array($child);
if (isset($child->tagName)) {
$t = xml_utils::ConvertTypes($child->tagName);
if (!isset($output[$t])) {
$output[$t] = array();
}
$output[$t][] = $v;
} elseif ($v) {
$output = (string) $v;
}
}
if (is_array($output)) {
if ($node->attributes->length) {
$a = array();
foreach ($node->attributes as $attrName => $attrNode) {
$a[$attrName] = xml_utils::ConvertTypes($attrNode->value);
}
$output['#attr'] = $a;
}
foreach ($output as $t => $v) {
if (is_array($v) && count($v) == 1 && $t != '#attr') {
$output[$t] = $v[0];
}
}
}
break;
}
return $output;
}
/* elements converter */
public static function ConvertTypes($org) {
if (is_numeric($org)) {
$val = floatval($org);
} else {
if ($org === 'true') {
$val = true;
} else if ($org === 'false') {
$val = false;
} else {
if ($org === '') {
$val = null;
} else {
$val = $org;
}
}
}
return $val;
}
}
You can loop through each key in your result and if the value is an array (as it is for user that has 3 elements in your example) then you can add each individual value in that array to the parent array and unset the value:
foreach($a as $user_key => $user_values) {
if(!is_array($user_values))
continue; //not an array nothing to do
unset($a[$user_key]); //it's an array so remove it from parent array
$i = 1; //counter for new key
//add each value to the parent array with numbered keys
foreach($user_values as $user_value) {
$new_key = $user_key . '_' . $i++; //create new key i.e 'user_1'
$a[$new_key] = $user_value; //add it to the parent array
}
}
var_dump($a);
First of all this line of code contains a superfluous cast to array:
$a = json_decode(json_encode((array) simplexml_load_string($xml)),1);
^^^^^^^
When you JSON-encode a SimpleXMLElement (which is returned by simplexml_load_string when the parameter could be parsed as XML) this already behaves as-if there would have been an array cast. So it's better to remove it:
$sxml = simplexml_load_string($xml);
$array = json_decode(json_encode($sxml), 1);
Even the result is still the same, this now allows you to create a subtype of SimpleXMLElement implementing the JsonSerialize interface changing the array creation to your needs.
The overall method (as well as the default behaviour) is outlined in a blog-series of mine, on Stackoverflow I have left some more examples already as well:
PHP convert XML to JSON group when there is one child (Jun 2013)
Resolve namespaces with SimpleXML regardless of structure or namespace (Oct 2014)
XML to JSON conversion in PHP SimpleXML (Dec 2014)
Your case I think is similar to what has been asked in the first of those three links.
Here is example how my array should look like:
$library = array(
'book' => array(
array(
'authorFirst' => 'Mark',
'authorLast' => 'Twain',
'title' => 'The Innocents Abroad'
),
array(
'authorFirst' => 'Charles',
'authorLast' => 'Dickens',
'title' => 'Oliver Twist'
)
)
);
When I get results from oracle database:
$row = oci_fetch_array($refcur, OCI_ASSOC+OCI_RETURN_NULLS);
But when I execute my code I only get one row.
For example: <books><book></book><name></name></books>
But I want all rows to be shown in xml.
EDIT:
This is my class for converting array to xml:
public static function toXml($data, $rootNodeName = 'data', &$xml=null)
{
// turn off compatibility mode as simple xml throws a wobbly if you don't.
if (ini_get('zend.ze1_compatibility_mode') == 1)
{
ini_set ('zend.ze1_compatibility_mode', 0);
}
if (is_null($xml))
{
$xml = simplexml_load_string("<".key($data)."/>");
}
// loop through the data passed in.
foreach($data as $key => $value)
{
// if numeric key, assume array of rootNodeName elements
if (is_numeric($key))
{
$key = $rootNodeName;
}
// delete any char not allowed in XML element names
$key = preg_replace('/[^a-z0-9\-\_\.\:]/i', '', $key);
// if there is another array found recrusively call this function
if (is_array($value))
{
// create a new node unless this is an array of elements
$node = ArrayToXML::isAssoc($value) ? $xml->addChild($key) : $xml;
// recrusive call - pass $key as the new rootNodeName
ArrayToXML::toXml($value, $key, $node);
}
else
{
// add single node.
$value = htmlentities($value);
$xml->addChild($key,$value);
}
}
// pass back as string. or simple xml object if you want!
return $xml->asXML();
}
// determine if a variable is an associative array
public static function isAssoc( $array ) {
return (is_array($array) && 0 !== count(array_diff_key($array, array_keys(array_keys($array)))));
}
}
?>
Now with below responde I have tried problem is I get following output: <book>...</book> tags after each row.. then I tried 3 dimensional array now I get: <book><book>...</book></book> on the proper place but I have 2 of them.
This is the line where I have determine which is root on that array and that's why I get this output. But don't know how to change it : $xml = simplexml_load_string("<".key($data)."/>");
Thank you.
oci_fetch_array() will always return a single row, you need to call it until there are no more rows to fetch in order to get all of them:
while ($row = oci_fetch_array($refcur, OCI_ASSOC+OCI_RETURN_NULLS))
{
$library['book'][] = $row;
}
Been trying to figure this out for a short while now but having now luck, for example I have an external xml document like this:
<?xml version="1.0" ?>
<template>
<name>My Template Name</name>
<author>John Doe</author>
<positions>
<position>top-a</position>
<position>top-b</position>
<position>sidebar-a</position>
<position>footer-a</position>
</positions>
</template>
How can I process this document to create variables like this:
$top-a = top-a;
$top-b = top-b;
$sidebar-a = sidebar-a;
$footer-a = footer-a
If you can't make them into variables, how would you put them into an array?
Any help will be greatly appreciated.
From the PHP web site at http://www.php.net/manual/en/function.xml-parse.php:
Ashok dot 893 at gmail dot com 26-Apr-2010 05:52
This is very simple way to convert all applicable objects into associative array. This works with not only SimpleXML but any kind of object. The input can be either array or object. This function also takes an options parameter as array of indices to be excluded in the return array. And keep in mind, this returns only the array of non-static and accessible variables of the object since using the function get_object_vars().
<?php
function objectsIntoArray($arrObjData, $arrSkipIndices = array())
{
$arrData = array();
// if input is object, convert into array
if (is_object($arrObjData)) {
$arrObjData = get_object_vars($arrObjData);
}
if (is_array($arrObjData)) {
foreach ($arrObjData as $index => $value) {
if (is_object($value) || is_array($value)) {
$value = objectsIntoArray($value, $arrSkipIndices); // recursive call
}
if (in_array($index, $arrSkipIndices)) {
continue;
}
$arrData[$index] = $value;
}
}
return $arrData;
}
?>
Usage:
<?php
$xmlUrl = "feed.xml"; // XML feed file/URL
$xmlStr = file_get_contents($xmlUrl);
$xmlObj = simplexml_load_string($xmlStr);
$arrXml = objectsIntoArray($xmlObj);
print_r($arrXml);
?>
This will give the following result:
Array
(
[name] => My Template Name
[author] => John Doe
[positions] => Array
(
[position] => Array
(
[0] => top-a
[1] => top-b
[2] => sidebar-a
[3] => footer-a
)
)
)
You want the built in class Simplexml
Take a look at SimpleXML:
http://www.php.net/manual/en/simplexml.examples-basic.php
It parses XML into a "map-like" structure which you could then use to access your content. For your particular case,
$xml = new SimpleXMLElement($xmlstr);
$top_a = $xml->template->positions[0]
The simplest method is to use SimpleXML:
$xml = simplexml_load_string(... your xml here...);
$values = array()
foreach($xml->positions as $pos) {
$values[$pos] = $pos;
}
You do not want to auto-create variables in the manner you suggest - it litters your variable name space with garbage. Consider what happens if someone sends over an XML snippet which has <position>_SERVER</position> and you create a variable of that name - there goes your $_SERVER superglobal.
why not doing the array directly?
var positions = document.getElementsByTagName("positions");
var positions_final_arr = [];
for(int i = 0; i < positions.length; i++){
positions_final_arr[i] = [];
var inner_pos = positions[i].getElementsbyTagName("position");
for(int l = 0; l < inner_pos.length; l++){
positions_final_arr[i][l] = inner_pos[i].value;
}
}
console.log(positions_final_arr);
$str = "your xml";
$xml = simplexml_load_string($str);
$result = array();
foreach ($xml->positions as $pos) {
foreach ($pos->position as $p) {
$element = (string)$p[0];
$result[$element] = $element;
}
}
var_dump($result);
Use SimpleXML to parse the file into an object/array structure, then simply use list:
$sxml = new SimpleXMLElement($xml);
$positions = (array)$sxml->positions->children();
list($top_a, $top_b, $sidebar_a, $footer_a) = $positions['position'];
$dom = new DOMDocument;
$dom->loadXML('<root><position>a</position></root>'); //your string here
//$dom->loadXML(file_get_contents($file_with_pxml)); - from file
$position = $dom->getElementsByTagName('position');
for ($i=0; $i<$position->length; $i++)
{
$item = $position->item($i);
${$item->nodeValue} = $item->nodeValue;//$$item->nodeValue = $item->nodeValue;
}
But as I know - you can't create variable with dash in name in PHP
<?php
$xmlUrl = "feed.xml"; // XML feed file/URL
$xmlStr = file_get_contents($xmlUrl);
$xmlObj = simplexml_load_string($xmlStr);
$arrXml = json_decode(json_encode($xmlObj), true); # the magic!!!
print_r($arrXml);
?>
This will give the following result:
Array
(
[name] => My Template Name
[author] => John Doe
[positions] => Array
(
[position] => Array
(
[0] => top-a
[1] => top-b
[2] => sidebar-a
[3] => footer-a
)
)
)