PHP parse nexted XML with unique attributes - php

I have a nested XML that I need to traverse and get not only the nodes, but also the attribute key and value which are each different.
I tried writing a recursive function in PHP to get what I was looking for. My XML looks like the following...
<document>
<character>
<literal>name</literal>
<codepoint>
<cp_value cp_type="ucs">4e9c</cp_value>
<cp_value cp_type="jis208">16-01</cp_value>
</codepoint>
<radical>
<rad_value rad_type="classical">7</rad_value>
<rad_value rad_type="nelson_c">1</rad_value>
</radical>
<meaning_group>
<meaning>this</meaning>
<meaning>that</meaning>
</meaning_group>
</character>
...
</document>
The problem is that not all [character] nodes have the exact same children.
I am trying to pull the attribute key and value to combine into one key, then associate the value as the value. If there is no attribute, I want to use the tag name as the key. Also, some children have the same name with no attribute. In this case, I want to just put them in one field separated by a line break. Thanks!!
["literal"] => "name",
["cp_type-ucs"] => "4e9c",
["cp_type-jis208"] => "16-01",
["rad_type-classical"] => "7",
["rad_type-nelson_c"] => "1",
["meaning"] => "this\nthat"
That's the array I want to output...
Any and all help would be greatly appreciated! Thanks!
EDIT: Added some code that I can use to traverse through the layers and get the tag names to echo, but for some reason, they won't populate the array. Just the "character" tag will go in the array.
function ripXML($file) {
$xml = simplexml_load_file ( $file );
return (peelTags ( $xml , array()) );
}
function peelTags($node, $tag) {
// find if there are children. (IF SO, there shouldn't be
$numChildren = #count ( $node->children () );
if ($numChildren != 0) {
foreach ( $node->children () as $child ) {
$tag [] = $child->getName ();
peelTags ( $child, $tag);
echo "<br />Name = " . $child->getName ();
}
}
return $tag;
}
$file = "dictionarytest.xml";
print_r ( ripXML ( $file ) );
EDIT 2 -
I figured it out finally. It might be a bit messy and not the best way to go about it, but it solved the problem that I was faced with. In case anyone else needed something similar, here it is!
$_SESSION ["a"] = array ();
$_SESSION ["c"] = 0;
function ripXML($file) {
$xml = simplexml_load_file ( $file );
return (peelTags ( $xml, array () ));
}
function peelTags($node, $tag) {
// find if there are children. (IF SO, there shouldn't be
$numChildren = #count ( $node->children () );
if ($numChildren != 0) {
foreach ( $node->children () as $child ) {
peelTags ( $child, $tag );
$tag = $child->getName ();
if ($tag == "literal") {
$_SESSION ["c"] ++;
}
$value = trim($child->__toString ());
if (isset ( $value ) && $value != "") {
if ($child->attributes ()) {
foreach ( $child->attributes () as $k => $v ) {
if (isset ( $v ) && $v != "") {
$_SESSION ["a"] [$_SESSION ["c"]] [$k . "_" . $v] = $value;
}
}
} else {
$_SESSION ["a"] [$_SESSION ["c"]] [$tag] = $value;
}
}
}
}
return 1;
}
$file = "dictionarytest.xml";
print_r ( ripXML ( $file ) );
print_r ( $_SESSION ["a"] );
I used global session variables to store the array and counter for the recursive algorithm. I don't know if anyone has a better suggestion. I would like to optimize this function if possible. I was testing it on an XML file of only 5 entries, but my real file will have over 4000.

... confusing. i did not syntax check or test this, but i think its something like this..
$domd=new DOMDocument();
$domd->loadXML($xml);
$interestingdomnode=$domd->getElementsByTagName("character")->item(0);
$parsed_info=array();
$parsed_info['literal']=$interestingdomnode->getElementsByTagName("literal")->item(0)->textContent;
foreach($interestingdomnode->getElementsByTagName("cp_value") as $cp){
$parsed_info["cp_type-".$cp->cp_type]=$cp->textContent
}
foreach($interestingdomnode->getElementsByTagName("rad_type") as $cp){
$parsed_info["rad_type-".$cp->rad_type]=$cp->textContent
}
$parsed_info['meaning']='';
foreach($interestingdomnode->getElementsByTagName("meaning") as $cp){
$parsed_info['meaning'].=$cp->textContent.PHP_EOL;
}
var_dump($parsed_info);

Related

Convert PHP array from XML that contains duplicate elements

Up until now, I've been using the snippet below to convert an XML tree to an array:
$a = json_decode(json_encode((array) simplexml_load_string($xml)),1);
..however, I'm now working with an XML that has duplicate key values, so the array is breaking when it loops through the XML. For example:
<users>
<user>x</user>
<user>y</user>
<user>z</user>
</users>
Is there a better method to do this that allows for duplicate Keys, or perhaps a way to add an incremented value to each key when it spits out the array, like this:
$array = array(
users => array(
user_1 => x,
user_2 => y,
user_3 => z
)
)
I'm stumped, so any help would be very appreciated.
Here is a complete universal recursive solution.
This class will parse any XML under any structure, with or without tags, from the simplest to the most complex ones.
It retains all proper values and convert them (bool, txt or int), generates adequate array keys for all elements groups including tags, keep duplicates elements etc etc...
Please forgive the statics, it s part of a large XML tools set I used, before rewriting them all for HHVM or pthreads, I havent got time to properly construct this one, but it will work like a charm for straightforward PHP.
For tags, the declared value is '#attr' in this case but can be whatever your needs are.
$xml = "<body>
<users id='group 1'>
<user>x</user>
<user>y</user>
<user>z</user>
</users>
<users id='group 2'>
<user>x</user>
<user>y</user>
<user>z</user>
</users>
</body>";
$result = xml_utils::xml_to_array($xml);
result:
Array ( [users] => Array ( [0] => Array ( [user] => Array ( [0] => x [1] => y [2] => z ) [#attr] => Array ( [id] => group 1 ) ) [1] => Array ( [user] => Array ( [0] => x [1] => y [2] => z ) [#attr] => Array ( [id] => group 2 ) ) ) )
Class:
class xml_utils {
/*object to array mapper */
public static function objectToArray($object) {
if (!is_object($object) && !is_array($object)) {
return $object;
}
if (is_object($object)) {
$object = get_object_vars($object);
}
return array_map('objectToArray', $object);
}
/* xml DOM loader*/
public static function xml_to_array($xmlstr) {
$doc = new DOMDocument();
$doc->loadXML($xmlstr);
return xml_utils::dom_to_array($doc->documentElement);
}
/* recursive XMl to array parser */
public static function dom_to_array($node) {
$output = array();
switch ($node->nodeType) {
case XML_CDATA_SECTION_NODE:
case XML_TEXT_NODE:
$output = trim($node->textContent);
break;
case XML_ELEMENT_NODE:
for ($i = 0, $m = $node->childNodes->length; $i < $m; $i++) {
$child = $node->childNodes->item($i);
$v = xml_utils::dom_to_array($child);
if (isset($child->tagName)) {
$t = xml_utils::ConvertTypes($child->tagName);
if (!isset($output[$t])) {
$output[$t] = array();
}
$output[$t][] = $v;
} elseif ($v) {
$output = (string) $v;
}
}
if (is_array($output)) {
if ($node->attributes->length) {
$a = array();
foreach ($node->attributes as $attrName => $attrNode) {
$a[$attrName] = xml_utils::ConvertTypes($attrNode->value);
}
$output['#attr'] = $a;
}
foreach ($output as $t => $v) {
if (is_array($v) && count($v) == 1 && $t != '#attr') {
$output[$t] = $v[0];
}
}
}
break;
}
return $output;
}
/* elements converter */
public static function ConvertTypes($org) {
if (is_numeric($org)) {
$val = floatval($org);
} else {
if ($org === 'true') {
$val = true;
} else if ($org === 'false') {
$val = false;
} else {
if ($org === '') {
$val = null;
} else {
$val = $org;
}
}
}
return $val;
}
}
You can loop through each key in your result and if the value is an array (as it is for user that has 3 elements in your example) then you can add each individual value in that array to the parent array and unset the value:
foreach($a as $user_key => $user_values) {
if(!is_array($user_values))
continue; //not an array nothing to do
unset($a[$user_key]); //it's an array so remove it from parent array
$i = 1; //counter for new key
//add each value to the parent array with numbered keys
foreach($user_values as $user_value) {
$new_key = $user_key . '_' . $i++; //create new key i.e 'user_1'
$a[$new_key] = $user_value; //add it to the parent array
}
}
var_dump($a);
First of all this line of code contains a superfluous cast to array:
$a = json_decode(json_encode((array) simplexml_load_string($xml)),1);
^^^^^^^
When you JSON-encode a SimpleXMLElement (which is returned by simplexml_load_string when the parameter could be parsed as XML) this already behaves as-if there would have been an array cast. So it's better to remove it:
$sxml = simplexml_load_string($xml);
$array = json_decode(json_encode($sxml), 1);
Even the result is still the same, this now allows you to create a subtype of SimpleXMLElement implementing the JsonSerialize interface changing the array creation to your needs.
The overall method (as well as the default behaviour) is outlined in a blog-series of mine, on Stackoverflow I have left some more examples already as well:
PHP convert XML to JSON group when there is one child (Jun 2013)
Resolve namespaces with SimpleXML regardless of structure or namespace (Oct 2014)
XML to JSON conversion in PHP SimpleXML (Dec 2014)
Your case I think is similar to what has been asked in the first of those three links.

PHP parse assoc. array or XML

How to acces this assoc array?
Array
(
[order-id] => Array
(
[0] => 1
[1] => 2
)
)
as a result of XML parsing
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE request SYSTEM "http://shits.com/wtf.dtd">
<request version="0.5">
<order-states-request>
<order-ids>
<order-id>1</order-id>
<order-id>2</order-id>
...
</order-ids>
</order-states-request>
</request>
$body = file_get_contents('php://input');
$xml = simplexml_load_string($body);
$src = $xml->{'order-states-request'}->{'order-ids'};
foreach ($src as $order) {
echo ' ID:'.$order->{'order-id'};
// dont work - echoes only ID:1, why?
}
// ok, lets try just another way...
$items = toArray($src); //googled function - see at the bottom
print_r($items);
// print result - see at the top assoc array
// and how to acces order ids in this (fck) assoc array???
//------------------------------------------
function toArray(SimpleXMLElement $xml) {
$array = (array)$xml;
foreach ( array_slice($array, 0) as $key => $value ) {
if ( $value instanceof SimpleXMLElement ) {
$array[$key] = empty($value) ? NULL : toArray($value);
}
}
return $array;
}
MANY THANKS FOR ANY HELP!
What you want is:
$body = file_get_contents('php://input');
$xml = simplexml_load_string($body);
$src = $xml->{'order-states-request'}->{'order-ids'}->{'order-id'};
foreach ($src as $id)
{
echo ' ID:', $id, "\n";
}
Live DEMO.
What happens with your code is that you're trying to loop:
$xml->{'order-states-request'}->{'order-ids'}
Which is not the array you want, order-id is, as you can see on your dump:
Array
(
[order-id] => Array

php objects, how to access a simplexmlelement

I have the following object:
object(SimpleXMLElement)#337 (1) { [0]=> string(4) "1001" }
But I can't seem to access it using [0] or even not using foreach($value as $obj=>$objvalue)
What am I doing wrong?
SimpleXMLElement implements Traversable, which means you could use foreach to loop it.
Try to use
$objectarray = get_object_vars(object(SimpleXMLElement));
By looking into the SimpleXMLElement manual I found this example (the example XML file is on the top of the page of the link):
$movies = new SimpleXMLElement($xmlstr);
/* For each <character> node, we echo a separate <name>. */
foreach ($movies->movie->characters->character as $character) {
echo $character->name, ' played by ', $character->actor, PHP_EOL;
}
And I found this function to transform the XML object to an array, maybe that's easier to use?:
function toArray($xml) { //$xml is of type SimpleXMLElement
$array = json_decode(json_encode($xml), TRUE);
foreach ( array_slice($array, 0) as $key => $value ) {
if ( empty($value) ) $array[$key] = NULL;
elseif ( is_array($value) ) $array[$key] = toArray($value);
}
return $array;
}

Converting array into multidimensional array in PHP when result is send from Oracle

Here is example how my array should look like:
$library = array(
'book' => array(
array(
'authorFirst' => 'Mark',
'authorLast' => 'Twain',
'title' => 'The Innocents Abroad'
),
array(
'authorFirst' => 'Charles',
'authorLast' => 'Dickens',
'title' => 'Oliver Twist'
)
)
);
When I get results from oracle database:
$row = oci_fetch_array($refcur, OCI_ASSOC+OCI_RETURN_NULLS);
But when I execute my code I only get one row.
For example: <books><book></book><name></name></books>
But I want all rows to be shown in xml.
EDIT:
This is my class for converting array to xml:
public static function toXml($data, $rootNodeName = 'data', &$xml=null)
{
// turn off compatibility mode as simple xml throws a wobbly if you don't.
if (ini_get('zend.ze1_compatibility_mode') == 1)
{
ini_set ('zend.ze1_compatibility_mode', 0);
}
if (is_null($xml))
{
$xml = simplexml_load_string("<".key($data)."/>");
}
// loop through the data passed in.
foreach($data as $key => $value)
{
// if numeric key, assume array of rootNodeName elements
if (is_numeric($key))
{
$key = $rootNodeName;
}
// delete any char not allowed in XML element names
$key = preg_replace('/[^a-z0-9\-\_\.\:]/i', '', $key);
// if there is another array found recrusively call this function
if (is_array($value))
{
// create a new node unless this is an array of elements
$node = ArrayToXML::isAssoc($value) ? $xml->addChild($key) : $xml;
// recrusive call - pass $key as the new rootNodeName
ArrayToXML::toXml($value, $key, $node);
}
else
{
// add single node.
$value = htmlentities($value);
$xml->addChild($key,$value);
}
}
// pass back as string. or simple xml object if you want!
return $xml->asXML();
}
// determine if a variable is an associative array
public static function isAssoc( $array ) {
return (is_array($array) && 0 !== count(array_diff_key($array, array_keys(array_keys($array)))));
}
}
?>
Now with below responde I have tried problem is I get following output: <book>...</book> tags after each row.. then I tried 3 dimensional array now I get: <book><book>...</book></book> on the proper place but I have 2 of them.
This is the line where I have determine which is root on that array and that's why I get this output. But don't know how to change it : $xml = simplexml_load_string("<".key($data)."/>");
Thank you.
oci_fetch_array() will always return a single row, you need to call it until there are no more rows to fetch in order to get all of them:
while ($row = oci_fetch_array($refcur, OCI_ASSOC+OCI_RETURN_NULLS))
{
$library['book'][] = $row;
}

PHP Dynamically adding dimensions to an array using for loop

Here is my dilemma and thank you in advance!
I am trying to create a variable variable or something of the sort for a dynamic associative array and having a hell of a time figuring out how to do this. I am creating a file explorer so I am using the directories as the keys in the array.
Example:
I need to get this so I can assign it values
$dir_list['root']['folder1']['folder2'] = value;
so I was thinking of doing something along these lines...
if ( $handle2 = #opendir( $theDir.'/'.$file ))
{
$tmp_dir_url = explode($theDir);
for ( $k = 1; $k < sizeof ( $tmp_dir_url ); $k++ )
{
$dir_list [ $dir_array [ sizeof ( $dir_array ) - 1 ] ][$tmp_dir_url[$k]]
}
this is where I get stuck, I need to dynamically append a new dimension to the array durring each iteration through the for loop...but i have NO CLUE how
I would use a recursive approach like this:
function read_dir_recursive( $dir ) {
$results = array( 'subdirs' => array(), 'files' => array() );
foreach( scandir( $dir ) as $item ) {
// skip . & ..
if ( preg_match( '/^\.\.?$/', $item ) )
continue;
$full = "$dir/$item";
if ( is_dir( $full ) )
$results['subdirs'][$item] = scan_dir_recursive( $full );
else
$results['files'][] = $item;
}
}
The code is untested as I have no PHP here to try it out.
Cheers,haggi
You can freely put an array into array cell, effectively adding 1 dimension for necessary directories only.
I.e.
$a['x'] = 'text';
$a['y'] = new array('q', 'w');
print($a['x']);
print($a['y']['q']);
How about this? This will stack array values into multiple dimensions.
$keys = array(
'year',
'make',
'model',
'submodel',
);
$array = array();
print_r(array_concatenate($array, $keys));
function array_concatenate($array, $keys){
if(count($keys) === 0){
return $array;
}
$key = array_shift($keys);
$array[$key] = array();
$array[$key] = array_concatenate($array[$key], $keys);
return $array;
}
In my case, I knew what i wanted $keys to contain. I used it to take the place of:
if(isset($array[$key0]) && isset($array[$key0][$key1] && isset($array[$key0][$key1][$key2])){
// do this
}
Cheers.

Categories