How to get ALL elements of simplexml object - php

OK, I'm totally stumped here. I've found similar questions, but the answers don't seem to work for my specific problem. I've been working on this on and off for days.
I have this here simplexml object (it's actually much, much, MUCH longer than this, but I'm cutting out all the extraneous stuff so you'll actually look at it):
SimpleXMLElement Object
(
[SubjectClassification] => Array
(
[0] => SimpleXMLElement Object
(
[#attributes] => Array
(
[Authority] => Category Code
[Value] => s
[Id] => s
)
)
[1] => SimpleXMLElement Object
(
[#attributes] => Array
(
[Authority] => Subject
[Value] => Sports
[Id] => 54df6c687df7100483dedf092526b43e
)
)
[2] => SimpleXMLElement Object
(
[#attributes] => Array
(
[Authority] => Subject
[Value] => Professional baseball
[Id] => 20dd2c287e4e100488e5d0913b2d075c
)
)
)
)
I got this block of code by doing a print_r on a variable containing the following:
$subjects->SubjectClassification->children();
Now, I want to get at all the elements of the subjectClassification array. ALL of them! But when I do this:
$subjects->SubjectClassification;
Or this:
$subjects->SubjectClassification->children();
OR if I try to get all the array elements via a loop, all I get is this:
SimpleXMLElement Object
(
[#attributes] => Array
(
[Authority] => Category Code
[Value] => s
[Id] => s
)
)
Why? How can I get everything?

You can use xpath to do this. Its the easiest way and most efficient I find and cuts down the need for lots of for loops and such to resolve items. To get all the nodes you want you can use:
if your xml is like this:
<Subjects>
<SubjectClassification>
</SubjectClassification>
<SubjectClassification>
</SubjectClassification>
<SubjectClassification>
</SubjectClassification>
</Subjects>
Then to get all subject classifications in an array you can do the following:
$subject_classifications = $xml->xpath("//SubjectClassification");
The xml variable refers to your main simplexml object i.e. the file you loaded using simplexml.
Then you can just iterate through the array using a foreach loop like this:
foreach($subject_classifications as $subject_classification){
echo (string) $subject_classification->Authority;
echo (string) $subject_classification->Value;
echo (string) $subject_classification->Id;
}
Your structure may vary but you get the idea anyway. You can see a good article from IBM here "Using Xpath With PHP":

Because of the extent to which SimpleXML overloads PHP syntax, relying on print_r to figure out what's in a SimpleXML object, or what you can do with it, is not always helpful. (I've written a couple of debugging functions intended to be more comprehensive.) Ultimately, the reference should be to the XML structure itself, and knowledge of how SimpleXML works.
In this case, it looks from the output you provide that what you have is a list of elements all called SubjectClassification, and all siblings to each other. So you don't want to call $subjects->SubjectClassification->children(), because those nodes have no children.
Without a better idea of the underlying XML structure, it's hard to say more, so I'll save this incomplete answer for now.

For all descendants (that are children, grand-children, grand-grand-children, grand-grand-... (you get the idea)) of <subjectClassification>s ("all the elements [...] ALL of them!" as you named it), you can make use of Xpath which supports such more advanced queries (at least I assume that is what you're looking for, your question does not give any detailed discription nor example what you mean by "all" specifically).
As for SimpleXML you can query elements (and attributes) only with Xpath, but as you need elements only, this is no show stopper:
$allOfThem = $subjects->xpath('./SubjectClassification//*');
The key point here is the Xpath expression:
./SubjectClassification//*
Per the dot . at the beginning it is relative to the context-node, which is $subjects in your case. Then looking for all elements that are descending to the direct child-element named SubjectClassification. This works per // (unspecified depth) and * (any element, star acts as a wildcard).
So hopefully this answers your question months after. I just stumbled over it by cleaning up some XML questions and perhaps this is useful for future reference as well.

I have added this second answer in case whats actually throwing you is retrieving the attributes array as opposed to the nodes. This is how you could print out the attributes for each SubjectClassification in your main $xml object.
foreach($xml->SubjectClassification->attributes() as $key => $value) {
echo $key . " : " . $value "\n";
}

I've found that count returns the proper number of elements, and you can then use a standard for loop to iterate over them:
$n = count($subjects->SubjectClassification);
for ($i = 0; $i < $n; $i++) {
var_dump($subjects->SubjectClassification[$i]);
}
I'm not sure why the foreach loop doesn't work, nor why dumping $subjects->SubjectClassification directly only shows the first node, but for any who stumble across this ancient question as I have, the above is one way to find more information without resorting to external libraries.

Related

foreach stdClass object converted to string

I am working response from an API which returns data in JSON in a fairly straight forward structure. Pseudo structure is best represented as:
services -> service -> options -> option -> suboptions -> option
As this is decoded JSON, all of these are stdClass objects. I have a number of foreach statements iterating through the various services and options, which all work without issue. However, when I use a foreach statement at the suboption level, the object is serialized into a string. For your information suboptions has a structure like this.
[suboptions] => stdClass Object
(
[option] => stdClass Object
(
[code] => SOME_CODE
[name] => Some Name
)
)
When using a foreach such as this (where $option is an option in services -> service -> options -> option):
foreach($option->suboptions->option as $suboption) {
print_r($suboption);
}
It outputs
SOME_CODESome Name
Not the expected
stdClass Object
(
[code] => SOME_CODE
[name] => Some Name
)
I am not aware of any reasons foreach would do this in terms of depth or other conditions.
I've tried everything I can think of and searched through SO and can't find any case of this happening. If anyone has any ideas I'd love to hear them! Cheers.
Apologies if it has been answered elsewhere or I am missing something obvious. If I have, it has escaped me thus far.
Edit: For those asking it is indeed the AusPost PAC API. Changing over from the DRC as it is being removed ~2014.
I can't comment, because I don't have that amount of reputation, but I'll try to answer...
I'm assuming you want to remove the stdClass from the PHP array?
If so, then make sure you are using true when decoding the JSON.
Example: $array = json_decode($json, true);
For more information on stdClass, see this SO post: What is stdClass in PHP?
Again, forgive me if I'm misinterpreting your question, I just wish I could comment...
I'm assuming this is from a response from the Australia Post postage calculation API. This API makes the horrible decision to treat collection instances with only one item as a single value. For example, service -> options -> option may be an array or it may be a single option.
The easiest way I've found to deal with this is cast the option to an array, eg
$options = $service->options->option;
if (!is_array($options)) {
$options = array($options);
}
// now you can loop over it safely
foreach ($options as $option) { ... }
You would do the same thing with the sub-options. If you're interested, I've got an entire library that deals with the AusPost API, using Guzzle to manage the HTTP side.
The values of the last object are not arrays so you would need both the key and value.
foreach($option->suboptions->option as $key => $value) {
echo '<p>KEY:'.$key.' VALUE:'.$value.'</p>';
}

Getting the first XML element with SimpleXML

ok, this might be a stupid question, but how do I get one single element from an XML document?
I have this XML
$element = $response['linkedin'];
SimpleXMLElement Object
(
[id] => 575677478478
[first-name] => John
[last-name] => Doe
[email-address] => john#doe.com
[picture-url] => http://m3.licdn.com/mpr/mprx/123
[headline] => Headline goes here
[industry] => Internet
[num-connections] => 71
I just want to assign first-name as $firstName
I can loop over it using xPath, but that just seems like overkill.
ex:
$fName = $element->xpath('first-name');
foreach ($fName as $name)
{
$firstName = $name;
}
If you access a list of (one or more) element nodes in SimpleXML as a single element, it will return the first element. That is by default (and outlined as well in the SimpleXML Basic Usage):
$first = $element->{'first-name'};
If there are more than one element, you can specify which one you mean by using the zero-based index of it, either in square (array-access) or curly (property-access) brackets:
$first = $element->{'first-name'}[0];
$first = $element->{'first-name'}{0};
This also allows you to create a so called SimpleXML self-reference to access the element itself, e.g. to remove it:
unset($first[0]); # removes the element node from the document.
unset($first); # unsets the variable $first
You might think your Xpath would be overkill. But it's not that expensive in SimpleXML. Sometimes the only way to access an element is with Xpath even. Therefore it might be useful for you to know that you can easily access the first element as well per an xpath. For example the parent element in SimpleXML:
list($parent) $element->xpath('..'); # PHP < 5.4
$parent = $element->xpath('..')[0]; # PHP >= 5.4
As you can see it is worth to actually understand how things work to make more use of SimpleXML. If you already know all from the SimpleXML Basic Usage page, you might want to learn a bit more with the
SimpleXML Type Cheatsheet
How to tell apart SimpleXML objects representing element and attribute?
SimpleXMLElement implements JsonSerializable
Answer form per request. ^^
If that SimpleXMLElement is the only one contained within $resource['linkedin'], you can change it with:
$resource['linkedin']->{'first-name'} = $name;
That allows you direct access to the element without needing to do an xpath on it. ^^
You can use XPath to find the first instance of a matching element.
/root/firstname[1] would give you the first instance of firstname in your document.
$res=$response['linkedin']->xpath('/first-name[1]');

Why does the sort order of multidimensional child arrays revert as soon as foreach loop used for sorting ends?

I have a very strange array sorting related problem in PHP that is driving me completely crazy. I have googled for hours, and still NOTHING indicates that other people have this problem, or that this should happen to begin with, so a solution to this mystery would be GREATLY appreciated!
To describe the problem/question in as few words as possible: When sorting an array based on values inside a multiple levels deeply nested array, using a foreach loop, the resulting array sort order reverts as soon as execution leaves the loop, even though it works fine inside the loop. Why is this, and how do I work around it?
Here is sample code for my problem, which should hopefully be a little more clear than the sentence above:
$top_level_array = array('key_1' => array('sub_array' => array('sub_sub_array_1' => array(1),
'sub_sub_array_2' => array(3),
'sub_sub_array_3' => array(2)
)
)
);
function mycmp($arr_1, $arr_2)
{
if ($arr_1[0] == $arr_2[0])
{
return 0;
}
return ($arr_1[0] < $arr_2[0]) ? -1 : 1;
}
foreach($top_level_array as $current_top_level_member)
{
//This loop will only have one iteration, but never mind that...
print("Inside loop before sort operation:\n\n");
print_r($current_top_level_member['sub_array']);
uasort($current_top_level_member['sub_array'], 'mycmp');
print("\nInside loop after sort operation:\n\n");
print_r($current_top_level_member['sub_array']);
}
print("\nOutside of loop (i.e. after all sort operations finished):\n\n");
print_r($top_level_array);
The output of this is as follows:
Inside loop before sort operation:
Array
(
[sub_sub_array_1] => Array
(
[0] => 1
)
[sub_sub_array_2] => Array
(
[0] => 3
)
[sub_sub_array_3] => Array
(
[0] => 2
)
)
Inside loop after sort operation:
Array
(
[sub_sub_array_1] => Array
(
[0] => 1
)
[sub_sub_array_3] => Array
(
[0] => 2
)
[sub_sub_array_2] => Array
(
[0] => 3
)
)
Outside of loop (i.e. after all sort operations finished):
Array
(
[key_1] => Array
(
[sub_array] => Array
(
[sub_sub_array_1] => Array
(
[0] => 1
)
[sub_sub_array_2] => Array
(
[0] => 3
)
[sub_sub_array_3] => Array
(
[0] => 2
)
)
)
)
As you can see, the sort order is "wrong" (i.e. not ordered by the desired value in the innermost array) before the sort operation inside the loop (as expected), then is becomes "correct" after the sort operation inside the loop (as expected).
So far so good.
But THEN, once we're outside the loop again, all of a sudden the order has reverted to its original state, as if the sort loop didn't execute at all?!?
How come this happens, and how will I ever be able to sort this array in the desired way then?
I was under the impression that neither foreach loops nor the uasort() function operated on separate instances of the items in question (but rather on references, i.e. in place), but the result above seems to indicate otherwise? And if so, how will I ever be able to perform the desired sort operation?
(and WHY doesn't anyone else than me on the entire internet seem to have this problem?)
PS.
Never mind the reason behind the design of the strange array to be sorted in this example, it is of course only a simplified PoC of a real problem in much more complex code.
Your problem is a misunderstanding of how PHP provides your "value" in the foreach construct.
foreach($top_level_array as $current_top_level_member)
The variable $current_top_level_member is a copy of the value in the array, not a reference to inside the $top_level_array. Therefore all your work happens on the copy and is discarded after the loop completes. (Actually it is in the $current_top_level_member variable, but $top_level_array never sees the changes.)
You want a reference instead:
foreach($top_level_array as $key => $value)
{
$current_top_level_member =& $top_level_array[$key];
EDIT:
You can also use the foreach by reference notation (hat tip to air4x) to avoid the extra assignment. Note that if you are working with an array of Objects, they are already passed by reference.
foreach($top_level_array as &$current_top_level_member)
To answer you question as to why PHP defaults to a copy instead of a reference, it's simply because of the rules of the language. Scalar values and arrays are assigned by value, unless the & prefix is used, and objects are always assigned by reference (as of PHP 5). And that is likely due to a general consensus that it's generally better to work with copies of everything expect objects. BUT--it is not slow like you might expect. PHP uses a lazy copy called copy on write, where it is really a read-only reference. On the first write, the copy is made.
PHP uses a lazy-copy mechanism (also called copy-on-write) that does
not actually create a copy of a variable until it is modified.
Source: http://www.thedeveloperday.com/php-lazy-copy/
You can add & before $current_top_level_member and use it as reference to the variable in the original array. Then you would be making changes to the original array.
foreach ($top_level_array as &$current_top_level_member) {

How to reference this variable (nested objects/arrays)?

I'm learning PHP and Drupal. I need to reference a variable contained in an array called $contexts.
So print_r($contexts) gives me this:
Array (
[context_view_1] => ctools_context Object (
[type] => view
[view] => view Object (
[db_table] => views_view
[result] => Array (
[0] => stdClass Object (
[nid] => 28
[node_data_field_display_field_display_value] => slideshow
)
)
eek confusing. I want to work with the node_data_field_display_field_display_value variable. I think my code needs to be like this, but I know this isn't right:
if ($contexts['context_view_1']['view']['result'][0]
['node_data_field_display_field_display_value'] == 'slideshow') then do whatever...
Thanks!
You suggested the following array reference to get to the variable you want:
$contexts['context_view_1']['view']['result'][0]['node_data_field_display_field_display_value']
The reason this doesn't work is because some of the structures in the chain are actually objects rather than arrays, so you need a different syntax for them to get at their properties.
So the first layer is correct, because $contexts is an array, so context_view_1 is an array element, so you'd get to it with $contexts['context_view_1'] as you did.
But the next level is an object, so to get to view, you need to reference it as an object property with -> syntax, like so: $contexts['context_view_1']->view
For each level down the tree, you need to determine whether it's an object or an array element, and use the correct syntax.
In this case, you'll end up with something that looks like this:
$context['context_view_1']->view->result[0]->node_data_field_display_field_display_value
That's a mess of a variable. The issue you're having is that you're using the bracketed notation, e.g. "['view']", for each "step" in the navigation through your variable. That would be fine if each child of the variable were an array, but not every one is.
You'll note, for example, that $contexts['context_view_1'] is actually an object, not an array (take note that it says "[context_view_1] => ctools_context Object"). Whereas you would use that bracketed notation to address the elements of an array, you use the arrow operator to address the properties of an object.
Thus, you would address the field you are trying to reach with the following expression:
$contexts['context_view_1']->view->result[0]->node_data_field_display_field_display_value
For properties listed as "Object", you need to use -> to get into it, and "Array", you need to use []. So:
$contexts['context_view_1']->view->result[0]->node_data_field_display_field_display_value
$contexts['context_view_1']->view->result[0]->node_data_field_display_field_display_value
echo $context['context_view_1']->view->result[0]->node_data_field_display_field_display_value;
Do not mistake objects with arrays. A memeber of an array can be accesed by $array['member'], but fields of an object can be accessed as $object->fieldname.

PHP SimpleXML::addChild with empty string - redundant node

Calling addChild with an empty string as the value (or even with whitespace) seems to cause a redundant SimpleXml node to be added inside the node instead of adding just the node with no value.
Here's a quick demo of what happens:
[description] => !4jh5jh1uio4jh5ij14j34io5j!
And here's with an empty string:
[description] => SimpleXMLElement Object ( [0] => )
The workaround I'm using at the moment is pretty horrible - I'm doing a str_replace on the final JSON to replace !4jh5jh1uio4jh5ij14j34io5j! with an empty string. Yuck. Perhaps the only answer at this point is 'submit a bug report to simplexml'...
Does anyone have a better solution?
I think I figured out what is going on. Given code like this:
$xml = new SimpleXMLElement('<xml></xml>');
$xml->addChild('node','value');
print_r($xml);
$xml = new SimpleXMLElement('<xml></xml>');
$xml->addChild('node','');
print_r($xml);
$xml = new SimpleXMLElement('<xml></xml>');
$xml->addChild('node');
print_r($xml);
The output is this:
SimpleXMLElement Object
(
[node] => value
)
SimpleXMLElement Object
(
[node] => SimpleXMLElement Object
(
[0] =>
)
)
SimpleXMLElement Object
(
[node] => SimpleXMLElement Object
(
)
)
So, to make it so that in case #2 the empty element isn't created (i.e. if you don't know if the second argument is going to be an empty string or not), you could just do something like this:
$mystery_string = '';
$xml = new SimpleXMLElement('<xml></xml>');
if (preg_match('#\S#', $mystery_string)) // Checks for non-whitespace character
$xml->addChild('node', $mystery_string);
else
$xml->addChild('node');
print_r($xml);
echo "\nOr in JSON:\n";
echo json_encode($xml);
To output:
SimpleXMLElement Object
(
[node] => SimpleXMLElement Object
(
)
)
Or in JSON:
{"node":{}}
Is that what you want?
Personally, I never use SimpleXML, and not only because of this sort of weird behavior -- it is still under major development and in PHP5 is missing like 2/3 of the methods you need to do DOM manipulation (like deleteChild, replaceChild etc).
I use DOMDocument (which is standardized, fast and feature-complete, since it's an interface to libxml2).
With SimpleXML, what you get if you use print_r(), or var_dump(), serialize(), or similar, does not correspond to what is stored internally in the object. It is a 'magical' object which overloads the way PHP interates its contents.
You get the true representation of the element with AsXML() only.
When something like print_r() iterates over a SimpleXML element or you access its properties using the -> operator, you get a munged version of the object. This munged version allows you to do things like "echo $xml->surname" or $xml->names[1] as if it really had these as properties, but is separate to the true XML contained within: in the munged representation elements are not necessarily in order, and elements whose names are PHP reserved words (like "var") aren't presented as properties, but can be accessed with code like $xml["var"] - as if the object is an associative array. Where multiple sibling elements have the same name they are presented like arrays. I guess an empty string is also presented like an array for some reason. However, when output using AsXML() you get the real representation.
Maybe I'm not understanding the question right but, it seems to me that when you use the addChild method, you're required to have a string as an argument for the name of the node regardless of what content is in the node. The value (second argument) is optional and can be left blank to add and empty node.
Let me know if that helps.
I've created an Xml library to which extends the simpleXml object to include all of the functionally that is present in the DOMDocument but is missing an interface from SimpleXml (as the two functions interact with the same underlying libxml2 object --by reference). It also has niceties such as AsArray() or AsJson() to output your object in one of those formats.
I've just updated the library to work as you expect when outputting JSON. You can do the following:
$xml = new bXml('<xml></xml>');
$xml->addChild('node', '');
$json_w_root = $xml->asJson(); // is { 'xml': {'node':'' } }
$json = $xml->children()->asJson(); // is { 'node' : '' } as expected.
The library is hosted on google code at http://code.google.com/p/blibrary/

Categories