Getting the first XML element with SimpleXML - php

ok, this might be a stupid question, but how do I get one single element from an XML document?
I have this XML
$element = $response['linkedin'];
SimpleXMLElement Object
(
[id] => 575677478478
[first-name] => John
[last-name] => Doe
[email-address] => john#doe.com
[picture-url] => http://m3.licdn.com/mpr/mprx/123
[headline] => Headline goes here
[industry] => Internet
[num-connections] => 71
I just want to assign first-name as $firstName
I can loop over it using xPath, but that just seems like overkill.
ex:
$fName = $element->xpath('first-name');
foreach ($fName as $name)
{
$firstName = $name;
}

If you access a list of (one or more) element nodes in SimpleXML as a single element, it will return the first element. That is by default (and outlined as well in the SimpleXML Basic Usage):
$first = $element->{'first-name'};
If there are more than one element, you can specify which one you mean by using the zero-based index of it, either in square (array-access) or curly (property-access) brackets:
$first = $element->{'first-name'}[0];
$first = $element->{'first-name'}{0};
This also allows you to create a so called SimpleXML self-reference to access the element itself, e.g. to remove it:
unset($first[0]); # removes the element node from the document.
unset($first); # unsets the variable $first
You might think your Xpath would be overkill. But it's not that expensive in SimpleXML. Sometimes the only way to access an element is with Xpath even. Therefore it might be useful for you to know that you can easily access the first element as well per an xpath. For example the parent element in SimpleXML:
list($parent) $element->xpath('..'); # PHP < 5.4
$parent = $element->xpath('..')[0]; # PHP >= 5.4
As you can see it is worth to actually understand how things work to make more use of SimpleXML. If you already know all from the SimpleXML Basic Usage page, you might want to learn a bit more with the
SimpleXML Type Cheatsheet
How to tell apart SimpleXML objects representing element and attribute?
SimpleXMLElement implements JsonSerializable

Answer form per request. ^^
If that SimpleXMLElement is the only one contained within $resource['linkedin'], you can change it with:
$resource['linkedin']->{'first-name'} = $name;
That allows you direct access to the element without needing to do an xpath on it. ^^

You can use XPath to find the first instance of a matching element.
/root/firstname[1] would give you the first instance of firstname in your document.
$res=$response['linkedin']->xpath('/first-name[1]');

Related

Trouble with SimpleXMLElement Namespaces

I'm having trouble parsing XML with Namespaces using SimpleXMLElement.
I've tried using looping through the xml and also tried using xpath without success.
$data_url="http://isni.oclc.nl/sru/0000000123121970?query=pica.isn+%3D+%220000000123121970%22&version=1.1&operation=searchRetrieve&stylesheet=http%3A%2F%2Fisni.oclc.nl%2Fsru%2FDB%3D1.2%2F%3Fxsl%3DsearchRetrieveResponse&recordSchema=isni-b&maximumRecords=10&startRecord=1&recordPacking=xml&sortKeys=none&x-info-5-mg-requestGroupings=none";
$data = file_get_contents($data_url);
$xml = simplexml_load_string($data);
$org_names = $xml->children('srw', true)->records->children('srw', true)->record->children('srw', true)->recordData->responseRecord->isniassigned->isnimetadata->identity->organisation->organisationnamevariant->mainname;
foreach($org_names as $a)
{
echo "a: $a\n";
}
I'm expecting to get a list of organisationnamevariant->mainname items:
Academia lugduno-batava
Leiden university
Leidse universiteit
etc.
However, I'm getting this error: Trying to get property of non-object
Having such a deep hierarchy is difficult to navigate using the normal -> structure, but you also have to be careful when changing namespace. You only need to do the ->children('srw', true) once and then all of the child nodes will be for that namespace. BUT you also have to switch back at <responseRecord> by using ->children().
You also need to be careful that you use the proper case for each tag name...
$org_names = $xml->children('srw', true)->records->record->recordData->children()->
responseRecord->ISNIAssigned->ISNIMetadata->identity->organisation->
organisationNameVariant->mainName;
echo (string)$org_names;
An alternative is to use XPath (as xpath() returns a list of matches, I use [0] to only use the first one)...
$org_names = $xml->xpath("//organisationNameVariant/mainName");
echo (string)$org_names[0];
I know that echo casts the value to a string, but if you use this in any other scenario, you may end up with a SimpleXMLElement instead, so I tend to add the case to string in just to make the point.

Determine the order of XML elements in PHP

so I have this xml
<xmlTag>
<item>etc etc</item>
<section>etc etc</section>
<item>etc etc</item>
<item>etc etc</item>
<section>etc etc</section>
<xmlTag>
and I want to process this in order, ie process the first item tag then the second section tag then the third item tag...etc
however when I use simplexml_load_string() the resultant object becomes
$xmlTag = {SimpleXMLElement}[2]
item = {array} [3]
section = {array} [2]
Hence it separates out the item tags and the section tags and now I have no way to determine the orderings between the item tags and the section tags....
Anyone know of an alternative way to figure out the order of the xml elements in this scenario?
Whatever dump function you're using there is misleading you (anything not dedicated to the purpose will, due to the large amount of "magic" supported by SimpleXML). The nodes have not been permanently separated out, it's just showing you that they can be accessed that way.
If you use the children() method, you will get them in the order they are defined in the document, regardless of tag name:
foreach ( $xmlTag->children() as $child_name => $child ) {
echo $child_name, "\n";
}
Note that children() doesn't actually return an array, just an "iterable" object. So unlike a real array, the same "key" can occur multiple times when you loop over it.

How to get ALL elements of simplexml object

OK, I'm totally stumped here. I've found similar questions, but the answers don't seem to work for my specific problem. I've been working on this on and off for days.
I have this here simplexml object (it's actually much, much, MUCH longer than this, but I'm cutting out all the extraneous stuff so you'll actually look at it):
SimpleXMLElement Object
(
[SubjectClassification] => Array
(
[0] => SimpleXMLElement Object
(
[#attributes] => Array
(
[Authority] => Category Code
[Value] => s
[Id] => s
)
)
[1] => SimpleXMLElement Object
(
[#attributes] => Array
(
[Authority] => Subject
[Value] => Sports
[Id] => 54df6c687df7100483dedf092526b43e
)
)
[2] => SimpleXMLElement Object
(
[#attributes] => Array
(
[Authority] => Subject
[Value] => Professional baseball
[Id] => 20dd2c287e4e100488e5d0913b2d075c
)
)
)
)
I got this block of code by doing a print_r on a variable containing the following:
$subjects->SubjectClassification->children();
Now, I want to get at all the elements of the subjectClassification array. ALL of them! But when I do this:
$subjects->SubjectClassification;
Or this:
$subjects->SubjectClassification->children();
OR if I try to get all the array elements via a loop, all I get is this:
SimpleXMLElement Object
(
[#attributes] => Array
(
[Authority] => Category Code
[Value] => s
[Id] => s
)
)
Why? How can I get everything?
You can use xpath to do this. Its the easiest way and most efficient I find and cuts down the need for lots of for loops and such to resolve items. To get all the nodes you want you can use:
if your xml is like this:
<Subjects>
<SubjectClassification>
</SubjectClassification>
<SubjectClassification>
</SubjectClassification>
<SubjectClassification>
</SubjectClassification>
</Subjects>
Then to get all subject classifications in an array you can do the following:
$subject_classifications = $xml->xpath("//SubjectClassification");
The xml variable refers to your main simplexml object i.e. the file you loaded using simplexml.
Then you can just iterate through the array using a foreach loop like this:
foreach($subject_classifications as $subject_classification){
echo (string) $subject_classification->Authority;
echo (string) $subject_classification->Value;
echo (string) $subject_classification->Id;
}
Your structure may vary but you get the idea anyway. You can see a good article from IBM here "Using Xpath With PHP":
Because of the extent to which SimpleXML overloads PHP syntax, relying on print_r to figure out what's in a SimpleXML object, or what you can do with it, is not always helpful. (I've written a couple of debugging functions intended to be more comprehensive.) Ultimately, the reference should be to the XML structure itself, and knowledge of how SimpleXML works.
In this case, it looks from the output you provide that what you have is a list of elements all called SubjectClassification, and all siblings to each other. So you don't want to call $subjects->SubjectClassification->children(), because those nodes have no children.
Without a better idea of the underlying XML structure, it's hard to say more, so I'll save this incomplete answer for now.
For all descendants (that are children, grand-children, grand-grand-children, grand-grand-... (you get the idea)) of <subjectClassification>s ("all the elements [...] ALL of them!" as you named it), you can make use of Xpath which supports such more advanced queries (at least I assume that is what you're looking for, your question does not give any detailed discription nor example what you mean by "all" specifically).
As for SimpleXML you can query elements (and attributes) only with Xpath, but as you need elements only, this is no show stopper:
$allOfThem = $subjects->xpath('./SubjectClassification//*');
The key point here is the Xpath expression:
./SubjectClassification//*
Per the dot . at the beginning it is relative to the context-node, which is $subjects in your case. Then looking for all elements that are descending to the direct child-element named SubjectClassification. This works per // (unspecified depth) and * (any element, star acts as a wildcard).
So hopefully this answers your question months after. I just stumbled over it by cleaning up some XML questions and perhaps this is useful for future reference as well.
I have added this second answer in case whats actually throwing you is retrieving the attributes array as opposed to the nodes. This is how you could print out the attributes for each SubjectClassification in your main $xml object.
foreach($xml->SubjectClassification->attributes() as $key => $value) {
echo $key . " : " . $value "\n";
}
I've found that count returns the proper number of elements, and you can then use a standard for loop to iterate over them:
$n = count($subjects->SubjectClassification);
for ($i = 0; $i < $n; $i++) {
var_dump($subjects->SubjectClassification[$i]);
}
I'm not sure why the foreach loop doesn't work, nor why dumping $subjects->SubjectClassification directly only shows the first node, but for any who stumble across this ancient question as I have, the above is one way to find more information without resorting to external libraries.

How to retrieve data from # xml attribute in PHP

Ok so I am stuck with this xml in PHP stuff. I have gotten pretty far considering its my first 3 hours into XML all together ever in my entire life.
I am having trouble pulling data from a XML thing that has # in the name. See below (obviously im not going to post the whole XML thing but u can see how i got there below that.
SimpleXMLElement Object
(
[#attributes] => Array
(
[date] => 2010-09
[reserved] => 6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
)
)
How i got there:
echo $this->General_functions->naked($xml->property[0]->availability->month[0]);
General_functions->naked is just a fast function to wrap and print_r around the given attribute.
My question is, HOW do i get the values inside #attributes cause no matter what i try i cant figure it out. Ive searched the web for a good 45 mins with no real answer.
Thanks in advance.
David
You need to use the attributes() method to get the results as another class. So, for example, to get the date attribute:
$myElement->attributes()->date
Also note that it's not a string, it's a SimpleXML attribute. If you want to get its actual value, you need to cast it to string explicitly:
(string)$myElement->attributes()->date
Access attributes of an element just as you would elements of an array:
(string) $xml->property[0]->availability->month[0]['date']
Edited to add the cast.

PHP SimpleXML::addChild with empty string - redundant node

Calling addChild with an empty string as the value (or even with whitespace) seems to cause a redundant SimpleXml node to be added inside the node instead of adding just the node with no value.
Here's a quick demo of what happens:
[description] => !4jh5jh1uio4jh5ij14j34io5j!
And here's with an empty string:
[description] => SimpleXMLElement Object ( [0] => )
The workaround I'm using at the moment is pretty horrible - I'm doing a str_replace on the final JSON to replace !4jh5jh1uio4jh5ij14j34io5j! with an empty string. Yuck. Perhaps the only answer at this point is 'submit a bug report to simplexml'...
Does anyone have a better solution?
I think I figured out what is going on. Given code like this:
$xml = new SimpleXMLElement('<xml></xml>');
$xml->addChild('node','value');
print_r($xml);
$xml = new SimpleXMLElement('<xml></xml>');
$xml->addChild('node','');
print_r($xml);
$xml = new SimpleXMLElement('<xml></xml>');
$xml->addChild('node');
print_r($xml);
The output is this:
SimpleXMLElement Object
(
[node] => value
)
SimpleXMLElement Object
(
[node] => SimpleXMLElement Object
(
[0] =>
)
)
SimpleXMLElement Object
(
[node] => SimpleXMLElement Object
(
)
)
So, to make it so that in case #2 the empty element isn't created (i.e. if you don't know if the second argument is going to be an empty string or not), you could just do something like this:
$mystery_string = '';
$xml = new SimpleXMLElement('<xml></xml>');
if (preg_match('#\S#', $mystery_string)) // Checks for non-whitespace character
$xml->addChild('node', $mystery_string);
else
$xml->addChild('node');
print_r($xml);
echo "\nOr in JSON:\n";
echo json_encode($xml);
To output:
SimpleXMLElement Object
(
[node] => SimpleXMLElement Object
(
)
)
Or in JSON:
{"node":{}}
Is that what you want?
Personally, I never use SimpleXML, and not only because of this sort of weird behavior -- it is still under major development and in PHP5 is missing like 2/3 of the methods you need to do DOM manipulation (like deleteChild, replaceChild etc).
I use DOMDocument (which is standardized, fast and feature-complete, since it's an interface to libxml2).
With SimpleXML, what you get if you use print_r(), or var_dump(), serialize(), or similar, does not correspond to what is stored internally in the object. It is a 'magical' object which overloads the way PHP interates its contents.
You get the true representation of the element with AsXML() only.
When something like print_r() iterates over a SimpleXML element or you access its properties using the -> operator, you get a munged version of the object. This munged version allows you to do things like "echo $xml->surname" or $xml->names[1] as if it really had these as properties, but is separate to the true XML contained within: in the munged representation elements are not necessarily in order, and elements whose names are PHP reserved words (like "var") aren't presented as properties, but can be accessed with code like $xml["var"] - as if the object is an associative array. Where multiple sibling elements have the same name they are presented like arrays. I guess an empty string is also presented like an array for some reason. However, when output using AsXML() you get the real representation.
Maybe I'm not understanding the question right but, it seems to me that when you use the addChild method, you're required to have a string as an argument for the name of the node regardless of what content is in the node. The value (second argument) is optional and can be left blank to add and empty node.
Let me know if that helps.
I've created an Xml library to which extends the simpleXml object to include all of the functionally that is present in the DOMDocument but is missing an interface from SimpleXml (as the two functions interact with the same underlying libxml2 object --by reference). It also has niceties such as AsArray() or AsJson() to output your object in one of those formats.
I've just updated the library to work as you expect when outputting JSON. You can do the following:
$xml = new bXml('<xml></xml>');
$xml->addChild('node', '');
$json_w_root = $xml->asJson(); // is { 'xml': {'node':'' } }
$json = $xml->children()->asJson(); // is { 'node' : '' } as expected.
The library is hosted on google code at http://code.google.com/p/blibrary/

Categories