Test XML:
<?xml version="1.0" encoding="UTF-8"?>
<Transfer>
<ABR recordLastUpdatedDate="20180329" replaced="N">
<ABN status="ACT" ABNStatusFromDate="20000214">80007321682</ABN>
<EntityType>
<EntityTypeInd>PUB</EntityTypeInd>
<EntityTypeText>Australian Public Company</EntityTypeText>
</EntityType>
<MainEntity>
<NonIndividualName type="MN">
<NonIndividualNameText>BLACK CABS COMBINED PTY LTD</NonIndividualNameText>
</NonIndividualName>
<BusinessAddress>
<AddressDetails>
<State>VIC</State>
<Postcode>3166</Postcode>
</AddressDetails>
</BusinessAddress>
</MainEntity>
</ABR>
</Transfer>
PHP Script:
$f='test.xml';
$reader=new XMLReader();
$reader->open($f);
while($reader->read()){
if($reader->nodeType==XMLReader::ELEMENT && $reader->name=='ABR'){
$doc=new DOMDocument('1.0','UTF-8');
$xml=simplexml_import_dom($doc->importNode($reader->expand(),true));
print_r($xml);
}
}
$reader->close();
PHP Output:
SimpleXMLElement Object
(
[#attributes] => Array
(
[recordLastUpdatedDate] => 20180329
[replaced] => N
)
[ABN] => 80007321682
[EntityType] => SimpleXMLElement Object
(
[EntityTypeInd] => PUB
[EntityTypeText] => Australian Public Company
)
[MainEntity] => SimpleXMLElement Object
(
[NonIndividualName] => SimpleXMLElement Object
(
[#attributes] => Array
(
[type] => MN
)
[NonIndividualNameText] => BLACK CABS COMBINED PTY LTD
)
[BusinessAddress] => SimpleXMLElement Object
(
[AddressDetails] => SimpleXMLElement Object
(
[State] => VIC
[Postcode] => 3166
)
)
)
)
The Problem:
The attributes for the ABN element (status and ABNStatusFromDate) are not in the output, even though other attributes are.
Please help me understand why those attributes in particular are missing.
PS - Dummy text so SO doesn't give me warnings about my post being mostly code
Answer: print_r is not meant to be used to display a SimpleXML object.
I can access the attribute directly via $xml->ABN['status'].
Related
I have searched for this and the answers I find seem to say what I thought I understand. Obviously I am missing something. I am confused at the results from the xPath query. I have simplified my problem for a test case to post here.
My real xml has several dataset nodes at different depths. Ultimately, I want to get every dataset element with a given label and then loop over that and get the field values (at different locations (or depths) so I think I need xpath). I can use xpath to get the dataset elements that I want successfully. However, when I then run xpath on that result object, it gets me the fields I want and all the other fields too. I can't figure out why it isn't only returning field1, field2, and field3. When I print_r($value[0]), it shows only the fields I want. But, when I run xpath on $value[0], it returns all fields in the xml doc.
Sample XML
<myxml>
<dataset label="wanteddata" label2="anotherlabel">
<dataitem>
<textdata>
<field label="label1">field1</field>
<field label="label2">field2</field>
<field label="label3">field3</field>
</textdata>
</dataitem>
</dataset>
<dataset label="unwanteddata" label2="unwantedanotherlabel">
<dataitem>
<textdata>
<field label="label4">field4</field>
<field label="label5">field5</field>
<field label="label6">field6</field>
</textdata>
</dataitem>
</dataset>
</myxml>
Here is the test code.
$xmlstring = file_get_contents('simplexml_test.xml');
$xml = simplexml_load_string($xmlstring);
if ($xml === false) {
throw new Exception("Failed to load");
}
$value = $xml->xpath('//dataset[#label="wanteddata"]');
print_r($value[0]->xpath('//field'));
Code Output:
Array
(
[0] => SimpleXMLElement Object
(
[#attributes] => Array
(
[label] => label1
)
)
[1] => SimpleXMLElement Object
(
[#attributes] => Array
(
[label] => label2
)
)
[2] => SimpleXMLElement Object
(
[#attributes] => Array
(
[label] => label3
)
)
[3] => SimpleXMLElement Object
(
[#attributes] => Array
(
[label] => label4
)
)
[4] => SimpleXMLElement Object
(
[#attributes] => Array
(
[label] => label5
)
)
[5] => SimpleXMLElement Object
(
[#attributes] => Array
(
[label] => label6
)
)
)
//field selects all <field> elements within the entire XML document regardless of the context node from which you call that XPath. To make the XPath heed the context node, you need to add a dot (.) at the beginning of the XPath. In XPath, (.) references current context node :
print_r($value[0]->xpath('.//field'));
I am using an API to get a Block FIPS number but I have not been able to target that specific number within the XML file.
I did a print_r() on the xml output and here is what I get
SimpleXMLElement Object ( [#attributes] => Array ( [status] => OK [executionTime] => 6 ) [Block] => SimpleXMLElement Object ( [#attributes] => Array ( [FIPS] => 060730200252015 ) ) [County] => SimpleXMLElement Object ( [#attributes] => Array ( [FIPS] => 06073 [name] => San Diego ) ) [State] => SimpleXMLElement Object ( [#attributes] => Array ( [FIPS] => 06 [code] => CA [name] => California ) ) )
Here is the XML that is being generated
<Response xmlns="http://data.fcc.gov/api" status="OK" executionTime="10">
<Block FIPS="060730200252015"/>
<County FIPS="06073" name="San Diego"/>
<State FIPS="06" code="CA" name="California"/>
</Response>
I have been trying to get the Block FIPS Number like this:
$fccAPI = "http://data.fcc.gov/api/block/2010/find?latitude=$lat&longitude=$lng";
//echo $fccAPI;
$fccXML= simplexml_load_file($fccAPI);
print_r($fccXML);
//Echo FIPS Number
echo $fccXML->FIPS;
Please help me target the Block FIPS number.
You need to use the following:
echo $fccXML->Block[0]['FIPS'];
$fccXML is the root node, the <Response> element. ->Block[0] selects the first Block element, and to access an attribute, use the square brackets notation with the attribute name, i.e. ['FIPS'].
The SimpleXML documentation has numerous examples if you're having trouble with the syntax.
I retrieve the following XML data:
<?xml version="1.0" encoding="UTF-8"?>
<JMF xmlns="http://www.CIP4.org/JDFSchema_1_1" MaxVersion="1.4" SenderID="HP Hybrid Elk JMF" TimeStamp="2014-02-19T07:42:11+00:00" Version="1.4" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="JMFRootMessage">
<!--Generated by the CIP4 Java open source JDF Library version : CIP4 JDF Writer Java 1.4a BLD 63-->
<Response ID="Rgdhhhdfhd" ReturnCode="0" Type="KnownDevices" refID="gdhhhdfhd" xsi:type="ResponseKnownDevices">
<DeviceList>
<DeviceInfo DeviceCondition="OK" DeviceID="HPSSPP-SM" DeviceStatus="Running" StatusDetails="Running"/>
<DeviceInfo CounterUnit="Impressions" DeviceCondition="OK" DeviceID="192.168.1.101" DeviceStatus="Running" ProductionCounter="12345678" StatusDetails="Indigo: Printing" xmlns:jdf="http://www.CIP4.org/JDFSchema_1_1">
<GeneralID IDUsage="hpCustomerImpressionCounter" IDValue="12345678.0"/>
</DeviceInfo>
<DeviceInfo CounterUnit="Impressions" DeviceCondition="OK" DeviceID="192.168.1.102" DeviceStatus="Running" ProductionCounter="23456789" StatusDetails="Indigo: Printing" xmlns:jdf="http://www.CIP4.org/JDFSchema_1_1">
<GeneralID IDUsage="hpCustomerImpressionCounter" IDValue="23456789.0"/>
</DeviceInfo>
</DeviceList>
</Response>
</JMF>
I load it into a SimpleXMLElement:
<?php
$jdf_response = new SimpleXMLElement($xml_response);
And I can then display it like so:
<pre>
<?php print_r($jdf_response->Response->DeviceList); ?>
</pre>
Which gives the following output:
SimpleXMLElement Object
(
[DeviceInfo] => Array
(
[0] => SimpleXMLElement Object
(
[#attributes] => Array
(
[DeviceCondition] => OK
[DeviceID] => HPSSPP-SM
[DeviceStatus] => Running
[StatusDetails] => Running
)
)
[1] => SimpleXMLElement Object
(
[#attributes] => Array
(
[CounterUnit] => Impressions
[DeviceCondition] => OK
[DeviceID] => 192.168.1.101
[DeviceStatus] => Running
[ProductionCounter] => 12345678
[StatusDetails] => Indigo: Printing
)
[GeneralID] => SimpleXMLElement Object
(
[#attributes] => Array
(
[IDUsage] => hpCustomerImpressionCounter
[IDValue] => 12345678.0
)
)
)
[2] => SimpleXMLElement Object
(
[#attributes] => Array
(
[CounterUnit] => Impressions
[DeviceCondition] => OK
[DeviceID] => 192.168.1.102
[DeviceStatus] => Running
[ProductionCounter] => 23456789
[StatusDetails] => Indigo: Printing
)
[GeneralID] => SimpleXMLElement Object
(
[#attributes] => Array
(
[IDUsage] => hpCustomerImpressionCounter
[IDValue] => 23456789.0
)
)
)
)
)
So far so good. But I need to get the data from the DeviceInfo array, so I modify the code:
<pre>
<?php print_r($jdf_response->Response->DeviceList->DeviceInfo); ?>
</pre>
But instead of three SimpleXMLElement objects, I get only the first.
SimpleXMLElement Object
(
[#attributes] => Array
(
[DeviceCondition] => OK
[DeviceID] => HPSSPP-SM
[DeviceStatus] => Running
[StatusDetails] => Running
)
)
What am I doing wrong?
Update:
The reason I was using print_r() in the first place because because I was getting no output from the following code:
<?php
$addresses = array();
foreach ($jdf_response->Response->DeviceList->DeviceInfo as $device) {
$addresses[] = $device->{'#attributes'}'DeviceID'];
}
print_r($addresses);
Example #4 Accessing non-unique elements in SimpleXML
When multiple instances of an element exist as children of a single parent element,
normal iteration techniques apply.
Data is still there, You just have to use an iterator like:
foreach($jdf_response->Response->DeviceList->DeviceInfo as $device)
{
print_r($device);
}
Reference
As I have mentioned in question title, I am trying below code to reach till the desired node in xpath result.
<?php
$xpath = '//*[#id="topsection"]/div[3]/div[2]/div[1]/div/div[1]';
$html = new DOMDocument();
#$html->loadHTMLFile('http://www.flipkart.com/samsung-galaxy-ace-s5830/p/itmdfndpgz4nbuft');
$xml = simplexml_import_dom($html);
if (!$xml) {
echo 'Error while parsing the document';
exit;
}
$source = $xml->xpath($xpath);
echo "<pre>";
print_r($source);
?>
this is the source code. I am using to scrap price from a ecommerce.
it works it gives below output :
Array
(
[0] => SimpleXMLElement Object
(
[#attributes] => Array
(
[class] => line
)
[div] => SimpleXMLElement Object
(
[#attributes] => Array
(
[class] => prices
[itemprop] => offers
[itemscope] =>
[itemtype] => http://schema.org/Offer
)
[span] => Rs. 10300
[div] => (Prices inclusive of taxes)
[meta] => Array
(
[0] => SimpleXMLElement Object
(
[#attributes] => Array
(
[itemprop] => price
[content] => Rs. 10300
)
)
[1] => SimpleXMLElement Object
(
[#attributes] => Array
(
[itemprop] => priceCurrency
[content] => INR
)
)
)
)
)
)
Now How to reach till directly [content] => Rs. 10300.
I tried:
echo $source[0]['div']['meta']['#attributes']['content']
but it doesn't work.
Try echo (String) $source[0]->div->meta[0]['content'];.
Basically, when you see an element is an object, you can't access it like an array, you need to use object -> approach.
The print_r of a SimpleXMLElement does not show the real object structure. So you need to have some knowledge:
$source[0]->div->meta['content']
| | | `- attribute acccess
| | `- element access, defaults to the first one
| `- element access, defaults to the first one
|
standard array access to get
the first SimpleXMLElement of xpath()
operation
That example then is (with your address) the following (print_r again, Demo):
SimpleXMLElement Object
(
[0] => Rs. 10300
)
Cast it to string in case you want the text-value:
$rs = (string) $source[0]->div->meta['content'];
However you can already directly access that node with the xpath expression (if that is a single case).
Learn more on how to access a SimpleXMLElement in the Basic SimpleXML usage ExamplesDocs.
I have loaded an XML file using
simplexml_load_file($filePath,'SimpleXMLElement', LIBXML_NOCDATA);
And for most of the XML provided it works fine. However, for some of the elements in the XML the attributes are not converted into an '#attributes' array, and are instead missing form the output. Here's a sample:
<UI_DEFINITION>
<EDIT_PERMISSION>testPermission</EDIT_PERMISSION>
<DEFAULT_VALUES>
<display>hidden</display>
<css_class>generic_css_class</css_class>
<title>{tag}</title>
<type>string</type>
<wrapper_format>{value}</wrapper_format>
<full_path>false</full_path>
<mandatory>false</mandatory>
<edit_permission>testPermission</edit_permission>
<max_length>0</max_length>
</DEFAULT_VALUES>
<LOOKUPS>
<DB_LOOKUP name="test3">
<VIEW>???</VIEW>
<ID_FIELD>???</ID_FIELD>
<DESCR_FIELD>???</DESCR_FIELD>
<ORDER>??? asc</ORDER>
</DB_LOOKUP>
<DB_LOOKUP name="test1">
<VIEW>???</VIEW>
<ID_FIELD>???</ID_FIELD>
<DESCR_FIELD>???</DESCR_FIELD>
<ORDER>??? asc</ORDER>
</DB_LOOKUP>
</LOOKUPS>
<AREA internal_name="main_details" title="" display="show">
<FIELD lookup="test1" title="Title">Title</FIELD>
<FIELD title="Name">Given_Name</FIELD>
<FIELD title="Mid. Name(s)">Middle_Names</FIELD>
<FIELD title="Family Name">Family_Name</FIELD>
<FIELD title="Gender">Gender</FIELD>
<FIELD title="Born" type="date">Date_of_Birth</FIELD>
<FIELD max_length="20" title="ID">Unique_Identifier</FIELD>
</AREA>
This gives the following output from print_r (I've added a line break at the bit that's the problem):
SimpleXMLElement Object ( [UI_DEFINITION] => SimpleXMLElement Object ( [EDIT_PERMISSION] => testPermission [DEFAULT_VALUES] => SimpleXMLElement Object ( [display] => hidden [css_class] => generic_css_class [title] => {tag} [type] => string [wrapper_format] => {value} [full_path] => false [mandatory] => false [edit_permission] => testPermission [max_length] => 0 ) [LOOKUPS] => SimpleXMLElement Object ( [DB_LOOKUP] => Array ( [0] => SimpleXMLElement Object ( [#attributes] => Array ( [name] => test3 ) [VIEW] => ??? [ID_FIELD] => ??? [DESCR_FIELD] => ??? [ORDER] => ??? asc ) [1] => SimpleXMLElement Object ( [#attributes] => Array ( [name] => test1 ) [VIEW] => ??? [ID_FIELD] => ??? [DESCR_FIELD] => ??? [ORDER] => ??? asc ) ) )
[AREA] => SimpleXMLElement Object ( [#attributes] => Array ( [internal_name] => main_details [title] => [display] => show ) [FIELD] => Array ( [0] => Title [1] => Given_Name [2] => Middle_Names [3] => Family_Name [4] => Gender [5] => Date_of_Birth [6] => Unique_Identifier ) ) ) )
As you can see, the attributes array is correctly added to most of the elements, but not to the FIELD elements. I've tried renaming them and it didn't seem to make a difference.
EDIT:
I should also add that I've tried surrounding the FIELD tags with a FIELDS tag, also to no avail.
EDIT:
I've simplified the XML hugely, and it still doesn't return anny attributes:
<UI_DEFINITION>
<FIELD lookup="test1" title="Title">Title</FIELD>
</UI_DEFINITION>
produces:
SimpleXMLElement Object ( [UI_DEFINITION] => SimpleXMLElement Object ( [FIELD] => Title ) )
The attributes are accessible, for example:
$obj = simplexml_load_string($xml);
foreach($obj->AREA->FIELD as $field)
{
echo $field->attributes()->title . '<br />';
}
print_r() does not always show the full structure with SimpleXML, but the attributes are there for use.
Sorry it's taken so long to come back and answer this question!
As MrCode suggested, the attributes were accessible. The problem I was in the serialisation of the SimpleXML object into another format. Using printr or json_convert on the while object resulted in the attributes not being available in the cases reported.
I didn't go far enough into this to find a code-based workaround for printing or converting these objects including the problematic cases, I simply worked around it as part of the XML data:
<UI_DEFINITION>
<FIELD lookup="test1" title="Title"><VALUEPATH>Title</VALUEPATH></FIELD>
</UI_DEFINITION>
Addint this extra level into the hierarchy resulted in the attributes being preserved at the top level, and the text value being available correctly at the sub-level.