simplexml stuck parsing xml - php

I am trying to parse an xml file, I can get the d3pp1:key and d3p1:values alright.
foreach ($xml_contact->Attributes->KeyValuePairOfstringanyType as $node) {
$key = (string)$node->children('d3p1', TRUE)->key;
$value = (string)$node->children('d3p1', TRUE)->value;
do_stuff($key, $value);
}
But i also need to get this 9ccaa69b-fced-e411-80da-00155d0a0806
and I am struggling to figure out how to reference it.
I have tried various incarnations along these lines
$node->children('d3p1', TRUE)->value->Id
What am I doing wrong?
<KeyValuePairOfstringanyType>
<d3p1:key>birthdate</d3p1:key>
<d3p1:value xmlns:d5p1="http://www.w3.org/2001/XMLSchema" i:type="d5p1:dateTime">1940-12-10T11:00:00Z</d3p1:value>
</KeyValuePairOfstringanyType>
<KeyValuePairOfstringanyType>
<d3p1:key>parentcustomerid</d3p1:key>
<d3p1:value i:type="EntityReference">
<Id>9ccaa69b-fced-e411-80da-00155d0a0806</Id>
<KeyAttributes xmlns:d6p1="http://schemas.microsoft.com/xrm/7.1/Contracts"/>
<LogicalName>account</LogicalName>
<Name>Test ABC</Name>
<RowVersion i:nil="true"/>
</d3p1:value>
</KeyValuePairOfstringanyType>

The Id element has no namespace prefix, so is in the default namespace of the document, or in no namespace if the document has no default namespace. You need to call ->children() again to switch to the right namespace, as SimpleXML is currently looking for further nodes in the namespace with prefix d3p1.
If there is no default namespace, you just need to pass NULL:
$node->children('d3p1', TRUE)->value->children(NULL)->Id

Related

PHP - Unable to parse attribute using SimpleXML

Given the following xml:
<data xmlns:ns2="...">
<versions>
<ns2:version type="HW">E</ns2:version>
<ns2:version type="FW">3160</ns2:version>
<ns2:version type="SW">3.4.1 (777)</ns2:version>
</versions>
...
</data>
I am trying to parse the third attribute ~ns2:version type="SW" but when running the following code I get nothing..
$s = simplexml_load_file('data.xml');
echo $s->versions[2]->{'ns2:version'};
Running this gives the following output:
$s = simplexml_load_file('data.xml');
var_dump($s->versions);
How can I properly get that attribute?
You've got some quite annoying XML to work with there, at least as far as SimpleXML is concerned.
Your version elements are in the ns2 namespace, so in order to loop over them, you need to do something like this:
$s = simplexml_load_string($xml);
foreach ($s->versions[0]->children('ns2', true)->version as $child) {
...
}
The children() method returns all children of the current tag, but only in the default namespace. If you want to access elements in other namespaces, you can pass the local alias and the second argument true.
The more complicated part is that the type attributes is not considered to be part of this same namespace. This means you can't use the standard $element['attribute'] form to access it, since your element and attribute are in different namespaces.
Fortunately, SimpleXML's attributes() method works in the same way as children(), and so to access the attributes in the global namespace, you can pass it an empty string:
$element->attributes('')->type
In full, this is:
$s = simplexml_load_string($xml);
foreach ($s->versions[0]->children('ns2', true)->version as $child) {
echo (string) $child->attributes()->type, PHP_EOL;
}
This will get you the output
HW
FW
SW
To get the third attribute.
$s = simplexml_load_file('data.xml');
$sxe = new SimpleXMLElement($s);
foreach ($sxe as $out_ns) {
$ns = $out_ns->getNamespaces(true);
$child = $out_ns->children($ns['ns2']);
}
echo $child[2];
Out put:
3.4.1 (777)

Xpath in PHP with OTA standards

I have basic knowledge about the use of Xpath in PHP, but I'm having some troubles with a specific case and I think that the problem is in the standards.
This is the snippet of the XML and it's based on the OTA standards:
<SendHotelResResult xmlns:a="http://schemas/Models/OTA" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<a:RoomRates>
<a:RoomRate>
<a:EffectiveDate>2015-11-13T00:00:00</a:EffectiveDate>
<a:ExpireDate>2015-11-15T00:00:00</a:ExpireDate>
<a:RatePlanID>25</a:RatePlanID>
<a:RatesType>
<a:Rates>
<a:Rate>
<a:AgeQualifyingCode i:nil="true"/>
<a:EffectiveDate>2015-11-13T00:00:00</a:EffectiveDate>
<a:Total>
<a:AmountAfterTax>0</a:AmountAfterTax>
<a:AmountBeforeTax>260.00</a:AmountBeforeTax>
<a:CurrencyCode>EUR</a:CurrencyCode>
</a:Total>
</a:Rate>
<a:Rate>
<a:AgeQualifyingCode i:nil="true"/>
<a:EffectiveDate>2015-11-14T00:00:00</a:EffectiveDate>
<a:Total>
<a:AmountAfterTax>0</a:AmountAfterTax>
<a:AmountBeforeTax>260.00</a:AmountBeforeTax>
<a:CurrencyCode>EUR</a:CurrencyCode>
</a:Total>
</a:Rate>
</a:Rates>
</a:RatesType>
<a:RoomID>52</a:RoomID>
<a:Total>
<a:AmountAfterTax>546.00</a:AmountAfterTax>
<a:AmountBeforeTax>520.00</a:AmountBeforeTax>
<a:CurrencyCode>EUR</a:CurrencyCode>
</a:Total>
</a:RoomRate>
</a:RoomRates>
</SendHotelRes>
What I want:
Get a specific <RoomRate> tag based on the element <RoomID>.
Get the global RoomRate <Total> tag. I don't want the <Total> tag that is inside the <Rate> tag. This is the reason why I'm using the xpath rather than a simple getElementsByTagName('Total'). I don't know if the OTA standards has some approach to differentiate the Total tags.
My attempts until now:
$dom = new DOMDocument();
$response = $dom->load($xmlSendHotelRes);
$roomID = '52';
$roomRatesTag = $response->getElementsByTagName('RoomRates')->item(0);
$prefix = $roomRatesTag->prefix;
$namespace = $roomRatesTag->lookupNamespaceURI($prefix);
$xpath = new DOMXpath($dom);
$xpath->registerNamespace($prefix, $namespace);
$roomRateTotal = $xpath->query("//RoomRate[RoomID=$roomID]/Total", $roomRatesTag, true);
I already tried with and without $roomRatesTag as context and also other expressions like:
./RoomRate[RoomID=$roomID]/Total, //RoomRate[RoomID=$roomID]/Total, //RoomRate/[RoomID=$roomID]/Total,//RoomRate[RoomID=$roomID]/Total and //RoomRate/RoomID[text() = $roomID]/../Total but any of them works.
Actually, even $roomRate = $xpath->query("//RoomRate"); returns a empty DOMNodeList, so, I don't know what I doing wrong and I'm thinking about the problem in the standards with 2 identical tags in different places, although this not make much sense.
Are there some other expressions that I need to try?
You're fetching the namespace from the document.
$prefix = $roomRatesTag->prefix;
$namespace = $roomRatesTag->lookupNamespaceURI($prefix);
But this is not necessary or a good idea. You know that the document uses OTA, so you know the namespace is http://schemas/Models/OTA.
The prefix is just an alias for the actual namespace value the following 3 XML example all resolve to a node {http://schemas/Models/OTA}RoomRates
<a:RoomRates xmlns:a="http://schemas/Models/OTA"/>
<ota:RoomRates xmlns:ota="http://schemas/Models/OTA"/>
<RoomRates xmlns="http://schemas/Models/OTA"/>
Your Api has to look for nodes inside the namespace.
One possibility is to use the *NS (namespace aware) methods.
$response->getElementsByTagNameNS('http://schemas/Models/OTA', 'RoomRates')->item(0);
The other is to use Xpath and register prefixes for the namespaces. This can be the prefixes from the document, or different ones.
$document = new DOMDocument();
$document->load($xmlSendHotelRes);
$xpath = new DOMXpath($document);
$xpath->registerNamespace('ota', 'http://schemas/Models/OTA');
var_dump(
$xpath->evaluate(
'string(//ota:RoomRates/ota:RoomRate[ota:RoomID=$roomID]/ota:Total)')
)
);
For a location path, DOMXpath::evaluate() would return a DOMNodeList but with string() it casts the first found node into a string and returns it.
You need to use a prefix (that you registered) and I think you want to start your path with .// and not with // if you want to search relative to the context node, so try ".//a:RoomRate[a:RoomID=$roomID]/a:Total"

How to get XML namespace attributes with PHP simplexml

I'm pretty new to this, and I've followed several tutorials (including other OS questions), but I can't seem to get this to work.
I'm working with a library's EAD file (Library of Congress XML standard for describing library collections, http://www.loc.gov/ead/index.html), and I'm having trouble with the namespaces.
A simplified example of the XML:
<?xml version="1.0"?>
<ead xsi:schemaLocation="urn:isbn:1-931666-22-9 http://www.loc.gov/ead/ead.xsd" xmlns:ns2="http://www.w3.org/1999/xlink" xmlns="urn:isbn:1-931666-22-9" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<c02 id="ref24" level="item">
<did>
<unittitle>"Lepidoptera and seas(on) of appearance"</unittitle>
<unitid>1</unitid>
<container id="cid71717" type="Box" label="Mixed materials">1</container>
<physdesc>
<extent>Pencil</extent>
</physdesc>
<unitdate>[1817]</unitdate>
</did>
<dao id="ref001" ns2:actuate="onRequest" ns2:show="embed" ns2:role="" ns2:href="http://diglib.amphilsoc.org/fedora/repository/graphics:92"/>
</c02>
<c02 id="ref25" level="item">
<did>
<unittitle>Argus carryntas (Butterfly)</unittitle>
<unitid>2</unitid>
<container id="cid71715" type="Box" label="Mixed materials">1</container>
<physdesc>
<extent>Watercolor</extent>
</physdesc>
<unitdate>[1817]</unitdate>
</did>
<dao ns2:actuate="onRequest" ns2:show="embed" ns2:role="" ns2:href="http://diglib.amphilsoc.org/fedora/repository/graphics:87"/>
</c02>
Following advise I found elsewhere, I was trying this (and variations on this theme):
<?php
$entries = simplexml_load_file('test.xml');
foreach ($entries->c02->children('http://www.w3.org/1999/xlink') as $entry) {
echo 'link: ', $entry->children('dao', true)->href, "\n";
}
?>
Which, of course, isn't working.
You have to understand the difference between a namespace and a namespace prefix. The namespace is the value inside the xmlns attributes. The xmlns attributes define the prefix, which is an alias for the actual namespace for that node and its descendants.
In you example are three namespaces:
http://www.w3.org/1999/xlink with the alias "ns2"
urn:isbn:1-931666-22-9 without an alias
http://www.w3.org/2001/XMLSchema-instance with the alias "xsi"
So elements and attributes starting with "ns2:" are inside the xlink namespace, elements and attributes starting with "xsi:" in the XML schema instance namespace. All elements without an namespace prefix are in the isbn specific namespace. Attributes without a namespace prefix are always in NO namespace.
If you query the xml dom, you need to define your own namespaces prefixes. The namespace prefixes in the xml documents can change, especially if they are external resources.
I don't use "SimpleXML", so here is an DOM example:
<?php
$xml = <<<'XML'
<?xml version="1.0"?>
<ead
xsi:schemaLocation="urn:isbn:1-931666-22-9 http://www.loc.gov/ead/ead.xsd"
xmlns:ns2="http://www.w3.org/1999/xlink"
xmlns="urn:isbn:1-931666-22-9"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<c02 id="ref24" level="item">
<did>
<unittitle>"Lepidoptera and seas(on) of appearance"</unittitle>
</did>
</c02>
</ead>
XML;
// create dom and load the xml
$dom = new DOMDocument();
$dom->loadXml($xml);
// create an xpath object
$xpath = new DOMXpath($dom);
// register you own namespace prefix
$xpath->registerNamespace('isbn', 'urn:isbn:1-931666-22-9');
foreach ($xpath->evaluate('//isbn:unittitle', NULL, FALSE) as $node) {
var_dump($node->textContent);
}
Output:
string(40) ""Lepidoptera and seas(on) of appearance""
Xpath is quite powerful and the most comfortable way to extract data from XML.
The default namespace in you case is weird. It looks like it is dynamic, so you might need a way to read it. Here is the Xpath for that:
$defaultNamespace = $xpath->evaluate('string(/*/namespace::*[name() = ""])');
It reads the namespace without a prefix from the document element.

PHP Handling Namespace with SimpleXML

I really need help with using namespaces. How do I get the following code to work properly?
<?php
$mytv = simplexml_load_string(
'<?xml version="1.0" encoding="utf-8"?>
<mytv>
<mytv:channelone>
<mytv:description>comedy that makes you laugh</mytv:description>
</mytv:channelone>
</mytv>'
);
foreach ($mytv as $mytv1)
{
echo 'description: ', $mytv1->children('mytv', true)->channelone->description;
}
?>
All I'm trying to do is get the content inside the name element.
when ever yu are using the namespaces in xml yu should define the namespaces what ever you use..! in the code what i posted you can see how you can define the namespace you are using..
you need to display the description specific to the namespace isn't it..? correct me if I'm wrong., and please post yur purpose properly so that i can understand your problem..
Use this code and see if you can get some idea..
$xml ='<mytv>
<mytv:channelone xmlns:mytv="http://mycompany/namespaces/mytvs">
<mytv:description >comedy that makes you laugh</mytv:description>
</mytv:channelone>
</mytv>';
$xml = simplexml_load_string($xml);
$doc = new DOMDocument();
$str = $xml->asXML();
$doc->loadXML($str);
$bar_count = $doc->getElementsByTagName("description");
foreach ($bar_count as $node)
{
echo $node->nodeName." - ".$node->nodeValue."-".$node->prefix. "<br>";
}
here., the value, "$node->prefix" will be the namespace of the tag containing "description".
getElementsByTagName("description") is used to get all the elements in the xml containing description as tags...!! and then later using the "$node->prefix" you compare with the specific namespace as required for you and then print..

Remove namespace from XML using PHP

I have an XML document that looks like this:
<Data
xmlns="http://www.domain.com/schema/data"
xmlns:dmd="http://www.domain.com/schema/data-metadata"
>
<Something>...</Something>
</Data>
I am parsing the information using SimpleXML in PHP. I am dealing with arrays and I seem to be having a problem with the namespace.
My question is: How do I remove those namespaces? I read the data from an XML file.
Thank you!
I found the answer above to be helpful, but it didn't quite work for me.
This ended up working better:
// Gets rid of all namespace definitions
$xml_string = preg_replace('/xmlns[^=]*="[^"]*"/i', '', $xml_string);
// Gets rid of all namespace references
$xml_string = preg_replace('/[a-zA-Z]+:([a-zA-Z]+[=>])/', '$1', $xml_string);
If you're using XPath then it's a limitation with XPath and not PHP look at this explanation on xpath and default namespaces for more info.
More specifically its the xmlns="" attribute in the root node which is causing the problem. This means that you'll need to register the namespace then use a QName thereafter to refer to elements.
$feed = simplexml_load_file('http://www.sitepoint.com/recent.rdf');
$feed->registerXPathNamespace("a", "http://www.domain.com/schema/data");
$result = $feed->xpath("a:Data/a:Something/...");
Important: The URI used in the registerXPathNamespace call must be identical to the one that is used in the actual XML file.
The following PHP code automatically detects the default namespace specified in the XML file under the alias "default". No all xpath queries have to be updated to include the prefix default:
So if you want to read XML files rather they contain an default NS definition or they don't and you want to query all Something elements, you could use the following code:
$xml = simplexml_load_file($name);
$namespaces = $xml->getDocNamespaces();
if (isset($namespaces[''])) {
$defaultNamespaceUrl = $namespaces[''];
$xml->registerXPathNamespace('default', $defaultNamespaceUrl);
$nsprefix = 'default:';
} else {
$nsprefix = '';
}
$somethings = $xml->xpath('//'.$nsprefix.'Something');
echo count($somethings).' times found';
When you just want your xml, parsed to be used, and you don't care for any namespaces,
you just remove them. Regular expressions are good, and way faster than my method below.
But for a safer approach when removing namespaces, one could parse the xml with SimpleXML and ask for the namespaces it has, like below:
$xml = '...';
$namespaces = simplexml_load_string($xml)->getDocNamespaces(true);
//The line bellow fetches default namespace with empty key, like this: '' => 'url'
//So we remove any default namespace from the array
$namespaces = array_filter(array_keys($namespaces), function($k){return !empty($k);});
$namespaces = array_map(function($ns){return "$ns:";}, $namespaces);
$ns_clean_xml = str_replace("xmlns=", "ns=", $xml);
$ns_clean_xml = str_replace($namespaces, array_fill(0, count($namespaces), ''), $ns_clean_xml);
$xml_obj = simplexml_load_string($ns_clean_xml);
Thus you hit replace only for the namespaces avoiding to remove anything else the xml could have.
Actually I am using it as a method:
function refined_simplexml_load_string($xml_string) {
if(false === ($x1 = simplexml_load_string($xml_string)) ) return false;
$namespaces = array_keys($x1->getDocNamespaces(true));
$namespaces = array_filter($namespaces, function($k){return !empty($k);});
$namespaces = array_map(function($ns){return "$ns:";}, $namespaces);
return simplexml_load_string($ns_clean_xml = str_replace(
array_merge(["xmlns="], $namespaces),
array_merge(["ns="], array_fill(0, count($namespaces), '')),
$xml_string
));
}
To remove the namespace completely, you'll need to use Regular Expressions (RegEx). For example:
$feed = file_get_contents("http://www.sitepoint.com/recent.rdf");
$feed = preg_replace("/<.*(xmlns *= *[\"'].[^\"']*[\"']).[^>]*>/i", "", $feed); // This removes ALL default namespaces.
$xml_feed = simplexml_load_string($feed);
Then you've stripped any xml namespaces before you load the XML (be careful with the regex through, because if you have any fields with something like:
<![CDATA[ <Transfer xmlns="http://redeux.example.com">cool.</Transfer> ]]>
Then it will strip the xmlns from inside the CDATA which may lead to unexpected results.

Categories