Remove namespace from XML using PHP - php

I have an XML document that looks like this:
<Data
xmlns="http://www.domain.com/schema/data"
xmlns:dmd="http://www.domain.com/schema/data-metadata"
>
<Something>...</Something>
</Data>
I am parsing the information using SimpleXML in PHP. I am dealing with arrays and I seem to be having a problem with the namespace.
My question is: How do I remove those namespaces? I read the data from an XML file.
Thank you!

I found the answer above to be helpful, but it didn't quite work for me.
This ended up working better:
// Gets rid of all namespace definitions
$xml_string = preg_replace('/xmlns[^=]*="[^"]*"/i', '', $xml_string);
// Gets rid of all namespace references
$xml_string = preg_replace('/[a-zA-Z]+:([a-zA-Z]+[=>])/', '$1', $xml_string);

If you're using XPath then it's a limitation with XPath and not PHP look at this explanation on xpath and default namespaces for more info.
More specifically its the xmlns="" attribute in the root node which is causing the problem. This means that you'll need to register the namespace then use a QName thereafter to refer to elements.
$feed = simplexml_load_file('http://www.sitepoint.com/recent.rdf');
$feed->registerXPathNamespace("a", "http://www.domain.com/schema/data");
$result = $feed->xpath("a:Data/a:Something/...");
Important: The URI used in the registerXPathNamespace call must be identical to the one that is used in the actual XML file.

The following PHP code automatically detects the default namespace specified in the XML file under the alias "default". No all xpath queries have to be updated to include the prefix default:
So if you want to read XML files rather they contain an default NS definition or they don't and you want to query all Something elements, you could use the following code:
$xml = simplexml_load_file($name);
$namespaces = $xml->getDocNamespaces();
if (isset($namespaces[''])) {
$defaultNamespaceUrl = $namespaces[''];
$xml->registerXPathNamespace('default', $defaultNamespaceUrl);
$nsprefix = 'default:';
} else {
$nsprefix = '';
}
$somethings = $xml->xpath('//'.$nsprefix.'Something');
echo count($somethings).' times found';

When you just want your xml, parsed to be used, and you don't care for any namespaces,
you just remove them. Regular expressions are good, and way faster than my method below.
But for a safer approach when removing namespaces, one could parse the xml with SimpleXML and ask for the namespaces it has, like below:
$xml = '...';
$namespaces = simplexml_load_string($xml)->getDocNamespaces(true);
//The line bellow fetches default namespace with empty key, like this: '' => 'url'
//So we remove any default namespace from the array
$namespaces = array_filter(array_keys($namespaces), function($k){return !empty($k);});
$namespaces = array_map(function($ns){return "$ns:";}, $namespaces);
$ns_clean_xml = str_replace("xmlns=", "ns=", $xml);
$ns_clean_xml = str_replace($namespaces, array_fill(0, count($namespaces), ''), $ns_clean_xml);
$xml_obj = simplexml_load_string($ns_clean_xml);
Thus you hit replace only for the namespaces avoiding to remove anything else the xml could have.
Actually I am using it as a method:
function refined_simplexml_load_string($xml_string) {
if(false === ($x1 = simplexml_load_string($xml_string)) ) return false;
$namespaces = array_keys($x1->getDocNamespaces(true));
$namespaces = array_filter($namespaces, function($k){return !empty($k);});
$namespaces = array_map(function($ns){return "$ns:";}, $namespaces);
return simplexml_load_string($ns_clean_xml = str_replace(
array_merge(["xmlns="], $namespaces),
array_merge(["ns="], array_fill(0, count($namespaces), '')),
$xml_string
));
}

To remove the namespace completely, you'll need to use Regular Expressions (RegEx). For example:
$feed = file_get_contents("http://www.sitepoint.com/recent.rdf");
$feed = preg_replace("/<.*(xmlns *= *[\"'].[^\"']*[\"']).[^>]*>/i", "", $feed); // This removes ALL default namespaces.
$xml_feed = simplexml_load_string($feed);
Then you've stripped any xml namespaces before you load the XML (be careful with the regex through, because if you have any fields with something like:
<![CDATA[ <Transfer xmlns="http://redeux.example.com">cool.</Transfer> ]]>
Then it will strip the xmlns from inside the CDATA which may lead to unexpected results.

Related

PHP - Unable to parse attribute using SimpleXML

Given the following xml:
<data xmlns:ns2="...">
<versions>
<ns2:version type="HW">E</ns2:version>
<ns2:version type="FW">3160</ns2:version>
<ns2:version type="SW">3.4.1 (777)</ns2:version>
</versions>
...
</data>
I am trying to parse the third attribute ~ns2:version type="SW" but when running the following code I get nothing..
$s = simplexml_load_file('data.xml');
echo $s->versions[2]->{'ns2:version'};
Running this gives the following output:
$s = simplexml_load_file('data.xml');
var_dump($s->versions);
How can I properly get that attribute?
You've got some quite annoying XML to work with there, at least as far as SimpleXML is concerned.
Your version elements are in the ns2 namespace, so in order to loop over them, you need to do something like this:
$s = simplexml_load_string($xml);
foreach ($s->versions[0]->children('ns2', true)->version as $child) {
...
}
The children() method returns all children of the current tag, but only in the default namespace. If you want to access elements in other namespaces, you can pass the local alias and the second argument true.
The more complicated part is that the type attributes is not considered to be part of this same namespace. This means you can't use the standard $element['attribute'] form to access it, since your element and attribute are in different namespaces.
Fortunately, SimpleXML's attributes() method works in the same way as children(), and so to access the attributes in the global namespace, you can pass it an empty string:
$element->attributes('')->type
In full, this is:
$s = simplexml_load_string($xml);
foreach ($s->versions[0]->children('ns2', true)->version as $child) {
echo (string) $child->attributes()->type, PHP_EOL;
}
This will get you the output
HW
FW
SW
To get the third attribute.
$s = simplexml_load_file('data.xml');
$sxe = new SimpleXMLElement($s);
foreach ($sxe as $out_ns) {
$ns = $out_ns->getNamespaces(true);
$child = $out_ns->children($ns['ns2']);
}
echo $child[2];
Out put:
3.4.1 (777)

Xpath in PHP with OTA standards

I have basic knowledge about the use of Xpath in PHP, but I'm having some troubles with a specific case and I think that the problem is in the standards.
This is the snippet of the XML and it's based on the OTA standards:
<SendHotelResResult xmlns:a="http://schemas/Models/OTA" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<a:RoomRates>
<a:RoomRate>
<a:EffectiveDate>2015-11-13T00:00:00</a:EffectiveDate>
<a:ExpireDate>2015-11-15T00:00:00</a:ExpireDate>
<a:RatePlanID>25</a:RatePlanID>
<a:RatesType>
<a:Rates>
<a:Rate>
<a:AgeQualifyingCode i:nil="true"/>
<a:EffectiveDate>2015-11-13T00:00:00</a:EffectiveDate>
<a:Total>
<a:AmountAfterTax>0</a:AmountAfterTax>
<a:AmountBeforeTax>260.00</a:AmountBeforeTax>
<a:CurrencyCode>EUR</a:CurrencyCode>
</a:Total>
</a:Rate>
<a:Rate>
<a:AgeQualifyingCode i:nil="true"/>
<a:EffectiveDate>2015-11-14T00:00:00</a:EffectiveDate>
<a:Total>
<a:AmountAfterTax>0</a:AmountAfterTax>
<a:AmountBeforeTax>260.00</a:AmountBeforeTax>
<a:CurrencyCode>EUR</a:CurrencyCode>
</a:Total>
</a:Rate>
</a:Rates>
</a:RatesType>
<a:RoomID>52</a:RoomID>
<a:Total>
<a:AmountAfterTax>546.00</a:AmountAfterTax>
<a:AmountBeforeTax>520.00</a:AmountBeforeTax>
<a:CurrencyCode>EUR</a:CurrencyCode>
</a:Total>
</a:RoomRate>
</a:RoomRates>
</SendHotelRes>
What I want:
Get a specific <RoomRate> tag based on the element <RoomID>.
Get the global RoomRate <Total> tag. I don't want the <Total> tag that is inside the <Rate> tag. This is the reason why I'm using the xpath rather than a simple getElementsByTagName('Total'). I don't know if the OTA standards has some approach to differentiate the Total tags.
My attempts until now:
$dom = new DOMDocument();
$response = $dom->load($xmlSendHotelRes);
$roomID = '52';
$roomRatesTag = $response->getElementsByTagName('RoomRates')->item(0);
$prefix = $roomRatesTag->prefix;
$namespace = $roomRatesTag->lookupNamespaceURI($prefix);
$xpath = new DOMXpath($dom);
$xpath->registerNamespace($prefix, $namespace);
$roomRateTotal = $xpath->query("//RoomRate[RoomID=$roomID]/Total", $roomRatesTag, true);
I already tried with and without $roomRatesTag as context and also other expressions like:
./RoomRate[RoomID=$roomID]/Total, //RoomRate[RoomID=$roomID]/Total, //RoomRate/[RoomID=$roomID]/Total,//RoomRate[RoomID=$roomID]/Total and //RoomRate/RoomID[text() = $roomID]/../Total but any of them works.
Actually, even $roomRate = $xpath->query("//RoomRate"); returns a empty DOMNodeList, so, I don't know what I doing wrong and I'm thinking about the problem in the standards with 2 identical tags in different places, although this not make much sense.
Are there some other expressions that I need to try?
You're fetching the namespace from the document.
$prefix = $roomRatesTag->prefix;
$namespace = $roomRatesTag->lookupNamespaceURI($prefix);
But this is not necessary or a good idea. You know that the document uses OTA, so you know the namespace is http://schemas/Models/OTA.
The prefix is just an alias for the actual namespace value the following 3 XML example all resolve to a node {http://schemas/Models/OTA}RoomRates
<a:RoomRates xmlns:a="http://schemas/Models/OTA"/>
<ota:RoomRates xmlns:ota="http://schemas/Models/OTA"/>
<RoomRates xmlns="http://schemas/Models/OTA"/>
Your Api has to look for nodes inside the namespace.
One possibility is to use the *NS (namespace aware) methods.
$response->getElementsByTagNameNS('http://schemas/Models/OTA', 'RoomRates')->item(0);
The other is to use Xpath and register prefixes for the namespaces. This can be the prefixes from the document, or different ones.
$document = new DOMDocument();
$document->load($xmlSendHotelRes);
$xpath = new DOMXpath($document);
$xpath->registerNamespace('ota', 'http://schemas/Models/OTA');
var_dump(
$xpath->evaluate(
'string(//ota:RoomRates/ota:RoomRate[ota:RoomID=$roomID]/ota:Total)')
)
);
For a location path, DOMXpath::evaluate() would return a DOMNodeList but with string() it casts the first found node into a string and returns it.
You need to use a prefix (that you registered) and I think you want to start your path with .// and not with // if you want to search relative to the context node, so try ".//a:RoomRate[a:RoomID=$roomID]/a:Total"

DOMDocument simple GetElementsByTagName wont work?

$xml = '<?xml version="1.0" encoding="UTF-8"?>
<stw:ThumbnailResponse xmlns:stw="http://www.shrinktheweb.com/doc/stwresponse.xsd">
<stw:Response>
<stw:ThumbnailResult>
<stw:Thumbnail Exists="true">http://imagelink.com</stw:Thumbnail>
<stw:Thumbnail Verified="false">delivered</stw:Thumbnail>
</stw:ThumbnailResult>
<stw:ResponseStatus>
<stw:StatusCode>refresh</stw:StatusCode>
</stw:ResponseStatus>
<stw:ResponseTimestamp>
<stw:StatusCode>1413812009</stw:StatusCode>
</stw:ResponseTimestamp>
<stw:ResponseCode>
<stw:StatusCode>HTTP:200</stw:StatusCode>
</stw:ResponseCode>
<stw:CategoryCode>
<stw:StatusCode></stw:StatusCode>
</stw:CategoryCode>
<stw:Quota_Remaining>
<stw:StatusCode>132</stw:StatusCode>
</stw:Quota_Remaining>
<stw:Bandwidth_Remaining>
<stw:StatusCode>999791</stw:StatusCode>
</stw:Bandwidth_Remaining>
</stw:Response>
</stw:ThumbnailResponse>';
$dom = new DOMDocument;
$dom->loadXML($xml);
$result = $dom->getElementsByTagName('stw:Thumbnail')->item(0)->nodeValue;
$status = $dom->getElementsByTagName('stw:Thumbnail')->item(0)->nodeValue;
echo $result;
Having the above code should output http://imagelink.com and $status should hold "delivered" - but none of these work instead I am left with the error notice that:
Trying to get property of non-object
I have tried different xml parsing alternatives like simplexml (but that did not work when the tag names have : in it ) and i tried looping through the each scope in the xml (ThumbNailresponse, response and then thumbnailresult) without luck.
How can i get the values inside stw:Thumbnail?
You need to specify a namespace and the method DOMDocument::getElementsByTagName can't handle it. In the manual:
The local name (without namespace) of the tag to match on.
You can use DOMDocument::getElementsByTagNameNS instead:
$dom = new DOMDocument;
$dom->loadXML($xml);
$namespaceURI = 'http://www.shrinktheweb.com/doc/stwresponse.xsd';
$result = $dom->getElementsByTagNameNS($namespaceURI, 'Thumbnail')->item(0)->nodeValue;
Using simple xml you could use ->children() method on this one:
$xml = simplexml_load_string($xml_string);
$stw = $xml->children('stw', 'http://www.shrinktheweb.com/doc/stwresponse.xsd');
echo '<pre>';
foreach($stw as $e) {
print_r($e);
// do what you have to do here
}
This code actually runs just fine for me ---
Typically, that sort of error means you may've made a typo on your $dom object - double check it and try again.
Also, it is notable that you'll want to change the item(0) to item(1) when you're setting your $status variable.
$result = $dom->getElementsByTagName('stw:Thumbnail')->item(0)->nodeValue;
$status = $dom->getElementsByTagName('stw:Thumbnail')->item(0)->nodeValue;

creating multiple xml nodes with same namespaces in php

I have the following code
$dom = new DOMDocument('1.0', 'utf-8');
$headerNS = $dom->createElementNS('http://somenamespace', 'ttauth:authHeader');
$accesuser = $dom->createElementNS('http://somenamespace', 'ttauth:Accessuser','aassdd');
$accesscode = $dom->createElementNS('http://somenamespace', 'ttauth:Accesscode','aassdd');
$headerNS->appendChild($accesuser);
$headerNS->appendChild($accesscode);
echo "<pre>";
echo ($dom->saveXML($headerNS));
echo "</pre>";
IT will produce the following xml as output
<?xml version="1.0" ?>
<ttauth:authHeader xmlns:ttauth="http://somenamespace">
<ttauth:Accessuser>
ApiUserFor136
</ttauth:Accessuser>
<ttauth:Accesscode>
test1234
</ttauth:Accesscode>
</ttauth:authHeader>
But I want the following output
<ttauth:authHeader xmlns:ttauth="http://somenamespace">
<ttauth:Accessuser xmlns:ttauth="http://somenamespace">
aassdd
</ttauth:Accessuser>
<ttauth:Accesscode xmlns:ttauth="somenamespace">
aassdd
</ttauth:Accesscode>
</ttauth:authHeader>
See the xmlns is not included in elements other than root element but I want xmlns to be included in all elements Is there anything I am doing wrong ??
Probably the PHP parser does not add renaming of the same namespace "http://somenamespace" with the same prefix "ttauth" because it is redundant. Both xmls you shown ( the output and expected ) are equivalent. If you want to be sure you have the namespaces attributes as you want, you should add them manually by using addAtribute - http://www.php.net/manual/en/domdocument.createattribute.php. See the following code snippet:
$domAttribute = $domDocument->createAttribute('xmlns:ttauth');
$domAttribute->value = 'http://somenamespace';
$accessuser->appendChild($domAttribute);
Hope it helps
instead of using
$accesuser = $dom->createElementNS('http://somenamespace', 'ttauth:Accessuser','aassdd');
I used
$accesuser = $dom->createElement('http://somenamespace', 'ttauth:Accessuser','aassdd');
and then
$accesuser->setAttribute('xmlns:ttauth', ('http://somenamespace');
it works fine for any number of nodes

PHP Handling Namespace with SimpleXML

I really need help with using namespaces. How do I get the following code to work properly?
<?php
$mytv = simplexml_load_string(
'<?xml version="1.0" encoding="utf-8"?>
<mytv>
<mytv:channelone>
<mytv:description>comedy that makes you laugh</mytv:description>
</mytv:channelone>
</mytv>'
);
foreach ($mytv as $mytv1)
{
echo 'description: ', $mytv1->children('mytv', true)->channelone->description;
}
?>
All I'm trying to do is get the content inside the name element.
when ever yu are using the namespaces in xml yu should define the namespaces what ever you use..! in the code what i posted you can see how you can define the namespace you are using..
you need to display the description specific to the namespace isn't it..? correct me if I'm wrong., and please post yur purpose properly so that i can understand your problem..
Use this code and see if you can get some idea..
$xml ='<mytv>
<mytv:channelone xmlns:mytv="http://mycompany/namespaces/mytvs">
<mytv:description >comedy that makes you laugh</mytv:description>
</mytv:channelone>
</mytv>';
$xml = simplexml_load_string($xml);
$doc = new DOMDocument();
$str = $xml->asXML();
$doc->loadXML($str);
$bar_count = $doc->getElementsByTagName("description");
foreach ($bar_count as $node)
{
echo $node->nodeName." - ".$node->nodeValue."-".$node->prefix. "<br>";
}
here., the value, "$node->prefix" will be the namespace of the tag containing "description".
getElementsByTagName("description") is used to get all the elements in the xml containing description as tags...!! and then later using the "$node->prefix" you compare with the specific namespace as required for you and then print..

Categories