PHP SimpleXML: How to access nested namespaces? - php

Given this XML structure:
$xml = '<rss xmlns:media="http://search.yahoo.com/mrss/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
<channel>
<item>
<title>Title</title>
<media:group>
<media:content url="url1" />
<media:content url="url2" />
</media:group>
</item>
<item>
<title>Title2</title>
<media:group>
<media:content url="url1" />
<media:content url="url2" />
</media:group>
</item>
</channel>
</rss>';
$xml_data = new SimpleXMLElement($xml);
How do I access the attributes of the media:content nodes? I tried
foreach ($xml_data->channel->item as $key => $data) {
$urls = $data->children('media', true)->children('media', true);
print_r($urls);
}
and
foreach ($xml_data->channel->item as $key => $data) {
$ns = $xml->getNamespaces(true);
$urls = $data->children('media', true)->children($ns['media']);
print_r($urls);
}
as per other answers, but they both return empty SimpleXMLElements.

When you echo out XML with SimpleXML, you need to use asXML() to see the real content, print_r() does it's own version and doesn't show all the content...
foreach ($xml_data->channel->item as $key => $data) {
$urls = $data->children('media', true)->children('media', true);
echo $urls->asXML().PHP_EOL;
}
echos out...
<media:content url="url1"/>
<media:content url="url1"/>
It only outputs the first one of each group as you will need to add another foreach to go through all of the child nodes for each element.
foreach ($xml_data->channel->item as $key => $data) {
echo $data->title.PHP_EOL;
foreach ( $data->children('media', true)->children('media', true) as $content ) {
echo $content->asXML().PHP_EOL;
}
}
outputs..
Title
<media:content url="url1"/>
<media:content url="url2"/>
Title2
<media:content url="url1"/>
<media:content url="url2"/>
To access a particular attribute (so for example the url attribute from the second code example) you have to use the attributes() method...
echo $content->attributes()['url'];

Related

Remove all nodes from XML but specific ones in PHP

I have an XML from Google with a content like this:
<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0" xmlns:g="http://base.google.com/ns/1.0">
<channel>
<title>E-commerce's products.</title>
<description><![CDATA[Clothing and accessories.]]></description>
<link>https://www.ourwebsite.com/</link>
<item>
<title><![CDATA[Product #1 title]]></title>
<g:brand><![CDATA[Product #1 brand]]></g:brand>
<g:mpn><![CDATA[5643785645]]></g:mpn>
<g:gender>Male</g:gender>
<g:age_group>Adult</g:age_group>
<g:size>Unica</g:size>
<g:condition>new</g:condition>
<g:id>fr_30763_06352</g:id>
<g:item_group_id>fr_30763</g:item_group_id>
<link><![CDATA[https://www.ourwebsite.com/product_1_url.htm?mid=62367]]></link>
<description><![CDATA[Product #1 description]]></description>
<g:image_link><![CDATA[https://data.ourwebsite.com/imgprodotto/product-1_big.jpg]]></g:image_link>
<g:sale_price>29.25 EUR</g:sale_price>
<g:price>65.00 EUR</g:price>
<g:shipping_weight>0.5 kg</g:shipping_weight>
<g:featured_product>y</g:featured_product>
<g:product_type><![CDATA[Product #1 category]]></g:product_type>
<g:availability>in stock</g:availability>
<g:availability_date>2022-08-10T00:00-0000</g:availability_date>
<qty>3</qty>
<g:payment_accepted>Visa</g:payment_accepted>
<g:payment_accepted>MasterCard</g:payment_accepted>
<g:payment_accepted>CartaSi</g:payment_accepted>
<g:payment_accepted>Aura</g:payment_accepted>
<g:payment_accepted>PayPal</g:payment_accepted>
</item>
<item>
<title><![CDATA[Product #2 title]]></title>
<g:brand><![CDATA[Product #2 brand]]></g:brand>
<g:mpn><![CDATA[573489547859]]></g:mpn>
<g:gender>Unisex</g:gender>
<g:age_group>Adult</g:age_group>
<g:size>Unica</g:size>
<g:condition>new</g:condition>
<g:id>fr_47362_382936</g:id>
<g:item_group_id>fr_47362</g:item_group_id>
<link><![CDATA[https://www.ourwebsite.com/product_2_url.htm?mid=168192]]></link>
<description><![CDATA[Product #2 description]]></description>
<g:image_link><![CDATA[https://data.ourwebsite.com/imgprodotto/product-2_big.jpg]]></g:image_link>
<g:sale_price>143.91 EUR</g:sale_price>
<g:price>159.90 EUR</g:price>
<g:shipping_weight>8.0 kg</g:shipping_weight>
<g:product_type><![CDATA[Product #2 category]]></g:product_type>
<g:availability>in stock</g:availability>
<g:availability_date>2022-08-10T00:00-0000</g:availability_date>
<qty>1</qty>
<g:payment_accepted>Visa</g:payment_accepted>
<g:payment_accepted>MasterCard</g:payment_accepted>
<g:payment_accepted>CartaSi</g:payment_accepted>
<g:payment_accepted>Aura</g:payment_accepted>
<g:payment_accepted>PayPal</g:payment_accepted>
</item>
...
</channel>
</rss>
I need to produce a XML file purged from all the tags inside <item> except for <g:mpn>, <link>, <g:sale_price> and <qty>.
In the example above, the result should be
<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0" xmlns:g="http://base.google.com/ns/1.0">
<channel>
<title>E-commerce's products.</title>
<description><![CDATA[Clothing and accessories.]]></description>
<link>https://www.ourwebsite.com/</link>
<item>
<g:mpn><![CDATA[5643785645]]></g:mpn>
<link><![CDATA[https://www.ourwebsite.com/product_1_url.htm?mid=62367]]></link>
<g:sale_price>29.25 EUR</g:sale_price>
<qty>3</qty>
</item>
<item>
<g:mpn><![CDATA[573489547859]]></g:mpn>
<link><![CDATA[https://www.ourwebsite.com/product_2_url.htm?mid=168192]]></link>
<g:sale_price>143.91 EUR</g:sale_price>
<qty>1</qty>
</item>
...
</channel>
</rss>
I've looked at SimpleXML, DOMDocument, XPath docs but I couldn't find the way to exclude specific elements. I don't want to select by name the nodes I have to delete, as in a future Google could add some nodes and they will not be deleted by my script.
I've also tried to loop through namespaced elements with SimpleXML and unset them if not matched with the nodes I have to keep:
$g = $element->children($namespaces['g']); //$element is the SimpleXMLElement of <item> tag
foreach ($g as $gchild) {
if ($gchild->getName() != "mpn") { //for example
unset($gchild);
}
}
but the code above doesn't remove all nodes except for <g:mpn>, for example.
PS: consider the fact that the XML contains both namespaced and not namespaced elements
Thank you in advance.
EDIT:
I've managed to do this with the following code:
$elementsToKeep = array("mpn", "link", "sale_price", "qty");
$domdoc = new DOMDocument();
$domdoc->preserveWhiteSpace = FALSE;
$domdoc->formatOutput = TRUE;
$domdoc->loadXML($myXMLDocument->asXML()); //$myXMLDocument is the SimpleXML document related to the original XML
$xpath = new DOMXPath($domdoc);
foreach ($element->children() as $child) {
$cname = $child->getName();
if (!in_array($cname, $elementsToKeep)) {
foreach($xpath->query('/rss/channel/item/'.$cname) as $node) {
$node->parentNode->removeChild($node);
}
}
}
$g = $element->children($namespaces['g']);
foreach ($g as $gchild) {
$gname = $gchild->getName();
if (!in_array($gname, $elementsToKeep)) {
foreach($xpath->query('/rss/channel/item/g:'.$gname) as $node) {
$node->parentNode->removeChild($node);
}
}
}
I've used DOMDocument and DOMXPath and two loops on no-namespaced tags and namespaced tags, in order to use the removeChild function of DOMDocument.
Really there is not a cleaner solution?? Thanks again
Somewhat simpler:
$items = $xpath->query('//item');
foreach($items as $item) {
$targets = $xpath->query('.//*',$item);
foreach($targets as $target) {
if (!in_array($target->localName, $elementsToKeep)) {
$target->parentNode->removeChild($target);
}
};
};
Use XPath to express all child elements you want to remove.
Then use the library of your choice to remove the elements.
SimpleXMLElement example:
$sxe = simplexml_load_string($xml);
foreach ($sxe->xpath('//item/*[
not(
name() = "g:mpn"
or name() = "link"
or name() = "g:sale_price"
or name() = "qty"
)
]') as $child) unset($child[0]);
echo $sxe->asXML(), "\n";
DOMDocument example:
This one is mainly identical to the previous example, with a bit of a variation on the xpath expression to explicitly use namespace URIs for the elements. This prevents that it breaks when the namespace prefix changes (it also works in the SimpleXMLElement example):
$doc = new DOMDocument();
$doc->loadXML($xml);
$xpath = new DOMXPath($doc);
foreach ($xpath->query('//item/*[
not(
(local-name() = "mpn" and namespace-uri() = "http://base.google.com/ns/1.0")
or (local-name() = "link" and namespace-uri() = "")
or (local-name() = "sale_price" and namespace-uri() = "http://base.google.com/ns/1.0")
or (local-name() = "qty" and namespace-uri() = "")
)
]') as $child) {
$child->parentNode->removeChild($child);
}
echo $doc->saveXML(), "\n";

Same variable from multiple XML

$var1 = new SimpleXMLElement('CSVXML/xvar.xml', null, true);
$var2 = new SimpleXMLElement('CSVXML/yvar.xml', null, true);
let's say I get variables from two diffrents XML files, in the first XML files
<Number>3698</Number>
<InternalNumber>1</InternalNumber>
<Name>Bob</Name>
<Number>3500</Number>
<InternalNumber>2</InternalNumber>
<Name>Mike</Name>
<Number>2775</Number>
<InternalNumber>3</InternalNumber>
<Name>Dan</Name>
in the second XML I get the followings
<player>3698</player>
<group>A</group>
I do this
$varID = $var1->Number;
$varnumber = $var2->player;
if ($varID == $varnumber ){
echo '$var1->InternalNumber';
}
is this possible ?
I simply want to put out a variable, is A for XML! = B from XML2, is there anyway possible to do that?
I found this working fine. tested link
<?php
$str = <<<XML
<items>
<item>
<Number>3698</Number>
<InternalNumber>1</InternalNumber>
<Name>Bob</Name>
</item>
<item>
<Number>3500</Number>
<InternalNumber>2</InternalNumber >
<Name>Mike</Name>
</item>
<item>
<Number>2775</Number>
<InternalNumber>3</InternalNumber>
<Name>Dan</Name>
</item>
</items>
XML;
$str2 = <<<XML
<item>
<player>3698</player>
<group>A</group>
</item>
XML;
$da = new SimpleXMLElement($str2);
$varnumber = $da->player;
$data = new SimpleXMLElement($str);
foreach ($data->item as $item)
{
$this_number = $item->Number;
//echo $this_number."-".$item->InternalNumber."-".$varnumber."\n";
if((int)$this_number == (int)$varnumber ){
$this_internalnumber = $item->InternalNumber;
echo $this_internalnumber."\n";
}
else{
echo "No Match found \n";
}
}
Hope this helps.

Parse feed media group with array of children

I have this XML feed:
<item>
<title>Title</title>
<media:group>
<media:content url="http://example.it/image.jpg" type="image/jpeg">
<media:thumbail url="http://example.it/image.jpg" type="image/png"/>
<media:credit>Credit</media:credit>
</media:content>
<media:content url="http://example.it/image2.jpg" type="image/jpeg">
<media:thumbail url="http://example.it/image2.jpg" type="image/png"/>
<media:credit>Credit2</media:credit>
</media:content>
</media:group>
</item>
This is my PHP code for read it:
$rss = new SimpleXMLElement($url);
foreach ($rss->channel->item as $item) {
$title = $item->title;
}
No problem reading "title" item, but how can I read "url", "thumbnail", "credit" for each media:content?
-------SOLVED-------
$rss = new SimpleXMLElement($url);
foreach ($rss->channel->item as $item) {
$title = $item->title;
$gallerie = $item->children('http://search.yahoo.com/mrss/')->group->content;
foreach($gallerie as $g) {
echo $g->attributes()['url'] ."<br/>";
}
}

PHP - how to display soap response sub children list

I got a response in the following xml format like below:
How can I get list->item->value in one row or container:
<list>
<item>
<Key>3</Key>
<Value>3960</Value>
</item>
<item>
<Key>5</Key>
<Value>3967</Value>
</item>
<item>
<Key>6</Key>
<Value>3968</Value>
</item>
</list>
How can I display the value like this below
<table>
<tr>
<td>3960, 3967, 3968</td>
<td>3963, 3961, 3960</td>
</tr>
</table>
and at the moment I try to use children() in foreach, but it returns error: Call to a member function children() on null, and below is my php code
foreach($items as $item){
echo '<td>';
$child_item = '';
foreach($item->list->children()->children() as $child)
{
$child_item .= $child .' ,';
}
echo rtrim($child_item,' ,');
echo '</td>';
}
Thanks experts!
This is what you need to achieve that:
<?php
$xmlstr = <<<XML
<root>
<list>
<item>
<Key>3</Key>
<Value>3960</Value>
</item>
<item>
<Key>5</Key>
<Value>3967</Value>
</item>
<item>
<Key>6</Key>
<Value>3968</Value>
</item>
</list>
<list>
<item>
<Key>3</Key>
<Value>3963</Value>
</item>
<item>
<Key>5</Key>
<Value>3961</Value>
</item>
<item>
<Key>6</Key>
<Value>3960</Value>
</item>
</list>
</root>
XML;
$items = new SimpleXMLElement($xmlstr);
echo "<table>\r\n";
echo "<tr>\r\n";
foreach($items as $list){
echo "<td>";
$itemsArr = array();
foreach($list as $item){
$itemsArr[] = $item->Value[0];
}
echo implode(", ", $itemsArr);
echo "</td>\r\n";
}
echo "</tr>\r\n";
echo "</table>";
?>

Problems on reading image url from a rss feed, using DOMDocument

I have a rss feed
<rss xmlns:media="http://search.yahoo.com/mrss/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
<item>
<title>VIDEO: Have you heard of Alibaba?</title>
<description>Alibaba is the world's biggest e-commerce firm but most people in the West haven't heard of it.</description>
<link>http://www.bbc.co.uk/news/business-29216696#sa-ns_mchannel=rss&ns_source=PublicRSS20-sa</link>
<guid isPermaLink="false">http://www.bbc.co.uk/news/business-29216696</guid>
<pubDate>Tue, 16 Sep 2014 02:29:17 GMT</pubDate>
<media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/77609000/jpg/_77609399_73619721.jpg"/>
<media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/77609000/jpg/_77609400_73619721.jpg"/>
</item>
<item>
<title>VIDEO: Phones 4U shops closing for business</title>
<description>Retailer Phones 4U has gone into administration putting 5,596 jobs at risk.</description>
<link>http://www.bbc.co.uk/news/business-29202179#sa-ns_mchannel=rss&ns_source=PublicRSS20-sa</link>
<guid isPermaLink="false">http://www.bbc.co.uk/news/business-29202179</guid>
<pubDate>Mon, 15 Sep 2014 22:15:50 GMT</pubDate>
<media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/77587000/jpg/_77587217_77587209.jpg"/>
<media:thumbnail width="144" height="81" url="http://news.bbcimg.co.uk/media/images/77587000/jpg/_77587218_77587209.jpg"/>
</item>
</rss>
I am able to read title, description from this rss, using php's DOMDocument class.
Following is my code
$xml = 'http://feeds.bbci.co.uk/news/video_and_audio/business/rss.xml' ;
$xmlDoc = new DOMDocument();
$xmlDoc->load($xml);
$items=$xmlDoc->getElementsByTagName('item');
foreach($items as $item){
$item_title= $item->getElementsByTagName('title')->item(0)->childNodes->item(0)->nodeValue;
$item_link= $item->getElementsByTagName('link')->item(0)->childNodes->item(0)->nodeValue;
$item_desc= $item->getElementsByTagName('description')->item(0)->childNodes->item(0)->nodeValue;
}
But how can able to read url of 'media:thumbnail' tag of each item ?
Since it has namespaces, use getElementsByTagNameNS() together with ->getAttribute() in this case. Example:
$xml = 'http://feeds.bbci.co.uk/news/video_and_audio/business/rss.xml' ;
$xmlDoc = new DOMDocument();
$xmlDoc->load($xml);
$items = $xmlDoc->getElementsByTagName('item');
foreach($items as $key => $item) {
$item_title= $item->getElementsByTagName('title')->item(0)->childNodes->item(0)->nodeValue;
$item_link= $item->getElementsByTagName('link')->item(0)->childNodes->item(0)->nodeValue;
$item_desc= $item->getElementsByTagName('description')->item(0)->childNodes->item(0)->nodeValue;
$media = $item->getElementsByTagNameNS('http://search.yahoo.com/mrss/', 'thumbnail');
foreach($media as $thumb) {
echo $thumb->getAttribute('url') . '<br/>';
}
}
SimpleXMLElement Variant:
$xml = simplexml_load_file('http://feeds.bbci.co.uk/news/video_and_audio/business/rss.xml');
foreach($xml->channel->item as $item) {
$title = $item->title;
$description = $item->description;
$link = $item->link;
$media = $item->children('media', 'http://search.yahoo.com/mrss/');
foreach($media->thumbnail as $thumb) {
echo $thumb->attributes()->url . '<br/>';
}
}
Use Xpath. It is part of the DOM extension and allows you to use expressions to fetch nodes and values from a DOM. Like XML itself Xpath allows you define prefixes/aliases for the namespaces.
$dom = new DOMDocument;
$dom->loadXml($xml);
$xpath = new DOMXpath($dom);
$xpath->registerNamespace('m', 'http://search.yahoo.com/mrss/');
$xpath->registerNamespace('a', 'http://www.w3.org/2005/Atom');
foreach ($xpath->evaluate('//item') as $itemNode) {
$item = [
'title' => $xpath->evaluate('string(title)', $itemNode),
'link' => $xpath->evaluate('string(link)', $itemNode),
'description' => $xpath->evaluate('string(description)', $itemNode),
];
foreach ($xpath->evaluate('m:thumbnail/#url', $itemNode) as $urlAttribute) {
$item['thumbnails'][] = $urlAttribute->value;
}
var_dump($item);
}

Categories