How do I get the child nodes of this RSS feed? - php

How can I get the contest logo and start date from this RSS feed? I can get the dc:modified child for example but always get a blank for anything from dc:dataset.
My code:
$feed_url = 'https://www.website.com/?call_custom_simple_rss=1&csrp_post_type=contest&csrp_posts_per_page=2&csrp_show_meta=1';
$feed = file_get_contents($feed_url);
$rss = simplexml_load_string($feed);
foreach($rss->channel->item as $entry) {
echo $entry->children("dc", true)->modified . "<br>";
echo $entry->children("dc", true)->dataset->contest_logo . "<br>";
echo $entry->children("dc", true)->dataset->start_date . "<br>";
}
The RSS feed:
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:media="http://search.yahoo.com/mrss/" xmlns:wp="http://wordpress.org/export/1.2/" xmlns:excerpt="http://wordpress.org/export/1.2/excerpt/" version="2.0">
<channel>
<title>RSS Title</title>
<description>A website</description>
<lastBuildDate>Wed, 17 Feb 2021 15:03:03 +0000</lastBuildDate>
<item>
<title>
<![CDATA[ Photography Awards ]]>
</title>
<link>
<![CDATA[ /contests/photography-awards/ ]]>
</link>
<pubDate>Mon, 11 Jan 2021 13:52:27 -0600</pubDate>
<dc:identifier>619116</dc:identifier>
<dc:modified>2021-02-09 07:50:10</dc:modified>
<dc:created unix="1610373147">2021-01-11 13:52:27</dc:created>
<dc:dataset>
<contest_logo>
<![CDATA[ 619130 ]]>
</contest_logo>
<start_date>
<![CDATA[ 20210110 ]]>
</start_date>
</dc:dataset>
</item>
</channel>
</rss>

The contest_logo and start_date are in the empty namespace. You have to switch back. Additionally it is not good to reply on namespace prefixes defined in the document. Use the namespace URI (for example defined as mapping array in your code).
$rss = simplexml_load_string($feed);
$xmlns = [
'dc' => 'http://purl.org/dc/elements/1.1/'
];
foreach($rss->channel->item as $entry) {
echo $entry->children($xmlns['dc'])->modified . "<br>";
echo $entry->children($xmlns['dc'])->dataset->children('')->contest_logo . "<br>";
echo $entry->children($xmlns['dc'])->dataset->children('')->start_date . "<br>";
}
Output:
2021-02-09 07:50:10<br>
619130
<br>
20210110
<br>
In DOM you would register an alias on the Xpath processor and use it in the expressions. Here is a demo:
$document = new DOMDocument();
$document->loadXML($feed);
$xpath = new DOMXpath($document);
$xpath->registerNamespace('dc', 'http://purl.org/dc/elements/1.1/');
foreach ($xpath->evaluate('/rss/channel/item') as $entry) {
echo $xpath->evaluate('string(dc:modified)', $entry). "<br>";
echo $xpath->evaluate('string(dc:dataset/contest_logo)', $entry). "<br>";
echo $xpath->evaluate('string(dc:dataset/start_date)', $entry). "<br>";
}

Another alternative - use xpath:
echo $rss->xpath('//dc:dataset/contest_logo')[0] . "\r\n";
echo $rss->xpath('//dc:modified')[0] . "\r\n";
echo $rss->xpath('//start_date')[0] . "\r\n";
Output:
619130
2021-02-09 07:50:10
20210110

Related

Get XML Attributes using PHP

I want to get the URL of the image in . The XML document tree is as follow:
<rss xmlns:media="http://search.yahoo.com/mrss/" version="2.0">
<channel>
<title>
<![CDATA[ The Star Online Business Highlights ]]>
</title>
<link>/TheStar/Website</link>
<description>...</description>
<image>...</image>
<language>en</language>
<item>
<guid isPermaLink="false">{F88B27DD-24FB-4807-941F-070D772B7586}</guid>
<link>
http://www.thestar.com.my/business/business-news/2017/10/24/top-glove-says-not-buying-adventa-nor-supermax/
</link>
<title>
<![CDATA[ Top Glove says not buying Adventa nor Supermax ]]>
</title>
<description>
<![CDATA[KUALA LUMPUR: Top Glove, which has allocated about RM1bil to expand via mergers, has denied news reports the target companies are Adventa Bhd and Supermax Corporation Bhd.]]>
</description>
<pubDate>Tue, 24 Oct 2017 13:17:18 +08:00</pubDate>
<enclosure url="http://www.thestar.com.my/~/media/online/2017/08/22/03/58/hartalega-glove3.ashx?crop=1&w=0&h=0&" length="" type="image/jpeg"/>
<media:content url="http://www.thestar.com.my/~/media/online/2017/08/22/03/58/hartalega-glove3.ashx?crop=1&w=0&h=0&" type="image/jpeg">
<media:description>
<![CDATA[ ]]>
</media:description>
</media:content>
<section>
<![CDATA[ Business ]]>
</section>
</item>
<item>...</item>
<item>...</item>
<item>...</item>
</channel>
As there is multiple item and I want to make it a loop, I tried:
foreach($xml->channel->item as $news) {
$media = $news->media->children('http://search.yahoo.com/mrss/');
echo ($media->content);
}
and also
foreach($xml->channel->item as $news) {
$media = $news->children('http://search.yahoo.com/mrss/');
echo ($media->content);
}
but both are seems failed. What is the right method?
The $media variable is of type SimpleXMLElement.
What you could do is loop your $media variable in a foreach and then get your url from the attributes.
For example (using simplexml_load_string with additional Libxml parameters to load your example xml:
$source = <<<SOURCE
//Your example xml here
SOURCE;
$xml = simplexml_load_string($source, "SimpleXMLElement", LIBXML_NOERROR|LIBXML_ERR_NONE|LIBXML_ERR_FATAL);
foreach($xml->channel->item as $news) {
$media = $news->children('http://search.yahoo.com/mrss/');
foreach($media as $child) {
echo $child->attributes()->url;
}
}
Will result in:
http://www.thestar.com.my/~/media/online/2017/08/22/03/58/hartalega-glove3.ashx?crop=1=0=0
$xml = new SimpleXMLElement($xml, LIBXML_NOERROR|LIBXML_ERR_NONE|LIBXML_ERR_FATAL);
foreach ($xml->xpath("//media:content") as $node)
{
var_dump ((string) $node["url"]);
}

get media:description and media:content url from xml

I have XML data in which some item tag have media: content tag some have not. How can I check that content exists in that XML and also, how can I get description under media: content tag?
Here is XML data:
<rss xmlns:content="" xmlns:wfw="" xmlns:dc="" xmlns:atom="" xmlns:sy="" xmlns:slash="" version="2.0">
<channel>
<item>
<title>Title1</title>
<link>Link</link>
<pubDate>Date</pubDate>
<content:encoded>
<![CDATA[ This is description 1 ]]>
<![CDATA[ This is description 2 ]]>
</content:encoded>
<media:content url="URL" type="image/jpeg">
<media:description>
<![CDATA[ Text ]]>
</media:description>
</media:content>
</item>
<item> -- this item tag does not have media: content
<title>Title2</title>
<link>Link2</link>
<pubDate>Date2</pubDate>
<content:encoded>
<![CDATA[ This is description 3 ]]>
<![CDATA[ This is description 4 ]]>
</content:encoded>
</item>
<item>
<title>Title3</title>
<link>Link3</link>
<pubDate>Date3</pubDate>
<content:encoded>
<![CDATA[ This is description 5 ]]>
<![CDATA[ This is description 6 ]]>
</content:encoded>
<media:content url="UR1L" type="image/jpeg">
<media:description>
<![CDATA[ Text 2 ]]>
</media:description>
</media:content>
</item>
</channel>
</rss>
What I tried is:
<?php
function feeds()
{
$url = "http://localhost/xmldata/xmld.xml"; // xmld.xml contains above data
$feeds = file_get_contents($url);
$rss = simplexml_load_string($feeds);
foreach($rss->channel->item as $entry) {
if($entry->children('media', true)->content->attributes()) {
$md = $entry->children('media', true)->content->attributes();
print_r("$md->url");
}
}
}
?>
It is returning me error like below:
Node no longer exists
Even I don't have any idea to get media:description which is in media:content tag.
You can use isset to check if the 'media:content' property is set on the SimpleXMLElement.
I think it would help if you change these lines:
foreach($rss->channel->item as $entry) {
if($entry->children('media', true)->content->attributes()) {
$md = $entry->children('media', true)->content->attributes();
print_r("$md->url");
}
}
To these lines:
$rss = #simplexml_load_string($feeds);
foreach ($rss->channel->item as $entry) {
if (isset($entry->{'media:content'})) {
$url = (string)$entry->{'media:content'}->attributes()->url;
$description = (string)$entry->{'media:content'}->{'media:description'};
echo "$url<br>";
echo "$description<br>";
}
}
Will result in:
URL
Text
UR1L
Text 2
It is working for me .
$namespaces = $entry->getNamespaces(true);
$media_url = trim((string)$entry->children($namespaces['media'])->content->attributes()->url);

PHP Reading XML Issue

I need your help once again!
I need to read this xml file... but the problem is that it's not working!
This is the XML
<rss xmlns:media="http://search.yahoo.com/mrss/" version="2.0">
<channel>
<item>
<title>Video</title>
<media:content url="http://videourl.com/etc/" type="video/x-flv" duration="5128"/>
</item>
</channel>
</rss>
And this is my code:
<?php
$xml=simplexml_load_file("http://videourl.com/etc/");
echo $xml->getName() . "<media:content url=";
foreach($xml->children() as $child)
{
echo $child->getName() . ": " . $child . "";
}
?>
And it's not working! It's not working because nothing gets echoed, or printed! Does anyone spot the error?
<?php
$xml = '<?xml version="1.0" encoding="UTF-8" ?>
<rss>
<channel>
<item>
<title><![CDATA[Tom & Jerry]]></title>
</item>
</channel>
</rss>';
$xml = simplexml_load_string($xml);
// echo does the casting for you
echo $xml->channel->item->title;
// but vardump (or print_r) not!
var_dump($xml->channel->item->title);
// so cast the SimpleXML Element to 'string' solve this issue
var_dump((string) $xml->channel->item->title);
?>
Again i edit my code now try this

xml DOM : delete element with condition

May be the question is already answered in a way or in another in many questions, but since I'm a new bie in XML, I can't figured it out in my project.
I have an RSS (XML) file with this structure:
<rss>
<channel>
<item>
<title>some title</title>
<description> some descrp </description>
...
</item>
</channel>
</rss>
How can I, in PHP, delete some item when the title is equal to some value? THanks.
EDIT1 : I have my XML file stored at my web server.
$rss = "
<rss>
<channel>
<item>
<title>some title</title>
<description> some descrp </description>
</item>
<item>
<title>some other title</title>
<description> some descrp </description>
</item>
</channel>
</rss>
";
$doc = new DOMDocument();
$doc->loadXML($rss);
$xpath = new DOMXPath($doc);
$els = $xpath->query('//title[text()="some title"]');
foreach($els as $el)
{
$parent = $el->parentNode;
$parent->parentNode->removeChild($parent);
}
echo $doc->saveXML();
It searches for exact match.
ps: another method, without xpath
$doc = new DOMDocument();
$doc->loadXML($rss);
$els = $doc->getElementsByTagName('title');
for($i = $els->length-1; $i >= 0; $i--)
{
$el = $els->item($i);
if ($el->nodeValue == 'some title')
{
$parent = $el->parentNode;
$parent->parentNode->removeChild($parent);
}
}
echo $doc->saveXML();

How to get the value of an xml sub element using DOMDocument methods?

I am new to PHP and XML.
Can somebody tell me how can I get the values of a sub element or child node of a an xml element?
index.php
$domdoc = new DOMDocument();
$domdoc->load('actionstars.xml');
foreach ($domdoc->getElementsByTagName("actionstar") as $star) {
echo $star->item(0)->nodeValue; // displays the <id> element
echo $star->item(1)->nodeValue; // displays the <name> element
echo "<br />";
}
actionstars.xml
<?xml version="1.0" encoding="ISO-8859-1"?>
<actionstars>
<actionstar>
<id>1</id>
<name>Jean Claude Van Damme</name>
</actionstar>
<actionstar>
<id>2</id>
<name>Scott Adkins</name>
</actionstar>
<actionstar>
<id>3</id>
<name>Dolph Ludgren</name>
</actionstar>
<actionstar>
<id>4</id>
<name>Michael Jai White</name>
</actionstar>
<actionstar>
<id>5</id>
<name>Michael Worth</name>
</actionstar>
</actionstars>
Pls help...
If you can guarantee their order, you can use childNodes and the offset, otherwise...
$domdoc = new DOMDocument();
$domdoc->load('actionstars.xml');
foreach ($domdoc->getElementsByTagName("actionstar") as $star) {
echo $shit->getElementsByTagName('id')->item(0)->nodeValue; // displays the <id> element
echo $shit->getElementsByTagName('name')->item(0)->nodeValue; // displays the <name> element
echo "<br />";
}

Categories