Trying to parse out lat/lon from a google maps rss feed:
$file = "http://maps.google.com/maps/ms?ie=UTF8&hl=en&vps=1&jsv=327b&msa=0&output=georss&msid=217909142388190116501.000473ca1b7eb5750ebfe";
$xml = simplexml_load_file($file);
$loc = $xml->channel->item;
echo $loc[0]->title;
echo $loc[0]->point;
The title shows up alright, but point gives me nothing. Each node looks like this:
<item>
<guid isPermaLink="false">0004740950fd067393eb4</guid>
<pubDate>Sun, 20 Sep 2009 21:47:49 +0000</pubDate>
<title>Big Wong King Restaurant</title>
<description><![CDATA[<div dir="ltr">$4.99 full meals!</div>]]></description>
<author>neufuture</author>
<georss:point>
40.716236 -73.998413
</georss:point>
<georss:elev>0.000000</georss:elev>
</item>
<?php
$file = "http://maps.google.com/maps/ms?ie=UTF8&hl=en&vps=1&jsv=327b&msa=0&output=georss&msid=217909142388190116501.000473ca1b7eb5750ebfe";
$xml = simplexml_load_file($file);
$loc = $xml->channel->item;
foreach ($loc->children('http://www.georss.org/georss') as $geo) {
echo $geo;
}
?>
Related
How can I get the contest logo and start date from this RSS feed? I can get the dc:modified child for example but always get a blank for anything from dc:dataset.
My code:
$feed_url = 'https://www.website.com/?call_custom_simple_rss=1&csrp_post_type=contest&csrp_posts_per_page=2&csrp_show_meta=1';
$feed = file_get_contents($feed_url);
$rss = simplexml_load_string($feed);
foreach($rss->channel->item as $entry) {
echo $entry->children("dc", true)->modified . "<br>";
echo $entry->children("dc", true)->dataset->contest_logo . "<br>";
echo $entry->children("dc", true)->dataset->start_date . "<br>";
}
The RSS feed:
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:media="http://search.yahoo.com/mrss/" xmlns:wp="http://wordpress.org/export/1.2/" xmlns:excerpt="http://wordpress.org/export/1.2/excerpt/" version="2.0">
<channel>
<title>RSS Title</title>
<description>A website</description>
<lastBuildDate>Wed, 17 Feb 2021 15:03:03 +0000</lastBuildDate>
<item>
<title>
<![CDATA[ Photography Awards ]]>
</title>
<link>
<![CDATA[ /contests/photography-awards/ ]]>
</link>
<pubDate>Mon, 11 Jan 2021 13:52:27 -0600</pubDate>
<dc:identifier>619116</dc:identifier>
<dc:modified>2021-02-09 07:50:10</dc:modified>
<dc:created unix="1610373147">2021-01-11 13:52:27</dc:created>
<dc:dataset>
<contest_logo>
<![CDATA[ 619130 ]]>
</contest_logo>
<start_date>
<![CDATA[ 20210110 ]]>
</start_date>
</dc:dataset>
</item>
</channel>
</rss>
The contest_logo and start_date are in the empty namespace. You have to switch back. Additionally it is not good to reply on namespace prefixes defined in the document. Use the namespace URI (for example defined as mapping array in your code).
$rss = simplexml_load_string($feed);
$xmlns = [
'dc' => 'http://purl.org/dc/elements/1.1/'
];
foreach($rss->channel->item as $entry) {
echo $entry->children($xmlns['dc'])->modified . "<br>";
echo $entry->children($xmlns['dc'])->dataset->children('')->contest_logo . "<br>";
echo $entry->children($xmlns['dc'])->dataset->children('')->start_date . "<br>";
}
Output:
2021-02-09 07:50:10<br>
619130
<br>
20210110
<br>
In DOM you would register an alias on the Xpath processor and use it in the expressions. Here is a demo:
$document = new DOMDocument();
$document->loadXML($feed);
$xpath = new DOMXpath($document);
$xpath->registerNamespace('dc', 'http://purl.org/dc/elements/1.1/');
foreach ($xpath->evaluate('/rss/channel/item') as $entry) {
echo $xpath->evaluate('string(dc:modified)', $entry). "<br>";
echo $xpath->evaluate('string(dc:dataset/contest_logo)', $entry). "<br>";
echo $xpath->evaluate('string(dc:dataset/start_date)', $entry). "<br>";
}
Another alternative - use xpath:
echo $rss->xpath('//dc:dataset/contest_logo')[0] . "\r\n";
echo $rss->xpath('//dc:modified')[0] . "\r\n";
echo $rss->xpath('//start_date')[0] . "\r\n";
Output:
619130
2021-02-09 07:50:10
20210110
0Here is the relevant part of my rss feed:
<channel>
<description></description>
<title>Untitled</title>
<generator>Tumblr (3.0; #xxx)</generator>
<link>http://xxx.tumblr.com/</link>
<item>
<title>Title</title>
<description><figure><img src="https://31.media.tumblr.com/c78c7t3abd23423549d3bb0f705/tumblr_inline_nkp9z234d0uj.jpg"/></figure></description>
<link>http://xxx.tumblr.com/post/99569244093</link>
<guid>http://xxx.tumblr.com/post/99569244093</guid>
<pubDate>Thu, 09 Oct 2014 11:19:33 -0400</pubDate>
</item>
</channel>
Using the answer from other questions on here I tried this:
$content = file_get_contents("http://xxx.tumblr.com/rss");
$feed = new SimpleXmlElement($content);
$imgs = $feed->channel->item[0]->description->xpath('//img');
foreach($imgs as $image) {
echo (string)$image['src'];
};
This is returning an empty array for $imgs
Does it have something to do with the tags being < > etc?
and if so what can I do?
You can get it from the description, which seems to include a HTML image tag for the image, by using a simple regular expression with preg_match:
$content = file_get_contents("http://xxx.tumblr.com/rss");
$feed = new SimpleXmlElement($content);
$img = (string)$feed->channel->item[0]->description;
if (preg_match('/src="(.*?)"/', $img, $matches)) {
$src = $matches[1];
echo "src = $src", PHP_EOL;
}
Output:
src = http://40.media.tumblr.com/58d24c3009638514325b113859ba369f/tumblr_nk0mwfhKXU1sl87kjo1_500.jpg
Before you can use xapth() on the description, you need to create a new XML document out of it:
$url = "http://xxx.tumblr.com/rss";
$desc = simplexml_load_file($url)->xpath('//item/description[1]')[0];
$src = simplexml_load_string("<x>$desc</x>")->xpath('//img/#src')[0];
echo $src;
Output:
http://40.media.tumblr.com/58d24c3009638514325b113859ba369f/tumblr_nk0mwfhKXU1sl87kjo1_500.jpg
I'm not sure if you can use this approach - as already mentioned by kjhughes as comment, your input XML does not contain any img element. But it's possible to retrieve the image source using XPath substring-functions:
substring-before(substring-after(substring-after(//item/description[contains(.,'img')],
'src='),'"'),'"')
Result:
https://31.media.tumblr.com/c78c7t3abd23423549d3bb0f705/tumblr_inline_nkp9z234d0uj.jpg
I know child elements have been discussed a lot, but I've gone through the helpful answers to related questions and can't seem to get it working (new to coding, so bear with me).
Here's what I'm working with:
rss xmlns:media="http://search.yahoo.com/mrss/" xmlns:bc="http://www.brightcove.tv/link" xmlns:dcterms="http://purl.org/dc/terms/" version="2.0">
<channel>
<title>Search Videos By Criteria</title>
<link>...</link>
<description/>
<copyright>Copyright 2014</copyright>
<lastBuildDate>Thu, 25 Sep 2014 13:29:49 -0700</lastBuildDate>
<generator>http://www.brightcove.com/?v=1.0</generator>
<item>
<title>5 best guards in Lakers history</title>
<link/>
<description>...</description>
<guid>video3805826070001</guid>
<pubDate>Thu, 25 Sep 2014 05:11:39 -0700</pubDate>
<media:content duration="121" medium="video" type="video/mp4" url="http://videos.usatoday.net/Brightcove2/29906170001/2014/09/29906170001_3805837947001_5-BEST-GUARDS-IN-LAKERS--HISTORY-final.mp4?videoId=3805826070001"/>
<media:group>...</media:group>
<media:keywords>jerry west,derek fisher,Gail Goodrich,losangeleslakers,SMGV,USA Today Sports,Kobe Bryant,video big board,sports,basketball,lakers,magic johnson,nba
</media:keywords>
<media:thumbnail height="90" url="http://videos.usatoday.net/Brightcove2/29906170001/2014/09/29906170001_3805822421001_Screen-Shot-2014-09-25-at-8-06-28-AM.jpg?pubId=29906170001" width="120"/>
<media:thumbnail height="360" url="http://videos.usatoday.net/Brightcove2/29906170001/2014/09/29906170001_3805709286001_Screen-Shot-2014-09-25-at-8-06-28-AM.jpg?pubId=29906170001" width="480"/>
<bc:titleid>3805826070001</bc:titleid>
<bc:duration>121</bc:duration>
<dcterms:valid/>
<bc:accountid>44854217001</bc:accountid>
</item>
I'm using the following SimpleXML_Parser script to pull most of the info out that I need:
<?php
$html = "";
$url = "http://api.brightcove.com/services/library?command=search_videos&any=tag:NBA&output=mrss&media_delivery=http&sort_by=CREATION_DATE:DESC&token=NU-nMdtzfF8z9NNinlAgM4c9S-9BBfKpm6gFISdwyk-AnQ84efFBbQ..";
$xml = simplexml_load_file($url);
for($i = 0; $i < 80; $i++){
$title = $xml->channel->item[$i]->video;
$link = $xml->channel->item[$i]->link;
$title = $xml->channel->item[$i]->title;
$pubDate = $xml->channel->item[$i]->pubDate;
$description = $xml->channel->item[$i]->description;/* The code below starting with $html is where you setup how the parsed data will look on the webpage */
$html .= "<div><h3>$title</h3><br/>$description<p><br/>$pubDate<p><br/>$link<p><br/>$titleid<p><br/></div><iframe width='580' height='360' src='http://link.brightcove.com/services/player/bcpid3742068445001?bckey=/*deleted API key&bctid=$titleid' frameborder='0'></iframe><hr/>";}
echo $html;/* tutorial for this script is here https://www.youtube.com/watch?v=4ZLZkdiKGE0 */?>
What I need to be able to parse out of the feed is the string of number assigned to "titleid"
I have tried adding in variations on approaches for pulling out child elements, such as:
$titleid = $xml->children(‘media’, true)->div->children(‘bc’, true)->div[$i]->titled;
But not having any luck. I'm sure it's something obvious to a seasoned developer, but again, I'm a newbie.
Any suggestions?
Thanks for any help!
To parse MRSS properly you need first to put the getNamespaces to true.
Then select the namespace $xml->channel->item[$i]->children($namespaces['bc']) finaly you can extract the wanted value from it in your case id
<?php
$html = "";
$url = "http://api.brightcove.com/services/library?command=search_videos&any=tag:NBA&output=mrss&media_delivery=http&sort_by=CREATION_DATE:DESC&token=NU-nMdtzfF8z9NNinlAgM4c9S-9BBfKpm6gFISdwyk-AnQ84efFBbQ..";
$xml = simplexml_load_file($url);
$namespaces = $xml->getNamespaces(true); // get namespaces
for($i = 0; $i < 80; $i++){
$title = $xml->channel->item[$i]->video;
$link = $xml->channel->item[$i]->link;
$title = $xml->channel->item[$i]->title;
$pubDate = $xml->channel->item[$i]->pubDate;
$description = $xml->channel->item[$i]->description;
$titleid = $xml->channel->item[$i]->children($namespaces['bc'])->titleid;
echo $title_group .'<br>';
}
Using simpleXML, how do you display the contents of a feed. I mean actually display the XML on a page so that I can see the schema?
In my example i will use XML file like:
<?xml version="1.0" encoding="UTF-8"?>
<module>
<name>Menus</name>
<folder>menus</folder>
<install>install.php</install>
</module>
You can use to load xml file in a variable:
$mod = simplexml_load_file($XMLfile);
and call fields:
echo $mod->module->name;
echo $mod->module->folder;
echo $mod->module->install;
Hope this helps.
$xml = simplexml_load_file("URL GOES HERE");
echo "<pre>";
print_r($xml);
echo "</pre>";
Will print a readable array of the object.
And a VALID RSS feed has this schema:
foreach($xml->channel->item as $items){
$title = $items->title;
$link = $items->link;
$description = $items->description;
$pubDate = strtotime($items->pubDate);
$pubDate = date('Y-m-d H:i:s', $pubDate);
}
Along with a few other attributes.
I'm struggling to parse an XML file in PHP:
Here's the XML
<rss xmlns:ac="http://palm.com/app.catalog.rss.extensions" version="2.0">
<channel>
<title>Device App Updates for US</title>
<link>http://www.palm.com</link>
<description>Updates</description>
<language>en-US</language>
<pubDate>Mon, 04 Jan 2010 18:23:51 -0800</pubDate>
<lastBuildDate>Mon, 04 Jan 2010 18:23:51 -0800</lastBuildDate>
<ac:distributionChannel>Device</ac:distributionChannel>
<ac:countryCode>US</ac:countryCode>
<item>
<title><![CDATA[My App]]></title>
<link>http://developer.palm.com/appredirect/?packageid=com.palm.myapp</link>
<description><![CDATA[My fun app.]]></description>
<pubDate>2009-12-21 21:00:58</pubDate>
<guid>334.232</guid>
<ac:total_downloads>1234</ac:total_downloads>
<ac:total_comments>12</ac:total_comments>
<ac:country>US</ac:country>
</item>
</channel>
</rss>
My Problem is; when I use:
$strURL = "http://developer.palm.com/rss/D/appcatalog.update.rss.xml";
$ch = curl_init($strURL);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HEADER, 0);
$data = curl_exec($ch);
curl_close($ch);
$doc = new SimpleXmlElement($data, LIBXML_NOCDATA);
print_r($doc);
I'm not able to display the values? The one I'm most interested in is <ac:total_downloads>1000</ac> but I don't seem to be able to parse it.
What am I doing wrong?
Many thanks
You don't need to use curl to retrieve that file, SimpleXML can fetch external resources.
Use children() to access namespaced nodes. Here's how to do it:
$rss = simplexml_load_file($strURL);
$ns = 'http://palm.com/app.catalog.rss.extensions';
foreach ($rss->channel->item as $item)
{
echo 'Title: ', $item->title, "\n";
echo 'Downloads: ', $item->children($ns)->total_downloads, "\n\n";
}