I am trying to utilize simplexml to convert an iTunes RSS Feed to JSON so I can better parse it. The issue I am having is that it is not coming back as correctly formatted JSON.
$feed_url = 'https://podcasts.subsplash.com/c2yjpyh/podcast.rss';
$feed_contents = file_get_contents($feed_url);
$xml = simplexml_load_string($feed_contents);
$podcasts = json_decode(json_encode($xml));
print_r($podcasts);
Is there a better way to be attempting this to get the correct result?
Thanks to IMSoP for pointing me in the right direction! This took a bit of studying but the solution ends up being very simple! Instead of trying to convert to a JSON format, just use SimpleXML. However, due to the namespaces, it does require an additional line to map the itunes: prefix.
So in my iTunes feed rss, the following line exists: xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd So we just reference this to make accessing the values very easy. Here is a quick example:
$rss = simplexml_load_file('https://podcasts.example.com/podcast.rss');
foreach ($rss->channel->item as $item){
// Now we define the map for the itunes: namespace
$itunes = $item->children('http://www.itunes.com/dtds/podcast-1.0.dtd');
// This is a value WITHOUT the itunes: namespace
$title = $item->title;
// This is a value WITH the itunes: namespace
$author = $itunes->author;
echo $title . '<br>';
echo $author . '<br>';
}
The other little issue that I ran into is getting attributes such as the url for images and audio links. That is accomplished by using the attributes() function like so:
// Access attributes WITH itunes: namespace
$image = $itunes->image->attributes();
// Access attributes WITHOUT itunes: namespace
$audio = $item->enclosure->attributes();
// To echo these we simple add the desired attribute in `[]`:
echo $image['href'] . '<br>';
echo $audio['url'] . '<br>';
Related
Hey I am trying to get viewers from XML file. Problem is in path cuz other paths were working for me. I guess problem is because it's element? <media:statistics views="131"/>
$url = "https://www.youtube.com/feeds/videos.xml?channel_id=UCH7Hj6l_xDmbyvjQOA5Du0g";
$xml = simplexml_load_file($url);
$views = $xml->entry[0]->children('media', true)->group[0]->children('media', true)->community[0]->children('media', true)->attributes('statistics');
echo $views;
You could get the views by using instead of using ->attributes('statistics') use the statistics property first, and from that, get the views from the attributes:
->statistics->attributes()->views;
The code could look like:
$url = "https://www.youtube.com/feeds/videos.xml?channel_id=UCH7Hj6l_xDmbyvjQOA5Du0g";
$xml = simplexml_load_file($url);
$views = $xml->entry[0]->children('media', true)->group[0]->children('media', true)->community[0]->children('media', true)->statistics->attributes()->views;
echo $views;
Output
131
I am using PHP and simpleXML to read the following rss feed:
http://feeds.bbci.co.uk/news/england/rss.xml
I can get most of the information I want like so:
$rss = simplexml_load_file('http://feeds.bbci.co.uk/news/england/rss.xml');
echo '<h1>'. $rss->channel->title . '</h1>';
foreach ($rss->channel->item as $item) {
echo '<h2>' . $item->title . "</h2>";
echo "<p>" . $item->pubDate . "</p>";
echo "<p>" . $item->description . "</p>";
}
But how would I output the thumbnail image that is in the following tag:
<media:thumbnail width="66" height="49" url="http://news.bbcimg.co.uk/media/images/51078000/jpg/_51078953_226alanpotbury.jpg"/>
As you already know, SimpleXML lets you select an node's child using the object property operator -> or a node's attribute using the array access ['name']. It's great, but the operation only works if what you select belongs to the same namespace.
If you want to "hop" from a namespace to another, you can use the children() or attributes() methods. In your case, this is made a bit trickier because you have <item/> in the global namespace, the node you're looking for is in the "media" namespace* and then the attributes are in the global namespace again (they are not prefixed.) So using the normal object/array notation you'll have to "hop" twice:
foreach ($rss->channel->item as $item)
{
// we load the attributes into $thumbAttr
// you can either use the namespace prefix
$thumbAttr = $item->children('media', true)->thumbnail->attributes();
// or preferably the namespace name, read note below for an explanation
$thumbAttr = $item->children('http://search.yahoo.com/mrss/')->thumbnail->attributes();
echo $thumbAttr['url'];
}
*Note
I refer to the namespace as the "media" namespace but that's not really correct. The namespace name is http://search.yahoo.com/mrss/, and "media" is just a prefix, some sort of alias if you will. What's important to keep in mind is that http://search.yahoo.com/mrss/ is the real name of the namespace. At some point, your RSS provider might decide to change the prefix to, say, "yahoo" and your script will stop working if your script refers to the "media" prefix. However, if you use the namespace name, it will keep working no matter the prefix.
SimpleXML is pretty bad at handling namespaces. You have two choices: The simplest hack is to simply read the contents of the feed into a string and replace the namespaces;
$feed = file_get_contents('http://feeds.bbci.co.uk/news/england/rss.xml');
$feed = str_replace('<media:', '<', $feed);
$rss = simplexml_load_string($feed);
...
Now you can access the element thumbnail directly.
The more elegant (not really) method is to find out what URI the namespace uses. If you look at the source code for http://feeds.bbci.co.uk/news/england/rss.xml you see that it points to http://search.yahoo.com/mrss/.
Now you can use this URI in the children() method of a SimpleXMLElement to get the contents of the media:thumbnail element;
$rss = simplexml_load_file('http://feeds.bbci.co.uk/news/england/rss.xml');
foreach ($rss->channel->item as $item) {
$media = $item->children('http://search.yahoo.com/mrss/');
...
}
I'm trying to process an RSS feed using PHP and there are some tags such as 'itunes:image' which I need to process. The code I'm using is below and for some reason these elements are not returning any value. The output is length is 0.
How can I read these tags and get their attributes?
$f = $_REQUEST['feed'];
$feed = new DOMDocument();
$feed->load($f);
$items = $feed->getElementsByTagName('channel')->item(0)->getElementsByTagName('item');
foreach($items as $key => $item)
{
$title = $item->getElementsByTagName('title')->item(0)->firstChild->nodeValue;
$pubDate = $item->getElementsByTagName('pubDate')->item(0)->firstChild->nodeValue;
$description = $item->getElementsByTagName('description')->item(0)->textContent; // textContent
$arrt = $item->getElementsByTagName('itunes:image');
print_r($arrt);
}
getElementsByTagName is specified by DOM, and PHP is just following that. It doesn't consider namespaces. Instead, use getElementsByTagNameNS, which requires the full namespace URI (not the prefix). This appears to be http://www.itunes.com/dtds/podcast-1.0.dtd*. So:
$img = $item->getElementsByTagNameNS('http://www.itunes.com/dtds/podcast-1.0.dtd', 'image');
// Set preemptive fallback, then set value if check passes
urlImage = '';
if ($img) {
$urlImage = $img->getAttribute('href');
}
Or put the namespace in a constant.
You might be able to get away with simply removing the prefix and getting all image tags of any namespace with getElementsByTagName.
Make sure to check whether a given item has an itunes:image element at all (example now given); in the example podcast, some don't, and I suspect that was also giving you trouble. (If there's no href attribute, getAttribute will return either null or an empty string per the DOM spec without erroring out.)
*In case you're wondering, there is no actual DTD file hosted at that location, and there hasn't been for about ten years.
<?php
$rss_feed = simplexml_load_file("url link");
if(!empty($rss_feed)) {
$i=0;
foreach ($rss_feed->channel->item as $feed_item) {
?>
<?php echo $rss_feed->children('itunes', true)->image->attributes()->href;?>
<?php
}
?>
I am trying to get an xml feed from a url
http://api.eve-central.com/api/marketstat?typeid=1230®ionlimit=10000002
but seem to be failing miserably. I have tried
new SimpleXMLElement,
file_get_contents and
http_get
Yet none of these seem to echo a nice XML feed when I either echo or print_r. The end goal is to eventually parse this data but getting it into a variable would sure be a nice start.
I have attached my code below. This is contained within a loop and $typeID does in fact give the correct ID as seen above
$url = 'http://api.eve-central.com/api/marketstat?typeid='.$typeID.'®ionlimit=10000002';
echo $url."<br />";
$xml = new SimpleXMLElement($url);
print_r($xml);
I should state that the other strange thing I am seeing is that when I echo $url, i get
http://api.eve-central.com/api/marketstat?typeid=1230®ionlimit=10000002
the ® is the registered trademark symbol. I am unsure if this is "feature" in my browser, or a "feature" in my code
Try the following:
<?php
$typeID = 1230;
// set feed URL
$url = 'http://api.eve-central.com/api/marketstat?typeid='.$typeID.'®ionlimit=10000002';
echo $url."<br />";
// read feed into SimpleXML object
$sxml = simplexml_load_file($url);
// then you can do
var_dump($sxml);
// And now you'll be able to call `$sxml->marketstat->type->buy->volume` as well as other properties.
echo $sxml->marketstat->type->buy->volume;
// And if you want to fetch multiple IDs:
foreach($sxml->marketstat->type as $type){
echo $type->buy->volume . "<br>";
}
?>
You need to fetch the data from the URL in order to make an XML object.
$url = 'http://api.eve-central.com/api/marketstat?typeid='.$typeID.'®ionlimit=10000002';
$xml = new SimpleXMLElement(file_get_contents($url));
// pre tags to format nicely
echo '<pre>';
print_r($xml);
echo '</pre>';
I have been trying to read an xml attribute that has a " : " in it, but I'm having trouble...specifically "yweather:condition"
This is my code:
if ($xml = simplexml_load_file("http://weather.yahooapis.com/forecastrss?p=LEXX0003&u=c")) {
$namespacesMeta = $xml->getNamespaces(true);
$yweather = $xml->children($namespacesMeta['yweather']);
$docMeta = $yweather->{'condition'};
var_dump($docMeta);
}
i got here after reading off another thread on stackoverflow, but the result is not as I expected, I get the following:
object(SimpleXMLElement)[3]
You can check the above link to see the full xml,
I want to read the attributes in "yweather:condition"
I know how to access and read the other parts of the XML, but this one is being tricky...I also tried getAttributes() and it did not work
thanks
$docMetaAttributes = $docMeta->attributes();
or
$docMetaAttributes = $docMeta->attributes($namespacesMeta['yweather']);
for namespaced attributes in the yweather namespace
http://www.php.net/manual/en/simplexmlelement.attributes.php