How to collect all info from a rss feed using PHP [duplicate] - php

This question already has answers here:
How to get attribute of node with namespace using SimpleXML? [closed]
(4 answers)
Closed 8 years ago.
I need to display such info from a rss feed using PHP
Title , Link / url , description and image
I have already done this code below but i am unable to fetch the image from a feed
I checked many site but still yet unable to solve this problem
<?php
$ch = curl_init("http://economico.sapo.pt/rss/ultimas");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HEADER, 0);
$data = curl_exec($ch);
curl_close($ch);
$doc = new SimpleXmlElement($data);
//print_r($doc);
if(isset($doc->channel))
{
parseRSS($doc);
}
if(isset($doc->entry))
{
parseAtom($doc);
}
function parseRSS($xml)
{
// echo "<strong>".$xml->channel->title."</strong>";
$cnt = count($xml->channel->item);
for($i=0; $i<$cnt; $i++)
{
$postdate = $xml->channel->item[$i]->pubDate;
//pubDate
$url = $xml->channel->item[$i]->link;
$title = $xml->channel->item[$i]->title;
$desc = $xml->channel->item[$i]->description;
echo $postdate."<br/>".''.$title.'<br/>'.$desc.'<br/>';
}
}
function parseAtom($xml)
{
echo "<strong>".$xml->author->name."</strong>";
$cnt = count($xml->entry);
for($i=0; $i<$cnt; $i++)
{
$urlAtt = $xml->entry->link[$i]->attributes();
$url = $urlAtt['href'];
$title = $xml->entry->title;
$desc = strip_tags($xml->entry->content);
echo ''.$title.''.$desc.'';
}
}
?>

You are already on that right track with using ->attributes(), with regards to those with namespaces, just use ->children(). Simple example:
$url = 'http://economico.sapo.pt/rss/ultimas';
$rss = simplexml_load_file($url, null, LIBXML_NOCDATA);
foreach($rss->channel->item as $item) {
$title = (string) $item->title;
$link = (string) $item->link;
$description = (string) $item->description;
$pubDate = (string) $item->pubDate;
$media_image_url = '';
$media_title = '';
$media = $item->children('media', 'http://search.yahoo.com/mrss/');
if(isset($media->content)) {
$media_image_url = (string) $media->content->attributes()->url;
$media_title = (string) $media->content->title;
}
echo "
Title: $title <br/>
Link: $link <br/>
Description: $description <br/>
Pub Date: $pubDate <br/>
Image URL: $media_image_url <br/>
Media Title: $media_title <br/>
<hr/>
";
}
Sample Output

Related

Convert a RSS feed from php to xml

The code below is a PHP generator that generate a simple web page in PHP with just one feed from my RSS URL. This URL has around 60 feeds.
I would like to generate a XML RSS with just one feed in random mode. Is this possible?
<?php
function load_xml_feed($feed){
global $RanVal;
$i= 1;
$FeedXml = simplexml_load_file($feed);
foreach ($FeedXml->channel->item as $topic) {
$title[$i] = (string)$topic->title;
$link[$i] = (string)$topic->link;
$description[$i] = (string)$topic->description;
$i++;
}
$randtopic = rand(2, $i);
$link = trim($link[$randtopic]);
$title = trim($title[$randtopic]);
$description = trim($description[$randtopic]);
$RanVal = array($title,$link,$description);
return $RanVal;
}
$rss = "http://syncds.boxip.com.br/feed?code=24C165DB04";
load_xml_feed($rss);
$link = $RanVal[1];
$title = $RanVal[0];
$description = $RanVal[2];
echo "<h1>".$title."</h1><h2>".$link."</h2><p>".$description."</p>";
?>
Thank you guys.

returning json array with PHP not working

I am trying to return a json array after i parse an rss feed.
this is my code :
<?php
header('Content-Type: application/json');
$feed = new DOMDocument();
//http://www.espnfc.com/rss
//http://www.football365.com/topical-top-10/rss
$feed->load('http://www.espnfc.com/rss');
$json = array();
$items = $feed->getElementsByTagName('channel')->item(0)->getElementsByTagName('item');
$json['item'] = array();
$i = 0;
foreach($items as $item) {
$i = $i+1;
$title = $item->getElementsByTagName('title')->item(0)->firstChild->nodeValue;
$link = $item->getElementsByTagName('link')->item(0)->firstChild->nodeValue;
$img = $item->getElementsByTagName('enclosure')->item(0)->attributes->getNamedItem('url')->value;
//$img = $item;echo($url);
$json['item'][] = array("title"=>str_replace(array("\n", "\r", "\t","'"), ' ', $title),"link"=>str_replace(array("\n", "\r", "\t","'"), ' ', $link),"img"=>str_replace(array("\n", "\r", "\t","'"), ' ', $img));
}
print_r($json['item'][0]);
//echo json_encode($json['item']);
?>
after iterating all items i finally would like to echo them as a result:
echo json_encode($json['item']);
the problem that's not showing any thing in browser. but when i moved this line into the foreach bloc it show result (of course with redundancy).
Some of the items don't have an <enclosure> tag, so the script gets an error when it tries to access the url attribute. You need to check for this.
$enclosures = $item->getElementsByTagName('enclosure');
if ($enclosures->length) {
$img = $item->getElementsByTagName('enclosure')->item(0)->attributes->getNamedItem('url')->value;
} else {
$img = '';
}
Your code returns request status "Status Code:500 Internal Server Error"
You can easy see it by browsing the network tab of your browser's web tools.
This is because on the 3rd post there is no image.
<?php
// Json Header
header('Content-Type: application/json');
// Get Feed
$feed = new DOMDocument();
$feed->load('http://www.espnfc.com/rss');
// Get Items
$items = $feed->getElementsByTagName('channel')->item(0)->getElementsByTagName('item');
// My json object
$json = array();
$json['item'] = array();
// For each item
foreach($items as $item){
// Get title
$title = $item->getElementsByTagName('title')->item(0)->firstChild->nodeValue;
// Get link
$link = $item->getElementsByTagName('link')->item(0)->firstChild->nodeValue;
// Get image if it exist
$img = $item->getElementsByTagName('enclosure');
if($img->length>0){
$img = $img->item(0)->attributes->getNamedItem('url')->value;
} else {
$img = "";
}
array_push($json['item'], array(
"title" => preg_replace('/(\n|\r|\t|\')/', ' ', $title),
"link" => preg_replace('/(\n|\r|\t|\')/', ' ', $link),
"img" => preg_replace('/(\n|\r|\t|\')/', ' ', $img)
));
}
echo json_encode($json['item']);
?>

How to turn variable feed to array?

I want to turn the variable into an array so I can store more than one feed?
<?php
error_reporting(0);
$feed_lifehacker_full = simplexml_load_file('http://feeds.gawker.com/lifehacker/full');
$xml = $feed_lifehacker_full;
//print_r($xml);
foreach ($xml->channel->item as $node){
$title = $node->title;
$link = $node->link;
$link = explode('/', $link);
$link = $link[8];
$url = $node->url;
$description = $node->description;
$pubDate = $node->pubDate;
preg_match_all('#(http://img[^\s]+(?=\.(jpe?g|png|gif)))#i', $description[0], $images);
$images = $images[0][1] . '.jpg';
if($images == '.jpg'){
//uncomment to show youtube articles
//$images = "http://placehold.it/640x360";
//echo "<a href='page2.php?a=$link' title='$title'><img src='$images' /></a><br>";
} else {
//article image
$images . '<br>';
echo "<a href='page2.php?a=$link' title='$title'><img src='$images' /></a><br>";
}
}
How can I change this to load to arrays,
$feed_lifehacker_full = simplexml_load_file('http://feeds.gawker.com/lifehacker/full');
$xml = $feed_lifehacker_full;
The script is just gathering the image of an rss feed and linking to a page, if you see how it can be done more efficiently feel free to say
it is possible to encode the result given as json and by decoding it it will return you an array
$xml = simplexml_load_string($xmlstring);
$json = json_encode($xml);
$array = json_decode($json, TRUE);

XML data extraction of src="xyz" from description

I am trying to incorporate my pin feed into my site. I have got it working but I need to amend what it shows as it's not quite working as intended.
What I need is to extract a certain piece of data from the description bit of the date.
This is the code I use to grab my XML feed:
<?php
$ch = curl_init("http://pinterest.com/1234/feed.rss");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HEADER, 0);
$data = curl_exec($ch);
curl_close($ch);
$doc = new SimpleXmlElement($data, LIBXML_NOCDATA);
if(isset($doc->channel))
{
parseRSS($doc);
}
function parseRSS($xml)
{
$cnt = 9;
for($i=0; $i<$cnt; $i++)
{
$url = $xml->channel->item[$i]->link;
$img = $xml->channel->item[$i]->description;
$title = $xml->channel->item[$i]->title;
echo '<p>'.$img.'</p>';
}
}
?>
The problem is that description looks like below and all I want is the src value from it:
<description><p><a href="/pin/1785432765530/"><img src="http://media-cache-ec1.pinterest.com/upload/27099622222548513383_qJV62266Pf_b.jpg"></a></p><p>What it takes to Google’s.</p></description>
Is there a way to just get src="http://media-cache-ec1.pinterest.com/upload/270996666522513383_qJV6666Pf_b.jpg" out of the description and store it in $img or another variable?
html_entity_decode and Simple HTML DOM parser could solve your problem.
(http://stackoverflow.com/questions/138313/how-to-extract-img-src-title-and-alt-from-html-using-php)
Some RegExp would help you (PHP Manual, Wikipedia)
eg.: .*(src=".*[^"]").*
thanks all i used
$cnt = 9;
for($i=0; $i<$cnt; $i++)
{
$url = $xml->channel->item[$i]->link;
$img = $xml->channel->item[$i]->description;
$title = $xml->channel->item[$i]->title;
$pattern = '/src="([^"]*)"/';
preg_match($pattern, $img, $matches);
$src = $matches[0];
unset($matches);
//echo $src;
echo '<p><img '.$src.'</img></p>';
}
}
?>
thanks for the tips

Parsing XML in PHP DOM via cURL - can't get nodeValue if it is url address or date

I have this strange problem parsing XML document in PHP loaded via cURL. I cannot get nodeValue containing URL address (I'm trying to implement simple RSS reader into my CMS). Strange thing is that it works for every node except that containing url addresses and date ( and ).
Here is the code (I know it is a stupid solution, but I'm kinda newbie in working with DOM and parsing XML documents).
function file_get_contents_curl($url) {
$ch = curl_init(); // initialize curl handle
curl_setopt($ch, CURLOPT_URL, $url); // set url to post to
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); // return into a variable
curl_setopt($ch, CURLOPT_TIMEOUT, 4); // times out after 4s
$result = curl_exec($ch); // run the whole process
return $result;
}
function vypis($adresa) {
$html = file_get_contents_curl($adresa);
$doc = new DOMDocument();
#$doc->loadHTML($html);
$nodes = $doc->getElementsByTagName('title');
$desc = $doc->getElementsByTagName('description');
$ctg = $doc->getElementsByTagName('category');
$pd = $doc->getElementsByTagName('pubDate');
$ab = $doc->getElementsByTagName('link');
$aut = $doc->getElementsByTagName('author');
for ($i = 1; $i < $desc->length; $i++) {
$dsc = $desc->item($i);
$titles = $nodes->item($i);
$categorys = $ctg->item($i);
$pubDates = $pd->item($i);
$links = $ab->item($i);
$autors = $aut->item($i);
$description = $dsc->nodeValue;
$title = $titles->nodeValue;
$category = $categorys->nodeValue;
$pubDate = $pubDates->nodeValue;
$link = $links->nodeValue;
$autor = $autors->nodeValue;
echo 'Title:' . $title . '<br/>';
echo 'Description:' . $description . '<br/>';
echo 'Category:' . $category . '<br/>';
echo 'Datum ' . gmdate("D, d M Y H:i:s",
strtotime($pubDate)) . " GMT" . '<br/>';
echo "Autor: $autor" . '<br/>';
echo 'Link: ' . $link . '<br/><br/>';
}
}
Can you please help me with this?
To read RSS you shouldn't use loadHTML, but loadXML. One reason why your links don't show is because the <link> tag in HTML ignores its contents. See also here: http://www.w3.org/TR/html401/struct/links.html#h-12.3
Also, I find it easier to just iterate over the <item> tags and then iterate over their children nodes. Like so:
$d = new DOMDocument;
// don't show xml warnings
libxml_use_internal_errors(true);
$d->loadXML($xml_contents);
// clear xml warnings buffer
libxml_clear_errors();
$items = array();
// iterate all item tags
foreach ($d->getElementsByTagName('item') as $item) {
$item_attributes = array();
// iterate over children
foreach ($item->childNodes as $child) {
$item_attributes[$child->nodeName] = $child->nodeValue;
}
$items[] = $item_attributes;
}
var_dump($items);

Categories