Removing wrapping HTML elements inside a RSS XML node - php

I have a fetch function that injects rss content into a page for me. This returns an xml which contains the usual RSS elements like title, link, description but the problem is the returned description is a table with two tds which one contains an image the other the text. I am not sure how I can remove the table, img and the tds and be left only with the text using php and not javascript.
Any help is much appreciated.
<?php
require_once('rss_fetch.inc');
$url = 'http://www.domain.com/rss.aspx?typeid=0&imagesize=120&topcount=20';
if ( $url ) {
$rss = fetch_rss( $url );
//echo "Channel: " . $rss->channel['title'] . "<p>";
echo "<ul>";
foreach ($rss->items as $item) {
$href = $item['link'];
$title = $item['title'];
$description = $item['description'];
$pubdate = date('F dS, Y', strtotime($item['pubdate']));
echo "<li><h3>$title<em>$pubdate</em></h3>$description <p><a href='$href' target='_blank'>ادامه مطلب</a></p><br/></li>";
}
echo "</ul>";
}
?>

strip_tags() will do the job..

Related

PHP parses XML content without looping the hyperlink

I have an xml file structured like this:
<channel>
<title>abc</title>
<link>domain.com</link>
<description>Bla bla.</description>
<item>
<title>xyz </title>
<link>domain.com/</link>
<description>
<table border="1" width="100%"><tr><th colspan="2"></th><th>P</th><th>W</th><th>D</th><th>L</th><th>GF</th><th>GA</th><th>Dif</th><th>Pts</th></tr><tr><td width="7%">1</td><td width="27%"><a target="_blank" href="domain[dot]com/new-york/"/>New York</td><td width="7%"><center>12</center></td><td width="7%"><center>8</center></td><td width="7%"><center>2</center></td><td width="7%"><center>2</center></td><td width="7%"><center>17</center></td><td width="7%"><center>10</center></td><td width="7%"<center>+7</center></td><td width="7%"><center>26</center></td></tr><tr><td width="7%">2</td><td width="27%"><a target="_blank" href="domain[dot]com/lon-don/"/>London</td><td width="7%"><center>12</center></td><td width="7%"><center>6</center></td><td width="7%"><center>4</center></td><td width="7%"><center>2</center></td><td width="7%"><center>22</center></td><td width="7%"><center>12</center></td><td width="7%"><center>+10</center></td><td width="7%"><center>22</center></td></tr></table><br/>
</description>
I used this piece of code to parse the table data in PHP and i was successful:
$url = "link to the above xml file";
$xml = simplexml_load_file($url);
foreach($xml->channel->item as $item){
$desc = html_entity_decode((string)$item->description);
$descXML = simplexml_load_string('<desc>'.$desc.'</desc>');
$html = $descXML->table->asXML();
$html .= "<hr />";
echo $html;
}
However, it also includes the hyperlink in the table data/ array values, which are domain[dot]com/newyork/ and domain[dot]com/london/ while outputting.
What I am expecting is that I would like to exclude the hyperlinks in the output, which means that I just need the plain text such as Lon Don or New York and so on.
No hyperlink in the output, please.
Thanks,
As you are just displaying the entire table XML in
$html = $descXML->table->asXML();
This contains all of the markup of the table, what you need to do if you just want some of the table data is to further process it to extract that data...
$xml = simplexml_load_file($url);
foreach($xml->item as $item){
$desc = html_entity_decode((string)$item->description);
$descXML = simplexml_load_string('<desc>'.$desc.'</desc>');
// Loop over each row of the table
foreach ( $descXML->table->tr as $row ) {
// If there are td elements
if ( isset($row->td) ) {
// Extract the value from the second td element, convert to a string and trim the result
$html = trim((string)($row->td[1]));
$html .= "<hr />";
echo $html;
}
}
}
If you want all of the <tr> XML except the <a> tag, you can just unset it (assuming it will always be there)...
foreach ( $descXML->table->tr as $row ) {
// If there are td elements
if ( isset($row->td) ) {
unset($row->td[1]->a);
$html = $row->asXML(). "<hr />";
echo $html;
}
}

Handling different RSS Feed Formats

I am trying to create a personal job board using RSS feeds from Craigslist, Reddit, Kijiji, and Indeed.
I have found a method (using magpie) to bring in the multiple feeds, however I am not able to parse any data from Indeed.ca. I tried echoing the results at different stages, to make sure Iw as connected to the RSS Feed for Indeed, and I was able to get information, but it won't display on the finished product.
Here's my code to call the RSS Feeds (rss-urls.php):
$urls = array(
//Craigslist RSS Feeds
'http://toronto.en.craigslist.ca/med/index.rss',
//Reddit RSS Feeds
'http://www.reddit.com/r/forhire/new/.rss',
//Kijiji RSS Feeds
'http://www.kijiji.ca/rss-srp-graphic-web-design-jobs/owen-sound/c152l1700187',
//Indeed RSS Feed
'http://www.indeed.ca/rss?q=Graphic+Designer&l=Toronto%2C+Ontario');
foreach($urls as $url) {
$rss = fetch_rss($url);
foreach ($rss->items as $item ) {
$title = $item[title];
$url = $item[link];
$description = $item[description];
$date = $item['dc']['date'];
//print_r($tot_array);
rsort($tot_array);
And here's the code that takes the feed's info and displays it:
foreach($tot_array as $tot) {
$all = explode(",",$tot);
$date = date("Y-m-d",strtotime($all[4]));
$now = date("Y-m-d");
$title = $all[1];
$url = $all[2];
$description = $all[3];
//echo $tot."";
//print $url;
if (false !== strpos($url,'indeed')) {
echo '<div id="linkCell" style="width: 100%;">';
echo '<div id="vAlign">';
echo '<p class="linkTitle">'.$title.'</p><br />';
//echo '<span class="date">Post is '. date_diff(date_create($date), date_create($now))->format('%a day(s) old') .'</span></p>';
echo '<p class="description">'.$description.'</p>';
echo '</div>';
echo '</div>';
echo '<span style="color:white;">'.$date."</span><br>";
}
}
If you run the result on a browser, you can see there are lots of empty lines from Indeed RSS feed.
I would trim out those lines before parsing it.

Trying to read values from an XML feed and display them in PHP

I'm trying to display values from this xml feed:
http://www.scorespro.com/rss/live-soccer.xml
In my PHP code I have the following loop but it does not display the results on my page:
<?php
$xml = simplexml_load_file("http://www.scorespro.com/rss/live-soccer.xml");
echo $xml->getName() . "<br>";
foreach($xml->children() as $item)
{
echo $item->getName() . ": " . $item->name . "<br>";
}
?>
For some reason it only shows:
rss
channel:
I'm fairly new to how XML works so any help would be much appreciated.
You can get actual data from $xml->channel->item so use like below
$items = $xml->channel->item;
foreach($items as $item) {
$title = $item->title;
$link = $item->link;
$pubDate = $item->pubDate;
$description = $item->description;
}
DEMO.
you can use below code
$dom = new DOMDocument;
$dom->loadXML($url);
if (!$dom) {
echo 'Error while parsing the document';
exit;
}
$xml = simplexml_import_dom($dom);
$data = $xml->channel->item;

How to fetch content:encoded tags information from rss

Am fetching blog from wordpress to my website using magpierss-0.72 rss parser now i want to fetch image from my blog, the image in tag like
<content:encoded><img src="path" /></content:encoded>
my code what i have tried is
require_once('rss_fetch.inc');
$rss = fetch_rss($url);
foreach ($rss->items as $i => $item ) {
$title = strtoupper ($item['title']);
$url = $item['link'];
$date = $item['pubdate'];
$desc = $item['description'];
$content = $item['content:encoded'];
echo $title."<br />";
echo $url."<br />";
echo $date."<br />";
echo $desc."<br />";
echo $content."<br />";
}
But the details in content:encode tag is not fetching. Can any one help me Please
Thank you in advance
It should be parsed into the $item['content']['encoded'] field if your feed is an Atom feed or under $item['atom_content'] if your feed is an RSS feed.
For reference see the rss_parse.php, MagpieRSS::parse method.

Inserting multiple rss feeds into MYSQL with simplexml

Below is my code to parse multiple rss feeds into a mysql db.
I do something wrong in the foreach part I think, since there is no output.
The db however, gets filled. When using 1 feed, the script works fine.
Anybody sees what I do wrong? Many thanks in advance :)
$feeds = ('https://www.ictu.nl/rss.xml', 'http://www.vng.nl/smartsite.dws?id=97817');
$xml = simplexml_load_file($feeds);
foreach($xml->channel->item as $item)
{
$date_format = "j-n-Y"; // 7-7-2008
echo date($date_format,strtotime($item->pubDate));
echo ''.$item->title.'';
echo '<div>' . $item->description . '</div>';
mysql_query("INSERT INTO rss_feeds (id, title, description, link, pubdate)
VALUES (
'',
'".mysql_real_escape_string($item->title)."',
'".mysql_real_escape_string($item->description=htmlspecialchars(trim($item->description)))."',
'".mysql_real_escape_string($item->link)."',
'".mysql_real_escape_string($item->pubdate)."')");
}
Try this:
<?php
$feeds = array('https://www.ictu.nl/rss.xml', 'http://www.vng.nl/smartsite.dws?id=97817');
foreach( $feeds as $feed ) {
$xml = simplexml_load_file($feed);
foreach($xml->channel->item as $item)
{
$date_format = "j-n-Y"; // 7-7-2008
echo date($date_format,strtotime($item->pubDate));
echo ''.$item->title.'';
echo '<div>' . $item->description . '</div>';
mysql_query("INSERT INTO rss_feeds (id, title, description, link, pubdate)
VALUES (
'',
'".mysql_real_escape_string($item->title)."',
'".mysql_real_escape_string($item->description=htmlspecialchars(trim($item->description)))."',
'".mysql_real_escape_string($item->link)."',
'".mysql_real_escape_string($item->pubdate)."')");
}
}
?>
Hope it helps.

Categories