RSS to HTML with varying number of elements - php

I've adapted the code found here http://www.w3schools.com/php/php_ajax_rss_reader.asp to turn XML into HTML, and it works fine.
But what I'm stuck on is getting it to show all the items in a feed when the feed can have varying numbers of items. The feed is published daily and can have anywhere from 12-20 articles in it, and I want to show all of them.
In the For Loop for ($i=0; $i<=12; $i++) if I set the condition to be greater than the number of articles, I get an error PHP Fatal error: Call to a member function getElementsByTagName(), so I can't just set it to a big number.
I get the same error if I just remove the condition.
I can't figure out how to count the number of items, either; if I could do that the solution would be easy.
The feed is created in-house so I could ask my colleague to insert the number of items in the feed; is that the best way to go about it?
Thanks!

If you don't know the number of items in the feed, you can go through them all using a foreach loop. Here is an example using the RSS feed from the PHP tag on StackOverflow. Have a look at the rss format so you can see what each entry looks like, and compare it to the code below.
# start off like the w3schools code...
$xml=("https://stackoverflow.com/feeds/tag?tagnames=php&sort=newest");
$xmlDoc = new DOMDocument();
$xmlDoc->load($xml);
# StackOverflow uses the <entry> element for each separate item.
# find all the "entry" items. This returns an array of matching entry elements
$items = $xmlDoc->getElementsByTagName('entry');
# go through the array of "entry" elements one at a time
# $items is the array of <entry> elements
# $i is set to each <entry> in turn, starting from the first one on the page
foreach ($items as $i) {
# some sample code to get the title, tags, and link
$title = $i->getElementsByTagName('title')->item(0)->nodeValue;
$href = $i->getElementsByTagName('link')->item(0)->getAttribute('href');
$tags = $i->getElementsByTagName('category');
$tag_arr = [];
foreach ($tags as $t) {
$tag_arr[] = $t->getAttribute('term');
}
echo "Title: $item_title; tags: " . implode(", ", $tag_arr) . ";\nhref: $href\n\n";
}
Using a foreach loop means you are not stuck with having to work out how many items you have in your array, and you don't have to set up an array iterator using for ($i = 0; $i < 500; $i++).

Related

Matching results count when there's not a 1:1 ratio of results

What I'm doing is scraping the contents of a web page by extracting the value from a div using xpath selectors. The amount of parent divs$listing is always the same. This div contains (most times) both a $titlehit containing a title like 'Guitar' and a $pricehit which contains prices like '199'. The problem is that there is not a 1:1 ratio of these divs. There can be 31 results on $titlehit and 29 on $pricehit. This causes the scraped data to be incorrect once it encounters a case where there's a title but no price. I'd like to know if there's some way to lock the count so both the title and price will be added to the same key in the array $list correctly.
Below is my code:
$listing = $xpath->query("//div[#class='cell-group listing']");
$titlehit = $xpath->query("//span[#class='mp-listing-title']");
$pricehit = $xpath->query("//span[#class='price-new']");
$i = 0;
foreach ($listing as $listings[$i]) {
$titles = $titlehit[$i];
$prices = $pricehit[$i];
$list[]=['title'=>$titlehit[$i]->nodeValue,
'price'=>trim($pricehit[$i]->nodeValue)];
$i++
}
I've tried with various if statements to only add entries with both a title and a price to the array $list, without results like this:
if (!empty($titlehit[$i]->nodeValue) && (!empty($pricehit[$i]->nodeValue))
There problem appears to be related to the count.
any help is much appreciated.

Getting name spaces out of XML from simplexml_load_file in php

I am trying to parse this YouTube XML using simplexml_load_file in php.
The XML feed can be found here:
https://www.youtube.com/feeds/videos.xml?playlist_id=PL1mm1FfX5EHRjGyoBpEXBRIGAmCNt8pBT
Below in php I am trying to iterate through the media groups nested inside each entry node.
<?php
$xmlFeed=simplexml_load_file('https://www.youtube.com/feeds/videos.xml?playlist_id=PL1mm1FfX5EHRjGyoBpEXBRIGAmCNt8pBT')
or die("Cannot load YouTube video feed, please try again later.");
foreach ($xmlFeed->entry->children('media', true)->group as $video) {
echo $video->title;
echo $video->description;
echo $video->thumbnail->getNameSpaces(true);
}
?>
Title and description print just fine. But I'm trying to get at the thumbnail URL found in this namespace:
<media:thumbnail url="https://i1.ytimg.com/vi/HEYQXVGnwXc/hqdefault.jpg" width="480" height="360"/>
I've tried all 3 of the following:
echo $video->thumbnail->getNameSpaces(true);
echo $video->thumbnail->getNameSpaces(true)['url'];
echo $video->thumbnail->getNameSpaces(true)->url;
None return the url. The first returns Array and the last two are blank. What am I missing?
Several things: first, you have to use the attributes() function since there is no child of thumbnail. Secondly, you don't need to declare getNameSpaces(true) since the namespace prefix media is done in the for loop. Finally, you do not iterate across all media:group. Right now, you will return only the first set of xml values, not both from each <entry> node. Therefore, you need to add an outer loop -one that iterates across the frequency of <entry> nodes.
$attr = 'url';
for($i = 0; $i < sizeof($xmlFeed->entry); $i++) {
foreach ($xmlFeed->entry[$i]->children('media', true)->group as $video) {
echo $video->title."\n";
echo $video->description."\n";
echo $video->thumbnail->attributes()->$attr."\n";
}
}
XPATH Alternative
Even further, you could have handled your needs in XPath by simply registering the media namespace and querying to exact locations, iterating of course across each set:
$xmlFeed->registerXPathNamespace('media', 'http://search.yahoo.com/mrss/');
// ARRAYS TO HOLD XML VALUES
$videos = $xmlFeed->xpath('//media:group');
$title = $xmlFeed->xpath('//media:group/media:title');
$description = $xmlFeed->xpath('//media:group/media:description');
$url = $xmlFeed->xpath('//media:group/media:thumbnail/#url');
// ITERATING THROUGH EACH ARRAY
for($i = 0; $i < sizeof($videos); $i++) {
echo $title[$i]."\n";
echo $description[$i]."\n";
echo $url[$i]."\n";
}

Pulling SimpleXML data Across a Page

I'm sure there's a pretty obvious solution to this problem...but it's alluding me.
I've got an XML feed that I want to pull information from - from only items with a specific ID. Let lets say we have the following XML:
<XML>
<item>
<name>John</name>
<p:id>1</id>
<p:eye>Blue</eye>
<p:hair>Black</hair>
</item>
<item>
<name>Jake</name>
<p:id>2</id>
<p:eye>Hazel</eye>
<p:hair>White</hair>
</item>
<item>
<name>Amy</name>
<p:id>3</id>
<p:eye>Brown</eye>
<p:hair>Yellow</hair>
</item>
<item>
<name>Tammy</name>
<p:id>4</id>
<p:eye>Blue</eye>
<p:hair>Black</hair>
</item>
<item>
<name>Blake</name>
<p:id>5</id>
<p:eye>Green</eye>
<p:hair>Red</hair>
</item>
</xml>
And I want to pull ONLY people with the ID 3 and 1 into specific spots on a page (there will be no double IDs - unique IDs for each item). Using SimpleXML and a forloop I can easily display each ITEM on a page using PHP - with some "if ($item->{'id'} == #)" statements (where # is the ID I'm looking for(, I can also display the info for each ID I'm looking for.
The problem I'm running into is how to distribute the information across the page. I'm trying to pull the information into specific spots on a page my first attempt at distributing the specific fields across the page aren't working as follows:
<html>
<head><title>.</title></head>
<body>
<?php
(SimpleXML code / For Loop for each element here...)
?>
<H1>Staff Profiles</h1>
<h4>Maintenance</h4>
<p>Maintenance staff does a lot of work! Meet your super maintenance staff:</p>
<?php
if($ID == 1) {
echo "Name:".$name."<br/>";
echo "Eye Color:".$eye."<br/>";
echo "Hair Color:".$hair."<br/>";
?>
<h4>Receptionists</h4>
<p>Always a smiling face - meet them here:</p>
<?php
if($ID == 3) {
echo "Name:".$name."<br/>";
echo "Eye Color:".$eye."<br/>";
echo "Hair Color:".$hair."<br/>";
?>
<H4>The ENd</h4>
<?php (closing the four loop) ?>
</body>
</html>
But it's not working - it randomly starts repeating elements on my page (not even the XML elements). My method is probably pretty...rudimentary; so a point in the right direction is much appreciated. Any advice?
EDIT:
New (NEW) XPATH code:
$count = 0;
foreach ($sxe->xpath('//item') as $item) {
$item->registerXPathNamespace('p', 'http://www.example.com/this');
$id = $item->xpath('//p:id');
echo $id[$count] . "\n";
echo $item->name . "<br />";
$count++;
}
use xpath to accomplish this, and write a small function to retrieve a person by id.
function getPerson($id = 0, &$xml) {
return $xml->xpath("//item[id='$id']")[0]; // PHP >= 5.4 required
}
$xml = simplexml_load_string($x); // assume XML in $x
Now, you can (example 1):
echo getPerson(5, $xml)->name;
Output:
Blake
or (example 2):
$a = getPerson(2, $xml);
echo "$a->name has $a->eye eyes and $a->hair hair.";
Output:
Jake has Hazel eyes and White hair.
see it working: http://codepad.viper-7.com/SwLids
EDIT In your HTML, this would probably look like this:
...
<h1>Staff Profiles</h1>
<h4>Maintenance</h4>
<p>Maintenance staff does a lot of work! Meet your super maintenance staff:</p>
<?php
$p = getPerson(4, $xml);
echo "Name: $p->name <br />";
echo "Eye Color: $p->eye <br />";
echo "Hair Color: $p->hair <br />";
?>
no looping required, though.
First thing that popped into my mind is to use a numerical offset (which is zero-based in SimpleXML) as there is a string co-relation between the offset and the ID, the offset is always the ID minus one:
$items = $xml->item;
$id = 3;
$person = $items[$id - 1];
echo $person->id, "\n"; // prints "3"
But that would work only if - and only if - the first element would have ID 1 and then each next element the ID value one higher than it's previous sibling.
Which we could just assume by the sample XML given, however, I somewhat guess this is not the case. So the next thing that can be done is to still use the offset but this time create a map between IDs and offsets:
$items = $xml->item;
$offset = 0;
$idMap = [];
foreach ($items as $item) {
$idMap[$item->id] = $offset;
$offset++;
}
With that new $idMap map, you then can get each item based on the ID:
$id = 3;
$person = $items[$idMap[$id]];
Such a map is useful in case you know that you need that more than once, because creating the map is somewhat extra work you need to do.
So let's see if there ain't something built-in that solves the issue already. Maybe there is some code out there that shows how to find an element in simplexml with a specific attribute value?
SimpleXML: Selecting Elements Which Have A Certain Attribute Value (Reference Question)
Read and take value of XML attributes - Especially because of the answer on how to add the functionality to SimpleXMLElement transparently.
Which leads to the point you could do it as outlined in that answer that shows how it works transparently like this:
$person = $items->attribute("id", $id);
I hope this is helpful.

How can I parse HTML in batches using xpath [PHP]?

I tried all sorts of things but couldn't find a solution.
I want to retrieve elements from html code using xpath in php.
Ex:
<div class='student'>
<div class='name'>Michael</div>
<div class='age'>26</div>
</div>
<div class='student'>
<div class='name'>Joseph</div>
<div class='age'>27</div>
</div>
I want to retrieve the information and put them in an array as follows:
$student[0][name] = Michael;
$student[0][age] = 26;
$student[1][name] = Joseph;
$student[1][age] = 27;`
In other words i want the matching ages to stay with the names.
I tried the following:
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpathDom = new DomXPath($dom);
$homepostcontentNodes = $xpathDom->query("//*[contains(#class, 'student')]//*[contains(#class, 'name')]");`
However, this is only grabbing me the nodes 'names'
How can i get the matching age nodes?
Of course it is only grabbing the nodes name - you are telling it to!
What you will need to do is in two steps:
Pick out all the student nodes
For each student node, pick out the columns
This is a pretty standard step in linearization of data, and the XPath queries are simple:
Step 1
You pretty much have it:
$studentNodes = $xpathDom->query("//div[contains(#class, 'student')]");
This will return all your student nodes.
Step 2
This is where the magic happens. We have our nodes, we can loop through them (DOMNodeList implements Iterator, so we can foreach-loop through them). What we need to figure out is how to find its children...
...Oh wait. DOMNode implements a method called getNodePath which returns the full, direct XPath path to the node. This allows us to then simply append /div to get all the div direct descendents to the node!
Another quick foreach, and we get this code:
$studentNodes = $xpathDom->query("//div[contains(#class, 'student')]");
$result = array();
foreach ($studentNodes as $v) {
// Child nodes: student
$r = array();
$columns = $xpathDom->query($v->getNodePath()."/div");
foreach ($columns as $v2) {
// Attributes allows me to get the 'class' property of the node. Bit clunky, but there's no alternative
$r[$v2->attributes->getNamedItem("class")->textContent] = $v2->textContent;
}
$result[] = $r;
}
var_dump($result);
Full fiddle: http://codepad.viper-7.com/t868Wh

How can I skip first item from Twitter Status feed, display next 4 items with PHP?

I am setting up a series of Twitter feed displays on one page. One shows the MOST RECENT status, in a particular fashion. The other (I am hoping) will show the next 4 statuses, while NOT including the most recent status. Here is part of the code that I think needs attention in order for this idea to work out:
$rss = file_get_contents('https://api.twitter.com/1/statuses/user_timeline.rss?
screen_name='.$twitter_user_id);
if($rss) {
// Parse the RSS feed to an XML object.
$xml = simplexml_load_string($rss);
if($xml !== false) {
// Error check: Make sure there is at least one item.
if (count($xml->channel->item)) {
$tweet_count = 0;
// Start output buffering.
ob_start();
// Open the twitter wrapping element.
$twitter_html = $twitter_wrap_open;
// Iterate over tweets.
foreach($xml->channel->item as $tweet) {
Here is the website which has lent me the code for this task:
< Pixel Acres - Display recent Twitter tweets using PHP >
Your foreach loop goes over each item in the feed. You want to skip certain elements based on the position in the feed, so you could add an index variable to the foreach and an if after the foreach:
foreach($xml->channel->item as $i => $tweet) {
if ($i == 0 || $i > 4)
continue;
I used an alternate method to solve the issue I was having. It included using a string replace on the latest tweet's URL to obtain the Tweet ID, which then allowed me to query tweets using (Tweet ID - 1) as the max_id term.

Categories