parsing nested xml using php - php

here's the XML which I'm trying to parse for a while but I'm stuck on nested elements.
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:course="https://www.example.org/api/course/elements/1.0/" xmlns:staff="https://www.example.org/api/staff/elements/1.0/" version="2.0">
<channel>
<item>
<title>example.org course feed</title>
<link>https://www.example.org/api/v2/report/course-feed/rss</link>
<description>.org - course catalog feed</description>
<language>en</language>
<course:instructors>
<course:staff>
<staff:name>Mark Moran</staff:name>
</course:staff>
</course:instructors>
</item>
</channel>
how to parse course: instructors, my PHP code is
$rss = simplexml_load_file('https://www.edx.org/api/v2/report/course-feed/rss');
$namespaces = $rss->getNamespaces(true);
foreach ($rss->channel->item as $item) {
$title = $item->title ;
}
EDIT:2
$rss = simplexml_load_file('https://www.example.org/api/v2/report/course-feed/rss');
$namespaces = $rss->getNamespaces(true);//Add this line
foreach ($rss->channel->item as $item) {
$course_title = $item->title ;
$course_description = $item->description;
$course_url = $item->link;
$course = $item->children($namespaces['course']);
$course_thumbnail_url = $course->{'image-thumbnail'};
$course_banner_url = $course->{'image-banner'};
$course_teaser_url = $course->{'video-youtube'};
$course_start_date = $course->start;
$course_duration = $course->length;
$instructors = $item->children('course',true)->instructors;
$staff = $instructors->children('course',true)->staff;
$instructor_name = $staff->children('staff',true)->name;
$instructor_image = $staff->children('staff',true)->image;
echo $instructor_name.' '.$instructor_image,"<br>";
$course_price = 0;
$course_provider_id = 3;
$course_affiliates = $course->school;
$categories = $course->subject;
$categories = explode(',', $categories);
$c = count($categories);
$i = 0;
while($i < $c)
{
$course_rating = mt_rand(3.5,5);
$course_category = $categories[$i];
if(mysqli_query($conn,"INSERT into course_catalog_table (course_title,course_description,course_url,course_thumbnail_url,course_banner_url,course_teaser_url,course_category,course_start_date,course_duration,course_rating,course_affiliates,course_instructor,course_instructor_image,course_price,course_provider_id) VALUES('$course_title','$course_description','$course_url','$course_thumbnail_url','course_banner_url','course_teaser_url','$course_category','$course_start_date','$course_duration','$course_rating','$course_affiliates','$instructor_name',$instructor_image','$course_price')"))
{
echo "successfull\r\n";
}
$i++;
}
}
When i print instructor_name and instructor_image sometimes its prints but sometimes it throws warning that main(): Node no longer exists ,how can i check that is empty or not

You can use the children() function to access the child tree of the xml structure.
Do like this:
$rss->channel->item->children('course',true)->instructors;
Read: http://php.net/manual/en/simplexmlelement.children.php
But since the XML have multiple nest, you need to use multiple children() function to access the deepest nest.
Here is the modified code to parse the XML you give:
foreach ($rss->channel->item as $item)
{
$title = $item->title;
// access course:instructors nest
$instructors = $item->children('course',true)->instructors;
// then access the course:staff nest
$staff = $instructors->children('course',true)->staff;
// finally access the staff:name nest value
$name = $staff->children('staff',true)->name;
// print the value
echo "Staff Name: ". $name . "<br>";
}
Test run: https://eval.in/735810

Related

Displaying XML feed content

I am trying to get the contents from an xml feed and display them in a list. My test below is just to get the job_title for a vacancy.
$feed = file_get_contents('https://www.jobs.nhs.uk/search_xml?client_id=120650');
$xml = simplexml_load_string($feed);
$items = $xml->nhs_search->vacancy_details;
foreach($items as $item) {
$job_title = $item->job_title;
echo $job_title;
}
Here is a snippet of the xml feed
<nhs_search>
<vacancy_details>
<id>915854585</id>
<job_title>Band 5 Speech and Language Therapist</job_title>
</vacancy_details>
</nhs_search>
Nothing is displaying and no errors.
It's work fine:
$feed = file_get_contents('https://www.jobs.nhs.uk/search_xml?client_id=120650');
$xml = simplexml_load_string($feed);
$items = $xml->vacancy_details;
foreach ($items as $item) {
$id = $item->id;
$job_title = $item->job_title;
echo $id; echo '<br />';
echo $job_title;
}
Changed $items = $xml->nhs_search->vacancy_details; to $items = $xml->vacancy_details;

Get enclosure img url from rss feed

I have a problem with an rss feed in php. I want do get the img-url from "enclosure" but it´s not working.
My code just now:
$rss = simplexml_load_file($url);
$i = 0;
if($rss)
{
$items = $rss->channel->item;
foreach($items as $item)
{
$title = $item->title;
$link = $item->link;
$published_on = $item->pubDate;
$phpDate = strtotime($published_on);
$enclosure = $item['enclosure'][0]['url'];
From the RSS:
<enclosure url="http://www.svenskafans.com/image/7/141433/Snalla-Pelle-stanna-i-Gefle.jpg" lenght="51265" type="image/jpeg" />
Important to note is that sometimes there is not enclosure-tag with so it must work even if it is missing.
Thanks!
Best Regards
Charles
What about :
$rss=simplexml_load_file('http://www.svenskafans.com/rss/team/77.aspx');
foreach ($rss->channel->item as $item) {
if (isset($item->enclosure)) {
echo $item->enclosure['url'].'<br>';
}
}
outputs :
http://www.svenskafans.com/image/7/393988/Bilder-fran-tifot-for-Hugo-och-Bernhard.jpg
http://www.svenskafans.com/image/7/141433/Snalla-Pelle-stanna-i-Gefle.jpg
http://www.svenskafans.com/image/7/392527/Efter-Gefle-Elfsborg-En-skitmatch-i-regnet-gav-5-insikter.jpg
http://www.svenskafans.com/image/7/363552/Infor-Gefle-IF-IF-Elfsborg.jpg
http://www.svenskafans.com/image/7/211783/Gefles-Silly-Season-2013-2014-Angekeepern-Lloyd-Saxton-provtranar-med-Gefle.jpg
http://www.svenskafans.com/image/7/363058/Gefle-Panelen-17-Pensionera-Hugos-och-Bernhards-trojnummer.jpg
http://www.svenskafans.com/image/7/328214/Kungsbacksv-24-17-Hoppas-Hugo-satter-en-straff-mot-Elfsborg-i-89e-minuten.jpg
http://www.svenskafans.com/image/7/192682/Intervju-med-Daniel-Bernhardsson-Gefle-har-en-ljus-framtid.jpg
http://www.svenskafans.com/image/7/74875/Besked-idag-Bade-Bernhard-och-Hugo-spelar-sin-sista-match-i-Gefle-IF-pa-sondag.jpg
http://www.svenskafans.com/image/7/343968/Overraskande-piggt-Gefle-nar-Oremo-och-Jawo-natade.jpg
http://www.svenskafans.com/image/7/330399/Tack-AIK-nu-klart-till-100-att-Gefle-spelar-i-Allsvenskan-2014.jpg
http://www.svenskafans.com/image/7/363552/Rosta-fram-Gefles-MVP-2013.jpg
http://www.svenskafans.com/image/7/220468/Par-Asp-berattar-om-tiden-i-Gefle-roligaste-matchen-och-om-att-spela-med-Guidetti.jpg

Searching XML tags with regex - PHP XPatch

I have a XML document:
<product>
<item>
<item00>
<name>DVD</name>
</item00>
</item>
</product>
<product>
<item>
<item11>
<name>CD</name>
</item11>
</item>
</product>
And I would like to show the names of these products, but there are products with item as "item00" and "item11".
I tried adding the path regular expressions in XPath, but without success.
There is a possibility I display the name of these products (DVD and CD) using XPath?
<?php
$xml = 'file.xml';
$content = '';
$f = fopen($xml, 'r');
while($data = fread($f, filesize($xml))) {
$content.= $data;
}
fclose($f);
preg_match_all('/\<product\>(.*?)\<\/product\>/s', $content, $product);
$product = $product[1];
$doc = new SimpleXMLElement($content);
for($i = 0; $i <= count($product) - 1; $i++) {
// So far, no problems. Seriously.
// The issue starts here.
$query = $doc->xpath('/product/item/???');
foreach($query as $item) {
echo $item->name . '<br>';
}
}
?>
Where "???" is the problem with "item00" and "item11".
If anyone knows and can help me, I'll be very grateful!
Here is the total working code
<?php
$xml = 'file.xml';
$content = '';
$f = fopen($xml, 'r');
while($data = fread($f, filesize($xml))) {
$content.= $data;
}
fclose($f);
$content = "<root>$conten</root>";
$doc = new SimpleXmlElement($content);
$query = $doc->xpath('//item/child::*');
foreach($query as $item) {
echo $item->name . '<br>';
}
i dont think you can use regex in that context, that's the very reason to use attributes
<item num="00">
however check this, i believe it is what you are looking for
those 00 11 things really should be attributes

Manipulating nodes in DomDocument and transform it into an array

I've an XML looking this way :
<?xml version="1.0" ?>
<rss version="2.0">
<channel>
<title>get_news_category</title>
<item>
<id>10502</id>
<title>Cheesecake</title>
<summary>SummaryBlahblah</summary>
</item>
<item>
<id>13236</id>
<title>Moto</title>
<summary>summary blahblah</summary>
</item>
And I want to put the items into an php array.
I've done so far:
$nodes = $dom->getElementsByTagName('item')->item(0);
$values = $nodes->getElementsByTagName("*");
$articles = array();
foreach ($values as $node) {
$articles[$node->nodeName] = $node->nodeValue;
}
var_dump($articles);
Which only returns me in an array, the 1 <item> element. which is logic because i told him ->item(0).
So how to select all the items in order to put all the items into an array ?
Thanks.
use $nodes->length
$dom = new DOMDocument();
$dom->loadHTML($html);
$nodes = $dom->getElementsByTagName('item');
for($i=0; $i<$nodes->length; $i++){
$values = $nodes->item($i)->getElementsByTagName("*");
$articles = array();
foreach ($values as $num => $node) {
$articles[$i][$node->nodeName] = $node->nodeValue;
}
var_dump($articles);
}
You need to iterate the $nodes.
$nodes = $dom->getElementsByTagName('item');
for ($i = 0; $i < $nodes->length; $i++)
{
// Lets grab the node
$values = $nodes->item($i)->getElementsByTagName("*");
}

PHP how to count xml elements in object returned by simplexml_load_file(),

I have inherited some PHP code (but I've little PHP experience) and can't find how to count some elements in the object returned by simplexml_load_file()
The code is something like this
$xml = simplexml_load_file($feed);
for ($x=0; $x<6; $x++) {
$title = $xml->channel[0]->item[$x]->title[0];
echo "<li>" . $title . "</li>\n";
}
It assumes there will be at least 6 <item> elements but sometimes there are fewer so I get warning messages in the output on my development system (though not on live).
How do I extract a count of <item> elements in $xml->channel[0]?
Here are several options, from my most to least favourite (of the ones provided).
One option is to make use of the SimpleXMLIterator in conjunction with LimitIterator.
$xml = simplexml_load_file($feed, 'SimpleXMLIterator');
$items = new LimitIterator($xml->channel->item, 0, 6);
foreach ($items as $item) {
echo "<li>{$item->title}</li>\n";
}
If that looks too scary, or not scary enough, then another is to throw XPath into the mix.
$xml = simplexml_load_file($feed);
$items = $xml->xpath('/rss/channel/item[position() <= 6]');
foreach ($items as $item) {
echo "<li>{$item->title}</li>\n";
}
Finally, with little change to your existing code, there is also.
$xml = simplexml_load_file($feed);
for ($x=0; $x<6; $x++) {
// Break out of loop if no more items
if (!isset($xml->channel[0]->item[$x])) {
break;
}
$title = $xml->channel[0]->item[$x]->title[0];
echo "<li>" . $title . "</li>\n";
}
The easiest way is to use SimpleXMLElement::count() as:
$xml = simplexml_load_file($feed);
$num = $xml->channel[0]->count();
for ($x=0; $x<$num; $x++) {
$title = $xml->channel[0]->item[$x]->title[0];
echo "<li>" . $title . "</li>\n";
}
Also note that the return of $xml->channel[0] is a SimpleXMLElement object. This class implements the Traversable interface so we can use it directly in a foreach loop:
$xml = simplexml_load_file($feed);
foreach($xml->channel[0] as $item {
$title = $item->title[0];
echo "<li>" . $title . "</li>\n";
}
You get count by count($xml).
I always do it like this:
$xml = simplexml_load_file($feed);
foreach($xml as $key => $one_row) {
echo $one_row->some_xml_chield;
}

Categories