Manipulating nodes in DomDocument and transform it into an array - php

I've an XML looking this way :
<?xml version="1.0" ?>
<rss version="2.0">
<channel>
<title>get_news_category</title>
<item>
<id>10502</id>
<title>Cheesecake</title>
<summary>SummaryBlahblah</summary>
</item>
<item>
<id>13236</id>
<title>Moto</title>
<summary>summary blahblah</summary>
</item>
And I want to put the items into an php array.
I've done so far:
$nodes = $dom->getElementsByTagName('item')->item(0);
$values = $nodes->getElementsByTagName("*");
$articles = array();
foreach ($values as $node) {
$articles[$node->nodeName] = $node->nodeValue;
}
var_dump($articles);
Which only returns me in an array, the 1 <item> element. which is logic because i told him ->item(0).
So how to select all the items in order to put all the items into an array ?
Thanks.

use $nodes->length
$dom = new DOMDocument();
$dom->loadHTML($html);
$nodes = $dom->getElementsByTagName('item');
for($i=0; $i<$nodes->length; $i++){
$values = $nodes->item($i)->getElementsByTagName("*");
$articles = array();
foreach ($values as $num => $node) {
$articles[$i][$node->nodeName] = $node->nodeValue;
}
var_dump($articles);
}

You need to iterate the $nodes.
$nodes = $dom->getElementsByTagName('item');
for ($i = 0; $i < $nodes->length; $i++)
{
// Lets grab the node
$values = $nodes->item($i)->getElementsByTagName("*");
}

Related

How to return full set of child nodes based on search of XML file

I am trying to search an XML file of the following structure:
<Root>
<Record>
<Filenumber>12314123</Filenumber>
<StatusEN>Closed</StatusEN>
<StatusDate>02 Nov 2019</StatusDate>
</Record>
<Record>
<Filenumber>0678672301</Filenumber>
<StatusEN>Closed</StatusEN>
<StatusDate>02 Nov 2019</StatusDate>
</Record>
</Root>
I want to search based on the filenumber, but return all 3 nodes and values for the match.
I am trying
$q = '12314123';
$file = "status.xml";
$doc = new DOMDocument;
$doc->preserveWhiteSpace = false;
$doc->Load($file);
$xpath = new DOMXPath($doc);
$query = "/Root/Record/Filenumber[contains(text(), '$q')]";
$entries = $xpath->query($query);
foreach ($entries as $entry) {
echo $entry->parentNode->nodeValue ;
}
This seems to return all the values I want but in one single string. How can I return them as separate variables or even better, in an array or JSON?
DOMNodeList or DOMNodeElement don't know how become an array. And that's why we must do it with our hands:
foreach ($entries as $entry) {
$result = [];
foreach ($entry->parentNode->childNodes as $node) {
$result[$node->nodeName] = $node->nodeValue;
}
var_dump($result);
}

Finding value of nodes using XMLDOm in PHP

I need to extract information from an XML using XMLDom.
Below is myroot.xml
<?xml version='1.0' encoding='ISO-8859-1'?>
<myroot xml:lang='en'>
<delta>
<history>
<detail>
<id>one</id>
<degree>
<dname>alpha</dname>
<dates>
<StartDate>
<Year>1998</Year>
</StartDate>
<EndDate>
<Year>2002</Year>
</EndDate>
</dates>
</degree>
</detail>
<detail>
<id>two</id>
<degree>
<dname>beta</dname>
<dates>
<StartDate>
<Year>2006</Year>
</StartDate>
<EndDate>
<Year>2008</Year>
</EndDate>
</dates>
</degree>
</detail>
</history>
</delta>
here is my code
$doc = new DOMDocument;
$doc->preserveWhiteSpace = false;
$rootxmldoc = $doc->load('myroot.xml');
$xpath = new DOMXPath($rootxmldoc);
$items = $hrxml_obj->getElementsByTagName("detail");
$subitemarray = array();
$icounter = 0;
foreach ($items as $item) {
$query = "//dates/*/Year"; //xpath of all occurrence of Year
$entries = $xpath->query($query, $item);
foreach ($entries as $entry) {
$dates["startdate"] = "todo"; //extract StartDate
$dates["enddate"] = "todo"; //extract EndDate
}
$subitemarray[$icounter++] = dates;
}
var_dump($subitemarray);
Ideally I need to extract dates using xpath. I am not able to get this nailed. any help is appreciated. The issue is the usage of xpath while looping.
With XPath go directly to yout dates tag, and then use DOMElement::getElementsByTagName() to get StartDate and EndDate (you can also go to the dates tag using DOMDocument::getElementsByTagName(), but XPath gives you more flexibility should you need it). This will return a DOMNodeList, but you know (if the structure is constant) that you only need the first element of the list. So:
// $xml ommited, saved in a variable for testing purposes
$doc = new DOMDocument;
$doc->preserveWhiteSpace = false;
$doc->loadXML($xml);
$xpath = new DOMXPath($doc);
$items = $doc->getElementsByTagName("detail");
$subitemarray = array();
$icounter = 0;
foreach ($items as $item) {
$query = "//dates"; //xpath of all occurrence of Year
$entries = $xpath->query($query, $item);
foreach ($entries as $entry) {
$startDate = $entry->getElementsByTagName("StartDate")[0]->nodeValue;
$endDate = $entry->getElementsByTagName("EndDate")[0]->nodeValue;
$dates["startdate"] = $startDate; //extract StartDate
$dates["enddate"] = $endDate; //extract EndDate
}
$subitemarray[$icounter++] = $dates;
}
var_dump($subitemarray);
Demo
Or only with XPath:
$doc = new DOMDocument;
$doc->preserveWhiteSpace = false;
$doc->loadXML($xml);
$xpath = new DOMXPath($doc);
$items = $doc->getElementsByTagName("detail");
$subitemarray = array();
$icounter = 0;
foreach ($items as $item) {
$queryStart = "//dates/StartDate";
$entriesStart = $xpath->query($queryStart, $item);
$dates["startdate"] = $entriesStart[0]->nodeValue;
$queryEnd = "//dates/EndDate";
$entriesEnd = $xpath->query($queryEnd, $item);
$dates["enddate"] = $entriesEnd[0]->nodeValue;
$subitemarray[$icounter++] = $dates;
}
var_dump($subitemarray);
And lastly, using only one XPath query:
$doc = new DOMDocument;
$doc->preserveWhiteSpace = false;
$doc->loadXML($xml);
$xpath = new DOMXPath($doc);
$items = $doc->getElementsByTagName("detail");
$subitemarray = array();
$icounter = 0;
foreach ($items as $item) {
$query = "//dates/*[contains(local-name(), 'Date')]
";
$entries = $xpath->query($query, $item);
$dates["startdate"] = $entries[0]->nodeValue;
$dates["enddate"] = $entries[1]->nodeValue;
$subitemarray[$icounter++] = $dates;
}
var_dump($subitemarray);
Demo
The query will simply get any elements inside the current detail element that contains the word "Date". Again, if the structure is constant, you can assume that the first result will be StartDate and the second result will be EndDate.

Delete Node isn't working with Simple XML (PHP)

I want to delete a node if the title of an node is matching a filter (array). I use unset() and I already tried $node and $item but both arguments won't delete my node...
What is wrong in this code? - I do enter the if condition, because I see in if in my console!
$dom = new DOMDocument('1.0', 'utf-8');
$dom->preserveWhiteSpace = false;
$dom->formatOutput = true;
$dom->load("shop1.xml");
$pathXML = "/products/product";
$titleArray = array("Test", "Battlefield 1");
$doc = simplexml_import_dom($dom);
$items = $doc->xpath($pathXML);
foreach ($items as $item) {
$node = dom_import_simplexml($item);
$title = $node->getElementsByTagName('title')->item(0)->textContent;
echo $title . "\n";
foreach ($titleArray as $titles) {
echo $titles . "\n";
if (mb_stripos($title, $titles) !== false) {
echo "in if\n\n";
unset($item);
}
}
}
$dom->saveXML();
$dom->save("shop1_2.xml");
XML File:
<products>
<product>
<title>Battlefield 1</title>
<url>https://www.google.de/</url>
<price>0.80</price>
</product>
<product>
<title>Battlefield 2</title>
<url>https://www.google.de/</url>
<price>180</price>
</product>
</products>
Greetings and Thank You!
All you're doing is unsetting a local variable. Instead you need to alter the DOM:
$dom = new DOMDocument('1.0', 'utf-8');
$dom->preserveWhiteSpace = false;
$dom->formatOutput = true;
$dom->load("shop1.xml");
$xpathQuery = "/products/product";
$titleArray = array("Test", "Battlefield 1");
$xp = new DomXpath($dom);
$items = $xp->query($xpathQuery);
foreach ($items as $item) {
$title = $item->getElementsByTagName('title')->item(0)->textContent;
echo "$title\n";
if (in_array($title, $titleArray)) {
$item->parentNode->removeChild($item);
}
}
$dom->saveXML();
$dom->save("shop1_2.xml");

parsing nested xml using php

here's the XML which I'm trying to parse for a while but I'm stuck on nested elements.
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:course="https://www.example.org/api/course/elements/1.0/" xmlns:staff="https://www.example.org/api/staff/elements/1.0/" version="2.0">
<channel>
<item>
<title>example.org course feed</title>
<link>https://www.example.org/api/v2/report/course-feed/rss</link>
<description>.org - course catalog feed</description>
<language>en</language>
<course:instructors>
<course:staff>
<staff:name>Mark Moran</staff:name>
</course:staff>
</course:instructors>
</item>
</channel>
how to parse course: instructors, my PHP code is
$rss = simplexml_load_file('https://www.edx.org/api/v2/report/course-feed/rss');
$namespaces = $rss->getNamespaces(true);
foreach ($rss->channel->item as $item) {
$title = $item->title ;
}
EDIT:2
$rss = simplexml_load_file('https://www.example.org/api/v2/report/course-feed/rss');
$namespaces = $rss->getNamespaces(true);//Add this line
foreach ($rss->channel->item as $item) {
$course_title = $item->title ;
$course_description = $item->description;
$course_url = $item->link;
$course = $item->children($namespaces['course']);
$course_thumbnail_url = $course->{'image-thumbnail'};
$course_banner_url = $course->{'image-banner'};
$course_teaser_url = $course->{'video-youtube'};
$course_start_date = $course->start;
$course_duration = $course->length;
$instructors = $item->children('course',true)->instructors;
$staff = $instructors->children('course',true)->staff;
$instructor_name = $staff->children('staff',true)->name;
$instructor_image = $staff->children('staff',true)->image;
echo $instructor_name.' '.$instructor_image,"<br>";
$course_price = 0;
$course_provider_id = 3;
$course_affiliates = $course->school;
$categories = $course->subject;
$categories = explode(',', $categories);
$c = count($categories);
$i = 0;
while($i < $c)
{
$course_rating = mt_rand(3.5,5);
$course_category = $categories[$i];
if(mysqli_query($conn,"INSERT into course_catalog_table (course_title,course_description,course_url,course_thumbnail_url,course_banner_url,course_teaser_url,course_category,course_start_date,course_duration,course_rating,course_affiliates,course_instructor,course_instructor_image,course_price,course_provider_id) VALUES('$course_title','$course_description','$course_url','$course_thumbnail_url','course_banner_url','course_teaser_url','$course_category','$course_start_date','$course_duration','$course_rating','$course_affiliates','$instructor_name',$instructor_image','$course_price')"))
{
echo "successfull\r\n";
}
$i++;
}
}
When i print instructor_name and instructor_image sometimes its prints but sometimes it throws warning that main(): Node no longer exists ,how can i check that is empty or not
You can use the children() function to access the child tree of the xml structure.
Do like this:
$rss->channel->item->children('course',true)->instructors;
Read: http://php.net/manual/en/simplexmlelement.children.php
But since the XML have multiple nest, you need to use multiple children() function to access the deepest nest.
Here is the modified code to parse the XML you give:
foreach ($rss->channel->item as $item)
{
$title = $item->title;
// access course:instructors nest
$instructors = $item->children('course',true)->instructors;
// then access the course:staff nest
$staff = $instructors->children('course',true)->staff;
// finally access the staff:name nest value
$name = $staff->children('staff',true)->name;
// print the value
echo "Staff Name: ". $name . "<br>";
}
Test run: https://eval.in/735810

PHP SOAP XML DOM

I have a soap xml that contains a bunch of variables that I need to access. Here is the XML.
`<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soapenv:Body>
<searchPersonsResponse xmlns="">
<searchPersonsReturn>
<attributes>
<attributes>
<name>ercreatedate</name>
<values>
<values>201104070130Z</values>
</values>
</attributes>
<attributes>
<name>status</name>
<values>
<values>Stuff1</values>
<values>Stuff2</values>
<values>Stuff3</values>
<values>Stuff4</values>
<values>Stuff5</values>
<values>Stuff6</values>
<values>Stuff7</values>
</values>
</attributes>
</attributes>
<itimDN>blah</itimDN>
<name>Smith, Bob</name>
<profileName>PER</profileName>
<select>false</select>
</searchPersonsReturn>
</searchPersonsResponse>
</soapenv:Body>
</soapenv:Envelope>
I'm trying to access the inner attribute node and pull out the Name and values into a multidimentional array like this ....
$array["status"][0]="stuff1";
$array["status"][1]="stuff2";
$array["status"][2]="stuff3";
$array["status"][3]="stuff4";
so far I have been able to access the nodes but not really get them the way I want. here is the code I have been playing around with .....
$dom_document = new DOMDocument();
$dom_document->loadXML($thexml);
$tag_els_names = $dom_document->getElementsByTagname('name');
$tag_els_values = $dom_document->getElementsByTagname('values');
$data = array();
$data2 = array();
foreach($tag_els_names as $node){
$data[] = array($node->nodeName => $node->nodeValue);
//grabs all the <name> node values
}
$i=0;$j=0;
foreach($tag_els_values as $node){
$j=0;
foreach($node->childNodes as $child) {
$data2[$i][$j] = $child->nodeValue;
//grabs all the value node values
$j++;
}
$i++;
$j=0;
}
Does anyone know an easy way to do this? I think that I have been looking at this for way to long.
How about something like:
$dom_document = new DOMDocument();
$dom_document->loadXML($thexml);
$xpath = new DOMXpath($dom_document);
$attr = $xpath->evaluate("//attribute/attributes");
$names = array();
$values = array();
$i = 0;
foreach($attr as $attr_node) {
$values[i] = array();
foreach($xpath->evaluate("name", $attr_node) as $name){
$names[] = $name->nodeValue;
}
$foreach($xpath->evaluate("value", $attr_node) as $value){
$values[i][] = $value->nodeValue;
}
i++;
}
This would, however, miss the <name> element that's outside of the <attributes> group. Did you mean to be including that?
I figured this out and thought it could help someone else
$doc = new DOMDocument();
$values=array();
if ($doc->loadXML($temp)) {
$attributes = $doc->getElementsByTagName('attributes');
foreach($attributes as $attribute) {
if($attribute->childNodes->length) {
$previous_nodeValue="";
foreach($attribute->childNodes as $i) {
if($i->nodeValue=="status"){
$previous_nodeValue=$i->nodeValue;
}
if($i->nodeName=="values" && $previous_nodeValue== "status"){
foreach($i->childNodes as $j){
$values[]=$j->nodeValue;
}
}
}
$previous_nodeValue="";
}
}
}

Categories