How to create looped XML file from HTML in PHP? - php

I would like to be able to create an XML file from some of the content of a html page. I have tried intensively but seem to miss something.
I have created two arrays, I have setup a DOMdocument and I have prepared to save an XML file on the server... I have tried to make tons of different foreach loops all over the place - but it won't work.
Here is my code:
<?php
$page = file_get_contents('http://www.halfmen.dk/!hmhb8/score.php');
$doc = new DOMDocument();
$doc->loadHTML($page);
$score = $doc->getElementsByTagName('div');
$keyarray = array();
$teamarray = array();
foreach ($score as $value) {
if ($value->getAttribute('class') == 'xml') {
$keyarray[] = $value->firstChild->nodeValue;
$teamarray[] = $value->firstChild->nextSibling->nodeValue;
}
}
print_r($keyarray);
print_r($teamarray);
$doc = new DOMDocument('1.0','utf-8');
$doc->formatOutput = true;
$droot = $doc->createElement('ROOT');
$droot = $doc->appendChild($droot);
$dsection = $doc->createElement('SECTION');
$dsection = $droot->appendChild($dsection);
$dkey = $doc->createElement('KEY');
$dkey = $dsection->appendChild($dkey);
$dteam = $doc->createElement('TEAM');
$dteam = $dsection->appendChild($dteam);
$dkeytext = $doc->createTextNode($keyarray);
$dkeytext = $dkey->appendChild($dkeytext);
$dteamtext = $doc->createTextNode($teamarray);
$dteamtext = $dteam->appendChild($dteamtext);
echo $doc->save('xml/test.xml');
?>
I really like simplicity, thank you.

You need to add each item in one at a time rather than as an array, which is why I build the XML for each div tag rather than as a second pass. I've had to assume that your XML is structured the way I've done it, but this may help you.
$page = file_get_contents('http://www.halfmen.dk/!hmhb8/score.php');
$doc = new DOMDocument();
$doc->loadHTML($page);
$score = $doc->getElementsByTagName('div');
$doc = new DOMDocument('1.0','utf-8');
$doc->formatOutput = true;
$droot = $doc->createElement('ROOT');
$droot = $doc->appendChild($droot);
foreach ($score as $value) {
if ($value->getAttribute('class') == 'xml') {
$dsection = $doc->createElement('SECTION');
$dsection = $droot->appendChild($dsection);
$dkey = $doc->createElement('KEY', $value->firstChild->nodeValue);
$dkey = $dsection->appendChild($dkey);
$dteam = $doc->createElement('TEAM', $value->firstChild->nextSibling->nodeValue);
$dteam = $dsection->appendChild($dteam);
}
}

Related

How to get child nodes from an xml url?

I got this link https://www.ncbi.nlm.nih.gov/gene/7128?report=xml&format=text. I am trying to write a code that gets Interactions and GeneOntology within Gene-commentary_heading from the link. I only succeed using this code when there are the 2 or 3 nodes but in this case there are at least 6 nodes or more. Could someone help me?
Bellow is the example of the information I am looking for (it's to much to visualise so I just showed a part)
<Gene-commentary_heading>GeneOntology</Gene-commentary_heading>
<Gene-commentary_source>
<Other-source>
<Other-source_pre-text>Provided by</Other-source_pre-text>
<Other-source_anchor>GOA</Other-source_anchor>
<Other-source_url>http://www.ebi.ac.uk/GOA/</Other-source_url>
</Other-source>
</Gene-commentary_source>
<Gene-commentary_comment>
<Gene-commentary>
<Gene-commentary_type value="comment">254</Gene-commentary_type>
<Gene-commentary_label>Function</Gene-commentary_label>
<Gene-commentary_comment>
<Gene-commentary>
<Gene-commentary_type value="comment">254</Gene-commentary_type>
<Gene-commentary_source>
<Other-source>
<Other-source_src>
<Dbtag>
<Dbtag_db>GO</Dbtag_db>
<Dbtag_tag>
<Object-id>
<Object-id_id>3677</Object-id_id>
</Object-id>
</Dbtag_tag>
...
`$url = "https://www.ncbi.nlm.nih.gov/gene/7128?report=xml&format=text";
$document_xml = new DOMDocument();
$document_xml->loadXML($url);
$elements = $url->getElementsByTagName('Gene-commentary_heading');
echo $elements;
foreach($element as $node) {
$GO = $node -> getElementsByTagName('GeneOntology');
$Int = $node->getElementsByTagName('Interactions');
}
My answer
$esearch_test = "https://www.ncbi.nlm.nih.gov/gene/7128?report=xml&format=text";
$result = file_get_contents($esearch_test);
$xml = simplexml_load_string($result);
$doc = new DOMDocument();
$doc = DOMDocument::loadXML($xml);
$c = 1;
foreach($doc->getElementsByTagName('Gene-commentary_heading') as $node) {
echo "$c: ".$node->textContent."\n";
$c++;
}

Why the query doesn't match the DOM?

Here is my code:
$res = file_get_contents("http://www.lenzor.com/photo/search/index/type/user/%D8%B9%D9%84%DB%8C//text/%D9%81%D8%A7%D8%B7%D9%85%D9%87");
$doc = new \DOMDocument();
#$doc->loadHTMLFile($res);
$xpath = new \DOMXpath($doc);
$links = $xpath->query("//ul[#class='user_box']/li");
$result = array();
if (!is_null($links)) {
foreach ($links as $link) {
$href = $link->getAttribute('class');
$result[] = [$href];
}
}
print_r($result);
Here is the content I'm working on. I mean it's the result of echo $res.
Ok well, the result of my code is an empty array. So $links is empty and that foreach won't be executed. Why? Why //ul[#class='user_box']/li query doesn't match the DOM ?
Expected result is an array contains the class attribute of lis.
Try this, Hope this will be helpful. There are few mistakes in your code.
1. You should search like this '//ul[#class="user_box clearfix"]/li' because class="user_box clearfix" class attribute of that HTML source contains two classes.
2. You should use loadHTMLinstead of loadHTMLFile.
<?php
ini_set('display_errors', 1);
libxml_use_internal_errors(true);
$res = file_get_contents("http://www.lenzor.com/photo/search/index/type/user/%D8%B9%D9%84%DB%8C//text/%D9%81%D8%A7%D8%B7%D9%85%D9%87");
$doc = new \DOMDocument();
$doc->loadHTML($res);
$xpath = new \DOMXpath($doc);
$links = $xpath->query('//ul[#class="user_box clearfix"]/li');
$result = array();
if (!is_null($links)) {
foreach ($links as $link) {
$href = $link->getAttribute('class');
$result[] = [$href];
}
}
print_r($result);

Getting data from HTML using DOMDocument

I'm trying to get data from HTML using DOM. I can get some data, but can't figure out how to get the rest. Here is an image highlighting the data I want.
http://i.imgur.com/Es51s5s.png
here is the code itself
http://pastebin.com/Re8qEivv
and here my PHP code
$html = file_get_contents('result.html');
$dom = new DOMDocument;
$dom->loadHTML($html);
$tr = $dom->getElementsByTagName('tr');
foreach ($tr as $row){
$td = $row->getElementsByTagName('td');
$td1 = $td->item(1);
$td2 = $td->item(2);
foreach ($td1->childNodes as $node){
$title = $node->textContent;
}
foreach ($td2->childNodes as $node){
$type = $node->textContent;
}
}
Figured it out
$html = file_get_contents('result.html');
$dom = new DOMDocument;
$dom->loadHTML($html);
$tr = $dom->getElementsByTagName('tr');
foreach ($tr as $row){
$td = $row->getElementsByTagName('td');
$td1 = $td->item(1);
$td2 = $td->item(2);
$title = $td1->childNodes->item(0)->textContent;
$firstURL = $td1->getElementsByTagName('a')->item(0)->getAttribute('href');
$type = $td2->childNodes->item(0)->textContent;
$imageURL = $td2->getElementsByTagName('img')->item(0)->getAttribute('src');
}
I have used following class.
http://sourceforge.net/projects/simplehtmldom/
This is very simple and easy to use class.
You can use
$html->find('#RosterReport > tbody', 0);
to find specific table
$html->find('tr')
$html->find('td')
to find table rows or columns
Note $html is variable have full html dom content.

Filling XML with loops in PHP

I got the following code but nothing happens, I can't find a good tutorial on filling XML with multiple loops:
$xml = new DOMDocument();
$xml->formatOutput = true;
$xml_user = $xml->createElement("User");
$xml_user->setAttribute("Name", $username);
foreach ($alben as $album) {
$i = 1;
$xml_album = $xml->createElement("Album");
$pictures = $picture_object->get_all_pictures($album['id']);
$xml_user->appendChild($xml_album);
$xml_album->setAttribute("Name", $album['name']);
foreach ($pictures as $picture) {
$xml_picture = $xml->createElement("Picture");
$xml_album->appendChild($xml_picture);
$xml_picture->setAttribute("Name", $picture['name']);
$rating = $rating_object->get_picture_rating($picture['id']);
$xml_rating = $xml->createElement("Rating");
$xml_picture->appendChild($xml_rating);
$comments = $comment_object->get_all_comments($picture['id']);
foreach ($comments as $comment) {
$xml_comment = $xml->createElement("Kommentar");
$xml_picture->appendChild($xml_comment);
$xml_comment->setAttribute("Datum", $comment['date']);
$xml_comment->nodeValue($comment['text']);
}
}
}
$xml->save("myXML.xml");

How do I rename XML values using php?

How do I rename a value in xml using PHP? This is what I've got so far:
<?php
$q = $_GET["q"];
$q = stripslashes($q);
$q = explode('|^', $q);
$old = $q[0];
$dom = new DOMDocument;
$dom->preserveWhiteSpace = false;
$dom->formatOutput = true;
$dom->Load("test.xml");
$xpath = new DOMXPath($dom);
$query1 = 'channel/item[title="' . $old . '"]/title';
$entries = $xpath->query($query1);
foreach ($entries as $entry)
{
$oldchapter = $entry->parentNode->removeChild($entry);
$item = $dom->getElementsByTagName('item');
foreach ($item as $items)
{
$title = $dom->createElement('title', $q[1]);
$items->appendChild($title);
}
}
$dom->save("test.xml");
Basically, what it does is take two titles from a url, the old existing title, and the one the user wants to change it to (so like this oldtitle|^newtitle), and puts them into an array.
What I've tried doing is removing the existing old title, and then making a new title with, using the new title value from the url, but it doesn't seem to be working. Where am I going wrong, or is there an easier way of doing this?
The way to do this is with DOMNode::replaceChild(). The majority of your code is correct, you've just slightly over-complicated some of the DOM stuff.
Try this:
<?php
$q = $_GET["q"];
$q = stripslashes($q);
$q = explode('|^', $q);
$old = $q[0];
$dom = new DOMDocument;
// Do this *before* loading the document
$dom->preserveWhiteSpace = false;
$dom->formatOutput = true;
$dom->Load("test.xml");
$xpath = new DOMXPath($dom);
$query1 = 'channel/item[title="' . $old . '"]/title';
$entries = $xpath->query($query1);
// This is all you need to do in the loop
foreach ($entries as $oldTitle) {
$newTitle = $dom->createElement('title', $q[1]);
$entry->parentNode->replaceChild($newTitle, $oldTitle);
}
$dom->save("test.xml");

Categories