Read XML file and write to a php array/file

Read XML file and write to a php array/file - php

I need to update the country list of my website and I want to automate the process. Country list can be found here
http://www.iso.org/iso/country_codes...code_lists.htm // Edit : Can't find the good link...
I tried it this way –
http://www.w3schools.com/php/php_xml_parser_expat.asp (PHP XML Expat Parser)
However, this didn't seem to work well as I was confused where to actually 'get' the data and print it to my own array for later use.
Now I want to try it using XML DOM.
Just want to check with everyone, if I had a simple XML file to read, that contained a country code and country name as follows:
<Entry>
<Country_name>AFGHANISTAN</Country_name>
<Code_element>AF</Code_element>
</Entry>
I want to read this file (DOM method), and then feed the data into a separate file/array of mine that will be accessed by my website. What PHP xml functions would YOU use/recommend to do this simple task?
Any help in this regards is appreciated.

Use SimpleXML

how about
$dom = new DOMDOcument();
$dom->loadXML($xml);
$xpath = new DOMXpath($dom);
$res = $xpath->query("/CODE");
$allres = array();
foreach($res as $node){
$result = array();
$result['country'] = ($node->getElementsByTagName("Country_name")->item(0)->nodeValue);
$result['code'] = ($node->getElementsByTagName("Code_element")->item(0)->nodeValue);
$allres[] = $res
}
in the end $allres array would contain all your country codes and names

Related

PHP - DOMXpath does not return any nodes when it should

I am pulling HTML from Selenium, and then extracting data from the HTML using Xpaths.
This is the Xpath:
/html/body/div[2]/div[1]/div/div/div/div/ul/li/div[1]/h3/a
This is my code:
$data = $webdriver->getPageSource();
d($data, $urltemplate);
$doc = new DOMDocument();
$doc->loadHTML($data);
$xp = "/html/body/div[2]/div[1]/div/div/div/div/ul/li/div[1]/h3/a";
$xpatho = new DOMXpath($doc);
$elementsn = $xpatho->query($xp);
d(get_class($elementsn),$elementsn->count(),$xp,$name);
// d() is a custom function like var_dump().
I always get $elementsn->count() = 0.
This is $data:
https://pastebin.com/ahuvkJfN
I am trying to extract those strings like "NAD M10 BLUOS...", "NAD M12 DIRECT DIGITAL..." and so on...
I saved the HTML into a file, and opened it in my browser. I am attaching screenshot of what data I was looking to retrieve (highlighted in blue):
Basically, the HTML page is a product listing, and I am looking to extract all the product names. To confirm, I used Chrome Developer tools, and used the copy full Xpath function. I have the following Xpaths for some of the product names:
/html/body/div[2]/div[1]/div/div/div/div/ul/li[1]/div[1]/h3/a
/html/body/div[2]/div[1]/div/div/div/div/ul/li[3]/div[1]/h3/a
I would guess that this would generalise to:
/html/body/div[2]/div[1]/div/div/div/div/ul/li/div[1]/h3/a
However, I keep on getting a DOMNodeList with count = 0. Why is this so, and how can I check what the error is, if any?
P.S.: This is the original webpage: http://lenbrook.com.sg/3-shop-by-brand#/page-4/price-49-8667

Try changing your $xp
$xp = '//a[#class="product_link"]/text()'

Scrape Text With PHP & Display On Website

I am a complete beginner with PHP. I understand the concepts but am struggling to find a tutorial I understand. My goal is this:
Use the xpath addons for Firefox to select which piece of text I would like to scrape from a site
Format the scraped text properly
Display the text on a website
Example)
// Get the HTML Source Code
$url='http://steamcommunity.com/profiles/76561197967713768';
$source = file_get_contents($url);
// DOM document Creation
$doc = new DOMDocument;
$doc->loadHTML($source);
// DOM XPath Creation
$xpath = new DOMXPath($doc);
// Get all events
$username = $xpath->query('//html/body/div[3]/div[1]/div/div/div/div[3]/div[1]');
echo $username;
?>
In this example, I would like to scrape the username (which at the time of writing is mopar410).
Thank you for your help - I am so lost :( Right now I managed to use xpath with importXML in Google doc spreadsheets and that works, but I would like to be able to do this on my own site with PHP to learn how.
This is code I found online and edited the URL and the variable - as I am not aware of how to write this myself.

They have a public API.
Simply use http://steamcommunity.com/profiles/STEAM_ID/?xml=1
<?php
$profile = simplexml_load_file('http://steamcommunity.com/profiles/76561197967713768/?xml=1', 'SimpleXMLElement', LIBXML_NOCDATA);
echo (string)$profile->steamID;
Outputs: mopar410 (at time of writing)
This also provides other information such as mostPlayedGame, hoursPlayed, etc (look for the xml node names).

DOMDocument : access the next following tag in PHP

I have installed a JSON plugin and got the content of HTML page. Now I want to parse and find a particular table, which has only class, but no id. I parse it using the PHP class DOMDocument.I have the idea to access the tag before the table and after that somehow to access the next following tag(my table) using DOMDocument.
Example:
<a name="Telefonliste" id="Telefonliste"></a>
<table class="wikitable">
So, i get fist the <a> and after that I get <table>.
I have got all the tables using the following commands and especially getElementsByTagName(). After that I can access item(2) where my table is:
$dom = new DOMDocument();
//load html source
$html = $dom->loadHTML($myHtml);
//discard white space
$dom->preserveWhiteSpace = false;
//the table by its tag name
$table = $dom->getElementsByTagName('table');
$rows = $table->item(2)->getElementsByTagName('tr');
This way is ok, but I want to make it more general, because now I know that the table is located in item(2), but the location can be changed e.g if a new table is included in the HTML page before my table. My table will not be in item(2), but in item(3). So, I want it it to parse in a way that I can still reach this table without changing something in my code. Can I do it using DOMDocument as a DOM parser?

You can use DOMXPath, and make the expression as general as you need it.
For example:
$dom = new DOMDocument();
//discard white space
$dom->preserveWhiteSpace = false;
//load html source
$dom->loadHTML($myHtml);
$domxpath = new DOMXPath($dom);
$table = $domxpath->query('//table[#class="wikitable" and not(#id)][0]')->item(0);
$elementBeforeTable = $table->previousSibling;
$rows = $table->getElementsByTagName('tr');

I've started writing a simple extension of this for the purpose of web scraping. I'm not 100% on the direction I want to take with it yet, but you can see an example of how to get the original HTML back in the response of the search rather than just raw text.
https://github.com/WolfeDev/PageScraper
EDIT: I plan on implementing basic table parsing soon.

Getting node Value of last Child

This question may seem very stupid, but I am not able to find much help on how to find the node value of the last child using PHP, even though it's a piece of cake with JS.
This is what my XML currently looks like:
<?xml version="1.0"?>
<files>
<file>.DS_Store</file>
<file>ID2PDF_log_1.xml</file>
<file>ID2PDF_log_12.xml</file>
<file>ID2PDF_log_15.xml</file>
</files>
Here's the php code:
$filename = 'files.xml'; //my xml file name
$dom = new DomDocument();
$dom->load($filename);
$elements = $dom->getElementsByTagName('file');
echo $elements->lastChild(); // This is obviously not working
/*I get an error that I am trying to access an undefined method in DOMNodeList. Now, I know
that lastChild is a property of DOMNode. But I can't figure out how I can change my code to
get this to work.*/
I am trying to echo out
ID2PDF_log_15.xml
Can anyone show me how to get this done?
P.S.: I don't want to change the xml file structure because I am creating it through a script and I am a lazy programmer. But, I did do my research to get this. Didn't help.
I did try getting the number of elements in the node 'file' and then using item(#), but that didn't seem to work either.
Thanks
SOLUTION
$filename = 'files.xml';
$dom = new DomDocument();
$dom->load($filename);
$elements = $dom->getElementsByTagName('file')->length;
echo 'Total elements in the xml file:'.$elements."\n";
$file = file_get_contents('files.xml');
$xml = simplexml_load_string($file);
$result = $xml->xpath('file');
echo "Last element".$result[$elements-1]."\n";
I'll make this neater a little later. But, just thought that I should share the answer anyway any new users in the future.

This should work:
$elements->xpath('root/child[last()]');
Read up about xpath
Alternatively I would suggest counting the number of elements, and then targeting the last element using that count:
$file_count = $elements->getElementsByTagName('file')->length;
$elements[$file_count];

i did it this way:
$elements = $dom->getElementsByTagName('file')->item(0);
echo $elements->lastChild->nodeValue;

Updating the XML file using PHP script

I'm making an interface-website to update a concert-list on a band-website.
The list is stored as an XML file an has this structure :
I already wrote a script that enables me to add a new gig to the list, this was relatively easy...
Now I want to write a script that enables me to edit a certain gig in the list.
Every Gig is Unique because of the first attribute : "id" .
I want to use this reference to edit the other attributes in that Node.
My PHP is very poor, so I hope someone could put me on the good foot here...
My PHP script :

Well i dunno what your XML structure looks like but:
<gig id="someid">
<venue></venue>
<day></day>
<month></month>
<year></year>
</gig>
$xml = new SimpleXmlElement('gig.xml',null, true);
$gig = $xml->xpath('//gig[#id="'.$_POST['id'].'"]');
$gig->venue = $_POST['venue'];
$gig->month = $_POST['month'];
// etc..
$xml->asXml('gig.xml)'; // save back to file
now if instead all these data points are attributes you can use $gig->attributes()->venue to access it.
There is no need for the loop really unless you are doing multiple updates with one post - you can get at any specific record via an XPAth query. SimpleXML is also a lot lighter and a lot easier to use for this type of thing than DOMDOcument - especially as you arent using the feature of DOMDocument.

You'll want to load the xml file in a domdocument with
<?
$xml = new DOMDocument();
$xml->load("xmlfile.xml");
//find the tags that you want to update
$tags = $xml->getElementsByTagName("GIG");
//find the tag with the id you want to update
foreach ($tags as $tag) {
if($tag->getAttribute("id") == $id) { //found the tag, now update the attribute
$tag->setAttribute("[attributeName]", "[attributeValue]");
}
}
//save the xml
$xml->save();
?>
code is untested, but it's a general idea

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Read XML file and write to a php array/file - php

Use SimpleXML

Related

PHP - DOMXpath does not return any nodes when it should

Scrape Text With PHP & Display On Website

DOMDocument : access the next following tag in PHP

Getting node Value of last Child

Updating the XML file using PHP script

Categories

Resources