Most efficient method of parsing non-standard XML document in PHP - php

In PHP, what is the quickest, most efficient way to parse an XML document formatted like so:
<data>
<rowdata>
<fieldname>products_id</fieldname>
<value><![CDATA[1]]></value>
<fieldname>products_image</fieldname>
<value><![CDATA[image_one.jpg]]></value>
<fieldname>products_name</fieldname>
<value>Product One</value>
</rowdata>
<rowdata>
<fieldname>products_id</fieldname>
<value><![CDATA[2]]></value>
<fieldname>products_image</fieldname>
<value><![CDATA[image_two.jpg]]></value>
<fieldname>products_name</fieldname>
<value>Product Two</value>
</rowdata>
</data>
This is the format I've been given by a till system company to import products in to a database. I have no idea why they decided to have <fieldname> and <value> tags instead of just <products_id>1</products_id>
At the moment the only way I can think of doing is writing up some crude loop which sets a boolean each time a <value> tag is found and resets

Seems pretty straightforward. There is some logic - for each <fieldname> there is a <value> - I assume that each fieldname correspond to a column in a table.
Using simplexml :
$xml='<data>
<rowdata>
<fieldname>products_id</fieldname>
<value><![CDATA[1]]></value>
<fieldname>products_image</fieldname>
<value><![CDATA[image_one.jpg]]></value>
<fieldname>products_name</fieldname>
<value>Product One</value>
</rowdata>
<rowdata>
<fieldname>products_id</fieldname>
<value><![CDATA[2]]></value>
<fieldname>products_image</fieldname>
<value><![CDATA[image_two.jpg]]></value>
<fieldname>products_name</fieldname>
<value>Product Two</value>
</rowdata>
</data>';
$data = simplexml_load_string($xml);
$baseSQL='INSERT into TABLE set ';
foreach($data as $rowdata) {
$SQL='';
$count=0;
foreach ($rowdata->fieldname as $fieldname) {
if ($SQL!='') $SQL.=',';
$SQL.=$fieldname.'="'.$rowdata->value[$count].'"';
$count++;
}
echo $baseSQL.$SQL.'<br>';
}
produces :
INSERT into TABLE set products_id="1",products_image="image_one.jpg",products_name="Product One"
INSERT into TABLE set products_id="2",products_image="image_two.jpg",products_name="Product Two"

Related

XML Assign More Than One Result To PHP Variable

I have done a bit of searching on this, but am just not sure I am searching for the right thing. Examples and things I have found have just confused me and possibly sent me in the wrong direction.
I am trying to figure out a php while statement, or if statement to return the results of XML output. The thing is the row/section I need may not always be the same number of results returned. For example there are ShoutCast streams, some have 1 mount point, and some have 3 mount points. Each mount point can have a different amount of listeners tuned in to that particular mount.
My Goal: To get the integer from all mount points returned in the XML, add them together to make a grand total of listeners.
The XML
<centovacast version="3.1.2" host="host.net">
<response type="success">
<message>Complete</message>
<data>
<status>
<mount>/stream</mount>
<sid>1</sid>
<listenercount>31</listenercount>
<genre>Blues</genre>
<url>http://www.websiteurl.com</url>
<title>Streams Name</title>
<currentsong>Artist Name - Track Name</currentsong>
<bitrate>128</bitrate>
<sourceconnected>1</sourceconnected>
<codec>audio/mpeg</codec>
<displayname>/stream</displayname>
<serverstate>1</serverstate>
<appstate>
<sctrans2>1</sctrans2>
</appstate>
<sourcestate>1</sourcestate>
<reseller/>
<useserver>1</useserver>
<ipaddress>11.11.111.111</ipaddress>
<port>8031</port>
<proxy>0</proxy>
<servertype>ShoutCast2</servertype>
<sourcetype>sctrans2</sourcetype>
</status>
<mountpoints>
<row>
<mount>/stream</mount>
<sid>1</sid>
<listenercount>31</listenercount>
<genre>Blues</genre>
<url>http://www.websiteurl.com</url>
<title>Stream Title Name</title>
<currentsong>Artist Name - Track Name</currentsong>
<bitrate>128</bitrate>
<sourceconnected>1</sourceconnected>
<codec>audio/mpeg</codec>
<displayname>/stream</displayname>
</row>
<row>
<mount>/live</mount>
<sid>2</sid>
<listenercount>0</listenercount>
<genre/>
<url/>
<title/>
<currentsong/>
<bitrate>0</bitrate>
<sourceconnected>0</sourceconnected>
<codec/>
<displayname>/live</displayname>
</row>
</mountpoints>
</data>
</response>
</centovacast>
So on the above I know how to pull the listeners for each mount individually using the following code.
$countlisteners->response->data->mountpoints->row[0]->listenercount;
That gives me the result for the first mount, and switching the 0 to a 1 gives me the second mount, so on and so forth.
What I need is for php that will count how many of those mounts exist, and assign each result to a variable I can then use to add together to get a grand total. Is there a way to do this?
What about doing something like this?
$countlisteners = simplexml_load_file('http://urltoxml.com');
foreach($countlisteners->response->data->mountpoints->row->listenercount as $result){
$total = $result;
echo $total;
}
You can use DOMDocument for extracting all mountpoint tags
<?php
$xml="Your xml document content here";
$dom = new DOMDocument;
$dom->loadXML($xml);
$books = $dom->getElementsByTagName('mountpoints');
foreach ($mountpoints as $mountpoints) {
echo $mountpoints->nodeValue;
//you can add your count variable here
//nodeValues can be assigned to varables
}
?>
I figured it out. So simplistic, yet hard to figure out.
$total = 0;
foreach($countlisteners->response->data->mountpoints->row as $result){
$total += $result->listenercount;
$items++;
}
echo $total;
You normally do that with Xpath. It's a query language for XML documents.
You're interested in all listenercount elements, the Xpath expression for these elements could be as simple as:
//listenercount
When you now use SimpleXML to parse the document, the following line of code gives you three SimpleXMLElements inside an array that represent the three values you want to create the sum of:
$array = simplexml_load_string($buffer)->xpath('//listenercount');
As you need the sum of the integer values of these three elements, it can be easily processed with array_map and array_sum:
$sum = array_sum(array_map('intval', $array));
And this gives you in $sum what you're looking for:
var_dump($sum); # int(62)
I hope this sheds you some light why it's often better to get the information you're looking for with an xpath query from the document instead of writing many lines of code to traverse the document "on your own".
The full example:
$buffer = <<<XML
<centovacast version="3.1.2" host="host.net">
<response type="success">
<message>Complete</message>
<data>
<status>
<mount>/stream</mount>
<sid>1</sid>
<listenercount>31</listenercount>
<genre>Blues</genre>
<url>http://www.websiteurl.com</url>
<title>Streams Name</title>
<currentsong>Artist Name - Track Name</currentsong>
<bitrate>128</bitrate>
<sourceconnected>1</sourceconnected>
<codec>audio/mpeg</codec>
<displayname>/stream</displayname>
<serverstate>1</serverstate>
<appstate>
<sctrans2>1</sctrans2>
</appstate>
<sourcestate>1</sourcestate>
<reseller/>
<useserver>1</useserver>
<ipaddress>11.11.111.111</ipaddress>
<port>8031</port>
<proxy>0</proxy>
<servertype>ShoutCast2</servertype>
<sourcetype>sctrans2</sourcetype>
</status>
<mountpoints>
<row>
<mount>/stream</mount>
<sid>1</sid>
<listenercount>31</listenercount>
<genre>Blues</genre>
<url>http://www.websiteurl.com</url>
<title>Stream Title Name</title>
<currentsong>Artist Name - Track Name</currentsong>
<bitrate>128</bitrate>
<sourceconnected>1</sourceconnected>
<codec>audio/mpeg</codec>
<displayname>/stream</displayname>
</row>
<row>
<mount>/live</mount>
<sid>2</sid>
<listenercount>0</listenercount>
<genre/>
<url/>
<title/>
<currentsong/>
<bitrate>0</bitrate>
<sourceconnected>0</sourceconnected>
<codec/>
<displayname>/live</displayname>
</row>
</mountpoints>
</data>
</response>
</centovacast>
XML;
$array = simplexml_load_string($buffer)->xpath('//listenercount');
$sum = array_sum(array_map('intval', $array));
var_dump($sum);

Xpath looping query

I have the following xml doc:
<shop id="123" name="xxx">
<product id="123456">
<name>Book</name>
<price>9.99</price
</product>
<product id="789012">
<name>Perfume</name>
<price>12.99</price
</product>
<product id="345678">
<name>T-Shirt</name>
<price>9.99</price
</product>
</shop>
<shop id="456" name="yyy">
<product id="123456">
<name>Book</name>
<price>9.99</price
</product>
</shop>
I have the following loop to gather the information for each product:
$data_feed = 'www.mydomain.com/xml/compression/gzip/';
$xml = simplexml_load_file("compress.zlib://$data_feed");
foreach ($xml->xpath('//product') as $row) {
$id = $row["id"]; // product id eg. "123456"
$name = $row->name;
$price = $row->price;
// update database etc.
}
HOWEVER, I also want to gather the information for each product's parent shop ("id" and "name").
I can easily change my xpath to start from shop as opposed to product, but I'm unsure of the most efficient way to then construct an additional loop within my foreach to loop each indented product
Make sense?
I'd go without xpath and just use two nested foreach-loops:
$xml = simplexml_load_string($x); // assume XML in $x
foreach ($xml->shop as $shop) {
echo "shop $shop[name], id $shop[id] <br />";
foreach ($shop->product as $product) {
echo "- $product->name (id $product[id]), $product->price <br />";
}
}
see it working: http://codepad.viper-7.com/vFmGvY
BTW: your XML is broken, probably a typo. Each closing </price> is missing its last >.
Sure, makes sense, you want one iteration, not a nested product of iterations (albeit that won't cut you much, #michi showed already), which is possible as well:
foreach ($xml->xpath('//product') as $row)
{
$id = $row["id"]; // product id eg. "123456"
$name = $row->name;
$price = $row->price;
$shopId = $row->xpath('../#id')[0];
$shopName = $row->xpath('../#name')[0];
// update database etc.
}
As this example shows, you can run xpath() on each element-node and the context-node is automatically set to the node itself, therefore the realtive path .. in xpath works to access the parent element (see as well: Access an element's parent with PHP's SimpleXML?). Of that then both attributes are read and then via PHP 5.4 array de-referencing the first (and only) attribute is accessed.
I hope this helps and shed some light how it works. Your question reminds me a bit of an earlier one where I suggested some kind of generic solution to these kind of problems:
Answer to Combining two Xpaths into one loop?

Show all items that Match in XML using PHP

This is my XML file named: full.xml
I need your help. I need a PHP script that open "full.xml"
and only display all values of the nodes that have .email
Example of the Output I want:
sales#company1.com
sales#company2.com
sales#company3.com
Thanks! I will thank you so much!
EDIT
$Connect = simplexml_load_file("full.xml");
return $Connect->table[0]->*.email;
The design of your XML is not very smart. With this xpath expression, you select all nodes with .email at the end of their name:
$xml = simplexml_load_string($x); // assume XML in $x
$results = $xml->xpath("//*[substring(name(),string-length(name())-" . (strlen('.email') - 1) . ") = '.email']");
--> result is an array with the selected nodes.
BTW: if you have any chance of CHANGING the structure of the XML, AVOID combining information within node names like <company1.email>, but do it like this:
...
<companies>
<company id="1">
<email>info#company1.com</email>
<tel>+498988123456</tel>
<name>somename</name>
</company>
<company id="2">
<email>info#company2.com</email>
<tel>+498988123457</tel>
<name>someothername</name>
</company>
</companies>
....
It will be much easier to read and parse.

Using SimpleXML to loop through muliple entries

Hi Im trying to parse an xml feed using simplexml in php.
The xml feed is laid out as follows:
<Member>
<MemberType>Full</MemberType>
<JoinDate>2010-06-12</JoinDate>
<DataType>A</DataType>
<Data>
<FirstName>Ted</FirstName>
<LasttName>Smith</LasttName>
<Data1>56</Data1>
<Data2>100</Data2>
<Data3>120</Data3>
</Data>
</Member>
<Member>
<MemberType>Full</MemberType>
<JoinDate>2010-06-12</JoinDate>
<DataType>B</DataType>
<Data>
<FirstName>Ted</FirstName>
<LasttName>Smith</LasttName>
<Data1>57</Data1>
<Data2>110</Data2>
<Data3>130</Data3>
</Data>
</Member>
<Member>
<MemberType>Full</MemberType>
<JoinDate>2010-06-12</JoinDate>
<DataType>C</DataType>
<Data>
<FirstName>Ted</FirstName>
<LasttName>Smith</LasttName>
<Data4>58</Data4>
<Data5>115</Data5>
<Data6>230</Data6>
</Data>
</Member>
where the member element loops over and over again in the xml doc, but the data inside it changes. What im trying to do is enter all the data for certain members into an sql database. So ideally i want to enter this all in one line in the db. The xml feed contains different member types, like 'full' or 'associate'.
At the moment i am trying to loop through all the full members, and get all the data for this particular member. The data for each member is broken up into three parts each with a separate member tag, so above Ted Bloggs has data in three member tags, where Datatype is A, B and C
$PLF = simplexml_load_file('../XML/Members.xml');
foreach ($PLF->Root->xpath('//Member') as $member) {
if ($member->MemberType == 'Full') {
echo $member->MemberType.'<br/>';
echo $member->JoinDate.'<br />';
echo $member->DataType.'<br/>';
echo $member->Data->FirstName.'<br/>';
echo $member->Data->LastName.'<br/>';
echo $member->Data->Data1.'<br/>';
echo $member->Data->Data2.'<br/>';
echo '<br />';
}}
the code i have at the moment can only pull data from the first type (type A) in each loop, and i really want to combine all types A, B, and C into the same loop. So i can get all the data for each member like Ted Smith into one line in the DB.
I use simple xml to read some remote files and loop thru them as well. This is my code:
$sUrl = "some url";
$sContent = file_get_contents($sUrl);
$oXml = new SimpleXMLElement($sContent);
$aReturn = array();
foreach ($oXml->children() as $oStation)
{
$iRackId = (int)$oStation->rack_id;
$dLong = (double)str_replace(",", ".", $oStation->longitute);
$dLati = (double)str_replace(",", ".", $oStation->latitude);
$sDescription = (string)$oStation->description;
$aRes = array();
$aRes['rack_id'] = $iRackId;
$aRes['longitute'] = $dLong;
$aRes['latitude'] = $dLati;
$aRes['description'] = utf8_decode($sDescription);
if ($dLong > 0 && $dLati > 0)
$aReturn[$iRackId] = $aRes;
}
What I do is I put the result of the XML file into an array. Later in the code I save that data to the database as well.
Hope this helped you. I don't use xpath... had nothing but problems with it and didn't had the time to sort them out. This seems to work for me though.
Br,
Paul Peelen

Finding the value of a child in a specific attribute

<data>
<gig id="1">
<date>December 19th</date>
<venue>The Zanzibar</venue>
<area>Liverpool</area>
<telephone>Ticketline.co.uk</telephone>
<price>£6</price>
<time>Time TBA</time>
</gig>
<gig id="2">
<date>Sat. 16th Jan</date>
<venue>Celtic Connection, Classic Grand</venue>
<area>Glasgow</area>
<telephone>0141 353 8000</telephone>
<price>£17.50</price>
<time>7pm</time>
</gig>
Say if I wanted to view the values of "date" from the gig element which has an attribute of 2 how could I do this using php ?
Basically I want to delete the say id 2 and then create it again or just modify it.
using simpleXML how can I just delete a certain part ?
To find nodes, use XPath.
$data->xpath('//gig[#id="2"]');
It will return an array with all <gig/> nodes with an attribute id whose value is 2. Usually, it will contain 0 or 1 element. You can modify those directly. For example:
$data = simplexml_load_string(
'<data>
<gig id="1">
<date>December 19th</date>
<venue>The Zanzibar</venue>
<area>Liverpool</area>
<telephone>Ticketline.co.uk</telephone>
<price>£6</price>
<time>Time TBA</time>
</gig>
<gig id="2">
<date>Sat. 16th Jan</date>
<venue>Celtic Connection, Classic Grand</venue>
<area>Glasgow</area>
<telephone>0141 353 8000</telephone>
<price>£17.50</price>
<time>7pm</time>
</gig>
</data>'
);
$nodes = $data->xpath('//gig[#id="2"]');
if (empty($nodes))
{
// didn't find it
}
$gig = $nodes[0];
$gig->time = '6pm';
die($data->asXML());
Deleting arbitrary nodes is an order of magnitude more complicated, so it's much easier to modify the values rather than deleting/recreating the node.

Categories