Simple XML to parse general recordset - php

I am trying to find a way to iterate through an XML recordset containing a namespace. However, I don't know the field names in advance. Sample XML is below.
<?xml version="1.0" encoding="utf-8"?>
<string xmlns="http://www.site.com/SMART/Updates">
<NewDataSet>
<record>
<FIELD1>data1</FIELD1>
<FIELD2>data2</FIELD2>
<FIELD3>data3</FIELD3>
</record>
<record>
<FIELD1>data1</FIELD1>
<FIELD2>data2</FIELD2>
<FIELD3>data3</FIELD3>
</record>
</NewDataSet>
Again, I won't know the field names in advance. I need read the namespace, find the name of the root element ("NewDataSet", in this case) and then need to get the field names and values of the individual elements. I have tried to use $xml->getname(), and $xml->xpath('\') to find the root element name, but been unable to crack it.

(as discussed in Chat)
Plain DOM functions are the best way to process XML.
Demo or code:
<?php
header('Content-Type: text/plain');
$xml = <<<END
<?xml version="1.0" encoding="utf-8"?>
<string xmlns="http://www.site.com/SMART/Updates">
<NewDataSet>
<record>
<FIELD1>data1</FIELD1>
<FIELD2>data2</FIELD2>
<FIELD3>data3</FIELD3>
</record>
<record>
<FIELD1>data1</FIELD1>
<FIELD2>data2</FIELD2>
<FIELD3>data3</FIELD3>
</record>
</NewDataSet>
</string>
END;
$dom = new DOMDocument;
$dom->preserveWhiteSpace = false;
$dom->normalize();
$dom->loadXML($xml);
echo 'Root element name: ' . $dom->firstChild->firstChild->tagName . PHP_EOL;
echo 'Number of child elements: ' . count($dom->firstChild->firstChild->childNodes) . PHP_EOL;
echo '=====' . PHP_EOL . PHP_EOL;
echo print_node($dom->firstChild->firstChild);
function print_node($node, $level = 0, $prev_level = 0) {
$result = '';
if($node->hasChildNodes()) {
foreach($node->childNodes as $subnode) {
$result .= str_repeat(' ', $level) . $node->tagName . ' =>' . PHP_EOL;
$result .= print_node($subnode, $level + 1, $level) . PHP_EOL;
}
} else {
if(trim($node->nodeValue) !== '') {
$result .= str_repeat(' ', $level) . '**Data: ' . trim($node->nodeValue) . PHP_EOL;
}
}
return $result;
}
?>
Output:
Root element name: NewDataSet
Number of child elements: 1
=====
NewDataSet =>
record =>
FIELD1 =>
**Data: data1
record =>
FIELD2 =>
**Data: data2
record =>
FIELD3 =>
**Data: data3
NewDataSet =>
record =>
FIELD1 =>
**Data: data1
record =>
FIELD2 =>
**Data: data2
record =>
FIELD3 =>
**Data: data3

Your XML is invalid, but assuming the string tag is closed after the </NewDataSet> tag:
You can get the namespaces declared in the document using getDocNamespaces().
$xml = simplexml_load_string($xmlfile);
$namespaces = $xml->getDocNamespaces(); //array of namespaces
$dataset = $xml->children(); //first child (NewDataSet)
echo $dataset->getName(); //NewDataSet
$records = $dataset->children();
$i = 0;
$result = array();
foreach ($records as $key => $value) {
foreach ($value as $fieldName => $fieldData) {
$result[$i][$fieldName] = (string)$fieldData;
}
$i++;
}
var_dump($result);
Now $result contains an array that is easier to read and contains the rows :
array(2) {
[0]=> array(3) {
["FIELD1"]=> string(5) "data1"
["FIELD2"]=> string(5) "data2"
["FIELD3"]=> string(5) "data3"
}
[1]=> array(3) {
["FIELD1"]=> string(5) "data1"
["FIELD2"]=> string(5) "data2"
["FIELD3"]=> string(5) "data3"
}
}

Looking at the chat transcript posted in another answer, it looks like the element actually contains a string which is an escaped XML document. So there is exactly one element in the outer document, called <string>. It has no children, just content. (This looks remarkably like someone using ASP.net's service-builder.)
So the step you are missing is unescaping this inner XML to treat as a new XML document:
// Parse the outer XML, which is just one <string> node
$wrapper_sx = simplexml_load_string($wrapper_xml);
// Extract the actual XML inside it
$response_xml = (string)$wrapper_sx;
// Parse that
$response_sx = simplexml_load_string($response_xml);
// Now handle the XML
$tag_name = $response_sx->getName();
foreach ( $response_sx->children() as $child )
{
// Etc
}
// see http://github.com/IMSoP/simplexml_debug
simplexml_tree($response_sx, true);

I don't actually understand what you problem is, in your said real-life XML you gave in PHP chat, there are no namespaces involved (and even if!).
Just read out the tag-name from the document element:
# echoes NewDataSet / string (depending on which XML input)
echo dom_import_simplexml($simplexml)->ownerDocument->documentElement->tagName;
If you have actually an XML document inside another XML document, you can do the following:
// load outer document
$docOuter = new DOMDocument();
$docOuter->loadXML($xmlString);
// load inner document
$doc = new DOMDocument();
$doc->loadXML($docOuter->documentElement->nodeValue);
echo "Root element is named: ", $doc->documentElement->tagName, "\n";
Or if you prefer SimpleXML:
echo "Root element is named: ",
simplexml_load_string(simplexml_load_string($xmlString))->getName()
;

Related

how to ignore <sub></sub> tag from xml file? please see example

Here is the XML file let's say $xml;
<?xml version="1.0"?>
<btps>
<u5_SO2>
<label>Sulphur Dioxide (SO<sub>2</sub>)</label>
<value>..</value>
<unit>mg/Nm<super>3</super></unit>
</u5_SO2>
<u5_NO2>
<label>Nitrogen Dioxide (NO<sub>2</sub>)</label>
<value>..</value>
<unit>mg/Nm<super>3</super></unit>
</u5_NO2>
</btps>
Here is the PHP script
$label = $xml->u5_SO2->label;
$value = $xml->u5_SO2->value;
$unit = $xml->u5_SO2->unit;
echo "<br>".$label;
echo "<br>".$value;
echo "<br>".$unit;
when i echo this $label variable it printing like this Sulphur
Dioxide (SO) but what I accept is Sulphur Dioxide (SO2) is it possible to print what I accepted?
SimpleXML is not a good option if you have mixed type child nodes. With current DOM this is a lot easier because it allows for precise node manipulations. Use Xpath expressions to fetch nodes and DOM methods to manipulate them.
DOM nodes have a property textContent which allows you to read (and write) all descendant text nodes as a string.
Here is an example that replaces elements with text nodes (with unicode characters):
$document = new DOMDocument();
$document->loadXML($xml);
$xpath = new DOMXpath($document);
$replacements = [
'//sub' => ['2' => "\u{2082}", '3' => "\u{2083}" /*,...*/],
'//super' => ['2' => "\u{00B2}", '3' => "\u{00B3}" /*,...*/]
];
foreach ($replacements as $expression => $map) {
// fetch and iterate nodes
foreach ($xpath->evaluate($expression) as $sub) {
$content = $sub->textContent;
// check map
if (isset($map[$content])) {
// replace element with text node
$sub->parentNode->replaceChild(
$document->createTextNode($map[$content]),
$sub
);
}
}
}
echo $document->saveXML();
Output:
<?xml version="1.0"?>
<btps>
<u5_SO2>
<label>Sulphur Dioxide (SO₂)</label>
<value>..</value>
<unit>mg/Nm³</unit>
</u5_SO2>
<u5_NO2>
<label>Nitrogen Dioxide (NO₂)</label>
<value>..</value>
<unit>mg/Nm³</unit>
</u5_NO2>
</btps>
Reading the modified DOM with Xpath expressions:
foreach ($xpath->evaluate('/btps/*') as $element) {
var_dump(
[
'label' => $xpath->evaluate('string(label)', $element),
'value' => $xpath->evaluate('string(value)', $element),
'unit' => $xpath->evaluate('string(unit)', $element),
]
);
}
Output:
array(3) {
["label"]=>
string(23) "Sulphur Dioxide (SO₂)"
["value"]=>
string(2) ".."
["unit"]=>
string(7) "mg/Nm³"
}
array(3) {
["label"]=>
string(24) "Nitrogen Dioxide (NO₂)"
["value"]=>
string(2) ".."
["unit"]=>
string(7) "mg/Nm³"
}
I would argue in favor of using DOMDocument over SimpleXML, in favor of using xpath over dot notation and against regex in xml under any circumstances.
So with that said, after the usual DOMDocument boilerplate, I would use this:
$label = $xpath->evaluate('//u5_SO2/label')[0];
echo $label->textContent;
Output:
Sulphur Dioxide (SO2)
I assume you're using SimpleXML. I don't think you can't easily do it with that extension alone, but you can use it together with DOMDocument (dom_import_simplexml() and DOMNode::nodeValue):
dom_import_simplexml($xml->u5_SO2->label)->nodeValue
Demo.
You can use a regex to find and remove the element. Like it
<?php
$xml = '<?xml version="1.0"?>
<btps>
<u5_SO2>
<label>Sulphur Dioxide (SO<sub>2</sub>)</label>
<value>..</value>
<unit>mg/Nm<super>3</super></unit>
</u5_SO2>
<u5_NO2>
<label>Nitrogen Dioxide (NO<sub>2</sub>)</label>
<value>..</value>
<unit>mg/Nm<super>3</super></unit>
</u5_NO2>
</btps>';
$xml = preg_replace('/(<sub>)(.*)?(<\/sub>)/', '$2', $xml);
$xml = simplexml_load_string($xml);
$label = $xml->u5_SO2->label;
$value = $xml->u5_SO2->value;
$unit = $xml->u5_SO2->unit;
echo "<br>".$label . PHP_EOL;
echo "<br>".$value . PHP_EOL;
echo "<br>".$unit . PHP_EOL;
Check in codepad
http://codepad.org/SbmbmGFl
You else, can put it back when parse the element like that
<?php
$xml = '<?xml version="1.0"?>
<btps>
<u5_SO2>
<label>Sulphur Dioxide (SO<sub>2</sub>)</label>
<value>..</value>
<unit>mg/Nm<super>3</super></unit>
</u5_SO2>
<u5_NO2>
<label>Nitrogen Dioxide (NO<sub>2</sub>)</label>
<value>..</value>
<unit>mg/Nm<super>3</super></unit>
</u5_NO2>
</btps>';
$xml = preg_replace('/(<sub>)(.*)?(<\/sub>)/', '___sub___$2___sub___', $xml);
$xml = simplexml_load_string($xml);
foreach($xml as $row) {
$row->label = preg_replace('/(___sub___)(.*)?(___sub___)/', '<sub>$2</sub>', $row->label);
}
$label = $xml->u5_SO2->label;
$value = $xml->u5_SO2->value;
$unit = $xml->u5_SO2->unit;
echo "<br>".$label . PHP_EOL;
echo "<br>".$value . PHP_EOL;
echo "<br>".$unit . PHP_EOL;
Check in codepad
http://codepad.org/mvnJYnGB

Get elements from a XML content by PHP

I am trying to get elements from this XML content but returns empty:
<results>
<error>
<string>i</string>
<description>Make I uppercase</description>
<precontext></precontext>
<suggestions>
<option>I</option>
</suggestions>
<type>grammar</type>
</error>
</results>
And this is my code to extract element type of grammar :
$dom = new DOMDocument();
$dom->loadXml($output);
$params = $dom->getElementsByTagName('error'); // Find Sections
$k=0;
foreach ($params as $param) //go to each section 1 by 1
{
if($param->type == "grammar"){
echo $param->description;
}else{
echo "other type";
}
Problem is the script returns empty.
you can use simplexml_load_string()
$output = '<results>
<error>
<string>i</string>
<description>Make I uppercase</description>
<precontext></precontext>
<suggestions>
<option>I</option>
</suggestions>
<type>grammar</type>
</error>
</results>';
$xml = simplexml_load_string($output);
foreach($xml->error as $item)
{
//echo (string)$item->type;
if($item->type == "grammar"){
echo $item->description;
}else{
echo "other type";
}
}
You apparently haven't configured PHP to report errors because your code triggers:
Notice: Undefined property: DOMElement::$type
You need to grab <type> the same way you grab <error>, using DOM methods like e.g. getElementsByTagName(). Same for node value:
if ($param->getElementsByTagName('type')->length && $param->getElementsByTagName('type')[0]->nodeValue === 'grammar') {
// Feel free to add additional checks here:
echo $param->getElementsByTagName('description')[0]->nodeValue;
}else{
echo "other type";
}
Demo
I think is this what you want.
<?php
$output = '<results>
<error>
<string>i</string>
<description>Make I uppercase</description>
<precontext></precontext>
<suggestions>
<option>I</option>
</suggestions>
<type>grammar</type>
</error>
</results>';
$dom = new DOMDocument();
$dom->loadXml($output);
$params = $dom->getElementsByTagName('error'); // Find Sections
$k=0;
foreach ($params as $param) //go to each section 1 by 1
{
$string = $param->getElementsByTagName( "string" )->item(0)->nodeValue;
$description = $param->getElementsByTagName( "description" )->item(0)->nodeValue;
$option = $param->getElementsByTagName( "option" )->item(0)->nodeValue;
$type = $param->getElementsByTagName( "type" )->item(0)->nodeValue;
echo $type;
if($type == "grammar"){
echo $description ;
}else{
echo "other type";
}
}
?>
You're mixing DOM with SimpleXML. This is possible, but you would need to convert the DOM element node into a SimpleXML instance with simplexml_import_dom().
Or you use Xpath. getElementsByTagName() is a low level DOM method. Using Xpath expressions allows for more specific access with a lot less code.
$document = new DOMDocument();
$document->loadXML($xml);
$xpath = new DOMXpath($document);
foreach ($xpath->evaluate('//error') as $error) {
var_dump(
[
'type' => $xpath->evaluate('string(type)', $error),
'description' => $xpath->evaluate('string(description)', $error)
]
);
}
Output:
array(2) {
["type"]=>
string(7) "grammar"
["description"]=>
string(16) "Make I uppercase"
}
Xpath expressions allow for conditions as well, for example you could fetch all grammar errors using //error[#type = "grammar"].

How to get CDATA texts from XML by id in tags

I know how to access tags in XML using PHP but this time, I have to use a function getText($textId) to access text content in those tags but I tried so many things that I am desperate for help.
I tried this
$doc->load("localisations_EN.xml");
$texts = $doc->getElementsByTagName("txt");
$elem = $doc->getElementById("home");
$children = $elem->childNodes;
foreach ($children as $child) {
if ($child->nodeType == XML_CDATA_SECTION_NODE) {
echo $child->textContent . "<br/>";
}
}
print_r($texts);
print_r($doc->getElementById('home'));
foreach ($texts as $text)
{
foreach($text->childNodes as $child) {
if ($child->nodeType == XML_CDATA_SECTION_NODE) {
echo $child->textContent . "<br/>";
}
}
}
Then I tried this but I don't know how to access the string value
$xml=simplexml_load_file("localisations_EN.xml") or die("Error: Cannot create object");
print_r($xml);
$description = $xml->xpath("//txt[#id='home']");
var_dump($description);
And I got something like this
array(1) { [0]=> object(SimpleXMLElement)#2 (1) { ["#attributes"]=>
array(1) { ["id"]=> string(4) "home" } } }
This is the XML file I have to use
<?xml version="1.0" encoding="UTF-8" ?>
<localisation application="test1">
<part ID="menu">
<txt id="home"><![CDATA[Home]]></txt>
<txt id="news"><![CDATA[News]]></txt>
<txt id="settings"><![CDATA[Settings]]></txt>
</part>
<part ID="login">
<txt id="id"><![CDATA[Login]]></txt>
<txt id="password"><![CDATA[Password]]></txt>
<txt id="forgetPassword"><![CDATA[Forget password?]]></txt>
</part>
</localisation>
Thanks for your help.
simplexml element has a __toString() magic function that will return the text content of the element (however, not the text content of sub-elements)
so your simplexml code should be
$xml=simplexml_load_file("localisations_EN.xml");
$description = (string) $xml->xpath("//txt[#id='home']")[0];
// ^-- triggers __toString() ^-- xpath returns array
because xpath returns an array of elements, you need to fetch one (or more) and cast it to string. To get the immediate contents of that element.
with DOMDocument:
don't know why you go for the (non-existant) child nodes there. CDATA is just syntax to say "don't parse this, this is data"
$doc = new DOMDocument;
$doc->load("localisations_EN.xml");
$texts = $doc->getElementsByTagName('txt');
foreach($texts as $text) {
if($text->getAttribute('id') == 'home') {
// prepend hasAttribute('id') if needed to if clause above
$description = $text->textContent;
}
}
also, $doc->getElementById() probably only works, if the DTD has set some attribute as ID. Since your xml doesn't do that (it doesn't name a DTD) it doesn't work.
with DOMDocument and DOMXPath
// $doc as before
$xpath = new DOMXPath($doc);
$description = $xpath->evaluate('//txt[#id="home"]')[0]->textContent;
// as before, xpath returns an array, that's why ---^

Parse XML to PHP using ID value

How can I echo xml values with php by calling their "columnId" and not the position in the array ? (The array is really long)
Here is a sample of the xml :
<Data>
<Value columnId="ITEMS_SOLD">68</Value>
<Value columnId="TOTAL_SALES">682</Value>
<Value columnId="SHIPPING_READY">29</Value>
...
</Data>
The following php gives me all of the values :
$url = 'XXX';
$xml = file_get_contents($url);
$feed = simplexml_load_string($xml) or die("Error: Cannot create object");
foreach($feed->Data->Value as $key => $value){
echo $value;
}
I would like to be able to use something like that in my document :
echo $feed->Data->Value['TOTAL_SALES'];
Thank you for your help.
echo $feed->Data->Value[1];
I have an another way for your solution. You can convert xml object into array and use this for further process. Try this code:
<?php
$url = 'XXX';
//Read xml data, If file exist...
if (file_exists($url)) {
//Load xml file...
$xml = simplexml_load_file($url);
$arrColumn = array();//Variable initialization...
$arrFromObj = (array) $xml;//Convert object to array...
$i = 0;//Variable initialization with value...
//Loop until data...
foreach($xml AS $arrKey => $arrData) {
$columnId = (string) $arrData['columnId'][0];//array object to string...
$arrColumn[$columnId] = $arrFromObj['Value'][$i];//assign data to array...
$i++;//Incremental variable...
}
} else {//Condition if file not exist and display message...
exit('Failed to open file');
}
?>
Above code will store result into array variable $arrColumn and result is:
Array
(
[ITEMS_SOLD] => 68
[TOTAL_SALES] => 682
[SHIPPING_READY] => 29
)
Hope this help you well!
Use XPath. SimpleXML and DOM support it, but SimpleXML has some limits (It can only fetch node lists).
SimpleXML
$feed = simplexml_load_string($xml);
var_dump(
(string)$feed->xpath('//Value[#columnId = "TOTAL_SALES"]')[0]
);
Output:
string(3) "682"
DOM
$document = new DOMDocument();
$document->loadXml($xml);
$xpath = new DOMXpath($document);
var_dump(
$xpath->evaluate('string(//Value[#columnId = "TOTAL_SALES"])')
);
Output:
string(3) "682"

how to form a path to XML subnodes with a multidimensional array

Where do I begin... a XML file needs to go into a database. Therefore I want to make a config array containing the mapping between XML nodes and table-columns of one table.
$maps = array(
// 'node-name'=>'column-name'
'prod_id'=>'supplier_product_id',
'description'=>'product_description',
);
$xml=simplexml_load_file($file);
//just a test
foreach ($maps as $node => $col){
echo 'node ' . $xml->$node . ' is mapped to: ' . $col; //this works
}
There is information I need to put in this (same) table, from a subnode. So I was thinking of putting subnodes in a nested array like this:
$maps = array(
// 'node-name'=>'column-name'
'prod_id'=>'supplier_product_id',
'description'=>'product_description',
// to access $xml->node->subnode;
'category'=>array(
'id'=>'category_id',
),
);
But now I get confused, how can I use the nested array to make an path to the node like this:
$xml->category->id
I am a newbee in PHP and hopefully some help will keep me on the road again.
All help is welcome, thank you in advance.
Try this:
<?php
$maps = array(
// 'node-name'=>'column-name'
'prod_id'=>'supplier_product_id',
'description'=>'product_description',
// to access $xml->node->subnode;
'category'=>array(
'id'=>'category_id'
)
);
function getDataMapping( $maps, $child="" ) {
global $file;
$xml=simplexml_load_file($file);
foreach ($maps as $node => $col) {
if( is_array( $col ) ) {
getDataMapping( $col, $node );
} else {
if( $child ) {
echo 'node ' . $xml->{$child}->$node . ' is mapped to: ' . $col; //this works
} else {
echo 'node ' . $xml->$node . ' is mapped to: ' . $col; //this works
}
}
}
}
getDataMapping( $maps );
?>
Obviously, if your nesting runs many levels deep (array within array and so on), you can change it to a recursive function.
Hope this helps.
Here's the XML:
<?xml version="1.0" encoding="ISO-8859-1"?>
<message>
<errorcode>100</errorcode>
<body>
<jobs>
<job>
<id>1</id>
<description>Nice job at the office</description>
<hours>40</hours>
<contact>
<id>SYL</id>
<name>Sylvia</name>
<email>sylvia#mail.com</email>
</contact>
</job>
<job>
<id>2</id>
<description>Construction work</description>
<hours>32</hours>
<contact>
<id>HEN</id>
<name>Hendrik</name>
<email>hendrik#mail.com</email>
</contact>
</job>
</jobs>
</body>
<attachements>
</attachements>
<filenames>
</filenames>
</message>
I just realize there is actually no need to read more than 2 levels. So your answer is enough, because it can access both $xml->node and $xml->child->node . With xpath('//job') I can set a base node entrance to all the nodes and iterate over them.

Categories