I'm trying to convert xml to json back to xml for testing a service and I'm having an issue w/ repeated keys being represented incorrectly.
The following valid XML is the starting point:
<foo>
<bars>
<bar>
<url>http://url</url>
</bar>
<bar>
<url>http://url</url>
</bar>
</bars>
</foo>
Which converts to json:
{"bars":{"bar":[{"url":"http:\/\/url"},{"url":"http:\/\/url"}]}}
Every solution I've seen to similar questions ends up rendering the resulting xml as something like:
<bars>
<bar>
<n0>
<url>http://url</url>
</n0>
<n1>
<url>http://url</url>
</n1>
</bar>
</bars>
Obviously, I need to get back to the original xml. And the structure is quite complex and variable, so I can't count on a particular structure.
Any ideas?
I've done a few functions which encode and decode XML, the first takes an XML source as a SimpleXMLElement and converts it into an array (note that it doesn't deal with attributes) but seems to work for your test case and a few I've tried (the example has a slight modification to the XML to check). The second takes the same array and converts it into a string with the XML reconstructed. There is a lot of recursion going on but the routines are quite short so hopefully easy(ish) to follow...
function xmlToArray ( $base, SimpleXMLElement $node ) {
$nodeName = $node->getName();
$childNodes = $node->children();
if ( count($childNodes) == 0 ) {
$base[ $nodeName ] = (string)$node;
}
else {
$new = [];
foreach ( $childNodes as $newNode ) {
$new[] = xmlToArray($base, $newNode);
}
$base[$nodeName] = count($new)>1?$new:$new[0];
}
return $base;
}
function arrayToXML ( $base ) {
foreach ( $base as $name => $node ) {
$xml = "<{$name}>";
if ( $node instanceof stdClass ){
$xml .= arrayToXML($node);
}
elseif ( is_array($node) ) {
foreach ( $node as $ele ){
$xml .= arrayToXML($ele);
}
}
else {
$xml .= $node;
}
$xml .= "</{$name}>";
}
return $xml;
}
$xml_string = <<< XML
<foo>
<bars>
<bar>
<url>http://url1</url>
</bar>
<bar>
<url>http://url2</url>
</bar>
<url>http://url3</url>
</bars>
</foo>
XML;ToXML ($dec);
echo $target;
ML ($dec);
echo $target;
$source = simplexml_load_string($xml_string);
$xml = xmlToArray([], $source);
$enc = json_encode($xml);
echo $enc.PHP_EOL;
$dec = json_decode($enc);
$target = arrayToXML ($dec);
echo $target;
This outputs the JSON and the XML at the end as...
{"foo":{"bars":[{"bar":{"url":"http:\/\/url1"}},{"bar":{"url":"http:\/\/url2"}},{"url":"http:\/\/url3"}]}}
<foo><bars><bar><url>http://url1</url></bar><bar><url>http://url2</url></bar><url>http://url3</url></bars></foo>
You may use php file handling function and read xml file line by line or number of characters for fixed length tag name and using simple if conditions, print json string on a file.
This may work out.
There are many different ways of converting JSON to XML, or XML to JSON. They all work differently, and there is no single method that is always best. They all have to make some kind of compromise between usability and faithful round-tripping (for example, your library has dropped the outer "foo" element, which therefore can't be reconstituted on the reverse conversion).
You could devise a mapping of arbitrary XML to JSON that allows faithful round-tripping back to XML, but the JSON representation wouldn't be particularly user-friendly, especially for example if you need faithful round-tripping of namespaces.
XSLT 3.0 incidentally does the reverse: it has functions that will convert any JSON input losslessly to (a rather unfriendly vocabulary of) XML, and then convert the result faithfully back to the original JSON. You need the opposite of that.
Related
I am trying to read an XML file in PHP, edit some values and save it back.
I do it by opening the XML file in php. I then convert it using SimpleXML into an array. After doing the manipulation needed, I am struggling in returning that array into the XML file in the same format due to how my XML elements are converted into attributes. Hence when I go from array to XML, my elements (which are attributes now) are saved as attributes in the updated XML file. I would like to know if it's possible to preserve XML elements when converting back from php array to XML.
A random XML example with two elements, lets call it myFile.xml
<XML>
<Project Element1 = 'some random value' Element2='Will be stored as attribute instead'>
</XML>
The php code I would run to convert it into an array
<?php
$xml = simplexml_load_file("myFile.xml") or die("Error: Cannot create object");
$arrayXML = json_decode(json_encode((array)$xml), TRUE);
$arrayXML["Project"]["attributes"]["Element1"] = "updated value"
// I will then run some array to XML converter code here found online
// took it from here https://stackoverflow.com/questions/1397036/how-to-convert-array-to-simplexml
function array_to_xml( $data, &$xml_data ) {
foreach( $data as $key => $value ) {
if( is_array($value) ) {
if( is_numeric($key) ){
$key = 'item'.$key; //dealing with <0/>..<n/> issues
}
$subnode = $xml_data->addChild($key);
array_to_xml($value, $subnode);
} else {
$xml_data->addChild("$key",htmlspecialchars("$value"));
}
}
}
$xml_data = new SimpleXMLElement();
array_to_xml($arrayNexus,$xml_data);
saving generated xml file;
$result = $xml_data->asXML('myFile.xml');
?>
Something like this would then generate an XML file like this
<XML>
<Project>
<attribute>
<Element1>updated value</Element1>
<Element2><Will be stored as attribute instead</Element2>
</attribute>
</Project>
</XML>
When the result I would like to have would be
<XML>
<Project Element1 = 'updated value' Element2='Will be stored as attribute instead'>
</XML>
I could write my own XML converter but if there exist already methods out there, can someone show me the way?
Don't convert the XML - you will loose data if you don't use specific formats like JsonML. It is much easier to use DOM. Use Xpath expressions to fetch the nodes and modify them.
$document = new DOMDocument();
$document->loadXML($xml);
$xpath = new DOMXpath($document);
// iterate the first 'Project' element
foreach($xpath->evaluate('(/XML/Project)[1]') as $project) {
// change the attribute value
$project->setAttribute('Element1', 'updated value');
}
echo $document->saveXML();
Output:
<?xml version="1.0"?>
<XML>
<Project Element1="updated value" Element2="Will be stored as attribute instead"/>
</XML>
Xpath
'XML' document element/XML
'Project' child elements/XML/Project
Limit to first found node(/XML/Project)[1]
This example uses the position in the result list as a condition but if the project has an id attribute you could use this to find the element: /XML/Project[#id="example-id"].
I have an XML file that contains the following type of data
<definition name="/products/phone" path="/main/something.jsp" > </definition>
There are dozens of nodes in the xml file.
What I want to do is extract the url under the 'name' parameter so my end result will be:
http://www.mysite.com/products/phone.jsp
Can I do this with a so called XML parser? I have no idea where to begin. Can someone steer me to a direction. What tools do I need to achieve something like that?
I am particularly interested in doing this with PHP.
It should be easy to append a path to an existing URL and expected resource type given the above basic XML.
If you are comfortable with C#, and you know there is one and only one "definition" element, here is a self contained little program that does what you require (and assumes you are loading the XML from a string):
using System;
using System.Xml;
public class parseXml
{
private const string myDomain = "http://www.mysite.com/";
private const string myExtension = ".jsp";
public static void Main()
{
string xmlString = "<definition name='/products/phone' path='/main/something.jsp'> </definition>";
XmlDocument doc = new XmlDocument();
doc.LoadXml(xmlString);
string fqdn = myDomain +
doc.DocumentElement.SelectSingleNode("//definition").Attributes["name"].ToString() +
myExtension;
Console.WriteLine("Original XML: {0}\nResultant FQDN: {1}", xmlString, fqdn);
}
}
You are going to need to be careful with SelectSingleNode above; the XPath expression assumes there is only one "definition" node and that you are searching from the document root.
Fundamentally, it's worthwhile to read a primer on XML. Xml is not difficult, it's a self describing hierarchical data format - lots of nested text, angle brackets, and quotation marks :).
A good primer would probably be that at the W3 Schools:
http://www.w3schools.com/xml/xml_whatis.asp
You may also want to read up on streaming (SAX/StreamReader) vs. loading (DOM/XmlDocument) Xml:
What is the difference between SAX and DOM?
I can provide a Java example too, if you feel that would be helpful.
Not sure if you solved your problem, so here is a PHP solution:
$xml = <<<DATA
<?xml version="1.0"?>
<root>
<definition name="/products/phone" path="/main/something.jsp"> </definition>
<definition name="/products/cell" path="/main/something.jsp"> </definition>
<definition name="/products/mobile" path="/main/something.jsp"> </definition>
</root>
DATA;
$arr = array();
$dom = new DOMDocument('1.0', 'UTF-8');
$dom->loadHTML($xml);
$xpath = new DOMXPath($dom);
$defs = $xpath->query('//definition');
foreach($defs as $def) {
$attr = $def->getAttribute('name');
if ($attr != "") {
array_push($arr, $attr);
}
}
print_r($arr);
See IDEONE demo
Result:
Array
(
[0] => /products/phone
[1] => /products/cell
[2] => /products/mobile
)
I've come across a weird but apparently valid XML string that I'm being returned by an API. I've been parsing XML with SimpleXML because it's really easy to pass it to a function and convert it into a handy array.
The following is parsed incorrectly by SimpleXML:
<?xml version="1.0" standalone="yes"?>
<Response>
<CustomsID>010912-1
<IsApproved>NO</IsApproved>
<ErrorMsg>Electronic refunds...</ErrorMsg>
</CustomsID>
</Response>
Simple XML results in:
SimpleXMLElement Object ( [CustomsID] => 010912-1 )
Is there a way to parse this in XML? Or another XML library that returns an object that reflects the XML structure?
That is an odd response with the text along with other nodes. If you manually traverse it (not as an array, but as an object) you should be able to get inside:
<?php
$xml = '<?xml version="1.0" standalone="yes"?>
<Response>
<CustomsID>010912-1
<IsApproved>NO</IsApproved>
<ErrorMsg>Electronic refunds...</ErrorMsg>
</CustomsID>
</Response>';
$sObj = new SimpleXMLElement( $xml );
var_dump( $sObj->CustomsID );
exit;
?>
Results in second object:
object(SimpleXMLElement)#2 (2) {
["IsApproved"]=>
string(2) "NO"
["ErrorMsg"]=>
string(21) "Electronic refunds..."
}
You already parse the XML with SimpleXML. I guess you want to parse it into a handy array which you not further define.
The problem with the XML you have is that it's structure is not very distinct. In case it does not change much, you can convert it into an array using a SimpleXMLIterator instead of a SimpleXMLElement:
$it = new SimpleXMLIterator($xml);
$mode = RecursiveIteratorIterator::SELF_FIRST;
$rit = new RecursiveIteratorIterator($it, $mode);
$array = array_map('trim', iterator_to_array($rit));
print_r($array);
For the XML-string in question this gives:
Array
(
[CustomsID] => 010912-1
[IsApproved] => NO
[ErrorMsg] => Electronic refunds...
)
See as well the online demo and How to parse and process HTML/XML with PHP?.
I have a PHP script that pulls an XML file from a remote server, and converts it to JSON using simplexml_load_string and json_encode. However, the simplexml_load_string seems to ignore inline attributes, like so:
<AxisFeedrate dataItemId="iid7" timestamp="2012-03-21T15:15:41-04:00" sequence="7" name="Yfrt" subType="ACTUAL" units="MILLIMETER/SECOND">UNAVAILABLE</AxisFeedrate>
In this case the JSON representation would be {AxisFeedrate: 'UNAVAILABLE'}
However, I need to have those attributes available. One idea I've been approaching is replacing strings to turn the attributes into text nodes like so:
<AxisFeedrate>
<dataItemId>iid7</dataItemId>
<timestamp>2012-03-21T15:15:41-04:00</timestamp>
<sequence>7</sequence>
<name>Yfrt</name>
<subType>ACTUAL</subType>
<units>MILLIMETER/SECOND"</units>
<value>UNAVAILABLE</value>
</AxisFeedrate>
I can turn the attributes into their own tag elements with regular find/replace, but I'm having trouble wrapping the original text value in a Value tag, at least with find/replace.
What are some good approaches for doing this? The above chunk of XML is in the middle of many similar chunks on different data items, so I couldn't just start by replacing the first closing > with >...
You could use SimpleXML itself to read the attributes.
Example:
<?php
$xml=simplexml_load_string('<AxisFeedrate dataItemId="iid7" timestamp="2012-03-21T15:15:41-04:00" sequence="7" name="Yfrt" subType="ACTUAL" units="MILLIMETER/SECOND">UNAVAILABLE</AxisFeedrate>');
foreach($xml->attributes() as $k=>$v) {
echo $k." -> ".(string)$v."\n";
}
?>
Output:
dataItemId -> iid7
timestamp -> 2012-03-21T15:15:41-04:00
sequence -> 7
name -> Yfrt
subType -> ACTUAL
units -> MILLIMETER/SECOND
Try this regex: ([\w]*?)="(.*?)" with this replace <$1>$2</$1>\n
You should use SimpleXML. Be aware though, that you have to cast values to string type explicitly, or you'll get objects.
$xml_string = <<<XML
<AxisFeedrate
dataItemId="iid7"
timestamp="2012-03-21T15:15:41-04:00"
sequence="7"
name="Yfrt"
subType="ACTUAL"
units="MILLIMETER/SECOND"
>UNAVAILABLE</AxisFeedrate>
XML;
$xml = simplexml_load_string($xml_string);
$axis_info = array('value' => (string)$xml);
foreach($xml -> attributes() as $attr => $val) {
$axis_info[$attr] = (string) $val;
}
echo json_encode(array("AxisFeedrate" => $axis_info));
Update:
This will give you a more generic version, but notice that the attributes are cast as an array and that this only works on a single element:
$xml_string = <<<XML
<AxisFeedrate dataItemId="iid7" timestamp="2012-03-21T15:15:41-04:00" sequence="7" name="Yfrt" subType="ACTUAL" units="MILLIMETER/SECOND">UNAVAILABLE</AxisFeedrate>
XML;
$xml = simplexml_load_string($xml_string);
$obj_name = $xml -> getName();
$attributes = (array) $xml->attributes();
$axis_info[$obj_name] = $attributes["#attributes"];
$axis_info[$obj_name]['value'] = (string) $xml;
echo json_encode($axis_info);
I need to return a SimpleXML object converted as a JSON object to work with it in JavaScript. The problem is that there are no attributes on any object with a value.
As an example:
<customer editable="true" maxChars="9" valueType="numeric">69236</customer>
becomes in the SimpleXML object:
"customer":"69236"
Where is the #attributes object?
This has driven me crazy on several occasions. When SimpleXML encounters a node that only has a text value, it drops all the attributes. My workaround has been to modify the XML prior to parsing with SimpleXML. With a bit of regular expressions, you can create a child node that contains the actual text value. For example, in your situation you can change the XML to:
<customer editable="true" maxChars="9" valueType="numeric"><value>69236<value></customer>
Some example code assuming that your XML string was in $str:
$str = preg_replace('/<customer ([^>]*)>([^<>]*)<\/customer>/i', '<customer $1><value>$2</value></customer>', $str);
$xml = #simplexml_load_string($str);
That would preserve the attributes and nest the text value in a child node.
I realize this is an old post, but in case it proves useful. The below extends #ryanmcdonnell's solution to work on any tags instead of a hard-coded tag. Hopefully it helps someone.
$str = preg_replace('/<([^ ]+) ([^>]*)>([^<>]*)<\/\\1>/i', '<$1 $2><value>$3</value></$1>', $result);
The main different is that it replaces /<customer with /<([^ ]+), and then </customer> with </\\1>
which tells it to match that part of the search against the first element in the pattern.
Then it just adjusts the placeholders ($1,$2,$3) to account for the fact that there are three sub-matches now instead of two.
So it appears that this is a bug and is fixed in PHP 7.4.5.
It's an old question, but I found something that works neat - parse it into a DOMNode object.
// $customer contains the SimpleXMLElement
$customerDom = dom_import_simplexml($customer);
var_dump($customerDom->getAttribute('numeric'));
Will show:
string 'numeric'
Here's some code to iterate through attributes, and construct JSON. If supports, one or many customers.
If you're XML looks like this (or just one customer)
<xml>
<customer editable="true" maxChars="9" valueType="numeric">69236</customer>
<customer editable="true" maxChars="9" valueType="numeric">12345</customer>
<customer editable="true" maxChars="9" valueType="numeric">67890</customer>
</xml>
Iterate through it like this.
try {
$xml = simplexml_load_file( "customer.xml" );
// Find the customer
$result = $xml->xpath('/xml/customer');
$bFirstElement = true;
echo "var customers = {\r\n";
while(list( , $node) = each($result)) {
if( $bFirstElement ) {
echo "'". $node."':{\r\n";
$bFirstElement = false;
} else {
echo ",\r\n'". $node."':{\r\n";
}
$bFirstAtt = true;
foreach($node->attributes() as $a => $b) {
if( $bFirstAtt ) {
echo "\t".$a.":'".$b."'";
$bFirstAtt = false;
} else {
echo ",\r\n\t".$a.":'".$b."'";
}
}
echo "}";
}
echo "\r\n};\r\n";
} catch( Exception $e ) {
echo "Exception on line ".$e->getLine()." of file ".$e->getFile()." : ".$e->getMessage()."<br/>";
}
To produce a JSON structure like this
var customers = {
'69236':{
editable:'true',
maxChars:'9',
valueType:'numeric'},
'12345':{
editable:'true',
maxChars:'9',
valueType:'numeric'},
'67890':{
editable:'true',
maxChars:'9',
valueType:'numeric'}
};
Finally, in your script, access the attribute like this
WScript.Echo( customers["12345"].editable );
Good luck