Prepending raw XML using PHP's SimpleXML - php

Given a base $xml and a file containing a <something> tag with attributes, children and children of its children, I would like to append it as first child and all of its children as raw XML.
Original XML:
<root>
<people>
<person>
<name>John Doe</name>
<age>47</age>
</person>
<person>
<name>James Johnson</name>
<age>13</age>
</person>
</people>
</root>
XML in file:
<something someval="x" otherthing="y">
<child attr="val" ..> { some children and values ... }</child>
<child attr="val2" ..> { some children and values ... }</child>
...
</something>
Result XML:
<root>
<something someval="x" otherthing="y">
<child attr="val" ..> { some children and values ... }</child>
<child attr="val2" ..> { some children and values ... }</child>
...
</something>
<people>
<person>
<name>John Doe</name>
<age>47</age>
</person>
<person>
<name>James Johnson</name>
<age>13</age>
</person>
</people>
</root>
This tag would contain several children both direct and recursively, so it would not be practical to build the XML via the SimpleXML operations. Besides, keeping it in a file would result in lower maintenance costs.
Technically it would simply be prepending one child. The problem is that this child would have other children and so on.
On the PHP addChild page there's a comment that says:
$x = new SimpleXMLElement('<root name="toplevel"></root>');
$f1 = new SimpleXMLElement('<child pos="1">alpha</child>');
$x->{$f1->getName()} = $f1; // adds $f1 to $x
However, this does not seem to treat my XML as raw XML therefore causing < and > escaped tags to appear. Several warnings concerning namespaces seem to appear as well.
I suppose I could do a quick replace of such tags but I am not sure whether it could cause future problems and it certainly does not feel right.
Manually hacking the XML is not an option and neither is adding children one by one. Choosing a different library could be.
Any clues on how to get this working?
Thanks!

I'm really not sure if that will work. Try this or downvote this, but I hope it helps. Using DOMDocument (Reference)
<?php
$xml = new DOMDocument();
$xml->loadHTML($yourOriginalXML);
$newNode = DOMDocument::createElement($someXMLtoPrepend);
$nodeRoot = $xml->getElementsByTagName('root')->item(0);
$nodeOriginal = $xml->getElementsByTagName('people')->item(0);
$nodeRoot->insertBefore($newNode,$nodeOriginal);
$finalXmlAsString = $xml->saveXML();
?>
Sometimes UTF-8 can make problems, then try this:
<?php
$xml = new DOMDocument();
$xml->loadHTML(mb_convert_encoding($yourOriginalXML, 'HTML-ENTITIES', 'UTF-8'));
$newNode = DOMDocument::createElement(mb_convert_encoding($someXMLtoPrepend, 'HTML-ENTITIES', 'UTF-8'));
$nodeRoot = $xml->getElementsByTagName('root')->item(0);
$nodeOriginal = $xml->getElementsByTagName('people')->item(0);
$nodeRoot->insertBefore($newNode,$nodeOriginal);
$finalXmlAsString = $xml->saveXML();
?>

Related

Merge people profiles based on email match

I have an existing directory (php with xml datasource) which contains people information such as this:
MainSource.xml
<people>
<person>
<id></id>
<last_name></last_name>
<first_name></first_name>
<email></email>
<phone></phone>
</person>
...
</people>
I need to add a new node to MainSource.xml from NewSource.xml, matching on email address, from the new datasource which contains people info like this:
NewSource.xml
<people>
<person>
<email></email>
<website_url></website_url>
</person>
...
</people>
I have tried a number of variations, but I think my hangup is properly comparing the two documents. Logically, it feels like I need to be iterating, as opposed to foreach? Or two foreach, one for each source? Here's a sample of what I'm thinking. Please offer any clarity or insight which can nudge me along in the right direction.
<?php
$doc1 = new DOMDocument();
$doc1->load('MainSource.xml');
$doc2 = new DOMDocument();
$doc2->load('NewSource.xml');
foreach ($doc1->person as $person) {
if ($person->email === $doc2->person->email) {
$node = $doc1->createElement("website_url", $valueFromDoc2);
$newnode = $doc1->appendChild($node);
}
}
$merged = $doc1->saveXML();
file_put_contents('MergedSource.xml', $merged)
?>
As mentioned by #waterloomatt, you need to use xpath to achieve that.
Assuming that MainSource.xml looks like this:
<people>
<person>
<id>1</id>
<last_name>smith</last_name>
<first_name>john</first_name>
<email>js#example.com</email>
<phone>555-123-1234</phone>
</person>
<person>
<id>2</id>
<last_name>doe</last_name>
<first_name>jane</first_name>
<email>jd#anotherexample.com</email>
<phone>666-234-2345</phone>
</person>
</people>
and NewSource.xml looks like this:
<people>
<person>
<email>js#example.com</email>
<website_url>js.example.com</website_url>
</person>
<person>
<email>jd#anotherexample.com</email>
<website_url>jd.anotherexample.com</website_url>
</person>
</people>
you can try this:
$doc1->loadXML('MainSource.xml');
$xpath1 = new DOMXPath($doc1);
# find each person's email address
$sources = $xpath1->query('//person//email');
$doc2->loadXML('NewSource.xml');
$xpath2 = new DOMXPath($doc2);
foreach ($sources as $source) {
#for each email address, get the parent and use that as the destination
#of the new web address element
$destination = $xpath1->query('..',$source);
#in the other doc, search for each person whose email address matches
#that of the first doc and get the relevant web address
$exp2 = "//person[email[text()='{$source->nodeValue}']]//website_url";
$target = $xpath2->query($exp2);
#import the result of the search as a node into the first doc
$node = $doc1->importNode($target[0], true);
#finally, append the imported node in the right location of the first doc
$destination[0]->appendChild($node);
};
echo $doc1->saveXml();
Output:
<people>
<person>
<id>1</id>
<last_name>smith</last_name>
<first_name>john</first_name>
<email>js#example.com</email>
<phone>555-123-1234</phone>
<website_url>js.example.com</website_url></person>
<person>
<id>2</id>
<last_name>doe</last_name>
<first_name>jane</first_name>
<email>jd#anotherexample.com</email>
<phone>666-234-2345</phone>
<website_url>jd.anotherexample.com</website_url></person>
</people>

PHP XML append to created file

I have the following XML documment:
<list>
<person>
<name>Simple name</name>
</person>
</list>
I try to read it, and basically create another "person" element. The output I want to achieve is:
<list>
<person>
<name>Simple name</name>
</person>
<person>
<name>Simple name again</name>
</person>
</list>
Here is how I am doing it:
$xml = new DOMDocument();
$xml->load('../test.xml');
$list = $xml->getElementsByTagName('list') ;
if ($list->length > 0) {
$person = $xml->createElement("person");
$name = $xml->createElement("name");
$name->nodeValue = 'Simple name again';
$person->appendChild($name);
$list->appendChild($person);
}
$xml->save("../test.xml");
What I am missing here?
Edit: I have translated the tags, so that example would be clearer.
Currently, you're pointing/appending to the node list instead of that found parent node:
$list->appendChild($person);
// ^ DOMNodeList
You should point to the element:
$list->item(0)->appendChild($person);
Sidenote: The text can already put inside the second argument of ->createElement():
$name = $xml->createElement("name", 'Simple name again');

Hide XML declaration in files generated using PHP

I was tesing with a simple example of how to display XML in browser using PHP and found this example which works good
<?php
$xml = new DOMDocument("1.0");
$root = $xml->createElement("data");
$xml->appendChild($root);
$id = $xml->createElement("id");
$idText = $xml->createTextNode('1');
$id->appendChild($idText);
$title = $xml->createElement("title");
$titleText = $xml->createTextNode('Valid');
$title->appendChild($titleText);
$book = $xml->createElement("book");
$book->appendChild($id);
$book->appendChild($title);
$root->appendChild($book);
$xml->formatOutput = true;
echo "<xmp>". $xml->saveXML() ."</xmp>";
$xml->save("mybooks.xml") or die("Error");
?>
It produces the following output:
<?xml version="1.0"?>
<data>
<book>
<id>1</id>
<title>Valid</title>
</book>
</data>
Now I have got two questions regarding how the output should look like.
The first line in the xml file '', should not be displayed, that is it should be hidden
How can I display the TextNode in the next line. In total I am exepecting an output in this fashion
<data>
<book>
<id>1</id>
<title>
Valid
</title>
</book>
</data>
Is that possible to get the desired output, if so how can I accomplish that.
Thanks
To skip the XML declaration you can use the result of saveXML on the root node:
$xml_content = $xml->saveXML($root);
file_put_contents("mybooks.xml", $xml_content) or die("cannot save XML");
Please note that saveXML(node) has a different output from saveXML().
First question:
here is my post where all usable threads with answers are listed: How do you exclude the XML prolog from output?
Second question:
I don't know of any PHP function that outputs text nodes like that.
You could:
read xml using DomDocument and save each node as string
iterate trough nodes
detect text nodes and add new lines to xml string manually
At the end you would have the same XML with text node values in new line:
<node>
some text data
</node>

PHP parsing xml xpath

XML
<person>
<description>
<p>blah blah blah</p>
<p>kjdsfksdjf</p>
</description>
</person>
<person>
<description>
k kjsdf kk sak kfsdjk sadk
</description>
</person>
I'd like to parse the description so that it returns the html tags that are inside.
I've tried both of these, without success
$description = ereg_replace('<description>|</description>','',$person->description->asXML());
$description = $person->description;
Any suggestions?
EDIT
What I'm trying to accomplish is to import an xml file into a mysql db. Everything is working accept what is mentioned above... the paragraph tags inside the description aren't showing up... and they need to be there. The mysql field "description" is set as a text field. If I was to parse the xml to output in the browser then $description = ereg_replace('<description>|</description>','',$person->description->asXML()); works fine... this isn't true though when I'm trying to import into mysql. Do I need to add something to the mysql INSERT? mysql_query("UPDATE table SET description = '$value' WHERE id = '$id'");
Please familiarize yourself with the SimpleXml API:
$xml = <<< XML
<person>
<description>
<p>blah blah blah</p>
<p>kjdsfksdjf</p>
</description>
</person>
XML;
$person = simplexml_load_string($xml);
foreach ($person->description->children() as $child) {
echo $child->asXml();
}
gives
<p>blah blah blah</p><p>kjdsfksdjf</p>
Note that SimpleXml isnt capable of doing the same for the second description element you show because it has no concept of text nodes, e.g.
$xml = <<< XML
<person>
<description>
k kjsdf kk sak kfsdjk sadk
</description>
</person>
XML;
$person = simplexml_load_string($xml);
foreach ($person->description->children() as $child) {
echo $child->asXml();
}
will return an empty string. If you want a unified API, use DOM:
$xml = <<< XML
<people>
<person>
<description>
<p>blah blah blah</p>
<p>kjdsfksdjf</p>
</description>
</person>
<person>
<description>
k kjsdf kk sak kfsdjk sadk
</description>
</person>
</people>
XML;
$dom = new DOMDocument;
$dom->loadXml($xml);
$xp = new DOMXPath($dom);
foreach ($xp->query('/people/person/description/node()') as $child) {
echo $dom->saveXml($child);
}
will give
<p>blah blah blah</p>
<p>kjdsfksdjf</p>
k kjsdf kk sak kfsdjk sadk
For importing XML into MySql, you can also use http://dev.mysql.com/doc/refman/5.5/en/load-xml.html
I'd like to parse the description so that it returns the html tags that are inside.
In XPath you would select the child nodes of the description elements.
Use:
"//person/description/*"
to get all child nodes (html tags only) or
"//person/description/node()"
to get all child nodes (html tags and text nodes).
For instance, this php code:
<?php
$xml = simplexml_load_file("test.xml");
$result = $xml->xpath("//person/description/*");
print_r($result);
?>
Returns an array of SimpleXMLElements which are children of description. Each item is retrieved with all its descendant nodes.

Retrieving a subset of XML nodes with PHP

Using PHP, how do I get an entire subset of nodes from an XML document? I can retrieve something like:
<?xml version="1.0" encoding="utf-8"?>
<people>
<certain>
<name>Jane Doe</name>
<age>21</age>
</certain>
<certain>
<certain>
<name>John Smith</name>
<age>34</age>
</certain>
</people>
But what if I only want to return the child nodes of like this?
<certain>
<name>Jane Doe</name>
<age>21</age>
</certain>
<certain>
<certain>
<name>John Smith</name>
<age>34</age>
</certain>
EDIT: I'm trying to get a subset of XML and pass that directly, not an object like simplexml would give me. I am basically trying to get PHP to do what .NET's OuterXml does... return literally the above subset of XML as is... no interpreting or converting or creating a new XML file or anything... just extract those nodes in situ and pass them on. Am I going to have to get the XML file, parse out what I need and then rebuild it as a new XML file? If so then I need to get rid of the <?xml version="1.0" encoding="utf-8"?> bit... ugh.
The answer would be to use XPath.
$people = simplexml_load_string(
'<?xml version="1.0" encoding="utf-8"?>
<people>
<certain>
<name>Jane Doe</name>
<age>21</age>
</certain>
<certain>
<name>John Smith</name>
<age>34</age>
</certain>
</people>'
);
// get all <certain/> nodes
$people->xpath('//certain');
// get all <certain/> nodes whose <name/> is "John Smith"
print_r($people->xpath('//certain[name = "John Smith"]'));
// get all <certain/> nodes whose <age/> child's value is greater than 21
print_r($people->xpath('//certain[age > 21]'));
Take 2
So apparently you want to copy some nodes from a document into another document? SimpleXML doesn't support that. DOM has methods for that but they're kind of annoying to use. Which one are you using? Here's what I use: SimpleDOM. In fact, it's really SimpleXML augmented with DOM's methods.
include 'SimpleDOM.php';
$results = simpledom_load_string('<results/>');
foreach ($people->xpath('//certain') as $certain)
{
$results->appendChild($certain);
}
That routine finds all <certain/> node via XPath, then appends them to the new document.
You could use DOMDocument.GetElementsByTagName or you could:
Use XPath?
<?php
$xml = simplexml_load_file("test.xml");
$result = $xml->xpath("//certain");
print_r($result);
?>
Use DOM and XPath. Xpath allows you to select nodes (and values) from an XML DOM.
$dom = new DOMDocument();
$dom->loadXml($xml);
$xpath = new DOMXpath($dom);
$result = '';
foreach ($xpath->evaluate('/people/certain') as $node) {
$result .= $dom->saveXml($node);
}
echo $result;
Demo: https://eval.in/162149
DOMDocument::saveXml() has a context argument. If provided it saves that node as XML. Much like outerXml(). PHP is able to register your own classes for the DOM nodes, too. So it is even possible to add an outerXML() function to element nodes.
class MyDomElement extends DOMElement {
public function outerXml() {
return $this->ownerDocument->saveXml($this);
}
}
class MyDomDocument extends DOMDocument {
public function __construct($version = '1.0', $encoding = 'utf-8') {
parent::__construct($version, $encoding);
$this->registerNodeClass('DOMElement', 'MyDomElement');
}
}
$dom = new MyDomDocument();
$dom->loadXml($xml);
$xpath = new DOMXpath($dom);
$result = '';
foreach ($xpath->evaluate('/people/certain') as $node) {
$result .= $node->outerXml();
}
echo $result;
Demo: https://eval.in/162157
See http://www.php.net/manual/en/domdocument.getelementsbytagname.php
The answer turned out to be a combination of the xpath suggestion and outputting with asXML().
Using the example given by Josh Davis:
$people = simplexml_load_string(
<?xml version="1.0" encoding="utf-8"?>
<people>
<certain>
<name>Jane Doe</name>
<age>21</age>
</certain>
<certain>
<name>John Smith</name>
<age>34</age>
</certain>
</people>'
);
// get all <certain/> nodes
$nodes = $people->xpath('/people/certain');
foreach ( $nodes as $node ) {
$result .= $node->asXML()."\n";
}
echo $result;

Categories