Suppose I have the following XML:
<Book>
<bookname>thename</bookname>
<chapters>
<chapter>
<name>chapter1</name>
</chapter>
<chapter>
<name>chapter2</name>
</chapter>
</chapters>
</Book>
How can I get an XML as follows:
<chapters>
<chapter>
<name>chapter1</name>
</chapter>
<chapter>
<name>chapter2</name>
</chapter>
</chapters>
One way is to manually remove unwanted elements e.g
$resultXML = new SimpleXMLElement($inputXML);
unset($resultXML->bookname);
$resultXML = $resultXML->asXml();
echo format_result($resultXML,$format);
But if I have a large XML with many unwanted notes, this is tedious. Any idea who to extract the required element using its name?
The sub-nodes of your XML are themselves SimpleXMLElements – so you can just call the axXml method on them as well.
Since you only have one chapters node in your XML, simply
$resultXML->chapters->asXml()
will do.
Here we go:
<?php
$inputXML ="
<Book>
<bookname>thename</bookname>
<chapters>
<chapter>
<name>chapter1</name>
</chapter>
<chapter>
<name>chapter2</name>
</chapter>
</chapters>
</Book>
";
/* Load from file:
$parseXML = simplexml_load_file( $fileURL );
*/
$parseXML = simplexml_load_string( $inputXML );
$chapters = $parseXML->chapters->asXML();
echo $chapters;
?>
Note: Your xml must be well-formated, currently your <Book> has no closing tag.
Related
I am curious if I can search an XML file for a certain tag with regular expressions. I can search the file if I use fopen('foo.xml'); but it will only allow me to search the content between the tags not the tags them self. My objective for this is I hope to create a function that will allow me to delete all the content between two tags for example between users which are in a xml file. He language that I am using is PHP.
Thanks in advance john.
You should use something like SimpleXMLto handle/edit XML files.
If you really insist on doing it by treating the SML file as a string you can do something like this (or you can use regex). But you should use an XML library.
// get your file as a string
$yourXML = file_get_contents($file) ;
$posStart = stripos($yourXML,'<users>') + strlen('<users>') ;
$posEnd = stripos($yourXML,'</users>') ;
$newXML = substr($yourXML,0,$posStart) . substr($yourXML,$posEnd) ;
// <users> is now empty
echo $newXML ;
DomDocument & XPath will make things very clean, direct and reliable.
You can use evaluate() or query() as they provide the same result.
// will seek out the matching tags regardless of their location.
Be aware that my solution is case-sensitive.
Code: (Demo)
$xml = <<<XML
<myXml>
<Person>
<firstName>pradeep</firstName>
<lastName>jain</lastName>
<address>
<doorNumber>287</doorNumber>
<street>2nd block</street>
<city>bangalore</city>
</address>
<phoneNums type="mobile">9980572765</phoneNums>
<phoneNums type="landline">080 42056434</phoneNums>
<phoneNums type="skype">123456</phoneNums>
</Person>
<Person>
<firstName>pradeep</firstName>
<lastName>jain</lastName>
<address>
<doorNumber>287</doorNumber>
<street>2nd block</street>
<city>bangalore</city>
</address>
<phoneNums type="mobile">1</phoneNums>
<phoneNums type="landline">2</phoneNums>
<phoneNums type="skype">3</phoneNums>
</Person>
</myXml>
XML;
$dom = new DOMDocument;
$dom->loadXML($xml); // <-- you'll need to import your file instead of a string as demo'ed here
$xpath = new DOMXPath($dom);
echo count($xpath->evaluate("//phoneNums")) , "\n"; // 6
echo count($xpath->evaluate("//street")) , "\n"; // 2
echo count($xpath->evaluate("//myXml")) , "\n"; // 1
echo count($xpath->evaluate("//Person")) , "\n"; // 2
echo count($xpath->evaluate("//person")) , "\n"; // 0 <-- case-sensitive
As a simple mock up of the various parts needed to do this in SimpleXML, there are a few concepts you need to know to get it to work.
The main one being XPath, which a sort of SQL for XML. Of course it has it's own notation and can be a little pedantic at times, but you can experiment with it on sites like https://codebeautify.org/Xpath-Tester.
$data = '<?xml version="1.0" encoding="UTF-8"?>
<Users>
<User id="123">
<Name>fred</Name>
<Extension>1234</Extension>
</User>
<User id="124">
<Name>bert</Name>
<Extension>1235</Extension>
</User>
<User id="125">
<Name>foo</Name>
<Extension>1236</Extension>
</User>
</Users>';
$userID = "123";
$users = simplexml_load_string($data);
// Find the user with the id attribute (use [0] as the call to xpath
// returns a list of matches and you only want the first one)
$userMatch = $users->xpath("//User[#id='{$userID}']")[0];
// Just output user id attribute and name
echo "id=".$userMatch['id'].",name=".$userMatch->Name.PHP_EOL;
echo "Removing user...".PHP_EOL;
// Remove the user - note the [0] is required here
unset($userMatch[0]);
// Print out the resulting XML after the removal
echo $users->asXML();
I've put comments through the code as how it works. The output is...
id=123,name=fred
Removing user...
<?xml version="1.0" encoding="UTF-8"?>
<Users>
<User id="124">
<Name>bert</Name>
<Extension>1235</Extension>
</User>
<User id="125">
<Name>foo</Name>
<Extension>1236</Extension>
</User>
</Users>
Given a base $xml and a file containing a <something> tag with attributes, children and children of its children, I would like to append it as first child and all of its children as raw XML.
Original XML:
<root>
<people>
<person>
<name>John Doe</name>
<age>47</age>
</person>
<person>
<name>James Johnson</name>
<age>13</age>
</person>
</people>
</root>
XML in file:
<something someval="x" otherthing="y">
<child attr="val" ..> { some children and values ... }</child>
<child attr="val2" ..> { some children and values ... }</child>
...
</something>
Result XML:
<root>
<something someval="x" otherthing="y">
<child attr="val" ..> { some children and values ... }</child>
<child attr="val2" ..> { some children and values ... }</child>
...
</something>
<people>
<person>
<name>John Doe</name>
<age>47</age>
</person>
<person>
<name>James Johnson</name>
<age>13</age>
</person>
</people>
</root>
This tag would contain several children both direct and recursively, so it would not be practical to build the XML via the SimpleXML operations. Besides, keeping it in a file would result in lower maintenance costs.
Technically it would simply be prepending one child. The problem is that this child would have other children and so on.
On the PHP addChild page there's a comment that says:
$x = new SimpleXMLElement('<root name="toplevel"></root>');
$f1 = new SimpleXMLElement('<child pos="1">alpha</child>');
$x->{$f1->getName()} = $f1; // adds $f1 to $x
However, this does not seem to treat my XML as raw XML therefore causing < and > escaped tags to appear. Several warnings concerning namespaces seem to appear as well.
I suppose I could do a quick replace of such tags but I am not sure whether it could cause future problems and it certainly does not feel right.
Manually hacking the XML is not an option and neither is adding children one by one. Choosing a different library could be.
Any clues on how to get this working?
Thanks!
I'm really not sure if that will work. Try this or downvote this, but I hope it helps. Using DOMDocument (Reference)
<?php
$xml = new DOMDocument();
$xml->loadHTML($yourOriginalXML);
$newNode = DOMDocument::createElement($someXMLtoPrepend);
$nodeRoot = $xml->getElementsByTagName('root')->item(0);
$nodeOriginal = $xml->getElementsByTagName('people')->item(0);
$nodeRoot->insertBefore($newNode,$nodeOriginal);
$finalXmlAsString = $xml->saveXML();
?>
Sometimes UTF-8 can make problems, then try this:
<?php
$xml = new DOMDocument();
$xml->loadHTML(mb_convert_encoding($yourOriginalXML, 'HTML-ENTITIES', 'UTF-8'));
$newNode = DOMDocument::createElement(mb_convert_encoding($someXMLtoPrepend, 'HTML-ENTITIES', 'UTF-8'));
$nodeRoot = $xml->getElementsByTagName('root')->item(0);
$nodeOriginal = $xml->getElementsByTagName('people')->item(0);
$nodeRoot->insertBefore($newNode,$nodeOriginal);
$finalXmlAsString = $xml->saveXML();
?>
I am having some issues using xmldiff package. I'm using xmldiff package 0.9.2; PHP 5.4.17; Apache 2.2.25.
For example I have two xml files: "from.xml" & "to.xml".
File "from.xml" contains:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<rott>
<NDC>321</NDC>
<NDC>123</NDC>
</rott>
</root>
File "to.xml" contains:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<rott>
<NDC>123</NDC>
<NDC>321</NDC>
</rott>
</root>
I'm using code:
$zxo = new XMLDiff\File;
$dir1 = dirname(__FILE__) . "/upload/from.xml";
$dir2 = dirname(__FILE__) . "/upload/to.xml";
$diff = $zxo->diff($dir1, $dir2);
$file = 'differences.xml';
file_put_contents($file, $diff);
I get result in "differences.xml" file:
<?xml version="1.0"?>
<dm:diff xmlns:dm="http://www.locus.cz/diffmark">
<root>
<rott>
<dm:delete>
<NDC/>
</dm:delete>
<dm:copy count="1"/>
<dm:insert>
<NDC>321</NDC>
</dm:insert>
</rott>
</root>
</dm:diff>
Could you please comment from where this:
<dm:delete>
<NDC/>
</dm:delete>
comes?
Also please kindly inform me if there is a method which differs two xml files without matter of xml nodes order?
What you see is the diff in the libdiffmark format. Right from that page:
<copy/> is used in places where the input subtrees are the same
The documents from your snippet have partially identical sub trees. Effectively the instructions libdiffmark will execute are
delete the whole subtree
copy 1 nodes, that means the node is the same in the both documents, so don't touch it
insert 1 new node
The order of the nodes matters. Please think about how a diff would look like, if the node order were ignored. Say you had 42 nodes and some of those were the same, how it would apply the copy instruction with the count? Much easier for a diff to use the exact node order of two documents. One interesting reading I've found here about why node order can be important.
Thanks.
If the document structure is known, I think you can simply sort the necessary parts. Here's a useful acticle about it. Based on it, I've poked on some examples and could sort a document by node values (just for example), please look here
document library.xml
<?xml version="1.0"?>
<library>
<book id="1003">
<title>Jquery MVC</title>
<author>Me</author>
<price>500</price>
</book>
<book id="1001">
<title>Php</title>
<author>Me</author>
<price>600</price>
</book>
<book id="1002">
<title>Where to use IFrame</title>
<author>Me</author>
<price>300</price>
</book>
<book id="1002">
<title>American dream</title>
<author>Hello</author>
<price>300</price>
</book>
</library>
The PHP code, sorting by the <title>
<?php
$dom = new DOMDocument();
$dom->load('library.xml');
$xp = new DOMXPath($dom);
$booklist = $xp->query('/library/book');
$books = iterator_to_array($booklist);
function sort_by_title_node($a, $b)
{
$x = $a->getElementsByTagName('title')->item(0);
$y = $b->getElementsByTagName('title')->item(0);
return strcmp($x->nodeValue, $y->nodeValue) > 0;
}
usort($books, 'sort_by_title_node');
$newdom = new DOMDocument("1.0");
$newdom->formatOutput = true;
$root = $newdom->createElement("library");
$newdom->appendChild($root);
foreach ($books as $b) {
$node = $newdom->importNode($b,true);
$root->appendChild($newdom->importNode($b,true));
}
echo $newdom->saveXML();
And here's the result:
<?xml version="1.0"?>
<library>
<book id="1002">
<title>American dream</title>
<author>Hello</author>
<price>300</price>
</book>
<book id="1003">
<title>Jquery MVC</title>
<author>Me</author>
<price>500</price>
</book>
<book id="1001">
<title>Php</title>
<author>Me</author>
<price>600</price>
</book>
<book id="1002">
<title>Where to use IFrame</title>
<author>Me</author>
<price>300</price>
</book>
</library>
This way you can sort the parts of the document before comparing. After that you can even use the DOM comparison directly. Even you could reorder the nodes, it were a similar approach.
I'm not sure it'll be very useful in the case if you have a variable node number. Say if the <NDC> tag were repeated some random number of times and it's values were completely different.
And after all, I still think the simplest way were to ask your supplicant to create some more predictable document structure :)
Thanks
Anatol
I was tesing with a simple example of how to display XML in browser using PHP and found this example which works good
<?php
$xml = new DOMDocument("1.0");
$root = $xml->createElement("data");
$xml->appendChild($root);
$id = $xml->createElement("id");
$idText = $xml->createTextNode('1');
$id->appendChild($idText);
$title = $xml->createElement("title");
$titleText = $xml->createTextNode('Valid');
$title->appendChild($titleText);
$book = $xml->createElement("book");
$book->appendChild($id);
$book->appendChild($title);
$root->appendChild($book);
$xml->formatOutput = true;
echo "<xmp>". $xml->saveXML() ."</xmp>";
$xml->save("mybooks.xml") or die("Error");
?>
It produces the following output:
<?xml version="1.0"?>
<data>
<book>
<id>1</id>
<title>Valid</title>
</book>
</data>
Now I have got two questions regarding how the output should look like.
The first line in the xml file '', should not be displayed, that is it should be hidden
How can I display the TextNode in the next line. In total I am exepecting an output in this fashion
<data>
<book>
<id>1</id>
<title>
Valid
</title>
</book>
</data>
Is that possible to get the desired output, if so how can I accomplish that.
Thanks
To skip the XML declaration you can use the result of saveXML on the root node:
$xml_content = $xml->saveXML($root);
file_put_contents("mybooks.xml", $xml_content) or die("cannot save XML");
Please note that saveXML(node) has a different output from saveXML().
First question:
here is my post where all usable threads with answers are listed: How do you exclude the XML prolog from output?
Second question:
I don't know of any PHP function that outputs text nodes like that.
You could:
read xml using DomDocument and save each node as string
iterate trough nodes
detect text nodes and add new lines to xml string manually
At the end you would have the same XML with text node values in new line:
<node>
some text data
</node>
Example of the xml:
<books>
<book>
<title>Hip Hop Hippo</title>
<released>31-12-9999</released>
</book>
<book>
<title>Bee In A Jar</title>
<released>01-01-0001</released>
</book>
</books>
I want to make a function that return the released date of a book title.
Ex: I want to get released date of the 'Hip Hop Hippo' book.
I know I can use simplexml and write ->book[0]->released. But that's only works when I have a static XML and I know where the ->book[$i]->title that match 'Hip Hop Hippo'. But not in dynamic case. I can't predict every changes, since it came from an API provider. It can be book[1], book[2], and so on.
What should I write in my function?
Check out the xpath functions http://php.net/manual/en/simplexmlelement.xpath.php
You will then be able to write a query like: /books/book[title="Hip Hop Hippo"]
$string = <<<XML
<books>
<book>
<title>Hip Hop Hippo</title>
<released>31-12-9999</released>
</book>
<book>
<title>Hip Hop Hippo</title>
<released>31-12-2000</released>
</book>
<book>
<title>Bee In A Jar</title>
<released>01-01-0001</released>
</book>
</books>
XML;
$xml = new SimpleXMLElement($string);
$result = $xml->xpath('/books/book[title="Hip Hop Hippo"]');
foreach($result as $key=>$node)
{
echo '<li>';
echo $node->title . ' / ' . $node->released;
echo '</li>';
}