Get all children from certain xml child element using SimpleXMLElement and xpath - php

I have xml like:
<root xmlns="urn:test:apis:baseComponents">
<books>
<book>
<name>50 shades of grey</name>
</book>
</books>
<disks>
<disk>
<name>Britney Spears</name>
</disk>
</disks>
</root>
And such php code:
$xml = new SimpleXMLElement($xml);
$books = $xml->books;
$disks = $xml->disks;
$disks->registerXPathNamespace('x', 'urn:test:apis:baseComponents');
$books->registerXPathNamespace('x', 'urn:test:apis:baseComponents');
$b_names = $books->xpath('//x:name');
b_names contains array with 2 values instead of 1. First holds books->book->name, second holds disks->disk->name.
Can you please explain what am I doing wrong and how could I find children of only one element?
The reason that I am using xpath instead of taking manually values using SimpleXMLElement, is that I don't know what value, which I want to search in advance.

Use $books->xpath('.//x:name') to search descendants of your $books variable and not descendants of the root node/document node (which the path //x:name does).

Related

PHP Converting from XML to JSON with a SimpleXML object. Array with <items> tag causing issues

We are using SimpleXML to try and convert XML to JSON, and in turn convert to a PHP object, so that we can compare out Soap API with our Rest API. We have a request that returns quite a lot of data, but the part in question is where we have a nested array.
The array is returned with the tag in XML, however we do not want this translated into the JSON.
The XML that we get is as follows:
<apns>
<item>
<apn>apn</apn>
</item>
</apns>
So when it is translated into JSON it looks like this:
{"apns":{"item":{"apn":"apn"}}
In reality, we want SimpleXML to convert to the same JSON as in our Rest API, which looks like the following:
{"apns":[{"apn":"apn"}]}
The array could contain more than one thing, for example:
<apns>
<item>
<apn>apn</apn>
</item>
<item>
<apn>apn2</apn>
</item>
</apns>
Which I'm assuming will just error in JSON or have the first one overwritten.
I'd expect SimpleXML to be able to handle this natively, but if not has anyone got a fix that doesn't involve janky string manipulation?
TIA :)
A generic conversion has no possibility to know that a single element should be an array in JSON.
SimpleXMLElement properties can be treated as an Iterable to traverse sibling with the same name. They can be treated as an list or a single value.
This allows you to build up your own array/object structure and serialize it to JSON.
$xml = <<<'XML'
<apns>
<item>
<apn>apn1</apn>
</item>
<item>
<apn>apn2</apn>
</item>
</apns>
XML;
$apns = new SimpleXMLElement($xml);
$json = [
'apns' => []
];
foreach ($apns->item as $item) {
$json['apns'][] = ['apn' => (string)$item->apn];
}
echo json_encode($json, JSON_PRETTY_PRINT);
This still allows you to read/convert parts in a general way. Take a more in deep look at the SimpleXMLElement class. Here are method to iterate over all children or to get the name of the current node.
I hope this code is useful as a template to what your after, the problem is that it's difficult to know if this is the only instance of what your trying to do...
What this does is first looks for any nodes which have a item/apn structure underneath using XPath (//*[item/apn] says any node //* with the following nodes underneath).
Then it loops through these items and adds new <apn> nodes underneath the start node (the <apns> node in this case) from each <item> with the value ($list->addChild("apn", (string)$item->apn);.
Once the nodes are copied it removes all of the <item> nodes (unset($list->item);).
$input = '<apns>
<item>
<apn>apn</apn>
</item>
<item>
<apn>apn2</apn>
</item>
</apns>';
$xml = simplexml_load_string($input);
$itemList = $xml->xpath("//*[item/apn]");
foreach ( $itemList as $list ) {
foreach ( $list->item as $item ) {
$list->addChild("apn", (string)$item->apn);
}
unset($list->item);
}
echo $xml->asXML();
gives...
<?xml version="1.0"?>
<apns>
<apn>apn</apn><apn>apn2</apn></apns>
and
echo json_encode($xml);
gives...
{"apn":["apn","apn2"]}
If you just want the last value, then you can just keep track of the last value and set the new element outside the inner loop...
$itemList = $xml->xpath("//*[item/apn]");
foreach ( $itemList as $list ) {
foreach ( $list->item as $item ) {
$apn = (string)$item->apn;
}
$list->addChild("apn", $apn);
unset($list->item);
}

php xpath query to get parent node based on value in repeating child nodes

I have an XML file structured as follows:
<pictures>
<picture>
<title></title>
<description></description>
<facts>
<date></date>
<place>Unites States</place>
</facts>
<people>
<person>John</person>
<person>Sue</person>
</people>
</picture>
<picture>
<title></title>
<description></description>
<facts>
<date></date>
<place>Canada</place>
</facts>
<people>
<person>Sue</person>
<person>Jane</person>
</people>
</picture>
<picture>
<title></title>
<description></description>
<facts>
<date></date>
<place>Canada</place>
</facts>
<people>
<person>John</person>
<person>Joe</person>
<person>Harry</person>
</people>
</picture>
<pictures>
In one case, I need to search for pictures where place="Canada". I have an XPath that does this fine, as such:
$place = "Canada";
$pics = ($pictures->xpath("//*[place='$place']"));
This pulls the entire "picture" node, so I am able to display title, description, etc.
I have another need to find all pictures where person = $person. I use the same type query as above:
$person = "John";
$pics = ($pictures->xpath("//*[person='$person']"));
In this case, the query apparently knows there are 2 pictures with John, but I don't get any of the values for the other nodes. I'm guessing it has something to do with the repeating child node, but can't figure out how to restructure the XPath to pull all of the picture node for each where I have a match on person. I tried using attributes instead of values (and modified the query accordingly), but got the same result.
Can anyone advise what I'm missing here?
Let's replace the variables first. That takes PHP out of the picture. The problem is just the proper XPath expression.
//*[place='Canada']
matches any element node that has a child element node place with the text content Canada.
This is the facts element node - not the picture.
Getting the pictures node is slightly different:
//picture[facts/place='Canada']
This would select ANY picture node at ANY DEPTH that matches the condition.
picture[facts/place='Canada']
Would return the same result with the provided XML, but is more specific and matches only picture element nodes that are children of the document element.
Now validating the people node is about the same:
picture[people/person="John"]
You can even combine the two conditions:
picture[facts/place="Canada" and people/person="John"]
Here is a small demo:
$element = new SimpleXMLElement($xml);
$expressions = [
'//*[place="Canada"]',
'//picture[facts/place="Canada"]',
'picture[facts/place="Canada"]',
'picture[people/person="John"]',
'picture[facts/place="Canada" and people/person="John"]',
];
foreach ($expressions as $expression) {
echo $expression, "\n", str_repeat('-', 60), "\n";
foreach ($element->xpath($expression) as $index => $found) {
echo '#', $index, "\n", $found->asXml(), "\n";
}
echo "\n";
}
HINT: Your using dyamic values in you XPath expressions. String literals in XPath 1.0 do not support any kind of escaping. A quote in the variable can break you expression. See this answer.

Using DOMXml and Xpath, to update XML entries

Hello I know there is many questions here about those three topics combined together to update XML entries, but it seems everyone is very specific to a given problem.
I have been spending some time trying to understand XPath and its way, but I still can't get what I need to do.
Here we go
I have this XML file
<?xml version="1.0" encoding="UTF-8"?>
<storagehouse xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="schema.xsd">
<item id="c7278e33ef0f4aff88da10dfeeaaae7a">
<name>HDMI Cable 3m</name>
<weight>0.5</weight>
<category>Cables</category>
<location>B3</location>
</item>
<item id="df799fb47bc1e13f3e1c8b04ebd16a96">
<name>Dell U2410</name>
<weight>2.5</weight>
<category>Monitors</category>
<location>C2</location>
</item>
</storagehouse>
What I would like to do is to update/edit any of the nodes above when I need to. I will do a Html form for that.
But my biggest conserne is how do I find and update a the desired node and update it?
Here I have some of what I am trying to do
<?php
function fnDOMEditElementCond()
{
$dom = new DOMDocument();
$dom->load('storage.xml');
$library = $dom->documentElement;
$xpath = new DOMXPath($dom);
// I kind of understand this one here
$result = $xpath->query('/storagehouse/item[1]/name');
//This one not so much
$result->item(0)->nodeValue .= ' Series';
// This will remove the CDATA property of the element.
//To retain it, delete this element (see delete eg) & recreate it with CDATA (see create xml eg).
//2nd Way
//$result = $xpath->query('/library/book[author="J.R.R.Tolkein"]');
// $result->item(0)->getElementsByTagName('title')->item(0)->nodeValue .= ' Series';
header("Content-type: text/xml");
echo $dom->saveXML();
}
?>
Could someone maybe give me an examples with attributes and so on, so one a user decides to update a desired node, I could find that node with XPath and then update it?
The following example is making use of simplexml which is a close friend of DOMDocument. The xpath shown is the same regardless which method you use, and I use simplexml here to keep the code low. I'll show a more advanced DOMDocument example later on.
So about the xpath: How to find the node and update it. First of all how to find the node:
The node has the element/tagname item. You are looking for it inside the storagehouse element, which is the root element of your XML document. All item elements in your document are expressed like this in xpath:
/storagehouse/item
From the root, first storagehouse, then item. Divided with /. You already know that, so the interesting part is how to only take those item elements that have the specific ID. For that the predicate is used and added at the end:
/storagehouse/item[#id="id"]
This will return all item elements again, but this time only those which have the attribute id with the value id (string). For example in your case with the following XML:
$xml = <<<XML
<?xml version="1.0" encoding="UTF-8"?>
<storagehouse xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="schema.xsd">
<item id="c7278e33ef0f4aff88da10dfeeaaae7a">
<name>HDMI Cable 3m</name>
<weight>0.5</weight>
<category>Cables</category>
<location>B3</location>
</item>
<item id="df799fb47bc1e13f3e1c8b04ebd16a96">
<name>Dell U2410</name>
<weight>2.5</weight>
<category>Monitors</category>
<location>C2</location>
</item>
</storagehouse>
XML;
that xpath:
/storagehouse/item[#id="df799fb47bc1e13f3e1c8b04ebd16a96"]
will return the computer monitor (because such an item with that id exists). If there would be multiple items with the same id value, multiple would be returned. If there were none, none would be returned. So let's wrap that into a code-example:
$simplexml = simplexml_load_string($xml);
$result = $simplexml->xpath(sprintf('/storagehouse/item[#id="%s"]', $id));
if (!$result || count($result) !== 1) {
throw new Exception(sprintf('Item with id "%s" does not exists or is not unique.', $id));
}
list($item) = $result;
In this example, $titem is the SimpleXMLElement object of that computer monitor xml element name item.
So now for the changes, which are extremely easy with SimpleXML in your case:
$item->category = 'LCD Monitor';
And to finally see the result:
echo $simplexml->asXML();
Yes that's all with SimpleXML in your case.
If you want to do this with DOMDocument, it works quite similar. However, for updating an element's value, you need to access the child element of that item as well. Let's see the following example which first of all fetches the item as well. If you compare with the SimpleXML example above, you can see that things not really differ:
$doc = new DOMDocument();
$doc->loadXML($xml);
$xpath = new DOMXPath($doc);
$result = $xpath->query(sprintf('/storagehouse/item[#id="%s"]', $id));
if (!$result || $result->length !== 1) {
throw new Exception(sprintf('Item with id "%s" does not exists or is not unique.', $id));
}
$item = $result->item(0);
Again, $item contains the item XML element of the computer monitor. But this time as a DOMElement. To modify the category element in there (or more precisely it's nodeValue), that children needs to be obtained first. You can do this again with xpath, but this time with an expression relative to the $item element:
./category
Assuming that there always is a category child-element in the item element, this could be written as such:
$category = $xpath->query('./category', $item)->item(0);
$category does now contain the first category child element of $item. What's left is updating the value of it:
$category->nodeValue = "LCD Monitor";
And to finally see the result:
echo $doc->saveXML();
And that's it. Whether you choose SimpleXML or DOMDocument, that depends on your needs. You can even switch between both. You probably might want to map and check for changes:
$repository = new Repository($xml);
$item = $repository->getItemByID($id);
$item->category = 'LCD Monitor';
$repository->saveChanges();
echo $repository->getXML();
Naturally this requires more code, which is too much for this answer.

Hide XML declaration in files generated using PHP

I was tesing with a simple example of how to display XML in browser using PHP and found this example which works good
<?php
$xml = new DOMDocument("1.0");
$root = $xml->createElement("data");
$xml->appendChild($root);
$id = $xml->createElement("id");
$idText = $xml->createTextNode('1');
$id->appendChild($idText);
$title = $xml->createElement("title");
$titleText = $xml->createTextNode('Valid');
$title->appendChild($titleText);
$book = $xml->createElement("book");
$book->appendChild($id);
$book->appendChild($title);
$root->appendChild($book);
$xml->formatOutput = true;
echo "<xmp>". $xml->saveXML() ."</xmp>";
$xml->save("mybooks.xml") or die("Error");
?>
It produces the following output:
<?xml version="1.0"?>
<data>
<book>
<id>1</id>
<title>Valid</title>
</book>
</data>
Now I have got two questions regarding how the output should look like.
The first line in the xml file '', should not be displayed, that is it should be hidden
How can I display the TextNode in the next line. In total I am exepecting an output in this fashion
<data>
<book>
<id>1</id>
<title>
Valid
</title>
</book>
</data>
Is that possible to get the desired output, if so how can I accomplish that.
Thanks
To skip the XML declaration you can use the result of saveXML on the root node:
$xml_content = $xml->saveXML($root);
file_put_contents("mybooks.xml", $xml_content) or die("cannot save XML");
Please note that saveXML(node) has a different output from saveXML().
First question:
here is my post where all usable threads with answers are listed: How do you exclude the XML prolog from output?
Second question:
I don't know of any PHP function that outputs text nodes like that.
You could:
read xml using DomDocument and save each node as string
iterate trough nodes
detect text nodes and add new lines to xml string manually
At the end you would have the same XML with text node values in new line:
<node>
some text data
</node>

Adding a block of XML as child of a SimpleXMLElement object

I have this SimpleXMLElement object with a XML setup similar to the following...
$xml = <<<EOX
<books>
<book>
<name>ABCD</name>
</book>
</books>
EOX;
$sx = new SimpleXMLElement( $xml );
Now I have a class named Book that contains info. about each book. The same class can also spit out the book info. in XML format akin the the above (the nested block).. example,
$book = new Book( 'EFGH' );
$book->genXML();
... will generate
<book>
<name>EFGH</name>
</book>
Now I'm trying to figure out a way by which I can use this generated XML block and append as a child of so that now it looks like... for example..
// Non-existent member method. For illustration purposes only.
$sx->addXMLChild( $book->genXML() );
...XML tree now looks like:
<books>
<book>
<name>ABCD</name>
</book>
<book>
<name>EFGH</name>
</book>
</books>
From what documentation I have read on SimpleXMLElement, addChild() won't get this done for you as it doesn't support XML data as tag value.
Two solutions. First, you do it with the help of libxml / DOMDocument / SimpleXML: you have to import your $sx object to DOM, create a DOMDocumentFragment and use DOMDocumentFragment::appendXML():
$doc = dom_import_simplexml($sx)->ownerDocument;
$fragment = $doc->createDocumentFragment();
$fragment->appendXML($book->genXML());
$doc->documentElement->appendChild($fragment);
// your original $sx is now already modified.
See the Online Demo.
You can also extend from SimpleXMLElement and add a method that is providing this. Using this specialized object then would allow you to create the following easily:
$sx = new MySimpleXMLElement($xml);
$sx->addXML($book->genXML());
Another solution is to use an XML library that already has this feature built-in like SimpleDOM. You grab SimpleDOM and you use insertXML(), which works like the addXMLChild() method you were describing.
include 'SimpleDOM.php';
$books = simpledom_load_string(
'<books>
<book>
<name>ABCD</name>
</book>
</books>'
);
$books->insertXML(
'<book>
<name>EFGH</name>
</book>'
);
Have a look at my code:
$doc = new DOMDocument();
$doc->loadXML("<root/>");
$fragment = $doc->createDocumentFragment();
$fragment->appendXML("<foo>text</foo><bar>text2</bar>");
$doc->documentElement->appendChild($fragment);
echo $doc->saveXML();
This modifies the XML document by adding an XML fragment. Online Demo.

Categories