get the last child node from xml using DOMDocument in php - php

Here is my xml:
<details>
<car>
<id>61XZB6</id>
<Jan-01-14>20</Jan-01-14>
<Jan-02-14>435</Jan-02-14>
<Jan-03-14>454</Jan-03-14>
<Jan-04-14>768</Jan-04-14>
<Jan-05-14>24</Jan-05-14>
<Jan-06-14>675</Jan-06-14>
<Jan-07-14>213</Jan-07-14>
<Jan-08-14>44</Jan-08-14>
<Jan-09-14>565</Jan-09-14>
<Jan-10-14>80</Jan-10-14>
<Jan-11-14>998</Jan-11-14>
<Jan-12-14>67</Jan-12-14>
<Jan-13-14>77</Jan-13-14>
<Jan-14-14>909</Jan-14-14>
<Jan-15-14>34</Jan-15-14>
<Jan-16-14>887</Jan-16-14>
<Jan-17-14>767</Jan-17-14>
<Jan-18-14>545</Jan-18-14>
<Jan-19-14>67</Jan-19-14>
<Jan-20-14>787</Jan-20-14>
<Jan-21-14>898</Jan-21-14>
<Jan-22-14>435</Jan-22-14>
<Jan-23-14>42</Jan-23-14>
<Jan-24-14>232</Jan-24-14>
<Jan-25-14>234</Jan-25-14>
<Jan-26-14>675</Jan-26-14>
<Jan-27-14>46</Jan-27-14>
<Jan-28-14>546</Jan-28-14>
<Jan-29-14>88</Jan-29-14>
<Jan-30-14>0</Jan-30-14>
<Jan-31-14>0</Jan-31-14>
</car>
</details>
My query is how to check the last node inside each tag before inserting a new node to each tag.Thanks in advance for any sort of help extended.

If I understand you correctly you would like to check for the last element in each car element node? Well, Xpath hast two methods position() and last() that can be used in a condition.
Select the car nodes
/details/car
Select the child element nodes of the car nodes
/details/car/*
Add a condition to limit the selection to the last node
/details/car/*[last()]
Full example: https://eval.in/145531
$dom = new DOMDocument();
$dom->loadXml($xml);
$xpath = new DOMXpath($dom);
foreach ($xpath->evaluate('/details/car/*[last()]') as $node) {
var_dump(
$node->nodeName,
$node->nodeValue
);
}
Output:
string(9) "Jan-31-14"
string(1) "0"
HINT!
Flexible element names are really bad style, you will not be able to define them in a schema. If possible I suggest you change them to something like <amount date="Jan-31-14">0</amount>

Related

php xpath with dom how to write shorter querys

The xml file that I need to get out data is large and has lot of inner children in children. The XML on what I need to query looks like in the picture and it has more Ref children in company.
xml structure
I need to get the company node that has the correct Info->ID. That node has 3 Ref nodes and I need to get Date from the one that has the correct Who.
I got this working with this ugly code:
$query_pod = "//Seller/Company/Info/ID[ID = 'IV'] | //Seller/Company/Ref/Who[Who = 'VA'] | //Seller/Company/Ref/Date";
foreach ($xpath->query( $query_pod ) as $pod)
{
$pod_dates = $xpath->query( "//Seller/Company/Ref/Who[Who = 'VA'] | //Seller/Company/Ref/Date", $pod );
$pod_date = $pod_dates->item(0)->nodeValue;
}
I tryed it shorter but I cant get the select element in [] like this:
$query_pod = "//Seller/Company/Info[ID = 'IV']";
foreach ($xpath->query( $query_pod ) as $pod)
{
$pod_dates = $xpath->query( "//Seller/Company/Ref/Date[Who = 'VA'], $pod );
$pod_date = $pod_dates->item(0)->nodeValue;
}
Can someone help? I'm new to xpath.
You second expression (the on inside the loop) does not depend on the context. A slash (/) at the begin of a location path means that is absolute, it starts at the document node. [] are conditions that filter the result from the previous location path. They are full Xpath expression themselves. So no, it is not always possible to shorten the Xpath expressions. But in you case you can fetch the company node as a list and then the details directly using DOMXpath::evaluate(). Unlike DOMXpath::query(), DOMXpath::evaluate() can return node lists or scalar values. It depending on the expression.
$xml = <<<'XML'
<Seller>
<Company>
<Info>
<ID>123</ID>
<Address>Some street, somewhere</Address>
</Info>
<Ref>
<Who>456</Who>
<Date>2017-01-05T01:59Z</Date>
</Ref>
</Company>
</Seller>
XML;
$document = new DOMDocument();
$document->loadXml($xml);
$xpath = new DOMXpath($document);
foreach ($xpath->evaluate('/Seller/Company[Info/ID = "123"]') as $company) {
var_dump(
$xpath->evaluate('string(Info/ID)', $company),
$xpath->evaluate('string(Ref/Date)', $company)
);
}
Output:
string(3) "123"
string(17) "2017-01-05T01:59Z"
If you only need a single value you can fetch it directly:
var_dump(
$xpath->evaluate('string(/Seller/Company[Info/ID = "123"]/Ref/Date)')
);

Parse XML with tags not grouped under one tag, but really should be

I happen to be unfortunate enough to be working with an api that has images on the same XML tag level as the other tags and have the subscripts i.e 1,2,3,4 as part of the tag name of the image. Total images of each vehicle will vary in count.
<Vehicle>
<TITLE>Some car name i dont need</TITLE>
<DESCRIPTION>Some description i also dont need</DESCRIPTION>
<IMAGE_URL1>{imagelinkhere i want}</IMAGE_URL1>
<IMAGE_URL2>{imagelinkhere i want}</IMAGE_URL2>
<IMAGE_URL3>{imagelinkhere i want}</IMAGE_URL3>
<IMAGE_URL4>{imagelinkhere i want}</IMAGE_URL4>
</Vehicle>
I am using PHP's method simplexml_load_file(xml_url) to parse the entire xml into an object array.
My question: Is there a way to get these images using the same method which is also efficient and clean?
EDIT:
I have just refined the xml to show that there are other tags i dont need there and already handling.
$xml = '<Vehicle>
<DESCRIPTION/>
<IMAGE_URL1>{imagelinkhere}</IMAGE_URL1>
<IMAGE_URL2>{imagelinkhere}</IMAGE_URL2>
<IMAGE_URL3>{imagelinkhere}</IMAGE_URL3>
<IMAGE_URL4>{imagelinkhere}</IMAGE_URL4>
</Vehicle>';
$parsed = simplexml_load_string($xml);
If you know, that the image url tags will always contain the name IMAGE_URL, you can check them:
foreach ($parsed as $key => $image) {
if (strpos($key, 'IMAGE_URL') !== false) {
echo $image, '</br>';
}
}
You can fetch the nodes with Xpath.
$xml = <<<'XML'
<Vehicle>
<TITLE>Some car name i dont need</TITLE>
<DESCRIPTION>Some description i also dont need</DESCRIPTION>
<IMAGE_URL1>image1</IMAGE_URL1>
<IMAGE_URL2>image2</IMAGE_URL2>
<IMAGE_URL3>image3</IMAGE_URL3>
<IMAGE_URL4>image4</IMAGE_URL4>
</Vehicle>
XML;
$vehicle = new SimpleXMLElement($xml);
foreach ($vehicle->xpath('*[starts-with(local-name(), "IMAGE_URL")]') as $imageUrl) {
var_dump((string)$imageUrl);
}
Output:
string(6) "image1"
string(6) "image2"
string(6) "image3"
string(6) "image4"
* selects all element child nodes. [] is a condition. In this case a validation that the local name (tag name without any namespace prefix) starts with a specific string.
This looks not that much different in DOM. But you start at the document context.
$document = new DOMDocument();
$document->loadXml($xml);
$xpath = new DOMXpath($document);
foreach ($xpath->evaluate('/Vehicle/*[starts-with(local-name(), "IMAGE_URL")]') as $imageUrl) {
var_dump($imageUrl->textContent);
}

how to differentiate these two xml tags with childnodes

i have two tags in my sample xml as below,
<EmailAddresses>2</EmailAddresses>
<EmailAddresses>
<string>Allen.Patterson01#fantasyisland.com</string>
<string>Allen.Patterson12#fantasyisland.com</string>
</EmailAddresses>
how to differentiate these two xml tags based on the childnodes that means how to check that first tag has no childnodes and other one has using DOM php
Hope it will meet your requirement. Just copy,paste and run it. And change/add logic whatever you want.
<?php
$xmlstr = <<<XML
<?xml version='1.0' standalone='yes'?>
<email>
<EmailAddresses>2</EmailAddresses>
<EmailAddresses>
<string>Allen.Patterson01#fantasyisland.com</string>
<string>Allen.Patterson12#fantasyisland.com</string>
</EmailAddresses>
</email>
XML;
$email = new SimpleXMLElement($xmlstr);
foreach ($email as $key => $value) {
if(count($value)>1) {
var_dump($value);
//write your logic to process email strings
} else {
var_dump($value);
// count of emails
}
}
?>
You can use ->getElementsByTagName( 'string' ):
foreach( $dom->getElementsByTagName( 'EmailAddresses' ) as $node )
{
if( $node->getElementsByTagName( 'string' )->length )
{
// Code for <EmailAddresses><string/></EmailAddresses>
}
else
{
// Code for <EmailAddresses>2</EmailAddresses>
}
}
2 is considered as <EmailAddresses> child node, so in your XML ->haschildNodes() returns always True.
You have this problem due your weird XML structure conception.
If you don't have particular reason to maintain this XML syntax, I suggest you to use only one tag:
<EmailAddresses count="2">
<string>Allen.Patterson01#fantasyisland.com</string>
<string>Allen.Patterson12#fantasyisland.com</string>
</EmailAddresses>
Xpath allows you to do that.
$xml = <<<'XML'
<xml>
<EmailAddresses>2</EmailAddresses>
<EmailAddresses>
<string>Allen.Patterson01#fantasyisland.com</string>
<string>Allen.Patterson12#fantasyisland.com</string>
</EmailAddresses>
</xml>
XML;
$document = new DOMDocument();
$document->loadXml($xml);
$xpath = new DOMXpath($document);
var_dump(
$xpath->evaluate('number(//EmailAddresses[not(*)])')
);
foreach ($xpath->evaluate('//EmailAddresses/string') as $address) {
var_dump($address->textContent);
}
Output:
float(2)
string(35) "Allen.Patterson01#fantasyisland.com"
string(35) "Allen.Patterson12#fantasyisland.com"
The Expressions
Fetch the first EmailAddresses node without any element node child as a number.
Select any EmailAddresses element node:
//EmailAddresses
That does not contain another element node as child node:
//EmailAddresses[not(*)]
Cast the first of the fetched EmailAddresses nodes into a number:
number(//EmailAddresses[not(*)])
Fetch the string child nodes of the EmailAddresses element nodes.
Select any EmailAddresses element node:
//EmailAddresses
Get their string child nodes:
//EmailAddresses/string
In you example the first EmailAddresses seems to be duplicate information and stored in a weird way. Xpath can count nodes, too. The expression count(//EmailAddresses/string) would return the number of nodes.

PHP DOM Cut xml in to pieces and save each child with parent separately

I have next type of XML:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE test SYSTEM "dtd">
<root>
<tag1>
<1>Name</1>
<2>Num1</2>
<3>NumOrder</3>
<4>test</5>
<6>line</6>
<7>HTTP </7>
<8>1</8>
<9></9>
</tag1>
<tag2>
<1>Name</1>
<2>Num1</2>
<3>NumOrder</3>
<4>test</5>
<6>line</6>
<7>HTTP </7>
<8>1</8>
<9></9>
</tag2>
...
<tagN>
<1>Name</1>
<2>Num1</2>
<3>NumOrder</3>
<4>test</5>
<6>line</6>
<7>HTTP </7>
<8>1</8>
<9></9>
</tagN>
</root>
And i need to get root with each child element separately in array saved as HTML:
array = [rootwithchild1,rootwithchild2...N];
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE test SYSTEM "dtd">
<root>
<tagN>
<1>Name</1>
<2>Num1</2>
<3>NumOrder</3>
<4>test</5>
<6>line</6>
<7>HTTP </7>
<8>1</8>
<9></9>
</tagN>
</root>
For now i make 2 doms, in one i get all child separately, in another i have deleted all child and left only root. At these step i wanted to add each child to root, save as html, delete child, and so on with each child, but this doesn't work.
$bodyNode = $copydoc->getElementsByTagName('root')->item(0);
foreach ($mini as $value) {
$bodyNode->appendChild($value);
$result[] = $copydoc->saveHTML();
$bodyNode->removeChild($value);
}
Error on $bodyNode->appendChild($value);
Mini is array of cut child.
Lib: $doc = new DOMDocument();
Can anyone advice how to do this right, maybe better to use xpath or something else..?
Thanks
I would simply create a new document that contains only the root element and a “fake” initial child:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE test SYSTEM "dtd">
<root>
<fakechild />
</root>
After that, loop over the child elements of the original document – and for each of those perform the following steps:
import the child node from the original document into the new document using DOMDocument::importNode
replace the current child node of the root element of the new document with the imported node using DOMNode::replaceChild with the firstChild of the root element as second parameter
save the new document
(Having the <fakechild /> in the root element to begin with is not technically necessary, a simple whitespace text node should do as well – but with an empty root element this would not work in such a straight fashion, because the firstChild would give you NULL in the first loop iteration, so you would not have a node to feed to DOMNode::replaceChild as second parameter. Of course you could do additional checks for that and use appendChild instead of replaceChild for the first item … but why complicate stuff more than necessary.)
DOMNode::getElemementsByTagName() returns a live result. So if you remove the node from the DOM it is removed from the node list as well.
You can iterate the list backwards...
for ($i = $nodes->length - 1; $i >= 0; $i--) {
$node = $nodes->item($i);
...
}
... or copy it to an array:
foreach (iterator_to_array($nodes) as $node) {
...
}
Node lists from DOMXpath::evaluate() are not affected that way. XPath allows a more specific selection of nodes, too.
$xpath = new DOMXpath($domDocument);
$nodes = $xpath->evaluate('/root/*');
foreach (iterator_to_array($nodes) as $node) {
...
}
But I wonder why are you modifying (destroying) the original XML source?
If would create a new document to act as a template and. Never removing nodes, only creating new documents and importing them:
// load the original source
$source= new DOMDocument();
$source->loadXml($xml);
$xpath = new DOMXpath($source);
// create a template dom
$template = new DOMDocument();
$parent = $template;
// add a node and all its ancestors to the template
foreach ($xpath->evaluate('/root/part[1]/ancestor-or-self::*') as $node) {
$parent = $parent->appendChild($template->importNode($node, FALSE));
}
// for each of the child element nodes
foreach ($xpath->evaluate('/root/part/*') as $node) {
// create a new target
$target = new DOMDocument();
// import the nodes from the template
$target->appendChild($target->importNode($template->documentElement, TRUE));
// find the first element node that has no child element nodes
$targetXpath = new DOMXpath($target);
$targetNode = $targetXpath->evaluate('//*[count(*) = 0]')->item(0);
// append the child node from the original xml
$targetNode->appendChild($target->importNode($node, TRUE));
echo $target->saveXml(), "\n\n";
}
Demo: https://eval.in/191304

Using DOMXml and Xpath, to update XML entries

Hello I know there is many questions here about those three topics combined together to update XML entries, but it seems everyone is very specific to a given problem.
I have been spending some time trying to understand XPath and its way, but I still can't get what I need to do.
Here we go
I have this XML file
<?xml version="1.0" encoding="UTF-8"?>
<storagehouse xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="schema.xsd">
<item id="c7278e33ef0f4aff88da10dfeeaaae7a">
<name>HDMI Cable 3m</name>
<weight>0.5</weight>
<category>Cables</category>
<location>B3</location>
</item>
<item id="df799fb47bc1e13f3e1c8b04ebd16a96">
<name>Dell U2410</name>
<weight>2.5</weight>
<category>Monitors</category>
<location>C2</location>
</item>
</storagehouse>
What I would like to do is to update/edit any of the nodes above when I need to. I will do a Html form for that.
But my biggest conserne is how do I find and update a the desired node and update it?
Here I have some of what I am trying to do
<?php
function fnDOMEditElementCond()
{
$dom = new DOMDocument();
$dom->load('storage.xml');
$library = $dom->documentElement;
$xpath = new DOMXPath($dom);
// I kind of understand this one here
$result = $xpath->query('/storagehouse/item[1]/name');
//This one not so much
$result->item(0)->nodeValue .= ' Series';
// This will remove the CDATA property of the element.
//To retain it, delete this element (see delete eg) & recreate it with CDATA (see create xml eg).
//2nd Way
//$result = $xpath->query('/library/book[author="J.R.R.Tolkein"]');
// $result->item(0)->getElementsByTagName('title')->item(0)->nodeValue .= ' Series';
header("Content-type: text/xml");
echo $dom->saveXML();
}
?>
Could someone maybe give me an examples with attributes and so on, so one a user decides to update a desired node, I could find that node with XPath and then update it?
The following example is making use of simplexml which is a close friend of DOMDocument. The xpath shown is the same regardless which method you use, and I use simplexml here to keep the code low. I'll show a more advanced DOMDocument example later on.
So about the xpath: How to find the node and update it. First of all how to find the node:
The node has the element/tagname item. You are looking for it inside the storagehouse element, which is the root element of your XML document. All item elements in your document are expressed like this in xpath:
/storagehouse/item
From the root, first storagehouse, then item. Divided with /. You already know that, so the interesting part is how to only take those item elements that have the specific ID. For that the predicate is used and added at the end:
/storagehouse/item[#id="id"]
This will return all item elements again, but this time only those which have the attribute id with the value id (string). For example in your case with the following XML:
$xml = <<<XML
<?xml version="1.0" encoding="UTF-8"?>
<storagehouse xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="schema.xsd">
<item id="c7278e33ef0f4aff88da10dfeeaaae7a">
<name>HDMI Cable 3m</name>
<weight>0.5</weight>
<category>Cables</category>
<location>B3</location>
</item>
<item id="df799fb47bc1e13f3e1c8b04ebd16a96">
<name>Dell U2410</name>
<weight>2.5</weight>
<category>Monitors</category>
<location>C2</location>
</item>
</storagehouse>
XML;
that xpath:
/storagehouse/item[#id="df799fb47bc1e13f3e1c8b04ebd16a96"]
will return the computer monitor (because such an item with that id exists). If there would be multiple items with the same id value, multiple would be returned. If there were none, none would be returned. So let's wrap that into a code-example:
$simplexml = simplexml_load_string($xml);
$result = $simplexml->xpath(sprintf('/storagehouse/item[#id="%s"]', $id));
if (!$result || count($result) !== 1) {
throw new Exception(sprintf('Item with id "%s" does not exists or is not unique.', $id));
}
list($item) = $result;
In this example, $titem is the SimpleXMLElement object of that computer monitor xml element name item.
So now for the changes, which are extremely easy with SimpleXML in your case:
$item->category = 'LCD Monitor';
And to finally see the result:
echo $simplexml->asXML();
Yes that's all with SimpleXML in your case.
If you want to do this with DOMDocument, it works quite similar. However, for updating an element's value, you need to access the child element of that item as well. Let's see the following example which first of all fetches the item as well. If you compare with the SimpleXML example above, you can see that things not really differ:
$doc = new DOMDocument();
$doc->loadXML($xml);
$xpath = new DOMXPath($doc);
$result = $xpath->query(sprintf('/storagehouse/item[#id="%s"]', $id));
if (!$result || $result->length !== 1) {
throw new Exception(sprintf('Item with id "%s" does not exists or is not unique.', $id));
}
$item = $result->item(0);
Again, $item contains the item XML element of the computer monitor. But this time as a DOMElement. To modify the category element in there (or more precisely it's nodeValue), that children needs to be obtained first. You can do this again with xpath, but this time with an expression relative to the $item element:
./category
Assuming that there always is a category child-element in the item element, this could be written as such:
$category = $xpath->query('./category', $item)->item(0);
$category does now contain the first category child element of $item. What's left is updating the value of it:
$category->nodeValue = "LCD Monitor";
And to finally see the result:
echo $doc->saveXML();
And that's it. Whether you choose SimpleXML or DOMDocument, that depends on your needs. You can even switch between both. You probably might want to map and check for changes:
$repository = new Repository($xml);
$item = $repository->getItemByID($id);
$item->category = 'LCD Monitor';
$repository->saveChanges();
echo $repository->getXML();
Naturally this requires more code, which is too much for this answer.

Categories