XPath in DOMDocument on WSDL file

XPath in DOMDocument on WSDL file - php

I have some problems with make query to the XPath. I try to load WSDL file and them get some nodes using XPath.
$DOMDocument = new DOMDocument();
$DOMDocument->loadXML($wsdl);
$DOMXpath = new DOMXPath($DOMDocument);
$elements = $DOMXpath->query('//definitions//binding');
var_dump($elements);
Result is:
class DOMNodeList#15 (1) {
public $length =>
int(0)
}
Here is WSDL file: http://pastebin.com/YDRzbq3x
How to make correct XPath query to traversing over nodes.

Your XML has default namespace (xmlns="http://schemas.xmlsoap.org/wsdl/"). In this case, you need to register a prefix that point to that default namespace URI, then use that prefix in your XPath query :
.......
$DOMXpath->registerNamespace('d', "http://schemas.xmlsoap.org/wsdl/");
$elements = $DOMXpath->query('//d:definitions//d:binding');
.......

Related

Accessing XML data within namespaces

So my XML Looks like this :-
<ns0:ASN xmlns:ns0="http://schemas.microsoft.com/dynamics/2008/01/documents/ASN" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<ns0:CustPackingSlipJour class="entity">
<ns0:BON_FileNameSeqNum>40</ns0:BON_FileNameSeqNum>
<ns0:BON_TotalNetAmount>10.00</ns0:BON_TotalNetAmount>
<ns0:BON_TotalTaxAmount>.00</ns0:BON_TotalTaxAmount>
<ns0:InvoiceAccount>Acc</ns0:InvoiceAccount>
<ns0:LanguageId>EN</ns0:LanguageId>
<ns0:OrderAccount>I</ns0:OrderAccount>
<ns0:PurchaseOrder>74</ns0:PurchaseOrder>
<ns0:Qty>13.00</ns0:Qty>
<ns0:SalesId>00025873_054</ns0:SalesId>
<ns0:CustPackingSlipTrans class="entity">
<ns0:BON_LineNetAmount>19.00</ns0:BON_LineNetAmount>
<ns0:BON_SalesPrice>0.00</ns0:BON_SalesPrice>
<ns0:DeliveryDate>2016-11-30</ns0:DeliveryDate>
<ns0:ItemId>25712</ns0:ItemId>
<ns0:Ordered>1.00</ns0:Ordered>
<ns0:PackingSlipId>00339_061</ns0:PackingSlipId>
<ns0:Qty>1.00</ns0:Qty>
</ns0:CustPackingSlipTrans>
<ns0:CustPackingSlipTrans class="entity">
<ns0:BON_LineNetAmount>19.00</ns0:BON_LineNetAmount>
<ns0:BON_SalesPrice>0.00</ns0:BON_SalesPrice>
<ns0:DeliveryDate>2-11-30</ns0:DeliveryDate>
<ns0:ItemId>25823-35714</ns0:ItemId>
<ns0:Ordered>1.00</ns0:Ordered>
<ns0:PackingSlipId>00_061</ns0:PackingSlipId>
<ns0:Qty>1.00</ns0:Qty>
</ns0:CustPackingSlipTrans>
</ns0:CustPackingSlipJour>
</ns0:ASN>
How can I access the value of ItemId for all CustPackingSlipTrans ?
I have tried various ways of getting it, for instance registering xpath and then trying to access. However, it ins't working for me. Whats the best way to get it's value?

The solution using DOMXPath::query method:
// $xml contains your xml contents
$doc = new \DOMDocument();
$doc->loadXML($xml);
$xpath = new \DOMXPath($doc);
foreach ($xpath->query("ns0:CustPackingSlipJour/ns0:CustPackingSlipTrans/ns0:ItemId") as $node) {
var_dump($node->nodeValue);
}
The output:
string(5) "25712"
string(11) "25823-35714"
DEMO

You need to register the namespace with the DomXPath:
$xp = new DomXPath ($doc);
$xp->registerNamespace ('pfx', 'http://pfxuri');

Getting specific xml data on php with xpath

I have a xml response like that;
<n:Crev xmlns:soap="http://a.com"
xmlns:obj="http://b.com"
xmlns:n="http://c.com"
xmlns:msg="http://d.com"
xmlns="http://e.com"
xmlns:xsi="http://f.com"
xsi:schemaLocation="http://g.com">
<n:Header>
<msg:mydata>123123</msg:mydata>
</n:Header>
</n:Crev>
now I want to get 'msg:mydata' value..
I tried some xpaths but they didn't work and tried online xpath creator it gives something like;
'/n:Crev[#xmlns:soap="http://a.com"]/n:Header/msg:mydata/text()'
but it didn't work also.. So how can I write xpath for that?
Thanks

I've succeeded with following code:
<?php
$xmlStr = '<n:Crev xmlns:soap="http://a.com"
xmlns:obj="http://b.com"
xmlns:n="http://c.com"
xmlns:msg="http://d.com"
xmlns="http://e.com"
xmlns:xsi="http://f.com"
xsi:schemaLocation="http://g.com">
<n:Header>
<msg:mydata>123123</msg:mydata>
</n:Header>
</n:Crev>';
$xmlDoc = new DOMDocument();
$xmlDoc->loadXML($xmlStr);
$xmlPath = new DOMXPath($xmlDoc);
var_dump($xmlPath->query('//n:Crev/n:Header/msg:mydata')->item(0)->textContent);
result:
string '123123' (length=6)

n or msg are namespace prefixes. The actual namespaces are the values of the xmlns attributes. The XML parser will resolve the namespaces.
Here is a small example:
$document = new DOMDocument();
$document->loadXml('<n:Crev xmlns:n="http://c.com"/>');
var_dump(
$document->documentElement->namespaceURI,
$document->documentElement->localName
);
Output:
string(12) "http://c.com"
string(4) "Crev"
The following XMLs all would have the same output:
<n:Crev xmlns:n="http://c.com"/>
<Crev xmlns="http://c.com"/>
<c:Crev xmlns:c="http://c.com"/>
You can read the node as {http://c.com}Crev.
To fetch nodes or scalar values from the DOM you can use Xpath::evaluate(). But to match namespaces you will have to register prefixes for the Xpath expressions. This allows the Xpath engine to resolve the namespaces and match them against the node properties. The prefixes do not have to be the same as in the document.
$xml = <<<'XML'
<n:Crev xmlns:n="http://c.com" xmlns:msg="http://d.com">
<n:Header>
<msg:mydata>123123</msg:mydata>
</n:Header>
</n:Crev>
XML;
$document = new DOMDocument();
$document->loadXml($xml);
$xpath = new DOMXpath($document);
$xpath->registerNamespace('c', 'http://c.com');
$xpath->registerNamespace('msg', 'http://d.com');
var_dump(
$xpath->evaluate('string(/c:Crev/c:Header/msg:mydata)')
);
Output:
string(6) "123123"
If the expression is an location path like /c:Crev/c:Header/msg:mydata the result with be an DOMNodeList, but Xpath functions or operators can return scalar values.

Weird SimpleXML issue - can't reference nodes by name?

I'm trying to parse a remote XML file, which is valid:
$xml = simplexml_load_file('http://feeds.feedburner.com/HammersInTheHeart?format=xml');
The root element is feed, and I'm trying to grab it via:
$nodes = $xml->xpath('/feed'); //also tried 'feed', without slash
Except it doesn't find any nodes.
print_r($nodes); //empty array
Or any nodes of any kind, so long as I search for them by tag name, in fact:
$nodes = $xml->xpath('//entry');
print_r($nodes); //empty array
It does find nodes, however, if I use wildcards, e.g.
$nodes = $xml->xpath('/*/*[4]');
print_r($nodes); //node found
What's going on?

Unlike DOM, SimpleXML has no concept of a document object, only elements. So if you load an XML you always get the document element.
$feed = simplexml_load_file($xmlFile);
var_dump($feed->getName());
Output:
string(4) "feed"
That means that all Xpath expression have to to be relative to this element or absolute. Simple feed will not work because the context already is the feed element.
But here is another reason. The URL is an Atom feed. So the XML elements in the namespace http://www.w3.org/2005/Atom. SimpleXMLs magic syntax recognizes a default namespace for some calls - but Xpath does not. Here is not default namespace in Xpath. You will have to register them with a prefix and use that prefix in your Xpath expressions.
$feed = simplexml_load_file($xmlFile);
$feed->registerXpathNamespace('a', 'http://www.w3.org/2005/Atom');
foreach ($feed->xpath('/a:feed/a:entry[position() < 3]') as $entry) {
var_dump((string)$entry->title);
}
Output:
string(24) "Sharing the goals around"
string(34) "Kouyate inspires Hammers' comeback"
However in SimpleXML the registration has to be done for each object you call the xpath() method on.
Using Xpath with DOM is slightly different but a lot more powerful.
$document = new DOMDocument();
$document->load($xmlFile);
$xpath = new DOMXpath($document);
$xpath->registerNamespace('a', 'http://www.w3.org/2005/Atom');
foreach ($xpath->evaluate('/a:feed/a:entry[position() < 3]') as $entry) {
var_dump($xpath->evaluate('string(a:title)', $entry));
}
Output:
string(24) "Sharing the goals around"
string(34) "Kouyate inspires Hammers' comeback"
Xpath expression using with DOMXpath::evaluate() can return scalar values.

XML Xpath Failing on getElementsByTagName

<?xml version="1.0" encoding="UTF-8"?>
<AddProduct>
<auth><id>vendor123</id><auth_code>abc123</auth_code></auth>
</AddProduct>
What am I doing wrong to get : Fatal error: Call to undefined method DOMNodeList::getElementsByTagName()
$xml = $_GET['xmlRequest'];
$dom = new DOMDocument();
#$dom->loadXML($xml);
$xpath = new DOMXPath($dom);
$auth = $xpath->query('*/auth');
$id = $auth->getElementsByTagName('id')->item(0)->nodeValue;
$code = $auth->getElementsByTagName('auth_code')->item(0)->nodeValue;

You could retrieve the data (in the XML you posted) you want using XPath only:
$id = $xpath->query('//auth/id')->item(0)->nodeValue;
$code = $xpath->query('//auth/auth_code')->item(0)->nodeValue;
You are also calling getElementsByTagName() on $auth (DOMXPath), as #Ohgodwhy pointed out in the comments, which is causing the error. If you want to use it, you should call it on $dom.
Your XPath expression returns the auth child of the current (context) node. Unless your XML file is different, it's clearer to use one of:
/*/auth # returns auth nodes two levels below root
/AddProduct/auth # returns auth nodes in below /AddProduct
//auth # returns all auth nodes

This is what I came up with after reviewing php's documentation (http://us1.php.net/manual/en/class.domdocument.php, http://us1.php.net/manual/en/domdocument.loadxml.php, http://us3.php.net/manual/en/domxpath.query.php, http://us3.php.net/domxpath)
$dom = new DOMDocument();
$dom->loadXML($xml);
$id = $dom->getElementsByTagName("id")->item(0)->nodeValue;
$code = $dom->getElementsByTagName("auth_code")->item(0)->nodeValue;
As helderdarocha and Ohgodwhy pointed out, the getElementByTagName is a DOMDocument method not a DOMXPath method. I like helderdarocha's solution that only uses XPath, the solution I posted accomplishes the same thing but only uses the DOMDocument.

In DomDocument, reuse of DOMXpath, it is stable?

I am using the function below, but not sure about it is always stable/secure... Is it?
When and who is stable/secure to "reuse parts of the DOMXpath preparing procedures"?
To simlify the use of the XPath query() method we can adopt a function that memorizes the last calls with static variables,
function DOMXpath_reuser($file) {
static $doc=NULL;
static $docName='';
static $xp=NULL;
if (!$doc)
$doc = new DOMDocument();
if ($file!=$docName) {
$doc->loadHTMLFile($file);
$xp = NULL;
}
if (!$xp)
$xp = new DOMXpath($doc);
return $xp; // ??RETURNED VALUES ARE ALWAYS STABLE??
}
The present question is similar to this other one about XSLTProcessor reuse.
In both questions the problem can be generalized for any language or framework that use LibXML2 as DomDocument implementation.
There are another related question: How to "refresh" DOMDocument instances of LibXML2?
Illustrating
The reuse is very commom (examples):
$f = "my_XML_file.xml";
$elements = DOMXpath_reuser($f)->query("//*[#id]");
// use elements to get information
$elements = DOMXpath_reuser($f)->("/html/body/div[1]");
// use elements to get information
But, if you do something like removeChild, replaceChild, etc. (example),
$div = DOMXpath_reuser($f)->query("/html/body/div[1]")->item(0); //STABLE
$div->parentNode->removeChild($div); // CHANGES DOM
$elements = DOMXpath_reuser($f)->query("//div[#id]"); // INSTABLE! !!
extrange things can be occur, and the queries not works as expected!!
When (what DOMDocument methods affect XPath?)
Why we can not use something like normalizeDocument to "refresh DOM" (exist?)?
Only a "new DOMXpath($doc);" is allways secure? need to reload $doc also?

DOMXpath is affected by the load*() methods on DOMDocument. After loading a new xml or html, you need to recreate the DOMXpath instance:
$xml = '<xml/>';
$dom = new DOMDocument();
$dom->loadXml($xml);
$xpath = new DOMXpath($dom);
var_dump($xpath->document === $dom); // bool(true)
$dom->loadXml($xml);
var_dump($xpath->document === $dom); // bool(false)
In DOMXpath_reuser() you store a static variable and recreate the xpath depending on the file name. If you want to reuse an Xpath object, suggest extending DOMDocument. This way you only need pass the $dom variable around. It would work with a stored xml file as well with xml string or a document your are creating.
The following class extends DOMDocument with an method xpath() that always returns a valid DOMXpath instance for it. It stores and registers the namespaces, too:
class MyDOMDocument
extends DOMDocument {
private $_xpath = NULL;
private $_namespaces = array();
public function xpath() {
// if the xpath instance is missing or not attached to the document
if (is_null($this->_xpath) || $this->_xpath->document != $this) {
// create a new one
$this->_xpath = new DOMXpath($this);
// and register the namespaces for it
foreach ($this->_namespaces as $prefix => $namespace) {
$this->_xpath->registerNamespace($prefix, $namespace);
}
}
return $this->_xpath;
}
public function registerNamespaces(array $namespaces) {
$this->_namespaces = array_merge($this->_namespaces, $namespaces);
if (isset($this->_xpath)) {
foreach ($namespaces as $prefix => $namespace) {
$this->_xpath->registerNamespace($prefix, $namespace);
}
}
}
}
$xml = <<<'ATOM'
<feed xmlns="http://www.w3.org/2005/Atom">
<title>Test</title>
</feed>
ATOM;
$dom = new MyDOMDocument();
$dom->registerNamespaces(
array(
'atom' => 'http://www.w3.org/2005/Atom'
)
);
$dom->loadXml($xml);
// created, first access
var_dump($dom->xpath()->evaluate('string(/atom:feed/atom:title)', NULL, FALSE));
$dom->loadXml($xml);
// recreated, connection was lost
var_dump($dom->xpath()->evaluate('string(/atom:feed/atom:title)', NULL, FALSE));

The DOMXpath class (instead of XSLTProcessor in your another question) use reference to given DOMDocument object in contructor. DOMXpath create libxml context object based on given DOMDocument and save it to internal class data. Besides libxml context its saves references to originalDOMDocument` given in contructor arguments.
What that means:
Part of sample from ThomasWeinert answer:
var_dump($xpath->document === $dom); // bool(true)
$dom->loadXml($xml);
var_dump($xpath->document === $dom); // bool(false)
gives false after load becouse of $dom already holds pointer to new libxml data but DOMXpath holds libxml context for $dom before load and pointer to real document after load.
Now about query works
If it should return XPATH_NODESET (as in your case) its make a node copy - node by node iterating throw detected node set(\ext\dom\xpath.c from 468 line). Copy but with original document node as parent. Its means that you can modify result but this gone away you XPath and DOMDocument connection.
XPath results provide a parentNode memeber that knows their origin:
for attribute values, parentNode returns the element that carries them. An example is //foo/#attribute, where the parent would be a foo Element.
for the text() function (as in //text()), it returns the element that contains the text or tail that was returned.
note that parentNode may not always return an element. For example, the XPath functions string() and concat() will construct strings that do not have an origin. For them, parentNode will return None.
So,
There is no any reasons to cache XPath. It do not anything besides xmlXPathNewContext (just allocate lightweight internal struct).
Each time your modify your DOMDocument (removeChild, replaceChild, etc.) your should recreate XPath.
We can not use something like normalizeDocument to "refresh DOM" because of it change internal document structure and invalidate xmlXPathNewContext created in Xpath constructor.
Only "new DOMXpath($doc);" is allways secure? Yes, if you do not change $doc between Xpath usage. Need to reload $doc also - no, because of it invalidated previously created xmlXPathNewContext.

(this is not a real answer, but a consolidation of comments and answers posted here and related questions)
This new version of the question's DOMXpath_reuser function contains the #ThomasWeinert suggestion (for avoid DOM changes by external re-load) and an option $enforceRefresh to workaround the problem of instability (as related question shows the programmer must detect when).
function DOMXpath_reuser_v2($file, $enforceRefresh=0) { //changed here
static $doc=NULL;
static $docName='';
static $xp=NULL;
if (!$doc)
$doc = new DOMDocument();
if ( $file!=$docName || ($xp && $doc !== $xp->document) ) { // changed here
$doc->load($file);
$xp = NULL;
} elseif ($enforceRefresh==2) { // add this new refresh mode
$doc->loadXML($doc->saveXML());
$xp = NULL;
}
if (!$xp || $enforceRefresh==1) //changed here
$xp = new DOMXpath($doc);
return $xp;
}
When must to use $enforceRefresh=1 ?
... perhaps an open problem, only little tips and clues...
when DOM submited to setAttribute, removeChild, replaceChild, etc.
...? more cases?
When must to use $enforceRefresh=2 ?
... perhaps an open problem, only little tips and clues...
when DOM was subject to indexes inconsistences, etc. See this question/solution.
...? more cases?

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

XPath in DOMDocument on WSDL file - php

Related

Accessing XML data within namespaces

Getting specific xml data on php with xpath

Weird SimpleXML issue - can't reference nodes by name?

XML Xpath Failing on getElementsByTagName

In DomDocument, reuse of DOMXpath, it is stable?

Categories

Resources