Trying to decide which is more appropriate for my use case...
After comparing the documentation for these methods, my vague understanding is evaluate returns a typed result but query doesn't. Furthermore, the query example includes looping through many results but the evaluate example assumes a single typed result.
Still not much the wiser! Could anyone explain (in as close as possible to layman's terms) when you would use one or the other - e.g. will the multiple/single results mentioned above always be the case?
DOMXPath::query() supports only expressions that return a node list. DOMXPath::evaluate() supports all valid expressions. The official method is named evaluate(), too: http://www.w3.org/TR/DOM-Level-3-XPath/xpath.html#XPathEvaluator
Select all p elements inside a div: //div//p
Select all href attributes in a elements the current document: //a/#href
You can use the string() function to cast the first element of a node list to a string. This will not work with DOMXpath::query().
Select the title text of a document: string(/html/head/title)
There are other function and operators that will change the result type of an expression. But it is always unambiguous. You will always know what type the result is.
query will return a DOMNodeList regardless of your actual XPath expression. This suggests that you don't know what the result may be. So you can iterate over the list and check the node type of the nodes and do something based on the type.
But query is not limited to this use case. You can still use this when you know what type you will get. It may be more readable in the future what you wanted to achieve and therefore easier to maintain.
evaluate on the other hand gives you exactly the type that you select. As the examples point out:
$xpath->evaluate("1 = 0"); // FALSE
$xpath->evaluate("string(1 = 0)"); // "false"
As it turns out selecting attributes //div/#id or text nodes //div/text() still yields DOMNodeList instead of strings. So the potential use cases are limited. You would have to enclose them in string: string(//div/#id) or text nodes string(//div/text()).
The main advantage of evaluate is that you can get strings out of your DOMDocument with fewer lines of code. Otherwise it will produce the same output as query.
ThW's answer is right that some expressions will not work with query:
$xpath->query("string(//div/#id)") // DOMNodeList of length 0
$xpath->evaluate("string(//div/#id)") // string with the found id
Related
I have an application where the user writes XPath queries to use as source data from a given document. Sometimes they need just the contents of an element, sometimes they need the whole element itself. To my understanding they should be able to specify either text() or node() at the end of their query to choose which behavior.
But it seems like the way I get a string out of the SimpleXMLElement determines the behavior, regardless of the query.
When I cast the query to (string), it ALWAYS only returns inner XML.
(string) $xml->xpath('//document/head/Keywords')[0] ===
(string) $xml->xpath('//document/head/Keywords/node()')[0] ===
(string) $xml->xpath('//document/head/Keywords/text()')[0] ===
'17';
If I use ->saveXML(), it ALWAYS returns the entire tag.
$xml->xpath('//document/head/Keywords')[0]->asXML() ===
$xml->xpath('//document/head/Keywords/node()')[0]->asXML() ===
$xml->xpath('//document/head/Keywords/text()')[0]->asXML() ===
'<Keywords topic="611x27keqj">17</Keywords>';
Is there a single way that I can get a string, which allows my users to specify inner vs outer XML as a part of their XPath query?
The SimpleXML xpath() method always returns SimpleXMLElement objects representing either an element or an attribute, never text. The methods you show in the question are the correct way to use that object to get text content or full XML.
If you want richer (but less simple) XPath functionality, you will have to use the DOM, and specifically the DOMXPath class. Note that you can freely mix SimpleXML and DOM using simplexml_import_dom and dom_import_simplexml; the internal representation is the same, so you can switch between the two "wrappers" with minimal cost.
I have some objects which come from an XML file. I am trying to convert the XML file to SQL data structure. So far, I managed to retrieve the tables and columns and now I need to find out data types for each column.
gettype() didn't help since it always returns object.
Casting is not efficient, I tried to cast to every data type to see if it suits one but, for example, if I cast 5hi to integer, the result would be 5.
Here is a part on XML file:
<device>
<manufacturer>SIEMENS</manufacturer>
<model>SOMATOM Definition</model>
<serial>60301</serial>
<version>syngo CT 2010A</version>
</device>
So, as an example, manufacturer is string and serial is integer.
How can I cast an object to appropriate datatype based on its value?
is_numeric Will provide you with the capacity to check if a value is numeric
If you're looking to ensure integers, loop over each character and use ctype_digit
You may also wish to use regular expressions to recognise more complex structures.
Example:
if(is_numeric($value)) {
$value = doubleval($value); //cast to double
}
I'm using jQuery to post ajax requests, and PHP to construct XML responses. Everything works fine, but I wonder about the method I've used for data typing, and whether there's a more standard way, or a more correct way. My XML generally looks like this, with some attributes representing text and other attributes representing numeric data:
<UnitConversions>
<UnitConversion basicUnitName="mile" conversionFactor="5280" conversionUnit="foot"/>
<UnitConversion basicUnitName="mile" conversionFactor="1760" conversionUnit="yard"/>
</UnitConversions>
I have a lot of different objects, not just this one type, so in my constructors, rather than initializing every property explicitly, I just copy the attributes over from the XML node:
var UnitConverter = function(inUnitConversionNode) {
var that = this;
$.each(inUnitConversionNode.attributes, function(i, attribute) {
that[attribute.name] = attribute.value;
});
};
I had trouble early on when I checked for numeric values, as in if(someValueFromTheXML === 1) -- this would always evaluate to false because the value from the XML was a string, "1". So I added nodes in key places in the XML to tell my client-side code what to interpret as numeric and what to leave as text:
<UnitConversions>
<NumericFields>
<NumericField fieldName="conversionFactor"/>
</NumericFields>
<UnitConversion basicUnitName="mile" conversionFactor="5280" conversionUnit="foot"/>
<UnitConversion basicUnitName="mile" conversionFactor="1760" conversionUnit="yard"/>
</UnitConversions>
So now I pass the NumericFields node into the constructor so it will know which fields to store as actual numbers.
This all works great, but it seems like a bit of a naive solution, maybe even a hack. Seems like there would be something more sophisticated out there. It seems like this issue is related to XML schemas, but my googling seems to suggest that schemas are more about validation, rather than typing, and they seem to be geared toward server-side processing anyway.
What's the standard/correct way for js to know which fields in the XML are numeric?
You can use isNaN() to detect whether the string is a number. For example isNaN("5043") returns false indicating that "5043" can be parsed as a number. Then, just use parseInt() to compare the value. For example:
if (parseInt(someValueFromTheXML, 10) === 1) {
...
}
Another way is to use loose comparison with the == operator so that "1" == 1 evaluates to true. However, it would be better practice to use the first suggestion instead. There is really no other way to go about this since XML/HTML attributes are always strings.
I'm trying to select all dom elements that have id="mydiv" but exclude the ones that also have the class="exclass". Right now I'm doing the first part //*[#id="mydiv"]. How do I add the class exclusion part?
P.S. In case you're wondering why I need to select multiple elements that have the same id, I'm just working on an existing DOM that I can't control.
You can use negation:
//*[#id="mydiv" and #class!="exclass"]
If the class attribute may not exist on all nodes, you need this:
//*[#id="mydiv" and (not(#class) or #class!="exclass")]
The last (somewhat) odd logic can be turned into what Michael proposed:
//*[#id="mydiv" and not(#class="exclass")]
Though, personally, the fact that XPath cannot make comparisons if the attribute is missing feels a bit like a shortcoming.
The answer to the question as written is
//*[#id="mydiv" and not(#class="exclass")]
The first half of the condition is true if there is an #id attribute and its value is mydiv. The second half is true if there is no #class attribute with the value exclass: that is, if there is no class attribute, or if there is a class attribute and its value is something other than "exclass".
Avoid using != in this situation: 90% of the time, if you think of writing A!=B, you probably wanted not(A=B). The meaning is different in the case where either A or B is not a singleton, that is, where it is either empty, or can contain multiple values.
I would like to know if there is some improvement on MongoCollection::findOne or if is just an "alias" or "shorcut" to MongoCollection::find with a limit of 1, for example.
Thank you
findOne() is an alias of find() with a limit(-1)
You can see this in the source code here. It does the equivalent to
find(...).limit(-1).getNext().
The -1 is actually relevant. Here's a snippet from the wire protocol docs:
If the number is negative, then the database will return that number
and close the cursor.
If you go to the shell and type > db.collection.findOne (no parens), you can see that the function is also just a helper in the shell.
So, "yes findOne() is just a helper".
From the mongo tutorials...
To show that the document we inserted in the previous step is there,
we can do a simple findOne() operation to get the first document in
the collection. This method returns a single document (rather than the
DBCursor that the find() operation returns), and it's useful for
things where there only is one document, or you are only interested in
the first. You don't have to deal with the cursor.
The MongoCollection::findOne method will directly return the result array and the MongoCollection::find one will return a MongoCursor instance even if it is a single valued result.
mongodb.org has an performance test report where they compared findOne and find. Based on the results it would seem that findOne is 35-45% faster.
Few data points from the report:
find_one (small, no index): 989 Ops/s
find (small, no index): 554 Ops/s
It is almost like an alias but instead of return you a list, it returns you an object.
It depends on your search query. E.g if you search by ID, since ID is unique it would not need to limit the results because only one result would be found. If more than one record is found then it would limit the results by 1. Another difference is that findOne returns an array, while find returns a mongoCursor.