Getting the value of XML field matching a property via XPATH - php

I've got a really odd XML schema that's causing me unecessary grief and woe.
I need to get the value of an IMAGEFILENAME node that has a property of "hide".
The XML schema looks something like:
<PHOTOS>
<IMAGETHUMBFILENAME/>
<IMAGECAPTION>
This is a caption
</IMAGECAPTION>
<PRINTQUALITYIMAGE>
/mylocation/filename1.jpg
</PRINTQUALITYIMAGE>
<IMAGEFILENAME pictype="show">
/mylocation/filename2.jpg
</IMAGEFILENAME>
<IMAGETHUMBFILENAME/>
<IMAGECAPTION>This is another caption</IMAGECAPTION>
<PRINTQUALITYIMAGE>
/mylocation/filename3.jpg
</PRINTQUALITYIMAGE>
<IMAGEFILENAME pictype="hide">
/mylocation/filename4.jpg
</IMAGEFILENAME>
<IMAGETHUMBFILENAME/>
</PHOTOS>
And I've managed to come up with the following XPATH using PHP:
$nodes = $xml->xpath('/PHOTOS/IMAGEFILENAME[#pictype="hide"]');
var_dump($nodes);
When I do a dump of the $nodes var what I'd hope to see (and what I want) is to get the value /mylocation/filename4.jpg. Instead what I'm getting is:
array(1) {
[0]=>
object(SimpleXMLElement)#333 (1) {
["#attributes"]=>
array(1) {
["pictype"]=>
string(10) "hide"
}
}
}
I've tried various combinations of /parent, /text() and /node() but with no joy at all.
Please somebody tell me what a muppet I'm being and put me out of my misery. Either that or is the schema being problematic?

So, you have array of SimpleXMLElements.
To get string representation of SimpleXMLElement you can just echo it:
$nodes = $xml->xpath('/PHOTOS/IMAGEFILENAME[#pictype="hide"]');
echo $nodes[0]; // I used `[]` notation to get first element of array
To use string representation of SimpleXMLElement later in your code you can convert it to string explicitly:
$nodes = $xml->xpath('/PHOTOS/IMAGEFILENAME[#pictype="hide"]');
$node_str = strval($nodes[0]); // still `[]` notation

Related

Importing external XML file with PHP, summing values

Like the title says, I'm importing an external XML file into a site. It's actually weather data from observation sites around the world. I can parse and display the data no problem. My problem is I'm trying to sum up a specific set of data.
Here's the basic code:
<?php
$xml = simplexml_load_file('https://aviationweather.gov/adds/dataserver_current/httpparam?dataSource=metars&requestType=retrieve&format=xml&stationString=kbwi&hoursBeforeNow=65');
for($i=0;$i<=60;$i++)
{
$precip[$i] = $xml->data->METAR[$i]->precip_in;
echo $precip[$i];
}
?>
This will echo all the values from 'precip_in' from the XML file, and it does work. But if I try to sum up the data in $precip or in 'precip_in' by using array_sum, I get a blank page. Using var_dump returns "NULL" a bunch of times.
Now, I could manually sum the values by doing something like:
$rainTotal = $precip[0]+$precip[1]+$precip[2];
But one thing I want to do with this is a 24 hour rainfall total. The observations aren't always updated at regular or hourly intervals; meaning that if I were to do something like this:
$rainTotal = $precip[0]+$precip[1]+$precip[2]...+$precip[23];
It would not necessarily give me the 24 hour rain total. So, I need a way to sum all the rainfall values contained within 'precip_in' or $precip.
Any ideas on how I should proceed?
EDIT: Some clarifications based on the comments below:
Echoing and var_dump-ing $precip[$i] work fine. But if I try the following:
<?php
$xml = simplexml_load_file('https://aviationweather.gov/adds/dataserver_current/httpparam?dataSource=metars&requestType=retrieve&format=xml&stationString=kbwi&hoursBeforeNow=65');
for($i=0;$i<=60;$i++)
{
$precip[$i] = $xml->data->METAR[$i]->precip_in;
echo array_sum($precip[$i]);
}
?>
I get a blank page. Doing a var_dump of array_sum($precip[$i]) results in "NULL" a bunch of times in a row.
I have tried casting the XML string as either a float or a string, but I get the same results.
the value of $xml->data->METAR[$i]->precip_in is an object ((SimpleXMLElement)#12 (1) { [0]=> string(5) "0.005" }). It's not numeric, so it has no numeric value. You can cast this to a float however and get the numeric value your were expecting.
for($i=0;$i<=60;$i++)
{
$precip[$i] = (float)$xml->data->METAR[$i]->precip_in;
echo $precip[$i];
}
These float values can be summed.
Consider an array_sum on an XPath array return:
$xml = simplexml_load_file('https://aviationweather.gov/adds/dataserver_current/httpparam?dataSource=metars&requestType=retrieve&format=xml&stationString=kbwi&hoursBeforeNow=65');
$result = array_sum(array_map("floatval", $xml->xpath('//METAR/precip_in')));
echo $result;
// 10.065
Alternatively you can use XPath's sum() which requires using DOMDocument and not SimpleXML, specifically DOMXPath::evaluate:
$dom = new DOMDocument;
$dom->load('https://aviationweather.gov/adds/dataserver_current/httpparam?dataSource=metars&requestType=retrieve&format=xml&stationString=kbwi&hoursBeforeNow=65');
$xpath = new DOMXPath($dom);
$sum = (float)$xpath->evaluate('sum(//METAR/precip_in)');
echo $sum;
// 10.065

XML Node accessing attribute with namespace

This is my xml data.
<?xml version="1.0" encoding="UTF-8"?>
<ns1:catalog
xmlns:ns1="http://www.omnichannelintegrationlayer.com/xml/catalog/2016-01-01" catalog-id="at-master-catalog">
<ns1:product product-id="4132002004">
<ns1:min-order-quantity>1</ns1:min-order-quantity>
<ns1:step-quantity>1</ns1:step-quantity>
<ns1:short-description
xmlns:ns2="xml" ns2:lang="de-AT">Jogginghose Cacy jr
</ns1:short-description>
<ns1:short-description
xmlns:ns2="xml" ns2:lang="de-CH">Jogginghose Cacy jr
</ns1:short-description>
</ns1:product>
I'm trying to filter the xml the short-description base on ns2:lang attribute.
This is what I've done so far:
foreach ($xml->xpath("//ns1:product[#product-id='".$productid."']/ns1:short-description/") as $short_description) {
$namespaces = $short_description->getNameSpaces(true);
$ns1 = $short_description->children($namespaces['ns1']);
$ns2 = $short_description->children($namespaces['ns2']);
var_dump($ns2);
echo $ns2["lang"];
}
The output of var_dump looks okay:
object(SimpleXMLElement)#27 (1) { ["#attributes"]=> array(1) { ["lang"]=> string(5) "de-AT" } }
But I can't access the array because when I echo $ns2["lang"], I'm getting NULL.
I already tried different solution like declaring namespace first but no luck.
Thanks in advance.
The values you are looking for are in the attributes and the attributes use a namespace which you can pass as a parameter to the attributes method.
The attribute itself is of type SimpleXMLElement and has a method __toString to get the text content that is directly in this element.
You could for example use echo $short_description->attributes($namespaces['ns2'])->lang; or cast it to a (string)
You might update your code to:
$namespaces = $xml->getNamespaces(true);
foreach ($xml->xpath("//ns1:product[#product-id='".$productid."']/ns1:short-description") as $short_description) {
$langAsString = (string)$short_description->attributes($namespaces['ns2'])->lang;
echo $langAsString . "<br>";
}
That would give you:
de-AT
de-CH
Demo

Xpath regex functionality [duplicate]

I am trying to filter html tables with regex matching their id attribute. What am i doing wrong? Code i am trying to implement:
$this->xpath = new DOMXPath($this->dom);
$this->xpath->registerNamespace("php", "http://php.net/xpath");
$this->xpath->registerPHPFunctions();
foreach($xpath->query("//table[php:function('preg_match', '/post\d+/', #id)]") as $key => $row)
{
}
Error that i get: preg_match expects second param to be a string, array given.
An attribute is still a complex element according to DOM (has a namespace etc.). Use:
//table[php:function('preg_match', '/post\d+/', string(#id))]
Now, we need a boolean return, so:
function booleanPregMatch($match,$string){
return preg_match($match,$string)>0;
}
$xpath->registerPHPFunctions();
foreach($xpath->query("//table[#id and php:function('booleanPregMatch', '/post\d+/', string(#id))]") as $key => $row){
echo $row->ownerDocument->saveXML($row);
}
BTW: for more complex issues, you can of course sneakily check what's happening with this:
//table[php:function('var_dump',#id)]
It's a shame we don't have XPATH 2.0 functions available, but if you can handle this requirement with a more unreliable starts-with, I'd always prefer that over importing PHP functions.
What am i doing wrong?
The xpath expression #id (second parameter) returns an array but preg_match expects a string.
Convert it to string first: string(#id).
Next to that you need to actually compare the output to 1 as preg_match returns 1 when found:
foreach($xpath->query("//table[#id and 1 = php:function('preg_match', '/post\d+/', string(#id))]") as $key => $row)
{
var_dump($key, $row, $row->ownerDocument->saveXml($row));
}
Explanation/What happens here?:
A xpath expression will by default return a node-list (more precisely node-set). If you map a PHP function onto such expressions these sets are represented in form of an array. You can easily tests that by using var_dump:
$xpath->query("php:function('var_dump', //table)");
array(1) {
[0]=>
object(DOMElement)#3 (0) {
}
}
Same for the xpath expression #id in the context of each table element:
$xpath->query("//table[php:function('var_dump', #id)]");
array(1) {
[0]=>
object(DOMAttr)#3 (0) {
}
}
You can change that into a string typed result by making use of the xpath string function:
A node-set is converted to a string by returning the string-value of the node in the node-set that is first in document order. If the node-set is empty, an empty string is returned.
$xpath->query("//table[php:function('var_dump', string(#id))]");
string(4) "test"
(the table has id="test")

XPATH - get single value returned instead of array php

I am using Xpath in PHP - I know that my query will return either 0 or 1 results.
If 1 result is returned I do not want it as an array - which is what is returned right now. I simply want the value without having to access the [0] element of the result and cast to a string.
Is this possible?
If 1 result is returned I dont want it as an array - which is what is returned. I simply want the value without having to access the [0] element of the result and cast to a string.
That is possible with XPath's string function
A node-set is converted to a string by returning the string-value of the node in the node-set that is first in document order. If the node-set is empty, an empty string is returned.
and DOMXPath's evaluate method:
Returns a typed result if possible or a DOMNodeList containing all nodes matching the given XPath expression.
Example:
$dom = new DOMDocument;
$dom->loadXML('<root foo="bar"/>');
$xp = new DOMXPath($dom);
var_dump($xp->evaluate('string(/root/#foo)')); // string(3) "bar"
If there was a built in xpath way of grabbing the first and only the first node value then that would be much more preferable over writing a function to do it
You can use the position function:
The position function returns a number equal to the context position from the expression evaluation context.
Example:
$dom = new DOMDocument;
$dom->loadXML('<root><foo xml:id="f1"/><foo xml:id="f2"/></root>');
$xp = new DOMXPath($dom);
var_dump($xp->evaluate('string(/root/foo[position() = 1]/#xml:id)')); // string(2) "f1"
or the abbreviated syntax
$dom = new DOMDocument;
$dom->loadXML('<root><foo xml:id="f1"/><foo xml:id="f2"/></root>');
$xp = new DOMXPath($dom);
var_dump($xp->evaluate('string(/root/foo[1]/#xml:id)')); // string(2) "f1"
Note that when querying for descendants with // using the position function might yield multiple result due to the way the expression is evaluated.
Using 'evaluate' instead of 'query', you can do things like casting.
DOMXPath::evaluate()
Also, if you're just annoyed with doing stuff a lot of times, just write a function that does it ... that is the whole idea behind functions, right?
probably
if ($array[0]){
$string = $array[0];
}
?
if $array[0] is an array, you can rename string to new_array
if ($array[0]){
$new_array = $array[0];
}
Your question suggests that you are using SimpleXML because you talk about an array. However long-time ago you accepted an answer giving an answer with DOMDocument. In any case other users go here looking for a solution in SimpleXML it works a little differently:
list($first) = $xml->xpath('//element') + array(NULL);
The element in $first if not NULL (for no elements) then still will be of type SimpleXMLElement (either an element node or an attribute node depending on the xpath query), however you can just cast it to string in PHP and done or you just use it in string context, like with echo:
echo $first;
You can write it most simply like this:
$string = #$array[0];
The # operator will suppress errors, making $string null if $array is empty.

Using regex to filter attributes in xpath with php

I am trying to filter html tables with regex matching their id attribute. What am i doing wrong? Code i am trying to implement:
$this->xpath = new DOMXPath($this->dom);
$this->xpath->registerNamespace("php", "http://php.net/xpath");
$this->xpath->registerPHPFunctions();
foreach($xpath->query("//table[php:function('preg_match', '/post\d+/', #id)]") as $key => $row)
{
}
Error that i get: preg_match expects second param to be a string, array given.
An attribute is still a complex element according to DOM (has a namespace etc.). Use:
//table[php:function('preg_match', '/post\d+/', string(#id))]
Now, we need a boolean return, so:
function booleanPregMatch($match,$string){
return preg_match($match,$string)>0;
}
$xpath->registerPHPFunctions();
foreach($xpath->query("//table[#id and php:function('booleanPregMatch', '/post\d+/', string(#id))]") as $key => $row){
echo $row->ownerDocument->saveXML($row);
}
BTW: for more complex issues, you can of course sneakily check what's happening with this:
//table[php:function('var_dump',#id)]
It's a shame we don't have XPATH 2.0 functions available, but if you can handle this requirement with a more unreliable starts-with, I'd always prefer that over importing PHP functions.
What am i doing wrong?
The xpath expression #id (second parameter) returns an array but preg_match expects a string.
Convert it to string first: string(#id).
Next to that you need to actually compare the output to 1 as preg_match returns 1 when found:
foreach($xpath->query("//table[#id and 1 = php:function('preg_match', '/post\d+/', string(#id))]") as $key => $row)
{
var_dump($key, $row, $row->ownerDocument->saveXml($row));
}
Explanation/What happens here?:
A xpath expression will by default return a node-list (more precisely node-set). If you map a PHP function onto such expressions these sets are represented in form of an array. You can easily tests that by using var_dump:
$xpath->query("php:function('var_dump', //table)");
array(1) {
[0]=>
object(DOMElement)#3 (0) {
}
}
Same for the xpath expression #id in the context of each table element:
$xpath->query("//table[php:function('var_dump', #id)]");
array(1) {
[0]=>
object(DOMAttr)#3 (0) {
}
}
You can change that into a string typed result by making use of the xpath string function:
A node-set is converted to a string by returning the string-value of the node in the node-set that is first in document order. If the node-set is empty, an empty string is returned.
$xpath->query("//table[php:function('var_dump', string(#id))]");
string(4) "test"
(the table has id="test")

Categories