Xpath regex functionality [duplicate] - php

I am trying to filter html tables with regex matching their id attribute. What am i doing wrong? Code i am trying to implement:
$this->xpath = new DOMXPath($this->dom);
$this->xpath->registerNamespace("php", "http://php.net/xpath");
$this->xpath->registerPHPFunctions();
foreach($xpath->query("//table[php:function('preg_match', '/post\d+/', #id)]") as $key => $row)
{
}
Error that i get: preg_match expects second param to be a string, array given.

An attribute is still a complex element according to DOM (has a namespace etc.). Use:
//table[php:function('preg_match', '/post\d+/', string(#id))]
Now, we need a boolean return, so:
function booleanPregMatch($match,$string){
return preg_match($match,$string)>0;
}
$xpath->registerPHPFunctions();
foreach($xpath->query("//table[#id and php:function('booleanPregMatch', '/post\d+/', string(#id))]") as $key => $row){
echo $row->ownerDocument->saveXML($row);
}
BTW: for more complex issues, you can of course sneakily check what's happening with this:
//table[php:function('var_dump',#id)]
It's a shame we don't have XPATH 2.0 functions available, but if you can handle this requirement with a more unreliable starts-with, I'd always prefer that over importing PHP functions.

What am i doing wrong?
The xpath expression #id (second parameter) returns an array but preg_match expects a string.
Convert it to string first: string(#id).
Next to that you need to actually compare the output to 1 as preg_match returns 1 when found:
foreach($xpath->query("//table[#id and 1 = php:function('preg_match', '/post\d+/', string(#id))]") as $key => $row)
{
var_dump($key, $row, $row->ownerDocument->saveXml($row));
}
Explanation/What happens here?:
A xpath expression will by default return a node-list (more precisely node-set). If you map a PHP function onto such expressions these sets are represented in form of an array. You can easily tests that by using var_dump:
$xpath->query("php:function('var_dump', //table)");
array(1) {
[0]=>
object(DOMElement)#3 (0) {
}
}
Same for the xpath expression #id in the context of each table element:
$xpath->query("//table[php:function('var_dump', #id)]");
array(1) {
[0]=>
object(DOMAttr)#3 (0) {
}
}
You can change that into a string typed result by making use of the xpath string function:
A node-set is converted to a string by returning the string-value of the node in the node-set that is first in document order. If the node-set is empty, an empty string is returned.
$xpath->query("//table[php:function('var_dump', string(#id))]");
string(4) "test"
(the table has id="test")

Related

Getting the value of XML field matching a property via XPATH

I've got a really odd XML schema that's causing me unecessary grief and woe.
I need to get the value of an IMAGEFILENAME node that has a property of "hide".
The XML schema looks something like:
<PHOTOS>
<IMAGETHUMBFILENAME/>
<IMAGECAPTION>
This is a caption
</IMAGECAPTION>
<PRINTQUALITYIMAGE>
/mylocation/filename1.jpg
</PRINTQUALITYIMAGE>
<IMAGEFILENAME pictype="show">
/mylocation/filename2.jpg
</IMAGEFILENAME>
<IMAGETHUMBFILENAME/>
<IMAGECAPTION>This is another caption</IMAGECAPTION>
<PRINTQUALITYIMAGE>
/mylocation/filename3.jpg
</PRINTQUALITYIMAGE>
<IMAGEFILENAME pictype="hide">
/mylocation/filename4.jpg
</IMAGEFILENAME>
<IMAGETHUMBFILENAME/>
</PHOTOS>
And I've managed to come up with the following XPATH using PHP:
$nodes = $xml->xpath('/PHOTOS/IMAGEFILENAME[#pictype="hide"]');
var_dump($nodes);
When I do a dump of the $nodes var what I'd hope to see (and what I want) is to get the value /mylocation/filename4.jpg. Instead what I'm getting is:
array(1) {
[0]=>
object(SimpleXMLElement)#333 (1) {
["#attributes"]=>
array(1) {
["pictype"]=>
string(10) "hide"
}
}
}
I've tried various combinations of /parent, /text() and /node() but with no joy at all.
Please somebody tell me what a muppet I'm being and put me out of my misery. Either that or is the schema being problematic?
So, you have array of SimpleXMLElements.
To get string representation of SimpleXMLElement you can just echo it:
$nodes = $xml->xpath('/PHOTOS/IMAGEFILENAME[#pictype="hide"]');
echo $nodes[0]; // I used `[]` notation to get first element of array
To use string representation of SimpleXMLElement later in your code you can convert it to string explicitly:
$nodes = $xml->xpath('/PHOTOS/IMAGEFILENAME[#pictype="hide"]');
$node_str = strval($nodes[0]); // still `[]` notation

PHP SimpleXMLElement object get data

I have the following XML:
<Root>
<personalData>
<userName>John Tom</userName>
<email>mail#example.com</email>
</personalData>
<profesionalData>
<job>engineer</job>
<jobId>16957</jobId>
</profesionalData>
</Root>
Doing in my debugger:
$myObject->xpath('//Root/profesionalData')
I have:
: array =
0: object(SimpleXMLElement) =
job: string = engineer
jobId: string = 16957
I cannot get hold of the jobId 16957.
What do I have to do?
$root = simplexml_load_file('file.xml');
$job_ids = $root->xpath('//profesionalData/jobId');
if (!$job_ids) {
die("Job IDs not found");
}
foreach ($job_ids as $id) {
// SimpleXmlElement implements __toString method, so
// you can fetch the vlaue by casting the object to string.
$id = (string)$id;
var_dump($id);
}
Sample Output
string(5) "16957"
Notes
You don't need to specify Root in the XPath expression, if you are going to fetch all profesionalData/jobId tags no matter where they are in the document, just use the double slash (//) expression. This approach may be convenient in cases, when you want to avoid registering the XML namespaces. Otherwise, you can use a strict expression like /Root/profesionalData/jobId (path from the root). By the way, your current expression (//Root/profesionalData/jobId) matches all occurrences of /Root/profesionalData/jobId in the document, e.g. /x/y/z/Root/profesionalData/jobId.
Since SimpleXmlElement::xpath function returns an array on success, or FALSE on failure, you should iterate the value with a loop, if it is a non-empty array.
SimpleXmlElement implements __toString method. The method is called when the object appears in a string context. In particular, you can cast the object to string in order to fetch string content of the node.

PHP: Illegal offset type while generating an array from xml

Why am I getting Illegal offset type error while trying to build an array?
function tassi_parser() {
$xml=simplexml_load_file('http://www.ecb.europa.eu/stats/eurofxref/eurofxref-daily.xml');
foreach($xml->Cube->Cube->Cube as $tmp) {
$results[$tmp['currency']] = $tmp['rate'];
};
return $results;
};
$tmp['currency'] correctly contains a string that should be used as key so i can't understand what is the problem...
simplexml_load_file return SimpleXMLElement and every xml element will be an object. Therefore, the type of your $tmp in foreach is "object" (not string), so you need to cast it to string as follows:
(string)$tmp['currency']
You could use the gettype function to retrieve the type of something: http://php.net/manual/en/function.gettype.php
You have to cast it to string like this:
$results[(string)$tmp['currency']] = (string)$tmp['rate'];
Also the ; at the end of the foreach and the function isn't necessary!

XPATH - get single value returned instead of array php

I am using Xpath in PHP - I know that my query will return either 0 or 1 results.
If 1 result is returned I do not want it as an array - which is what is returned right now. I simply want the value without having to access the [0] element of the result and cast to a string.
Is this possible?
If 1 result is returned I dont want it as an array - which is what is returned. I simply want the value without having to access the [0] element of the result and cast to a string.
That is possible with XPath's string function
A node-set is converted to a string by returning the string-value of the node in the node-set that is first in document order. If the node-set is empty, an empty string is returned.
and DOMXPath's evaluate method:
Returns a typed result if possible or a DOMNodeList containing all nodes matching the given XPath expression.
Example:
$dom = new DOMDocument;
$dom->loadXML('<root foo="bar"/>');
$xp = new DOMXPath($dom);
var_dump($xp->evaluate('string(/root/#foo)')); // string(3) "bar"
If there was a built in xpath way of grabbing the first and only the first node value then that would be much more preferable over writing a function to do it
You can use the position function:
The position function returns a number equal to the context position from the expression evaluation context.
Example:
$dom = new DOMDocument;
$dom->loadXML('<root><foo xml:id="f1"/><foo xml:id="f2"/></root>');
$xp = new DOMXPath($dom);
var_dump($xp->evaluate('string(/root/foo[position() = 1]/#xml:id)')); // string(2) "f1"
or the abbreviated syntax
$dom = new DOMDocument;
$dom->loadXML('<root><foo xml:id="f1"/><foo xml:id="f2"/></root>');
$xp = new DOMXPath($dom);
var_dump($xp->evaluate('string(/root/foo[1]/#xml:id)')); // string(2) "f1"
Note that when querying for descendants with // using the position function might yield multiple result due to the way the expression is evaluated.
Using 'evaluate' instead of 'query', you can do things like casting.
DOMXPath::evaluate()
Also, if you're just annoyed with doing stuff a lot of times, just write a function that does it ... that is the whole idea behind functions, right?
probably
if ($array[0]){
$string = $array[0];
}
?
if $array[0] is an array, you can rename string to new_array
if ($array[0]){
$new_array = $array[0];
}
Your question suggests that you are using SimpleXML because you talk about an array. However long-time ago you accepted an answer giving an answer with DOMDocument. In any case other users go here looking for a solution in SimpleXML it works a little differently:
list($first) = $xml->xpath('//element') + array(NULL);
The element in $first if not NULL (for no elements) then still will be of type SimpleXMLElement (either an element node or an attribute node depending on the xpath query), however you can just cast it to string in PHP and done or you just use it in string context, like with echo:
echo $first;
You can write it most simply like this:
$string = #$array[0];
The # operator will suppress errors, making $string null if $array is empty.

Using regex to filter attributes in xpath with php

I am trying to filter html tables with regex matching their id attribute. What am i doing wrong? Code i am trying to implement:
$this->xpath = new DOMXPath($this->dom);
$this->xpath->registerNamespace("php", "http://php.net/xpath");
$this->xpath->registerPHPFunctions();
foreach($xpath->query("//table[php:function('preg_match', '/post\d+/', #id)]") as $key => $row)
{
}
Error that i get: preg_match expects second param to be a string, array given.
An attribute is still a complex element according to DOM (has a namespace etc.). Use:
//table[php:function('preg_match', '/post\d+/', string(#id))]
Now, we need a boolean return, so:
function booleanPregMatch($match,$string){
return preg_match($match,$string)>0;
}
$xpath->registerPHPFunctions();
foreach($xpath->query("//table[#id and php:function('booleanPregMatch', '/post\d+/', string(#id))]") as $key => $row){
echo $row->ownerDocument->saveXML($row);
}
BTW: for more complex issues, you can of course sneakily check what's happening with this:
//table[php:function('var_dump',#id)]
It's a shame we don't have XPATH 2.0 functions available, but if you can handle this requirement with a more unreliable starts-with, I'd always prefer that over importing PHP functions.
What am i doing wrong?
The xpath expression #id (second parameter) returns an array but preg_match expects a string.
Convert it to string first: string(#id).
Next to that you need to actually compare the output to 1 as preg_match returns 1 when found:
foreach($xpath->query("//table[#id and 1 = php:function('preg_match', '/post\d+/', string(#id))]") as $key => $row)
{
var_dump($key, $row, $row->ownerDocument->saveXml($row));
}
Explanation/What happens here?:
A xpath expression will by default return a node-list (more precisely node-set). If you map a PHP function onto such expressions these sets are represented in form of an array. You can easily tests that by using var_dump:
$xpath->query("php:function('var_dump', //table)");
array(1) {
[0]=>
object(DOMElement)#3 (0) {
}
}
Same for the xpath expression #id in the context of each table element:
$xpath->query("//table[php:function('var_dump', #id)]");
array(1) {
[0]=>
object(DOMAttr)#3 (0) {
}
}
You can change that into a string typed result by making use of the xpath string function:
A node-set is converted to a string by returning the string-value of the node in the node-set that is first in document order. If the node-set is empty, an empty string is returned.
$xpath->query("//table[php:function('var_dump', string(#id))]");
string(4) "test"
(the table has id="test")

Categories