Help parse XML to PHP by attribute value - php

Could someone kindly help me get this parsed. I have the following XML. I need to get the value of photo-url that matches "75" max-width. How do I filter that in PHP
$xml->posts->post['photo-url']....?
<photo-url max-width="100">
image1.jpg
</photo-url>
<photo-url max-width="75">
image2.jpg
</photo-url>

Using PHP DOM
$dom = new DomDocument;
$dom->loadXml('
<root>
<photo-url max-width="100">image1.jpg</photo-url>
<photo-url max-width="75">image2.jpg</photo-url>
</root>
');
$xpath = new DomXpath($dom);
foreach ($xpath->query('//photo-url[#max-width="75"]') as $photoUrlNode) {
echo $photoUrlNode->nodeValue; // will be image2.jpg
}

Use SimpleXMLElement and an xpath query.
$xml = new SimpleXMLElement($your_xml_string);
$result = $xml->xpath('//photo-url[#max-width="75"]');
// Loop over all the <photo-url> nodes and dump their contents
foreach ($result as $node ) {
print_r($node);
$image = strip_tags($node->asXML);
}

You can use XPath: //photo-url[#max-width = '75']. It will select all photo-url which satisfies this condition. To select only 1st photo-url use this: //photo-url[#max-width = '75'][1]

Related

How to query a xml file using xpath (php) ?

I am trying to query an XML file using XPath. But as return I get nothing. I think I formatted the query false.
XML
<subject id="Tom">
<relation unit="ITSupport" role="ITSupporter" />
</subject>
PHP
$xpath = new DOMXpath($doc);
$role = 'ITSupporter';
$elements = $xpath-> query("//subject/#id[../relation/#role='".$role."']");
foreach ($elements as $element) {
$name = $element -> nodeValue;
$arr[$i] = $name;
$i = $i + 1;
}
How can I get the id TOM? I want to save it to for example $var
Building up the Xpath expression:
Fetch any subject element//subject
... with a child element relation//subject[relation]
... that has a role attribute with the given text//subject[relation/#role="ITSupporter"]
... and get the #id attribute of subject//subject[relation/#role="ITSupporter"]/#id
Additionally the source could be cleaned up. PHP arrays can use the $array[] syntax to push new elements into them.
Put together:
$xml = <<<'XML'
<subject id="Tom">
<relation unit="ITSupport" role="ITSupporter" />
</subject>
XML;
$role = 'ITSupporter';
$document = new DOMDocument();
$document->loadXML($xml);
$xpath = new DOMXpath($document);
$ids = [];
foreach ($xpath->evaluate("//subject[relation/#role='".$role."']/#id") as $idAttribute) {
$ids[] = $idAttribute->value;
}
var_dump($ids);
Output:
array(1) {
[0]=>
string(3) "Tom"
}
If you expect only a single result you can cast the it in Xpath:
$id = $xpath->evaluate(
"string(//subject[relation/#role='".$role."']/#id)"
);
var_dump($id);
Output:
string(3) "Tom"
XML Namespaces
Looking at the example posted in the comment your XML uses the namespace http://cpee.org/ns/organisation/1.0 without a prefix. The XML parser will resolve it so you can read the nodes as {http://cpee.org/ns/organisation/1.0}subject. Here are 3 examples that all resolve to this:
<subject xmlns="http://cpee.org/ns/organisation/1.0"/>
<cpee:subject xmlns:cpee="http://cpee.org/ns/organisation/1.0"/>
<c:subject xmlns:c="http://cpee.org/ns/organisation/1.0"/>
The same has to happen for the Xpath expression. However Xpath does not have
a default namespace. You need to register an use an prefix of your choosing. This
allows the Xpath engine to resolve something like //org:subject to //{http://cpee.org/ns/organisation/1.0}subject.
The PHP does not need to change much:
$xml = <<<'XML'
<subject id="Tom" xmlns="http://cpee.org/ns/organisation/1.0">
<relation unit="ITSupport" role="ITSupporter" />
</subject>
XML;
$role = 'ITSupporter';
$document = new DOMDocument();
$document->loadXML($xml);
$xpath = new DOMXpath($document);
// register a prefix for the namespace
$xpath->registerNamespace('org', 'http://cpee.org/ns/organisation/1.0');
$ids = [];
// address the elements using the registered prefix
$idAttributes = $xpath->evaluate("//org:subject[org:relation/#role='".$role."']/#id");
foreach ($idAttributes as $idAttribute) {
$ids[] = $idAttribute->value;
}
var_dump($ids);
Try this XPath
//subject[relation/#role='".$role."']/#id
You were applying the predicate on the id attribute and not on the subject element.
Getting element by id is the same as doing by $role contents.
So, like the followings;
$xpath->query("//*[#id='$id']")->item(0);
In other words, #id should be in '[' bracket.

why I can't get first child value for a XML file in php

I used an XMLHttpRequest object to retrieve data from a PHP response.
Then, I created an XML file:
<?xml version="1.0" encoding="UTF-8"?>
<persons>
<person>
<name>Ce</name>
<gender>male</gender>
<age>24</age>
</person>
<person>
<name>Lin</name>
<gender>female</gender>
<age>25</age>
</person>
</persons>
In the PHP file, I load the XML file and try to echo tag values of "name."
$dom = new DOMDocument("1.0");
$dom -> load("test.xml");
$persons = $dom -> getElementsByTagName("person");
foreach($persons as $person){
echo $person -> childNodes -> item(0) -> nodeValue;
}
But the nodeValue returned is null. However, when I change to item(1), the name tag values can be displayed. Why?
Change code to
$dom = new DOMDocument("1.0");
$dom -> load("test.xml");
$persons = $dom -> getElementsByTagName("persons");
foreach($persons as $person){
echo $person->childNodes[1]->nodeValue;
}
Anything in a DOM is a node, include texts and text with only whitespaces. So the first child of the person element node is a text node that contains the linebreak and indent before the name element node.
Here is a property that removes any whitespace node at parse time:
$document = new DOMDocument("1.0");
// do not preserve whitespace only text nodes
$document->preserveWhiteSpace = FALSE;
$document->load("test.xml");
$persons = $document->getElementsByTagName("person");
foreach ($persons as $person) {
echo $person->firstChild->textContent;
}
However typically a better way is to use Xpath expressions.
$document = new DOMDocument("1.0");
$document->load("test.xml");
$xpath = new DOMXpath($document)
$persons = $xpath->evaluate("/persons/person");
foreach ($persons as $person) {
echo $xpath->evaluate("string(name)", $person);
}
string(name) fetches the child element node name (position is not relevant) and casts it into a string. If here is no name element it will return an empty string.
Using DOM you need to get the right element to pick up the name, child nodes include all sorts of things including whitespace. The node 0 your trying to use is null because of this. So for DOM...
$dom = new DOMDocument("1.0");
$dom -> load("test.xml");
$persons = $dom -> getElementsByTagName("person");
foreach($persons as $person){
$name = $person->getElementsByTagName("name");
echo $name->item(0)->nodeValue.PHP_EOL;
}
If your requirements are as simple as this, you could alternatively use SimpleXML...
$sxml = simplexml_load_file("test.xml");
foreach ( $sxml->person as $person ) {
echo $person->name.PHP_EOL;
}
This allows you to access elements as though they are object properties and as you can see ->person equates to accessing <person>.

With DOMDocument, is it possible to filter the output of getElementsByTagName based on an additional parameter?

Take the following example of this xml:
<xml>
<siblings>
<brother>Derek</brother>
<sister>Elaine</sister>
<sister>Flora</sister>
</siblings>
<siblings>
<brother>Gary</brother>
<sister>Hannah</sister>
</siblings>
</xml>
If I were to use the following code:
$xmlDoc=new DOMDocument();
$xmlDoc->load("Family.xml");
$siblings = $xmlDoc->getElementsByTagName('Siblings');
$sister = $xmlDoc->getElementsByTagName('Sister');
This would normally return all instances of the tag "Sister", in this case "Elaine", "Flora" and "Hannah". Would it be possible to change it so that you could filter the tagnames by the name of one of the other nodes? For instance, using the name "Derek" to change the output to "Elaine" and "Flora" only.
Xpath expressions allow you to use conditions to fetch nodes from a DOM.
$xml = <<<'XML'
<xml>
<siblings>
<brother>Derek</brother>
<sister>Elaine</sister>
<sister>Flora</sister>
</siblings>
<siblings>
<brother>Gary</brother>
<sister>Hannah</sister>
</siblings>
</xml>
XML;
$document = new DOMDocument();
$document->loadXml($xml);
$xpath = new DOMXpath($document);
$expression = '/xml/siblings[brother = "Derek"]/*[not(self::brother = "Derek")]';
foreach ($xpath->evaluate($expression) as $sibling) {
echo $sibling->textContent, "\n";
}
Output:
Elaine
Flora
The Xpath Expression
Fetch the siblings elements ...
/xml/siblings
... if they have a child element brother with the value Derek ...
/xml/siblings[brother = "Derek"]
... and fetch their child elements...
/xml/siblings[brother = "Derek"]/*
... if they are not a brother element node with the value Derek.
/xml/siblings[brother = "Derek"]/*[not(self::brother = "Derek")]

XPath Substring-After Help / Query/Evaluate?

I'm building a php script to transfer selected contents of an xml file to an sql database..
One of the hardcoded XML contents is formatted like this:
<visualURL>
id=18144083|img=http://upload.wikimedia.org/wikipedia/en/8/86/Holyrollernovacaine.jpg
</visualURL>
And I'm looking for a way to just get the contents of the URL (all text after img=).
$Image = $xpath->query("substring-after(/Playlist/PlaylistEntry[1]/visualURL[1]/text(), 'img=')", $element)->item(0)->nodeValue;
Displays a property non-object error on my php output.
There must be another way to just extract the URL contents using XPath that I want, no?
Any help would be greatly appreciated!
EDIT:
Here is the minimum code
<?php
$xmlDoc = new DOMDocument();
$xmlDoc->loadXML('<Playlist>
<PlaylistEntry>
<visualURL>
id=12582194|img=http://upload.wikimedia.org/wikipedia/en/9/96/Sometime_around_midnight.jpg
</visualURL>
</PlaylistEntry>
</Playlist>');
$xpath = new DOMXpath($xmlDoc);
$elements = $xpath->query("/Playlist/PlaylistEntry[1]");
if (!is_null($elements))
foreach ($elements as $element)
$Image = $xpath->query("substring-after(/Playlist/PlaylistEntry[1]/visualURL[1]/text(), 'img=')", $element)- >item(0)->nodeValue;
print "Finished Item: $Image";
?>
EDIT 2:
After some research I believe I must use
$xpath->evaluate
instead of my current use of
$xpath->query
see this link
Same XPath query is working with Google docs but not PHP
I'm not exactly sure how to do this yet.. but i will investigate more in the morning. Again, any help would be appreciated.
You're in right direction. Use DOMXPath::evaluate() for xpath expression that doesn't return node(s) like substring-after() (it returns string as documented in the linked page). The following codes prints expected output :
$xmlDoc = new DOMDocument();
$xml = <<<XML
<Playlist>
<PlaylistEntry>
<visualURL>
id=12582194|img=http://upload.wikimedia.org/wikipedia/en/9/96/Sometime_around_midnight.jpg
</visualURL>
</PlaylistEntry>
</Playlist>
XML;
$xmlDoc->loadXML($xml);
$xpath = new DOMXpath($xmlDoc);
$elements = $xpath->query("/Playlist/PlaylistEntry");
foreach ($elements as $element) {
$Image = $xpath->evaluate("substring-after(visualURL, 'img=')", $element);
print "Finished Item: $Image <br>";
}
output :
Finished Item: http://upload.wikimedia.org/wikipedia/en/9/96/Sometime_around_midnight.jpg
Demo

how to select nodevalue based on another nodevalue via php ?

I have a xml file which contains this :
<ns1:Response xmlns:ns1="http://example.com/">
- <ns1:return>
<ns1:mid>39824</ns1:mid>
<ns1:serverType>4</ns1:serverType>
<ns1:size>5</ns1:size>
</ns1:return>
- <ns1:return>....
</ns1:return>
Now I want to get nodevalue of mid where nodevalue size has 5, I tried following code but no results:
$doc = new DOMDocument();
$doc->load($file);
$xpath = new DOMXPath($doc);
$query = '//Response/return/size[.="5"]/mid';
$entries = $xpath->evaluate($query);
So how can I do that ?
thanks in advance
PHP has some automatic registration for the namespaces of the current context, but it is a better idea not to depend on it. Prefixes can change. You can even use a default namespace and avoid the prefixes.
Best register your own prefix:
$xpath->registerNamespace('e', 'http://example.com/');
In XPath you define location paths with conditions:
Any return node inside a Response node:
//e:Response/e:return
If it has a child node size node with the value 5
//e:Response/e:return[e:size = 5]
Get the mid node inside it
//e:Response/e:return[e:size = 5]/e:mid
Cast the first found mid node into a string
string(//e:Response/e:return[e:size = 5]/e:mid)
Complete example:
$xml = <<<'XML'
<ns1:Response xmlns:ns1="http://example.com/">
<ns1:return>
<ns1:mid>39824</ns1:mid>
<ns1:serverType>4</ns1:serverType>
<ns1:size>5</ns1:size>
</ns1:return>
<ns1:return></ns1:return>
</ns1:Response>
XML;
$doc = new DOMDocument();
$doc->loadXml($xml);
$xpath = new DOMXPath($doc);
$xpath->registerNamespace('e', 'http://example.com/');
$mid = $xpath->evaluate(
'string(//e:Response/e:return[e:size = 5]/e:mid)'
);
var_dump($mid);
Output:
string(5) "39824"
You can also use following::sibling in this case. Get mid value where its following sibling is size with text equal to 5. Rough example:
$query = 'string(//ns1:Response/ns1:return/ns1:mid[following-sibling::ns1:size[text()="5"]])';
Sample Output
You're missing some namespace and you're trying to get the child mid of a size element whose content is 5.
try this:
$query = '//ns1:Response/ns1:return/ns1:mid[../ns1:size[text()="5"]]';
then, to see the result:
foreach ($entries as $entry) {
echo $entry->nodeValue . "<br />";
}

Categories