This question already has answers here:
SimpleXML: Selecting Elements Which Have A Certain Attribute Value
(2 answers)
Closed 8 years ago.
I am new to processing and reading XML strings in PHP
I have XML like this
<sports-metadata>
<sports-content-codes>
<sports-content-code code-type="sport" code-key="15027000" code-name="Golf"/>
<sports-content-code code-type="league" code-key="l.pga.com" code-name="Professional Golf Association"/>
<sports-content-code code-type="season-type" code-key="regular"/>
<sports-content-code code-type="season" code-key="2015"/>
<sports-content-code code-type="priority" code-key="normal"/>
</sports-content-codes>
</sports-metadata>
I have read in the XML via a $xml=simplexml_load_file()
I can get to this XML section via $xml->{'sports-content-codes'}->{'sports-content-code'}
In sports-content-code
I want to access/retrieve the code-key value where code-type="season"
How can I do this in PHP?
Thank you all.
-- Ed
Usually you use ->attributes() method to get those attributes:
foreach($xml->{'sports-content-codes'}->{'sports-content-code'} as $content_code) {
$attr = $content_code->attributes();
$code_type = (string) $attr->{'code-type'};
echo $code_type;
}
Sample Output
Use Xpath ...
... with SimpleXml:
$element = simplexml_load_string($xml);
$array = $element->xpath(
'//sports-content-code[#code-type="season"]'
);
var_dump(
(string)$array[0]['code-key']
);
... or DOM:
$dom = new DOMDocument();
$dom->loadXml($xml);
$xpath = new DOMXpath($dom);
var_dump(
$xpath->evaluate(
'string(//sports-content-code[#code-type="season"]/#code-key)'
)
);
Output (both):
string(4) "2015"
Xpath is a expression language for DOM (think SQL for DBMS). SimpleXMLElement::xpath() supports some of it (only expressions that return element or attribute nodes). The result will always be an array of SimpleXMLElement objects. DOMXpath::evaluate() supports full Xpath 1.0. The result is a DOMNodelist or a scalar value, depending on the expression.
The expression:
Select the "sports-content-code" element nodes
//sports-content-code
with a "code-type" attribute node is season
//sports-content-code[#code-type="season"]
get the "code-key" attribute nodes
//sports-content-code[#code-type="season"]/#code-key
cast the node list to a string returning the text content of the first node
string(//sports-content-code[#code-type="season"]/#code-key)
Related
I'm attempting to extract the date saved in the <PersonDetails> tag for some XML I am working with, example:
<Record>
<PersonDetails RecordDate="2017-03-31T00:00:00">
<FirstName>Joe</FirstName>
<Surname>Blogs</Surname>
<Status>Active</Status>
</PersonDetails>
</Record>
Currently I have been trying the following:
if (isset($XML->Record->xpath("//PersonDetails[#RecordDate]")[0])) {
$theDate = $XML->Record->xpath("//PersonDetails[#RecordDate]")[0])->textContent;
} else {
$theDate = "no date";
}
My intention is to have $theDate = 2017-03-31T00:00:00
A valid XPath expression for selecting attribute node should look like below:
$theDate = $XML->xpath("//Record/PersonDetails/#RecordDate")[0];
echo $theDate; // 2017-03-31T00:00:00
You're mixing SimpleXML and DOM here. Additionally the expression fetches a PersonDetails element that has a RecordDate attribute. [] are conditions.
SimpleXML
So to fetch attribute node you need to use //PersonDetails/#RecordDate. In SimpleXML this will create a SimpleXMLElement for a non existing element node that will return the attribute value if cast to a string. SimpleXMLElement::xpath() will always return an array so you need to cast the first element of that array into a string.
$theDate = (string)$XML->xpath("//PersonDetails/#RecordDate")[0];
DOM
$textContent is a property of DOM nodes. It contains the text content of all descendant nodes. But you don't need it in this case. If you use DOMXpath::evaluate(), the Xpath expression can return the string value directly.
$document = new DOMDocument();
$document->loadXml($xmlString);
$xpath = new DOMXpath($document);
$theDate = $xpath->evaluate('string(//PersonDetails/#RecordDate)');
The string typecast is moved into the Xpath expression.
I'm trying to parse a remote XML file, which is valid:
$xml = simplexml_load_file('http://feeds.feedburner.com/HammersInTheHeart?format=xml');
The root element is feed, and I'm trying to grab it via:
$nodes = $xml->xpath('/feed'); //also tried 'feed', without slash
Except it doesn't find any nodes.
print_r($nodes); //empty array
Or any nodes of any kind, so long as I search for them by tag name, in fact:
$nodes = $xml->xpath('//entry');
print_r($nodes); //empty array
It does find nodes, however, if I use wildcards, e.g.
$nodes = $xml->xpath('/*/*[4]');
print_r($nodes); //node found
What's going on?
Unlike DOM, SimpleXML has no concept of a document object, only elements. So if you load an XML you always get the document element.
$feed = simplexml_load_file($xmlFile);
var_dump($feed->getName());
Output:
string(4) "feed"
That means that all Xpath expression have to to be relative to this element or absolute. Simple feed will not work because the context already is the feed element.
But here is another reason. The URL is an Atom feed. So the XML elements in the namespace http://www.w3.org/2005/Atom. SimpleXMLs magic syntax recognizes a default namespace for some calls - but Xpath does not. Here is not default namespace in Xpath. You will have to register them with a prefix and use that prefix in your Xpath expressions.
$feed = simplexml_load_file($xmlFile);
$feed->registerXpathNamespace('a', 'http://www.w3.org/2005/Atom');
foreach ($feed->xpath('/a:feed/a:entry[position() < 3]') as $entry) {
var_dump((string)$entry->title);
}
Output:
string(24) "Sharing the goals around"
string(34) "Kouyate inspires Hammers' comeback"
However in SimpleXML the registration has to be done for each object you call the xpath() method on.
Using Xpath with DOM is slightly different but a lot more powerful.
$document = new DOMDocument();
$document->load($xmlFile);
$xpath = new DOMXpath($document);
$xpath->registerNamespace('a', 'http://www.w3.org/2005/Atom');
foreach ($xpath->evaluate('/a:feed/a:entry[position() < 3]') as $entry) {
var_dump($xpath->evaluate('string(a:title)', $entry));
}
Output:
string(24) "Sharing the goals around"
string(34) "Kouyate inspires Hammers' comeback"
Xpath expression using with DOMXpath::evaluate() can return scalar values.
It seems that PHP SimpleXML XPath doesn't allow to get results of XPath functions:
$s = new \SimpleXMLElement('<test><node>A</node><node>B</node></test>');
var_dump($s->xpath("count(node)"));
Returns an empty array:
array(0) {
}
While Using DOM returns the expected value 2:
$dom = new \DOMDocument();
$dom->loadXML('<test><node>A</node><node>B</node></test>');
$xpath = new \DOMXPath($dom);
var_dump($xpath->evaluate("count(node)"));
float(2.0)
Is there a way to do the same directly with SimpleXML?
PHP's SimpleXML only works on queries which return nodesets. count(...) returns a scalar value which is not supported. Use DOMXPath which is much more capable or count the objects in the result array:
var_dump(count($s->xpath("node")));
int(2)
I have following xml structure:
<stores>
<store>
<name></name>
<address></address>
<custom-attributes>
<custom-attribute attribute-id="country">Deutschland</custom-attribute>
<custom-attribute attribute-id="displayWeb">false</custom-attribute>
</custom-attributes>
</store>
</stores>
how can i get the value of "displayWeb"?
The best solution for this is use PHP DOM, you may either loop trough all stores:
$dom = new DOMDocument();
$dom->loadXML( $yourXML);
// With use of child elements:
$storeNodes = $dom->documentElement->childNodes;
// Or xpath
$xPath = new DOMXPath( $dom);
$storeNodes = $xPath->query( 'store/store');
// Store nodes now contain DOMElements which are equivalent to this array:
// 0 => <store><name></name>....</store>
// 1 => <store><name>Another store not shown in your XML</name>....</store>
Those uses DOMDocument properties and DOMElement attribute childNodes or DOMXPath. Once you have all stores you may iterate trough them with foreach loop and get either all elements and store them into associative array with getElementsByTagName:
foreach( $storeNodes as $node){
// $node should be DOMElement
// of course you can use xPath instead of getAttributesbyTagName, but this is
// more effective
$domAttrs = $node->getAttributesByTagName( 'custom-attribute');
$attributes = array();
foreach( $domAttrs as $domAttr){
$attributes[ $domAttr->getAttribute( 'attribute-id')] = $domAttr->nodeValue;
}
// $attributes = array( 'country' => 'Deutschland', 'displayWeb' => 'false');
}
Or select attribute directly with xPath:
// Inside foreach($storeNodes as $node) loop
$yourAttribute = $xPath->query( "custom-attribute[#attribute-id='displayWeb']", $node)
->item(0)->nodeValue; // Warning will cause fatal error when missing desired tag
Or when you need just one value from whole document you could use (as Kirill Polishchuk suggested):
$yourAttribute = $xPath->query( "stores/store/custom-attributes/custom-attribute[#attribute-id='displayWeb']")
->item(0)->nodeValue; // Warning will cause fatal error when missing desired tag
Carefully study manual to understand what type is returned when and what does which attribute contain.
For example I can parse XML DOM. http://php.net/manual/en/book.dom.php
You can use XPath:
stores/store/custom-attributes/custom-attribute[#attribute-id='displayWeb']
I'd suggest PHP's SimpleXML. That web page has lots of user-supplied examples of use to extract values from the parsed data.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
How can I get an element's serialised HTML with PHP's DOMDocument?
PHP + DOMDocument: outerHTML for element?
I am trying to extract all img tags from a string. I am using:
$domimg = new DOMDocument();
#$domimg->loadHTML($body);
$images_all = $domimg->getElementsByTagName('img');
foreach ($images_all as $image) {
// do something
}
I want to put the src= values or even the complete img tags into an array or string.
Use saveXML() or saveHTML() on each node to add it to an array:
$img_links = array();
$domimg = new DOMDocument();
$domimg->loadHTML($body);
$images_all = $domimg->getElementsByTagName('img');
foreach ($images_all as $image) {
// Append the XML or HTML of each to an array
$img_links[] = $domimg->saveXML($image);
}
print_r($img_links);
You could try a DOM parser like simplexml_load_string. Take a look at a similar answer I posted here:
Needle in haystack with array in PHP