PHP: get attributes value of xml - php

I have following xml structure:
<stores>
<store>
<name></name>
<address></address>
<custom-attributes>
<custom-attribute attribute-id="country">Deutschland</custom-attribute>
<custom-attribute attribute-id="displayWeb">false</custom-attribute>
</custom-attributes>
</store>
</stores>
how can i get the value of "displayWeb"?

The best solution for this is use PHP DOM, you may either loop trough all stores:
$dom = new DOMDocument();
$dom->loadXML( $yourXML);
// With use of child elements:
$storeNodes = $dom->documentElement->childNodes;
// Or xpath
$xPath = new DOMXPath( $dom);
$storeNodes = $xPath->query( 'store/store');
// Store nodes now contain DOMElements which are equivalent to this array:
// 0 => <store><name></name>....</store>
// 1 => <store><name>Another store not shown in your XML</name>....</store>
Those uses DOMDocument properties and DOMElement attribute childNodes or DOMXPath. Once you have all stores you may iterate trough them with foreach loop and get either all elements and store them into associative array with getElementsByTagName:
foreach( $storeNodes as $node){
// $node should be DOMElement
// of course you can use xPath instead of getAttributesbyTagName, but this is
// more effective
$domAttrs = $node->getAttributesByTagName( 'custom-attribute');
$attributes = array();
foreach( $domAttrs as $domAttr){
$attributes[ $domAttr->getAttribute( 'attribute-id')] = $domAttr->nodeValue;
}
// $attributes = array( 'country' => 'Deutschland', 'displayWeb' => 'false');
}
Or select attribute directly with xPath:
// Inside foreach($storeNodes as $node) loop
$yourAttribute = $xPath->query( "custom-attribute[#attribute-id='displayWeb']", $node)
->item(0)->nodeValue; // Warning will cause fatal error when missing desired tag
Or when you need just one value from whole document you could use (as Kirill Polishchuk suggested):
$yourAttribute = $xPath->query( "stores/store/custom-attributes/custom-attribute[#attribute-id='displayWeb']")
->item(0)->nodeValue; // Warning will cause fatal error when missing desired tag
Carefully study manual to understand what type is returned when and what does which attribute contain.

For example I can parse XML DOM. http://php.net/manual/en/book.dom.php

You can use XPath:
stores/store/custom-attributes/custom-attribute[#attribute-id='displayWeb']

I'd suggest PHP's SimpleXML. That web page has lots of user-supplied examples of use to extract values from the parsed data.

Related

PHP get nodes value with nested nodes XML

I have a xml file:
<Epo>
<Doc upd="add">
<Fld name="IC"><Prg><Sen>A01B1/00 <Cmt>(1585, 779)</Cmt></Sen></Prg></Fld>
<Fld name="CC"><Prg><Sen>A01B1/00 <Cmt>(420, 54%)</Cmt>;</Sen><Sen>B25G1/102 <Cmt>(60, 8%)</Cmt>;</Sen><Sen>A01B1/02 <Cmt>(47, 6%)</Cmt></Sen></Prg></Fld></Doc>
<Doc upd="add">
<Fld name="IC"><Prg><Sen>A01B1/02 <Cmt>(3847, 1718)</Cmt></Sen></Prg></Fld>
<Fld name="CC"><Prg><Sen>A01B1/02 <Cmt>(708, 41%)</Cmt>;</Sen><Sen>A01B1/022 <Cmt>(347, 20%)</Cmt>;</Sen><Sen>A01B1/028 <Cmt>(224, 13%)</Cmt></Sen></Prg></Fld></Doc>
</Epo>
I want to get node value, for example : A01B1/00 (1585, 779) - A01B1/00 (420, 54%); B25G1/102 (60, 8%); A01B1/02 (47, 6%)
Then formating them into table's column. how can I do that?
My code:
<?php
$doc = new DOMDocument;
$doc->preserveWhiteSpace = false;
$doc->load('test.xml'); //IPCCPC-epoxif-201905
$xpath = new DOMXPath($doc);
$titles = $xpath->query('//Doc/Fld');
foreach ($titles as $title){
echo $title->nodeValue ."<hr>";
}
?>
I cannot separate evrey node. Please help me.
I've tried to split it down to fetch all the various levels of content, but I think the main problem was just getting the current node text without the child elements text content. Using DOMDocument, the nodeValue is the same as textContent which (from the manual)...
textContent The text content of this node and its descendants.
Using DOMDocument isn't the easiest to use when just accessing a relatively simple hierarchy and requires you to continually make calls (in this case) to getElementsByTagName() to fetch the enclosed elements, the following source shows how you can get at each part of the document using this method...
foreach ( $doc->getElementsByTagName("Doc") as $item ) {
echo "upd=".$item->getAttribute("upd").PHP_EOL;
foreach ( $item->getElementsByTagName("Fld") as $fld ) {
echo "name=".$fld->getAttribute("name").PHP_EOL;
foreach ( $fld->getElementsByTagName("Sen") as $sen ) {
echo trim($sen->firstChild->nodeValue) ." cmt = ".
$sen->getElementsByTagName("Cmt")[0]->firstChild->nodeValue.PHP_EOL;
}
}
}
Using the SimpleXML API can however give a simpler solution. Each level of the hierarchy is accessed using object notation, and so ->Doc is used to access the Doc elements off the root node, and the foreach() loops just work off that. You can also see that using just the element name ($sen->Cmt) will give you just the text content of that node and not the descendants (although you have to cast it to a string to get it's value from the object) ...
$doc = simplexml_load_file("test.xml");
foreach ( $doc->Doc as $docElemnt ) {
echo "upd=".(string)$docElemnt['upd'].PHP_EOL;
foreach ( $docElemnt->Fld as $fld ) {
echo "name=".(string)$fld['name'].PHP_EOL;
foreach ( $fld->Prg->Sen as $sen ) {
echo trim((string)$sen)."=".trim((string)$sen->Cmt).PHP_EOL;
}
}
}

PHP XPath issue

Having a real bugger of an Xpath issue. I am trying to match the nodes with a certain value.
Here is an example XML fragment.
http://pastie.org/private/xrjb2ncya8rdm8rckrjqg
I am trying to match a given MatchNumber node value to see if there are two or more. Assuming that this is stored in a variable called $data I am using the below expression. Its been a while since ive done much XPath as most thing seem to be JSON these days so please excuse any rookie oversights.
$doc = new DOMDocument;
$doc->load($data);
$xpath = new DOMXPath($doc);
$result = $xpath->query("/CupRoundSpot/MatchNumber[.='1']");
I need to basically match any node that has a Match Number value of 1 and then determine if the result length is greater than 1 ( i.e. 2 or more have been found ).
Many thanks in advance for any help.
Your XML document has a default namespace: xmlns="http://www.fixtureslive.com/".
You have to register this namespace on the xpath element and use the (registered) prefix in your query.
$xpath->registerNamespace ('fl' , 'http://www.fixtureslive.com/');
$result = $xpath->query("/fl:ArrayOfCupRoundSpot/fl:CupRoundSpot/fl:MatchNumber[.='1']");
foreach( $result as $e ) {
echo '.';
}
The following XPath:
/CupRoundSpot[MatchNumber = 1]
Returns all the CupRoundSpot nodes where MatchNumber equals 1. You could use these nodes futher in your PHP to do stuff with it.
Executing:
count(/CupRoundSpot[MatchNumber = 1])
Returns you the total CupRoundSpot nodes found where MatchNumber equals 1.
You have to register the namespace. After that you can use the Xpath count() function. An expression like that will only work with evaluate(), not with query(). query() can only return node lists, not scalar values.
$dom = new DOMDocument();
$dom->loadXml($xml);
$xpath = new DOMXpath($dom);
$xpath->registerNamespace('fl', 'http://www.fixtureslive.com/');
var_dump(
$xpath->evaluate(
'count(/fl:ArrayOfCupRoundSpot/fl:CupRoundSpot[number(fl:MatchNumber) = 1])'
)
);
Output:
float(2)
DEMO: https://eval.in/130366
To iterate the CupRoundSpot nodes, just use foreach:
$nodes = $xpath->evaluate(
'/fl:ArrayOfCupRoundSpot/fl:CupRoundSpot[number(fl:MatchNumber) = 1]'
);
foreach ($nodes as $node) {
//...
}

PHP Xpath: Get all href's that contain "letter"

Say I have an html file that I have loaded, I run this query:
$url = 'http://www.fangraphs.com/players.aspx';
$html = file_get_contents($url);
$myDom = new DOMDocument;
$myDom->formatOutput = true;
#$myDom->loadHTML($html);
$anchor = $xpath->query('//a[contains(#href,"letter")]');
That gives me a list of these anchors that look like the following:
Aa
But I need a way to only get "players.aspx?letter=Aa".
I thought I could try:
$anchor = $xpath->query('//a[contains(#href,"letter")]/#href');
But that gives me a php error saying I couldn't append node when I try the following:
$xpath = new DOMXPath($myDom);
$newDom = new DOMDocument;
$j = 0;
while( $myAnchor = $anchor->item($j++) ){
$node = $newDom->importNode( $myAnchor, true ); // import node
$newDom->appendChild($node);
}
Any idea how to obtain just the value of the href tags that the first query selects?? Thanks!
Use:
//a/#href[contains(., 'letter')]
this selects any href attribute of any a whose string value (of the attribute) contains the string "letter" .
Your XPath query is returning attributes themselves (i.e., DOMAttr objects) rather than elements (i.e., DOMElement objects). That's fine, and that seems to be what you want, but appending them to the document is the problem. A DOMAttr is not a standalone node in the document tree; it's associated with a DOMElement but is not a child in the usual sense. Thus, directly appending a DOMAttr to the document is invalid.
From the W3C specs:
Attr objects inherit the Node interface, but since they are not actually child nodes of the element they describe, the DOM does not consider them part of the document tree. . . . The DOM takes the view that attributes are properties of elements rather than having a separate identity from the elements they are associated with
Either associate the DOMAttr with a DOMElement and append that element, or pull out the DOMAttr's value and use that as you wish.
To just append its plain text value, use its value in a DOMText node and append that. For example, change this line:
$newDom->appendChild($node);
to this:
$newDom->appendChild(new DOMText($node->value));
try this..
$xml_string = 'your xml string';
$xml = simplexml_load_string($xml_string);
foreach($xml->a[0]->attributes() as $href => $value) {
$myAnchorsValues[] = $value;
}
var_dump($myAnchorsValues);

Get instance of nodes by their name in SimpleXML (PHP)

I'd like to search for nodes with the same node name in a SimpleXML Object no matter how deep they are nested and create an instance of them as an array.
In the HTML DOM I can do that with JavaScript by using getElementsByTagName(). Is there a way to do that in PHP as well?
Yes use xpath
$xml->xpath('//div');
Here $xml is your SimpleXML object.
In this example you will get array of all 'div' elements
$fname = dirname(__FILE__) . '\\xml\\crRoll.xml';
$dom = new DOMDocument;
$dom->load($fname, LIBXML_DTDLOAD|LIBXML_DTDATTR);
$root = $dom->documentElement;
$xpath = new DOMXpath($dom);
$xpath->registerNamespace('cr', "http://www.w3.org/1999/xhtml");
$candidateNodes = $xpath->query("//cr:break");
foreach ($candidateNodes as $child) {
$max = $child->getAttribute('tstamp');
}
This finds all the BREAK nodes (tstamp attr) using XPath ...
Only on DOMDocument::getElementsByTagName,
however, you can import/export SimpleXML into DOMDocument,
or simply use DOMDocument to parse XML.
Another answer mentioned about Xpath,
it will return duplication of node, if you have something like :-
<div><div>1</div></div>

PHP Dealing with missing XML data

If I have three sets of data, say:
<note><from>Me</from><to>someone</to><message>hello</message></note>
<note><from>Me</from><to></to><message>Need milk & eggs</message></note>
<note><from>Me</from><message>Need milk & eggs</message></note>
and I'm using simplexml is there a way to have simple xml check that there's an empty/absent tag automatically?
I would like the output to be:
FROM TO MESSAGE
Me someone hello
Me NULL Need milk & eggs
Me NULL Need milk & eggs
Right now I'm doing it manually and I quickly realised that it's going to take a very long time to do it for long xml files.
My current sample code:
$xml = simplexml_load_string($string);
if ($xml->from != "") {$out .= $xml->from."\t"} else {$out .= "NULL\t";}
//repeat for all children, checking by name
Sometimes the order is different as well, there might be a xml with:
<note><message>pick up cd</message><from>me</from></note>
so iterating through the children and checking by index count doesn't work.
The actual xml files I'm working with are thousands of lines each, so I obviously can't just code in every tag.
It sounds like you need a DTD (Document Type Definition), which will define the required format of the XML file, and specify which elements are required, optional, what they can contain, etc.
DTDs can be used to validate an XML file before you do any processing with it.
Unfortunately, PHP's simplexml library doesn't do anything with DTD, but the DomDocument library does, so you may want to use that instead.
I'll leave it as a separate excersise for you to research how to create a DTD file. If you need more help with that, I'd suggest asking it as a separate question.
You could use the DOMDocument instead. I have created a quick demo that splits the <note> elements into an array using the XML tag names as keys. You could then iterate the resultant array to create your output.
I corrected the invalid XML by replacing the ampersand with the HTML entity equivalent (&).
<?php
libxml_use_internal_errors(true);
$xml = <<<XML
<notes>
<note><from>Me</from><to>someone</to><message>hello</message></note>
<note><from>Me</from><to></to><message>Need milk & eggs</message></note>
<note><from>Me</from><message>Need milk & eggs</message></note>
<note><message>pick up cd</message><from>me</from></note>
</notes>
XML;
function getNotes($nodelist) {
$notes = array();
foreach ($nodelist as $node) {
$noteParts = array();
foreach ($node->childNodes as $child) {
$noteParts[$child->tagName] = $child->nodeValue;
}
$notes[] = $noteParts;
}
return $notes;
}
$dom = new DOMDocument();
$dom->recover = true;
$dom->loadXML($xml);
$xpath = new DOMXPath($dom);
$nodelist = $xpath->query("//note");
$notes = getNotes($nodelist);
print_r($notes);
?>
Edit: If you change to $noteParts = array(); to $noteParts = array('from' => null, 'to' => null, 'message' => null); then it will always create the full set of keys.

Categories