I have a xml file:
<Epo>
<Doc upd="add">
<Fld name="IC"><Prg><Sen>A01B1/00 <Cmt>(1585, 779)</Cmt></Sen></Prg></Fld>
<Fld name="CC"><Prg><Sen>A01B1/00 <Cmt>(420, 54%)</Cmt>;</Sen><Sen>B25G1/102 <Cmt>(60, 8%)</Cmt>;</Sen><Sen>A01B1/02 <Cmt>(47, 6%)</Cmt></Sen></Prg></Fld></Doc>
<Doc upd="add">
<Fld name="IC"><Prg><Sen>A01B1/02 <Cmt>(3847, 1718)</Cmt></Sen></Prg></Fld>
<Fld name="CC"><Prg><Sen>A01B1/02 <Cmt>(708, 41%)</Cmt>;</Sen><Sen>A01B1/022 <Cmt>(347, 20%)</Cmt>;</Sen><Sen>A01B1/028 <Cmt>(224, 13%)</Cmt></Sen></Prg></Fld></Doc>
</Epo>
I want to get node value, for example : A01B1/00 (1585, 779) - A01B1/00 (420, 54%); B25G1/102 (60, 8%); A01B1/02 (47, 6%)
Then formating them into table's column. how can I do that?
My code:
<?php
$doc = new DOMDocument;
$doc->preserveWhiteSpace = false;
$doc->load('test.xml'); //IPCCPC-epoxif-201905
$xpath = new DOMXPath($doc);
$titles = $xpath->query('//Doc/Fld');
foreach ($titles as $title){
echo $title->nodeValue ."<hr>";
}
?>
I cannot separate evrey node. Please help me.
I've tried to split it down to fetch all the various levels of content, but I think the main problem was just getting the current node text without the child elements text content. Using DOMDocument, the nodeValue is the same as textContent which (from the manual)...
textContent The text content of this node and its descendants.
Using DOMDocument isn't the easiest to use when just accessing a relatively simple hierarchy and requires you to continually make calls (in this case) to getElementsByTagName() to fetch the enclosed elements, the following source shows how you can get at each part of the document using this method...
foreach ( $doc->getElementsByTagName("Doc") as $item ) {
echo "upd=".$item->getAttribute("upd").PHP_EOL;
foreach ( $item->getElementsByTagName("Fld") as $fld ) {
echo "name=".$fld->getAttribute("name").PHP_EOL;
foreach ( $fld->getElementsByTagName("Sen") as $sen ) {
echo trim($sen->firstChild->nodeValue) ." cmt = ".
$sen->getElementsByTagName("Cmt")[0]->firstChild->nodeValue.PHP_EOL;
}
}
}
Using the SimpleXML API can however give a simpler solution. Each level of the hierarchy is accessed using object notation, and so ->Doc is used to access the Doc elements off the root node, and the foreach() loops just work off that. You can also see that using just the element name ($sen->Cmt) will give you just the text content of that node and not the descendants (although you have to cast it to a string to get it's value from the object) ...
$doc = simplexml_load_file("test.xml");
foreach ( $doc->Doc as $docElemnt ) {
echo "upd=".(string)$docElemnt['upd'].PHP_EOL;
foreach ( $docElemnt->Fld as $fld ) {
echo "name=".(string)$fld['name'].PHP_EOL;
foreach ( $fld->Prg->Sen as $sen ) {
echo trim((string)$sen)."=".trim((string)$sen->Cmt).PHP_EOL;
}
}
}
Related
I am using php dom to parse xml from another platform, extract certain data from it, and upload to my own platform. I am however stuck when it comes to extracting a certain node value, only if another node value is greater than 0 for the child node 'row'. In the example below, I would like to iterate over the xml and pull out the 'affcustomid' value only if the CPACommission node value is greater than 0. Does anyone have any ideas how I can do this? The below code is a shortened version, in reality, i would get back 100's of rows in the same format as below.
<row>
<rowid>1</rowid>
<currencysymbol>€</currencysymbol>
<totalrecords>2145</totalrecords>
<affcustomid>11159_4498302</affcustomid>
<period>7/1/2014</period>
<impressions>0</impressions>
<clicks>1</clicks>
<clickthroughratio>0</clickthroughratio>
<downloads>1</downloads>
<downloadratio>1</downloadratio>
<newaccountratio>1</newaccountratio>
<newdepositingacc>1</newdepositingacc>
<newaccounts>1</newaccounts>
<firstdepositcount>1</firstdepositcount>
<activeaccounts>1</activeaccounts>
<activedays>1</activedays>
<newpurchases>12.4948</newpurchases>
<purchaccountcount>1</purchaccountcount>
<wageraccountcount>1</wageraccountcount>
<avgactivedays>1</avgactivedays>
<netrevenueplayer>11.8701</netrevenueplayer>
<Deposits>12.4948</Deposits>
<Bonus>0</Bonus>
<NetRevenue>11.8701</NetRevenue>
<TotalBetsHands>4</TotalBetsHands>
<Product1Bets>4</Product1Bets>
<Product1NetRevenue>11.8701</Product1NetRevenue>
<Product1Commission>30</Product1Commission>
<Commission>0</Commission>
<CPACommission>30</CPACommission>
</row>
Thanks in advance!
Mark
The easiest way to fetch data from an XML DOM is Xpath:
$dom = new DOMDocument();
$dom->load('file.xml');
$xpath = new DOMXpath($dom);
var_dump(
$xpath->evaluate('string(//row[CPACommission > 0]/affcustomid)')
);
It would be easier using SimpleXML:
$doc = simplexml_load_file('file.xml');
foreach ($doc->row AS $row) {
if($row->CPACommission > 0){
echo $row->affcustomid;
}
}
But if you still need to use DOMDocument:
$doc = new DOMDocument();
$doc->load('file.xml');
foreach ($doc->getElementsByTagName('row') AS $row) {
if($row->getElementsByTagName('CPACommission')->item(0)->textContent > 0){
echo $row->getElementsByTagName('affcustomid')->item(0)->textContent;
}
}
I'm using SimpleXML & PHP to parse an XML element in the following form:
<element>
random text with <inlinetag src="http://url.com/">inline</inlinetag> XML to parse
</element>
I know I can reach inlinetag using $element->inlinetag, but I don't know how to reach it in such a way that I can basically replace the inlinetag with a link to the attribute source without using it's location in the text. The result would basically have to look like this:
here is a random text with inline XML
This may be a stupid questions, I hope someone here can help! :)
I found a way to do this using DOMElement.
One way to replace the element is by cloning it with a different name/attributes. Here is is a way to do this, using the accepted answer given on How do you rename a tag in SimpleXML through a DOM object?
function clonishNode(DOMNode $oldNode, $newName, $replaceAttrs = [])
{
$newNode = $oldNode->ownerDocument->createElement($newName);
foreach ($oldNode->attributes as $attr)
{
if (isset($replaceAttrs[$attr->name]))
$newNode->setAttribute($replaceAttrs[$attr->name], $attr->value);
else
$newNode->appendChild($attr->cloneNode());
}
foreach ($oldNode->childNodes as $child)
$newNode->appendChild($child->cloneNode(true));
$oldNode->parentNode->replaceChild($newNode, $oldNode);
}
Now, we use this function to clone the inline element with a new element and attribute name. Here comes the tricky part: iterating over all the nodes will not work as expected. The length of the selected nodes will change as you clone them, as the original node is removed. Therefore, we only select the first element until there are no elements left to clone.
$xml = '<element>
random text with <inlinetag src="http://url.com/">inline</inlinetag> XML to parse
</element>';
$dom = new DOMDocument;
$dom->loadXML($xml);
$nodes= $dom->getElementsByTagName('inlinetag');
echo $dom->saveXML(); //<element>random text with <inlinetag src="http://url.com/">inline</inlinetag> XML to parse</element>
while($nodes->length > 0) {
clonishNode($nodes->item(0), 'a', ['src' => 'href']);
}
echo $dom->saveXML(); //<element>random text with inline XML to parse</element>
That's it! All that's left to do is getting the content of the element tag.
Is this the result you want to achieve?
<?php
$data = '<element>
random text with
<inlinetag src="http://url.com/">inline
</inlinetag> XML to parse
</element>';
$xml = simplexml_load_string($data);
foreach($xml->inlinetag as $resource)
{
echo 'Your SRC attribute = '. $resource->attributes()->src; // e.g. name, price, symbol
}
?>
I have following xml structure:
<stores>
<store>
<name></name>
<address></address>
<custom-attributes>
<custom-attribute attribute-id="country">Deutschland</custom-attribute>
<custom-attribute attribute-id="displayWeb">false</custom-attribute>
</custom-attributes>
</store>
</stores>
how can i get the value of "displayWeb"?
The best solution for this is use PHP DOM, you may either loop trough all stores:
$dom = new DOMDocument();
$dom->loadXML( $yourXML);
// With use of child elements:
$storeNodes = $dom->documentElement->childNodes;
// Or xpath
$xPath = new DOMXPath( $dom);
$storeNodes = $xPath->query( 'store/store');
// Store nodes now contain DOMElements which are equivalent to this array:
// 0 => <store><name></name>....</store>
// 1 => <store><name>Another store not shown in your XML</name>....</store>
Those uses DOMDocument properties and DOMElement attribute childNodes or DOMXPath. Once you have all stores you may iterate trough them with foreach loop and get either all elements and store them into associative array with getElementsByTagName:
foreach( $storeNodes as $node){
// $node should be DOMElement
// of course you can use xPath instead of getAttributesbyTagName, but this is
// more effective
$domAttrs = $node->getAttributesByTagName( 'custom-attribute');
$attributes = array();
foreach( $domAttrs as $domAttr){
$attributes[ $domAttr->getAttribute( 'attribute-id')] = $domAttr->nodeValue;
}
// $attributes = array( 'country' => 'Deutschland', 'displayWeb' => 'false');
}
Or select attribute directly with xPath:
// Inside foreach($storeNodes as $node) loop
$yourAttribute = $xPath->query( "custom-attribute[#attribute-id='displayWeb']", $node)
->item(0)->nodeValue; // Warning will cause fatal error when missing desired tag
Or when you need just one value from whole document you could use (as Kirill Polishchuk suggested):
$yourAttribute = $xPath->query( "stores/store/custom-attributes/custom-attribute[#attribute-id='displayWeb']")
->item(0)->nodeValue; // Warning will cause fatal error when missing desired tag
Carefully study manual to understand what type is returned when and what does which attribute contain.
For example I can parse XML DOM. http://php.net/manual/en/book.dom.php
You can use XPath:
stores/store/custom-attributes/custom-attribute[#attribute-id='displayWeb']
I'd suggest PHP's SimpleXML. That web page has lots of user-supplied examples of use to extract values from the parsed data.
I am trying to get the value (text) of a specific node from an xml document using php DOM classes but I cannot do it right because I get the text content of that node merged with its descendants.
Let's suppose that I need to get the trees from this document:
<?xml version="1.0"?>
<trees>
LarchRedwoodChestnutBirch
<trimmed>Larch</trimmed>
<trimmed>Redwood</trimmed>
</trees>
And I get:
LarchRedwoodChestnutBirchLarchRedwood
You can see that I cannot remove the substring LarchRedwood made by the trimmed trees from the whole text because I would get only ChestnutBirch and it is not what I need.
Any suggest? (Thanx)
I got it. This works:
function specificNodeValue($node, $implode = true) {
$value = array();
if ($node->childNodes) {
for ($i = 0; $i < $node->childNodes->length; $i++) {
if (!(#$node->childNodes->item($i)->tagName)) {
$value[] = $node->childNodes->item($i)->nodeValue;
}
}
}
return (is_string($implode) ? implode($implode, $value) : ($implode === true ? implode($value) : $value));
}
A given node is like a root, if you get no tagName when you parse its child nodes then it is itself, so the value of that child node it is its own value.
Inside a bad formed xml document a node could have many pieces of value, put them all into an array to get the whole value of the node.
Use the function above to get needed node value without subnode values merged within.
Parameters are:
$node (required) must be a DOMElement object
$implode (optional) if you want to get a string (true by default) or an array (false) made up by many pieces of value. (Set a string instead of a boolean value if you wish to implode the array using a "glue" string).
You can try this to remove the trimmed node
$doc = new DOMDocument('1.0', 'utf-8');
$doc->loadXML($xml);
$xpath = new DOMXpath($doc);
$trees = $doc->getElementsByTagName('trees')->item(0);
foreach ($xpath->query('/trees/*') as $node)
{
$trees->removeChild($node);
}
echo $trees->textContent;
echo $trees->nodeValue;
Use $node->nodeValue to get a node's text content. If you use $node->textContent, you get all text from the current node and all child nodes.
Ideally, the XML should be:
<?xml version="1.0"?>
<trees>
<tree>Larch</tree>
<tree>Redwood</tree>
<tree>Chestnut</tree>
<tree>Birch</tree>
</trees>
To split "LarchRedwoodChestnutBirch" into separate words (by capital letter), you'll need to use PHP's "PCRE" functions:
http://www.php.net/manual/en/book.pcre.php
'Hope that helps!
I'm trying to get the contents of the XML:
$xmlstr = "<?xml version=\"1.0\" ?>
<article>
<Art>
<test>Hello</test>
</Art>
<Another>
<g>gooo</g>
</Another>
</article>";
$dom =domxml_open_mem($xmlstr);
$calcX = &$dom->xpath_new_context();
$cnt = $calcX->xpath_eval($querystring);
foreach ($cnt->nodeset as $node)
{
print_r($node);
}
Is there a way that I can get the content when the querystring is //article/Art?
What I'm looking for is:
<test>Hello</test>
If I use $node->get_content(), then the result is Hello.
I'm working with PHP4, so I'm unable to use SimpleXML [which is PHP5]. $node is DOMElement.
$node->nodeValue causes:
Notice: Undefined property: nodeValue in .php on line 19
Is there a way that I can get the
content when the querystring is
//article/Art?
What I'm looking for is:
<test>Hello</test>
Use:
/article/Art/*
This means: Select all elements that are children of an Art element that is a child of the top element named article .
If you want all nodes below /article/Art, use:
/article/Art/node()
This selects all elements, text-nodes, comment nodes and processing-instruction nodes that are children of the top element named article .
It's been awhile that I used the old PHP4 DOM extension, but try
DomNode->dump_node - Dumps a single node
Example:
foreach ($cnt->nodeset as $node) {
echo $node->dump_node;
}
If the above doesn't do what you are looking for, try the other dump_* methods in DomDocument.