PHP SimpleXML Get One Node with Children via Attribute - php

Have not found a direct solution and would prefer to do this with SimpleXML ... want to get a single node and children via an attribute (id) from a sent url
Address is clicked ...
www.website.com/index.php#929495820
XML
<archive>
<unit id="925535820">
<data>Blah</data>
<link>url</link>
</unit>
<unit id="929495820">
<data>Blah</data>
<link>url</link>
</unit>
<unit id="929495821">
<data>Blah</data>
<link>url</link>
</unit> ... and many more ...
</archive>
I have php that turns the entire XML in to an array and then splice to limit what is shown (see below) but what I want is the url to grab the single value from the array. Can it be done and if so, Simple or Dom? Please say Simple. I have the worst luck working Dom.
$xml_get = 'filename.xml'
$xml_array = json_decode(json_encode($xml_get), 1);
$master = $xml_array['unit'];
// Show Last 200
$master = array_slice($master, 1, 200);
foreach(array_reverse($master) as $arc)
{
$last_id = $arc['#attributes']['id'];
$last_data = $arc['data'];
$last_link = $arc['link'];
// Do stuff with values...
}

If you want to use SimpleXML you could load the file with simplexml_load_file and check the attributes for your id.
Then use for example a foreach or an xpath expression:
$elm = simplexml_load_file("filename.xml");
foreach ($elm->unit as $item) {
if ((string)$item->attributes()->id === "929495820") {
echo $item->data;
echo "<br>";
echo $item->link;
}
}
$result = $elm->xpath("/archive/unit[#id='929495820']");
echo $result[0]->data;
echo "<br>";
echo $result[0]->link;
Demo

Related

How to extract the text in a SimpleXmlElement object? [duplicate]

Given the php code:
$xml = <<<EOF
<articles>
<article>
This is a link
<link>Title</link>
with some text following it.
</article>
</articles>
EOF;
function traverse($xml) {
$result = "";
foreach($xml->children() as $x) {
if ($x->count()) {
$result .= traverse($x);
}
else {
$result .= $x;
}
}
return $result;
}
$parser = new SimpleXMLElement($xml);
traverse($parser);
I expected the function traverse() to return:
This is a link Title with some text following it.
However, it returns only:
Title
Is there a way to get the expected result using simpleXML (obviously for the purpose of consuming the data rather than just returning it as in this simple example)?
There might be ways to achieve what you want using only SimpleXML, but in this case, the simplest way to do it is to use DOM. The good news is if you're already using SimpleXML, you don't have to change anything as DOM and SimpleXML are basically interchangeable:
// either
$articles = simplexml_load_string($xml);
echo dom_import_simplexml($articles)->textContent;
// or
$dom = new DOMDocument;
$dom->loadXML($xml);
echo $dom->documentElement->textContent;
Assuming your task is to iterate over each <article/> and get its content, your code will look like
$articles = simplexml_load_string($xml);
foreach ($articles->article as $article)
{
$articleText = dom_import_simplexml($article)->textContent;
}
node->asXML();// It's the simple solution i think !!
So, the simple answer to my question was: Simplexml can't process this kind of XML. Use DomDocument instead.
This example shows how to traverse the entire XML. It seems that DomDocument will work with any XML whereas SimpleXML requires the XML to be simple.
function attrs($list) {
$result = "";
foreach ($list as $attr) {
$result .= " $attr->name='$attr->value'";
}
return $result;
}
function parseTree($xml) {
$result = "";
foreach ($xml->childNodes AS $item) {
if ($item->nodeType == 1) {
$result .= "<$item->nodeName" . attrs($item->attributes) . ">" . parseTree($item) . "</$item->nodeName>";
}
else {
$result .= $item->nodeValue;
}
}
return $result;
}
$xmlDoc = new DOMDocument();
$xmlDoc->loadXML($xml);
print parseTree($xmlDoc->documentElement);
You could also load the xml using simpleXML and then convert it to DOM using dom_import_simplexml() as Josh said. This would be useful, if you are using simpleXml to filter nodes for parsing, e.g. using XPath.
However, I don't actually use simpleXML, so for me that would be taking the long way around.
$simpleXml = new SimpleXMLElement($xml);
$xmlDom = dom_import_simplexml($simpleXml);
print parseTree($xmlDom);
Thank you for all the help!
You can get the text node of a DOM element with simplexml just by treating it like a string:
foreach($xml->children() as $x) {
$result .= "$x"
However, this prints out:
This is a link
with some text following it.
TitleTitle
..because the text node is treated as one block and there is no way to tell where the child fits in inside the text node. The child node is also added twice because of the other else {}, but you can just take that out.
Sorry if I didn't help much, but I don't think there's any way to find out where the child node fits in the text node unless the xml is consistent (but then, why not use tags). If you know what element you want to strip the text out of, strip_tags() will work great.
This has already been answered, but CASTING TO STRING ( i.e. $sString = (string) oSimpleXMLNode->TagName) always worked for me.
Try this:
$parser = new SimpleXMLElement($xml);
echo html_entity_decode(strip_tags($parser->asXML()));
That's pretty much equivalent to:
$parser = simplexml_load_string($xml);
echo dom_import_simplexml($parser)->textContent;
Like #tandu said, it's not possible, but if you can modify your XML, this will work:
$xml = <<<EOF
<articles>
<article>
This is a link
</article>
<link>Title</link>
<article>
with some text following it.
</article>
</articles>

PHP get nodes value with nested nodes XML

I have a xml file:
<Epo>
<Doc upd="add">
<Fld name="IC"><Prg><Sen>A01B1/00 <Cmt>(1585, 779)</Cmt></Sen></Prg></Fld>
<Fld name="CC"><Prg><Sen>A01B1/00 <Cmt>(420, 54%)</Cmt>;</Sen><Sen>B25G1/102 <Cmt>(60, 8%)</Cmt>;</Sen><Sen>A01B1/02 <Cmt>(47, 6%)</Cmt></Sen></Prg></Fld></Doc>
<Doc upd="add">
<Fld name="IC"><Prg><Sen>A01B1/02 <Cmt>(3847, 1718)</Cmt></Sen></Prg></Fld>
<Fld name="CC"><Prg><Sen>A01B1/02 <Cmt>(708, 41%)</Cmt>;</Sen><Sen>A01B1/022 <Cmt>(347, 20%)</Cmt>;</Sen><Sen>A01B1/028 <Cmt>(224, 13%)</Cmt></Sen></Prg></Fld></Doc>
</Epo>
I want to get node value, for example : A01B1/00 (1585, 779) - A01B1/00 (420, 54%); B25G1/102 (60, 8%); A01B1/02 (47, 6%)
Then formating them into table's column. how can I do that?
My code:
<?php
$doc = new DOMDocument;
$doc->preserveWhiteSpace = false;
$doc->load('test.xml'); //IPCCPC-epoxif-201905
$xpath = new DOMXPath($doc);
$titles = $xpath->query('//Doc/Fld');
foreach ($titles as $title){
echo $title->nodeValue ."<hr>";
}
?>
I cannot separate evrey node. Please help me.
I've tried to split it down to fetch all the various levels of content, but I think the main problem was just getting the current node text without the child elements text content. Using DOMDocument, the nodeValue is the same as textContent which (from the manual)...
textContent The text content of this node and its descendants.
Using DOMDocument isn't the easiest to use when just accessing a relatively simple hierarchy and requires you to continually make calls (in this case) to getElementsByTagName() to fetch the enclosed elements, the following source shows how you can get at each part of the document using this method...
foreach ( $doc->getElementsByTagName("Doc") as $item ) {
echo "upd=".$item->getAttribute("upd").PHP_EOL;
foreach ( $item->getElementsByTagName("Fld") as $fld ) {
echo "name=".$fld->getAttribute("name").PHP_EOL;
foreach ( $fld->getElementsByTagName("Sen") as $sen ) {
echo trim($sen->firstChild->nodeValue) ." cmt = ".
$sen->getElementsByTagName("Cmt")[0]->firstChild->nodeValue.PHP_EOL;
}
}
}
Using the SimpleXML API can however give a simpler solution. Each level of the hierarchy is accessed using object notation, and so ->Doc is used to access the Doc elements off the root node, and the foreach() loops just work off that. You can also see that using just the element name ($sen->Cmt) will give you just the text content of that node and not the descendants (although you have to cast it to a string to get it's value from the object) ...
$doc = simplexml_load_file("test.xml");
foreach ( $doc->Doc as $docElemnt ) {
echo "upd=".(string)$docElemnt['upd'].PHP_EOL;
foreach ( $docElemnt->Fld as $fld ) {
echo "name=".(string)$fld['name'].PHP_EOL;
foreach ( $fld->Prg->Sen as $sen ) {
echo trim((string)$sen)."=".trim((string)$sen->Cmt).PHP_EOL;
}
}
}

Setting Parent Node Variable in SimpleXML and XPath

I am working with PHP and SimpleXML/XPath, and I'm just wondering how to set a certain parent (with a certain attribute value) equal to a variable, which I could use in a 'foreach'?
I'm currently getting this error:
Notice: Array to string conversion
and this output
Array
Thanks for any leads.
Here is the php code:
<?php
$url = "test_b.xml";
$xml = simplexml_load_file($url);
$xml_report_abbrev_b = $xml->xpath('//poster[#name="U-Verify"]')[0];
if($xml_report_abbrev_b){
foreach($xml_report_abbrev_b as $node_a) {
echo '<h1>'.$node_a->xpath('/full_image/#url').'</h1>';
}
} else {
echo 'XPath query failed';
}
?>
Here's the xml:
<data>
<poster name="U-Verify" id="uverify">
<full_image url="u-verify.jpg"/>
<full_other url=""/>
</poster>
<poster name="Minimum" id="min">
<full_image url="min.jpg"/>
<full_other url="spa_min.jpg"/>
</poster>
</data>
Using SimpleXML's element access and attribute access directly instead of using the XPath query will make the code simpler and perform better.
Your code could be reduced to...
$xml_report_abbrev_b = $xml->xpath('//poster[#name="U-Verify"]');
if($xml_report_abbrev_b){
echo '<h1>'.$xml_report_abbrev_b[0]->full_image['url'].'</h1>';
} else {
echo 'XPath query failed';
}
Note the way the echo line says - with the <poster> element you found from the XPath expression, use the <full_image> element and fetch the url attribute.
I also moved the [0] into the if because if the XPath didn't find a value, this produced an error as there isn't any data to get a value from.
This outputs...
<h1>u-verify.jpg</h1>

Disappearing attributes in PHP SimpleXML Object?

I need to return a SimpleXML object converted as a JSON object to work with it in JavaScript. The problem is that there are no attributes on any object with a value.
As an example:
<customer editable="true" maxChars="9" valueType="numeric">69236</customer>
becomes in the SimpleXML object:
"customer":"69236"
Where is the #attributes object?
This has driven me crazy on several occasions. When SimpleXML encounters a node that only has a text value, it drops all the attributes. My workaround has been to modify the XML prior to parsing with SimpleXML. With a bit of regular expressions, you can create a child node that contains the actual text value. For example, in your situation you can change the XML to:
<customer editable="true" maxChars="9" valueType="numeric"><value>69236<value></customer>
Some example code assuming that your XML string was in $str:
$str = preg_replace('/<customer ([^>]*)>([^<>]*)<\/customer>/i', '<customer $1><value>$2</value></customer>', $str);
$xml = #simplexml_load_string($str);
That would preserve the attributes and nest the text value in a child node.
I realize this is an old post, but in case it proves useful. The below extends #ryanmcdonnell's solution to work on any tags instead of a hard-coded tag. Hopefully it helps someone.
$str = preg_replace('/<([^ ]+) ([^>]*)>([^<>]*)<\/\\1>/i', '<$1 $2><value>$3</value></$1>', $result);
The main different is that it replaces /<customer with /<([^ ]+), and then </customer> with </\\1>
which tells it to match that part of the search against the first element in the pattern.
Then it just adjusts the placeholders ($1,$2,$3) to account for the fact that there are three sub-matches now instead of two.
So it appears that this is a bug and is fixed in PHP 7.4.5.
It's an old question, but I found something that works neat - parse it into a DOMNode object.
// $customer contains the SimpleXMLElement
$customerDom = dom_import_simplexml($customer);
var_dump($customerDom->getAttribute('numeric'));
Will show:
string 'numeric'
Here's some code to iterate through attributes, and construct JSON. If supports, one or many customers.
If you're XML looks like this (or just one customer)
<xml>
<customer editable="true" maxChars="9" valueType="numeric">69236</customer>
<customer editable="true" maxChars="9" valueType="numeric">12345</customer>
<customer editable="true" maxChars="9" valueType="numeric">67890</customer>
</xml>
Iterate through it like this.
try {
$xml = simplexml_load_file( "customer.xml" );
// Find the customer
$result = $xml->xpath('/xml/customer');
$bFirstElement = true;
echo "var customers = {\r\n";
while(list( , $node) = each($result)) {
if( $bFirstElement ) {
echo "'". $node."':{\r\n";
$bFirstElement = false;
} else {
echo ",\r\n'". $node."':{\r\n";
}
$bFirstAtt = true;
foreach($node->attributes() as $a => $b) {
if( $bFirstAtt ) {
echo "\t".$a.":'".$b."'";
$bFirstAtt = false;
} else {
echo ",\r\n\t".$a.":'".$b."'";
}
}
echo "}";
}
echo "\r\n};\r\n";
} catch( Exception $e ) {
echo "Exception on line ".$e->getLine()." of file ".$e->getFile()." : ".$e->getMessage()."<br/>";
}
To produce a JSON structure like this
var customers = {
'69236':{
editable:'true',
maxChars:'9',
valueType:'numeric'},
'12345':{
editable:'true',
maxChars:'9',
valueType:'numeric'},
'67890':{
editable:'true',
maxChars:'9',
valueType:'numeric'}
};
Finally, in your script, access the attribute like this
WScript.Echo( customers["12345"].editable );
Good luck

Getting the text portion of a node using php Simple XML

Given the php code:
$xml = <<<EOF
<articles>
<article>
This is a link
<link>Title</link>
with some text following it.
</article>
</articles>
EOF;
function traverse($xml) {
$result = "";
foreach($xml->children() as $x) {
if ($x->count()) {
$result .= traverse($x);
}
else {
$result .= $x;
}
}
return $result;
}
$parser = new SimpleXMLElement($xml);
traverse($parser);
I expected the function traverse() to return:
This is a link Title with some text following it.
However, it returns only:
Title
Is there a way to get the expected result using simpleXML (obviously for the purpose of consuming the data rather than just returning it as in this simple example)?
There might be ways to achieve what you want using only SimpleXML, but in this case, the simplest way to do it is to use DOM. The good news is if you're already using SimpleXML, you don't have to change anything as DOM and SimpleXML are basically interchangeable:
// either
$articles = simplexml_load_string($xml);
echo dom_import_simplexml($articles)->textContent;
// or
$dom = new DOMDocument;
$dom->loadXML($xml);
echo $dom->documentElement->textContent;
Assuming your task is to iterate over each <article/> and get its content, your code will look like
$articles = simplexml_load_string($xml);
foreach ($articles->article as $article)
{
$articleText = dom_import_simplexml($article)->textContent;
}
node->asXML();// It's the simple solution i think !!
So, the simple answer to my question was: Simplexml can't process this kind of XML. Use DomDocument instead.
This example shows how to traverse the entire XML. It seems that DomDocument will work with any XML whereas SimpleXML requires the XML to be simple.
function attrs($list) {
$result = "";
foreach ($list as $attr) {
$result .= " $attr->name='$attr->value'";
}
return $result;
}
function parseTree($xml) {
$result = "";
foreach ($xml->childNodes AS $item) {
if ($item->nodeType == 1) {
$result .= "<$item->nodeName" . attrs($item->attributes) . ">" . parseTree($item) . "</$item->nodeName>";
}
else {
$result .= $item->nodeValue;
}
}
return $result;
}
$xmlDoc = new DOMDocument();
$xmlDoc->loadXML($xml);
print parseTree($xmlDoc->documentElement);
You could also load the xml using simpleXML and then convert it to DOM using dom_import_simplexml() as Josh said. This would be useful, if you are using simpleXml to filter nodes for parsing, e.g. using XPath.
However, I don't actually use simpleXML, so for me that would be taking the long way around.
$simpleXml = new SimpleXMLElement($xml);
$xmlDom = dom_import_simplexml($simpleXml);
print parseTree($xmlDom);
Thank you for all the help!
You can get the text node of a DOM element with simplexml just by treating it like a string:
foreach($xml->children() as $x) {
$result .= "$x"
However, this prints out:
This is a link
with some text following it.
TitleTitle
..because the text node is treated as one block and there is no way to tell where the child fits in inside the text node. The child node is also added twice because of the other else {}, but you can just take that out.
Sorry if I didn't help much, but I don't think there's any way to find out where the child node fits in the text node unless the xml is consistent (but then, why not use tags). If you know what element you want to strip the text out of, strip_tags() will work great.
This has already been answered, but CASTING TO STRING ( i.e. $sString = (string) oSimpleXMLNode->TagName) always worked for me.
Try this:
$parser = new SimpleXMLElement($xml);
echo html_entity_decode(strip_tags($parser->asXML()));
That's pretty much equivalent to:
$parser = simplexml_load_string($xml);
echo dom_import_simplexml($parser)->textContent;
Like #tandu said, it's not possible, but if you can modify your XML, this will work:
$xml = <<<EOF
<articles>
<article>
This is a link
</article>
<link>Title</link>
<article>
with some text following it.
</article>
</articles>

Categories