php simplexml_load_file and CDATA has data missing completely - php

I have the following xml file
<?xml version="1.0" encoding="UTF-8"?>
<data>
<item name="general.global.Event"><![CDATA[EVENT!]]></item>
<item name="general.global.CompanyName"><![CDATA[some name]]></item>
<item name="general.global.CompanyImprint"><![CDATA[Legal information]]></item>
</data>
and my code is as follows
$xml = simplexml_load_file("general.xml") or die("Error: Cannot create object");
print_r($xml);
and my output is missing the CDATA.. how?
SimpleXMLElement Object
(
[item] => Array
(
[0] => SimpleXMLElement Object
(
[#attributes] => Array
(
[name] => general.global.Event
)
)
[1] => SimpleXMLElement Object
(
[#attributes] => Array
(
[name] => general.global.CompanyName
)
)
[2] => SimpleXMLElement Object
(
[#attributes] => Array
(
[name] => general.global.CompanyImprint
)
)
)
)

Text nodes are not exposed with print_r.
You can see the data there is you look at it explicitly:
print $xml->item[0];

The CDATA is being read, read this answer and you'll see that if you print_r($xml->asXML()); The parser recompiles the CDATA information just fine.
For some reason, PHP's var_dump and print_r don't have accurate representation of XML objects. Try this and you can still access the data:
foreach ($xml->item as $item) {
if ('general.global.CompanyImprint' === (string)$item['name']) {
var_dump((string)$item);
}
}
// prints
string(17) "Legal information"

Related

simplexmlelement array to <xml file format

let say my code below is a xml object code
SimpleXMLElement Object
(
[data] => SimpleXMLElement Object
(
[delivery_info] => SimpleXMLElement Object
(
[carriage_list] => SimpleXMLElement Object
(
[carriage] => SimpleXMLElement Object
(
[name] => aa
[price] => 0.00
)
)
now need convert it to example below
<?xml version='1.0'?>
<document>
<data>
<delivery_info>
<carriage_list>
<carriage>
<name>aa</name>
<price>aa</price>
</carriage>
</carriage_list>
</delivery_info>
</data>
</document>
XML;
so how i do it? i got search many time already ,no any answer to me
pls help thanks
You can use asXML($filename) function of SimpleXMLElement or cast type to string if you don't need to save to file:
$xml = (string) $element;

PHP SimpleXML xpath query results

I have searched for this and the answers I find seem to say what I thought I understand. Obviously I am missing something. I am confused at the results from the xPath query. I have simplified my problem for a test case to post here.
My real xml has several dataset nodes at different depths. Ultimately, I want to get every dataset element with a given label and then loop over that and get the field values (at different locations (or depths) so I think I need xpath). I can use xpath to get the dataset elements that I want successfully. However, when I then run xpath on that result object, it gets me the fields I want and all the other fields too. I can't figure out why it isn't only returning field1, field2, and field3. When I print_r($value[0]), it shows only the fields I want. But, when I run xpath on $value[0], it returns all fields in the xml doc.
Sample XML
<myxml>
<dataset label="wanteddata" label2="anotherlabel">
<dataitem>
<textdata>
<field label="label1">field1</field>
<field label="label2">field2</field>
<field label="label3">field3</field>
</textdata>
</dataitem>
</dataset>
<dataset label="unwanteddata" label2="unwantedanotherlabel">
<dataitem>
<textdata>
<field label="label4">field4</field>
<field label="label5">field5</field>
<field label="label6">field6</field>
</textdata>
</dataitem>
</dataset>
</myxml>
Here is the test code.
$xmlstring = file_get_contents('simplexml_test.xml');
$xml = simplexml_load_string($xmlstring);
if ($xml === false) {
throw new Exception("Failed to load");
}
$value = $xml->xpath('//dataset[#label="wanteddata"]');
print_r($value[0]->xpath('//field'));
Code Output:
Array
(
[0] => SimpleXMLElement Object
(
[#attributes] => Array
(
[label] => label1
)
)
[1] => SimpleXMLElement Object
(
[#attributes] => Array
(
[label] => label2
)
)
[2] => SimpleXMLElement Object
(
[#attributes] => Array
(
[label] => label3
)
)
[3] => SimpleXMLElement Object
(
[#attributes] => Array
(
[label] => label4
)
)
[4] => SimpleXMLElement Object
(
[#attributes] => Array
(
[label] => label5
)
)
[5] => SimpleXMLElement Object
(
[#attributes] => Array
(
[label] => label6
)
)
)
//field selects all <field> elements within the entire XML document regardless of the context node from which you call that XPath. To make the XPath heed the context node, you need to add a dot (.) at the beginning of the XPath. In XPath, (.) references current context node :
print_r($value[0]->xpath('.//field'));

Weird behaviour in SimpleXMLElement Object when printing the array

I'm struggling with an array in my SimpleXMLElement Object. Somehow I don't get the expected result when I print the array $node->reference.
print_r($node); shows:
SimpleXMLElement Object
(
[reference] => Array
(
[0] => SimpleXMLElement Object
(
[#attributes] => Array
(
[resourceIdentifier] => 52chgb7f-1a00-4eaf-ac8a-5d4557f9796a
)
)
[1] => SimpleXMLElement Object
(
[#attributes] => Array
(
[resourceIdentifier] => 52cbccc3-b754-4e88-9238-5d5257f9796a
)
)
)
)
But print_r($node->reference); and print_r($node->reference->children()); shows:
SimpleXMLElement Object
(
[#attributes] => Array
(
[resourceIdentifier] => 52chgb7f-1a00-4eaf-ac8a-5d4557f9796a
)
)
I expect to see:
Array
(
[0] => SimpleXMLElement Object
(
[#attributes] => Array
(
[resourceIdentifier] => 52chgb7f-1a00-4eaf-ac8a-5d4557f9796a
)
)
[1] => SimpleXMLElement Object
(
[#attributes] => Array
(
[resourceIdentifier] => 52cbccc3-b754-4e88-9238-5d5257f9796a
)
)
)
Edit
Here is some code to reproduce:
<?php
$xml = '<?xml version="1.0" encoding="UTF-8" ?>
<items>
<item>
<reference resourceIdentifier="52chgb7f-1a00-4eaf-ac8a-5d4557f9796a" />
<reference resourceIdentifier="52cbccc3-b754-4e88-9238-5d5257f9796a" />
</item>
<item>
<reference resourceIdentifier="52chgb7f-1a00-4eaf-ac8a-5d4557f9796a" />
</item>
<item>
<reference resourceIdentifier="52chgb7f-1a00-4eaf-ac8a-5d4557f9796a" />
<reference resourceIdentifier="52chgb7f-1a00-4eaf-ac8a-5d4557f9796a" />
<reference resourceIdentifier="52cbccc3-b754-4e88-9238-5d5257f9796a" />
</item>
</items>';
$items = new \SimpleXMLElement($xml);
foreach ($items as $item) {
echo '<h1>Item</h1>';
echo '<pre>';
print_r($item);
print_r($item->reference); // Returns always 1 SimpleXMLElement Object?
print_r($item->reference->children()); // Returns always 1 SimpleXMLElement Object?
echo '</pre>';
}
The simple answer is: don't rely on print_r. The thing with SimpleXML is that it uses a lot of, for want of a better word, "magic", and print_r (and var_dump, var_export, and pretty much any other generic debug or serialize function) doesn't show you how it will behave. Also, and this is really important, a SimpleXMLElement does not contain any arrays.
I wrote a dedicated debug function which, while not perfect, does a better job of recursing through SimpleXML objects than the native ones.
The reason for this specific behaviour is that you can use $node->reference to refer to either the list of all children called reference, or the first such child. The following are all equivalent:
// Access as iterable list
foreach ( $node->reference as $ref ) {
echo $ref['resourceIdentifier'];
// only loop once
break;
}
// Access as numerically indexed array
echo $node->reference[0]['resourceIdentifier'];
// Access first item by default
echo $node->reference['resourceIdentifier'];
This is extremely handy when you have a document that is "deep but narrow", e.g.
$xml = simplexml_load_string('<foo><bar><baz><quux hello="world" /></baz></bar></foo>');
echo $xml->bar->baz->quux['hello']; // world
Rather than you having to check whether a node is unique or multiple, SimpleXML just lets you write such an expression and ignore any multiples:
$xml = simplexml_load_string('<foo><bar><baz><quux hello="world" /><quux ignored="true" /></baz></bar><bar>ignored</bar></foo>');
echo $xml->bar->baz->quux['hello']; // world

Could not to read data from xml file due to that tag <![CDATA[]]

test.xml
<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0"
xmlns:excerpt="http://wordpress.org/export/1.2/excerpt/"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:wfw="http://wellformedweb.org/CommentAPI/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:wp="http://wordpress.org/export/1.2/"
>
<item>
<title>Hello world!</title>
<link>http://localhost/wordpress/?p=1</link>
<category><![CDATA[Uncategorized]]></category>
<HEADLINE><![CDATA[Does Sleep Apnea Offer Some Protection During Heart Attack?]]></HEADLINE>
</item>
</rss>
i used that code read xml file
<?php
if (file_exists('test.xml')) {
$xml = simplexml_load_file('test.xml');
echo "<pre>";
print_r($xml);
echo "</pre>";
} else {
exit('Failed to open test.xml.');
}
?>
Output
SimpleXMLElement Object
(
[#attributes] => Array
(
[version] => 2.0
)
[item] => SimpleXMLElement Object
(
[title] => Hello world!
[link] => http://localhost/wordpress/?p=1
[category] => SimpleXMLElement Object
(
)
[HEADLINE] => SimpleXMLElement Object
(
)
)
)
but i have problem with those tag in xml
<category><![CDATA[Uncategorized]]></category>
<HEADLINE><![CDATA[Does Sleep Apnea Offer Some Protection During Heart Attack?]]></HEADLINE>
to read data bcoz of it <![CDATA[]] in category and HEADLINE
i need output
SimpleXMLElement Object
(
[#attributes] => Array
(
[version] => 2.0
)
[item] => SimpleXMLElement Object
(
[title] => Hello world!
[link] => http://localhost/wordpress/?p=1
[category] => Uncategorized
[HEADLINE] => Does Sleep Apnea Offer Some Protection During Heart Attack?
)
)
Try this when loading the file
$xml = simplexml_load_file($this->filename,'SimpleXMLElement', LIBXML_NOCDATA);
Try simply this:
var_dump(
(string)$xml->item->category,
(string)$xml->item->HEADLINE
);
It's worth nothing that SimpleXML builds stuff on the fly so print_r() on objects does not necessarily display useful info.
try casting the
[category] => SimpleXMLElement Object
(
)
[HEADLINE] => SimpleXMLElement Object
(
)
to strings, and i think you will get what you need
$item['category']=(string)$item['category'];

How to set attribute for nodes with text content?

I am trying to iterate over set of nodes given by xpath and set certain attribute for each node. However it works only for nodes withou content or with empty (whitespace) content. I have tried 2 approaches but with the same result (maybe they are both the same on some deeper level, dunno). The commented line is the second approach.
$temp = simplexml_load_string (
'<toolbox>
<hammer/>
<screwdriver> </screwdriver>
<knife>
sharp
</knife>
</toolbox>' );
echo "vanilla toolbox: ";
print_r($temp);
$nodes = $temp->xpath('//*[not(#id)]');
foreach($nodes as $obj) {
$tempdom = dom_import_simplexml($obj);
$tempdom->setAttributeNode(new DOMAttr('id', 5));
//$obj->addAttribute('bagr', 5);
}
echo "processed toolbox: ";
print_r($temp);
This is output. Attribute id is missing in node knife.:
vanilla toolbox: SimpleXMLElement Object
(
[hammer] => SimpleXMLElement Object
(
)
[screwdriver] => SimpleXMLElement Object
(
[0] =>
)
[knife] =>
sharp
)
processed toolbox: SimpleXMLElement Object
(
[#attributes] => Array
(
[id] => 5
)
[hammer] => SimpleXMLElement Object
(
[#attributes] => Array
(
[id] => 5
)
)
[screwdriver] => SimpleXMLElement Object
(
[#attributes] => Array
(
[id] => 5
)
[0] =>
)
[knife] =>
sharp
I'm unable to reproduce what you describe, the changed XML is:
<?xml version="1.0"?>
<toolbox id="5">
<hammer id="5"/>
<screwdriver id="5"> </screwdriver>
<knife id="5">
sharp
</knife>
</toolbox>
Demo
It's exactly your code, maybe you're using a different LIBXML version? See the LIBXML_VERSION constant (codepad viper has 20626 (2.6.26)).
But probably it's just only the print_r output for a SimpleXMLElement object.
It does not output the attributes for the last element, even on a brand new object, but it's still possible to access the attribute. Demo.
You will see when you print_r($temp->knife['id']); that the attribute is set (as you can see in the earlier XML output).

Categories