PHP: How to handle <![CDATA[ with SimpleXMLElement? - php

I noticed that when using SimpleXMLElement on a document that contains those CDATA tags, the content is always NULL. How do I fix this?
Also, sorry for spamming about XML here. I have been trying to get an XML based script to work for several hours now...
<content><![CDATA[Hello, world!]]></content>
I tried the first hit on Google if you search for "SimpleXMLElement cdata", but that didn't work.

You're probably not accessing it correctly. You can output it directly or cast it as a string. (in this example, the casting is superfluous, as echo automatically does it anyway)
$content = simplexml_load_string(
'<content><![CDATA[Hello, world!]]></content>'
);
echo (string) $content;
// or with parent element:
$foo = simplexml_load_string(
'<foo><content><![CDATA[Hello, world!]]></content></foo>'
);
echo (string) $foo->content;
You might have better luck with LIBXML_NOCDATA:
$content = simplexml_load_string(
'<content><![CDATA[Hello, world!]]></content>'
, null
, LIBXML_NOCDATA
);

The LIBXML_NOCDATA is optional third parameter of simplexml_load_file() function. This returns the XML object with all the CDATA data converted into strings.
$xml = simplexml_load_file($this->filename, 'SimpleXMLElement', LIBXML_NOCDATA);
echo "<pre>";
print_r($xml);
echo "</pre>";
Fix CDATA in SimpleXML

This did the trick for me:
echo trim($entry->title);

This is working perfect for me.
$content = simplexml_load_string(
$raw_xml
, null
, LIBXML_NOCDATA
);

When to use LIBXML_NOCDATA ?
I add the issue when transforming XML to JSON.
$xml = simplexml_load_string("<foo><content><![CDATA[Hello, world!]]></content></foo>");
echo json_encode($xml, true);
/* prints
{
"content": {}
}
*/
When accessing the SimpleXMLElement object, It gets the CDATA :
$xml = simplexml_load_string("<foo><content><![CDATA[Hello, world!]]></content></foo>");
echo $xml->content;
/* prints
Hello, world!
*/
I makes sense to use LIBXML_NOCDATA because json_encode don't access the SimpleXMLElement to trigger the string casting feature, I'm guessing a __toString() equivalent.
$xml = simplexml_load_string("<foo><content><![CDATA[Hello, world!]]></content></foo>", null, LIBXML_NOCDATA);
echo json_encode($xml);
/*
{
"content": "Hello, world!"
}
*/

While using SimpleXMLElement class directly
new SimpleXMLElement($rawXml, LIBXML_NOCDATA);

Related

<![CDATA[ ]]> returns empty description simplexml_load_file() php [duplicate]

I noticed that when using SimpleXMLElement on a document that contains those CDATA tags, the content is always NULL. How do I fix this?
Also, sorry for spamming about XML here. I have been trying to get an XML based script to work for several hours now...
<content><![CDATA[Hello, world!]]></content>
I tried the first hit on Google if you search for "SimpleXMLElement cdata", but that didn't work.
You're probably not accessing it correctly. You can output it directly or cast it as a string. (in this example, the casting is superfluous, as echo automatically does it anyway)
$content = simplexml_load_string(
'<content><![CDATA[Hello, world!]]></content>'
);
echo (string) $content;
// or with parent element:
$foo = simplexml_load_string(
'<foo><content><![CDATA[Hello, world!]]></content></foo>'
);
echo (string) $foo->content;
You might have better luck with LIBXML_NOCDATA:
$content = simplexml_load_string(
'<content><![CDATA[Hello, world!]]></content>'
, null
, LIBXML_NOCDATA
);
The LIBXML_NOCDATA is optional third parameter of simplexml_load_file() function. This returns the XML object with all the CDATA data converted into strings.
$xml = simplexml_load_file($this->filename, 'SimpleXMLElement', LIBXML_NOCDATA);
echo "<pre>";
print_r($xml);
echo "</pre>";
Fix CDATA in SimpleXML
This did the trick for me:
echo trim($entry->title);
This is working perfect for me.
$content = simplexml_load_string(
$raw_xml
, null
, LIBXML_NOCDATA
);
When to use LIBXML_NOCDATA ?
I add the issue when transforming XML to JSON.
$xml = simplexml_load_string("<foo><content><![CDATA[Hello, world!]]></content></foo>");
echo json_encode($xml, true);
/* prints
{
"content": {}
}
*/
When accessing the SimpleXMLElement object, It gets the CDATA :
$xml = simplexml_load_string("<foo><content><![CDATA[Hello, world!]]></content></foo>");
echo $xml->content;
/* prints
Hello, world!
*/
I makes sense to use LIBXML_NOCDATA because json_encode don't access the SimpleXMLElement to trigger the string casting feature, I'm guessing a __toString() equivalent.
$xml = simplexml_load_string("<foo><content><![CDATA[Hello, world!]]></content></foo>", null, LIBXML_NOCDATA);
echo json_encode($xml);
/*
{
"content": "Hello, world!"
}
*/
While using SimpleXMLElement class directly
new SimpleXMLElement($rawXml, LIBXML_NOCDATA);

Parsing XML with PHP with <![CDATA[

I'm parsing XML with PHP using simplexml_load_file, then I json_encode and json_decode in order to get all the info as arrays:
$xml = simplexml_load_file('/var/www/darkglass/wp-content/themes/dark2/assets/xml/artists.xml');
$musicos = json_encode($xml);
$musicos = json_decode($musicos, true);
I'm having this problem where I want to add a HTML code inside the tag, but it only works if I add a character before the <![CDATA like the example below:
This doesn't work:
<band><![CDATA[<a class="abandlink" href="#">Cannibal Corpse</a>]]></band>
This works:
<band>.<![CDATA[<a class="abandlink" href="#">Cannibal Corpse</a>]]></band>
Any idea why is this happening?
You should use LIBXML_NOCDATA option:
$xml = simplexml_load_file('artists.xml', 'SimpleXMLElement', LIBXML_NOCDATA);
Well, I found a workaround to this problem,
I just added substr, $band = substr($band, 1); to the variable, so it removes the first character of the variable, and it works.

How do I parse an XML file with SimpleXMLElement and multiple namespaces?

I have an XML file that looks like the example on this site: http://msdn.microsoft.com/en-us/library/ee223815(v=sql.105).aspx
I am trying to parse the XML file using something like this:
$data = file_get_contents('http://mywebsite here');
$xml = new SimpleXMLElement($data);
$str = $xml->Author;
echo $str;
Unfortunately, this is not working, and I suspect it is due to the namespaces. I can dump the $xml using asXML() and it correctly shows the XML data.
I understand I need to insert namespaces somehow, but I'm not sure how. How do I parse this type of XML file?
All you need is to register the namespace
$sxe = new SimpleXMLElement($data);
$sxe->registerXPathNamespace("diffgr", "urn:schemas-microsoft-com:xml-diffgram-v1");
$data = $sxe->xpath("//diffgr:diffgram") ;
$data = $data[0];
echo "<pre>";
foreach($data->Results->RelevantResults as $result)
{
echo $result->Author , PHP_EOL ;
}
Output
Ms.Kim Abercrombie
Mr.GustavoAchong
Mr. Samuel N. Agcaoili
See Full code In Action

How to convert empty XML node into empty string instead of SimpleXMLElement

I have an XML string that sometimes has empty nodes. When parsing this with simplexml_load_string the parser interprets any empty nodes (example <node></node>) to be an empty SimpleXMLElement. I actually would prefer these come through as an empty string, or are just omitted entirely.
I've tried using LIBXML_NOBLANKS as shown below, but it seems to have no effect. Here's some code that demonstrates the situation. the node "p2" is empty:
$xml = "<xml><p1>1</p1><p2></p2><p3>3</p3></xml>";
$obj = simplexml_load_string($xml, 'SimpleXMLElement', LIBXML_NOBLANKS);
header("Content-type: text/plain");
echo "STRING\n-----\n" . $xml;
echo "\n\nOBJ\n---\n" . print_r($obj,1);
echo "\n\nJSON\n----\n" . json_encode($obj);
Here is working example for empty nodes:
$nodes = $rootNode->xpath("//*[text()='']");
foreach ($nodes as $node) {
unset($node->{0});
}
unset($node->{0}) - is a trick which destroyes this node and removes it from parent node.

php SimpleXMLElement set text

how to set text for a SimpleXMLElement in php?
Let's say $node is an SimpleXMLElement. This code:
$node->{0} = "some string"
will result in an extra childNode in PHP 7, instead of setting the text content for $node. Use:
$node[0] = "some string"
instead.
Did you look at the basic documentation examples?
From there:
include 'example.php';
$xml = new SimpleXMLElement($xmlstr);
$xml->movie[0]->characters->character[0]->name = 'Miss Coder';
echo $xml->asXML();
$xml = SimpleXMLElement('<a><b><c></c></b></a>');
$foundNodes = $xml->xpath('//c');
$foundNode = $foundNodes[0];
$foundNode->{0} = "This text will be put inside of c tag.";
$xml->asXML();
// will output <a><b><c>This text will be put inside of c tag.</c></b></a>
More on my search of this answer here:
How can I set text value of SimpleXmlElement without using its parent?

Categories