Parsing XML with PHP with <![CDATA[ - php

I'm parsing XML with PHP using simplexml_load_file, then I json_encode and json_decode in order to get all the info as arrays:
$xml = simplexml_load_file('/var/www/darkglass/wp-content/themes/dark2/assets/xml/artists.xml');
$musicos = json_encode($xml);
$musicos = json_decode($musicos, true);
I'm having this problem where I want to add a HTML code inside the tag, but it only works if I add a character before the <![CDATA like the example below:
This doesn't work:
<band><![CDATA[<a class="abandlink" href="#">Cannibal Corpse</a>]]></band>
This works:
<band>.<![CDATA[<a class="abandlink" href="#">Cannibal Corpse</a>]]></band>
Any idea why is this happening?

You should use LIBXML_NOCDATA option:
$xml = simplexml_load_file('artists.xml', 'SimpleXMLElement', LIBXML_NOCDATA);

Well, I found a workaround to this problem,
I just added substr, $band = substr($band, 1); to the variable, so it removes the first character of the variable, and it works.

Related

<![CDATA[ ]]> returns empty description simplexml_load_file() php [duplicate]

I noticed that when using SimpleXMLElement on a document that contains those CDATA tags, the content is always NULL. How do I fix this?
Also, sorry for spamming about XML here. I have been trying to get an XML based script to work for several hours now...
<content><![CDATA[Hello, world!]]></content>
I tried the first hit on Google if you search for "SimpleXMLElement cdata", but that didn't work.
You're probably not accessing it correctly. You can output it directly or cast it as a string. (in this example, the casting is superfluous, as echo automatically does it anyway)
$content = simplexml_load_string(
'<content><![CDATA[Hello, world!]]></content>'
);
echo (string) $content;
// or with parent element:
$foo = simplexml_load_string(
'<foo><content><![CDATA[Hello, world!]]></content></foo>'
);
echo (string) $foo->content;
You might have better luck with LIBXML_NOCDATA:
$content = simplexml_load_string(
'<content><![CDATA[Hello, world!]]></content>'
, null
, LIBXML_NOCDATA
);
The LIBXML_NOCDATA is optional third parameter of simplexml_load_file() function. This returns the XML object with all the CDATA data converted into strings.
$xml = simplexml_load_file($this->filename, 'SimpleXMLElement', LIBXML_NOCDATA);
echo "<pre>";
print_r($xml);
echo "</pre>";
Fix CDATA in SimpleXML
This did the trick for me:
echo trim($entry->title);
This is working perfect for me.
$content = simplexml_load_string(
$raw_xml
, null
, LIBXML_NOCDATA
);
When to use LIBXML_NOCDATA ?
I add the issue when transforming XML to JSON.
$xml = simplexml_load_string("<foo><content><![CDATA[Hello, world!]]></content></foo>");
echo json_encode($xml, true);
/* prints
{
"content": {}
}
*/
When accessing the SimpleXMLElement object, It gets the CDATA :
$xml = simplexml_load_string("<foo><content><![CDATA[Hello, world!]]></content></foo>");
echo $xml->content;
/* prints
Hello, world!
*/
I makes sense to use LIBXML_NOCDATA because json_encode don't access the SimpleXMLElement to trigger the string casting feature, I'm guessing a __toString() equivalent.
$xml = simplexml_load_string("<foo><content><![CDATA[Hello, world!]]></content></foo>", null, LIBXML_NOCDATA);
echo json_encode($xml);
/*
{
"content": "Hello, world!"
}
*/
While using SimpleXMLElement class directly
new SimpleXMLElement($rawXml, LIBXML_NOCDATA);

PHP: How to parse this xml with Simple XML

I need to get the value of url2 from the following xml:
<videoplayer>
<embed_code>aaa</embed_code>
<volume>bbb</volume>
<stats_pixel>
<secret>ccc</secret>
<url>ddd</url>
<url2>HOW TO GET THIS???</url2>
<video_plays>
<site_url>eee</site_url>
</video_plays>
</stats_pixel>
</videoplayer>
This didn't work:
$xml = simplexml_load_file($url);
$xml->videoplayer[0]->stats_pixel->url2;
videoplayer is root, so you shouldn't specify it, this should work:
echo $xml->stats_pixel->url2;;
You might need to encode your URL:
$xml = simplexml_load_file(rawurlencode($url));
var_dump($xml); //make sure you get a SimpleXMLElement here before using it...

creating multiple xml nodes with same namespaces in php

I have the following code
$dom = new DOMDocument('1.0', 'utf-8');
$headerNS = $dom->createElementNS('http://somenamespace', 'ttauth:authHeader');
$accesuser = $dom->createElementNS('http://somenamespace', 'ttauth:Accessuser','aassdd');
$accesscode = $dom->createElementNS('http://somenamespace', 'ttauth:Accesscode','aassdd');
$headerNS->appendChild($accesuser);
$headerNS->appendChild($accesscode);
echo "<pre>";
echo ($dom->saveXML($headerNS));
echo "</pre>";
IT will produce the following xml as output
<?xml version="1.0" ?>
<ttauth:authHeader xmlns:ttauth="http://somenamespace">
<ttauth:Accessuser>
ApiUserFor136
</ttauth:Accessuser>
<ttauth:Accesscode>
test1234
</ttauth:Accesscode>
</ttauth:authHeader>
But I want the following output
<ttauth:authHeader xmlns:ttauth="http://somenamespace">
<ttauth:Accessuser xmlns:ttauth="http://somenamespace">
aassdd
</ttauth:Accessuser>
<ttauth:Accesscode xmlns:ttauth="somenamespace">
aassdd
</ttauth:Accesscode>
</ttauth:authHeader>
See the xmlns is not included in elements other than root element but I want xmlns to be included in all elements Is there anything I am doing wrong ??
Probably the PHP parser does not add renaming of the same namespace "http://somenamespace" with the same prefix "ttauth" because it is redundant. Both xmls you shown ( the output and expected ) are equivalent. If you want to be sure you have the namespaces attributes as you want, you should add them manually by using addAtribute - http://www.php.net/manual/en/domdocument.createattribute.php. See the following code snippet:
$domAttribute = $domDocument->createAttribute('xmlns:ttauth');
$domAttribute->value = 'http://somenamespace';
$accessuser->appendChild($domAttribute);
Hope it helps
instead of using
$accesuser = $dom->createElementNS('http://somenamespace', 'ttauth:Accessuser','aassdd');
I used
$accesuser = $dom->createElement('http://somenamespace', 'ttauth:Accessuser','aassdd');
and then
$accesuser->setAttribute('xmlns:ttauth', ('http://somenamespace');
it works fine for any number of nodes

remove xml version tag when a xml is created in php

I'm creating a xml using this
$customXML = new SimpleXMLElement('<abc></abc>');
after adding some attributes onto this, when I try to print it
it appears like this,
<?xml version="1.0"?>
<abc id="332"><params><param name="aa">33</param></params></abc>
Is there a way to remove the xml version node ?
Thank you
In theory you can provide the LIBXML_NOXMLDECL option to drop the XML declaration when saving a document, but this is only available in Libxml >= 2.6.21 (and buggy). An alternative would be to use
$customXML = new SimpleXMLElement('<abc></abc>');
$dom = dom_import_simplexml($customXML);
echo $dom->ownerDocument->saveXML($dom->ownerDocument->documentElement);
I have a simmilar solution to the accepted answer:
If you have xml allready loaded in a variable:
$t_xml = new DOMDocument();
$t_xml->loadXML($xml_as_string);
$xml_out = $t_xml->saveXML($t_xml->documentElement);
For XML file from disk:
$t_xml = new DOMDocument();
$t_xml->load($file_path_to_xml);
$xml_out = $t_xml->saveXML($t_xml->documentElement);
This comment helped: http://www.php.net/manual/en/domdocument.savexml.php#88525
A practical solution: you know that the first occurrence of ?> in the result string is going to be then end of the xml version substring. So:
$customXML = new SimpleXMLElement('<abc></abc>');
$customXML = substr($customXML, strpos($customXML, '?'.'>') + 2);
Note that ?> is split into two parts because otherwise some poor syntax highlighter may have problems parsing at this point.
As SimpleXMLElement always uses "\n" to separate the XML-Declaration from the rest of the document, it can be split at that position and the remainder taken:
explode("\n", $customXML->asXML(), 2)[1];
Example:
<?php
$customXML = new SimpleXMLElement('<!-- some comment -->
<abc>
</abc>');
echo explode("\n", $customXML->asXML(), 2)[1];
Output:
<!-- some comment -->
<abc>
</abc>
There's another way without the replacing xml header. I prefer this:
$xml = new xmlWriter();
$xml->openMemory();
$xml->startElement('abc');
$xml->writeAttribute('id', 332);
$xml->startElement('params');
$xml->startElement('param');
$xml->writeAttribute('name', 'aa');
$xml->text('33');
$xml->endElement();
$xml->endElement();
echo $xml->outputMemory(true);
Gives output:
<abc id="332"><params><param name="aa">33</param></params></abc>
Try:
$xmlString = $doc->saveXML();
$xmlString = str_replace("<?xml version=\"1.0\"?>\n", '', $xmlString);
file_put_contents($filename, $xmlString);
echo preg_replace("/<\\?xml.*\\?>/",'',$doc->saveXML(),1);
If this is a problem, this should do it:
$xml = str_replace(' version="1.0"', '', $xml);`
$customXML = new SimpleXMLElement('<source><abc>hello</abc></source>');
$result = $customXML->xpath('//abc');
$result = $result[0];
var_dump($result->asXML());
XML header tags and PHP short tags are incompatible. So, this may be an issue of using the short tags in PHP (i.e. <?= instead of <?php echo...). So, turning off short tags in PHP.INI (and updating your code appropriately) may resolve this issue.

PHP: How to handle <![CDATA[ with SimpleXMLElement?

I noticed that when using SimpleXMLElement on a document that contains those CDATA tags, the content is always NULL. How do I fix this?
Also, sorry for spamming about XML here. I have been trying to get an XML based script to work for several hours now...
<content><![CDATA[Hello, world!]]></content>
I tried the first hit on Google if you search for "SimpleXMLElement cdata", but that didn't work.
You're probably not accessing it correctly. You can output it directly or cast it as a string. (in this example, the casting is superfluous, as echo automatically does it anyway)
$content = simplexml_load_string(
'<content><![CDATA[Hello, world!]]></content>'
);
echo (string) $content;
// or with parent element:
$foo = simplexml_load_string(
'<foo><content><![CDATA[Hello, world!]]></content></foo>'
);
echo (string) $foo->content;
You might have better luck with LIBXML_NOCDATA:
$content = simplexml_load_string(
'<content><![CDATA[Hello, world!]]></content>'
, null
, LIBXML_NOCDATA
);
The LIBXML_NOCDATA is optional third parameter of simplexml_load_file() function. This returns the XML object with all the CDATA data converted into strings.
$xml = simplexml_load_file($this->filename, 'SimpleXMLElement', LIBXML_NOCDATA);
echo "<pre>";
print_r($xml);
echo "</pre>";
Fix CDATA in SimpleXML
This did the trick for me:
echo trim($entry->title);
This is working perfect for me.
$content = simplexml_load_string(
$raw_xml
, null
, LIBXML_NOCDATA
);
When to use LIBXML_NOCDATA ?
I add the issue when transforming XML to JSON.
$xml = simplexml_load_string("<foo><content><![CDATA[Hello, world!]]></content></foo>");
echo json_encode($xml, true);
/* prints
{
"content": {}
}
*/
When accessing the SimpleXMLElement object, It gets the CDATA :
$xml = simplexml_load_string("<foo><content><![CDATA[Hello, world!]]></content></foo>");
echo $xml->content;
/* prints
Hello, world!
*/
I makes sense to use LIBXML_NOCDATA because json_encode don't access the SimpleXMLElement to trigger the string casting feature, I'm guessing a __toString() equivalent.
$xml = simplexml_load_string("<foo><content><![CDATA[Hello, world!]]></content></foo>", null, LIBXML_NOCDATA);
echo json_encode($xml);
/*
{
"content": "Hello, world!"
}
*/
While using SimpleXMLElement class directly
new SimpleXMLElement($rawXml, LIBXML_NOCDATA);

Categories