So another CDATA returning content question. I've seen many answers, but even though I tried them all, I still get only content.
In more details:
I have an xml file (containing many NewsItem inside):
<NewsML>
<NewsItem>
<NewsComponent>
<ContentItem>
<DataContent>
<body>
<body.content>
<![CDATA[<p>This is what I am trying to retrieve</p>]]>
</body.content>
</body>
</DataContent>
</ContentItem>
</NewsComponent>
</NewsItem>
I am trying to get the content of body.content.
Here is my code:
$xml = simplexml_load_file('path/to/my/xml.xml',null,LIBXML_NOCDATA);
if(count($xml->children()) > 0){
foreach($xml->children() as $element){
$description = (string)$element->NewsComponent->ContentItem->DataContent->body->body.content;
echo $description;
}
}
echo '<pre>';
print_r($xml);
echo '</pre>';
My echo returns:
content
even though I do see the content in the print_r of my xml, as we can see here:
SimpleXMLElement Object
(
[NewsItem] => Array
(
[0] => SimpleXMLElement Object
(
[NewsComponent] => SimpleXMLElement Object
(
[ContentItem] => Array
(
[0] => SimpleXMLElement Object
(
[DataContent] => SimpleXMLElement Object
(
[body] => SimpleXMLElement Object
(
[body.content] => This is what I am trying to retieve
)
)
)
)
)
)
I tried using (string) or not on the element.
I also tried using
$xml = simplexml_load_file('path/to/my/xml.xml',null,LIBXML_NOCDATA);
vs
$xml = simplexml_load_file('path/to/my/xml.xml',"SimpleXMLElement",LIBXML_NOCDATA);
vs
$xml = simplexml_load_file('path/to/my/xml.xml');
For element names which cannot be PHP identifiers (like body.content), you must use an alternative PHP notation:
$element->NewsComponent->ContentItem->DataContent->body->{'body.content'};
I think your example returns 'content' because you are concatenating an element that does not exist
$element->NewsComponent->ContentItem->DataContent->body->body
with the string 'content' - probably PHP complains that there's no constant with the name content and therefore assumes you meant 'content'.
Thus my guess is you need to find another way to select an element with a dot in the name.
(This problem does not appear to be related to CDATA.)
Related
Clearly I'm missing something simple.
I have the following xml string which I parse using simplexml_load_string:
$xmlString = '<root><Title>Heading Text</Title><Image><img src="https://image.com?id=123" alt="alt text" /></Image></root>';
$xml = simplexml_load_string($xmlString);
However, I cannot access the img tag inside Image.
I would think I would use $xml->Image[0]->img to get the element and
$xml->Image[0]->img['src'] to get the url of the image. But I keep getting the error:
Trying to get property 'img' of non-object
$xml->Image[0] tests out as type SimpleXMLElement, and when I print_r() I get:
SimpleXMLElement Object (
[img] => SimpleXMLElement Object (
[#attributes] => Array (
[src] => https://image.com?id=123
[alt] => alt text
)
)
)
Like I said, I know I'm missing something really obvious, so any help would be appreciated.
You are correct, $xml->Image[0]->img['src'] will give you the src attribute, but it will give it to you as an object.
If you run print_r($xml->Image[0]->img['src']); it will show you this:
SimpleXMLElement Object
(
[0] => https://image.com?id=123
)
But if you run echo $xml->Image[0]->img['src']; instead, it will give you this:
https://image.com?id=123
The reason is that the SimpleXMLElement class implements the magic overload method __toString (or its equivalent in internal C code), so that whenever you cast the object to string, it will give you the string contents. Since echo always needs a string, it does this implicitly, but you can do it explicitly with (string), e.g.:
$imageSrc = (string)$xml->Image[0]->img['src'];
var_dump($imageSrc);
(As a side-note, the [0] is always optional - if you don't give a number, SimpleXML assumes you want the only or first child with that name, so $xml->Image->img['src'], $xml->Image[0]->img['src'], $xml->Image->img[0]['src'] and $xml->Image[0]->img[0]['src'] will all give the same result.)
I cannot get XML node contents and attributes at the same time with SimpleXML library:
I have the following XML, and want to get content#name attribute and node's contents:
<page id="id1">
<content name="abc">def</content>
</page>
Method simplexml_load_string()
print_r(simplexml_load_string('<page id="id1"><content name="abc">def</content></page>'));
outputs this:
SimpleXMLElement Object
(
[#attributes] => Array
(
[id] => id1
)
[content] => def
)
As you can see, contents of the content node is present, but attributes are missing. How can I receive the contents and attributes?
Thanks!
The attributes of content are present. This is just a trick of print_r() and how it works with XML objects in memory.
$x = simplexml_load_string('<page id="id1"><content name="abc">def</content></page>');
print_r($x->content);
print_r($x->content['name']);
SimpleXMLElement Object
(
[#attributes] => Array
(
[name] => abc
)
[0] => def
)
SimpleXMLElement Object
(
[0] => abc
)
In simplexml, accessing elements returns SimpleXMLElement objects. You can view the content of these objects using var_dump.
$book=simplexml_load_string('<page id="id1"><content name="abc">def</content></page>');
$content=$book->content;
var_dump($content);
You can access these objects with foreach loop.
foreach($obj as $value) {
if (is_array($value)) {
foreach ($value as $name=>$value) {
print $name.": ".$value."\n";}
}
else print $value;
}
You can not only retrieve contents (such as elements and attributes) but also add and remove them. You can also use Xpath to navigate values in complex XML tree. You just need to go through the methods of SimpleXMLElement class here.
$x = simplexml_load_string('<page id="id1"><content name="abc">def</content></page>');
To get the node's attributes:
$attributes = $x->content->attributes(); //where content is the name of the node
$name = $attributes['name'];
To get the content node's content:
$c = $x->content;
Interesting, that $c can be used as string and as object, i.e.
echo $c; //prints string
print_r($c) //prints it out as object
I have a XML object result from my database containing settings.
I am trying to access the values for a particular settingName:
SimpleXMLElement Object
(
[settings] => Array
(
[0] => SimpleXMLElement Object
(
[settingName] => Test
[settingDescription] => Testing
[requireValue] => 1
[localeID] => 14
[status] => 1
[value] => 66
[settingID] => 5
)
[1] => SimpleXMLElement Object
(
[settingName] => Home Page Stats
[settingDescription] => Show the Top 5 Teammate / Teamleader stats?
[requireValue] => 0
[localeID] => 14
[status] => 0
[value] => SimpleXMLElement Object
(
)
[settingID] => 3
)
)
)
I tried using xPath and have this so far:
$value = $fetchSettings->xpath("//settingName[text()='Test']/../value");
which returns:
Array ( [0] => SimpleXMLElement Object ( [0] => 66 ) )
How can I get the actual value and not just another array/object?
The end result will just be 66 for the example above.
SimpleXMLElement::xpath() returns a plain PHP array of "search results"; the first result will always be index 0 if any results were found.
Each "search result" is a SimpleXMLElement object, which has a magic __toString() method for getting the direct text content of a node (including CDATA, but including text inside child nodes, etc). The simplest way to call it is with (string)$my_element; (int)$my_element will also invoke it, then convert the result to an integer.
So:
$xpath_results = $fetchSettings->xpath("//settingName[text()='Test']/../value");
if ( count($xpath_results) > 0 ) {
$value = (string)$xpath_results[0];
}
Alternatively, the DOMXPath class can return results other than element and attribute nodes, due to the DOM's richer object model. For instance, you can have an XPath expression ending //text() to refer to the text content of a node, rather than the node itself (SimpleXML will do the search, but give you an element object anyway).
The downside is it's rather more verbose to use, but luckily you can mix and match the two sets of functions (using dom_import_simplexml() and its counterpart) as they have the same underlying representation:
// WARNING: Untested code. Please comment or edit if you find a bug!
$fetchSettings_dom = dom_import_simplexml($fetchSettings);
$xpath = new DOMXPath($fetchSettings_dom->ownerDocument);
$value = $xpath->evaluate(
"//settingName[text()='Test']/../value/text()",
$fetchSettings_dom
);
Because every element in a XML-file can appear as multiple times the parser always returns an array. If you are sure, that it is only a single item you can use current()
echo (string) current($value);
Note, that I cast the SimpleXMLElement to a string (see http://php.net/manual/simplexmlelement.tostring.php ) to get the actual value.
Use DomXPath class instead.
http://php.net/manual/en/domxpath.evaluate.php
The sample from php.net is just equivalent what you'd like to achieve:
<?php
$doc = new DOMDocument;
$doc->load('book.xml');
$xpath = new DOMXPath($doc);
$tbody = $doc->getElementsByTagName('tbody')->item(0);
// our query is relative to the tbody node
$query = 'count(row/entry[. = "en"])';
$entries = $xpath->evaluate($query, $tbody);
echo "There are $entries english books\n";
In this way, you can get values straight from the XML.
I am pretty new to PHP an XML and hope you can help me with this.
Searching the forum didn't help me yet to find an answer to my specific issue.
I have a PHP page with a simplexml array that looks like the following, just longer:
SimpleXMLElement Object
(
[textID] => Array
(
[0] => SimpleXMLElement Object
(
[textID] => 1
[content] => Text1
)
[1] => SimpleXMLElement Object
(
[textID] => 2
[content] => Text2
)
[2] => SimpleXMLElement Object
(
[textID] => 3
[content] => Text3
)
)
)
Now I am trying to echo out a specific value from this array by referring to its ID which is an integer.
The only way I get this working is the following but this just goes by the order within the array, not by the actual ID:
<?php echo $objTexts->textID[1]->content; ?>
Can someone tell me what I am missing here ?
Thanks, Tim
SimpleXML has no way of knowing that the textID identifies which node is which - it is just another element in the XML.
Based on your sample output, your XML is a little confusing as you have multiple elements called textID which each have a single child, also called textID, which has a different meaning. Nonetheless, what you want to do can be achieved either by looping through all the outer textID elements and testing the value of their inner textID element:
foreach ( $objTexts->textID as $item )
{
if ( $item->textID == '2' )
{
...
}
}
Or, you could use XPath, which is a fairly simple query language for XML, and is supported within SimpleXML in the form of the ->xpath() method. In your case, you want to find a textID node which contains a textID child with a particular value, so the code would look something like this:
// ->xpath always returns a plain PHP array - not a SimpleXML object
$xpath_results = $objTexts->xpath('//textID[textID=2]');
// If you're certain you only want the first result:
echo $xpath_results[0]->content;
// If you might want multiple matches
foreach ( $xpath_results as $item )
{
...
}
Further to my question here, I'm actually wondering why I'm not getting strings added to my array with the following code.
I get some HTML from an external source with this:
$doc = new DOMDocument();
#$doc->loadHTML($html);
$xml = #simplexml_import_dom($doc); // just to make xpath more simple
$images = $xml->xpath('//img');
$sources = array();
Here is the images array:
Array
(
[0] => SimpleXMLElement Object
(
[#attributes] => Array
(
[alt] => techcrunch logo
[src] => http://s2.wp.com/wp-content/themes/vip/tctechcrunch/images/logos_small/techcrunch2.png?m=1265111136g
)
)
...
)
Then I added the sources to my array with:
foreach ($images as $i) {
array_push($sources, $i['src']);
}
But when I print the results:
echo "<pre>";
print_r($sources);
die();
I get this:
Array
(
[0] => SimpleXMLElement Object
(
[0] => http://www.domain.com/someimages.jpg
)
...
)
Why isn't $i['src'] treated as a string? Isn't the original [src] element noted where I print $images a string inside there?
To put it another way $images[0] is a SimpleXMLElement, I understand that. But why is the 'src' attribute of THAT object not being but into $sources as a string when I reference it as $i['src']?
Why isn't $i['src'] treated as a string?
Becaue it isn't one - it's a SimpleXMLElement object that gets cast to a string if used in a string context, but it still remains a SimpleXMLElement at heart.
To make it a real string, force cast it:
array_push($sources, (string) $i['src']);
Because SimpleXMLElement::xpath() (quoting) :
Returns an array of SimpleXMLElement
objects
and not an array of strings.
So, the items of your $images array are SimpleXMLElement objects, and not strings -- which is why you have to cast them to strings, if you want strings.