How do I extract im:image elements. For instance, I can do this:
$feed=file_get_contents($url);
$xml = new SimpleXMLElement($feed);
$title = $xml->entry[0]->title;
$html = $xml->entry[0]->content;
But I can't get this:
$img = $xml->entry[0]->im;
How do I target those? I'm willing to use DOMDocument() as well.
EDIT:
<entry>
<im:image height="55">
http://foo.com/foo.jpg
</im:image>
</entry>
The im is just a namespace. You want 'image' element, not im – Dmitri Snytkine 20 hours ago
Yes, that's true, and the clue I needed.
Related
I'm trying to read an RSS feed from Flickr but it has some nodes which are not readable by Simple XML (media:thumbnail, flickr:profile, and so on).
How do I get round this? My head hurts when I look at the documentation for the DOM. So I'd like to avoid it as I don't want to learn.
I'm trying to get the thumbnail by the way.
The solution is explained in this nice article. You need the children() method for accessing XML elements which contain a namespace. This code snippet is quoted from the article:
$feed = simplexml_load_file('http://www.sitepoint.com/recent.rdf');
foreach ($feed->item as $item) {
$ns_dc = $item->children('http://purl.org/dc/elements/1.1/');
echo $ns_dc->date;
}
With the latest version, you can now reference colon nodes with curly brackets.
$item->{'itunes:duration'}
You're dealing with a namespace? I think you need to use the ->children method.
$ns_dc = $item->children('http://namespace.org/');
Can you provide a snippet with the xml declaration?
An even simpler method using PHP of accessing namespaced XML nodes without declaring a namespace is....
In order to get the value of <su:authorEmail> from the following source
<item>
<title>My important article</title>
<pubDate>Mon, 29 Feb 2017 00:00:00 +0000</pubDate>
<link>https://myxmlsource.com/32984</link>
<guid>https://myxmlsource.com/32984</guid>
<author>Blogs, Jo</author>
<su:departments>
<su:department>Human Affairs</su:department>
</su:departments>
<su:authorHash>4f329b923419b3cb2c654d615e22588c</su:authorHash>
<su:authorEmail>hIwW14tLc+4l/oo7agmRrcjwe531u+mO/3IG3xe5jMg=</su:authorEmail>
<dc:identifier>/32984/Download/0032984-11042.docx</dc:identifier>
<dc:format>Journal article</dc:format>
<dc:creator>Blogs, Jo</dc:creator>
<slash:comments>0</slash:comments>
</item>
Use the following code:
$rss = new DOMDocument();
$rss->load('https://myxmlsource.com/rss/xml');
$nodes = $rss->getElementsByTagName('item');
foreach ($nodes as $node) {
$title = $node->getElementsByTagName('title')->item(0)->nodeValue;
$author = $node->getElementsByTagName('author')->item(0)->nodeValue;
$authorHash = $node->getElementsByTagName('authorHash')->item(0)->nodeValue;
$department = $node->getElementsByTagName('department')->item(0)->nodeValue;
$email = decryptEmail($node->getElementsByTagName('authorEmail')->item(0)->nodeValue);
}
Im attempting to echo/assign a variable to the contents of the node "code" which is inside status;
I can get request-id just fine...
Any ideas people?
<?
$responseXML = '<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<payment xmlns="http://www.example.com" self="http://www.example.com">
<merchant-account-id ref="http://www.example.com">0000</merchant-account-id>
<transaction-id>0000</transaction-id>
<request-id>0000</request-id>
<transaction-type>auth</transaction-type>
<transaction-state>success</transaction-state>
<completion-time-stamp>2015-12-28T17:39:25.000Z</completion-time-stamp>
<statuses>
<status code="201.0000" description="3d-acquirer:The resource was successfully created." severity="information"/>
</statuses>
<avs-code>P</avs-code>
<requested-amount currency="GBP">0.01</requested-amount>
<account-holder>
<first-name>test</first-name>
<last-name>test</last-name>
<email>test.test#hotmail.co.uk</email>
<phone>00000000000</phone>
<address>
<street1>test</street1>
<city>test test</city>
<state>test</state>
<country>GB</country>
</address>
</account-holder>
<card-token>
<token-id>000</token-id>
<masked-account-number>000000******0000</masked-account-number>
</card-token>
<ip-address>192.168.0.1</ip-address>
<descriptor></descriptor>
<authorization-code>000000</authorization-code>
<api-id>000-000</api-id>
</payment>';
$doc = new DOMDocument;
$doc->loadXML($responseXML);
echo $doc->getElementsByTagName('request-id')->item(0)->nodeValue;
echo $doc->getElementsByTagName('status code')->item(0)->nodeValue;
?>
I've tried simplexml looad string, but pulling hair out with this one, can anybody shed some light, speed of getting this info out in one process is quite important so not to stress the webserver out!
Many thanks.
Using DOM is a good idea, but the API methods are a little cumbersome. Using Xpath makes it a lot easier.
Xpath allows you to use expressions to fetch node lists or scalar values from a DOM:
$document = new DOMDocument;
$document->loadXML($responseXML);
$xpath = new DOMXpath($document);
$xpath->registerNamespace('example', 'http://www.example.com');
echo $xpath->evaluate('string(//example:request-id)'), "\n";
echo $xpath->evaluate('string(//example:status/#code)');
Output:
0000
201.0000
Xpath does not have a default namespace so if you XML has a namespace (like your example) you need to register a prefix for it and use it.
As code is an attribute of xml tag status, doing
getElementsByTagName('status code')
is wrong.
There's a special method for getting attribute value getAttribute:
echo $doc->getElementsByTagName('status')->item(0)->getAttribute('code');
Using XPath allows to access the status node very precisely.
DOMDocument + XPath:
$responseXML = '...';
$doc = new DOMDocument();
$doc->loadXML($responseXML);
$xp = new DOMXpath($doc);
$xp->registerNamespace('example', 'http://www.example.com');
// Every status node.
$statusNodes = $xp->query('//example:status');
// or a very specific one.
$statusNodes = $xp->query('/example:payment/example:statuses/example:status');
$statusNode = $statusNodes[0];
$code = $statusNode->getAttribute('code');
// $code is '201.0000'.
// To change the 'code' value.
$statusNode->setAttribute('code', '302.0000');
I'm trying to read an RSS feed from Flickr but it has some nodes which are not readable by Simple XML (media:thumbnail, flickr:profile, and so on).
How do I get round this? My head hurts when I look at the documentation for the DOM. So I'd like to avoid it as I don't want to learn.
I'm trying to get the thumbnail by the way.
The solution is explained in this nice article. You need the children() method for accessing XML elements which contain a namespace. This code snippet is quoted from the article:
$feed = simplexml_load_file('http://www.sitepoint.com/recent.rdf');
foreach ($feed->item as $item) {
$ns_dc = $item->children('http://purl.org/dc/elements/1.1/');
echo $ns_dc->date;
}
With the latest version, you can now reference colon nodes with curly brackets.
$item->{'itunes:duration'}
You're dealing with a namespace? I think you need to use the ->children method.
$ns_dc = $item->children('http://namespace.org/');
Can you provide a snippet with the xml declaration?
An even simpler method using PHP of accessing namespaced XML nodes without declaring a namespace is....
In order to get the value of <su:authorEmail> from the following source
<item>
<title>My important article</title>
<pubDate>Mon, 29 Feb 2017 00:00:00 +0000</pubDate>
<link>https://myxmlsource.com/32984</link>
<guid>https://myxmlsource.com/32984</guid>
<author>Blogs, Jo</author>
<su:departments>
<su:department>Human Affairs</su:department>
</su:departments>
<su:authorHash>4f329b923419b3cb2c654d615e22588c</su:authorHash>
<su:authorEmail>hIwW14tLc+4l/oo7agmRrcjwe531u+mO/3IG3xe5jMg=</su:authorEmail>
<dc:identifier>/32984/Download/0032984-11042.docx</dc:identifier>
<dc:format>Journal article</dc:format>
<dc:creator>Blogs, Jo</dc:creator>
<slash:comments>0</slash:comments>
</item>
Use the following code:
$rss = new DOMDocument();
$rss->load('https://myxmlsource.com/rss/xml');
$nodes = $rss->getElementsByTagName('item');
foreach ($nodes as $node) {
$title = $node->getElementsByTagName('title')->item(0)->nodeValue;
$author = $node->getElementsByTagName('author')->item(0)->nodeValue;
$authorHash = $node->getElementsByTagName('authorHash')->item(0)->nodeValue;
$department = $node->getElementsByTagName('department')->item(0)->nodeValue;
$email = decryptEmail($node->getElementsByTagName('authorEmail')->item(0)->nodeValue);
}
I'm using DOMDocument to generate a XML and this XML has to have an image-Tag.
Somehow, when I do (simplified)
$response = new DOMDocument();
$actions = $response->createElement('actions');
$response->appendChild($actions);
$imageElement = $response->createElement('image');
$actions->appendChild($imageElement);
$anotherNode = $response->createElement('nodexy');
$imageElement->appendChild($anotherNode);
it results in
<actions>
<img>
<node></node>
</actions>
If I change 'image' to 'images' or even 'img' it works. It does work as well when I switch from PHP 5.3.10 to 5.3.8.
Is this a bug or a feature? My guess is that DOMDocuments assumes that I want to build an HTML img Element ... Can I prevent this somehow?
Weird thing on top: I'm not able to reproduce the error in another script on the same server. But I do not catch the pattern ...
Here's a complete pastebin of the class, that's causing the error: http://pastebin.com/KqidsssM
That costed me two hours.
DOMDocument renders the XML correctly. The XML is returned by an ajax call and somehow the browser/javascript changes it to img before displaying it ...
Is it possible that $imageAction->getAction() of line 44 returns 'img'? Have you var_dump()ed that? I don't see how DOM would convert "image" to "img" under any circumstances.
I think it is behaving as an "html doc"
try to add a version number, "1.0"
code
<?php
$response = new DOMDocument('1.0','UTF-8');
$actions = $response->createElement('actions');
$response->appendChild($actions);
$imageElement = $response->createElement('image');
$actions->appendChild($imageElement);
$anotherNode = $response->createElement('nodexy');
$imageElement->appendChild($anotherNode);
echo $response->saveXML();
output:
<?xml version="1.0" encoding="UTF-8" ?>
<actions>
<image>
<nodexy />
</image>
</actions>
and also you can use SimpleXML classes
Example :
<?php
$response = new SimpleXMLElement("<actions></actions>");
$imageElement = $response->addChild('image');
$imageElement->addChild("nodexy");
echo $response->asXML();
output :
<?xml version="1.0" ?>
<actions>
<image>
<nodexy />
</image>
</actions>
I am trying to parse a youtube playlist field.
The URL is: http://gdata.youtube.com/feeds/api/playlists/664AA68C6E6BA19B?v=2
I need: Title, Video ID, and Default thumbnail.
I can easily get the title but I'm a little lost when it comes to the nested elements
$data = new DOMDocument();
if($data->load("http://gdata.youtube.com/feeds/api/playlists/664AA68C6E6BA19B?v=2"))
{
foreach ($data->getElementsByTagName('entry') as $video)
{
$title = $video->getElementsByTagName('title')->item(0)->nodeValue;
$id = ??
$thumb = ??
}
}
Here is the XML (I have stripped out the elements that are irrelevant for this example)
<entry gd:etag="W/"AkYGSXc9cSp7ImA9Wx9VGEk."">
<title>A GoPro Weekend On The Ice</title>
<media:group>
<media:thumbnail url="http://i.ytimg.com/vi/yk6wkfVNFQE/default.jpg" height="90" width="120" time="00:02:07" yt:name="default" />
<yt:videoid>yk6wkfVNFQE</yt:videoid>
</media:group>
</entry>
I need the "videoid" and the "url" from thumbnail-default
Thank you!
Similar to the getElementsByTagName() that you're already using, to access namespaced elements (recognisable by namespace:element-name) you can use the getElementsByTagNameNS() method.
The documenation (linked above) should give you the technical lowdown on how to use it, suffice to say it will be similar to the following (also using getAttribute()).
$yt = 'http://gdata.youtube.com/schemas/2007';
$media = 'http://search.yahoo.com/mrss/';
// Inside your loop
$id = $video->getElementsByTagNameNS($yt, 'videoid')->item(0)->nodeValue;
$thumb = $video->getElementsByTagNameNS($media, 'thumbnail')->item(0)->getAttribute('url');
Hopefully that should give you a spring-board to leap into accessing namespaced items within your XML documents.