PHP SimpleXML issue - php

I'm trying to get an xml stream by using curl. I've recieved the string with curl but I'm having troubles parsing the xmlstream with SimpleXML. The url im using is http://www.google.com/books/feeds/volumes/fR4vqfywNlgC
and it seems to be ignoring the parts containing "dc". Why?

The dublin core data (at least, I'm assuming that's what the DC prefix means in this case) uses its own namespace. You need to refer to that namespace when retrieving these elements. This can be done using the 'children' method.
Example:
$sxml = simplexml_load_string($xml);
$dcData = $sxml->children('dc', TRUE);
echo (string)$dcData->creator;
An article/posting detailing the problem and solution can be found here.
http://blogs.sitepoint.com/simplexml-and-namespaces/

Related

using simplexml_load_file() to get attributes from xml file for Amazon Affiliate API

I am using Amazon Affiliate API here. I am fairly new to all of this actually.
I have the PHP code to generate a working URL for products, however I am having issues extracting data from that url's generated xml. I am looking to return the FormattedPrice from the xml.
Here is an example of the xml file:
https://pastebin.com/HnttEVfv
Here is my code trying to pull the "FormattedPrice". Neither of the examples below are working and are just returning empty values.
Sidenote: $request_url is a full http:// valid url of the xml file.
$getxml = simplexml_load_file($request_url);
$price = $getxml->ItemLookupResponse->Items->Item->ItemAttributes->ListPrice['FormattedPrice'];
Nor does
$price = $getxml->ItemLookupResponse->Items->Item->ItemAttributes->ListPrice->FormatedPrice;
Every XML document has a single "root element". In your example, that top-level element is <ItemLookupResponse>.
When you load the document in SimpleXML, the first object it gives you is that root element. You then access chlidren of that element with the ->ElementName notation.
So instead of $getxml->ItemLookupResponse->Items you should write $getxml->Items.

Can't access XML node via xpath() (YT channel feed)

Very stumped by this one. In PHP, I'm fetching a YouTube user's vids feed and trying to access the nodes, like so:
$url = 'http://gdata.youtube.com/feeds/api/users/HCAFCOfficial/uploads';
$xml = simplexml_load_file($url);
So far, so fine. Really basic stuff. I can see the data comes back by running:
echo '<p>Found '.count($xml->xpath('*')).' nodes.</p>'; //41
echo '<textarea>';print_r($xml);echo '</textarea>';
Both print what I would expect, and the print_r replicates the XML structure.
However, I have no idea why this is returning zero:
echo '<p>Found '.count($xml->xpath('entry')).'"entry" nodes.</p>';
There blatantly are entry nodes in the XML. This is confirmed by running:
foreach($xml->xpath('*') as $node) echo '<p>['.$node->getName().']</p>';
...which duly outputs "[entry]" 25 times. So perhaps this is a bug in SimpleXML? This is part of a wider feed caching system and I'm not having any trouble with other, non-YT feeds, only YT ones.
[UPDATE]
This question shows that it works if you do
count($xml->entry)
But I'm curious as to why count($xml->xpath('entry')) doesn't also work...
[Update 2]
I can happily traverse YT's anternate feed format just fine:
http://gdata.youtube.com/feeds/base/users/{user id}/uploads?alt=rss&v=2
This is happening because the feed is an Atom document with a defined default namespace.
<feed xmlns="http://www.w3.org/2005/Atom" ...
Since a namespace is defined, you have to define it for your xpath call too. Doing something like this works:
$url = 'http://gdata.youtube.com/feeds/api/users/HCAFCOfficial/uploads';
$xml = simplexml_load_file($url);
$xml->registerXPathNamespace('ns', 'http://www.w3.org/2005/Atom');
$results = $xml->xpath('ns:entry');
echo count($results);
The main thing to know here is that SimpleXML respects any and all defined namespaces and you need to handle them accordingly, including the default namespace. You'll notice that the second feed you listed does not define a default namespace and so the xpath call works fine as is.

Parsing Wordpress XML file in PHP

Im migrating big Wordpress page to custom CMS. I need to extract information from big (20MB+) XML file, exported from Wordpress.
I don't have any experience in XML under PHP and i don't know how to start reading file.
Wordpress file contains structures like this:
<excerpt:encoded><![CDATA[Encoded text here]]></excerpt:encoded>
and i don't know how to handle this in PHP.
You are probably going to do fine with simplexml:
$xml = simplexml_load_file('big_xml_file.xml');
foreach ($xml->element as $el) {
echo $el->name;
}
See php.net for more info
Unfortunately, your XML example didn't come through.
PHP5 ships with two extensions for working with XML - DOM and "SimpleXML".
Generally speaking, I recommend looking into SimpleXML first since it's the more accessible library of the two.
For starters, use "simplexml_load_file()" to read an XML file into an object for further processing.
You should also check out the "SimpleXML basic examples page on php.net".
I don't have any experience in XML under PHP
Take a look at simplexml_load_file() or DomDocument.
<excerpt:encoded><![CDATA[Encoded text here]]></excerpt:encoded>
This should not be a problem for the XML parser. However, you will have a problem with the content exported by WordPress. For example, it can contain WordPress shortcodes, which will come across in their raw format instead of expanded.
Better Approach
Determine if what you are migrating to supports an export from WordPress feature. Many other systems do - Drupal, Joomla, Octopress, etc.
Although Adam is Absolutely right, his answer needed a bit more details. Here's a simple script that should get you going.
$xmlfile = simplexml_load_file('yourxmlfile.xml');
foreach ($xmlfile->channel->item as $item) {
var_dump($item->xpath('title'));
var_dump($item->xpath('wp:post_type'));
}
simplexml_load_file() is the way to go creating an object, but you will also need to use xpath as WordPress uses name spaces. If I remember correctly SimpleXML does not handle name space well or at all.
$xml = simplexml_load_file( $file );
$xml->xpath('/rss/channel/wp:category');
I would recommend looking at what WordPress uses for importing the files.
https://github.com/WordPress/WordPress/blob/master/wp-admin/includes/class-wp-importer.php

Get XML Attribute with SimpleXML

I'm trying to get the $xml->entry->yt:statistics->attributes()->viewCount attribute, and I've tried some stuff with SimpleXML, and I can't really get it working!
Attempt #1
<?php
$xml = simplexml_load_file("http://gdata.youtube.com/feeds/api/videos?author=Google");
echo $xml->entry[0]->yt:statistics['viewCount'];
?>
Attempt #2
<?php
$xml = simplexml_load_file("http://gdata.youtube.com/feeds/api/videos?author=Google");
echo $xml->entry[0]->yt:statistics->attributes()->viewCount;
?>
Both of which return blank, though SimpleXML is working, I tried to get the feed's title, which worked!
Any ideas?
I've looked at loads of other examples on SO and other sites, but somehow this isn't working? does PHP recognize the ':' to be a cut-off, or am I just doing something stupid?
Thank you, any responses greatly appreciated!
If you just want to get the viewcount of a youtube video then you have to specify the video ID. The youtube ID is found in each video url. For example "http://www.youtube.com/watch?v=ccI-MugndOU" so the id is ccI-MugndOU. In order to get the viewcount then try the code below
$sample_video_ID = "ccI-MugndOU";
$JSON = file_get_contents("http://gdata.youtube.com/feeds/api/videos?q={$sample_video_ID}&alt=json");
$JSON_Data = json_decode($JSON);
$views = $JSON_Data->{'feed'}->{'entry'}[0]->{'yt$statistics'}->{'viewCount'};
echo $views;
I would use the gdata component from the zend framework. Is also available as a separate module, so you don't need to use the whole zend.
The yt: prefix marks that element as being in a different "XML namespace" from the rest of the document. You have to tell SimpleXML to switch to that namespace using the ->children() method.
The line you were attempting should actually look like this:
echo (string)$xml->entry[0]->children('yt', true)->statistics->attributes(NULL)->viewCount;
To break this down:
(string) - this is just a good habit: you want the string contents of the attribute, not a SimpleXML object representing it
$xml->entry[0] - as expected
->children('yt', true) - switch to the namespace with the local alias 'yt'
->statistics - as expected
->attributes(NULL) - technically, the attribute "viewCount" is back in the default namespace, because it is not prefixed with "yt:", so we have to switch back in order to see it
->viewCount - running ->attributes() gives us nothing but attributes, which are accessed with ->foo not ['foo']

Parsing XML (PHP)

I'm using SimpleXML . I want to get this node's text attribute.
<yweather:condition text="Mostly Cloudy" ......
I'm using this it's not working :
$xml->children("yweather", TRUE)->condition->attributes()->text;
Do a print_r() on $xml to see how the structure looks. From there you should be able to see how to access the information.
It looks like you are trying to access an attribute, which is stored in an array in $xml->yweather->attributes() so:
$attributes = $xml->condition->attributes();
$weather = $attributes['text'];
To deal with the namespace, you need to use children() to get the members of that namespace.
$weather_items = $xml->channel->item->children("http://xml.weather.yahoo.com/ns/rss/1.0");
It might help to mention that the string you showed is part of a feed, specifically the RSS formatted Yahoo Weather feed.
You would probably use $xml->condition but there may be branches before that.

Categories