Problem getting XPath working in PHP

Problem getting XPath working in PHP - php

I've been trying to access the NHS API using different methods to read in the XML.
Here is a snippet of the XML:
<feed xmlns:s="http://syndication.nhschoices.nhs.uk/services" xmlns="http://www.w3.org/2005/Atom">
<title type="text">NHS Choices - GP Practices Near Postcode - W1T4LB - Within 5km</title>
<entry>
<id>http://v1.syndication.nhschoices.nhs.uk/organisations/gppractices/27369</id>
<title type="text">Fitzrovia Medical Centre</title>
<updated>2011-08-20T22:47:39Z</updated>
<link rel="self" title="Fitzrovia Medical Centre" href="http://v1.syndication.nhschoices.nhs.uk/organisations/gppractices/27369?apikey="/>
<link rel="alternate" title="Fitzrovia Medical Centre" href="http://www.nhs.uk/ServiceDirectories/Pages/GP.aspx?pid=303A92EF-EC8D-496B-B9CD-E6D836D13BA2"/>
<content type="application/xml">
<s:organisationSummary>
<s:name>Fitzrovia Medical Centre</s:name>
<s:address>
<s:addressLine>31 Fitzroy Square</s:addressLine>
<s:addressLine>London</s:addressLine>
<s:postcode>W1T6EU</s:postcode>
</s:address>
<s:contact type="General">
<s:telephone>020 7387 5798</s:telephone>
</s:contact>
<s:geographicCoordinates>
<s:northing>182000</s:northing>
<s:easting>529000</s:easting>
<s:longitude>-0.140267259415255</s:longitude>
<s:latitude>51.5224357586293</s:latitude>
</s:geographicCoordinates>
<s:Distance>0.360555127546399</s:Distance>
</s:organisationSummary>
</content>
</entry>
</feed>
I've been using this PHP to access it:
<?php
$feedURL = 'http://v1.syndication.nhschoices.nhs.uk/organisations/gppractices/postcode/W1T4LB.xml?apikey=&range=5';
$raw = file_get_contents($feedURL);
$dom = new DOMDocument();
$dom->loadXML($raw);
$xp = new DOMXPath($dom);
$result = $xp->query("//entry"); // select all entry nodes
print $result->item(0)->nodeValue;
?>
Problem is I have no results, the $raw data is present, but the $dom never gets filled with the string XML. This means the XPath won't work.
Also... for bonus points: how do I access the <s:Name> tag using XPath in this instance??
Appreciate the help as always.
Edit:
Here is the resulting PHP that worked fine, thanks to #Andrej L
<?php
$feedURL = 'http://v1.syndication.nhschoices.nhs.uk/organisations/gppractices/postcode/W1T4LB.xml?apikey=&range=5';
$xml = simplexml_load_file($feedURL);
$xml->registerXPathNamespace('s', 'http://syndication.nhschoices.nhs.uk/services');
$result = $xml->xpath('//s:name');
foreach ($result as $title)
{
print $title . '<br />';
}
?>

I think it's better to use SimpleXml library. see http://www.php.net/manual/en/book.simplexml.php
Register namespace using http://www.php.net/manual/en/simplexmlelement.registerxpathnamespace.php
Use xpath method. See http://www.php.net/manual/en/simplexmlelement.xpath.php

Related

Unable to parse atom feed

I am implementing Youtube push notification and implemented webhook. Youtube gives updates in the form of atom feed. My problem is i can't parse that feed.
This is the XML:
<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:yt="http://www.youtube.com/xml/schemas/2015">
<link rel="hub" href="https://pubsubhubbub.appspot.com" />
<link rel="self" href="https://www.youtube.com/xml/feeds/videos.xml?channel_id=UCaNoTnXcQQt3ody_cLZSihw" />
<title>YouTube video feed</title>
<updated>2018-03-01T07:21:59.144766801+00:00</updated>
<entry>
<id>yt:video:vNQyYJqFopE</id>
<yt:videoId>vNQyYJqFopE</yt:videoId>
<yt:channelId>UCaNoTnXcQQt3ody_cLZSihw</yt:channelId>
<title>Test Video 4</title>
<link rel="alternate" href="https://www.youtube.com/watch?v=vNQyYJqFopE" />
<author>
<name>Testing</name>
<uri>https://www.youtube.com/channel/UCaNoTnXcQQt3ody_cLZSihw</uri>
</author>
<published>2018-03-01T07:21:48+00:00</published>
<updated>2018-03-01T07:21:59.144766801+00:00</updated>
</entry>
<?php
$xml = '<?xml versio......';
$obj = simplexml_load_string($xml);
echo '<pre>';print_r($obj);echo '</pre>';
Screenshot
How to get the value of yt:videoId element. I am new to PHP, if I did anything wrong please correct me.

It seems the XML elements containing the yt namespace (e.g. <yt:videoId>) are not being parsed by simplexml_load_string. I don't know why but in your case the video id is also present in the <id> element you just need to extract the last value or simply cut of yt:video: in front of it. That is at least an easy workaround.
Also it works if you use a direct XPath to the <yt:videoId> element like this:
echo $obj->xpath('//yt:videoId')[0];
// output: vNQyYJqFopE
XPath always returns an array so you need to get the first element with [0].

Try this (updated)
$str = $obj->entry->id;
echo substr($str, strpos($str, "video:")+ 6);
Get the channel
$chan = $obj->entry->author->uri;
echo substr($chan , strpos($chan , "channel/")+ 8);

How to fetch exact parameter from XML string with PHP?

This is part of XML document:
<entry>
<author>
<name>Dunnock_D</name>
<uri>http://www.flickr.com/people/dunnock_d/</uri>
</author>
<link rel="license" type="text/html" href="https://creativecommons.org/licenses/by-nc/2.0/deed.en" />
<link rel="enclosure" type="image/jpeg" href="http://farm8.staticflickr.com/7548/26820724620_1d221c3187_b.jpg" />
</entry>
My code:
$xml = simplexml_load_string($result);
foreach ($xml->entry as $pixinfo) {
echo $pixinfo->link[1]['href'];
}
The problem is there can be one or more link strings and I need only particular with rel="enclosure" attribute.
What is the easiest way without extra IF and loops?
Thank you!

For that you can use DOMXPath, more specifically the query function. Let's say your $result variable contains the following:
<?xml version='1.0' encoding='UTF-8'?>
<entries>
<entry>
<author>
<name>Dunnock_D</name>
<uri>http://www.flickr.com/people/dunnock_d/</uri>
</author>
<link rel="license" type="text/html" href="https://creativecommons.org/licenses/by-nc/2.0/deed.en" />
<link rel="enclosure" type="image/jpeg" href="http://farm8.staticflickr.com/7548/26820724620_1d221c3187_b.jpg" />
</entry>
<entry>
<author>
<name>Dunnock_D</name>
<uri>http://www.flickr.com/people/dunnock_d/</uri>
</author>
<link rel="license" type="text/html" href="https://creativecommons.org/licenses/by-nc/2.0/deed.en" />
<link rel="enclosure" type="image/jpeg" href="http://farm8.staticflickr.com/7548/26820724620_1d221c3187_b.jpg" />
</entry>
<entry>
<author>
<name>Dunnock_D</name>
<uri>http://www.flickr.com/people/dunnock_d/</uri>
</author>
<link rel="license" type="text/html" href="https://creativecommons.org/licenses/by-nc/2.0/deed.en" />
<link rel="enclosure" type="image/jpeg" href="http://farm8.staticflickr.com/7548/26820724620_1d221c3187_b.jpg" />
</entry>
</entries>
I know the entries are repeated, but it's only for demo purposes. The code to get only the enclosure links would be:
$doc = new DOMDocument();
$doc->preserveWhiteSpace = false;
$doc->loadXML($result);
$xpath = new DOMXpath($doc);
$entries = $xpath->query('//entries/entry');
foreach ($entries as $entry) {
$link = $xpath->query('link[#rel="enclosure"]', $entry)->item(0);
$href = $link->getAttribute('href');
echo "{$href}\n";
}

You are using simplexml. Just use "attributes()" function: http://php.net/manual/pt_BR/simplexmlelement.attributes.php
Or you can access directly:
foreach ($xml->entry as $pixinfo) {
if($pixinfo->link[1]['rel'] == 'enclosure') {
echo $pixinfo->link[1]['href'];
}
}

The solution is Xpath.
With SimpleXML you can fetch the attribute node and cast the generated SimpleXMLElement into a string. You should make sure that you got an element before you cast it. SimpleXMLElement::xpath() will always return an array of SimpleXMLElement objects.
$entry = new SimpleXMLElement($xml);
$enclosures = $entry->xpath('link[#rel="enclosure"]/#href');
if (count($enclosures) > 0) {
var_dump((string)$enclosures[0]);
}
Output:
string(63) "http://farm8.staticflickr.com/7548/26820724620_1d221c3187_b.jpg"
With DOM the bootstrap is slightly larger, but you can fetch the href attribute directly as a string:
$document = new DOMDocument();
$document->loadXml($xml);
$xpath = new DOMXpath($document);
var_dump(
$xpath->evaluate('string(/entry/link[#rel="enclosure"]/#href)')
);
This will return an empty string if the expression does not match.

PHP outputing nested childnodes

I am trying to output a xml file to an array thats then outputted to screen. The xml file loads I know it loads because as I can output entry > Id but I can not access its child nodes. I need the data located in.
content > s:organisationSummay
content > s:organisationSummay > s:address
content > s:organisationSummay > s:geographicCoordinates
how would I access the the data located in s:organisationSummay ,s:address, s:geographicCoordinates so I can getElementsByTagName for each items in that child node.
$doc2 = new DOMDocument();
$url = 'http://v1.syndication.nhschoices.nhs.uk/organisations/'.$_POST['ServiceType'].'/postcode/'.$_POST['PostCode'].'.xml?apikey=??&range=50';
echo $url;
$doc2->load($url);
$arrFeeds = array();
foreach ($doc2->getElementsByTagName('entry') as $node)
{
echo $node->getElementsByTagName($content->'s:name');
$itemRSS = array (
'PracticeName' => $organisationSummary->getElementsByTagName('s:name')->item(0)->nodeValue
);
array_push($arrFeeds, $itemRSS);
}
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns:s="http://syndication.nhschoices.nhs.uk/services" xmlns="http://www.w3.org/2005/Atom">
<title type="text">NHS Choices - GP Practices Near Postcode - ls1- Within 50km</title>
<id>http://v1.syndication.nhschoices.nhs.uk/organisations/gppractices/postcode/bd164jt?range=50</id>
<rights type="text">© Crown Copyright 2009</rights>
<updated>2012-07-06T10:24:46+01:00</updated>
<category term="Search"/>
<logo>http://www.nhs.uk/nhscwebservices/documents/logo1.jpg</logo>
<author>
<name>NHS Choices</name>
<uri>http://www.nhs.uk</uri>
<email>webservices#nhschoices.nhs.uk</email>
</author>
<link rel="self" type="application/xml" title="NHS Choices - GP Practices Near Postcode - ;ls1 - Within 50km" href="http://v1.syndication.nhschoices.nhs.uk/organisations/gppractices/postcode/ls1?apikey=??&range=50"/>
<link rel="first" type="application/xml" title="first" length="1000" href="http://v1.syndication.nhschoices.nhs.uk/organisations/gppractices/postcode/ls1?apikey=??&range=50&page=1"/>
<link rel="next" type="application/xml" title="next" length="1000" href="http://v1.syndication.nhschoices.nhs.uk/organisations/gppractices/postcode/Ls1?apikey=??&range=50&page=2"/>
<link rel="last" type="application/xml" title="last" length="1000" href="http://v1.syndication.nhschoices.nhs.uk/organisations/gppractices/postcode/LS1?apikey=??&range=50&page=10"/>
<link rel="alternate" title="NHS Choices - Find and choose services - GP Practices" href="http://www.nhs.uk/ServiceDirectories/pages/ServiceSearch.aspx?ServiceType=GP"/>
<s:SearchCoords>439300,411100</s:SearchCoords>
<entry>
<id>http://v1.syndication.nhschoices.nhs.uk/organisations/gppractices/1</id>
<title type="text">Medical Practice</title>
<updated>2012-07-06T09:24:46Z</updated>
<link rel="self" title="Medical Practice" href="http://v1.syndication.nhschoices.nhs.uk/organisations/gppractices/1?apikey=??"/>
<link rel="alternate" title="Medical Practice" href="http://www.nhs.uk/ServiceDirectories/Pages/GP.aspx?pid=1"/>
<content type="application/xml">
<s:organisationSummary>
<s:name>Medical Practice</s:name>
<s:address>
<s:addressLine>Health Care Centre</s:addressLine>
<s:addressLine>2</s:addressLine>
<s:addressLine>Town</s:addressLine>
<s:addressLine>Yorkshire</s:addressLine>
<s:postcode>?</s:postcode>
</s:address>
<s:contact type="General">
<s:telephone>5558383</s:telephone>
</s:contact>
<s:geographicCoordinates>
<s:northing>438880</s:northing>
<s:easting>411444</s:easting>
<s:longitude>-1.82821202227791</s:longitude>
<s:latitude>53.996218047559</s:latitude>
</s:geographicCoordinates>
<s:Distance>0.5</s:Distance>
</s:organisationSummary>
</content>
</entry>
<entry>
<id>http://v1.syndication.nhschoices.nhs.uk/organisations/gppractices/2</id>
<title type="text">Surgery</title>
<updated>2012-07-06T09:24:46Z</updated>
<link rel="self" title="Surgery" href="http://v1.syndication.nhschoices.nhs.uk/organisations/gppractices/1?apikey=??"/>
<link rel="alternate" title="Surgery" href="http://www.nhs.uk/ServiceDirectories/Pages/GP.aspx?pid=2"/>
<content type="application/xml">
<s:organisationSummary>
<s:name>Surgery</s:name>
<s:address>
<s:addressLine>Healthcare Centre</s:addressLine>
<s:addressLine>Kings</s:addressLine>
<s:addressLine>Town</s:addressLine>
<s:postcode>?</s:postcode>
</s:address>
<s:contact type="General">
<s:telephone>555555</s:telephone>
<s:email>Email</s:email>
</s:contact>
<s:geographicCoordinates>
<s:northing>78421</s:northing>
<s:easting>484100</s:easting>
<s:longitude>-1.828987402220691</s:longitude>
<s:latitude>53.987218047559</s:latitude>
</s:geographicCoordinates>
<s:Distance>0.5</s:Distance>
</s:organisationSummary>
</content>
</entry>
</feed>

This is a namespaced document, so you need to use the proper namespace methods, e.g. DOMDocument::getElementsByTagNameNS.
In addition, there is so much wrong with your loop that I suspect you're either not including all the code or you really misunderstand how DOMDocument works.
$NS = array(
's' => "http://syndication.nhschoices.nhs.uk/services",
'atom' => "http://www.w3.org/2005/Atom",
);
$entries = array();
foreach ($doc2->getElementsByTagNameNS($NS['s'], 'organisationSummary') as $node)
{
$entries[] = array(
'name' => trim($node->getElementsByTagNameNS($NS['s'], 'name')->item(0)->textContent),
'address' => keyByElementName($node->getElementsByTagNameNS($NS['s'], 'address')->item(0)),
'geographicCoordinates' => keyByElementName($node->getElementsByTagNameNS($NS['s'], 'geographicCoordinates')->item(0)),
);
}
function keyByElementName(DOMNode $node)
{
$elem = array();
foreach ($node->childNodes as $child) {
if ($child->nodeType===XML_ELEMENT_NODE) {
$elem[$child->localName] = trim($child->textContent);
}
}
return $elem;
}
However, consider using DOMXPath or SimpleXML, as these will be easier than dom traversal.

Get XML values using PHP

I have a XML file. Here is a small version of that.
<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="fr">
<title>Liste des ebooks</title>
<updated>2012-03-01T01:23:24Z</updated>
<author>
<name>Drown Del</name>
</author>
<opensearch:totalResults>2338</opensearch:totalResults>
<opensearch:itemsPerPage>100</opensearch:itemsPerPage>
<entry>
<category term="Romans" label="Romans"/>
<category term="Aventures" label="Aventures"/>
</entry>
</feed>
First I would like to know how do we call something like opensearch:totalResults in XML terms.
And I need your help with obtaining following values with PHP.
<opensearch:totalResults>2338</opensearch:totalResults> I need to get 2338 to a PHP variable.
Thank you.
Thank you all for your answers.
I could fix it with following way.
$xml = simplexml_load_string($xmltext);
$val = $xml->xpath('opensearch:totalResults');
echo $val[0];

parse all of this information into PHP using DOM. Ex.
$doc = new DOMDocument;
$doc->loadXML($xml); //$xml is your xml string
echo $doc->getElementsByTagName("totalResults")->item(0)->nodeValue;

For your first question, opensearch:totalResults is the qualified name of a start tag. It is called a qualified name (you might come across this as QName) because it contains the namespace (opensearch) for the tag.
For your second question, you can easily parse your XML into a DOMDocument and then query it for the value of the relevant tag. There are lots of examples on SO and of course on Google; a basic one from PHP.net is here.
Important note: Your current XML document does not contain an XML namespace declaration for the opensearch namespace, and will not parse as a result. You need to add such a declaration by making a modification:
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="fr"
xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/">
If you need more powerful querying you can also use XPath. A minimal example would look like:
$doc = new DOMDocument();
$doc->loadXML($xml);
$xpath = new DOMXPath($doc);
$nodes = $xpath->query('//opensearch:totalResults');
foreach ($nodes as $node) {
echo $node->nodeValue;
}

opensearch is a namespace, so you can try to access it like:
$yourXml->children('openSearch', true)->totalResults
Hope it helps

Check this out the exact result in PHP
<?php
$xml ='<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="fr">
<title>Liste des ebooks</title>
<updated>2012-03-01T01:23:24Z</updated>
<author>
<name>Drown Del</name>
</author>
<opensearch:totalResults>2338</opensearch:totalResults>
<opensearch:itemsPerPage>100</opensearch:itemsPerPage>
<entry>
<category term="Romans" label="Romans"/>
<category term="Aventures" label="Aventures"/>
</entry>
</feed>';
$dom = new DOMDocument();
$dom->loadXML($xml);
$xmlD = simplexml_import_dom($dom);
echo $xmlD->totalResults;
?>

Read your xml file with simplexml_load_file as an object
Then get your variable like this:
$object->{'opensearch:totalResults'};

Retrieve an XML psuedo class value in PHP?

I'm trying to develop a site with some youtube videos. After I retrieve the XML file from their API, I have the following.
<?xml version='1.0' encoding='UTF-8'?>
<entry xmlns='http://www.w3.org/2005/Atom' xmlns:media='http://search.yahoo.com/mrss/' xmlns:gd='http://schemas.google.com/g/2005' xmlns:yt='http://gdata.youtube.com/schemas/2007'>
<id>http://gdata.youtube.com/feeds/api/videos/4ZsiqqOyWx8</id>
<published>2007-08-03T05:48:51.000Z</published>
[...]
<author>
<name>ak326</name>
<uri>http://gdata.youtube.com/feeds/api/users/ak326</uri>
</author>
<gd:comments>
<gd:feedLink href='http://gdata.youtube.com/feeds/api/videos/4ZsiqqOyWx8/comments' countHint='0'/>
</gd:comments>
<media:group>
[...]
<yt:duration seconds='222'/>
</media:group>
<gd:rating average='5.0' max='5' min='1' numRaters='4' rel='http://schemas.google.com/g/2005#overall'/>
<yt:statistics favoriteCount='8' viewCount='2674'/>
</entry>
I'm trying to retrieve the length of this video from with PHP but with
echo $xml->media->yt
But it's not working. I think it has something to do with the psuedo class on media and yt but I don't know how to select those.

Try DOMXPath
$xml = new DOMDocument();
$xml->load(path/to/file);
$xpath = new DOMXPath($xml);
$xpath->registerNamespace("atom", "http://www.w3.org/2005/Atom");
$xpath->registerNamespace("media", "http://search.yahoo.com/mrss/");
$xpath->registerNamespace("yt", "http://gdata.youtube.com/schemas/2007");
print $xpath->query("/atom:entry/media:group/yt:duration/#seconds")->item(0)->value;

Those XML elements are namespaced. You need to get the namespace information.
Example
// get nodes in media: namespace for media information
$media = $entry->children('http://search.yahoo.com/mrss/');
// get video player URL
$attrs = $media->group->player->attributes();

I'm assuming you're using SimpleXML here
$nsMedia = $xml->children('http://search.yahoo.com/mrss/');
$group = $nsMedia->group;
$nsYt = $group->children('http://gdata.youtube.com/schemas/2007');
$duration = $nsYt->duration;
echo $duration['seconds'];

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Problem getting XPath working in PHP - php

I think it's better to use SimpleXml library. see http://www.php.net/manual/en/book.simplexml.php Register namespace using http://www.php.net/manual/en/simplexmlelement.registerxpathnamespace.php Use xpath method. See http://www.php.net/manual/en/simplexmlelement.xpath.php

Related

Unable to parse atom feed

How to fetch exact parameter from XML string with PHP?

PHP outputing nested childnodes

Get XML values using PHP

Retrieve an XML psuedo class value in PHP?

Categories

Resources