How to fetch exact parameter from XML string with PHP? - php

This is part of XML document:
<entry>
<author>
<name>Dunnock_D</name>
<uri>http://www.flickr.com/people/dunnock_d/</uri>
</author>
<link rel="license" type="text/html" href="https://creativecommons.org/licenses/by-nc/2.0/deed.en" />
<link rel="enclosure" type="image/jpeg" href="http://farm8.staticflickr.com/7548/26820724620_1d221c3187_b.jpg" />
</entry>
My code:
$xml = simplexml_load_string($result);
foreach ($xml->entry as $pixinfo) {
echo $pixinfo->link[1]['href'];
}
The problem is there can be one or more link strings and I need only particular with rel="enclosure" attribute.
What is the easiest way without extra IF and loops?
Thank you!

For that you can use DOMXPath, more specifically the query function. Let's say your $result variable contains the following:
<?xml version='1.0' encoding='UTF-8'?>
<entries>
<entry>
<author>
<name>Dunnock_D</name>
<uri>http://www.flickr.com/people/dunnock_d/</uri>
</author>
<link rel="license" type="text/html" href="https://creativecommons.org/licenses/by-nc/2.0/deed.en" />
<link rel="enclosure" type="image/jpeg" href="http://farm8.staticflickr.com/7548/26820724620_1d221c3187_b.jpg" />
</entry>
<entry>
<author>
<name>Dunnock_D</name>
<uri>http://www.flickr.com/people/dunnock_d/</uri>
</author>
<link rel="license" type="text/html" href="https://creativecommons.org/licenses/by-nc/2.0/deed.en" />
<link rel="enclosure" type="image/jpeg" href="http://farm8.staticflickr.com/7548/26820724620_1d221c3187_b.jpg" />
</entry>
<entry>
<author>
<name>Dunnock_D</name>
<uri>http://www.flickr.com/people/dunnock_d/</uri>
</author>
<link rel="license" type="text/html" href="https://creativecommons.org/licenses/by-nc/2.0/deed.en" />
<link rel="enclosure" type="image/jpeg" href="http://farm8.staticflickr.com/7548/26820724620_1d221c3187_b.jpg" />
</entry>
</entries>
I know the entries are repeated, but it's only for demo purposes. The code to get only the enclosure links would be:
$doc = new DOMDocument();
$doc->preserveWhiteSpace = false;
$doc->loadXML($result);
$xpath = new DOMXpath($doc);
$entries = $xpath->query('//entries/entry');
foreach ($entries as $entry) {
$link = $xpath->query('link[#rel="enclosure"]', $entry)->item(0);
$href = $link->getAttribute('href');
echo "{$href}\n";
}

You are using simplexml. Just use "attributes()" function: http://php.net/manual/pt_BR/simplexmlelement.attributes.php
Or you can access directly:
foreach ($xml->entry as $pixinfo) {
if($pixinfo->link[1]['rel'] == 'enclosure') {
echo $pixinfo->link[1]['href'];
}
}

The solution is Xpath.
With SimpleXML you can fetch the attribute node and cast the generated SimpleXMLElement into a string. You should make sure that you got an element before you cast it. SimpleXMLElement::xpath() will always return an array of SimpleXMLElement objects.
$entry = new SimpleXMLElement($xml);
$enclosures = $entry->xpath('link[#rel="enclosure"]/#href');
if (count($enclosures) > 0) {
var_dump((string)$enclosures[0]);
}
Output:
string(63) "http://farm8.staticflickr.com/7548/26820724620_1d221c3187_b.jpg"
With DOM the bootstrap is slightly larger, but you can fetch the href attribute directly as a string:
$document = new DOMDocument();
$document->loadXml($xml);
$xpath = new DOMXpath($document);
var_dump(
$xpath->evaluate('string(/entry/link[#rel="enclosure"]/#href)')
);
This will return an empty string if the expression does not match.

Related

PHP: How to extract “content type=”application/xml" nodes from a XML file?

I have a valid XML file (generated from SharePoint) which looks like this (in browser):
Sample XML File
<?xml version="1.0" encoding="utf-8"?>
<feed xml:base="https://www.example.com/_api/" xmlns="http://www.w3.org/2005/Atom" xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices" xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata" xmlns:georss="http://www.georss.org/georss" xmlns:gml="http://www.opengis.net/gml">
<id>9913f043-xxxx-xxxx-xxxx-xxxx-xxxx</id>
<title />
<updated>2017-05-23T06:08:01Z</updated>
<entry m:etag=""23"">
<id>Web/Lists(guid'13306095-xxxx-xxxx-xxxx-xxxx-xxxx-xxxx')/Items(1)</id>
<category term="SP.Data.XXXXXXXXXXXXXXXXXXXXX" scheme="http://schemas.microsoft.com/ado/2007/08/dataservices/scheme" />
<link rel="edit" href="Web/Lists(guid'13306095-xxxx-xxxx-xxxx-xxxx-xxxx')/Items(1)" />
<title />
<updated>2017-05-23T06:08:01Z</updated>
<author>
<name />
</author>
<content type="application/xml">
<m:properties>
<d:FileSystemObjectType m:type="Edm.Int32">0</d:FileSystemObjectType>
<d:Id m:type="Edm.Int32">1</d:Id>
<d:ContentTypeId>0x0100B6A3B67BE96F724682CCDC8FBE9D70C2</d:ContentTypeId>
<d:Title m:null="true" />
<d:Topic>How to google?</d:Topic>
<d:Cats m:type="Collection(Edm.Int32)">
<d:element>1</d:element>
<d:element>2</d:element>
<d:element>3</d:element>
<d:element>4</d:element>
<d:element>5</d:element>
<d:element>6</d:element>
<d:element>7</d:element>
</d:Cats>
</m:properties>
</content>
</entry>
<entry>
.
.
</entry>
<entry>
.
.
</entry>
</feed>
(Note: I cut off some repeated nodes here, because it is so long.)
Clearly, we have inner nodes <content type="application/xml"> which also contain data inside.
The Problem (When parsing with PHP)
In PHP, i used this codes to parse (trying to extract it):
$xml = simplexml_load_file("data.xml");
foreach ($xml->entry as $item) {
echo $item->updated . PHP_EOL; // <--- This works!
print_r($item->content); // <--- This doesn't work as expected.
}
.. and then, it is giving me these:
2017-05-23T06:08:01Z
SimpleXMLElement Object
(
[#attributes] => Array
(
[type] => application/xml
)
)
2017-05-23T06:08:01Z
SimpleXMLElement Object
(
[#attributes] => Array
(
[type] => application/xml
)
)
.
.
Question (Help!)
How do i extract (get) the actual data inside those <content type="application/xml"> nodes, please?
Please help. Thank you in advance.
The elements below "content" have a namespace (d:...). I had the same problem a while ago. This should help:
$xml = simplexml_load_file("data.xml");
foreach ($xml->entry as $item) {
echo $item->updated . PHP_EOL;
$ns = $item->content->children('http://schemas.microsoft.com/ado/2007/08/dataservices/metadata');
print_r($ns->properties);
}
I updated the code. I'm shure print_r($ns->properties) doesn't show the complete sub-elements ... because they are from another namspace. I guess you can then do this:
$nsd = $ns->properties->children("http://schemas.microsoft.com/ado/2007/08/dataservices");
and proccced with the result.
In your example namespaces can be found in the document element:
xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices" xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata"
(use the URL between the quotation marks)
d: and m: are used in the document to reference these namespaces.
EDIT: There is another namespace involved. Didn't recognize that. The solution can be atapted. I changed the code a bit.
I had a very similar issue. I was finally able to get my example working with this.
function pre($array){
echo "<pre>";
print_r($array);
echo "</pre>";
}
$record[$count]['id'] = $id->id;
$xmlData = utf8_encode(file_get_contents("https://ucf.uscourts.gov/odata.svc/Creditors(guid'81044f71-fb3c-11e5-ac5b-0050569d488e')"));
$xml = new SimpleXMLElement($xmlData);
$properties = $xml->content->children('http://schemas.microsoft.com/ado/2007/08/dataservices/metadata');
$fields = $properties->properties->children("http://schemas.microsoft.com/ado/2007/08/dataservices");
pre($fields);
$key = (string)$fields->Key;
$lastName = (string)$fields->LastName;
echo $key. "<br />";
echo $lastName. "<br />";
You would need to replace the Url in file_get_contents, the Key variable and LastName variable with you namespace values that you are looking for and I like to use a pre function to have things show easier. You can remove this part. Hopes this helps someone.

PHP outputing nested childnodes

I am trying to output a xml file to an array thats then outputted to screen. The xml file loads I know it loads because as I can output entry > Id but I can not access its child nodes. I need the data located in.
content > s:organisationSummay
content > s:organisationSummay > s:address
content > s:organisationSummay > s:geographicCoordinates
how would I access the the data located in s:organisationSummay ,s:address, s:geographicCoordinates so I can getElementsByTagName for each items in that child node.
$doc2 = new DOMDocument();
$url = 'http://v1.syndication.nhschoices.nhs.uk/organisations/'.$_POST['ServiceType'].'/postcode/'.$_POST['PostCode'].'.xml?apikey=??&range=50';
echo $url;
$doc2->load($url);
$arrFeeds = array();
foreach ($doc2->getElementsByTagName('entry') as $node)
{
echo $node->getElementsByTagName($content->'s:name');
$itemRSS = array (
'PracticeName' => $organisationSummary->getElementsByTagName('s:name')->item(0)->nodeValue
);
array_push($arrFeeds, $itemRSS);
}
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns:s="http://syndication.nhschoices.nhs.uk/services" xmlns="http://www.w3.org/2005/Atom">
<title type="text">NHS Choices - GP Practices Near Postcode - ls1- Within 50km</title>
<id>http://v1.syndication.nhschoices.nhs.uk/organisations/gppractices/postcode/bd164jt?range=50</id>
<rights type="text">© Crown Copyright 2009</rights>
<updated>2012-07-06T10:24:46+01:00</updated>
<category term="Search"/>
<logo>http://www.nhs.uk/nhscwebservices/documents/logo1.jpg</logo>
<author>
<name>NHS Choices</name>
<uri>http://www.nhs.uk</uri>
<email>webservices#nhschoices.nhs.uk</email>
</author>
<link rel="self" type="application/xml" title="NHS Choices - GP Practices Near Postcode - ;ls1 - Within 50km" href="http://v1.syndication.nhschoices.nhs.uk/organisations/gppractices/postcode/ls1?apikey=??&range=50"/>
<link rel="first" type="application/xml" title="first" length="1000" href="http://v1.syndication.nhschoices.nhs.uk/organisations/gppractices/postcode/ls1?apikey=??&range=50&page=1"/>
<link rel="next" type="application/xml" title="next" length="1000" href="http://v1.syndication.nhschoices.nhs.uk/organisations/gppractices/postcode/Ls1?apikey=??&range=50&page=2"/>
<link rel="last" type="application/xml" title="last" length="1000" href="http://v1.syndication.nhschoices.nhs.uk/organisations/gppractices/postcode/LS1?apikey=??&range=50&page=10"/>
<link rel="alternate" title="NHS Choices - Find and choose services - GP Practices" href="http://www.nhs.uk/ServiceDirectories/pages/ServiceSearch.aspx?ServiceType=GP"/>
<s:SearchCoords>439300,411100</s:SearchCoords>
<entry>
<id>http://v1.syndication.nhschoices.nhs.uk/organisations/gppractices/1</id>
<title type="text">Medical Practice</title>
<updated>2012-07-06T09:24:46Z</updated>
<link rel="self" title="Medical Practice" href="http://v1.syndication.nhschoices.nhs.uk/organisations/gppractices/1?apikey=??"/>
<link rel="alternate" title="Medical Practice" href="http://www.nhs.uk/ServiceDirectories/Pages/GP.aspx?pid=1"/>
<content type="application/xml">
<s:organisationSummary>
<s:name>Medical Practice</s:name>
<s:address>
<s:addressLine>Health Care Centre</s:addressLine>
<s:addressLine>2</s:addressLine>
<s:addressLine>Town</s:addressLine>
<s:addressLine>Yorkshire</s:addressLine>
<s:postcode>?</s:postcode>
</s:address>
<s:contact type="General">
<s:telephone>5558383</s:telephone>
</s:contact>
<s:geographicCoordinates>
<s:northing>438880</s:northing>
<s:easting>411444</s:easting>
<s:longitude>-1.82821202227791</s:longitude>
<s:latitude>53.996218047559</s:latitude>
</s:geographicCoordinates>
<s:Distance>0.5</s:Distance>
</s:organisationSummary>
</content>
</entry>
<entry>
<id>http://v1.syndication.nhschoices.nhs.uk/organisations/gppractices/2</id>
<title type="text">Surgery</title>
<updated>2012-07-06T09:24:46Z</updated>
<link rel="self" title="Surgery" href="http://v1.syndication.nhschoices.nhs.uk/organisations/gppractices/1?apikey=??"/>
<link rel="alternate" title="Surgery" href="http://www.nhs.uk/ServiceDirectories/Pages/GP.aspx?pid=2"/>
<content type="application/xml">
<s:organisationSummary>
<s:name>Surgery</s:name>
<s:address>
<s:addressLine>Healthcare Centre</s:addressLine>
<s:addressLine>Kings</s:addressLine>
<s:addressLine>Town</s:addressLine>
<s:postcode>?</s:postcode>
</s:address>
<s:contact type="General">
<s:telephone>555555</s:telephone>
<s:email>Email</s:email>
</s:contact>
<s:geographicCoordinates>
<s:northing>78421</s:northing>
<s:easting>484100</s:easting>
<s:longitude>-1.828987402220691</s:longitude>
<s:latitude>53.987218047559</s:latitude>
</s:geographicCoordinates>
<s:Distance>0.5</s:Distance>
</s:organisationSummary>
</content>
</entry>
</feed>
This is a namespaced document, so you need to use the proper namespace methods, e.g. DOMDocument::getElementsByTagNameNS.
In addition, there is so much wrong with your loop that I suspect you're either not including all the code or you really misunderstand how DOMDocument works.
$NS = array(
's' => "http://syndication.nhschoices.nhs.uk/services",
'atom' => "http://www.w3.org/2005/Atom",
);
$entries = array();
foreach ($doc2->getElementsByTagNameNS($NS['s'], 'organisationSummary') as $node)
{
$entries[] = array(
'name' => trim($node->getElementsByTagNameNS($NS['s'], 'name')->item(0)->textContent),
'address' => keyByElementName($node->getElementsByTagNameNS($NS['s'], 'address')->item(0)),
'geographicCoordinates' => keyByElementName($node->getElementsByTagNameNS($NS['s'], 'geographicCoordinates')->item(0)),
);
}
function keyByElementName(DOMNode $node)
{
$elem = array();
foreach ($node->childNodes as $child) {
if ($child->nodeType===XML_ELEMENT_NODE) {
$elem[$child->localName] = trim($child->textContent);
}
}
return $elem;
}
However, consider using DOMXPath or SimpleXML, as these will be easier than dom traversal.

Problem getting XPath working in PHP

I've been trying to access the NHS API using different methods to read in the XML.
Here is a snippet of the XML:
<feed xmlns:s="http://syndication.nhschoices.nhs.uk/services" xmlns="http://www.w3.org/2005/Atom">
<title type="text">NHS Choices - GP Practices Near Postcode - W1T4LB - Within 5km</title>
<entry>
<id>http://v1.syndication.nhschoices.nhs.uk/organisations/gppractices/27369</id>
<title type="text">Fitzrovia Medical Centre</title>
<updated>2011-08-20T22:47:39Z</updated>
<link rel="self" title="Fitzrovia Medical Centre" href="http://v1.syndication.nhschoices.nhs.uk/organisations/gppractices/27369?apikey="/>
<link rel="alternate" title="Fitzrovia Medical Centre" href="http://www.nhs.uk/ServiceDirectories/Pages/GP.aspx?pid=303A92EF-EC8D-496B-B9CD-E6D836D13BA2"/>
<content type="application/xml">
<s:organisationSummary>
<s:name>Fitzrovia Medical Centre</s:name>
<s:address>
<s:addressLine>31 Fitzroy Square</s:addressLine>
<s:addressLine>London</s:addressLine>
<s:postcode>W1T6EU</s:postcode>
</s:address>
<s:contact type="General">
<s:telephone>020 7387 5798</s:telephone>
</s:contact>
<s:geographicCoordinates>
<s:northing>182000</s:northing>
<s:easting>529000</s:easting>
<s:longitude>-0.140267259415255</s:longitude>
<s:latitude>51.5224357586293</s:latitude>
</s:geographicCoordinates>
<s:Distance>0.360555127546399</s:Distance>
</s:organisationSummary>
</content>
</entry>
</feed>
I've been using this PHP to access it:
<?php
$feedURL = 'http://v1.syndication.nhschoices.nhs.uk/organisations/gppractices/postcode/W1T4LB.xml?apikey=&range=5';
$raw = file_get_contents($feedURL);
$dom = new DOMDocument();
$dom->loadXML($raw);
$xp = new DOMXPath($dom);
$result = $xp->query("//entry"); // select all entry nodes
print $result->item(0)->nodeValue;
?>
Problem is I have no results, the $raw data is present, but the $dom never gets filled with the string XML. This means the XPath won't work.
Also... for bonus points: how do I access the <s:Name> tag using XPath in this instance??
Appreciate the help as always.
Edit:
Here is the resulting PHP that worked fine, thanks to #Andrej L
<?php
$feedURL = 'http://v1.syndication.nhschoices.nhs.uk/organisations/gppractices/postcode/W1T4LB.xml?apikey=&range=5';
$xml = simplexml_load_file($feedURL);
$xml->registerXPathNamespace('s', 'http://syndication.nhschoices.nhs.uk/services');
$result = $xml->xpath('//s:name');
foreach ($result as $title)
{
print $title . '<br />';
}
?>
I think it's better to use SimpleXml library. see http://www.php.net/manual/en/book.simplexml.php
Register namespace using http://www.php.net/manual/en/simplexmlelement.registerxpathnamespace.php
Use xpath method. See http://www.php.net/manual/en/simplexmlelement.xpath.php

Traversing XML in PHP

I have the following XML code that I'm trying to parse, but I'm sure of how to traverse some of the data in PHP:
<entry>
<id>http://data.treasury.gov:8001/Feed.svc/DailyTreasuryYieldCurveRateData(5360)</id>
<title type="text"></title>
<updated>2011-06-09T20:15:18Z</updated>
<author>
<name />
</author>
<link rel="edit" title="DailyTreasuryYieldCurveRateDatum" href="DailyTreasuryYieldCurveRateData(5360)" />
<category term="TreasuryDataWarehouseModel.DailyTreasuryYieldCurveRateDatum" scheme="http://schemas.microsoft.com/ado/2007/08/dataservices/scheme" />
<content type="application/xml">
<m:properties>
<d:Id m:type="Edm.Int32">5360</d:Id>
<d:NEW_DATE m:type="Edm.DateTime">2011-06-01T00:00:00</d:NEW_DATE>
<d:BC_1MONTH m:type="Edm.Double">0.04</d:BC_1MONTH>
<d:BC_3MONTH m:type="Edm.Double">0.05</d:BC_3MONTH>
<d:BC_6MONTH m:type="Edm.Double">0.11</d:BC_6MONTH>
<d:BC_1YEAR m:type="Edm.Double">0.18</d:BC_1YEAR>
<d:BC_2YEAR m:type="Edm.Double">0.44</d:BC_2YEAR>
<d:BC_3YEAR m:type="Edm.Double">0.74</d:BC_3YEAR>
<d:BC_5YEAR m:type="Edm.Double">1.6</d:BC_5YEAR>
<d:BC_7YEAR m:type="Edm.Double">2.28</d:BC_7YEAR>
<d:BC_10YEAR m:type="Edm.Double">2.96</d:BC_10YEAR>
<d:BC_20YEAR m:type="Edm.Double">3.83</d:BC_20YEAR>
<d:BC_30YEAR m:type="Edm.Double">4.15</d:BC_30YEAR>
<d:BC_30YEARDISPLAY m:type="Edm.Double">4.15</d:BC_30YEARDISPLAY>
</m:properties>
</content>
</entry>
I can only get so far as
entry->content
As the following throws an error for having a colon:
entry->content->m:properties
How do I access what's inside content such as d:NEW_DATE?
In SimpleXML you can use the children('prefix', true) and attributes('prefix', true) functions to access namespaced content.
entry->content->children('m', true)->properties
or to access d:NEW_DATE
entry->content->children('m', true)->properties->children('d', true)->NEW_DATE
or one step further to access the m:type attribute
entry->content->children('m', true)->properties->children('d', true)->NEW_DATE->attributes('m', true)->type
You can use the SimpleXml's functions
SimpleXML
But my fav class is DOMDocument

XML with xpath and PHP: How to access the text value of an attribute of an entry

xml:
<entry>
<link rel="http://schemas.google.com/g/2005#feed" type="application/atom+xml" href="http://picasaweb.google.com/data/feed/api/user/xy/albumid/531885671007533108" />
<link rel="alternate" type="text/html" href="http://picasaweb.google.com/xy/Cooking" />
<link rel="self" type="application/atom+xml" href="http://picasaweb.google.com/data/entry/api/user/xy/albumid/531885671007533108" />
</entry>
Here's what I've tried:
foreach($xml->entry as $feed) {
$album_url = $feed->xpath("./link[#rel='alternate']/#href");
echo $album_url;
}
I've tried all kinds of permutations, too but no luck.
Expected result would be http://picasaweb.google.com/xy/Cooking
The result I get is "". Can someone explain what I'm doing wrong?
Can someone please help me out? I've been at this for hours...
xpath() returns an array, you have to select the first element of that array, at index 0. Attention: if there's no match, it may return an empty array. Therefore, you should add an if (isset($xpath[0])) clause, just in case.
foreach ($xml->entry as $entry)
{
$xpath = $entry->xpath('./link[#rel="alternate"]/#href');
if (isset($xpath[0]))
{
echo $xpath[0], "\n";
}
}
You were close:
./link[#rel='alternate']/#href
Should be the correct XPath to get those values.

Categories