Identical nested XML elements with namespaces and PHP - php

Try as I may, I cannot seem to grab the value of the "Id" attribute in the nested apcm:Property element, where the "Name" attribute equals "sequenceNumber", on line 12. As you can see, there element of interest is buried in a nest of other elements with an identical name and namespace.
Using PHP, I'm having a difficult time wrapping my head around how to grab that Id value.
<?xml version="1.0" encoding="utf-8" ?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:apcm="http://ap.org/schemas/03/2005/apcm" xmlns:apnm="http://ap.org/schemas/03/2005/apnm" xmlns:georss="http://www.georss.org/georss">
<id>urn:publicid:ap.org:30085</id>
<title type="xhtml">
<apxh:div xmlns:apxh="http://www.w3.org/1999/xhtml">
<apxh:span>AP New York State News - No Weather</apxh:span>
</apxh:div>
</title>
<apcm:Property Name="FeedProperties">
<apcm:Property Name="Entitlement" Id="urn:publicid:ap.org:product:30085" Value="AP New York State News - No Weather" />
<apcm:Property Name="FeedSequencing">
<apcm:Property Name="sequenceNumber" Id="169310964" />
<apcm:Property Name="minDateTime" Value="2012-05-22T18:04:18.913Z" />
</apcm:Property>
</apcm:Property>
<updated>2012-05-22T18:04:18.913Z</updated>
<author>
<name>The Associated Press</name>
<uri>http://www.ap.org</uri>
</author>
<rights>Copyright 2012 The Associated Press. All rights reserved. This material may not be published, broadcast, rewritten or redistributed.</rights>
<link rel="self" href="http://syndication.ap.org/AP.Distro.Feed/GetFeed.aspx?idList=30085&idListType=products&maxItems=20" />
<entry>
...
</entry>
</feed>

You have to register the namespaces, and use the [] predicate to identify which Property element you are interested in. It is safest if you do NOT use double slash, i.e., if you start the look up from the document element.
<?php
$xml = <<<EOD
...
EOD;
$sxe = new SimpleXMLElement($xml);
$sxe->registerXPathNamespace('apcm', 'http://ap.org/schemas/03/2005/apcm');
$sxe->registerXPathNamespace('atom', 'http://www.w3.org/2005/Atom');
$result = $sxe->xpath('/atom:feed/acpm:Property[#Name=\'FeedProperties\']/acpm:Property[#Name=\'FeedSequencing\']/acpm:Property[#Name=\'sequenceNumber\']/#Id');
foreach ($result as $sequenceNumber) {
echo $sequenceNumber . "\n";
}
?>
Note that there may theoretically be multiple sibling Property elements with the same #Name and so this Xpath may produce multiple nodes (#Id values).

Related

Unable to parse atom feed

I am implementing Youtube push notification and implemented webhook. Youtube gives updates in the form of atom feed. My problem is i can't parse that feed.
This is the XML:
<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:yt="http://www.youtube.com/xml/schemas/2015">
<link rel="hub" href="https://pubsubhubbub.appspot.com" />
<link rel="self" href="https://www.youtube.com/xml/feeds/videos.xml?channel_id=UCaNoTnXcQQt3ody_cLZSihw" />
<title>YouTube video feed</title>
<updated>2018-03-01T07:21:59.144766801+00:00</updated>
<entry>
<id>yt:video:vNQyYJqFopE</id>
<yt:videoId>vNQyYJqFopE</yt:videoId>
<yt:channelId>UCaNoTnXcQQt3ody_cLZSihw</yt:channelId>
<title>Test Video 4</title>
<link rel="alternate" href="https://www.youtube.com/watch?v=vNQyYJqFopE" />
<author>
<name>Testing</name>
<uri>https://www.youtube.com/channel/UCaNoTnXcQQt3ody_cLZSihw</uri>
</author>
<published>2018-03-01T07:21:48+00:00</published>
<updated>2018-03-01T07:21:59.144766801+00:00</updated>
</entry>
<?php
$xml = '<?xml versio......';
$obj = simplexml_load_string($xml);
echo '<pre>';print_r($obj);echo '</pre>';
Screenshot
How to get the value of yt:videoId element. I am new to PHP, if I did anything wrong please correct me.
It seems the XML elements containing the yt namespace (e.g. <yt:videoId>) are not being parsed by simplexml_load_string. I don't know why but in your case the video id is also present in the <id> element you just need to extract the last value or simply cut of yt:video: in front of it. That is at least an easy workaround.
Also it works if you use a direct XPath to the <yt:videoId> element like this:
echo $obj->xpath('//yt:videoId')[0];
// output: vNQyYJqFopE
XPath always returns an array so you need to get the first element with [0].
Try this (updated)
$str = $obj->entry->id;
echo substr($str, strpos($str, "video:")+ 6);
Get the channel
$chan = $obj->entry->author->uri;
echo substr($chan , strpos($chan , "channel/")+ 8);

Search with PHP in XML <B> Data via <C> tag

First, i know that there are already some questions aubout similar things (Searching an XML file using PHP Load external xml file?), but i CAN'T make ANY PHP, so the following (search.php) is what I copy-pasted in PHP:
<html>
<head>
<title>Search</title>
</head>
<body>
<?php
$search = $_POST["search"];
$xml=simplexml_load_file("inhaltsverzeichniss.xml") or die("Error: Cannot create object");
$result = $xml->xpath('//array/entry/tags[.="$search"]');
while(list( , $node) = each($result)) {
echo '/array/entry/tags: ',$node,"\n";
}
?>
</body>
Then this is my HTML code (in case someone needs it):
<form action="search.php" method="post" >
<input type="text" name="search" placeholder="search for stuff" >
<input type="submit" name="search_button" value="Suche" >
</form>
and this is the XML Document I want to search in (inhaltsverzichniss.xml):
<?xml version="1.0" encoding="UTF-8"?>
<array>
<entry>
<url>Zeitformen Spanisch.odp</url>
<name>Zeitformen Tabelle</name>
<tags>spanisch</tags>
</entry>
<entry>
<url>german-stuff.odt</url>
<name>etwas Deutsches</name>
<tags>german</tags>
</entry>
<entry>
<url>something_english.html</url>
<name>english things</name>
<tags>english</tags>
</entry>
<entry>
<url>other-data.fileextension</url>
<name>cool name</name>
<tags>etc.</tags>
</entry>
</array>
What I want to do is search for tags written in the search form, and give the whole corresponding entry out. (like following:<a href"$url">$name</a><br>Tags: $tags)
EDIT: this is my current XML with entries with multiple tags (changed it to "tag"):
<array>
<entry>
<url>spanisch/Zeitformen_Spanisch.ods</url>
<name>Zeitformen Tabelle</name>
<tag>spanisch</tag>
</entry>
<entry>
<url>german-stuff.odt</url>
<name>etwas Deutsches</name>
<tag>deutsch</tag>
<tag>spanisch</tag>
<tag>englisch</tag>
</entry>
<entry>
<url>something_english.html</url>
<name>english things</name>
<tag>spanisch</tag>
<tag>englisch</tag>
</entry>
<entry>
<url>other-data.fileextension</url>
<name>cool name</name>
<tag>etc.</tag>
</entry>
</array>
Quoting the PHP Manual:
Note: Unlike the double-quoted and heredoc syntaxes, variables and escape sequences for special characters will not be expanded when they occur in single quoted strings.
You are using single quotes in
$result = $xml->xpath('//array/entry/tags[.="$search"]');
but single quotes will treat the string content as literal strings. It won't interpolate $search like you think it does.
Change your code to
$result = $xml->xpath("//array/entry/tags[.='$search']");
and it will use the value of $search instead of the literal string "$search".
However, since you want to print the full entry element, it would be better to use this XPath
$result = $xml->xpath("/array/entry[tags='$search']");
For readability I prefer
$result = $xml->xpath(sprintf('/array/entry[tags="%s"]', $search));
This will give you all the <entry> elements, instead of the <tags> elements in the $result. And while //array with a double slash does work, it's unneeded and performs somewhat worse than the direct path. That's because it will try to find the nodes regardless of position in the document.
You can then print it like you asked for like this:
foreach ($result as $entry) {
printf(
'%s<br>Tags: %s%s',
urlencode($entry->url),
htmlspecialchars($entry->name),
htmlspecialchars($entry->tags),
PHP_EOL
);
}
Regarding your edit with multiple tag elements:
The XPath above will already find the correct entries. To display all tags, you can change the line
htmlspecialchars($entry->tags),
to (don't forget to adjust the xpath to read tag instead of tags):
implode_tags($entry->tag),
with the implode_tags function being defined as
function implode_tags(\SimpleXMLElement $tags) {
$allTags = [];
foreach ($tags as $tag) {
$allTags[] = htmlspecialchars($tag);
}
return implode(", ", $allTags);
}
This should then return something like "deutsch, spanisch, englisch"
See full demo at https://eval.in/885485

SimpleXML: Getting attributes into variables

I am building my website with a central XML-file and SimpleXML. The pages have some different features like the language. I would like to put these features into the XML-file with attributes of the parent node:
<content>
<item id="one" lang="en">
<title>Hello</title>
</item>
</content>
I call a certain item by the id-attribute and I know how to call subnodes like :
$xml = simplexml_load_file('file.xml');
$lang = $bl->xpath('/content/item[#id="one"]/title/text()');
$lang = $lang[0]; echo $lang;
But how do I get the attribute LANG of an item with the id="one" into a variable?
the path to the attribute
/content/item[#id="one"]/#lang
the value of the attribute data(/content/item[#id="one"]/#lang)

parsing XML file using PHP (cs-s75: David Malan's)

I am making a PHP project for a Pizza Shop [This is project-0 in David Malan's course CS-S75 Building Dynamic Websites]. And the Code that I have to write must be eXtensible. That is, if the pizza shop's owner wants to add a new category, he should be able to do that pretty easily and my PHP code must accommodate those changes in the XML file without writing any new code.
For my code to be extensible though, I need some methods for filtering the XML data.
For instance inside the root node <menu>, I have child nodes item that have attributes like
<item name="Pizzas">
<category name="Onions">
</category>
</item>
<item name="Salads">
<category name = "Garden">
</category>
</item>
and there are ten item tags in total.
What I want to do is this: if the user wants to purchase the salads, I would want to filter the XML DOM tree the following way:
// $_POST['selected'] has a value of 'Salads' stored in it
$selected = $_POST['selected']
$dom = simple_xml_loadfile("menu.xml")
foreach ($dom -> xpath("menu/item[#name = $selected ]" as $item))
{
echo $item -> category['name'].'<br />';
}
And it should print Garden and any other item that is subsequently added to the Salads category.The problem occurs with the menu/item[#name = $selected ] because this is probably not a proper method for comparing the attribute (Note that attribute comparison like this in XML requires single equal sign and not double equal).And obviously menu/item[#name = $_POST['selected']] doesn't work either.
What works is #name = "Salads" and of course this kills the whole purpose of the extensiblity of XML and dynamism of PHP.
Please help!
Let's get all category nodes that belong to a parent node that has a name attribute of your choosing:
Also note that the function name is simplexml_load_file and not simple_xml_loadfile
foreach ($dom->xpath('item[#name="' . $selected . '"]/category') as $item)
{
echo $item->attributes()->{'name'}. PHP_EOL;
}
Also note the usage of single vs. double quotes to enclose the attribute value.
For reference, this is the xml structure I used for testing:
<menu>
<item name="Pizzas">
<category name="Onions"></category>
</item>
<item name="Salads">
<category name = "Garden"></category>
<category name = "Cesar"></category>
<category name = "Onion and Tomato"></category>
</item>
</menu>

XML to PHP array to mysql

I'm trying to import a xml data from a google xml document using simple xml to achieve that, an example of the code is here
<entry>
<id>
tag:google.com,2013:googlealerts/feed:11187837211342886856
</id>
<title type="html">
<b>London</b> Collections: Topman Design's retro mash-up
</title>
<link href="https://www.google.com/url?q=http://www.telegraph.co.uk/men/fashion-and-style/10901146/London-Collections-Topman-Designs-retro-mash-up.html&ct=ga&cd=CAIyAA&usg=AFQjCNEib0lLtkzUzFtR2Hk37wGefTVAZQ"/>
<published>2014-06-15T14:15:00Z</published>
<updated>2014-06-15T14:15:00Z</updated>
<content type="html">
Today is a very important day for England, and I'm not referring to the World Cup; it's the first day of <b>London</b> Collections: Men, a three day celebration ...
</content>
<author>
<name/>
</author>
</entry>
What would the best solution to do this? I'm so confused with how to get each as an variable to pass to mysql
this is exactly where I'm stuck
$xml = simplexml_load_file("xml.xml");
$feed = simplexml_load_string($xml);
$ns=$feed->getNameSpaces(true);
foreach ($feed->entry as $entry) {
}
thank you all in advance
You can use XPath. It may be simpler than SimpleXML when you have namespaces. You will also have to register the namespace which is not present in the feed excerpt you included as an example.
I found an arbitrary feed here: http://www.google.com/alerts/feeds/01662123773360489091/16526224428036307178
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:idx="urn:atom-extension:indexing">
<id>
tag:google.com,2005:reader/user/01662123773360489091/state/com.google/alerts/16526224428036307178
</id>
<title>Google Alert - test</title>
<link href="http://www.google.com/alerts/feeds/01662123773360489091/16526224428036307178" rel="self"/>
<updated>2014-06-15T17:30:04Z</updated>
<entry>
<id>
tag:google.com,2013:googlealerts/feed:5957360885559055905
</id>
<title type="html">
Dad's <b>Test</b> Out Products Made For the Family
</title>
<link href="https://www.google.com/url?q=http://gma.yahoo.com/video/dads-test-products-made-family-141428658.html&ct=ga&cd=CAIyAA&usg=AFQjCNHHBPoS6Poz-Y5A3vFfbsGL3fkrBA"/>
<published>2014-06-15T17:30:04Z</published>
<updated>2014-06-15T17:30:04Z</updated>
<content type="html">
Watch the video Dad's <b>Test</b> Out Products Made For the Family on Yahoo Good Morning America . Becky Worley enlists a group of fathers to see if "As ...
</content>
<author>
<name/>
</author>
</entry>
<entry>
...
I will use it to provide your answer.
In the first line there is a default namespace declaration xmlns. You have to register that in PHP to use the namespace in XPath. You should map it to a prefix (could be any one) even if there is no prefix in the original file. So this is how you would initialize the parser.
These two lines initialize the DOM parser and parse the file, loading it from the Internet:
$document = new DOMDocument();
$document->load( "http://www.google.com/alerts/feeds/01662123773360489091/16526224428036307178" );
These two initialize the XPath environment, registering the default namespace of your file with a prefix (I chose atom):
$xpath = new DOMXpath($document);
$xpath->registerNamespace("atom", "http://www.w3.org/2005/Atom");
Once that is set up, you can select the nodes using the evaluate() expression, which can be absolute or relative. To get all entry nodes, you can use an absolute expression:
$entries = $xpath->evaluate("//atom:entry");
The XPath expression is //atom::entry. It returns a set of entry nodes from the "http://www.w3.org/2005/Atom" namespace, which is what you want.
To extract the nodes and the information in the context of each entry, you can use DOM methods and properties such as firstChild, nextSibling, etc. or you can perform additional XPath contextual searches. A contextual search passes the context node as a second parameter to the evaluate() expression. Here is a loop that gets the data in each child node of <entry> and places it in an HTML sublist:
$entries = $xpath->evaluate("//atom:entry");
echo '<ul>'."\n";
foreach ($entries as $entry) {
echo '<li><b>Entry ID: '.$xpath->evaluate("atom:id/text()", $entry)->item(0)->nodeValue.'</b></li>'."\n";
echo '<ul>'."\n";
echo '<li>Title: '.$xpath->evaluate("atom:title/text()", $entry)->item(0)->nodeValue.'</li>'."\n";
echo '<li>Link: '.$xpath->evaluate("atom:link/#href", $entry)->item(0)->nodeValue.'</li>'."\n";
echo '<li>Published: '.$xpath->evaluate("atom:published/text()", $entry)->item(0)->nodeValue.'</li>'."\n";
echo '<li>Updated: '.$xpath->evaluate("atom:updated/text()", $entry)->item(0)->nodeValue.'</li>'."\n";
echo '<li>Content: '.$xpath->evaluate("atom:content/text()", $entry)->item(0)->nodeValue.'</li>'."\n";
echo '<li>Author: '.$xpath->evaluate("atom:author/atom:name/text()", $entry)->item(0)->nodeValue.'</li>'."\n";
echo '</ul>'."\n";
}
echo '</ul>'."\n";
Note that the expressions are relative to entry (they don't start with /), he element selectors are also prefixed (they also belong to the atom namespace), and I used item(0) and nodeValue to extract the results. Since nodes may have many children, the evaluate() expression as used above returns a nodeset. If there is only one text child, it's in item(0). nodeValue converts it to string.
The result of running the program above will be:
<ul>
<li><b>Entry ID: tag:google.com,2013:googlealerts/feed:5957360885559055905</b></li>
<ul>
<li>Title: Dad's <b>Test</b> Out Products Made For the Family</li>
<li>Link: https://www.google.com/url?q=http://gma.yahoo.com/video/dads-test-products-made-family-141428658.html&ct=ga&cd=CAIyAA&usg=AFQjCNHHBPoS6Poz-Y5A3vFfbsGL3fkrBA</li>
<li>Published: 2014-06-15T17:30:04Z</li>
<li>Updated: 2014-06-15T17:30:04Z</li>
<li>Content: Watch the video Dad's <b>Test</b> Out Products Made For the Family on Yahoo Good Morning America . Becky Worley enlists a group of fathers to see if "As ...</li>
<li>Author: </li>
</ul>
<li><b>Entry ID: tag:google.com,2013:googlealerts/feed:11008408359408830921</b></li>
<ul>
<li>Title: Germany faces major <b>test</b> of strength in its World Cup opener against Portugal</li>
<li>Link: https://www.google.com/url?q=http://www.foxnews.com/sports/2014/06/15/germany-faces-major-test-strength-in-its-world-cup-opener-against-portugal/&ct=ga&cd=CAIyAA&usg=AFQjCNHOU94QyciRpCEdJawOwl3diEEO0A</li>
<li>Published: 2014-06-15T16:18:45Z</li>
<li>Updated: 2014-06-15T16:18:45Z</li>
<li>Content: Cristiano Ronaldo stretches during a training session of Portugal in Campinas, Brazil, Saturday, June 14, 2014. Portugal plays in group G of the Brazil ...</li>
<li>Author: </li>
</ul>
<li><b>Entry ID: tag:google.com,2013:googlealerts/feed:8664961950651004785</b></li>
...
Now you can edit the code to adapt it to the data you wish to extract.
You can see a working example of this application in this PHP Fiddle

Categories