Parse XML namespaces with php SimpleXML - php

I know this has been asked many many times but I haven't been able to get any of the suggestions to work with my situation and I have searched the web and here and tried everything and anything and nothing works. I just need to parse this XML with the namespace cap: and just need four entries from it.
<?xml version="1.0" encoding="UTF-8"?>
<entry>
<id>http://alerts.weather.gov/cap/wwacapget.php?x=TX124EFFB832F0.SpecialWeatherStatement.124EFFB84164TX.LUBSPSLUB.ac20a1425c958f66dc159baea2f9e672</id>
<updated>2013-05-06T20:08:00-05:00</updated>
<published>2013-05-06T20:08:00-05:00</published>
<author>
<name>w-nws.webmaster#noaa.gov</name>
</author>
<title>Special Weather Statement issued May 06 at 8:08PM CDT by NWS</title>
<link href="http://alerts.weather.gov/cap/wwacapget.php?x=TX124EFFB832F0.SpecialWeatherStatement.124EFFB84164TX.LUBSPSLUB.ac20a1425c958f66dc159baea2f9e672"/>
<summary>...SIGNIFICANT WEATHER ADVISORY FOR COCHRAN AND BAILEY COUNTIES... AT 808 PM CDT...NATIONAL WEATHER SERVICE DOPPLER RADAR INDICATED A STRONG THUNDERSTORM 30 MILES NORTHWEST OF MORTON...MOVING SOUTHEAST AT 25 MPH. NICKEL SIZE HAIL...WINDS SPEEDS UP TO 40 MPH...CONTINUOUS CLOUD TO GROUND LIGHTNING...AND BRIEF MODERATE DOWNPOURS ARE POSSIBLE WITH</summary>
<cap:event>Special Weather Statement</cap:event>
<cap:effective>2013-05-06T20:08:00-05:00</cap:effective>
<cap:expires>2013-05-06T20:45:00-05:00</cap:expires>
<cap:status>Actual</cap:status>
<cap:msgType>Alert</cap:msgType>
<cap:category>Met</cap:category>
<cap:urgency>Expected</cap:urgency>
<cap:severity>Minor</cap:severity>
<cap:certainty>Observed</cap:certainty>
<cap:areaDesc>Bailey; Cochran</cap:areaDesc>
<cap:polygon>34.19,-103.04 34.19,-103.03 33.98,-102.61 33.71,-102.61 33.63,-102.75 33.64,-103.05 34.19,-103.04</cap:polygon>
<cap:geocode>
<valueName>FIPS6</valueName>
<value>048017 048079</value>
<valueName>UGC</valueName>
<value>TXZ027 TXZ033</value>
</cap:geocode>
<cap:parameter>
<valueName>VTEC</valueName>
<value>
</value>
</cap:parameter>
</entry>
I am using simpleXML and I have a small simple test script set up and it works great for parsing regular elements. I can't for the dickens of me find or get a way to parse the elements with the namespaces.
Here is a small sample test script with code I am using and works great for parsing simple elements. How do I use this to parse namespaces? Everything I've tried doesn't work. I need it to be able to create variables so I can be able to embed them in HTML for style.
<?php
$html = "";
// Get the XML Feed
$data = "http://alerts.weather.gov/cap/tx.php?x=1";
// load the xml into the object
$xml = simplexml_load_file($data);
for ($i = 0; $i < 10; $i++){
$title = $xml->entry[$i]->title;
$summary = $xml->entry[$i]->summary;
$html .= "<p><strong>$title</strong></p><p>$summary</p><hr/>";
}
echo $html;
?>
This works fine for parsing regular elements but what about the ones with the cap: namespace under the entry parent?
<?php
ini_set('display_errors','1');
$html = "";
$data = "http://alerts.weather.gov/cap/tx.php?x=1";
$entries = simplexml_load_file($data);
if(count($entries)):
//Registering NameSpace
$entries->registerXPathNamespace('prefix', 'http://www.w3.org/2005/Atom');
$result = $entries->xpath("//prefix:entry");
//echo count($asin);
//echo "<pre>";print_r($asin);
foreach ($result as $entry):
$title = $entry->title;
$summary = $entry->summary;
$html .= "<p><strong>$title</strong></p><p>$summary</p>$event<hr/>";
endforeach;
endif;
echo $html;
?>
Any help would be greatly appreciated.
-Thanks

I have given same type of answer here - solution to your question
You just need to register Namespace and then you can work normally with simplexml_load_file and XPath
<?php
$data = "http://alerts.weather.gov/cap/tx.php?x=1";
$entries = file_get_contents($data);
$entries = new SimpleXmlElement($entries);
if(count($entries)):
//echo "<pre>";print_r($entries);die;
//alternate way other than registring NameSpace
//$asin = $asins->xpath("//*[local-name() = 'ASIN']");
$entries->registerXPathNamespace('prefix', 'http://www.w3.org/2005/Atom');
$result = $entries->xpath("//prefix:entry");
//echo count($asin);
//echo "<pre>";print_r($result);die;
foreach ($result as $entry):
//echo "<pre>";print_r($entry);die;
$dc = $entry->children('urn:oasis:names:tc:emergency:cap:1.1');
echo $dc->event."<br/>";
echo $dc->effective."<br/>";
echo "<hr>";
endforeach;
endif;
That's it.

Here's an alternative solution:
<?php
$xml = <<<XML
<?xml version = '1.0' encoding = 'UTF-8' standalone = 'yes'?>
<?xml-stylesheet href='http://alerts.weather.gov/cap/capatom.xsl' type='text/xsl'?>
<!--
This atom/xml feed is an index to active advisories, watches and warnings
issued by the National Weather Service. This index file is not the complete
Common Alerting Protocol (CAP) alert message. To obtain the complete CAP
alert, please follow the links for each entry in this index. Also note the
CAP message uses a style sheet to convey the information in a human readable
format. Please view the source of the CAP message to see the complete data
set. Not all information in the CAP message is contained in this index of
active alerts.
-->
<feed
xmlns = 'http://www.w3.org/2005/Atom'
xmlns:cap = 'urn:oasis:names:tc:emergency:cap:1.1'
xmlns:ha = 'http://www.alerting.net/namespace/index_1.0'
>
<!-- http-date = Tue, 07 May 2013 04:14:00 GMT -->
<id>http://alerts.weather.gov/cap/tx.atom</id>
<logo>http://alerts.weather.gov/images/xml_logo.gif</logo>
<generator>NWS CAP Server</generator>
<updated>2013-05-06T23:14:00-05:00</updated>
<author>
<name>w-nws.webmaster#noaa.gov</name>
</author>
<title>Current Watches, Warnings and Advisories for Texas Issued by the National Weather Service</title>
<link href='http://alerts.weather.gov/cap/tx.atom'/>
<entry>
<id>http://alerts.weather.gov/cap/wwacapget.php?x=TX124EFFB8AA78.FireWeatherWatch.124EFFD70270TX.EPZRFWEPZ.1716207877d94d15d43d410892b9f175</id>
<updated>2013-05-06T23:14:00-05:00</updated>
<published>2013-05-06T23:14:00-05:00</published>
<author>
<name>w-nws.webmaster#noaa.gov</name>
</author>
<title>Fire Weather Watch issued May 06 at 11:14PM CDT until May 08 at 10:00PM CDT by NWS</title>
<link href="http://alerts.weather.gov/cap/wwacapget.php?x=TX124EFFB8AA78.FireWeatherWatch.124EFFD70270TX.EPZRFWEPZ.1716207877d94d15d43d410892b9f175"/>
<summary>...CRITICAL FIRE CONDITIONS EXPECTED WEDNESDAY ACROSS FAR WEST TEXAS AND THE SOUTHWEST NEW MEXICO LOWLANDS... .WINDS ALOFT WILL STRENGTHEN OVER THE REGION EARLY THIS WEEK...AHEAD OF AN UPPER LEVEL TROUGH FORECAST TO MOVE THROUGH NEW MEXICO AND TEXAS ON WEDNESDAY. SURFACE LOW PRESSURE WILL ALSO DEVELOP TO OUR EAST AS THE TROUGH APPROACHES. THIS COMBINATION WILL RESULT</summary>
<cap:event>Fire Weather Watch</cap:event>
<cap:effective>2013-05-06T23:14:00-05:00</cap:effective>
<cap:expires>2013-05-08T22:00:00-05:00</cap:expires>
<cap:status>Actual</cap:status>
<cap:msgType>Alert</cap:msgType>
<cap:category>Met</cap:category>
<cap:urgency>Future</cap:urgency>
<cap:severity>Moderate</cap:severity>
<cap:certainty>Possible</cap:certainty>
<cap:areaDesc>El Paso; Hudspeth</cap:areaDesc>
<cap:polygon></cap:polygon>
<cap:geocode>
<valueName>FIPS6</valueName>
<value>048141 048229</value>
<valueName>UGC</valueName>
<value>TXZ055 TXZ056</value>
</cap:geocode>
<cap:parameter>
<valueName>VTEC</valueName>
<value>/O.NEW.KEPZ.FW.A.0018.130508T1900Z-130509T0300Z/</value>
</cap:parameter>
</entry>
<entry>
<id>http://alerts.weather.gov/cap/wwacapget.php?x=TX124EFFABB2F0.AirQualityAlert.124EFFC750DCTX.HGXAQAHGX.7f2cf548a67d403f0541492b2804d621</id>
<updated>2013-05-06T14:16:00-05:00</updated>
<published>2013-05-06T14:16:00-05:00</published>
<author>
<name>w-nws.webmaster#noaa.gov</name>
</author>
<title>Air Quality Alert issued May 06 at 2:16PM CDT by NWS</title>
<link href="http://alerts.weather.gov/cap/wwacapget.php?x=TX124EFFABB2F0.AirQualityAlert.124EFFC750DCTX.HGXAQAHGX.7f2cf548a67d403f0541492b2804d621"/>
<summary>...OZONE ACTION DAY FOR TUESDAY... THE TEXAS COMMISSION ON ENVIRONMENTAL QUALITY (TCEQ)...HAS ISSUED AN OZONE ACTION DAY FOR THE HOUSTON...GALVESTON...AND BRAZORIA AREAS FOR TUESDAY...MAY 7 2013. ATMOSPHERIC CONDITIONS ARE EXPECTED TO BE FAVORABLE FOR PRODUCING HIGH LEVELS OF OZONE POLLUTION IN THE HOUSTON...GALVESTON AND</summary>
<cap:event>Air Quality Alert</cap:event>
<cap:effective>2013-05-06T14:16:00-05:00</cap:effective>
<cap:expires>2013-05-07T19:15:00-05:00</cap:expires>
<cap:status>Actual</cap:status>
<cap:msgType>Alert</cap:msgType>
<cap:category>Met</cap:category>
<cap:urgency>Unknown</cap:urgency>
<cap:severity>Unknown</cap:severity>
<cap:certainty>Unknown</cap:certainty>
<cap:areaDesc>Brazoria; Galveston; Harris</cap:areaDesc>
<cap:polygon></cap:polygon>
<cap:geocode>
<valueName>FIPS6</valueName>
<value>048039 048167 048201</value>
<valueName>UGC</valueName>
<value>TXZ213 TXZ237 TXZ238</value>
</cap:geocode>
<cap:parameter>
<valueName>VTEC</valueName>
<value></value>
</cap:parameter>
</entry>
</feed>
XML;
$sxe = new SimpleXMLElement($xml);
$capFields = $sxe->entry->children('cap', true);
echo "Event: " . (string) $capFields->event . "\n";
echo "Effective: " . (string) $capFields->effective . "\n";
echo "Expires: " . (string) $capFields->expires . "\n";
echo "Severity: " . (string) $capFields->severity . "\n";
Output:
Event: Fire Weather Watch
Effective: 2013-05-06T23:14:00-05:00
Expires: 2013-05-08T22:00:00-05:00
Severity: Moderate

Related

php explode on xml file

Here is the xml file I have. ( I do not have editing abilities of the parser that creates this file... so hince why I am asking my question below).
<?xml version="1.0" encoding="UTF-8"?>
<JobSearchResults LookID="arkansas">
<!-- Served from qs-b-02.oc.careercast.com -->
<QueryString>clientid=arkansas&stringVar=xmlString&pageSize=200&searchType=featured&outFormat=xml</QueryString>
<channel>
<title>JobsArkansas Listings</title>
<items></items>
</channel>
<item>
<JobID>73451732</JobID>
<Title>Radiology</Title>
<Employer>Baptist-Health </Employer>
<Location>LITTLE ROCK, AR</Location>
<Description><![CDATA[IMMEDIATE OPENINGS for:Diabetes Patient Educator, RN Community Education Nurse-RN Baptist Health Community Outreach•Diabetes Patient Educator, RN: Full-time: 8am-5pm Minimum Requirements:•Requires graduation from a state approved school/college of Nursing•Current licensure by theAR State Board of Nursing. •2+ years bedside experience preferred. •Certified Diabetes Educator certificate preferred. •Community Education Nurse - RNMinimum Requirements (PRN: Varies):•Current RN license & 2 years clinical experience. •Current CPR certification. Apply online at: baptist-health.com/jobs]]></Description>
<LookID>arkansas</LookID>
<Url>http://jobs.arkansasonline.com/careers/jobsearch/detail/jobId/73451732/viewType/featured</Url>
</item>
<item>
<JobID>66703190</JobID>
<Title>Telemarketing Agents</Title>
<Employer>Arkansas Democrat Gazette </Employer>
<Location>Bryant, AR</Location>
<Description><![CDATA[Telemarketing Agents Needed Position is part-time Starting at $9.00/hour Plus Bonus! Looking for dependable and professional applicants. We are a drug and smoke free company located in Bryant. Hours: Mon-Fri 4:30pm to 8:30pm and Sat. 9am to 6pm. Send resumes to: clewis#wehco.com or P.O. Box 384 Bryant, AR 72089 Arkansas Democrat Gazette Arkansas' Largest NewspaperCLICK THE IMAGE TO VIEW THE AD]]></Description>
<LookID>arkansas</LookID>
<Url>http://jobs.arkansasonline.com/careers/jobsearch/detail/jobId/66703190/viewType/featured</Url>
</item>
</JobSearchResults>
sas</LookID>
<Url>http://jobs.arkansasonline.com/careers/jobsearch/detail/jobId/73004973/viewType/featured</Url>
</item>
</JobSearchResults>
I am using the following php code to open the above xml file, and take out the following:
sas
http://jobs.arkansasonline.com/careers/jobsearch/detail/jobId/73004973/viewType/featured
</JobSearchResults>
However the php code below:
error_reporting(E_ALL | E_STRICT);
ini_set('display_errors', 1);
// Load File
// $today = date('Ymd');
$file = '/Users/jasenburkett/Sites/jobsark/feed' . '.xml';
$newfile = '/Users/jasenburkett/Sites/jobsark/feed' . '.xml';
$file_contents = file_get_contents($file);
$data = $file_contents;
$parts = explode("</JobSearchResults>", $data);
// Save File
file_put_contents($newfile, $data);
?>
This works, however it deletes everything after the first </JobSearchResults> and I want to keep the very last one...
Any ideas where i am going wrong?
If what you are looking for is a way to cleanup the corrupt XML file, you can just add the string that gets missing when the explode is run. It is all a bit hackish, but it works.
$file = '/Users/jasenburkett/Sites/jobsark/feed.xml';
$data = file_get_contents($file);
$split = "</JobSearchResults>"; // Split on this
$parts = explode($split, $data); // Make the split
$cleanedXml = $parts[0]; // Use first part
$cleanedXml .= $split; // Put back last ending tag
file_put_contents($file, $cleanedXml);

XML to PHP array to mysql

I'm trying to import a xml data from a google xml document using simple xml to achieve that, an example of the code is here
<entry>
<id>
tag:google.com,2013:googlealerts/feed:11187837211342886856
</id>
<title type="html">
<b>London</b> Collections: Topman Design's retro mash-up
</title>
<link href="https://www.google.com/url?q=http://www.telegraph.co.uk/men/fashion-and-style/10901146/London-Collections-Topman-Designs-retro-mash-up.html&ct=ga&cd=CAIyAA&usg=AFQjCNEib0lLtkzUzFtR2Hk37wGefTVAZQ"/>
<published>2014-06-15T14:15:00Z</published>
<updated>2014-06-15T14:15:00Z</updated>
<content type="html">
Today is a very important day for England, and I'm not referring to the World Cup; it's the first day of <b>London</b> Collections: Men, a three day celebration ...
</content>
<author>
<name/>
</author>
</entry>
What would the best solution to do this? I'm so confused with how to get each as an variable to pass to mysql
this is exactly where I'm stuck
$xml = simplexml_load_file("xml.xml");
$feed = simplexml_load_string($xml);
$ns=$feed->getNameSpaces(true);
foreach ($feed->entry as $entry) {
}
thank you all in advance
You can use XPath. It may be simpler than SimpleXML when you have namespaces. You will also have to register the namespace which is not present in the feed excerpt you included as an example.
I found an arbitrary feed here: http://www.google.com/alerts/feeds/01662123773360489091/16526224428036307178
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:idx="urn:atom-extension:indexing">
<id>
tag:google.com,2005:reader/user/01662123773360489091/state/com.google/alerts/16526224428036307178
</id>
<title>Google Alert - test</title>
<link href="http://www.google.com/alerts/feeds/01662123773360489091/16526224428036307178" rel="self"/>
<updated>2014-06-15T17:30:04Z</updated>
<entry>
<id>
tag:google.com,2013:googlealerts/feed:5957360885559055905
</id>
<title type="html">
Dad's <b>Test</b> Out Products Made For the Family
</title>
<link href="https://www.google.com/url?q=http://gma.yahoo.com/video/dads-test-products-made-family-141428658.html&ct=ga&cd=CAIyAA&usg=AFQjCNHHBPoS6Poz-Y5A3vFfbsGL3fkrBA"/>
<published>2014-06-15T17:30:04Z</published>
<updated>2014-06-15T17:30:04Z</updated>
<content type="html">
Watch the video Dad's <b>Test</b> Out Products Made For the Family on Yahoo Good Morning America . Becky Worley enlists a group of fathers to see if "As ...
</content>
<author>
<name/>
</author>
</entry>
<entry>
...
I will use it to provide your answer.
In the first line there is a default namespace declaration xmlns. You have to register that in PHP to use the namespace in XPath. You should map it to a prefix (could be any one) even if there is no prefix in the original file. So this is how you would initialize the parser.
These two lines initialize the DOM parser and parse the file, loading it from the Internet:
$document = new DOMDocument();
$document->load( "http://www.google.com/alerts/feeds/01662123773360489091/16526224428036307178" );
These two initialize the XPath environment, registering the default namespace of your file with a prefix (I chose atom):
$xpath = new DOMXpath($document);
$xpath->registerNamespace("atom", "http://www.w3.org/2005/Atom");
Once that is set up, you can select the nodes using the evaluate() expression, which can be absolute or relative. To get all entry nodes, you can use an absolute expression:
$entries = $xpath->evaluate("//atom:entry");
The XPath expression is //atom::entry. It returns a set of entry nodes from the "http://www.w3.org/2005/Atom" namespace, which is what you want.
To extract the nodes and the information in the context of each entry, you can use DOM methods and properties such as firstChild, nextSibling, etc. or you can perform additional XPath contextual searches. A contextual search passes the context node as a second parameter to the evaluate() expression. Here is a loop that gets the data in each child node of <entry> and places it in an HTML sublist:
$entries = $xpath->evaluate("//atom:entry");
echo '<ul>'."\n";
foreach ($entries as $entry) {
echo '<li><b>Entry ID: '.$xpath->evaluate("atom:id/text()", $entry)->item(0)->nodeValue.'</b></li>'."\n";
echo '<ul>'."\n";
echo '<li>Title: '.$xpath->evaluate("atom:title/text()", $entry)->item(0)->nodeValue.'</li>'."\n";
echo '<li>Link: '.$xpath->evaluate("atom:link/#href", $entry)->item(0)->nodeValue.'</li>'."\n";
echo '<li>Published: '.$xpath->evaluate("atom:published/text()", $entry)->item(0)->nodeValue.'</li>'."\n";
echo '<li>Updated: '.$xpath->evaluate("atom:updated/text()", $entry)->item(0)->nodeValue.'</li>'."\n";
echo '<li>Content: '.$xpath->evaluate("atom:content/text()", $entry)->item(0)->nodeValue.'</li>'."\n";
echo '<li>Author: '.$xpath->evaluate("atom:author/atom:name/text()", $entry)->item(0)->nodeValue.'</li>'."\n";
echo '</ul>'."\n";
}
echo '</ul>'."\n";
Note that the expressions are relative to entry (they don't start with /), he element selectors are also prefixed (they also belong to the atom namespace), and I used item(0) and nodeValue to extract the results. Since nodes may have many children, the evaluate() expression as used above returns a nodeset. If there is only one text child, it's in item(0). nodeValue converts it to string.
The result of running the program above will be:
<ul>
<li><b>Entry ID: tag:google.com,2013:googlealerts/feed:5957360885559055905</b></li>
<ul>
<li>Title: Dad's <b>Test</b> Out Products Made For the Family</li>
<li>Link: https://www.google.com/url?q=http://gma.yahoo.com/video/dads-test-products-made-family-141428658.html&ct=ga&cd=CAIyAA&usg=AFQjCNHHBPoS6Poz-Y5A3vFfbsGL3fkrBA</li>
<li>Published: 2014-06-15T17:30:04Z</li>
<li>Updated: 2014-06-15T17:30:04Z</li>
<li>Content: Watch the video Dad's <b>Test</b> Out Products Made For the Family on Yahoo Good Morning America . Becky Worley enlists a group of fathers to see if "As ...</li>
<li>Author: </li>
</ul>
<li><b>Entry ID: tag:google.com,2013:googlealerts/feed:11008408359408830921</b></li>
<ul>
<li>Title: Germany faces major <b>test</b> of strength in its World Cup opener against Portugal</li>
<li>Link: https://www.google.com/url?q=http://www.foxnews.com/sports/2014/06/15/germany-faces-major-test-strength-in-its-world-cup-opener-against-portugal/&ct=ga&cd=CAIyAA&usg=AFQjCNHOU94QyciRpCEdJawOwl3diEEO0A</li>
<li>Published: 2014-06-15T16:18:45Z</li>
<li>Updated: 2014-06-15T16:18:45Z</li>
<li>Content: Cristiano Ronaldo stretches during a training session of Portugal in Campinas, Brazil, Saturday, June 14, 2014. Portugal plays in group G of the Brazil ...</li>
<li>Author: </li>
</ul>
<li><b>Entry ID: tag:google.com,2013:googlealerts/feed:8664961950651004785</b></li>
...
Now you can edit the code to adapt it to the data you wish to extract.
You can see a working example of this application in this PHP Fiddle

Regular Expressions - PHP and XML

I'm in college and new to PHP regular expressions but I have somewhat of an idea what I need to do I think. Basically I need to create a PHP program to read XML source code containing several 'stories' and store their details in a mySQL database. I've managed to create an expression that selects each story but I need to break this expression down further in order to get each element within the story. Here's the XML:
XML
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="test.xsl"?>
<latestIssue>
<issue number="256" />
<date>
<day> 21 </day>
<month> 1 </month>
<year> 2011 </year>
</date>
<story>
<title> Is the earth flat? </title>
<author> A. N. Redneck </author>
<url> http://www.HotStuff.ie/stories/story123456.xml </url>
</story>
<story>
<title> What the actress said to the bishop </title>
<author> Brated Film Critic </author>
<url> http://www.HotStuff.ie/stories/story123457.xml </url>
</story>
<story>
<title> What the year has in store </title>
<author> Stargazer </author>
<url> http://www.HotStuff.ie/stories/story123458.xml </url>
</story>
</latestIssue>
So I need to get the title, author and url from each story and add them as a row in my database. Here's what I have so far:
PHP
<?php
$url = fopen("http://address/to/test.xml", "r");
$contents = fread($url,10000000);
$exp = preg_match_all("/<title>(.+?)<\/url>/s", $contents, $matches);
foreach($matches[1] as $match) {
// NO IDEA WHAT TO DO FROM HERE
// $exp2 = "/<title>(.+?)<\/title><author>(.+?)<\/author><url>(.+?)<\/url>/";
// This is what I had but I'm not sure if it's right or what to do after
}
?>
I'd really appreciate the help guys, I've been stuck on this all day and I can't wrap my head around regular expressions at all. Once I've managed to get each story's details I can easily update the database.
EDIT:
Thanks for replying but are you sure this can't be done with regular expressions? It's just the question says "Use regular expressions to analyse the XML and extract the relevant data that you need. Note that information about each story is spread across several lines of XML". Maybe he made a mistake but I don't see why he'd write it like that if it can't be done this way.
First of all, start using
file_get_contents("UrlHere");
to gather the content from a page.
Now if you want to parse the XML use the XML parser in PHP for example.
You could also use third-party XML parsers
Regular expressions are not the correct tool to use here. You want to use a XML parser. I like PHP's SimpleXML
$sXML = new SimpleXMLElement('http://address/to/test.xml', 0, TRUE);
$stories = $sXML->story;
foreach($stories as $story){
$title = (string)$story->title;
$author = (string)$story->author;
$url = (string)$story->url;
}
You should never use regexp to parse an XML document (Ok, never is a big word, in some rare cases the regexp can be better but not in your case).
As it's a document reading, I suggest you to use the SimpleXML class and XPath queries.
For example :
$ cat test.php
#!/usr/bin/php
<?php
function xpathValueToString(SimpleXMLElement $xml, $xpath){
$arrayXpath = $xml->xpath($xpath);
return ($arrayXpath) ? trim((string) $arrayXpath[0]) : null;
}
$xml = new SimpleXMLElement(file_get_contents("test.xml"));
$arrayXpathStories = $xml->xpath("/latestIssue/story");
foreach ($arrayXpathStories as $story){
echo "Title : " . xpathValueToString($story, 'title') . "\n";
echo "Author : " . xpathValueToString($story, 'author') . "\n";
echo "URL : " . xpathValueToString($story, 'url') . "\n\n";
}
?>
$ ./test.php
Title : Is the earth flat?
Author : A. N. Redneck
URL : http://www.HotStuff.ie/stories/story123456.xml
Title : What the actress said to the bishop
Author : Brated Film Critic
URL : http://www.HotStuff.ie/stories/story123457.xml
Title : What the year has in store
Author : Stargazer
URL : http://www.HotStuff.ie/stories/story123458.xml

SimpleXML PHP Parsing [duplicate]

This question already has an answer here:
Closed 10 years ago.
It IS a duplicate, just not for the question that is was closed for.
Finally found the answer, even here on SO.
Actual Duplicate: PHP SimpleXML Namespace Problem
EDIT: If you read the question closely, you will see it is NOT a duplicate of PHP namespace simplexml problems. The answer from the 'possible duplicate' is not the answer to my question.
Again:
I have no problem with $value = $record->children('cap', true)->$title;.(which is all the 'possible duplicate' answers)
I have a problem when there are other tags inside the tag with the colon.
<tag:something>hello</tag:something> //I parse out hello (this is the 'duplicate questions' answer that I don't need answered)
<tag:something>
<stuff>hello</stuff> //I cannot grab this. Explanation below.
</tag:something>
END of edit.
ORIGINAL question:
I cannot get the data inside the tag <value> in the XML located at http://alerts.weather.gov/cap/us.php?x=1 (sample of XML below).
The problem is at:
$array[] = $record->children($tag_cap, true)->$tag_geocode->$tag_value;
This is the only data I cannot grab, I have verified that all the other data other than $array[4] is grabbed.
There is just a problem getting data from tags when the parent tag is in the form <cap:something>. For example:
I can get 100 when it is like <cap:something>100</cap:something>. But I cant get 100 if it was like <cap:something><value>100</value></cap:something>.
Piece of the XML:
<?xml version = '1.0' encoding = 'UTF-8' standalone = 'yes'?>
<feed
xmlns = 'http://www.w3.org/2005/Atom'
xmlns:cap = 'urn:oasis:names:tc:emergency:cap:1.1'
xmlns:ha = 'http://www.alerting.net/namespace/index_1.0'
>
<!-- http-date = Tue, 30 Oct 2012 06:34:00 GMT -->
<id>http://alerts.weather.gov/cap/us.atom</id>
<logo>http://alerts.weather.gov/images/xml_logo.gif</logo>
<generator>NWS CAP Server</generator>
<updated>2012-10-30T14:34:00-04:00</updated>
<author>
<name>w-nws.webmaster#noaa.gov</name>
</author>
<title>Current Watches, Warnings and Advisories for the United States Issued by the National Weather Service</title>
<link href='http://alerts.weather.gov/cap/us.atom'/>
<entry>
<id>http://alerts.weather.gov/cap/wwacapget.php?x=AK124CCADA8120.BlizzardWarning.124CCAE7BFC0AK.AFGWSWNSB.d32adb45b5c82ec5e486c4cfb96d3fb6</id>
<updated>2012-10-30T05:20:00-08:00</updated>
<published>2012-10-30T05:20:00-08:00</published>
<author>
<name>w-nws.webmaster#noaa.gov</name>
</author>
<title>Blizzard Warning issued October 30 at 5:20AM AKDT until October 31 at 6:00AM AKDT by NWS</title>
<link href='http://alerts.weather.gov/cap/wwacapget.php?x=AK124CCADA8120.BlizzardWarning.124CCAE7BFC0AK.AFGWSWNSB.d32adb45b5c82ec5e486c4cfb96d3fb6'/>
<summary>...BLIZZARD WARNING IN EFFECT UNTIL 6 AM AKDT WEDNESDAY... THE NATIONAL WEATHER SERVICE IN FAIRBANKS HAS ISSUED A BLIZZARD WARNING...WHICH IS IN EFFECT UNTIL 6 AM AKDT WEDNESDAY. * VISIBILITY...NEAR ZERO IN SNOW AND BLOWING SNOW. * WINDS...WEST 35 MPH GUSTING TO 50 MPH. * SNOW...ACCUMULATION 3 INCHES THROUGH TONIGHT.</summary>
<cap:event>Blizzard Warning</cap:event>
<cap:effective>2012-10-30T05:20:00-08:00</cap:effective>
<cap:expires>2012-10-30T16:00:00-08:00</cap:expires>
<cap:status>Actual</cap:status>
<cap:msgType>Alert</cap:msgType>
<cap:category>Met</cap:category>
<cap:urgency>Expected</cap:urgency>
<cap:severity>Severe</cap:severity>
<cap:certainty>Likely</cap:certainty>
<cap:areaDesc>Eastern Beaufort Sea Coast</cap:areaDesc>
<cap:polygon></cap:polygon>
<cap:geocode>
<valueName>FIPS6</valueName>
<value>002185</value>
<valueName>UGC</valueName>
<value>AKZ204</value>
</cap:geocode>
<cap:parameter>
<valueName>VTEC</valueName>
<value>/X.NEW.PAFG.BZ.W.0013.121030T1320Z-121031T1400Z/</value>
</cap:parameter>
</entry>
...//rest of XML...
PHP Code:
ini_set('display_errors','1');
$alert_url = 'http://alerts.weather.gov/cap/us.php?x=1';
$alert_string_xml = file_get_contents($alert_url);
$alert_simple_xml_object = simplexml_load_string($alert_string_xml);
$count = 0;
$tag_entry = 'entry';
$tag_summary = 'summary';
$tag_cap = 'cap';
$tag_event = 'event';
$tag_certainty = 'certainty';
$tag_areaDesc = 'areaDesc';
$tag_geocode = 'geocode';
$tag_value = 'value';
foreach ($alert_simple_xml_object->$tag_entry as $record)
{
$count++;
$array = array();
$array[] = $record->$tag_summary;
$array[] = $record->children($tag_cap, true)->$tag_event;
$array[] = $record->children($tag_cap, true)->$tag_certainty;
$array[] = $record->children($tag_cap, true)->$tag_areaDesc;
$array[] = $record->children($tag_cap, true)->$tag_geocode->$tag_value;
//$array[] = $record->children($tag_cap, true)->$tag_geocode->$tag_value[0]; //doesnt work either
echo $array[4]; //nothing is echoed
}
MOST CURRENT ATTEMPT:
I read more on namespaces and understand them better. I even tried what I thought was a better solution:
//inside the above foreach loop
$namespaces = $record->getNameSpaces(true);
$caap = $record->children($namespaces['cap']);
echo $caap->event; //works (but the first way works too)
echo $caap->geocode->value; //(STILL does not work. Nothing is echoed)
I don't understand why I cannot grab any data from children tags that have a parent tag that includes a namespace.
cap:stuff is the root, so you would access the elements as:
$xml = simplexml_load_string($your_xml);
$value_name_0 = $xml->valueName[0];
$value_0 = $xml->value[0];
$value_name_1 = $xml->valueName[1];
$value_1 = $xml->value[1];
You are probably looking for this function. There are 2 examples, which should be enough to solve your problem
The problem you are facing is not that visible if you have errors and warnings disabled:
namespace error : Namespace prefix cap on stuff is not defined
If you would have errors enable you would see that message. Because simplexml is not able to parse the namespace prefix cap properly, it will be dropped.
Therefore you access it directly:
$xml->stuff->value[1]
And similar. Consider the following code-example (demo:
$xml = simplexml_load_string('<entry>
<cap:stuff>
<valueName>aaa</valueName>
<value>000</value>
<valueName>bbb</valueName>
<value>111</value>
</cap:stuff>
</entry>');
echo "\nResult:", $xml->stuff->value[1], "\n\n";
echo "XML:\n", $xml->asXML();
It demonstrates the error message as well what is in $xml after loading the XML string by outputting it:
Warning: simplexml_load_string(): namespace error : Namespace prefix cap on \
stuff is not defined on line 10
Warning: simplexml_load_string(): <cap:stuff> on line 10
Warning: simplexml_load_string(): ^ on line 10
Result:111
XML:
<?xml version="1.0"?>
<entry>
<stuff>
<valueName>aaa</valueName>
<value>000</value>
<valueName>bbb</valueName>
<value>111</value>
</stuff>
</entry>
If you smell that something should work but it isn't, it is always necessary to look closer. One option is to echo the string again as XML to see what simplexml has parsed, the other is to enable error reporting and looking for warnings and error, they often contain further information.

Trying to pull in elements from Yahoo Weather XML

Trying to pull a value from the sunrise element from [yweather:astronomy] in the Yahoo XML Doc.
Tried various combinations along the lines of:
echo $yweather->astronomy == 'sunrise';
Is pulling a value from the sunrise element the right terminology? Struggling to find much in the way of help using this terminology on the web.
The remainder of the code is functioning as I wish
Yahoo XML Doc - snippet
<rss xmlns:yweather="http://xml.weather.yahoo.com/ns/rss/1.0"
xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" version="2.0">
<channel>
<title>Yahoo! Weather - New York, NY</title>
<link>
http://us.rd.yahoo.com/dailynews/rss/weather/New_York__NY/*http://weather.yahoo.com/forecast/USNY0996_f.html
</link>
<description>Yahoo! Weather for New York, NY</description>
<language>en-us</language>
<lastBuildDate>Mon, 12 Dec 2011 1:50 pm EST</lastBuildDate>
<ttl>60</ttl>
<yweather:location city="New York" region="NY" country="US"/>
<yweather:units temperature="F" distance="mi" pressure="in" speed="mph"/>
<yweather:wind chill="40" direction="0" speed="5"/>
<yweather:atmosphere humidity="37" visibility="10" pressure="30.54" rising="2"/>
<yweather:astronomy sunrise="7:09 am" sunset="4:26 pm"/>
rssweather.php
<?php
// Get XML data from source
$feed = file_get_contents("http://weather.yahooapis.com/forecastrss?p=USNY0996&u=f");
// Check to ensure the feed exists
if(!$feed){
die('Weather not found! Check feed URL');
}
$xml = new SimpleXmlElement($feed);
foreach ($xml->channel as $entry){
echo "<strong>Description</strong> ";
echo $entry->description;
echo "<br /><strong>Collected on</strong> ";
echo $entry->lastBuildDate;
//Use that namespace
$yweather = $entry->children("http://xml.weather.yahoo.com/ns/rss/1.0");
echo "<br /><strong>Sunrise</strong> ";
echo $yweather->astronomy == 'sunrise';
}
>?
My final solution
// Get and return sunrise time
function get_sunrise_time(SimpleXMLElement $xml) {
$weather['sunrise'] = $xml->channel->children('yweather', TRUE)->astronomy[1]->attributes()->sunrise;
return $weather;
}

Categories