I'm building a script that takes the contents of several (~13) news feeds and parses the XML data and inserts the records into a database. Since I don't have any control over the structure of the feeds, I need to tailor an object operator for each one to drill down into the structure in order to get the information I need.
The script works just fine if the target node is one step below the root, but if my string contains a second step, it fails ( 'foo' works, but 'foo->bar' fails). I've tried escaping characters and eval(), but I feel like I'm missing something glaringly obvious. Any help would be greatly appreciated.
// Roadmaps for xml navigation
$roadmap[1] = "deal"; // works
$roadmap[2] = "channel->item"; // fails
$roadmap[3] = "deals->deal";
$roadmap[4] = "resource";
$roadmap[5] = "object";
$roadmap[6] = "product";
$roadmap[8] = "channel->deal";
$roadmap[13] = "channel->item";
$roadmap[20] = "product";
$xmlSource = $xmlURL[$fID];
$xml=simplexml_load_file($xmlSource) or die(mysql_error());
if (!(empty($xml))) {
foreach($xml->$roadmap[$fID] as $div) {
include('./_'.$incName.'/feedVars.php');
include('./_includes/masterCategory.php.inc');
$test = sqlVendors($vendorName);
} // end foreach
echo $vUpdated." records updated.<br>";
echo $vInserted." records Inserted.<br><br>";
} else {
echo $xmlSource." returned an empty set!";
} // END IF empty $xml result
While Fosco's solution will work, it is indeed very dirty.
How about using xpath instead of object properties?
$xml->xpath('deals/deal');
PHP isn't going to magically turn your string which includes -> into a second level search.
Quick and dirty hack...
eval("\$node = \"\$xml->" . $roadmap[$fID] . "\";");
foreach($node as $div) {
Related
First, I am pretty clueless with PHP, so be kind! I am working on a site for an SPCA (I'm a Vet and a part time geek). The PHP accesses an xml file from a portal used to administer the shelter and store images, info. The file writes that xml data to JSON and then I use the JSON data in a handlebars template, etc. I am having a problem getting some data from the xml file to outprint to JSON.
The xml file is like this:
</DataFeedAnimal>
<AdditionalPhotoUrls>
<string>doc_73737.jpg</string>
<string>doc_74483.jpg</string>
<string>doc_74484.jpg</string>
</AdditionalPhotoUrls>
<PrimaryPhotoUrl>19427.jpg</PrimaryPhotoUrl>
<Sex>Male</Sex>
<Type>Cat</Type>
<YouTubeVideoUrls>
<string>http://www.youtube.com/watch?v=6EMT2s4n6Xc</string>
</YouTubeVideoUrls>
</DataFeedAnimal>
In the PHP file, written by a friend, the code is below, (just part of it), to access that XML data and write it to JSON:
<?php
$url = "http://eastbayspcapets.shelterbuddy.com/DataFeeds/AnimalsForAdoption.aspx";
if ($_GET["type"] == "found") {
$url = "http://eastbayspcapets.shelterbuddy.com/DataFeeds/foundanimals.aspx";
} else if ($_GET["type"] == "lost") {
$url = "http://eastbayspcapets.shelterbuddy.com/DataFeeds/lostanimals.aspx";
}
$response_xml_data = file_get_contents($url);
$xml = simplexml_load_string($response_xml_data);
$data = array();
foreach($xml->DataFeedAnimal as $animal)
{
$item = array();
$item['sex'] = (string)$animal->Sex;
$item['photo'] = (string)$animal->PrimaryPhotoUrl;
$item['videos'][] = (string)$animal->YouTubeVideoUrls;
$item['photos'][] = (string)$animal->PrimaryPhotoUrl;
foreach($animal->AdditionalPhotoUrls->string as $photo) {
$item['photos'][] = (string)$photo;
}
$item['videos'] = array();
$data[] = $item;
}
echo file_put_contents('../adopt.json', json_encode($data));
echo json_encode($data);
?>
The JSON output works well but I am unable to get 'videos' to write out to the JSON file as the 'photos' do. I just get '/n'!
Since the friend who helped with this is no longer around, I am stuck. I have tried similar code to the foreach statement for photos but am getting nowhere. Any help would be appreciated and the pets would appreciate it as well!
The trick with such implementations is to always look what you have got by dumping data structures to a log file or command line. Then to take a look at the documentation of the data you see. That way you know exactly what data you are working with and how to work with it ;-)
Here it turns out that the video URLs you are interested in are placed inside an object of type SimpleXMLElement with public properties, which is not really surprising if you look at the xml structure. The documentation of class SimpleXMLElement shows the method children() which iterates through all children. Just what we are looking for...
That means a clean implementation to access those sets should go along these lines:
foreach($animal->AdditionalPhotoUrls->children() as $photo) {
$item['photos'][] = (string)$photo;
}
foreach($animal->YouTubeVideoUrls->children() as $video) {
$item['videos'][] = (string)$video;
}
Take a look at this full and working example:
<?php
$response_xml_data = <<< EOT
<DataFeedAnimal>
<AdditionalPhotoUrls>
<string>doc_73737.jpg</string>
<string>doc_74483.jpg</string>
<string>doc_74484.jpg</string>
</AdditionalPhotoUrls>
<PrimaryPhotoUrl>19427.jpg</PrimaryPhotoUrl>
<Sex>Male</Sex>
<Type>Cat</Type>
<YouTubeVideoUrls>
<string>http://www.youtube.com/watch?v=6EMT2s4n6Xc</string>
<string>http://www.youtube.com/watch?v=hgfg83mKFnd</string>
</YouTubeVideoUrls>
</DataFeedAnimal>
EOT;
$animal = simplexml_load_string($response_xml_data);
$item = [];
$item['sex'] = (string)$animal->Sex;
$item['photo'] = (string)$animal->PrimaryPhotoUrl;
$item['photos'][] = (string)$animal->PrimaryPhotoUrl;
foreach($animal->AdditionalPhotoUrls->children() as $photo) {
$item['photos'][] = (string)$photo;
}
$item['videos'] = [];
foreach($animal->YouTubeVideoUrls->children() as $video) {
$item['videos'][] = (string)$video;
}
echo json_encode($item);
The obvious output of this is:
{
"sex":"Male",
"photo":"19427.jpg",
"photos" ["19427.jpg","doc_73737.jpg","doc_74483.jpg","doc_74484.jpg"],
"videos":["http:\/\/www.youtube.com\/watch?v=6EMT2s4n6Xc","http:\/\/www.youtube.com\/watch?v=hgfg83mKFnd"]
}
I would however like to add a short hint:
In m eyes it is questionable to convert such structured information into an associative array. Why? Why not a simple json_encode($animal)? The structure is perfectly fine and should be easy to work with! The output of that would be:
{
"AdditionalPhotoUrls":{
"string":[
"doc_73737.jpg",
"doc_74483.jpg",
"doc_74484.jpg"
]
},
"PrimaryPhotoUrl":"19427.jpg",
"Sex":"Male",
"Type":"Cat",
"YouTubeVideoUrls":{
"string":[
"http:\/\/www.youtube.com\/watch?v=6EMT2s4n6Xc",
"http:\/\/www.youtube.com\/watch?v=hgfg83mKFnd"
]
}
}
That structure describes objects (items with an inner structure, enclosed in json by {...}), not just arbitrary arrays (sets without a structure, enclosed in json by a [...]). Arrays are only used for the two unstructured sets of strings in there: photos and videos. This is much more logical, once you think about it...
Assuming the XML Data and the JSON data are intended to have the same structure. I would take a look at this: PHP convert XML to JSON
You may not need for loops at all.
I got a PHP array with a lot of XML users-file URL :
$tab_users[0]=john.xml
$tab_users[1]=chris.xml
$tab_users[n...]=phil.xml
For each user a <zoom> tag is filled or not, depending if user filled it up or not:
john.xml = <zoom>Some content here</zoom>
chris.xml = <zoom/>
phil.xml = <zoom/>
I'm trying to explore the users datas and display the first filled <zoom> tag, but randomized: each time you reload the page the <div id="zoom"> content is different.
$rand=rand(0,$n); // $n is the number of users
$datas_zoom=zoom($n,$rand);
My PHP function
function zoom($n,$rand) {
global $tab_users;
$datas_user=new SimpleXMLElement($tab_users[$rand],null,true);
$tag=$datas_user->xpath('/user');
//if zoom found
if($tag[0]->zoom !='') {
$txt_zoom=$tag[0]->zoom;
}
... some other taff here
// no "zoom" value found
if ($txt_zoom =='') {
echo 'RAND='.$rand.' XML='.$tab_users[$rand].'<br />';
$datas_zoom=zoom($r,$n,$rand); } // random zoom fct again and again till...
}
else {
echo 'ZOOM='.$txt_zoom.'<br />';
return $txt_zoom; // we got it!
}
}
echo '<br />Return='.$datas_zoom;
The prob is: when by chance the first XML explored contains a "zoom" information the function returns it, but if not nothing returns... An exemple of results when the first one is by chance the good one:
// for RAND=0, XML=john.xml
ZOOM=Anything here
Return=Some content here // we're lucky
Unlucky:
RAND=1 XML=chris.xml
RAND=2 XML=phil.xml
// the for RAND=0 and XML=john.xml
ZOOM=Anything here
// content founded but Return is empty
Return=
What's wrong?
I suggest importing the values into a database table, generating a single local file or something like that. So that you don't have to open and parse all the XML files for each request.
Reading multiple files is a lot slower then reading a single file. And using a database even the random logic can be moved to SQL.
You're are currently using SimpleXML, but fetching a single value from an XML document is actually easier with DOM. SimpleXMLElement::xpath() only supports Xpath expression that return a node list, but DOMXpath::evaluate() can return the scalar value directly:
$document = new DOMDocument();
$document->load($xmlFile);
$xpath = new DOMXpath($document);
$zoomValue = $xpath->evaluate('string(//zoom[1])');
//zoom[1] will fetch the first zoom element node in a node list. Casting the list into a string will return the text content of the first node or an empty string if the list was empty (no node found).
For the sake of this example assume that you generated an XML like this
<zooms>
<zoom user="u1">z1</zoom>
<zoom user="u2">z2</zoom>
</zooms>
In this case you can use Xpath to fetch all zoom nodes and get a random node from the list.
$document = new DOMDocument();
$document->loadXml($xml);
$xpath = new DOMXpath($document);
$zooms = $xpath->evaluate('//zoom');
$zoom = $zooms->item(mt_rand(0, $zooms->length - 1));
var_dump(
[
'user' => $zoom->getAttribute('user'),
'zoom' => $zoom->textContent
]
);
Your main issue is that you are not returning any value when there is no zoom found.
$datas_zoom=zoom($r,$n,$rand); // no return keyword here!
When you're using recursion, you usually want to "chain" return values on and on, till you find the one you need. $datas_zoom is not a global variable and it will not "leak out" outside of your function. Please read the php's variable scope documentation for more info.
Then again, you're calling zoom function with three arguments ($r,$n,$rand) while the function can only handle two ($n and $rand). Also the $r is undiefined, $n is not used at all and you are most likely trying to use the same $rand value again and again, which obviously cannot work.
Also note that there are too many closing braces in your code.
I think the best approach for your problem will be to shuffle the array and then to use it like FIFO without recursion (which should be slightly faster):
function zoom($tab_users) {
// shuffle an array once
shuffle($tab_users);
// init variable
$txt_zoom = null;
// repeat until zoom is found or there
// are no more elements in array
do {
$rand = array_pop($tab_users);
$datas_user = new SimpleXMLElement($rand, null, true);
$tag=$datas_user->xpath('/user');
//if zoom found
if($tag[0]->zoom !='') {
$txt_zoom=$tag[0]->zoom;
}
} while(!$txt_zoom && !empty($tab_users));
return $txt_zoom;
}
$datas_zoom = zoom($tab_users); // your zoom is here!
Please read more about php scopes, php functions and recursion.
There's no reason for recursion. A simple loop would do.
$datas_user=new SimpleXMLElement($tab_users[$rand],null,true);
$tag=$datas_user->xpath('/user');
$max = $tag->length;
while(true) {
$test_index = rand(0, $max);
if ($tag[$test_index]->zoom != "") {
break;
}
}
Of course, you might want to add a bit more logic to handle the case where NO zooms have text set, in which case the above would be an infinite loop.
I am parsing through an XML document and getting the values of nested tags using asXML(). This works fine, but I would like to move this data into a MySQL database whose columns match the tags of the file. So essentially how do I get the tags that asXML() is pulling text from?
This way I can eventually do something like: INSERT INTO db.table (TheXMLTag) VALUES ('XMLTagText');
This is my code as of now:
$xml = simplexml_load_file($target_file) or die ("Error: Cannot create object");
foreach ($xml->Message->SettlementReport->SettlementData as $main ){
$value = $main->asXML();
echo '<pre>'; echo $value; echo '</pre>';
}
foreach ($xml->Message->SettlementReport->Order as $main ){
$value = $main->asXML();
echo '<pre>'; echo $value; echo '</pre>';
}
This is what my file looks like to give you an idea (So essentially how do I get the tags within [SettlementData], [0], [Fulfillment], [Item], etc. ?):
I would like to move this data into a MySQL database whose columns match the tags of the file.
Your problem is two folded.
The first part of the problem is to do the introspection on the database structure. That is, obtain all table names and obtain the column names of these. Most modern databases offer this functionality, so does MySQL. In MySQL those are the INFORMATION_SCHEMA Tables. You can query them as if those were normal database tables. I generally recommend PDO for that in PHP, mysqli is naturally doing the job perfectly as well.
The second part is parsing the XML data and mapping it's data onto the database tables (you use SimpleXMLElement for that in your question so I related to it specifically). For that you first of all need to find out how you would like to map the data from the XML onto the database. An XML file does not have a 2D structure like a relational database table, but it has a tree structure.
For example (if I read your question right) you identify Message->SettlementReport->SettlementData as the first "table". For that specific example it is easy as the <SettlementData> only has child-elements that could represent a column name (the element name) and value (the text-content). For that it is easy:
header('Content-Type: text/plain; charset=utf-8');
$table = $xml->Message->SettlementReport->SettlementData;
foreach ($table as $name => $value ) {
echo $name, ': ', $value, "\n";
}
As you can see, specifying the key assignment in the foreach clause will give you the element name with SimpleXMLElement. Alternatively, the SimpleXMLElement::getName() method does the same (just an example which does the same just with slightly different code):
header('Content-Type: text/plain; charset=utf-8');
$table = $xml->Message->SettlementReport->SettlementData;
foreach ($table as $value) {
$name = $value->getName();
echo $name, ': ', $value, "\n";
}
In this case you benefit from the fact that the Iterator provided in the foreach of the SimpleXMLElement you access via $xml->...->SettlementData traverses all child-elements.
A more generic concept would be Xpath here. So bear with me presenting you a third example which - again - does a similar output:
header('Content-Type: text/plain; charset=utf-8');
$rows = $xml->xpath('/*/Message/SettlementReport/SettlementData');
foreach ($rows as $row) {
foreach ($row as $column) {
$name = $column->getName();
$value = (string) $column;
echo $name, ': ', $value, "\n";
}
}
However, as mentioned earlier, mapping a tree-structure (N-Depth) onto a 2D-structure (a database table) might now always be that straight forward.
If you're looking what could be an outcome (there will most often be data-loss or data-duplication) a more complex PHP example is given in a previous Q&A:
How excel reads XML file?
PHP XML to dynamic table
Please note: As the matter of fact such mappings on it's own can be complex, the questions and answers inherit from that complexity. This first of all means those might not be easy to read but also - perhaps more prominently - might just not apply to your question. Those are merely to broaden your view and provide and some examples for certain scenarios.
I hope this is helpful, please provide any feedback in form of comments below. Your problem might or might not be less problematic, so this hopefully helps you to decide how/where to go on.
I tried with SimpleXML but it skips text data. However, using the Document Object Model extension works.
This returns an array where each element is an array with 2 keys: tag and text, returned in the order in which the tree is walked.
<?php
// recursive, pass by reference (spare memory ? meh...)
// can skip non tag elements (removes lots of empty elements)
function tagData(&$node, $skipNonTag=false) {
// get function name, allows to rename function without too much work
$self = __FUNCTION__;
// init
$out = array();
$innerXML = '';
// get document
$doc = $node->nodeName == '#document'
? $node
: $node->ownerDocument;
// current tag
// we use a reference to innerXML to fill it later to keep the tree order
// without ref, this would go after the loop, children would appear first
// not really important but we never know
if(!(mb_substr($node->nodeName,0,1) == '#' && $skipNonTag)) {
$out[] = array(
'tag' => $node->nodeName,
'text' => &$innerXML,
);
}
// build current innerXML and process children
// check for children
if($node->hasChildNodes()) {
// process children
foreach($node->childNodes as $child) {
// build current innerXML
$innerXML .= $doc->saveXML($child);
// repeat process with children
$out = array_merge($out, $self($child, $skipNonTag));
}
}
// return current + children
return $out;
}
$xml = new DOMDocument();
$xml->load($target_file) or die ("Error: Cannot load xml");
$tags = tagData($xml, true);
//print_r($tags);
?>
I have made a small script which uses the Twitch API. The API only allows a maximum of 100 results per query. I would like to have this query carry on until there are no more results.
My theory behind this, is to run a foreach or while loop and increment the offset by 1 each time.
My problem however, is that I cannot change the foreach parameters within itself.
Is there anyway of executing this efficiently without causing an infinite loop?
Here is my current code:
<?php
$newcurrentFollower = 0;
$offset=0;
$i = 100;
$json = json_decode(file_get_contents("https://api.twitch.tv/kraken/channels/greatbritishbg/follows?limit=25&offset=".$offset));
foreach ($json->follows as $follow)
{
echo $follow->user->name . ' (' . $newcurrentFollower . ')' . "<br>";
$newcurrentFollower++;
$offset++;
$json = json_decode(file_get_contents("https://api.twitch.tv/kraken/channels/greatbritishbg/follows?limit=25&offset=".$offset));
}
?>
Using a While loop:
while($i < $total)
{
$json = json_decode(file_get_contents("https://api.twitch.tv/kraken/channels/greatbritishbg/follows?limit=25&offset=".$offset));
echo $json->follows->user->name . ' (' . $newcurrentFollower . ')' . "<br>";
$newcurrentFollower++;
$offset++;
$i++;
}
Ends up echoing this (No names are successfully being grabbed):
Here is the API part for $json->follows:
https://github.com/justintv/Twitch-API/blob/master/v2_resources/channels.md#get-channelschannelfollows
You can use this:
$offset = 0;
$count = 1;
do {
$response = json_decode(file_get_contents(
'https://api.twitch.tv/kraken/channels/greatbritishbg/follows?limit=100&offset=' . $offset
));
foreach($response->follows as $follow) {
echo $follow->user->name . ' (' . ($count++) . ')' . "</br>";
}
$offset+=25;
} while (!empty($response->follows));
You want to use a while loop here, not just a foreach. Basically:
while (the HTTP request returns results)
{
foreach ($json->follows as $follow)
{
do stuff
}
increment offset so the next request returns the next one not already processed
}
The trickiest part is going to be getting the while condition right so that it returns false when the request gets no more results, and will depend on what the API actually returns if there are no more results.
Also important, the cleanest way would be to have the HTTP request occur as part of the while condition, but if you need to do some complicated computation of the JSON return to check the condition, you can put an initial HTTP request before the loop, and then do another request at the end of each while loop iteration.
The problem is you're only capturing the key not the value. Place it into a datastructure to access the information.
Honestly I find a recursive function much more effective than a iterative/loop approach then just update a datatable or list before the next call. It's simple, uses cursors, lightweight and does the job. Reusable if you use generics on it too.
This code will be in c#, however I know with minor changes you'll be able to get it working in php with ease.
query = //follower object get request//
private void doProcessFollowers(string query)
{
HTTPParse followerData = new HTTPParse(); //custom json wrapper. using the basic is fine. Careful with your cons though
var newRoot = followerData.createFollowersRoot(query); // generates a class populated by json
if (newRoot[0]._cursor != null)
{
populateUserDataTable(newRoot); //update dataset
doProcessFollowers(newRoot[0]._links.next); //recurse
}
}
Anyway - This just allows you to roll through the cursors without needing to worry about indexes - unless you specifically want them for whatever reason. If you're working with generics you can just reuse this code without issue. Find a generic example below. All you need to do to make it reuseable is pass the correct class within the <> of the method call. Can work for any custom class that you use to parse json data with. Which is basically what the 'createfollowerroot()' is in the above code, except that's hard typed.
Also I know it's in c# and the topic is php, with a few minor changes to syntax you'll get it working easily.
Anyway Hope this helped somebody
Generic example:
public static List<T> rootSerialize<T>(JsonTextReader reader)
{
List<T> outputData = new List<T>();
while (reader.Read())
{
JsonSerializer serializer = new JsonSerializer();
var tempData = serializer.Deserialize<T>(reader);
outputData.Add(tempData);
}
return outputData;
}
I'm having a some trouble accessing attributes in my XML. My code is below. Initially I had two loops and this was working with no problems.
I would first get the image names and then use the second loop to get the story heading and story details. Then insert everything into the database. I want to tidy up the code and use only one loop. My image name is store in the Href attribute. ()
Sample XML layout (http://pastie.org/1850682). The XML layout is a bit messy so that was the reason for using two loops.
$xml = new SimpleXMLElement('entertainment/Showbiz.xml', null, true);
// Get story images
//$i=0;
//$image = $xml->xpath('NewsItem/NewsComponent/NewsComponent/NewsComponent/NewsComponent/NewsComponent/ContentItem');
// foreach($image as $imageNode){
// $attributeArray = $imageNode->attributes();
// if ($attributeArray != ""){
// $imageArray[$i] = $attributeArray;
// $i++;
// }
//}
// Get story header & detail
$i=0;
$story = $xml->xpath('NewsItem/NewsComponent/NewsComponent/NewsComponent');
foreach($story as $contentItem){
//$dbImage = $imageArray[$i]['Href'];
foreach($contentItem->xpath('ContentItem/DataContent/nitf/body/body.head/hedline/hl1') as $headline){
$strDetail = "";
foreach($contentItem->xpath('ContentItem/DataContent/nitf/body/body.content/p') as $detail){
$strDetail .= '<p>'.$detail.'</p>';
foreach($contentItem->xpath('NewsComponent/NewsComponent/ContentItem') as $imageNode){
$dbImage = $imageNode->attributes();
}
}
$link = getUnique($headline);
$sql = "INSERT INTO tablename (headline, detail, image, link) VALUES ('".mysql_real_escape_string($headline)."', '".mysql_real_escape_string($strDetail)."', '".mysql_real_escape_string($dbImage)."', '".$link."')";
if (mysql_query($sql, $db) or die(mysql_error())){
echo "Loaded ";
}else{
echo "Not Loaded ";
}
}
$i++;
}
I think I'm close to getting it. I tried putting a few echo statements in the fourth nested foreach loop, but nothing was out. So its not executing that loop. I've been at this for a few hours and googled as well, just can't manage to get it.
If all else fails, I'll just go back to using two loops.
Regards,
Stephen
This was pretty difficult to follow. I've simplified the structure so we can see the parts of the hierarchy we care about.
It appears that the NewsComponent that has a Duid attribute is what defines/contains one complete news piece. Of its two children, the first child NewsComponent contains the summary and text, while the second child NewsComponent contains the image.
Your initial XPath query is for 'NewsItem/NewsComponent/NewsComponent/NewsComponent', which is the first NewsComponent child (the one with the body text). You can't find the image from that point because the image isn't within that NewsComponent; you've gone one level too deep. (I was tipped off by the fact I got a PHP Notice: Undefined variable: dbImage.) Thus, drop your initial XPath query back a level, and add that extra level to your subsequent XPath queries where needed.
From this:
$story = $xml->xpath('NewsItem/NewsComponent/NewsComponent/NewsComponent');
foreach($story as $contentItem){
foreach($contentItem->xpath('ContentItem/DataContent/nitf/body/body.head/hedline/hl1') as $headline){
foreach($contentItem->xpath('ContentItem/DataContent/nitf/body/body.content/p') as $detail){
foreach($contentItem->xpath('NewsComponent/NewsComponent/ContentItem') as $imageNode){ /* ... */ }}}}
to this:
$story = $xml->xpath('NewsItem/NewsComponent/NewsComponent');
foreach($story as $contentItem){
foreach($contentItem->xpath('NewsComponent/ContentItem/DataContent/nitf/body/body.head/hedline/hl1') as $headline){
foreach($contentItem->xpath('NewsComponent/ContentItem/DataContent/nitf/body/body.content/p') as $detail){
foreach($contentItem->xpath('NewsComponent/NewsComponent/NewsComponent/ContentItem') as $imageNode){ /* ... */ }}}}
However, the image still doesn't work after that. Because you're using loops (sometimes unnecessarily), $dbImage gets reassigned to an empty string. The first ContentItem has the Href attribute, which gets assigned to $dbImage. But then it loops to the next ContentItem, which has no attributes and therefore overwrites $dbImage with an empty value. I'd recommend modifying that XPath query to find only ContentItems that have an Href attribute, like this:
->xpath('NewsComponent/NewsComponent/NewsComponent/ContentItem[#Href]')
That should do it.
Other thoughts
Refactor to clean up this code, if/where possible.
As I mentioned, sometimes you are looping and nesting when you don't need to, and it just ends up being harder to follow and potentially introducing logical bugs (like the image one). It seems that the structure of this file will always be consistent. If so, you can forgo some looping and go straight for the pieces of data you're looking for. You could do something like this:
// Get story header & detail
$stories = $xml->xpath('/NewsML/NewsItem/NewsComponent/NewsComponent');
foreach ($stories as $story) {
$headlineItem = $story->xpath('NewsComponent/ContentItem/DataContent/nitf/body/body.head/hedline/hl1');
$headline = $headlineItem[0];
$detailItems = $story->xpath('NewsComponent/ContentItem/DataContent/nitf/body/body.content/p');
$strDetail = '<p>' . implode('</p><p>', $detailItems) . '</p>';
$imageItem = $story->xpath('NewsComponent/NewsComponent/NewsComponent/ContentItem[#Href]');
$imageAtts = $imageItem[0]->attributes();
$dbImage = $imageAtts['Href'];
$link = getUnique($headline);
$sql = "INSERT INTO tablename (headline, detail, image, link) VALUES ('".mysql_real_escape_string($headline)."', '".mysql_real_escape_string($strDetail)."', '".mysql_real_escape_string($dbImage)."', '".$link."')";
if (mysql_query($sql, $db) or die(mysql_error())) {
echo "Loaded ";
} else {
echo "Not Loaded ";
}
}