Parsing XML document with PHP using 'foreach' loop - php

I'm new to PHP, MySQL and XML... and have been trying to wrap my head around classes, objects, arrays and loops. I'm working on a parser that extracts data from an XML file, then stores it into a database. A fun and delightfully frustrating challenge to work on during the christmas holiday.
Before posting this question I've gone over the PHP5.x documentation, W3C and also searched quite a bit around stackoverflow.
Here's the code...
> XML:
<alliancedata>
<server>
<name>irrelevant</name>
</server>
<alliances>
<alliance>
<alliance id="101">Knock Out</alliance>
<roles>
<role>
<role id="1">irrelevant</role>
</role>
</roles>
<relationships>
<relationship>
<proposedbyalliance id="102" />
<acceptedbyalliance id="101" />
<relationshiptype id="4">NAP</relationshiptype>
<establishedsince>2014-12-27T18:01:34.130</establishedsince>
</relationship>
<relationship>
<proposedbyalliance id="101" />
<acceptedbyalliance id="103" />
<relationshiptype id="4">NAP</relationshiptype>
<establishedsince>2014-12-27T18:01:34.130</establishedsince>
</relationship>
<relationship>
<proposedbyalliance id="104" />
<acceptedbyalliance id="101" />
<relationshiptype id="4">NAP</relationshiptype>
<establishedsince>2014-12-27T18:01:34.130</establishedsince>
</relationship>
</relationships>
</alliance>
</alliancedata>
> PHP:
$xml = simplexml_load_file($alliances_xml); // $alliances_xml = path to file
// die(var_dump($xml));
// var_dump prints out the entire unparsed xml file.
foreach ($xml->alliances as $alliances) {
// Alliance info
$alliance_id = mysqli_real_escape_string($dbconnect, $alliances->alliance->alliance['id']);
$alliance_name = mysqli_real_escape_string($dbconnect,$alliances->alliance->alliance);
// Diplomacy info
$proposed_by_alliance_id = mysqli_real_escape_string($dbconnect,$alliances->alliance->relationships->relationship->proposedbyalliance['id']);
$accepted_by_alliance_id = mysqli_real_escape_string($dbconnect,$alliances->alliance->relationships->relationship->acceptedbyalliance['id']);
$relationship_type_id = mysqli_real_escape_string($dbconnect,$alliances->alliance->relationships->relationship->relationshiptype['id']);
$established_date = mysqli_real_escape_string($dbconnect,$alliances->alliance->relationships->relationship->establishedsince);
// this is my attempt to echo every result
echo "Alliance ID: <b>$alliance_id</b> <br/>";
echo "Alliance NAME: <b>$alliance_name</b> <br/>";
echo "Diplomacy Proposed: <b>$proposed_by_alliance_id</b> <br/>";
echo "Diplomacy Accepted: <b>$accepted_by_alliance_id</b> <br/>";
echo "Diplomacy Type: <b>$relationship_type_id</b> <br/>";
echo "Date Accepted: <b>$established_date</b> <br/>";
echo "<hr/>";
}
> intrepter output:
Alliance ID: 1
Alliance NAME: Knock Out
Diplomacy Proposed: 102
Diplomacy Accepted: 101
Diplomacy Type: 4
Date Accepted: 2011-10-24T05:08:35.830
I don't understand why the loop simply stops after parsing the first row of data. My best guess, is that my code is not telling PHP what to do after the first values are parsed.
Honestly I have no idea how to explain this in words, so here's a visual representation.
First row is interpreted as
--->$alliance_id
--->$alliance_name
--->$proposed_by_alliance_id
--->$accepted_by_alliance_id
--->$relationship_type_id
--->$established_date
then for the next <relationship> subnodes the following happens...
---> ?? _(no data)_
---> ?? _(no data)_
--->$proposed_by_alliance_id
--->$accepted_by_alliance_id
--->$relationship_type_id
--->$established_date
Since I'm not telling PHP to add $alliance_id and $alliance_name to every iteration of the <relationship> subnode, the interpreter simply decides to abort the foreach operation.
As I mentioned above, I'm new to both PHP and Stackoverflow and I really appreciate any help or wisdom you can share. Thank you in advance.

You write that you've got problems to debug your issues traversing an XML document with SimpleXML.
The first puzzle you come over is that your foreach does only iterate once:
foreach ($xml->alliances as $alliances) {
You can't accept the fact. However, if we take the XML you've got in your question and actually take a look how many <alliances> elements the XML document has, we can see that SimpleXML is doing the right thing here:
there is exactly one (1) <alliances> element inside the document element.
$xml->alliances has one (1) iteration.
$xml->alliances->count() gives int(1)
The accordance with the XML can be easily verified as well. Commented dead code in your questions example suggests that you were using var_dump to see whether or not the XML loads. You don't have to, if simplexml_load_file does not return false, the document was loaded (if you opt for falsy: the document was either not loaded or empty).
So if you want to ensure the document has loaded, just check the return value and throw an exception in case there was a problem.
To check which XML a SimpleXMLElement contains, you shouldn't use var_dump as well. Instead output the XML. As the XML can be quite large at this point, take only the first 256 bytes for example, that normally shows a good picture:
echo substr($xml->alliances->asXML(), 0, 256), "\n";
<alliances>
<alliance>
<alliance id="1">Harmless?</alliance>
<foundedbyplayerid id="10"/><alliancecapitaltownid id="14646"/>
<allianceticker>H?</allianceticker>
<foundeddatetime>2010-02-25T14:18:07.867</foundeddatetime>
<alliancecapitallastmoved>2012-01-19T17:42
^^^^^^^^^
This directly shows that you're iterating over the element(s) named alliances which exist only once in the document. This is totally aligned with the observation you've made that there is only one foreach.
With this really basic debugging you can do the following conclusion:
It is observed that Foreach does only iterate once (1).
Foreach has been commanded to iterates over elements named alliances.
As there is only one (1) iteration, there has to be only one (1) alliances element.
Counting the alliances elements, the result is one.
Therefore it is confirmed that there is only one (1) alliances element.
So obviously you're iterating over the wrong element(s).
As this outline of the error finding is rather extensive (just to give you the picture at which many points you could have already improved both your code but also the error checking and especially to show you places where you can start with trouble-shooting), the question remains, why you weren't able to spot this already. As until now, an answer here already pointed to the fact, that you were iterating over the wrong element(s). However it was not written out, but just a bit cryptic in code:
[...] change your for loop from foreach ($xml->alliances->alliance as $alliance) { to foreach ($xml->alliance as $alliance) {
and that's all
Source
Sure it's weak, as this only gives code but doesn't answer any of your (programming) question(s).
After finding the cause, let's cure this step by step
So after finding out that it's the wrong element, it's easy to fix that: iterate over the right elements.
This can be done by applying incremental changes to your code.
First of all the correct element needs to be chosen:
foreach ($xml->alliances->alliance as $alliances) {
This will immediately make your code spit out a lot of errors, many for each iteration. And there are many iterations. So you can already say with this little change, something was effectively changed into the right direction: Instead of one iteration, there are now many more.
But before fixing the mess with the newly introduced errors and warnings, first take care about the code just changed. The next thing is to rename the variable $alliances to $alliance (your editor should support your with that by either using search and replace (often CTRL+R) or by offering a refactoring command named "rename variable" (e.g. SHIFT+F6 in Phpstorm)). Afterwards that line (and the following lines are also changed but I don't show them) looks like:
foreach ($xml->alliances->alliance as $alliance) {
And it's yet still not ready. As $xml->alliances->alliance is a bit bulky, let's move it out and take a more speaking variable for that: $alliances:
$alliances = $xml->alliances->alliance;
foreach ($alliances as $alliance) {
The next step that needs to be done is just to correct an error you made. For some obscure reason totally not clear to me is that pass all data through mysqli_real_escape_string(). Even though if you would have intended to pass the data later on to a database, this is yet at the wrong place to call that function. First of all extract the data, that function is called later on in preparation of the database insert operation which is a different part of your application.
I just replaced all occurences of "mysqli_real_escape_string($dbconnect," with "trim(" so that finally - after proper indentation - the code has changed to this:
$alliances = $xml->alliances->alliance;
foreach ($alliances as $alliance) {
// Alliance info
$alliance_id = trim($alliance->alliance->alliance['id']);
$alliance_name = trim($alliance->alliance->alliance);
// Diplomacy info
$proposed_by_alliance_id = trim($alliance->alliance->relationships->relationship->proposedbyalliance['id']);
$accepted_by_alliance_id = trim($alliance->alliance->relationships->relationship->acceptedbyalliance['id']);
$relationship_type_id = trim($alliance->alliance->relationships->relationship->relationshiptype['id']);
$established_date = trim($alliance->alliance->relationships->relationship->establishedsince);
Thanks to the better named variables it now is pretty visible where the many
Notice: Trying to get property of non-object
warnings come from: The many calls to $alliance->alliance-> are just redundant. If we remember that originally you did iterate over the wrong elements, this is the counter-part: Because you used the wrong elements, you had to make the error more than once, otherwise you could not have extracted any data at all. Just think a second about this. It also means, that the earlier you could have verified that what your intention to do is actually done by the code, the less little problems were introduced.
Good thing here again is that this is easy to fix by replacing all "$alliance->alliance->" with "$alliance->":
$alliances = $xml->alliances->alliance;
foreach ($alliances as $alliance) {
// Alliance info
$alliance_id = trim($alliance->alliance['id']);
$alliance_name = trim($alliance->alliance);
// Diplomacy info
$proposed_by_alliance_id = trim($alliance->relationships->relationship->proposedbyalliance['id']);
$accepted_by_alliance_id = trim($alliance->relationships->relationship->acceptedbyalliance['id']);
$relationship_type_id = trim($alliance->relationships->relationship->relationshiptype['id']);
$established_date = trim($alliance->relationships->relationship->establishedsince);
Running the code again now shows that the iteration works and the information to obtain from each alliance element works perfectly fine as well. Still there are errors given because as you already say in your question, you not only wonder about the iteration but also about further traversing the relationships:
Alliance ID ......: 1
Alliance NAME ....: Harmless?
Diplomacy Proposed: 454
Diplomacy Accepted: 1
Diplomacy Type ...: 4
Date Accepted ...: 2011-10-24T05:08:35.830
-------------------------------------------------
[4x Notice: Trying to get property of non-object]
Alliance ID ......: 2
Alliance NAME ....: Danger
Diplomacy Proposed:
Diplomacy Accepted:
Diplomacy Type ...:
Date Accepted ...:
-------------------------------------------------
...
The error messages correspond to the following four lines:
$proposed_by_alliance_id = trim($alliance->relationships->relationship->proposedbyalliance['id']);
$accepted_by_alliance_id = trim($alliance->relationships->relationship->acceptedbyalliance['id']);
$relationship_type_id = trim($alliance->relationships->relationship->relationshiptype['id']);
$established_date = trim($alliance->relationships->relationship->establishedsince);
Which means, that again, you need to apply trouble-shooting steps as outlined at the very beginning of my answer to this section now of your code.
Here is the code example so far:
$xml = simplexml_load_file($alliances_xml); // $alliances_xml = path to file
if (!$xml) {
throw new UnexpectedValueException(
sprintf("Unable to load XML or it was empty. Filename given was %s", var_export($alliances_xml, true))
);
}
$alliances = $xml->alliances->alliance;
// limit to two iterations for debugging
$alliances = new LimitIterator(new IteratorIterator($alliances), 0, 2);
foreach ($alliances as $alliance) {
// Alliance info
$alliance_id = trim($alliance->alliance['id']);
$alliance_name = trim($alliance->alliance);
// Diplomacy info
$proposed_by_alliance_id = trim($alliance->relationships->relationship->proposedbyalliance['id']);
$accepted_by_alliance_id = trim($alliance->relationships->relationship->acceptedbyalliance['id']);
$relationship_type_id = trim($alliance->relationships->relationship->relationshiptype['id']);
$established_date = trim($alliance->relationships->relationship->establishedsince);
// this is my attempt to echo every result
echo "Alliance ID ......: $alliance_id\n";
echo "Alliance NAME ....: $alliance_name\n";
echo "Diplomacy Proposed: $proposed_by_alliance_id\n";
echo "Diplomacy Accepted: $accepted_by_alliance_id\n";
echo "Diplomacy Type ...: $relationship_type_id\n";
echo "Date Accepted ...: $established_date\n";
echo "-------------------------------------------------\n";
}
Please note that I'm using the command-line to execute the PHP code as it's much faster then via the browser over a webserver. I also do not need to write HTML to just have nicely formatted output.

I made phpfiddle of your code, tested, working.
http://phpfiddle.org/main/code/7agg-si3f
You need to remove
<server>
<name>Epic1</name>
</server>
and add </alliances> to the end, since it's reporting invalid xml
after that change your for loop from foreach ($xml->alliances->alliance as $alliance) {
to foreach ($xml->alliance as $alliance) {
and that's all

Related

Getting an XML value from a named field

Sorry to be asking this, but it's driving me crazy.
I've been using the php SimpleXMLElement as my XML go to parser, and I've looked at many examples, and have given up on this many times. But, now, I just need to have this working. There are many examples on how to get simple fields, but not so many with values in the fields...
I'm trying to get the "track_artist_name" value from this XML as a named variable in php.
<nowplaying-info-list>
<nowplaying-info >
<property name="track_title"><![CDATA[Song Title]]></property>
<property name="track_album_name"><![CDATA[Song Album]]></property>
<property name="track_artist_name"><![CDATA[Song Artist]]></property>
</nowplaying-info>
</nowplaying-info-list>
I've tried using xpath with:
$sxml->xpath("/nowplaying-info-list[0]/nowplaying-info/property[#name='track_artist_name']"));
But, I know it's all mucked up and not working.
I originally tried something like this too, thinking it made sense - but no:
attrs = $sxml->nowplaying_info[0]->property['#name']['track_artist_name'];
echo $attrs . "\n\n";
I know I can get the values with something such as this:
$sxml->nowplaying_info[0]->property[2];
Sometimes there are more lines in the XML results than other times, and so because of this, it is breaks the calculations with the wrong data.
Can someone shed some light on my problem? I'm just trying to the name of the artist to a variable. Many thanks.
*** WORKING UPDATE: **
I was unaware there were different XML interpreter methods, and was using the following XML interpreter version:
// read feed into SimpleXML object
$sxml = new SimpleXMLElement($json);
That didn't work, but have now updated to the following (for that section of code) thanks to the help here.
$sxml_new = simplexml_load_string($json_raw);
if ( $sxml_new->xpath("/nowplaying-info-list/nowplaying-info/property[#name='track_artist_name']") != null )
{
$results = $sxml_new->xpath("/nowplaying-info-list/nowplaying-info/property[#name='track_artist_name']");
//print_r($results);
$artist = (string) $results[0];
// var_dump($artist);
echo "Artist: " . $artist . "\n";
}
Your xpath expression is pretty much right, but you don't need to specify an index for the <nowplaying-info-list> element - it'll deal with that itself. If you were to supply an index, it would need to start at 1, not 0.
Try
$results = $sxml->xpath("/nowplaying-info-list/nowplaying-info/property[#name='track_artist_name']");
echo (string) $results[0];
Song Artist
See https://3v4l.org/eH4Dr
Your second approach:
$sxml->nowplaying_info[0]->property['#name']['track_artist_name'];
Would be trying to access the attribute named #name of the first property element, rather than treating it as an xpath-style # expression. To do this without using xpath, you'd need to loop over each of the <property> elements, and test their name attibrute.
Just in case if the node you are looking for is deeply residing some where, you could just add a double slash at the start.
$results = $sxml->xpath("//nowplaying-info-list/nowplaying-info/property[#name='track_artist_name']");
Also in case if you have multiple <nowplaying-info> elements. You could make of use of the index for that. (note the [1] index)
$results = $sxml->xpath("//nowplaying-info-list/nowplaying-info[1]/property[#name='track_artist_name']");

How to parse PCDATA and child element separately with PHP DOM?

I'm trying to parse an XML of a dtbook, which contains levels (1, 2 and 3) that later on contains p-tags. I'm doing this with PHP DOM. Link to XML
Inside som of these p-tags there are noteref-tags. I do get a hold of those, but it seems that the only results I'm able to get is either that the noteref appears before the p-tag, or after. I need some of the noterefs to appear inside the p-tag; or in other words, where they actually are supposed to be.
<p>Special education for the ..... <noteref class="endnote" idref="fn_5"
id="note5">5</noteref>. Interest ..... 19th century <noteref class="endnote"
idref="fn_6" id="note6">6</noteref>.</p>
This is the code I've got for the p-tag now. Before this, I'm looping through the dt-book to get tho the p-tag. That works fine.
if($level1->tagName == "p") {
echo "<p>".$level1->nodeValue;
$noterefs = $level1->childNodes;
foreach($noterefs as $noteref) {
if($noteref->nodeType == XML_ELEMENT_NODE) {
echo "<span><b>".$noteref->nodeValue."</b></span>";
}
}
echo "</p><br>";
}
These are the results I get:
Special education for the ..... 5. Interest ..... 19th century 6.56
56Special education for the ..... 5. Interest ..... 19th century 6.
I also want the p-tag to not display what's inside the noteref-tag. That should be done by the noteref-tag (only).
So, does anybody know what could possibly be done to fix these things? It feels like I've both googled and tried almost everything.
DOMNode->nodeValue (which in PHP's DOMElement is the same as DOMNode->textContent) will contain the complete text content from itself and all its descending nodes. Or, to put it a little more simple: it contains the complete content of the node, but with all tags removed.
What you probably want to try is the something like the following (untested):
if($level1->tagName == "p") {
echo "<p>";
// loop through all childNodes, not just noteref elements
foreach($level1->childNodes as $childNode) {
// you could also use if() statements here, of course
switch($childNode->nodeType) {
// if it's just text
case XML_TEXT_NODE:
echo $childNode->nodeValue;
break;
// if it's an element
case XML_ELEMENT_NODE:
echo "<span><b>".$childNode->nodeValue."</b></span>";
break;
}
}
echo "</p><br>";
}
Be aware though that this is still rather flimsy. For instance: if any other elements, besides <noteref> elements, show up in the <p> elements, they will also be wrapped in <span><b> elements.
Hopefully I've at least given you a clue as to why your result <p> elements showed the contents of the child elements as well.
As a side note: if what you want to achieve is transform the contents of an XML document into HTML or perhaps some other XML structure, it might pay off to look into XSLT. Be aware though that the learning curve could be steep.

Simple use PHP to get data from XML

I am getting some variable where is XML file, I can't edit it or do anything with it.
So what I do:
$xml = $client->get_details('WF0GXXGBBG7P857BB');
$xml = simplexml_load_string($xml);
//print_r($xml);
$vin = $xml->vin;
print_r($vin);
If I uncomment print_r($xml) it just prints out whole xml and works cool (output is http://pastebin.com/w5VVysZU), but if I use second part with print_r($vin) it just displays just SimpleXMLElement Object ( ),
Any idea what can I do? How can I fix this? I've tried like 20 tutorials and always get no output or error with using nonobject something.
EDIT 1:
I need it to display one specific thing from this XML, in example it's VIN, so from this big amount I want script to find where is [vin] => WF0GXXGBBG7P857BB and echo WF0GXXGBBG7P857BB
EDIT 2:
My XML: http://pastebin.com/1KLB5Ba0
SimpleXML elements can and have to be type casted to strings if you want to display them that way. Otherwise, as you seen, it just tells you that it is an object.
$vin = (string) $xml->vin;
// OR
print_r((string) $vin);

Create comma separated string via xml values

I'm working on some system for a few hours now and this little thing is too much for me to think logically about at the moment.
Normally I would wait a few hours but this is a last minute job and I need to finish this.
Here's my problem:
I have an XML file that gets posted to my PHP file, the PHP file inserts certain data into a DB, but some XML nodes have the same name:
<accessoires>
<accessoire>value1</accessoire>
<accessoire>value2</accessoire>
<accessoire>value3</accessoire>
</accessoires>
Now I want to get a var $acclist which contains all values seperated by a comma:
value1,value2,value3,
I bet the solution to this is very easy but I'm at the known point where even the easiest piece of code becomes a hassle. And googling only comes up with nodes that in some way have their own identifiers.
Could someone help me out please?
You can try simplexml_load_string to parse the html then call implode on the node after casting to an array.
NOTE This code was tested in php 5.4.6 and behaves as expected.
<?php
$xml = '<accessoires>
<accessoire>value1</accessoire>
<accessoire>value2</accessoire>
<accessoire>value3</accessoire>
</accessoires>';
$dat = simplexml_load_string($xml);
echo implode(",",(array)$dat->accessoire);
For 5.3.x I had to change to
$xml = '<accessoires>
<accessoire>value1</accessoire>
<accessoire>value2</accessoire>
<accessoire>value3</accessoire>
</accessoires>';
$dat = simplexml_load_string($xml);
$dat = (array)$dat;
echo implode(",",$dat["accessoire"]);
You do this by taking a library that is able to parse and process XML, for example with SimpleXML:
implode(',', iterator_to_array($accessoires->accessoire, FALSE));
The key part here is to use iterator_to_array() as SimpleXML offers the same-named child-elements here as an iterator. Otherwise $accessoires->accessoire gives you auto-magically only the first element (if any).

How to order a XML return depending on its tags values?

Let's say I have a php page that dynamically loads the return of a method that returns a XML.
The XML is something like this:
<SYSTEM>
<GUY>
<ID>500</ID>
<NAME>Joseph</NAME>
<EMAIL>joseph#mark</EMAIL>
<ERROR />
</GUY>
<GUY>
<ID>510</ID>
<NAME>Richard</NAME>
<EMAIL>richard#gmail.com</EMAIL>
</GUY>
</SYSTEM>
Now my PHP file has a simple "if" that checks for the ERROR tag. If it's detected, then it prints an error.
The result right now is the error being printed BEFORE the correct print (Richard). Both should be printed, but I want to put the errors on the bottom, after the correct results. The error is printed first because it's the first result of the XML. How can I bypass that?
I think it may be simple, but I'm really not getting it.
My PHP verification is something like this (it runs based on the number of GUY tags, so it'll be twice according to my XML above):
$xmlresult = simplexml_load_string($xml);
$error = $xmlresult->xpath("//ERROR");
if($error==true){
echo "error message here";
} else {
echo "wee! no errors!"
}
The way I would approach this would be to temporarily store any error results in a list as I walked through the XML file, and print out the good ones as you go. Then, once you reach the end of the file, you can walk through your list of errors and print them after all of the good ones.
This sort of simple algorithm should work for pretty much any method of going through the XML, both with a parsing library that gives back nice objects, as well as more brute-force string-based methods.

Categories