How to snag highest numbered XML item with PHP? - php

I've basically got an XML file full of product information for use in an ecommerce system. I've been creating a script that converts these XML files into a .CSV with the data structured in a format the ecommerce system can handle (So I don't need to copy/paste columns over every time the vendor provides new XML files). The category of each product is defined like this:
<web_category1>3</web_category1>
<web_category2>1</web_category2>
<web_category3>6</web_category3>
web_category3 being the category of the item and 1 and 2 being the parent categories of the product's category. The thing is that some items are nested under 2 categories..or sometimes 5. So I need to figure out a way for PHP to grab the web_category with the highest number after it since that's always going to be the product's category.
Thanks!

#ben's answer is correct, but is a little intense for me. SimpleXMLElement objects are nice because you can easily cast them to an array. So, a simpler solution would be to cast it to an array and use max to determine the highest value in the resulting array:
$str = '
<item>
<web_category1>3</web_category1>
<web_category2>1</web_category2>
<web_category3>6</web_category3>
</item>
';
$xml = new SimpleXMLElement($str);
echo max((array)$xml); // outputs: 6
UPDATE
Based on your comment below, let's assume you need to get the max of all the <item> elements that occur in an XML file and not just one (like above). To handle this you could use SimpleXMLElement::xpathdocs to get an array of all the occurrences of <item> then execute the same casting trick inside a loop over the xpath result:
$str = '
<xml>
<product1>
<item>
<web_category1>3</web_category1>
<web_category2>1</web_category2>
<web_category3>6</web_category3>
</item>
</product1>
<product2>
<item>
<web_category4>17</web_category4>
<web_category5>0</web_category5>
</item>
</product2>
</xml>
';
$xml = new SimpleXMLElement($str);
$allItems = array();
$items = $xml->xpath('//item');
foreach($items as $item) {
$allItems = array_merge($allItems, (array)$item);
}
echo max($allItems); // outputs: 17
UPDATE 2
Okay, last time. If this isn't exactly what you're trying to do, you should at least have enough examples to figure it out from here. Consider:
$str = '
<xml>
<product1>
<web_category1>3</web_category1>
<web_category2>1</web_category2>
<web_category3>6</web_category3>
</product1>
<product2>
<web_category4>17</web_category4>
<web_category5>0</web_category5>
</product2>
<product3>
<web_category6>17</web_category6>
<web_category7>21</web_category7>
</product3>
</xml>
';
$xml = new SimpleXMLElement($str);
// assumes that product node names start with "product"
$products = $xml->xpath("//*[starts-with(name(),'product')]");
foreach ($products as $p) {
$catNames = array_keys((array)$p);
$catNums = preg_replace("/[^\d]/", "", $catNames);
echo $p->getName() . ' - highest category: ' . max($catNums) . "\n";
}
The above code outputs the following:
product1 - highest category: 3
product2 - highest category: 5
product3 - highest category: 7

assuming your XML is something like this:
<item>
<web_category1>3</web_category1>
<web_category2>1</web_category2>
<web_category3>6</web_category3>
</item>
And you have a SimpleXMLElement object for <item>, this should do it:
$highest_web_category_number = -1;
$value_of_highest_web_category_number = -1;
foreach($item->getChildren() as $name => $data) {
if(strpos($name, 'web_category') === 0) {
$web_category_number = substr($name, strlen('web_category'));
if($web_category_number > $highest_web_category_number) {
$highest_web_category_number = $web_category_number
$value_of_highest_web_category_number = $data;
}
}
}

Related

Retrieve XML element data with PHP

i'm trying to allow the user to retrieve a specific XML node and print the information. My XML data is as follows:
<people>
<person id="1">
<forename>Jim</forename>
<surname>Morrison</surname>
</person>
<person id="2">
<forename>James</forename>
<surname>Frank</surname>
</person>
</people>
I want to be able to search my XML document with a name or ID, for instance I want to be able to say, 'check that james exists, and if james does, print out his information, otherwise print out an error message'.
I've figured that I first need to include the file with:
$xml=simplexml_load_file("credentials.xml") or die("Error: Cannot create object");
From this point onwards i'm unsure how to proceed, i've looked at SimpleXML and XPATH but i'm unsure on how to use these to acheive this. Thanks if you can help
Let's xpath():
$xml = simplexml_load_string($x); // assume XML in $x
$result = $xml->xpath("/people/person[forename = 'James']")[0];
The above xpath-expression will select any <person> having 'James' as a <forename> and store it as an array in $result.
With the [0] at the end of line 2, we select only the first entry in that array and store it in $result. This requires PHP >= 5.4.
We could also write to get the same result:
$result = $xml->xpath("/people/person[forename = 'James']");
$result = $result[0];
If there were more than 1 James in the XML, we would only get the first.
To get all Jameses, do as in line 1 above.
Then, let's output the <person> selected by our xpath expression:
echo $result->forename . ' ' . $result->surname .' has id ' . $result['id'] . ".";
In case $result contains several Jameses, do:
foreach ($result as $person) {
echo $person->forename . ' ' . $person->surname .' has id ' . $person['id'] . "." . PHP_EOL;
}
see it working: https://eval.in/222195
You should use the manual.
simplexml_load_file returns a SimpleXMLElement object.
From here you can use the objects children method to get the children you want, which would also be more of that SimpleXMLElement objects. Let's say you first want the children called people, and then want person. When you got the person objects, you can get the value of the attribute by the method attributes to get the names and the values etc. Because SimpleXMLElement implements Traversable, you can use a foreach loop to loop through the lists/arrays you get in return.
Figured out how to mostly solve the problem with this:
$xml=simplexml_load_file("credentials.xml") or die("Error: Cannot create object");
foreach($xml->children() as $people){
if ($people->forename == "james" ){echo $people->forename;}
}
This may be inefficient so if anyone else has a better way please do specify :)

Xpath looping query

I have the following xml doc:
<shop id="123" name="xxx">
<product id="123456">
<name>Book</name>
<price>9.99</price
</product>
<product id="789012">
<name>Perfume</name>
<price>12.99</price
</product>
<product id="345678">
<name>T-Shirt</name>
<price>9.99</price
</product>
</shop>
<shop id="456" name="yyy">
<product id="123456">
<name>Book</name>
<price>9.99</price
</product>
</shop>
I have the following loop to gather the information for each product:
$data_feed = 'www.mydomain.com/xml/compression/gzip/';
$xml = simplexml_load_file("compress.zlib://$data_feed");
foreach ($xml->xpath('//product') as $row) {
$id = $row["id"]; // product id eg. "123456"
$name = $row->name;
$price = $row->price;
// update database etc.
}
HOWEVER, I also want to gather the information for each product's parent shop ("id" and "name").
I can easily change my xpath to start from shop as opposed to product, but I'm unsure of the most efficient way to then construct an additional loop within my foreach to loop each indented product
Make sense?
I'd go without xpath and just use two nested foreach-loops:
$xml = simplexml_load_string($x); // assume XML in $x
foreach ($xml->shop as $shop) {
echo "shop $shop[name], id $shop[id] <br />";
foreach ($shop->product as $product) {
echo "- $product->name (id $product[id]), $product->price <br />";
}
}
see it working: http://codepad.viper-7.com/vFmGvY
BTW: your XML is broken, probably a typo. Each closing </price> is missing its last >.
Sure, makes sense, you want one iteration, not a nested product of iterations (albeit that won't cut you much, #michi showed already), which is possible as well:
foreach ($xml->xpath('//product') as $row)
{
$id = $row["id"]; // product id eg. "123456"
$name = $row->name;
$price = $row->price;
$shopId = $row->xpath('../#id')[0];
$shopName = $row->xpath('../#name')[0];
// update database etc.
}
As this example shows, you can run xpath() on each element-node and the context-node is automatically set to the node itself, therefore the realtive path .. in xpath works to access the parent element (see as well: Access an element's parent with PHP's SimpleXML?). Of that then both attributes are read and then via PHP 5.4 array de-referencing the first (and only) attribute is accessed.
I hope this helps and shed some light how it works. Your question reminds me a bit of an earlier one where I suggested some kind of generic solution to these kind of problems:
Answer to Combining two Xpaths into one loop?

Two dimensional array

i've tried to find this out by myself before asking but cant really figure it out.
What I have is a loop, it's actually a loop which reads XML data with simplexml_load_file
Now this XML file has data which I want to read and put into an array.. a two dimensional array actually..
So the XML file has a child called Tag and has a child called Amount.
The amount is always differnt, but the Tag is usually the same, but can change sometimes too.
What I am trying to do now is:
Example:
This is the XML example:
<?xml version="1.0"?>
<Data>
<Items>
<Item Amount="9,21" Tag="tag1"/>
<Item Amount="4,21" Tag="tag1"/>
<Item Amount="6,21" Tag="tag2"/>
<Item Amount="1,21" Tag="tag1"/>
<Item Amount="6,21" Tag="tag2"/>
</Data>
</Items>
Now i have a loop which reads this, sees what tag it is and adds up the amounts.
It works with 2 loops and two different array, and I would like to have it all in one array in single loop.
I tried something like this:
$tags = array();
for($k = 0; $k < sizeof($tags); $k++)
{
if (strcmp($tags[$k], $child['Tag']) == 0)
{
$foundTAG = true;
break;
}
else
$foundTAG = false;
}
if (!$foundTAG)
{
$tags[] = $child['Tag'];
}
and then somewhere in the code i tried different variations of adding to the array ($counter is what counts the Amounts together):
$tags[$child['Tag']][$k] = $counter;
$tags[$child['Tag']][] = $counter;
$tags[][] = $counter;
i tried few other combinations which i already deleted since it didnt work..
Ok this might be a really noob question, but i started with PHP yesterday and have no idea how multidimensional arrays work :)
Thank you
this is how you can iterate over the returned object from simple xml:
$xml=simplexml_load_file("/home/chris/tmp/data.xml");
foreach($xml->Items->Item as $obj){
foreach($obj->Attributes() as $key=>$val){
// php will automatically cast each of these to a string for the echo
echo "$key = $val\n";
}
}
so, to build an array with totals for each tag:
$xml=simplexml_load_file("/home/chris/tmp/data.xml");
$tagarray=array();
// iterate over the xml object
foreach($xml->Items->Item as $obj){
// reset the attr vars.
$tag="";
$amount=0;
// iterate over the attributes setting
// the correct vars as you go
foreach($obj->Attributes() as $key=>$val){
if($key=="Tag"){
// if you don't cast this to a
// string php (helpfully) gives you
// a psuedo simplexml_element object
$tag=(string)$val[0];
}
if($key=="Amount"){
// same as for the string above
// but cast to a float
$amount=(float)$val[0];
}
// when we have both the tag and the amount
// we can store them in the array
if(strlen($tag) && $amount>0){
$tagarray[$tag]+=$amount;
}
}
}
print_r($tagarray);
print "\n";
This will break horribly should the schema change or you decide to wear blue socks (xml is extremely colour sensitive). As you can see dealing with the problem child that is xml is tedious - yet another design decision taken in a committee room :-)

How to iterate through an XML element node with dynamic children

I currently have the following XML structure:
<root>
<maininfo>
<node>
<tournament_id>3100423</tournament_id>
<games>
<a_0>
<id>23523636</id>
<type>
<choice_4>
<choice_id>345</choice_id>
<choice_4>
<choice_9>
<choice_id>345</choice_id>
<choice_9>
... etc
</type>
</a_0>
<a_1></a_1>
<a_2></a_2>
...etc
</games>
</info>
</node>
</root>
I can easily get the id of the first node element "a_0" by just doing:
maininfo[0]->a_3130432[0]->games[0]->a_1[0]->id;
My issue is:
How do I automatically iterate (with a foreach) through all a_0, a_1, a_2 and get the values of each of these node elements and all of their children like "345" in <choice_id>345</choice_id>?
The ending numbers of a_0, a_1 + the children of choice_4, choice_9, are dynamically created and there are no logic in the _[number] counting up with +1 for each next element.
As it has been outlined previously on Stackoverflow (for example in Read XML dynamic PHP) and as well generally in the PHP manual (for example in Basic SimpleXML usage), you can iterate over all child elements by using foreach.
For example to go over all a_* elements, it's just
foreach ($xml->maininfo->node->games[0] as $name => $a) {
echo $name, "\n";
}
Output:
a_0
a_1
a_2
You then want to iterate over these their ->type children again. This is possible in pure PHP by putting one foreach into a another:
foreach ($xml->maininfo->node->games[0] as $name => $a) {
echo $name, "\n";
if (!$a->type[0]) {
continue;
}
foreach ($a->type[0] as $name => $choice) {
echo ' +- ', $name, "\n";
}
}
This now outputs:
a_0
+- choice_4
+- choice_9
a_1
a_2
This starts to get a bit complicated. As you can imagine since XML is famous for it's tree structures, you're not the first one running into this problem. Therefore a query-language to get elements from an XML document has been invented: Xpath.
With Xpath you can access XML data as if it was a file-system. As I know that each a_* element is a child of games and each choice_* element a child of type, it's pretty straight forward:
/*/maininfo/node/games/*/type/*
^ ^ ^
| | choice_*
root |
a_*
In PHP Simplexml this looks like:
$choices = $xml->xpath('/*/maininfo/node/games/*/type/*');
foreach ($choices as $choice) {
echo $choice->getName(), ': ', $choice->choice_id, "\n";
}
Output:
choice_4: 345
choice_9: 345
As this example shows, the data is now retrieved with a single foreach.
If you as well need access to the <a_*> elements, you need to have multiple foreach's or your own iteration but that is even a more advanced topic which I'd say would extend over the limits of your question.
I hope this is helpful so far. See as well SimpleXMLElement::children() which also gives all children (like ->games[0] in the first example). All example codes are as well available as a working, interactive online-demo.
If I understand it well, you can do something like:
for($i = 0; $i < $max; ++$i){
$a = $parentNode->{'a_'.$i};
}
You can do this very easily using SimpleXML :
<?php
$xmlStr = "<?xml version='1.0' standalone='yes'?>
<root>
<maininfo>
<node>
<tournament_id>3100423</tournament_id>
<games>
<a_0>
<id>23523636</id>
<type>
<choice_4>
<choice_id>345</choice_id>
</choice_4>
<choice_9>
<choice_id>345</choice_id>
</choice_9>
</type>
</a_0>
<a_1></a_1>
<a_2></a_2>
</games>
</node>
</maininfo>
</root>";
$xmlRoot = new SimpleXMLElement($xmlStr);
$i = 0;
foreach($xmlRoot->maininfo[0]->node[0]->games[0] as $a_x)
{
echo $i++ . " - " . htmlentities($a_x->asXML()) . "<br/>";
}
?>
I have modified some parts of your XML string to make it syntactically correct. You can view the results at http://phpfiddle.org/main/code/56q-san

XPath not returning values with quotes

<html>
<body>
<channel>
<item>
<link>"http://www.example.com/"
</link>
<title>This is a title
</title>
</item>
<item>
<link>"http://www.example2.com/"
</link>
<title>This a 2nd title
</title>
</item>
</channel>
</body>
</html>
$query = '/html/body/channel/item/title';
$xpath->query($query);
$i = 0;
foreach ( $xpath->query($query) as $key )
{
echo '<p>'.$xpath->query($query) -> item($i) -> nodeValue . '</p><br />';
$i++;
}
I tried the following queries:
$query = '/html/body/channel/item/link';
and
$query = '/html/body/channel/item/link/text()';
I can return <item> and <title> just fine. Just not <link>. Is there something I'm missing?
Your code is broken and does not make sense
1 $query = '/html/body/channel/item/title';
2 $xpath->query($query);
3 $i = 0;
4 foreach ($xpath->query($query) as $key)
5 {
6 echo '<p>'.$xpath->query($query) -> item($i) -> nodeValue . '</p><br />';
7 $i++;
8 }
will query for title elements (2) but since the result isn't assigned, it is superfluous. Then you do foreach and query again (4). This time you assign each title DOMElement to $key (which is bad wording imo). In the foreach, you do yet another query for title elements (6) and fetch the items/title elements in it from your counter variable (3/6). That is superfluous as well, because you already have that element in $key (3). So you are doing three identical queries where you just need one and you do a foreach without using it.
It should be
foreach ($xpath->query('/html/body/channel/item/title') as $titleElement) {
printf('<p>%s</p>', $titleElement->nodeValue);
}
Since you are already using DOM to work with the markup, you could also create the p element with it instead of using string concatenation, e.g.
foreach ($xpath->query('/html/body/channel/item/title') as $titleElement) {
echo $domDocument->saveXml(
$domDocument->createElement('p', $titleElement->nodeValue)
);
}
If you want the link elements, change the XPath accordingly to query for that instead of title. The quotes in the node value have nothing to do with it at all. They will show just fine.
Full working example showing how to combine <title> and <link> elements into <a> elements

Categories