Trying to get a specific value from XML attribute - php

Trying to scrape an IKEA page at the following link:
http://www.ikea.com/it/it/catalog/products/60255550/?type=xml&dataset=prices
I want to scrape the price of the item, but in the xml file the price appears once unformatted and once with the Euro sign next to it. I wish to scrape the priceNormal unformatted value specifically.
<prices>
<normal>
<priceNormal unformatted="44.99">€ 44,99</priceNormal>
<pricePrevious/>
<priceNormalPerUnit/>
<pricePreviousPerUnit/>
</normal>
My code below doesn't echo the price at all, not sure where I'm going wrong :(
$string = 'http://www.ikea.com/it/it/catalog/products/60255550/?type=xml&dataset=prices';
$xml=simplexml_load_file($string) or die("Error: Cannot create object");
//print_r($xml);
echo $xml->product->prices;

You should be able to get the price with
$xml->products->product->items->item->prices->normal->priceNormal
$xml->products->product->items->item->prices->normal->priceNormal->attributes()->unformatted
If you however need to iterate of a result set, you can break up the places where you are expecting multiples with iteration...
foreach( $xml->products->product as $product )
{
echo $product->name;
foreach( $product->items->item as $item )
{
echo $item->name;
echo $item->prices->normal->priceNormal;
echo $item->prices->normal->priceNormal->attributes()->unformatted;
}
}

Try using var_dump() instead of print_r() to look at the value of $xml. It's a bit convoluted, but you'll find the data you're looking for at this location:
$xml->products[0]->product->items[0]->item->prices->normal->priceNormal[0];

Related

Using SimpleXML to extract data from XML page returning empty array

I'm trying to extract the values of all elements named wardtitle from this xml page:
https://democracy.ashfield-dc.gov.uk/mgWebService.asmx/GetCouncillorsByWard
Here's the code I'm currently trying:
$xml = simplexml_load_file("https://democracy.ashfield-dc.gov.uk/mgWebService.asmx/GetCouncillorsByWard");
var_dump($xml->children());
$ward = (string) $xml->wardtitle;
echo $ward;
print_r($xml->xpath("//email"));
The children dump seems to work fine, the $ward variable returns nothing, then the xpath attempt returns the correct number of results, but all empty...
Any help very much appreciated.
According to xml structure: wardtitle is a node of each ward, which in turn - node of wards, so, to echo wardtitle of first ward:
echo $xml->wards->ward[0]->wardtitle;
As for emails - everything works fine:
foreach ($xml->xpath("//email") as $email) {
echo $email;
}

Separating XML products by Category in PHP

I read an XML file in PHP by
$xml=simplexml_load_file("./files/downloaded.xml");
This file is having many products with different categories.
I want to separate them with respect to their categories.
Here is the look of the read file view by the following code.
print "<pre>";
print_r($xml);
print "</pre>";
I separated the products by the following code
$baby = array();
for($x=0;$x < count($xml->product); $x++)
{
if( preg_match("#Groceries > ([ a-zA-Z0-9]+) >#i",$xml->product[$x]->category,$match) )
{
$match[1] = str_replace(" ", "" , strtolower($match[1]) );
if($match[1] == "baby"){
$baby[] = $xml->product[$x];
}
}
}
and it has been separated in an array named as $baby and here is the view of the baby array by the following code
print "<pre>";
print_r($baby);
print "</pre>";
Now I want to save this as baby.xml and baby.json file but I don't know how to save this.
I tried this code to save these files
$baby_json = simplexml_load_string($baby);
$json = json_encode($baby_json);
file_put_contents("./files/foodcupobard.json",$json);
file_put_contents("./files/foodcupobard.xml",$baby);
But it is not working after separation.
Here is the code which works before separation
$xml=simplexml_load_file("./files/downloaded.xml");
$xml_json = simplexml_load_string($xml);
$json = json_encode($xml_json);
file_put_contents("./files/baby.json",$json);
file_put_contents("./files/baby.xml",$xml);
The reason it is not working after separation is that after separation {$baby} becomes an array instead of SimpleXMLElement Object. Can anyone help me to save these separated products into an baby.xml and baby.json files ? or any other way to separate these products with php code ?
Any Help would be much appreciated!
Thanks :)
Manipulate the original SimpleXml object instead of creating an array.
Then save as XML with $xml->asXML($filename);
Use xpath to select <product> nodes with a certain <category>. xpath is like SQL for XML:
/products/product[starts-with(category, 'Foo > Bar >')]
Comments:
expression will return all <product> having a <category> starting with "Foo > Bar >"
[] enclose a condition.
you could use the contains function instead of start-with
code example:
$products = $xml->xpath("/products/product[starts-with(category, 'Foo > Bar >')]");
BUT $products is an array of SimpleXml elements, but no SimpleXml object, so asXML() won't work here.
Solution 1:
select all <product> that are NOT in the desired category
delete those from $xml
save with asXML()
code example:
$products = $xml->xpath("/products/product[not(starts-with(category, 'Foo > Bar >'))]");
foreach ($products as $product)
unset($product[0]);
This is the self-reference-technique to delete a node with unset.
show the manipulated XML:
echo $xml->asXML();
see it working: https://eval.in/512140
Solution 2
Go with the original $products and build a new XML string from it.
foreach ($products as $product)
$newxmlstr = $newxmlstr . $product->asXML();
$newxmlstr = "<products>" . $newxmlstr . "</products>";
see it working: https://eval.in/512153
I prefer solution 1. XML manipulation by string functions carry the risk of error. If the original XML is really large, solution 2 might be faster.

Splitting an external array to give different titles

I've been trying to figure out how to split the array and add different titles for each of the separate titles on the page, for each of the different things that this displays. However the most I can manage to do is add a comma between the numbers and words.
I would like to add selling"1st variable price"second variable" etc however I don't quite know how to do anything other than to turn this very confusing looking bunch of letters:
user name and notes 01001000013972583957ecCCany amount-w378- v west
into anything other than this:
0,100,10000,1397258395,7ec,CC,any amount-w378- v west
Also, this is what it looks like in its JSON form:
{"selling":"0","quantity":"100","price":"10000","date":"1397258395","rs_name":"7ec","contact":"CC","notes":"any amount-w378- v west"}
I just want all the information that is in there to displayed like that however I'm not quite sure how to add the titles that is in the JSON data. I also don't have access to the external site to change anything.
A little background: what I am trying to achieve is a price look-up for a game on my website from an external site. I tried to use an iframe but it was terrible; I would rather just manually display it rather than showing their site from mine - their style and my style clash terribly.
$json = file_get_contents('http://forums.zybez.net/runescape-2007-prices/api/rune+axe');
$obj = json_decode($json,true);
$blah1 = implode( $obj[0]["offers"][1]);
print_r($blah1);
If you know where it is, you should be able to just grab it and show it out?
You can use a failsafe to check if it is present with is_array() and isset() functions - see php.net docs on them.
Your print_r should give you good valid info -- try to wrap it around <pre></pre> tags before for better readability or view the source - it will be easier!
<pre><?php print_r($obj) ?></pre>
This should be your starting point, and from here you will either take the first one of your items or loop through all with
foreach ($obj as $o) { //should be $objects, not $obj
//do whatever with $o, like echo $o['price']
}
Each offers row is a table with each field separated by row:
$item = json_decode(file_get_contents('http://forums.zybez.net/runescape-2007-prices/api/rune+axe'));
while ($offer = array_shift($item[0]->offers)) {
echo "<table>" . PHP_EOL;
foreach ($offer as $field => $value) {
echo "<tr><th>$field</th><td>$value</td></tr>" . PHP_EOL;
}
echo "</table>" . PHP_EOL;
}
http://codepad.org/C3PQJHqL
Tables in HTML:
http://jsfiddle.net/G5QqZ/

how to display SimpleXMLElement with php

Hi I have never used xml but need to now, so I am trying to quickly learn but struggling with the structure I think. This is just to display the weather at the top of someones website.
I want to display Melbourne weather using this xml link ftp://ftp2.bom.gov.au/anon/gen/fwo/IDV10753.xml
Basically I am trying get Melbourne forecast for 3 days (what ever just something that works) there is a forecast-period array [0] to [6]
I used this print_r to view the structure:
$url = "linkhere";
$xml = simplexml_load_file($url);
echo "<pre>";
print_r($xml);
and tried this just to get something:
$url = "linkhere";
$xml = simplexml_load_file($url);
$data = (string) $xml->forecast->area[52]->description;
echo $data;
Which gave me nothing (expected 'Melbourne'), obviously I need to learn and I am but if someone could help that would be great.
Because description is an attribute of <area>, you need to use
$data = (string) $xml->forecast->area[52]['description'];
I also wouldn't rely on Melbourne being the 52nd area node (though this is really up to the data maintainers). I'd go by its aac attribute as this appears to be unique, eg
$search = $xml->xpath('forecast/area[#aac="VIC_PT042"]');
if (count($search)) {
$melbourne = $search[0];
echo $melbourne['description'];
}
This is a working example for you:
<?php
$forecastdata = simplexml_load_file('ftp://ftp2.bom.gov.au/anon/gen/fwo/IDV10753.xml','SimpleXMLElement',LIBXML_NOCDATA);
foreach($forecastdata->forecast->area as $singleregion) {
$area = $singleregion['description'];
$weather = $singleregion->{'forecast-period'}->text;
echo $area.': '.$weather.'<hr />';
}
?>
You can edit the aforementioned example to extract the tags and attributes you want.
Always remember that a good practice to understand the structure of your XML object is printing out its content using, for instance, print_r
In the specific case of the XML you proposed, cities are specified through attributes (description). For this reason you have to read also those attributes using ['attribute name'] (see here for more information).
Notice also that the tag {'forecast-period'} is wrapped in curly brackets cause it contains a hyphen, and otherwise it wouldn generate an error.

Not finding elements using getElementsByTagName() using DOMDocument

I'm trying to loop through multiple <LineItemInfo> products contained within a <LineItems> within XML I'm parsing to pull product Ids out and send emails and do other actions for each product.
The problem is that it's not returning anything. I've verified that the XML data is valid and it does contain the necessary components.
$itemListObject = $orderXML->getElementsByTagName('LineItemInfo');
var_dump($itemListObject->length);
var_dump($itemListObject);
The output of the var_dump is:
int(0)
object(DOMNodeList)#22 (0) {
}
This is my first time messing with this and it's taken me a couple of hours but I can't figure it out. Any advice would be awesome.
EDIT:
My XML looks like this... except with a lot more tags than just ProductId
<LineItems>
<LineItemInfo>
<ProductId href='[URL_TO_PRODUCT_XML]'>149593</ProductId>
</LineItemInfo>
<LineItemInfo>
<ProductId href='[URL_TO_PRODUCT_XML]'>149593</ProductId>
</LineItemInfo>
</LineItems>
Executing the following code does NOT get me the ProductId
$itemListObject = $orderXML->getElementsByTagName('LineItemInfo');
foreach ($itemListObject as $element) {
$product = $element->getElementsByTagName('ProductId');
$productId = $product->item(0)->nodeValue;
echo $productId.'-';
}
EDIT #2
As a side note, calling
$element->item(0)->nodeValue
on $element instead of $product caused my script's execution to discontinue and not throwing any errors that were logged by the server. It's a pain to debug when you have to run a credit card to find out whether it's functioning or not.
DOMDocument stuff can be tricky to get a handle on, because functions such as print_r() and var_dump() don't necessarily perform the same as they would on normal arrays and objects (see this comment in the manual).
You have to use various functions and properties of the document nodes to pull out the data. For instance, if you had the following XML:
<LineItemInfo attr1="hi">This is a line item.</LineItemInfo>
You could output various parts of that using:
$itemListObjects = $orderXML->getElementsByTagName('LineItemInfo');
foreach($itemListObjects as $node) {
echo $node->nodeValue; //echos "This is a line item."
echo $node->attributes->getNamedItem('attr1')->nodeValue; //echos "hi"
}
If you had a nested structure, you can follow basically the same procedure using the childNodes property. For example, if you had this:
<LineItemInfo attr1="hi">
<LineItem>Line 1</LineItem>
<LineItem>Line 2</LineItem>
</LineItemInfo>
You might do something like this:
$itemListObjects = $orderXML->getElementsByTagName('LineItemInfo');
foreach($itemListObjects as $node) {
if ($node->hasChildNodes()) {
foreach($node->childNodes as $c) {
echo $c->nodeValue .",";
}
}
}
//you'll get output of "Line 1,Line 2,"
Hope that helps.
EDIT for specific code and XML
I ran the following code in a test script, and it seemed to work for me. Can you be more specific about what's not working? I used your code exactly, except for the first two lines that create the document. Are you using loadXML() over loadHTML()? Are there any errors?
$orderXML = new DOMDocument();
$orderXML->loadXML("
<LineItems>
<LineItemInfo>
<ProductId href='[URL_TO_PRODUCT_XML]'>149593</ProductId>
</LineItemInfo>
<LineItemInfo>
<ProductId href='[URL_TO_PRODUCT_XML]'>149593</ProductId>
</LineItemInfo>
</LineItems>
");
$itemListObject = $orderXML->getElementsByTagName('LineItemInfo');
foreach ($itemListObject as $element) {
$product = $element->getElementsByTagName('ProductId');
$productId = $product->item(0)->nodeValue;
echo $productId.'-';
}
//outputs "149593-149595-"
XML tags tend to be lower-camel-case (or just "camel-case"), i.e. "lineItemInfo", instead of "LineItemInfo" and XML is case-sensitive, so check for that.

Categories