PHP: How to store XML data in an array? - php

Below is the XML I am working with - there are more items - this is the first set. How can I get these elements in to an array? I have been trying with PHP's SimpleXML etc. but I just cant do it.
<response xmlns:lf="http://api.lemonfree.com/ns/1.0">
<lf:request_type>listing</lf:request_type>
<lf:response_code>0</lf:response_code>
<lf:result type="listing" count="10">
<lf:item id="56832429">
<lf:attr name="title">Used 2005 Ford Mustang V6 Deluxe</lf:attr>
<lf:attr name="year">2005</lf:attr>
<lf:attr name="make">FORD</lf:attr>
<lf:attr name="model">MUSTANG</lf:attr>
<lf:attr name="vin">1ZVFT80N555169501</lf:attr>
<lf:attr name="price">12987</lf:attr>
<lf:attr name="mileage">42242</lf:attr>
<lf:attr name="auction">no</lf:attr>
<lf:attr name="city">Grand Rapids</lf:attr>
<lf:attr name="state">Michigan</lf:attr>
<lf:attr name="image">http://www.lemonfree.com/images/stock_images/thumbnails/2005_38_557_80.jpg</lf:attr>
<lf:attr name="link">http://www.lemonfree.com/56832429.html</lf:attr>
</lf:item>
<!-- more items -->
</lf:result>
</response>
Thanks guys
EDIT: I want the first items data in easy to access variables, I've been struggling for a couple of days to get SimpleXML to work as I am new to PHP, so I thought manipulating an array is easier to do.

Why do you want them in an array? They are structured already, use them as XML directly.
There is SimpleXML and DOMDocument, now it depends on what you want to do with the data (you failed to mention that) which one serves you better. Expand your question to get code samples.
EDIT: Here is an example of how you could handle your document with SimpleXML:
$url = "http://api.lemonfree.com/listings?key=xxxx&make=ford&model=mustang";
$ns_lf = "http://api.lemonfree.com/ns/1.0";
$response = simplexml_load_file($url);
// children() fetches all nodes of a given namespace
$result = $response->children($ns_lf)->result;
// dump the entire <lf:result> to see what it looks like
print_r($result);
// once the namespace was handled, you can go on normally (-> syntax)
foreach ($result->item as $item) {
$title = $item->xpath("lf:attr[#name='title']");
$state = $item->xpath("lf:attr[#name='state']");
// xpath() always returns an array of matches, hence the [0]
echo( $title[0].", ".$state[0] );
}

Perhaps you should look at SimplePie, it parses XML feeds into an array or an object (well, one of the two :D). I think it works well for namespaces and attributes too.
Some benefits include it's GPL license (it's free) and it's support community.

SimpleXML is the best way to read/write XML files. Usually it's as easy as using arrays, except in your case because there's XML namespaces involved and it complicates stuff. Also, the format used to stored attributes kind of sucks, so instead of being easy to use and obvious it's kind of complicated, so here's what you're looking for so that you can move on to doing something more interesting for you:
$response = simplexml_load_file($url);
$items = array();
foreach ($response->xpath('//lf:item') as $item)
{
$id = (string) $item['id'];
foreach ($item->xpath('lf:attr') as $attr)
{
$name = (string) $attr['name'];
$items[$id][$name] = (string) $attr;
}
}
You'll have everything you need in the $items array, use print_r() to see what's inside. $url should be the URL of that lemonfree API thing. The code assumes there can't be multiple values for one attribute (e.g. multiple images.)
Good luck.

This is the way I parsed XML returned from a clients email address book system into an array so I could use it on a page. uses an XML parser that is part of PHP, I think.
here's it's documentation http://www.php.net/manual/en/ref.xml.php
$user_info = YOUR_XML
SETS $xml_array AS ARRAY
$xml_array = array();
// SETS UP XML PARSER AND PARSES $user_info INTO AN ARRAY
$xml_parser = xml_parser_create();
xml_parse_into_struct($xml_parser, $user_info, $values, $index);
xml_parser_free($xml_parser);
foreach($values as $key => $value)
{
// $value['level'] relates to the nesting level of a tag, x will need to be a number
if($value['level']==x)
{
$tag_name = $value['tag'];
// INSERTS DETAILS INTO ARRAY $contact_array SETTING KEY = $tag_name VALUE = value for that tag
$xml_array[strtolower($tag_name)] = $value['value'];
}
}
If you var_dump($values) you should see what level the data you're info is on, I think it'll be 4 for the above XML, so you can then filter out anything you don't want by changing the value of $value['level']==x to the required level, ie. $value['level']==4.
This should return $xml_array as an array with $xml_array['title'] = 'Used 2005 Ford Mustang V6 Deluxe' etc.
Hope that helps some

Related

Searching an XML structure but modifying a node higher in the hierarchy

So as an example here is an MWE XML
<manifest xmlns="http://iuclid6.echa.europa.eu/namespaces/manifest/v1"
xmlns:xlink="http://www.w3.org/1999/xlink">
<general-information>
<title>IUCLID 6 container manifest file</title>
<created>Tue Nov 05 11:04:06 EET 2019</created>
<author>SuperUser</author>
</general-information>
<base-document-uuid>f53d48a9-17ef-48f0-8d0e-76d03007bdfe/f53d48a9-17ef-48f0-8d0e-76d03007bdfe</base-document-uuid>
<contained-documents>
<document id="f53d48a9-17ef-48f0-8d0e-76d03007bdfe/f53d48a9-17ef-48f0-8d0e-76d03007bdfe">
<type>DOSSIER</type>
<name xlink:type="simple"
xlink:href="f53d48a9-17ef-48f0-8d0e-76d03007bdfe_f53d48a9-17ef-48f0-8d0e-76d03007bdfe.i6d"
>Initial submission</name>
<first-modification-date>2019-03-27T06:46:39Z</first-modification-date>
<last-modification-date>2019-03-27T06:46:39Z</last-modification-date>
</document>
</contained-documents>
</manifest>
In this case I want to find an attribute xlink:href and replace the name tag with the contents of the file referred to by the xlink:href - in this case f53d48a9-17ef-48f0-8d0e-76d03007bdfe_f53d48a9-17ef-48f0-8d0e-76d03007bdfe.i6d (which is an XML format file as well).
At the moment I use simplexml to pull it into an object and then xml2json library to convert it into a recursive array - but walking it using the normal methods doesn't give me a way to modify a parent node..
I'm not sure how to back up the hierarchy - any suggestions??
So this is where I am right now - xml2array (https://github.com/tamlyn/xml2json) delivers an array of arrays with XML attributes brought out into the array too
<?php
include('./xml2json.php');
$arrayData = [];
$xmlOptions = array(
"namespaceRecursive" => "True"
);
function &i6cArray(& $array){
foreach ($array as $key => $value) {
if(is_array($value)){
//recurse the array of arrays
$value = &i6cArray($value);
$array[$key]=$value;
print_r($value);
} elseif ($key == '#xlink:href') {
// we want to replace the element here with the ref'd file contents
// So we should get name.content = file contents
$tempxml = simplexml_load_file($value);
$tempArrayData = xmlToArray($tempxml);
$array['content']=$tempArrayData;
} else {
//do nothing (at least for now)
}
}
return $array;
}
if (file_exists('manifest.xml')) {
$xml = simplexml_load_file('manifest.xml');
$arrayData = xmlToArray($xml,$xmlOptions);
// walk array - we know the initial thing is an array
$arrayData = &i6cArray($arrayData);
//output result
$jsonString = json_encode($arrayData, JSON_PRETTY_PRINT);
file_put_contents('dossier.json', $jsonString);
} else {
exit("Failed to open manifest.");
}
?>
Since I would have liked to remove the #xlink attributes, but won't die otherwise I am going to insert a 'content' value which will be the referenced XML content.
I would still link to have replaced the entire 'name' key with something
A few bits of background before we get into the specific solution:
The parts of names before a colon are local aliases for a particular namespace, identified by a URI in an xmlns attribute. They need slightly different handling than non-namespaced names; see this reference question for SimpleXML.
PHP's SimpleXML and DOM extensions both have support for a language called "XPath", which lets you search for elements and attributes based on their parents and/or content.
The DOM is a more complex API than SimpleXML, but has more powerful features, particularly for writing. You can switch between the two using the functions simplexml_import_dom() and dom_import_simplexml().
In this case, we want to find all xlink:href attributes. Looking at the xmlns attributes at the top of the file, we see these are in the http://www.w3.org/1999/xlink namespace. In XPath, you can say "has an attribute" with the syntax [#attributename], so we can use SimpleXML and XPath like this:
$simplexml->registerXpathNamespace('xl', 'http://www.w3.org/1999/xlink');
$elements_with_xlink_hrefs = $simplexml->xpath('//[#xl:href]');
For each of those, we want the attribute value:
foreach ( $elements_with_xlink_hrefs as $simplexml_element ) {
$filename = (string)$simplexml_element->attributes('http://www.w3.org/1999/xlink')->href;
// ...
We then want to load that file, and inject it into the document; this is easier with the DOM, but there is a complexity of having to "import" the node so that it's "owned by" the right document.
// load the other file
$other_document = new DOMDocument;
$other_document->load($filename);
// switch to DOM and add it in place
$dom_element = dom_import_simplexml($simplexml_element);
$dom_element->appendChild(
$dom_element->ownerDocument->importNode(
$other_document->documentElement
)
);
We can now tidy up and delete the "xlink" attributes:
$dom_element->removeAttributeNs('http://www.w3.org/1999/xlink', 'href');
$dom_element->removeAttributeNs('http://www.w3.org/1999/xlink', 'type');
Once we're done, we can output the whole thing back as one combined XML document:
} // end of foreach loop
echo $simplexml->asXML();

Missing Attributes while parsing XML with simplexml_load_string

$xml = simplexml_load_string($value);
$json = json_encode($xml); // convert the XML string to JSON
$array = json_decode($json,TRUE);
Attributes are missing after converting into array.
The <SampleData> value as you say is encoded, the simplest way to get this back to 'normal' is to use htmlspecialchars_decode() to convert all the symbols before loading the string into SimpleXML. The code below does this and then outputs various parts of the data as an example of how to display information...
$source = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi?db=biosample&id=367368";
$value = file_get_contents($source);
$value = htmlspecialchars_decode($value);
$xml = simplexml_load_string($value);
// Access the DbBuild value
echo "DbBuild=".(string)$xml->DocumentSummarySet->DbBuild.PHP_EOL;
// The BioSample publication date attribute
echo "BioSample publication date=".(string)$xml->DocumentSummarySet->DocumentSummary->SampleData->BioSample['publication_date'].PHP_EOL;
// List the attributes name and value
foreach ( $xml->DocumentSummarySet->DocumentSummary->SampleData->BioSample->Attributes->Attribute as $attribute ) {
echo (string)$attribute['attribute_name']."=".(string)$attribute.PHP_EOL;
}
Some of the XML access looks long winded, but it's just a case of accessing the various levels of data in the document. $xml->DocumentSummarySet accesses the <DocumentSummarySet> element under the root elements. BioSample['publication_date'] is the publication_date attribute in the <BioSample> element and so on.
There is a really simple solution to this - delete these two lines of code:
$json = json_encode($xml); // convert the XML string to JSON
$array = json_decode($json,TRUE);
XML, JSON, and PHP arrays all have different rules about what kind of structures can be represented, so converting arbitrarily between them will always end up with edge cases where you're missing data. SimpleXML is, as the name says, designed to be simple to use, so you will be much better off actually using it:
$xml = simplexml_load_string($value);
// Now access your data from $xml; no further conversion is needed
Since you give no further information about what your XML looks like, I can't give any further information about how to process it, but there are extensive examples in the PHP manual, and refer here if there are namespaces (tags or attributes with : in their name).

how to display SimpleXMLElement with php

Hi I have never used xml but need to now, so I am trying to quickly learn but struggling with the structure I think. This is just to display the weather at the top of someones website.
I want to display Melbourne weather using this xml link ftp://ftp2.bom.gov.au/anon/gen/fwo/IDV10753.xml
Basically I am trying get Melbourne forecast for 3 days (what ever just something that works) there is a forecast-period array [0] to [6]
I used this print_r to view the structure:
$url = "linkhere";
$xml = simplexml_load_file($url);
echo "<pre>";
print_r($xml);
and tried this just to get something:
$url = "linkhere";
$xml = simplexml_load_file($url);
$data = (string) $xml->forecast->area[52]->description;
echo $data;
Which gave me nothing (expected 'Melbourne'), obviously I need to learn and I am but if someone could help that would be great.
Because description is an attribute of <area>, you need to use
$data = (string) $xml->forecast->area[52]['description'];
I also wouldn't rely on Melbourne being the 52nd area node (though this is really up to the data maintainers). I'd go by its aac attribute as this appears to be unique, eg
$search = $xml->xpath('forecast/area[#aac="VIC_PT042"]');
if (count($search)) {
$melbourne = $search[0];
echo $melbourne['description'];
}
This is a working example for you:
<?php
$forecastdata = simplexml_load_file('ftp://ftp2.bom.gov.au/anon/gen/fwo/IDV10753.xml','SimpleXMLElement',LIBXML_NOCDATA);
foreach($forecastdata->forecast->area as $singleregion) {
$area = $singleregion['description'];
$weather = $singleregion->{'forecast-period'}->text;
echo $area.': '.$weather.'<hr />';
}
?>
You can edit the aforementioned example to extract the tags and attributes you want.
Always remember that a good practice to understand the structure of your XML object is printing out its content using, for instance, print_r
In the specific case of the XML you proposed, cities are specified through attributes (description). For this reason you have to read also those attributes using ['attribute name'] (see here for more information).
Notice also that the tag {'forecast-period'} is wrapped in curly brackets cause it contains a hyphen, and otherwise it wouldn generate an error.

PHP Parse XML with Attributes

I have an XML document that I am trying to get some of the values for and don't know how to get to the attributes. An example of the structure and values are below:
<vin_number value="3N1AB51D84L729887">
<common_data>
<engines>
</engines>
</common_data>
<available_vehicle_styles>
<vehicle_style name="SE-R 4dr Sedan" style_id="100285116" complete="Y">
<engines>
<engine brand="" name="ED 2L NA I 4 double overhead cam (DOHC) 16V"></engine>
</engines>
</vehicle_style>
</available_vehicle_styles>
</vin_number>
I am trying to get the engine["name"] attribute (NOT "ENGINES"). I thought the following would work but I get errors (I cant parse past "vehicle_style")
$xml = simplexml_load_file($fileVIN);
foreach($xml->vin_number->available_vehicle_styles->vehicle_style->engines->engine->attributes() as $a => $b) {
echo $b;
}
Assuming your XML is structured in the same was as this example XML, the following two snippets will get the engine name.
The property hierarchy way (split onto multiple lines so you can read it).
$name = (string) $xml->vin_number
->available_vehicle_styles
->vehicle_style
->engines
->engine['name'];
Or the more concise XPath way.
$engines = $xml->xpath('//engines/engine');
$name = (string) $engines[0]['name'];
Unless there are multiple engine names in your XML, there is no need to use a foreach loop at all.
(See both snippets running on a codepad.)
Use the SimpleXMLElement::attributes method to get the attributes:
foreach($xml->available_vehicle_styles->vehicle_style as $b) {
$attrs = $b->attributes();
echo "Name = $attrs->name";
}
Note: I slightly changed the "path" to the element starting from $xml because that's how it loaded the fragment for me.
By this layout, there could be more than one engine per engines block, so you have to explicitly pick the first one. (Assuming you know for sure there's only going to be one.)
$name = $xml->available_vehicle_styles->vehicle_style->engines->engine[0]->attributes()->name;

Parsing XML with PHP (simplexml)

Firstly, may I point out that I am a newcomer to all things PHP so apologies if anything here is unclear and I'm afraid the more layman the response the better. I've been having real trouble parsing an xml file in to php to then populate an HTML table for my website. At the moment, I have been able to get the full xml feed in to a string which I can then echo and view and all seems well. I then thought I would be able to use simplexml to pick out specific elements and print their content but have been unable to do this.
The xml feed will be constantly changing (structure remaining the same) and is in compressed format. From various sources I've identified the following commands to get my feed in to the right format within a string although I am still unable to print specific elements. I've tried every combination without any luck and suspect I may be barking up the wrong tree. Could someone please point me in the right direction?!
$file = fopen("compress.zlib://$url", 'r');
$xmlstr = file_get_contents($url);
$xml = new SimpleXMLElement($url,null,true);
foreach($xml as $name) {
echo "{$name->awCat}\r\n";
}
Many, many thanks in advance,
Chris
PS The actual feed
Since no one followed my closevote, I think I can just as well put my own comments as an answer:
First of all, SimpleXml can load URIs directly and it can do so with stream wrappers, so your three calls in the beginning can be shortened to (note that you are not using $file at all)
$merchantProductFeed = new SimpleXMLElement("compress.zlib://$url", null, TRUE);
To get the values you can either use the implicit SimpleXml API and drill down to the wanted elements (like shown multiple times elsewhere on the site):
foreach ($merchantProductFeed->merchant->prod as $prod) {
echo $prod->cat->awCat , PHP_EOL;
}
or you can use an XPath query to get at the wanted elements directly
$xml = new SimpleXMLElement("compress.zlib://$url", null, TRUE);
foreach ($xml->xpath('/merchantProductFeed/merchant/prod/cat/awCat') as $awCat) {
echo $awCat, PHP_EOL;
}
Live Demo
Note that fetching all $awCat elements from the source XML is rather pointless though, because all of them have "Bodycare & Fitness" for value. Of course you can also mix XPath and the implict API and just fetch the prod elements and then drill down to the various children of them.
Using XPath should be somewhat faster than iterating over the SimpleXmlElement object graph. Though it should be noted that the difference is in an neglectable area (read 0.000x vs 0.000y) for your feed. Still, if you plan to do more XML work, it pays off to familiarize yourself with XPath, because it's quite powerful. Think of it as SQL for XML.
For additional examples see
A simple program to CRUD node and node values of xml file and
PHP Manual - SimpleXml Basic Examples
Try this...
$url = "http://datafeed.api.productserve.com/datafeed/download/apikey/58bc4442611e03a13eca07d83607f851/cid/97,98,142,144,146,129,595,539,147,149,613,626,135,163,168,159,169,161,167,170,137,171,548,174,183,178,179,175,172,623,139,614,189,194,141,205,198,206,203,208,199,204,201,61,62,72,73,71,74,75,76,77,78,79,63,80,82,64,83,84,85,65,86,87,88,90,89,91,67,92,94,33,54,53,57,58,52,603,60,56,66,128,130,133,212,207,209,210,211,68,69,213,216,217,218,219,220,221,223,70,224,225,226,227,228,229,4,5,10,11,537,13,19,15,14,18,6,551,20,21,22,23,24,25,26,7,30,29,32,619,34,8,35,618,40,38,42,43,9,45,46,651,47,49,50,634,230,231,538,235,550,240,239,241,556,245,244,242,521,576,575,577,579,281,283,554,285,555,303,304,286,282,287,288,173,193,637,639,640,642,643,644,641,650,177,379,648,181,645,384,387,646,598,611,391,393,647,395,631,602,570,600,405,187,411,412,413,414,415,416,649,418,419,420,99,100,101,107,110,111,113,114,115,116,118,121,122,127,581,624,123,594,125,421,604,599,422,530,434,532,428,474,475,476,477,423,608,437,438,440,441,442,444,446,447,607,424,451,448,453,449,452,450,425,455,457,459,460,456,458,426,616,463,464,465,466,467,427,625,597,473,469,617,470,429,430,615,483,484,485,487,488,529,596,431,432,489,490,361,633,362,366,367,368,371,369,363,372,373,374,377,375,536,535,364,378,380,381,365,383,385,386,390,392,394,396,397,399,402,404,406,407,540,542,544,546,547,246,558,247,252,559,255,248,256,265,259,632,260,261,262,557,249,266,267,268,269,612,251,277,250,272,270,271,273,561,560,347,348,354,350,352,349,355,356,357,358,359,360,586,590,592,588,591,589,328,629,330,338,493,635,495,507,563,564,567,569,568/mid/2891/columns/merchant_id,merchant_name,aw_product_id,merchant_product_id,product_name,description,category_id,category_name,merchant_category,aw_deep_link,aw_image_url,search_price,delivery_cost,merchant_deep_link,merchant_image_url/format/xml/compression/gzip/";
$zd = gzopen($url, "r");
$data = gzread($zd, 1000000);
gzclose($zd);
if ($data !== false) {
$xml = simplexml_load_string($data);
foreach ($xml->merchant->prod as $pr) {
echo $pr->cat->awCat . "<br>";
}
}
<?php
$xmlstr = file_get_contents("compress.zlib://$url");
$xml = simplexml_load_string($xmlstr);
// you can transverse the xml tree however you want
foreach ($xml->merchant->prod as $line) {
// $line->cat->awCat -> you can use this
}
more information here
Use print_r($xml) to see the structure of the parsed XML feed.
Then it becomes obvious how you would traverse it:
foreach ($xml->merchant->prod as $prod) {
print $prod->pId;
print $prod->text->name;
print $prod->cat->awCat; # <-- which is what you wanted
print $prod->price->buynow;
}
$url = 'you url here';
$f = gzopen ($url, 'r');
$xml = new SimpleXMLElement (fread ($f, 1000000));
foreach($xml->xpath ('//prod') as $name)
{
echo (string) $name->cat->awCatId, "\r\n";
}

Categories