PHP Parse XML with Attributes - php

I have an XML document that I am trying to get some of the values for and don't know how to get to the attributes. An example of the structure and values are below:
<vin_number value="3N1AB51D84L729887">
<common_data>
<engines>
</engines>
</common_data>
<available_vehicle_styles>
<vehicle_style name="SE-R 4dr Sedan" style_id="100285116" complete="Y">
<engines>
<engine brand="" name="ED 2L NA I 4 double overhead cam (DOHC) 16V"></engine>
</engines>
</vehicle_style>
</available_vehicle_styles>
</vin_number>
I am trying to get the engine["name"] attribute (NOT "ENGINES"). I thought the following would work but I get errors (I cant parse past "vehicle_style")
$xml = simplexml_load_file($fileVIN);
foreach($xml->vin_number->available_vehicle_styles->vehicle_style->engines->engine->attributes() as $a => $b) {
echo $b;
}

Assuming your XML is structured in the same was as this example XML, the following two snippets will get the engine name.
The property hierarchy way (split onto multiple lines so you can read it).
$name = (string) $xml->vin_number
->available_vehicle_styles
->vehicle_style
->engines
->engine['name'];
Or the more concise XPath way.
$engines = $xml->xpath('//engines/engine');
$name = (string) $engines[0]['name'];
Unless there are multiple engine names in your XML, there is no need to use a foreach loop at all.
(See both snippets running on a codepad.)

Use the SimpleXMLElement::attributes method to get the attributes:
foreach($xml->available_vehicle_styles->vehicle_style as $b) {
$attrs = $b->attributes();
echo "Name = $attrs->name";
}
Note: I slightly changed the "path" to the element starting from $xml because that's how it loaded the fragment for me.

By this layout, there could be more than one engine per engines block, so you have to explicitly pick the first one. (Assuming you know for sure there's only going to be one.)
$name = $xml->available_vehicle_styles->vehicle_style->engines->engine[0]->attributes()->name;

Related

Searching an XML structure but modifying a node higher in the hierarchy

So as an example here is an MWE XML
<manifest xmlns="http://iuclid6.echa.europa.eu/namespaces/manifest/v1"
xmlns:xlink="http://www.w3.org/1999/xlink">
<general-information>
<title>IUCLID 6 container manifest file</title>
<created>Tue Nov 05 11:04:06 EET 2019</created>
<author>SuperUser</author>
</general-information>
<base-document-uuid>f53d48a9-17ef-48f0-8d0e-76d03007bdfe/f53d48a9-17ef-48f0-8d0e-76d03007bdfe</base-document-uuid>
<contained-documents>
<document id="f53d48a9-17ef-48f0-8d0e-76d03007bdfe/f53d48a9-17ef-48f0-8d0e-76d03007bdfe">
<type>DOSSIER</type>
<name xlink:type="simple"
xlink:href="f53d48a9-17ef-48f0-8d0e-76d03007bdfe_f53d48a9-17ef-48f0-8d0e-76d03007bdfe.i6d"
>Initial submission</name>
<first-modification-date>2019-03-27T06:46:39Z</first-modification-date>
<last-modification-date>2019-03-27T06:46:39Z</last-modification-date>
</document>
</contained-documents>
</manifest>
In this case I want to find an attribute xlink:href and replace the name tag with the contents of the file referred to by the xlink:href - in this case f53d48a9-17ef-48f0-8d0e-76d03007bdfe_f53d48a9-17ef-48f0-8d0e-76d03007bdfe.i6d (which is an XML format file as well).
At the moment I use simplexml to pull it into an object and then xml2json library to convert it into a recursive array - but walking it using the normal methods doesn't give me a way to modify a parent node..
I'm not sure how to back up the hierarchy - any suggestions??
So this is where I am right now - xml2array (https://github.com/tamlyn/xml2json) delivers an array of arrays with XML attributes brought out into the array too
<?php
include('./xml2json.php');
$arrayData = [];
$xmlOptions = array(
"namespaceRecursive" => "True"
);
function &i6cArray(& $array){
foreach ($array as $key => $value) {
if(is_array($value)){
//recurse the array of arrays
$value = &i6cArray($value);
$array[$key]=$value;
print_r($value);
} elseif ($key == '#xlink:href') {
// we want to replace the element here with the ref'd file contents
// So we should get name.content = file contents
$tempxml = simplexml_load_file($value);
$tempArrayData = xmlToArray($tempxml);
$array['content']=$tempArrayData;
} else {
//do nothing (at least for now)
}
}
return $array;
}
if (file_exists('manifest.xml')) {
$xml = simplexml_load_file('manifest.xml');
$arrayData = xmlToArray($xml,$xmlOptions);
// walk array - we know the initial thing is an array
$arrayData = &i6cArray($arrayData);
//output result
$jsonString = json_encode($arrayData, JSON_PRETTY_PRINT);
file_put_contents('dossier.json', $jsonString);
} else {
exit("Failed to open manifest.");
}
?>
Since I would have liked to remove the #xlink attributes, but won't die otherwise I am going to insert a 'content' value which will be the referenced XML content.
I would still link to have replaced the entire 'name' key with something
A few bits of background before we get into the specific solution:
The parts of names before a colon are local aliases for a particular namespace, identified by a URI in an xmlns attribute. They need slightly different handling than non-namespaced names; see this reference question for SimpleXML.
PHP's SimpleXML and DOM extensions both have support for a language called "XPath", which lets you search for elements and attributes based on their parents and/or content.
The DOM is a more complex API than SimpleXML, but has more powerful features, particularly for writing. You can switch between the two using the functions simplexml_import_dom() and dom_import_simplexml().
In this case, we want to find all xlink:href attributes. Looking at the xmlns attributes at the top of the file, we see these are in the http://www.w3.org/1999/xlink namespace. In XPath, you can say "has an attribute" with the syntax [#attributename], so we can use SimpleXML and XPath like this:
$simplexml->registerXpathNamespace('xl', 'http://www.w3.org/1999/xlink');
$elements_with_xlink_hrefs = $simplexml->xpath('//[#xl:href]');
For each of those, we want the attribute value:
foreach ( $elements_with_xlink_hrefs as $simplexml_element ) {
$filename = (string)$simplexml_element->attributes('http://www.w3.org/1999/xlink')->href;
// ...
We then want to load that file, and inject it into the document; this is easier with the DOM, but there is a complexity of having to "import" the node so that it's "owned by" the right document.
// load the other file
$other_document = new DOMDocument;
$other_document->load($filename);
// switch to DOM and add it in place
$dom_element = dom_import_simplexml($simplexml_element);
$dom_element->appendChild(
$dom_element->ownerDocument->importNode(
$other_document->documentElement
)
);
We can now tidy up and delete the "xlink" attributes:
$dom_element->removeAttributeNs('http://www.w3.org/1999/xlink', 'href');
$dom_element->removeAttributeNs('http://www.w3.org/1999/xlink', 'type');
Once we're done, we can output the whole thing back as one combined XML document:
} // end of foreach loop
echo $simplexml->asXML();

how to display SimpleXMLElement with php

Hi I have never used xml but need to now, so I am trying to quickly learn but struggling with the structure I think. This is just to display the weather at the top of someones website.
I want to display Melbourne weather using this xml link ftp://ftp2.bom.gov.au/anon/gen/fwo/IDV10753.xml
Basically I am trying get Melbourne forecast for 3 days (what ever just something that works) there is a forecast-period array [0] to [6]
I used this print_r to view the structure:
$url = "linkhere";
$xml = simplexml_load_file($url);
echo "<pre>";
print_r($xml);
and tried this just to get something:
$url = "linkhere";
$xml = simplexml_load_file($url);
$data = (string) $xml->forecast->area[52]->description;
echo $data;
Which gave me nothing (expected 'Melbourne'), obviously I need to learn and I am but if someone could help that would be great.
Because description is an attribute of <area>, you need to use
$data = (string) $xml->forecast->area[52]['description'];
I also wouldn't rely on Melbourne being the 52nd area node (though this is really up to the data maintainers). I'd go by its aac attribute as this appears to be unique, eg
$search = $xml->xpath('forecast/area[#aac="VIC_PT042"]');
if (count($search)) {
$melbourne = $search[0];
echo $melbourne['description'];
}
This is a working example for you:
<?php
$forecastdata = simplexml_load_file('ftp://ftp2.bom.gov.au/anon/gen/fwo/IDV10753.xml','SimpleXMLElement',LIBXML_NOCDATA);
foreach($forecastdata->forecast->area as $singleregion) {
$area = $singleregion['description'];
$weather = $singleregion->{'forecast-period'}->text;
echo $area.': '.$weather.'<hr />';
}
?>
You can edit the aforementioned example to extract the tags and attributes you want.
Always remember that a good practice to understand the structure of your XML object is printing out its content using, for instance, print_r
In the specific case of the XML you proposed, cities are specified through attributes (description). For this reason you have to read also those attributes using ['attribute name'] (see here for more information).
Notice also that the tag {'forecast-period'} is wrapped in curly brackets cause it contains a hyphen, and otherwise it wouldn generate an error.

How can I sort on XML child node value with PHP SimpleDOM? (or any other method)

I need to sort the following XML (foreach ProgramList) based on the value of it's child MajorDescription
<ArrayOfProgramList xmlns="http://schemas.datacontract.org/2004/07/Taca.Resources" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<ProgramList>
<AreaOfInterests xmlns:a="http://schemas.datacontract.org/2004/07/Taca">
<a:AreaOfInterest>
<a:Interest>ABORIGINAL STUDIES</a:Interest>
</a:AreaOfInterest>
</AreaOfInterests>
<Coop>true</Coop>
<MajorDescription>ABORIGINAL COMMUNITY AND SOCIAL DEVELOPMENT</MajorDescription>
<Program>ACSD</Program>
<ProgramLocations>
<ProgramLocation>
<Campus>Barrie</Campus>
</ProgramLocation>
</ProgramLocations>
<Term>201210</Term>
</ProgramList>
<ProgramList>
<AreaOfInterests xmlns:a="http://schemas.datacontract.org/2004/07/Taca">
<a:AreaOfInterest>
<a:Interest>GRADUATE CERTIFICATE STUDIES</a:Interest>
</a:AreaOfInterest>
<a:AreaOfInterest>
<a:Interest>HEALTH AND WELLNESS STUDIES</a:Interest>
</a:AreaOfInterest>
</AreaOfInterests>
<Coop>false</Coop>
<MajorDescription>ADVANCED CARE PARAMEDIC</MajorDescription>
<Program>PARM</Program>
<ProgramLocations>
<ProgramLocation>
<Campus>Barrie</Campus>
</ProgramLocation>
</ProgramLocations>
<Term>201210</Term>
</ProgramList>
</ArrayOfProgramList>
I'm trying to do it with SimpleDOM as I've read thats the easiest way to sort XML on other SO questions.
I've tried using:
foreach($listxml->sortedXPath('//ArrayOfProgramList/ProgramList','//ArrayOfProgramList/ProgramList/MajorDescription') as $program){ ... }
and various other similar 'sort' values such as '#MajorDescription', '/MajorDescription' and '.' as suggested here How does one use SimpleDOM sortedXPath to sort on node value? but everything returns an empty array when I check it with var_dump()
I think the problem is that I need to sort on the value of a child node - is this possible? The foreach needs to be on ProgramList as I need to output the values of all the child nodes within ProgramList on each iteration.
Any suggestions? I don't have to use SimpleDOM, I'm open to any method that works - currently I'm iterating through an array of A-Z, and for each letter, iterating the ProgramList, comparing the first letter of MajorDescription to the current letter and outputting if it matches - this is obviously not ideal and only sorts the first letter...
You can try to put all the ProgramList elements into an array and sort it according to a custom function. The code should look like this:
function cmp($a, $b)
{
return strcmp($a->MajorDescription[0],$b->MajorDescription[0])
}
$arr = $listxml->xpath("//ArrayOfProgramList/ProgramList");
usort($arr,"cmp");
There are two problems with your original code. The first is that your XML uses a default namespace, and by design, XPath doesn't support default namespaces so you have to look for namespaced node (e.g. //foo:bar, not //bar) to find them. If you cannot register a prefix for this namespace (for example, if you cannot modify the source XML) you can match namespaced nodes using the wildcard //* combined with a predicate that matches the node's namespace and/or local name.
$nsPredicate = '[namespace-uri() = "http://schemas.datacontract.org/2004/07/Taca.Resources"]';
$query = '//*[local-name() = "ArrayOfProgramList"]' . $nsPredicate
. '/*[local-name() = "ProgramList"]' . $nsPredicate;
$orderBy = '*[local-name() = "MajorDescription"]' . $nsPredicate;
foreach ($listxml->sortedXPath($query, $orderBy) as $program)
{
echo $program->asXML(),"\n";
}
The other problem is with your sort criterion. It should be written from the target node's context.

Parsing XML with PHP (simplexml)

Firstly, may I point out that I am a newcomer to all things PHP so apologies if anything here is unclear and I'm afraid the more layman the response the better. I've been having real trouble parsing an xml file in to php to then populate an HTML table for my website. At the moment, I have been able to get the full xml feed in to a string which I can then echo and view and all seems well. I then thought I would be able to use simplexml to pick out specific elements and print their content but have been unable to do this.
The xml feed will be constantly changing (structure remaining the same) and is in compressed format. From various sources I've identified the following commands to get my feed in to the right format within a string although I am still unable to print specific elements. I've tried every combination without any luck and suspect I may be barking up the wrong tree. Could someone please point me in the right direction?!
$file = fopen("compress.zlib://$url", 'r');
$xmlstr = file_get_contents($url);
$xml = new SimpleXMLElement($url,null,true);
foreach($xml as $name) {
echo "{$name->awCat}\r\n";
}
Many, many thanks in advance,
Chris
PS The actual feed
Since no one followed my closevote, I think I can just as well put my own comments as an answer:
First of all, SimpleXml can load URIs directly and it can do so with stream wrappers, so your three calls in the beginning can be shortened to (note that you are not using $file at all)
$merchantProductFeed = new SimpleXMLElement("compress.zlib://$url", null, TRUE);
To get the values you can either use the implicit SimpleXml API and drill down to the wanted elements (like shown multiple times elsewhere on the site):
foreach ($merchantProductFeed->merchant->prod as $prod) {
echo $prod->cat->awCat , PHP_EOL;
}
or you can use an XPath query to get at the wanted elements directly
$xml = new SimpleXMLElement("compress.zlib://$url", null, TRUE);
foreach ($xml->xpath('/merchantProductFeed/merchant/prod/cat/awCat') as $awCat) {
echo $awCat, PHP_EOL;
}
Live Demo
Note that fetching all $awCat elements from the source XML is rather pointless though, because all of them have "Bodycare & Fitness" for value. Of course you can also mix XPath and the implict API and just fetch the prod elements and then drill down to the various children of them.
Using XPath should be somewhat faster than iterating over the SimpleXmlElement object graph. Though it should be noted that the difference is in an neglectable area (read 0.000x vs 0.000y) for your feed. Still, if you plan to do more XML work, it pays off to familiarize yourself with XPath, because it's quite powerful. Think of it as SQL for XML.
For additional examples see
A simple program to CRUD node and node values of xml file and
PHP Manual - SimpleXml Basic Examples
Try this...
$url = "http://datafeed.api.productserve.com/datafeed/download/apikey/58bc4442611e03a13eca07d83607f851/cid/97,98,142,144,146,129,595,539,147,149,613,626,135,163,168,159,169,161,167,170,137,171,548,174,183,178,179,175,172,623,139,614,189,194,141,205,198,206,203,208,199,204,201,61,62,72,73,71,74,75,76,77,78,79,63,80,82,64,83,84,85,65,86,87,88,90,89,91,67,92,94,33,54,53,57,58,52,603,60,56,66,128,130,133,212,207,209,210,211,68,69,213,216,217,218,219,220,221,223,70,224,225,226,227,228,229,4,5,10,11,537,13,19,15,14,18,6,551,20,21,22,23,24,25,26,7,30,29,32,619,34,8,35,618,40,38,42,43,9,45,46,651,47,49,50,634,230,231,538,235,550,240,239,241,556,245,244,242,521,576,575,577,579,281,283,554,285,555,303,304,286,282,287,288,173,193,637,639,640,642,643,644,641,650,177,379,648,181,645,384,387,646,598,611,391,393,647,395,631,602,570,600,405,187,411,412,413,414,415,416,649,418,419,420,99,100,101,107,110,111,113,114,115,116,118,121,122,127,581,624,123,594,125,421,604,599,422,530,434,532,428,474,475,476,477,423,608,437,438,440,441,442,444,446,447,607,424,451,448,453,449,452,450,425,455,457,459,460,456,458,426,616,463,464,465,466,467,427,625,597,473,469,617,470,429,430,615,483,484,485,487,488,529,596,431,432,489,490,361,633,362,366,367,368,371,369,363,372,373,374,377,375,536,535,364,378,380,381,365,383,385,386,390,392,394,396,397,399,402,404,406,407,540,542,544,546,547,246,558,247,252,559,255,248,256,265,259,632,260,261,262,557,249,266,267,268,269,612,251,277,250,272,270,271,273,561,560,347,348,354,350,352,349,355,356,357,358,359,360,586,590,592,588,591,589,328,629,330,338,493,635,495,507,563,564,567,569,568/mid/2891/columns/merchant_id,merchant_name,aw_product_id,merchant_product_id,product_name,description,category_id,category_name,merchant_category,aw_deep_link,aw_image_url,search_price,delivery_cost,merchant_deep_link,merchant_image_url/format/xml/compression/gzip/";
$zd = gzopen($url, "r");
$data = gzread($zd, 1000000);
gzclose($zd);
if ($data !== false) {
$xml = simplexml_load_string($data);
foreach ($xml->merchant->prod as $pr) {
echo $pr->cat->awCat . "<br>";
}
}
<?php
$xmlstr = file_get_contents("compress.zlib://$url");
$xml = simplexml_load_string($xmlstr);
// you can transverse the xml tree however you want
foreach ($xml->merchant->prod as $line) {
// $line->cat->awCat -> you can use this
}
more information here
Use print_r($xml) to see the structure of the parsed XML feed.
Then it becomes obvious how you would traverse it:
foreach ($xml->merchant->prod as $prod) {
print $prod->pId;
print $prod->text->name;
print $prod->cat->awCat; # <-- which is what you wanted
print $prod->price->buynow;
}
$url = 'you url here';
$f = gzopen ($url, 'r');
$xml = new SimpleXMLElement (fread ($f, 1000000));
foreach($xml->xpath ('//prod') as $name)
{
echo (string) $name->cat->awCatId, "\r\n";
}

PHP: How to store XML data in an array?

Below is the XML I am working with - there are more items - this is the first set. How can I get these elements in to an array? I have been trying with PHP's SimpleXML etc. but I just cant do it.
<response xmlns:lf="http://api.lemonfree.com/ns/1.0">
<lf:request_type>listing</lf:request_type>
<lf:response_code>0</lf:response_code>
<lf:result type="listing" count="10">
<lf:item id="56832429">
<lf:attr name="title">Used 2005 Ford Mustang V6 Deluxe</lf:attr>
<lf:attr name="year">2005</lf:attr>
<lf:attr name="make">FORD</lf:attr>
<lf:attr name="model">MUSTANG</lf:attr>
<lf:attr name="vin">1ZVFT80N555169501</lf:attr>
<lf:attr name="price">12987</lf:attr>
<lf:attr name="mileage">42242</lf:attr>
<lf:attr name="auction">no</lf:attr>
<lf:attr name="city">Grand Rapids</lf:attr>
<lf:attr name="state">Michigan</lf:attr>
<lf:attr name="image">http://www.lemonfree.com/images/stock_images/thumbnails/2005_38_557_80.jpg</lf:attr>
<lf:attr name="link">http://www.lemonfree.com/56832429.html</lf:attr>
</lf:item>
<!-- more items -->
</lf:result>
</response>
Thanks guys
EDIT: I want the first items data in easy to access variables, I've been struggling for a couple of days to get SimpleXML to work as I am new to PHP, so I thought manipulating an array is easier to do.
Why do you want them in an array? They are structured already, use them as XML directly.
There is SimpleXML and DOMDocument, now it depends on what you want to do with the data (you failed to mention that) which one serves you better. Expand your question to get code samples.
EDIT: Here is an example of how you could handle your document with SimpleXML:
$url = "http://api.lemonfree.com/listings?key=xxxx&make=ford&model=mustang";
$ns_lf = "http://api.lemonfree.com/ns/1.0";
$response = simplexml_load_file($url);
// children() fetches all nodes of a given namespace
$result = $response->children($ns_lf)->result;
// dump the entire <lf:result> to see what it looks like
print_r($result);
// once the namespace was handled, you can go on normally (-> syntax)
foreach ($result->item as $item) {
$title = $item->xpath("lf:attr[#name='title']");
$state = $item->xpath("lf:attr[#name='state']");
// xpath() always returns an array of matches, hence the [0]
echo( $title[0].", ".$state[0] );
}
Perhaps you should look at SimplePie, it parses XML feeds into an array or an object (well, one of the two :D). I think it works well for namespaces and attributes too.
Some benefits include it's GPL license (it's free) and it's support community.
SimpleXML is the best way to read/write XML files. Usually it's as easy as using arrays, except in your case because there's XML namespaces involved and it complicates stuff. Also, the format used to stored attributes kind of sucks, so instead of being easy to use and obvious it's kind of complicated, so here's what you're looking for so that you can move on to doing something more interesting for you:
$response = simplexml_load_file($url);
$items = array();
foreach ($response->xpath('//lf:item') as $item)
{
$id = (string) $item['id'];
foreach ($item->xpath('lf:attr') as $attr)
{
$name = (string) $attr['name'];
$items[$id][$name] = (string) $attr;
}
}
You'll have everything you need in the $items array, use print_r() to see what's inside. $url should be the URL of that lemonfree API thing. The code assumes there can't be multiple values for one attribute (e.g. multiple images.)
Good luck.
This is the way I parsed XML returned from a clients email address book system into an array so I could use it on a page. uses an XML parser that is part of PHP, I think.
here's it's documentation http://www.php.net/manual/en/ref.xml.php
$user_info = YOUR_XML
SETS $xml_array AS ARRAY
$xml_array = array();
// SETS UP XML PARSER AND PARSES $user_info INTO AN ARRAY
$xml_parser = xml_parser_create();
xml_parse_into_struct($xml_parser, $user_info, $values, $index);
xml_parser_free($xml_parser);
foreach($values as $key => $value)
{
// $value['level'] relates to the nesting level of a tag, x will need to be a number
if($value['level']==x)
{
$tag_name = $value['tag'];
// INSERTS DETAILS INTO ARRAY $contact_array SETTING KEY = $tag_name VALUE = value for that tag
$xml_array[strtolower($tag_name)] = $value['value'];
}
}
If you var_dump($values) you should see what level the data you're info is on, I think it'll be 4 for the above XML, so you can then filter out anything you don't want by changing the value of $value['level']==x to the required level, ie. $value['level']==4.
This should return $xml_array as an array with $xml_array['title'] = 'Used 2005 Ford Mustang V6 Deluxe' etc.
Hope that helps some

Categories