XML to MySQL when xml file has multiple matching fields - php

I've been doing some work on an XML to Mysql using load XML. I have been successful with itin the past. The difference with the latest effort is that we have multiple occurrences of a field-name in the MySQL. A sample of this is below:
<row>
<pictures>
<picture name="Photo 1">
<filename>image1.jpg</filename>
</picture>
<picture name="Photo 2">
<filename>image2.jpg</filename>
</picture>
<picture name="Photo 4">
<filename>image3.jpg</filename>
</picture>
<picture name="Photo 3">
<filename>image4.jpg</filename>
</picture>
<picture name="Photo 7">
<filename>image5.jpg</filename>
</picture>
<picture name="Photo 6">
<filename>image6.jpg</filename>
</picture>
<picture name="Photo 5">
<filename>image7.jpg</filename>
</picture>
<picture name="Photo 8">
<filename>image8.jpg</filename>
</picture>
<picture name="Photo 9">
<filename>image9.jpg</filename>
</picture>
</pictures>
</row>
I need to import this into a MySQL table with the fields:
picture1
picture2
picture3
picture4
picture5
picture6
picture7
picture8
picture9
As you can see, the 'name' attribute doesn't necessarily occur in the correct order, so I need them to simply be inserted in order. So the first <filename> to go to picture1, the second <filename> to picture2 etc..
What is currently being achieved is that I always end up with the last <picture> entry in the list being in the table. This is I assume because the filed is being overwritten each time.
Any ideas how to achieve this? I have found similar queries to this but no answers as yet and have been looking for a good while. The rest of the file is loading fine as they have unique field-names and can easily be mapped to a MySQL column, but I am struggling with this one.

As the XML does not match the format you aim for you need to transform it first. Traditionally this is done with XSLT but you can also do this with XMLReader and XMLWriter in PHP which has the benefit that it does not require to keep the whole XML document(s) in memory.
The XMLReaderIterator package has support for such operations, an example is already given with the library.
Creating a modification of that example code by taking your specific case and an exemplary input file named pictures.xml and keeping the output to the standard-output for demonstration purposes allows me to quote the following excerpt:
[... starts like examples/read-write.php]
/** #var $iterator XMLWritingIteration|XMLReaderNode[] */
$iterator = new XMLWritingIteration($writer, $reader);
$writer->startDocument();
$rename = ['row' => 'resultset', 'pictures' => 'row'];
$trimLevel = null;
$pictureCount = null;
foreach ($iterator as $node) {
$name = $node->name;
$isElement = $node->nodeType === XMLReader::ELEMENT;
$isEndElement = $node->nodeType === XMLReader::END_ELEMENT;
$isWhitespace = $node->nodeType === XMLReader::SIGNIFICANT_WHITESPACE;
if (($isElement || $isEndElement) && $name === 'filename') {
// drop <filename> opening and closing tags
} elseif ($isElement && $name === 'picture') {
$writer->startElement('field');
$writer->writeAttribute('name', sprintf('picture%d', ++$pictureCount));
$trimLevel = $node->depth;
} elseif ($trimLevel && $isWhitespace && $node->depth > $trimLevel) {
// drop (trim) SIGNIFICANT_WHITESPACE
} elseif ($isElement && isset($rename[$name])) {
$writer->startElement($rename[$name]);
if ($rename[$name] === 'row') {
$pictureCount = 0;
}
} else {
$iterator->write();
}
}
This is one XMLWritingIteration that is composed of an XMLReader and XMLWriter object. That iteration allows you to take over everything from the input document (via $iterator->write()) and do the needed changes only on occasions:
drop the <filename> and </filename> tags
create <field> elements with the correct name attributes to have the pictures in document order (Mysql XML nomenclature)
drop significant whitespace as <filename> tags are dropped as well
rename the document element from <row> to <resultset> (Mysql XML nomenclature)
rename the <pictures> element to <row> (again Mysql XML nomenclature)
the counter for the picture fields is reset per each (output) row
everything else is kept as-is
Such a transformation results in the following example output with the XML presented in your question:
<?xml version="1.0"?>
<resultset>
<row>
<field name="picture1">image1.jpg</field>
<field name="picture2">image2.jpg</field>
<field name="picture3">image3.jpg</field>
<field name="picture4">image4.jpg</field>
<field name="picture5">image5.jpg</field>
<field name="picture6">image6.jpg</field>
<field name="picture7">image7.jpg</field>
<field name="picture8">image8.jpg</field>
<field name="picture9">image9.jpg</field>
</row>
</resultset>
For more information about the XML format used by Mysql, please see the Mysql documentation for the --xml commandline switch which describes the standard XML output format which can be read in by LOAD XML.
For this little example you could as well use XSLT as there would be no problem to do the whole transformation in memory. But if you need to look for memory (which can happen if you deal with XML database dumps), the XMLWritingIteration allows iteration based XML transformation with an XML Pull parser (XMLReader) and forward-only XML output via XMLWriter.

And here is the XSLT solution. As information, XSLT is a declarative special-purpose language to transform, re-style, and restructure XML documents in various formats for end use purposes. PHP maintains an XSLT processor. Be sure to uncomment out extension=php_xsl.dll
XLST (accommodates image numbers greater than two digits)
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output version="1.0" encoding="UTF-8"/>
<xsl:template name="picturesort" match="pictures" >
<row>
<pictures>
<xsl:for-each select="picture">
<xsl:variable name="numkey"
select="substring-after(substring-before(filename, '.'), 'e')"/>
<picture name="{../picture[substring-after(#name, ' ') = $numkey]/#name}">
<xsl:copy-of select="filename"/>
</picture>
</xsl:for-each>
</pictures>
</row>
</xsl:template>
</xsl:stylesheet>
XML OUTPUT
<?xml version="1.0" encoding="UTF-8"?>
<row>
<pictures>
<picture name="Photo 1">
<filename>image1.jpg</filename>
</picture>
<picture name="Photo 2">
<filename>image2.jpg</filename>
</picture>
<picture name="Photo 3">
<filename>image3.jpg</filename>
</picture>
<picture name="Photo 4">
<filename>image4.jpg</filename>
</picture>
<picture name="Photo 5">
<filename>image5.jpg</filename>
</picture>
<picture name="Photo 6">
<filename>image6.jpg</filename>
</picture>
<picture name="Photo 7">
<filename>image7.jpg</filename>
</picture>
<picture name="Photo 8">
<filename>image8.jpg</filename>
</picture>
<picture name="Photo 9">
<filename>image9.jpg</filename>
</picture>
</pictures>
</row>
PHP
<?php
// Load the XML source
$xml = new DOMDocument;
$xml->load('C:/Path/To/XMLfile.xml');
$xsl = new DOMDocument;
$xsl->load('C:/Path/To/XSLfile.xsl');
// Configure the transformer
$proc = new XSLTProcessor;
$proc->importStyleSheet($xsl);
// Transform XML source
$newXml = $proc->transformToXML($xml);
echo $newXml;
// Save output to file
file_put_contents("C:/Path/To/NewXMLfile.xml", $newXml);
?>

Possible way:
iterate over all <picture> in a <row>
build an associative array with key = name and value = filename
sort array by keys
feed the array to your DB

Related

XML PHP Get children value

XML:
<Result xmlns="" xmlns:xsi="" totalResultsAvailable="0" totalResultsReturned="0" schk="true" totalLooseOffers="0" xsi:schemaLocation="">
<details>
<ID></ID>
<applicationVersion>1.0</applicationVersion>
<applicationPath/>
<date>2016-05-23T12:17:16.369-03:00</date>
<elapsedTime>17</elapsedTime>
<status>success</status>
<message>success</message>
</details>
<category id="1">
<thumbnail url="http://image.google.com/test.jpg"/>
<links>
<link url="www.google.com" type="category"/>
<link url="www.google2.com" type="xml"/>
</links>
<name>Category</name>
<filters>
<filter id="1" name="Filter1">
<value id="1" value="Test1"/>
<value id="2" value="Test2"/>
<value id="3" value="Test3"/>
</filter>
<filter id="2" name="Filter2">
<value id="1" value="Test4"/>
<value id="2" value="Test5"/>
<value id="3" value="Test6"/>
</filter>
</filters>
</category>
</Result>
PHP:
$xml = simplexml_load_file("http://xml.com");
foreach($xml->category->filters as $filters){
foreach($filters->children() as $child){
echo $child['value'];
}
}
I'm trying to get the filters value, but nothing shows with the code i have. I saw something about xpath but don't know if it's applicable in this situation. Do you have any clue?
--
When the XML looks like this:
<Result xmlns="" xmlns:xsi="" totalResultsAvailable="0" totalResultsReturned="0" schk="true" totalLooseOffers="0" xsi:schemaLocation="">
<details>
<ID></ID>
<applicationVersion>1.0</applicationVersion>
<applicationPath/>
<date>2016-05-23T12:17:16.369-03:00</date>
<elapsedTime>17</elapsedTime>
<status>success</status>
<message>success</message>
</details>
<subCategory id="1">
<thumbnail url="http://image.google.com/test.jpg"/>
<name>Subcategory</name>
</subCategory>
<subCategory id="2">
<thumbnail url="http://image.google.com/test2.jpg"/>
<name>Subcategory2</name>
</subCategory>
</Result>
Then am able to do this:
foreach($xml->subCategory as $subCategory){
$categoryId = $subCategory['id'];
$categoryName = $subCategory->name;
}
The elements you reference as $child in the inner loop actually point to the <filter> nodes, not the children <value> nodes you are attempting to target attributes for. So this really is just a matter of extending the outer foreach loop to iterate over $xml->category->filters->filter rather than its parent $xml->category->filters.
// Iterate the correct <filter> node, not its parent <filters>
foreach ($xml->category->filters->filter as $filter) {
foreach($filter->children() as $child){
echo $child['value'] . "\n";
}
}
Here it is in demonstration: https://3v4l.org/Rqc4Y
Using xpath, you can target the inner nodes directly.
$values = $xml->xpath('//category/filters/filter/value');
foreach ($values as $value) {
echo $value['value'];
}
https://3v4l.org/vPhKE
Both of these examples output
Test1
Test2
Test3
Test4
Test5
Test6

Xpath query in PHP to attribute from element with specific attribute value

I'm truly bending my head over something that should be way to simple. I have an XML feed with 25 entries in the root. I'm already iterating them as $entry in PHP.
Here is an example of one entry in the xml feed:
<entry>
<id>tag:blogger.com,1999:blog-7691515427771054332.post-4593968385603307594</id>
<published>2014-02-10T06:33:00.000-05:00</published>
<updated>2014-02-10T06:40:34.678-05:00</updated>
<category scheme="http://www.blogger.com/atom/ns#" term="Aurin" />
<category scheme="http://www.blogger.com/atom/ns#" term="fan art" />
<category scheme="http://www.blogger.com/atom/ns#" term="Fred-H" />
<category scheme="http://www.blogger.com/atom/ns#" term="spellslinger" />
<category scheme="http://www.blogger.com/atom/ns#" term="wildstar" />
<title type="text">Fan Art Showcase: She's gunnin' for trouble!</title>
<content type="html">Some random content</content>
<link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/7691515427771054332/posts/default/4593968385603307594" />
<link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/7691515427771054332/posts/default/4593968385603307594" />
<link rel="alternate" type="text/html" href="http://www.wildstarfans.net/2014/02/fan-art-showcase-shes-gunnin-for-trouble.html" title="Fan Art Showcase: She's gunnin' for trouble!" />
<author>
<name>Name Removed</name>
<uri>URL removed</uri>
<email>noreply#blogger.com</email>
<gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="32" height="32" src="//lh3.googleusercontent.com/-ow-dvUDbNxI/AAAAAAAAAAI/AAAAAAAABTY/MhrybgagMv0/s512-c/photo.jpg" />
</author>
<media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://3.bp.blogspot.com/-Ifp6awhDJuU/UWQEUl8nhUI/AAAAAAAABss/BSZ_YYM1U38/s72-c/fan-art-header.png" height="72" width="72" />
</entry>
I want to get the href of the third link with rel set to alternate. The alternate link isn't always the third one. I know how to do this through SimpleXML, but I want to get to know xpath for this, because through simpleXML it's more complicated and with this I hope I'm one step closer to understanding complex xpath queries.
The PHP I got that makes the most sense to me is:
$href = $entry->xpath('link[#rel="alternate"]/#href');
I tried multiple queries based on the information I found, but they all resulted in nothing. Here is a list of the queries I tried:
$href = $entry->xpath('link[#rel="alternate"]/#href/text()');
$href = $entry->xpath('link[#rel="alternate"]')->getAttributes()->href;
$href = $entry->xpath('*[#rel="alternate"]'); $href = $href['href'];
As it turns out from the chat conversation from my original question I had to register the namespace. In the end I used this website and the code turned out to be like this:
$feed = new DOMDocument();
$feed->load("http://www.wildstarfans.net/feeds/posts/default");
$xpath = new DOMXPath($feed);
$xpath->registerNamespace('atom', 'http://www.w3.org/2005/Atom');
foreach ($xpath->evaluate('//atom:entry') as $entry) {
$href = $xpath->evaluate('string(atom:link[#rel="alternate"]/#href)', $entry);
}
Credits go to ThW and Wrikken. Wish I could give you guys SO points for this.
$href = $entry->xpath('link[#rel="alternate"]');
$href = (string) $href[0]->attributes()->href;

How to set xsl attribute using processing instruction php

I'm having trouble trying to set a value into the attribute using a PHP processing instruction:
XSLT
<li itemprop="startDate">
<xsl:attribute name="content">
<xsl:processing-instruction name="php">
echo "Monday";
?</xsl:processing-instruction>
</xsl:attribute>
Monday
</li>
The page renders fine but the attribute is always empty.
Output
<li itemprop="startDate" content="">Monday</li>
I'm expecting the PHP to echo out a value into the attribute
If your using PHP to transform the XML through XSLT you can use in php:
$proc->setParameter(null, 'day', 'Monday');
$proc->transformToXML($xml);
Then in your XSLT to use this variable:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:php="http://php.net/xsl"
exclude-result-prefixes="php"
xsl:extension-element-prefixes="php">
<xsl:param name="day"/> <!-- Set the parameter -->
<xsl:attribute name='content'>
<xsl:value-of select="$day"/>
</xsl:attribute>
All the best!
You did not say how you open the XML. But because of the echo I assume it could/should have php instruction include.
The xsl:processing-instruction does not make sense here. Try this:
<li itemprop="startDate">
<xsl:attribute name="content">
<?php
echo "Monday";
?>
</xsl:attribute>
Monday
</li>

Storing the content of an xml child node based on a parent node's attribute in php

I'm trying to display the biggest image url returned from an xml result. So far the largest returned is 400 high so I hardcoded 400 in. If possible I would like to select just the largest in case in the future I get results that don't have a 400 height image in them.
I've tried
$x = file_get_contents($url);
$xml = simplexml_load_string($x);
$imageURL=$xml->categories->category->items->product->images->image[#height='400']->sourceURL;
Which gives me "syntax error, unexpected '=', expecting ']'".
And I also tried:
$imageURL= $xml->xpath("/categories/category/items/producct/images/image[#height='400']/sourceURL");
But got a bad link.
Here is the XML:
<images>
<image available="true" height="100" width="100">
<sourceURL>
Someurl.com
</sourceURL>
</image>
<image available="true" height="200" width="200">
<sourceURL>
Someurl.com
</sourceURL>
</image>
<image available="true" height="300" width="300">
<sourceURL>
Someurl.com
</sourceURL>
</image>
<image available="true" height="400" width="400">
<sourceURL>
Someurl.com
</sourceURL>
</image>
<image available="true" height="399" width="400">
<sourceURL>
Someurl.com
</sourceURL>
</image>
</images>
Any ideas?
->image[#height='400'] is a direct PHP array reference. This'd be interpreted as supressing errors (#) on a defined() constant (height), and trying to set its value via an assignment ='400'.
For your xpath version, remember that an xpath query returns a DOMNodeList, not an actual DOMElement. To get the URLs you need from the query results, you have to ierate over the node list:
$nodes = $xpath->query(...) {
foreach($nodes as $node) {
$url = $node->nodeValue;
}
Below code might help...
$xmlSQLProcedures = new DOMXPath($xmlSQLProcedures);
$strProcedureName = $xmlSQLProcedures->query("//SQLProcedure[#ID='$sSQLProcedureID']")->item(0)->nodeValue;
$nodeParameters = $xmlSQLProcedures->query("//SQLProcedure[#ID='$sSQLProcedureID']/Parameters/Parameter");
$ParamCount = $nodeParameters->length-1;
for ($i=0;$i<=$ParamCount;$i++) {
echo $nodeParameters->item($i)->getAttribute("Name").'<br>';
}
<?xml version="1.0" encoding="UTF-8"?>
<SQLProcedures>
<!-- ********** FOR KEYWORD IN LOCAL LANGUAGE ************* -->
<SQLProcedure ID="001070001">
<Name>P_ManipulateKeywordsInLL</Name>
<Parameters>
<Parameter Name="LanguageId"/>
<Parameter Name="KeywordId"/>
<Parameter Name="KeywordInLL"/>
<Parameter Name="ActionFor"/>
<Parameter Name="KeywordInLLId"/>
<Parameter Name="Keyword"/>
<Parameter Name="KeywordList"/>
<Parameter Name="SessionId"/>
<Parameter Name="WarehouseId"/>
</Parameters>
</SQLProcedure>
</SQLProcedures>

How to move childnode to parentnode level in xml ?

Recently I have been working with the xml coding,and the problem happened.
Given the xml like this:
<Colonel id="1">
<Lieutenant_Colonel id="2">
<Secretary id="6"/>
</Lieutenant_Colonel>
<Lieutenant_Colonel id="3">
<Secretary id="7"/>
</Lieutenant_Colonel>
<Secretary id="5"/>
</Colonel>
now the Colonel(id=1) has gone with the Secretary(id=5)
the xml we want is like
<!-- <Colonel id="1"> -->
<Lieutenant_Colonel id="2">
<Secretary id="6"/>
</Lieutenant_Colonel>
<Lieutenant_Colonel id="3">
<Secretary id="7"/>
</Lieutenant_Colonel>
<!--
<Secretary id="5"/>
</Colonel> -->
or
<Lieutenant_Colonel id="2">
<Secretary id="6"/>
</Lieutenant_Colonel>
<Lieutenant_Colonel id="3">
<Secretary id="7"/>
</Lieutenant_Colonel>
How to do this work?
please help me
This is a pretty long them, so I´ll have to refer other questions. First, consider this function:
function RemoveColonel($id, $secretaryId)
{
$colXpath = '//colonel[#id="' . $id . '"]';
$secXpath = $colXpath . '/Secretary[#id="' . $secretaryId . '"]';
RemoveNode($secXpath);
$colChildren = $rootXml->xpath($colXpath)->children();
$children = array();
foreach ($colChildren as $child ){
$children[] = $child;
}
RemoveNode($colXpath);
return $children;
}
Given the Id of a colonel and a secretary for that colonel, it deletes the secretary, then it saves the colonel's children to remove the colonel and return the children. After that, you can insert your elements in the root node again if you please. If you can't have the secretary Id at the moment of deleting the colonel, you can workaround the xpath to remove the secretary as child of the colonel by removing the id specification.
The RemoveNode code is a bit long, so I'll have to give you this question on how to remove nodes using SimpleXML in PHP.
I hope I can be of some help!

Categories