Merge selected nodes of the two XML files - php

Let's say I have two XML files which has the same structure. I need to create new XML file with same structure which contain the selected nodes from initial two XML files.
I will try to explain again with bellow example.
input1.xml :
<parent>
<item id="100">
...
</item>
<item id="101">
...
</item>
<item id="102">
...
</item>
<item id="103">
...
</item>
</parent>
input2.xml :
<parent>
<item id="200">
...
</item>
<item id="201">
...
</item>
<item id="202">
...
</item>
<item id="203">
...
</item>
</parent>
Now I need select the nodes which have id 100,103 from input1.xml and 202,203 from input2.xml. Also it should be in the order of 203,100,103,202 and final result would be look like bellow.
result.xml :
<parent>
<item id="203">
...
</item>
<item id="100">
...
</item>
<item id="103">
...
</item>
<item id="202">
...
</item>
</parent>
It is not necessary to create new file, if I can edit the input2.xml in the way it looks like result.xml, that would be the ideal solution.
What I have done so far :
My approach is first delete the nodes from the input2.xml and then add the nodes to that from input1.xml.
I have following function to delete nodes from input2.xml file.
eg: call delete_record(200,'input2.xml','result.xml') can delete the node 200 and I can repeat it in the similar manner.
function delete_record($id, $input, $output){
$xml = new DOMDocument();
$xml->load($input);
$deals = $xml->getElementsByTagName('item');
foreach ($deals as $deal) {
$deal_id = $deal->getElementsByTagName('id')->item(0)->nodeValue;
if ($deal_id == $id) {
$id_matched = true;
$deal->parentNode->removeChild($deal);
break;
}
}
if ($id_matched == true) {
if ($xml->save($output)) {
return true;
}
}
}
But still I am struggling to find a way how to add nodes to the same result.xml file and how to make the order.
Any kind of help would be highly appreciated.

There is no need to mess with deletion, just do what you need to. Pick nodes by id from both files, and put it in order:
// merge all nodes by Id
function getNodesById($id, ...$xpaths) {
$result = [];
foreach($xpaths as $xpath) {
foreach($xpath->query("//item[#id='$id']") as $node) {
$result[] = $node;
}
}
return $result;
}
// load source documents
$xml1 = new DOMDocument();
$xml1->load(....);
$xpath1 = new DomXpath($xml1);
$xml2 = new DOMDocument();
$xml2->load(....);
$xpath2 = new DomXpath($xml2);
// create result document
$result = new DOMDocument();
$parent = $result->createElement("parent");
$result->appendChild($parent);
// populate result document with nodes:
foreach([203, 100, 103, 202] as $id) {
$nodesToInsert = getNodesById($id, xpath1, xpath2);
if (count($nodesToInsert) !== 1) {
// resolve conflicts, if any
throw new Exception("Id $id is not found or not unique.");
}
$parent->appendChild($result->importNode($nodesToInsert[0], true));
}
// or save it to a file
echo $result->saveXml();

Related

How to edit large XML files in PHP based on a record in the XML Node

I'm trying to modify a 130mb+ XML file via PHP so it only shows the results where a child node is a specific value. I'm trying to filter this because of limitations via the software we're using to import the XML into our website.
Example: (mockup data)
<Items>
<Item>
<Barcode>...</Barcode>
<BrandCode>...</BrandCode>
<Title>...</Title>
<Content>...</Content>
<ShowOnWebsite>false</BrandDescr>
</Item>
<Item>
<Barcode>...</Barcode>
<BrandCode>...</BrandCode>
<Title>...</Title>
<Content>...</Content>
<ShowOnWebsite>true</BrandDescr>
</Item>
<Item>
<Barcode>...</Barcode>
<BrandCode>...</BrandCode>
<Title>...</Title>
<Content>...</Content>
<ShowOnWebsite>false</BrandDescr>
</Item>
</Items>
Desired result:
I want to create a new XML file with only the records where the child "ShowOnWebsite" is true.
Problems I've run into
Because the XML is so large simple solutions like using SimpleXML or loading the XML into the body and editing the nodes in there don't work. Because they all read the entire file into memory which is too slow and usually fails.
I've also looked at prewk/xml-string-streamer (https://github.com/prewk/xml-string-streamer) which is great for streaming large XML files because it doesn't place them in memory, although I can't find any way to modify the XML via that solution. (Other online posts say you need to have the nodes in memory to edit them).
Anyone got an idea on how to tackle this problem?
Goal
Desired result: I want to create a new XML file with only the records where the child "ShowOnWebsite" is true.
Given
test.xml
<Items>
<Item>
<Barcode>...</Barcode>
<BrandCode>...</BrandCode>
<Title>...</Title>
<Content>...</Content>
<ShowOnWebsite>false</ShowOnWebsite>
</Item>
<Item>
<Barcode>...</Barcode>
<BrandCode>...</BrandCode>
<Title>...</Title>
<Content>...</Content>
<ShowOnWebsite>true</ShowOnWebsite>
</Item>
<Item>
<Barcode>...</Barcode>
<BrandCode>...</BrandCode>
<Title>...</Title>
<Content>...</Content>
<ShowOnWebsite>false</ShowOnWebsite>
</Item>
</Items>
Code
This is the implementation I wrote. The getItems yields the childs without loading the xml at once into the memory.
function getItems($fileName) {
if ($file = fopen($fileName, "r")) {
$buffer = "";
$active = false;
while(!feof($file)) {
$line = fgets($file);
$line = trim(str_replace(["\r", "\n"], "", $line));
if($line == "<Item>") {
$buffer .= $line;
$active = true;
} elseif($line == "</Item>") {
$buffer .= $line;
$active = false;
yield new SimpleXMLElement($buffer);
$buffer = "";
} elseif($active == true) {
$buffer .= $line;
}
}
fclose($file);
}
}
$output = new SimpleXMLElement('<?xml version="1.0" encoding="utf-8"?><Items></Items>');
foreach(getItems("test.xml") as $element)
{
if($element->ShowOnWebsite == "true") {
$item = $output->addChild('Item');
$item->addChild('Barcode', (string) $element->Barcode);
$item->addChild('BrandCode', (string) $element->BrandCode);
$item->addChild('Title', (string) $element->Title);
$item->addChild('Content', (string) $element->Content);
$item->addChild('ShowOnWebsite', $element->ShowOnWebsite);
}
}
$fileName = __DIR__ . "/test_" . rand(100, 999999) . ".xml";
$output->asXML($fileName);
Output
<?xml version="1.0" encoding="utf-8"?>
<Items><Item><Barcode>...</Barcode><BrandCode>...</BrandCode><Title>...</Title><Content>...</Content><ShowOnWebsite>true</ShowOnWebsite></Item></Items>
XMLReader has an expand() method, but XMLWriter is missing the counterpart. So I added a XMLWriter::collapse() method in FluentDOM.
This allows to read the XML with XMLReader, expand it to DOM, use DOM methods to filter/manipulate the it and write it back with XMLWriter:
require __DIR__.'/../../vendor/autoload.php';
// Create the target writer and add the root element
$writer = new \FluentDOM\XMLWriter();
$writer->openUri('php://stdout');
$writer->setIndent(2);
$writer->startDocument();
$writer->startElement('Items');
// load the source into a reader
$reader = new \FluentDOM\XMLReader();
$reader->open(getXMLAsURI());
// iterate the Item elements - the iterator expands them into a DOM node
foreach (new FluentDOM\XMLReader\SiblingIterator($reader, 'Item') as $item) {
/** #var \FluentDOM\DOM\Element $item */
// only "ShowOnWebsite = true"
if ($item('ShowOnWebsite = "true"')) {
// write expanded node to the output
$writer->collapse($item);
}
}
$writer->endElement();
$writer->endDocument();
function getXMLAsURI() {
$xml = <<<'XML'
<Items>
<Item>
<Barcode>...</Barcode>
<BrandCode>...</BrandCode>
<Title>...</Title>
<Content>...</Content>
<ShowOnWebsite>false</ShowOnWebsite>
</Item>
<Item>
<Barcode>...</Barcode>
<BrandCode>...</BrandCode>
<Title>...</Title>
<Content>...</Content>
<ShowOnWebsite>true</ShowOnWebsite>
</Item>
<Item>
<Barcode>...</Barcode>
<BrandCode>...</BrandCode>
<Title>...</Title>
<Content>...</Content>
<ShowOnWebsite>false</ShowOnWebsite>
</Item>
</Items>
XML;
return 'data://text/plain;base64,'.base64_encode($xml);
}

XML change value from parent to child

I have this XML:
<destinos>
<destino>
<location>Spain</location>
<programas>
<item></item>
<item></item>
</programas>
</destino>
<destino>
<location>France</location>
<programas>
<item></item>
<item></item>
</programas>
</destino>
</destinos>
I need to include or copy the value of "Location" within each "item" and I am not able to do so.
<destinos>
<destino>
<location>Spain</location>
<programas>
<item>
<location>Spain</location>
</item>
<item>
<location>Spain</location>
</item>
</programas>
</destino>
<destino>
<location>France</location>
<programas>
<item>
<location>France</location>
</item>
<item>
<location>France</location>
</item>
</programas>
</destino>
</destinos>
I have no knowledge of PHP and I have been reading but I can't find the solution.
If someone could help me and explain I would be very grateful.
My code:
$url = file_get_contents("archive.xml");
$xml = simplexml_load_string($url);
$changes = $xml->xpath("//*[starts-with(local-name(), 'item')]");
foreach ($changes as $change)
$change[0] = $xml->destinos->destino->location;
header('Content-Type: application/xml');
echo $xml->asXML();
One option could be to use xpath with addChild with with the value of the location:
$url = file_get_contents("archive.xml");
$xml = simplexml_load_string($url);
$changes = $xml->xpath("/destinos/destino");
foreach ($changes as $change) {
$text = (string)$change->location;
foreach ($change->xpath("programas/item") as $i) {
$i->addChild("location", $text);
}
}
header('Content-Type: application/xml');
echo $xml->asXML();
Output
<destinos>
<destino>
<location>Spain</location>
<programas>
<item><location>Spain</location></item>
<item><location>Spain</location></item>
</programas>
</destino>
<destino>
<location>France</location>
<programas>
<item><location>France</location></item>
<item><location>France</location></item>
</programas>
</destino>
</destinos>
Php demo
Using SimpleXML, you can just use object notation to access the various elements of the document, this stops the need for XPath and can also make the code more readable...
$url = file_get_contents("archive.xml");
$xml = simplexml_load_string($url);
foreach ($xml->destino as $destino) {
// Process each item
foreach ( $destino->programas->item as $item ) {
// Set the location from the destino location value
$item->location = (string)$destino->location;
}
}
header('Content-Type: application/xml');
echo $xml->asXML();
One thing to note is that when using SimpleXML, the root node (<destinos> in this case) is the $xml object. This is why $xml->destino is accessing the <destino> elements.
With DOM you can append clones of the location nodes to the respective item elements.
$document = new DOMDocument();
$document->load($url);
$xpath = new DOMXpath($document);
// iterate the location child of the destino elements
foreach($xpath->evaluate('//destino/location') as $location) {
// iterate the item nodes inside the same parent node
foreach ($xpath->evaluate('parent::*/programas/item', $location) as $item) {
// append a copy of the location to the item
$item->appendChild($location->cloneNode(TRUE));
}
}
echo $document->saveXML();

xml DOM : delete element with condition

May be the question is already answered in a way or in another in many questions, but since I'm a new bie in XML, I can't figured it out in my project.
I have an RSS (XML) file with this structure:
<rss>
<channel>
<item>
<title>some title</title>
<description> some descrp </description>
...
</item>
</channel>
</rss>
How can I, in PHP, delete some item when the title is equal to some value? THanks.
EDIT1 : I have my XML file stored at my web server.
$rss = "
<rss>
<channel>
<item>
<title>some title</title>
<description> some descrp </description>
</item>
<item>
<title>some other title</title>
<description> some descrp </description>
</item>
</channel>
</rss>
";
$doc = new DOMDocument();
$doc->loadXML($rss);
$xpath = new DOMXPath($doc);
$els = $xpath->query('//title[text()="some title"]');
foreach($els as $el)
{
$parent = $el->parentNode;
$parent->parentNode->removeChild($parent);
}
echo $doc->saveXML();
It searches for exact match.
ps: another method, without xpath
$doc = new DOMDocument();
$doc->loadXML($rss);
$els = $doc->getElementsByTagName('title');
for($i = $els->length-1; $i >= 0; $i--)
{
$el = $els->item($i);
if ($el->nodeValue == 'some title')
{
$parent = $el->parentNode;
$parent->parentNode->removeChild($parent);
}
}
echo $doc->saveXML();

PHP DOMDocument getting element data with particular attribute name

The XML
<?xml version="1.0"?>
<items version="1.5">
<item name="device">iphone</item>
<item name="affinity">testing</item>
</items>
I need to get the value 'iphone' and value 'testing' from device and affinity, how do I select them?
I've tried:
$xml = new DOMDocument();
$xml->loadXML($request);
$itemList = $xml->getElementsByTagName('item');
foreach($itemList as $install)
{
echo $install->getAttribute('device')->item(0)->nodeValue;
}
But that doesn't seem to work.
Thank you!
Something like this may do:
$Device;
$Affinity;
foreach($itemList as $install)
{
if($install->getAttribute('name') =='device'){
$Device = $install->nodeValue;
}
if($install->getAttribute('name')=='affinity'){
$Affinity = $install->nodeValue;
}
}

XML reforming with DOM

I am trying to reformat XML adding intermediate level node.
Here is what I have as input:
<channel>
<item>
<title>Advanced PHP Book</title>
</item>
<item>
<title>MySQL primer</title>
</item>
<item>
<title>C++ for beginners</title>
</item>
</channel>
I need it to be like that at the end (page node added between channel and item):
<channel>
<page>
<item>
<title>Advanced PHP Book</title>
</item>
<item>
<title>MySQL primer</title>
</item>
<item>
<title>C++ for beginners</title>
</item>
</page>
</channel>
Here is my testing code:
$sxe = simplexml_load_string($string);
$dom_sxe = dom_import_simplexml($sxe);
$dom = new DOMDocument('1.0');
$channel = $dom->appendChild($dom->createElement('channel'));
$page = $channel->appendChild($dom->createElement('page'));
$dom_sxe = $dom->importNode($dom_sxe, true);
$dom_sxe = $page->appendChild($dom_sxe);
$dom->formatOutput = true;
echo $dom->saveXML();
The problem I have is that channel element is doubled.
Please help.
I don't think this should be too hard: I think you're overcomplicating it by using the simplexml stuff.
$dom = new DOMDocument;
$dom->loadXML($string);
// create the <page> element
$page = $dom->createElement('page');
while ($dom->firstChild->firstChild) {
// move the items in <channel> to the <page> element
$page->appendChild($dom->firstChild->firstChild);
}
// insert the <page> element into <channel>
$dom->firstChild->appendChild($page);
$dom->saveXML();
$xml = '<channel> <item> <title>Advanced PHP Book</title> </item> <item> <title>MySQL primer</title> </item> <item> <title>C++ for beginners</title> </item> </channel>';
$dom = new DOMDocument;
$dom->loadXML($xml);
$page = $dom->createElement('page');
$items = $dom->getElementsByTagName('item');
while ($items->length) {
$page->appendChild($items->item(0));
}
$dom->getElementsByTagName('channel')->item(0)->appendChild($page);
echo $dom->saveXML();
Output
<?xml version="1.0"?>
<channel> <page><item> <title>Advanced PHP Book</title> </item><item> <title>MySQL primer</title> </item><item> <title>C++ for beginners</title> </item></page></channel>
See it.

Categories