php only appending 3 out of 5 child nodes - php

This code is only appending 3 of the 5 name nodes. Why is that?
Here is the original XML:
It has 5 name nodes.
<?xml version='1.0'?>
<products>
<product>
<itemId>531670</itemId>
<modelNumber>METRA ELECTRONICS/MOBILE AUDIO</modelNumber>
<categoryPath>
<category><name>Buy</name></category>
<category><name>Car, Marine & GPS</name></category>
<category><name>Car Installation Parts</name></category>
<category><name>Deck Installation Parts</name></category>
<category><name>Antennas & Adapters</name></category>
</categoryPath>
</product>
</products>
Then is run this PHP code. which is suppossed to appened ALL name nodes into the product node.
<?php
// load up your XML
$xml = new DOMDocument;
$xml->load('book.xml');
// Find all elements you want to replace. Since your data is really simple,
// you can do this without much ado. Otherwise you could read up on XPath.
// See http://www.php.net/manual/en/class.domxpath.php
//$elements = $xml->getElementsByTagName('category');
// WARNING: $elements is a "live" list -- it's going to reflect the structure
// of the document even as we are modifying it! For this reason, it's
// important to write the loop in a way that makes it work correctly in the
// presence of such "live updates".
foreach ($xml->getElementsByTagName('product') as $product ) {
foreach($product->getElementsByTagName('name') as $name ) {
$product->appendChild($name );
}
$product->removeChild($xml->getElementsByTagName('categoryPath')->item(0));
}
// final result:
$result = $xml->saveXML();
echo $result;
?>
The end result is this and it only appends 3 of the name nodes:
<?xml version="1.0"?>
<products>
<product>
<itemId>531670</itemId>
<modelNumber>METRA ELECTRONICS/MOBILE AUDIO</modelNumber>
<name>Buy</name>
<name>Antennas & Adapters</name>
<name>Car Installation Parts</name>
</product>
</products>
Why is it only appending 3 of the name nodes?

You can temporarily add the name elements to an array before appending them, owing to the fact that you're modifying the DOM in real time. The node list generated by getElementsByTagName() may change as you are moving nodes around (and indeed that appears to be what's happening).
<?php
// load up your XML
$xml = new DOMDocument;
$xml->load('book.xml');
// Array to store them
$append = array();
foreach ($xml->getElementsByTagName('product') as $product ) {
foreach($product->getElementsByTagName('name') as $name ) {
// Stick $name onto the array
$append[] = $name;
}
// Now append all of them to product
foreach ($append as $a) {
$product->appendChild($a);
}
$product->removeChild($xml->getElementsByTagName('categoryPath')->item(0));
}
// final result:
$result = $xml->saveXML();
echo $result;
?>
Output, with all values appended:
<?xml version="1.0"?>
<products>
<product>
<ItemId>531670</ItemId>
<modelNumber>METRA ELECTRONICS/MOBILE AUDIO</modelNumber>
<name>Buy</name><name>Car, Marine & GPS</name><name>Car Installation Parts</name><name>Deck Installation Parts</name><name>Antennas & Adapters</name></product>
</products>

You're modifying the DOM tree as you're pulling results from it. Any modifications to the tree that cover the results of a previous query operation (your getElementsByTagName) invalidate those results, so you're getting undefined results. This is especially true of operations that add/remove nodes.

You're moving nodes as you're iterating through them so 2 are being skipped. I'm not a php guy so I can't give you the code to do this, but what you need to do is build a collection of the name nodes and iterate through that collection in reverse.

A less complicated way to do it is to manipulate the nodes with insertBefore
foreach($xml->getElementsByTagName('name') as $node){
$gp = $node->parentNode->parentNode;
$ggp = $gp->parentNode;
// move the node above gp without removing gp or parent
$ggp->insertBefore($node,$gp);
}
// remove the empty categoryPath node
$ggp->removeChild($gp);

Related

read xml content and add some numbers with php

I have an xml file but through php I would like to make some changes only for some products (each item has its own id.
Let me explain better on some products I would like to add the shipping cost with the price and at the item use grid put from 1 to 0.
<Products>
<Product>
<sku>35</sku>
<sku_manufacturer>test sku</sku_manufacturer>
<manufacturer>test manufacturer</manufacturer>
<ean>800000000000</ean>
<title><![CDATA[title test]]></title>
<description><![CDATA[description</description>
<product_price_vat_inc>8.08</product_price_vat_inc>
<shipping_price_vat_inc>4.99</shipping_price_vat_inc>
<quantity>2842</quantity>
<brand><![CDATA[Finder]]></brand>
<merchant_category><![CDATA[Home/test category]]></merchant_category>
<product_url><![CDATA[https://www.example.com]]></product_url>
<image_1><![CDATA[https://www.example.com]]></image_1>
<image_2><![CDATA[]]></image_2>
<image_3><![CDATA[]]></image_3>
<image_4><![CDATA[]]></image_4>
<image_5><![CDATA[]]></image_5>
<retail_price_vat_inc/>
<product_vat_rate>22</product_vat_rate>
<shipping_vat_rate>22</shipping_vat_rate>
<manufacturer_pdf/>
<ParentSKU/>
<parent_title/>
<Cross_Sell_Sku/>
<ManufacturerWarrantyTime/>
<use_grid>1</use_grid>
<carrier>DHL</carrier>
<shipping_time>2#3</shipping_time>
<carrier_grid_1>DHL</carrier_grid_1>
<shipping_time_carrier_grid_1>2#3</shipping_time_carrier_grid_1>
<carrier_grid_2/>
<shipping_time_carrier_grid_2/>
<carrier_grid_3/>
<shipping_time_carrier_grid_3/>
<carrier_grid_4/>
<shipping_time_carrier_grid_4/>
<carrier_grid_5/>
<shipping_time_carrier_grid_5/>
<DisplayWeight>0.050000</DisplayWeight>
<free_return/>
<min_quantity>1</min_quantity>
<increment>1</increment>
<sales>0</sales>
<eco_participation>0</eco_participation>
<shipping_price_supplement_vat_inc>0</shipping_price_supplement_vat_inc>
<Unit_count>-1.000000</Unit_count>
<Unit_count_type/>
</Product>
</Products>
You XML is a little large so let's strip it down for the example:
$xmlString = <<<'XML'
<Products>
<Product>
<sku>35</sku>
<title><![CDATA[title test]]></title>
<use_grid>1</use_grid>
</Product>
<Product>
<sku>42</sku>
<title><![CDATA[title test two]]></title>
<use_grid>1</use_grid>
</Product>
</Products>
XML;
DOM is a standard API for XML manipulation. PHP supports it and Xpath expressions for fetching nodes.
$document = new DOMDocument('1.0', "UTF-8");
// $document->load($xmlFile);
$document->loadXML($xmlString);
// $xpath for fetching node using expressions
$xpath = new DOMXpath($document);
// iterate "Product" nodes with a specific "sku" child
foreach ($xpath->evaluate('//Product[sku="35"]') as $product) {
// output sku and title for validation
var_dump(
$xpath->evaluate('string(sku)', $product),
$xpath->evaluate('string(title)', $product)
);
// iterate the "use_grid" child elements
foreach ($xpath->evaluate('./use_grid', $product) as $useGrid) {
// output current value
var_dump(
$useGrid->textContent
);
// change it
$useGrid->textContent = "0";
}
}
echo "\n\n", $document->saveXML();
Output:
string(2) "35"
string(10) "title test"
string(1) "1"
<?xml version="1.0"?>
<Products>
<Product>
<sku>35</sku>
<title><![CDATA[title test]]></title>
<use_grid>0</use_grid>
</Product>
<Product>
<sku>42</sku>
<title><![CDATA[title test two]]></title>
<use_grid>1</use_grid>
</Product>
</Products>
Xpath::evaluate()
Xpath::evaluate() fetches nodes using an Xpath expression. The result type depends on the expression. A location path like //Product[sku="35"] will return a list of nodes (DOMNodeList). However Xpath functions inside the can return a scalar value - string(sku) will return the text content of the first sku child node as a string or an empty string.
DOMNode::$textContent
Reading $node->textContent will return all the text inside a node - including inside descendant elements.
Writing it replaces the content while taking care of the escaping.

Check if child exists? - SimpleXML (PHP)

I have different XML files where I renamed for each XML file all individual tags, so that every XML file has the same tag name. That was easy because the function was customized for the XML file.
But instand of writing 7 new functions for each XML file now I want to check if a XML file has a specidifed child or not. Because if I want to say:
foreach ($items as $item) {
$node = dom_import_simplexml($item);
$title = $node->getElementsByTagName('title')->item(0)->textContent;
$price = $node->getElementsByTagName('price')->item(0)->textContent;
$url = $node->getElementsByTagName('url')->item(0)->textContent;
$publisher = $node->getElementsByTagName('publisher')->item(0)->textContent;
$category = $node->getElementsByTagName('category')->item(0)->textContent;
$platform = $node->getElementsByTagName('platform')->item(0)->textContent;
}
I get sometimes: PHP Notice: Trying to get property of non-object in ...
For example. Two different XML sheets. One contains publisher, category and platform, the other not:
XML 1:
<products>
<product>
<desc>This is a Test</desc>
<price>11.69</price>
<price_base>12.99</price_base>
<publisher>Stackoverflow</publisher>
<category>PHP</category>
</packshot>
<title>Check if child exists? - SimpleXML (PHP)</title>
<url>http://stackoverflow.com/questions/ask</url>
</product>
</products>
XML 2:
<products>
<product>
<image></image>
<title>Questions</title>
<price>23,90</price>
<url>google.de/url>
<platform>Stackoverflow</platform>
</product>
</products>
You see, sometimes one XML file contains publisher, category and platform but sometimes not. But it could also be that not every node of a XML file contains all attributes like in the first!
So I need to check for every node of a XML file individual if the node is containing publisher, category or/and platform.
How can I do that with SimpleXML?
I thought about switch case but at first I need to check which childs are contained in every node.
EDIT:
Maybe I found a solution. Is that a solution or not?
if($node->getElementsByTagName('platform')->item(0)){
echo $node->getElementsByTagName('platform')->item(0)->textContent . "\n";
}
Greetings and Thank You!
One way to rome... (working example)
$xml = "<products>
<product>
<desc>This is a Test</desc>
<price>11.69</price>
<price_base>12.99</price_base>
<publisher>Stackoverflow</publisher>
<category>PHP</category>
<title>Check if child exists? - SimpleXML (PHP)</title>
<url>http://stackoverflow.com/questions/ask</url>
</product>
</products>";
$xml = simplexml_load_string($xml);
#set fields to look for
foreach(['desc','title','price','publisher','category','platform','image','whatever'] as $path){
#get the first node
$result = $xml->xpath("product/{$path}[1]");
#validate and set
$coll[$path] = $result?(string)$result[0]:null;
#if you need here a local variable do (2 x $)
${$path} = $coll[$path];
}
#here i do array_filter() to remove all NULL entries
print_r(array_filter($coll));
#if local variables needed do
extract($coll);#this creates $desc, $price
Note </packshot> is an invalid node, removed here.
xpath syntax https://www.w3schools.com/xmL/xpath_syntax.asp
Firstly, you're over-complicating your code by switching from SimpleXML to DOM with dom_import_simplexml. The things you're doing with DOM can be done in much shorter code with SimpleXML.
Instead of this:
$node = dom_import_simplexml($item);
$title = $node->getElementsByTagName('title')->item(0)->textContent;
you can just use:
$title = (string)$item->title[0];
or even just:
$title = (string)$item->title;
To understand why this works, take a look at the SimpleXML examples in the manual.
Armed with that knowledge, you'll be amazed at how simple it is to see if a child exists or not:
if ( isset($item->title) ) {
$title = (string)$item->title;
} else {
echo "There is no title!";
}

XML: Delete parent with simplexml_import_dom (PHP)

I want to delete those entries where the title matches my $titleArray.
My XML files looks like:
<products>
<product>
<title>Battlefield 1</title>
<url>https://www.google.de/</url>
<price>0.80</price>
</product>
<product>
<title>Battlefield 2</title>
<url>https://www.google.de/</url>
<price>180</price>
</product>
</products>
Here is my code but I don't think that it is working and my IDE says here $node->removeChild($product); -> "Expected DOMNode, got DOMNodeList"
What is wrong and how can I fix that?
function removeProduct($dom, $productTag, $pathXML, $titleArray){
$doc = simplexml_import_dom($dom);
$items = $doc->xpath($pathXML);
foreach ($items as $item) {
$node = dom_import_simplexml($item);
foreach ($titleArray as $title) {
if (mb_stripos($node->textContent, $title) !== false) {
$product = $node->parentNode->getElementsByTagName($productTag);
$node->removeChild($product);
}
}
}
}
Thank you and Greetings!
Most DOM methods that fetch nodes return a list of nodes. You can have several element nodes with the same name. So the result will a list (and empty list if nothing is found). You can traverse the list and apply logic to each node in the list.
Here are two problems with the approach. Removing nodes modifies the document. So you have to be careful not to remove a node that you're still using after that. It can lead to any kind of unexpected results. DOMNode::getElementsByTagName() returns a node list and it is a "live" result. If you remove the first node the list actually changes, not just the XML document.
DOMXpath::evaluate() solves two of the problems at the same time. The result is not "live" so you can iterate the result with foreach() and remove nodes. Xpath expressions allow for conditions so you can filter and fetch specific nodes. Unfortunately Xpath 1.0 has now lower case methods, but you can call back into PHP for that.
function isTitleInArray($title) {
$titles = [
'battlefield 2'
];
return in_array(mb_strtolower($title, 'UTF-8'), $titles);
}
$document = new DOMDocument();
$document->loadXml($xml);
$xpath = new DOMXpath($document);
$xpath->registerNamespace("php", "http://php.net/xpath");
$xpath->registerPHPFunctions('isTitleInArray');
$expression = '//product[php:function("isTitleInArray", normalize-space(title))]';
foreach ($xpath->evaluate($expression) as $product) {
$product->parentNode->removeChild($product);
}
echo $document->saveXml();
Output:
<?xml version="1.0"?>
<products>
<product>
<title>Battlefield 1</title>
<url>https://www.google.de/</url>
<price>0.80</price>
</product>
</products>

php xml remove elements not containing a specific word from large file

I am reading an xml file which looks like this but with a lot more products:
<?xml version="1.0" encoding="iso-8859-1"?>
<products>
<product>
<company>company.com</company>
<category>Category A</category>
<brand>Alle!rgica</brand>
<product_name>Name A</product_name>
<productid>6230</productid>
<description>A nice description</description>
<price>125.50</price>
</product>
<product>
<company>Team.com</company>
<category>Category B // something</category>
<brand>New Nordic > Healthcare</brand>
<product_name>Name B</product_name>
<productid>9489</productid>
<description>Active Legs? Buy it now for free</description>
<price>188.00</price>
</product>
</products>
I want to read it and then save it with only products containing the word "free" somewhere in the "product tag" and without the "products" tag and the xml header.
I know how to read the file and save it, but I can't figure out the best approach to remove everything but the products that contain "free".
I tried wth Regex but it didn't seem the best solution (mainly because the matching doesn't properly work):
preg_match_all('/<product>(.*?)(free|free-stuff)(.*?)<\/product>/is', $data, $result);
So in the case of the above the file should only contain:
<product>
<company>Team.com</company>
<category>Category B // something</category>
<brand>New Nordic > Healthcare æøå</brand>
<product_name>Name B</product_name>
<productid>9489</productid>
<description>Active Legs? Buy it now for free</description>
<price>188.00</price>
</product>
use xpath():
$xml = simplexml_load_string($x); // assume XML in $x
$result = $xml->xpath("//product[not(contains(., 'free'))]");
$result contains an array of <product>-nodes as SimpleXML-elements that do not contain "free".
Output:
foreach ($result as $r)
echo $r->asXML();
See it working: https://eval.in/338884
Use this code:
$xml = simplexml_load_file($filename);
foreach($xml->product as $product) {
foreach($product->children() as $child)
// lookup the pattern in all nodes inside product
if ($found = (false !== strpos((string)$child, 'free')))
// Found - we can don't continue searching
break;
// save product found
if ($found) $products[] = $product;
}
print_r( $products);

SimpleXML: trouble with parent with attributes

Need help with updating some simplexml code I did along time ago. The XML file I'm parsing from is formatted in a new way, but I can't figure out how to navigate it.
Example of old XML format:
<?xml version="1.0" encoding="UTF-8"?>
<pf version="1.0">
<pinfo>
<pid><![CDATA[test1 pid]]></pid>
<picture><![CDATA[http://test1.image]]></picture>
</pinfo>
<pinfo>
<pid><![CDATA[test2 pid]]></pid>
<picture><![CDATA[http://test2.image]]></picture>
</pinfo>
</pf>
and then the new XML format (note "category name" added):
<?xml version="1.0" encoding="UTF-8"?>
<pf version="1.2">
<category name="Cname1">
<pinfo>
<pid><![CDATA[test1 pid]]></pid>
<picture><![CDATA[http://test1.image]]></picture>
</pinfo>
</category>
<category name="Cname2">
<pinfo>
<pid><![CDATA[test2 pid]]></pid>
<picture><![CDATA[http://test2.image]]></picture>
</pinfo>
</category>
</pf>
And below the old code for parsing that doesn't work since the addition of "category name" in the XML:
$pinfo = new SimpleXMLElement($_SERVER['DOCUMENT_ROOT'].'/xml/file.xml', null, true);
foreach($pinfo as $resource)
{
$Profile_id = $resource->pid;
$Image_url = $resource->picture;
// and then some echo´ing of the collected data inside the loop
}
What do I need to add or do completely different? I tried with xpath,children and sorting by attributes but no luck - SimpleXML has always been a mystery to me :)
You were iterating over all <pinfo> elements located in the root element previously:
foreach ($pinfo as $resource)
Now all <pinfo> elements have moved from the root element into the <category> elements. You now need to query those elements first:
foreach ($pinfo->xpath('/*/category/pinfo') as $resource)
The now wrong named variable $pinfo is standing a bit in the way so it better do some more changes:
$xml = new SimpleXMLElement($_SERVER['DOCUMENT_ROOT'].'/xml/file.xml', null, true);
$pinfos = $xml->xpath('/*/category/pinfo');
foreach ($pinfos as $pinfo) {
$Profile_id = $pinfo->pid;
$Image_url = $pinfo->picture;
// ... and then some echo´ing of the collected data inside the loop
}
The category elements exist as their own array when you load the XML file. The XML you are used to parsing is contained within. All you need to do is wrap your current code with another foreach. Other than that there isn't much to change.
foreach($pinfo as $category)
{
foreach($category as $resource)
{
$Profile_id = $resource->pid;
$Image_url = $resource->picture;
// and then some echo´ing of the collected data inside the loop
}
}

Categories