Search and replace in SimpleXMLElement - php

I loop through an XML file. Once I find a certain node I take it's value and change all it's occurrences in the whole document.
XML
<?xml version="1.0" encoding="UTF-8"?>
<catalog catalog-id="my_catalog">
<product product-id="11111111">
......
<new-id>aaaaaaa</new-id>
</product>
<product product-id="2222222">
......
<new-id>bbbbbbb</new-id>
</product>
</catalog>
For each <product> when I find <new-id> I need to replace <product product-id=""> with it which is fine, but also I need to replace all occurrences of 11111111 with aaaaaaa in the whole XML document.
PHP
$cotalog = new SimpleXMLElement($file);
header('Content-Type: text/xml');
foreach ($cotalog as $product) {
if ($product->getName() == 'product') {
$product['product-id'] = $product->{'new-id'};
// code here to replace all occurrences of $product['product-id'] with $product->{'new-id'}
...
}
}
echo $cotalog->asXML();
Is there a str_replace type of way to replace all occurrences of a string value with a different string?

Related

read xml content and add some numbers with php

I have an xml file but through php I would like to make some changes only for some products (each item has its own id.
Let me explain better on some products I would like to add the shipping cost with the price and at the item use grid put from 1 to 0.
<Products>
<Product>
<sku>35</sku>
<sku_manufacturer>test sku</sku_manufacturer>
<manufacturer>test manufacturer</manufacturer>
<ean>800000000000</ean>
<title><![CDATA[title test]]></title>
<description><![CDATA[description</description>
<product_price_vat_inc>8.08</product_price_vat_inc>
<shipping_price_vat_inc>4.99</shipping_price_vat_inc>
<quantity>2842</quantity>
<brand><![CDATA[Finder]]></brand>
<merchant_category><![CDATA[Home/test category]]></merchant_category>
<product_url><![CDATA[https://www.example.com]]></product_url>
<image_1><![CDATA[https://www.example.com]]></image_1>
<image_2><![CDATA[]]></image_2>
<image_3><![CDATA[]]></image_3>
<image_4><![CDATA[]]></image_4>
<image_5><![CDATA[]]></image_5>
<retail_price_vat_inc/>
<product_vat_rate>22</product_vat_rate>
<shipping_vat_rate>22</shipping_vat_rate>
<manufacturer_pdf/>
<ParentSKU/>
<parent_title/>
<Cross_Sell_Sku/>
<ManufacturerWarrantyTime/>
<use_grid>1</use_grid>
<carrier>DHL</carrier>
<shipping_time>2#3</shipping_time>
<carrier_grid_1>DHL</carrier_grid_1>
<shipping_time_carrier_grid_1>2#3</shipping_time_carrier_grid_1>
<carrier_grid_2/>
<shipping_time_carrier_grid_2/>
<carrier_grid_3/>
<shipping_time_carrier_grid_3/>
<carrier_grid_4/>
<shipping_time_carrier_grid_4/>
<carrier_grid_5/>
<shipping_time_carrier_grid_5/>
<DisplayWeight>0.050000</DisplayWeight>
<free_return/>
<min_quantity>1</min_quantity>
<increment>1</increment>
<sales>0</sales>
<eco_participation>0</eco_participation>
<shipping_price_supplement_vat_inc>0</shipping_price_supplement_vat_inc>
<Unit_count>-1.000000</Unit_count>
<Unit_count_type/>
</Product>
</Products>
You XML is a little large so let's strip it down for the example:
$xmlString = <<<'XML'
<Products>
<Product>
<sku>35</sku>
<title><![CDATA[title test]]></title>
<use_grid>1</use_grid>
</Product>
<Product>
<sku>42</sku>
<title><![CDATA[title test two]]></title>
<use_grid>1</use_grid>
</Product>
</Products>
XML;
DOM is a standard API for XML manipulation. PHP supports it and Xpath expressions for fetching nodes.
$document = new DOMDocument('1.0', "UTF-8");
// $document->load($xmlFile);
$document->loadXML($xmlString);
// $xpath for fetching node using expressions
$xpath = new DOMXpath($document);
// iterate "Product" nodes with a specific "sku" child
foreach ($xpath->evaluate('//Product[sku="35"]') as $product) {
// output sku and title for validation
var_dump(
$xpath->evaluate('string(sku)', $product),
$xpath->evaluate('string(title)', $product)
);
// iterate the "use_grid" child elements
foreach ($xpath->evaluate('./use_grid', $product) as $useGrid) {
// output current value
var_dump(
$useGrid->textContent
);
// change it
$useGrid->textContent = "0";
}
}
echo "\n\n", $document->saveXML();
Output:
string(2) "35"
string(10) "title test"
string(1) "1"
<?xml version="1.0"?>
<Products>
<Product>
<sku>35</sku>
<title><![CDATA[title test]]></title>
<use_grid>0</use_grid>
</Product>
<Product>
<sku>42</sku>
<title><![CDATA[title test two]]></title>
<use_grid>1</use_grid>
</Product>
</Products>
Xpath::evaluate()
Xpath::evaluate() fetches nodes using an Xpath expression. The result type depends on the expression. A location path like //Product[sku="35"] will return a list of nodes (DOMNodeList). However Xpath functions inside the can return a scalar value - string(sku) will return the text content of the first sku child node as a string or an empty string.
DOMNode::$textContent
Reading $node->textContent will return all the text inside a node - including inside descendant elements.
Writing it replaces the content while taking care of the escaping.

Parse through XML childs

i have the following xml:
<?xml version="1.0" standalone="yes"?>
<Products>
<Product>
<name>Milk</name>
<price>1.4</price>
<productinfos>
<category1 value="somecategory1"/>
<category2 value="somecategory2"/>
<category3 value="somecategory3"/>
</productinfos>
</Product>
</Products>
how can i make sure that productinfos category1, category2 or category3 do exist and are not an empty string? And how does the loop look like if i want the following output:
//output
Cat1: somecategory1
Cat3: somecategory3
Cat2: somecategory2
because sometimes the xml i parse looks different:
<?xml version="1.0" standalone="yes"?>
<Products>
<Product>
<name>Milk</name>
<price>1.4</price>
<productinfos>
<category1 value=""/>
<category3 value="somecategory"/>
</productinfos>
</Product>
</Products>
in the above example, how can i check if category2 exists?
tia for your efforts!
You're looking for the SimpleXMLElement::children() method.
https://secure.php.net/manual/en/simplexmlelement.children.php
$xml = new SimpleXMLElement(<<<XML
<?xml version="1.0" standalone="yes"?>
<Products>
<Product>
<name>Milk</name>
<price>1.4</price>
<productinfos>
<category1 value="somecategory1"/>
<category2 value="somecategory2"/>
<category3 value="somecategory3"/>
</productinfos>
</Product>
</Products>
XML
);
// $xml is a SimpleXMLElement of <Products>
foreach ($xml->children() as $product) {
if ($product->getName() != 'Product') {
// ignore <Products><Cow> or whatever, if you care
continue;
}
// start out assuming that everything is missing
$missing_tags = array(
'category1' => true,
'category2' => true,
'category3' => true,
);
// iterate through child tags of <productinfos>
foreach ($product->productinfos->children() as $productinfo) {
// element name is accessed using the getName() method, and
// XML attributes can be accessed like an array
if (isset($missing_tags[$productinfo->getName()]) &&
!empty($productinfo['value'])) {
$missing_tags[$productinfo->getName()] = false;
echo $productinfo->getName() . ": " . $productinfo['value'] . "\n";
}
}
// array_filter with one argument filters out any values that eval to false
if (array_filter($missing_tags)) {
echo "Missing tags: " . implode(", ", array_keys($missing_tags)) . "\n";
}
}
The SimpleXML extension is rather less intuitive than the name would suggest, but it's about as simple as you can get with XML...

php xml remove elements not containing a specific word from large file

I am reading an xml file which looks like this but with a lot more products:
<?xml version="1.0" encoding="iso-8859-1"?>
<products>
<product>
<company>company.com</company>
<category>Category A</category>
<brand>Alle!rgica</brand>
<product_name>Name A</product_name>
<productid>6230</productid>
<description>A nice description</description>
<price>125.50</price>
</product>
<product>
<company>Team.com</company>
<category>Category B // something</category>
<brand>New Nordic > Healthcare</brand>
<product_name>Name B</product_name>
<productid>9489</productid>
<description>Active Legs? Buy it now for free</description>
<price>188.00</price>
</product>
</products>
I want to read it and then save it with only products containing the word "free" somewhere in the "product tag" and without the "products" tag and the xml header.
I know how to read the file and save it, but I can't figure out the best approach to remove everything but the products that contain "free".
I tried wth Regex but it didn't seem the best solution (mainly because the matching doesn't properly work):
preg_match_all('/<product>(.*?)(free|free-stuff)(.*?)<\/product>/is', $data, $result);
So in the case of the above the file should only contain:
<product>
<company>Team.com</company>
<category>Category B // something</category>
<brand>New Nordic > Healthcare æøå</brand>
<product_name>Name B</product_name>
<productid>9489</productid>
<description>Active Legs? Buy it now for free</description>
<price>188.00</price>
</product>
use xpath():
$xml = simplexml_load_string($x); // assume XML in $x
$result = $xml->xpath("//product[not(contains(., 'free'))]");
$result contains an array of <product>-nodes as SimpleXML-elements that do not contain "free".
Output:
foreach ($result as $r)
echo $r->asXML();
See it working: https://eval.in/338884
Use this code:
$xml = simplexml_load_file($filename);
foreach($xml->product as $product) {
foreach($product->children() as $child)
// lookup the pattern in all nodes inside product
if ($found = (false !== strpos((string)$child, 'free')))
// Found - we can don't continue searching
break;
// save product found
if ($found) $products[] = $product;
}
print_r( $products);

How should I modify this for each statement to only display entries where all of the values are unique?

foreach($resultXML->products->children() as $product) {
echo "<p>".$product->{'advertiser-name'}." - ".$product->price."</p>
<p>".$product->{'description'}."</p>";
}
Suppose I wanted to screen out the ones that had the same title, and only display the first title that appears in the return results.
I'm not working with my own database, this is all about what's displayed.
I suppose the easiest way would be to keep track of the titles in an array, and checking it each iteration.
$titles = array();
foreach($resultXML->products->children() as $product) {
if (in_array($product->title, $titles) continue;
$titles[] = $product->title;
echo "<p>".$product->{'advertiser-name'}." - ".$product->price."</p>
<p>".$product->{'description'}."</p>";
}
Assuming that the title is contained in $product->title. You could do something fancier through array functions, but I don't see a reason to make a simple problem complicated.
You have not provided any exemplary XML, so given for
<?xml version="1.0" encoding="UTF-8"?>
<example>
<products>
<product>
<title>First Product</title>
<advertiser-name>First Name</advertiser-name>
</product>
</products>
<products>
<product>
<title>Second Product</title>
<advertiser-name>First Name</advertiser-name>
</product>
</products>
<products>
<product>
<title>Third Product</title>
<advertiser-name>Second Name</advertiser-name>
</product>
</products>
</example>
You want to get all product elements with an advertiser-name that is not an advertiser-name of all preceding product elements.
So for the XML above, that would be the 1st and 3rd product element.
You can write that down as an XPath expression:
/*/products/product[not(advertiser-name = preceding::product/advertiser-name)]
And as PHP code:
$xml = simplexml_load_string($buffer);
$expr = '/*/products/product[not(advertiser-name = preceding::product/advertiser-name)]';
foreach ($xml->xpath($expr) as $product) {
echo $product->asXML(), "\n";
}
This produces the following output:
<product>
<title>First Product</title>
<advertiser-name>First Name</advertiser-name>
</product>
<product>
<title>Third Product</title>
<advertiser-name>Second Name</advertiser-name>
</product>
So one answer to your question therefore is: Only query those elements from the document you're interested in. XPath can be used for that with SimpleXMLElement.
Related questions:
Implementing condition in XPath

PHP SimpleXMLElement not parsing <title> node

I have a simple well-formed XML doc that I'm writing to the page using PHP. For some reason the output never includes the title node, and after researching I can't figure this out. If I change the title node to 'heading' or some other name it is included in the output, but when its named 'title', this node is skipped.
Here's the XML doc code...
<?xml version="1.0" encoding="UTF-8"?>
<items>
<product>
<id>cd1</id>
<title>CD One</title>
<description>This is my first CD</description>
<img>/images/sample.jpg</img>
<price>14.99</price>
</product>
</items>
The PHP code looks like this...
<?php
$filename = '../catalog.xml';
$contents = file_get_contents($filename);
echo $contents;
?>
Well, the XML you posted is not valid XML;
The encoding should be in lowercase. Try with this string:
<?xml version="1.0" encoding="utf-8"?>
<items>
<product>
<id>cd1</id>
<title>CD One</title>
<description>This is my first CD</description>
<img>/images/sample.jpg</img>
<price>14.99</price>
</product>
</items>
Validate here: http://validator.w3.org/check

Categories