PHP String to Valid XML file - php

I have a problem converting my string which has a structure of XML into a proper XML file.
My String looks like:
<product>
<ID>12345</ID>
<NAME></NAME>
</product>
<product>
<ID>123</ID>
<NAME></NAME>
</product>
And so on. The problem is that I get empty result if I use DOM.
$dom = new DomDocument('1.0', 'UTF-8');
$dom->loadXML($products);
$xml = $dom->saveXML($dom);
Output is:
string(39) "<?xml version="1.0" encoding="UTF-8"?>
"
How can I make this work? Or I just can add the html and root tags to this string and just parse it to the file?

Your XML is not properly formatted. XML requires a root element.
If you change your XML to something like this:
<products>
<product>
<ID>12345</ID>
<NAME></NAME>
</product>
<product>
<ID>123</ID>
<NAME></NAME>
</product>
</products>
It should work as expected.
Adjusted code:
<?php
$dom = new DomDocument('1.0', 'UTF-8');
$dom->loadXML('<products>
<product>
<ID>12345</ID>
<NAME></NAME>
</product>
<product>
<ID>123</ID>
<NAME></NAME>
</product>
</products>');
echo $dom->saveXML();
Outputs:
<?xml version="1.0"?>
<products>
<product>
<ID>12345</ID>
<NAME/>
</product>
<product>
<ID>123</ID>
<NAME/>
</product>
</products>

Related

I need to customize xml tags

I need to make some changes to the xml tags I use. I need to collect some of these tags in other tags.
For example:
<image> image_url </image>
the tags
<Images>
    <Image> image_url <image>
</Images>
must be
I may also need to use some tags below or above the existing tag
For example:
<productname> pr_name </productname>
<newproductname> pr_name </newproductname>
Test xml output:
<Root>
<Products>
<Product>
<productname> pr_name </productname>
<price> pr_price </price>
<sku> pr_sku </sku>
<Image> image_url <image>
<Product>
</Products>
</Root>
I want it this way :)
<Root>
<Products>
<Product>
<newproductname> pr_name </newproductname>
<productname> pr_name </productname>
<price> pr_price </price>
<sku> pr_sku </sku>
<Images>
    <Image> image_url <image>
</Images>
<newsku> pr_newsku </newsku>
<Product>
</Products>
</Root>
How do I make changes to this structure that I use like.
header('Content-Type: application/xml');
$xml = new DOMDocument();
$xml->preserveWhiteSpace = false;
$xml->formatOutput = true;
$xml->load('test.xml');
$xml_string = $xml->saveXML();
echo $xml_string;
Given the sample XML below, which is based upon that cited in the question but expanded slightly and corrected the badly formed XML as seen above, you can easily achieve the desired structure with careful use of DOMXPath and insertBefore - with a little twist on the latter to actually perform insertAfter
$xml='<?xml version="1.0" encoding="UTF-8"?>
<Root>
<Products>
<Product>
<productname> pr_name 1</productname>
<price> pr_price 1</price>
<sku> pr_sku 1</sku>
<Image> image_url 1.1</Image>
<Image> image_url 1.2</Image>
</Product>
<Product>
<productname> pr_name 2</productname>
<price> pr_price 2</price>
<sku> pr_sku 2</sku>
<Image> image_url 2</Image>
</Product>
<Product>
<productname> pr_name 3</productname>
<price> pr_price 3</price>
<sku> pr_sku 3</sku>
<Image> image_url 3.1</Image>
<Image> image_url 3.2</Image>
<Image> image_url 3.3</Image>
</Product>
<Product>
<productname> pr_name 4</productname>
<price> pr_price 4</price>
<sku> pr_sku 4</sku>
</Product>
</Products>
</Root>';
$dom=new DOMDocument('1.0','utf-8');
$dom->formatOutput=true;
$dom->validateOnParse=false;
$dom->recover=true;
$dom->strictErrorChecking=false;
$dom->preserveWhiteSpace=false;
$dom->loadXML( $xml );
$errors = libxml_get_errors();
libxml_clear_errors();
$xp=new DOMXPath( $dom );
$col=$xp->query( '//Products/Product' );
if( $col->length > 0 ){
foreach( $col as $node ){
/* Get the `productname` and create a new element before with same value */
$productname=$xp->query( 'productname', $node )->item(0);
$newproductname=$dom->createElement( 'newproductname', $productname->textContent );
$node->insertBefore( $newproductname, $productname );
/* Find all the Image tags within parent */
$ref=$xp->query( 'Image', $node );
/* determine reference node to use for `insertBefore` */
if( $ref && $ref->length > 0 )$refNode=$ref->item(0);
else $refNode=$node->lastChild;
/* create a new `Images` node */
$oImages=$dom->createElement('Images');
$node->insertBefore( $oImages, $refNode );
/* using previously discovered `Image` nodes, add to the new `Images` element */
foreach( $ref as $img )$oImages->appendChild( $img );
}
}
/* for demo display */
printf('<pre>%s</pre>',print_r( htmlentities( $dom->saveXML() ), true ) );
/* For real output */
#header( 'Content-Type: application/xml' );
#exit( $dom->saveXML() );
The output from the debug print ( printf above ) is as follow and follows, I believe, the desired output format - apart from new, unmentioned newsku tag which you show above:
<?xml version="1.0" encoding="UTF-8"?>
<Root>
<Products>
<Product>
<newproductname> pr_name 1</newproductname>
<productname> pr_name 1</productname>
<price> pr_price 1</price>
<sku> pr_sku 1</sku>
<Images>
<Image> image_url 1.1</Image>
<Image> image_url 1.2</Image>
</Images>
</Product>
<Product>
<newproductname> pr_name 2</newproductname>
<productname> pr_name 2</productname>
<price> pr_price 2</price>
<sku> pr_sku 2</sku>
<Images>
<Image> image_url 2</Image>
</Images>
</Product>
<Product>
<newproductname> pr_name 3</newproductname>
<productname> pr_name 3</productname>
<price> pr_price 3</price>
<sku> pr_sku 3</sku>
<Images>
<Image> image_url 3.1</Image>
<Image> image_url 3.2</Image>
<Image> image_url 3.3</Image>
</Images>
</Product>
<Product>
<newproductname> pr_name 4</newproductname>
<productname> pr_name 4</productname>
<price> pr_price 4</price>
<Images/>
<sku> pr_sku 4</sku>
</Product>
</Products>
</Root>

Filtering XML elements by child node value

I have a feed with products, all the products have a child node called 'category' with a value. I can't find a way to return all products with a certain category value.
The XML looks something like this
<product>
<name>xxxx</name>
<category>Category A</category>
</product>
<product>
<name>xxxx</name>
<category>Category B</category>
</product>
<product>
<name>xxxx</name>
<category>Category A</category>
</product>
<product>
<name>xxxx</name>
<category>Category B</category>
</product>
<product>
<name>xxxx</name>
<category>Category B</category>
</product>
I've tried looping through the XML, using PHP code like this:
$xml = simplexml_load_file('file.xml');
foreach ($xml as $product) {
if ((string) $product['category'] == 'Category A') {
echo (string) $product['name'];
}
}
Expected outcome is to return/echo other child nodes for that product. What would be the best approach for this?
Your approach seems sound, I'm not familiar enough with SimpleXML to say why it's not working. But, since you asked for the best approach, I'm partial to DomDocument and XPath myself:
$xml = <<< XML
<?xml version="1.0"?>
<products>
<product>
<name>xxxx</name>
<category>Category A</category>
</product>
<product>
<name>xxxx</name>
<category>Category B</category>
</product>
<product>
<name>xxxx</name>
<category>Category A</category>
</product>
<product>
<name>xxxx</name>
<category>Category B</category>
</product>
<product>
<name>xxxx</name>
<category>Category B</category>
</product>
</products>
XML;
$dom = new DomDocument;
$dom->loadXML($xml);
$xpath = new DomXPath($dom);
$search = "Category A";
$nodes = $xpath->query("//product[category='$search']/name");
foreach ($nodes as $node) {
printf("%s\n", $node->textContent);
}
For SimpleXML, after a little digging it looks like it needs to access elements with object notation, not array notation. This worked for me:
$x = simplexml_load_string($xml);
foreach ($x->product as $product) {
if ((string) $product->category == 'Category A') {
echo (string) $product->name;
}
}
But I maintain that learning DOM and XPath methods will serve you better in the long run; they're both well established standards that are used in many languages. Knowledge about SimpleXML is not something you can transfer to another environment.

Delete Parent node keeping all its child elements in nested XML with PHP

I am having nested XML, I want to remove only parent node < items> in xml document keeping all its child nodes.
<root>
<items>
<Product>
<name> </name>
<size> </size>
<images>
<img1></img1>
<img2></img2>
</images>
</Product>
<Product>
<name> </name>
<size> </size>
<images>
<img1></img1>
<img2></img2>
</images>
</Product>
</items>
</root>
Expected Output -
<root>
<Product>
<name> </name>
<size> </size>
<images>
<img1></img1>
<img2></img2>
</images>
</Product>
<Product>
<name> </name>
<size> </size>
<images>
<img1></img1>
<img2></img2>
</images>
</Product>
</root>
I have researched & tried a lot, on removing the < items> node all its child nodes are also getting deleted. Please help if there is any way using DOMDocument or any other way in php.
Well, Geza Boems answer is not exactly what I meant. Using Xpath you can fetch the items nodes for iteration. This is a stable result, so you can iterate it while modifying the DOM.
$document = new DOMDocument();
$document->loadXML($input);
$xpath = new DOMXpath($document);
foreach ($xpath->evaluate('//items') as $itemsNode) {
// as long that here is any child inside it
while ($itemsNode->firstChild instanceof DOMNode) {
// move it before its parent
$itemsNode->parentNode->insertBefore($itemsNode->firstChild, $itemsNode);
}
// remove the empty items node
$itemsNode->parentNode->removeChild($itemsNode);
}
echo $document->saveXML();
As #ThW mentioned, you have to collect the child nodes in ITEMS, then insert them into ROOT, and finally delete ITEMS.
$input = "
<root>
<items>
<Product>
<name> </name>
<size> </size>
<images>
<img1></img1>
<img2></img2>
</images>
</Product>
<Product>
<name> </name>
<size> </size>
<images>
<img1></img1>
<img2></img2>
</images>
</Product>
</items>
</root>";
$doc = new DOMDocument();
$ret = $doc->loadXML($input);
$root = $doc->firstChild;
$nodes_to_insert = array();
$nodes_to_remove = array();
foreach($root->childNodes as $items) {
if($items->nodeName != "items") {
continue;
}
$nodes_to_remove[] = $items;
foreach($items->childNodes as $child) {
if($child->nodeType != XML_ELEMENT_NODE) {
continue;
}
$nodes_to_insert[] = $child;
}
}
foreach($nodes_to_insert as $node) {
$root->appendChild($node);
}
foreach($nodes_to_remove as $node) {
$root->removeChild($node);
}
var_dump($doc->saveXML());
This code will search for all "items" tag within root, not only one. Inside "items", it will search all normal node (ELEMENT type, but no TEXT node, etc.)
In the last line there is a dump, but normally you will not see anything in a browser, because of the XML header line. But if you take a look at page source, the result will be shown.
PS: it is quite important to not modify xml structure when you walk it. That's why i do only collection first, then the insert and delete actions.

Parse XML Parent node of matching attribute

I have an XML like the one below, I am trying to do an xpath query and parse it with simplexml. The XML is a CURL response and is stored in a $response variable. I need to look the Code attribute inside the <Item> and select the parent <Product> to parse it.
$response:
<Items>
<Product>
<Item Code="123">
</Item>
<Price>170
</Price>
</Product>
<Product>
<Item Code="456">
</Item>
<Price>150
</Price>
</Product>
</Items>
This is what I am doing:
$xml = simplexml_import_dom($response);
function loadNode($code){
global $xml;
$scode = $xml->xpath('//Item[contains(#Code,"' . $code . '")]/..');
echo $scode->Items->Product->Price;
}
loadNode("123");
This is the Notice I get:
Notice: Trying to get property of non-object
A couple of observations:
The xpath() method returns an array of SimpleXMLElement
objects, not a single SimpleXMLElement. (Yes, even though there can only be a single parent of an element, you still have to get it as the first member of the array ([0]).
$scode->Items->Product->Price should be changed to just
$scode->Price.
These modifications to your PHP code:
<?php
$response = <<<XML
<Items>
<Product>
<Item Code="123">
</Item>
<Price>170
</Price>
</Product>
<Product>
<Item Code="456">
</Item>
<Price>150
</Price>
</Product>
</Items>
XML;
$xml = simplexml_load_string($response);
function loadNode($code) {
global $xml;
$scode = $xml->xpath('//Item[contains(#Code,' . $code . ')]/..')[0];
echo $scode->Price;
}
loadNode("123");
?>
When run will yield this output:
170
as expected.

PHP SimpleXMLElement not parsing <title> node

I have a simple well-formed XML doc that I'm writing to the page using PHP. For some reason the output never includes the title node, and after researching I can't figure this out. If I change the title node to 'heading' or some other name it is included in the output, but when its named 'title', this node is skipped.
Here's the XML doc code...
<?xml version="1.0" encoding="UTF-8"?>
<items>
<product>
<id>cd1</id>
<title>CD One</title>
<description>This is my first CD</description>
<img>/images/sample.jpg</img>
<price>14.99</price>
</product>
</items>
The PHP code looks like this...
<?php
$filename = '../catalog.xml';
$contents = file_get_contents($filename);
echo $contents;
?>
Well, the XML you posted is not valid XML;
The encoding should be in lowercase. Try with this string:
<?xml version="1.0" encoding="utf-8"?>
<items>
<product>
<id>cd1</id>
<title>CD One</title>
<description>This is my first CD</description>
<img>/images/sample.jpg</img>
<price>14.99</price>
</product>
</items>
Validate here: http://validator.w3.org/check

Categories