generating sitemap with DOMDocument : missing AttributeNode in the output - php

im trying to generate a sitemap.xml , here is simplified version of my code
$dom = new \DOMDocument();
$dom->encoding = 'utf-8';
$dom->xmlVersion = '1.0';
$dom->formatOutput = true;
$xml_file_name = './sitemap.xml';
$urlset = $dom->createElement('urlset');
$attr_ = new \DOMAttr('xmlns:xsi', "http://www.w3.org/2001/XMLSchema-instance");
$urlset->setAttributeNode($attr_);
$url_node = $dom->createElement('url');
$url_node_loc = $dom->createElement('loc', 'http://localhost' );
$url_node->appendChild($url_node_loc);
$url_node_lastmod = $dom->createElement('lastmod', '2021-08-03T22:17:47+04:30' );
$url_node->appendChild($url_node_lastmod);
$urlset->appendChild($url_node);
$dom->appendChild($urlset);
$dom->save($xml_file_name);
dd('done');
here is the output in my sitemap.xml
This XML file does not appear to have any style information associated with it. The document tree is shown below.
<urlset>
<url>
<loc>http://localhost</loc>
<lastmod>2021-08-03T22:17:47+04:30</lastmod>
</url>
</urlset>
i need to add some attributes to my urlset tag , here is how i've did it
$attr_ = new \DOMAttr('xmlns:xsi', "http://www.w3.org/2001/XMLSchema-instance");
$urlset->setAttributeNode($attr_);
but for some reason this doesn't show up in my sitemap file , urlset has no attributes

Use setAttribute() instead of setAttributeNode().
$urlset->setAttribute('xmlns:xsi', 'http://www.w3.org/2001/XMLSchema-instance');

This will not be a valid sitemap. Sitemaps use an XML namespace (https://www.sitemaps.org/protocol.html)
To create nodes with namespaces you should use the namespace aware DOM methods with the *NS suffix. This will add namespace definitions as needed.
xmlns:xsi is a namespace definition. They can be considered attributes nodes in the reserved namespace {http://www.w3.org/2000/xmlns/}.
$xmlns = [
'sitemap' => 'http://www.sitemaps.org/schemas/sitemap/0.9',
'xmlns' => 'http://www.w3.org/2000/xmlns/',
'xsi' => 'http://www.w3.org/2001/XMLSchema-instance',
];
$document = new \DOMDocument('1.0', 'utf-8');
$document->formatOutput = true;
$urlset = $document->appendChild(
$document->createElementNS($xmlns['sitemap'], 'urlset')
);
// explict namespace definition
$urlset->setAttributeNS(
$xmlns['xmlns'], 'xmlns:xsi', $xmlns['xsi']
);
$url_node = $urlset->appendChild(
$document->createElementNS($xmlns['sitemap'], 'url')
);
$url_node
->appendChild($document->createElementNS($xmlns['sitemap'], 'loc'))
->textContent = 'http://localhost';
$url_node
->appendChild($document->createElementNS($xmlns['sitemap'], 'lastmod'))
->textContent = '2021-08-03T22:17:47+04:30';
echo $document->saveXML();
Output:
<?xml version="1.0" encoding="utf-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<url>
<loc>http://localhost</loc>
<lastmod>2021-08-03T22:17:47+04:30</lastmod>
</url>
</urlset>

Related

XML PHP Format Setup

Is there a why to do this? I'm new on create DomDocument. Thank you
<!DOCTYPE Data SYSTEM "http://data.data.org/schemas/data/1.234.1/data.dtd"> <Data payloadID = "123123123131231232323" timestamp = "2015-06-10T12:59:09-07:00">
$aribaXML = new DOMImplementation;
$dtd = $aribaXML->createDocumentType('cXML', '', 'http://xml.cxml.org/schemas/cXML/1.2.014/cXML.dtd');
$dom = $aribaXML->createDocument('', '', $dtd);
Here are two ways in DOM to create a new document in PHP. If you don't need the DTD you can directly create an instance of the DOMDocument class.
$document = new DOMDocument('1.0', "UTF-8");
$document->appendChild(
$cXML = $document->createElement('cXML')
);
echo $document->saveXML();
Output:
<?xml version="1.0" encoding="UTF-8"?>
<cXML/>
For a document with a DTD your approach was correct.
$implementation = new DOMImplementation;
$dtd = $implementation->createDocumentType(
'cXML', '', 'http://xml.cxml.org/schemas/cXML/1.2.014/cXML.dtd'
);
$document = $implementation->createDocument("", "cXML", $dtd);
$document->encoding = 'UTF-8';
$cXML = $document->documentElement;
echo $document->saveXML();
Output:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE cXML SYSTEM "http://xml.cxml.org/schemas/cXML/1.2.014/cXML.dtd">
After the initial bootstrap you use methods of the $document to create nodes and append them.
// set an attribute on the document element
$cXML->setAttribute('version', '1.1.007');
// xml:lang is a namespaced attribute in a reserved namespace
$cXML->setAttributeNS('http://www.w3.org/XML/1998/namespace', 'xml:lang', 'en-US');
// nested elements
// create a node, store it for the next call and append it
$cXML->appendChild(
$header = $document->createElement('Header')
);
$header->appendChild(
$from = $document->createElement('From')
);
$from->appendChild(
$credential = $document->createElement('Credential')
);
// "Identity" has only a text node so we don't need to store it for later
$credential->appendChild(
$document->createElement('Identity')
)->textContent = '83528721';
// format serialized XML string
$document->formatOutput = TRUE;
echo $document->saveXML();
Output:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE cXML SYSTEM "http://xml.cxml.org/schemas/cXML/1.2.014/cXML.dtd">
<cXML version="1.1.007" xml:lang="en-US">
<Header>
<From>
<Credential>
<Identity>83528721</Identity>
</Credential>
</From>
</Header>
</cXML>
$xml = new DomDocument('1.0');
$xml->formatOutput = true;
$works = $xml->createElement("works");
$xml->appendChild($works);
$work = $xml->createElement("work");
$work->setAttribute("id",1);
$works->appendChild($work);
$xml->save("storage/document.xml") or die("Error, Unable to create XML File");

Php:Proble when i convert in xml

I want my arElemt(gurl and gname) put in . Example and problem2 = when i write g:url or g:name = Error... php7.2* and now example now now i have this construction
-rss
---title
---link
---description
---gurl
---gname
i want now i have this construction
-rss
---title
---link
---description
---gurl
---gname
---gurl
---gname
---gurl
---gname i want
-rss
---title
---link
---description
---item
-----gurl
-----gname
---item
-----gurl
-----gname
---item
-----gurl
-----gname
---item
-----gurl
-----gname
header("Content-type: text/xml; charset=utf-8");
$dom = new DOMDocument('1.0','utf-8');
$root = $dom->createElement('rss');
$dom->appendChild($root);
$title = $dom->createElement('title', 'test');
$root->appendChild($title );
$link = $dom->createElement('link', 'test');
$root->appendChild($link );
$description = $dom->createElement('description', 'test');
$root->appendChild($description );
$root = $item->createElement('item');
while($arElement = $rsElements->GetNext())
{
$url = $dom->createElement("gurl", $surl.$arElement[DETAIL_PAGE_URL]);
$item->appendChild($url );
$name = $dom->createElement("gname", $arElement[NAME]);
$root->appendChild($name );
}
echo $dom->saveXML();
$dom->save($file_name); // save as file
Here is a big difference between gurl and g:url. gurl is not an valid RSS tag afaik. g:url is an url element inside a defined namespace.
The g from g:url is a namespace prefix. It references a namespace definition. Look for a xmlns:g attribute in examples or for the namespace URI in the documentation of the format. The g is an alias for the value of that attribute. A parser resolves that to the URI internally. All the following nodes can be read as {urn:example:namespace}url.
<g:url xmlns:g="urn:example:namespace"/>
<g2:url xmlns:g2="urn:example:namespace"/>
<url xmlns="urn:example:namespace"/>
RSS itself is just wellformed XML, it uses no namespace. But it can contain other XML formats that use namespaces (MediaRSS, ...).
To create an element with a namespace use the method DOMDocument::createElementNS(). This will automatically add the namespace definition if needed. However if do not use the namespace for the document element it will be added multiple times. You can set the namespace definition as an attribute of the reserved XMLNS namespace.
$data = ['one', 'two'];
// the namespace for namespace definitions
const XMLNS_XMLNS = 'http://www.w3.org/2000/xmlns/';
// namespace referenced by prefix g?
const XMLNS_G = 'urn:example:namespace';
$document = new DOMDocument('1.0','utf-8');
$rss = $document->appendChild(
$document->createElement('rss')
);
// add the namespace definition to the document element
$rss->setAttributeNS(XMLNS_XMLNS, 'xmlns:g', XMLNS_G);
// create + append element node, set its text content
$rss->appendChild(
$document->createElement('title')
)->textContent = 'test';
foreach ($data as $value) {
$item = $rss->appendChild(
$document->createElement('item')
);
// create and append an element with the namespace
$item->appendChild(
$document->createElementNS(XMLNS_G, 'g:url')
)->textContent = 'http://example.com/page?'.$value;
}
$document->formatOutput = TRUE;
echo $document->saveXML();
Output:
<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:g="urn:example:namespace">
<title>test</title>
<item>
<g:url>http://example.com/page?one</g:url>
</item>
<item>
<g:url>http://example.com/page?two</g:url>
</item>
</rss>
Hint 1: DOMNode::appendChild() returns the appended node. It is possible to nest the create call.
Hint 2: DOMNode::$textContent allows to read/write the text content of a node and escapes properly.
const XMLNS_XMLNS = 'http://www.w3.org/2000/xmlns/';
// namespace referenced by prefix g?
const XMLNS_G = 'urn:example:namespace';
$document = new DOMDocument('1.0','utf-8');
$rss = $document->appendChild(
$document->createElement('rss')
);
// add the namespace definition to the document element
$rss->setAttributeNS(XMLNS_XMLNS, 'xmlns:g', XMLNS_G);
// create + append element node, set its text content
$rss->appendChild(
$document->createElement('title')
)->textContent = 'test';
while($arElement = $rsElements->GetNext())
{
$item = $rss->appendChild(
$document->createElement('item')
);
$item->appendChild(
$document->createElementNS(XMLNS_G, 'g:url')
)->textContent = $surl.$arElement["DETAIL_PAGE_URL"];
$item->appendChild(
$document->createElementNS(XMLNS_G, 'g:name')
)->textContent = $arElement["NAME"];
}

Duplicate xml namespace declarations php DOMDocument

I use PHP DOMDocument to generate xml. Sometimes namespaces are declared only on root element, which is intended behaviour, but sometimes no.
For example:
$xml = new DOMDocument('1.0', 'utf-8');
$ns = "http://ns.com";
$otherNs = "http://otherns.com";
$docs = $xml->createElementNS($ns, "ns:Documents");
$doc = $xml->createElementNS($otherNs, "ons:Document");
$innerElement = $xml->createElementNS($otherNs, "ons:innerElement", "someValue");
$doc->appendChild($innerElement);
$docs->appendChild($doc);
$xml->appendChild($docs);
$xml->formatOutput = true;
$xml->save("dom");
I expect:
<?xml version="1.0" encoding="UTF-8"?>
<ns:Documents xmlns:ns="http://ns.com" xmlns:ons="http://otherns.com">
<ons:Document>
<ons:innerElement>someValue</ons:innerElement>
</ons:Document>
</ns:Documents>
But got:
<?xml version="1.0" encoding="UTF-8"?>
<ns:Documents xmlns:ns="http://ns.com" xmlns:ons="http://otherns.com">
<ons:Document xmlns:ons="http://otherns.com">
<ons:innerElement>someValue</ons:innerElement>
</ons:Document>
</ns:Documents>
Why declaration of xmlns:ons="http://otherns.com" appears on Document element, but not in <innerElement>? And how to prevent duplicates?
It's very easy. Just add your nodes into document tree.
Additionally you can explicitly create xmlns:XXX atrtribute in root node.
See example:
namespace test;
use DOMDocument;
$xml = new DOMDocument("1.0", "UTF-8");
$ns = "http://ns.com";
$otherNs = "http://otherns.com";
$docs = $xml->createElementNS($ns, "ns:Documents");
$xml->appendChild($docs);
$docs->setAttributeNS('http://www.w3.org/2000/xmlns/', 'xmlns:ons', $otherNs);
$doc = $xml->createElement("ons:Document");
$docs->appendChild($doc);
$innerElement = $xml->createElement("ons:innerElement", "someValue");
$doc->appendChild($innerElement);
$xml->formatOutput = true;
echo $xml->saveXML();
Result:
<?xml version="1.0" encoding="UTF-8"?>
<ns:Documents xmlns:ns="http://ns.com" xmlns:ons="http://otherns.com">
<ons:Document>
<ons:innerElement>someValue</ons:innerElement>
</ons:Document>
</ns:Documents>

Creating attribute in domdocument

I have to make this type of XML :-
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.example.com/</loc>
<lastmod>2005-01-01</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</url>
<url>
<loc>http://www.example.com/catalog?item=12&desc=vacation_hawaii</loc>
<changefreq>weekly</changefreq>
</url>
</urlset>
For which I have written this code,
$dom = new domDocument('1.0', 'utf-8');
$dom->formatOutput = true;
$rootElement = $dom->createElementNS('http://www.sitemaps.org/schemas/sitemap/0.9', 'urlset');
$sxe = simplexml_import_dom( $dom );
$urlMain = $sxe->addChild("url");
$loc = $urlMain->addChild("loc","http://www.example.com");
$lastmod = $urlMain->addChild("lastmod","$date");
$changefreq = $urlMain->addChild("changefreq","daily");
$priority = $urlMain->addChild("priority","1");
Everything works completely fine, but for some reason xmlns for urlset is not getting added. What might be wrong here?
Any suggestion would be helpful.
You need to append the root element to the document prior to conversion to simplexml:
$rootElement = $dom->createElementNS('http://www.sitemaps.org/schemas/sitemap/0.9', 'urlset');
$dom->appendChild($rootElement);
$sxe = simplexml_import_dom( $dom );

Using xPath for sitemap.xml

Here is the XML file content:
<?xml version="1.0" encoding="utf-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url id="first_url">
<loc>http://example.com</loc>
<lastmod>2014-05-21</lastmod>
</url>
</urlset>
And here goes the PHP code:
<?php
$dom = new DOMDocument('1.0', 'utf-8');
$dom->Load('sitemap.xml');
$xpath = new DOMXPath($dom);
$tags = $xpath->query('//url[#id="first_url"]');
foreach($tags as $tag)
print $tag->getAttribute("id")."<br/>";
?>
This code does not work. But if I remove xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" from file, it works. Why it's so? Thanks!
It is because of the namespace. Here is way you can do it that ignores the name space:
Xpath 1.0:
//*[local-name()="url"][#id="first_url"]
Xpath 2.0:
//*:url[#id="first_url"]
Register the namespace using DOMXPath::registerNamespace
$xpath->registerNamespace("s",
"http://www.sitemaps.org/schemas/sitemap/0.9");
Then use it in your XPath:
$tags = $xpath->query('//s:url[#id="first_url"]');

Categories