Using namespaces on XML - php

I need work with namespaces on XML from a code and do something with it. For instance:
<system:include file="./test.php" cache="true" />
That would be the final output of the content, but it is necessary to process the special tags (like the system:include) before send to client.
So I will get all elements of final output to search about namespaced tags or specific ones. The problem is that if I use DOMDocument and read like XML, I have some problems with namespaces declaration (Namespace prefix system on include is not defined in Entity).
My test code is:
<?php
$document = new DOMDocument();
$document->loadXML('
<system:include file="./test.php" cache="true" />
');
foreach($document->childNodes as $node) {
var_dump($node->nodeName);
}
?>
I need do it because I need process some special tags and converts it to real HTML. For instance: convert <b> to <strong> (just an example!) or make something better like include and cache a specific page using tags.
Another example:
<h7>Hello World!</h7>
Converts to:
<div class="h7">Hello World!</div>
Note: the ob contents will be sent to a specific method that will search by this special tags. So I don't know if I can make namespaces declaration before (will be hard and slowly, probably).
Bye!

I can get it to work if I specify a root element in the XML, and then declare the system namespace inside the root element. <root xmlns:system="system">...</root>
<?php
function dump($root) {
foreach($root->childNodes as $node) {
echo $node->nodeName;
echo "\n";
dump($node);
}
}
$doc = new DOMDocument();
$doc->loadXML('<root xmlns:system="system"><system:include file="./test.php" cache="true" /></root>');
dump($doc);
?>

Related

php xml DOMDocument close tag element

I am using PHP DOMDocument() to generate XML file with elements.
I am appending all details into sample xml file into components tag. But closing tag is not coming. I want to create closing tag.
My Code is doing this
<component expiresOn="2022-12-31" id="pam" />
I want to do like following
<component expiresOn="2022-12-31" id="pam"></component>
My PHP CODE SAMPLE
$dom = new DOMDocument();
$dom->load("Config.xml");
$components = $dom->getElementsByTagName('components')->item(0);
if(!empty($_POST["pam"])) {
$pam = $_POST["pam"];
$component = $dom->createElement('component');
$component->setAttribute('expiresOn', $expirydate);
$component->setAttribute('id', "pam");
$components->appendChild($component5);
}
$dom->save("Config.xml");
I tested following suggestion and its not working. Both xml-php code are different.
$dom->saveXml($dom,LIBXML_NOEMPTYTAG);
Self-closing tags using createElement
I tested following.
You're trying to use DOMDocument::saveXML to save the new XML back into the original file, but all that function does is return the XML as a string. Since you aren't assigning the result to anything, nothing happens.
If you want to save the XML back to your file, as well as avoiding self-closing tags, you'll need to use the save method as you originally were, and also pass the option:
$dom->save('licenceConfig.xml', LIBXML_NOEMPTYTAG);
See https://3v4l.org/e6N5s for a demo

Load HTML containing namespaces with DOMDocument

I've a problem. I want to load a HTML snippet with namespaces in it with DOMDocument.
<div class="something-first">
<div class="something-child something-good another something-great">
<my:text value="huhu">
</div>
</div>
But I can't figure out how to preserve the namespaces. I tried loading it with loadHTML() but HTML does not have namespaces and so they get stripped.
I tried loading it with loadXML() but this doesn't work neither cause <my:text value="huhu"> is not correct XML.
What I need is a loadHTML() method which doesn't strip namespaces or a loadXML() method which does not validate the markup. So a combination of this two methods.
My code so far:
$html = '<div class="something-first">
<div class="something-child something-good another something-great">
<my:text value="huhu">
</div>
</div>';
libxml_use_internal_errors(true);
$domDoc = new DOMDocument();
$domDoc->formatOutput = false;
$domDoc->resolveExternals = false;
$domDoc->substituteEntities = false;
$domDoc->strictErrorChecking = false;
$domDoc->validateOnParse = false;
$domDoc->loadHTML($html/*, LIBXML_NOERROR | LIBXML_NOWARNING*/);
$xpath = new DOMXPath($domDoc);
$xpath->registerNamespace ( 'my', 'http://www.example.com/' );
// -----> This results in zero nodes cause namespace gets stripped by loadHTML()
$nodes = $xpath->query('//my:*');
var_dump($nodes);
Is there a way to achieve what I want? I would be very happy for any advices.
EDIT I opened an enhancment request for libxml2 to provide an option to preserve namespaces in HTML: https://bugzilla.gnome.org/show_bug.cgi?id=711670
First, namespaces are allowed in XML (or XHTML) only. HTML does not support namespaces.
Given that it is XHTML and the xmlns declaration is present in the snippet, then you can access elements by namespace using DOMDocument::getElementsByTagNameNS():
$html = <<<EOF
<div xmlns:my="http://www.example.com/" class="something-first">
<div class="something-child something-good another something-great">
<my:text value="huhu" />
</div>
</div>
EOF;
$domDoc = new DOMDocument();
$domDoc->loadXML($html);
var_dump(
// it is possible to use wildcard `*` here
$domDoc->getElementsByTagNameNS('http://www.example.com/', '*')
);
However as it is common that the namespace declaration is defined in the root element <html> rather than in sub nodes, the code above will not work in most cases..
So part two of the solution would be to check if the declaration is present and if not inject it.... (working on this)
As I said, the code above works for XML / XHTML only. It is still open how to do that with HTML. (check the discussion below)
Technically it's neither valid XML or HTML (or XHTML) because HTML does not allow for namespaced elements while valid XML requires that empty elements be self-closing and that the namespace be registered. So your basically asking "how can I have DOMDocument treat this invalid HTML as valid XML even though it's not valid XML either?" which is going to prove difficult and one might ask why should libxml be updated to allow for this? If I update your snippet to:
$html = <<<XML
<div xmlns:my="http://www.example.com/" class="something-first">
<div class="something-child something-good another something-great">
<my:text value="huhu" />
</div>
</div>
XML;
adding in the NS registration and closing the my:text, it works just fine with:
$domDoc = new DOMDocument();
$domDoc->loadXML($html);
echo $domDoc->saveXML();
Notice that the namespace is not stripped out. The namespace is stripped out, as I understand it, because it's not valid XML or HTML. The XPath can't query by the namespace since the namespace wasn't defined via xmlns and therefore was dropped.
So I guess the question is: Why are you petitioning for invalid XML support rather than adding that closing slash? Is it because the data is from an external source or because in some context the empty non-closing tag is valid?

PHP Modify an included file

I have a bunch of .html files that I am including on a page. Conditionally, I need to add classes to some of the components in these files, for example:
<div id='foo' class='bar'></div>
to
<div id='foo' class='bar bar2'></div>
I know I can do this with some inline PHP like this
<div id='foo' class="bar <?php echo " bar2"; ?>"></div>
However, having PHP in any of the files I'm including is not an option.
I also looked into including a file and then modifying afterward, but that doesn't seem possible. Then I was thinking I should read the files line-by-line, and add it in then.
Is there a nicer way I'm not thinking of?
Since having PHP is not an option, you could use PHP's DOM Parser with an XPath selector:
$dom = new DOMDocument();
$dom->loadHTMLFile($htmlFile);
$finder = new DomXPath($dom);
// getting the class name using XPath
$nodes = $finder->query("//*[contains(#class, 'bar')]");
// changing the class name using setAttribute
foreach ($nodes as $node) {
$node->setAttribute('class', 'barbar2');
}
// modified HTML source
$html = $dom->saveHTML();
That should get you started.
You can use the DOMDocument class in PHP to retreive the information from the file and then add attributes and data.
I don't really remember the code for DOMDocument so I haven't included any code here (sorry), but here are some links:
Use this method to get the HTML from your file:
http://php.net/manual/en/domdocument.loadhtmlfile.php
Review the DOMDocument class:
http://php.net/manual/en/class.domdocument.php
You may need to use .php instead of .html.
So do like below:
$variableClass="bar2";
include("htmlfilename.html");
where the htmlfile.html consists of
<div id='foo' class="bar <?php echo $variableClass; ?>"></div>
Depends on what you actually want to achieve - but basically this tends to be better solved by jQuery on the client.
But anyway you might put your HTML fragments in a DOM object, analyze and modify it, and read the HTML back after the modifications, for example:
// including an HTML file writes to the output stream, so buffer this
ob_start();
include('myfile.html');
$html = ob_get_clean();
// make a DOMDocument
$doc = new DOMDocument();
$doc->loadHTML($html);
// make the changes you need to
$xpath = new DOMXPath($doc);
$nodelist $xpath->query('//div[#id="foo"]');
// etc...
// get modified HTML
$html = $doc->saveHTML();
Hope this helps.

manipulate html navigation with php dom

I need to add classes to the navigation HTML being output from a function in a custom CMS.
The only way I can get the output I need is to parse the HTML with PHP.
I am using PHP's DOM methods to look through the HTML and add a class to any <li> element that contains a child <ul> (top level navigation items).
So far it's working, but I have 2 questions:
Is there a more efficient way for me to go through this DOM data? It seems cumbersome to me, but that could just be my lack of experience.
In some cases, my <li> elements may already have a class, how can I add to the existing class attribute without destroying what may or may not already be there?
-
<?
$mcms_nav = getContent(
// call to cms that returns navigation html as a string
// ex. <ul id="pnav"><li>home</li>....</ul>
);
$dom = new DOMDocument();
$dom->preserveWhiteSpace = FALSE;
$dom->loadHTML($mcms_nav);
$x = new DOMXPath($dom);
foreach($x->query('//ul/li/ul') as $node)
{
$parent = $node->parentNode;
$parent_attr = $dom->createAttribute('class');
$parent_attr->value = 'has-flyout';
$parent->appendChild($parent_attr);
$flyout_attr = $dom->createAttribute('class');
$flyout_attr->value = 'flyout';
$node->appendChild($flyout_attr);
}
$mcms_nav = $dom->getElementByID('pnav');
echo $dom->saveHTML($mcms_nav);
?>
Not really. You could take the XML class from the CakePHP framework, turn this into an array, manipulate the array, and turn it back. Not sure if that's an option in your case. http://book.cakephp.org/2.0/en/core-utility-libraries/xml.html
You can use dom->hasAttribute() and dom->getAttribute() to get the existing attribute contents if they exist.
Also, a new job wouldn't hurt ;)

Getting and placing content within html tag by its class using php

Is it possible to get and place content within an html tag by its class name?
For Example:
<div class='edit'>
Wow! I'm the Content.
</div>
Is it possible to get that value, edit and place it back or a new value to that div etc? If it's possible... will it work if it has multiple classes? Like:
<div class='span-20 edit'>
Wow! I'm the Content.
</div>
If you can determine which specific HTML tag to manipulate, you have various tools at your disposal. You can use str_replace, preg_replace, DOMDocument, DOMXPath, and simplexml in this situation.
If in PHP, try this:
$xhtml = simplexml_load_string("<div class='edit'>Wow! I'm the Content.</div>");
$divs = $xhtml->xpath('//div[#class=edit]');
if (!empty($divs){
foreach ($divs as $div){
$div['class'] .= ' span-20';
}
}
return $xhtml->asXML();
With jQuery javascript library, do this:
$('.edit').addClass('span-20');

Categories