PHP DOM return as html - php

I have an xml file slider.xml with html code inside:
<?xml version="1.0" encoding="UTF-8"?>
<content>
<title>Slider</title>
<head>
<script async="async" src='ws-custom/plugins/slider.js'></script>
<script async="defer" src='ws-custom/plugins/functions.js'></script>
</head>
<footer>
<script async="defer" src='ws-custom/plugins/jquery.js'></script>
</footer>
</content>
In PHP I would like to:
1. load it (using simplexml, dom or other better solution) and store in a variable $xml;
2. create an array $head with both $xml->head->children();
3. return the original html code for $head[0] and $head[1].
I have tried using this code:
$xml = simplexml_load_file('slider.xml');
$head = $xml->head->children();
foreach($head as $element){
echo $element->asXML();
}
but it returns self-closing tags:
<script async="async" src="ws-custom/plugins/slider.js"/>
<script async="defer" src="ws-custom/plugins/functions.js"/>
which is not valid html code for W3C http://validator.w3.org/nu/
I would like also to be able to write only async, i.e.
because it's valid html, but with simplexml it's not valid xml.
Thank you very much.
Best regards.

I've edited the script, now it works perfectly.
Note please the row 6:
$element[] = null;
<?php
$xml = new DOMDocument();
$xml = simplexml_load_file('slider.xml');
$head = $xml->head->children();
foreach($head as $element){
$element[] = null;
echo $element->asXML().PHP_EOL;
}

SimpleXML can't output the empty tags properly, you should use DOMDocument instead (LIBXML_NOEMPTYTAG doesn't work in SimpleXML)...
$xml = new DOMDocument('1.0');
$xml->load("slider.xml");
$head = $xml->getElementsByTagName("head");
$headScripts= $head[0]->getElementsByTagName("script");
foreach($headScripts as $element){
echo $xml->saveXML($element, LIBXML_NOEMPTYTAG).PHP_EOL;
}
This code gets a start point (the <head> tag), as you only want the first one it uses [0] and finds the <script> tags inside the start point.
Which with the test source gives...
<script async="async" src="ws-custom/plugins/slider.js"></script>
<script async="defer" src="ws-custom/plugins/functions.js"></script>

Related

php getimagesize with persian file name

I'm trying to write an Joomla plugin to add width and height tag to each <img> in HTML file.
Some image file names are Persian, and getimagesize faces error.
The code is this:
#$dom->loadHTML('<?xml version="1.0" encoding="UTF-8"?>' . "\n" . '
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<img src="images\banners\س.jpg" style="max-width: 90%;" >
</body>
</html>
');
$x = new DOMXPath($dom);
foreach($x->query("//img") as $node)
{
$imgtag = $node->getAttribute("src");
$imgtag = pathinfo($imgtag);
$imgtag = $imgtag['dirname'].'\\'.$imgtag['basename'];
$imgtag = getimagesize($imgtag);
$node->setAttribute("width",$imgtag[0]);
$node->setAttribute("height",$imgtag[1]);
}
$newHtml = urldecode($dom->saveHtml($dom->documentElement));
And when Persian characters exist in file name, getimagesize shows:
Warning: getimagesize(images\banners\س.jpg): failed to open stream: No such file or directory in C:\wamp64\www\plugin.php
How can I solve this?
Thanks to all,
I couldn't reach to results on WAMP server (local server on Windows),
but when I migrated to Linux server, finally this code worked properly.
$html = $app->getBody();
setlocale(LC_ALL, '');
$dom = new DOMDocument();
#$dom->loadHTML($html);
$x = new DOMXPath($dom);
foreach($x->query("//img") as $node)
{
$imgtag = $node->getAttribute("src");
if(strpos($imgtag,"data:image")===false)
{
$imgtag = getimagesize($imgtag);
$node->setAttribute("width",$imgtag[0]);
$node->setAttribute("height",$imgtag[1]);
}
}
$bodytag = $x->query("//body");
$node = $dom->createElement("script", ' /* java script which may be necessary on client */ ');
$bodytag[0]->appendChild($node);
$html = '<!DOCTYPE html>'."\n" . $dom->saveHtml($dom->documentElement);
Some hints:
the code, shouldn't touch base64 image sources, so I added an condition to the code.
if some script (or whatever, div, p, ....) should be added to body tag, you can use appendChild method.
<!DOCTYPE html> should be added to final DOM object output :)

How can I exclude the specific html block within body tag by using DOMDocument?

I'm using DOMDocument to get the HTML from a website. I want to get html within the <body></body> and I got it. But inside body here is a <nav>...</nav> block. How can I exclude <nav></nav> block only by using DOMDocument.
Here is my Code:
<!DOCTYPE html>
<head>
<title>Title Here</title>
<head>
<?php
$d = new DOMDocument;
$mock = new DOMDocument;
$internalErrors = libxml_use_internal_errors(true);
$d->loadHTML(file_get_contents('http://www.example.com'));
$body = $d->getElementsByTagName('body')->item(0);
foreach ($body->childNodes as $child){
$mock->appendChild($mock->importNode($child, true));
}
libxml_use_internal_errors($internalErrors);
echo $mock->saveHTML(); //<body>.....</body>
?>
</html>
Please look at the accepted answer on this one,
PHP DOM: Get NodeValue excluding the child nodes
You can remove 'nav' node just after gathering all child nodes of the body.

Get body without tags using tidy

http://php.net/manual/en/tidy.body.php will return the body content wrapped with the <body> tag. How do I get the body content without the <body> tag? I've come up with a couple possible solutions, however, they are not very elegant.
$tidy = new tidy;
$tidy->parseString($html);
$tidy->cleanRepair();
$body_content=trim(ltrim(rtrim(trim($tidy->body()->value),'</body>'),'<body>'));
var_dump($body_content);
$body=$tidy->body()->value;
$body_content=substr($body,7,strlen($body)-16);
var_dump($body_content);
$tidy->body() returns a tidyNode instance representing the body. Each tidyNode contains a child property containing an array of tidyNode instances for each child element. You can loop over these children to rebuild the inner html of the body tag. For example:
<?php
$html = <<<'HTML'
<html>
<head><title>test</title></head>
<body>
<h1>Hello!</h1>
<p>Hello world!</p>
</body>
</body>
</html>
HTML;
$tidy = new tidy;
$tidy->parseString($html);
$tidy->cleanRepair();
$bodyInnerHtml = '';
foreach($tidy->body()->child as $child) {
$bodyInnerHtml .= (string)$child;
}
var_dump($bodyInnerHtml);
will result in:
string(36) "<h1>Hello!</h1>
<p>Hello world!</p>
"
More information about the tidyNode class can be found in the documentation.

how to access DOM in php that will echo out everything between <html></html> [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Export particular element in DOMDocument to string
i know how to access different element depending on id but don't know how to get everything between html start tag to html end tag. Can anyone please help me.
thanks.
If you would like to parse an html page with PHP, you could use PHP's DOMDocument extension, as such:
// a new dom object
$dom = new domDocument;
// load the html into the object
$dom->loadHTML($html);
// keep white space
$dom->preserveWhiteSpace = true;
// nicely format output
$dom ->formatOutput = true;
//get element by tag name
$htmlRootElement = $dom->getElementsByTagName('html');
echo htmlspecialchars($dom->saveHTML(), ENT_QUOTES);
Or you could do this with JavaScript on the client side:
var htmlRootElement = document.getElementsByTagName("html");
alert(htmlRootElement.innerHTML);
You can access each element in the <html> tag with the DOMDocument class.
Example
$htmlDoc = new DOMDocument;
$html = <<<HTML
<!doctype html>
<html>
<head>
<meta charset="utf-8">
<title>My Site</title>
<meta name="description" content="DOM test">
</head>
<body>
<h1>Hello</h1>
<p>This is a DOM test</p>
</body>
</html>
HTML;
$htmlDoc->loadHTML($html);
$htmlElement = $htmlDoc->getElementsByTagName("html");
foreach ($htmlElement->item(0)->childNodes as $element) {
echo 'Element name: ' . $element->nodeName . PHP_EOL;
echo 'Element value: '. $element->nodeValue . PHP_EOL;
}

Passing PHP variables to XML in a .php file

<?php
include "../music/php/logic/core.php";
include "../music/php/logic/settings.php";
include "../music/php/logic/music.php";
$top = "At world's end";
// create doctype
$dom = new DOMDocument("1.0");
header("Content-Type: text/xml");
?>
<music>
<?php $_xml = "<title>".$top."</title>";
echo $_xml; ?>
</music>
I'm using this code to generate a dynamic XML document. The file is saved as PHP.
My problem is that I can't echo php variables into the xml. However I can echo "literal" type text. I can't see anything wrong with my approach, it just doesn't work!
I'm pretty new to XML so I've probably missed something glaringly simple.
I've also tried lines like:
<title><?php echo $top; ?></title>
You don't use DOM this way. You use the DOM API to create the entire document:
$doc = new DOMDocument();
$books = $doc->createElement( "books" );
$doc->appendChild( $books );
// ...
See:
http://www.ibm.com/developerworks/library/os-xmldomphp/
http://www.tonymarston.net/php-mysql/dom.html
https://web.archive.org/web/1/http://articles.techrepublic%2ecom%2ecom/5100-10878_11-6141415.html
A more verbose example (generating XHTML with DOM)
// Create head element
$head = $document->createElement('head');
$metahttp = $document->createElement('meta');
$metahttp->setAttribute('http-equiv', 'Content-Type');
$metahttp->setAttribute('content', 'text/html; charset=utf-8');
$head->appendChild($metahttp);
See this tutorial on how to use DOM for XHTML. For reuse of code, you can write your own classes extending DOM classes to get configurable components.
If you don't want to use DOM or want to use plain text for generating the XML, just approach it like any other template, e.g.
<root>
<albums>
<album id="<?php echo $albumId; ?>">
<title><?php echo $title; ?></title>
... other elements ...
</album>
</albums>
</root>
You can store your XMl string in a .php file then render it to get final formatted XMl string. In many cases simpler than playing with XMl writers
File template.php
<root>
<albums>
<album id="<?php echo $data['albumId']; ?>">
<title><?php echo $data['title']; ?></title>
... other elements ...
</album>
</albums>
</root>
Render
function render($template, array $data)
{
ob_start();
include $template;
return ob_get_clean();
}
I think it's echo($_xml);

Categories