Change stylesheet attribute inside iframe using DOMDocument class - php

I'm using DOMDocument class in order to change link tag attribute when the page is load.
The tag that I want to manipulate is inside iframe block, so my code doesn't take effect in that specific case.
Here's the code:
$page_content = file_get_contents($page_link);
$dom = new DOMDocument();
$dom->loadHtml($page_content);
$links = $dom->getElementsByTagName('link');
foreach( $links as $k => $link ){
if( $link->getAttribute('rel') === 'stylesheet' ){
$link->setAttribute('rel', 'test'); //just fot testing
}
}
$newHtml = $dom->saveHtml();
echo $newHtml;
But only the main tag is affected, while the <link> inside the iframe block don't:
<head>
<link rel="test" href="style.css"/>
</head>
<body>
.
.
.
.
<iframe>
<head>
<link rel="stylesheet" href="style.css"/><!--"rel" stays the same-->
</head>
</iframe>
.
.
.
</body>
<footer>
</footer>
Much appreciate your help!

Related

php getimagesize with persian file name

I'm trying to write an Joomla plugin to add width and height tag to each <img> in HTML file.
Some image file names are Persian, and getimagesize faces error.
The code is this:
#$dom->loadHTML('<?xml version="1.0" encoding="UTF-8"?>' . "\n" . '
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<img src="images\banners\س.jpg" style="max-width: 90%;" >
</body>
</html>
');
$x = new DOMXPath($dom);
foreach($x->query("//img") as $node)
{
$imgtag = $node->getAttribute("src");
$imgtag = pathinfo($imgtag);
$imgtag = $imgtag['dirname'].'\\'.$imgtag['basename'];
$imgtag = getimagesize($imgtag);
$node->setAttribute("width",$imgtag[0]);
$node->setAttribute("height",$imgtag[1]);
}
$newHtml = urldecode($dom->saveHtml($dom->documentElement));
And when Persian characters exist in file name, getimagesize shows:
Warning: getimagesize(images\banners\س.jpg): failed to open stream: No such file or directory in C:\wamp64\www\plugin.php
How can I solve this?
Thanks to all,
I couldn't reach to results on WAMP server (local server on Windows),
but when I migrated to Linux server, finally this code worked properly.
$html = $app->getBody();
setlocale(LC_ALL, '');
$dom = new DOMDocument();
#$dom->loadHTML($html);
$x = new DOMXPath($dom);
foreach($x->query("//img") as $node)
{
$imgtag = $node->getAttribute("src");
if(strpos($imgtag,"data:image")===false)
{
$imgtag = getimagesize($imgtag);
$node->setAttribute("width",$imgtag[0]);
$node->setAttribute("height",$imgtag[1]);
}
}
$bodytag = $x->query("//body");
$node = $dom->createElement("script", ' /* java script which may be necessary on client */ ');
$bodytag[0]->appendChild($node);
$html = '<!DOCTYPE html>'."\n" . $dom->saveHtml($dom->documentElement);
Some hints:
the code, shouldn't touch base64 image sources, so I added an condition to the code.
if some script (or whatever, div, p, ....) should be added to body tag, you can use appendChild method.
<!DOCTYPE html> should be added to final DOM object output :)

Making An HTML Directory List

I have a form on my site where users can enter links to articles
So far... when a link is submitted, I am able to get that link to post to a destination html page.
However... if another link is submitted, it deletes the first one.
I would like the links to 'stack' and make a list to the destination (directory) page (which is currently an html page).
I don't know how to achieve this. Any advice or examples would be greatly appreciated.
I have include a very stripped down version of all three pages....
1.) The Form
<!DOCTYPE html>
<html>
<head>
<title>FORM</title>
<style>
body{margin-top:20px; margin-left:20px;}
.fieldHeader{font-family:Arial, Helvetica, sans-serif; font-size:12pt;}
.articleURL{margin-top:10px; width:700px; height:25px;}
.btnWrap{margin-top:20px;}
.postButton{cursor:pointer;}
</style>
</head>
<body>
<form action="urlUpload.php" method="post" enctype="multipart/form-data">
<div class="fieldHeader">Enter Article Link:</div>
<input class="articleURL" id="articleURL" name="articleURL" autocomplete="off">
<div class="btnWrap"><input class="postButton" type="submit" name="submit" value="POST"></button></div>
</form>
</body>
</html>
The Upload PHP (buffer) Page
<?php ob_start(); ?>
<!DOCTYPE html>
<html>
<head>
<title>urlUpload</title>
<style>body{margin-top:20px; margin-left:20px;}</style>
</head>
<body>
<?php $articleURL = htmlspecialchars($_POST['articleURL']); echo $articleURL;?>
</body>
</html>
<?php echo ''; file_put_contents("urlDirectory.html", ob_get_contents()); ?>
3.) The Destination HTML 'Directory List' page
<!DOCTYPE html>
<html>
<head>
<title>urlDirectory</title>
<style>body{margin-top:20px; margin-left:20px;}</style>
</head>
<body>
Sumbitted URL's should be listed here:
</body>
</html>
PS: I may not even need the middle php 'buffer' page. My knowledge of this sort of thing is limited thus far. If I don't need that, and can skip that page to accomplish my needs, please advise as well.
You can do this by using PHP to write the file and using urlDirectory.html as a template. You will just need to change your php file:
urlUpload.php
<?php
function saveUrl($url, $template, $tag)
{
// If template is invalid, return
if (!file_exists($template)) {
return false;
}
// Remove whitespace from URL
$url = trim($url);
// Ignore invalid urls
if (!filter_var($url, FILTER_VALIDATE_URL)) {
return true;
}
// Read template into array
$html = file($template);
foreach ($html as &$line) {
// Look for the tag, we will add our new URL directly before this tag, use
// preg_match incase the tag is preceded or followed by some other text
if (preg_match("/(.*)?(" . preg_quote($tag, '/') . ")(.*)?/", $line, $matches)) {
// Create line for URL
$urlLine = '<p>' . htmlspecialchars($_POST['articleURL']) . '</p>' . PHP_EOL;
// Handle lines that just contain body and lines that have text before body
$line = $matches[1] == $tag ? $urlLine . $matches[1] : $matches[1] . $urlLine . $matches[2];
// If we have text after body add that too
if (isset($matches[3])) {
$line .= $matches[3];
}
// Don't process any more lines
break;
}
}
// Save file
return file_put_contents($template, implode('', $html));
}
$template = 'urlDirectory.html';
$result = saveUrl($_POST['articleURL'], $template, '</body>');
// Output to browser
echo $result ? file_get_contents($template) : 'Template error';

Reading and encoding html

I am trying to read and display the content of the title (contained in a h1 tag) from many HTML files. These files are all in the same folder.
This is what the html files look like :
<!DOCTYPE html PUBLIC '-//W3C//DTD HTML 4.01//EN'>
<html>
<head>
<title>A title</title>
<style type='text/css'>
... Styles here ...
</style>
</head>
<body>
<h1>Être aidant</h1>
<p>En général, les aidants doivent équilibrer...</p>
... more tags ...
</body>
I have tried to display the content from the H1 tag with this PHP script :
<?php
foreach (glob("test/*.html") as $file) {
$file_handle = fopen($file, "r");
$doc = new DOMDocument();
$doc->loadHTMLfile($file);
$title = $doc->getElementsByTagName('h1');
if ( $title && 0<$title->length ) {
$title = $title->item(0);
$content = $doc->savehtml($title);
echo $content;
}
fclose($file_handle);
}
?>
But the output contains wrong characters. For the example file, the output is :
Être aidant
How can I achieve this output?
Être aidant
You should state a charset in the <head> of your HTML document.
<meta charset="utf-8">
you need to use utf-8 encoding
change echo $content to echo utf8_encode($content);

How to get the href content of a link tag

I have a link tag. I want to get the href so that I can get the external CSS code.
This is what I tried:
<link rel="stylesheet" href="CSS/main.css" type="text/css">
<?php
include('simple_html_dom.php');
$html = new simple_html_dom();
$html->load_file("test.txt");
$file = fopen("link.txt","w");
$link=$html->find("link");
foreach($link AS $lk){
$lk->href;
$line_string=file_get_contents($lk);
fwrite($file,($line_string. PHP_EOL));
}
fclose($file);
?>
you're not assigning the lk value to anything
$lk->href;
that returns the value of the href but doesn't assign it to anything. should be more like:
$link=$html->find("link");
foreach($link AS $lk){
$hr=$lk->href;
$line_string=file_get_contents($hr);
fwrite($file,($line_string. PHP_EOL));
}
Your line "$lk->href" isn't doing anything. Try assigning it to a variable and writing that variable. For example:
foreach($link AS $lk){
$href = $lk->href;
$line_string=file_get_contents($href);
fwrite($file,($line_string. PHP_EOL));
}

how to access DOM in php that will echo out everything between <html></html> [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Export particular element in DOMDocument to string
i know how to access different element depending on id but don't know how to get everything between html start tag to html end tag. Can anyone please help me.
thanks.
If you would like to parse an html page with PHP, you could use PHP's DOMDocument extension, as such:
// a new dom object
$dom = new domDocument;
// load the html into the object
$dom->loadHTML($html);
// keep white space
$dom->preserveWhiteSpace = true;
// nicely format output
$dom ->formatOutput = true;
//get element by tag name
$htmlRootElement = $dom->getElementsByTagName('html');
echo htmlspecialchars($dom->saveHTML(), ENT_QUOTES);
Or you could do this with JavaScript on the client side:
var htmlRootElement = document.getElementsByTagName("html");
alert(htmlRootElement.innerHTML);
You can access each element in the <html> tag with the DOMDocument class.
Example
$htmlDoc = new DOMDocument;
$html = <<<HTML
<!doctype html>
<html>
<head>
<meta charset="utf-8">
<title>My Site</title>
<meta name="description" content="DOM test">
</head>
<body>
<h1>Hello</h1>
<p>This is a DOM test</p>
</body>
</html>
HTML;
$htmlDoc->loadHTML($html);
$htmlElement = $htmlDoc->getElementsByTagName("html");
foreach ($htmlElement->item(0)->childNodes as $element) {
echo 'Element name: ' . $element->nodeName . PHP_EOL;
echo 'Element value: '. $element->nodeValue . PHP_EOL;
}

Categories