how to save page source in a php variable?

how to save page source in a php variable? - php

i have an html email being generated by certain php functions and collected from several tables in the database and organized in an html layout..right now it is all in a preview.php page for testing the layout but when i need to send it to subscribers, i need to get the html code generated from this page only,and send that code in an email.and i mean by page source the one i see when i right click on the page and then click view source..so how do i get this page source ? or save it into a certain variable to use it ?

Option 1:
Use file_get_contents(), since it returns the file in a string:
$html = file_get_contents('preview.php')
Your whole html is now saved in the $html variable as a string.
Option 2:
If your preview.php contains some PHP processing, you can do this instead (so that the PHP codes get executed, and you still get the resulting html):
ob_end_clean();
ob_start();
include('preview.php');
$html = ob_get_contents();
ob_end_clean();
Again, your whole html is now saved in the $html variable as a string.

You should generate your html with PHP and then save it in a session variable before echoing it.
Something like
$html = <<<HTML
<html>
<-- Here you have the full html of the page -->
</html>
HTML;
session_start();
$_SESSION['html'] = $html;
echo $html;
Then when you want to send the email you simply do
$message = $_SESSION['html'];

What you want looks a bit weird to me (why not get it into a variable instead of echoing it?)
Anyway, have a look to the ob_start & ob_get_contents functions

The right way is to have preview.php do this:
$html = '';
$html .= '<div>';
$html .= 'Text within div';
$html .= '</div>';
// etc
echo $html;
// Do other stuff with $html
But if you just want the lazy way, leaving preview.php doing echo statements, do this:
ob_start();
// Make the HTML using echo
$html = ob_get_contents();
ob_end_clean();

Related

how to parse this to get title in php

I want to parse HTML code present in $raw to get the title and save it mysql. I have tried to do it with php dom and Ganon HTML parser but when I run it, shows me an error 500. it would be great if you solve this problem with Ganon.
function store($raw)
{
include_once('ganon.php');
$html = file_get_dom($raw);
echo $html('title', 0)->parent->getPlainText();
}
store ('<html> all html code </html>');

There are a few problems with your code.
Firstly you use file_get_dom() which is expecting to be passed in a file name, so usestr_get_dom() instead.
Secondly, the example HTML doesn't contain a title, so this won't work.
Then when you find the title, you go to the parent element and output from there. You just need to use that nodes content.
include_once('ganon.php');
function store($raw)
{
$html = str_get_dom($raw);
echo $html('title', 0)->getPlainText();
}
store ('<html><title>Title</title> all html code </html>');
outputs...
Title of page

php dom replacedChild, save as html and continue parsing

I created a php parser for editing the html which is created by a CMS. The first thing I do is parse a custom tag for adding modules.
After that things like links, images etc. are if needed updated, changed or w/e. This all works.
Now I noticed that when a custom tag is replaced with the html the module generated this html is NOT processed by the rest of the actions.
For example; all links with a href of /pagelink-001 are replaced with the actual link of the current page. This works for the initial loaded html, not the replaced tag. Below I have a short version of the code. I tried saving it with saveHtml() and load it with loadHtml() and things like that.
I'm guessing this is because $doc with the loaded html is not updated as such.
My code:
$html = 'Link1<customtag></customtag>';
// Load the html (all other settings are not shown to keep it simple. Can be added if this is important)
$doc->loadHTML($html);
// Replace custom tag
foreach($xpath->query('//customtag') as $module)
{
// Create fragment
$return = $doc->createDocumentFragment();
// Check the kind of module
switch($module)
{
case 'news':
$html = $this->ZendActionHelperThatReturnsHtml;
// <div class="news">Link2</div>
break;
}
// Fill fragment
$return->appendXML($html);
// Replace tag with html
$module->parentNode->replaceChild($return, $module);
}
foreach($doc->getElementsByTagName('a') as $link)
{
// Replace the the /pagelink with a correct link
}
In this example Link1 href is replaced with the correct value, however Link2 is not. Link2 does correctly appear as a link and all that works fine.
Any directions of how I can update the $doc with the new html or if that is indeed the problem would be awesome. Or please tell me if I'm completely wrong (and where to look)!
Thanks in advance!!

It seemed that I was right and the returned string was a string and not html. I discovered in my code the innerHtml function from #Keyvan that I implemented at some point. This resulted in my function being this:
// Start with the modules, so all that content can be fixed as well
foreach($xpath->query('//customtag') as $module)
{
// Create fragment
$fragment = $doc->createDocumentFragment();
// Check the kind of module
switch($module)
{
case 'news':
$html = htmlspecialchars_decode($this->ZendActionHelperThatReturnsHtml); // Note htmlspecialchars_decode!
break;
}
// Set contents as innerHtml instead of string
$module->innerHTML = $html;
// Append child
$fragment->appendChild($module->childNodes->item(0));
// Replace tag with html
$module->parentNode->replaceChild($fragment, $module);
}

How to make a link in a mysql stored variable clickable when rendered on the page

I have a function that enables members on a site to message each other; the message is stored in mysql database.
My question now is this: what is the best way to allow members to include a link in the message so that, when rendered, it is rendered as a click-able link.
I've tried the following:
click here
but when I then tried to render it on the page it came out as:
$message = nl2br($this->escapeHtml(trim($this->theMessage[0]['message'])));
echo $message; // click here
the var_dump Values of $messages is:
string '<a href="testpage.html"> click here</a>'

HTML markup is complicated, because when displaying it to the user and someone has injected unsavory HTML into the markup, then you've got an XSS attack on your hands. Imagine an added onclick interception, etc.. Any data from outside is dangerous.
markup language
This is one of the reasons, why markup languages like BBCode and markdown exist.
You don't want every piece of HTML markup, only clean and safe stuff.
Basically, you want to work with a restricted set of "content".
And one way of allowing data from outside is by using an "intermediate" markup language.
It is intermediate, because it is a custom format, which is later transformed into HTML.
This happens here on Stackoverflow, too:
[link](http://google.com) = link
tell your users: "to insert a link, using a special syntax"
save the content to the database.
the content you store to the database is something like:
The message text. And some markdown [link](http://google.com).
when you fetch the message from database, you process the markdown content:
$messageFromDb = 'The message. [http://google.com](google)';
$parsedown = new Parsedown();
$html = $parsedown->text($messageFromDb);
echo $html; // ready to show
Result: <p>The message. <a href="http://google.com">http://google.com</a></p>
There are libraries out there ready for usage, like
http://parsedown.org/
https://github.com/egil/php-markdown-extra-extended
filter html
Another way is to allow HTML, but only an restricted set. You would have to filter the inserted HTML, to pick only the good content and drop the rest.
PHP Extension Tidy: http://php.net/manual/en/book.tidy.php
Libraries like http://htmlpurifier.org/
DOM based HTML filter
Instead of relying on a filter library, you could also come up with a "little" DOM based HTML filter.
The following example re-creates a clean link from a crappy and bad one.
You should also check the URL attributes to ensure they use known-good schemes like http:, and not troublesome like javascript:.
This allows to whitelist the combination of elements, to control the nesting and the content.
<?php
// content from form
$html = 'Message <img title="The Link" /> Link Text';
$dom = new DOMDocument;
$dom->formatOutput = true;
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD | LIBXML_NOXMLDECL);
// filter, then rebuild a clean link
foreach ($dom->getElementsByTagName('a') as $node)
{
// extract the values
$title = $node->nodeValue;
$href = $node->getAttribute('href');
// maybe add a href filter?
// to remove links to bad websites
// and to remove href="javascript:"
// oh boy ... simple questions, resulting in lots of work ;)
// create a new "clean" link element
$link = $dom->createElement('a', $title);
$link->setAttribute('href', $href);
// replace old node
$node->parentNode->replaceChild($link, $node);
}
$html = $dom->saveXML();
// drop html, body, get only html fragment
// http://stackoverflow.com/q/11216726/1163786
$html = preg_replace('~<(?:!DOCTYPE|/?(?:html|body|p))[^>]*>\s*~i', '', $dom->saveHTML());
var_dump($html);
Before
Message <img src="injectionHell.png" title="The Link" /> Link Text
After
Message Link Text
To store "HTML in database"
When storing: use addslashes().
When returning text from DB: apply stripslashes(), before rending

A simple way to attain your goal is to save the message including the <a> tags.
You can use an HTML sanitizer so that you accept <a> link tags from your users while removing any potentially dangerous tags.
Then you wouldn't escape the saved text when you output it.
Have a look at HTML purifier.
Alternatively, you could use a Markdown parser to convert plain text to HTML.

your code removes the html tags and replace it with a written form ...
escapeHtml()
what you need is a function that remove all your html tags except what you desire in this case (link tag)
<a>
here is the function you can add it to your code :
function stripme($msg){
$msg = strip_tags($msg,'<a>');
return $msg ;
}
and then call it for your message like this:
$message = nl2br($this->stripme($this->theMessage[0]['message']));

setting data via Jquery/Javascript in a ob_get_contents

i might not be clear with my question title but here is the code..
<?php
$filename = 'myfile.htm';
ob_start();
<?PHP
<div id='test'>my original value</div>
?>
$htmlcontent = ob_get_contents();
file_put_contents("$filename", $htmlcontent);
ob_end_clean();
so this code will eventually create a new file and with the text 'my original value
is it possible if i want to alter the div's value through javascript/jquery before it could be transferred to the file?
why am i doing this? because i would eventually be adding a jquery graph library and want to save it to the file..
later using wkhtmltopdf to generate a pdf version of that html page..

No; You'll have to display the page along with all of the javascript you want to use. Then you create a form to gather the contents of the page (after its been manipulated by your graph library) and post it back to PHP, where it can be saved to file.

Hmm, well you can try one thing. I don't know how the content of myfile.htm looks like, but you can try to load this content with something like DOMDocument, use the loadHTML method, and getElementById.
so:
<div id="test1">value</div>
could be retrieved with
// pseudo
$dom = new DOMDocument::loadHTML('myfile.htm');
$dom->getElemenyById('test1');
$dom->saveHTMLFile('etc ..
execute a $.post and 'manipulate' the existing myfile.htm and overwrite it.
cheers

How to use PHP to show HTML source

I created a few PHP files for users of a popular hardware site to use to "Metro" their news posts. It works fairly well, you add the title of the article, links etc. and then it spits it out in Metro tile format.
Take a look: http://briandempsey.org.uk/Newstool/index.php
When the user submits, it uses the information provided to create the post. Now, I need to somehow use PHP or some other language to display the code that it generated underneath it so users can just copy and paste it. I'm not sure how to do this.

header('Content-Type: text/plain');

Since you're passing your form data using the method GET, you could instead pass it to a page that creates a url to pull the html from...
index.php will have the form as you've shown above and will post to urlCreator.php.
form.php can be deleted as it is not needed anymore, the magic will happen in the urlCreator.php file.
urlCreator.php (NEW) will have code in it like so:
<?php
// urlCreator.php will get variables passed to it from index.php
// Now create the url using the passed variables
$url = "http://briandempsey.org.uk/Newstool/form.php?title=" . $_GET['title'] . "&color=" . $_GET['color'] . "&Articlecontent=" . $_GET['Articlecontent'] //and so on to build the whole url
//Now get the pages html
$html = file_get_contents($url);
?>
Now that you have the html in a variable you can clean it using str_replace or manipulate it however you'd like.
<?php
// Still in urlCreator.php
// With the html in a variable you could now edit out the divs and echo the results
$html = str_replace("<div style=\"background-color: #008299; height: auto; width: 100%; margin-left: 10px;\">", "", $html); //removes the intro div
$html = str_replace("</div>", "", $html); //removes the closing div
//Finally echo out the html to the page for copy+paste purposes
echo $html;
?>

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

how to save page source in a php variable? - php

What you want looks a bit weird to me (why not get it into a variable instead of echoing it?) Anyway, have a look to the ob_start & ob_get_contents functions

Related

how to parse this to get title in php

php dom replacedChild, save as html and continue parsing

How to make a link in a mysql stored variable clickable when rendered on the page

setting data via Jquery/Javascript in a ob_get_contents

How to use PHP to show HTML source

Categories

Resources