Cleanup malicious code while allowing some HTML (before persisting to DB) - php

I have a Symfony project where any user can register for an account and then create a page with a form that includes a field content. I want to allow users to insert some html (like bold text, numbered lists and some other elements), which I have done by using a WYSIWYG-editor, CKEditor. I have created a toolbar that only allows the elements I have chosen to be parsed to the database when saving the page. I can show the content of this page by using:
{{ page.content | raw }}
This all works as expected. However, is a user was to copy the post-request, edit in some JS or other HTML and use cURL to send it, this would allow them to insert (harmful) code. My question is: how to prevent this from happening?
I have been reading about 'sanitation' or 'purification' to cleanup user input. Something like HTML Purifier could cleanup the output, which I also considered doing by creating a sort of 'whitelist twig filter' for the elements I do allow. Preferably I would cleanup the input before persisting it to the database. I imagine this is a common issue, but I only find solutions on how to cleanup the output, usually by escaping all HTML, which in my case is also not a solution because I do want to allow some HTML.

You could purify this in your form type, after the user submits the form with the HTML Purifier Library and symfony form events:
use HTMLPurifier;
use HTMLPurifier_Config;
use Symfony\Component\Form\FormEvent;
use Symfony\Component\Form\FormEvents;
$builder->addEventListener(FormEvents::SUBMIT, function (FormEvent $event) {
$object = $event->getData();
$config = HTMLPurifier_Config::createDefault();
$config->set('HTML.AllowedElements', ['a', 'b', 'strong', 'ul', 'li', 'p', 'br']);
$config->set('Attr.AllowedFrameTargets', ['_blank']);
$purifier = new HTMLPurifier($config);
$content = $purifier->purify($object->getContent());
$object->setContent($content);
});
So in this example the users content is cleaned up. The HTML.AllowedElements defines which elements should not be removed. After that the entity is ready to be persisted to your database without bad html user content.

The trick is not to manipulate user input. You should validate/reject user input (example: user uploads 10GB of data, or the user starts a div element, but doesn’t end it), but don’t change it. It’s not going anywhere or going to infect anyone by sitting in a database.
When you display the page to the user, that is when you manipulate the data. Like you said, escape your characters: < for <, &amp for &, and &quot for “.
I was recently programming for this, and what I did was use an XML parser (luaexpat). In your case, you have PHP that has an XML parser library.
Run the user input HTML through the XML parser. If any unauthorized elements show up, you can either escape them (<) on output or throw an error instead of the content. It is also good to make sure that the content has valid XML, so a user can’t mess up the rest of the page by not closing an element.
Another idea is to store “version identifiers” of post types. If you decide to add more features/attributes or switch to another encoding (like BBCose), write a note in the database so it will be easier to decode the posts. This is another reason why you should NOT change user input, but rather user output in case you start off by denying images, then you decide to allow it later on.
Also whitelist attributes too. Don’t let someone put JavaScript in an attributes (such as <div onclick=“MaliciousCode();”>)
Be sure to look out for SQL injection attacks and HTML injection attacks.

Related

How to make dynamic links in php without eval()

I am using wordpress for a web site. I am using snippets (my own custom php code) to fetch data from a database and echo that data onto my web site.
if($_GET['commentID'] && is_numeric($_GET['commentID'])){
$comment_id=$_GET['commentID'];
$sql="SELECT comments FROM database WHERE commentID=$comment_id";
$result=$database->get_results($sql);
echo "<dl><dt>Comments:</dt>";
foreach($result as $item):
echo "<dd>".$item->comment."</dd>";
endforeach;
echo "</dl>";
}
This specific page reads an ID from the URL and shows all comments related to that ID. In most cases, these comments are texts. But some comments should be able to point to other pages on my web site.
For example, I would like to be able to input into the comment-field in the database:
This is a magnificent comment. You should also check out this other section for more information
where getURLtoSectionPage() is a function I have declared in my functions.php to provide the static URLs to each section of my home page in order to prevent broken links if I change my URL pattern in the future.
I do not want to do this by using eval(), and I have not been able to accomplish this by using output buffers either. I would be grateful for any hints as to how I can get this working as safely and cleanly as possible. I do not wish to execute any custom php code, only make function calls to my already existing functions which validates input parameters.
Update:
Thanks for your replies. I have been thinking of this problem a lot, and spent the evening experimenting, and I have come up with the following solution.
My SQL "shortcode":
This is a magnificent comment. You should also check out this other section for more information
My php snippet in wordpress:
ob_start();
// All my code that echo content to my page comes here
// Retrieve ID from url
// Echo all page contents
// Finished generating page contents
$entire_page=ob_get_clean();
replaceInternalLinks($entire_page);
PHP function in my functions.php in wordpress
if(!function_exists("replaceInternalLinks")){
function replaceInternalLinks($reference){
mb_ereg_search_init($reference,"\[custom_func:([^\]]*):([^\]]*)\]");
if(mb_ereg_search()){
$matches = mb_ereg_search_getregs(); //get first result
do{
if($matches[1]=="getURLtoSectionPage" && is_numeric($matches[2])){
$reference=str_replace($matches[0],getURLtoSectionPage($matches[2]),$reference);
}else{
echo "Help! An unvalid function has been inserted into my tables. Have I been hacked?";
}
$matches = mb_ereg_search_regs();//get next result
}while($matches);
}
echo $reference;
}
}
This way I can decide which functions it is possible to call via the shortcode format and can validate that only integer references can be used.
I am safe now?
Don't store the code in the database, store the ID, then process it when you need to. BTW, I'm assuming you really need it to be dynamic, and you can't just store the final URL.
So, I'd change your example comment-field text to something like:
This is a magnificent comment. You should also check out this other section for more information
Then, when you need to display that text, do something like a regular expression search-replace on 'href="#comment-([0-9]+)"', calling your getURLtoSectionPage() function at that point.
Does that make sense?
I do not want to do this by using eval(), and I have not been able to accomplish this by using output buffers either. I would be grateful for any hints as to how I can get this working as safely and cleanly as possible. I do not wish to execute any custom php code, only make function calls to my already existing functions which validates input parameters.
Eval is a terrible approach, as is allowing people to submit raw PHP at all. It's highly error-prone and the results of an error could be catastrophic (and that's without even considering the possibly that code designed by a malicious attacker gets submitted).
You need to use something custom. Possibly something inspired by BBCode.

insert html into database with CodeIgniter

I want to enable users to edit pages with editor (CKEditor).
The problem is that I want to prevent XSS, so when I'm using:
$this->input->post('content', TRUE)
it also removes some html conent, for example, the following code:
<script></script><p><span style="color:#800000;">text</span></p>
becomes to:
[removed][removed]<p><span
So yes, it prevents XSS, but also removes some necessary html content.
What should I do to fix it?
Don't use their built in XSS functionality. Use HTML purifier to do it for you. That way you have more control over what is and isn't removed.
try this simple way change this code $this->input->post('content', TRUE) into $_POST['content'] its work for me because codeigniter will do XSS filtering when run $this->input
Instead of this you can use below code.
$content = htmlspecialchars($this->input->post('content'));
The save to database and at the time of retrieval, you can use
htmlspecialchars_decode('your html code');

Get all content from a file, including PHP code

I'm making a small CMS for practice. I am using CKEDITOR and is trying to make it avaliable to write something like %contactform% in the text, and then my PHP function will replace it with a contactform.
I've accomplished to replace the text with a form. But now I need the PHP code for the form to send a mail. I'm using file_get_contents(); but it's stripping the php-code.
I've used include(); to get the php-code from another file then and that works for now. I would like to do it with one file tho.
So - can I get all content from a file INCLUDING the php-code?
*UPDATE *
I'll try to explain in another way.
I can create a page in my CMS where I can write a header and some content. In the content I am able to write %contactform%.
When I get the content from the database I am replacing %contactform% with the content from /inserts/contactform.php, using file_get_contents(); where I have the form in HTML and my php code:
if(isset($_POST['submit'])) {
echo 'Now my form is submitted!';
}
<form method="post">
<input type="text" name="email">
<input type="submit" name="submit">
</form>
Now I was expecting to retrieve the form AND the php code active. But If I press my submit button in the form it's not firing the php code.
I do not wan't to show the php code I want to be able to use it.
I still have to guess, but from your update, I think you ultimatly end up with a variable, which contains the content from the database with %contactform% replaced by file_get_contents('/inserts/contactform.php').
Something like:
$contentToOutput = str_replace(
'%contactform%',
file_get_contents('/inserts/contactform.php'),
$contentFromDatabase
);
If you echo out that variable, it will just send it's content as is. No php will get executed.
Though it's risky in many cases, if you know what you're doing you can use eval to parse the php code. With mixed code like this, you maybe want to do it like the following.
ob_start();
eval('; ?>' . $contentToOutput);
$parsedContent = ob_get_clean();
$parsedContent should now contain the results after executing the code. You can now send it to the user or handle it whatever way you want to.
Of course you'll have to make sure that whatever is in $contentToOutput is valid php code (or a valid mixture of php with php-tags and text).
Here is a link to the symfony Templating/PhpEngine class. Have a look at the evaluate method to see the above example in real code.
yes...
$content = file_get_contents( 'path to your file' );
for printing try
echo htmlspecialchars( $content );
From reading the revised question, I think the answer is "You can't get there from here." Let me try to explain what I think you will encounter.
First, consider the nature of HTTP and the client/server model. Clients make requests and servers make responses. Each request is atomic, complete and stateless, and each response is complete and usually instantaneous. And that is the end of it. The server disconnects and goes back to "sleep" until the client makes a new request.
Let's say I make a request for a web page. A PHP script runs and it prepares a response document (HTML, probably) and the server sends the document to my browser. If the document contains an HTML form, I can submit the form to the URL of the action= script. But when I submit the form, I am making a new request that goes back to the server.
As I understand your design, the plan is to put both the HTML form and the PHP action script into the textarea of the CKeditor at the location of the %contactform% string. This would be presented to the client who would submit the form back to your server, where it would run the PHP script. I just don't think that will work, and if you find a way to make it work, you're basically saying, "I will accept external input and run it in PHP." That would represent an unacceptable security exposure for me.
If you can step back from the technical details and just tell us in plain language what you're trying to achieve, we may be able to offer a suggestion about the design pattern.

How do I turn a form into a attachable file for email in php

So I Have a form that will send out emails that correspond to how they filled the form. I have to take some information that was filled in the form and put it in a HTML form to send in a email which is done in php.
My question is, is there anyway to turn that form into a attachable file that could just be added to the mail function and sent instead?
Thank you for your time!
Yes, you can transform your HTML form into a... HTML page (using HTML tokens manually, or maybe something like TCPDF?), and save it to a temporary disk file, then attach it to the outgoing email. Of course, your mail function must support attachments! (If it doesn't, use a different mailer functions - there's many of them).
You can do it quickly by preparing the form manually and replacing the text fields with something like {{ Name }}.
Then just run a str_replace on the file_get_contents() of your template and save the results in a temporary file:
$translate = array(
'{{ Name }}' => $Form['Name'],
....
);
// one could also use preg_replace with 'e' flag and use directly $Form...
$html = str_replace(array_keys($translate), array_values($translate),
file_get_contents("form-template.html"));
In just the same way you can prepare a Word document, save it as RTF (which is ASCII readable), and attach the file with .doc extension -- most word processors will open it automatically.
If you want to send a PDF file (using TCPDF, FPDF, etc.), it's a bit more complicated but not too much.
Swift Mailer integrates into any web app written in PHP 5, offering a flexible and elegant object-oriented approach to sending emails with a multitude of features including attachments.

How to check if the HTML code is valid via PHP or Javascript

I am providing my users to create their own widgets on their profile pages.Widgets are created by HTML codes by user input.
But Sometimes, if user enter an invalid html code, widget field is corrupting. To avoid this situation, I must be sure the html code that user entered is valid.
My question is how to check a html code validation via PHP or Javascript.
Thanks in advance.
You can use the tidy php extension;
There are a number of things that you can do with it, but for your case I think you could just use the repairString method to get the clean version of the user input and compare it with the initial version,
Something like this:
$tidy = new tidy();
$clean = $tidy->repairString($user_input);
If no changes were made during the repairString method; $user_input and $clean should be the same.
Take a look at this SO question / answers. The DOMDocument-Class may validate your HTML with PHP.

Categories