Detecting and preventing XSS, but allowing the html formatting - php

I'm trying to understand XSS attacks. I learnt that I should use htmlspecialchars() whenever outputting something to the browser that came from the user input. The code below works fine.
What I don't understand is whether there is a need to use htmlspecialchars() here for echoing the $enrollmentno or not?
<?php
$enrollmentno = (int)$_POST['enrollmentno'];
echo "<div style='border-radius:45px; border-width: 2px; border-style: dashed; border-color: black;'><center><h4><b>$enrollmentno</b></h4></center></div>";
$clink = "http://xyz/$enrollmentno/2013";
echo"<iframe src='$clink' width='1500' height='900' frameBorder='0'></iframe>";
?>
If I do something like
$safe = "<div style='border-radius:45px; border-width: 2px; border-style: dashed; border-color: black;'><center><h4><b>$enrollmentno</b></h4></center></div>";
echo htmlspecialchars($safe, ENT_QUOTES);
It doesn't show the correct HTML format.
I'm not sure if I have to use HTMLPurifer here.
Does HTMLPurifer retain the HTML formating while prevent XSS?
Update
echo "<div style='border-radius:45px; border-width: 2px; border-style: dashed; border-color: black;'><center><h4><b>".htmlspecialchars ($enrollmentno)."</b></h4></center></div>";
Does the trick!

Any time you use arbitrary data in the context of HTML, you should be using htmlspecialchars(). The reason for this is that it prevents your text content from being treated as HTML, which could potentially be malicious if coming from outside users. It also ensures you are generating valid HTML that browsers can handle consistently.
Suppose I want the text "8 > 3" to appear on in HTML. To do this, my HTML code would be 8 > 3. The > is encoded as > so that it isn't misinterpreted as part of a tag.
Now, suppose I am making a web page about how to write HTML. I want the user to see the following:
<p>This is how to make a paragraph</p>
If I don't want <p> and </p> to be interpreted as an actual paragraph, but as text, you need to encode:
<p>This is how to make a paragraph</p>
htmlspecialchars() does that. It allows you to insert arbitrary text into an HTML context in a safe way.
Now, in your second example:
$safe = "<div style='border-radius:45px; border-width: 2px; border-style: dashed; border-color: black;'><center><h4><b>$enrollmentno</b></h4></center></div>";
echo htmlspecialchars($safe, ENT_QUOTES);
This does exactly what you asked it to do. You gave it some text, and it encoded that. If you wanted it as HTML, you should have just echoed it.
Now, if you need to display HTML as HTML and it comes from an untrusted source (i.e. not you), then you need tools like HTMLPurifier. You do not need this if you trust the source. Running all your output through htmlspecialchars() doesn't magically make things safe. You only need it when inserting arbitrary text data. Here's a good use case:
echo '<h1>Product Review from ', htmlspecialchars($username), '</h1>';
echo htmlspecialchars($reviewText);
In this case, both the username and review text can contain whatever that user typed in, and they will be encoded correctly for use in HTML.

Related

HTML <pre> word wrapping in PHP doesn't work

I've been working on a forum, and I've made everything work as it, and I tried wrapping to see if it works, basically that part of the code looks like this
before the PHP I got some HTML style:
<style>
pre {
white-space: pre-wrap;
white-space: -moz-pre-wrap;
white-space: -pre-wrap;
white-space: -o-pre-wrap;
word-wrap: break-word;
font-size: 20px;
margin: 0% 8% 0px 8%;
}
</style>
And then after a bunch of MySQL things
echo '<table style="height: 21px;" width="100%">';
while($forumcomrow=mysqli_fetch_array($forumcomres))
{
echo '<tr><td>some text</td><td><pre>Some Very Long Text From Mysql </pre></td></tr></br>';
}
echo '</table>';
I removed everything and just left the style for it and the echo for the table, when I echo just the pre tag, it wraps it at the end of the screen, but as soon as I put it in a table (size doesn't matter, even if I put the width of the table to 5px) it still goes 10 miles off the screen until it starts wrapping.
This is what happens:
On the picture above you can see one part of it, but the text goes off screen about 5x the length you can see on there, and only then starts wrapping
I have figured out what's giving it a problem, it will wrap the text if it has spaces ("aaaa aaa aaa aa aa") but if its one long word ("aaaaaaaaa") it won't wrap it, I don't know how to fix it, I've just figured out whats causing the problem.
It is because of <br> tag at the end, I have faced the same problem some time ago. <br> get applied first for every time and then table gets wrapped.
Remove <br> at the end
echo '<tr><td>some text</td><td><pre>Some Very Long Text From Mysql </pre></td></tr>';
and then it will work perfectly fine.

How to capture exact text input in textarea and display it in a table?

I am creating a programming forum and I am having trouble containing the retrieved text from the database. my textarea looks like this
<textarea name = "PostText" style = "width: 90%; height:480px; resize:none; direction: ltr; unicode-bidi: bidi-override;text-align:left;"required>
</textarea>
unicode-bidi: bidi-override captures the code how it is entered but I can not make word-wrap function correctly when using them both. So the first question is how can capture the exact text in the textarea and and display it in a table.
My td tag looks like this
<td style = "margin:0 auto; text-align:left;"><pre>'.htmlspecialchars($post_).'</pre></td>
The pre tag displays the text exactly how it was entered but will not wrap correctly and overflows its container.
How to capture exact text input in textarea and display it in a table?
I'd recommend storing as you store now (just insert to table as is), and when you display the text that you got from database, output it like this:
<td style = "margin:0 auto; text-align:left;">'.nl2br($post_).'</td>
The PHP nl2br function converts line-breaks (that are stored in DB) to new-line HTML tags.
Add overflow with determination of width:
<style>
td pre{
width: 10%;
}
pre{
overflow: auto;
}
</style>
Checkout this DEMO

strip_tags or other solution

I have this text in mysql adding even directly but do not want to lose the labels only the styles and formats that tenien
<p style="margin-bottom: 20px; padding: 0px; border: 0px;"><span style="line-height: 1.428571429;">Allí, el club crema</span><p>
use strip_tags but removes the entire label
strip_tags ($ data, "<p>");
I want it that way:
<p>Allí, el club crema<p>
I hope your help, thank you very much beforehand for your answers
Warning, anti-pattern: using REGEX on mark-up is generally a bad idea. However it's sometimes more convenient, so to hell with it:
$data = preg_replace('/(<\w+) [^>]+/', '$1', $data);
There is no php function for that. The strip tags function will strip the tag completely, and allowing a tag will keep the tag in place, including the attributes. You'll need to load the html in a xml parser and reconstruct the output, or, and I would advise you to go that way, use regex to strip out any html attributes (after you've stripped away the tags you don't need anyway. See also this question

PHP: How can you remove/replace a specific href attribute value

I am parsing some articles from a database with php and in the articles there are links which I would like to overwrite. Link always start with "http://cdn.example.com/" and the end parser is htmlspecialchars_decode($item->parse_articles(), ENT_NOQUOTES).
So before the articles are passed to the HTML DOM, I would like to replace all those href's that contain (?) example.com or maybe even if faster and possible to remove the <a> completely.
.
How is this possible? and if possible, is this considered faster option than passing it first to the DOM and manipulating it from there on the client-side?
You could try something like the following in PHP:
$newtext = preg_replace('/^("http:\/\/cdn\.example\.com\/){1}(.*)("){1}$/', '"#" class="disabled-link"', $oldtext);
$oldtext being your input article as a string.
$newtext being the text to echo on the page.
Broken down:
Find text starting with "http://cdn.example.com/
Then match anything
Stop at "
Replace with "#" class="disabled-link"
This should let you remove the link and also I added the class part so that you can add some CSS to style the links as text.
Example:
.disabled-link{
color:#000;
pointer-events: none;
cursor: default;
text-decoration: none;
}
All this combined will provide users with a link that is completely invisible without looking into the DOM or the source.

preg_replace UNLESS string exists

I'm trying to add CSS styling to all hyperlinks unless it has a "donttouch" attribute.
E.g.
Style this: style me
Don't style this: <a href="http://whatever.com" donttouch>don't style me</a>
Here's my preg_replace without the "donttouch" exclusion, which works fine.
preg_replace('/<a(.*?)href="([^"]*)"(.*?)>(.*?)<\/a>/','<a$1href="$2"$3><span style="color:%link_color%; text-decoration:underline;">$4</span></a>', $this->html)
I've looked all over the place, and would appreciate any help.
Find (works also in Notepad++)
(?s)(<a (?:(?!donttouch)[^>])+>)(.*?)</a>
Replace with (Replace all in Notepad++):
\1<span style="whatever">\2</span></a>
This can be accomplished without a regular expression. Instead, use a CSS attribute selector.
For example, use these rules:
a { font-weight: bold; color: green }
a[donttouch=''] { font-weight: normal; color: blue }
Technically, you are styling the elements with the 'donttouch' attribute, but you can use default values. This will be more efficient than attempting to use a regular expression to parse your HTML, which is usually a bad idea.

Categories