Escaping output safely for both html and input fields - php

In my web app, users can input text data. This data can be shown to other users, and the original author can also go back and edit their data. I'm looking for the correct way to safely escape this data.
I'm only sql sanitizing on the way in, so everything is stored as it reads. Let's say I have "déjà vu" in the database. Or, to be more extreme, a <script> tag. It is possible that this may be valid, and not even maliciously intended, input.
I'm using htmlentities() on the way out to make sure everything is escaped. The problem is that html and input fields treat things differently. I want to make sure it's safe in HTML, but that the author when editing the text, sees exactly what they typed in the input fields. I'm also using jQuery to fill form fields with the data dynamically.
If I do this:
<p><?=htmlentities("déjà vu");?></p>
<input type=text value="<?=htmlentities("déjà vu");?>">
The page source puts déjà vu in both places (I had to backtick that or you would see "déjà vu"!) The problem is that the output in the <p> is correct, but the input just shows the escaped text. If the user resubmits their form, they double escape and ruin their input.
I know I still have to sanitize text that goes into the field, otherwise you can end the value quote and do bad things. The only solution I found is this. Again, I'm using jQuery.
var temp = $("<div></div>").html("<?=htmlentities("déjà vu");?>");
$("input").val(temp.html());
This works, as it causes the div to read the escaped text as encoded characters, and then the jquery copies those encoded characters to the input tag, properly preserved.
So my question: is this still safe, or is there a security hole somewhere? And more importantly, is this the only / correct way to do this? Am I missing something about how html and character encoding works that make this a trivial issue to solve?
EDIT
This is actually wrong, I oversimplified my example to the point of it not working. The problem is actually because I'm using jQuery's val() to insert the text into the field.
<input>
<script>$("input").val("<?=htmlentities("déjà vu");?>");</script>
The reason for this is that the form is dynamic - the user can add or remove fields at will and so they are generated after page load.
So it seems that jQuery is escaping the data to go into the input, but it's not quite good enough - if I don't do anything myself, a user can still put in a </script> tag, killing my code and inserting malicious code. But there's another argument to be made here. Since only the original author can see the text in an input box anyway, should I even bother? Basically the only people they could execute an XSS attack against is themselves.

I'm sorry but I cannot reproduce the behaviour you describe. I've always used htmlspecialchars() (which does essentially the same task as htmlentities()) and it's never lead to any sort of double-encoding. The page source shows déjà vu in both places (of course! that's the point!) but the rendered page shows the appropriate values and that's what sent back to the server.
Can you post a full self-contained code snippet that exhibits such behaviour?
Update: some testing code:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head><title></title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body>
<?php
$default_value = 'déjà vu <script> ¿foo?';
if( !isset($_GET['foo']) ){
$_GET['foo'] = $default_value;
}
?>
<form action="" method="get">
<p><?php echo htmlentities($_GET['foo']); ?></p>
<input type="text" name="foo" value="<?php echo htmlentities($_GET['foo']); ?>">
<input type="submit" value="Submit">
</form>
</body>
</html>
Answer to updated question
The htmlentities() function, as its name suggests, is used when generating HTML output. That's why it's of little use in your second example: JavaScript is not HTML. It's a language of its own with its own syntax.
Now, the problem you want to fix is how to generate output that follows these two rules:
It's a valid string in JavaScript.
It can be embedded safely in an HTML document.
The closest PHP function for #1 I'm aware of is json_encode(). Since JSON syntax is a subset of JavaScript, if you feed it with a PHP string it will output a JavaScript string.
As about #2, once the browser enters a JavaScript block it expects a </script> tag to leave it. The json_encode() function takes care of this and escapes it properly (<\/script>).
My revised test code:
<?php
$default_value = 'déjà vu </script> ¿foo?';
if( !isset($_GET['foo']) ){
$_GET['foo'] = $default_value;
}
?>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head><title></title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js"></script>
<script type="text/javascript"><!--
$(function(){
$("input[type=text]").val(<?php echo json_encode(utf8_encode($_GET['foo'])); ?>);
});
//--></script>
</head>
<body>
<form action="" method="get">
<p><?php echo htmlentities($_GET['foo']); ?></p>
<input type="text" name="foo" value="(to be replaced)">
<input type="submit" value="Submit">
</form>
</body>
</html>
Note: utf8_encode() converts from ISO-8859-1 to UTF-8 and it isn't required if your data is already in UTF-8 (recommended).

If you just need to reverse the encode then you can use html_entity_decode - http://www.php.net/manual/en/function.html-entity-decode.php.
Another possibility to is only run htmlentities at the time the content will be displayed as part of a web page. Otherwise, keep the unencoded text, as submitted or loaded from your datastore.

I believe it is a problem with the way you are applying the value towards the input. It is being displayed as encoded, which makes sense because it is Javascript, not HTML. So, what I would propose is to write your encoded text as part of the markup so that it gets parsed naturally (as opposed to being injected with client script). Since your textboxes are not readily available when the server is responding, you can use a temporary hidden field...
<input type="hidden" id="hidEncoded" value="<?=htmlentities("déjà vu");?>" />
Then it will get parsed as good old HTML, and when you try to access the value with Javascript it should be decoded...
// Give your textbox an ID!
$("#txtInput").val($("#hidEncoded").val());

Related

XSS in text-fields - PHP example

Please consider this PHP page below, named xss1.php. You can upload it to any LAMP server or VM you have, to understand my conundrum.
<?php
ob_start();
session_start();
$searchValue = "";
if ($_SERVER["REQUEST_METHOD"] === "POST") {
$searchValue = trim($_POST["txtSearch"]);
}
?>
<!DOCTYPE html>
<html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>XSS: Sample 1</title>
</head>
<body>
<form name="xssForm" method="POST" action="xss1.php">
<input type="text" id="txtSearch" name="txtSearch" maxlength="128" value="<?php print($searchValue); ?>"/>
<input type="Submit" id="btnSubmit" value="Submit"/>
</form>
</body>
</html>
<?php
ob_end_flush();
?>
I was under the impression, data in text-fields are displayed as is, and need minimal or no-XSS checking. In this text-field, If I were to stick in <script>alert(1);</script> and the form gets posted, the value gets displayed back in the text-field again, with no XSS execution or injection. I'm running Firefox 50.0.2. on my Mac OS X.
Now, if I stick in "><script>alert(1);</script>, there is XSS and I see a Javascript alert pop-out with 1 in it. The characters "/> come after the text-field, rendered as text on the page, not inside the text-field. What changed here? I'm a little perplexed and will perhaps spend the next hour trying to find the answer on XSS Filter Evasion Cheat Sheet, at https://www.owasp.org/index.php/XSS_Filter_Evasion_Cheat_Sheet
In Safari though, I don't see the Javascript alert pop-out, but "/> gets written outside the text-field, right after it on the page.
It's 2:01 am PT and I'm Sleepless in Seattle :)
I was under the impression, data in text-fields are displayed as is, and need minimal or no-XSS checking
Your impression was wrong. Any user input into an HTML document needs to be considered for XSS. It must be either:
Escaped
Passed through a really good white listing filter
From a trusted source (and that means trusted to not be malicious, and to write code without accidently writing something dangerous, and to not copy/paste code they don't understand).
The characters "/> come after the text-field, rendered as text on the page, not inside the text-field. What changed here?
You added a " character. A " ends an attribute value.
Then you added a >. Inside a tag, but outside of an attribute value, a > ends the tag.
The "/> that were in the original document (i.e. the ones that are not part of the user input) no longer have an attribute and tag to close (because the "> from the user input did that) so are rendered as text.

Why does not PHP accept UTF-8 form data?

I'm using ajax. I can track the POST request and see that data is there in the correct state, however, despite the fact that i have
header("Content-Type: text/html;charset=UTF-8");
mb_internal_encoding("UTF-8");
in the beginning of the script, I still get gibberish symbols instead of valid UTF-8 string. What could be the issue?
Here's a part of the html file:
<meta charset="UTF-8">
...
<div id="form-container" role="form" data-toggle="validator" accept-charset="UTF-8" onsubmit="return false">
Here is what my ajax post looks like:
Have you tried mb_detect_encoding(); instead of trying to force it to UTF-8?
So see if mb_internal_encoding(mb_detect_encoding($_POST['value'])); gives you any luck? Or just echo mb_detect_encoding($_POST['value']); to see what encoding it seems to think it is? Just a poke in the dark really.

How to output an HTML page based on user input

I want to make an HTML (or php maybe?) page that constructs a new HTML page based on input parameters the user gives to a drop-down box. I just don't know how you handle the input.
Here's my HTML:
<html>
<body>
<input type="number" min="1">
</body>
</html>
Yes I know it's not the full HTML page, but I just want to focus on the <input> tag. I know you probably have to set it equal to a PHP variable maybe?
I want it to generate a different HTML page that looks like this:
<html>
<body>
<p>You have chosen: $input </p>
</body>
</html>
I might be asking this all wrong, but I hope it makes sense what I'm looking for. I need to know how to handle the user input. I couldn't find a thread that discusses this. Do I need to generate a new HTML file? Or just override the current one and maybe have a reset button? I'm so confused.
In the simple case, you'll have two pages: your form and your result page. You can send data from the form page to the results page with one of two methods: GET or POST.
GET means that the data you're sending gets put in the page URL. This is useful because then you can link to a specific version of the results page, but potentially dangerous because you don't want to put sensitive data in the URL bar.
POST means that the data is sent with the HTTP request in the background. This is preferable for something like a password.
The GET and POST data can be read by nearly any server-side language and used to generate HTML on-the-fly. The example below uses PHP.
The form page doesn't necessarily need any server-side code, just basic HTML. Here's a simple example:
<!DOCTYPE html>
<html>
<form method="GET" action="my_result.php">
<input type="text" name="my_value">
<input type="submit">
</form>
</html>
Your second page (the results page) should bear the name that you specified in the form's action attribute. This is the page which will need server-side code. So here is an example my_result.php:
<!DOCTYPE html>
<html>
<p><?php echo $_GET['my_value']; ?></p>
</html>
Obviously, my_value can and should be replaced by whatever you want to call your data, as long as the name attribute of the input element matches the key in the PHP.
This example uses the GET method. You can use POST by changing the method attribute of the form and using $_POST instead of $_GET (if you are using PHP).
If you use $_REQUEST rather than $_GET or $_POST, it finds a value that was passed via either GET or POST. This is usually less safe than explicitly stating how your value was passed.
Addendum: Some servers are configured to disallow you from directly using the values of php superglobals such as $_GET, $_POST, and $_REQUEST for security purposes. That is because you really should always sanitize user input before using it in an application. The type of sanitization required depends on the type of input and how it is being used, and is well outside of the scope of this question. For this purpose, php provides the filter_input function.
The sanitization filter is an optional parameter for the filter_input function, so if you really want to use the data unfiltered, you can simply omit it (but know that this is dangerous). In this case, you can replace all instances of $_GET['my_value'] in the above code with filter_input(INPUT_GET, 'my_value').
This is not a tutorial, but I guide you to some important points:
You can get user input with html by using form element. read more about form and methods of form (GET and POST).
Then, how can you print user input when submitted by user? php supports both (GET and POST) using $_GET and $_POST with input name as key.
Dealing with user-input needs extra care because of security. user might submit malicious content that later attacks you or another user.
Try like below
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Document</title>
</head>
<body>
<?php
if ($_POST) {
echo "<h3>You have selected ".$_POST['number']."</h3>";
} else {
echo '
<form method="post" action="">
<select name="number" id="number">
<option value="1" >1</option>
<option value="2" >2</option>
<option value="3" >3</option>
</select>
<input type="submit" value="submit">
</form>
';
}
?>
</body>
</html>
To handle a user input you have to use forms
<form action="action_page.php">
<input type="number" min="1 name="my-number">
<input type="submit" value="Submit">
</form>
After user set number and press submit button, you will get the value in action_page.php in $_REQUEST['my-number']

Printing the current page to pdf using wkhtmltopdf

Recently installed wkhtmltopdf. Was trying to capture the entire page in its current state, however, the below method seems to navigate to the initial state of that page without all the input fields that the user has entered.
PHP
shell_exec('wkhtmltopdf http://localhost/www/bolt/invoice.php invoice.pdf');
I was wondering if someone knew of an implementation of wkhtmltopdf that captures the current state of the page including any text entered in the text fields??
I appreciate any suggestions.
Many thanks in advance!
wkhtmltopdf hits the page independently of your current browsing session. If you hit it like that, you're going to get what anyone would see when they first go to your page. Probably what you want to do is save the current page using an output buffer, and then run wkhtmltopdf on the saved page. Here's some sample code:
sub.php
<?php
$documentTemplate = file_get_contents ("template.html");
foreach ($_POST as $key => $postVar)
{
$documentTemplate =
preg_replace ("/name=\"$key\"/", "value=\"$postVar\"", $documentTemplate);
}
file_put_contents ("out.html", $documentTemplate);
shell_exec ("wkhtmltopdf out.html test.pdf");
?>
template.php
<!DOCTYPE html>
<html>
<head>
<title></title>
<meta charset="utf-8" />
</head>
<body>
<h1>This is a page</h1>
<form action="sub.php" method="post" accept-charset="utf-8">
My Test Field:
<input type="text" name="test_field" value="">
<input type="submit" value = "submit">
</form>
</body>
</html>
Probably in the long run you should have some kind of base template that both pages would use, and one have some markers like value='%valueOfThisVariable%' in your input fields that you can replace with blanks when you present the fields to the user, and fill with the user data when you create the page that you want to write to pdf. Right now it's just going through and replacing all the name='this_name' with value='this_name->value'.

Display error messages in html

I want to display warning messages in html. This code shows two text boxes named "company" and "name". con.php connects to the database and inserts the information. But if I enter nothing, then the values are still getting stored in the database as null. I want user to know that he shouldn't leave the fields blank by displaying some messages and also a warning should appear if the given company already exists in the database. How do I implement that?
<html>
<head>
<title>store in a database</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<h2>company Store</h2>
<form name="form1" method="post" action="con.php">
<p>company:<input type="text" name="company">
<br/>
<br/>
<br/>
Name: <input type="text" name="name" size="40">
<br/>
<br/>
<br/>
<input type="submit" value="Save">
<input type="button" onclick="window.close()" value="cancel">
</form>
</body>
While an alert message cannot be produced without JavaScript, you could take advantage of HTML5's placeholder attribute to inform the user of this message:
<input type="text" placeholder="You must enter something in this field"! name="whatever" id="whatever" />
And couple this with JavaScript:
var inputElem = document.getElementById('whatever');
var form = document.getElementsByTagName('form')[0];
form.onsubmit = function(){
if (inputElem.value = '' || inputElem.value.length < 1){
alert('You must enter some actual information');
return false;
}
};
However JavaScript can be edited by the users, via Firebug, Web Inspector, Dragonfly...or by simply creating a new html file and submitting the form to the same source from the action attribute of the form element. Therefore your form-handling script must be sanitised and checked on the server as well as the client; client-side checking is a convenience to the user (to prevent unnecessary page-reloads, submissions and so on), it is not a security feature, and should not be used, or mistaken, as such.
Best way is using Ajax if you want to do it at the same page. You need to read some tutorials on it. It's not that easy to explian here.
If reloading or redirecting to other page is ok for you, you should compare the submitted form value with the values in the database in a PHP script which is redirected from form submission (action url). If values doesn't match and not empty, store the values to database and redirect to a page like the list of companies or "company successfully created" message page. If values match with an old record or empty, redirect back to the same form page with a flag (something like form.php?error=1 etc.) and show the proper error message.
Also you can use JavaScript for immediate alerts. But you should always do the same checks at PHP side since JavaScript can be disabled in browsers.
In con.php you should do your data validation and return the markup (or redirect to page describing the error).
So, check for empty fields, and if the exists redirect the user to a page saying the fields can not be empty (and probably allow them to enter new values).
If the data entered is ok, check the database for duplicates and if they exist, redirect the user to a page saying that the company already exists (and again probably allow the user to correct the data).
You can not do it only with HTML.
You need to add a form validation (to prevent empty strings), HTML5 form validation can do that for you (check http://www.broken-links.com/2011/03/28/html5-form-validation/), but not all browser support it, so you will need to use JavaScript to validate the form.
There are JavaScript libraries that will take an old browser and make it behave like a browser that support HTML5 (check http://www.matiasmancini.com.ar/jquery-plugin-ajax-form-validation-html5.html).
You will also need to retrieve the companies already in your database and check them against the user input and alert him if needed.
On top of that you will need to validate the data in your PHP before inserting it to the database (check for empty string for example).

Categories