If you were not using a database with your application, but you do 'echo' or use a $_POST or $_GET variable in your code, do we need to escape them?
Like:
if(isset($_GET['test']){
echo $_GET['test'];
}
or
function math(){
if(isset($_GET['number'],$_GET['numberr']){
return $_GET['number']*$_GET['numberr'];
}
return null;
}
Even if you use a database you need to escape or sanitize them before printing. Someone could sneak in stray HTML like <b> that will make your whole page bold, or <script>alert('hello');</script> that will run Javascript.
echo htmlspecialchars($_GET['test']);
This will replace all your < with < and > with > so that the HTML will be treated as text rather than HTML and will not mess up your page.
You should escape them. Also you should use regual expressions to limit the variable content, and to prevent "unintended" characters.
EDIT: Sry to post this as an answer, i am currently not allowed to comment to questions.
Related
I'm sanitizing all user inputs that are output on page, e.g.:
echo escape($user_input);
I have a question about user inputs that are not output to the page but are inside statements - do they need escaped?
Is this OK:
if ($user_input == 'something') { echo escape($another_user_input); }
or should it be:
if (escape($user_input) == 'something') { echo escape($another_user_input); }
Same question for other logic (foreach loops etc.) which would add more faff than this simple example.
The only reason you ever need to escape something is when you're interpolating data into another text medium which gets re-interpreted according to some rules.
E.g.:
echo '<p>' . $somedata . '</p>';
This is programmatically generating HTML which will get interpreted by an HTML parser and will have specific behaviour depending on what's inside $somedata.
$query = 'SELECT foo FROM bar WHERE baz = ' . $somedata;
This is programmatically generating an SQL query which will get interpreted by an SQL parser and will have specific behaviour depending on what's inside $somedata.
If you want to ensure that HTML or that query behaves as you intended, you better make sure you generate those textual commands in a way does that not allow anyone to inject unwanted commands.
if ($somedata == 'something')
Here you're not creating some new text which will be interpreted by something. There's no way for anyone to inject anything here. It's not like PHP is replacing $somedata with the contents of $somedata and then re-interprets that line with the interpolated data. So there's no need to escape this in any way.
Also see The Great Escapism (Or: What You Need To Know To Work With Text Within Text).
I am beginner in web development, I am developing a site that allows user to post various discussions and others comment and reply on it. The problem I am facing is, the user can post almost anything, including code snippets and any other thing which might possible include single quotes, double quotes and even some html content.
When such posts are being posted, it is intervening with the MySQL insert query as the quotes are ending the string and as a result the query is failing. And even when I display the string using php, the string is being interpreted as html by the browser, where as I want it to be interpreted as text. Do I have to parse the input string manually and escape all the special characters? or is there another way?
You need to read up on a few things
SQL Injection - What is SQL Injection and how to prevent it
PHP PDO - Using PHP PDO reduces the risk of injections
htmlentities
The basic premise is this, sanitize all input that is coming in and encode everything that is going out. Don't trust any user input.
If possible, whitelist instead of blacklisting.
EDIT :
I you want to display HTML or other code content in there, users need to mark those areas with the <pre> tag. Or you could use something like a markdown variation for formatting.
Use PDO, prepared statements and bound parameters to insert / update data, eg
$db = new PDO('mysql:host=hostname;dbname=dbname', 'user', 'pass');
$stmt = $db->prepare('INSERT INTO table (col1, col2) VALUES (?, ?)');
$stmt->execute(array('val1', 'val2'));
Edit: Please note, this is a very simplified example
When displaying data, filter it through htmlspecialchars(), eg
<?php echo htmlspecialchars($row['something'], ENT_COMPAT, 'UTF-8') ?>
Update
As noted on your comment to another answer, if you want to maintain indentation and white-space when displaying information in HTML, wrap the content in <pre> tags, eg
<pre><?php echo htmlspecialchars($data, ENT_COMPAT, 'UTF-8') ?></pre>
Look at mysql_real_escape_string and htmlentities functions in PHP manual.
You can also read the Security chapter in PHP manual.
To avoid the breaking of queries in database (which means you're not escaping them, leaving big holes for sql injection) you use mysql_real_escape_string($string) on the value before passing it to the query string, enclosing it in quotes also.
Ex. $value = mysql_real_escape_string($value); // be sure to have an open connection before using this function.
$query = "select * from `table` where value = '".$value."'";
As for displaying in html, you should at least echo htmlentities($string) before outputting it to the browser.
Like echo htmlentities($mystring, ENT_QUOTES)`;
Edit:
To preserve withe spaces, you can use nl2br function (which converts linebrakes to the html equivalen <br />) or go for a little deeper $string = nl2br(str_replace(" ", " ", $string));, but html code would look a bit ugly, at least for me
Reference: htmlentities and mysql_real_escape_string. nl2br
use mysql_real_escape_string. It is a good practice to use this on all user inputs to prevent SQL Injection attacks.
I have a form with 2 textareas; the first one allows user to send HTML Code, the second allows to send CSS Code. I have to verify with a PHP function, if the language is correct.
If the language is correct, for security, i have to check that there is not PHP code or SQL Injection or whatever.
What do you think ? Is there a way to do that ?
Where can I find this kind of function ?
Is "HTML Purifier" http://htmlpurifier.org/ a good solution ?
If you have to validate the date to insert them in to database - then you just have to use mysql_real_escape_string() function before inserting them in to db.
//Safe database insertion
mysql_query("INSERT INTO table(column) VALUES(".mysql_real_escape_string($_POST['field']).")");
If you want to output the data to the end user as plain text - then you have to escape all html sensitive chars by htmlspecialchars(). If you want to output it as HTML, the you have to use HTML Purify tool.
//Safe plain text output
echo htmlspecialchars($data, ENT_QUOTES);
//Safe HTML output
$data = purifyHtml($data); //Or how it is spiecified in the purifier documentation
echo $data; //Safe html output
for something primitive you can use regex, BUT it should be noted using a parser to fully-exhaust all possibilities is recommended.
/(<\?(?:php)?(.*)\?>)/i
Example: http://regexr.com?2t3e5 (change the < in the expression back to a < and it will work (for some reason rexepr changes it to html formatting))
EDIT
/(<\?(?:php)?(.*)(?:\?>|$))/i
That's probably better so they can't place php at the end of the document (as PHP doesn't actually require a terminating character)
SHJS syntax highlighter for Javascript have files with regular expressions http://shjs.sourceforge.net/lang/ for languages that highlights — You can check how SHJS parse code.
HTMLPurifier is the recommended tool for cleaning up HTML. And as luck has it, it also incudes CSSTidy and can sanitize CSS as well.
... that there is not PHP code or SQL Injection or whatever.
You are basing your question on a wrong premise. While HTML can be cleaned, this is no safeguard against other exploitabilies. PHP "tags" are most likely to be filtered out. If you are doing something other weird (include-ing or eval-ing the content partially), that's no real help.
And SQL exploits can only be prevented by meticously using the proper database escape functions. There is no magic solution to that.
Yes. htmlpurifier is a good tool to remove malicious scripts and validate your HTML. Don't think it does CSS though. Apparently it works with CSS too. Thanks Briedis.
Ok thanks you all.
actually, i realize that I needed a human validation. Users can post HTML + CSS, I can verify in PHP that the langage & the syntax are correct, but it doesn't avoid people to post iframe, html redirection, or big black div that take all the screen.
:-)
I made a GET form recently.But the problem is that it is highly vulnerable.You can inject your an script as below.
http://mysite.com/processget.phtml?search=Hacked
I'm able to inject any kind of script into my above URL.I'm actually echoing my GET data using an echo in my BODY,so whenever i enter a malicious script it is being executed in my BODY tag.So now how do i limit this http://mysite.com/processget.phtml?search= to just Number,letters and a few symbols which i want.
For ex.The user should only be able to enter
http://mysite.com/processget.phtml?search=A123123+*$
So can anyof you help me fix this bug.I'm kind of new to PHP,so please explain.
if (!empty($_GET['search'])) {
$search = htmlentities($_GET['search'],ENT_QUOTES,'UTF-8');
echo $search;
}
Now it's safe.
But if you want to limit to specific symbols, then you need to use regular expressions.
You can let a user enter whatever you like; the key is to escape the output. Then the string is displayed as desired, rather than included as HTML.
Use a php function like htmlentities
Strip the tags:
echo strip_tags($_GET['search']);
Actually, you may want htmlspecialchars instead, which escapes the tags instead of removing them so they display as intended:
echo htmlspecialchars($_GET['search']);
I have an HTML form POSTing to a PHP page.
I can read in the data using the $_POST variable on the PHP.
However, all the data seems to be escaped.
So, for example
a comma (,) = %2C
a colon (:) = %3a
a slash (/) = %2
so things like a simple URL of such as http://example.com get POSTed as http%3A%2F%2Fexample.com
Any ideas as to what is happening?
Actually you want urldecode. %xx is an URL encoding, not a html encoding. The real question is why are you getting these codes. PHP usually decodes the URL for you as it parses the request into the $_GET and $_REQUEST variables. POSTed forms should not be urlencoded. Can you show us some of the code generating the form? Maybe your form is being encoded on the way out for some reason.
See the warning on this page: http://us2.php.net/manual/en/function.urldecode.php
Here is a simple PHP loop to decode all POST vars
foreach($_POST as $key=>$value) {
$_POST[$key] = urldecode($value);
}
You can then access them as per normal, but properly decoded. I, however, would use a different array to store them, as I don't like to pollute the super globals (I believe they should always have the exact data in them as by PHP).
This shouldn't be happening, and though you can fix it by manually urldecode()ing, you will probably be hiding a basic bug elsewhere that might come round to bite you later.
Although when you POST a form using the default content-type ‘application/x-www-form-encoded’, the values inside it are URL-encoded (%xx), PHP undoes that for you when it makes values available in the $_POST[] array.
If you are still getting unwanted %xx sequences afterwards, there must be another layer of manual URL-encoding going on that shouldn't be there. You need to find where that is. If it's a hidden field, maybe the page that generates it is accidentally encoding it using urlencode() instead of htmlspecialchars(), or something? Putting some example code online might help us find out.