On my site users can add content to the database via a form. I want the users to be able to type anything in the form and for it all to be added to the database how they have entered it. At the moment I'm getting problems with a number of characters, namely slashes, &, ? etc.
What is the best way to allow all characters to be added to the database correctly?
Also, do you have to decode them when displaying them for it to work correctly? If so, how do I do that?
When saving, use mysql_real_escape_string (or PDO) to protect against SQL injection attacks. This will make it possible to write quotes and backslashes without destroying the SQL query.
<?php
$text = mysql_real_escape_string($_POST['text']);
mysql_query('INSERT INTO table(text) VALUES("'.$text.'")');
?>
When printing the data to a browser (with echo), first run it through htmlspecialchars to disable HTML and solve your current problem:
<?php
// ...fetch $text from db here...
echo htmlspecialchars($text);
?>
htmlentities() may help you encode and decode html characters:
You can also use nl2br() to preserve line breaks from textarea elements:
Also you should use PDO for your database needs as it is much more secure than the old method of escaping data, mysql_real_escape_string()
Related
I have not followed rule which is
Best Practice: Always use htmlentities for displaying data in a browser. Do not use htmlentities for storing data.
And I have used htmlentities for storing data so I should not use htmlentities each time when I display it.But I saw that if someone can access my db and wrote script tag he can run script in all users data.I didn't followed rule because I thought that it can be slow to use htmlentities each time when display data.
So is there way to solve this issue?I think that solution can be use htmlspecialchars if it is not used.How can I do that?Should I delete all datas and wrote all script again that use htmlentities only when display in browser?
Updated
I found html_entity_decode() like below:
function escape($string){
$string = html_entity_decode($string,null,'UTF-8');
return htmlspecialchars($string);
}
is that best way to do this.If I do so will I 100% able to reverse all and escape if it is not escaped?
In an article http://dev.mysql.com/tech-resources/articles/4.1/prepared-statements.html, it says the followings:
There are numerous advantages to using prepared statements in your applications, both for security and performance reasons.
Prepared statements can help increase security by separating SQL logic from the data being supplied. This separation of logic and data can help prevent a very common type of vulnerability called an SQL injection attack.
Normally when you are dealing with an ad hoc query, you need to be very careful when handling the data that you received from the user. This entails using functions that escape all of the necessary trouble characters, such as the single quote, double quote, and backslash characters.
This is unnecessary when dealing with prepared statements. The separation of the data allows MySQL to automatically take into account these characters and they do not need to be escaped using any special function.
Does this mean I don't need htmlentities() or htmlspecialchars()?
But I assume I need to add strip_tags() to user input data?
Am I right?
htmlentities and htmlspecialchars are used to generate the HTML output that is sent to the browser.
Prepared statements are used to generate/send queries to the Database engine.
Both allow escaping of data; but they don't escape for the same usage.
So, no, prepared statements (for SQL queries) don't prevent you from properly using htmlspecialchars/htmlentities (for HTML generation)
About strip_tags: it will remove tags from a string, where htmlspecialchars will transform them to HTML entities.
Those two functions don't do the same thing; you should choose which one to use depending on your needs / what you want to get.
For instance, with this piece of code:
$str = 'this is a <strong>test</strong>';
var_dump(strip_tags($str));
var_dump(htmlspecialchars($str));
You'll get this kind of output:
string 'this is a test' (length=14)
string 'this is a <strong>test</strong>' (length=43)
In the first case, no tag; in the second, properly escaped ones.
And, with an HTML output:
$str = 'this is a <strong>test</strong>';
echo strip_tags($str);
echo '<br />';
echo htmlspecialchars($str);
You'll get:
this is a test
this is a <strong>test</strong>
Which one of those do you want? That is the important question ;-)
Nothing changes for htmlspecialchars(), because that's for HTML, not SQL. You still need to escape HTML properly, and it's best to do it when you actually generate the HTML, rather than tying it to the database somehow.
If you use prepared statements, then you don't need mysql_[real_]escape_string() anymore (assuming you stick to prepared statements' placeholders and resist temptation to bypass it with string manipulation).
If you want to get rid of htmlspecialchars(), then there are HTML templating engines that work similarily to prepared statements in SQL and free you from escaping everything manually, for example PHPTAL.
You don't need htmlentities() or htmlspecialchars() when inserting stuff in the database, nothing bad will happen, you will not be vulnerable to SQL injection if you're using prepared statements.
The good thing is you'll now store the pristine user input in your database.
You DO need to escape stuff on output and sending it back to a client, - when you pull stuff out of the database else you'll be vulnerable to cross site scripting attacks, and other bad things. You'll need to escape them for the output format you need, like html, so you'll still need htmlentities etc.
For that reason you could just escape things as you put them into the database, not when you output it - however you'll lose the original formatting of the user, and you'll escape the data for html use which might not pay off if you're using the data in different output formats.
prepare for SQL Injection
htmlspecialchar for XSS(redirect to another link)
<?php
$str = "this is <script> document.location.href='https://www.google.com';</script>";
echo $str;
output: this is ... and redirect to google.com
Using htmlspecialchars:
$str = "this is <script> document.location.href='https://www.google.com';</script>";
echo htmlspecialchars($str);
<i>output1</i>: this is <script> document.location.href='https://www.google.com';</script> (in output browser)<br />
<i>output2</i>: this is <script> document.location.href='https://www.google.com';</script> (in view source)<br />
If user input comment "the script" into database, then browser display
all comment from database, auto "the script" will executed and
redirect to google.com
So,
1. use htmlspecial for deactive bad script tag
2. use prepare for secure database
htmlspecialchars
htmlspecialchars_decode
php validation
I would still be inclined to encode HTML. If you're building some form of CMS or web application, it's easier to store it as encoded HTML, and then re-encode it as required.
For example, when bringing information into a TextArea modified by TinyMCE, they reccomend that the HTML should be encoded - since the HTML spec does not allow for HTML inside a text area.
I would also strip_tags() from anywhere you don't want HTML code.
I am using tinymce editor to have html page and then insert it in mysql.
I tried this:
$esdata = mysql_real_escape_string($data);
it is working for all html except images. If I have hyperlink like:
http://www.abc.com/pic.jpg
then it makes it somewhat very obscure and the image doesn't appear.
INPUT
<img src="../images/size-chart.jpg" alt="Beer" />
OUPUT
<img src="\""images/size-chart.jpg\\"\"" alt="\"Beer" />
Try to use urlencode and urldecode to escape the string.
As Christian said it is not used for the sake of DB but to keep the things as it is. So you can also use urlencode and urldecode.
For Ex:
//to encode
$output = urlencode($input);
//to decode
$input = urldecode($output);
You shouldn't over-escape code before you send it to DB.
When you escape it, it's done in a way that it is stored in the DB as it was originally. Escaping is not done for the sake of the DB, but for the sake of keeping the data as it was without allowing users to inject bad stuff in your sql statements (prior to sending the stuff in the DB).
You should use htmlspecialchars function to encode the string and htmlspecialchars_decode to display the string back to html
Should I use htmlentities with strip_tags?
I am currently using strip_tags when adding to database and thinking about removing htmlentities on output; I want to avoid unnecessary processing while generating HTML on the server.
Is it safe to use only strip_tags without allowed tags?
First: Use the escaping method only as soon as you need it. I.e. if you insert something into a database, only escape it for the database, i.e. apply mysql_real_escape_string (or PDO->quote or whatever database layer you are using). But don't yet apply any escaping for the output. No strip_tags or similar yet. This is because you may want to use the data stored in the database someplace else, where HTML escaping isn't necessary, but only makes the text ugly.
Second: You should not use strip_tags. It removes the tags altogether. I.e. the user doesn't get the same output as he typed in. Instead use htmlspecialchars. It will give the user the same output, but will make it harmless.
strip_tags will remove all HTML tags:
"<b>foo</b><i>bar</i>" --> "foobar"
htmlentities will encode characters which are special characters in HTML
"a & b" --> "a & b"
"<b>foo</b>" --> "<b>foo</b>"
If you use htmlentities, then when you output the string to the browser, the user should see the text as they entered it, not as HTML
echo htmlentities("<b>foo</b>");
Visually results in: <b>foo</b>
echo strip_tags("<b>foo</b>");
Results in: foo
I wouldn't use htmlentities as this will allow you to insert the string, as is, into the database. Yhis is no good for account details or forums.
Use mysql_real_escape_string for inserting data into the database, and strip_tags for receiving data from the database and echoing out to the screen.
try this one and see the differences:
<?php
$d= isset($argv[1]) ? $argv[1] : "empty argv[1]".PHP_EOL;
echo strip_tags(htmlentities($d)) . PHP_EOL;
echo htmlentities(strip_tags($d)) . PHP_EOL;
?>
open up cmd or your terminal and type something like following;
php your_script.php "<br>foo</br>"
this should get what you want and safe !
In an article http://dev.mysql.com/tech-resources/articles/4.1/prepared-statements.html, it says the followings:
There are numerous advantages to using prepared statements in your applications, both for security and performance reasons.
Prepared statements can help increase security by separating SQL logic from the data being supplied. This separation of logic and data can help prevent a very common type of vulnerability called an SQL injection attack.
Normally when you are dealing with an ad hoc query, you need to be very careful when handling the data that you received from the user. This entails using functions that escape all of the necessary trouble characters, such as the single quote, double quote, and backslash characters.
This is unnecessary when dealing with prepared statements. The separation of the data allows MySQL to automatically take into account these characters and they do not need to be escaped using any special function.
Does this mean I don't need htmlentities() or htmlspecialchars()?
But I assume I need to add strip_tags() to user input data?
Am I right?
htmlentities and htmlspecialchars are used to generate the HTML output that is sent to the browser.
Prepared statements are used to generate/send queries to the Database engine.
Both allow escaping of data; but they don't escape for the same usage.
So, no, prepared statements (for SQL queries) don't prevent you from properly using htmlspecialchars/htmlentities (for HTML generation)
About strip_tags: it will remove tags from a string, where htmlspecialchars will transform them to HTML entities.
Those two functions don't do the same thing; you should choose which one to use depending on your needs / what you want to get.
For instance, with this piece of code:
$str = 'this is a <strong>test</strong>';
var_dump(strip_tags($str));
var_dump(htmlspecialchars($str));
You'll get this kind of output:
string 'this is a test' (length=14)
string 'this is a <strong>test</strong>' (length=43)
In the first case, no tag; in the second, properly escaped ones.
And, with an HTML output:
$str = 'this is a <strong>test</strong>';
echo strip_tags($str);
echo '<br />';
echo htmlspecialchars($str);
You'll get:
this is a test
this is a <strong>test</strong>
Which one of those do you want? That is the important question ;-)
Nothing changes for htmlspecialchars(), because that's for HTML, not SQL. You still need to escape HTML properly, and it's best to do it when you actually generate the HTML, rather than tying it to the database somehow.
If you use prepared statements, then you don't need mysql_[real_]escape_string() anymore (assuming you stick to prepared statements' placeholders and resist temptation to bypass it with string manipulation).
If you want to get rid of htmlspecialchars(), then there are HTML templating engines that work similarily to prepared statements in SQL and free you from escaping everything manually, for example PHPTAL.
You don't need htmlentities() or htmlspecialchars() when inserting stuff in the database, nothing bad will happen, you will not be vulnerable to SQL injection if you're using prepared statements.
The good thing is you'll now store the pristine user input in your database.
You DO need to escape stuff on output and sending it back to a client, - when you pull stuff out of the database else you'll be vulnerable to cross site scripting attacks, and other bad things. You'll need to escape them for the output format you need, like html, so you'll still need htmlentities etc.
For that reason you could just escape things as you put them into the database, not when you output it - however you'll lose the original formatting of the user, and you'll escape the data for html use which might not pay off if you're using the data in different output formats.
prepare for SQL Injection
htmlspecialchar for XSS(redirect to another link)
<?php
$str = "this is <script> document.location.href='https://www.google.com';</script>";
echo $str;
output: this is ... and redirect to google.com
Using htmlspecialchars:
$str = "this is <script> document.location.href='https://www.google.com';</script>";
echo htmlspecialchars($str);
<i>output1</i>: this is <script> document.location.href='https://www.google.com';</script> (in output browser)<br />
<i>output2</i>: this is <script> document.location.href='https://www.google.com';</script> (in view source)<br />
If user input comment "the script" into database, then browser display
all comment from database, auto "the script" will executed and
redirect to google.com
So,
1. use htmlspecial for deactive bad script tag
2. use prepare for secure database
htmlspecialchars
htmlspecialchars_decode
php validation
I would still be inclined to encode HTML. If you're building some form of CMS or web application, it's easier to store it as encoded HTML, and then re-encode it as required.
For example, when bringing information into a TextArea modified by TinyMCE, they reccomend that the HTML should be encoded - since the HTML spec does not allow for HTML inside a text area.
I would also strip_tags() from anywhere you don't want HTML code.