I want to allow user to put his data into text filed . that text field will be stored in database . And on future steps , this text will be displayed in some pages . Of course in a same way , that user that created . OK, consider this stackoverflow example , i m allowed to put any code or text , anything ; and that code or anything is simple ignored it by its server . so how is this working .
My problem is , i cant trust on users .. user can put anything .. ( may be code -> sql or simple text ) . so i planned to use mysql_real_escape_string() but this function is putting some slash in malicious code. its good .. but i want to put user entered string into database so that i can use it later ( not that sanitized string ) . so how can i ?
Indeed , i am developing CMS which is using database class ( this ) I read about PDO , but making use of this concept may let me to change everything . i want a way except PDO approach . parametric approach favorable
mysql_real_escape_string() does not sanitize or mess up your input in any way, it just prepares your text to be a valid part of a SQL insert statement.
If you get duplicate backslashes before an apostrophe, check if you maybe have "magic quotes" enabled.
An option for you would also be to start using mysqli driver, then you can use prepared statements. This syntax works better against SQL injections. See responses on this SO post: Does mysqli class in PHP protect 100% against sql injections?
When inserting user-provided content into the database, use query parameters or at least escaping to prevent SQL injection. See also my answer to What is SQL injection?
Even if you get strings of code inserted safely into the database, you have a second possible vulnerability:
When displaying content, be aware of risks of Cross-Site Scripting (XSS). When you display the content from the database in an HTML output, it could contain HTML tags or Javascript code that is executed as part of the web page instead of displaying the code.
To help prevent XSS, you must convert tag-open characters with the HTML entity, for instance < should be output as <. This makes sure it is shown as a literal '<' and not interpreted by the user's browser as another tag.
How about encoding the entire string and then inserting it? I use Base64_encode to encode, and do the reverse when retrieving from the database. The characters are alphanumerics (with ==) and they aren't harmful.
You can push the entire encoded string to the client-side and decode it with Javascript.
Here is an example
if (isset($_POST['userdata'])) {
$safestring= base64_encode($_POST['userdata']);
mysql_query("UPDATE table_name SET value_name = '$safestring'
WHERE some_username = 'username'");
}
Related
1) I have a textarea in my html. Inside the textarea I wrote: <i>ABC Enterprise</i>. When saving into the sql database it saved as <i>XYZ Enterprise</i>
2) Does anyone know how to retain < and </> when saving into the database without converting? If this is not possible, does anyone know how to convert <i>XYZ Enterprise</i> to <i>ABC Enterprise</i> in php? I need the string to maintain this form <i>ABC Enterprise</i> in php not html.
I have tried preg_replace("/&([a-z])[a-z]+;/i", "$1", htmlentities($company)), iconv('utf-8', 'ascii//TRANSLIT', $company), htmlspecialchars($compnay), many other ways I happened to stumble upon on stackoverflow but nothing seemed to work. Any help?
To specifically answer your question:
How to retain <> and </> when inserting into the DB? [paraphrased, emphasis added]
Simple: don't modify your data. As discussed below, however, be smart about it and insert the data using a prepared statement.
Why is your data being changed? Most likely because your code is doing some form of modification of the data before putting it in the database. In PHP, this generally means one of:
htmlentities
htmlspecialchars
The general advice for years was simply "escape all your data or suffer the XSS/CSRF/Sql Injection/other attack consequences!" The problem is that there are nuances of when and how to escape and in the zeal for security, many websites over do it. As you've described your situation, I would consider:
When inserting into the DB: use prepared statements, rather than manual escaping.
When pulling from the DB: be judicious when you apply escaping techniques.
A prepared statement is where you tell the database the format of what you're going to send, then send the data in a separate communication. If there's anything awry, the DB knows best how to find it. For example:
$pstmt = $dbh->prepare('INSERT INTO tab (html) VALUES (?)');
$pstmt->execute(array($_POST['my_textarea']));
Note the lack of any sanitization, using the $_POST variable directly. What the user sent to you is what you put in the DB, with zero modification. Because the DB server was sent a format first, it will not allow any ulterior SQL injection shenanigans.
However, when pulling data out of the DB, you need to be careful of exactly what data goes where. For example, to allow < and > characters inside of the content might be foolhardy, depending on your context. I'll leave it to you to decide whether you want to escape the output inside of your <textarea>:
echo "<textarea>$textarea_content_as_retrieved_from_db</textarea>";
or
echo '<textarea>' . htmlentities( $textarea_content_as_retrieved_from_db ) . '</textarea>';
I know I've already asked a question about sanitizing and escaping, but I have a question which didn't get answered.
Okay, here it goes. If I have a PHP-script and I GET the users input and SELECT it from a mySQL database, would it matter/be any security risk, if I didn't escape < and > through the use of either htmlspecialchars, htmlentities or strip_tags and therefore allowed for HTML tags to be selected/searched from the database? Because the input is already being sanitized through the use of trim(), mysql_real_escape_string and addcslashes (\%_).
The problem using htmlspecialchars is that it escapes ampersand (&), which the user input is supposed to allow (I guess the same goes for htmlentities?). With the use of strip_tags, something like "John" results in the PHP-script selecting and displaying results for John, which it isn't supposed to do.
Here is my PHP-code for sanitizing the input, before selecting from the database:
if(isset($_GET['query'])) {
if(strlen(trim($_GET['query'])) >= 3) {
$search = mysql_real_escape_string(addcslashes(trim($_GET['search']), '\%_'));
$sql = "SELECT name, age, address WHERE name LIKE '%".$search."%'";
[...]
}
}
And here is my output for displaying "x matched y results.":
echo htmlspecialchars(strip_tags($_GET['search']), ENT_QUOTES, 'UTF-8')." matched y results.";
A good way to go about this is to use MySQLi, it uses prepared statements which essentially escapes everything for you on the backend and offers strong protection against SQL injections. Not escaping GET data is just as dangerous as not escaping any other input.
There's two different concerns here that you've identified.
User Data in SQL Statements
Whenever you're constructing a query, you need to be absolutely certain that no arbitrary user data will end up in it. These mistakes are called SQL injection bugs and are the result of failing to correctly escape your data. As a general rule, you should never, ever use string concatenation to compose a query. Whenever possible, use placeholders to ensure that your data is correctly escaped.
User Data in HTML Document
When you're rendering a page that contains user-submitted content, you need to escape it so that the user cannot introduce arbitrary HTML tags or scripting elements. This is avoids XSS issues and means that characters like & and < do not get interpreted incorrectly. User data of "x < y" wouldn't end up breaking your page.
You'll always need to escape for whatever context you're rendering user data into. There are others, like inside a script tag or in a URL, but these are the two most common ones.
For wont of avoiding SQL injection attacks, I'm looking to cleanse all of the text (and most other data) entered by the user of my website before sending it into the database for storage.
I was under the impression that the function inserted backslashes ( \ ) before all characters capable of being malicious ( \n , ' , " , etc ), and expected that the returned string would contain the newly added backslashes.
I performed a simple test on a made up string containing such potentially malicious characters and echo'd it to the document, seeing exactly what I expected: the string with backslashes escaping these characters.
So, I proceeded to add the cleansing function to the data before storing into the database. I inserted it (mysqli_real_escape_string( $link , $string)) into the query I build for data storage. Testing the script, I was surprised (a bit to my chagrin) to notice that the data stored in the database did not seem to contain the backslashes. I tested and tested and tested, but all to no avail, and I'm at a loss...
Any suggestions? Am I missing something? I was expecting to then have to remove the backslashes with the stripslashes($string) function, but there doesn't seem to be anything to strip...
When you view your data in the database after a successful insert, having escaped it with mysql_real_escape_string(), you will not see the backslashes in the database. This is because the escaping backslashes are only needed in the SQL query statement. mysql_real_escape_string() sanitizes it for insert (or update, or other query input) but doesn't result in a permanently modified version of the data when it is stored.
In general, you do not want to store modified or sanitized data in your database, but instead should be storing the data in its original version. For example, it is best practice to store complete HTML strings, rather than to store HTML that has been encoded with PHP's htmlspecialchars().
When you retrieve it back out from the database, there is no need for stripslashes() or other similar unescaping. There are some legacy (mis-)features of PHP like magic_quotes_gpc that had been designed to protect programmers from themselves by automatically adding backslashes to quoted strings, requiring `stripslashes() to be used on output, but those features have been deprecated and now mostly removed.
MySQL stores the data without the slashes (although it is passed to the RDBMS with the slashes). So you don't need to use stripslashes() later on.
You can be sure that the string was escaped, cause otherwise, the query would have failed.
I'm looking to cleanse all of the text (and most other data) entered by the user of my website
This is what you are doing wrong.
mysqli_real_escape_string does not "cleanse" anything. There is no word "cleanse" in it's name.
You should format, not "cleanse" your data. And different data require different formatting.
You should format ALL the data, not only data entered by the user of my website
In the current form you are leaving your site highly vulnerable to attacks and errors.
I was under the impression that the function inserted backslashes ( \ ) before all characters capable of being malicious ( \n , ' , " , etc ),
To let you know, there is nothing malicious in any character. There are some service characters, that can be misinterpreted in some circumstances.
But adding backslashes doesn't make your data automatically "safe". Some injections doesn't require any special characters. So, you need to properly format your data, not just use a some sort of magic that will make you magically safe
I read in a PHP book that it is a good practice to use htmlspecialchars and mysqli_real_escape_string in conditions when we handle user inputed data. What is the main difference between these two and where they are appropriate to be used? Please guide me.
htmlspecialchars: "<" to "& lt;"
(Replaces HTML-Code)
mysqli_real_escape_string: " to \"
(Replaces Code, that has a meaning in a mysql-query)
Both are used to be save against some attacks like SQL-Injection and XSS
These two functions are used for completely different things.
htmlspecialchars() converts special HTML characters into entities so that they can be outputted without problems. mysql_real_escape_string() escapes sensitive SQL characters so dynamic queries can be performed without the risk of SQL injection.
You could just as easily say that htmlspecialchars handles sensitive OUTPUT, while mysql_real_escape_string handles sensitive INPUT.
Shai
The two functions are totally unrelated in purpose; the only attribute they share is that they are commonly used to provide safety to web applications.
mysqli_real_escape_string is meant to provide safety against SQL injection.
htmlspecialchars is meant to provide safety against cross-site scripting (XSS).
Also see What's the best method for sanitizing user input with PHP? and Do htmlspecialchars and mysql_real_escape_string keep my PHP code safe from injection?
htmlspecialcharacters turns 'html special characters' into code, such as quotes (both single and double), ampersands, and less than/greater than signs. This function is generally used to ensure that content users post on your website doesn't have HTML tags or XSS scripts.
mysql_real_escape_string escapes strings, meaning it adds the \ in front of slashes, quotes(both single and double), and anything else that can mess up a mysql query. This function ensures that no one is executing SQL commands on your server and getting information from the database.
When to use real_escape_string?
Short: Use when building queries which depend on user submitted data.
Long:
When saving user submitted data to your database in a manner which does not use prepared statements (these are escaped by default). What it does is prevent situations as the following
(DO NOT DO THIS):
txtSQL = "SELECT * FROM Users WHERE UserId = " + $_GET("userid");
Using real_escape_string($_GET("userid") instead of the raw parameter prevents that an attacker gets all users sending a userid parameter which is formed like this: '100 OR 1=1'. This would be concatenated and yield the query:
SELECT * FROM Users WHERE UserId = 100 OR 1=1;
Which would return all users data in the database.
Real escape string would escape 100 OR 1=1 in a way that it would not be interpreted as valid SQL and thus would not yield all user data.
More on SQL injection
When to use htmlspecialchars?
Short: Use when echoing user submitted data to your page.
Long: If user manages to save a string like:
<script>alert("Stealing your cookies")</script>
to your database which is then presented to other users and you echo it without htmlspecialchars the javascript code in the script tag would execute on the users machine, which is just bad news, as now pretty much any data within the browser could be stolen (cookies/localstorage) or the user be redirected.
The resulting string of htmlspecial chars on the aforementioned script tag would be:
<script>alert('Stealing your cookies')'</script>
Which would be displayed on the page and not be interpreted as javascript code.
I usually escape user input by doing the following:
htmlspecialchars($str,ENT_QUOTES,"UTF-8");
as well as mysql_real_escape_string($str) whenever a mysql connection is available.
How can this be improved? I have not had any problems with this so far, but I am unsure about it.
Thank you.
Data should be escaped (sanitized) for storage and encoded for display. Data should never be encoded for storage. You want to store only the raw data. Note that escaping does not alter raw data at all as escape characters are not stored; they are only used to properly signal the difference between raw data and command syntax.
In short, you want to do the following:
$data = $_POST['raw data'];
//Shorthand used; you all know what a query looks like.
mysql_query("INSERT " . mysql_real_escape_string($data));
$show = mysql_query("SELECT ...");
echo htmlentities($show);
// Note that htmlentities() is usually overzealous.
// htmlspecialchars() is enough the majority of the time.
// You also don't have to use ENT_QUOTES unless you are using single
// quotes to delimit input (or someone please correct me on this).
You may also need to strip slashes from user input if magic quotes is enabled. stripslashes() is enough.
As for why you should not encode for storage, take the following example:
Say that you have a DB field that is char(5). The html input is also maxlength="5". If a user enters "&&&&&", which may be perfectly valid, this is stored as "&&." When it's retrieved and displayed back to the user, if you do not encode, they will see "&&," which is incorrect. If you do encode, they see "&&," which is also incorrect. You are not storing the data that the user intended to store. You need to store the raw data.
This also becomes an issue in a case where a user wants to store special characters. How do you handle the storage of these? You don't. Store it raw.
To defend against sql injection, at the very least escape input with mysql_real_escape_string, but it is recommended to use prepared statements with a DB wrapper like PDO. Figure out which one works best, or write your own (and test it thoroughly).
To defend against XSS (cross-site-scripting), encode user input before it is displayed back to them.
If you only use mysql_real_escape_string($str) to avoid sql injection, make sure you always add single quotes around it in your query.
The htmlspecialchars is fine when parsing unsafe output to the screen.
For the database switch to PDO.
It's much easier and does the escaping for you.
http://php.net/pdo