I read this tutorial about storing images in DB. In the tutorial, the author escapes special characters in the binary data before inserting: http://www.phpriot.com/articles/images-in-mysql/7 ( using addslashes although mysql_real_escape_string is preferable - but that is another issue ).
The point is, when displaying, he just displays the data as it is stored: http://www.phpriot.com/articles/images-in-mysql/8
My questions:
1) Do we need to escape special characters even for binary field type (blob)?
2) If so, then, do we not need to "unescape" the characters again in order to display the image correctly? (If so, what is the best way to do it. Any comments about efficiency? For large images: escaping and unescaping can be a big overhead?).
Or is it that my understanding about escaping is totally wrong (and escaping only affects the query and not the final data inserted/stored?).
thanks
JP
Your understanding of escaping is wrong. The data being inserted into the database is escaped, so that the query parser sees the information as intended.
Take the string "Jean-Luc 'Earl Grey' Picard".
Escaping results in: 'Jean-Luc \'Earl Grey\' Picard'
When MySQL receives this, it understands that the escaped quotes need to be taken literally, that is what escaping means, and will store them in the database. It will not store the escape-characters in the database. The \ indicates to MySQL that it should take the character following it literally.
When retrieving, the data is presented to your application without the escaping characters, as they are removed when parsing the query.
1) Do we need to escape special characters even for binary field type (blob)?
Yes, because mysql_real_escape_string() (which is indeed the one to use) provides protection against SQL injection attacks, which could easily be inside an image file as well. Any arbitrary data you feed into a database must be sanitized first.
Related
I'm trying to sanitize a string to be saved in a db.
First step I took was to use addslashes(), but then I realized it didn't solve many security issues, so I added htmlspecialchars(), and now I have this line of code:
$val=htmlspecialchars(addslashes(trim($val)));
But then I was wondering if it makes any sense at all to use addslashes() on a string that will be processed by htmlspecialchars(), since the latter will "remove" any element that would cause problems, if I'm not mistaken.
In particular, I was wondering if that makes the server work twice without any real need.
You are wrong alltogether. addslashes() is no database escaping function, use the one that comes with your database access extension, like mysqli_real_escape_string().
htmlspecialchars() completey does not makes sense here. Only use it if you want to place a string within HTML - that should be when you output stuff, not when storing it in the database.
I wouldn't use either of those when saving the string to the database.
addslashes() escapes only quote characters and the backslash character (\). It's not adequate for avoiding SQL injection, because the DBMS may use other special characters which would have to be escaped as well. The best way to avoid SQL injection is to use PHP data objects and its support for bind parameters, which let you keep the parameter values out of the SQL string entirely. If PDO isn't an option for some reason, you should at least use a database-specific escaping function, e.g. mysqli_real_escape_string if you're using MySQL, to ensure that all the necessary characters are escaped.
htmlspecialchars() is for use when incorporating a non-HTML string into an HTML page; it escapes characters that are significant to a web browser, such as angle brackets, and has nothing to do with databases. Assuming that you're not generating and storing complete HTML documents in your database, you shouldn't be calling this function on values before putting them into the database. Store what the user actually entered, and call htmlspecialchars() when you retrieve the value from the database and you're about to actually put it into some HTML output.
For wont of avoiding SQL injection attacks, I'm looking to cleanse all of the text (and most other data) entered by the user of my website before sending it into the database for storage.
I was under the impression that the function inserted backslashes ( \ ) before all characters capable of being malicious ( \n , ' , " , etc ), and expected that the returned string would contain the newly added backslashes.
I performed a simple test on a made up string containing such potentially malicious characters and echo'd it to the document, seeing exactly what I expected: the string with backslashes escaping these characters.
So, I proceeded to add the cleansing function to the data before storing into the database. I inserted it (mysqli_real_escape_string( $link , $string)) into the query I build for data storage. Testing the script, I was surprised (a bit to my chagrin) to notice that the data stored in the database did not seem to contain the backslashes. I tested and tested and tested, but all to no avail, and I'm at a loss...
Any suggestions? Am I missing something? I was expecting to then have to remove the backslashes with the stripslashes($string) function, but there doesn't seem to be anything to strip...
When you view your data in the database after a successful insert, having escaped it with mysql_real_escape_string(), you will not see the backslashes in the database. This is because the escaping backslashes are only needed in the SQL query statement. mysql_real_escape_string() sanitizes it for insert (or update, or other query input) but doesn't result in a permanently modified version of the data when it is stored.
In general, you do not want to store modified or sanitized data in your database, but instead should be storing the data in its original version. For example, it is best practice to store complete HTML strings, rather than to store HTML that has been encoded with PHP's htmlspecialchars().
When you retrieve it back out from the database, there is no need for stripslashes() or other similar unescaping. There are some legacy (mis-)features of PHP like magic_quotes_gpc that had been designed to protect programmers from themselves by automatically adding backslashes to quoted strings, requiring `stripslashes() to be used on output, but those features have been deprecated and now mostly removed.
MySQL stores the data without the slashes (although it is passed to the RDBMS with the slashes). So you don't need to use stripslashes() later on.
You can be sure that the string was escaped, cause otherwise, the query would have failed.
I'm looking to cleanse all of the text (and most other data) entered by the user of my website
This is what you are doing wrong.
mysqli_real_escape_string does not "cleanse" anything. There is no word "cleanse" in it's name.
You should format, not "cleanse" your data. And different data require different formatting.
You should format ALL the data, not only data entered by the user of my website
In the current form you are leaving your site highly vulnerable to attacks and errors.
I was under the impression that the function inserted backslashes ( \ ) before all characters capable of being malicious ( \n , ' , " , etc ),
To let you know, there is nothing malicious in any character. There are some service characters, that can be misinterpreted in some circumstances.
But adding backslashes doesn't make your data automatically "safe". Some injections doesn't require any special characters. So, you need to properly format your data, not just use a some sort of magic that will make you magically safe
Well, on my php pages I am escaping in this manner:
$title = "Jack's Long 'Shoes'";
$title = mysql_real_escape_string($title);
$go = mysql_query("INSERT INTO titles (title) VALUES '$title'");
Then when I view this data via phpmyadmin, the data appears as it were before it was escaped, ie Jack's Long 'Shoes'
I was under the impression that it would look like:
Jack\s Long \Shoes\
Are the slashes supposed to actually be printed inside the mysql database field?
No. The escapes vanish once they pass into the data tables. That's the whole points of escaping data - it's like stuffing a letter into an envelope. The letter stays in the envelope (escaped) during its journey through the postal system. Once it gets to its destination (the database storage medium), it's removed from the envelope and stored in its original form.
If the escaping (envelope) was stored along with the letter, you'd have to UNESCAPE (open the envelope) it each time you pulled the letter out of the database.
For databases, the escaping serves to "hide" SQL metacharacters from the query parser. Once the data's passed through the parser and has been written into the DB, the escapes are no longer necessary. The db's own internal handlers knows what is data and what is sql commands, so the artificial divisions created by the escapes are no longer necessary at that point.
Certain characters must exist so that the statement can be understood.
For example, in PHP, when you have, $var = "The other day, someone said, \"Hello!\""; you don't expect the \ to exist in the output string. Escaping in SQL is the same concept. The escape characters are there to mark special characters as literal, not to actually show up.
I m creating page in which user enters commnets and that comments are inserted into DB(mysql). These comments can contain single,double quotes or any special chars. To escape these I used following code
$str = mysql_real_escape_string($str,$conn);
here $conn is active connection resource, $str is string content from textarea
This works fine and return perfectly escaped string that I can insert into DB. But if user typed his/her comments into text editor like openoffice writer or msword and use this text from it, the error occur and gives error as follow while inserting in DB
Incorrect string value: '\x93testi...' for column 'commnets' at row 1
I think this is happening because single-double quotes in text that are coming from text editor(openoffice, msword) is not escaped properly. So How do I escape it to insert it into DB. Please help me
Thanks in advance.....
You aren't submitting a valid UTF8 string to be saved in the DB. Instead it's probably a windows specific character set.
Presumably your users are submitting the text through a web page - you need to make sure that you serve the page in UTF8 and when the form is submitted it is also in UTf8 (which it will be by default if the page is served in UTF8).
You need to:
Make sure you're sending the UTF-8 charset in the headers.
header("Content-Type:text/html; charset=UTF-8");
And/or set the content type in your section of your page
btw mysql_real_escape_string is not really anything to do with the problem here. That function is used to prevent strings containing normal quotes from being used to do SQL injection attacks, which is better solved by using prepared statements anyway.
There is one way to sidestep all this real_escape malarkey and inject INTO sql what is actually supplied, and that is to use mysql's ability to interpret a hexadecimal number of arbitrary length as a string.
e.g.
$query=sprintf("update module set code=0x%s where id='%d'", bin2hex($code), $id);
This works even if code is a BLOB type binary field and $code is full binary data (e.g, an image file contents).
You will also sidestep any sql injection with this.
I have found that using sprintf to format queries is extremely powerful and safe and use of the php bin2hex() renders anything up to and including binary able to get into the database untainted.
Getting it out is somewhat another matter mind you..
I usually escape user input by doing the following:
htmlspecialchars($str,ENT_QUOTES,"UTF-8");
as well as mysql_real_escape_string($str) whenever a mysql connection is available.
How can this be improved? I have not had any problems with this so far, but I am unsure about it.
Thank you.
Data should be escaped (sanitized) for storage and encoded for display. Data should never be encoded for storage. You want to store only the raw data. Note that escaping does not alter raw data at all as escape characters are not stored; they are only used to properly signal the difference between raw data and command syntax.
In short, you want to do the following:
$data = $_POST['raw data'];
//Shorthand used; you all know what a query looks like.
mysql_query("INSERT " . mysql_real_escape_string($data));
$show = mysql_query("SELECT ...");
echo htmlentities($show);
// Note that htmlentities() is usually overzealous.
// htmlspecialchars() is enough the majority of the time.
// You also don't have to use ENT_QUOTES unless you are using single
// quotes to delimit input (or someone please correct me on this).
You may also need to strip slashes from user input if magic quotes is enabled. stripslashes() is enough.
As for why you should not encode for storage, take the following example:
Say that you have a DB field that is char(5). The html input is also maxlength="5". If a user enters "&&&&&", which may be perfectly valid, this is stored as "&&." When it's retrieved and displayed back to the user, if you do not encode, they will see "&&," which is incorrect. If you do encode, they see "&&," which is also incorrect. You are not storing the data that the user intended to store. You need to store the raw data.
This also becomes an issue in a case where a user wants to store special characters. How do you handle the storage of these? You don't. Store it raw.
To defend against sql injection, at the very least escape input with mysql_real_escape_string, but it is recommended to use prepared statements with a DB wrapper like PDO. Figure out which one works best, or write your own (and test it thoroughly).
To defend against XSS (cross-site-scripting), encode user input before it is displayed back to them.
If you only use mysql_real_escape_string($str) to avoid sql injection, make sure you always add single quotes around it in your query.
The htmlspecialchars is fine when parsing unsafe output to the screen.
For the database switch to PDO.
It's much easier and does the escaping for you.
http://php.net/pdo