PHP/MySQL: How to verify properly escaped data? - php

Well, on my php pages I am escaping in this manner:
$title = "Jack's Long 'Shoes'";
$title = mysql_real_escape_string($title);
$go = mysql_query("INSERT INTO titles (title) VALUES '$title'");
Then when I view this data via phpmyadmin, the data appears as it were before it was escaped, ie Jack's Long 'Shoes'
I was under the impression that it would look like:
Jack\s Long \Shoes\
Are the slashes supposed to actually be printed inside the mysql database field?

No. The escapes vanish once they pass into the data tables. That's the whole points of escaping data - it's like stuffing a letter into an envelope. The letter stays in the envelope (escaped) during its journey through the postal system. Once it gets to its destination (the database storage medium), it's removed from the envelope and stored in its original form.
If the escaping (envelope) was stored along with the letter, you'd have to UNESCAPE (open the envelope) it each time you pulled the letter out of the database.
For databases, the escaping serves to "hide" SQL metacharacters from the query parser. Once the data's passed through the parser and has been written into the DB, the escapes are no longer necessary. The db's own internal handlers knows what is data and what is sql commands, so the artificial divisions created by the escapes are no longer necessary at that point.

Certain characters must exist so that the statement can be understood.
For example, in PHP, when you have, $var = "The other day, someone said, \"Hello!\""; you don't expect the \ to exist in the output string. Escaping in SQL is the same concept. The escape characters are there to mark special characters as literal, not to actually show up.

Related

PHP MySQL how to transition database input from stripslashes to parameters

I'm trying to update an old program that uses regular MySql queries, runs all the inputs through addslashes() prior to inserting them, and runs the retrieved data through stripslashes() before returning it. I'm trying to update the program to run with Mysqli and use prepared statements, but I'm not sure how to make the transition with the existing data in the database.
My first thought was to write a function to test if slashes have been added, but the general consensus in the answers to similar questions I found was that this isn't doable.
I'm trying to avoid doing a mass update to the database, because it's an open source program and this seems likely to potentially cause problems for existing users when they try to upgrade.
If I continue to use addslashes() prior to inserting using MySqli and prepared statements would this work? What would be the reasons for not doing it this way and instead going the full database upgrade route?
Edit:
Based on a comment that appears to be deleted now, I went and looked at the database directly. All I could find was an ' that had been converted to ' and no slashes. When I pulled out the data it came out fine without running stripslashes(). So do the added slashes just tell mysql to escape the data when it's inserted? Will all the data come out find if I remove the stripslashes()? If so what's the point of stripslashes()?
You have several questions:
If I continue to use addslashes() prior to inserting using mysqli_ and prepared statements would this work?
No, it would not: the added backslashes would then be stored in your database, which is not what you want. In prepared statements the provided arguments are taken literally as you pass them, as in that mechanism there is no more need to escape quotes.
What would be the reasons for not doing it this way and instead going the full database upgrade route?
See above.
So do the added slashes just tell MySql to escape the data when it's inserted?
The added slashes are interpreted when you embed a string literal in your SQL, as that necessarily is wrapped in quotes. For example:
$text = "My friend's wedding";
$sql = "INSERT INTO mytable VALUES ('$text')";
$result = mysqli_query($con, $sql);
The mysqli_query will fail, as the SQL statement that is passed to it looks like this:
INSERT INTO mytable VALUES ('My friend's wedding')
This is not valid SQL, as the middle quote ends the string literal, and so the s that follows is not considered part of the string any more. As the SQL engine does not understand the s (nor wedding and the dangling quote after it), it produces a syntax error. You can even see something is wrong in the way the syntax is highlighted in the above line.
So that is where addslashes() is (somewhat) useful:
$text = "My friend's wedding";
$text = addslashes($text);
$sql = "INSERT INTO mytable VALUES ('$text')";
$result = mysqli_query($con, $sql);
This will work because now the SQL statement sent to MySql looks like this:
INSERT INTO mytable VALUES ('My friend\'s wedding')
The backslash tells MySql that the quote that follows it has to be taken as a literal one, and not as the end of the string. MySql will not store the backslash itself, because it is only there to escape the quote (see list of escape sequences in the documentation).
So in your table you will get this content inserted:
My friend's wedding
Then:
Will all the data come out fine if I remove the stripslashes()?
Yes. In fact, it should never have been in your code in the first place, as it will negatively effect what you get when there truly are (intended!) backslashes in your data. You are maybe lucky that a backslash is a rare character in normal text. Apparently you have not found any backslash in your database data, so it probably did not badly affect you until now. It is an error to think that the addslashes() call during the saving of data had to be countered with a stripslashes() call while reading the data, because, again, MySql had already removed them while writing the data into the database.
If so what's the point of stripslashes()?
There is no point in the context of what you are doing. It only is useful if you have received somehow a string in which there are characters that have a backslash before them, and have the meaning to escape the next character when the string is used in some context. But since you only briefly have such strings (after calling addslashes()) and MySql interprets these escapes by removing the backslashes in the written data, you never again see that string with the additional backslashes.
Situations where you would need stripslashes() are very rare and very specific. For your scenario you don't need it.

real_escape_string not cleaning up entered text

I thought the proper way to "sanitize" incoming data from an HTML form before entering it into a mySQL database was to use real_escape_string on it in the PHP script, like this:
$newsStoryHeadline = $_POST['newsStoryHeadline'];
$newsStoryHeadline = $mysqli->real_escape_string($newsStoryHeadline);
$storyDate = $_POST['storyDate'];
$storyDate = $mysqli->real_escape_string($storyDate);
$storySource = $_POST['storySource'];
$storySource = $mysqli->real_escape_string($storySource);
// etc.
And once that's done you could just insert the data to the DB like this:
$mysqli->query("INSERT INTO NewsStoriesTable (Headline, Date, DateAdded, Source, StoryCopy) VALUES ('".$newsStoryHeadline."', '".$storyDate."', '".$dateAdded."', '".$storySource."', '".$storyText."')");
So I thought doing this would take care of cleaning up all the invisible "junk" characters that may be coming in with your submitted text.
However, I just pasted some text I copied from a web-page into my HTML form, clicked "submit" - which ran the above script and inserted that text into my DB - but when I read that text back from the DB, I discovered that this piece of text did still have junk characters in it, such as –.
And those junk characters of course caused the PHP script I wrote that retrieves the information from the DB to crash.
So what am I doing wrong?
Is using real_escape_string not the way to go here? Or should I be using it in conjunction with something else?
OR, is there something I should be doing (like more escaping) when reading reading data back out from the the mySQL database?
(I should mention that I'm an Objective-C developer, not a PHP/mySQL developer, but I've unfortunately been given this task to do some DB stuff - hence my question...)
thanks!
Your assumption is wrong. mysqli_real_escape_string’s only intention is to escape certain characters so that the resulting string can be safely used in a MySQL string literal. That’s it, nothing more, nothing less.
The result should be that exactly the passed data is retained, including ‘junk’. If you don’t want that ‘junk’ in your database, you need to detect, validate, or filter it before passing to to MySQL.
In your case, the ‘junk’ seems to be due to different character encodings: You input data seems to be encoded with UTF-8 while it’s later displayed using Windows-1250. In this scenario, the character – (U+2013) would be encoded with 0xE28093 in UTF-8 which would represent the three characters â, €, and “ in Windows-1250. Properly declaring the document’s encoding would probably fix this.
Sanitization is a tricky subject, because it never means the same thing depending on the context. :)
real_escape_string just makes sure your data can be included in a request (inside quotes, of course) without having the possibility to change the "meaning" of the request.
The manual page explains what the function really does: it escapes nul characters, line feeds, carriage returns, simple quotes, double quotes, and "Control-Z" (probably the SUBSTITUTE character). So it just inserts a backslash before those characters.
That's it. It "sanitizes" the string so it can be passed unchanged in a request. But it doesn't sanitize it under any other point of view: users can still pass for instance HTML markers, or "strange" characters. You need to make rules depending on what your output format is (most of the time HTML, but HTTP isn't restricted to HTML documents), and what you want to let your users do.
If your code can't handle some characters, or if they have a special meaning in the output format, or if they cause your output to appear "corrupted" in some way, you need to escape or remove them yourself.
You will probably be interested in htmlspecialchars. Control characters generally aren't a problem with HTML. If your output encoding is the same as your input encoding, they won't be displayed and thus won't be an issue for your users (well, maybe for the W3C validator). If you think it is, make your own function to check and remove them.

clarification on mysqli_real_escape_string: storing in database

For wont of avoiding SQL injection attacks, I'm looking to cleanse all of the text (and most other data) entered by the user of my website before sending it into the database for storage.
I was under the impression that the function inserted backslashes ( \ ) before all characters capable of being malicious ( \n , ' , " , etc ), and expected that the returned string would contain the newly added backslashes.
I performed a simple test on a made up string containing such potentially malicious characters and echo'd it to the document, seeing exactly what I expected: the string with backslashes escaping these characters.
So, I proceeded to add the cleansing function to the data before storing into the database. I inserted it (mysqli_real_escape_string( $link , $string)) into the query I build for data storage. Testing the script, I was surprised (a bit to my chagrin) to notice that the data stored in the database did not seem to contain the backslashes. I tested and tested and tested, but all to no avail, and I'm at a loss...
Any suggestions? Am I missing something? I was expecting to then have to remove the backslashes with the stripslashes($string) function, but there doesn't seem to be anything to strip...
When you view your data in the database after a successful insert, having escaped it with mysql_real_escape_string(), you will not see the backslashes in the database. This is because the escaping backslashes are only needed in the SQL query statement. mysql_real_escape_string() sanitizes it for insert (or update, or other query input) but doesn't result in a permanently modified version of the data when it is stored.
In general, you do not want to store modified or sanitized data in your database, but instead should be storing the data in its original version. For example, it is best practice to store complete HTML strings, rather than to store HTML that has been encoded with PHP's htmlspecialchars().
When you retrieve it back out from the database, there is no need for stripslashes() or other similar unescaping. There are some legacy (mis-)features of PHP like magic_quotes_gpc that had been designed to protect programmers from themselves by automatically adding backslashes to quoted strings, requiring `stripslashes() to be used on output, but those features have been deprecated and now mostly removed.
MySQL stores the data without the slashes (although it is passed to the RDBMS with the slashes). So you don't need to use stripslashes() later on.
You can be sure that the string was escaped, cause otherwise, the query would have failed.
I'm looking to cleanse all of the text (and most other data) entered by the user of my website
This is what you are doing wrong.
mysqli_real_escape_string does not "cleanse" anything. There is no word "cleanse" in it's name.
You should format, not "cleanse" your data. And different data require different formatting.
You should format ALL the data, not only data entered by the user of my website
In the current form you are leaving your site highly vulnerable to attacks and errors.
I was under the impression that the function inserted backslashes ( \ ) before all characters capable of being malicious ( \n , ' , " , etc ),
To let you know, there is nothing malicious in any character. There are some service characters, that can be misinterpreted in some circumstances.
But adding backslashes doesn't make your data automatically "safe". Some injections doesn't require any special characters. So, you need to properly format your data, not just use a some sort of magic that will make you magically safe

Do values coming directly from the database need to be escaped?

Do I need to escape/filter data that is coming from the database? Even if said data has already been "escaped" once (at the point in time where it was inserted into the database).
For example, say I allow users to submit blog posts via a form that has a title input and a textarea input.
A malicious user submits the blog post
title: Attackposttitle');DROP TABLE posts;--
textarea: Hahaha nuked ur site noobzors!
Now as this is being inserted into my database, I am going to escape it with mysql_real_escape_string, but once it is in the database I will later reference this data in my php blog application with something like this:
sql="SELECT posttitle FROM posts WHERE id=50";
$posttitlearray = mysql_fetch_array(mysql_query($sql));
This is where my concern is, what if I, for example, run the following query to get the post content:
sql="SELECT postcontent FROM posts WHERE posttitle=$posttitlearray[posttitle]";
In theory am I not sql injecting myself? IE, am I not effectively running the query:
sql="SELECT postcontent FROM posts WHERE posttitle=Attackposttitle');DROP TABLE posts;--";
Or does the "Attackposttitle');DROP TABLE posts;--" data continue to be escaped once it is in the database?
Do I need to continually escape it like so:
sql="SELECT postcontent FROM posts WHERE posttitle=msql_real_escape_string($posttitlearray[posttitle])";
Or is the data safe once it has been escaped initially upon first being inserted into the database?
Thanks Stack!
It does not continue to be escaped once it's put in the database. You'll have to escape it again.
$sql="SELECT postcontent FROM posts WHERE posttitle='".mysql_real_escape_string($posttitlearray[posttitle])."'";
The value should be escaped every time just before insertion to SQL query. Not for magical security reasons, but just to be sure that the syntax of the resultant query is OK.
Escaping the string sound magical to many people, something like shield against some mysterious danger, but in fact it is nothing magical. It is just the way to enable special characters being processed by the query.
The best would be just to have a look what escaping really does. Say the input string is:
Attackposttitle');DROP TABLE posts;--
after escaping:
Attackposttitle\');DROP TABLE posts;--
in fact it escaped only the single slash. That's the only thing you need to assure - that when you insert the string in the query, the syntax will be OK!
insert into posts set title = 'Attackposttitle\');DROP TABLE posts;--'
It's nothing magical like danger shield or something, it is just to ensure that the resultant query has the right syntax! (of course if it doesn't, it can be exploited)
The query parser then looks at the \' sequence and knows that it is still the variable, not ending of its value. It will remove the backslash and the following will be stored in the database:
Attackposttitle');DROP TABLE posts;--
which is exactly the same value as user entered. And which is exactly what you wanted to have in the database!!
So this means that the if you fetch that string from the database and want to use it in the query again, you need to escape it again to be sure that the resultant query has the right syntax.
But, in your example, very important thing to mention is the magic_quotes_gpc directive!
This feature escapes all the user input automatically (gpc - _GET, _POST and _COOKIE). This is an evil feature made for people not aware of sql injection. It is evil for two reasons. First reason is that then you have to distinguish the case of your first and second query - in the first you don't escape and in the second you do. What most people do is to either switch the "feature" off (I prefer this solution) or unescape the user input at first and then escape it again when needed. The unescape code could look like:
function stripslashes_deep($value)
{
return is_array($value) ?
array_map('stripslashes_deep', $value) :
stripslashes($value);
}
if (get_magic_quotes_gpc()) {
$_POST = stripslashes_deep($_POST);
$_GET = stripslashes_deep($_GET);
$_COOKIE = stripslashes_deep($_COOKIE);
}
The second reason why this is evil is because there is nothing like "universal quoting".
When quoting, you always quote text for some particular output, like:
string value for mysql query
like expression for mysql query
html code
json
mysql regular expression
php regular expression
For each case, you need different quoting, because each usage is present within different syntax context. This also implies that the quoting shouldn't be made at the input into PHP, but at the particular output! Which is the reason why features like magic_quotes_gpc are broken (never forget to handle it, or better, assure it is switched off!!!).
So, what methods would one use for quoting in these particular cases? (Feel free to correct me, there might be more modern methods, but these are working for me)
mysql_real_escape_string($str)
mysql_real_escape_string(addcslashes($str, "%_"))
htmlspecialchars($str)
json_encode() - only for utf8! I use my function for iso-8859-2
mysql_real_escape_string(addcslashes($str, '^.[]$()|*+?{}')) - you cannot use preg_quote in this case because backslash would be escaped two times!
preg_quote()
Try using bind variables. which will remove the need to escape your data completely.
http://php.net/manual/en/function.mssql-bind.php
only down side is your restricted to using them with stored procedures in SQL server, other database you can use them for everything.

PHP - Mysql: storing images in DB - escaping special characters

I read this tutorial about storing images in DB. In the tutorial, the author escapes special characters in the binary data before inserting: http://www.phpriot.com/articles/images-in-mysql/7 ( using addslashes although mysql_real_escape_string is preferable - but that is another issue ).
The point is, when displaying, he just displays the data as it is stored: http://www.phpriot.com/articles/images-in-mysql/8
My questions:
1) Do we need to escape special characters even for binary field type (blob)?
2) If so, then, do we not need to "unescape" the characters again in order to display the image correctly? (If so, what is the best way to do it. Any comments about efficiency? For large images: escaping and unescaping can be a big overhead?).
Or is it that my understanding about escaping is totally wrong (and escaping only affects the query and not the final data inserted/stored?).
thanks
JP
Your understanding of escaping is wrong. The data being inserted into the database is escaped, so that the query parser sees the information as intended.
Take the string "Jean-Luc 'Earl Grey' Picard".
Escaping results in: 'Jean-Luc \'Earl Grey\' Picard'
When MySQL receives this, it understands that the escaped quotes need to be taken literally, that is what escaping means, and will store them in the database. It will not store the escape-characters in the database. The \ indicates to MySQL that it should take the character following it literally.
When retrieving, the data is presented to your application without the escaping characters, as they are removed when parsing the query.
1) Do we need to escape special characters even for binary field type (blob)?
Yes, because mysql_real_escape_string() (which is indeed the one to use) provides protection against SQL injection attacks, which could easily be inside an image file as well. Any arbitrary data you feed into a database must be sanitized first.

Categories