I'm working on a PHP comment system and came across the problem that the commentator's quotation marks are delimited when written to the comment file, so the output would end up like this for example:
"That is your father\'s! It\'s special to him!" (random sentence). How do I disable this?
The backslashes are added to the database query to prevent SQL injection. You can use the stripslashes() function to remove them when you retrieve the comments from the database.
You should also take a look at magic quotes.
It depends on version of PHP and it's configuration. Older versions (older than 5.3) had this enabled by default. It adds the quotes when you post your comment (so it will be stored in the database with the quotes). You can disable this behavior:
http://cz.php.net/manual/en/function.get-magic-quotes-gpc.php
http://cz.php.net/manual/en/function.set-magic-quotes-runtime.php
http://cz.php.net/manual/en/info.configuration.php#ini.magic-quotes-gpc
For existing comments, you'll have to run some cleanup script that will fetch all rows, performs stripslashes() on it and save it back.
Escaping your queries should be done by mysql_real_escape() anyway, relying on magic quotes is suicide, so if you think about it, it's safer to turn them off completely and escape the queries manually.
Turn gpc_magic_quote 's off in your php.ini .
Those backslashes are there to escape the quotation marks from the SQL engine.
That style of programming is quite common with the mysql_* series of functions which take a string directly from the program and execute it directly in the database engine. This is notoriously prone to SQL injection attacks and, as you've discovered, corrupting your data. (When applied consistently, you can always de-corrupt the data on the way back to the user, with the stripslashes() function, but that also must be done consistently.)
The far better approach in my humble opinion is to use prepared statements and let the database libraries insert data directly into the database without any escaping or un-escaping involved. This also completely removes the risk of SQL injection attacks. (Though you're still free to write insecure code.)
Related
Right now I am using htmlspecialchars(mysql_escape_string($value)), but is there a way to sanitize it with one statement rather than a nested statement?
Well there's no one function that handles both of them.
You can use prepared statements and html puffier class, maybe then the "look and feel" will be little bit better :)
mysql_real_escape_string has actually fallen out of favor lately.
It is now preferred to use PDO or mysqli. They both come with PHP by default. They use something called parameterized queries to access the database, rather than having you write the SQL command yourself. This means that you don't need to worry about escaping anymore, since the query and the variables are passed into the function separately.
You can learn more about PDO here:
http://net.tutsplus.com/tutorials/php/why-you-should-be-using-phps-pdo-for-database-access/
On a related note, it is conventional to store user-supplied input into the database "as it was written", rather than using htmlspecialchars. You should then use htmlspecialchars to escape the data wherever the it appears on the site. This is a convention recommended by OWASP.
This is because you need to escape different things depending on context. This string:
' <script src="http://example.org/malice.js"></script> ]}\\\\--
...will need to be treated differently if it is used as a parameter in JSON (the quotes and backslashes and ] and } need to be escaped), HTML (the quotes and <s need to be escaped), or written as a URL (almost everything needs to be escaped). If you need to spend time instructing your JavaScript to un-encode the HTML, then your code is going to be confusing quickly.
This approach also makes fixing bugs simpler: if your site has a bug where content isn't escaped properly on a single page, then you can update the page and everything is fixed. If your site has a bug where the data is getting stored in the database incorrectly, then you need to fix everything in the database (which will take much longer and harm more users).
There is a (ugly)project I was asked to help with by doing a specific task.
The thing is, I don't have any control over the other files in the site much less over the server configuration. Some data I'm going to use comes from a query like this:
'SELECT * FROM table where value like "'.$unsafe.'"';
$unsafe is an unescaped value coming from $_POST or $_GET.
I checked the server, is PHP5.1.6 and has magic_quotes_gpc On so the data is being auto escaped.
Is this query breakable? Being $unsafe between colons gives me the impression It cant be broken but maybe I'm missing something.
I know magic_quotes_gpc is deprecated because of its insecurity so I'm concerned about it, not because of the application security which fails every where but for my own knowledge.
EDIT: I'm aware of the security implications of *magic_quotes_gpc* and I never use it in my own projects. I always use parameterized queries to avoid injection but this time I was asked to add a very specific pice of code in a friend/client project, so I cant change what is already done. I'd like to know if there is a specific value I can use to create an injection so I can illustrate my friend why he should change it.
if the DB is mysql use mysqli_real_escape_string() instead, if the PHP version is very old you can use the mysql_real_escape_string (not recommended at the moment).
even if the variable is between colons it can be injected, you just need to close the colons inside the value of the variable and then inject whatever you want afterwards.
With regard to your edit: You asked "I'd like to know if there is a specific value I can use to create an injection so I can illustrate my friend why he should change it."
According to the manual page for mysqli_real_escape_string(), the characters it escapes are as follows:
NUL (ASCII 0), \n, \r, \, ', ", and Control-Z.
The old mysql_real_escape_string() function also escapes the same characters.
This gives you a starting point as to which characters can be used to do injection attacks in MySQL. Magic quotes only escapes the quote characters and the slash character, which clearly leaves several gaping holes that can be exploited.
In an easy world, the above information would be enough for us to fix the escaping by doing a string replace on the remaining unescaped characters.
However, both the real_escape functions also require an active database connection for them to work, and this leads us to a further complication: character sets.
Further attacks are possible if the database has a different character set to PHP, particularly with variable-length character sets such as UTF-8 or UTF-16.
An attacker who knows (or can guess) the character set that PHP and the DB are using can send a crafted injection attack string that contains characters that PHP would not see as needing escaping, but which would still cause succeed in hacking MySQL. This is why the real_escape functions need to access the DB in order to know how to do the escaping.
Further resources:
Php & Sql Injection - UTF8 POC
SQL injection that gets around mysql_real_escape_string()
I hope that gives you a few pointers.
I am using mysql_real_escape_string to save content in my mySQL database. The content I save is HTML through a form. I delete and re-upload the PHP file that writes in DB when I need it.
To display correctly my HTML input I use stripslashes()
In other case, when I insert it without mysql_real_escape_string, I do not use stripslashes() on the output.
What is your opinion? Does stripslashes affect performance badly ?
Do not use stripslashes(). It is utterly useless in terms of security, and there's no added benefit. This practice came from the dark ages of "magic quotes", a thing of the past that has been eliminated in the next PHP version.
Instead, only filter input:
string: mysql_real_escape_string($data)
integers: (int)$data
floats: (float)$data
boolean: isset($data) && $data
The output is a different matter. If you are storing HTML, you need to filter HTML against javascript.
Edit: If you have to do stripslashes() for the output to look correctly, than most probably you have magic quotes turned on. Some CMS even made the grave mistake to do their own magic quotes (eg: Wordpress). Always filter as I advised above, turn off magic quotes, and you should be fine.
Do not think about performance, think about security. Use mysql_real_escape_string everytime you're inserting data into DB
No, don't escape it. Use prepared statements instead. Store your data in its raw format, and process it as necessary for display - for example, use a suitable method to prevent Javascript from executing when displaying user supplied HTML.
See Bill Karwin's Sql Injection Myths and Fallacies talk and slides for more information on this subject.
See HTML Purifier and htmlspecialchars for a couple of approaches to filter your HTML for output.
Check out a database abstraction library that does all this and more for you automatically, such as ADOdb at http://adodb.sourceforge.net/
It addresses a lot of the concerns others have brought up such as security / parameterization. I doubt any performance saved is worth the developer hassle to do all this manually every query, or the security practices sacrificed.
It is always best to scrub your data for potential malicious or overlooked special characters which might throw errors or corrupt your database.
Per PHP docs, it even says "If this function is not used to escape data, the query is vulnerable to SQL Injection Attacks."
magic_quotes_gpc
We all hate it and many servers still use this setting and knowingly enough some provides will argue it's safer but I must disagree.
The question I have is, what would a backslash be needed for?
I want to remove them completely which I can do but I am not sure if they are needed?
EDIT
Other then SQL injection.
magic_quotes_gpc() was provided based on the misguided notion that ALL data submitted to PHP from any external source would be immediately inserted into a database. If you wanted to send that data somewhere OTHER than a database, you had to remove the slashes that PHP just inserted, doubling the work required.
As well, not all databases use slashes for escaping metacharacters. \' is fine in MySQL, but in MS Access, escaping a single quote is actually '' - so not only was PHP doing unecessary work, in many situations, it was doing the work WRONG to begin with.
And then, on top of all that, addslashes (which is basically what magic_quotes_gpc() was calling internally) can't handle all forms of SQL injection attacks, particularly where Unicode is used. addslashes is a glorified form of str_replace("'", "\\'", $string), which works at the ASCII level - plenty of Unicode sequences can look like regular ascii, but get turned into SQL metacharacters after a simplistic addslashes() has wreaked its havoc.
They are for preventing SQL injection exploits, a very serious issue you should read up on if you're going to be coding for the web.
You should look into prepared queries, which is a much better way of avoiding SQL injection.
There is no good reason to have this feature in PHP.
That's why it's officially deprecated and will not exist in future versions.
If there were good reasons to keep it, the developer community would have done so.
They are designed to make us do extra work to remove them. For example, some code in Dokuwiki.
Don't forget about magic_quotes_runtime too.
One thing that's always confused me is input escaping and whether or not you're protected from attacks like SQL injection.
Say I have a form which sends data using HTTP POST to a PHP file. I type the following in an input field and submit the form:
"Hello", said Jimmy O'Toole.
If you print/echo the input on the PHP page that receives this POST data, it comes out as:
\"Hello\", said Jimmy O\'Toole.
This is the point where it gets confusing. If I put this input string into (My)SQL and execute it, it'll go into the database fine (since quotes are escaped), but would that stop SQL injection?
If I take the input string and call something like mysqli real_escape_string on it, it comes out like this:
\\"Hello\\", said Jimmy O\\'Toole.
So when it goes into the database via (My)SQL, it ends up as:
\"Hello\", said Jimmy O\'Toole.
This obviously has too many slashes.
So if the input comes through HTTP POST as escaped, do you have to escape it again to make it safe for (My)SQL? Or am I just not seeing something obvious here?
Thanks in advance for any help.
Ah, the wonders of magic quotes. It is making those unnecessary escapes from your POST forms. You should disable (or neutralize) them, and many of your headaches go away.
Here's an exemplary article of the subject: http://www.sitepoint.com/blogs/2005/03/02/magic-quotes-headaches/
Recap: disable magic quotes, use real_escape_string().
Instead of relying on escaping I would use parametrized SQL queries and let the mysql driver do whatever escaping it needs.
It looks like your PHP server has the Magic Quotes feature enabled - that's where your first set of slashes comes from. In theory, it should then be unnecessary to call the escape functions - but when the app runs on a server with magic quotes disabled, you're suddenly wide open to SQL injection while thinking you aren't.
As chakrit wrote, escaping is not the best way to protect yourself - It's much safer to user parameterized queries.
What's going on is that you have Magic Quotes turned on in your PHP configuration.
It's highly recommended that youturn magic quotes off - in fact, they've been removed from PHP 6 completely.
Once you disable magic quotes, you'll see the POSTed text coming back exactly as you typed it in to the form: "Hello", said Jimmy O'Toole.
It's now obvious that you need to use the mysql escaping functions or even better, prepared statements (with prepared statements you can't forget to escape a string as it's done for you).
Obvious is the keyword for a hacker.
I think escaping normally should be enough, but protecting against just the quotes might not be enough.
See this SQL Injection cheatsheet, it's a good list of test you can run and see if too many slahses is a good thing or not.
And don't forget to escape other kinds of values too, i.e. numeric fields and datetime fields can all be injected just as easily as strings.