A better way to clean data before writing to database - php

I'm using PHP and MySQL to power a basic forum. When users use the apostrophe (') or insert links into their post, the mysql_real_escape_string function is adding \ to the text. When displaying the post, the links don't work, and all the apostrophe's have a \ before it.
Is the problem that I am not doing something before outputting the text or is the issue that I'm not cleaning the data properly before writing to MySQL?

Are magicquotes turned on? You can check quickly by creating a PHP page like so:
<?php var_dump(get_magic_quotes_gpc()) ?>
If the page says something like int(1), then the culprit isn't mysql_real_escape_string, but PHP itself. It was a security feature, but not very secure, and mostly just annoying. Before you sanitize each variable, you first need to undo the slashing with stripslashes.

You can also turn off magic quotes by using this:
if ( version_compare(PHP_VERSION, '5.3.0', '<') ) {
set_magic_quotes_runtime(0);
}
It will turn magic quotes off when your server is running any version of php less than 5.3.0.

Related

stripslashes issue in php

when i use stripslashes in php but i did not get the exact solution. I have menstion below which i used in my code those are
Example if i have the value in table like suresh\'s kuma\"r
i trying to display the value in the following three formats but no one is giving exact value
1) value=<?=stripslashes($row[1])?> //output is suresh's
2) value='<?=stripslashes($row[1])?>' //output is suresh
3) value="<?=stripslashes($row[1])?>" //output is suresh's kuma
But the exact output i need is suresh's kuma"r
let me know how to resolve the this issue?
The issue has nothing do to with stripslashes. If I guess correctly, the problem lies in the fact that in your examples quotes break the html field attribute;
I'll show you by manually echoing out your $row content as per your infos:
value=sures kumar --> leads to browser to interpret this as value="sures" kumar
value='suresh'khumar --> well, same story value='sures' khumar
value="Suresh"Khumar -->what can I say...you know the drill
Escaping the quotes won't affect html, since backslashes has no meaning in html.
Both value="Suresh" and value="Suresh\" will work fine for the browser, but your name will always be interpreted by the browser as some unknown attribute, leaving only the first part inside the value.
What you might do, instead, is apply htmlentities($row[1],ENT_QUOTES) so that they get converted in the equivalent entity (&quote;,for ex.) and not break your value attribute. See manual.
Another issue is that you shouldn't be having backslashes in your database in the first place; this might be due to the presence of magic_quotes enabled in your provider, or you passing manually addslashes() or other wrong trickery. If you want to insert into a database values containing quotes, use the escaping mechanism provided by your database driver (mysql_real_escape_string() in mysql, for ex.), or better tools (preparated statements with query bindings).
You should first get rid of all the slashes using that stripslashes and re-saving back the content; but slashes or not, the issue would appear again if you don't format that appropriately for your html, as I showed above.
Are you sure you want stripslashes instead of addslashes? Is the purpose is to quote the " characters?

PHP, why do you escape my quotes? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why are escape characters being added to the value of the hidden input
So, I have a file called Save.php.
It takes two things: a file, and the new contents.
You use it by sending a request like '/Resources/Save.php?file=/Resources/Data.json&contents={"Hey":"There"}'.
..but of course, encoding the url. :) I left it all unencoded for simplicity and readability.
The file works, but instead of the contents being..
{"Hey":"There"}
..I find..
{\"Hey\":\"There\"}
..which of course throws an error when trying to use JSON.parse when getting the JSON file later through XHR.
To save the contents, I just use..
file_put_contents($url, $contents);
What can I do to get rid of the backslashes?
Turn magic_quotes off in PHP.ini.
Looks like you have magic_quotes turned on.
If that is the case, either turn it off - Or use a runtime disabling function
Try this:
file_put_contents($url, stripslashes($contents));
you probably have magic quotes enabled, only two things you can do. disable magic quotes in your php.ini or call stripslashes() on $_GET and $_POST globals.
FYI, use $_GET['contents'] as opposed to $contents; newer versions of php will not create the $contents var.
You should disable magic_quotes in your php.ini configuration file. However if this is not possible you can also use the stripslashes() function to get rid of the automatic escaping.
If you can not get magic quotes switched off for your server, then you need to check if it is switched on using get_magic_quotes_gpc() and if it is true, stripslashes().

mysql_real_escape_string() -> stripslashes() -> jquery.append()

im letting my users type in texts, then take them to server side php and process them, and if everything goes as it should, i just append the text with jquery without the page having to load all over again.
This is the procedure:
$post_text = htmlspecialchars(mysql_real_escape_string($_POST['post_text']));
some logic...
everything ok!
stripslashes(str_replace("\\n", "", $post_text))
and then i send all the nessesary data witj json
echo json_encode($return);
on the client side i append the html chunk saved in a variable from the server side.
this seems to work on localhost, it removes all the slashes and so on, but online it just doenst remove the slashes, and they keep coming up, when i hit refresh they dissapear becouse then its a
stripslashes($comment['statusmsg_text'])
written out with php straight from the database. Is it the json that adds some extra stuff? i dont get it becouse it works perfectly on localhost.
best of regards,
alexander
The additional slashes might be magic quotes. You shouldn’t rely on them and disable them.
Additionally, mysql_real_escape_string should only be used to prepare strings to be put into a string context in an MySQL statement. Similar applies to htmlspecialchars that should only be used for sanitizing data to be put into an HTML context.
It may be, that on your server and your localhost the magic_quotes_gpc directive is set differently, so your string is double encoded on server side.
Try it without stripslashes, json_encode should handle that. All you need to do is use mysql_real_escape once, before your string touches your database.

A PHP Function that verify code language

I have a form with 2 textareas; the first one allows user to send HTML Code, the second allows to send CSS Code. I have to verify with a PHP function, if the language is correct.
If the language is correct, for security, i have to check that there is not PHP code or SQL Injection or whatever.
What do you think ? Is there a way to do that ?
Where can I find this kind of function ?
Is "HTML Purifier" http://htmlpurifier.org/ a good solution ?
If you have to validate the date to insert them in to database - then you just have to use mysql_real_escape_string() function before inserting them in to db.
//Safe database insertion
mysql_query("INSERT INTO table(column) VALUES(".mysql_real_escape_string($_POST['field']).")");
If you want to output the data to the end user as plain text - then you have to escape all html sensitive chars by htmlspecialchars(). If you want to output it as HTML, the you have to use HTML Purify tool.
//Safe plain text output
echo htmlspecialchars($data, ENT_QUOTES);
//Safe HTML output
$data = purifyHtml($data); //Or how it is spiecified in the purifier documentation
echo $data; //Safe html output
for something primitive you can use regex, BUT it should be noted using a parser to fully-exhaust all possibilities is recommended.
/(<\?(?:php)?(.*)\?>)/i
Example: http://regexr.com?2t3e5 (change the < in the expression back to a < and it will work (for some reason rexepr changes it to html formatting))
EDIT
/(<\?(?:php)?(.*)(?:\?>|$))/i
That's probably better so they can't place php at the end of the document (as PHP doesn't actually require a terminating character)
SHJS syntax highlighter for Javascript have files with regular expressions http://shjs.sourceforge.net/lang/ for languages that highlights — You can check how SHJS parse code.
HTMLPurifier is the recommended tool for cleaning up HTML. And as luck has it, it also incudes CSSTidy and can sanitize CSS as well.
... that there is not PHP code or SQL Injection or whatever.
You are basing your question on a wrong premise. While HTML can be cleaned, this is no safeguard against other exploitabilies. PHP "tags" are most likely to be filtered out. If you are doing something other weird (include-ing or eval-ing the content partially), that's no real help.
And SQL exploits can only be prevented by meticously using the proper database escape functions. There is no magic solution to that.
Yes. htmlpurifier is a good tool to remove malicious scripts and validate your HTML. Don't think it does CSS though. Apparently it works with CSS too. Thanks Briedis.
Ok thanks you all.
actually, i realize that I needed a human validation. Users can post HTML + CSS, I can verify in PHP that the langage & the syntax are correct, but it doesn't avoid people to post iframe, html redirection, or big black div that take all the screen.
:-)

Images not uploading when htmlentities has 'UTF-8' set

I have a form that, among other things, accepts an image for upload and sticks it in the database. Previously I had a function filtering the POSTed data that was basically:
function processInput($stuff) {
$formdata = $stuff;
$formdata = htmlentities($formdata, ENT_QUOTES);
return "'" . mysql_real_escape_string(stripslashes($formdata)) . "'";
}
When, in an effort to fix some weird entities that weren't getting converted properly I changed the function to (all that has changed is I added that 'UTF-8' bit in htmlentities):
function processInput($stuff) {
$formdata = $stuff;
$formdata = htmlentities($formdata, ENT_QUOTES, 'UTF-8'); //added UTF-8
return "'" . mysql_real_escape_string(stripslashes($formdata)) . "'";
}
And now images will not upload.
What would be causing this? Simply removing the 'UTF-8' bit allows images to upload properly but then some of the MS Word entities that users put into the system show up as gibberish. What is going on?
**EDIT: Since I cannot do much to change the code on this beast I was able to slap a bandaid on by using htmlspecialchars() rather than htmlentities() and that seems to at least leave the image data untouched while converting things like quotes, angle brackets, etc.
bobince's advice is excellent but in this case I cannot now spend the time needed to fix the messy legacy code in this project. Most stuff I deal with is object oriented and framework based but now I see first hand what people mean when they talk about "spaghetti code" in PHP.
function processInput($stuff) {
$formdata = $stuff;
$formdata = htmlentities($formdata, ENT_QUOTES);
return "'" . mysql_real_escape_string(stripslashes($formdata)) . "'";
}
This function represents a basic misunderstanding of string processing, one common to PHP programmers.
SQL-escaping, HTML-escaping and input validation are three separate functions, to be used at different stages of your script. It makes no sense to try to do them all in one go; it will only result in characters that are ‘special’ to any one of the processes getting mangled when used in the other parts of the script. You can try to tinker with this function to try to fix mangling in one part of the app, but you'll break something else.
Why are images being mangled? Well, it's not immediately clear via what path image data is going from a $_FILES temporary upload file to the database. If this function is involved at any point though, it's going to completely ruin the binary content of an image file. Backslashes removed and HTML-escaped... no image could survive that.
mysql_real_escape_string is for escaping some text for inclusion in a MySQL string literal. It should be used always-and-only when making an SQL string literal with inserted text, and not globally applied to input. Because some things that come in in the input aren't going immediately or solely to the database. For example, if you echo one of the input values to the HTML page, you'll find you get a bunch of unwanted backslashes in it when it contains characters like '. This is how you end up with pages full of runaway backslashes.
(Even then, parameterised queries are generally preferable to manual string hacking and mysql_real_escape_string. They hide the details of string escaping from you so you don't get confused by them.)
htmlentities is for escaping text for inclusion in an HTML page. It should be used always-and-only in the output templating bit of your PHP. It is inappropriate to run it globally over all your input because not everything is going to end up in an HTML page or solely in an HTML page, and most probably it's going to go to the database first where you absolutely don't want a load of < and & rubbish making your text fail to search or substring reliably.
(Even then, htmlspecialchars is generally preferable to htmlentities as it only encodes the characters that really need it. htmlentities will add needless escaping, and unless you tell it the right encoding it'll also totally mess up all your non-ASCII characters. htmlentities should almost never be used.)
As for stripslashes... well, you sometimes need to apply that to input, but only when the idiotic magic_quotes_gpc option is turned on. You certainly shouldn't apply it all the time, only when you detect magic_quotes_gpc is on. It is long deprecated and thankfully dying out, so it's probably just as good to bomb out with an error message if you detect it being turned on. Then you could chuck the whole processInput thing away.
To summarise:
At start time, do no global input processing. You can do application-specific validation here if you want, like checking a phone number is just numbers, or removing control characters from text or something, but there should be no escaping happening here.
When making an SQL query with a string literal in it, use SQL-escaping on the value as it goes into the string: $query= "SELECT * FROM t WHERE name='".mysql_real_escape_string($name)."'";. You can define a function with a shorter name to do the escaping to save some typing. Or, more readably, parameterisation.
When making HTML output with strings from the input or the database or elsewhere, use HTML-escaping, eg.: <p>Hello, <?php echo htmlspecialchars($name); ?>!</p>. Again, you can define a function with a short name to do echo htmlspecialchars to save on typing.

Categories