HTML forms, php and apostrophes - php

Doing a uni assignment with HTML, XML and php 5.3 (no SQL). Building a review website. I have a textarea in which the user can place their comments. If the user enters an apostrophe, eg World's Best Uni!, when I echo $_REQUEST['reviewtext'] I get World\'s Best Uni!
To massage the data for saving in the XML, I have the following code:
$cleantext1 = htmlspecialchars($_REQUEST['reviewtext']);
substr_replace($cleantext1,"\'","'");
$cleantext2 = strip_tags($cleantext1);
$cleantext3 = utf8_encode($cleantext2);
I have echo's at each step an the quote remains World\'s Best Uni! at each step.
I expected the one of the first two lines to replace the escaped apostrophe with an html code but it doesn't seem to work.
Interestingly, this problem doesn't happen on my local XAMPP server; only on my hosted website.
Any suggestions?
Thanks,
Sean

What you are experiencing is PHP's Magic Quotes feature which is automatically escaping input from GET, POST, COOKIE. It is not wise to rely on this feature, and is deprecated as of PHP 5.3, and tends to default to off on most configurations (but not in your Uni's config).
You can use get_magic_quotes_gpc() to determine if this is turned on, and if so, unescape the data.
if (get_magic_quotes_gpc()) {
$val = stripslashes($_POST['val']);
} else {
$val = $_POST['val'];
}
The magic quotes reference goes into more detail on the history, usage, and how to deal with magic quotes.
Also, just an aside, when you output data, always make sure you escape it (e.g. htmlspecialchars() and when you process input from any untrusted source, make sure to filter it (e.g. addslashes(), mysql_real_escape_string()).

Try switching off magic quotes (a PITA IMO!). (as posted above while I was typing my response drew's method would be the most flexible for portability.
By the way, no need to declare new variables if you aren't going to process variables different ways. so after you clean the text with htmlspecialchars, I would toss it into $cleanreview. Also you are not specifying a character encoding which can come back to bite you. I use UTF-8 since it seem like the most forward thinking encoding that's already widely supported.
http://www.php.net/manual/en/function.htmlspecialchars.php
BTEW, I'm a stickler for proper punctuation too so in my code I replace the html entities on output:
$syn = str_replace("'", "’", $syn);
$syn = str_replace("“", "“", $syn);
$syn = str_replace("”", "”", $syn);
$syn = str_replace(" -- ", "—", $syn);
But of course that's assuming UTF-8 being declared in your html (first item after the tag for speed).

This effect of automatical escaping of some characters is called "magic quotes" and can be turned on/off in the php.ini configuration file. Apparently, in the configuration of your local server it is turned off, while on the server it is on.
For more info, just consult the PHP reference for "magic quotes".

Related

Unescaping " In PHP Dynamically

There is a page that I'm currently working on (http://www.flcbranson.org/freedownloads-new.php) that loads data from an rss feed.
That rss feed contains descriptions, some of which contain quotation marks.
When the page is displayed (you can click on the Read Summary link for Filled With All The Fullness Of God to see what I'm talking about), it does \" for each quote.
I assume that it's because of php's escaping requirements.
Is there a way that I can remove the escape character (other than the obvious "remove the quotation marks")?
Sounds like you have magic quotes turned on. Read the PHP documentation for stripslashes() and pay special attention to the magic quotes stuff.
In a nutshell, if you know that your working with a string and not (say) an array, you can do the following:
if (get_magic_quotes_runtime()) {
$string = stripslashes($string);
}
If the data is coming from $_GET, $_POST, or $_COOKIE superglobals, use this instead:
if (get_magic_quotes_gpc()) {
$string = stripslashes($string);
}
If it's not a string you're dealing with, you may need to look at the stripslashes_deep() implementation in the PHP docs.
You need to remove the slashes by running data through:
stripslashes()
However, you still want to make your output (if you are doing something with this) HTML safe.
so run this function on the data after:
htmlspecialchars()
try using stripslashes()
http://www.php.net/manual/en/function.stripslashes.php
checkout stripslashes()

PHP submitting forms, escaped quotes?

If I have a form with a value of just "" and I submit it and echo it with PHP, I get \"\"
How can I get around this?
This is because magic_quotes_gpc is on. This is a bad 'feature', designed to automatically escape incoming data for those developers who can't learn to escape SQL input.
You should disable this as soon as possible.
ini_set('magic_quotes_gpc', 'off');
You should switch off magic_quotes_gpc, which is a broken feature (see Delan's answer, I completely agree).
But wait! You must sanitize the user input from $_REQUEST, $_POST and $_GET and $_COOKIE, if you want to use it for database or display at your page! Otherwise your code would be prone to various types of attacks!
There is nothing like "universal sanitization". Let's call it just quoting, because that's what its all about.
When quoting, you always quote text for some particular output, like:
string value for mysql query
like expression for mysql query
html code
json
mysql regular expression
php regular expression
For each case, you need different quoting, because each usage is present within different syntax context. This also implies that the quoting shouldn't be made at the input into PHP, but at the particular output! Which is the reason why features like magic_quotes_gpc are broken (always assure it is switched off!!!).
So, what methods would one use for quoting in these particular cases? (Feel free to correct me, there might be more modern methods, but these are working for me)
mysql_real_escape_string($str)
mysql_real_escape_string(addcslashes($str, "%_"))
htmlspecialchars($str)
json_encode() - only for utf8! I use my function for iso-8859-2
mysql_real_escape_string(addcslashes($str, '^.[]$()|*+?{}')) - you cannot use preg_quote in this case because backslash would be escaped two times!
preg_quote()
Try stripslashes().
stripslashes() is the opposite of addslashes(), and removes escape slashes from strings.
You can use stripslashes() function.
http://php.net/manual/en/function.stripslashes.php
This behavior is caused by the "Magic Quotes" PHP-Feature. http://php.net/manual/en/security.magicquotes.php
You can use something like this to make it work whether magic quotes are enabled or not:
if (get_magic_quotes_gpc()) {
$data = stripslashes($_POST['data']);
}
else {
$data = $_POST['data'];
}
I always use this method as it grabs the value as a string and therefore there will be no slashes:
$variable = mysql_escape_string($_REQUEST['name_input']);

How do I retrieve "escaped" strings from db?

I'm doing this to all strings before inserting them:
mysql_real_escape_string($_POST['position']);
How do I remove the: \ after retriving them?
So I don't end up with: \"Piza\"
Also is this enough security or should I do something else?
Thanks
I would suggest you call $_POST['position'] directly (don't call mysql_real_escape_string on it) to get the non-escaped version.
Incidentally your comment about security suggests a bit of trouble understanding things.
One way of handling strings is to handle the escaped versions, which leads to one kind of difficulty, while another is to handle another and escape strings just before embedding, which leads to another kind of difficulty. I much prefer the latter.
use stripslashes() to get rid of the escape character.
Escaping is great. In case the value is going to be integer , I would suggest you do it like:
$value = (int) $_POST['some_int_field'];
This would make sure you always end up with an integer value.
It could be because magic quotes are enabled, so to make it versatile, use this:
if (get_magic_quotes_gpc()) { // Check if magic quotes are enabled
$position = stripslashes($_POST['position']);
} else {
$position = mysql_real_escape_string($_POST['position']);
}
mysql_real_escape_string() does add \s in your SQL strings but they should not be making it into the database as they are only there for the purpose of string parsing.
If you are seeing \s in you database then something else is escaping your stings before you call mysql_real_escape_string(). Check to make sure that magic_quotes_gpc isn't turned on.

What do I need to do to santize data from textarea to be fed to mysql database?

Well, the title is my question. Can anybody give me a list of things to do to sanitize my data before entering to mysql database using php, especially if the data contains html tags?
It depends on a lot of things. If you don't want to accept any HTML, that makes it a whole lot easier, run it through strip_tags() first to remove all the HTML from it. After that it's much safer. If you do want to accept some HTML, you can selectively keep some tags from it with the same function, just add in the tags to keep after. eg: strip_tags($string_to_sanitize, '<p><div>'); // Keeps only <p> and <div> tags.
As for inserting into a database, it's always best to sanitize anything before inserting into the database; adopting a "don't trust anybody" mentality will save you a lot of trouble. Preventing against SQL injection is fairly straightforward, this is the method I use:
$q = sprintf("INSERT INTO table_name (string_field, int_field) VALUES ('%s', %d);",
mysql_real_escape_string($values['string']),
mysql_real_escape_string($values['number']));
$result = mysql_query($q, $connection)
Generally once you open the door for allowing HTML in, you'll have a whole deal of things to worry about (there are some great articles on defending from XSS out there). If you want to test for XSS vulnerabilities, try the examples on http://ha.ckers.org/xss.html. There are some they have there that you would probably never even consider, so give it a look!
Also, if you are accepting specific types of input (eg: numbers, emails, boolean values) try using the inbuilt filter_var() function in PHP. They have a bunch of inbuilt types to validate data against (http://www.php.net/manual/en/filter.filters.validate.php), as well as a number of filters to sanitize your data (http://www.php.net/manual/en/filter.filters.sanitize.php).
Generally, accepting any input is like opening a Pandora's Box, and while you'll probably never be able to block 100% of the weaknesses (people are always looking to find a way in), you can block the common ones to save you headaches.
Finally remember to sanitize ALL external data. Just because you make a dropdown input doesn't mean some shady person can't send their own data instead!
Use mysql_real_escape_string();
mysql_query("INSERT INTO table(col) VALUES('".mysql_real_escape_string($_POST['data']."')");
You should use prepared statements when inserting data into the database, not any sort of escaping. (PHP manual: prepared statements in pdo and mysqli.)
Sanitization for HTML output should, as mentioned by others, happen when you go to take data out of the database and merge it into a page, not before.
Turn off register_globals and magic_quotes, use mysql_real_escape_string on any string coming from the user before placing it into your query.
Of course mysql_real_escape_string
When dealing with any kind of input start from the I won't allow anything stand point and whitelist only that deemed to be acceptable.
On insert you need to make sure that the data is MySQL-escaped. For this, use mysql_real_escape_string.
Before showing the data you will need to strip out unsafe HTML and/or JavaScript code. Many people choose to store the sanitised version in the database. Other prefer to strip the ugly HTML from the string before rendering.
You do this in PHP with some filtering. an example is the Drupal filter_xss function:
function filter_xss($string, $allowed_tags = array('a', 'em', 'strong', 'cite', 'code', 'ul', 'ol', 'li', 'dl', 'dt', 'dd')) {
// Only operate on valid UTF-8 strings. This is necessary to prevent cross
// site scripting issues on Internet Explorer 6.
if (!drupal_validate_utf8($string)) {
return '';
}
// Store the input format
_filter_xss_split($allowed_tags, TRUE);
// Remove NUL characters (ignored by some browsers)
$string = str_replace(chr(0), '', $string);
// Remove Netscape 4 JS entities
$string = preg_replace('%&\s*\{[^}]*(\}\s*;?|$)%', '', $string);
// Defuse all HTML entities
$string = str_replace('&', '&', $string);
// Change back only well-formed entities in our whitelist
// Decimal numeric entities
$string = preg_replace('/&#([0-9]+;)/', '&#\1', $string);
// Hexadecimal numeric entities
$string = preg_replace('/&#[Xx]0*((?:[0-9A-Fa-f]{2})+;)/', '&#x\1', $string);
// Named entities
$string = preg_replace('/&([A-Za-z][A-Za-z0-9]*;)/', '&\1', $string);
return preg_replace_callback('%
(
<(?=[^a-zA-Z!/]) # a lone <
| # or
<!--.*?--> # a comment
| # or
<[^>]*(>|$) # a string that starts with a <, up until the > or the end of the string
| # or
> # just a >
)%x', '_filter_xss_split', $string);
}
well, there is not too much to do while we're talking of inserting data from textarea to mysql database.
For the strings placed into query, Mysql requirements are not so complicated.
Only 2 rules to follow:
inserted data should be surrounded by quotes.
some special character in the data should be escaped.
Note that this operation has nothing to do with security. It's syntax requirements.
Assuming you're adding quotes already, the only thing you have to add is escaping. Depends on your encoding, you can use addslashes or mysql_escape_string or mysql_real_escape_string functions.
However, other parts of query require more attention. If you're curious, refer to my earlier answer with complete guide: In PHP when submitting strings to the database should I take care of illegal characters using htmlspecialchars() or use a regular expression?
HTML tags has nothing to do with database and require no special attention.
However, for displaying data from untrusted source, some precautions should be taken. It was described in this topic already, only I have to add is you can't trust to strip_tags when used with second parameter.
You can use mysql_real_escape_string, you can also use htmlentities with addslashes... or you can use all 3 together also...

PHP 'addslashes' not behaving as expected

This has been driving be crazy, but I can't seem to find an answer. We run a technical knowledge base that will sometimes include Windows samba paths for mapping to network drives.
For example: \\servername\sharename
When we include paths that have two backslashes followed by each other, they are not escaped properly when running 'addslashes'. My expected results would be "\\\\servername\\sharename", however it returns "\\servername\\sharename". Obviously, when running 'stripslashes' later on, the double backslash prefix is only a single slash. I've also tried using a str_replace("\\", "\", $variable); however it returns "\servername\sharename" when I would expect "\\servername\sharename".
So with addslashes, it ignores the first set of double-backslashes and with str_replace it changes the double-backslashes into a single, encoded backslash.
We need to run addslashes and stripslashes for database insertion; using pg_escape_string won't work in our specific case.
This is running on PHP 5.3.1 on Apache.
EDIT: Example Code
$variable = 'In the box labeled Folder type: \\servername\sharename';
echo addslashes($variable);
This returns: In the box labeled Folder type: \\servername\\sharename
EDIT: Example Code #2
$variable = 'In the box labeled Folder type: \\servername\sharename';
echo str_replace('\\', '\', $variable);
This returns: In the box labeled Folder type: \servername\sharename
I'd also like to state that using a single quotes or double-quotes does not give me different results (as you would expect). Using either or both give me the same exact results.
Does anyone have any suggestions on what I can possibly do?
I think I know where is a problem. Just try to run this one:
echo addslashes('\\servername\sharename');
And this one
echo addslashes('\\\\servername\sharename');
PHP escapes double slashes even with single quotes, because it is used to escape single quote.
Ran a test on the problem you described, and the only way I could get the behavior you desired was to couple a conditional with a regex and anticipate the double slashes at the start.
$str = '\\servername\sharename';
if(substr($str,0,1) == '\\'){
//String starts with double backslashes, let's append an escape one.
//Exclaimation used for demonstration purposes.
$str = '\\'.$str;
echo addslashes(preg_replace('#\\\\\\\\#', '!',$str ));
}
This outputs:
!servername\\sharename
While this may not be an outright answer, it does work and illustrates a difference in how the escape character is treated by these two constructs. If used, the ! could easily be replaced with the desired characters using another regex.
This is not a problem with addslashes, it is a problem with the way you are assigning the string to your variable.
$variable = 'In the box labeled Folder type: \\servername\sharename';
echo $variable;
This returns: In the box labeled Folder type: \servername\sharename
This is because the double backslash is interpreted as an escaped backslash. Use this assignment instead.
$variable = 'In the box labeled Folder type: \\\\servername\\sharename';
I've determined, with more testing, that it indeed is with how PHP is handling hard-coded strings. Since hard-coded strings are not what I'm interested in (I was just using them for testing/this example), I created a form with a single text box and a submit button. addslashes would correctly escape the POST'ed data this way.
Doing even more research, I determined that the issue I was experiencing was with how PostgreSQL accepts escaped data. Upon inserting data into a PostgreSQL database, it will remove any escape characters it is given when it actually places the data in the table. Therefore, stripslashes is not required to remove escape characters when pulling the data back out.
This problem stemmed from code migration from PHP 4.1 (with Magic Quotes on) to PHP 5.3 (with Magic Quotes deprecated). In the existing system (PHP4), I don't think we were aware that Magic Quotes were on. Therefore, all POST data was being escaped already and then we were escaping that data again with addslashes before inserting. When it got inserted into PostgreSQL, it would strip one set of slashes and leave the other, therefore requiring us to stripslashes on the way out. Now, with Magic Quotes off, we escape with addslashes but are not required to use stripslashes on the way out.
It was very hard to organize and determine exactly where the problem lay, so I know this answer is a little off to my original question. I do, however, thank everyone who contributed. Having other people sound off on their ideas always helps to make you think on avenues you may not have on your own.

Categories