can some one exploit my below unserialize code - php

As per PHP.net manual on unserialize ( http://php.net/manual/en/function.unserialize.php ) and after few google search found - unserialize code can be exploited.
I don't have much information about on how hackers can exploit unserialize code. I am just scared, since I am using unserialize code that is coming from external user input.
Below is my code, I want to know if this code can be exploited:
<?php
if(filter_var($_GET['url'], FILTER_VALIDATE_URL)) {
// $_GET['url'] = 'http://example.com/page/1.html'
$html = file_get_contents($_GET['url']);
$doc = new DOMDocument();
$encode = mb_convert_encoding($html, 'HTML-ENTITIES', "UTF-8");
libxml_use_internal_errors(true);
$doc->loadHTML($encode);
libxml_clear_errors();
$nodes = $doc->getElementsByTagName('title');
$title = strtolower($nodes->item(0)->nodeValue);
// storing in mysqli database
// hiding mysql code..
$serialize = serialize(['icon' => 'check', 'data' => $title]);
// fetching from mysqli database.
// hiding other mysql code..
$row = $fetch->fetch_assoc();
$unserialize = unserialize($row['title']);
}
Can hacker craft "malicious title" tag and provide his URL to exploit my unserialize code?
Update: I am using PDO for mysql, that is not problem. My concern is about unserialize code that is coming from external website title html tag, for which I have no control.

If an attacker is able to somehow get a specifically crafted string directly into your database, perhaps through a secondary, unrelated vulnerability, then unserializeing that string while you fetch it from the database can lead to an object/code injection.
You can only trust the data you're unserialising if you're sure you have also serialised it yourself before. Now, how can you be sure that the data you fetch from the database has been serialised by a trusted party before? This depends on the entire rest of the system offering no exploitable way for an attacker to bypass serialisation. Essentially, you cannot be sure who has serialised the data that's in your database. Tiny missing sanity check over here plus harmless looking code over there can in combination lead to a vulnerability.
The better solution is to use an exploit-free data format like JSON. It purely describes data, and there's no chance to inject anything code-related into it.

Lots of very good comments, I'm surprised nobody made an answer already.
Can hacker craft "malicious title" tag and provide his URL to exploit my unserialize code?
No.
I'm assuming you are refering to PHP Object Injection vulnerability. This is about directly unserializing untrusted input. In this case you are unserializing something you serialized yourself, so you're good. Here is how it'll play out:
an attacker crafts a malicious title
you serialize it
you unserialize it - at this point it is still the same malicious title as a string. You would need to unserialize it a second time so it becomes an object and your code is at risk.
I would recommend using JSON anyway, as it is more standard (easier to use from another language for example).
Now the real question would be:
can a hacker craft a malicious title tag to exploit something (not necessarily unserialize)
The answer is yes and that is why you always need to protect against XSS and SQL Injection when using the title provided by the user.
That being said, the real vulnerability in my opinion is in the way you validate the URL. filter_var($_GET['url'], FILTER_VALIDATE_URL) will happily validate a URL such as file:///var/www/index.html. So your code allows the user to read any file on your server.
It's also true for other protocols such as ftp://, ssh://, and so on. It's completely open to abuse.

Related

function to provide some extra security in php with query_string

some years ago I started using the following code including in the top of my pages. I read that was good and used it. But I was wondering, is it helpful?
$page = "index.php";
$cracktrack = $_SERVER['QUERY_STRING'];
$wormprotector = array('chr(', 'chr=', 'chr%20', '%20chr', 'wget%20', '%20wget', 'wget(',
'cmd=', '%20cmd', 'cmd%20', 'rush=', '%20rush', 'rush%20',
'union%20', '%20union', 'union(', 'union=', 'echr(', '%20echr', 'echr%20', 'echr=',
'esystem(', 'esystem%20', 'cp%20', '%20cp', 'cp(', 'mdir%20', '%20mdir', 'mdir(',
'mcd%20', 'mrd%20', 'rm%20', '%20mcd', '%20mrd', '%20rm',
'mcd(', 'mrd(', 'rm(', 'mcd=', 'mrd=', 'mv%20', 'rmdir%20', 'mv(', 'rmdir(',
'chmod(', 'chmod%20', '%20chmod', 'chmod(', 'chmod=', 'chown%20', 'chgrp%20', 'chown(', 'chgrp(',
'locate%20', 'grep%20', 'locate(', 'grep(', 'diff%20', 'kill%20', 'kill(', 'killall',
'passwd%20', '%20passwd', 'passwd(', 'telnet%20', 'vi(', 'vi%20',
'insert%20into', 'select%20', 'nigga(', '%20nigga', 'nigga%20', 'fopen', 'fwrite', '%20like', 'like%20',
'$_request', '$_get', '$request', '$get', '.system', 'HTTP_PHP', '&aim', '%20getenv', 'getenv%20',
'new_password', '&icq','/etc/password','/etc/shadow', '/etc/groups', '/etc/gshadow',
'HTTP_USER_AGENT', 'HTTP_HOST', '/bin/ps', 'wget%20', 'unamex20-a', '/usr/bin/id',
'/bin/echo', '/bin/kill', '/bin/', '/chgrp', '/chown', '/usr/bin', 'g++', 'bin/python',
'bin/tclsh', 'bin/nasm', 'perl%20', 'traceroute%20', 'ping%20', '.pl', '/usr/X11R6/bin/xterm', 'lsof%20',
'/bin/mail', '.conf', 'motd%20', 'HTTP/1.', '.inc.php', 'config.php', 'cgi-', '.eml',
'file://', 'window.open', '<SCRIPT>', 'javascript://','img src', 'img%20src','.jsp','ftp.exe',
'xp_enumdsn', 'xp_availablemedia', 'xp_filelist', 'xp_cmdshell', 'nc.exe', '.htpasswd',
'servlet', '/etc/passwd', 'wwwacl', '~root', '~ftp', '.js', '.jsp', 'admin_', '.history',
'bash_history', '.bash_history', '~nobody', 'server-info', 'server-status', 'reboot%20', 'halt%20',
'powerdown%20', '/home/ftp', '/home/www', 'secure_site, ok', 'chunked', 'org.apache', '/servlet/con',
'<script', '/robot.txt' ,'/perl' ,'mod_gzip_status', 'db_mysql.inc', '.inc', 'select%20from',
'select from', 'drop%20', '.system', 'getenv', 'http_', '_php', 'php_', 'phpinfo()', '<?php', '?>', 'sql=');
$checkworm = str_replace($wormprotector, '*', $cracktrack);
if ($cracktrack != $checkworm){
$cremotead = $_SERVER['REMOTE_ADDR'];
$cuseragent = $_SERVER['HTTP_USER_AGENT'];
header("location:$page");
die();
}
In general, I personally wouldn't use this strategy. I'd rather sanitize each and every input. If a user passes .bash_history in the URL I don't care because it's never going to do anything in my script.
I could maybe see something like this being useful if you had some third-party low reliability script that was available for anyone to hit. Even in that scenario though it seems like a semi-reliable band-aid at best.
For applications you write however, this should hopefully be unnecessary.
Although it's great that you're concerned about security, and you're following the principle of treating all input with suspicion, I don't think that list is terribly useful.
It's a rather arbitrary selection of potentially unwanted strings/commands/tags/folder names and other things. It's likely to get out of date over time, and probably is already. Having a generic list like this is never going to catch everything, and may also lend a false sense of security that your application is secure when really it's not.
As another answer has already mentioned, you want to be checking each input you get from your application (whether via query string variables, POST variables or wherever) and validating that it meets your expectations (e.g. if you're expecting a numeric value, is the value passed in numeric?).
Then if you plan to redisplay or re-use that data, you might want to sanitise if further, and strip out things that might potentially be dangerous in the context where it will be used. For example, you might strip out "script" tags if you're going to display the data on a web page.
If you sanitize all user input properly, there's absolutely no need to use a script like this.
Besides that, it's also case sensitive (str_replace vs str_ireplace) which means that I can easily bypass it by making use of a mix of uppercase and lowercase letters. It also only checks the query string, useless against POST requests.

What output should be sanitized?

Security is a big concern for me. The last version of my site had little to no security and we experienced a lot of issues so this time around i am looking to have things as secure as possible.
All user input goes through a function that strips tags( allowing <p><b> and trims the string. It is then uploaded to the site using a pdo prepared statement.
I am now going back to output everything and I am wondering what exactly needs to be sanitized. I have a lot of queries that fetch integers ( epoch time,and id's etc. ) should they be sanitized to avoid any issues?
I also have specific sections that use <P> and <b> tags from user input like about me on the user profile etc. Is my Output_paragraph function incorrect?
// Output Sanitation
function output($input) {
$output = htmlspecialchars($input);
return $output;
}
// Paragraph Output Sanitation
function output_paragraph($input) {
$output = htmlspecialchars($input);
$output = htmlspecialchars_decode($output);
return $output;
}
Generally speaking, output should be sanitized when it can't be trusted. In 99% of all cases, this means user input. So raw echoing user input, for instance, is an invitation for someone to use your site to to load, say, an attack file and trick some user into clicking a link to your site that loads said data.
htmlspecialchars is a good start, but you need to make sure you're not trusting user entered data. That means you define what you expect, and then match it to what you can output. Maybe you need strip_tags too. Maybe you need a regex. There's no real "one size fits all" answer to security.

Securing against XSS - user provides part of url

I see why this below is bad and that htmlspecialchars must be used to prevent some xss vulnerabilities:
<?php $url = '<plaintext>';
echo $url;?>
like so:
<?php $url = '<plaintext>';
htmlspecialchars($url, ENT_QUOTES, "UTF-8");
echo $url;?>
In my database i store the filename only which is user provided. (im sure this will change as i learn more about this subject)
What im wondering though, is if this below is actually doing anything to protect against XSS? Is it less of a vulnerability compared to the previous case?
I've tried injecting script tags with and without htmlspecialchars
and it seems to do nothing in either case. The script code wont execute.
Is it secure? Is htmlspecialchars the right tool for the job? How can i make it better?
$sql['image'] is fetched from my database and this below is the code that displays the image.
<?php $url = "/images/" . $sql['image'] . ".jpg";
$url = htmlspecialchars($url, ENT_QUOTES, "UTF-8");?>
<img src="<?php echo $url;?>">
outputs:
<img src="/images/test.jpg">
In principle you can't trust any user input, ever. If $sql['image'] will directly or indirectly be provided by users, then it won't matter if you add constants to the beginning and end of that string. Either way you'll have to rely on htmlspecialchars() not containing any bugs that would allow scripting to be injected.
To actually increase security in this case, you'd have to take somewhat more drastic measures. One popular method for that would be to simply assign file names yourself, for example by hashing on the contents of the file and using that hash instead of the original file name to store the file. md5() and sha1() tend to come in handy for that.
Also, assuming the users provide the images whose file names you're storing, you'd have to make sure those can't be used to get the job done, either. For example, the users might upload an SVG with an embedded script instead of a JPEG, thus potentially completely avoiding any validation or mangling on the file name itself.

The URL Security

I have made below function for the security of URLs. I just wanted to know is there anything i need to re-consider or change in below code. I have made this function after reading quite some articles on security from various sources.
Here is the function:
// filters possible malacious stuff from URLs
private function filter_url($url)
{
if (is_array($url))
{
foreach($url as $key => $value)
{
// recurssion
$url[$key] = filter_url($value);
}
return $url;
}
else
{
// Allow only one ? in URLs
$total_question_marks = substr_count($url, '?');
if ($total_question_marks >= 2)
{
exit('You can not use 2 question marks (?) in URLs for security reasons!!');
}
// decode URLs
$url = rawurldecode($url);
$url = urldecode($url);
// remove bad stuff
$url = str_replace('../', '', $url);
$url = str_replace('..\\', '', $url);
$url = str_replace('..%5C', '', $url);
$url = str_replace('%00', '', $url);
$url = str_ireplace('http', '', $url);
$url = str_ireplace('https', '', $url);
$url = str_ireplace('ftp', '', $url);
$url = str_ireplace('smb', '', $url);
$url = str_replace('://', '', $url);
$url = str_replace(':\\\\', '', $url);
$url = str_replace(array('<', '>'), array('<', '>'), $url);
// Allow only a-zA-Z0-9_/.-?=&
$url = preg_replace("/[^a-zA-Z0-9_\-\/\.\?=&]+/", "", $url);
//print $url;
return $url;
}
}
I can use this function simply like this:
$_GET = filter_url($_GET);
Or even like this:
$_SERVER['QUERY_STRING'] = filter_url($_SERVER['QUERY_STRING']);
Any attempt to try and create some sort of catch-all filter like this will always fail and in addition, since you always "corrupt" the data, you will be in trouble when you really need to accept a piece of data with a "dissalowed" character or character sequence.
You really need to read around the subject of web security a bit and fully understand common attacks such as (a minimum of) cross-site scripting, cross-site request forgery and sql injection.
You need to take a 2-pronged approach to using user-provided data in a safe way. This is to
Think of the process like this:
Validate and reject data on the way in.
Encode data on the way out
Input Validation
Check each piece of input to ensure it only contains the right kind of data and fit within length and range boundaries -- ACCORDING TO THE MEANING OF EACH INIDIVIDUAL PIECE OF DATA --. i.e. ensure numbers only contain numeric digits. ensure years are within a sensible range, ensure strings are not overlong or blank, ensure filenames don't travese directories, ensure IDs only contain legal characters. etc.etc. etc.. The most important thing here, is wherever possible state what is allowed; DO NOT STATE WHAT IS DISSALOWED. Testing for what is allowed and rejecting everything else is known as whitelisting and is a good thing, as you know you will get clean data (or as near to it as is sensible). Looking for bad patterns and rejecting them is known as blacklisting and is a less safe idea. For black-listing to succeed, you need to ensure that your black-list is complete, and often this is basically an impossible task. In some limited contexts a black-list approach can be ueful, but only when you are as certain as you can be that the list is exhaustive.
Storing the data
Once you have only accepted clean data, save it in your variable or session. Maybe take an approach to variable-naming that indicates this data is now clean. The most important thing here, is that when we save this data, we have not yet changed it. This means that the data can be used in any context without us having "thrown anything away"
Output encoding
When you send the data to an external system - such as a filename, cookie, web page, file or saving in your own database - you must encode the data to ensure you do not break the language syntax or file format used in the output. It is here where the data can be transformed.
The transformation you need to perform on the data will be different according to how and where you use the data. Transforming a string for use in a windows filename is different for a linux filename, different again if being inserted into a PDF document, or web page or database etc. etc. etc. However, lets look at 2 examples in more detail:
Output to HTML
In the most common case of inserting a string into some HTML, you need to ensure that you do not allow the user to inject arbitrary content into the page, Not only does this allow them to edit the page that another user can see, but they can also inject code in the form of javascript that could do anything they want. This script will run as the user that views the page and could allow the attacker to steal their information and login credentials. This is called Cross-Site Scripting. The syntax rules of HTML and Javascript mean that you need to encode differently depending on where in the HTML you insert the user-data. There is a very valuable page at http://www.owasp.org/index.php/XSS_%28Cross_Site_Scripting%29_Prevention_Cheat_Sheet that explains how to transform the data for the 6 different categories of places where you can insert a string into a HTML page.
Output to database
If you save user-data to the database, you effectively have to include the string in an SQL statement. You must ensure that you do not allow the user to change the meaning of the SQL statement and only be able to change the data values. If they can change the meaning of the statement, this is called SQL injection.
This is a special case, as although you can it solve the problem using output-encoding, you are better off using a technique called "bound parameters". This ensures that your data is always used as data and never as code when talking to the database. PHP supports bound parameters in a number of db libraries including "PDO" (cross-database) and "Mysqli" (MySQL). It should be noted that the "Mysql" library does not support bound-parameters.
There is much more information all over the net and in StackOverflow about Cross-Site scripting (XSS) and and SQL injection (SQLi) and it is well worth reading around the subject. There are of course many other types of attack, but if you follow the process above you should minimise your risks. It is not unreasonable for data-validation and encoding routines to make up a significant part of a secure web appplication. But you have to biuld the security methodology into your standard working process. Adding it as an afterthought is much more difficult. Sometimes look back at your validation code for a particular function and think whether you could add some more rules. There's always going to be something you miss first time round.
Filters relying on blacklists fail. Check PHPIDS if you are serious about detecting attack patterns.

What are the best practices for avoiding xss attacks in a PHP site [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I have PHP configured so that magic quotes are on and register globals are off.
I do my best to always call htmlentities() for anything I am outputing that is derived from user input.
I also occasionally seach my database for common things used in xss attached such as...
<script
What else should I be doing and how can I make sure that the things I am trying to do are always done.
Escaping input is not the best you can do for successful XSS prevention. Also output must be escaped. If you use Smarty template engine, you may use |escape:'htmlall' modifier to convert all sensitive characters to HTML entities (I use own |e modifier which is alias to the above).
My approach to input/output security is:
store user input not modified (no HTML escaping on input, only DB-aware escaping done via PDO prepared statements)
escape on output, depending on what output format you use (e.g. HTML and JSON need different escaping rules)
I'm of the opinion that one shouldn't escape anything during input, only on output. Since (most of the time) you can not assume that you know where that data is going. Example, if you have form that takes data that later on appears in an email that you send out, you need different escaping (otherwise a malicious user could rewrite your email-headers).
In other words, you can only escape at the very last moment the data is "leaving" your application:
List item
Write to XML file, escape for XML
Write to DB, escape (for that particular DBMS)
Write email, escape for emails
etc
To go short:
You don't know where your data is going
Data might actually end up in more than one place, needing different escaping mechanism's BUT NOT BOTH
Data escaped for the wrong target is really not nice. (E.g. get an email with the subject "Go to Tommy\'s bar".)
Esp #3 will occur if you escape data at the input layer (or you need to de-escape it again, etc).
PS: I'll second the advice for not using magic_quotes, those are pure evil!
There are a lot of ways to do XSS (See http://ha.ckers.org/xss.html) and it's very hard to catch.
I personally delegate this to the current framework I'm using (Code Igniter for example). While not perfect, it might catch more than my hand made routines ever do.
This is a great question.
First, don't escape text on input except to make it safe for storage (such as being put into a database). The reason for this is you want to keep what was input so you can contextually present it in different ways and places. Making changes here can compromise your later presentation.
When you go to present your data filter out what shouldn't be there. For example, if there isn't a reason for javascript to be there search for it and remove it. An easy way to do that is to use the strip_tags function and only present the html tags you are allowing.
Next, take what you have and pass it thought htmlentities or htmlspecialchars to change what's there to ascii characters. Do this based on context and what you want to get out.
I'd, also, suggest turning off Magic Quotes. It is has been removed from PHP 6 and is considered bad practice to use it. Details at http://us3.php.net/magic_quotes
For more details check out http://ha.ckers.org/xss.html
This isn't a complete answer but, hopefully enough to help you get started.
rikh Writes:
I do my best to always call htmlentities() for anything I am outputing that is derived from user input.
See Joel's essay on Making Code Look Wrong for help with this
Template library. Or at least, that is what template libraries should do.
To prevent XSS all output should be encoded. This is not the task of the main application / control logic, it should solely be handled by the output methods.
If you sprinkle htmlentities() thorughout your code, the overall design is wrong. And as you suggest, you might miss one or two spots.
That's why the only solution is rigorous html encoding -> when output vars get written into a html/xml stream.
Unfortunately, most php template libraries only add their own template syntax, but don't concern themselves with output encoding, or localization, or html validation, or anything important. Maybe someone else knows a proper template library for php?
I rely on PHPTAL for that.
Unlike Smarty and plain PHP, it escapes all output by default. This is a big win for security, because your site won't become vurnelable if you forget htmlspecialchars() or |escape somewhere.
XSS is HTML-specific attack, so HTML output is the right place to prevent it. You should not try pre-filtering data in the database, because you could need to output data to another medium which doesn't accept HTML, but has its own risks.
Escaping all user input is enough for most sites. Also make sure that session IDs don't end up in the URL so they can't be stolen from the Referer link to another site. Additionally, if you allow your users to submit links, make sure no javascript: protocol links are allowed; these would execute a script as soon as the user clicks on the link.
If you are concerned about XSS attacks, encoding your output strings to HTML is the solution. If you remember to encode every single output character to HTML format, there is no way to execute a successful XSS attack.
Read more:
Sanitizing user data: How and where to do it
Personally, I would disable magic_quotes. In PHP5+ it is disabled by default and it is better to code as if it is not there at all as it does not escape everything and it will be removed from PHP6.
Next, depending on what type of user data you are filtering will dictate what to do next e.g. if it is just text e.g. a name, then strip_tags(trim(stripslashes())); it or to check for ranges use regular expressions.
If you expect a certain range of values, create an array of the valid values and only allow those values through (in_array($userData, array(...))).
If you are checking numbers use is_numeric to enforce whole numbers or cast to a specific type, that should prevent people trying to send strings in stead.
If you have PHP5.2+ then consider looking at filter() and making use of that extension which can filter various data types including email addresses. Documentation is not particularly good, but is improving.
If you have to handle HTML then you should consider something like PHP Input Filter or HTML Purifier. HTML Purifier will also validate HTML for conformance. I am not sure if Input Filter is still being developed. Both will allow you to define a set of tags that can be used and what attributes are allowed.
Whatever you decide upon, always remember, never ever trust anything coming into your PHP script from a user (including yourself!).
All of these answers are great, but fundamentally, the solution to XSS will be to stop generating HTML documents by string manipulation.
Filtering input is always a good idea for any application.
Escaping your output using htmlentities() and friends should work as long as it's used properly, but this is the HTML equivalent of creating a SQL query by concatenating strings with mysql_real_escape_string($var) - it should work, but fewer things can validate your work, so to speak, compared to an approach like using parameterized queries.
The long-term solution should be for applications to construct the page internally, perhaps using a standard interface like the DOM, and then to use a library (like libxml) to handle the serialization to XHTML/HTML/etc. Of course, we're a long ways away from that being popular and fast enough, but in the meantime we have to build our HTML documents via string operations, and that's inherently more risky.
“Magic quotes” is a palliative remedy for some of the worst XSS flaws which works by escaping everything on input, something that's wrong by design. The only case where one would want to use it is when you absolutely must use an existing PHP application known to be written carelessly with regard to XSS. (In this case you're in a serious trouble even with “magic quotes”.) When developing your own application, you should disable “magic quotes” and follow XSS-safe practices instead.
XSS, a cross-site scripting vulnerability, occurs when an application includes strings from external sources (user input, fetched from other websites, etc) in its [X]HTML, CSS, ECMAscript or other browser-parsed output without proper escaping, hoping that special characters like less-than (in [X]HTML), single or double quotes (ECMAscript) will never appear. The proper solution to it is to always escape strings according to the rules of the output language: using entities in [X]HTML, backslashes in ECMAscript etc.
Because it can be hard to keep track of what is untrusted and has to be escaped, it's a good idea to always escape everything that is a “text string” as opposed to “text with markup” in a language like HTML. Some programming environments make it easier by introducing several incompatible string types: “string” (normal text), “HTML string” (HTML markup) and so on. That way, a direct implicit conversion from “string” to “HTML string” would be impossible, and the only way a string could become HTML markup is by passing it through an escaping function.
“Register globals”, though disabling it is definitely a good idea, deals with a problem entirely different from XSS.
I find that using this function helps to strip out a lot of possible xss attacks:
<?php
function h($string, $esc_type = 'htmlall')
{
switch ($esc_type) {
case 'css':
$string = str_replace(array('<', '>', '\\'), array('<', '>', '/'), $string);
// get rid of various versions of javascript
$string = preg_replace(
'/j\s*[\\\]*\s*a\s*[\\\]*\s*v\s*[\\\]*\s*a\s*[\\\]*\s*s\s*[\\\]*\s*c\s*[\\\]*\s*r\s*[\\\]*\s*i\s*[\\\]*\s*p\s*[\\\]*\s*t\s*[\\\]*\s*:/i',
'blocked', $string);
$string = preg_replace(
'/#\s*[\\\]*\s*i\s*[\\\]*\s*m\s*[\\\]*\s*p\s*[\\\]*\s*o\s*[\\\]*\s*r\s*[\\\]*\s*t/i',
'blocked', $string);
$string = preg_replace(
'/e\s*[\\\]*\s*x\s*[\\\]*\s*p\s*[\\\]*\s*r\s*[\\\]*\s*e\s*[\\\]*\s*s\s*[\\\]*\s*s\s*[\\\]*\s*i\s*[\\\]*\s*o\s*[\\\]*\s*n\s*[\\\]*\s*/i',
'blocked', $string);
$string = preg_replace('/b\s*[\\\]*\s*i\s*[\\\]*\s*n\s*[\\\]*\s*d\s*[\\\]*\s*i\s*[\\\]*\s*n\s*[\\\]*\s*g:/i', 'blocked', $string);
return $string;
case 'html':
//return htmlspecialchars($string, ENT_NOQUOTES);
return str_replace(array('<', '>'), array('<' , '>'), $string);
case 'htmlall':
return htmlentities($string, ENT_QUOTES);
case 'url':
return rawurlencode($string);
case 'query':
return urlencode($string);
case 'quotes':
// escape unescaped single quotes
return preg_replace("%(?<!\\\\)'%", "\\'", $string);
case 'hex':
// escape every character into hex
$s_return = '';
for ($x=0; $x < strlen($string); $x++) {
$s_return .= '%' . bin2hex($string[$x]);
}
return $s_return;
case 'hexentity':
$s_return = '';
for ($x=0; $x < strlen($string); $x++) {
$s_return .= '&#x' . bin2hex($string[$x]) . ';';
}
return $s_return;
case 'decentity':
$s_return = '';
for ($x=0; $x < strlen($string); $x++) {
$s_return .= '&#' . ord($string[$x]) . ';';
}
return $s_return;
case 'javascript':
// escape quotes and backslashes, newlines, etc.
return strtr($string, array('\\'=>'\\\\',"'"=>"\\'",'"'=>'\\"',"\r"=>'\\r',"\n"=>'\\n','</'=>'<\/'));
case 'mail':
// safe way to display e-mail address on a web page
return str_replace(array('#', '.'),array(' [AT] ', ' [DOT] '), $string);
case 'nonstd':
// escape non-standard chars, such as ms document quotes
$_res = '';
for($_i = 0, $_len = strlen($string); $_i < $_len; $_i++) {
$_ord = ord($string{$_i});
// non-standard char, escape it
if($_ord >= 126){
$_res .= '&#' . $_ord . ';';
} else {
$_res .= $string{$_i};
}
}
return $_res;
default:
return $string;
}
}
?>
Source
Make you any session cookies (or all cookies) you use HttpOnly. Most browsers will hide the cookie value from JavaScript in that case. User could still manually copy cookies, but this helps prevent direct script access. StackOverflow had this problem durning beta.
This isn't a solution, just another brick in the wall
Don't trust user input
Escape all free-text output
Don't use magic_quotes; see if there's a DBMS-specfic variant, or use PDO
Consider using HTTP-only cookies where possible to avoid any malicious script being able to hijack a session
You should at least validate all data going into the database. And try to validate all data leaving the database too.
mysql_real_escape_string is good to prevent SQL injection, but XSS is trickier.
You should preg_match, stip_tags, or htmlentities where possible!
The best current method for preventing XSS in a PHP application is HTML Purifier (http://htmlpurifier.org/). One minor drawback to it is that it's a rather large library and is best used with an op code cache like APC. You would use this in any place where untrusted content is being outputted to the screen. It is much more thorough that htmlentities, htmlspecialchars, filter_input, filter_var, strip_tags, etc.
Use an existing user-input sanitization library to clean all user-input. Unless you put a lot of effort into it, implementing it yourself will never work as well.
I find the best way is using a class that allows you to bind your code so you never have to worry about manually escaping your data.
It is difficult to implement a thorough sql injection/xss injection prevention on a site that doesn't cause false alarms. In a CMS the end user might want to use <script> or <object> that links to items from another site.
I recommend having all users install FireFox with NoScript ;-)

Categories