Reduce chances of PHP plugins being malicious - php

I was wondering what steps you use to keep downloaded plugins from being malicious?
For example, what does wordpress do to ensure that the plugins you download do not simply execute unlink('/')
I'm assuming it partly depends partly on downloader to install plugins to use his or her own discretion, but do plugin systems take measures to minimize the security risk of running 3rd party plugins?
Thanks!
Matt Mueller

Simple answer: you can't do this programmatically. Simply can't be done. Certainly Wordpress has a validator of some sort to determine whether the plugin is outright nasty, but there's no way to say for certain that it is safe.
I'm an intern at Mozilla this summer and I'm working on the validator that scans add-ons as they're submitted to addons.mozilla.org. I can only imagine that Wordpress has a very similar tool on their end. The idea is that the app outright rejects blatantly malicious code (eval("evil nasty code");), while the rest of it is analyzed with some simple heuristics. The algorithms in place mark down some potential red flags based on what it sees in the add-on package and submits those notes to the editors, who then review the code. It effectively ends up being a human-powered process, but the software helps to take care of a lot of the heavy lifting.
Some techniques that the Mozilla validator uses:
Syntax checking
Code and markup parsing (HTML/CSS) to find remote code vulnerabilities
Javascript parsing and analysis (parse the JS to an AST tree and analyze each statement, evaluating static expressions as deeply as possible)
Compatibility/deprecation testing
You can check out the code here:
http://github.com/mattbasta/amo-validator
Hope this helps!

unlink('/') wont do any harm since it only deletes files, you would have to use rmdir or more precisely a recursive rmdir implementation. I don't think there is any way to prevent malicious code from being executed because there are many ways of being malicious. You can restrict certain functions from being called in php.ini but that will only help you to a certain point. For instance, str_repeat and unserialize are common functions but if called with the right arguments they can exaust all the memory allocated to your PHP scripts in no time. But this is only an example, a more nefarious one could act as a backdoor or email all the logins to the developer. I guess in the end you'll have to trust the developer and the community if you don't want to audit the code by yourself.

There are tools for PHP that do Static Source Code Analysis in order to find vulnerabilities. Open source analysis tools for php include RATS and PHP-SAT.
If you have ever used a Static Source Code Analysis then you know that these tools will produce a TON of false positives and false negatives. No Source Code Analysis tool can tell you 100% weather or not a program has a backdoor or can be malicious. If it could then we wouldn't have so many problems with websites getting hacked. Wordpress its self is extremely insecure, so are all of the plugins, and this is because of mistakes, not malice.
Malicious code can be obfuscated, hidden and take on many many forms. Trying to find an accidental vulnerability is a whole lot easier problem than an intentional one. A backdoor in PHP can be as simple as adding or removing 2 bytes.
Removing 2 bytes:
$id=mysql_real_escape_string($id);
"select * from test where id=$id"
vs
"select * from test where id='$id'"
or adding 2 bytes:
`$_GET[b]`;

Related

Using eval() to enhance security

I admit the title is mostly a catch 22, but it's entirely relevant, so please bear with me for a while...
Background
As some may know, I'm working on a PHP framework whose major selling point is that of bridging functionality between different CMSes/systems.
From a developer perspective, there's an extensive error handling and logging mechanism.
Right now, there are two settings, DEBUG_MODE and DEBUG_VERBOSE, which control debug output.
The mode describes the medium and verbose controls the amount of detail.
To make it short, there's a mode called "console" which basically dumps debug info into the javascript console (which is now available in a major web browser near you).
The Issue
This [debug system] works great for development servers, but you absolutely cannot use it on a production one since debug details (which include DB credentials etc) get published publicly. And in all honesty, who ever migrated from a dev. to a prod. server flawlessly each time?
Solutions
Therefore, I've been trying to figure out a way to fix this. Among my proposed solutions are:
Having a setting which tells the framework to enable logging only if the request comes from a certain IP. The security issues for this are quite obvious (IP spoofing among others).
Having a setting which contains PHP expression(code) that gets eval'd and it's return used as a yes/no. The best part is that the framework installed may suggest CMS-specific expressions, eg:
Wordpress: current_user_can('manage_options')
Joomla: $user=&JFactory::getUser() && ($user->usertype=='Super Administrator') || ($user->usertype=='Administrator')
Custom: $_SERVER['REMOTE_ADDR']=='123.124.125.126'
These are among the two, I'm eager to hear out more suggestions.
So, do you think eval() should be up to it? I'll ensure it still performs well by only doing this once per page load/request.
Clarification
if(DEBUG_MODE!='none')echo 'Debug'; // this is how it is now
if(DEBUG_MODE!='none' && $USER_CONDITION)echo 'Debug'; // this is how it should be
The $USER_CONDITON allows stuff such as running is_admin() to allow all admins to see debug info, or, getUser()->id==45 to enable it for a specific user. Or by IP, or whatever.
Go ahead. It's evident that you understand the hypothetical security implications. In your case it's just important to tell the target user base about it.
As for the practicability of your approach, there's no discussion really. You need variable authentication logic and can't hardwire it to one specific environment/cms runtime.
The only concern you see is about performance. That's baloney. Not an issue. The presence of eval is what discerns scripting languages from compiled languages. If it's available you can not only use it, but can be sure that it's not going to be slow because a compiler+linker run is required behind the scenes. PHP takes some time with initializing its tokenizer and parser, but parsing itself is surprisingly quick.
And lastly, avoid such question titles on SO. ;} Or at the very least talk about create_function please.
IP spoofing long enough to actually get a response is unlikely to occur. If a user manages to build up a connection to your server, spoofing an internal or privileged developer IP they control your router, so you've got other things to worry about.
Rather than running eval can't you just write an anonymous function/closure: http://php.net/manual/en/functions.anonymous.php
(putting it in a config file, rather than web screen, writing complicated PHP code on a web form seems sub-optimal anyways)
Allowing free-form input of PHP code that gets executed - be it through eval() or create_function() - is simply bad design, and opens a big potential vulnerability for no good reason. It also opens the possibility of crashing a page through syntax errors.
Even the argument that the administrator can install plugins anyway doesn't hold entirely, because XSRF attacks are conceivable that manage to get malicious stuff into a text field (one request), but can't trigger a plug-in installation.
So no, I wouldn't do it; I would implement each CMS bridge as an adapter instead, and let the user choose the adapter (and if necessary enter some custom, sanitizable settings) from a pre-defined list. (Something similar was also suggested by #Wrikken in the comments)
It's your call. Chances are you will never have a problem from doing this the eval() way. And it can be argued that most of the CMSs you will be connecting with (Wordpress, Joomla) allow arbitrary execution of PHP code in the back-end anyway. But it's not good design.
Having a setting which contains PHP expression(code) that gets eval'd and it's return used as a yes/no. The best part is that the framework installed may suggest CMS-specific expressions, eg:
eval() may crash your page if any function doesn't exist or on any number of parse errors. And if bugs exist which allow user-supplied input (such as a uri requested) to even touch these evaled values, it will potentially open up your site to malicious or accidental destruction. Instead to identify the currently working framework, look for markers in the framework you're trying to bridge to, such as certain constants, functions, classes, etc. You can replace all your eval() functions with safe checks using function_exists(), defined(), etc.

Challenge: maximize cost of obfuscation's reverse engineering

Disclaimer: Similar questions has been asked a number of times on SO, however this question is much more specific, and has not been adequately addressed so far.
We're developing a new packaged software, which, for business security reasons, must run on our customer's server, in PHP. The software is sold with a per-user end-license; price range is within $20-80 per user, target market is small (and web-savy) consultancies, and IT agencies.
To discourage piracy (eg. removing the user-license enforcement), we'd like to maximize the protection of the PHP code in any means technologically available, which does not inconvenience the user.
Let's break this down:
does not inconvenience the user: no additional server-side installs (no zend decoder, or other binaries). Has to run on a plain-vanilla shared PHP host out-of-the-box.
Maximize the protection: breaking the protection has to outweigh the cost of buying an additional license. That is, it has to take at least 3-5 working days for a professional hacker to remove the user license protection.
Any means technologically available: might call home, might use high-end crypto, might implement a c64 emulator.
To pro-actively address the so far highest-voted non-solutions:
NOT looking for perfect obfuscation, just extremely hard ones (defined as: have to take at least 3-5 working days to decrypt), OR other anti-piracy methods
NOT looking for "black-box" software packages, which I don't know how they work, and can't determine whether it fits our purpose; looking for algorithmic ,and out-of-the-box ideas.
NOT looking for license/law-side protection, we already have that covered.
We DO know, that given enough time, and focus, all obfuscation will be hacked sooner or later; we merely want this not to be the economical solution.
Given the above constraints, what methods, or ideas would you use to maximize anti-piracy measures?
Bounty-hunt: point goes for the hardest algorithmic method to reverse-engineer the code, given the constraints above.
Update / Bounty-hunt: I've accepted Ira Baxter's answer, mostly because the rest failed to answer the core question, and attempted to question the underlying assumptions (business, closed source, yadda yadda). Thanks all!
I think what you want to do is to transform the code algorithmically, to obfuscate not only what is executed, but also to obfuscate the data structures. We assume we start with a clean version of the program, produced by the developer. He always works wih the clean version. Obfuscation produces the to-ship version. Good obfuscation will produce a to-ship version with exactly the same functionality as the original, so no further testing is (arguably) needed.
For control flow scrambling, the idea is to take the nicely written code you have at the start, and push it through transformations that make static (and human) analysis of the decisions that control the flow difficult by multiplying the set of assumptions that have to analyzed. For instance, if you have two pointers, and store a value through one, can it affect the value seen by the other? Depending on whether the pointers are aliased on not, you can get two different answers. Now take N pointers, each of which may be aliased; you get 2^N possible aliasing relations. If the reader doesn't know the exact combination, he won't be able to determine if a decision might be true, false or conditional. Of course, the tool that generates this produces conditionals whose outcome it knows, because it designs (generates) the pointer rat's nest to produce a specific outcome.
See Code Obfuscation Literature Survey (not my paper), which discusses a variety of control flow and data flow obfuscation. This is likely not the most recent summary of what is possible, but its pretty instructive. You should note doing this kind of obfuscation has some impact on execution time.
What the papers on this topic make clear is that control and data flow obfuscated programs are extremely hard for static analyzers to "understand"; the papers provide/reference demonstrations of the algorithmic complexity of processing such obfuscated programs.
Now, you might argue that people aren't static analyzers and therefore don't suffer the same limitations. You might be right; Roger Penrose famously argues that people do not have the same constraints as Turing machines; the argument isn't settled by a long shot. But the entire foundation of encryption/hashing technology is built on essentially the same kind of computational complexity arguments. And to date, nobody has proven smart enough to crack these technologies in ways
that can be used in daily life by theives (good thing, or your bank accounts would be empty).
To do this to a PHP program, you need tools that can parse the PHP code, and carry out such transformations. Our DMS Software Reengineering Toolkit has robust PHP parsers, and can apply very complex transformations to code. To do this really well, you want to apply the transformations globally across all your code, not just on a file-by-file basis. We don't have this kind of obfuscation transformation implemented on PHP, but if you really wanted to do it, this would be the way. We have applied complex transformations to PHP programs for other commercial products that we sell.
When you are all done, ideally you'd compile this result to machine code, say using the HipHop compiler. (Just compiling would defeat some folks, but not the serious software engineers).
EDIT: Obfuscation != AntiPiracy is a theme in other answers. So how does obfuscation help?
First you need to deal with the anti-piracy issue. The obvious things to do are:
Add copyright comments to each file. These serve as warnings to theives. Not good ones.
Add copyright strings in various places and print them out occasionally;
these will end up in memory and play a roleif a pirate steals the code; he stole this string, too.
Add a string to your application saying, "licensed to ". This makes
your customer unenthusiastic about letting it be stolen.
Add a check to your application that it is running on the intended customer's machine.
(Since your app is intended to be very cheap, you'll probably need to automate
a registration process)
Have the application phone home with its machine ID occasionally.
Now, these steps prevent someone (legally and technically) from stealing your code.
If this is all you have, an unfazed pirate will simply remove the technical checks and its stolen.
It is very hard to prevent somebody from copying the bit stream that makes up
your product; computers are far too good at copying.
So your goal is to arrange for it to be hard for him to derive
value if he does, and that's where obfuscation comes in.
If the code is sufficiently obfuscated, he will have a difficult time locating the license check
and phone home mechansisms to disable them. (I suggest several checks, none of them always called, to make it hard for the theif to tell when he is successful.).
The obfuscation, well done, should protect the printing of the original
owner's name, which means the original owner will have some interest in prevent it from being
stolen as you'll name him along with pirate in any lawsuit.
If they defeat the licenses, copyright printing, and phone-home mechanisms,
and simply want to run it in the back room without telling you, you might be stuck.
(For $80.00, I can't imagine why they'd go to all this trouble just for this effect).
But many thieves want to modify the software to "improve" it, especially if they want your market. Serious obfuscation will prevent them for doing this; it will even
make it hard for them to add thier own license controls.
That limits the value pretty severely.
They may simply steal it and release it to world for free; your hope here is
the applicaton is hard to crack. If they succeed, your only good defense
is a continuing stream of upgrades that licensed owners get.
Obfuscation is a key to successful piracy defense, IMHO.
Obfuscation != Anti-piracy For instance you could have a heavily obfuscated class, but I can use reflection to see all methods that this class implements. I can then extend this class and override any methods that I don't like. Are you storing a secret? Because any secret value can be pulled from memory using a debugger.
3-5 days? Even with Zend-Guard it takes 3-5 seconds to break using some open source tool. Most obfuscation tools are very primitive and easy to break.
I'm sorry but I don't think there is a good solution for this.
The best anti piracy method is no method.
If you don't want to use tools such as zend, then you are better off doing absolutely nothing.
Take it from me you can waste more time and lose sales trying to stop pirates. you will only hurt yourself. Hey they don't care and its good fun, the harder you make it the more satisfaction they get in doing it. and once its done it will be available for all via a torrent. so no-one needs to repeat the effort.
Make a good application. make it work well. give Fantastic service and the customers you want will gladly pay. those customers you don't want will NEVER pay so don't waste time on them. And guess what, they actually become good advertising. people see your software on more sites they come looking for it.
So in effect you are getting free advertising.
So don't stress, don't waste your time and don't blame pirates if your software fails. blame yourself because you got too distracted trying to do the impossible
I wanted to add a little bit of my personal experience.
Back in the 90's I spent many months creating encryption techniques to reduce/prevent pirating of a heavily pirated piece of software, in the end I 'mostly' succeeded.
I used custom encryption, junk insertion, random number generators, cross module CRC checking, blah blah blah.
I used to hang out in the news group devoted to hacking my software and others like it and even struck up conversations. one polite fellow said "why are you wasting your time we do this for fun". but I was hooked. it was a competition.
If I had spent the time and effort on improving the software instead, I would have earned 10x the amount I thought I had lost to piracy.
It was a fools victory.
I thought about this a lot, and what you are asking is essentially impossible. You can obfuscate to no end and people will still steal your software. There is little you can do about it. If you write in code to call home, someone will strip it out and just put true in instead. Your best bet is to write quality software so people want to buy it. It's either that or use a commercial solution like ionCube or Zend.
Only a few things can really work. The most basic logic I can think of that would be effective (since this market sounds like it's fairly controlled, and finite) would be to use something similar to a licensing server, but with a two-way communication channel (that you can encrypt etc.. etc..).
Now, of course you can have someone disable that communication channel, but between the coding you will add to disable the software, and the fact that your company will be able to follow up with the client since you will know exactly who it is that is "down" that will help.
The third part of the logic, is for each license that is given out to play a role in generating the "checks" that will occur between the software and your licensing server. This means you generate, on-premise, unique hash codes that are used as part of the answer your software send back to the server. That pretty much rules out the hacking, because the hacker would have to know what algorith you are using to generate the licensing (since it is pre-generated, there is no logic to use to decipher it) and the hacker would have to feed you a licensing key.
The fourth step, optionally, would be to push updates to clients to refresh the security mechanisms you have in place and run "tamper" checks on your code, possibly periodically feed some sort of hash to be used in the logic your software uses to connect to the licensing server.
This still isn't perfect, someone "will" be able to clone a production machine, circumvent/redirect the licensing (and you won't know since it will be a copy) and try to work away at the check that you have in your code which require a license (as someone above mentioned, set all the logic to "True")... but you could definitly spend the time putting checks and encryption on your licensing system and make it a time-consuming and "risky" process. Unlesss.. as a final touch... you can have some deliverable from your product generated by your server (none of the code is in what the client has) and pushed to the software that has this licensing mechanism in place.. but i don't know how possible that is.
Artificial code bloat
By using post processors to automatically bloat the code and insert logic multipliers you make the code hard to modify
I use tags in the original source to indicate the type of code in each method and which code multiplier to use. Randomisers can help too, as each release looks very different
The code bloat is achieved by a variety of processes. e.g. repeating and random fiddling of variables before and after they are officially in scope. Lots of extra logic steps that will never get followed. Breaking single statements into many random small steps. Interlace these with as many other statements as possible as long as the final step is in the correct order. etc etc
The final and most important part of this process is to interlace key generation and call home processes through this mess, and to be part of this mess (remember the "random fiddling of variables before and after they are officially in scope") so that the time taken to remove the key generation and call home become unwieldy
The call home server has to act like a rolling code remote control so while the attacker might discover the call home functions, taking them out will result in incorrect initialisation values for general variables in general methods, and in as many cases as you can work with
Over time you can build the general purpose code re-parser, and a library of functions to mess the code up. Keep adding the code mess library to improve the obfuscation level
You need to have a well covering unit and integration test library to validate the code after being messed up
I have not done this with PHP, but with other languages with similar constraints as PHP
Note: This technique works fine for complex scientific software where there is large amounts of cryptic logic and maths anyway. It may not work so well for typical web sites like CMS's unless your code multipliers are very convincing
If I get this right, why not invest in a server to be delivered within the cost of the application, a server which can be placed at the customer, with only one port opened for http access, I mean with a $1000 you can get a machine that can work as a safe for your software. If anyone attempts to hack into it you will know.
Another solution might be:
Currently I am working for a huge company that has aprox 350 selling points(shops) all over the country. As we can not rely on internet connection 100% we have a server at each shop. This server handles the business required for actual selling and it is linked to a local database. The rest of the stuff sits at the head-office server. Now, the clerks have computers in front of them, and all these computers work with the application hosted on the local server, the catch on the local server is that a registry which knows if a certain service is placed locally (on the same machine) or remote (at the head office) and executes the call as required (over http from remote location or direct call from local service). Services can be placed anywhere (local or remote) and all one needs to do is to configure their location in the registry by simply entering one of the keywords : local,remote,application (application keyword means that the service is first called from remote and if it fails it is called locally). This way you can make an acceptable compromise. Highly necessary stuff can sit locally and the rest of the business logic can reside on your server where nobody can touch it.
The short answer is no, there is no way to obfuscate code in such a complex manner that it takes days to crack. The simple explanation: obfuscation is a two way process. It can be done and undone. If a computer can do it, a determined person can do it too.
Instead of wasting so much time on protecting your code, why not take the hint from the popular TV show 24 (side note: Should have never been canceled!). To ensure scripts weren't stolen or revealed to the public, they watermarked each with a number specific to cast member, director, producer, etc. You can do something similar with you scripts by "watermarking" each PHP file. This can be something as simple as changing the name of the variable to reflect a client ID or something as complex as spreading identifying characters over multiple variable and function values/names. Try working this identifier and/or parts of it into as many inconspicuous places in your scripts as possible. Only you can know the exact combination that creates the identifying information. This way if code is leaked you can sue the responsible party.
Just a suggestion, you might just want to add needed lines of code that don't really do anything, except it looks like it.

What is your experience of PHP encrypters? Which one would you recommend?

We have an application that is written in PHP that we are going to license to a customer. Our company believes that the customer might intend to steal the source code and create their own fork of the software, therefore we want to encrypt the source code.
I have searched some for PHP-encrypters and found several that seems good, but since we have no previous experience of PHP-encrypters it hard to say which one is the best. Which PHP encrypters have you used and what is your experience?
So, First:
It is impossible to encrypt your entire code base because at some point there has to be an eval statement, and if the user changes the eval to an echo, they get all of your code in the browser.
And here is a bunch of people who agree with me.
Furthermore:
People will offer you obfuscators, but no amount of obfuscation can prevent someone from getting at your code. None. If your computer can run it, or in the case of movies and music if it can play it, the user can get at it. Even compiling it to machine code just makes the job a little more difficult. If you use an obfuscator, you are just fooling yourself. Worse, you're also disallowing your users from fixing bugs or making modifications. - Schwern
Now thats done:
Bytecompiling is something completely different than encrypting. It makes the PHP code into already interpreted bytes, similar to an exe file. You can include these files just like any other php file.
The byte code produced is able to be reverse engineered, but it would take lots of time and is not worth the company's time.
Check out the byte compiler PHP extension.
I'd also like to note that PHP comes with several ways of reverse engineering classes. Such as the Reflection Class. This basically allows people to see every method, variables, and constant in each of your classes without the need for your source code.
Frankly, once someone sees the functions you use, it is pretty easy to piece it together after that.
There's a lot of obfusticaters out there masquerading as encrypters.
If you really must encrypt your code use Zend.
IMHO shutting your customers out of your code is inherently evil and would rather hide some symbology in the code and sell it under a no-modify/re-sell contract. Then sue the ass off them if they try to sell it on. You could argue that encrypting your code closes down a business opportunity ;) !
C.

Interpret text input as PHP

I want to let users test out a PHP class of mine, that among other things crops and resizes images.
I want them to write PHP code in a text field, send the form, and then their code will be run. How can I do this?
Or is it other safe ways to let users (anyone) demo a PHP class?
I would spawn the PHP process using a user account with next-to-no permissions. Give the read-write access to a single directory, but that's it.
You're still opening yourself to DoS attacks with infinite loops and such, but if you absolutely must do it, then run the code in this very-low-permissions sandbox like IE and Chrome do.
Using EVAL is probably the worst idea.
Yes. You can use the eval statement in php (linked earlier by John http://us2.php.net/manual/en/function.eval.php), however be very careful. You really don't want users to be able to run code on your machine freely.
My recommendation would try to come up with a different approach to do demos - perhaps a flexible few examples... but don't let them run code directly
You could use the eval() function, but don't do it. Seriously.
There's no "safe" way to let any old user run their own PHP on your server. You are exposing yourself to a potential world of hurt.
Instead, provide excellent documentation, code samples, and the source for your own demos, and encourage potential users try it out on their own test/development servers.
As it was already stated, you could use eval function, but it's very dangerous. If you want users to test the code, prepare demo pages presenting possible usage, and for instance possibility to add parameters by user via HTML forms.
You could possibly use eval http://us2.php.net/manual/en/function.eval.php
Don't think in terms of PHP or another general-purpose language, think of the minimal language that's sufficient to express the operations in your domain of image processing. Users submit expressions in this domain-specific language (DSL), these expressions are parsed on the server side and passed to your code.
The important thing initially is to think about the range of image-processing operations and how they can be combined. That will tell you how expressive the language has to be. Once you've worked that out, there are plenty of choices for how the language would look syntactically. The syntax of the language might depend on a tradeoff between ease of use and ease of parsing.
If you can write or find a parser for expressions of this kind, it might be the easiest for users. Actually, can anyone recommend an existing expression evaluator that would work in cases like this (for example, can Smarty safely run user-submitted expressions?), or indeed a parser generator for PHP?
resize(rotate("foo.png", 90), 50)
A language like this might be less easy for users, but it can be processed using a fairly simple stack machine:
"foo.png" 90 rotate 50 resize
Even easier, an XML-based language like this doesn't need its own parser:
<resize percent="50"><rotate degrees="90"><img src="foo.png"></rotate></resize>
Using a DSL doesn't protect you from domain-specific attacks, for example somebody might use the language above to resize an image to a zillion pixels and use up all the server memory. So there would have to be some sort of runtime environment for the DSL that puts limits on the amount of resources any user-submitted script can use.
You can use eval, but not without some precautions.
If your security concerns are merely "cautious" as opposed to "paranoid", you do have a few options:
If you have a dedicated Apache/PHP instance just for this project of yours, set the disable_functions option in php.ini and turn off all file and network related functions. This will affect the entire PHP installation and will break some surprising things, like phpmyadmin.
If you don't have a dedicated server, try 'runkit': http://php.net/manual/en/book.runkit.php to disable functions only within an already running script.
Perhaps more work? Setup a virtual machine (VirtualBox, VMware, etc) which is aggressively firewalled from within the Host OS, with a minimal allocation of memory and diskspace, and run the untrusted code there.
If you are paranoid... setup an approval process for all uploaded code.

How can I scan/fuzz my code for vulnerabilites?

I'm looking for an automated way to fuzz my app or scan it for vulnerabilities. Please assume that my hacking knowledge is 0. Also the source is on my localhost so I need a way to fuzz it locally without relying on an internet connection. Can some security experts give me some hints or recommendations? I'm not sure what options are best.
Edit:
Thanks for the effort to answer, but none so far seems to get the point. I'd like to be more specific (because it helps the question) but without influencing opinions or sounding like I'm advertising a specific product. I'm looking for something like wapiti (sorry to mention names, but had to, because answers so far like learn about sql injections, xss etc. are obviously not real "expert" answers to this question. I already know about these (seriously, does this question sound like it could asked by someone who doesn't know salt about security?)
I'm not asking whether I should test, I'm asking how I should test. I already decided to incorporate automation (and there's no turning back in this decision unless someone gives me an expert answer that proves it useless), so please respect my decision that I'd like to automate. I don't want to go through every compiled xss, sql injection, etc. hack list and try it manually myself against my site (even hackers don't hack that way). Super extra points to anyone who gets the question.
Some people are asking why not just learn.
Best practices (which I know) are not the same as knowing hacking. Some people want to argue they're a flip-coin, but I definitely don't agree :) hence I need a protection tool by someone with the "hacker mentality". How is that going to hurt, in fact, you should try it too ;) Expert answers please from those who know.
There are services that will do automated scans for vulnerabilities. They will not catch everything, but will help you identify problems. Your best bet is to use one of these services and LEARN SOME SECURITY best practices.
Start learning about sql injection and cross site scripting. these are the biggest and easiest to fix vulnerabilities.
Programming defensively is a skill that IMHO every programmer should learn.
There is no substitute for understanding these issues on your own.
To strictly answer your question the way you should test is by using a tool. There are 2 main types of tools you can use, a security scanner which actively probes a running website or a static analysis tool which runs on the source code you use to build your webapp.
The short answer is you want a security scanning tool like wapiti or burp. Tools like these dynamically construct and execute security tests uniquely for your site. You could manually attempt to exploit your own site but that would take lots of time and not provide any value. It would be useless for you to go through a list of known xss or sql injection issues because each issue is unique to the site it applies to. Furthermore these tools can attack your site better then you can giving you a more rigorous security stress test.
There are 2 main tools you can use, static analysis tools and dynamic analysis tools. Static analysis tools read in your source code, figure out the way the data flows through the app and look for security issues. At their root most security issues are allowing a user to control some data that flows into an inappropriate part of an application so even though the app isn't running and you rub up against the halting problem, static analysis method of "guessing" and trying out each code path can yield good results. Static analysis tools are language dependent and most are expensive. Some free ones are fxcop (C#), PMD and findbugs (java), see http://en.wikipedia.org/wiki/List_of_tools_for_static_code_analysis
Dynamic analysis tools (more commonly just called "security scanner") require you setup your webapp so it can run tests against it, this sounds like more what you want. My favorite tool here is burp, some free ones include wapiti which is good as well. These tools will look at how your app handles data, look for inputs and fill them with malicious data in an attempt to trigger vulnerabilities. An example test would be for testing reflected cross-site scripting, the scanner would look at a page and insert javascript into every querystring value, cookie value, form value etc and then render the page to see if the malicious javascript was echod back to the page.
You likely don't need or want a fuzzer. Fuzzing tools mostly help you when there is a lot of parsing code so a fuzzer is not the best fit for a webapp whereas it would be a good fit for a protocol you are making. There is limited fuzzing capabilities in the security scanner tools listed above and you probably don't need more then this. Fuzzers also take time to build. Fuzzers often find more stuff in c/c++ code because there are less libraries built in already doing the right thing, in the webapp case there is less "room for fuzzers to play" so to speak.
Before you go crazy on automation (which will likely yield results you probably won't understand), I'd suggest that you read up of writing secure code instead and learn to identify the things you are doing wrong. Here are some tutorials to get you started:
http://php.net/manual/en/security.php
Failing that, I'd suggest outsourcing your code to a security firm if you can afford it.
Good luck!
Provided you know C, You can work with spike, Its always good to do a manual check for overflows in anything that could conceivably be touched by an end-user, The usual %x%x%x tests for format string attacks, and just to be diligent in your static analysis.
PeachFuzz and SPIKE are both well documented.
Failing that, writing your own is trivial.
Knowing what fuzzing is and how you may want to approach does not necessarily lead to the skills necessary to thoroughly test and evaluate your software for vulnerabilities and flaws. You need to use automated testing, but in a tuned manner where you modify the testing that the tool is doing as you find new input paths, interactions, and so on.
Basically, what I'm saying is that you need to know what you are doing if you want this to be a real value add. You cannot just pick a tool, run it, and expect to get good results. You need someone who does this type of testing to work either with or for you. Tools are useful, but can only produce useful results when used by someone skilled in this art.
I've used Paros - http://www.parosproxy.org/ - its free, easy to use and displays both the cause of the error, the possible fix and how to replicate it (usually a link).
It easy to configure and spiders your entire site - it can also spider local installations.
It has a gui as well.
Its old, but its good and easy.
I tried to configure WAPITI but it was simply too hard for me.
I've been researching this topic for many years for my own application and found a fantastic tool recently which was based on PAROS (see my other answer above)
Its ZAP from OWASP and is the ducks nuts.
https://www.owasp.org/index.php/OWASP_Zed_Attack_Proxy_Project
One of the best things you can do is integrate ZAP into your project automation / build so whenever you do a build the test runs.
Even better, you can sit it next to your selenium automated tests to 'collect' the pages you test, then......scan the hell out of them!
Its really well documented, but you'll need a fast PC as it runs hundreds of tests per page. If you're doing a whole site it can take some time.
There are some other tools you might want to consider
http://sqlmap.org/
I found this tool....scarily easy to use and very very comprehensive.
Whenever I got what I thought was a 'false positive' with ZAP, I'd scan the page with SQLmap (you gotta figure out how to use Python - its easy, took a couple of hours) and SQLmap would either verify the false positive or find the vulnerability.

Categories