PHP sandbox/sanitize code passed to create_function

PHP sandbox/sanitize code passed to create_function - php

I am using create_function to run some user-code at server end. I am looking for any of these two:
Is there a way to sanitize the code passed to it to prevent something harmful from executing?
Alternately, is there a way to specify this code to be run in a sandboxed environment so that the user can't play around with anything else.
Thanks!

http://php.net/runkit

You could use the tonkenizer to figure out what the code will do, then whitelist certain functions and operations. I think it would end up being very difficult (or impossible) to make it foolproof, especially given PHP's flexibility:
$f = "shell_exec";
$arg = 'rm -rf /';
$f($arg); // ouch
call_user_func($f, $arg); // ouch
eval("$f('$arg');"); // ouch
$newF = create_user_function('', "$f('$arg');");
$newF(); // ouch
The only kind of sandbox that will give you 100% security (well, 99.9%...) is a virtual machine you can just throw away afterwards.

We use the tokenizer to analyze code statically, as well as modify the code to perform runtime checks for certain things. This is done with the tokenizer and scripts based on the tokenizer. Since the tokenizer is the same one PHP actually uses, it improves your luck over writing your own.
I've seen people using regexes to try to analyze a language. This is a really bad idea.
But ...
Since PHP is a pretty stupid-easy grammar, and you have access to the tokenizer, you can actually stop most of the badness by disallowing variable functions, and only allowing a small number of whitelisted functions to be called. If you don't need OOP, even better.
However, we don't feel confident enough that we nailed 100% of the problems, and we use this to power a sandbox for the backend users who are paying customers, not every user on planet earth with a keyboard, and perhaps malice.
I too think the people here who poo-poo the idea 100% as "bad practice" need to get a clue. There are reasons to do this.

You cannot reliably sanitize the user input - a determined hacker will find some obscure way to circumvent your sanitization code.
Sandboxing could be possible, but is equally crippling. If you really want to be safe, you should create a sandbox for each call. After all, someone could execute bogus code that is harmful to all other users of your sandbox.
I don't think you really want to allow that. Think of it this way: you are providing programmatic access to the server!

You could try using Quercus, a Java based PHP interpreter, to create a safe sandboxed PHP environment. You can do the same for JavaScript using Rhino, so I think it might be possible with Quercus.

You could consider creating a custom(ized) language that your users can make use of. Then it's up to you to create the library of supported functions that could very well be just wrappers of PHP's native functions. But even then, making it hack-proof or simply working is a tedious job at best. Perhaps you should re-evaluate why you want users to have code access in the first place? I'd love to help out if you need someone to discuss this with (or update your question, I guess? :)
Hope you can work it out!
-Dave

The is a class on GitHub that may help, early stages but looks promising.
https://github.com/fregster/PHPSandbox

Overall bad idea and too dangerous IMO, no matter what protections you put into place. Better create a pseudo-language limited to exactly what users are allowed to do.

Related

How to safely run user-supplied PHP code [duplicate]

I am using create_function to run some user-code at server end. I am looking for any of these two:
Is there a way to sanitize the code passed to it to prevent something harmful from executing?
Alternately, is there a way to specify this code to be run in a sandboxed environment so that the user can't play around with anything else.
Thanks!

http://php.net/runkit

You could use the tonkenizer to figure out what the code will do, then whitelist certain functions and operations. I think it would end up being very difficult (or impossible) to make it foolproof, especially given PHP's flexibility:
$f = "shell_exec";
$arg = 'rm -rf /';
$f($arg); // ouch
call_user_func($f, $arg); // ouch
eval("$f('$arg');"); // ouch
$newF = create_user_function('', "$f('$arg');");
$newF(); // ouch
The only kind of sandbox that will give you 100% security (well, 99.9%...) is a virtual machine you can just throw away afterwards.

We use the tokenizer to analyze code statically, as well as modify the code to perform runtime checks for certain things. This is done with the tokenizer and scripts based on the tokenizer. Since the tokenizer is the same one PHP actually uses, it improves your luck over writing your own.
I've seen people using regexes to try to analyze a language. This is a really bad idea.
But ...
Since PHP is a pretty stupid-easy grammar, and you have access to the tokenizer, you can actually stop most of the badness by disallowing variable functions, and only allowing a small number of whitelisted functions to be called. If you don't need OOP, even better.
However, we don't feel confident enough that we nailed 100% of the problems, and we use this to power a sandbox for the backend users who are paying customers, not every user on planet earth with a keyboard, and perhaps malice.
I too think the people here who poo-poo the idea 100% as "bad practice" need to get a clue. There are reasons to do this.

You cannot reliably sanitize the user input - a determined hacker will find some obscure way to circumvent your sanitization code.
Sandboxing could be possible, but is equally crippling. If you really want to be safe, you should create a sandbox for each call. After all, someone could execute bogus code that is harmful to all other users of your sandbox.
I don't think you really want to allow that. Think of it this way: you are providing programmatic access to the server!

You could try using Quercus, a Java based PHP interpreter, to create a safe sandboxed PHP environment. You can do the same for JavaScript using Rhino, so I think it might be possible with Quercus.

You could consider creating a custom(ized) language that your users can make use of. Then it's up to you to create the library of supported functions that could very well be just wrappers of PHP's native functions. But even then, making it hack-proof or simply working is a tedious job at best. Perhaps you should re-evaluate why you want users to have code access in the first place? I'd love to help out if you need someone to discuss this with (or update your question, I guess? :)
Hope you can work it out!
-Dave

The is a class on GitHub that may help, early stages but looks promising.
https://github.com/fregster/PHPSandbox

Overall bad idea and too dangerous IMO, no matter what protections you put into place. Better create a pseudo-language limited to exactly what users are allowed to do.

PHP Performance Question (readable code or not, web app)

I wondered if I should write my code clean and readable or rather small and unreadable... Or should I write it readable and then compress it afterwards when I'm publishing it on the web?
Ps. I'm building a web app,
the faster, the better!
Thanks_

I think you are greatly underestimating PHP's performance if you think this will affect it.
Write clean, readable code. In fact write code as if the next guy to maintain it is a sociopath that knows where you live.
Edit In response to AESM's comment... not in any way that matters. Also you can edit your question if you want to expand on it, instead of leaving a comment.

PHP parses the code before executing. The first stage is tokenization, which throws out all comments and whitespaces, and converts all identifiers to tokens. This means neither meaningfull names, nor sensible comments and clean formatting will have any effects at runtime. In fact all speed effects you seem to expect from compression are already lost during tokenization.
If you do have "bigger" source files due to clean coding, then tokenization will effectively take longer. However this effect is barely meassurable compared to actual parsing and execution.
If you feel you want to optimize at that point, please consider using eaccelerator, which makes an actual difference.
greetz
back2dos

"Programs must be written for people to read, and only incidentally for machines to execute."

I'd say, write a clean/readable code and then eliminate the bottlenecks, if needed.

What is your experience of PHP encrypters? Which one would you recommend?

We have an application that is written in PHP that we are going to license to a customer. Our company believes that the customer might intend to steal the source code and create their own fork of the software, therefore we want to encrypt the source code.
I have searched some for PHP-encrypters and found several that seems good, but since we have no previous experience of PHP-encrypters it hard to say which one is the best. Which PHP encrypters have you used and what is your experience?

So, First:
It is impossible to encrypt your entire code base because at some point there has to be an eval statement, and if the user changes the eval to an echo, they get all of your code in the browser.
And here is a bunch of people who agree with me.
Furthermore:
People will offer you obfuscators, but no amount of obfuscation can prevent someone from getting at your code. None. If your computer can run it, or in the case of movies and music if it can play it, the user can get at it. Even compiling it to machine code just makes the job a little more difficult. If you use an obfuscator, you are just fooling yourself. Worse, you're also disallowing your users from fixing bugs or making modifications. - Schwern
Now thats done:
Bytecompiling is something completely different than encrypting. It makes the PHP code into already interpreted bytes, similar to an exe file. You can include these files just like any other php file.
The byte code produced is able to be reverse engineered, but it would take lots of time and is not worth the company's time.
Check out the byte compiler PHP extension.
I'd also like to note that PHP comes with several ways of reverse engineering classes. Such as the Reflection Class. This basically allows people to see every method, variables, and constant in each of your classes without the need for your source code.
Frankly, once someone sees the functions you use, it is pretty easy to piece it together after that.

There's a lot of obfusticaters out there masquerading as encrypters.
If you really must encrypt your code use Zend.
IMHO shutting your customers out of your code is inherently evil and would rather hide some symbology in the code and sell it under a no-modify/re-sell contract. Then sue the ass off them if they try to sell it on. You could argue that encrypting your code closes down a business opportunity ;) !
C.

Interpret text input as PHP

I want to let users test out a PHP class of mine, that among other things crops and resizes images.
I want them to write PHP code in a text field, send the form, and then their code will be run. How can I do this?
Or is it other safe ways to let users (anyone) demo a PHP class?

I would spawn the PHP process using a user account with next-to-no permissions. Give the read-write access to a single directory, but that's it.
You're still opening yourself to DoS attacks with infinite loops and such, but if you absolutely must do it, then run the code in this very-low-permissions sandbox like IE and Chrome do.
Using EVAL is probably the worst idea.

Yes. You can use the eval statement in php (linked earlier by John http://us2.php.net/manual/en/function.eval.php), however be very careful. You really don't want users to be able to run code on your machine freely.
My recommendation would try to come up with a different approach to do demos - perhaps a flexible few examples... but don't let them run code directly

You could use the eval() function, but don't do it. Seriously.
There's no "safe" way to let any old user run their own PHP on your server. You are exposing yourself to a potential world of hurt.
Instead, provide excellent documentation, code samples, and the source for your own demos, and encourage potential users try it out on their own test/development servers.

As it was already stated, you could use eval function, but it's very dangerous. If you want users to test the code, prepare demo pages presenting possible usage, and for instance possibility to add parameters by user via HTML forms.

You could possibly use eval http://us2.php.net/manual/en/function.eval.php

Don't think in terms of PHP or another general-purpose language, think of the minimal language that's sufficient to express the operations in your domain of image processing. Users submit expressions in this domain-specific language (DSL), these expressions are parsed on the server side and passed to your code.
The important thing initially is to think about the range of image-processing operations and how they can be combined. That will tell you how expressive the language has to be. Once you've worked that out, there are plenty of choices for how the language would look syntactically. The syntax of the language might depend on a tradeoff between ease of use and ease of parsing.
If you can write or find a parser for expressions of this kind, it might be the easiest for users. Actually, can anyone recommend an existing expression evaluator that would work in cases like this (for example, can Smarty safely run user-submitted expressions?), or indeed a parser generator for PHP?
resize(rotate("foo.png", 90), 50)
A language like this might be less easy for users, but it can be processed using a fairly simple stack machine:
"foo.png" 90 rotate 50 resize
Even easier, an XML-based language like this doesn't need its own parser:
<resize percent="50"><rotate degrees="90"><img src="foo.png"></rotate></resize>
Using a DSL doesn't protect you from domain-specific attacks, for example somebody might use the language above to resize an image to a zillion pixels and use up all the server memory. So there would have to be some sort of runtime environment for the DSL that puts limits on the amount of resources any user-submitted script can use.

You can use eval, but not without some precautions.
If your security concerns are merely "cautious" as opposed to "paranoid", you do have a few options:
If you have a dedicated Apache/PHP instance just for this project of yours, set the disable_functions option in php.ini and turn off all file and network related functions. This will affect the entire PHP installation and will break some surprising things, like phpmyadmin.
If you don't have a dedicated server, try 'runkit': http://php.net/manual/en/book.runkit.php to disable functions only within an already running script.
Perhaps more work? Setup a virtual machine (VirtualBox, VMware, etc) which is aggressively firewalled from within the Host OS, with a minimal allocation of memory and diskspace, and run the untrusted code there.
If you are paranoid... setup an approval process for all uploaded code.

What are the security concerns of evaluating user code?

I am wondering what security concerns there are to implementing a PHP evaluator like this:
<?php eval($_POST['codeInput']); %>
This is in the context of making a PHP sandbox so sanitising against DB input etc. isn't a massive issue.
Users destroying the server the file is hosted on is.
I've seen Ruby simulators so I was curious what's involved security wise (vague details at least).
Thanks all. I'm not even sure on which answer to accept because they are all useful.
Owen's answer summarises what I suspected (the server itself would be at risk).
arin's answer gives a great example of the potential problems.
Geoff's answer and randy's answer echo the general opinion that you would need to write your own evaluator to achieve simulation type capabilities.

don't do that.
they basically have access to anything you can do in PHP (look around the file system, get/set any sort of variables, open connections to other machines to insert code to run, etc...)

The eval() function is hard to sanitize and even if you did there would surely be a way around it. Even if you filtered exec, all you need to do is to somehow glue the string exec into a variable, and then do $variable(). You'd need to really cripple the language to achieve at least some sort of imaginary security.

could potentially be in really big trouble if you eval()'d something like
<?php
eval("shell_exec(\"rm -rf {$_SERVER['DOCUMENT_ROOT']}\");");
?>
it's an extreme example but it that case your site would just get deleted. hopefully your permissions wouldn't allow it but, it helps illustrate the need for sanitization & checks.

There are a lot of things you could say.. The concerns are not specific to PHP.
Here's the simple answer:
Any input to your machine (or database) needs to be sanitized.
The code snippet you've posted pretty much lets a user run any code they want, so it's especially dangerous.
There is a pretty good introductory article on code injection here:
Wikipedia on Code Injection.

If you allow arbitrary code to be run on your server, it's not your server any more.

Dear god NO. I cringe even at the title. Allowing user to run any kind of arbitrary code is like handing the server over to them
I know the people above me already said that. But believe me. That's never enough times that someone can tell you to sanitize your input.
If you really, really want to allow user to run some kind of code. Make a subset of the commands available to the user by creating some sort of psudo language that the user can use to do that. A-la the way bbcode or markdown works.

If you are looking to build an online PHP interpreter, you will need to build an actual REPL interpreter and not use eval.
Otherwise, never ever execute arbitrary user code. Ever.

Do NOT allow unfiltered code to be executed on your server, period.
If you'd like to create a tool that allows for interactive demonstration of a language such as the tool seen here: http://tryruby.hobix.com/ I would work on coding a sub portion of the language yourself. Ideally, you'll be using it to demonstrate simple concepts to new programmers, so it's irrelevant if you properly implement all the features.
By doing this you can control the input via a white list of known acceptable input. If the input isn't on the white list it isn't executed.
Best of luck!

As already answered, you need to sanitize your inputs. I guess you could use some regex-filtring of some kind to remove unwanted commands such as "exec" and basically every malicious command PHP has got to offer (or which could be exploited), and that's a lot.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.