When (if ever) is eval NOT evil? - php

I've heard many places that PHP's eval function is often not the answer. In light of PHP 5.3's LSB and closures we're running out of reasons to depend on eval or create_function.
Are there any conceivable cases where eval is the best (only?) answer in PHP 5.3?
This question is not about whether eval is evil in general, as it obviously is not.
Summary of Answers:
Evaluating numerical expressions (or other "safe" subsets of PHP)
Unit testing
Interactive PHP "shell"
Deserialization of trusted var_export
Some template languages
Creating backdoors for administers and/or hackers
Compatibility with < PHP 5.3
Checking syntax (possibly not safe)

If you're writing malware and you want to make life hard for the sysadmin who's trying to clean up after you. That seems to be the most common usage case in my experience.

Eric Lippert sums eval up over three blog posts. It's a very interesting read.
As far as I'm aware, the following are some of the only reasons eval is used.
For example, when you are building up complex mathematical expressions based on user input, or when you are serializing object state to a string so that it can be stored or transmitted, and reconstituted later.

The main problem with eval is it being a gateway for malicious code. Thus you should never use it in a context where it can be exploited from the outside, e.g. user provided input.
One valid UseCase would be in Mocking Frameworks.
Example from PHPUnit_Framework_TestCase::getMock()
// ... some code before
$mock = PHPUnit_Framework_MockObject_Generator::generate(
$originalClassName,
$methods,
$mockClassName,
$callOriginalClone,
$callAutoload
);
if (!class_exists($mock['mockClassName'], FALSE)) {
eval($mock['code']);
}
// ... some code after
There is actually a lot of things happening in the generate method. In laymens terms: PHPUnit will take the arguments to generate and create a class template from it. It will then eval that class template to make it available for instantiation. The point of this is to have TestDoubles to mock dependencies in UnitTests of course.

If you are writing a site that interprets and executes PHP code, like an interactive shell would.
...
I'm a systems guy, that's all I got.

You can use eval to create ad-hoc classes:
function myAutoLoad($sClassName){
# classic part
if (file_exists($sClassName.'.php'){
require $sClassName.'.php';
} else {
eval("
class $sClassName{
public function __call($sMethod,$aArgs){
return 'No such class: ' . $sClassName;
}
}");
}
}
Although, of course, usage is quite limited (some API's or maybe DI containers, testing frameworks, ORMs which have to deal with databases with dynamic structure, code playgrounds)

eval is a construct that can be used to check for syntax errors.
Say you have these two PHP scripts:
script1.php
<?php
// This is a valid syntax
$a = 1;
script2.php
<?php
// This is an invalid syntax
$a = abcdef
You can check for syntax errors using eval:
$code1 = 'return true; ?>'.file_get_contents('script1.php');
$code2 = 'return true; ?>'.file_get_contents('script2.php');
echo eval($code1) ? 'script1 has valid syntax' : 'script1 has syntax errors';
echo eval($code2) ? 'script2 has valid syntax' : 'script2 has syntax errors';
Unlike php_check_syntax (which is deprecated and removed anyway), the code will not be executed.
EDIT:
The other (preferred) alternative being php -l. You can use the solution above if you don't have access to system() or shell execution commands.
This method can inject classes/functions in your code. Be sure to enforce a preg_replace call or a namespace before doing so, to prevent them from being executed in subsequent calls.
As for the OP topic: When (if ever) is eval NOT evil? eval is simply not evil. Programmers are evil for using eval for no reason. eval can shorten your code (mathematical expression evaluation, per example).

I've found that there are times when most features of a language are useful. After all, even GOTO has had its proponents. Eval is used in a number of frameworks and it is used well. For example, CodeIgniter uses eval to distinguish between class hierarchy of PHP 4 and PHP 5 implementations. Blog plugins which allow for execution of PHP code definitely need it (and that is a feature available in Expression Engine, Wordpress, and others). I've also used it for one website where a series of views are almost identical, but custom code was needed for each and creating some sort of insane rules engine was far more complicated and slower.
While I know that this isn't PHP, I found that Python's eval makes implementation of a basic calculator much simpler.
Basically, here's the question:
Does eval make it easier to read? One of our chief goals is communicating to other programmers what was going through our head when we wrote this. In the CodeIgniter example it is very clear what they were trying to accomplish.
Is there another way? Chances are, if you're using eval (or variable variables, or any other form of string look-up or reflection syntax), there is another way to do it. Have you exhausted your other options? Do you have a reasonably limitted input set? Can a switch statement be used?
Other considerations:
Can it be made safe? Is there a way that a stray piece of code can work its way into the eval statement?
Can it be made consistent? Can you, given an input, always and consistently produce the same output?

An appropriate occasion (given the lack of easy alternatives) would be when trusted data was serialized with var_export and it's necessary to unserialize it. Of course, it should never have been serialized in that fashion, but sometimes the error is already done.

I suppose, eval should be used where the code is actually needs to be compiled. I mean such cases like template file compilations (template language into PHP for the sake of performance), plugin hook compilation, compilations for performance reasons etc.

You could use eval to create a setup for adding code after the system installed. Normally if you would want to change the code on the server you would have to add/change existing PHP files. An alternative to this would be to store the code in a database and use eval to execute it. You'd have to be sure that the code added is safe though.
Think of it like a plugin, just one that can do about anything...
You could think of a site that would allow people to contribute code snippets that the users could then dynamically add into their web pages - without them actually persisting code on the webservers filesystem. What you would need is an approval process though...

This eval debate is actually one big misunderstanding in context of php. People are brainwasched about eval being evil, but usually they have no problem using include, although include is essentially the same thing. Include foo is the same as eval file_get_contents foo, so everytime you're including something you commit the mortal sin of eval.

Compatibility. It's quite frequent to provide PHP4 fallbacks. But likewise it's a possible desire to emulate PHP5.4 functionality in 5.3, as example SplString. While simply providing two include variants (include.php4 vs. include.php5) is frequent, it's sometimes more efficient or readable to resort to eval():
$IMPL_AA = PHP_VERSION >= 5 ? "implements ArrayAccess" : "";
eval(<<<END
class BaseFeature $IMPL_AA {
Where in this case the code would work on PHP4, but expose the nicer API/syntax only on PHP5. Note that the example is fictional.

I've used eval when I had a php-engined bot that communicated with me and I could tell it to do commands via EVAL: php commands here. Still evil, but if your code has no idea what to expect (in case you pull a chunk of PHP code from a database) eval is the only solution.

So, this should hold true for all languages with eval:
Basically, with few exceptions, if you are building the value passed to eval or getting it from a non-truested source you are doing something wrong. The same holds true if you are calling eval on a static string.
Beyond the performance problems with initializing the parser at runtime, and the security issues, You generally mess with the type system.
More seriously, it's just been shown that in the vast majority of cases, there are much more elegant approaches to the solution. However, instead of banning the construct outright, it's nice to think of it as one might goto. There are legitimate uses for both, but it is a good red flag that should get you thinking about if you are approaching the problem the correct way.
In my experience, I've only found legitimate uses that fall in the categories of plugins and privileged user (for instance, the administrator of a website, not the user of such) extensions. Basically things that act as code coming from trusted sources.

Not direct use but the /e modifier to preg_replace utilizes eval and can be quite handy. See example #4 on http://php.net/preg_replace.
Whether or not it's evil/bad is subjective and depends entirely on what you consider "good" in a specific context. When dealing with untrusted inputs it is usually considered bad. However, in other situations it can be useful. Imagine writing a one-time data conversion script under extreme deadline pressure. In this situation, if eval works and makes things easier, I would have trouble calling it evil.

Related

PHP internal development question, want to extend the curly syntax with quoting functionality

I know this is a bit offtopic to SO given that I have no code and my question is abstract.
I want to add a functionality into the PHP language:
Existing functionality:
$var=123;
$string = "This is a {$var} variable";
Wanted new functionality for SQL variables:
$var="asd' union ()";
$string = "This is a {(QUOTE)$var} quoted variable";
Or:
$var="asd' union ()";
$string = "This is a {QUOTE($var)} quoted variable";
The idea is to extend the string curly syntax to support either a function or some hardcoded function to quote variables.
My question is:
Is there a way to write a php module/extension that will provide such functionality ?
If so, where do I have to look to get started quickly ?
Solution
function _quote($v) { return strtoupper($v); }
$_Q = '_quote';
$string = "This is an {$_Q($var)} integer";
echo $string;
I guess that's closest of what I want without having to hack php itself
Generally speaking your presented use case is the least useful, since you should be using parameterized queries anyway. You might be after solving the wrong problem. However, to extend the language itself one would have to modify the parser. That entails knowing YACC and C as well as some knowledge of PHP internals. This is by no means a trivial thing in your case particularly because you aren't just modifying how PHP parses the string, but also how it compiles the opcodes that entail function execution.
You can read a detailed blog post on how to approach your idea in Nikita's blog if you're curious. Though I'd posit there is a much easier solution to your immediate problem of escaping strings in SQL than trying to modify the entire PHP langauge.
Giving a more lengthy explanation of the innards of how the PHP parser works in this answer would be exhaustive. Which is why I'm linking to the most technically correct explanation of the problem that I'm aware of.
Important side note
Just to be clear this cannot be an extension. You aren't extending PHP. You are literally changing it. Because the PHP Parser has to be compiled from itself (yes, we compile the compiler), you can't simply provide a shared object to do this. You have to build the entire php-src tree from scratch including the parser itself (which is prepackaged with php normally).
The consequence of this is that you have now forked your own implementation of the PHP language that is no longer compatible with defacto PHP.
In other words... you own it for life.
Please please please use prepared statements for SQL. Also, read this: https://blog.codinghorror.com/give-me-parameterized-sql-or-give-me-death/
Disliking prepared statements is not a reason to not use them, instead maybe find a trusted library that handles them better, does it differently, or maybe even switch to an ORM.
To focus more on your question, it might be easier to accomplish what you want by using something like Mustache, forking it, and layering in your customization. https://github.com/bobthecow/mustache.php
I highly doubt you will be able to build a module to provide this sort of functionality. Most modules provide a very specific/isolated additional functionality. I can't think of a module that modifies how native language constructs behave. To do that, you'd need to modify PHP Core itself, which means writing some C, and then deploying a PHP version that isn't compatible with anything else out there.

Does PHP read functions before they are called?

I declare 100 functions, but I don't actually call any of them. Will having so many functions defined affect loading time?
Does PHP process these functions before they are called?
Yes, php parses all functions on the run, and checks possible syntax errors , (though it does not execute them all this time) and registers their name as a symbol. When you call any of the functions, php searches for the function in the registered symbol table for function name and then executes that function.
So, better to use functions of your purpose only as it will increase the size of symbol table.
Just to be clear, even having hundreds of unused classes and functions is not going to make much difference to the performance of your program. Some difference, yes maybe, but not much. Improving the code that is being run will make a bigger difference. Don't worry about optimising for language mechanics until you've got your own code perfect. The key to performance optimisation is to tackle the biggest problems first, and the biggest problems are very rarely caused by subtle language quirks.
If you do want to minimise the effect of loading too much code that isn't going to be used, the best way to do this is to use PHP's autoloading mechanism.
This probably means you should also write your code as classes rather than stand-alone functions, but that's a good thing to do anyway.
Using an autoloader means that you can let PHP do the work of loading the code it needs when it needs it. If you don't use a particular class, then it won't be loaded, but on the other hand it will be there when you need it without you having to do an include() or anything like that.
This setup is really powerful and eliminates any worries about having too much code loaded, even if you're using a massive framework library.
Autoloading is too big a topic for me to explain in enough detail in an answer here, but there are plenty of resources on the web to teach it. Alternatively, use an existing one -- pretty much all frameworks have an autoloader system built-in, so if you're using any kind of modern PHP framework, you should be able to use theirs.

Eval piece of code stored in DB as string

I'm moving part of my application from PHP to Go. I'm storing some pieces of code to eval in MySQL, for example: checkGeo('{geo:["DE","AU","NL"]}') && check0s('{os:["android"]}'). In PHP it is easy, just eval($stringToEval), but how can this be done in Go lang?
In an interpreted language like PHP, implementing eval is fairly simple. But Go is a compiled language. To implement eval in Go would require writing an interpreter for Go. This would not be impossible, but it would be a big job.
-Edit
You can have a look at https://godoc.org/bitbucket.org/binet/go-eval/pkg/eval which might do what you want. If it doesn't you could then maybe expand on it a bit. It isn't a full interpreter though.
Based on the examples you gave it seems like it'd be trivial-ish to build a bit of go code that knows how to evaluate checkGeo() or checkOs() rules; I think that'd be the best approach.
But that's not what you asked...
Another option would be to write the rules in Lua and run them with https://github.com/aarzilli/golua or in Javascript and use https://github.com/robertkrimen/otto
You don't need to pull in a full-blown interpreter for this sort of thing: write a simple parser that would pull your script apart into the syntax tree, and then write code that would walk that tree and "evaluate" it. It's not really that hard for simplistic cases like yours. And of course, your syntax might be made way simpler than PHP's since you don't want the full power of PHP's evaluator.
One simple example is rpn, but you can go simpler and invent a way to store your queries in, say, JSON.
Also note that Go has a Go parser in the form of a Go package — go/parser so you can write your queries using (minimal) Go syntax, parse them with go/parser and only implement an evaluator which would walk the AST produced by the parser and calculate the result. But I think this would be an overengeneering, given the example you've provided.
And a minor nitpick: storing the code which is to be evaluated by a full-blown evaluator, like PHP, is dangerous: if someone somehow manages to inject a call to exec() or something like this in your table, the result will be suboptimal. So having a primitive parser/evaluator is an upside from the security standpoint as well.

Implementing a DSL in PHP

I have an idea. I want to give our client the ability to specify pricing based on a number of variables by writing some simple code like this:
if customer.zip is "37208"
return 39.99
else
return 59.99
And in my code, I'd do something like this:
try {
$variables = array('customer' => array('zip' => '63901'));
$code = DSL::parse(DSL::tokenize($userCode))
$returnValue = DSL::run($code, $variables);
} catch (SyntaxErrorException $e) {
...
}
I guess what I'm wanting is to create a simple DSL in PHP that allows our customer to have a great deal of flexibility in setting pricing without having to have us code each and every case.
Here's the basic idea:
I would provide an array of variables and the code that the customer wrote.
The parser would evaluate the code that the user wrote using the variables provided and return to me the value that our customer returned. It would throw exceptions for any syntax errors, etc.
I would then use the returned value in the normal logic of the application.
So do you know of any resources or frameworks for building a simple DSL in PHP? Any ideas where to begin?
Thanks!
Technical limitations aside, you might want to really think twice about giving this kind of programming power to (I presume) non-programmers. They will probably mess up in completely unpredictable ways and you'll be the one having to clean up the mess. At least guard it with lots of tests. And possibly legalese as well.
But you asked a question, so I'll try to answer that. There is a distinction to be made between internal style DSL's (What most people mean when they use the word DSL) and then external style DSL's (Which is more like a mini language). Ruby is famous for having a syntax that lends it self well to internal style DSL. PHP on the other hand, is quite bad in that regard.
That said, you can still do some stuff in PHP - The simplest is perhaps to just write up a library of functions and then have your customers write code in plain PHP, using that library. You would have to audit the code of course, but it would give all the benefits of using an existing runtime.
If that's not fancy enough, you will have to dig in to the heavy stuff. First you need a parser. If you know how, they can be hand written fairly easily, but unless you were forced to write one in school or you have a strange hobby of writing that kind of stuff just for fun (I do), it's probably going to take you a bit of work. The basic components of a parser is a tokenizer and some kind of automata (state machine) that arranges the tokens into a tree-structure (an AST).
Once you have your parsed structure, you need to evaluate it. Since this is a DSL, the number of features are limited and performance is probably not your biggest concern, you could write some object oriented code around the AST and leave it at that. Otherwise you have options like writing some sort of interpreter or cross-compile it into another format (PHP would be an obvious choice).
The tricky part all way through this is mostly in handling edge cases, such as syntax errors and report something meaningful back to the user. Again, just giving then access to a subset of PHP, will give you that for free, so consider that first.
If anyone else is looking for another option - consider using Twig for creating the DSL/parsing (http://twig.sensiolabs.org/) which is integrated to the Pico CMS (http://pico.dev7studios.com/#).
The standard approach to building a DSL parser is to employ a parser generator aka a compiler-compiler to do the heavy lifting. This allows the developer to express the DSL in an abstract BNF-ish syntax, and not have to get into the nitty gritty of parsing and lexing.
Examples include Yacc in C, Regexp::Grammars in Perl, and ANTLR, which targets Java and several other languages, etc. The PHP option appears to be PHP-PEG.

Are global variables in PHP considered bad practice? If so, why? [duplicate]

This question already has answers here:
Stop using `global` in PHP
(6 answers)
Closed 4 months ago.
function foo () {
global $var;
// rest of code
}
In my small PHP projects I usually go the procedural way. I generally have a variable that contains the system configuration, and when I nead to access this variable in a function, I do global $var;.
Is this bad practice?
When people talk about global variables in other languages it means something different to what it does in PHP. That's because variables aren't really global in PHP. The scope of a typical PHP program is one HTTP request. Session variables actually have a wider scope than PHP "global" variables because they typically encompass many HTTP requests.
Often (always?) you can call member functions in methods like preg_replace_callback() like this:
preg_replace_callback('!pattern!', array($obj, 'method'), $str);
See callbacks for more.
The point is that objects have been bolted onto PHP and in some ways lead to some awkwardness.
Don't concern yourself overly with applying standards or constructs from different languages to PHP. Another common pitfall is trying to turn PHP into a pure OOP language by sticking object models on top of everything.
Like anything else, use "global" variables, procedural code, a particular framework and OOP because it makes sense, solves a problem, reduces the amount of code you need to write or makes it more maintainable and easier to understand, not because you think you should.
Global variables if not used carefully can make problems harder to find. Let's say you request a php script and you get a warning saying you're trying to access an index of an array that does not exist in some function.
If the array you're trying to access is local to the function, you check the function to see if you have made a mistake there. It might be a problem with an input to the function so you check the places where the function is called.
But if that array is global, you need to check all the places where you use that global variable, and not only that, you have to figure out in what order those references to the global variable are accessed.
If you have a global variable in a piece of code it makes it difficult to isolate the functionality of that code. Why would you want to isolate functionality? So you can test it and reuse it elsewhere. If you have some code you don't need to test and won't need to reuse then using global variables is fine.
I agree with the accepted answer. I would add two things:
Use a prefix so you can immediately identify it as global (e.g. $g_)
Declare them in one spot, don't go sprinkling them all around the code.
Who can argue against experience, college degrees, and software engineering? Not me. I would only say that in developing object-oriented single page PHP applications, I have more fun when I know I can build the entire thing from scratch without worrying about namespace collisions. Building from scratch is something many people do not do anymore. They have a job, a deadline, a bonus, or a reputation to care about. These types tend to use so much pre-built code with high stakes, that they cannot risk using global variables at all.
It may be bad to use global variables, even if they are only used in the global area of a program, but let's not forget about those who just want to have fun and make something work.
If that means using a few variables (< 10) in the global namespace, that only get used in the global area of a program, so be it. Yes, yes, MVC, dependency injection, external code, blah, blah, blah, blah. But, if you have contained 99.99% of your code into namespaces and classes, and external code is sandboxed, the world will not end (I repeat, the world will not end) if you use a global variable.
Generally, I would not say using global variables is bad practice. I would say that using global variables (flags and such) outside of the global area of a program is asking for trouble and (in the long run) ill-advised because you can lose track of their states rather easily. Also, I would say that the more you learn, the less reliant you will be on global variables because you will have experienced the "joy" of tracking down bugs associated with their use. This alone will incentivize you to find another way to solve the same problem. Coincidentally, this tends to push PHP people in the direction of learning how to use namespaces and classes (static members, etc ...).
The field of computer science is vast. If we scare everyone away from doing something because we label it bad, then they lose out on the fun of truly understanding the reasoning behind the label.
Use global variables if you must, but then see if you can solve the problem without them. Collisions, testing, and debugging mean more when you understand intimately the true nature of the problem, not just a description of the problem.
Reposted from the ended SO Documentation Beta
We can illustrate this problem with the following pseudo-code
function foo() {
global $bob;
$bob->doSomething();
}
Your first question here is an obvious one
Where did $bob come from?
Are you confused? Good. You've just learned why globals are confusing and considered a bad practice. If this were a real program, your next bit of fun is to go track down all instances of $bob and hope you find the right one (this gets worse if $bob is used everywhere). Worse, if someone else goes and defines $bob (or you forgot and reused that variable) your code can break (in the above code example, having the wrong object, or no object at all, would cause a fatal error). Since virtually all PHP programs make use of code like include('file.php'); your job maintaining code like this becomes exponentially harder the more files you add.
How do we avoid Globals?
The best way to avoid globals is a philosophy called Dependency Injection. This is where we pass the tools we need into the function or class.
function foo(\Bar $bob) {
$bob->doSomething();
}
This is much easier to understand and maintain. There's no guessing where $bob was set up because the caller is responsible for knowing that (it's passing us what we need to know). Better still, we can use type declarations to restrict what's being passed. So we know that $bob is either an instance of the Bar class, or an instance of a child of Bar, meaning we know we can use the methods of that class. Combined with a standard autoloader (available since PHP 5.3), we can now go track down where Bar is defined. PHP 7.0 or later includes expanded type declarations, where you can also use scalar types (like int or string).
As:
global $my_global;
$my_global = 'Transport me between functions';
Equals $GLOBALS['my_global']
is bad practice (Like Wordpress $pagenow)... hmmm
Concider this:
$my-global = 'Transport me between functions';
is PHP error But:
$GLOBALS['my-global'] = 'Transport me between functions';
is NOT error, hypens will not clash with "common" user declared variables, like $pagenow. And Using UPPERCASE indicates a superglobal in use, easy to spot in code, or track with find in files
I use hyphens, if Im lazy to build classes of everything for a single solution, like:
$GLOBALS['PREFIX-MY-GLOBAL'] = 'Transport me ... ';
But In cases of a more wider use, I use ONE globals as array:
$GLOBALS['PREFIX-MY-GLOBAL']['context-something'] = 'Transport me ... ';
$GLOBALS['PREFIX-MY-GLOBAL']['context-something-else']['numbers'][] = 'Transport me ... ';
The latter is for me, good practice on "cola light" objectives or use, instead of clutter with singleton classes each time to "cache" some data. Please make a comment if Im wrong or missing something stupid here...

Categories