How to safely allow users submit code to run periodically?

How to safely allow users submit code to run periodically? - php

Basically I need to allow users to submit code to be run periodically server side.
The users should submit simple scripts and I'll run their code server side to determine who came up with a better solution. I created a simple submit form and the code is stored on an SQL database.
I'm obviously worried about safety but I also don't know which language to use. I need an scripting language with an easy syntaxis that let's me limit the number of things users can do (I only need to let them define variables, create functions, use loops and some array and algebraic functions). Maybe even create a pseudolanguage with an easy syntaxis.
So basically:
What language could I use?
How do I run users codes periodically? (only know about cronjobs but I don't know if they will allow for long execution times)
Would it be a good idea to create a pseudolanguage? If it is please point me in the right direction

What language: Well, you could use any language, just make sure you have minimal permissions. A scripting language like Ruby or Python would be easier though.
If this task would fall on my lap I'd look into pythons virtualenv so that i have an environment that is isolated. Then obviously I'd make really sure about the permissions of the script running the uploaded programs.
This also means that you could set up a python environment for each user using this service.
Well yeah, cron works.
Indeed, but the scope for a good answer doesn't really fit here. But google DSL or Domain Specific Language and you're sure to find some tutorials.

If you're targeting PHP specifically you can use the runkit extension - specifically created to run user-supplied PHP code:
http://www.php.net/manual/en/intro.runkit.php
There's also a newer runkit project available (though you'll have to compile it manually):
https://github.com/zenovich/runkit/

Q1. What language could I use?
A1. Pretty much any. Because compliers would add to the complexity of the system, an interpreted (or JIT-compiled) language would be preferable.
Q2. How do I run users codes periodically? (only know about cronjobs but I don't know if they will allow for long execution times)
A2. cron jobs are probably the way to go. It doesn't care about execution times. However that means it is your job to make sure you only restart a job if the prior run has finished (assuming that is what you'd like it to do)
Q3. Would it be a good idea to create a pseudolanguage? If it is please point me in the right direction
A3. Inventing the wheel rarely is a good idea. You could do this, but there is reasonable doubt that it is necessary and/or advisable.
My personal pointer would go towards JavaScript as scripting language - since it is so widespread there are tons of tools and documentation around. So you might want to look at Node.js and this sandboxing model to run it server-side.

Related

Wordpress onstart(?) Adding unexpected crash handling

I'm new to WordPress, and I'm building some backend logic to it.
I want the admin to have as smooth experience with it as possible.
I want for him to be able to run the website with "a click of a button".
I'm used to Java and nodeJS environments, where I have life cycle,
where I can specify logic to happen when the server starts, but I'm having trouble to understand how it's done in WordPress(or PHP for that matter).
I want the website to check the database and see if it has needed tables for it's functioning, and if not, to create them and fill them with relevant data, as well as to check if the database is up-to-date (in case of a long crash),
and update if necessary.
Right now I'm thinking about running a Cron script to check it, every few minutes but it's heavy on resources. A better solution might be to run it on the first interaction with a user, but it seems not ideal, as it will slow him down.
Is there a life cycle in WordPress?
should I be worried about it crashing during important operation and then starting on it's own?
Can I specify logic for it to run on it's boot/restart?

I am not completely sure what you have in mind when you say "fill them with relevant data", but I would either recommend, leveraging WordPress' own logic or complete your task using other tools.
Just like any php script, it is stateless and it has an index.php as starting point. Then files are loaded in order and the contents of your request and the environment will depend where you end up.
This is just how php works and the key difference with JS is that JS executed on your computer, and php is a set of server side scripts that are compiled and will produce some sort of output that is sent back to your browser, just like when you call a REST api.
You might want to take a look at the following things:
wp-load.php: the file that will look for your constants defined in wp-config.php, when this file is not yet present it will redirect you to the "famous" wordpress site setup (after loading a bunch of stuff related to database connections and request data). You could follow the logic, but I would advise agains that. This due to the fact that the WordPress core is very old and this gives you an example of how php applications from the early 2000s used to look like and will most likely cause headaches.
Existing tools
Not only on server level, but also things like wp-cli or maybe a composer based solution like roots/bedrock or even roots/wordpress.
To answer your question about lifecycles directly
Yes, WordPress offers an old-timey hook system, but this is just during the request lifecycle for an active install, so this wouldn't be exactly what you seem to be looking for.
Finally, it is good to have some understanding of the internal workings of WordPress, but the whole reason that WordPress is easy to run and compatible with many setups, is just because they "strive for forever backwards compatibility" (which is also why they don't use semantic versioning). Which in turn means that the core is very outdated and unreadable, so I wouldn't bother trying to figure it out yourself.
And even more so I wouldn't want you to think that this is a fair representation of the PHP-world, since the initial release of WordPress, the language has completely evolved and most of the key components to it being a nice developer experience were still a long way ahead.
In short, I'd look for existing solutions which are built for your specific server setup and if that is not possible for some reason, try to find some sort of CLI tool in php, or other languages.

is it safe to use online script obfuscation?

I'm thinking of protecting my script to the mass majority of users (non-web dev savvy) and I came across an online service to encode php script. I'm not sure about it though.
Is it safe to encrypt php script? What if the encoded code has something fishy in it?

If you intend to distribute the PHP file then I would suggest that you do not do this. It's only going to irritate those that want to tinker with it.
If for some reason you don't want them tinkering with it, then don't distribute the PHP file.
If you need to distribute the file AND you don't want them tinkering with it, then I would highly suggest you not do this in PHP and instead write the functionality using C as an extension to PHP.
You'll notice that at no point do I suggest you actually go ahead and "encode" the php file. That's not going to buy you anything.

If you are looking to obfuscate your server-side PHP, the best bet would be to use a commercial product such as Zend Guard (http://www.zend.com/en/products/guard/). Any home-brew encryption is not secure in the slightest - your code can be easily reverse-engineered with fairly trivial effort. The page you link to does not have any credibility, it is just someone's side project. They have no accountability or stake in protecting your information.
Even these commercial products (Zend Guard, ionCube, phpShield, SourceGuardian) can be decrypted if someone really, really wanted to. No tool or technique in any language can make absolutely secure obfuscation, there is no "unhackable" system. Everything boils down to effort over time.
If it isn't important enough to bother doing it right, then you're probably wasting your time on the issue. Further, if it is absolutely vital that some information or code remain private, you should simply not put it out into the public purview.
[edited for clarity]

Ultimately, you need to trust the encrypting party. If you don't trust them (apparently you don't), then don't give them access to your server (through executing their decryption code/your obfuscated code, possibly with who-knows-what else inside). Simple as that, albeit possibly inconvenient.

php is usually running on the server where the users have no access to the code(neither source nor any other representation) anyways. No reason to obfuscate it there.
Obfuscating php is only useful in the rare cases where you give the php code to clients. For example if you want clients to be able to run their own server but not give them full access to the code.

So, it looks like all it does is obfuscate the code so it's not human-readable. The only way this would really be useful is to prevent lazy people who have access to the code from reading it. However, it uses simple functions to encode/decode, so it would be trivially easy for someone to decode it if they have access.
Which brings me to my point... PHP security works by not allowing anyone to have access to the source file. If someone who shouldn't have access gets it, then this "encoding" thing isn't going to do you any good.

The OP mentioned an interest in protecting database connection details, and it should be kept in mind that no matter what protection system is used for the code itself, the PHP engine and component libraries being opensource sets some absolute limits on what can be achieved. If MySQL connection details, for example, are hidden in a script then these details could be trivially revealed without going near the PHP scripts themselves simply by running the scripts with a PHP build that had slight modifications to the MySQL library or the associated PHP module wrapper. Even hiding the details in a C module as suggested by Chris L. would afford no extra protection in this case. Good protection can certainly be given to source code with compiled code systems such as ionCube and Zend, but wherever data hits routines in the PHP core then it can be exposed.
Obviously for any online service where you may be sending sensitive details, you should use due diligence and make best efforts to ensure that it has a good pedigree. Apart from anything else, not having a working https URL for the site the OP questioned should immediately warn that it's a no-no, and not just for the lack of connection encryption but showing that they are not offering a service that they consider to be serious.

how to identify whether my PHP script is hosted by others

I am selling PHP script online at 35$ for a individual user
Is there any way to identify whether , my script is hosted by more than one user ..
Should i use any logic in my script to find his identity?
Is there an easy way to find the pirator
Please help me.
(sorry for grammatical mistakes)

For example, somewhere in your script:
<?php
file_get_contents('http://yourserver.com/tranck_script_users.php?site='.url_encode($_SERVER['HTTP_HOST']));
?>
This way you will see which hosts use your script. Of course, anyone can remove this line from your script, there is no 100% way to know for sure.

If you can, try to make simple calls to a server of yours to track the script usage, you should send the domain name and the IP. Use cURL for this. If your business logic permits this you can go as far as disabling the script functionality if tracking is not successful.
Because PHP is just plain text anyone can remove your tracking code portion. Try to obfuscate the code.

There is no reliable way in PHP to prevent someone else using your script. Because PHP uses just-in-time compilation, the source code can be read by anyone with access to the files. This means that any call-home logic you put into your script can easily be disabled. The best you can do is obfuscate it, but the code can still be edited by anyone with sufficient determination.
Your best solution is to use a good licence, or to develop in a language that can be distributed already compiled. With PHP, there is not a reliable way to prevent re-use of your source code.
I would urge you not to put any kind of call-home functionality into your script. First, it can be disabled, so is essentially useless. Second, it will cause significant delays even for legitimate users of your script. Finally, if you must put it in, it is vital that you tell your users that you are doing so.

There isn't much you can do to negate piracy with non-compiled scripts. Anybody can modify the source to remove whatever protections you have in place. You can, however, try to run the script through some sort of obfuscation tool, or otherwise try to manually "encode" the file, in much the same way a lot of PHP malware does. Obfuscation and this type of encoding can and will be beaten by somebody with enough time on their hands, though.
If you're willing to invest some money into the problem, you could check out IonCube Encoder or Zend Guard. Both of which will secure your script, and I know at least Zend Guard allows for per-server licensing. These solutions would require your end-users to have either the IonCube or Zend loaders installed, though.

There is no way to do this without (IMO) impacting the security/privacy of your users.
The only "clean" way to do this is to encode your scripts with a tool like IonCube (there are many others but never used them) and restrict the execution on a specific domain. The downside (you can also see this as a plus depending of your license scheme) is that the users can't see/modify your code.

Why shouldn't I use unix commands from php?

Why would you prefer to keep from using bash commands via exec() in php?
I do not consider portability issue (I definitely will not port it to run on Windows). That's just a matter of a good way of writing scripts.
On the one hand:
I need to write much more lines in php then in bash to accomplish the same task.
For example, when I need to filter some lines in a file, I just can't imaging using something instead of cat file | grep string > new_file. This would take much more time and effort to do in php.
I do not want to analyze all situations when something might go wrong. I will just show bash command output to the user, so he would know what exactly happened.
I do not need to write another wrapper around filesystem functions and use it. It is much more efficient to leverage the OS for file searching, manipulation etc.
On the other hand:
Calling unix command with exec() might be inefficient in most cases. It is quite expensive to spawn a separate process. Not talking about scripts running under apache, which is even much less efficient than spawning from command line scripts.
Sometimes it turns out to be 'black magic-like' and perl-like scripting. Though it can be avoided via detailed comments.
Maybe I'm just trying to use two different tools together when they are not supposed to. Each tool has its own application and should not be mixed together.
Even though I'm sure users will not try to run script will malicious purposes, using exec() is a potential security threat. In most cases user data can be escaped with escapeshellarg(), but it is still an issue to take into account.

another reason to avoid this is that it's much easier to create security holes like this.
for example, if a user manage to sneak
`rm -rf /`
(With backticks) into the input, your bash code might actually nuke the server (or nuke something at least).
this is mostly a religious thing, most developers try to write code that always works. relying on external commands is a sure way to get your code to fail on some systems (even on the same OS).

What are you trying to achieve? PHP has regex-based functions to find what you need from a file. Yes, you would probably need about 5 lines of code to do it, but it would probably be no more or less efficient.
The main reason against using exec() in PHP is for security. If you're trusting your user to give you a command to exec() in bash, they could easily run malicious commands, such as installing and starting backdoor trojans, removing files, and the like.
As long as you're careful though (use the shell escaping commands to clean user input, restrict the Apache user permissions etc) it shouldn't be a problem. I'm just working on a complete platform at the moment, which relies on the front-end executing shell processes simply because C++ is much faster than PHP, so I've written a lot of the backend logic as a shell application and keep PHP for the front-end logic.

Even though you say portability isn't an issue, you never know for certain what the future holds, so I'd encourage you to reconsider that position. For example, I was once asked to port an editor that was written (by someone else) from Unix to DOS. The original program wasn't expected to be ported and was written with Unix specific calls deeply embedded in the code. After reviewing the amount of work required, we abandoned the task as too time consuming.
I have used exec calls in PHP; however, I had no other way to accomplish what I needed (I had to call another program written in another language with no other bridge between the languages). However, IMO, exec calls which aren't necessary are ugly. As others have said, they can also create security risks and slow your program down.
As you said yourself, you need to document the exec calls well to be sure they'll be understood by programmers. Why create the extra work? Not just now but in the future, when any changes to the exec call will also need to be documented.
Finally, I suggest you learn PHP and its functions a bit better. I'm not that good with PHP, but in just a matter of minutes with Google and php.net, I think I accomplished the same thing you gave as an example with:
$search_results = preg_grep($search_string, file($file_name));
foreach ($search_results as $result) {
echo $result . "\n";
}
Yes, it's a bit more code, but not that much, and you can put it in a function if appropriate ... and I wouldn't be surprised if a PHP guru could shorten it.

IMHO, the main concern with using exec() to execute *nix commands via PHP is security, more than performance or even code style.
If you have a very good input sanitization (and this is very hard to achieve), you may be able not to have any security hole.

Personally, if portability isn't an issue, I would totally use *nix commands like grep, locate, etc. anyday over trying to duplicate that functionality in PHP.
It's about using the best tool for the job. In some cases, arguably more often than most people realize, it is much more efficient to leverage the OS for file searching, manipulation, etc. (amongst other things)

Lot of people would descend on your like a ton of bricks for even mentioning using exec. Some people would consider is blasphemy but not me. I can see nothing wrong with exec for some situations if your server has been properly configured. The disadvantage though is that you are spawning another process.

If you are running your PHP using a web server, the "user" that runs the script may not have permission to run certain shell commands. you said portability is not an issue, but i can guarantee to you that it IS an issue, (unless you are creating PHP scripts for fun). In the business world where things and condition changes fast, you won't know you might one day have to run your scripts on other platforms.

It is not secure unless you take extreme precautions to make sure it can't be used by people executing the code.

php is not a good executor. php spawns a process from apache, and if that process hangs, your apache server will hang, if your site is also running on the same apache; it will fail.
You can expect to have silly issues like these as well, if it happens you can't even restart apache without killing the spawned process manually from shell.
http://bugs.php.net/bug.php?id=38915
therefore, i'm not talking about security, running linux commands from php fails more than you'd think, worst part of using exec, it's not always possible to get error messages back to php. or write subsequent method that depends on what happened with exec.
consider this pseudo example:
exec ('bash myscript.sh',$x)
if (myScriptWasOk == true) then do this
There is no way that you get that 'myScriptWasOk' variable right. You just don't know anything about it, $x will help you sometimes.
All this being said, if u need something simple, and if you tested your script and it works ok, just go for it.

If you are only aiming for Unix compatibility (which is perfectly fine), I can't see anything wrong with it. Virtually server operating system available today is a Unix clone, except of course for Windows which I think is ridiculous to use as a server platform in the first place (and I'm talking from experience here, this is not just Microsoft hatred). Unix-compatibility is a perfectly legitimate requirement on any server in my opinion.
The only real reason I can see to avoid it is performance. You will quickly find that executing external processes in general is extremely slow. It's slow in C, and it's slow in PHP. I would think that's the biggest real, non-religious concern.
EDIT:
Oh, and as for the security problem, that's a simple matter of making sure that you are in total control of the variables passed to the operating system. It's a concern you have to make when communicating between processes and languages anyway, for example when you do SQL queries. It's not a big enough reason in my opinion to not do something, it's just something that has to be taken into account in this case, like in every case.

If portability really isn't an issue, because you are building a company solution that is always going to be on your own, totally controlled servers, I say go for shell commands as much as you want to. There is no inherent security problem as long as you do proper basic sanitation using escapeshellarg() and consorts.
At the same time, in my projects portability mostly is an issue, and when it is, I try not to use shell commands at all - only when something can't be done in PHP at all (e.g. MP3 decoding/encoding, ImageMagick, Video operations) or not reasonably (i.e. a PHP based solution is way too slow) will I use external commands.

In need to program an algorithem to be very fast, should I do it as php extension, or some otherway?

Most of my application is written in PHP ((Front and Back ends).
There is a part that works too slowly and I will need to rewrite it, probably not in PHP.
What will give me the following:
1. Most speed
2. Fastest development
3. Easily maintained.
I have in my mind to rewrite this piece of code in CPP as a PHP extension, but may be I am locked on this solution and misses some simpler/better solutions?
The algorithm is PorterStemmerAlgorithm on several MB of data each time it is run.

The answer really depends on what kind of process it is.
If it is a long running process (at least seconds) then perhaps an external program written in C++ would be super easy. It would not have the complexities of a PHP extension and it's stability would not affect PHP/apache. You could communicate over pipes, shared memory, or the sort...
If it is a short running process (measured in ms) then you will most likely need to write a PHP extension. That would allow it to be invoked VERY fast with almost no per-call overhead.
Another possibility is a custom server which listens on a Unix Domain Socket and will quickly respond to PHP when PHP asks for information. Then your per-call overhead is basically creating a socket (not bad). The server could be in any language (c, c++, python, erlang, etc...), and the client could be a 50 line PHP class that uses the socket_*() functions.
A lot of information needs evaluated before making this decision. PHP does not typically show slowdowns until you get into really tight loops or thousands of repeated function calls. In other words, the overhead of the HTTP request and network delays usually make PHP delays insignificant (unless the above applies)
Perhaps there is a better way to write it in PHP?
Are you database bound?
Is it CPU bound, Network bound, or IO bound?
Can the result be cached?
Does a library already exist which will do the heavy lifting.
By committing to a custom PHP extension, you add significantly to the base of knowledge required to maintain it (even above C++). But it is a great option when necessary.
Feel free to update your question with more details, and I'm sure Stack Overflow will be happy to help out.

Suggestion
The PorterStemmerAlgorithm has a C implementation available at http://tartarus.org/~martin/PorterStemmer/c.txt
It should be an easy matter to tie this C program into your data sources and make it a stand alone executable. Then you could simply invoke it from PHP with one of the proc functions, such as proc_open()
Unless you need to invoke this program many times PER php request, then this approach should save you the effort of building and integrating a PHP extension, not to mention that the hard work (in c) is already done.

Am not sure about what the PorterStemmerAlgorithm is. However if you could make your process run in parallel and collect the information together , you could look at parallel running processes easily implemented in JAVA. Not sure how you could call it in PHP, but definitely maintainable.
You can have a look at this framework. Looks simple to implement
https://computefarm.dev.java.net/
Regards,
Franklin.

If you absolutely need to rewrite in a different language for speed reasons then I think gahooa's answer covers the options nicely. However, before you do, are you absolutely sure you've done everything you can to improve the performance if the PHP implementation?
Is caching the output viable in your situation? Could you get away with running the algorithm once and caching the output rather than on every page load?
Have you tried profiling the code to ensure there's no unnecessary work being done (db queries in an inner loop and the like). Xdebug can help here.
Are there other stemming algorithms available which might perform better on your dataset?

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.