Regex for a malware scanner (PHP)

Regex for a malware scanner (PHP) - php

Good day all! Today's problem is simple.
We have many sites on shared servers and sometimes they get hacked by malware, they start to send email or they try to infect the rest of the neighborhood sites.
After a while spending time to clean up the files on many servers I started to think that maybe having a script that takes a look at the php files could be a really help for me.
Having a full antivirus for each website is a bit "too much" what I like to have is something I can manage remotely, and that will give me some clues on the scanned webserver.
So I thought about a simple PHP script, it simply scans every directory and every file in search of suspicious patterns. I actually use these two regex:
/eval\((base64|eval|\$_|\$\$|\$[A-Za-z_0-9\{]*(\(|\{|\[))/i
/mail\(/i
I know, the second one is very naive, but it makes the job, because I want to know if there is a mail call on some php file.
I have tried to use the token_get_all but the only useful token I found was the T_EVAL one, so I returned to the regex way.
My problem actually is that these two regex are:
un-optimized (I'm feeling regex is a bit to much cpu/time consuming)
far away to be precise
What I'm asking is:
got some better idea for achieving the same result?
I would like to match also these patterns:
eval(gzinflate(base64_decode('...');
eval(gzuncompress(base64_decode('...');
eval(gzinflate(str_rot13(base64_decode('...');
I feel that eval|base64|gzinflate|gzuncompress|gzinflate should be the "initial" patterns to search for, but modifing the regexp into:
/eval\((base64|eval|gzinflate|gzuncompress|gzinflate\$_|\$\$|\$[A-Za-z_0-9\{]*(\(|\{|\[))/i
doesn't give me the results I espected.
Well, of course if you have better ideas, will be all more than welcome.
NOTE
I understand that the question is very broad, but I would like to have ideas on this topic since it is something very time consuming having to handle hundreds of websites without any kind of protection.
If this isn't the right place to post this question, I'll delete it, I don't want to sully the board.
ANOTHER NOTE
This isn't "the solution" I only want a structured tool to start with in case of problems, something I can use easily from home (or with a CRON), that just warn me about something odd, there will be a tons of false positive indeed.

My first thought is professional virus scan and malware removal software. If you want to try to do this yourself, you might consider making a database of md5() digests of your PHP scripts. Scan the directories, make the digests, store them in a database. On a re-scan, compare the digests. If there are any new digests (new PHP files appeared) or missing digests (PHP files went away) or changed digests (PHP files were modified) it would give you a quick heads-up that something was different, and you could investigate if the change was unexpected.

Once your server get hacked, there is no way to clean it.
The only safe way to proceed once an infection is detected is to build a new server and import the data into it (after a check of these data)
If the server is infected you can't trust it anymore because he can lie to you about anything, like the content of files.
I know it sounds painful, but there is no hope to fully recover from an infection and be sure you have recover from it.

Related

Recover/Decode encoded php code

we had an ex employee handling our servers and he left due to some issues. now he encoded all the php files on the server and we are struggling to get it back.
can any one help in decoding the file or maybe let us know which encryption is it and how to recover codes.
We tried using many online decoders and other stack over flow suggested techniques but ended up with a buggy code.
I have pasted a sample code on https://pastebin.com/4uwZLZVF
A sample first line
<?php {"G\x4cOB\x41L\x53"}["gb\x73\x73\x69\x62"]="t\x65\x63hid";${"\x47\x4c\x4f\x42\x41\x4c\x53"}["\x66\x65ll\x77\x68j\x6c"]="co\x6et\x5f\x72e\x73\x32";

You could start by decoding the \xNN escape sequences in the string literals, which should at least give you some idea of what the code is doing. For example, the first line becomes:
<?php {"GLOBALS"}["gbssib"]="techid";${"GLOBALS"}["fellwhjl"]="cont_res2";
(And no, I have no idea why the first {"GLOBALS"} has no $ before it. Looks like a syntax error to me.)
Anyway, if the ex-employee didn't originally write that code, you'd probably be best off restoring it from backups. (You do have backups, right?) Treat anything they did write as untrustworthy —given that they were willing to sabotage their employer to this extent, who knows what kind of other traps they may have buried in the code. Even if you manage to deobfuscate it, unless you're willing to carefully inspect every line of the code (which probably takes as much work as just reimplementing it) you can't be sure it doesn't contain some malware that compromises your server.
Oh, and call your lawyer. Given this kind of deliberate sabotage, there's got to be something you can sue your ex-employee for — probably breach of contract, at least. Assuming you can still track them down, that is. But you might, since they presumably had some motive for doing this to you (e.g. to extort extra money from you for the unobfuscated code), and unless it's pure revenge, they can't get what they want if they just walk away without any trace.
(Of course, that's assuming you didn't breach the contract you had with them first. If the reason they left you with obfuscated code is because you promised to pay them and didn't, then you probably won't have much luck with suing them, and should either pay up or give up. Consulting a lawyer might still be worthwhile, if you're not sure if you're in the right or not. If you do decide to pay the ex-employee for the unobfuscated code, you might still want to treat it as suspect — although, if it turns out that it still doesn't do what you want after you've fully paid for it, you're at least in a much stronger position legally. Oh, and if you didn't have a written contract before, make sure to insist on one now before paying anything. And have your lawyer read it before you sign it.)

What might be the best way to benchmark a users PC, PHP or JS?

PHP - Apache with Codeigniter
JS - typical with jQuery and in house lib
The Problem: Determining (without forcing a download) a user's PC ability &/or virus issue
The Why: We put out a software that is mostly used in clinics, but can be used from home, however, we need to know, before they go to our mainsite, if their pc can handle the enormities of our web-based, browser-served software.
Progress: So far, we've come up with a decent way to test dl speed, but that's about it.
What we've done: In php we create about a 2.5Gb array of data to send to the user in a view, from there the view calculates the time it took to get the data and then subtracts the php benchmark from this time in order to get a point of reference of upload/download time. This is not enough.
Some of our (local) users have been found to have "crappy" pc's or are virus infected and this can lead to 2 problems. (1)They crash in the middle of preforming task in our program, or (2) their virus' could be trying to inject into our js thus creating a bad experience that may make us look bad to the average (uneducated on how this stuff works) user, thus hurting "our" integrity.
I've done some googling around, but most plug-ins or advice forums/blogs i've found simply give ways to benchmark the speed of your JS and that is simply not enough. I need a simple bit of code (no visual interface included, another problem i found with one nice piece of js lib that did this, but would take days to remove all of the authors personal visual code) that will allow me to test the following 3 things:
The user's data transfer rate (i think we have this covered, but if better method presented i won't rule it out)
The user's processing speed, how fast is the computer in general
possible test for infection via malware, adware, whatever maybe harmful to the user's experience
What we are not looking to do: repair their pc! We don't care if they have problems, we just don't want to lead them into our site if they have too many problems. If they can't do it from home, then they will be recommended to go to their nearest local office to use this software "in house" so to speak.
Further Explanation
We know your can't test the user-side stuff with PHP, we're not that stupid, PHP is mentioned because it can still be useful in either determining connection speed or in delivering a script that may do what we want. Also, this is not a software for just anyone on the net to go sign up and use, if you find it online, unless you are affiliated with a specific clinic and have a login name and what not, your not ment to use the sight, and if you get in otherwise, it's illegal. I can't really reveal a whole lot of information yet as the sight is not live yet. What I can say, is it mostly used by clinics/offices for customers to preform a certain task. If they don't have time/transport/or otherwise and need to do it from home, then the option is available. However, if their home PC is not "up to snuff" it will be nothing but a problem for them and make the 2 hours task they are meant to preform become a 4-6hour nightmare. Thus the reason, i'm at one of my fav quest sights asking if anyone may have had experience with this before and may know a good way to test the user's PC so they can have the best possible resolution, either do it from home (as their PC is suitable) or be told they need to go to their local office. Hopefully this clears things up enough we can refrain from the "sillier" answers. I need a REAL viable solution and/or suggestions, please.

PHP has (virtually) no access to information about the client's computer. Data transfer can just as easily be limited by network speed as computer speed. Though if you don't care which is the limiter, it might work.
JavaScript can reliably check how quickly a set of operations are run, and send them back to the server... but that's about it. It has no access to the file system, for security reasons.
EDIT: Okay, with that revision, I think I can offer a real suggestion - basically, compromise. You are not going to be able to gather enough information to absolutely guarantee one way or another that the user's computer and connection are adequate, but you can get a general idea.
As someone suggested, use a 10MB-20MB file and several smaller ones to test actual transfer rate; this will give you a reasonable estimate. Then, use JavaScript to test their system speed. But don't just stick with one test, because that can be heavily dependent on browser. Do the research on what tests will best give an accurate representation of capability across browsers; things like looping over arrays, manipulating (invisible) elements, and complex math. If there is a significant discrepancy between browsers, then use different thresholds; PHP does know what browser they're using, so you can give the system different "good enough" ratings depending on that. Limiting by version (like, completely rejecting IE6) may help in that.
Finally... inform the user. Gently. First let them know, "Hey, this is going to run a test to see if your network connection and computer are fast enough to use our system." And if it fails, tell them which part, and give them a warning. "Hey, this really isn't as fast as we recommend. You really ought to go down to the local clinic to perform this task; if you choose to proceed, it may take a lot longer than intended." Hopefully, at that point, the user will realize that any issues are on them, not on you.

What you've heard is correct, there's no way to effectively benchmark a machine based on Javascript - especially because the javascript engine mostly depends on the actual browser the user is using, amongst numerous other variables - no file system permissions etc. A computer is hardly going to let a browsers sub-process stress itself anyway, the browser would simply crash first. PHP is obviously out as it's server-side.
Sites like System Requirements Lab have the user download a java applet to run in it's own scope.

Code obfuscation and runtime behaviour changes

Is there any way that you may create a puzzle of three pieces with your code that you allows you to have an encrypted string, a function built specifically for that string and only for that one, (string that in its decrypted form is actually code/instructions to execute). And the third and most important, the final piece would be the context that makes the execution of function possible, and that environment state must be unforgeable. So the function to be executed on the matching string only one time, only one environment state, and after that to expire. It must be as a perfect clock mechanism that even only one piece that is missing to make it "un...tickable" (unable to tick, as in "to function", of course, because every part is vital).
I am speaking about script languages here, where exists something like eval, but it can be applied as a general technique.
Using some kind of "domino" effect based of all kinds of memory states, and variables, random, or not, you could create something similar to a "leak effect" letting bits of your code in a structured (but hidden) place, so that at the end, using a strange tool as a "code funnel" all to arrange in an execution flow.
As I do not know of anything like this, my words are made-ups in order to be able to give then any kind of readable and understandable form.
My question is exactly about this... is it possible? Is there any help I can get? As any ideas somewhere? Because I cannot find anything right now that can hold such an effect.
And of course you might find no reason for me to ask for such a thing, as you could say that no one is interested in my code, or I suffer of paranoia. But still consider it might be a reason out there. No, it surely is a reason for me wanting to know this.
And please don't hush and jump to conclusions like "it was specified clearly that this is not a discussion forum", as this is a problem I am facing with some urgent need for a solution. And as I have few experience about a lot of things, including serious math, and a solid algorithmic thinking, I am in a cry for help. So I thank those who will give any consideration to this matter.

This is essentially impossible, at a very fundamental level.
You want to give someone some information (your code). You want them to be able to execute the code once, in one particular state (that you don't define very well). But you don't want them to be able to inspect that code, or execute it under other circumstances. Impossible.
A program on that person's machine (that will execute the code) must be able to inspect the code, in order to be able to execute it. So other programs on that machine will also be able to inspect the code.
You can encrypt your code, so that arbitrary programs can't inspect it, and you regain control. But the program intended to execute the code still needs to inspect it, so you'll have to somehow get the information it needs to decrypt the code.
But now you're right back to your original problem: You some information (the decryption key), which you want to give the user for one particular use (decrypting and running your code under only the circumstances you specify), and not for any others.
More interestingly, you could instead encrypt your code with a hash value derived from the parts of the environment you want to check (how exactly you derive the key is impossible to say without knowing what you want to check). Then, if the user (or a program executing your code on their behalf) wants to inspect the code, it will have to follow the same process of deriving a hash value from the environment and have the correct environment. But the user could arrange to collect the hash value from that environment and keep it, allowing them to inspect your code whenever they want.
Plus, I have no idea what you could possibly check that actually guarantees the correct environment. If the program is supposed to check that it is correctly running itself (with no modifications to spy on your code) before decrypting the secret part of itself and running that, then a spy can easily pull out the encrypted stuff, worked out from the (unencrypted) checking code what computations are done to check the environment, and then do their own calculations based on the "correct" environment rather than the one that actually exists.
All you can do, in the end, is make the process of recovering your top secret code more annoying, so that it de-motivates from bothering. Most people will never bother to look at your code anyway; they just don't care. Most of the remaining ones will give up at the slightest resistance (though it may put them off using your software). The ones who are still trying after that... there's always a chance (if your secret code is really that valuable/interesting) that there will be sufficiently motivated people who will get around whatever you do (unless you just never give anyone your code in the first place). So my inclination is to just put some small easy effort into obscuring your secret code, and don't worry about achieving perfect security, because it's impossible.

You could simply encrypt the code using a symmetric encryption. Encrypt it using a random key, store that key on the computer, and then include in the encrypted code a part that deletes that key from memory when the code is run. Of course it would be relatively easy for someone to copy the key before it was deleted and restore it to memory later, but you could put it somewhere obscure, and I think no matter what you do someone could restore the computer to an earlier state (using backup software).

#Ben, interesting presentation, I believe that I was thinking about this from the very moment I understood what can be done, and what cannot be done. I thought of every possible situation you mention, and even more, indeed it seems there is no way. Maybe I should limit myself at just making it as much as a headache as I can. Not only that is a super-secret code, as you named it, but I have another intention in hiding it, it's that, pardon my lack of modesty, it really is an interesting code, and I really want to filter those to see it, if it takes a genius to make it from A to C without going through B, and it must take another to figure out how. I though it was my lack of experience that kept me down with is unsolvable problem. But still, I am always put in front of such seeming impossible situations, and I usually raise hell to find the solution, I manage it just fine. But now it just that I cannot surrender.
I might just have known the answer since the beginning, that no matter how much you try it can't be done, and of course it is easy to find out why, because it is not normal, or correct, it must depend on your knowledge and experience the limit of what you are creating. And with that in mind it is obviously why there must be a way someone better, and surely there will be someone better, than you to overcome you. Even if this is, for my at least, subjectively speaking, such troublesome. Well then, I guess I'll give my best to lessen those to know my super-secret code. After all even god is great puzzle maker.
Anyway I highly appreciate your answer.
P.S. Sorry for the late answer, but as I could not answer my question for 8 hours, I had to wait, and after that I couldn't be able to find any avaible time to do it.

Using a single PHP script for an entire site

I had an idea today (that millions of others have probably already had) of putting all the sites script into a single file, instead of having multiple, seperate ones. When submitting a form, there would also be a hidden field called something like 'action' which would represent which function in the file would handle it.
I know that things like Code Igniter and CakePHP exist which help seperate/organise the code.
Is this a good or bad idea in terms of security, speed and maintenance?
Do things like this already exist that i am not aware of?

What's the point? It's just going to make maintenance more difficult. If you're having a hard time managing multiple files, you should invest the time into finding a better text editor / IDE and stop using Notepad or whatever is making it so difficult in the first place!

Many PHP frameworks rely on the Front Controller design: a single small PHP script serves as the landing point for all requests. Based on request arguments, the front controller invokes code in other PHP scripts.
But storing all code for your site in a single file is not practical, as other people have commented.

There are many forums that do this. Personally, I don't like it, mainly because if you make an error in the file, the entire site is broken until you fix it.
I like separation of each part, but I guess it has its plusses.
It's likely bad for maintenance, as you can't easily disable a section of your site for an update.
Speed: I'm not sure to be honest.
Security: You could accomplish the exact same security settings but just adding a security check to a file and then including that file in all your pages.

If you're not caching your scripts, everything in a single file means less disk I/O, and since generally, disk I/O is an expensive operation, this probably can be a significant benefit.
The thing is, by the time you're getting enough traffic for this to matter, you're probably better off going with caching anyway. I suppose it might make some limited sense, though, in special cases where you're stuck on a shared hosting environment where bandwidth isn't an issue.
Maintenance and security: composing software out of small integral pieces of code a programmer can fit inside their head (and a computer can manage neatly in memory) is almost always a better idea than a huge ol' file. Though if you wanted to make it hell for other devs to tinker with your code, the huge ol' file might serve well enough as part of an obfuscation scheme. ;)
If for some reason you were using the single-file approach to try and squeeze out extra disk I/O, then what you'd want to do is create a build process, where you did your actual development work in a series of broken-out discrete files, and issued make or ant like command to generate your single file.

What is your experience of PHP encrypters? Which one would you recommend?

We have an application that is written in PHP that we are going to license to a customer. Our company believes that the customer might intend to steal the source code and create their own fork of the software, therefore we want to encrypt the source code.
I have searched some for PHP-encrypters and found several that seems good, but since we have no previous experience of PHP-encrypters it hard to say which one is the best. Which PHP encrypters have you used and what is your experience?

So, First:
It is impossible to encrypt your entire code base because at some point there has to be an eval statement, and if the user changes the eval to an echo, they get all of your code in the browser.
And here is a bunch of people who agree with me.
Furthermore:
People will offer you obfuscators, but no amount of obfuscation can prevent someone from getting at your code. None. If your computer can run it, or in the case of movies and music if it can play it, the user can get at it. Even compiling it to machine code just makes the job a little more difficult. If you use an obfuscator, you are just fooling yourself. Worse, you're also disallowing your users from fixing bugs or making modifications. - Schwern
Now thats done:
Bytecompiling is something completely different than encrypting. It makes the PHP code into already interpreted bytes, similar to an exe file. You can include these files just like any other php file.
The byte code produced is able to be reverse engineered, but it would take lots of time and is not worth the company's time.
Check out the byte compiler PHP extension.
I'd also like to note that PHP comes with several ways of reverse engineering classes. Such as the Reflection Class. This basically allows people to see every method, variables, and constant in each of your classes without the need for your source code.
Frankly, once someone sees the functions you use, it is pretty easy to piece it together after that.

There's a lot of obfusticaters out there masquerading as encrypters.
If you really must encrypt your code use Zend.
IMHO shutting your customers out of your code is inherently evil and would rather hide some symbology in the code and sell it under a no-modify/re-sell contract. Then sue the ass off them if they try to sell it on. You could argue that encrypting your code closes down a business opportunity ;) !
C.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.