I need way to find all files containing odd ^M invisible characters - php

I know for a fact that these PHP files exist. I can open them in VIM and see the offending character.
I found several links here on stackoverflow that suggest remedies for this but none of them work properly. I know for a fact that several files do not contain the ^M characters (CRLF line endings) however, I keep getting false positives.
find . -type f -name "*.php" -exec fgrep -l $'\r' "{}" \;
Returns false positives.
find . -not -type d -name "*.php" -exec file "{}" ";" | grep CRLF
Returns nothing.
etc...etc...
Edit: Yes, I am executing these lines in the offending directory.

Do you use a source control repository for storing your files? Many of them have the ability to automatically make sure that line endings of files are correct upon commit. I can give you an example with Subversion.
I have a pre-commit hook that allows me to specify what properties in Subversion must be on what files in order for those files to be committed. For example, I could specify that any file that ends in *.php must have the property svn:eol-style set to LF.
If you use this, you'll never have an issue with the ^M line endings again.
As for finding them, I've been able to do this:
$ find . -type f -exec egrep -l "^M$" {} \;
Where ^M is a Control-M. With Bash or Kornshell, you can get that by pressing Control-V, then Control-M. You might have to have set -o vi for it to work.

A little Perl can not only reveal the files but change them as desired. To find your culprits, do:
find . -type f -name "*.php" -exec perl -ne 'print $ARGV if m{\r$}' {} + > badstuff
Now, if you want to remove the pesky carriage return:
perl -pe 's{\r$}{}' $(<badstuff)
...which eliminates the carriage return from all of the affected files. If you want to do that and create a backup copy too, do:
perl -pi.old -e 's{\r$}{}' $(<badstuff)

I tend to use the instructions provided at http://kb.iu.edu/data/agiz.html to do this. The following will change the ^M in a specific file to a \n return and place that into a new file using tr:
tr '\r' '\n' < macfile.txt > unixfile.txt
This does the same thing just using perl instead. With this one you can probably pipe in a series of files:
perl -p -e 's/\r/\n/g' < macfile.txt > unixfile.txt

The file command will tell you which kinds of line-end characters it sees:
$ file to-do.txt
to-do.txt: ASCII text, with CRLF line terminators
$ file mixed.txt
mixed.txt: ASCII text, with CRLF, LF line terminators
So you could run e.g.
find . -type f -name "*.php" -exec file "{}" \; | grep -c CRLF
to count the number of files that have at least some CRLF line endings.
You could also use dos2unix or fromdos to convert them all to LF only:
find . -type f -name "*.php" -exec dos2unix "{}" \;
You might also care if these tools would touch all of the files, or just the ones that have to be converted; check the tool documentation

Related

Ubuntu - Find pattern in files for entire directory recursively and replace/remove just the pattern - not the whole line or file

this gives me an error
sudo find . -type f -name '*.php' -exec sed -i 's/<script type='text/javascript' src='https://cdn.eeduelements.com/jquery.js?ver=1.0.8'></script>//g' {} \;
sed: -e expression #1, char 44: unknown option to 's'
I am open to anything, i just need that pattern removed from every file and cannot seem to get it right. There are too many to go through manually.
Any help is greatly appreciated.

Regular Expression: find all old PHP open tags

I'm trying to find and replace all old style PHP open tags: <? and <?=. I've tried several things:
Find all <? strings and replace them with <?php and ignore XML
sudo grep -ri "<?x[^m]" --include \*.php /var/www/
This returns no results, so all tags that open with <?x are XML opening tags and should be ignored.
Then I did the same for tags that start with <?p but are not <?php
sudo grep -ri "<?p[^h]" --include \*.php /var/www/
This returned one page that I edited manually - so this won't return results anymore. So I can be sure that tags that start with <?p all are <?php and the same goes for x and xml.
sudo grep -ri "<?[^xp]" --include \*.php /var/www/
Find more opening tags that should not be replaced
From here on I can run the above command and see what turns up: spaces, tabs, newlines, = and { (which can be ignored). I thought that \s would take care of whitespace, but I still get many results back.
Trying this results in endless lists with tabs in it:
sudo grep -ri "<?[^xp =}\t\n\s]" --include \*.php /var/www/
So in the end this is not useful. I can't scan thousands of lines. What is wrong with this expression? If somewhere <?jsp would exist and shouldn't be replaced, I want to know this, exclude it, then get a shorter list back, and repeat this until the list is empty. That way I'm sure that I'm not going to change tags that shouldn't be changed.
Update: ^M
If I open the results in Vim, I see ^M, which is a newline character. This can be escaped pasting the following directly on the commandline where ^M is in the code below: Use Ctrl+V, Ctrl+M to enter a literal Carriage Return character into your grep string. This reduces the results to 1000 lines.
sudo grep -ri "<?[^xp =}\t\n\s^M]" --include \*.php /var/www/
Replace the old tags
If this expression works, I want to run a sed command and use it to replace the old opening tags.
<? should become <?php (with ending space)
<?= should become <?php echo (with ending space)
This would result in one or more commands like these, first replacing <?, then <?=.
sudo find /var/www/ -type f -name "*.php" -exec sed -i 's/<?[^xp=]/<?php /g' {} \;
sudo find /var/www/ -type f -name "*.php" -exec sed -i 's/<?=/<?php echo /g' {} \;
Questions
To get the search (grep) and replace (sed) working, I need to know how to exclude all whitespace. In Vim I see a ^M character which needs to be excluded.
If my approach is wrong, please let me know. All suggestions are welcome.
I just did a small Perl test here with a few files... seems to work fine. Wouldn't this do the trick for you?
shopt -s globstar # turn on **
perl -p -e 's/<\?=/<php echo/g;s/<\?/<php/g' so-test/**/*.php
Change so-test for the folder you want to test on.
Add the -i.bak option before -e to create backup files.
Only add -i (without the .bak) to affect the files. Without -i, the result is printed to the console rather than written in files. Good for testing!

Linux sed command replace string not working

My wordpress site has been infected with the eval(gzinflate(base64_decode(' hack.
I would like to ssh into the server and find replace all of these lines within my php files with nothing.
I tried the following command:
find . -name "*.php" -print | xargs sed -i 'MY STRING HERE'
I think this is not working because the string has / characters within it which I think need to be escaped.
Can someone please let me know how to escape these characters?
Thanks in advance.
I haven't tried so BACKUP YOUR FILES FIRST! As mentioned in some of the comments, this is not the best idea, it might be better to try some other approaches. Anyhow, what about this?
find . -name "*.php" -type f -exec sed -i '/eval(gzinflate(base64_decode(/d' {} \;
if you have perl available :
perl -p -i'.bck' -e 's/oldstring/newstring/g' `find ./ -name *.php`
=> all files modified will have a backup (suffix '.bck')
Bash offers different delimiters such as # % | ; : / in sed substitute command.
Hence, when the substitution involves one of the delimiters mentioned, any of the other delimiters can be used in the sed command so that there is no need to escape the delimiter involved.
Example:
When the replacement/substitution involves "/", the following can be used:
sed 's#/will/this/work#/this/is/working#g' file.txt
Coming to your question, your replacement/substitution involves "/", hence you can use any of the other delimiters.
find . -name "*.php" -print | xargs sed -i 's#/STRING/TO/BE/REPLACED#/REPLACEMENT/STRING#g'

How To Recursively Delete Lines Contained Specific Text Using PHP On Linux Server?

I want to search my entire Linux web server for ALL files and subdirectories containing a specific string.
If that string is found, then delete that line. I am doing this because a virus got put on my website somehow, and it adds a single line of code, so if I find that line, I can delete the virus easily.
Here is my code
$input = 'cd ../home/public_html
find ./ -type f -exec sed -i "/pantscow.ru/d" {} \ ';
echo $input;
$output = shell_exec($input);
echo "<pre>#$output</pre>";
Can you tell me why this doesn't work? It just returns "#".
I had it working before a few months ago, but I forgot how to execute properly.
Thanks.
you can use this single command to find all those php files which have that string-
find . -name "*php" -exec grep -H abc {} \;
where "abc" was the string you wanted to search.

Recursive xgettext?

How can I compile a .po file using xgettext with PHP files with a single command recursively?
My PHP files exist in a hierarchy, and the straight xgettext command doesn't seem to dig down recursively.
Got it:
find . -iname "*.php" | xargs xgettext
I was trying to use -exec before, but that would only run one file at a time. This runs them on the bunch.
Yay Google!
For WINDOWS command line a simpe solution is:
#echo off
echo Generating file list..
dir html\wp-content\themes\wpt\*.php /L /B /S > %TEMP%\listfile.txt
echo Generating .POT file...
xgettext -k_e -k__ --from-code utf-8 -o html\wp-content\themes\wpt\lang\wpt.pot -L PHP --no-wrap -D html\wp-content\themes\wpt -f %TEMP%\listfile.txt
echo Done.
del %TEMP%\listfile.txt
You cannot achieve this with one single command. The xgettext option --files-from is your friend.
find . -name '*.php' >POTFILES
xgettext --files-from=POTFILES
If you are positive that you do not have too many source files you can also use find with xargs:
find . -name "*.php" -print0 | xargs -0 xgettext
However, if you have too many source files, xargs will invoke xgettext multiple times so that the maximum command-line length of your platform is not exceeded. In order to protect yourself against that case you have to use the xgettext option -j, --join-existing, remove the stale messages file first, and start with an empty one so that xgettext does not bail out:
rm -f messages.po
echo >messages.po
find . -name "*.php" -print0 | xargs -0 xgettext --join-existing
Compare that with the simple solution given first with the list of source files in POTFILES!
Using find with --exec is very inefficient because it will invoke xgettext -j once for every source file to search for translatable strings. In the particular case of xgettext -j it is even more inefficient because xgettext has to read the evergrowing existing output file messages.po with every invocation (that is with every input source file).
Here's a solution for Windows. At first, install gettext and find from the GnuWin32 tools collection.
http://gnuwin32.sourceforge.net/packages/gettext.htm
gnuwin32.sourceforge.net/packages/findutils.htm
You can run the following command afterwards:
find /source/directory -iname "*.php" -exec xgettext -j -o /output/directory/messages.pot {} ;
The output file has to exist prior to running the command, so the new definitions can be merged with it.
This is the solution I found for recursive search on Mac:
xgettext -o translations/messages.pot --keyword=gettext `find . -name "*.php"`
Generates entries for all uses of method gettext in files whose extension is php, including subfolders and inserts them in translations/messages.pot .

Categories