I'm trying in php to move a folder but keep both files in the dest folder if exist duplicate.
i tried todo that in recursion but its too complicated so many things can go wrong for example file premissions and duplicate files\folders.
im trying to work with system() command and i cant figure out how to move files but keep backup if duplicate without destroying the extension
$last_line = system('mv --backup=t websites/test/ websites/test2/', $retval);
gives the following if file exist in both dirs:
ajax.html~
ajax.html~1
ajax.html~2
what im looking for is:
ajax~.html
ajax~1.html
ajax~2.html
or any other like (1), (2) ... but without ruining the extension of the file.
any ideas? please.
p.s must use the system() command.
For this problem, I get sed to find and swap those extensions after the fact in this function below (passing my target directory as my argument):
swap_file_extension_and_backup_number ()
{
IFS=$'\n'
for y in $(ls $1)
do
mv $1/`echo $y | sed 's/ /\\ /g'` $1/`echo "$y" | sed 's/\(\.[^~]\{3\}\)\(\.~[0-9]\{1,2\}~\)$/\2\1/g'`
done
}
The function assumes that your file extensions will be the normal 3 characters long, and this will find backups up to two digits long i.e. .~99~
Explanation:
This part $1/`echo $y | sed 's/ /\\ /g'` $1/`echo "$y"
represents the first argument (the original file) of mv but protects you from space characters by adding an escape.
The last part $1/`echo "$y" | sed 's/\(\.[^~]\{3\}\)\(\.~[0-9]\{1,2\}~\)$/\2\1/g' is of course the target file where two parenthetic groups are swapped .i.e. /\2\1/
if you want to keep the original files and just create a copy then use cp not mv.
If you want to create a backup archive then do a tar gzip of the folder like this
tar -pczf name_of_your_archive.tar.gz /path/to/directory/to/backup
rsync --ignore-existing --remove-source-files /path/to/source /path/to/dest
Use rsync with the --backup and --backup-dir options. eg:
rsync -a --backup --backup-dir /usr/local/backup/2013/03/20/ /path/to/source /path/to/dest
Every time a file might be overwritten it is copied to the folder given, plus the path to that item. eg: /path/to/dest/path/to/source/file.txt
From the looks of things, there don't seem to be any built in method for you to back up files while keeping the extension at the correct place. Could be wrong, but I was not able to find one that doesn't do what your original question already pointed out.
Since you said that it's complicated to copy the files over using php, perhaps you can do it the same way you are doing it right now, getting the files in the format
ajax.html~
ajax.html~1
ajax.html~2
Then using PHP to parse through the files and rename them to the format you want. This way you won't have to deal with permissions, and duplicate files, which are complications you mentioned. You just have to look for files with this format, and rename them.
I am not responding strictly to your question, but the case I am presenting here is very common and therefore valid!
Here's my hack!
TO USE WITH FILES:
#!/bin/bash
# It will find all the files according to the arguments in
# "<YOUR_ARGUMENT_TO_FIND_FILES>" ("find" command) and move them to the
# "<DEST_FOLDER>" folder. Files with the same name will follow the pattern:
# "same_name.ext", "same_name (1).ext", "same_name (2).ext",
# "same_name (3).ext"...
cd <YOUR_TARGET_FOLDER>
mkdir ./<DEST_FOLDER>
find ./ -iname "<YOUR_ARGUMENT_TO_FIND_FILES>" -type f -print0 | xargs -0 -I "{}" sh -c 'cp --backup=numbered "{}" "./<DEST_FOLDER>/" && rm -f "{}"'
cd ./<DEST_FOLDER>
for f_name in *.~*~; do
f_bak_ext="${f_name##*.}"
f_bak_num="${f_bak_ext//[^0-9]/}"
f_orig_name="${f_name%.*}"
f_only_name="${f_orig_name%.*}"
f_only_ext="${f_orig_name##*.}"
mv "$f_name" "$f_only_name ($f_bak_num).$f_only_ext"
done
cd ..
TO USE WITH FOLDERS:
#!/bin/bash
# It will find all the folders according to the arguments in
# "<YOUR_ARGUMENT_TO_FIND_FOLDERS>" ("find" command) and move them to the
# "<DEST_FOLDER>" folder. Folders with the same name will have their contents
# merged, however files with the same name WILL NOT HAVE DUPLICATES (example:
# "same_name.ext", "same_name (1).ext", "same_name (2).ext",
# "same_name (3).ext"...).
cd <YOUR_TARGET_FOLDER>
find ./ -path "./<DEST_FOLDER>" -prune -o -iname "<YOUR_ARGUMENT_TO_FIND_FOLDERS>" -type d -print0 | xargs -0 -I "{}" sh -c 'rsync -a "{}" "./<DEST_FOLDER>/" && rm -rf "{}"'
This solution might work in this case
cp --backup=simple src dst
Or
cp --backup=numbered src dst
You can also specify a suffix
Related
Today, Hosted gator reported that I have 30K files infected on my shared hosting. Around 25K was the same file, multiple times. I deleted them in one shot through the terminal with:
find . -name "0x1337.php" -delete
Reference:
Delete large batch of files with same file name in different directories - cpanel/LAMP
One such example path was:
/home1/..../public_html/.........com/wp-includes/js/tinymce/skins/lightgray/fonts/0x1337.php: SL-PHP-BACKDOOR-GENERIC-md5-cavg.UNOFFICIAL FOUND
Around 6K are still left:
/home1/.../public_html/......com/blog/wp-content/plugins/related-posts-slider/styles/.htaccess: SL-HTACCESS-GENERIC-xo.UNOFFICIAL FOUND
If I used this to delete:
find . -name ".htaccess" -delete → All .htaccess will be deleted.
Is there was to refactor so only infected files can be deleted?
Can this be taken as a clue → -xo.UNOFFICIAL FOUND?
In an attempt to clarify the steps I mentioned in the comments, I'll write it all out here.
I'm going to be super verbose here to explain what's happening.
My assumption is that Hostgator has provided you with a file named malware.txt that contains entries that look like this:
/path/to/file.php: SOME-MALWARE FOUND
/path/to/another/file.php: SOME-OTHER-MALWARE FOUND
/path/to/clean/file.php: CLEAN
In other words: malware.txt contains a mix of files that were flagged as containing malware and files that were considered clean and you want to keep.
The first step is to turn this file into a new file that only contains the file paths. We'll use grep and cut for this.
grep 'SOME-MALWARE' malware.txt | cut -d ':' -f1 > filelist.txt
grep 'SOME-MALWARE' malware.txt will find all lines in malware.txt that contain the string SOME-MALWARE anywhere. This is on purpose so you can break the list down into smaller sets and target individual malwares at a time.
| will send each line to the next command
cut -d ':' -f1 will split each line by a delimiter (:) and keep only the given field (1 -- the first field, so everything from the beginning of the line up to the first : character)
> filelist.txt will send the results to filelist.txt.
Now, with the sample set, filelist.txt should contain the following:
/path/to/file.php
(to target /path/to/another/file.php change grep to look for SOME-OTHER-MALWARE)
The next step is to delete those files. We will use cat and rm for that. Since we cannot pipe directly to rm using |, we will use the xargs command to do that.
cat filelist.txt | xargs -n1 -p rm
cat filelist.txt will read all lines in filelist.txt one by one
| will send each line to the next command
xargs -n1 -p rm will send each line as an argument to the rm command.
-n1 will ensure it processes one line at a time (without this, it will bunch lines together into a single rm command to delete multiple files at once)
-p will prompt you to confirm that you want to continue before actually performing the rm command
The reason for -p is so you can verify that this all works correctly. Once you've verified that, you can remove the -p argument if you want.
Note that this assumes all your filenames are pretty basic. If they contain special characters or spaces, these will need to be escaped or it will break.
For example, if you have a file named /path/to/some file.php, this code will try to remove the files /path/to/some and file.php.
If you notice trouble here, you can adjust the xargs command like this:
cat filelist.txt | xargs -I "{}" -n1 -p rm {}
The -I "{}" part will wrap the input argument (the filename from cat) in double quotes, the {} at the end will make sure rm uses the wrapped argument instead of the original.
If you run this command multiple times, you will probably see a lot of "file not found" errors. That is expected for files that were already deleted the last time you ran it so shouldn't be anything to worry about.
I hope this clarifies things, let me know if this is unclear in any way.
I'm trying to find and replace all old style PHP open tags: <? and <?=. I've tried several things:
Find all <? strings and replace them with <?php and ignore XML
sudo grep -ri "<?x[^m]" --include \*.php /var/www/
This returns no results, so all tags that open with <?x are XML opening tags and should be ignored.
Then I did the same for tags that start with <?p but are not <?php
sudo grep -ri "<?p[^h]" --include \*.php /var/www/
This returned one page that I edited manually - so this won't return results anymore. So I can be sure that tags that start with <?p all are <?php and the same goes for x and xml.
sudo grep -ri "<?[^xp]" --include \*.php /var/www/
Find more opening tags that should not be replaced
From here on I can run the above command and see what turns up: spaces, tabs, newlines, = and { (which can be ignored). I thought that \s would take care of whitespace, but I still get many results back.
Trying this results in endless lists with tabs in it:
sudo grep -ri "<?[^xp =}\t\n\s]" --include \*.php /var/www/
So in the end this is not useful. I can't scan thousands of lines. What is wrong with this expression? If somewhere <?jsp would exist and shouldn't be replaced, I want to know this, exclude it, then get a shorter list back, and repeat this until the list is empty. That way I'm sure that I'm not going to change tags that shouldn't be changed.
Update: ^M
If I open the results in Vim, I see ^M, which is a newline character. This can be escaped pasting the following directly on the commandline where ^M is in the code below: Use Ctrl+V, Ctrl+M to enter a literal Carriage Return character into your grep string. This reduces the results to 1000 lines.
sudo grep -ri "<?[^xp =}\t\n\s^M]" --include \*.php /var/www/
Replace the old tags
If this expression works, I want to run a sed command and use it to replace the old opening tags.
<? should become <?php (with ending space)
<?= should become <?php echo (with ending space)
This would result in one or more commands like these, first replacing <?, then <?=.
sudo find /var/www/ -type f -name "*.php" -exec sed -i 's/<?[^xp=]/<?php /g' {} \;
sudo find /var/www/ -type f -name "*.php" -exec sed -i 's/<?=/<?php echo /g' {} \;
Questions
To get the search (grep) and replace (sed) working, I need to know how to exclude all whitespace. In Vim I see a ^M character which needs to be excluded.
If my approach is wrong, please let me know. All suggestions are welcome.
I just did a small Perl test here with a few files... seems to work fine. Wouldn't this do the trick for you?
shopt -s globstar # turn on **
perl -p -e 's/<\?=/<php echo/g;s/<\?/<php/g' so-test/**/*.php
Change so-test for the folder you want to test on.
Add the -i.bak option before -e to create backup files.
Only add -i (without the .bak) to affect the files. Without -i, the result is printed to the console rather than written in files. Good for testing!
I am trying to create a tar archive on my server through PHP as follows:
exec('tar -cvf myfile.tar tmp_folder/innerfolder/');
It works fine, but the saved file preserves the full path including tmp_folder/innerfolder/
I am creating those on the fly for users, so it's a bit unusable for users to have this path while extracting. I have reviewed this topic - How to strip path while archiving with TAR , but in the explanation the guy doesn't give an example, and I don't quite understand what to do.
Please, tell me with an example, how to add files to tar in a way that it does not preserve the 'tmp_folder/innerfolder/' part in archive?
Thanks in advance
Use the -C option to tar:
tar -C tmp_folder/innerfolder -cvf myfile.tar .
you can cheat..
exec('cd /path/to/tmp_folder/ && tar -cvf /path/to/myfile.tar innerfolder/');
This would would give your users just the innerfolder when they extracted the tarball
You can use --transform
tar -cf files.tar --transform='s,/your/path/,,' /your/path/file1 /your/path/file2
tar -tf files.tar
file1
file2
More info: http://www.gnu.org/software/tar/manual/html_section/transform.html
tar czf ~/backup.tgz --directory=/path filetotar
If you want to preserve the current directory name but not the full path to it, try something like this (executed from within the directory that you want to tar; assumes bash/zsh):
ORIGDIR=${PWD##*/}
tar -C `dirname $PWD` -cvf ../archive.tar $ORIGDIR
Here's some detail; first:
ORIGDIR=${PWD##*/}
.. stores the current directory name (i.e. the name of the directory you're in). Then, in the tar command:
-C `dirname $PWD`
.. switches tar's "working directory" from the standard root ("/") to the parent of the folder you want to archive. Strangely the -C switch only affects the path for building the archive, but not the location the archive itself will be stored in. Hence you'll still have to prefix the archive name with "../", or else tar will place it within the folder you started the command in. Finally, $ORIGDIR is relative to the parent directory, and so it and its contents are archived recursively into the tar (but without the path leading to it).
I want to rename all files in a folder and add a .xml extension. I am using Unix. How can I do that?
On the shell, you can do this:
for file in *; do
if [ -f ${file} ]; then
mv ${file} ${file}.xml
fi
done
Edit
To do this recursively on all subdirectories, you should use find:
for file in $(find -type f); do
mv ${file} ${file}.xml
done
On the other hand, if you're going to do anything more complex than this, you probably shouldn't use shell scripts.
Better still
Use the comment provided by Jonathan Leffler below:
find . -type f -exec mv {} {}.xml ';'
Don't know if this is standard, but my Perl package (Debian/Ubuntu) includes a /usr/bin/prename (and a symlink just rename) which has no other purpose:
rename 's/$/.xml/' *
find . -type f \! -name '*.xml' -print0 | xargs -0 rename 's/$/.xml/'
In Python:
Use os.listdir to find names of all files in a directory. If you need to recursively find all files in sub-directories as well, use os.walk instead. Its API is more complex than os.listdir but it provides powerful ways to recursively walk directories.
Then use os.rename to rename the files.
How can I compile a .po file using xgettext with PHP files with a single command recursively?
My PHP files exist in a hierarchy, and the straight xgettext command doesn't seem to dig down recursively.
Got it:
find . -iname "*.php" | xargs xgettext
I was trying to use -exec before, but that would only run one file at a time. This runs them on the bunch.
Yay Google!
For WINDOWS command line a simpe solution is:
#echo off
echo Generating file list..
dir html\wp-content\themes\wpt\*.php /L /B /S > %TEMP%\listfile.txt
echo Generating .POT file...
xgettext -k_e -k__ --from-code utf-8 -o html\wp-content\themes\wpt\lang\wpt.pot -L PHP --no-wrap -D html\wp-content\themes\wpt -f %TEMP%\listfile.txt
echo Done.
del %TEMP%\listfile.txt
You cannot achieve this with one single command. The xgettext option --files-from is your friend.
find . -name '*.php' >POTFILES
xgettext --files-from=POTFILES
If you are positive that you do not have too many source files you can also use find with xargs:
find . -name "*.php" -print0 | xargs -0 xgettext
However, if you have too many source files, xargs will invoke xgettext multiple times so that the maximum command-line length of your platform is not exceeded. In order to protect yourself against that case you have to use the xgettext option -j, --join-existing, remove the stale messages file first, and start with an empty one so that xgettext does not bail out:
rm -f messages.po
echo >messages.po
find . -name "*.php" -print0 | xargs -0 xgettext --join-existing
Compare that with the simple solution given first with the list of source files in POTFILES!
Using find with --exec is very inefficient because it will invoke xgettext -j once for every source file to search for translatable strings. In the particular case of xgettext -j it is even more inefficient because xgettext has to read the evergrowing existing output file messages.po with every invocation (that is with every input source file).
Here's a solution for Windows. At first, install gettext and find from the GnuWin32 tools collection.
http://gnuwin32.sourceforge.net/packages/gettext.htm
gnuwin32.sourceforge.net/packages/findutils.htm
You can run the following command afterwards:
find /source/directory -iname "*.php" -exec xgettext -j -o /output/directory/messages.pot {} ;
The output file has to exist prior to running the command, so the new definitions can be merged with it.
This is the solution I found for recursive search on Mac:
xgettext -o translations/messages.pot --keyword=gettext `find . -name "*.php"`
Generates entries for all uses of method gettext in files whose extension is php, including subfolders and inserts them in translations/messages.pot .