GREP function from Python to PHP - php

I have a python script I wrote that I need to port to php. It recursively searches a given directory and builds a string based on regex searches. The first function I am trying to port is below. It takes a regex and a base dir, recursively searches all files in that dir for the regex, and builds a list of the string matches.
def grep(regex, base_dir):
matches = list()
for path, dirs, files in os.walk(base_dir):
for filename in files:
fullpath = os.path.join(path, filename)
with open(fullpath, 'r') as f:
content = f.read()
matches = matches + re.findall(regex, content)
return matches
I never use PHP except for basic GET param manipulation. I grabbed some directory walking code from the web, and am struggling to make it work like the python function above due to my utter lack of the php API.
function findFiles($dir = '.', $pattern = '/./'){
$prefix = $dir . '/';
$dir = dir($dir);
while (false !== ($file = $dir->read())){
if ($file === '.' || $file === '..') continue;
$file = $prefix . $file;
if (is_dir($file)) findFiles($file, $pattern);
if (preg_match($pattern, $file)){
echo $file . "\n";
}
}
}

Here is my solution:
<?php
class FileGrep {
private $dirs; // Scanned directories list
private $files; // Found files list
private $matches; // Matches list
function __construct() {
$this->dirs = array();
$this->files = array();
$this->matches = array();
}
function findFiles($path, $recursive = TRUE) {
$this->dirs[] = realpath($path);
foreach (scandir($path) as $file) {
if (($file != '.') && ($file != '..')) {
$fullname = realpath("{$path}/{$file}");
if (is_dir($fullname) && !is_link($fullname) && $recursive) {
if (!in_array($fullname, $this->dirs)) {
$this->findFiles($fullname, $recursive);
}
} else if (is_file($fullname)){
$this->files[] = $fullname;
}
}
}
return($this->files);
}
function searchFiles($pattern) {
$this->matches = array();
foreach ($this->files as $file) {
if ($contents = file_get_contents($file)) {
if (preg_match($pattern, $contents, $matches) > 0) {
//echo $file."\n";
$this->matches = array_merge($this->matches, $matches);
}
}
}
return($this->matches);
}
}
// Usage example:
$fg = new FileGrep();
$files = $fg->findFiles('.'); // List all the files in current directory and its subdirectories
$matches = $fg->searchFiles('/open/'); // Search for the "open" string in all those files
?>
<html>
<body>
<pre><?php print_r($matches) ?></pre>
</body>
</html>
Be aware that:
It reads each file to search for the pattern, so it may require a lot of memory (check the "memory_limit" configuration in your PHP.INI file).
It does'nt work with unicode files. If you are working with unicode files you should use the "mb_ereg_match" function rather than the "preg_match" function.
It does'nt follow symbolic links
In conclusion, even if it's not the most efficient solution at all, it should work.

Related

Get directory structure from FTP using PHP

I have ftp server with a lot of subfolders and files in it. I need to retrieve the folder structure from ftp, which shows names of all folders and subfolders from a specified starting path. I'm not interested in files included in each folder, only the directory tree. I'm using PHP and my server does not support mlsd.
Thanks for help.
I implemented my own recursive function, which for some reason is not working.
function ftp_list_files_recursive($ftp_stream, $path) {
$lines = ftp_rawlist($ftp_stream, $path);
$result = [];
if (is_array($lines) || is_object($lines)) {
foreach ($lines as $line) {
$exp0 = explode('<', $line);
if (sizeof($exp0) > 1):
$exp1 = explode('>', $exp0[1]);
if ($exp1[0] == 'DIR') {
$file_path=$path . "/" . ltrim($exp1[1]);
$result = array_merge($result, ftp_list_files_recursive($ftp_stream, $file_path));
} else {
$result[] = $file_path;
}
endif;
}
}
return $result;
}
The ftp_rawlist returns directory info as: 01-18-20 01:00PM <DIR> DirName so first I explode on < and check whether it was successful. If yes, then it means a string had DIR in it and it can be further exploded on >. It could have been done with regular expression, but that works for me now. If I print $file_path variable I see that it changes, so I assume the recursion works. However, the $result array is always empty. Any thoughts on that?
Start here: PHP FTP recursive directory listing.
You just need to adjust the code to:
the DOS-style listing you have from your FTP server (IIS probably) and
to collect only the folders.
function ftp_list_dirs_recursive($ftp_stream, $directory)
{
$result = [];
$lines = ftp_rawlist($ftp_stream, $directory);
if ($lines === false)
{
die("Cannot list $directory");
}
foreach ($lines as $line)
{
// rather lame parsing as a quick example:
if (strpos($line, "<DIR>") !== false)
{
$dir_path = $directory . "/" . ltrim(substr($line, strpos($line, ">") + 1));
$subdirs = ftp_list_dirs_recursive($ftp_stream, $dir_path);
$result = array_merge($result, [$dir_path], $subdirs);
}
}
return $result;
}

List all .jpg files from all folders and subfolders

I would like to list all .jpg files from folders and subfolders.
I have that simple code:
<?php
// directory
$directory = "img/*/";
// file type
$images = glob("" . $directory . "*.jpg");
foreach ($images as $image) {
echo $image."<br>";
}
?>
But that lists .jpg files from img folder and one down.
How to scan all subfolders?
Php coming with the DirectoryIterator which can be very useful in that case.
Please note that this simple function can be easly improved by adding the whole path to a file instead the only file name, and maybe use something else instead of the reference.
/*
* Find all file of the given type.
* #dir : A directory from which to start the search
* #ext : The extension. XXX : Dont call it with "." separator
* #store : A REFERENCE to an array on which store the element found.
* */
function allFileOfType($dir, $ext, &$store) {
foreach(new DirectoryIterator($dir) as $subItem) {
if ($subItem->isFile() && $subItem->getExtension() == $ext)
array_push($store, $subItem->getFileName());
elseif(!$subItem->isDot() && $subItem->isDir())
allFileOfType($subItem->getPathName(), $ext, $store);
}
}
$jpgStore = array();
allFileOfType(__DIR__, "jpg", $jpgStore);
print_r($jpgStore);
As a directotry can contain subdirectories, and in their turn contains subdirectories, so we should use a recursive function. glob() is here not sufficient. This might work for you:
<?php
function getDir4JpgR($directory) {
if ($handle = opendir($directory)) {
while (false !== ($entry = readdir($handle))) {
if($entry != "." && $entry != "..") {
$str1 = "$directory/$entry";
if(preg_match("/\.jpg$/i", $entry)) {
echo $str1 . "<br />\n";
} else {
if(is_dir($str1)) {
getDir4JpgR($str1);
}
}
}
}
closedir($handle);
}
}
//
// call the recursive function in the main block:
//
// directory
$directory = "img";
getDir4JpgR($directory);
?>
I put this into a file named listjpgr.php. And in my Chrome Browser, it gives this capture:

How to use php keep only specific file and remove others in directory?

How to use php keep only specific file and remove others in directory?
example:
1/1.png, 1/2.jpeg, 1/5.png ...
the file number, and file type is random like x.png or x.jpeg, but I have a string 2.jpeg the file need to keep.
any suggestion how to do this??
Thanks for reply, now I coding like below but the unlink function seems not work delete anything.. do I need change some setting? I'm using Mamp
UPDATE
// explode string <img src="u_img_p/5/x.png">
$content_p_img_arr = explode('u_img_p/', $content_p_img);
$content_p_img_arr_1 = explode('"', $content_p_img_arr[1]); // get 5/2.png">
$content_p_img_arr_2 = explode('/', $content_p_img_arr_1[0]); // get 5/2.png
print $content_p_img_arr_2[1]; // get 2.png < the file need to keep
$dir = "u_img_p/".$id;
if ($opendir = opendir($dir)){
print $dir;
while(($file = readdir($opendir))!= FALSE )
if($file!="." && $file!= ".." && $file!= $content_p_img_arr_2[1]){
unlink($file);
print "unlink";
print $file;
}
}
}
I change the code unlink path to folder, then it works!!
unlink("u_img_p/".$id.'/'.$file);
http://php.net/manual/en/function.scandir.php
This will get all files in a directory into an array, then you can run a foreach() on the array and look for patterns / matches on each file.
unlink() can be used to delete the file.
$dir = "/pathto/files/"
$exclude[] = "2.jpeg";
foreach(scandir($dir) as $file) {
if (!in_array($file, $exclude)) {
unlink("$dir/$file");
}
}
Simple and to the point. You can add multiple files to the $exclude array.
$dir = "your_folder_path";
if ($opendir = opendir($dir)){
//read directory
while(($file = readdir($opendir))!= FALSE ){
if($file!="." && $file!= ".." && $file!= "2.jpg"){
unlink($file);
}
}
}
function remove_files( $folder_path , $aexcludefiles )
{
if (is_dir($folder_path))
{
if ($dh = opendir($folder_path))
{
while (($file = readdir($dh)) !== false)
{
if( $file == '.' || $file == '..' )
continue ;
if( in_array( $file , $aexcludefiles ) )
continue ;
$file_path = $folder_path."/".$file ;
if( is_link( $file_path ) )
continue ;
unlink( $file_path ) ;
}
closedir($dh);
}
}
}
$aexcludefiles = array( "2.jpeg" )
remove_files( "1" , $aexcludefiles ) ;
I'm surprised people don't use glob() more. Here is another idea:
$dir = '/absolute/path/to/u_img_p/5/';
$exclude[] = $dir . 'u_img_p/5/2.jpg';
$filesToDelete = array_diff(glob($dir . '*.jpg'), $exclude);
array_map('unlink', $filesToDelete);
First, glob() returns an array of files based on the pattern provided to it. Next, array_diff() finds all the elements in the first array that aren't in the second. Finally, use array_map() with unlink() to delete all but the excluded file(s). Be sure to use absolute paths*.
You could even make it into a helper function. Here's a start:
<?php
/**
* #param string $path
* #param string $pattern
* #param array $exclude
* #return bool
*/
function deleteFiles($path, $pattern, $exclude = [])
{
$basePath = '/absolute/path/to/your/webroot/or/images/or/whatever/';
$path = $basePath . trim($path, '/');
if (is_dir($path)) {
array_map(
'unlink',
array_diff(glob($path . '/' . $pattern, $exclude)
);
return true;
}
return false;
}
unlink() won't work unless the array of paths returned by glob() happen to be relative to where unlink() is called. Since glob() will return only what it matches, it's best to use the absolute path of the directory in which your files to delete/exclude are contained.See the docs and comments on how glob() matches and give it a play to see how it works.

php glob - scan in subfolders for a file

I have a server with a lot of files inside various folders, sub-folders, and sub-sub-folders.
I'm trying to make a search.php page that would be used to search the whole server for a specific file. If the file is found, then return the location path to display a download link.
Here's what i have so far:
$root = $_SERVER['DOCUMENT_ROOT'];
$search = "test.zip";
$found_files = glob("$root/*/test.zip");
$downloadlink = str_replace("$root/", "", $found_files[0]);
if (!empty($downloadlink)) {
echo "$search";
}
The script is working perfectly if the file is inside the root of my domain name... Now i'm trying to find a way to make it also scan sub-folders and sub-sub-folders but i'm stuck here.
There are 2 ways.
Use glob to do recursive search:
<?php
// Does not support flag GLOB_BRACE
function rglob($pattern, $flags = 0) {
$files = glob($pattern, $flags);
foreach (glob(dirname($pattern).'/*', GLOB_ONLYDIR|GLOB_NOSORT) as $dir) {
$files = array_merge(
[],
...[$files, rglob($dir . "/" . basename($pattern), $flags)]
);
}
return $files;
}
// usage: to find the test.zip file recursively
$result = rglob($_SERVER['DOCUMENT_ROOT'] . '/test.zip');
var_dump($result);
// to find the all files that names ends with test.zip
$result = rglob($_SERVER['DOCUMENT_ROOT'] . '/*test.zip');
?>
Use RecursiveDirectoryIterator
<?php
// $regPattern should be using regular expression
function rsearch($folder, $regPattern) {
$dir = new RecursiveDirectoryIterator($folder);
$ite = new RecursiveIteratorIterator($dir);
$files = new RegexIterator($ite, $regPattern, RegexIterator::GET_MATCH);
$fileList = array();
foreach($files as $file) {
$fileList = array_merge($fileList, $file);
}
return $fileList;
}
// usage: to find the test.zip file recursively
$result = rsearch($_SERVER['DOCUMENT_ROOT'], '/.*\/test\.zip/'));
var_dump($result);
?>
RecursiveDirectoryIterator comes with PHP5 while glob is from PHP4. Both can do the job, it's up to you.
I want to provide another simple alternative for cases where you can predict a max depth. You can use a pattern with braces listing all possible subfolder depths.
This example allows 0-3 arbitrary subfolders:
glob("$root/{,*/,*/*/,*/*/*/}test_*.zip", GLOB_BRACE);
Of course the braced pattern could be procedurally generated.
This returns fullpath to the file
function rsearch($folder, $pattern) {
$iti = new RecursiveDirectoryIterator($folder);
foreach(new RecursiveIteratorIterator($iti) as $file){
if(strpos($file , $pattern) !== false){
return $file;
}
}
return false;
}
call the function:
$filepath = rsearch('/home/directory/thisdir/', "/findthisfile.jpg");
And this is returns like:
/home/directory/thisdir/subdir/findthisfile.jpg
You can improve this function to find several files like all jpeg file:
function rsearch($folder, $pattern_array) {
$return = array();
$iti = new RecursiveDirectoryIterator($folder);
foreach(new RecursiveIteratorIterator($iti) as $file){
if (in_array(strtolower(array_pop(explode('.', $file))), $pattern_array)){
$return[] = $file;
}
}
return $return;
}
This can call as:
$filepaths = rsearch('/home/directory/thisdir/', array('jpeg', 'jpg') );
Ref: https://stackoverflow.com/a/1860417/219112
As a full solution for your problem (this was also my problem):
<?php
function rsearch($folder, $pattern) {
$dir = new RecursiveDirectoryIterator($folder);
$ite = new RecursiveIteratorIterator($dir);
$files = new RegexIterator($ite, $pattern, RegexIterator::MATCH);
foreach($files as $file) {
yield $file->getPathName();
}
}
Will get you the full path of the items that you wish to find.
Edit: Thanks to Rousseau Alexandre for pointing out , $pattern must be regular expression.

Deleting all files from a folder using PHP?

For example I had a folder called `Temp' and I wanted to delete or flush all files from this folder using PHP. Could I do this?
$files = glob('path/to/temp/*'); // get all file names
foreach($files as $file){ // iterate files
if(is_file($file)) {
unlink($file); // delete file
}
}
If you want to remove 'hidden' files like .htaccess, you have to use
$files = glob('path/to/temp/{,.}*', GLOB_BRACE);
If you want to delete everything from folder (including subfolders) use this combination of array_map, unlink and glob:
array_map( 'unlink', array_filter((array) glob("path/to/temp/*") ) );
This call can also handle empty directories ( thanks for the tip, #mojuba!)
Here is a more modern approach using the Standard PHP Library (SPL).
$dir = "path/to/directory";
if(file_exists($dir)){
$di = new RecursiveDirectoryIterator($dir, FilesystemIterator::SKIP_DOTS);
$ri = new RecursiveIteratorIterator($di, RecursiveIteratorIterator::CHILD_FIRST);
foreach ( $ri as $file ) {
$file->isDir() ? rmdir($file) : unlink($file);
}
}
foreach (new DirectoryIterator('/path/to/directory') as $fileInfo) {
if(!$fileInfo->isDot()) {
unlink($fileInfo->getPathname());
}
}
This code from http://php.net/unlink:
/**
* Delete a file or recursively delete a directory
*
* #param string $str Path to file or directory
*/
function recursiveDelete($str) {
if (is_file($str)) {
return #unlink($str);
}
elseif (is_dir($str)) {
$scan = glob(rtrim($str,'/').'/*');
foreach($scan as $index=>$path) {
recursiveDelete($path);
}
return #rmdir($str);
}
}
$dir = 'your/directory/';
foreach(glob($dir.'*.*') as $v){
unlink($v);
}
Assuming you have a folder with A LOT of files reading them all and then deleting in two steps is not that performing.
I believe the most performing way to delete files is to just use a system command.
For example on linux I use :
exec('rm -f '. $absolutePathToFolder .'*');
Or this if you want recursive deletion without the need to write a recursive function
exec('rm -f -r '. $absolutePathToFolder .'*');
the same exact commands exists for any OS supported by PHP.
Keep in mind this is a PERFORMING way of deleting files. $absolutePathToFolder MUST be checked and secured before running this code and permissions must be granted.
See readdir and unlink.
<?php
if ($handle = opendir('/path/to/files'))
{
echo "Directory handle: $handle\n";
echo "Files:\n";
while (false !== ($file = readdir($handle)))
{
if( is_file($file) )
{
unlink($file);
}
}
closedir($handle);
}
?>
The simple and best way to delete all files from a folder in PHP
$files = glob('my_folder/*'); //get all file names
foreach($files as $file){
if(is_file($file))
unlink($file); //delete file
}
Got this source code from here - http://www.codexworld.com/delete-all-files-from-folder-using-php/
unlinkr function recursively deletes all the folders and files in given path by making sure it doesn't delete the script itself.
function unlinkr($dir, $pattern = "*") {
// find all files and folders matching pattern
$files = glob($dir . "/$pattern");
//interate thorugh the files and folders
foreach($files as $file){
//if it is a directory then re-call unlinkr function to delete files inside this directory
if (is_dir($file) and !in_array($file, array('..', '.'))) {
echo "<p>opening directory $file </p>";
unlinkr($file, $pattern);
//remove the directory itself
echo "<p> deleting directory $file </p>";
rmdir($file);
} else if(is_file($file) and ($file != __FILE__)) {
// make sure you don't delete the current script
echo "<p>deleting file $file </p>";
unlink($file);
}
}
}
if you want to delete all files and folders where you place this script then call it as following
//get current working directory
$dir = getcwd();
unlinkr($dir);
if you want to just delete just php files then call it as following
unlinkr($dir, "*.php");
you can use any other path to delete the files as well
unlinkr("/home/user/temp");
This will delete all files in home/user/temp directory.
Another solution:
This Class delete all files, subdirectories and files in the sub directories.
class Your_Class_Name {
/**
* #see http://php.net/manual/de/function.array-map.php
* #see http://www.php.net/manual/en/function.rmdir.php
* #see http://www.php.net/manual/en/function.glob.php
* #see http://php.net/manual/de/function.unlink.php
* #param string $path
*/
public function delete($path) {
if (is_dir($path)) {
array_map(function($value) {
$this->delete($value);
rmdir($value);
},glob($path . '/*', GLOB_ONLYDIR));
array_map('unlink', glob($path."/*"));
}
}
}
Posted a general purpose file and folder handling class for copy, move, delete, calculate size, etc., that can handle a single file or a set of folders.
https://gist.github.com/4689551
To use:
To copy (or move) a single file or a set of folders/files:
$files = new Files();
$results = $files->copyOrMove('source/folder/optional-file', 'target/path', 'target-file-name-for-single-file.only', 'copy');
Delete a single file or all files and folders in a path:
$files = new Files();
$results = $files->delete('source/folder/optional-file.name');
Calculate the size of a single file or a set of files in a set of folders:
$files = new Files();
$results = $files->calculateSize('source/folder/optional-file.name');
<?
//delete all files from folder & sub folders
function listFolderFiles($dir)
{
$ffs = scandir($dir);
echo '<ol>';
foreach ($ffs as $ff) {
if ($ff != '.' && $ff != '..') {
if (file_exists("$dir/$ff")) {
unlink("$dir/$ff");
}
echo '<li>' . $ff;
if (is_dir($dir . '/' . $ff)) {
listFolderFiles($dir . '/' . $ff);
}
echo '</li>';
}
}
echo '</ol>';
}
$arr = array(
"folder1",
"folder2"
);
for ($x = 0; $x < count($arr); $x++) {
$mm = $arr[$x];
listFolderFiles($mm);
}
//end
?>
For me, the solution with readdir was best and worked like a charm. With glob, the function was failing with some scenarios.
// Remove a directory recursively
function removeDirectory($dirPath) {
if (! is_dir($dirPath)) {
return false;
}
if (substr($dirPath, strlen($dirPath) - 1, 1) != '/') {
$dirPath .= '/';
}
if ($handle = opendir($dirPath)) {
while (false !== ($sub = readdir($handle))) {
if ($sub != "." && $sub != ".." && $sub != "Thumb.db") {
$file = $dirPath . $sub;
if (is_dir($file)) {
removeDirectory($file);
} else {
unlink($file);
}
}
}
closedir($handle);
}
rmdir($dirPath);
}
public static function recursiveDelete($dir)
{
foreach (new \DirectoryIterator($dir) as $fileInfo) {
if (!$fileInfo->isDot()) {
if ($fileInfo->isDir()) {
recursiveDelete($fileInfo->getPathname());
} else {
unlink($fileInfo->getPathname());
}
}
}
rmdir($dir);
}
I've built a really simple package called "Pusheh". Using it, you can clear a directory or remove a directory completely (Github link). It's available on Packagist, also.
For instance, if you want to clear Temp directory, you can do:
Pusheh::clearDir("Temp");
// Or you can remove the directory completely
Pusheh::removeDirRecursively("Temp");
If you're interested, see the wiki.
I updated the answer of #Stichoza to remove files through subfolders.
function glob_recursive($pattern, $flags = 0) {
$fileList = glob($pattern, $flags);
foreach (glob(dirname($pattern).'/*', GLOB_ONLYDIR|GLOB_NOSORT) as $dir) {
$subPattern = $dir.'/'.basename($pattern);
$subFileList = glob_recursive($subPattern, $flags);
$fileList = array_merge($fileList, $subFileList);
}
return $fileList;
}
function glob_recursive_unlink($pattern, $flags = 0) {
array_map('unlink', glob_recursive($pattern, $flags));
}
This is a simple way and good solution. try this code.
array_map('unlink', array_filter((array) array_merge(glob("folder_name/*"))));

Categories