Recursively counting files with PHP - php

Simple question for a newb and my Google-Fu is failing me. Using PHP, how can you count the number of files in a given directory, including any sub-directories (and any sub-directories they might have, etc.)? e.g. if directory structure looks like this:
/Dir_A/
/Dir_A/File1.blah
/Dir_A/Dir_B/
/Dir_A/Dir_B/File2.blah
/Dir_A/Dir_B/File3.blah
/Dir_A/Dir_B/Dir_C/
/Dir_A/Dir_B/Dir_C/File4.blah
/Dir_A/Dir_D/
/Dir_A/Dir_D/File5.blah
The script should return with '5' for "./Dir_A".
I've cobbled together the following but it's not quite returning the correct answer, and I'm not sure why:
function getFilecount( $path = '.', $filecount = 0, $total = 0 ){
$ignore = array( 'cgi-bin', '.', '..', '.DS_Store' );
$dh = #opendir( $path );
while( false !== ( $file = readdir( $dh ) ) ){
if( !in_array( $file, $ignore ) ){
if( is_dir( "$path/$file" ) ){
$filecount = count(glob( "$path/$file/" . "*"));
$total += $filecount;
echo $filecount; /* debugging */
echo " $total"; /* debugging */
echo " $path/$file"; /* debugging */
getFilecount( "$path/$file", $filecount, $total);
}
}
}
return $total;
}
I'd greatly appreciate any help.

This should do the trick:
function getFileCount($path) {
$size = 0;
$ignore = array('.','..','cgi-bin','.DS_Store');
$files = scandir($path);
foreach($files as $t) {
if(in_array($t, $ignore)) continue;
if (is_dir(rtrim($path, '/') . '/' . $t)) {
$size += getFileCount(rtrim($path, '/') . '/' . $t);
} else {
$size++;
}
}
return $size;
}

Use the SPL, then see if you still get an error.
RecursiveDirectoryIterator
Usage example:
<?php
$path = realpath('/etc');
$objects = new RecursiveIteratorIterator(new RecursiveDirectoryIterator($path), RecursiveIteratorIterator::SELF_FIRST);
foreach($objects as $name => $object){
echo "$name\n";
}
?>
This prints a list of all files and directories under $path (including $path ifself). If you want to omit directories, remove the RecursiveIteratorIterator::SELF_FIRST part.
Then just use isDir()

based on Andrew's answer...
$path = realpath('my-big/directory');
$objects = new RecursiveIteratorIterator(
new RecursiveDirectoryIterator($path),
RecursiveIteratorIterator::SELF_FIRST
);
$count=iterator_count($objects);
echo number_format($count); //680,642 wooohaah!
like that i'm able to count (not listing) thousands & thousands files. 680,642 files in less than 4.6 seconds actually ;)

Paolo Bergantino was almost with his code, but the function will still count .DS_Store files since he misspelled it. Correct Code below
function getFileCount($path) {
$size = 0;
$ignore = array('.','..','cgi-bin','.DS_Store');
$files = scandir($path);
foreach($files as $t) {
if(in_array($t, $ignore)) continue;
if (is_dir(rtrim($path, '/') . '/' . $t)) {
$size += getFileCount(rtrim($path, '/') . '/' . $t);
} else {
$size++;
}
}
return $size;
}

Why are you passing $filecount? The [passed-in] value is not being used; the only usage is at "$total += $filecount" and you're overriding $filecount just before that.
You're missing the case when the function encounters a regular (non-dir) file.
Edit: I just noticed the call to glob(). It's not necessary. Your function is recursively touching every file in the whole directory tree, anyway. See #Paolo Bergantino's answer.

Check the PHP manual on glob() function: http://php.net/glob
It has examples in comments as to how to make it recursive.

Related

Php - Delete files older than 7 days for multiple folders

i would like to create a PHP script that delete files from multiple folders/paths.
I managed something but I would like to adapt this code for more specific folders.
This is the code:
<?php
function deleteOlderFiles($path,$days) {
if ($handle = opendir($path)) {
while (false !== ($file = readdir($handle))) {
$filelastmodified = filemtime($path . $file);
if((time() - $filelastmodified) > $days*24*3600)
{
if(is_file($path . $file)) {
unlink($path . $file);
}
}
}
closedir($handle);
}
}
$path = 'C:/Users/Legion/AppData/Local/Temp';
$days = 7;
deleteOlderFiles($path,$days);
?>
I would like to make something like to add more paths and this function to run for every path.
I tried to add multiple path locations but it didn't work because it always takes the last $ path variable.
For exemple:
$path = 'C:/Users/Legion/AppData/Local/Temp';
$path = 'C:/Users/Legion/AppData/Local/Temp/bla';
$path = 'C:/Users/Legion/AppData/Local/Temp/blabla';
$path = 'C:/Users/Legion/AppData/Local/Temp/blalbalba';
$days = 7;
deleteOlderFiles($path,$days);
Thank you for you help!
The simple solution, call the function after setting the parameter not after setting all the possible parameters into a scalar variable.
$days = 7;
$path = 'C:/Users/Legion/AppData/Local/Temp';
deleteOlderFiles($path,$days);
$path = 'C:/Users/Legion/AppData/Local/Temp/bla';
deleteOlderFiles($path,$days);
$path = 'C:/Users/Legion/AppData/Local/Temp/blabla';
deleteOlderFiles($path,$days);
$path = 'C:/Users/Legion/AppData/Local/Temp/blalbalba';
deleteOlderFiles($path,$days);
Alternatively, place the directories in an array and then call the funtion from within a foreach loop.
$paths = [];
$paths[] = 'C:/Users/Legion/AppData/Local/Temp';
$paths[] = 'C:/Users/Legion/AppData/Local/Temp/bla';
$paths[] = 'C:/Users/Legion/AppData/Local/Temp/blabla';
$paths[] = 'C:/Users/Legion/AppData/Local/Temp/blalbalba';
$days = 7;
foreach ( $paths as $path){
deleteOlderFiles($path,$days);
}
It seems that you need a recursive function, i.e. a function that calls itself. In this case it calls itself when it finds a subdirectory to scan/traverse.
function delete_files($current_path, $days) {
$files_in_current_path = scandir($current_path);
foreach($files_in_current_path as $file) {
if (!in_array($release_file, [".", ".."])) {
if (is_dir($current_path . "/" . $file)) {
// Scan found subdirectory
delete_files($current_path . "/" . $file, $days);
} else {
// Here you add your code for checking date and deletion of the $file
$filelastmodified = filemtime($current_path . "/" . $file);
if((time() - $filelastmodified) > $days*24*3600) {
if(is_file($current_path . "/" . $file)) {
unlink($current_path . "/". $file);
}
}
}
}
}
}
delete_files("your/startpath/here", 7);
This code starts in your specified start path. It scans all files in that directory. If a sub directory is found, there will be a new call to delete_files, but with that sub directory as a start.

Recursive function to get filesize in PHP

I am working on a PHP function that will scan a given folder and return the total size of all the files in the folder. My issue is that, even though it works for files stored in the root of that folder, it doesn't work for files in any subfolder. My code is:
function get_total_size($system)
{
$size = 0;
$path = scandir($system);
unset($path[0], $path[1]);
foreach($path as $file)
{
if(is_dir($file))
{
get_total_size("{$system}/{$file}");
}
else
{
$size = $size + filesize("{$system}/{$file}");
}
}
$size = $size / 1024;
return number_format($size, 2, ".", ",");
}
I'm unsetting the 0th and 1st elements of the array since these are the dot and the double dot to go up a directory. Any help would be greatly appreciated
You may try this procedure. When you check this file is_dir then you have to count the file size also. And when you check is_dir you have to concat it with root directory otherwise it show an error.
function get_total_size($system)
{
$size = 0;
$path = scandir($system);
unset($path[0], $path[1]);
foreach($path as $file)
{
if(is_dir($system.'/'.$file))
{
$size+=get_total_size("{$system}/{$file}");
}
else
{
$size = $size + filesize("{$system}/{$file}");
}
}
$size = $size / 1024;
return number_format($size, 2, ".", ",");
}
I think it will work fine
Happy coding :)
You forgot to count the size of the subfolders. you have to add it to the $size variable.
function get_total_size($system)
{
$size = 0;
$path = scandir($system);
unset($path[0], $path[1]);
foreach($path as $file)
{
if(is_dir($file))
{
$size += get_total_size("{$system}/{$file}"); // <--- HERE
}
else
{
$size = $size + filesize("{$system}/{$file}");
}
}
return $size;
}
This might however give a problem because you are using the number_format function. I would not do this and add the formatting after receiving the result of the get_total_size function.
you can use recursive directory iterator for the same. Have a look on below solution:
<?php
$total_size = 0;
$di = new RecursiveDirectoryIterator('/directory/path');
foreach (new RecursiveIteratorIterator($di) as $filename => $file) {
if($file->isFile()) {
echo $filename . ' - ' . $file->getSize() . ' bytes <br/>';
$total_size += $file->getSize();
}
}
echo $total_size; //in bytes
?>
The recursiveIterator family of classes could be of use to you.
function filesize_callback( $obj, &$total ){
foreach( $obj as $file => $info ){
if( $obj->isFile() ) {
echo 'path: '.$obj->getPath().' filename: '.$obj->getFilename().' filesize: '.filesize( $info->getPathName() ).BR;
$total+=filesize( $info->getPathName() );
} else filesize_callback( $info,&$total );
}
}
$total=0;
$folder='C:\temp';
$iterator=new RecursiveIteratorIterator( new RecursiveDirectoryIterator( $folder, RecursiveDirectoryIterator::KEY_AS_PATHNAME ), RecursiveIteratorIterator::CHILD_FIRST );
call_user_func( 'filesize_callback', $iterator, &$total );
echo BR.'Grand-Total: '.$total.BR;

scandir() to sort by date modified

I'm trying to make scandir(); function go beyond its written limits, I need more than the alpha sorting it currently supports. I need to sort the scandir(); results to be sorted by modification date.
I've tried a few solutions I found here and some other solutions from different websites, but none worked for me, so I think it's reasonable for me to post here.
What I've tried so far is this:
function scan_dir($dir)
{
$files_array = scandir($dir);
$img_array = array();
$img_dsort = array();
$final_array = array();
foreach($files_array as $file)
{
if(($file != ".") && ($file != "..") && ($file != ".svn") && ($file != ".htaccess"))
{
$img_array[] = $file;
$img_dsort[] = filemtime($dir . '/' . $file);
}
}
$merge_arrays = array_combine($img_dsort, $img_array);
krsort($merge_arrays);
foreach($merge_arrays as $key => $value)
{
$final_array[] = $value;
}
return (is_array($final_array)) ? $final_array : false;
}
But, this doesn't seem to work for me, it returns 3 results only, but it should return 16 results, because there are 16 images in the folder.
function scan_dir($dir) {
$ignored = array('.', '..', '.svn', '.htaccess');
$files = array();
foreach (scandir($dir) as $file) {
if (in_array($file, $ignored)) continue;
$files[$file] = filemtime($dir . '/' . $file);
}
arsort($files);
$files = array_keys($files);
return ($files) ? $files : false;
}
This is a great question and Ryon Sherman’s answer provides a solid answer, but I needed a bit more flexibility for my needs so I created this newer function: better_scandir.
The goal is to allow having scandir sorting order flags work as expected; not just the reverse array sort method in Ryon’s answer. And also explicitly setting SORT_NUMERIC for the array sort since those time values are clearly numbers.
Usage is like this; just switch out SCANDIR_SORT_DESCENDING to SCANDIR_SORT_ASCENDING or even leave it empty for default:
better_scandir(<filepath goes here>, SCANDIR_SORT_DESCENDING);
And here is the function itself:
function better_scandir($dir, $sorting_order = SCANDIR_SORT_ASCENDING) {
/****************************************************************************/
// Roll through the scandir values.
$files = array();
foreach (scandir($dir, $sorting_order) as $file) {
if ($file[0] === '.') {
continue;
}
$files[$file] = filemtime($dir . '/' . $file);
} // foreach
/****************************************************************************/
// Sort the files array.
if ($sorting_order == SCANDIR_SORT_ASCENDING) {
asort($files, SORT_NUMERIC);
}
else {
arsort($files, SORT_NUMERIC);
}
/****************************************************************************/
// Set the final return value.
$ret = array_keys($files);
/****************************************************************************/
// Return the final value.
return ($ret) ? $ret : false;
} // better_scandir
Alternative example..
$dir = "/home/novayear/public_html/backups";
chdir($dir);
array_multisort(array_map('filemtime', ($files = glob("*.{sql,php,7z}", GLOB_BRACE))), SORT_DESC, $files);
foreach($files as $filename)
{
echo "<a>".substr($filename, 0, -4)."</a><br>";
}
Another scandir keep latest 5 files:
public function checkmaxfiles()
{
$dir = APPLICATION_PATH . '\\modules\\yourmodulename\\public\\backup\\';
// '../notes/';
$ignored = array('.', '..', '.svn', '.htaccess');
$files = array();
foreach (scandir($dir) as $file) {
if (in_array($file, $ignored)) continue;
$files[$file] = filemtime($dir . '/' . $file);
}
arsort($files);
$files = array_keys($files);
$length = count($files);
if($length < 4 ){
return;
}
for ($i = $length; $i > 4; $i--) {
echo "Erase : " .$dir.$files[$i];
unlink($dir.$files[$i]);
}
}

php recursive folder readdir vs find performance

i came across few articles about performance and readdir
here is the php script:
function getDirectory( $path = '.', $level = 0 ) {
$ignore = array( 'cgi-bin', '.', '..' );
$dh = #opendir( $path );
while( false !== ( $file = readdir( $dh ) ) ){
if( !in_array( $file, $ignore ) ){
$spaces = str_repeat( ' ', ( $level * 4 ) );
if( is_dir( "$path/$file" ) ){
echo "$spaces $file\n";
getDirectory( "$path/$file", ($level+1) );
} else {
echo "$spaces $file\n";
}
}
}
closedir( $dh );
}
getDirectory( "." );
this echo the files/ folders correctly.
now i found this:
$t = system('find');
print_r($t);
which also find all the folders and files then i can create an array like the first code.
i think the system('find'); is faster than the readdir but i want to know if it'S a good practice?
thank you very much
Here's my benchmark using a simple for loop with 10 iteration on my server:
$path = '/home/clad/benchmark/';
// this folder has 10 main directories and each folder as 220 files in each from 1kn to 1mb
// glob no_sort = 0.004 seconds but NO recursion
$files = glob($path . '/*', GLOB_NOSORT);
// 1.8 seconds - not recommended
exec('find ' . $path, $t);
unset($t);
// 0.003 seconds
if ($handle = opendir('.')) {
while (false !== ($file = readdir($handle))) {
if ($file != "." && $file != "..") {
// action
}
}
closedir($handle);
}
// 1.1 seconds to execute
$path = realpath($path);
$objects = new RecursiveIteratorIterator(
new RecursiveDirectoryIterator($path), RecursiveIteratorIterator::SELF_FIRST);
foreach($objects as $name => $object) {
// action
}
}
Clearly the readdir is faster to use specially if you have lots of traffic on your site.
'find' is not portable, it's a unix/linux command. readdir() is portable and will work on Windows or any other OS. Moreover, 'find' without any parameters is recursive, so if you're in a dir with lots of subdirs and files, you will get to see all of them, rather than only the contents of that $path.

How to get X newest files from a directory in PHP?

The code below is part of a function for grabbing 5 image files from a given directory.
At the moment readdir returns the images 'in the order in which they are stored by the filesystem' as per the spec.
My question is, how can I modify it to get the latest 5 images? Either based on the last_modified date or the filename (which look like 0000009-16-5-2009.png, 0000012-17-5-2009.png, etc.).
if ( $handle = opendir($absolute_dir) )
{
$i = 0;
$image_array = array();
while ( count($image_array) < 5 && ( ($file = readdir($handle)) !== false) )
{
if ( $file != "." && $file != ".." && $file != ".svn" && $file != 'img' )
{
$image_array[$i]['url'] = $relative_dir . $file;
$image_array[$i]['last_modified'] = date ("F d Y H:i:s", filemtime($absolute_dir . '/' . $file));
}
$i++;
}
closedir($handle);
}
If you want to do this entirely in PHP, you must find all the files and their last modification times:
$images = array();
foreach (scandir($folder) as $node) {
$nodePath = $folder . DIRECTORY_SEPARATOR . $node;
if (is_dir($nodePath)) continue;
$images[$nodePath] = filemtime($nodePath);
}
arsort($images);
$newest = array_slice($images, 0, 5);
If you are really only interested in pictures you could use glob() instead of soulmerge's scandir:
$images = array();
foreach (glob("*.{png,jpg,jpeg}", GLOB_BRACE) as $filename) {
$images[$filename] = filemtime($filename);
}
arsort($images);
$newest = array_slice($images, 0, 5);
Or you can create function for the latest 5 files in specified folder.
private function getlatestfivefiles() {
$files = array();
foreach (glob("application/reports/*.*", GLOB_BRACE) as $filename) {
$files[$filename] = filemtime($filename);
}
arsort($files);
$newest = array_slice($files, 0, 5);
return $newest;
}
btw im using CI framework. cheers!

Categories