PHP - fastest way to find if directory has children? - php

I'm building a file browser, and I need to know if a directory has children (but not how many or what type).
What's the most efficient way to find if a directory has children? glob()? scandir() it? Check its tax records?
Edit
It seems I was misunderstood, although I thought I was pretty clear. I'll try to restate my question.
What is the most efficient way to know if a directory is not empty? I'm basically looking for a boolean answer - NOT EMPTY or EMPTY.
I don't need to know:
how many files are in the directory
what the files are
when they were modified
etc.
I do need to know:
does the directory have any files in it at all
efficiently.

I think this is very efficient:
function dir_contains_children($dir) {
$result = false;
if($dh = opendir($dir)) {
while(!$result && ($file = readdir($dh)) !== false) {
$result = $file !== "." && $file !== "..";
}
closedir($dh);
}
return $result;
}
It stops the listing of the directories contents as soon as there is a file or directory found (not including the . and ..).

You could use 'find' to list all empty directories in one step:
exec("find '$dir' -maxdepth 1 -empty -type d",$out,$ret);
print_r($out);
Its not "pure" php but its simple and fast.

This should do, easy, quick and effective.
<?php
function dir_is_empty($dir) {
$dirItems = count(scandir($dir));
if($dirItems > 2) return false;
else return true;
}
?>

Unfortunately, each solution so far has lacked the brevity and elegance necessary to shine above the rest.
So, I was forced to homebrew a solution myself, which I'll be implementing until something better pops up:
if(count(glob($dir."/*")) {
echo "NOT EMPTY";
}
Still not sure of the efficiency of this compared to other methods, which was the original question.

I wanted to expand vstm's answer - Check only for child directories (and not files):
/**
* Check if directory contains child directories.
*/
function dir_contains_children_dirs($dir) {
$result = false;
if($dh = opendir($dir)) {
while (!$result && ($file = readdir($dh))) {
$result = $file !== "." && $file !== ".." && is_dir($dir.'/'.$file);
}
closedir($dh);
}
return $result;
}

Related

Is there an efficient one-liner to grab the first file in a directory?

I want to grab the first file in a directory, without touching/grabbing all the other files. The filename is unknown.
One very short way could be this, using glob:
$file = array_slice(glob('/directory/*.jpg'), 0, 1);
But if there are a lot of files in that directory, there will be some overhead.
Other ways are answers to this question - but all involve a loop and are also longer then the glob example:
PHP: How can I grab a single file from a directory without scanning entire directory?
Is there a very short and efficient way to solve this?
Probably not totally efficient, but if you only want the FIRST jpg that appears, then
$dh = opendir('directory/');
while($filename = readdir($dh)) {
if (substr($filename, -4) == '.jpg')) {
break;
}
}
Well this is not totally a one-liner, but it is a way to go I believe:
$result = null;
foreach(new FilesystemIterator('directory/') as $file)
{
if($file->isFile() && $file->getExtension() == 'jpg') {
$result = $file->getPathname();
break;
}
}
but why don't you wrap it in a function and use it like get_first_file('directory/') ? It will be a nice and short!
This function will get the first filename of any type.
function get_first_filename ($dir) {
$d = dir($dir);
while ($f = $d->read()){
if (is_file($dir . '/' . $f)) {
$d->close();
return $f;
}
}
}

PHP Reading Directories

So I'm going through reading and writing to files in PHP via PHP Docs and there's an example I didn't quite understand:
http://php.net/manual/en/function.readdir.php
if toward the end it shows an example like this:
<?php
if ($handle = opendir('.')) {
while (false !== ($entry = readdir($handle))) {
if ($entry != "." && $entry != "..") {
echo "$entry\n";
}
}
closedir($handle);
}
?>
In what case would . or .. ever be read?
The readdir API call iterates over all of the directories. So assuming you loop over the current directory (denoted by ".") then you get into an endless loop. Also, iterating over the parent directory (denoted by "..") is avoided to restrict the list to the current directory and beneath.
Hope that helps.
If you want to read directories using PHP, I would recommend you use the scandir function. Below is a demonstration of scandir
$path = __DIR__.'/images';
$contents = scandir($path);
foreach($contents as $current){
if($current === '.' || $current === '..') continue ;
if(is_dir("$path/$current")){
echo 'I am a directory';
} elseif($path[0] == '.'){
echo "I am a file with a name starting with dot";
} else {
echo 'I am a file';
}
}
Because in a UNIX filesystem, . and .. are like signposts, as far as I know. Certainly to this PHP function, anyway.
Keep them in there, you'll get some weird results (like endless loops, etc.) otherwise!
In *nix . is the present working directory and .. is the directory parent. However any file or directory preceded by a '.' is considered hidden so I prefer something like the following:
...
if ($entry[0] !== '.') {
echo "$entry\n";
}
...
This ensures that you don't parse "up" the directory tree, that you don't endlessly loop the present directory, and that any hidden files/folders are ignored.

Best way to check if a dir is empty whith php

Is there a better way to check if a dir is empty than parsing it?
Don't think so. Shortest/quickest way I can think of is the following, which should work as far as I can see.
function dir_is_empty($path)
{
$empty = true;
$dir = opendir($path);
while($file = readdir($dir))
{
if($file != '.' && $file != '..')
{
$empty = false;
break;
}
}
closedir($dir);
return $empty;
}
This should only go through a maximum of 3 files. The two . and .. and potentially whatever comes next. If something comes next, it's not empty, and if not, well then it's empty.
Not really, but you can try to delete it. If it fails, its not empty (or you just can't delete it ;))
function dirIsEmpty ($dir) {
return rmdir($dir) && mkdir($dir);
}
Update:
It seems, that the answer, that takes the condition "without parsing" into account, doesn't find much friends ;)
function dirIsEmpty ($dir) {
return count(glob("$dir/**/*")) === 0:
}
Note, that this assumes, that the directory and every subdirectory doesn't contain any hidden file (starting with a single .).

Optimize PHP function

I have a function that detects all files started by a string and it returns an array filled with the correspondent files, but it is starting to get slow, because I have arround 20000 files in a particular directory.
I need to optimize this function, but I just can't see how. This is the function:
function DetectPrefix ($filePath, $prefix)
{
$dh = opendir($filePath);
while (false !== ($filename = readdir($dh)))
{
$posIni = strpos( $filename, $prefix);
if ($posIni===0):
$files[] = $filename;
endif;
}
if (count($files)>0){
return $files;
} else {
return null;
}
}
What more can I do?
Thanks
http://php.net/glob
$files = glob('/file/path/prefix*');
Wikipedia breaks uploads up by the first couple letters of their filenames, so excelfile.xls would go in a directory like /uploads/e/x while textfile.txt would go in /uploads/t/e.
Not only does this reduce the number of files glob (or any other approach) has to sort through, but it avoids the maximum files in a directory issue others have mentioned.
You could use scandir() to list the files in the directory, instead of iterating through them one-by-one using readdir(). scandir() returns an array of the files.
However, it'd be better if you could change your file system organization - do you really need to store 20000+ files in a single directory?
As the other answers mention, I'd look at glob(), scandir(), and/or the DirectoryIterator class, there is no need to recreate the wheel.
However watch out! check your operating system, but there may be a limit on the maximum number of files in a single directory. If this is the case and you just keep adding files in the same directory you will have some downtime, and some problems, when you reach the limit. This error will probably appear as a permissions or write failure and not an obvious "you can't write more files in a single directory" message.
I'm not sure but probably DirectoryIterator is a bit faster. Also add caching so that list gets generated only when files are added or deleted.
You just need to compare the first length of prefix characters. So try this:
function DetectPrefix($filePath, $prefix) {
$dh = opendir($filePath);
$len = strlen($prefix);
$files = array();
while (false !== ($filename = readdir($dh))) {
if (substr($filename, 0, $len) === $prefix) {
$files[] = $filename;
}
}
if (count($files)) {
return $files;
} else {
return null;
}
}

Is it possible to speed up a recursive file scan in PHP?

I've been trying to replicate Gnu Find ("find .") in PHP, but it seems impossible to get even close to its speed. The PHP implementations use at least twice the time of Find. Are there faster ways of doing this with PHP?
EDIT: I added a code example using the SPL implementation -- its performance is equal to the iterative approach
EDIT2: When calling find from PHP it was actually slower than the native PHP implementation. I guess I should be satisfied with what I've got :)
// measured to 317% of gnu find's speed when run directly from a shell
function list_recursive($dir) {
if ($dh = opendir($dir)) {
while (false !== ($entry = readdir($dh))) {
if ($entry == '.' || $entry == '..') continue;
$path = "$dir/$entry";
echo "$path\n";
if (is_dir($path)) list_recursive($path);
}
closedir($d);
}
}
// measured to 315% of gnu find's speed when run directly from a shell
function list_iterative($from) {
$dirs = array($from);
while (NULL !== ($dir = array_pop($dirs))) {
if ($dh = opendir($dir)) {
while (false !== ($entry = readdir($dh))) {
if ($entry == '.' || $entry == '..') continue;
$path = "$dir/$entry";
echo "$path\n";
if (is_dir($path)) $dirs[] = $path;
}
closedir($dh);
}
}
}
// measured to 315% of gnu find's speed when run directly from a shell
function list_recursivedirectoryiterator($path) {
$it = new RecursiveDirectoryIterator($path);
foreach ($it as $file) {
if ($file->isDot()) continue;
echo $file->getPathname();
}
}
// measured to 390% of gnu find's speed when run directly from a shell
function list_gnufind($dir) {
$dir = escapeshellcmd($dir);
$h = popen("/usr/bin/find $dir", "r");
while ('' != ($s = fread($h, 2048))) {
echo $s;
}
pclose($h);
}
I'm not sure if the performance is better, but you could use a recursive directory iterator to make your code simpler... See RecursiveDirectoryIterator and 'SplFileInfo`.
$it = new RecursiveDirectoryIterator($from);
foreach ($it as $file)
{
if ($file->isDot())
continue;
echo $file->getPathname();
}
Before you start changing anything, profile your code.
Use something like Xdebug (plus kcachegrind for a pretty graph) to find out where the slow parts are. If you start changing things blindly, you won't get anywhere.
My only other advice is to use the SPL directory iterators as posted already. Letting the internal C code do the work is almost always faster.
PHP just cannot perform as fast as C, plain and simple.
Why would you expect the interpreted PHP code to be as fast as the compiled C version of find? Being only twice as slow is actually pretty good.
About the only advice I would add is to do a ob_start() at the beginning and ob_get_contents(), ob_end_clean() at the end. That might speed things up.
You're keeping N directory streams open where N is the depth of the directory tree. Instead, try reading an entire directory's worth of entries at once, and then iterate over the entries. At the very least you'll maximize use of the desk I/O caches.
You might want to seriously consider just using GNU find. If it's available, and safe mode isn't turned on, you'll probably like the results just fine:
function list_recursive($dir) {
$dir=escapeshellcmd($dir);
$h = popen("/usr/bin/find $dir -type f", "r")
while ($s = fgets($h,1024)) {
echo $s;
}
pclose($h);
}
However there might to be some directory that's so big, you're not going to want to bother with this either. Consider amortizing the slowness in other ways. Your second try can be checkpointed (for example) by simply saving the directory stack in the session. If you're giving the user a list of files, simply collect a pageful then save the rest of the state in the session for page 2.
Try using scandir() to read a whole directory at once, as Jason Cohen has suggested. I've based the following code on code from the php manual comments for scandir()
function scan( $dir ){
$dirs = array_diff( scandir( $dir ), Array( ".", ".." ));
$dir_array = Array();
foreach( $dirs as $d )
$dir_array[ $d ] = is_dir($dir."/".$d) ? scan( $dir."/".$d) : print $dir."/".$d."\n";
}

Categories