So I'm going through reading and writing to files in PHP via PHP Docs and there's an example I didn't quite understand:
http://php.net/manual/en/function.readdir.php
if toward the end it shows an example like this:
<?php
if ($handle = opendir('.')) {
while (false !== ($entry = readdir($handle))) {
if ($entry != "." && $entry != "..") {
echo "$entry\n";
}
}
closedir($handle);
}
?>
In what case would . or .. ever be read?
The readdir API call iterates over all of the directories. So assuming you loop over the current directory (denoted by ".") then you get into an endless loop. Also, iterating over the parent directory (denoted by "..") is avoided to restrict the list to the current directory and beneath.
Hope that helps.
If you want to read directories using PHP, I would recommend you use the scandir function. Below is a demonstration of scandir
$path = __DIR__.'/images';
$contents = scandir($path);
foreach($contents as $current){
if($current === '.' || $current === '..') continue ;
if(is_dir("$path/$current")){
echo 'I am a directory';
} elseif($path[0] == '.'){
echo "I am a file with a name starting with dot";
} else {
echo 'I am a file';
}
}
Because in a UNIX filesystem, . and .. are like signposts, as far as I know. Certainly to this PHP function, anyway.
Keep them in there, you'll get some weird results (like endless loops, etc.) otherwise!
In *nix . is the present working directory and .. is the directory parent. However any file or directory preceded by a '.' is considered hidden so I prefer something like the following:
...
if ($entry[0] !== '.') {
echo "$entry\n";
}
...
This ensures that you don't parse "up" the directory tree, that you don't endlessly loop the present directory, and that any hidden files/folders are ignored.
Related
I have these files in /public_html/ directory :
0832.php
1481.php
2853.php
3471.php
index.php
and I want to move all those XXXX.php (always in 4 digits format) to directory /tmp/, except index.php. how to do it with reg-ex and loop?
Alternatively, how about moving all files (including index.php) first to /tmp/ then later on put only index.php back to /public_html/, which one you think is less CPU consuming?
Last thing, I found this tutorial to move file using PHP: http://www.kavoir.com/2009/04/php-copying-renaming-and-moving-a-file.html
But how to move ALL files in a directory?
You can use FilesystemIterator with RegexIterator
$source = "FULL PATH TO public_html";
$destination = "FULL PATH TO public_html/tmp";
$di = new FilesystemIterator($source, FilesystemIterator::SKIP_DOTS);
$regex = new RegexIterator($di, '/\d{4}\.php$/i');
foreach ( $regex as $file ) {
rename($file, $destination . DIRECTORY_SEPARATOR . $file->getFileName());
}
The best way would be to do it directly via the file system, but if you absolutely have to do it with PHP, something like this should do what you want - you'll have to change the paths so that they are correct, obviously. Note that this assumes that there could be other files in the public_html directory, and so it only get the filenames with 4 numbers.
$d = dir("public_html");
while (false !== ($entry = $d->read())) {
if($entry == '.' || $entry == '..') continue;
if(preg_match("#^\d{4}$#", basename($entry, ".php")) {
// move the file
rename("public_html/".$entry, "/tmp/".$entry));
}
}
$d->close();
in fact - I went to readdir manual page and the fist comment to read is:
loop through folders and sub folders with option to remove specific files.
<?php
function listFolderFiles($dir,$exclude){
$ffs = scandir($dir);
echo '<ul class="ulli">';
foreach($ffs as $ff){
if(is_array($exclude) and !in_array($ff,$exclude)){
if($ff != '.' && $ff != '..'){
if(!is_dir($dir.'/'.$ff)){
echo '<li>'.$ff.'';
} else {
echo '<li>'.$ff;
}
if(is_dir($dir.'/'.$ff)) listFolderFiles($dir.'/'.$ff,$exclude);
echo '</li>';
}
}
}
echo '</ul>';
}
listFolderFiles('.',array('index.php','edit_page.php'));
?>
Regexes are in fact overkill for this, as we only need to do some simple string matching:
$dir = 'the_directory/';
$handle = opendir($dir) or die("Problem opening the directory");
while ($filename = readdir($handle) !== false)
{
//if ($filename != 'index.php' && substr($filename, -3) == '.php')
// I originally thought you only wanted to move php files, but upon
// rereading I think it's not what you really want
// If you don't want to move non-php files, use the line above,
// otherwise the line below
if ($filename != 'index.php')
{
rename($dir . $filename, '/tmp/' . $filename);
}
}
Then for the question:
alternatively, how about moving all files (including index.php) first to /tmp/ then later on put only index.php back to /public_html/, which one you think is less CPU consuming?
It could be done, and it would probably be slightly easier on your CPU. However, there are several reasons why this doesn't matter. First off, you're already doing this in a very inefficient way by doing it through PHP, so you shouldn't really be looking at the strain this puts on your CPU at this point unless you are willing to do it outside PHP. Secondly, that would cause more disk access (especially if the source and destination directory aren't on the same disk or partition) and disk access is much, much slower than your CPU.
I have a directory with 1.3 Million files that I need to move into a database. I just need to grab a single filename from the directory WITHOUT scanning the whole directory. It does not matter which file I grab as I will delete it when I am done with it and then move on to the next. Is this possible? All the examples I can find seem to scan the whole directory listing into an array. I only need to grab one at a time for processing... not 1.3 Million every time.
This should do it:
<?php
$h = opendir('./'); //Open the current directory
while (false !== ($entry = readdir($h))) {
if($entry != '.' && $entry != '..') { //Skips over . and ..
echo $entry; //Do whatever you need to do with the file
break; //Exit the loop so no more files are read
}
}
?>
readdir
Returns the name of the next entry in the directory. The entries are returned in the order in which they are stored by the filesystem.
Just obtain the directories iterator and look for the first entry that is a file:
foreach(new DirectoryIterator('.') as $file)
{
if ($file->isFile()) {
echo $file, "\n";
break;
}
}
This also ensures that your code is executed on some other file-system behaviour than the one you expect.
See DirectoryIterator and SplFileInfo.
readdir will do the trick. Check the exampl on that page but instead of doing the readdir call in the loop, just do it once. You'll get the first file in the directory.
Note: you might get ".", "..", and other similar responses depending on the server, so you might want to at least loop until you get a valid file.
do you want return first directory OR first file? both? use this:
create function "pickfirst" with 2 argument (address and mode dir or file?)
function pickfirst($address,$file) { // $file=false >> pick first dir , $file=true >> pick first file
$h = opendir($address);
while (false !== ($entry = readdir($h))) {
if($entry != '.' && $entry != '..' && ( ($file==false && !is_file($address.$entry)) || ($file==true && is_file($address.$entry)) ) )
{ return $entry; break; }
} // end while
} // end function
if you want pick first directory in your address set $file to false and if you want pick first file in your address set $file to true.
good luck :)
I have the following which is fairly slow. How can I speed it up?
(it scans a directory and makes headers out of the foldernames and retrieves the pdf files from within and adds them to lists)
$directories= array_diff(scandir("../pdfArchive/subfolder", 0), array('..', '.'));
foreach ($directories as $v) {
echo "<h3>".$v."</h3>";
$current = array_diff(scandir("../pdfArchive/subfolder/".$v, 0), array('..', '.'));
echo "<ul style=\"list-style-image: url(/images/pdf.gif); margin-left: 20px;\">";
foreach ($current as $vone) {
echo "<li><a target=\"blank\" href=\"../pdfArchive/subfolder/".$vone."\">".str_replace(".pdf", "", $vone)."</a>";
echo "</li><br>";
}
echo "</ul>";
}
Don't use array_diff() to filter out current and parent directory, use something like DirectoryIterator or glob() and then test whether it's . or .. via an if statement
glob() has a flag that allows you to retrieve only directories for your loops
Profile your code to see exactly what lines/functions are executing slowly
I'm not sure how fast array_diff() is when the array is very large, isn't it faster to simply add a separate check and make sure that '.' and '..' is not the returned name?
Other than that, I can't see there being anything really wrong.
What did you test to consider the current approach slow?
Here is a snippet of code I use that I adapted from php.net. It is very basic and goes through a given directory and lists the files contained within.
// The # suppresses any errors, $dir is the directory path
if (($handle = #opendir($dir)) != FALSE) {
// Loop over directory contents
while (($file = readdir($handle)) !== FALSE) {
// We don't want the current directory (.) or parent (..)
if ($file != "." && $file != "..") {
var_dump($file);
if (!is_dir($dir . $file)) {
// $file is really a file
} else {
// $file is a directory
}
}
}
closedir($handle);
} else {
// Deal with it
}
You may adapt this further to recurse over subdirectories by using is_dir to identify folders as I have shown above.
here my code-
if ($handle = opendir('banner/')) {
while (false !== ($file = readdir($handle))) {
echo "$file";
}
closedir($handle);
}
wher I run this code unnecessary dots(.) are coming.
output image-3.jpgimage-4.jpgimage-1.jpgimage-2.jpgimage-5.jpg... why 3 dots are coming at the last??
Because . is the current directory and .. is the parent directory.
They are always exists.
If you need to exclude them - just add
if ($file != '.' && $file != '..')
right before echo
It's because there are items in your directory which you don't see... one of them is . and represents the current directory, and the other is .. and represents the directory above the current one. You need to filter these out of any readdir results.
I'm coding a simple web report system for my company. I wrote a script for index.php that gets a list of files in the "reports" directory and creates a link to that report automatically. It works fine, but my problem here is that readdir( ) keeps returning the . and .. directory pointers in addition to the directory's contents. Is there any way to prevent this OTHER THAN looping through the returned array and stripping them manually?
Here is the relevant code for the curious:
//Open the "reports" directory
$reportDir = opendir('reports');
//Loop through each file
while (false !== ($report = readdir($reportDir)))
{
//Convert the filename to a proper title format
$reportTitle = str_replace(array('_', '.php'), array(' ', ''), $report);
$reportTitle = strtolower($reportTitle);
$reportTitle = ucwords($reportTitle);
//Output link
echo "$reportTitle<br />";
}
//Close the directory
closedir($reportDir);
In your above code, you could append as a first line in the while loop:
if ($report == '.' or $report == '..') continue;
array_diff(scandir($reportDir), array('.', '..'))
or even better:
foreach(glob($dir.'*.php') as $file) {
# do your thing
}
No, those files belong to a directory and readdir should thus return them. I’d consider every other behaviour to be broken.
Anyway, just skip them:
while (false !== ($report = readdir($reportDir)))
{
if (($report == ".") || ($report == ".."))
{
continue;
}
...
}
I would not know another way, as "." and ".." are proper directories as well. As you're looping anyway to form the proper report URL, you might just put in a little if that ignores . and .. for further processing.
EDIT
Paul Lammertsma was a bit faster than me. That's the solution you want ;-)
I wanted to check for the "." and the ".." directories as well as any files that might not be valid based on what I was storing in the directory so I used:
while (false !== ($report = readdir($reportDir)))
{
if (strlen($report) < 8) continue;
// do processing
}