I made a tool to organize file content in a specific way. These files are located all over my pc which runs on Windows 7. The tool is made up of two parts: 1 the interface holding a form. 2 the script to do the work.
Instead of having to manually write the full path to a certain directory to the main script, I'd rather have the tool search for it and retrieve it seamlessly. I'm thinking of maybe adding a textfield and a button, in which I can enter the directory's name I'm looking for and after clicking the button, retrieve the directory's full pathname, print it to the same textfield and then pass it along to the program itself.
I've searched for several day for ways to have PHP interact with Windows (maybe with window's search object), but all I've found is very looong documentation on the COM and then on the NET. These however seem to strictly deal with accessing Office objects, since most of the available examples are about Excel or Word objects.
How can I accomplish the functionality I want to add to my interface?
To avoid further confusion, this is the image of the Window's object I'm referring to > Windows Starup Search Field
Use this handy function - just point it to the root of your filesystem and it will return an array with all the matching files - I mean, matched by the regular expression pattern you provide to the function.
// PREG_FIND_RECURSIVE - go into subdirectorys looking for more files
// PREG_FIND_DIRMATCH - return directorys that match the pattern also
// PREG_FIND_DIRONLY - return only directorys that match the pattern (no files)
// PREG_FIND_FULLPATH - search for the pattern in the full path (dir+file)
// PREG_FIND_NEGATE - return files that don't match the pattern
// PREG_FIND_RETURNASSOC - Instead of just returning a plain array of matches,
// return an associative array with file stats
// to use more than one simply seperate them with a | character
define('PREG_FIND_RECURSIVE', 1);
define('PREG_FIND_DIRMATCH', 2);
define('PREG_FIND_FULLPATH', 4);
define('PREG_FIND_NEGATE', 8);
define('PREG_FIND_DIRONLY', 16);
define('PREG_FIND_RETURNASSOC', 32);
function preg_find($pattern, $start_dir='.', $args=NULL)
{
$files_matched = array();
$fh = #opendir($start_dir);
if($fh)
{
while (($file = readdir($fh)) !== false)
{
if (strcmp($file, '.')==0 || strcmp($file, '..')==0) continue;
$filepath = $start_dir . '/' . $file;
if (preg_match($pattern, ($args & PREG_FIND_FULLPATH) ? $filepath : $file))
{
$doadd = is_file($filepath)
|| (is_dir($filepath) && ($args & PREG_FIND_DIRMATCH))
|| (is_dir($filepath) && ($args & PREG_FIND_DIRONLY));
if ($args & PREG_FIND_DIRONLY && $doadd && !is_dir($filepath)) $doadd = false;
if ($args & PREG_FIND_NEGATE) $doadd = !$doadd;
if ($doadd)
{
if ($args & PREG_FIND_RETURNASSOC) // return more than just the filenames
{
$fileres = array();
if (function_exists('stat'))
{
$fileres['stat'] = stat($filepath);
$fileres['du'] = $fileres['stat']['blocks'] * 512;
}
//if (function_exists('fileowner')) $fileres['uid'] = fileowner($filepath);
//if (function_exists('filegroup')) $fileres['gid'] = filegroup($filepath);
//if (function_exists('filetype')) $fileres['filetype'] = filetype($filepath);
//if (function_exists('mime_content_type')) $fileres['mimetype'] = mime_content_type($filepath);
if (function_exists('dirname')) $fileres['dirname'] = dirname($filepath);
if (function_exists('basename')) $fileres['basename'] = basename($filepath);
//if (isset($fileres['uid']) && function_exists('posix_getpwuid ')) $fileres['owner'] = posix_getpwuid ($fileres['uid']);
$files_matched[$filepath] = $fileres;
}
else array_push($files_matched, $filepath);
}
}
if ( is_dir($filepath) && ($args & PREG_FIND_RECURSIVE) ) $files_matched = array_merge($files_matched, preg_find($pattern, $filepath, $args));
}
closedir($fh);
}
return $files_matched;
}
Example usage:
$arr = preg_find('/./','z:\temp');
var_dump($arr);
Example output:
Another example:
$arr = preg_find('/\.tmp$/i','z:\temp',PREG_FIND_RECURSIVE | PREG_FIND_DIRMATCH);
var_dump($arr);
On the first place, thanks to #IVO GELOV for the handy script you generously shared with me.
In cases someone else needs this info.
After much searching, I found out that I only needed to use Tkinter to navigate through directories. It offers a Dialog Box with two methods: one to get the full path to a file and the other to a directory, which was what I needed. The path can be stored in a variable.
Thanks to all for the input.
Related
I'm attempting to add a feature to our intranet, which will allow users to log onto the intranet, and access documents stored within a Windows network SAN.
At the moment, I've successfully retrieved all the file and folder names within a specified users 'My Documents'.
I'm having difficulty removing hidden files and folders from the array.
At the moment, I can remove all folders and files starting with ..
However on Windows, they're being marked as 'hidden' in the properties. I've googled and found lots of resources about how to mark a file as hidden, and how to hide files that start with a ., but none on how to remove hidden windows files / folders. One post on stackoverflow mentions to use DirectoryIterator, but at the moment, but haven't explained at all how to use it to check if a files marked as hidden.
We have over 1000 users, with approximately 500MB - 1GB of documents, with multiple layers of directories, so It needs to be relatively fast.
For clarification:
During a recursive iteration on a Windows system, how can I find out whether a directory is hidden or not, without relying on a prepended . symbol?
Ok, so worked it out, with help from the exec() function, so use with care!
I'm using CodeIgniter, so I've modified the directory_helper.php function slightly, as its installed on a windows box, it'll always need to check for the hidden files, but it should also work for non-codeigniter sites:
function directory_map($source_dir, $directory_depth = 0, $hidden = FALSE)
{
if ($fp = #opendir($source_dir))
{
if(!$hidden)
{
$exclude = array();
exec('dir "' . $source_dir . '" /ah /B', $exclude);
}
$filedata = array();
$new_depth = $directory_depth - 1;
$source_dir = rtrim($source_dir, DIRECTORY_SEPARATOR).DIRECTORY_SEPARATOR;
while (FALSE !== ($file = readdir($fp)))
{
// Remove '.', '..', and hidden files [optional]
if ( ! trim($file, '.') OR ($hidden == FALSE && $file[0] == '.') OR ($hidden === FALSE && in_array($file, $exclude)))
{
continue;
}
if (($directory_depth < 1 OR $new_depth > 0) && #is_dir($source_dir.$file))
{
$filedata[$file] = directory_map($source_dir.$file.DIRECTORY_SEPARATOR, $new_depth, $hidden);
}
else
{
$filedata[] = $file;
}
}
closedir($fp);
return $filedata;
}
return FALSE;
}
This scanned 2207 files, and 446 folders in approx 11 seconds (Ages I know, but the best I could do). Tested it on 500 folders and 200 files, and did it in around 3 seconds.
Its a recursive function which will scan each non-hidden directory. The first thing it does is scan the current directory for all hidden files and folders using the exec('dir *directory* /ah /B') function.
It will then store the results in an array and make sure that the current file/directory being read isn't in that array.
Here´s my recent result for recursive listing of a user directory.
I use the results to build a filemanger (original screenshots).
(source: ddlab.de)
Sorry, the 654321.jpg is uploaded several times to different folders, thats why it looks a bit messy.
(source: ddlab.de)
Therefor I need two separate arrays, one for the directory tree, the other for the files.
Here only showing the php solution, as I am currently still working on javascript for usability. The array keys contain all currently needed infos. The key "tree" is used to get an ID for the folders as well as a CLASS for the files (using jquery, show files which are related to the active folder and hide which are not) a.s.o.
The folder list is an UL/LI, the files section is a sortable table which includes a "show all files"-function, where files are listed completely, sortable as well, with path info.
The function
function build_tree($dir,$deep=0,$tree='/',&$arr_folder=array(),&$arr_files=array()) {
$dir = rtrim($dir,'/').'/'; // not really necessary if 1st function call is clean
$handle = opendir($dir);
while ($file = readdir($handle))
{
if ($file != "." && $file != "..")
{
if (is_dir($dir.$file))
{
$deep++;
$tree_pre = $tree; // remember for reset
$tree = $tree.$file.'/'; // bulids something like "/","/sub1/","/sub1/sub2/"
$arr_folder[$tree] = array('tree'=>$tree,'deep'=>$deep,'file'=>$file);
build_tree($dir.$file,$deep,$tree,$arr_folder,$arr_files); // recursive function call
$tree = $tree_pre; // reset to go to upper levels
$deep--; // reset to go to upper levels
}
else
{
$arr_files[$file.'.'.$tree] = array('tree'=>$tree,'file'=>$file,'filesize'=>filesize($dir.$file),'filemtime'=>filemtime($dir.$file));
}
}
}
closedir($handle);
return array($arr_folder,$arr_files); //cannot return two separate arrays
}
Calling the function
$build_tree = build_tree($udir); // 1st function call, $udir is my user directory
Get the arrays separated
$arr_folder = $build_tree[0]; // separate the two arrays
$arr_files = $build_tree[1]; // separate the two arrays
see results
print_r($arr_folder);
print_r($arr_files);
It works like a charme,
Whoever might need something like this, be lucky with it.
I promise to post the entire code, when finished :-)
I am working on my 404 error doc, and I was thinking instead of just giving a sitemap, one could suggest to the user the website he might have looked for based on what actually exists on the server.
Example: if the person typed in "www.example.com/foldr/site.html", the 404 page could output:
Did you mean "www.example.com/folder/site.html"?
For this, I wrote the following code which works for me very well. My question now is: is it "safe" to use this? As basically someone could detect all files on the server by trying all kind of combinations. Or a hacker could even use a script that loops through and lists all types of valid URLs.
Should I limit the directories this script can detect and propose? With an array of "OK"-locations, or by file type?
Had anyone else already got an idea like this?
PHP:
// get incorrect URL that was entered
$script = explode("/",$_SERVER['SCRIPT_NAME']);
$query = $_SERVER['QUERY_STRING'];
// create vars
$match = array();
$matched = "../";
// loop through the given URL folder by folder to find the suggested location
foreach ($script as $dir) {
if (!$dir) {
continue;
}
if ($handle = opendir($matched)) {
while (false !== ($entry = readdir($handle))) {
if ($entry != "." && $entry != "..") {
similar_text($dir, $entry, $perc);
if ($perc > 80) {
$match[$entry] = $perc;
}
}
}
closedir($handle);
if ($match) {
arsort($match);
reset($match);
$matched .= key($match)."/";
} else {
$matched = false;
break;
}
$match = array();
}
}
// trim and echo the result that had the highest match
$matched = trim(ltrim(rtrim($matched,"/"),"."));
echo "proposed URL: ".$_SERVER["SERVER_NAME"].$matched;
Yup, you can see it as this:
Imagine a house with only glass walls on the outside, but it's night. You're a thief (hacker) and you want to check the house for worthfull loot (files with passwords, db connections etc).
If you don't protect (certain) files, you would be putting the lights on in every part of the house. The thief would look through the windows and see that you have loot - now the only the he would have to do is get in and take it.
If you do protect the files, the thief won't even be able to know that there was any loot in the house, and thus would the thief have a higher chance of moving on to the next house.
What exactly are the benefits of using a PHP 5 DirectoryIterator
$dir = new DirectoryIterator(dirname(__FILE__));
foreach ($dir as $fileinfo)
{
// handle what has been found
}
over a PHP 4 "opendir/readdir/closedir"
if($handle = opendir(dirname(__FILE__)))
{
while (false !== ($file = readdir($handle)))
{
// handle what has been found
}
closedir($handle);
}
besides the subclassing options that come with OOP?
To understand the difference between the two, let's write two functions that read contents of a directory into an array - one using the procedural method and the other object oriented:
Procedural, using opendir/readdir/closedir
function list_directory_p($dirpath) {
if (!is_dir($dirpath) || !is_readable($dirpath)) {
error_log(__FUNCTION__ . ": Argument should be a path to valid, readable directory (" . var_export($dirpath, true) . " provided)");
return null;
}
$paths = array();
$dir = realpath($dirpath);
$dh = opendir($dir);
while (false !== ($f = readdir($dh))) {
if ("$f" != '.' && "$f" != '..') {
$paths[] = "$dir" . DIRECTORY_SEPARATOR . "$f";
}
}
closedir($dh);
return $paths;
}
Object Oriented, using DirectoryIterator
function list_directory_oo($dirpath) {
if (!is_dir($dirpath) || !is_readable($dirpath)) {
error_log(__FUNCTION__ . ": Argument should be a path to valid, readable directory (" . var_export($dirpath, true) . " provided)");
return null;
}
$paths = array();
$dir = realpath($dirpath);
$di = new DirectoryIterator($dir);
foreach ($di as $fileinfo) {
if (!$fileinfo->isDot()) {
$paths[] = $fileinfo->getRealPath();
}
}
return $paths;
}
Performance
Let's assess their performance first:
$start_t = microtime(true);
for ($i = 0; $i < $num_iterations; $i++) {
$paths = list_directory_oo(".");
}
$end_t = microtime(true);
$time_diff_micro = (($end_t - $start_t) * 1000000) / $num_iterations;
echo "Time taken per call (list_directory_oo) = " . round($time_diff_micro / 1000, 2) . "ms (" . count($paths) . " files)\n";
$start_t = microtime(true);
for ($i = 0; $i < $num_iterations; $i++) {
$paths = list_directory_p(".");
}
$end_t = microtime(true);
$time_diff_micro = (($end_t - $start_t) * 1000000) / $num_iterations;
echo "Time taken per call (list_directory_p) = " . round($time_diff_micro / 1000, 2) . "ms (" . count($paths) . " files)\n";
On my laptop (Win 7 / NTFS), procedural method seems to be clear winner:
C:\code>"C:\Program Files (x86)\PHP\php.exe" list_directory.php
Time taken per call (list_directory_oo) = 4.46ms (161 files)
Time taken per call (list_directory_p) = 0.34ms (161 files)
On an entry-level AWS machine (CentOS):
[~]$ php list_directory.php
Time taken per call (list_directory_oo) = 0.84ms (203 files)
Time taken per call (list_directory_p) = 0.36ms (203 files)
Above are results on PHP 5.4. You'll see similar results using PHP 5.3 and 5.2. Results are similar when PHP is running on Apache or NGINX.
Code Readability
Although slower, code using DirectoryIterator is more readable.
File reading order
The order of directory contents read using either method are exact same. That is, if list_directory_oo returns array('h', 'a', 'g'), list_directory_p also returns array('h', 'a', 'g')
Extensibility
Above two functions demonstrated performance and readability. Note that, if your code needs to do further operations, code using DirectoryIterator is more extensible.
e.g. In function list_directory_oo above, the $fileinfo object provides you with a bunch of methods such as getMTime(), getOwner(), isReadable() etc (return values of most of which are cached and do not require system calls).
Therefore, depending on your use-case (that is, what you intend to do with each child element of the input directory), it's possible that code using DirectoryIterator performs as good or sometimes better than code using opendir.
You can modify the code of list_directory_oo and test it yourself.
Summary
Decision of which to use entirely depends on use-case.
If I were to write a cronjob in PHP which recursively scans a directory (and it's subdirectories) containing thousands of files and do certain operation on them, I would choose the procedural method.
But if my requirement is to write a sort of web-interface to display uploaded files (say in a CMS) and their metadata, I would choose DirectoryIterator.
You can choose based on your needs.
Benefit 1: You can hide away all the boring details.
When using iterators you generally define them somewhere else, so real-life code would look something more like:
// ImageFinder is an abstraction over an Iterator
$images = new ImageFinder($base_directory);
foreach ($images as $image) {
// application logic goes here.
}
The specifics of iterating through directories, sub-directories and filtering out unwanted items are all hidden from the application. That's probably not the interesting part of your application anyway, so it's nice to be able to hide those bits away somewhere else.
Benefit 2: What you do with the result is separated from obtaining the result.
In the above example, you could swap out that specific iterator for another iterator and you don't have to change what you do with the result at all. This makes the code a bit easier to maintain and add new features to later on.
A DirectoryIterator provides you with items that make sense in themselves. For example, DirectoryIterator::getPathname() will return all the information that you need to access the file contents.
The information that readdir() provides to you only make sense locally, namely in combination with the parameter that you passed to opendir().
The DirectoryIterator is implemented in terms of wrappers around the php_stream_* functions, so no fundamentally different performance characteristics are to be expected. Particularly, items from the directory are read only when they are requested. Details can be found in the file
ext/spl/spl_directory.c
of the PHP source code.
It's shorter, cleaner and easier to type and read.
Try re-read your examples. Just “for each in $dir” in first example.
What you want, that you write…
I cannot find a way to get the user's home directory (e.g. /home/jack; whatever ~ in bash points to) in PHP using CGI (suPHP). The $_ENV array is empty, and getenv('HOME') returns nothing.
The reason I want to do this is that in absense of configuration saying otherwise, I want to find variable files used by my application in /home/user/.myappnamehere, as most Linux applications do.
I've built something, but it's not the best; While it works, it assumes a lot about the system (e.g. the presence of /etc/passwd)
$usr = get_current_user();
$passwd = file('/etc/passwd');
$var = false;
foreach ($passwd as $line) {
if (strstr($line, $usr) !== false) {
$parts = explode(':', $line);
$var = realpath($parts[5].'/.report');
break;
}
}
I think you want the result of either:
http://us.php.net/manual/en/function.getmyuid.php or
http://us.php.net/manual/en/function.posix-getuid.php sent to
http://us.php.net/manual/en/function.posix-getpwuid.php
If safemode is disabled, try this one
$homedir = `cd ~ && pwd`;