PHP oldest path based on a dir name - php

I have on my linux machine such folder tree structure:
/dir/yyyy/mm/dd/HH
e.g.:
/dir/2014/03/01/08
/dir/2014/03/20/09
/dir/2014/03/01/10
/dir/2014/08/01/10
/dir/2014/12/15/10
/dir/2015/01/01/14
I'd like to get in php what path is the oldest, like this:
The oldest path is: 2014-03-01 08
The newest path is: 2015-01-01 14
How it can be done?

It could be written better but it works
$paths = array(
'/dir/2014/03/01/08',
'/dir/2014/03/20/09',
'/dir/2014/03/01/10',
'/dir/2014/08/01/10',
'/dir/2014/12/15/10',
'/dir/2015/01/01/14',
);
$dates = array();
foreach($paths as $path)
{
$matches = array();
preg_match('#([^\/]+?)\/([^\/]+?)\/([^\/]+?)\/([^\/]+?)\/([^\/]+)#', $path, $matches);
$dates[$path] = strtotime(sprintf("%s-%s-%s %s:00:00", $matches[2], $matches[3], $matches[4], $matches[5]));
}
asort($dates);
$dates = array_keys($dates);
$oldest = array_shift($dates);
$newest = array_pop($dates);
It changes date find by regex to unixtimestamp then sorts it and returns top and bottom value of sorted array.

Little bit like Pascal style)
<?php
$oldest = '';
$newest = '';
$iterator = new RecursiveIteratorIterator(new RecursiveDirectoryIterator('./dir/'));
foreach ($iterator as $file => $object) {
if ($iterator->getDepth() === 4) {
$name = $object->getPath();
if ($name > $newest) {
$newest = $name;
}
if (empty($oldest) or $name < $oldest) {
$oldest = $name;
}
}
}
var_export([$oldest, $newest]);
Result:
array (
0 => './dir/2014/03/01/08',
1 => './dir/2015/01/01/14',
)

All you have to do is loop through each folder and find the directory that has the lowest number. If you have the file paths stored in a database, it can be easier, but from your question it seems like you want to search the folders.
<?php
$base = 'dir';
$yearLowest = lowestDir($base);
$monthLowest = lowestDir($yearLowest);
$dayLowest = lowestDir($monthLowest);
echo $dayLowest;
function lowestDir($dir) {
$lowest = null;
$handle = opendir($dir);
while(($name = readdir($handle))) {
if($name == '.' || $name == '..') {
continue;
}
if(is_dir($dir.'/'.$name) && ($lowest == null || $name < $lowest)) {
$lowest = $name;
}
}
closedir($handle);
return $dir.'/'.$lowest;
}
?>

Related

Scan files in a directory to get the number of methods and PHP classes in a directory

Im trying to get the number of class and methods in a specific directory which contain sub folder and scan through them. So far I can only count the number of files.
$ite=new RecursiveDirectoryIterator("scanME");
//keyword search
$classWords = array('class');
$functionWords = array('function');
//Global Counts
$bytestotal=0;
$nbfiles=0;
$classCount = 0;
$methodCount = 0;
foreach (new RecursiveIteratorIterator($ite) as $filename=>$cur) {
$filesize=$cur->getSize();
$bytestotal+=$filesize;
if(is_file($cur))
{
$nbfiles++;
foreach ($classWords as $classWord) {
$fileContents = file_get_contents($cur);
$place = strpos($fileContents, $classWord);
if (!empty($place)) {
$classCount++;
}
}
foreach($functionWords as $functionWord) {
$fileContents = file_get_contents($cur);
$place = strpos($fileContents, $functionWord);
if (!empty($place)) {
$methodCount++;
}
}
}
}
EDIT: I manage to count the keyword class and function but the problem is it only concatenate for each file. Eg: I have 2 class in one file it will just count 1. How do I count for each keyword in a file?
The only time you define $classContents is at the top where you're attempting to get the contents of the directory:
$classContents = file_get_contents('scanMeDir');
You should be getting the contents of each file while looping through the RecursiveDirectoryIterator results. (You also don't need to create a new iterator instance):
foreach ($ite as $filename => $cur) {
$classContents = file_get_contents($filename);
...
}
using token instead of keyword is the better solution for this
$bytestotal=0;
$nbfiles=0;
$fileToString;
$token;
$pathInfo;
$classCount = 0;
$methodCount = 0;
foreach (new RecursiveIteratorIterator($ite) as $filename=>$cur) {
$filesize=$cur->getSize();
$bytestotal+=$filesize;
if(is_file($cur))
{
$nbfiles++;
$fileToString = file_get_contents($cur);
$token = token_get_all($fileToString);
$tokenCount = count($token);
//Class Count
$pathInfo = pathinfo($cur);
if ($pathInfo['extension'] === 'php') {
for ($i = 2; $i < $tokenCount; $i++) {
if ($token[$i-2][0] === T_CLASS && $token[$i-1][0] === T_WHITESPACE && $token[$i][0] === T_STRING ) {
$classCount++;
}
}
} else {
error_reporting(E_ALL & ~E_NOTICE);
}
//Method Count
for ($i = 2; $i < $tokenCount; $i++) {
if ($token[$i-2][0] === T_FUNCTION && $token[$i-1][0] === T_WHITESPACE && $token[$i][0] === T_STRING) {
$methodCount++;
}
}
}
}

Retreiving files based on a particular pattern

I am using this function to retreive files from directory and sub-directories.
How can I display only files with _lang.php within this directory and sub-directories?
function getDirContents($dir, &$results = array()){
$files = scandir($dir);
foreach($files as $key => $value){
$path = realpath($dir.DIRECTORY_SEPARATOR.$value);
if(!is_dir($path)) {
$results[] = $path;
} else if($value != "." && $value != "..") {
getDirContents($path, $results);
$results[] = $path;
}
}
return $results;
}
$dir = './test/';
var_dump(getDirContents($dir));
answered a question like this earlier, try using Iterator Classes
<?php
function getDirContents($directory, $pattern)
{
$result = array();
$objRecursiveDirectoryIterator = new RecursiveDirectoryIterator($directory, RecursiveDirectoryIterator::SKIP_DOTS);
$objRecursiveIteratorIterator = new RecursiveIteratorIterator($objRecursiveDirectoryIterator);
// use RegexIterator() to grab only files that match $pattern
$objRegexIterator = new RegexIterator($objRecursiveIteratorIterator, $pattern, RecursiveRegexIterator::GET_MATCH);
// iterate through all the results
foreach ($objRegexIterator as $arrMatches) {
$result[] = $arrMatches[0];
}
return $result;
}
$dir = './test/';
$arrDirContents = getDirContents($dir, "~^.+_lang\.php$~i");
var_dump($arrDirContents);

Multidimensional array directory map

I'm trying to get a directory structure in a multidimensional array.
I got this far:
function dirtree($dir, $regex = '', $ignoreEmpty = false)
{
if (!$dir instanceof DirectoryIterator) {
$dir = new DirectoryIterator((string) $dir);
}
$dirs = array();
$files = array();
foreach ($dir as $node) {
if ($node->isDir() && !$node->isDot()) {
$tree = dirtree($node->getPathname(), $regex, $ignoreEmpty);
if (!$ignoreEmpty || count($tree)) {
$dirs[$node->getFilename()] = $tree;
}
} elseif ($node->isFile()) {
$name = $node->getFilename();
if ('' == $regex || preg_match($regex, $name)) {
$files[] = $name;
}
}
}
asort($dirs);
sort($files);
return array_merge($dirs, $files);
}
But I am having issues getting the folder name instead of the index 0,1 .etc. This seems to be due to the fact that my directories have numeric names?
Array
(
[0] => Array // 0 should be the folder name
(
[0] => m_109225488_1.jpg
[1] => t_109225488_1.jpg
)
[1] => Array
(
[0] => m_252543961_1.jpg
[1] => t_252543961_1.jpg
)
The solution was rather simple thanks to: Merge array without loss key index
Instead of array_merge simply do $dirs + $files
Potential solution (potential issue point out by Roger Gee):
function dirtree($dir, $regex = '', $ignoreEmpty = false)
{
if (!$dir instanceof DirectoryIterator) {
$dir = new DirectoryIterator((string) $dir);
}
$dirs = array();
$files = array();
foreach ($dir as $node) {
if ($node->isDir() && !$node->isDot()) {
$tree = dirtree($node->getPathname(), $regex, $ignoreEmpty);
if (!$ignoreEmpty || count($tree)) {
$dirs[$node->getFilename()] = $tree;
}
} elseif ($node->isFile()) {
$name = $node->getFilename();
if ('' == $regex || preg_match($regex, $name)) {
$files[] = $name;
}
}
}
return $dirs + $files;
}
Better solution?
function dirtree($dir, $regex = '', $ignoreEmpty = false)
{
if (!$dir instanceof DirectoryIterator) {
$dir = new DirectoryIterator((string) $dir);
}
$filedata = array();
foreach ($dir as $node) {
if ($node->isDir() && !$node->isDot()) {
$tree = dirtree($node->getPathname(), $regex, $ignoreEmpty);
if (!$ignoreEmpty || count($tree)) {
$filedata[$node->getFilename()] = $tree;
}
} elseif ($node->isFile()) {
$name = $node->getFilename();
if ('' == $regex || preg_match($regex, $name)) {
$filedata[] = $name;
}
}
}
return $filedata;
}
Using the array union operation is dangerous since you can potentially overwrite existing files. Consider the following directory structure:
a <-- directory
├── 0 <-- directory (empty)
├── b <-- regular file
└── c <-- directory
└── d <-- regular file
Now consider running the operation using the array union. I get the following result:
array(2) {
[0]=>
array(0) {
}
["c"]=>
array(1) {
[0]=>
string(1) "d"
}
}
Notice how regular file b is not present? This is because the array union operation prefers the existing 0 index over the 0 index from the right operand (which contains the regular files).
I would stick with the original implementation present in the question or use a special bucket for files that doesn't contain a valid filesystem name (e.g. :files:). Note that this may be platform-specific as to what you choose.
In the case of the original implementation, you can decide whether the index is a directory vs regular file by calling is_array or is_scalar on the value. Note that since the directories array is the first parameter to array_merge, you are guaranteed that no directory indexes get incremented and will always refer to the correct directory names.
Here's how you could determine just the directory names:
function getDirectoryNames($result) {
$ds = [];
foreach ($result as $key => $value) {
if (is_array($value)) {
$ds[] = $key;
}
}
return $ds;
}
What you are looking for is ksort instead of asort.
<html>
<body>
<?php
function dirtree($dir, $regex = '', $ignoreEmpty = false)
{
if (!$dir instanceof DirectoryIterator) {
$dir = new DirectoryIterator((string) $dir);
}
$dirs = array();
$files = array();
foreach ($dir as $node) {
if ($node->isDir() && !$node->isDot()) {
$tree = dirtree($node->getPathname(), $regex, $ignoreEmpty);
if (!$ignoreEmpty || count($tree)) {
$dirs[$node->getFilename()] = $tree;
}
} elseif ($node->isFile()) {
$name = $node->getFilename();
if ('' == $regex || preg_match($regex, $name)) {
$files[] = $name;
}
}
}
ksort($dirs);
sort($files);
return array_merge($dirs, $files);
}
?>
<body>
<pre>
<?=var_dump(dirtree(getcwd());?>
</pre>
</body>
</html>
This will do the work for you.
But as mentioned, a better solution would be to seperate directories and files like this:
<html>
<body>
<?php
class DirNode {
public $name;
public $dirs=[];
public $files=[];
public function DirNode($dirName) {
$this->name = $dirName;
}
public function printDir($prefix="") {
echo($prefix.$this->name."\n");
foreach($this->dirs as $dir=>$subDir) {
echo($prefix.$dir."\n");
$subDir->printDir($prefix." ");
echo("\n");
}
foreach($this->files as $file) {
echo($prefix.$file."\n");
}
}
}
function dirtree($dir, $regex = '', $ignoreEmpty = false)
{
if (!$dir instanceof DirectoryIterator) {
$dir = new DirectoryIterator((string) $dir);
}
$directory = new DirNode($dir);
foreach ($dir as $node) {
if ($node->isDir() && !$node->isDot()) {
$tree = dirtree($node->getPathname(), $regex, $ignoreEmpty);
if (!$ignoreEmpty || count($tree)) {
$directory->dirs[$node->getFilename()] = $tree;
}
} elseif ($node->isFile()) {
$name = $node->getFilename();
if ('' == $regex || preg_match($regex, $name)) {
$directory->files[] = $name;
}
}
}
ksort($directory->dirs);
sort($directory->files);
return $directory;
}
$dirfiles = dirtree(getcwd().'/..');
echo("<pre>");
echo($dirfiles->printDir());
echo("</pre>");
?>
</body>
</html>

PHP FTP recursive directory listing

I'm trying to make a recursive function to get all the directories and sub directories from my ftp server in an array.
I tried a lot of functions I've found on the web. The one that works best for me is this one:
public function getAllSubDirFiles() {
$dir = array(".");
$a = count($dir);
$i = 0;
$depth = 20;
$b = 0;
while (($a != $b) && ($i < $depth)) {
$i++;
$a = count($dir);
foreach ($dir as $d) {
$ftp_dir = $d . "/";
$newdir = ftp_nlist($this->connectionId, $ftp_dir);
foreach ($newdir as $key => $x) {
if ((strpos($x, ".")) || (strpos($x, ".") === 0)) {
unset($newdir[$key]);
} elseif (!in_array($x, $dir)) {
$dir[] = $x;
}
}
}
$b = count($dir);
}
return $dir ;
}
The problem with this function is it wont allow the directory to have a "." in it's name and every file that is located in the root directory will be considered a directory as well. So I adjusted the function and got this:
public function getAllSubDirFiles($ip, $id, $pw) {
$dir = array(".");
$a = count($dir);
$i = 0;
$depth = 20;
$b =0;
while (($a != $b) && ($i < $depth)) {
$i++;
$a = count($dir);
foreach ($dir as $d) {
$ftp_dir = $d . "/";
$newdir = ftp_nlist($this->connectionId, $ftp_dir);
foreach ($newdir as $key => $x) {
if (!is_dir('ftp://'.$id.':'.$pw.'#'.$ip.'/'.$x)) {
unset($newdir[$key]);
} elseif (!in_array($x, $dir)) {
$dir[] = $x;
}
}
}
$b = count($dir);
}
return $dir ;
}
This works pretty good but and gives the result I want. but it's so slow it's unusable.
I also tried working with ftp_rawlist but it has the same drawback of being horribly slow.
public function getAllSubDirFiles() {
$dir = array(".");
$a = count($dir);
$i = 0;
$depth = 20;
$b = 0;
while (($a != $b) && ($i < $depth)) {
$i++;
$a = count($dir);
foreach ($dir as $d) {
$ftp_dir = $d . "/";
$newdir = $this->getFtp_rawlist('/' . $ftp_dir);
foreach ($newdir as $key => $x) {
$firstChar = substr($newdir[$key][0], 0, 1);
$a = 8;
while ($a < count($newdir[$key])) {
if ($a == 8) {
$fileName = $ftp_dir . '/' . $newdir[$key][$a];
} else {
$fileName = $fileName . ' ' . $newdir[$key][$a];
}
$a++;
}
if ($firstChar != 'd') {
unset($newdir[$key]);
} elseif (!in_array($fileName, $dir)) {
$dir[] = $fileName;
}
}
}
$b = count($dir);
}
return $dir;
}
public function getFtp_rawlist($dir) {
$newArr = array();
$arr = ftp_rawlist($this->connectionId, $dir);
foreach ($arr as $value) {
$stringArr = explode(" ", $value);
$newArr[] = array_values(array_filter($stringArr));
}
return $newArr;
}
I've been stuck on this problem for the last couple of days and I'am getting desperate. If any one has any suggestion please let me know
If your server supports MLSD command and you have PHP 7.2 or newer, you can use ftp_mlsd function:
function ftp_mlsd_recursive($ftp_stream, $directory)
{
$result = [];
$files = ftp_mlsd($ftp_stream, $directory);
if ($files === false)
{
die("Cannot list $directory");
}
foreach ($files as $file)
{
$name = $file["name"];
$filepath = $directory . "/" . $name;
if (($file["type"] == "cdir") || ($file["type"] == "pdir"))
{
// noop
}
else if ($file["type"] == "dir")
{
$result = array_merge($result, ftp_mlsd_recursive($ftp_stream, $filepath));
}
else
{
$result[] = $filepath;
}
}
return $result;
}
If you do not have PHP 7.2, you can try to implement the MLSD command on your own. For a start, see user comment of the ftp_rawlist command:
https://www.php.net/manual/en/function.ftp-rawlist.php#101071
If you cannot use MLSD, you will particularly have problems telling if an entry is a file or folder. While you can use the ftp_size trick, calling ftp_size for each entry can take ages.
But if you need to work against one specific FTP server only, you can use ftp_rawlist to retrieve a file listing in a platform-specific format and parse that.
The following code assumes a common *nix format.
function ftp_nlst_recursive($ftp_stream, $directory)
{
$result = [];
$lines = ftp_rawlist($ftp_stream, $directory);
if ($lines === false)
{
die("Cannot list $directory");
}
foreach ($lines as $line)
{
$tokens = preg_split("/\s+/", $line, 9);
$name = $tokens[8];
$type = $tokens[0][0];
$filepath = $directory . "/" . $name;
if ($type == 'd')
{
$result = array_merge($result, ftp_nlst_recursive($ftp_stream, $filepath));
}
else
{
$result[] = $filepath;
}
}
return $result;
}
For DOS format, see: Get directory structure from FTP using PHP.
I've build an OOP FTP Client library that's can help you on this a lot, using just this code you can retrieve a list of only the directories with addition useful information like (chmod, last modified time, size ...).
The code :
// Connection
$connection = new FtpConnection("localhost", "foo", "12345");
$connection->open();
// FtpConfig
$config = new FtpConfig($connection);
$config->setPassive(true);
$client = new FtpClient($connection);
$allFolders =
// directory, recursive, filter
$client->listDirectoryDetails('/', true, FtpClient::DIR_TYPE);
// Do whatever you want with the folders
This code a variation of Martin Prikryl one. It is slower but do not have any failures with whitespaces. Use this code only if you have any problems with the code above.
function ftp_list_files_recursive($ftp_stream, $path){
$lines = ftp_nlist($ftp_stream, $path);
$result = array();
foreach ($lines as $line) {
if (ftp_size($ftp_stream, $line) == -1) {
$result = array_merge($result, ftp_list_files_recursive($ftp_stream, $line));
}
else{
$result[] = $line;
}
}
return $result;
}

php multiple search values in file path

I am working on a php site that needs to search a set of files with any combination of search fields.
The possible search fields are
id, year, building, lastname, firstname, birthdate
The folder structure and file names are as such
/year/building/file.pdf
The filenames contain the data to search
id_lastname_firstname_MM_dd_yy.pdf
I have everything working on the site except this part. Originally I only had ID, year, and building and I was able to do if's to check for every possibility of combinations. Now there is way more combinations so it much more complex.
I was thinking nested if and in_array or such, but there has to be a better way. I just learning my way around php.
I would like to be able to search with any combination of fields. I can change the filenames if it helps.
I started with something like this
function search($transcripts, $studentid=null, $year=null, $building=null, $last=null, $first=null, $birthdate=null){
$ext = '.pdf';
date_default_timezone_set('America/Los_Angeles');
$dir_iterator = new RecursiveDirectoryIterator("../transcripts");
$iterator = new RecursiveIteratorIterator($dir_iterator, RecursiveIteratorIterator::SELF_FIRST);
foreach ($iterator as $file) {
if ($file->isFile()){
$path = explode('\\',$file->getPath());
$fname = explode('_', $file->getBasename($ext));
if($path[1] == $year){
if($path[2] == $building){
if(in_array($last, $fname, true)){
if((in_array($first, $fname, true)){
if((in_array($birthdate
Originally I had seperate functions depending on which fields where filed in.
function bldStuSearch($building, $studentid, $transcripts){
$ext = '.pdf';
date_default_timezone_set('America/Los_Angeles');
$dir_iterator = new RecursiveDirectoryIterator("../transcripts");
$iterator = new RecursiveIteratorIterator($dir_iterator, RecursiveIteratorIterator::SELF_FIRST);
foreach ($iterator as $file) {
$results = explode('\\',$file->getPath());
//var_dump($results);
if (($file->isFile()) && ($file->getBasename($ext)==$studentid) && ($results[2] == $building)){
//echo substr($file->getPathname(), 27) . ": " . $file->getSize() . " B; modified " . date("Y-m-d", $file->getMTime()) . "\n";
$results = explode('\\',$file->getPath());
//var_dump($results);
//$building = $results[2];
$year = $results[1];
//$studentid = $file->getBasename($ext);
array_push($transcripts, array($year, $building, $studentid));
//var_dump($transcripts);
//$size += $file->getSize();
//echo '<br>';
}
}
//echo "\nTotal file size: ", $size, " bytes\n";
if (empty($transcripts))
{
header('Location: index.php?error=2'); exit();
}
return $transcripts;
}
Now I am trying to have one search function to check for any combination? Any idea that would at least put in the right direction?
Thanks.
So I had an idea about doing a scoring system but then dismissed it. I came back to it and found a way to make it work using a weighted scoring system.
This allows the search to be super flexible and maintain being portable, not requiring a database for the metadata and using the filename as the search data without having to search each PDF. I am using A-Pdf splitter to split the PDF into separate files and add the metadata to the filename.
I hope someone some day finds this useful for other searches. I am really happy the way this turned out.
I will post the entire code when I am done on http://github.com/friedcircuits
One thing I should change is to use named keys for the arrays.
Here is the resulting code. Right now the birthdate has to be entered as m-d-yyyy to match.
function search($transcripts, $studentid=null, $year=null, $building=null, $last=null, $first=null, $birthdate=null){
$ext = '.pdf';
$bldSearch = false;
date_default_timezone_set('America/Los_Angeles');
if (($building == null) AND ($year == null)){ $searchLocation = "../transcripts";}
elseif (($year != null) AND ($building != null)){$searchLocation = "../transcripts/".$year."/".$building;}
elseif ($year != null) {$searchLocation = "../transcripts/".$year;}
elseif ($building != null) {
$searchLocation = "../transcripts/";
$bldSearch = true;
}
else{$searchLocation = "../transcripts";}
$dir_iterator = new RecursiveDirectoryIterator($searchLocation);
$iterator = new RecursiveIteratorIterator($dir_iterator, RecursiveIteratorIterator::SELF_FIRST);
$score = 0;
foreach ($iterator as $file) {
if ($file->isFile()){
//Fix for slashes changing direction depending on search path
$path = str_replace('/','\\', $file->getPath());
$path = explode('\\',$path);
$fname = explode('_', $file->getBasename($ext));
//var_dump($path);
//echo "<br>";
//var_dump($fname);
//echo "<br>";
//fix for different search paths
if($path[1] == "transcripts"){
$pYear = $path[2];
$pbuilding = $path[3];
}
else{
$pYear = $path[1];
$pbuilding = $path[2];
}
if ($bldSearch == true){
if ($building != $pbuilding) {continue;}
}
//$fname[1] = #strtolower($fname[1]);
//$fname[2] = #strtolower($fname[2]);
if($fname[0] == $studentid){
$yearS = $pYear;
$buildingS = $pbuilding;
$studentidS = $fname[0];
$lastS = $fname[1];
$firstS = $fname[2];
$birthdateS = $fname[3];
array_push($transcripts, array($yearS, $buildingS, $studentidS, $lastS, $firstS, $birthdateS));
continue;
}
if($pYear == $year){
$score += 1;
}
if($path[2] == $building){
$score += 1;
}
if(#strpos(#strtolower($fname[1]),$last) !== false){
$score += 3;
}
if(#strpos(strtolower($fname[2]), $first) !== false){
$score += 3;
}
if($fname[3] == $birthdate){
$score += 3;
}
//echo $score." ";
if ($score > 2) {
$yearS = $pYear;
$buildingS = $pbuilding;
$studentidS = $fname[0];
$lastS = $fname[1];
$firstS = $fname[2];
$birthdateS = $fname[3];
array_push($transcripts, array($yearS, $buildingS, $studentidS, $lastS, $firstS, $birthdateS));
//var_dump($transcripts);
}
}
$score = 0;
}
if (empty($transcripts))
{
header('Location: index.php?error=2'); exit();
}
return $transcripts;}

Categories