I wanted to play around with some of PHP's iterators and managed to get a solid (from my understanding) build going. My goal was to iterate inside of a parent folder and get 2 nodes down; building a hierarchical tree array during the process. Obviously, I could do this fairly easy using glob and a couple of nested loops but I want to use Spl classes to accomplish this.
All that out of the way, I've played around with SplHeap and SplObjectStore to hierarchy and failed. What's messing with my noodle is my normal methods of recursion are failing (out-of-memory errors) and my one success falls with a recursive method that loops over each node, adding to an array. The problem with that is it ignores the setMaxDepth() method and goes through all of the children. I thought about setting a $var++ to increment through the loop, limiting to limit the nodes but I don't believe that's the "right way".
Anywho, code (sorry for any orphaned code if any - just ignore it)...
<?php
namespace Tree;
use RecursiveFilterIterator,
RecursiveDirectoryIterator,
RecursiveIteratorIterator;
class Filter extends RecursiveFilterIterator {
public static $FILTERS = array(
'.git', '.gitattributes', '.gitignore', 'index.php'
);
public function accept() {
if (!$this->isDot() && !in_array($this->current()->getFilename(), self::$FILTERS))
return TRUE;
return FALSE;
}
}
class DirTree {
const MAX_DEPTH = 2;
private static $iterator;
private static $objectStore;
public function __construct() {
error_reporting(8191);
$path = realpath('./');
try {
$dirItr = new RecursiveDirectoryIterator($path);
$filterItr = new Filter($dirItr);
$objects = new RecursiveIteratorIterator($filterItr, RecursiveIteratorIterator::SELF_FIRST);
$objects->setMaxDepth(self::MAX_DEPTH);
echo '<pre>';
print_r($this->build_hierarchy($objects));
} catch(Exception $e) {
die($e->getMessage());
}
}
public function build_hierarchy($iterator){
$array = array();
foreach ($iterator as $fileinfo) {
if ($fileinfo->isDir()) {
// Directories and files have labels
$current = array(
'label' => $fileinfo->getFilename()
);
// Only directories have children
if ($fileinfo->isDir()) {
$current['children'] = $this->build_hierarchy($iterator->getChildren());
}
// Append the current item to this level
$array[] = $current;
}
}
return $array;
}
}
$d = new DirTree;
A RecursiveIteratorIterator is designed primarily to give you an iterator that behaves like an iterator over a flat list, but the flat list is actually just a sequence in the recursive traversal. It does this by internally managing a Stack of RecursiveIterators, calling getChildren() on them as necessary. A client of RecursiveIteratorIterator is only really supposed to call the normal Iterator methods like current() and next() etc...with the exception of value added methods like setMaxDepth()
Your problem is that your trying to do the recursion yourself by calling getChildren(). If you want to manually manage the recursion, that's fine - but that makes RecursiveIteratorIterator redundant. In fact, I'm really surprised calling getChildren() on RecursiveIteratorIterator didn't fatal error. That's a RecursiveIterator method. spl is probably just forwarding the method call to the inner iterator(some spl classes forward method calls to undefined methods in order to make it easy to use the Decorator design pattern).
The right way:
$dirItr = new RecursiveDirectoryIterator($path);
$filterItr = new Filter($dirItr);
$objects = new RecursiveIteratorIterator($filterItr, RecursiveIteratorIterator::SELF_FIRST);
$objects->setMaxDepth(self::MAX_DEPTH);
echo '<pre>';
foreach ($objects as $splFileInfo) {
echo $splFileInfo;
echo "\n";
}
I'm not going to get into forming the hierarchical array in some particular structure for you, but maybe this related question further helps you understand the difference between RecursiveIteratorIterator and RecursiveIterator
How does RecursiveIteratorIterator work in PHP?
Related
Well if i do something like:
boot.php:
function boot($c) { require 'mods/'.$c.'.php'; }
spl_autoload_register('boot');
index.php
require 'boot.php';
class Father {
function __construct()
{
/* get all modules in database then loop it like: */
foreach($mods as $v) eval('$cmod = new '.$v.'()');
}
}
new Father();
Example of a class module:
class mod01 extends Father {
function __construct() { //code }
}
I would like to know if use eval is a good or bad point, i'm using it cuz i don't know the name of mods it will come of db.
you don't need eval(). (and if not necessary, simply don't use it)
foreach($mods as $v)
$cmod = new $v();
Works too.
You could do it either way and still have access to the newly created classes.
$classes = array() ;
foreach($mods as $v){
$classes[] = new $v(); //Whats the point of rewriting $cmod?
}
You just rewrite reference to objects in each iteration, so store your references in an array.
The standard way to recursively scan directories via SPL iterators is:
$files = new RecursiveIteratorIterator(
new RecursiveDirectoryIterator($path),
RecursiveIteratorIterator::CHILD_FIRST
);
foreach ($files as $file) {
print $file->getPathname() . PHP_EOL;
}
I want a composable set of filters to apply to my recursive file search. I'm using a RecursiveDirectoryIterator to scan a directory structure.
I want to apply more than one filter to my directory structure.
My set up code:
$filters = new FilterRuleset(
new RecursiveDirectoryIterator($path)
);
$filters->addFilter(new FilterLapsedDirs);
$filters->addFilter(new IncludeExtension('wav'));
$files = new RecursiveIteratorIterator(
$filters, RecursiveIteratorIterator::CHILD_FIRST
);
I thought I could apply N filters by using rule set:
class FilterRuleset extends RecursiveFilterIterator {
private $filters = array();
public function addFilter($filter) {
$this->filters[] = $filter;
}
public function accept() {
$file = $this->current();
foreach ($this->filters as $filter) {
if (!$filter->accept($file)) {
return false;
}
}
return true;
}
}
The filtering I set up is not working as intended. When I check the filters in FilterRuleset they are populated on the first call, then blank on subsequent calls. Its as if internally RecursiveIteratorIterator is re-instantiating my FilterRuleset.
public function accept() {
print_r($this->filters);
$file = $this->current();
foreach ($this->filters as $filter) {
if (!$filter->accept($file)) {
return false;
}
}
return true;
}
Output:
Array
(
[0] => FilterLapsedDirs Object
(
)
[1] => IncludeExtension Object
(
[ext:private] => wav
)
)
Array
(
)
Array
(
)
Array
(
)
Array
(
)
Array
(
)
Array
(
)
I'm using PHP 5.1.6 but have tested it on 5.4.14 and there's no difference. Any ideas?
When I check the filters in FilterRuleset they are populated on the first call, then blank on subsequent calls. Its as if internally RecursiveIteratorIterator is re-instantiating my FilterRuleset.
Yes, this is exactly the case. Each time you go into a subdirectory the array is empty because per recursive iterator rules, the recursive filter iterator needs to provide the child iterators.
So you've got two options here:
Apply filters on the flattened iteration, that is after tree-traversal. It looks feasible in your case as long as you only need to filter each individual file - not children.
The standard way: Take care that getChildren() returns a configured FilterRuleset recursive-filter-iterator object with the filters set.
I start with the second one because it's quickly done and the normal way to do this.
You overwrite the getChildren() parent method by adding it to your class. Then you take the result of the parent (which is the new FilterRuleset for the children and set the private member. This is possible in PHP (in case you wonder that his works because it's a private member) because it's on the same level of the class hierarchy. You then just return it and done:
class FilterRuleset extends RecursiveFilterIterator
{
private $filters = array();
...
public function getChildren() {
$children = parent::getChildren();
$children->filters = $this->filters;
return $children;
}
}
The other (first) variant is that you basically "degrade" it to the "flat" filter, that is a standard FilterIterator. Therefore you first do the recursive iteration with a RecursiveIteratorIterator and then you wrap that into your filter-iterator. As the tree has been already traversed by the earlier iterator, all this recursive stuff is not needed any longer.
So first of all turn it into a the FilterIterator:
class FilterRuleset extends FilterIterator
{
...
}
The only change is from what you extend with that class. And the you instantiate in a slightly different order:
$path = __DIR__;
$files = new RecursiveIteratorIterator(
new RecursiveDirectoryIterator($path, RecursiveDirectoryIterator::SKIP_DOTS),
RecursiveIteratorIterator::CHILD_FIRST
);
$filtered = new FilterRuleset($files);
$filtered->addFilter(Accept::byCallback(function () {
return true;
}));
foreach ($filtered as $file) {
echo $file->getPathname(), PHP_EOL;
}
I hope these examples are clear. If you play around with these and you run into a problem (or even if not), feedback always welcome.
Ah and before I forget it: Here is the mock I've used to create filters in my example above:
class Accept
{
private $callback;
public function __construct($callback) {
$this->callback = $callback;
}
public function accept($subject) {
return call_user_func($this->callback, $subject);
}
public static function byCallback($callback) {
return new self($callback);
}
}
Hi I am trying to get RecursiveDirectoryIterator class using a extension on the FilterIterator to work but for some reason it is iterating on the root directory only.
my code is this.
class fileTypeFilter extends FilterIterator
{
public function __construct($path)
{
parent::__construct(new RecursiveDirectoryIterator($path));
}
public function accept()
{
$file = $this->getInnerIterator()->current();
return preg_match('/\.php/i', $file->getFilename());
}
}
$it = new RecursiveDirectoryIterator('./');
$it = new fileTypeFilter($it);
foreach ($it as $file)
{
echo $file;
}
my directory structure is something like this.
-Dir1
--file1.php
--file2.php
-Dir2
--file1.php
etc etc
But as I said before the class is not recursively iterating over the entire directory structure and is only looking at the root.
Question is, how do use a basic RescursiveDirectoryIterator to display folders and then run the FilterIterator to only show the php files in those directorys?
Cheers
The FilterIterator should accept another iterator through its constructor.
In order for the automatic recursion to happen, you need to use a RecursiveIteratorIterator to iterate over the RecursiveIterator. You don't need to, but if you don't, then the burden of calling hasChildren() and getChildren() etc... is on you.
Here's a sample. I didn't bother to accept any arguments in the constructor for FileTypeFilterIterator, although that would be a nice addition if you wanted to be able to alter the regex. But, otherwise, you don't need to define a constructor.
$it = new FileTypeFilterIterator (
new RecursiveIteratorIterator(
new RecursiveDirectoryIterator($path),
RecursiveIteratorIterator::SELF_FIRST
)
);
class FileTypeFilterIterator extends FilterIterator
{
public function __construct(Iterator $iter)
{
parent::__construct($iter);
}
public function accept()
{
$file = $this->getInnerIterator()->current();
return preg_match('/\.php/i', $file->getFilename());
}
}
foreach($it as $name => $object){
echo "$name\n";
}
Btw, in this case, you might as well just use RegexIterator instead of extending FilterIterator.
PLEASE CHECK ANSWERS by VolkerK too, he provided another solution, but I can't mark two posts as answer. :(
Good day!
I know that C# allows multiple iterators using yield, like described here:
Is Multiple Iterators is possible in c#?
In PHP there is and Iterator interface. Is it possible to implement more than one iteration scenario for a class?
More details (EDIT):
For example I have class TreeNode implementing single tree node. The whole tree can be expressed using only one this class. I want to provide iterators for iterating all direct and indirect children of current node, for example using BreadthFirst or DepthFirst order.
I can implement this Iterators as separate classes but doing so I need that tree node should expose it's children collection as public.
C# pseudocode:
public class TreeNode<T>
{
...
public IEnumerable<T> DepthFirstEnumerator
{
get
{
// Some tree traversal using 'yield return'
}
}
public IEnumerable<T> BreadthFirstEnumerator
{
get
{
// Some tree traversal using 'yield return'
}
}
}
Yes, you can.
foreach(new IteratorOne($obj) as $foo) ....
foreach(new IteratorTwo($obj) as $bar) .....
Actually, as long as you class implements Iterator, you can apply any arbitrary IteratorIterator to it. This is a Good Thing, because applied meta iterators are not required to know anything about the class in question.
Consider, for example, an iterable class like this
class JustList implements Iterator
{
function __construct() { $this->items = func_get_args(); }
function rewind() { return reset($this->items); }
function current() { return current($this->items); }
function key() { return key($this->items); }
function next() { return next($this->items); }
function valid() { return key($this->items) !== null; }
}
Let's define some meta iterators
class OddIterator extends FilterIterator {
function accept() { return parent::current() % 2; }
}
class EvenIterator extends FilterIterator {
function accept() { return parent::current() % 2 == 0; }
}
Now apply meta iterators to the base class:
$list = new JustList(1, 2, 3, 4, 5, 6, 7, 8, 9);
foreach(new OddIterator($list) as $p) echo $p; // prints 13579
foreach(new EvenIterator($list) as $p) echo $p; // prints 2468
UPDATE: php has no inner classes, so you're kinda out of luck here, without resorting to eval, at least. Your iterators need to be separate classes, which are aware of the baseclass structure. You can make it less harmful by providing methods in the base class that instantiate iterators behind the scenes:
class TreeDepthFirstIterator implements Iterator
{
function __construct($someTree).....
}
class Tree
{
function depthFirst() { return new TreeDepthFirstIterator($this); }
....
}
foreach($myTree->depthFirst() as $node).....
Another option is to use lambdas instead of foreach. This is nicer and more flexible, requires php5.3 though:
class Tree
{
function depthFirst($func) {
while($node = .....)
$func($node);
.....
$myTree->depthFirst(function($node) {
echo $node->name;
});
For your purpose it might be sufficient to have a "mode" flag in your class, so the user can choose whether to have a bread-first or a depth-first iterator.
class Tree {
const TREE_DEPTH_FIRST = 0;
const TREE_BREADTH_FIRST = 0;
protected $mode;
protected $current;
public function __construct($mode=Tree::TREE_DEPTH_FIRST) {
$this->mode = $mode;
}
public function setMode($mode) {
...
}
public function next() {
$this->current = advance($this->current, $this->mode);
}
....
}
(and the short answer to your initial question: no php doesn't have the syntactic sugar of yield return and it doesn't have inner private classes, i.e. whatever you would need the iterator you're returning to do with the "original" object has to be exposed to the outside world. So you'd probably end up "preparing" all elements for an iterator object like ArrayIterator, the very thing you avoid by using yield)
This code shows you how to add multiple iterators in a class.
class TreeNode {
public function getOddIterator () {
return new OddIterator($this->nodes);
}
public function getEvenIterator () {
return new EvenIterator($this->nodes);
}
}
You can have multiple Iterators. The key idea in the Iterator is to take the responsibility for access and traversal out of the list object and put it into an iterator object. So if you want to have multiple Iterators with the same list or different lists; there's no problem.
You can find four different PHP examples here:
http://www.php5dp.com/category/design-patterns/iterator/
You can also use them with Linked Lists.
Cheers,
Bill
I'm trying to iterate over a directory which contains loads of PHP files, and detect what classes are defined in each file.
Consider the following:
$php_files_and_content = new PhpFileAndContentIterator($dir);
foreach($php_files_and_content as $filepath => $sourceCode) {
// echo $filepath, $sourceCode
}
The above $php_files_and_content variable represents an iterator where the key is the filepath, and the content is the source code of the file (as if that wasn't obvious from the example).
This is then supplied into another iterator which will match all the defined classes in the source code, ala:
class DefinedClassDetector extends FilterIterator implements RecursiveIterator {
public function accept() {
return $this->hasChildren();
}
public function hasChildren() {
$classes = getDefinedClasses($this->current());
return !empty($classes);
}
public function getChildren() {
return new RecursiveArrayIterator(getDefinedClasses($this->current()));
}
}
$defined_classes = new RecursiveIteratorIterator(new DefinedClassDetector($php_files_and_content));
foreach($defined_classes as $index => $class) {
// print "$index => $class"; outputs:
// 0 => Class A
// 1 => Class B
// 0 => Class C
}
The reason the $index isn't sequential numerically is because 'Class C' was defined in the second source code file, and thus the array returned starts from index 0 again. This is preserved in the RecursiveIteratorIterator because each set of results represents a separate Iterator (and thus key/value pairs).
Anyway, what I am trying to do now is find the best way to combine these, such that when I iterate over the new iterator, I can get the key is the class name (from the $defined_classes iterator) and the value is the original file path, ala:
foreach($classes_and_paths as $filepath => $class) {
// print "$class => $filepath"; outputs
// Class A => file1.php
// Class B => file1.php
// Class C => file2.php
}
And that's where I'm stuck thus far.
At the moment, the only solution that is coming to mind is to create a new RecursiveIterator, that overrides the current() method to return the outer iterator key() (which would be the original filepath), and key() method to return the current iterator() value. But I'm not favouring this solution because:
It sounds complex (which means the code will look hideous and it won't be intuitive
The business rules are hard-coded inside the class, whereas I would like to define some generic Iterators and be able to combine them in such a way to produce the required result.
Any ideas or suggestions gratefully recieved.
I also realise there are far faster, more efficient ways of doing this, but this is also an exercise in using Iterators for myselfm and also an exercise in promoting code reuse, so any new Iterators that have to be written should be as minimal as possible and try to leverage existing functionality.
Thanks
OK, I think I finally got my head around this. Here's roughly what I did in pseudo-code:
Step 1
We need to list the directory contents, thus we can perform the following:
// Reads through the $dir directory
// traversing children, and returns all contents
$dirIterator = new RecursiveDirectoryIterator($dir);
// Flattens the recursive iterator into a single
// dimension, so it doesn't need recursive loops
$dirContents = new RecursiveIteratorIterator($dirIterator);
Step 2
We need to consider only the PHP files
class PhpFileIteratorFilter {
public function accept() {
$current = $this->current();
return $current instanceof SplFileInfo
&& $current->isFile()
&& end(explode('.', $current->getBasename())) == 'php';
}
}
// Extends FilterIterator, and accepts only .php files
$php_files = new PhpFileIteratorFilter($dirContents);
The PhpFileIteratorFilter isn't a great use of re-usable code. A better method would have been to be able to supply a file extension as part of the construction and get the filter to match on that. Although that said, I am trying to move away from construction arguments where they are not required and rely more on composition, because that makes better use of the Strategy pattern. The PhpFileIteratorFilter could simply have used the generic FileExtensionIteratorFilter and set itself up interally.
Step 3
We must now read in the file contents
class SplFileInfoReader extends FilterIterator {
public function accept() {
// make sure we use parent, this one returns the contents
$current = parent::current();
return $current instanceof SplFileInfo
&& $current->isFile()
&& $current->isReadable();
}
public function key() {
return parent::current()->getRealpath();
}
public function current() {
return file_get_contents($this->key());
}
}
// Reads the file contents of the .php files
// the key is the file path, the value is the file contents
$files_and_content = new SplFileInfoReader($php_files);
Step 4
Now we want to apply our callback to each item (the file contents) and somehow retain the results. Again, trying to make use of the strategy pattern, I've done away unneccessary contructor arguments, e.g. $preserveKeys or similar
/**
* Applies $callback to each element, and only accepts values that have children
*/
class ArrayCallbackFilterIterator extends FilterIterator implements RecursiveIterator {
public function __construct(Iterator $it, $callback) {
if (!is_callable($callback)) {
throw new InvalidArgumentException('$callback is not callable');
}
$this->callback = $callback;
parent::__construct($it);
}
public function accept() {
return $this->hasChildren();
}
public function hasChildren() {
$this->results = call_user_func($this->callback, $this->current());
return is_array($this->results) && !empty($this->results);
}
public function getChildren() {
return new RecursiveArrayIterator($this->results);
}
}
/**
* Overrides ArrayCallbackFilterIterator to allow a fixed $key to be returned
*/
class FixedKeyArrayCallbackFilterIterator extends ArrayCallbackFilterIterator {
public function getChildren() {
return new RecursiveFixedKeyArrayIterator($this->key(), $this->results);
}
}
/**
* Extends RecursiveArrayIterator to allow a fixed $key to be set
*/
class RecursiveFixedKeyArrayIterator extends RecursiveArrayIterator {
public function __construct($key, $array) {
$this->key = $key;
parent::__construct($array);
}
public function key() {
return $this->key;
}
}
So, here I have my basic iterator which will return the results of the $callback I supplied through, but I've also extended it to create a version that will preserve the keys too, rather than using a constructor argument for it.
And thus we have this:
// Returns a RecursiveIterator
// key: file path
// value: class name
$class_filter = new FixedKeyArrayCallbackFilterIterator($files_and_content, 'getDefinedClasses');
Step 5
Now we need to format it into a suitable manner. I desire the file paths to be the value, and the keys to be the class name (i.e. to provide a direct mapping for a class to the file in which it can be found for the auto loader)
// Reduce the multi-dimensional iterator into a single dimension
$files_and_classes = new RecursiveIteratorIterator($class_filter);
// Flip it around, so the class names are keys
$classes_and_files = new FlipIterator($files_and_classes);
And voila, I can now iterate over $classes_and_files and get a list of all defined classes under $dir, along with the file they're defined in. And pretty much all of the code used to do this is re-usable in other contexts as well. I haven't hard-coded anything in the defined Iterator to achieve this task, nor have I done any extra processing outside the iterators
I think what you want to do, is more or less to reverse the keys and values returned from PhpFileAndContent. Said class returns a list of filepath => source, and you want to first reverse the mapping so it is source => filepath and then expand source for each class defined in source, so it will be class1 => filepath, class2 => filepath.
It should be easy as in your getChildren() you can simply access $this->key() to get the current file path for the source you are running getDefinedClasses() on. You can write getDefinedClasses as getDefinedClasses($path, $source) and instead of returning an indexed array of all the classes, it will return a dictionary where each value from the current indexed array is a key in the dictionary and the value is the filepath where that class was defined.
Then it will come out just as you want.
The other option is to drop your use of RecursiveArrayIterator and instead write your own iterator that is initialized (in getChildren) as
return new FilePathMapperIterator($this->key,getDefinedClasses($this->current()));
and then FilePathMapperIterator will convert the class array from getDefinedClasses to the class => filepath mapping I described by simply iterating over the array and returning the current class in key() and always returning the specified filepath in current().
I think the latter is more cool, but definitely more code so its unlikely that I would have gone that way if I can adapt getDefinedClasses() for my needs.