I'm trying to iterate over a directory which contains loads of PHP files, and detect what classes are defined in each file.
Consider the following:
$php_files_and_content = new PhpFileAndContentIterator($dir);
foreach($php_files_and_content as $filepath => $sourceCode) {
// echo $filepath, $sourceCode
}
The above $php_files_and_content variable represents an iterator where the key is the filepath, and the content is the source code of the file (as if that wasn't obvious from the example).
This is then supplied into another iterator which will match all the defined classes in the source code, ala:
class DefinedClassDetector extends FilterIterator implements RecursiveIterator {
public function accept() {
return $this->hasChildren();
}
public function hasChildren() {
$classes = getDefinedClasses($this->current());
return !empty($classes);
}
public function getChildren() {
return new RecursiveArrayIterator(getDefinedClasses($this->current()));
}
}
$defined_classes = new RecursiveIteratorIterator(new DefinedClassDetector($php_files_and_content));
foreach($defined_classes as $index => $class) {
// print "$index => $class"; outputs:
// 0 => Class A
// 1 => Class B
// 0 => Class C
}
The reason the $index isn't sequential numerically is because 'Class C' was defined in the second source code file, and thus the array returned starts from index 0 again. This is preserved in the RecursiveIteratorIterator because each set of results represents a separate Iterator (and thus key/value pairs).
Anyway, what I am trying to do now is find the best way to combine these, such that when I iterate over the new iterator, I can get the key is the class name (from the $defined_classes iterator) and the value is the original file path, ala:
foreach($classes_and_paths as $filepath => $class) {
// print "$class => $filepath"; outputs
// Class A => file1.php
// Class B => file1.php
// Class C => file2.php
}
And that's where I'm stuck thus far.
At the moment, the only solution that is coming to mind is to create a new RecursiveIterator, that overrides the current() method to return the outer iterator key() (which would be the original filepath), and key() method to return the current iterator() value. But I'm not favouring this solution because:
It sounds complex (which means the code will look hideous and it won't be intuitive
The business rules are hard-coded inside the class, whereas I would like to define some generic Iterators and be able to combine them in such a way to produce the required result.
Any ideas or suggestions gratefully recieved.
I also realise there are far faster, more efficient ways of doing this, but this is also an exercise in using Iterators for myselfm and also an exercise in promoting code reuse, so any new Iterators that have to be written should be as minimal as possible and try to leverage existing functionality.
Thanks
OK, I think I finally got my head around this. Here's roughly what I did in pseudo-code:
Step 1
We need to list the directory contents, thus we can perform the following:
// Reads through the $dir directory
// traversing children, and returns all contents
$dirIterator = new RecursiveDirectoryIterator($dir);
// Flattens the recursive iterator into a single
// dimension, so it doesn't need recursive loops
$dirContents = new RecursiveIteratorIterator($dirIterator);
Step 2
We need to consider only the PHP files
class PhpFileIteratorFilter {
public function accept() {
$current = $this->current();
return $current instanceof SplFileInfo
&& $current->isFile()
&& end(explode('.', $current->getBasename())) == 'php';
}
}
// Extends FilterIterator, and accepts only .php files
$php_files = new PhpFileIteratorFilter($dirContents);
The PhpFileIteratorFilter isn't a great use of re-usable code. A better method would have been to be able to supply a file extension as part of the construction and get the filter to match on that. Although that said, I am trying to move away from construction arguments where they are not required and rely more on composition, because that makes better use of the Strategy pattern. The PhpFileIteratorFilter could simply have used the generic FileExtensionIteratorFilter and set itself up interally.
Step 3
We must now read in the file contents
class SplFileInfoReader extends FilterIterator {
public function accept() {
// make sure we use parent, this one returns the contents
$current = parent::current();
return $current instanceof SplFileInfo
&& $current->isFile()
&& $current->isReadable();
}
public function key() {
return parent::current()->getRealpath();
}
public function current() {
return file_get_contents($this->key());
}
}
// Reads the file contents of the .php files
// the key is the file path, the value is the file contents
$files_and_content = new SplFileInfoReader($php_files);
Step 4
Now we want to apply our callback to each item (the file contents) and somehow retain the results. Again, trying to make use of the strategy pattern, I've done away unneccessary contructor arguments, e.g. $preserveKeys or similar
/**
* Applies $callback to each element, and only accepts values that have children
*/
class ArrayCallbackFilterIterator extends FilterIterator implements RecursiveIterator {
public function __construct(Iterator $it, $callback) {
if (!is_callable($callback)) {
throw new InvalidArgumentException('$callback is not callable');
}
$this->callback = $callback;
parent::__construct($it);
}
public function accept() {
return $this->hasChildren();
}
public function hasChildren() {
$this->results = call_user_func($this->callback, $this->current());
return is_array($this->results) && !empty($this->results);
}
public function getChildren() {
return new RecursiveArrayIterator($this->results);
}
}
/**
* Overrides ArrayCallbackFilterIterator to allow a fixed $key to be returned
*/
class FixedKeyArrayCallbackFilterIterator extends ArrayCallbackFilterIterator {
public function getChildren() {
return new RecursiveFixedKeyArrayIterator($this->key(), $this->results);
}
}
/**
* Extends RecursiveArrayIterator to allow a fixed $key to be set
*/
class RecursiveFixedKeyArrayIterator extends RecursiveArrayIterator {
public function __construct($key, $array) {
$this->key = $key;
parent::__construct($array);
}
public function key() {
return $this->key;
}
}
So, here I have my basic iterator which will return the results of the $callback I supplied through, but I've also extended it to create a version that will preserve the keys too, rather than using a constructor argument for it.
And thus we have this:
// Returns a RecursiveIterator
// key: file path
// value: class name
$class_filter = new FixedKeyArrayCallbackFilterIterator($files_and_content, 'getDefinedClasses');
Step 5
Now we need to format it into a suitable manner. I desire the file paths to be the value, and the keys to be the class name (i.e. to provide a direct mapping for a class to the file in which it can be found for the auto loader)
// Reduce the multi-dimensional iterator into a single dimension
$files_and_classes = new RecursiveIteratorIterator($class_filter);
// Flip it around, so the class names are keys
$classes_and_files = new FlipIterator($files_and_classes);
And voila, I can now iterate over $classes_and_files and get a list of all defined classes under $dir, along with the file they're defined in. And pretty much all of the code used to do this is re-usable in other contexts as well. I haven't hard-coded anything in the defined Iterator to achieve this task, nor have I done any extra processing outside the iterators
I think what you want to do, is more or less to reverse the keys and values returned from PhpFileAndContent. Said class returns a list of filepath => source, and you want to first reverse the mapping so it is source => filepath and then expand source for each class defined in source, so it will be class1 => filepath, class2 => filepath.
It should be easy as in your getChildren() you can simply access $this->key() to get the current file path for the source you are running getDefinedClasses() on. You can write getDefinedClasses as getDefinedClasses($path, $source) and instead of returning an indexed array of all the classes, it will return a dictionary where each value from the current indexed array is a key in the dictionary and the value is the filepath where that class was defined.
Then it will come out just as you want.
The other option is to drop your use of RecursiveArrayIterator and instead write your own iterator that is initialized (in getChildren) as
return new FilePathMapperIterator($this->key,getDefinedClasses($this->current()));
and then FilePathMapperIterator will convert the class array from getDefinedClasses to the class => filepath mapping I described by simply iterating over the array and returning the current class in key() and always returning the specified filepath in current().
I think the latter is more cool, but definitely more code so its unlikely that I would have gone that way if I can adapt getDefinedClasses() for my needs.
Related
I'm currently working on a project where I have to work with huge arrays. With huge, I mean 1k elements or more. Since these are a lot of arrays and i sometimes mess things up, I decided to create a class with static functions so i can call the functions which would make the entire project easier to read. This is what I currently have:
ArrayAccess.class.php:
require "dep/arrays/elements.php";
class ArrayAccess {
public static function get_value_from_element($element) {
return $elements[$element];
}
}
elements.php:
<?php
$elements = array(
"sam" => 6, ... and so on ...
I simply want to be able to use ArrayAccess::get_value_from_element($element) in my project. It is so much easier to read than all these indexes everywhere. However, the array is defined in the elements.php file - I can't use that in the class.
So how can I access the array in my class? Please note, I cannot copy it into the class, the file would be larger than 400k lines, this is not an option.
You can return a value from an include (or require in this case) and store that to a static property the first time the function is called.
elements.php:
<?php
return array("sam" => 6, ...);
DataAccess.php:
class DataAccess {
private static $elements = array();
public static function get_value_from_element($element) {
if(self::$elements === array()) {
self::$elements = require "elements.php";
}
return self::$elements[$element];
}
}
You should also avoid naming your class ArrayAccess, since it already exists in PHP.
In elements.php
<?php
return array( // use return so you can retrieve these into a variable
"sam" => 6, ... and so on ...
Then in the class
<?php
class ArrayAccess {
public static $elements = null; // set up a static var to avoid load this big array multiple times
public static function get_value_from_element($element) {
if(self::$elements === null) { // check if null to load it from the file
self::$elements = require('elements.php');
}
return self::$elements[$element]; // there you go
}
}
If you don't want do the if statement in the getter every time, you should probably find some where else to load the file into the static variable before using the getter.
An alternative is to declare $elements as global in your class:
require "dep/arrays/elements.php";
class ArrayAccess {
public static function get_value_from_element($element) {
global $elements;
return $elements[$element];
}
}
I wanted to play around with some of PHP's iterators and managed to get a solid (from my understanding) build going. My goal was to iterate inside of a parent folder and get 2 nodes down; building a hierarchical tree array during the process. Obviously, I could do this fairly easy using glob and a couple of nested loops but I want to use Spl classes to accomplish this.
All that out of the way, I've played around with SplHeap and SplObjectStore to hierarchy and failed. What's messing with my noodle is my normal methods of recursion are failing (out-of-memory errors) and my one success falls with a recursive method that loops over each node, adding to an array. The problem with that is it ignores the setMaxDepth() method and goes through all of the children. I thought about setting a $var++ to increment through the loop, limiting to limit the nodes but I don't believe that's the "right way".
Anywho, code (sorry for any orphaned code if any - just ignore it)...
<?php
namespace Tree;
use RecursiveFilterIterator,
RecursiveDirectoryIterator,
RecursiveIteratorIterator;
class Filter extends RecursiveFilterIterator {
public static $FILTERS = array(
'.git', '.gitattributes', '.gitignore', 'index.php'
);
public function accept() {
if (!$this->isDot() && !in_array($this->current()->getFilename(), self::$FILTERS))
return TRUE;
return FALSE;
}
}
class DirTree {
const MAX_DEPTH = 2;
private static $iterator;
private static $objectStore;
public function __construct() {
error_reporting(8191);
$path = realpath('./');
try {
$dirItr = new RecursiveDirectoryIterator($path);
$filterItr = new Filter($dirItr);
$objects = new RecursiveIteratorIterator($filterItr, RecursiveIteratorIterator::SELF_FIRST);
$objects->setMaxDepth(self::MAX_DEPTH);
echo '<pre>';
print_r($this->build_hierarchy($objects));
} catch(Exception $e) {
die($e->getMessage());
}
}
public function build_hierarchy($iterator){
$array = array();
foreach ($iterator as $fileinfo) {
if ($fileinfo->isDir()) {
// Directories and files have labels
$current = array(
'label' => $fileinfo->getFilename()
);
// Only directories have children
if ($fileinfo->isDir()) {
$current['children'] = $this->build_hierarchy($iterator->getChildren());
}
// Append the current item to this level
$array[] = $current;
}
}
return $array;
}
}
$d = new DirTree;
A RecursiveIteratorIterator is designed primarily to give you an iterator that behaves like an iterator over a flat list, but the flat list is actually just a sequence in the recursive traversal. It does this by internally managing a Stack of RecursiveIterators, calling getChildren() on them as necessary. A client of RecursiveIteratorIterator is only really supposed to call the normal Iterator methods like current() and next() etc...with the exception of value added methods like setMaxDepth()
Your problem is that your trying to do the recursion yourself by calling getChildren(). If you want to manually manage the recursion, that's fine - but that makes RecursiveIteratorIterator redundant. In fact, I'm really surprised calling getChildren() on RecursiveIteratorIterator didn't fatal error. That's a RecursiveIterator method. spl is probably just forwarding the method call to the inner iterator(some spl classes forward method calls to undefined methods in order to make it easy to use the Decorator design pattern).
The right way:
$dirItr = new RecursiveDirectoryIterator($path);
$filterItr = new Filter($dirItr);
$objects = new RecursiveIteratorIterator($filterItr, RecursiveIteratorIterator::SELF_FIRST);
$objects->setMaxDepth(self::MAX_DEPTH);
echo '<pre>';
foreach ($objects as $splFileInfo) {
echo $splFileInfo;
echo "\n";
}
I'm not going to get into forming the hierarchical array in some particular structure for you, but maybe this related question further helps you understand the difference between RecursiveIteratorIterator and RecursiveIterator
How does RecursiveIteratorIterator work in PHP?
Consider an object used to store a collection of items, but that collection may vary depending on predefined contexts.
Class Container implements IteratorAggregate (
protected $contexts; // list of associated contexts, example: array(0=>1,1=>3)
protected $contents; // array
public loadContents( $contextId ) { /* populates $this->contents*/ }
public getContexts() { /* populates $this->contexts */ }
...
public function getIterator() { return new ArrayIterator($this->contents); }
public getContextIterator() { return new contextIterator($this); }
}
The iterator looks like:
Class contextIterator {
protected $container;
protected $contexts;
protected $currentContext;
public function __construct($container) {
$this->container = $container;
$this->contexts = $container->getContexts();
$this->currentContext = 0;
}
public current() {
$this->container->loadContents( $this->key() );
return $this->contexts[ $this->key() ];
}
public function key() { return $this->currentContext; }
public function next() { $this->currentContext++; }
public function rewind() { $this->currentContext = 0; }
public function valid() { return isset( $this->contexts[ $this->key() ] ); }
}
For the few cases where each context needs to be examined iteratively, I do the following:
$myContainer = new Container();
foreach( $myContainer->getContextIterator() as $key => $value ) {
$myContainer->someMethod();
}
The above is nice and compact, but it feels dirty to me since I'm never actually using $key or $value. Is using the iterator overkill? Further, should an iterator ever change the state/contents of the object it is iterating?
The above is nice and compact, but it feels dirty to me since I'm never actually using $key or $value.
You have not shown the inners of getContextIterator() so it's hard to make concrete suggestions. Generally it's possible to create iterate-able objects in PHP by implementing the OuterIterator interace or by just implementing the Iterator interface. Both interfaces are predefined and you then can use your object with next(), foreach etc.
I assume you've implemented something like OuterIterator. If you implement OuterIterator instead, you will get some speed benefit AFAIK.
Is using the iterator overkill?
No, won't say so. Iterators are very good for collections as you said you have one. I just would change it into a SPL iterator though.
Further, should an iterator ever change the state/contents of the object it is iterating?
Well actually each iterator does, at least for the internal pointer of the iteration. But I think that was not your concern, but might already lighten up.
So even for "more" changes inside the object you're iterating over, it's perfectly okay that it changes as long as it's clear what it does. Counter-Example: if you iterate over an array and it would shuffle elements each time the iteration goes one step ahead would not be useful.
But there are other cases where this is totally valid and useful. So decide on what's done, not with a general rule.
Well,
I have a problem (ok, no real problem, but I wanna try out something new) with creating objects. Actually I have some orders, which contains a list of orderitems.
These orderitems are used and so spreaded in the whole application, and I need a way to create them. The main problem is, I want to be able to create these objects in many different ways.
Actually I do this in the class constructor and check if the argument which is given.
(I'm using php, so there is no overloading support from the language as you surely know :))
A simple and quick Example
class foo {
protected $_data=null;
public function __contruct($bar){
if (is_array($bar)){
$this->_data=$bar;
}
else {
$dataFromDb=getDataFromDatabase
$this->_data=$dataFromDb;
}
}
}
Anyway, if I want to create my object by giving another type of parameter, lets say a xml-document encapsulated in a string I need to put all this stuff in my constructor.
If the process for creating an object is more complicated, I eventually need to create a seperate method for each type, I want to initiate. But this method is only called when this special type is created. (I think you got the problem :))
Another problem comes to mind, if I need more parameters in the constructor to create a concrete object, I have modify all my code, cause the contructor changed. (Ok, I can give him more and more parameters and work with default values, but that is not what I really want).
So my Question is, which pattern fits this problem to solve my creation of a concrete object. I thought about creating a factory for each way I want to create the concrete object. But I'm not sure if this is a common solution to solve such a problem.
IF its only the signature of the constructor changing i would do it like so (a la the Zend Framework universal constructor):
class foo {
// params
public function __construct($options = null)
{
if(null !== $options)
{
$this->setOptions($options);
}
}
public function setOptions(array $options){
foreach ($options as $name => $value){
$method = 'set' . $name;
if(method_exists($this, $method)
{
$this->$method($value);
}
}
return $this;
}
}
And this essntially means all your constructor parameters are array elements with named keys, and anything you want used in this array during initialization you create a setter for and then its automatically called. The down side is the lack of effective hinting in IDEs.
On the otherhand if you want to have specific constructors then i might go with a factory but still use much the same approach:
class foo {
public static function create($class, $options)
{
if(class_exists($class))
{
$obj = new $class($options);
}
}
}
Of course you could alternatively use PHP's reflection to determine how to call the constructor instead of just injecting an arbitrary array argument.
you could simply make it a factory with optional params :)
class Example_Factory
{
public static function factory($mandatoryParam, $optionalParam = null)
{
$instance = new self;
$instance->setMandatory($mandatoryParam);
if ($optionalParam !== null) {
$instance->setOptional($optionalParam);
}
return $instance;
}
public function setMandatory($in)
{
// do something....
}
public function setOptional($in)
{
// do some more...
}
}
PLEASE CHECK ANSWERS by VolkerK too, he provided another solution, but I can't mark two posts as answer. :(
Good day!
I know that C# allows multiple iterators using yield, like described here:
Is Multiple Iterators is possible in c#?
In PHP there is and Iterator interface. Is it possible to implement more than one iteration scenario for a class?
More details (EDIT):
For example I have class TreeNode implementing single tree node. The whole tree can be expressed using only one this class. I want to provide iterators for iterating all direct and indirect children of current node, for example using BreadthFirst or DepthFirst order.
I can implement this Iterators as separate classes but doing so I need that tree node should expose it's children collection as public.
C# pseudocode:
public class TreeNode<T>
{
...
public IEnumerable<T> DepthFirstEnumerator
{
get
{
// Some tree traversal using 'yield return'
}
}
public IEnumerable<T> BreadthFirstEnumerator
{
get
{
// Some tree traversal using 'yield return'
}
}
}
Yes, you can.
foreach(new IteratorOne($obj) as $foo) ....
foreach(new IteratorTwo($obj) as $bar) .....
Actually, as long as you class implements Iterator, you can apply any arbitrary IteratorIterator to it. This is a Good Thing, because applied meta iterators are not required to know anything about the class in question.
Consider, for example, an iterable class like this
class JustList implements Iterator
{
function __construct() { $this->items = func_get_args(); }
function rewind() { return reset($this->items); }
function current() { return current($this->items); }
function key() { return key($this->items); }
function next() { return next($this->items); }
function valid() { return key($this->items) !== null; }
}
Let's define some meta iterators
class OddIterator extends FilterIterator {
function accept() { return parent::current() % 2; }
}
class EvenIterator extends FilterIterator {
function accept() { return parent::current() % 2 == 0; }
}
Now apply meta iterators to the base class:
$list = new JustList(1, 2, 3, 4, 5, 6, 7, 8, 9);
foreach(new OddIterator($list) as $p) echo $p; // prints 13579
foreach(new EvenIterator($list) as $p) echo $p; // prints 2468
UPDATE: php has no inner classes, so you're kinda out of luck here, without resorting to eval, at least. Your iterators need to be separate classes, which are aware of the baseclass structure. You can make it less harmful by providing methods in the base class that instantiate iterators behind the scenes:
class TreeDepthFirstIterator implements Iterator
{
function __construct($someTree).....
}
class Tree
{
function depthFirst() { return new TreeDepthFirstIterator($this); }
....
}
foreach($myTree->depthFirst() as $node).....
Another option is to use lambdas instead of foreach. This is nicer and more flexible, requires php5.3 though:
class Tree
{
function depthFirst($func) {
while($node = .....)
$func($node);
.....
$myTree->depthFirst(function($node) {
echo $node->name;
});
For your purpose it might be sufficient to have a "mode" flag in your class, so the user can choose whether to have a bread-first or a depth-first iterator.
class Tree {
const TREE_DEPTH_FIRST = 0;
const TREE_BREADTH_FIRST = 0;
protected $mode;
protected $current;
public function __construct($mode=Tree::TREE_DEPTH_FIRST) {
$this->mode = $mode;
}
public function setMode($mode) {
...
}
public function next() {
$this->current = advance($this->current, $this->mode);
}
....
}
(and the short answer to your initial question: no php doesn't have the syntactic sugar of yield return and it doesn't have inner private classes, i.e. whatever you would need the iterator you're returning to do with the "original" object has to be exposed to the outside world. So you'd probably end up "preparing" all elements for an iterator object like ArrayIterator, the very thing you avoid by using yield)
This code shows you how to add multiple iterators in a class.
class TreeNode {
public function getOddIterator () {
return new OddIterator($this->nodes);
}
public function getEvenIterator () {
return new EvenIterator($this->nodes);
}
}
You can have multiple Iterators. The key idea in the Iterator is to take the responsibility for access and traversal out of the list object and put it into an iterator object. So if you want to have multiple Iterators with the same list or different lists; there's no problem.
You can find four different PHP examples here:
http://www.php5dp.com/category/design-patterns/iterator/
You can also use them with Linked Lists.
Cheers,
Bill