Out of curiosity, what's the difference (if any, e.g. performance) of creating instances in PHP using one of the following way?
class MyClass { }
// Direct
$name = 'MyClass';
$instance = new $name;
// Using ReflectionClass
$reflector = new ReflectionClass('MyClass');
$instance = $reflector->newInstance();
// Really don't know if it's going to work
$instance = call_user_func(array('MyClass', '__construct'));
Direct is "the normal way"
Using ReflectionClass is what you would do if you your program had to figure out classes etc on the fly - no need to do this in most cases. It will typically be a bit more resource hungry and slower (perhaps not noticably)
Not sure about the 3rd one - falls into the KISS principle - Since "Direct" works, I've never got into such a twisted situation as to even come up with that 3rd approach.
Related
I was wondering if there's a good way to implement the registry pattern in PHP, let me be more clear:
I do know that a Registry is used when you need to keep track of the object you instantiate in order to reuse them and not re-instantiate them again from script to script, e.g. I have a Database class that I want to instantiate only once and then use for all my scripts and I do not want to re-instantiate it again and again. Another example could be a User class that represents an instance of the currently logged in user. I could not use a Singleton in this case, cause e.g. I need another User instance for example when I want to retrieve a friend of the currently logged in user etc.
So I came up with the idea that the Registry better suits this kind of needs in such cases.
I also know that there are two ways of implementing it, or better two ways in order to access the stored instances:
Explicitly or externally, meaning that the Registry should be called every time you need to recover an instance inside your scripts or you need to put an instance inside of it;
Implicitly or internally, meaning that you make kind of an abstract class with a getInstance() method that returns an instance with the get_called_class() late static binding feature, adds it to the registry and then return that instance from the registry itself taking care that if a $label parameter is passed to the getInstance() method, then that particular instance from the registry will be returned. This approach is kinda transparent to the consumer and in my opinion is cleaner and neater (I'll show both implementations, though).
Let's take a basic Registry (really simple implementation, just an example took from a book):
class Registry {
static private $_store = array();
static public function set($object, $name = null)
{
// Use the class name if no name given, simulates singleton
$name = (!is_null($name)) ? $name: get_class($object);
$name = strtolower($name);
$return = null;
if (isset(self::$_store[$name])) {
// Store the old object for returning
$return = self::$_store[$name];
}
self::$_store[$name]= $object;
return $return;
}
static public function get($name)
{
if (!self::contains($name)) {
throw new Exception("Object does not exist in registry");
}
return self::$_store[$name];
}
static public function contains($name)
{
if (!isset(self::$_store[$name])) {
return false;
}
return true;
}
static public function remove($name)
{
if (self::contains($name)) {
unset(self::$_store[$name]);
}
}
}
I know, Registry could be a Singleton, so you never have two Registry at the same time (who needs them someone could think, but who knows).
Anyway the externally way of storing/accessing instances is like this:
$read = new DBReadConnection;
Registry::set($read);
$write = new DBWriteConnection;
Registry::set($write);
// To get the instances, anywhere in the code:
$read = Registry::get('DbReadConnection');
$write = Registry::get('DbWriteConnection');
And internally, inside the class (taken from the book) when getInstance is called:
abstract class DBConnection extends PDO {
static public function getInstance($name = null)
{
// Get the late-static-binding version of __CLASS__
$class = get_called_class();
// Allow passing in a name to get multiple instances
// If you do not pass a name, it functions as a singleton
$name = (!is_null($name)) ?: $class;
if (!Registry::contains($name)) {
$instance = new $class();
Registry::set($instance, $name);
}
return Registry::get($name);
}
}
class DBWriteConnection extends DBConnection {
public function __construct()
{
parent::__construct(APP_DB_WRITE_DSN, APP_DB_WRITE_USER, APP_DB_WRITE_PASSWORD);
} }
class DBReadConnection extends DBConnection {
public function __construct()
{
parent::__construct(APP_DB_READ_DSN, APP_DB_READ_USER,APP_DB_READ_PASSWORD);
}
}
Apparently referring to the registry indirectly (second case) seems more scalable for me, but what if some day I would need to change the registry and use another implementation, I would need to change that calls to Registry::get() and Registry::set() inside the getInstance() method in order to suit the changes or is there a smarter way?
Did someone of you came across this problem and found an easy way to interchange different registries depending on the type of application on the complexity etc.?
Should be a configuration class the solution? Or is there a smarter way to achieve a scalable registry pattern if it is possible?
Thanks for the attention! Hope for some help!
First of all. It's great that you spotted the problem of your approach by yourself. By using a registry you are tight coupling your classes to the registry where you pull your dependencies from. Not only that, but if your classes have to care about how they are stored in the registry and get grabbed from it (in your case every class would also implement a singleton), you also violate the Single-Responsibility-Principle.
As a rule of thumb keep in mind: Accessing objects globally from within a class from whatever storage will lead to tight coupling between the class and the storage.
Let's see what Martin Fowler has to say about this topic:
The key difference is that with a Service Locator every user of a service has a dependency to the locator. The locator can hide dependencies to other implementations, but you do need to see the locator. So the decision between locator and injector depends on whether that dependency is a problem.
and
With the service locator you have to search the source code for calls to the locator. Modern IDEs with a find references feature make this easier, but it's still not as easy as looking at the constructor or setting methods.
So you see it depends on what you are building. If you have a small app with a low amount of dependencies, to hell with it, go on with using a registry (But you absolutely should drop a classes behavior to store itself into or getting grabbed from the registry). If that's not the case and you are building complex services and want a clean and straightforward API define your dependencies explicitly by using Type Hints and Constructor Injection.
<?php
class DbConsumer {
protected $dbReadConnection;
protected $dbWriteConnection;
public function __construct(DBReadConnection $dbReadConnection, DBWriteConnection $dbWriteConnection)
{
$this->dbReadConnection = $dbReadConnection;
$this->dbWriteConnection = $dbWriteConnection;
}
}
// You can still use service location for example to grab instances
// but you will not pollute your classes itself by making use of it
// directly. Instead we'll grab instances from it and pass them into
// the consuming class
// [...]
$read = $registry->get('dbReadConnection');
$write = $registry->get('dbWriteConnection');
$dbConsumer = new DbConsumer($read, $write);
Should be a configuration class the solution? Or is there a smarter way to achieve a scalable registry pattern if it is possible?
That approach is encountered very often and you maybe have heard something about a DI-Container. Fabien Potencier writes the following:
A Dependency Injection Container is an object that knows how to instantiate and configure objects. And to be able to do its job, it needs to knows about the constructor arguments and the relationships between the objects.
The boundaries between a service locator and a DI-Container seem to be pretty blurry but I like the concept to think about it like that: A Service Locator hides the dependencies of a class while a DI-Container does not (which comes along with the benefit of easy unit testing).
So you see, there is no final answer and it depends on what you are building. I can suggest to dig more into the topic since how dependencies are managed is a core concern of every application.
Further Reading
Why Registry Pattern is antipattern. And what is alternative for it.
Service Locator is an Anti-Pattern
Do you need a Dependency Injection Container?
I was trying to find a way to execute some code to alter the results of an objects methods without actually touching the object's code. One way I came up is using a decorator:
class Decorator {
private $object;
public function __construct($object) {
if (!is_object($object)) {
throw new Exception("Not an object");
}
$this->object = $object;
}
protected function doSomething(&$val) {
$val .= "!!";
}
public function __call($name, $arguments) {
$retVal = call_user_func_array(array($this->object, $name), $arguments);
$this->doSomething($retVal);
return $retVal;
}
}
class Test extends BaseTest {
public function run() {
return "Test->run()";
}
}
$o = new Decorator(new Test());
$o->run();
That way it will work properly but it has one disadvantage which makes it unusable for me right now - it would require replacing all lines with new Test() with new Decorator(new Test()) and this is exactly what I would like to avoid - lots of meddling with the existing code. Maybe something I could do in the base class?
One does not simply overload stuff in PHP. So what you want cannot be done. But the fact that you are in trouble now is a big tell your design is flawed. Or if it is not your code design the code you have to work with (I feel your pain).
If you cannot do what you want to do it is because you have tightly coupled your code. I.e. you make use of the new keyword in classes instead of injecting them (dependency injection) into the classes / methods that need it.
Besides not being able to easily swap classes you would also have a gard time easily testing your units because of the tight coupling.
UPDATE
For completeness (for possible future readers): if the specific class would have been namespaced and you were allowed to change the namespace you could have thought about changing the namespace. However this is not really good practice, because it may screw with for example autoloaders. An example of this would be PSR-0. But considering you cannot do this either way I don't see it is possible what you want. P.S. you should not really use this "solution".
UPDATE2
It looks like there has been some overload extension at some time (way way way back), but the only thing I have found about it is some bug report. And don't count on it still working now either way. ;-) There simply is no real overloading in PHP.
Found something (a dead project which doesn't work anymore that enables class overloading): http://pecl.php.net/package/runkit
Possibly another project (also dead of course): http://pecl.php.net/package/apd
I am not a PHP programmer, but I think that AOP is what you are looking for. You can try some frameworks, for example listed in this answer.
From the Wikipedia article on the decorator pattern:
Subclass the original "Decorator" class into a "Component" class
So I think you're supposed to keep the class to be decorated private and expose only the already-decorated class.
I'm using __get() to make some of my properties "dynamic" (initialize them only when requested). These "fake" properties are stored inside a private array property, which I'm checking inside __get.
Anyway, do you think it's better idea to create methods for each of these proprties instead of doing it in a switch statement?
Edit: Speed tests
I'm only concerned about performance, other stuff that #Gordon mentioned are not that important to me:
unneeded added complexity - it doesn't really increase my app complexity
fragile non-obvious API - I specifically want my API to be "isolated"; The documentation should tell others how to use it :P
So here are the tests that I made, which make me think that the performance hit agument is unjustified:
Results for 50.000 calls (on PHP 5.3.9):
(t1 = magic with switch, t2 = getter, t3 = magic with further getter call)
Not sure what the "Cum" thing mean on t3. It cant be cumulative time because t2 should have 2K then...
The code:
class B{}
class A{
protected
$props = array(
'test_obj' => false,
);
// magic
function __get($name){
if(isset($this->props[$name])){
switch($name){
case 'test_obj':
if(!($this->props[$name] instanceof B))
$this->props[$name] = new B;
break;
}
return $this->props[$name];
}
trigger_error('property doesnt exist');
}
// standard getter
public function getTestObj(){
if(!($this->props['test_obj'] instanceof B))
$this->props['test_obj'] = new B;
return $this->props['test_obj'];
}
}
class AA extends A{
// magic
function __get($name){
$getter = "get".str_replace('_', '', $name); // give me a break, its just a test :P
if(method_exists($this, $getter))
return $this->$getter();
trigger_error('property doesnt exist');
}
}
function t1(){
$obj = new A;
for($i=1;$i<50000;$i++){
$a = $obj->test_obj;
}
echo 'done.';
}
function t2(){
$obj = new A;
for($i=1;$i<50000;$i++){
$a = $obj->getTestObj();
}
echo 'done.';
}
function t3(){
$obj = new AA;
for($i=1;$i<50000;$i++){
$a = $obj->test_obj;
}
echo 'done.';
}
t1();
t2();
t3();
ps: why do I want to use __get() over standard getter methods? the only reason is the api beauty; because i don't see any real disadvantages, I guess it's worth it :P
Edit: More Speed tests
This time I used microtime to measure some averages:
PHP 5.2.4 and 5.3.0 (similar results):
t1 - 0.12s
t2 - 0.08s
t3 - 0.24s
PHP 5.3.9, with xdebug active this is why it's so slow:
t1 - 1.34s
t2 - 1.26s
t3- 5.06s
PHP 5.3.9 with xdebug disabled:
t1 - 0.30
t2 - 0.25
t3 - 0.86
Another method:
// magic
function __get($name){
$getter = "get".str_replace('_', '', $name);
if(method_exists($this, $getter)){
$this->$name = $this->$getter(); // <-- create it
return $this->$name;
}
trigger_error('property doesnt exist');
}
A public property with the requested name will be created dynamically after the first __get call. This solves speed issues - getting 0.1s in PHP 5.3 (it's 12 times faster then standard getter), and the extensibility issue raised by Gordon. You can simply override the getter in the child class.
The disadvantage is that the property becomes writable :(
Here is the results of your code as reported by Zend Debugger with PHP 5.3.6 on my Win7 machine:
As you can see, the calls to your __get methods are a good deal (3-4 times) slower than the regular calls. We are still dealing with less than 1s for 50k calls in total, so it is negligible when used on a small scale. However, if your intention is to build your entire code around magic methods, you will want to profile the final application to see if it's still negligible.
So much for the rather uninteresting performance aspect. Now let's take a look at what you consider "not that important". I'm going to stress that because it actually is much more important than the performance aspect.
Regarding Uneeded Added Complexity you write
it doesn't really increase my app complexity
Of course it does. You can easily spot it by looking at the nesting depth of your code. Good code stays to the left. Your if/switch/case/if is four levels deep. This means there is more possible execution pathes and that will lead to a higher Cyclomatic Complexity, which means harder to maintain and understand.
Here is numbers for your class A (w\out the regular Getter. Output is shortened from PHPLoc):
Lines of Code (LOC): 19
Cyclomatic Complexity / Lines of Code: 0.16
Average Method Length (NCLOC): 18
Cyclomatic Complexity / Number of Methods: 4.00
A value of 4.00 means this is already at the edge to moderate complexity. This number increases by 2 for every additional case you put into your switch. In addition, it will turn your code into a procedural mess because all the logic is inside the switch/case instead of dividing it into discrete units, e.g. single Getters.
A Getter, even a lazy loading one, does not need to be moderately complex. Consider the same class with a plain old PHP Getter:
class Foo
{
protected $bar;
public function getBar()
{
// Lazy Initialization
if ($this->bar === null) {
$this->bar = new Bar;
}
return $this->bar;
}
}
Running PHPLoc on this will give you a much better Cyclomatic Complexity
Lines of Code (LOC): 11
Cyclomatic Complexity / Lines of Code: 0.09
Cyclomatic Complexity / Number of Methods: 2.00
And this will stay at 2 for every additional plain old Getter you add.
Also, take into account that when you want to use subtypes of your variant, you will have to overload __get and copy and paste the entire switch/case block to make changes, while with a plain old Getter you simply overload the Getters you need to change.
Yes, it's more typing work to add all the Getters, but it is also much simpler and will eventually lead to more maintainable code and also has the benefit of providing you with an explicit API, which leads us to your other statement
I specifically want my API to be "isolated"; The documentation should tell others how to use it :P
I don't know what you mean by "isolated" but if your API cannot express what it does, it is poor code. If I have to read your documentation because your API does not tell me how I can interface with it by looking at it, you are doing it wrong. You are obfuscating the code. Declaring properties in an array instead of declaring them at the class level (where they belong) forces you to write documentation for it, which is additional and superfluous work. Good code is easy to read and self documenting. Consider buying Robert Martin's book "Clean Code".
With that said, when you say
the only reason is the api beauty;
then I say: then don't use __get because it will have the opposite effect. It will make the API ugly. Magic is complicated and non-obvious and that's exactly what leads to those WTF moments:
To come to an end now:
i don't see any real disadvantages, I guess it's worth it
You hopefully see them now. It's not worth it.
For additional approaches to Lazy Loading, see the various Lazy Loading patterns from Martin Fowler's PoEAA:
There are four main varieties of lazy load. Lazy Initialization uses a special marker value (usually null) to indicate a field isn't loaded. Every access to the field checks the field for the marker value and if unloaded, loads it. Virtual Proxy is an object with the same interface as the real object. The first time one of its methods are called it loads the real the object and then delegates. Value Holder is an object with a getValue method. Clients call getValue to get the real object, the first call triggers the load. A ghost is the real object without any data. The first time you call a method the ghost loads the full data into its fields.
These approaches vary somewhat subtly and have various trade-offs. You can also use combination approaches. The book contains the full discussion and examples.
If your capitalization of the class names and the key names in $prop matched, you could do this:
class Dummy {
private $props = array(
'Someobject' => false,
//etc...
);
function __get($name){
if(isset($this->props[$name])){
if(!($this->props[$name] instanceof $name)) {
$this->props[$name] = new $name();
}
return $this->props[$name];
}
//trigger_error('property doesnt exist');
//Make exceptions, not war
throw new Exception('Property doesn\'t exist');
}
}
And even if the capitalization didn't match, as long as it followed the same pattern it could work. If the first letter was always capitalized you could use ucfirst() to get the class name.
EDIT
It's probably just better to use plain methods. Having a switch inside a getter, especially when the code executed for each thing you try to get is different, practically defeats the purpose of the getter, to save you from having to repeat code. Take the simple approach:
class Something {
private $props = array('Someobject' => false);
public getSomeobject() {
if(!($this->props['Someobject'] instanceof Someobject)) {
//Instantiate and do extra stuff
}
return $this->props['Someobject'];
}
public getSomeOtherObject() {
//etc..
}
}
I'm using __get() to make some of my properties "dynamic" (initialize them only when requested). These "fake" properties are stored inside a private array property, which I'm checking inside __get.
Anyway, do you think it's better idea to create methods for each of these proprties instead of doing it in a switch statement?
The way you ask your question I don't think it is actually about what anybody thinks. To talk about thoughts, first of all it must be clear which problem you want to solve here.
Both the magic _get as well as common getter methods help to provide the value. However, what you can not do in PHP is to create a read-only property.
If you need to have a read-only property, you can only do that with the magic _get function in PHP so far (the alternative is in a RFC).
If you are okay with accessor methods, and you are concerned about typing methods' code, use a better IDE that does that for you if you are really concerned about that writing aspect.
If those properties just do not need to be concrete, you can keep them dynamic because a more concrete interface would be a useless detail and only make your code more complex than it needs to be and therefore violates common OO design principles.
However, dynamic or magic can also be a sign that you do something wrong. And also hard to debug. So you really should know what you are doing. That needs that you make the problem you would like to solve more concrete because this heavily depends on the type of objects.
And speed is something you should not test isolated, it does not give you good suggestions. Speed in your question sounds more like a drug ;) but taking that drug won't give you the power to decide wisely.
Using __get() is said to be a performance hit. Therefore, if your list of parameters is static/fixed and not terribly long, it would be better performance-wise to make methods for each and skip __get(). For example:
public function someobject() {
if(!($this->props[$name] instanceof Someobject))
$this->props[$name] = new Someobject;
// do stuff to initialize someobject
}
if (count($argv = func_get_args())) {
// do stuff to SET someobject from $a[0]
}
return $this->props['someobject'];
}
To avoid the magic methods, you'd have to alter the way you use it like this
$bar = $foo->someobject; // this won't work without __get()
$bar = $foo->someobject(); // use this instead
$foo->someobject($bar); // this is how you would set without __set()
EDIT
Edit, as Alex pointed out, the performance hit is millisecond small. You can try both ways and do some benchmarks, or just go with __get since it's not likely to have a significant impact on your application.
Recently, I saw a colleague of mine instantiate his classes in a constructor, so I started doing the same, like this:
class FooBar{
private $model1;
private $model2;
public function __construct() {
$this->model1=new Model1();
$this->model2=new Model2();
}
}
And now I'm starting to wonder, if maybe instantiating the models everywhere where they are needed may be better?
E.g., function foo() needs model1 and function bar() needs model2, but now both models are loaded.
So, the question: Is this the right way to instantiate other classes? Or should I just instantiate them when I need them in a function?
Well, as always there is no one size fits all answer.
Most of the time, class FooBar aggregates $model1 and $model2 because it needs them to fulfill its function. In this scenario there's not much that FooBar can do unless it has objects in these variables, so it's the right thing to do to create them in the constructor.
Sometimes an aggregate object is not needed to perform a large part of class FooBar's function, and the construction of that object is an expensive operation. In this case, it makes sense to only construct it on demand with code like the following:
class FooBar {
private $model1;
private $model2;
public function Frob() {
$model = $this->getModel1();
$model->frob();
}
private function getModel1() {
if ($this->model1 === null) {
$this->model1 = new Model1;
}
return $this->model1;
}
}
However, that's only sometimes. If class FooBar needs $model1 for half of its operations and $model2 for the other half, this may indicate that FooBar is suffering from a case of "let's throw everything inside one class" and should be split into two classes instead.
I would like to see these dependencies injected into the constructor as parameters.
You should actually be loading them when you need them otherwise a whole bunch of models that are not required (which may have their own constructors with more models loading!) will pop into memory every time you need a trivial operation done.
Don't create a new model unless you're sure you will be using them (e.g. models needed to localize and such)
It is not exact science, and you should follow your instincts in how to organize the code.
If this approach gets unmaintainable, or you want to unit test it, dependency injection might come to the rescue.
But if you're doing simple scripts and development time is an important factor, the way you're doing it now is sufficient.
I've always worry about calling methods by referencing them via strings.
Basically in my current scenario, I use static data mapper methods to create and return an array of data model objects (eg. SomeDataMapper::getAll(1234)). Models follow the Active Record design pattern. In some cases, there could be hundreds of records returned, and I don't want to put everything into memory all at once. So, I am using an Iterator to page through the records, as follows
$Iterator = new DataMapperIterator('SomeDataMapper', 'getAll', array(1234));
while ($Iterator->hasNext()) {
$data = $Iterator->next();
}
Is that a good way of doing this? Is it a bad idea to pass as strings the name of the mapper class and the method? I worry that this idea is not portable to other languages. Is this generally true for languages like Ruby and Python? If so, can anyone recommend a good alternative?
FYI, for future peoples' refernce, I call the method like this:
$method = new ReflectionMethod($className, $methodName);
$returnValue = $method->invokeArgs(null, $parameters);
This is essentially a version of the factory pattern - Using strings to create a object instance.
However, I question the design idea of using an iterator to control the paging of data - that's not really the purpose of an iterator. Unless we just have name confusion, but I'd probably prefer to see something like this.
$pager = new DataMapperPager( 'SomeDataMapper', 'someMethod', array(1234) );
$pager->setPageNum( 1 );
$pager->setPageSize( 10 );
$rows = $pager->getResults();
foreach ( $rows as $row )
{
// whatever
}
Of course, DataMapperPager::getResults() could return an iterator or whatever you'd want.
It is an acceptable way of doing it. Both Python and Ruby support it and thus should be portable. Python can do it as easily as PHP can, however Ruby has a little more to it. In Python at least, it is useful for when the particular class you're referencing has not yet been imported nor seen yet in the file (i.e. the class is found lower in the same file as where you're trying to reference it.)
Getting a class object from a string in Ruby: http://infovore.org/archives/2006/08/02/getting-a-class-object-in-ruby-from-a-string-containing-that-classes-name/
PHP doesn't really support the passing of functions any other way. All dynamic method invocation functions in PHP take what they call a "callback" - see http://us.php.net/manual/en/language.pseudo-types.php#language.types.callback for documentation on that. As you'll see, they're just string or arrays of strings in different usage patterns, so you're not far off.
There are however, design patterns that work around this. For instance, you could define a DataMapper interface that all of your mapper classes must implement. Then, instead of passing in the class and method as string, you could pass the mapper instance to your iterator and since it requires the interface it could call the interface methods directly.
pseudocode:
interface DataMapper
{
public function mapData($data);
}
class DataMapperIterator ...
{
public function __construct(DataMapper $mapper, ...)
{
...
}
...
public function next()
{
... now we can call the method explicitly because of interface ...
$this->mapper->mapData($data);
}
}
class DataMapperImplemenation implements DataMapper
{
...
public function mapData($data)
{
...
}
...
}
Calling methods by name with passed in strings isn't horrible, there's probably only a performance penalty in that the bytecode generated can't be as optimized - there will always be a symbol lookup - but I doubt you'll notice this much.