Check if an object has changed - php

Is there a more native way (e.x. a built-in function) with less userland code to check if an objects property values have changed instead of using one of those methods:
The serialize approach
$obj = new stdClass(); // May be an instance of any class
echo $hashOld = md5(serialize($obj)) . PHP_EOL;
$obj->change = true;
echo $hashNew = md5(serialize($obj)) . PHP_EOL;
echo 'Changed: '; var_dump($hashOld !== $hashNew);
Which results in:
f7827bf44040a444ac855cd67adfb502 (initial)
506d1a0d96af3b9920a31ecfaca7fd26 (changed)
Changed: bool(true)
The shadow copy approach
$obj = new stdClass();
$shadowObj = clone $obj;
$obj->change = true;
var_dump($shadowObj != $obj);
Which results in:
bool(true);
Both approaches work. But both have disadvantages compared to a non userland implementation. The first one needs CPU for serialization and hashing and the second one needs memory for storing clones. And some classes may not be cloned.
Doesn't PHP track changes at object properties? And does PHP not expose a method to make use of it?

What you are trying to do?
You are trying to compare object with itself, after some chain of "unknown" operations to check if the object has changed. If this is true, there are some logical points to observe. At first, if you want to compare object with itself, you've got only two options:
Remember the whole object state (for example hash, or just copy whole object)
Track changes over time
There is no other logical approach. Comparing memory allocations, real objects, copying objects, comparing hashes, is all in point one. Tracking changes, saving changes inside object, remembering meantime operations, inside point 2.
So in my opinion this question is sort of backing up data questions. In that case there are many, many solutions but none of them are hardcoded inside php as far as I'm concerned. Why?
The answer is simple. PHP guys have got the same problems you've got :). Because if this would be hardocded inside php, then php should run / use one of those mechanisms (1) or (2).
In that case every object that you create, and every operation you made should be written somewhere to remember every state / object / something and use them for comparison in the future.
While you need this solution, almost ~100% of websites don't. So hardcoding this inside php would made ~100% of websites work slower and your work faster ;).
PHP hypothetical solution?
The only solution (maybe built in php in the future) I can think of is making some kind of php config flag: track objects, and only if this flag is true, then run all the php mechanisms of tracking objects states. But this also mean a huge performance gap. As all the ifs (if tracking, if tracking, if tracking) are also procesor and memory time consuming.
There is also a problem, what to compare? You need to compare object with same object, but... Few minutes ago? Few operations ago? No... You must point exactly one place in code, and then point second place in code and compare object in those two places. So hypothetical auto tracking is... Kind of powerless, as there is no "key" in the object state ofer time array. I mean, even if you got magic_object_comparer function, what it should look like?
<?php
function magic_object_comparer() {} // Arguments??
function magic_object_comparer($object_before, $object_after) {} // you must save object_before somewhere...??
function magic_object_comparer($object, $miliseconds) {} // How many miliseconds?
function magic_object_comparer($object, $operations) {} // How many operations? Which operations?
magic_comparer_start($object);
// ... Few operations...
$boolean = magic_comparer_compare_from start($object);
// Same as own implementation...
?>
Sadly, you are left with own implementation...
After all, I would propose to implement some kind of own mechanism for that, and remember to use it only there, where you need it. As this mechanism will for sure be time and memory consuming. So think carefully:
Which objects you want to compare. Why?
When you want to compare them?
Does all changes need to be compared?
What is the easiest way of saving those states changes?
And after all of that, try to implement it. I see that you've got a huge php knowledge, so I'm pretty sure that you will figure out something. There are also many comments, and possible ideas in this question and discussion.
But after all maybe I explained a little why, there is no build in solution, and why there should not be one in the future... :).
UPDATE
Take a look here: http://www.fluffycat.com/PHP-Design-Patterns/. This is a great resource about php patterns. You should take a look at adapter, decorator and observer patterns, for possible elegant object oriented solutions.

While I too am looking for a very fast/faster approach, a variant of method 2 is effectively what I use. The advantage of this method is that it is (pretty) fast (in comparison to an isset()), depending on object size. And you don't have to remember to set a ->modified property each time you change the object.
global $shadowcopy; // just a single copy in this simple example.
$thiscopy = (array) $obj; // don't use clone.
if ($thiscopy !== $shadowcopy) {
// it has been modified
// if you want to know if a field has been added use array_diff_key($thiscopy,$shadowcopy);
}
$shadowcopy = $thiscopy; // if you don't modify thiscopy or shadowcopy, it will be a reference, so an array copy won't be triggered.
This is basically method 2, but without the clone. If your property value is another object (vobj), then clone may be necessary (otherwise both references will point to the same object), but then it is worth noting that it is that object vobj you want to see if has changed with the above code. The thing about clone is that it is constructing a second object (similar performance), but if you want to see what values changed, you don't care about the object itself, only the values. And array casting of an object is very fast (~2x the speed of a boolean cast of a bool) .. well, up until large objects. Also direct array comparison === is very fast, for arrays under say 100 vals.
I'm pretty sure an even faster method exists...

I can offer you another solution to the problem, In fact to detect "if an object has changed" we can use observer pattern design principles. May that way should be better for some people who want to get notify about changes in object.
Contracts/ISubject.php
<?php
namespace Contracts;
interface ISubject
{
public function attach($observer): void;
public function detach($observer): void;
public function notify(): void;
}
Contracts/IObserver.php
<?php
namespace Contracts;
interface IObserver
{
public function update($subject);
}
Subject.php
class Subject implements ISubject
{
public $state; // That is detector
private $observers;
public function __construct()
{
$this->observers = new \SplObjectStorage(); // That is php built in object for testing purpose I use SplObjectStorage() to store attach()'ed objects.
}
public function attach($observer): void
{
echo "Subject: Attached an observer.\n";
$this->observers->attach($observer);
}
public function detach($observer): void
{
$this->observers->detach($observer);
echo "Subject: Detached an observer.\n";
}
public function notify(): void
{
echo "Subject: Notifying observers...\n";
foreach ($this->observers as $observer) {
$observer->update($this);
}
}
public function someYourLogic()
{
$this->state = rand(0, 10);
echo "Subject: My state has just changed to: {$this->state}\n";
$this->notify();
}
}
Observer1.php | Plus you are able to have as many ConcreteObserver as you want
class Observer1 implements IObserver
{
public function update($subject): void
{
if ($subject->state < 5) {
echo "Observer1: Reacted to the event.\n";
}
}
}
Clinet.php
$subject = new Subject();
$o1 = new Observer1();
$subject->attach($o1);
$subject->someYourLogic();

There is no built-in method, I'm afraid. The shadow copy approach is the best way.
A simpler way, if you have control over the class, is to add a modified variable:
$this->modified = false;
When I modify the object in any way, I simply use
$obj->modified = true;
This way I can later check
if($obj->modified){ // Do Something
to check if it was modified. Just remember to unset($obj->modified) before saving content in a database.

We can implement it without observer.
For pure php, we can use $attributes & $original to check what has been modified check this explanation if needed.
$modifiedValues = [];
foreach($obj->attributes as $column=>$value) {
if(!array_key_exists($column, $obj->original) || $obj->original[$column] != $value) {
$modifiedValues[$column] = $value;
}
}
// then check $modifiedValues if it contains values
For Laravel user, we can use the isDirty() method. Its usage:
$user = App\User::first();
$user->isDirty(); //false
$user->name = "Peter";
$user->isDirty(); //true

Related

php pthreads and using a class to store data that is shared between threads

I'm fairly new to PHP, and now very new to pthreads.
I'm using the latest PHP7 RC6 build, with pthreads built from git/src to get the latest (and tried the 'official' v3.0.8 one), on Ubuntu 3.13.0-66-generic
I'm trying to write a threaded solution to read in data from a socket and process it. I'm using threading to try to maximize my performance, mainly due to the fact I'm doing operations like http requests (to AWS DynamoDB and other services) and such that are waiting for responses from external systems, and therefore I can benefit from threading.
The real code I have is more complicated than this is. This is a simple example to show my problem.
What I am trying to do is to 'cache' certain information in an 'array' that I get from a database (AWS DynamoDB) so that I can get better performance. I need each thread to be able to use/access and modify this 'global' cache, and with multiple 'records' in the cache.
I had great success with testing and simply storing a string in this way, but now I'm doing it for real, I need to store more complicated data, and I decided to use a little class (cacheRecord) for each record, instead of a simple string of data. But the problem is that when I try to assign a value back to a class member, it seems to not want to 'save', back to the array.
I managed to get it to work by copying the whole 'class' to a tmp variable, modifying that, and then saving back the whole class to the array, but that seems like an overhead of code, and also I would need to wrap it in a ->synchronized to keep integrity between threads.
Is this the only way to do it correctly, with copying it to a tmp and copying it back and using 'synchronized', or am I doing something else wrong/stupid?
Experimenting with it, I made the cacheRecord class 'extends Threaded'. This made the single assign of the member work fine, but this then made it immutable, and I couldn't unset/delete that record in the cache later.
Code to show what I mean:
<?php
class cacheRecord {
public $currentPos;
public $currentRoom;
public $someOtherData;
}
class cache extends Threaded {
public function run() {}
}
class socketThread extends Thread {
public function __construct($myCache) {
$this->cacheData = $myCache;
}
public function run() {
// This will be in a loop, waiting for sockets, and then responding to them, indefinitely.
// At some point, add a record to the cache
$c = new cacheRecord;
$c->currentPos = '1,2,4';
$c->currentRoom = '2';
$this->cacheData['record1'] = $c;
var_dump($this);
// Later on, update the cache record, but this doesnt work
$this->cacheData['record1']->currentRoom = '3';
var_dump($this);
// However this does work, but is this the correct way? Seems like more code to execute, than a simple assign, and obviously, I would need to use synchronized to keep integrity, which would further slow it down.
$tmp = $this->cacheData['record1'];
$tmp->currentRoom = '3';
$this->cacheData['record1'] = $tmp;
var_dump($this);
// Later on some more, remove the record
unset($this->cacheData['record1']);
var_dump($this);
// Also will be using ->synchronized to enforce integrity of certain other operations
// Just an example of how I might use it
/*
$this->cacheData->synchronized(function() {
if ($this->cacheData['record1']->currentRoom == '3') {
$this->cacheData['record1']->Pos = '0,0,0'; // Obviously this wont work as above.
$this->cacheData['record1']->currentRoom = '4';
}
});
*/
}
}
// Main
$myCache = new cache;
for ($th=0;$th<1;$th++) { // Just 1 thread for testing
$socketThreads[$th] = new socketThread($myCache);
$socketThreads[$th]->start();
}
extends \Threaded is the way to go.
However, "anything" in the cache should be extended from this, not only the cache itsef.
It is explained somewhere in the manuals (sorry dont remember exactly where) than only volatile (aka threaded) object will not me immutable.
So if your class cacheRecord is not extended from threaded, it will be immutable, even into another threaded structure.
threaded makes inner attributes array automatically volatile (so thread-usable), but not object if they are not extended from threaded.
Try extending cacheRecord from threaded and tell me if it works.
Phil+

Using __get() (magic) to emulate readonly properites and lazy-loading

I'm using __get() to make some of my properties "dynamic" (initialize them only when requested). These "fake" properties are stored inside a private array property, which I'm checking inside __get.
Anyway, do you think it's better idea to create methods for each of these proprties instead of doing it in a switch statement?
Edit: Speed tests
I'm only concerned about performance, other stuff that #Gordon mentioned are not that important to me:
unneeded added complexity - it doesn't really increase my app complexity
fragile non-obvious API - I specifically want my API to be "isolated"; The documentation should tell others how to use it :P
So here are the tests that I made, which make me think that the performance hit agument is unjustified:
Results for 50.000 calls (on PHP 5.3.9):
(t1 = magic with switch, t2 = getter, t3 = magic with further getter call)
Not sure what the "Cum" thing mean on t3. It cant be cumulative time because t2 should have 2K then...
The code:
class B{}
class A{
protected
$props = array(
'test_obj' => false,
);
// magic
function __get($name){
if(isset($this->props[$name])){
switch($name){
case 'test_obj':
if(!($this->props[$name] instanceof B))
$this->props[$name] = new B;
break;
}
return $this->props[$name];
}
trigger_error('property doesnt exist');
}
// standard getter
public function getTestObj(){
if(!($this->props['test_obj'] instanceof B))
$this->props['test_obj'] = new B;
return $this->props['test_obj'];
}
}
class AA extends A{
// magic
function __get($name){
$getter = "get".str_replace('_', '', $name); // give me a break, its just a test :P
if(method_exists($this, $getter))
return $this->$getter();
trigger_error('property doesnt exist');
}
}
function t1(){
$obj = new A;
for($i=1;$i<50000;$i++){
$a = $obj->test_obj;
}
echo 'done.';
}
function t2(){
$obj = new A;
for($i=1;$i<50000;$i++){
$a = $obj->getTestObj();
}
echo 'done.';
}
function t3(){
$obj = new AA;
for($i=1;$i<50000;$i++){
$a = $obj->test_obj;
}
echo 'done.';
}
t1();
t2();
t3();
ps: why do I want to use __get() over standard getter methods? the only reason is the api beauty; because i don't see any real disadvantages, I guess it's worth it :P
Edit: More Speed tests
This time I used microtime to measure some averages:
PHP 5.2.4 and 5.3.0 (similar results):
t1 - 0.12s
t2 - 0.08s
t3 - 0.24s
PHP 5.3.9, with xdebug active this is why it's so slow:
t1 - 1.34s
t2 - 1.26s
t3- 5.06s
PHP 5.3.9 with xdebug disabled:
t1 - 0.30
t2 - 0.25
t3 - 0.86
Another method:
// magic
function __get($name){
$getter = "get".str_replace('_', '', $name);
if(method_exists($this, $getter)){
$this->$name = $this->$getter(); // <-- create it
return $this->$name;
}
trigger_error('property doesnt exist');
}
A public property with the requested name will be created dynamically after the first __get call. This solves speed issues - getting 0.1s in PHP 5.3 (it's 12 times faster then standard getter), and the extensibility issue raised by Gordon. You can simply override the getter in the child class.
The disadvantage is that the property becomes writable :(
Here is the results of your code as reported by Zend Debugger with PHP 5.3.6 on my Win7 machine:
As you can see, the calls to your __get methods are a good deal (3-4 times) slower than the regular calls. We are still dealing with less than 1s for 50k calls in total, so it is negligible when used on a small scale. However, if your intention is to build your entire code around magic methods, you will want to profile the final application to see if it's still negligible.
So much for the rather uninteresting performance aspect. Now let's take a look at what you consider "not that important". I'm going to stress that because it actually is much more important than the performance aspect.
Regarding Uneeded Added Complexity you write
it doesn't really increase my app complexity
Of course it does. You can easily spot it by looking at the nesting depth of your code. Good code stays to the left. Your if/switch/case/if is four levels deep. This means there is more possible execution pathes and that will lead to a higher Cyclomatic Complexity, which means harder to maintain and understand.
Here is numbers for your class A (w\out the regular Getter. Output is shortened from PHPLoc):
Lines of Code (LOC): 19
Cyclomatic Complexity / Lines of Code: 0.16
Average Method Length (NCLOC): 18
Cyclomatic Complexity / Number of Methods: 4.00
A value of 4.00 means this is already at the edge to moderate complexity. This number increases by 2 for every additional case you put into your switch. In addition, it will turn your code into a procedural mess because all the logic is inside the switch/case instead of dividing it into discrete units, e.g. single Getters.
A Getter, even a lazy loading one, does not need to be moderately complex. Consider the same class with a plain old PHP Getter:
class Foo
{
protected $bar;
public function getBar()
{
// Lazy Initialization
if ($this->bar === null) {
$this->bar = new Bar;
}
return $this->bar;
}
}
Running PHPLoc on this will give you a much better Cyclomatic Complexity
Lines of Code (LOC): 11
Cyclomatic Complexity / Lines of Code: 0.09
Cyclomatic Complexity / Number of Methods: 2.00
And this will stay at 2 for every additional plain old Getter you add.
Also, take into account that when you want to use subtypes of your variant, you will have to overload __get and copy and paste the entire switch/case block to make changes, while with a plain old Getter you simply overload the Getters you need to change.
Yes, it's more typing work to add all the Getters, but it is also much simpler and will eventually lead to more maintainable code and also has the benefit of providing you with an explicit API, which leads us to your other statement
I specifically want my API to be "isolated"; The documentation should tell others how to use it :P
I don't know what you mean by "isolated" but if your API cannot express what it does, it is poor code. If I have to read your documentation because your API does not tell me how I can interface with it by looking at it, you are doing it wrong. You are obfuscating the code. Declaring properties in an array instead of declaring them at the class level (where they belong) forces you to write documentation for it, which is additional and superfluous work. Good code is easy to read and self documenting. Consider buying Robert Martin's book "Clean Code".
With that said, when you say
the only reason is the api beauty;
then I say: then don't use __get because it will have the opposite effect. It will make the API ugly. Magic is complicated and non-obvious and that's exactly what leads to those WTF moments:
To come to an end now:
i don't see any real disadvantages, I guess it's worth it
You hopefully see them now. It's not worth it.
For additional approaches to Lazy Loading, see the various Lazy Loading patterns from Martin Fowler's PoEAA:
There are four main varieties of lazy load. Lazy Initialization uses a special marker value (usually null) to indicate a field isn't loaded. Every access to the field checks the field for the marker value and if unloaded, loads it. Virtual Proxy is an object with the same interface as the real object. The first time one of its methods are called it loads the real the object and then delegates. Value Holder is an object with a getValue method. Clients call getValue to get the real object, the first call triggers the load. A ghost is the real object without any data. The first time you call a method the ghost loads the full data into its fields.
These approaches vary somewhat subtly and have various trade-offs. You can also use combination approaches. The book contains the full discussion and examples.
If your capitalization of the class names and the key names in $prop matched, you could do this:
class Dummy {
private $props = array(
'Someobject' => false,
//etc...
);
function __get($name){
if(isset($this->props[$name])){
if(!($this->props[$name] instanceof $name)) {
$this->props[$name] = new $name();
}
return $this->props[$name];
}
//trigger_error('property doesnt exist');
//Make exceptions, not war
throw new Exception('Property doesn\'t exist');
}
}
And even if the capitalization didn't match, as long as it followed the same pattern it could work. If the first letter was always capitalized you could use ucfirst() to get the class name.
EDIT
It's probably just better to use plain methods. Having a switch inside a getter, especially when the code executed for each thing you try to get is different, practically defeats the purpose of the getter, to save you from having to repeat code. Take the simple approach:
class Something {
private $props = array('Someobject' => false);
public getSomeobject() {
if(!($this->props['Someobject'] instanceof Someobject)) {
//Instantiate and do extra stuff
}
return $this->props['Someobject'];
}
public getSomeOtherObject() {
//etc..
}
}
I'm using __get() to make some of my properties "dynamic" (initialize them only when requested). These "fake" properties are stored inside a private array property, which I'm checking inside __get.
Anyway, do you think it's better idea to create methods for each of these proprties instead of doing it in a switch statement?
The way you ask your question I don't think it is actually about what anybody thinks. To talk about thoughts, first of all it must be clear which problem you want to solve here.
Both the magic _get as well as common getter methods help to provide the value. However, what you can not do in PHP is to create a read-only property.
If you need to have a read-only property, you can only do that with the magic _get function in PHP so far (the alternative is in a RFC).
If you are okay with accessor methods, and you are concerned about typing methods' code, use a better IDE that does that for you if you are really concerned about that writing aspect.
If those properties just do not need to be concrete, you can keep them dynamic because a more concrete interface would be a useless detail and only make your code more complex than it needs to be and therefore violates common OO design principles.
However, dynamic or magic can also be a sign that you do something wrong. And also hard to debug. So you really should know what you are doing. That needs that you make the problem you would like to solve more concrete because this heavily depends on the type of objects.
And speed is something you should not test isolated, it does not give you good suggestions. Speed in your question sounds more like a drug ;) but taking that drug won't give you the power to decide wisely.
Using __get() is said to be a performance hit. Therefore, if your list of parameters is static/fixed and not terribly long, it would be better performance-wise to make methods for each and skip __get(). For example:
public function someobject() {
if(!($this->props[$name] instanceof Someobject))
$this->props[$name] = new Someobject;
// do stuff to initialize someobject
}
if (count($argv = func_get_args())) {
// do stuff to SET someobject from $a[0]
}
return $this->props['someobject'];
}
To avoid the magic methods, you'd have to alter the way you use it like this
$bar = $foo->someobject; // this won't work without __get()
$bar = $foo->someobject(); // use this instead
$foo->someobject($bar); // this is how you would set without __set()
EDIT
Edit, as Alex pointed out, the performance hit is millisecond small. You can try both ways and do some benchmarks, or just go with __get since it's not likely to have a significant impact on your application.

Creating graph with php

I wanna create and store graph in php. I have bus schedule, so I decided to create 2 classes:
class Vertex
{
public $city_id;
public $time;
}
class Edge
{
public routeId;
public end_vertex;
}
after this I'm trying to fill my graph. It should be something like hashtable where key will be Vertex object and it'll have many edges.
prototype example:
foreach ($data as $route)
{
$v = new Vertex($route->startCity, $route->startTime)
if(!graph[$v]) {
graph[$v] = [];
}
graph[$v].add(new Edge($route->routeId, new Vertex($route->city_id, $route->startTime + $route->arrivalTime)));
}
but there is one really big problem, as I understand object cannot be used as array key! Maybe I'm in a wrong way? How to create graphs correctly in php? I'm a newbie in this.
In PHP, only simple types can be used as array indices. Complex types, like arrays, objects and resources do not work properly.
Edit: Oh, if memory serves me right, you should watch out for booleans as well, I seem to recollect an issue I had with them.
Edit2: In your case, the object graph should be pointing at the objects, not an array.
So, for example, your code would look like:
$v = new Vertex();
$v->add(new Edge());
$vertices[] = $v;
Edit3: I noticed some serious syntactic flaws in your code. I don't know the exact reason, but if you really can't get them straight, I would advice that you give the PHP manual a couple of looks.
Edit4: By the way, you are using an object as an array index, not a class. There is no PHP data type for classes, there is only class names, which are plain strings.
See my answer here PHP approach to python's magic __getattr__() and combine it with the __toString() method.
BUT I would off-load this kind of stuff to something like gearman, if it's something more complex.
AND there's a library too http://nodebox.net/code/index.php/Graph

Objects to strings, unique keys in PHP

I was reading around about the Observer pattern, and found a dated article. Having read through, I noticed an interesting mention in this paragraph:
The key methods to look at here are attach(), detach(), and notify(). attach() and detach() handle adding and removing observers. We use a little trick here. Objects quoted in string context resolve to a unique identifier (even if __toString() is defined). You can use this fact to build keys for an associative array. The notify() method cycles through all attached observers, calling update() on each. The UploadManager class calls notify() whenever it has something important to report on upload and on error, in this case.
Which references this example:
function attach(UploadObserver $obs) {
$this->observers["$obs"] = $obs;
}
Now as mentioned, this article is dated. Casting objects to strings of course no longer works in this manner (I run 5.3.6 on my dev box, and push it for all client projects) but I'd like to achieve similar functionality. I can only think of (something like) this:
function attach(Observer $observer){
$this->_observers[md5(serialize($observer))] = $observer;
}
function detach(Observer $observer){
unset($this->_observers[md5(serialize($observer))]);
}
I'm curious, are there any other efficient ways to achieve this; creating a unique key from the object itself.
Caveat: I don't want to get into defined keys, I use those often enough with other repositories and such, implementing __set($key, $value), etc.
Note: I understand MD5 isn't ideal.
Update: Just found spl_object_hash, and I assume this is likely my best choice, however feel free to share your thoughts.
You're right that does not work that way any longer. You might want to use some other function instead: spl_object_hash()
function attach(Observer $observer){
$this->_observers[spl_object_hash($observer)] = $observer;
}
function detach(Observer $observer){
unset($this->_observers[spl_object_hash($observer)]);
}
The serialization based approach has a design problem btw: I stops working when objects are identical by value or in other words if objects return the same serialized value, e.g. NULL. This is fully controllable by the objects themselves when they implement the Serializable interface.
Have you tried the SPL object hash function?
Alternatively you could use SplObjectStorage directly.
Like:
function __construct(...){
$this->_observers = new SplObjectStorage;
}
function attach(Observer $observer) {
$this->_observers[$observer] = $observer;
}
function detach(Observer $observer){
unset($this->_observers[$observer]);
}

Value objects vs associative arrays in PHP

(This question uses PHP as context but isn't restricted to PHP only. e.g. Any language with built in hash is also relevant)
Let's look at this example (PHP):
function makeAFredUsingAssoc()
{
return array(
'id'=>1337,
'height'=>137,
'name'=>"Green Fred");
}
Versus:
class Fred
{
public $id;
public $height;
public $name;
public function __construct($id, $height, $name)
{
$this->id = $id;
$this->height = $height;
$this->name = $name;
}
}
function makeAFredUsingValueObject()
{
return new Fred(1337, 137, "Green Fred");
}
Method #1 is of course terser, however it may easily lead to error such as
$myFred = makeAFredUsingAssoc();
return $myFred['naem']; // notice teh typo here
Of course, one might argue that $myFred->naem will equally lead to error, which is true. However having a formal class just feels more rigid to me, but I can't really justify it.
What would be the pros/cons to using each approach and when should people use which approach?
Under the surface, the two approaches are equivalent. However, you get most of the standard OO benefits when using a class: encapsulation, inheritance, etc.
Also, look at the following examples:
$arr['naem'] = 'John';
is perfectly valid and could be a difficult bug to find.
On the other hand,
$class->setNaem('John');
will never work.
A simple class like this one:
class PersonalData {
protected $firstname;
protected $lastname;
// Getters/setters here
}
Has few advantages over an array.
There is no possibility to make some typos. $data['firtsname'] = 'Chris'; will work while $data->setFirtsname('Chris'); will throw en error.
Type hinting: PHP arrays can contain everything (including nothing) while well defined class contains only specified data.
public function doSth(array $personalData) {
$this->doSthElse($personalData['firstname']); // What if "firstname" index doesn't exist?
}
public function doSth(PersonalData $personalData) {
// I am guaranteed that following method exists.
// In worst case it will return NULL or some default value
$this->doSthElse($personalData->getFirstname());
}
We can add some extra code before set/get operations, like validation or logging:
public function setFirstname($firstname) {
if (/* doesn't match "firstname" regular expression */) {
throw new InvalidArgumentException('blah blah blah');
}
if (/* in debbug mode */) {
log('Firstname set to: ' . $firstname);
}
$this->firstname = $firstname;
}
We can use all the benefits of OOP like inheritance, polymorphism, type hinting, encapsulation and so on...
As mentioned before all of our "structs" can inherit from some base class that provides implementation for Countable, Serializable or Iterator interfaces, so our structs could use foreach loops etc.
IDE support.
The only disadvantage seems to be speed. Creation of an array and operating on it is faster. However we all know that in many cases CPU time is much cheaper than programmer time. ;)
After thinking about it for some time, here's my own answer.
The main thing about preferring value objects over arrays is clarity.
Consider this function:
// Yes, you can specify parameter types in PHP
function MagicFunction(Fred $fred)
{
// ...
}
versus
function MagicFunction(array $fred)
{
}
The intent is clearer. The function author can enforce his requirement.
More importantly, as the user, I can easily look up what constitutes a valid Fred. I just need to open Fred.php and discover its internals.
There is a contract between the caller and the callee. Using value objects, this contract can be written as syntax-checked code:
class Fred
{
public $name;
// ...
}
If I used an array, I can only hope my user would read the comments or the documentation:
// IMPORTANT! You need to specify 'name' and 'age'
function MagicFunction(array $fred)
{
}
Depending on the UseCase I might use either or. The advantage of the class is that I can use it like a Type and use Type Hints on methods or any introspection methods. If I just want to pass around some random dataset from a query or something, I'd likely use the array. So I guess as long as Fred has special meaning in my model, I'd use a class.
On a sidenote:
ValueObjects are supposed to be immutable. At least if you are refering to Eric Evan's definition in Domain Driven Design. In Fowler's PoEA, ValueObjects do not necessarily have to be immutable (though it is suggested), but they should not have identity, which is clearly the case with Fred.
Let me pose this question to you:
What's so different about making a typo like $myFred['naem'] and making a typo like $myFred->naem? The same issue still exists in both cases and they both error.
I like to use KISS (keep it simple, stupid) when I program.
If you are simply returning a subset of a query from a method, simply return an array.
If you are storing the data as a public/private/static/protected variable in one of your classes, it would be best to store it as a stdClass.
If you are going to later pass this to another class method, you might prefer the strict typing of the Fred class, i.e. public function acceptsClass(Fred $fredObj)
You could have just as easily created a standard class as opposed to an array if it is to be used as a return value. In this case you could care less about strict typing.
$class = new stdClass();
$class->param = 'value';
$class->param2 = 'value2';
return $class;
A pro for the hash: It is able to handle name-value combinations which are unknown at design time.
When the return value represents an entity in your application, you should use an object, as this is the purpose of OOP. If you just want to return a group of unrelated values then it's not so clear cut. If it's part of a public API, though, then a declared class is still the best way to go.
Honestly, I like them both.
Hash arrays are way faster than making objects, and time is money!
But, JSON doesn't like hash arrays (which seems a bit like OOP OCD).
Maybe for projects with multiple people, a well-defined class would be better.
Hash arrays might take more CPU time and memory (an object has a predefined amount), though its hard to be sure for every scenario.
But what really sucks is thinking about which one to use too much. Like I said, JSON doesn't like hashes. Oops, I used an array. I got to change a few thousand lines of code now.
I don't like it, but it seems that classes are the safer way to go.
The benefit of a proper Value Object is that there's no way to actually make an invalid one and no way to change one that exists (integrity and "immutability"). With only getters and type hinting parameters, there's NO WAY to screw it up in compilable code, which you can obviously easily do with malleable arrays.
Alternatively you could validate in a public constructor and throw an exception, but this provides a gentler factory method.
class Color
{
public static function create($name, $rgb) {
// validate both
if ($bothValid) {
return new self($name, $rgb);
} else {
return false;
}
}
public function getName() { return $this->_name; }
public function getRgb() { return $this->_rgb; }
protected function __construct($name, $rgb)
{
$this->_name = $name;
$this->_rgb = $rgb;
}
protected $_name;
protected $_rgb;
}
I have worked with OOP Languages over 10 years.
If you understand the way objects work you will love it.
Inheritance, Polymorphism, Encapsulation, Overloading are the key advantage of OOP.
On the other hand when we talk about PHP we have to consider that PHP isn't a full featured Object Oriented language.
For example we cant use method overloading or constructor overloading (straightforward).
Associative arrays in PHP is a VERY nice feature but i think that harms php enterprise applications.
When you write code you want to get clean and maintainable application.
Another think that you loose with Associative arrays is that you can't use intellisense.
So i think if you want to write cleanner and more maintainable code you have to use the OOP features when it is provided.
I prefer to have hard-coded properties like in your second example. I feel like it more clearly defines the expected class structure (and all possible properties on the class). As opposed to the first example which boils down to just always remembering to use the same key names. With the second you can always go back and look at the class to get an idea of the properties just by looking at the top of the file.
You'll better know you're doing something wrong with the second one -- if you try to echo $this->doesntExist you'll get an error whereas if you try to echo array['doesntExist'] you won't.

Categories