Objects with recursive properties in php - php

I'm using a class, moduleSelectQuery, to generate SQL queries in PHP.
Basically, this class breaks up the individual components of a SQL SELECT query, such as the table name, the fields to select, the WHERE conditions, etc.
However, this quickly becomes complicated with nested queries, such as an WHERE table1.field1 IN (SELECT table2.field2 from table2 WHERE table2.field3 = criteria)
Currently I have a property for moduleSelectQuery called $inWhereClause that is used to store the WHERE... IN(SELECT...) clause. Like the other properties (i.e. $tableName, $whereClause, $havingClause), this is parsed together by its own function based on user input.
However, this parsing function is fundamentally limited. Even if I devote enough effort to it as I do parsing the $whereClause property, it can't have additional nested select statements.
I think one way to do this would be to set $inWhereClause to be another moduleSelectQuery object. This would mean that the parent moduleSelectQuery would have a property which was itself a moduleSelectQuery, i.e. it would be a recursive object. Is this possible/good practice in PHP? Are there other drawbacks?

I could see this being a possible solution. Visualize having a Person class. A person might have a child, which is a person. Doesn't seem unreasonable.
I wouldn't necessarily call this a recursive object, as the object doesn't reference itself, but rather that an object can have a property that is an instance of the same class.
It would, obviously need to be nested appropriately, as only sub-queries in the Where statement would need to be moduleSelectQuery object, where simple comparisons would be instances of strings, or integers.
The only drawbacks that can arrive from this approach are in the design of the class, and methods. I do not see any glaring performance issues, or maintainability issues.

Related

Object Orientated Design with Databases and scalability/optimisation using PHP and mySQL

I'm currently at an impasse in reguards to the structural design of my website. At the moment I'm using objects to simplify the structure of my site (I have a person object, a party object, a position object, etc...) and in theory each of these is a row from it's respective table in the database.
Now from what I've learnt, OO Design is good for keeping things simple and easy to use/implement, which I agree with - it makes my code look so much cleaner and easier to maintain, but what I'm confused about is how I go about linking my objects to the database.
Let's say there is a person page. I create a person object, which equals one mysql query (which is reasonable), but then that person might have multiple positions which I need to fetch and display on a single page.
What I am currently doing is using a method called getPositions from the person object which gets the data from mysql and creates a separate position object for each row, passing in the data as an array. That keeps the queries down to a minimum (2 to a page) but it seems like a horrible implementation and to me, breaks the rules of object orientated design (should I want to change a mysql row, I'd need to change it in multiple places) but the alternative is worse.
In this case the alternative is just getting the ID's that I need and then creating separate positions, passing in the ID which then goes on to getting the row from the database in the constructor. If you have 20 positions per page, it can quickly add up and I've read about how much Wordpress is criticised for it's high number of queries per page and it's CPU usage. The other thing I'll need to consider in this case is sorting, and doing it this way means I'll need to sort the data using PHP, which surely can't be as efficient as natively doing it in mysql.
Of course, pages will be (and can be) cached, but to me, this seems almost like cheating for poorly built applications. In this case, what is the correct solution?
The way you're doing it now is at least on the right track. Having an array in the parent object with references to the children is basically how the data is represented in the database.
I'm not completely sure from your question if you're storing the children as references in the parent's array, but you should be and that's how PHP should store them by default. If you also use a singleton pattern for your objects that are pulled from the database, you should never need to modify multiple objects to change one row as you suggest in your question.
You should probably also create multiple constructors for your objects (using static methods that return new instances) so you can create them from their ID and have them pull the data or just create them from data you already have. The latter case would be used when you're creating children; you can have the parent pull all of the data for its children and create all of them using only one query. Getting a child from its ID will probably be used somewhere else so its good just to have if its needed.
For sorting, you could create additional private (or public if you want) arrays that have the children sorted in a particular way with references to the same objects the main array references.

OOP : One class for database table, one class for database table row

Currently, I build two classes for each database table. For instance, if I have the table person, I will have the classes Person_List and Person.
Design-wise, is it better
for Person_List to output an array of Person; or
for it to output an array containing arrays of rows in the table.
Performance-wise, which is better?
I believe that design-wise, and taking performance into account, would be to (if you insist on Person_List class to represent table and Person to represent single record):
use Iterator interface for Person_List class, so you can iterate through the table without the need to pull all the records at once (it should be significant performance gain in some cases),
additionally use Countable interface for Person_List class, so you are able to count all the results if necessary by getting count directly from database,
This should give you flexibility and allow you to use Person_List class objects similarly as arrays.
If you still have problems employing these two interfaces, here is some explanation:
every time you do foreach ($table as $record) (where $table is an instance of Person_List), the current() method of Person_List class will be invoked (because it is a part of Iterator interface - see docs here), which should return an object of Person class; this should happen using eg. mysql_fetch_object();
when you call count($table) (where $table is an instance of Person_List), the count() method of Person_List class will be invoked, which in turn can use eg. mysql_num_rows() function to return all the results instead of pulling them from database and then counting (this will be again significant performance gain),
It really depends on what you are doing with the records. Accessing columns on the records shouldn't be much (any?) faster to use arrays. Not enough to justify not using objects
Arrays are lighter (smaller in memory, especially) than objects, but if you use Iterator like #Tadeck mentions, this shouldn't be an issue, as you'd only have one instance in memory at a time.
In summary, objects are almost always a better design (from an interface standpoint), however, if you are not sure from a performance standpoint, benchmark the candidate implementations. If the difference isn't noticeable enough, use objects.
You tagged this OOP so I guess you want to work with objects. In that case you'd want to have it return PersonRow objects, e.g. objects that represent a row in the db table. Have a look at
RowData Gateway Pattern
description of the Row Data Gateway Pattern
You should not worry about performance. Come up with a solid design that is readable and maintainable. Only bother about performance when you put your design into action, profiled it and found it doesnt meet performance requirements.

Why return object instead of array?

I do a lot of work in WordPress, and I've noticed that far more functions return objects than arrays. Database results are returned as objects unless you specifically ask for an array. Errors are returned as objects. Outside of WordPress, most APIs give you an object instead of an array.
My question is, why do they use objects instead of arrays? For the most part it doesn't matter too much, but in some cases I find objects harder to not only process but to wrap my head around. Is there a performance reason for using an object?
I'm a self-taught PHP programmer. I've got a liberal arts degree. So forgive me if I'm missing a fundamental aspect of computer science. ;)
These are the reasons why I prefer objects in general:
Objects not only contain data but also functionality.
Objects have (in most cases) a predefined structure. This is very useful for API design. Furthermore, you can set properties as public, protected, or private.
objects better fit object oriented development.
In most IDE's auto-completion only works for objects.
Here is something to read:
Object Vs. Array in PHP
PHP stdClass: Storing Data in an Object Instead of an Array
When should I use stdClass and when should I use an array in php5 oo code
PHP Objects vs Arrays
Mysql results in PHP - arrays or objects?
PHP objects vs arrays performance myth
A Set of Objects in PHP: Arrays vs. SplObjectStorage
Better Object-Oriented Arrays
This probably isn't something you are going to deeply understand until you have worked on a large software project for several years. Many fresh computer science majors will give you an answer with all the right words (encapsulation, functionality with data, and maintainability) but few will really understand why all that stuff is good to have.
Let's run through a few examples.
If arrays were returned, then either all of the values need to be computed up front or lots of little values need to be returned with which you can build the more complex values from.
Think about an API method that returns a list of WordPress posts. These posts all have authors, authors have names, e-mail address, maybe even profiles with their biographies.
If you are returning all of the posts in an array, you'll either have to limit yourself to returning an array of post IDs:
[233, 41, 204, 111]
or returning a massive array that looks something like:
[ title: 'somePost', body: 'blah blah', 'author': ['name': 'billy', 'email': 'bill#bill.com', 'profile': ['interests': ['interest1', 'interest2', ...], 'bio': 'info...']] ]
[id: '2', .....]]
The first case of returning a list of IDs isn't very helpful to you because then you need to make an API call for each ID in order to get some information about that post.
The second case will pull way more information than you need 90% of the time and be doing way more work (especially if any of those fields is very complicated to build).
An object on the other hand can provide you with access to all the information you need, but not have actually pulled that information yet. Determining the values of fields can be done lazily (that is, when the value is needed and not beforehand) when using an object.
Arrays expose more data and capabilities than intended
Go back to the example of the massive array being returned. Now someone may likely build an application that iterates over each value inside the post array and prints it. If the API is updated to add just one extra element to that post array then the application code is going to break since it will be printing some new field that it probably shouldn't. If the order of items in the post array returned by the API changes, that will break the application code as well. So returning an array creates all sorts of possible dependencies that an object would not create.
Functionality
An object can hold information inside of it that will allow it to provide useful functionality to you. A post object, for instance, could be smart enough to return the previous or next posts. An array couldn't ever do that for you.
Flexibility
All of the benefits of objects mentioned above help to create a more flexible system.
My question is, why do they use objects instead of arrays?
Probably two reasons:
WordPress is quite old
arrays are faster and take less memory in most cases
easier to serialize
Is there a performance reason for using an object?
No. But a lot of good other reasons, for example:
you may store logic in the objects (methods, closures, etc.)
you may force object structure using an interface
better autocompletion in IDE
you don't get notices for not undefined array keys
in the end, you may easily convert any object to array
OOP != AOP :)
(For example, in Ruby, everything is an object. PHP was procedural/scripting language previously.)
WordPress (and a fair amount of other PHP applications) use objects rather than arrays, for conceptual, rather than technical reasons.
An object (even if just an instance of stdClass) is a representation of one thing. In WordPress that might be a post, a comment, or a user. An array on the other hand is a collection of things. (For example, a list of posts.)
Historically, PHP hasn't had great object support so arrays became quite powerful early on. (For example, the ability to have arbitrary keys rather than just being zero-indexed.) With the object support available in PHP 5, developers now have a choice between using arrays or objects as key-value stores. Personally, I prefer the WordPress approach as I like the syntactic difference between 'entities' and 'collections' that objects and arrays provide.
My question is, why do they (Wordpress) use objects instead of arrays?
That's really a good question and not easy to answer. I can only assume that it's common in Wordpress to use stdClass objects because they're using a database class that by default returns records as a stdClass object. They got used to it (8 years and more) and that's it. I don't think there is much more thought behind the simple fact.
syntactic sugar for associative arrays
-- Zeev Suraski about the standard object since PHP 3
stdClass objects are not really better than arrays. They are pretty much the same. That's for some historical reasons of the language as well as stdClass objects are really limited and actually are only sort of value objects in a very basic sense.
stdClass objects store values for their members like an array does per entry. And that's it.
Only PHP freaks are able to create stdClass objects with private members. There is not much benefit - if any - doing so.
stdClass objects do not have any methods/functions. So no use of that in Wordpress.
Compared with array, there are far less helpful functions to deal with a list or semi-structured data.
However, if you're used to arrays, just cast:
$array = (array) $object;
And you can access the data previously being an object, as an array. Or you like it the other way round:
$object = (object) $array;
Which will only drop invalid member names, like numbers. So take a little care. But I think you get the big picture: There is not much difference as long as it is about arrays and objects of stdClass.
Related:
Converting to object PHP Manual
Reserved Classes PHP Manual
What is stdClass in PHP?
The code looks cooler that way
Objects pass by reference
Objects are more strong typed then arrays, hence lees pron to errors (or give you a meaningful error message when you try to use un-existing member)
All the IDEs today have auto-complete, so when working with defined objects, the IDE does a lot for you and speeds up things
Easilly encapsulate logic and data in the same box, where with arrays, you store the data in the array, and then use a set of different function to process it.
Inheritance, If you would have a similar array with almost but not similar functionality, you would have to duplicate more code then if you are to do it with objects
Probably some more reason I have thought about
Objects are much more powerful than arrays can be.
Each object as an instance of a class can have functions attached.
If you have data that need processing then you need a function that does the processing.
With an array you would have to call that function on that array and therefore associate the logic yourself to the data.
With an object this association is already done and you don't have to care about it any more.
Also you should consider the OO principle of information hiding. Not everything that comes back from or goes to the database should be directly accessible.
There are several reasons to return objects:
Writing $myObject->property requires fewer "overhead" characters than $myArray['element']
Object can return data and functionality; arrays can contain only data.
Enable chaining: $myobject->getData()->parseData()->toXML();
Easier coding: IDE autocompletion can provide method and property hints for object.
In terms of performance, arrays are often faster than objects. In addition to performance, there are several reasons to use arrays:
The the functionality provided by the array_*() family of functions can reduce the amount of coding necessary in some cases.
Operations such as count() and foreach() can be performed on arrays. Objects do not offer this (unless they implement Iterator or Countable).
It's usually not going to be because of performance reasons. Typically, objects cost more than arrays.
For a lot of APIs, it probably has to do with the objects providing other functionality besides being a storage mechanism. Otherwise, it's a matter of preference and there is really no benefit to returning an object vs an array.
An array is just an index of values. Whereas an object contains methods which can generate the result for you. Sure, sometimes you can access an objects values directly, but the "right way to do it" is to access an objects methods (a function operating on the values of that object).
$obj = new MyObject;
$obj->getName(); // this calls a method (function), so it can decide what to return based on conditions or other criteria
$array['name']; // this is just the string "name". there is no logic to it.
Sometimes you are accessing an objects variables directly, this is usually frowned upon, but it happens quite often still.
$obj->name; // accessing the string "name" ... not really different from an array in this case.
However, consider that the MyObject class doesn't have a variable called 'name', but instead has a first_name and last_name variable.
$obj->getName(); // this would return first_name and last_name joined.
$obj->name; // would fail...
$obj->first_name;
$obj->last_name; // would be accessing the variables of that object directly.
This is a very simple example, but you can see where this is going. A class provides a collection of variables and the functions which can operate on those variables all within a self-contained logical entity. An instance of that entity is called an object, and it introduces logic and dynamic results, which an array simply doesn't have.
Most of the time objects are just as fast, if not faster than arrays, in PHP there isn't a noticeable difference. the main reason is that objects are more powerful than arrays. Object orientated programming allows you to create objects and store not only data, but functionality in them, for example in PHP the MySQLi Class allows you to have a database object that you can manipulate using a host of inbuilt functions, rather than the procedural approach.
So the main reason is that OOP is an excellent paradigm. I wrote an article about why using OOP is a good idea, and explaining the concept, you can take a look here: http://tomsbigbox.com/an-introduction-to-oop/
As a minor plus you also type less to get data from an object - $test->data is better than $test['data'].
I'm unfamiliar with word press. A lot of answers here suggest that a strength of objects is there ability to contain functional code. When returning an object from a function/API call it shouldn't contain utility functions. Just properties.
The strength in returning objects is that whatever lies behind the API can change without breaking your code.
Example: You get an array of data with key/value pairs, key representing the DB column. If the DB column gets renamed your code will break.
Im running the next test in php 5.3.10 (windows) :
for ($i = 0; $i < 1000000; $i++) {
$x = array();
$x['a'] = 'a';
$x['b'] = 'b';
}
and
for ($i = 0; $i < 1000000; $i++) {
$x = new stdClass;
$x->a = 'a';
$x->b = 'b';
}
Copied from http://atomized.org/2009/02/really-damn-slow-a-look-at-php-objects/comment-page-1/#comment-186961
Calling the function for 10 concurrent users and 10 times (for to obtain an average) then
Arrays : 100%
Object : 214% – 216% (2 times slower).
AKA, Object it is still painful slow. OOP keeps the things tidy however it should be used carefully.
What Wordpress is applying?. Well, both solutions, is using objects, arrays and object & arrays, Class wpdb uses the later (and it is the heart of Wordpress).
It follows the boxing and unboxing principle of OOP. While languages such as Java and C# support this natively, PHP does not. However it can be accomplished, to some degree in PHP, just not eloquently as the language itself does not have constructs to support it. Having box types in PHP could help with chaining, keeping everything object oriented and allows for type hinting in method signatures. The downside is overhead and the fact that you now have extra checking to do using the “instanceof†construct. Having a type system is also a plus when using development tools that have intellisense or code assist like PDT. Rather than having to google/bing/yahoo for the method, it exists on the object, and you can use the tool to provide a drop down.
Although the points made about objects being more than just data are valid since they are usually data and behaviour there is at least one pattern mentioned in Martin Fowler's "Patterns of Enterprise Application Architecture" that applies to this type of cenario in which you're transfering data from one system (the application behind the API) and another (your application).
Its the Data Transfer Object - An object that carries data between processes in order to reduce the number of method calls.
So if the question is whether APIs should return a DTO or an array I would say that if the performance cost is negligible then you should choose the option that is more maintainable which I would argue is the DTO option... but of course you also have to consider the skills and culture of the team that is developing your system and the language or IDE support for each of the options.

What does a Data Mapper typically look like?

I have a table called Cat, and an PHP class called Cat. Now I want to make a CatDataMapper class, so that Cat extends CatDataMapper.
I want that Data Mapper class to provide basic functionality for doing ORM, and for creating, editing and deleting Cat.
For that purpose, maybe someone who knows this pattern very well could give me some helpful advice? I feel it would be a little bit too simple to just provide some functions like update(), delete(), save().
I realize a Data Mapper has this problem: First you create the instance of Cat, then initialize all the variables like name, furColor, eyeColor, purrSound, meowSound, attendants, etc.. and after everything is set up, you call the save() function which is inherited from CatDataMapper. This was simple ;)
But now, the real problem: You query the database for cats and get back a plain boring result set with lots of cats data.
PDO features some ORM capability to create Cat instances. Lets say I use that, or lets even say I have a mapDataset() function that takes an associative array. However, as soon as I got my Cat object from a data set, I have redundant data. At the same time, twenty users could pick up the same cat data from the database and edit the cat object, i.e. rename the cat, and save() it, while another user still things about setting another furColor. When all of them save their edits, everything is messed up.
Err... ok, to keep this question really short: What's good practice here?
From DataMapper in PoEA
The Data Mapper is a layer of software
that separates the in-memory objects
from the database. Its responsibility
is to transfer data between the two
and also to isolate them from each
other. With Data Mapper the in-memory
objects needn't know even that there's
a database present; they need no SQL
interface code, and certainly no
knowledge of the database schema. (The
database schema is always ignorant of
the objects that use it.) Since it's a
form of Mapper (473), Data Mapper
itself is even unknown to the domain
layer.
Thus, a Cat should not extend CatDataMapper because that would create an is-a relationship and tie the Cat to the Persistence layer. If you want to be able to handle persistence from your Cats in this way, look into ActiveRecord or any of the other Data Source Architectural Patterns.
You usually use a DataMapper when using a Domain Model. A simple DataMapper would just map a database table to an equivalent in-memory class on a field-to-field basis. However, when the need for a DataMapper arises, you usually won't have such simple relationships. Tables will not map 1:1 to your objects. Instead multiple tables could form into one Object Aggregate and viceversa. Consequently, implementing just CRUD methods, can easily become quite a challenge.
Apart from that, it is one of the more complicated patterns (covers 15 pages in PoEA), often used in combination with the Repository pattern among others. Look into the related questions column on the right side of this page for similar questions.
As for your question about multiple users editing the same Cat, that's a common problem called Concurrency. One solution to that would be locking the row, while someone edits it. But like everything, this can lead to other issues.
If you rely on ORM's like Doctrine or Propel, the basic principle is to create a static class that would get the actual data from the database, (for instance Propel would create CatPeer), and the results retrieved by the Peer class would then be "hydrated" into Cat objects.
The hydration process is the process of converting a "plain boring" MySQL result set into nice objects having getters and setters.
So for a retrieve you'd use something like CatPeer::doSelect(). Then for a new object you'd first instantiate it (or retrieve and instance from the DB):
$cat = new Cat();
The insertion would be as simple as doing: $cat->save(); That'd be equivalent to an insert (or an update if the object already exists in the db... The ORM should know how to do the difference between new and existing objects by using, for instance, the presence ort absence of a primary key).
Implementing a Data Mapper is very hard in PHP < 5.3, since you cannot read/write protected/private fields. You have a few choices when loading and saving the objects:
Use some kind of workaround, like serializing the object, modifying it's string representation, and bringing it back with unserialize
Make all the fields public
Keep them private/protected, and write mutators/accessors for each of them
The first method has the possibility of breaking with a new release, and is very crude hack, the second one is considered a (very) bad practice.
The third option is also considered bad practice, since you should not provide getters/setters for all of your fields, only the ones that need it. Your model gets "damaged" from a pure DDD (domain driven design) perspective, since it contains methods that are only needed because of the persistence mechanism.
It also means that now you have to describe another mapping for the fields -> setter methods, next to the fields -> table columns.
PHP 5.3 introduces the ability to access/change all types of fields, by using reflection:
http://hu2.php.net/manual/en/reflectionproperty.setaccessible.php
With this, you can achieve a true data mapper, because the need to provide mutators for all of the fields has ceased.
PDO features some ORM capability to
create Cat instances. Lets say I use
that, or lets even say I have a
mapDataset() function that takes an
associative array. However, as soon as
I got my Cat object from a data set, I
have redundant data. At the same time,
twenty users could pick up the same
cat data from the database and edit
the cat object, i.e. rename the cat,
and save() it, while another user
still things about setting another
furColor. When all of them save their
edits, everything is messed up.
In order to keep track of the state of data typically and IdentityMap and/or a UnitOfWork would be used keep track of all teh different operations on mapped entities... and the end of the request cycle al the operations would then be performed.
keep the answer short:
You have an instance of Cat. (Maybe it extends CatDbMapper, or Cat3rdpartycatstoreMapper)
You call:
$cats = $cat_model->getBlueEyedCats();
//then you get an array of Cat objects, in the $cats array
Don't know what do you use, you might take a look at some php framework to the better understanding.

What do you think of returning select object instead of statement result?

I would have liked to know if it was a good idea to return a select object from a method like '$selectObj = getSomethingByName($name)' to then pass it to another method like 'getResult($selectObj)' which will do the trick.
The idea is to be able to pass the select object to any usefull function like 'setLimit(10)' or addCriteria('blabla') depending on my model...
But is it a good idea to do this ? it could be 'unsecure' because user will be able to modify the object himself, and i should not want to this..
I used to do simple method before like above but returning the result as a row... but it's sometimes painfull when you have complex statement depending on different tables..
The problem you are facing (complex statements depending on different tables) is an old and widespread problem with ORM frameworks in general. There are lots of things SQL can do, that an ORM doesn't do very well. Inevitably, you have to make up the different in complexity by writing lots of complicated code in your Controller or your View.
Instead, use a Domain Model pattern and encapsulate the complex multi-table database logic into one place, so your Controllers and Views don't have to know about all the sundry details. They just know about the interface of your Domain Model class, and that class has the sole responsibility to know how to fetch the information from the database.
Remember: a Model "HAS-A" table (or multiple tables) -- instead of Model "IS-A" table.

Categories