I am using Laravel and eclipse as my IDE. I am using the laravel-ide-helper package for autocompletion.
I am calling methods from an eloquent model object.
When I type in
User::find
eclipse provided me with:
find($id, $columns) : \Illuminate\Database\Eloquent\Model.
which means the "find" method returns an \Illuminate\Database\Eloquent\Model instance.
However, when I type in
User::where
eclipse provided me with the following:
where($column, $operator, $value, $boolean) : $this
which means the function "where" returns
$this
Now, I don't really know what $this means because as I understand it "where" should return a query builder instance. As far as I know, $this means the object caller of the method (in this context, the User model itself). But it clearly does not return the model. I suspect that I do not understand what $this means in this context.
What am I missing?
The find() and where() methods do not exist on the Model class, so calls to these methods ends up falling through to the PHP magic method __call() which Laravel has defined. Inside this magic method, Laravel forwards the method call to a new query builder object, which does have these methods.
The query builder class' find() method returns a Model, and its where() method returns a reference to itself ($this) so that you can fluently chain more method calls to the builder.
All of this can make it hard for an IDE to provide hints (IntelliSense), which is where packages like laravel-ide-helper come in. They basically generate files full of interfaces that your IDE can use to understand what magic methods and properties exist for various classes, but in some cases these method signatures still fall short of what you might like to know about the code structure.
In this case the IntelliSense suggestions are apparently populating from the docblock for \Illuminate\Database\Eloquent\Builder::where():
/**
* Add a basic where clause to the query.
*
* #param string|array|\Closure $column
* #param mixed $operator
* #param mixed $value
* #param string $boolean
* #return $this
*/
public function where($column, $operator = null, $value = null, $boolean = 'and');
You can see that the return type is defined as $this. At this point, some IDEs may be smart enough to understand the meaning and provide suggestions for an instance of that class. However, this could become more complicated if the method definitions your IDE is parsing are being generated by packages like laravel-ide-helper. In that case it depends not only on the capabilities of your IDE, but also on the output of the helper package.
Eclipse works purely off the method comments in the source code for its hints, so if you look at the source code for Builder which is the returned type of query(), it has for find...
/**
* Find a model by its primary key.
*
* #param mixed $id
* #param array $columns
* #return \Illuminate\Database\Eloquent\Model|\Illuminate\Database\Eloquent\Collection|static[]|static|null
*/
public function find($id, $columns = ['*'])
for where() it is...
/**
* Add a basic where clause to the query.
*
* #param string|\Closure $column
* #param string $operator
* #param mixed $value
* #param string $boolean
* #return $this
*/
public function where($column, $operator = null, $value = null, $boolean = 'and')
{
As it can only add one type hint it uses the first from find() which is \Illuminate\Database\Eloquent\Model and the only option from where() is $this.
Suppose I have the following PHP function:
/**
* #param string $className
* #param array $parameters
* #return mixed
*/
function getFirstObject($className, $parameters) {
// This uses a Doctrine DQl builder, but it could easily replaced
// by something else. The point is, that this function can return
// instances of many different classes, that do not necessarily
// have common signatures.
$builder = createQueryBuilder()
->select('obj')
->from($className, 'obj');
addParamClausesToBuilder($builder, $parameters, 'obj');
$objects = $builder
->getQuery()
->getResult();
return empty($objects) ? null : array_pop($objects);
}
Basically, the function always returns either an instance of the class specified with the $className parameter or null, if something went wrong. The only catch is, that I do not know the full list of classes this function can return. (at compile time)
Is it possible to get type hinting for the return type of this kind of function?
In Java, I would simply use generics to imply the return type:
static <T> T getOneObject(Class<? extends T> clazz, ParameterStorage parameters) {
...
}
I am aware of the manual type hinting, like
/** #var Foo $foo */
$foo = getOneObject('Foo', $params);
but I would like to have a solution that does not require this boilerplate line.
To elaborate: I am trying to write a wrapper around Doctrine, so that I can easily get the model entities that I want, while encapsulating all the specific usage of the ORM system. I am using PhpStorm.
** edited function to reflect my intended usage. I originally wanted to keep it clean of any specific use case to not bloat the question. Also note, that the actual wrapper is more complex, since I also incorporate model-specific implicit object relations and joins ect.
I use phpdoc #method for this purpose. For example, I create AbstractRepository class which is extend by other Repository classes. Suppose we have AbstractRepository::process(array $results) method whose return type changes according to the class that extends it.
So in sub class:
/**
* #method Car[] process(array $results)
*/
class CarRepo extends AbstractRepository {
//implementation of process() is in the parent class
}
Update 1:
You could also use phpstan/phpstan library. Which is used for static code analyses and you can use it to define generic return types:
/**
* #template T
* #param class-string<T> $className
* #param int $id
* #return T|null
*/
function findEntity(string $className, int $id)
{
// ...
}
This can now be achieved with the IntellJ (IDEA/phpStorm/webStorm) plugin DynamicReturnTypePlugin:
https://github.com/pbyrne84/DynamicReturnTypePlugin
If you use PHPStorm or VSCode (with the extension PHP Intelephense by Ben Mewburn) there is an implementation named metadata where you could specify your own type-hinting based on your code doing the magic inside. So the following should work (as it did on VSCode 1.71.2)
<?php
namespace PHPSTORM_META {
override(\getFirstObject(0), map(['' => '$0']));
}
After some fantastic suggestions, and a sleepless night of excitement due to the possibility of finally having a solution to my problem, I realize that I'm still not quite at a solution. So, I am here to outline my problem in much more detail in hope that someone knows of the best way to achieve this.
To recap (if you haven't read the previous post):
I am constructing a PHP OOP framework from scratch (I have no choice in this matter)
The framework is required to handle object-oriented data in the most efficient way possible. It doesn't need to be lightning quick, it just needs to be the best possible solution to the problem
Objects very closely resemble strictly written oop objects, in that they are an instance of a specific class, which contains a strict set of properties.
Object properties can be basic types (strings, numbers, bools) but can also be one object instance, or an array of objects (with the restriction that the array must be objects of the same type)
Ultimately, a storage engine that supports document-oriented storage (similar to XML or JSON) where the objects themselves have a strict structure.
Instead of outlining what I have tried so far (I discussed this briefly in my previous post) I am going to spend the rest of this post explaining, in detail, what it is that I am trying to do. This post is going to be long (sorry!).
To get started, I need to discuss a terminology that I had to introduce to solve one of the most crucial problems that came with the set of requirements. I've named this terminology "persistence". I understand that this term does have a different meaning when dealing with object databases, and for this reason I am open to suggestions on a different term. But for now, let's move on.
Persistence
Persistence refers to the independence of an object. I found the need to introduce this terminology when considering the data structure being generated from XML (which is something that I had to be able to do). In XML, we see objects that are completely dependent on their parent, while we also see objects that can be independent of a parent object.
The below example is an example of an XML document, that conforms to a certain structure (for example, a .wsdl file). Each object resembles a type with a strict structure. Every object has an "id" property
In the above example, we see two users. Both have their own Address objects under their "address" property. However if we look at their "favouriteBook" property, we can see that they both re-use the same Book object instance. Also note that the books use the same author.
So we have the Address object which is non-persistent because it is only related to its parent object (the User) meaning that its instance only needs to exist while the owning User object exists. Then the Book object which is persistent because it can be used in multiple locations and its instance remains persistent.
At first, I felt a bit crazy for coming up with a terminology like this, however, I found it remarkably simple to understand and use practically. It ultimately condenses the the "many-to-many, one-to-many, one-to-one, many-to-one" formula into a simple idea that I felt worked much better with nested data.
I've made an image representation of the above data here:
With my definition of persistence, comes a set of rules to help in understanding it. These rules are as follows:
update/create
Persistent child objects of the base object being stored update the properties of the persistent object, ultimately updating its instance.
Non-persistent objects always create a new instance of the object to ensure that they always use a non-persistent instance (no two non-persistent instances exist in more than one place at any given time)
deleting
Persistent child objects of the base object do not get deleted recursively. This is because the persistent object may exist in other places. You would always delete a persistent object directly.
Non-persistent child objects of the base object are removed along with the base object. If they were not removed, they would be left stranded as their design requires that the have a parent.
retrieving
Since persistent mostly defines how modifications work, retrieval doesn't involve persistence a great deal, aside from how you would expect persistence to effect how a model is stored and therefore how it would be retrieved (persistent object instances remaining persistent wherever it is located, non-persistent objects always having their own instance)
One final thing to note before we move on - the persistence of data models is defined by the model itself rather than the relationship. Initially, the persistence was part of the relationship but this was completely unnecessary when the system expects that you know the structure of your models, and how they are used. Ultimately, every model instance of a model is either persistent, or it is not.
So taking a look at some code now, you might start to see the method behind the madness. Although it may seem that the reason for this solution is to be able to build a storage system around objective data conforming to set of conditions, it's design actually comes from wanting to be able to store class instances, and/or generate class instances from an objective data structure.
I have written some pseudo-classes as an example of the functionality that I am trying to produce. I have commented most methods, including type declarations.
So first, this would be the base class that all model classes would extend. The purpose of this class is to create a layer between the model class/object, and the database/storage engine:
<?php
/**
* This is the base class that all models would extend. It contains the functionalities that are useful among all model
* objects, such as crud actions, finding, and crud event management.
*
* #author Donny Sutherland <donny#pixelpug.co.uk>
* #package Main
* #subpackage Sub
*
* Class ORMModel
*/
class ORMModel {
/**
* In order to generate relationships between objects, every object MUST have an id. This functions as the object's
* unique identifier. Each object in it's model type (collection) has it's own id.
*
* #var int
*/
public $id;
/**
* Internal property assigned by the application. This is where the persistence of the model is defined.
*
* #var bool
*/
protected $internal_isPersistent = true;
/**
* Internal property assigned by the application. This is an array of the model's properties, and their PHP type.
*
* For example, a User model might use something like this:
* array(
"id" => "integer",
* "username" => "string",
* "password" => "string",
* "address" => "object",
* "favouriteBook" => "object",
* "allBooks" => "array"
* )
*
* #var array
*/
protected $internal_propertyTypes = array();
/**
* Internal property assigned by the application. This is an array of the model's properties which are objects, and
* the MODEL CLASS type of the object.
*
* For example, the User model example for the property types might use this:
* array(
* "address" => "Address",
"favouriteBook" => "Book",
* "allBooks" => "Book"
* )
*
* #var array
*/
protected $internal_objectTypes = array();
/**
* I am not 100% sure on the best way to use this yet, I have tried a few different ways and all seem to cause
* performance problems. But ultimately, before we attempt to update an object, we cache it's currently stored
* instance to this property, allowing us to compare old vs new. I find this really useful for detecting whether a
* property has changed, I just need to work out the best way to do it.
*
* #var $this
*/
protected $internal_old;
/**
* The lazy way to construct an empty model object (all NULL values)
*
* #return $this
*/
final public static function constructEmpty() {
}
/**
* This method is used by the other constructFromXXX methods once the data has been converted to a PHP array.
* This method is what allows us to build a RESTful interface into the ORM system as it conforms to the following
* rules:
*
* - if the id is set (not null), first pull the object from storage.
* - For each key => value of the passed array, OVERWRITE the value
* - For properties that are model objects/arrays, if the property is assiged to the array:
* - if the array value is NULL, we are clearing the object relationship
* - if the array valus is not null, construct recursively at this point
*
* Ultimately, if you assign a property in the array that you pass to this method, it will overwrite the value. If
* you do not, it will use the property value in storage.
*
* #param array $array
*
* #return $this
*/
final public static function constructFromArray(array $array) {
}
/**
* This method attempts to decode the value of $json into a PHP array. It then calls constructFromArray if the string
* could be decoded.
*
* #param $json
*
* #return $this
*/
final public static function constructFromJson($json) {
}
/**
* This method attempts to decode the value of $xml into a PHP array. It then calls constructFromArray if the xml
* could be decoded.
*
* #param $xml
*
* #return $this
*/
final public static function constructFromXml($xml) {
}
/**
* Find one object, based on a set of options.
*
* #param ORMCrudOptions $options
*
* #return $this
*/
final public static function findOne(ORMCrudOptions $options) {
}
/**
* Find all objects, (optionally) based on a set of options
*
* #param ORMCrudOptions $options
*
* #return $this[]
*/
final public static function findAll(ORMCrudOptions $options=null) {
}
/**
* Find the count of objects, based on a set of optoins
*
* #param ORMCrudOptions $options
*
* #return integer
*/
final public static function findCount(ORMCrudOptions $options) {
}
/**
* Find one object, based on it's id, and (optionally) a set of options.
*
* #param ORMCrudOptions $options
*
* #return $this
*/
final public static function findById($id,ORMCrudOptions $options=null) {
}
/**
* Push this object to storage. This creates/updates all of the contained objects, based on their id's and
* persistence.
*
* #param ORMCrudOptions $options
*
* #return bool
*/
final public function pushThis(ORMCrudOptions $options) {
}
/**
* Pull this object form storage. This retrieves all of the contained objects again, based on their id's and
* persistence.
*
* #param ORMCrudOptions $options
*
* #return bool
*/
final public function pullThis(ORMCrudOptions $options) {
}
/**
* Remove this object from storage. This conditionally removes the contained objects (based on persistence) based
* on their id's.
*
* #param ORMCrudOptions $options
*/
final public function removeThis(ORMCrudOptions $options) {
}
/**
* This is a crud event.
*/
public function beforeCreate() {
}
/**
* This is a crud event.
*/
public function afterCreate() {
}
/**
* This is a crud event.
*/
public function beforeUpdate() {
}
/**
* This is a crud event.
*/
public function afterUpdate() {
}
/**
* This is a crud event.
*/
public function beforeRemove() {
}
/**
* This is a crud event.
*/
public function afterRemove() {
}
/**
* This is a crud event.
*/
public function beforeRetrieve() {
}
/**
* This is a crud event.
*/
public function afterRetrieve() {
}
}
So ultimately, this class would be designed to provide the functionality to construct, find, save, retrieve and delete model objects. The internal properties are properties that exist only in the classes (not in storage). These properties get populated by the framework itself while you use an interface to create models, and add property/fields to the models.
The idea is, the framework comes with an interface for managing data models. With this interface you create the models, and add property/fields to the models. In doing so, the system automatically creates the class files for you, updating those internal properties as you modify the persistence and property types.
To keep things developer friendly, the system creates two class files for each model. A base class (which extends ORMModel) and another class (which extends the base class). The base class is manipulated by the system and therefore modifying this file is not recommended. The other class is used by developers to add additional functionality to models and crud events.
So coming back to the example data, here is the User base class:
<?php
class User_Base extends ORMModel {
public $name;
public $pass;
/**
* #var Address
*/
public $address;
/**
* #var Book
*/
public $favouriteBook;
protected $internal_isPersistent = true;
protected $internal_propertyTypes = array(
"id" => "integer",
"name" => "string",
"pass" => "string",
"address" => "object",
"favouriteBook" => "object"
);
protected $internal_objectTypes = array(
"address" => "Address",
"favouriteBook" => "Book"
);
}
Pretty much self explanatory. Again note that the internal properties get generated by the system, so those arrays would be generated based on the property/fields that you specify when creating/modifying the User model in the model management interface. Also note the docblock on the address and favouriteBook property definitions. Those are also generated by the system making the classes very IDE friendly.
This would be the other class generated for the User model:
<?php
final class User extends User_Base {
public function beforeCreate() {
}
public function afterCreate() {
}
public function beforeUpdate() {
}
public function afterUpdate() {
}
public function beforeRemove() {
}
public function afterRemove() {
}
public function beforeRetrieve() {
}
public function afterRetrieve() {
}
}
Again, pretty self explanatory. We've extended the base class to create another class where developers would add additional methods, and add functionality to the crud events.
I'll not add in the other objects that make up the rest of the example data. Since the above should explain how they would look.
So you may/may not have noticed that in the ORMModel class, the CRUD methods require an instance of an ORMCrudOptions class. This class is pretty crucial to the whole system, so lets take a quick look at that:
<?php
/**
* Despite this object being some-what aggregate, it it quite possibly the most important part of the ORM, in that it
* defines how CRUD actions are executed, and outline how the querying is done.
*
* Class ORMCrudOptions
*/
final class ORMCrudOptions {
/**
* This ultimately makes up the "where" part of the sql query. However, because we want to be able to make querying
* possible at any depth within the hierarchy of a model, this gets quite complicated.
*
* Previously, I developed a system which allowed the user to do something like this:
*
* "this.customer.address.postcode LIKE ('%XXX%') OR this.customer.address.line1 LIKE ('%XXX%')
*
* he "this" and the "." are my extension to basic sql. The "this" refers to the base model that you are finding,
* and each "." basically drills down into the hierarchy to make a comparison on a property somewhere within a
* contained model object.
*
* I will explain more how I did this in my post, I am most definitely looking at how I could better achieve this
* though.
*
* #var string
*/
private $query;
/**
* This allows you to build up a list of order by definitions.
*
* Using the orderBy method, you can chain up the order by statements like:
*
* ->orderBy("this.name","asc")->orderBy("this.customer.address.line1","desc")
*
* Which would be similar to doing:
*
* ORDER BY this_name ASC, this_customer_address.line1 DESC
*
* #var array
*/
private $orderBy;
/**
* This allows you to set the limit start and limit values by doing:
*
* ->limit(10,10)
*
* Which would be similar to doing:
*
* LIMIT 10, 10
*
* #var
*/
private $limit;
/**
* Depth was added in my later en devours to try and help with performance. It allows you to specify the depth at
* which to retrieve data. Although this helped with optimisation a lot, I really disliked having to use
* implement this because it seems like a work-around. I would rather be able to increase performance elsewhere so
* that objects are always retrieved at their full depth
*
* #var integer
*/
private $depth;
/**
* This was another newly added feature. Whenever you execute a crud action on a model, the model instance is stored
* in a local cache if this is true, and/or retrieved from this cached if this value is true.
*
* I did find this to make a significant increase on performance, although it did bring in complications that make
* the system tricky to use at times. You really need to understand how and when to use the cache, otherwise it can
* be infuriatingly obtuse.
*
* #var bool
*/
private $useCache;
/**
* Built into the ORM system, and tied in with the application I set up a webhook system which fires out webhooks on
* crud events. I discovered the need to be able to disable webhooks at times (when doing large amounts of crud
* actions in one go) pretty early on. Setting this to false basically disables webhooks on the crud action
*
* #var bool
*/
private $fireWebhooks;
/**
* Also build into the application, and tied into the ORM system is an access system. This works on a seperate
* layer to the database, allowing me to use the same access system as I use for everything in the framework as I do
* for defining crud action access. However, in some instances I found it useful to disable access checks.
*
* This is always on by default. In the api system that I built to access the data models, you were not able to
* modify this property and therefore were always subject to access checks.
*
* #var
*/
private $ignoreAccessChecks;
/**
* The lazy way to create a new instance of options.
*
* #return ORMCrudOptions
*/
public static function n() {
return new ORMCrudOptions();
}
/**
* Set the query value
*
* #param $query
*
* #return $this
*/
public function query($query) {
$this->query = $query;
return $this;
}
/**
* Add an orderby field and direction
*
* #param $field
* #param string $direction
*
* #return $this
* #internal param array $orderBy
*
*/
public function orderBy($field,$direction="asc") {
$this->orderBy[] = array($field,$direction);
return $this;
}
/**
* Set the limit start and limit.
*
* #param $limitResults
* #param null $limitStart
*
* #return $this
*/
public function limit($limitResults,$limitStart=null) {
$this->limit = array($limitResults,$limitStart);
return $this;
}
/**
* Set the depth for retrieval
*
* #param $depth
*
* #return $this
*/
public function depth($depth) {
$this->depth = $depth;
return $this;
}
/**
* Set whether to use the model cache
*
* #param $useCache
*
* #return $this
*/
public function useCache($useCache) {
$this->useCache = $useCache;
return $this;
}
/**
* Set whether to fire webhooks on crud actions
*
* #param $fireWebhooks
*
* #return $this
*/
public function fireWebhooks($fireWebhooks) {
$this->fireWebhooks = $fireWebhooks;
return $this;
}
/**
* Set whether to ignore access checks
*
* #param $ignoreAccessChecks
*
* #return $this
*/
public function ignoreAccessChecks($ignoreAccessChecks) {
$this->ignoreAccessChecks = $ignoreAccessChecks;
return $this;
}
}
The idea behind this class is to remove the need to have a large number of arguments in the crud methods, and because the majority of those arguments can be re-used in all of the crud methods. Take note of the comments on the query property, as that is one is important.
So, that pretty much covers the base psuedo-code and ideas behind what it is that I am trying to do. So finally, I'll show some user-scenarios:
<?php
//the most simple way to store a user
$user = User::constructEmpty();
//we use auto incrementing on the id value at the database end. So by not specifying the id, we are not updaing, and
//the id will be auto generated. After the push has been made, the system will assign the id for me
$user->name = "bob";
$user->pass = "bobpass";
//the system automatically constructs child objects for you if they are not yet constructed, because
//it knows what type should be constructed. So I don't need to construct the address object, manually!
$user->address->line1 = "awesome drive";
$user->address->zip = "90051";
//save to storage, but don't fire webhooks and ignore access checks. Note that the ORMCrudOptions object
//is passed to child objects too when recursion happens, meaning that the same options are inherited by child objects
$user->pushThis(ORMCrudOptions::n()->fireWebhooks(false)->ignoreAccessChecks(true));
echo $user->id; //this will display the auto generated id
echo $user->address->id; //this will be the audo generated id of the address object.
//next lets update something within the object
$user->name = "bob updated";
//because we know now that the object has an id value, it will update the existing object. Remembering tha the User
//object is persistent!
$user->pushThis(ORMCrudOptions::n()->fireWebhooks(false)->ignoreAccessChecks(true));
echo $user->id; //this will be the exact same id as before
echo $user->address->id; //this will be a NEW ID! Remember, the address object is NOT persistent meaning that a new
//instance was created in order to ensure that is is infact non-persistent. The system does handle cleaning up of loose
//objects although this is one of the main perforance problems
//finding the above object by user->name
$user = User::findOne(ORMCrudOptions::n()->query("this.name = ('bob')"));
if($user) {
echo $user->name; //provided that a user with name "bob" exsists, this would output "bob"
}
//finding the above user by address->zip
$user = User::findOne(ORMCrudOptions::n()->query("this.address.zip = ('90051')"));
if($user) {
echo $user->address->zip; //provided that the user with address->zip "90051" exists, this would output "90051"
}
//removing the above user
$user = User::findById(1); //assuming that the id of the user id 1
//add a favourite book to the user
$user->favouriteBook->name = "awesome book!";
//update
$user->pushThis(ORMCrudOptions::n()->ignoreAccessChecks(true));
//remove
$user->removeThis(ORMCrudOptions::n()->ignoreAccessChecks(true));
//with how persistence works, this will delete the user, and the user's address (because the address is non-persistence)
//but will leave the created book un-deleted, because books are persistent and may exist as child objects to other objects
//finally, constructing from document-oriented
$user = User::constructFromArray(array(
"user" => "bob",
"pass" => "passbob",
"address" => array(
"line1" => "awesome drive",
"zip" => "90051"
)
));
//this will only CONSTRUCT the object based on the internal properties defined property types and object types.
//properties that don't exist in the model's defined properties, but exist in the array will be ignored, so having more
//properties in the array than should be there doesn't matter
$user->pushThis(ORMCrudOptions::n()->ignoreAccessChecks(true));
//update only one property of a user object using arrays (this is ultimately how the api system of the ORM was built)
$user = User::constructFromArray(array(
"id" => 1,
"user" => "bob updated"
));
echo $user->pass; //this would output passbob, because the pass was not specified in the array, it was pulled form storage
It's not really possible to show here, but one of the things that makes this system a delight to use is how the the generation of the class files makes them incredibly IDE friendly (in particular, for auto-completion). Yeah, some of the old-school developers will be against this new-modern-fangled-technology, but at the end of the day when you are dealing with crazily complex object-oriented data structures, having the IDE help you in spelling your property names correctly and getting the structure correct can be a life-saver!
If you are still with me, thank you for reading. You are probably wondering though, what is it you want again?.
In short, I don't have a huge amount of experience in document/object storage and already in the past few days I've been shown that there are technologies out there that could help my achieve what it is that I am trying to do. I'm just not 100% certain yet that I have found the right one. Do I create a new ORM, can I efficiently get this functionality out of an existing ORM, do I use a dedicated object/graph database?
I very much welcome any and all suggestions!
It still feels like this is a nested set algorithm, because your data will always fit into a hierarchy. Simple types (strings, integers, etc) have a hierarchy of depth 1, and an object expression like customer.address.postcode (from your related post) will have a hierarchy level for each component (3 in this case, with the corresponding string value stored in the outermost node).
It seems that this hierarchy can store different types, so you'd need to make a small change to the nested set algorithm. Rather than each node carrying class-specific (Address, User, etc) columns, you have a string reference to the type and an integer primary key to reference it. This means that you can't use foreign key constraints for this part of your database, but that's a small price to pay. (The reason for this is a single column cannot obey one of several constraints, it would have to obey them all. That said, you could probably do something clever with a pre-insert/pre-update trigger though).
So, if you were to use a Doctrine or Propel NestedSet behaviour, you would define tables thus:
Node
[nested set columns, done for you in an ORM]
name (varchar, records the element name e.g. customer)
is_persistent (bool)
table_name (varchar)
primary_key (integer)
Address
(Your usual columns, ditto any other table)
Now, we have an interesting property emerging here: when creating a hierarchy, you'll see that the trivial values in the leaf nodes can be shared by virtue of our reference system. In fact, I am not entirely sure the is_persistent boolean is required: it is persistent (if I have understood your term correctly) by virtue of sharing external table rows, and non-persistent if it does not.
So, if customer1.address.postcode has a particular string value, you can get customer2.address.postcode to point to the same thing. When updating the version pointed to by the first expression, the second one will update "automatically" (because it resolves to the same table row).
The advantage here is that this will bolt onto Propel and Doctrine without much work, and without any core hacking at all. You'd need to do some work to convert an object/array to a hierarchy, but that's probably not much code.
Addendum: let me explain my thinking a bit more in relation to the storage of nested elements. You say that you believe that you need to share a hierarchy at different levels in different places, but I am not so sure (and presently I think you need some encouragement not to build an excessively complicated system!). Let us look at an example, of a user having a favourite book.
To store it, we create these hierarchies:
user
node level 1
points to user record containing id=1, name=bob, pass=bobpass
favouriteBook
node level 2
points to book record containing id=1, name=awesome book
author
node level 3
points to author record containing id=3, name=peter, pass=peterpass
Now, let's say we have another user and want to share a different favourite book by the same author (i.e. we are sharing user.favouriteBook.author).
user
node level 1
points to different user record containing id=100, name=halfer, pass=halferpass
favouriteBook
node level 2
points to different book record containing id=101, name=textbook
author
node level 3
points to same author record (id = 3)
How about two users who share the same favourite book? No problem (we additionally share user.favouriteBook):
user
node level 1
points to different user record containing id=101, name=donny, pass=donnypass
favouriteBook
node level 2
points to previous book record (id=1)
author
node level 3
points to previous author record (id = 3)
One critique that could be made of this method is that if you make user.favouriteBook "persistent" (i.e. shared) then it should share user.favouriteBook.author automatically. This is because if two or more people like the same book, it will be by the same author(s) for all of them.
However, I noted in the comments why I think my explicit approach is better: the alternative might be a nested set of a nested set, which might get too complicated, and as yet I don't think you've demonstrated you need that. The trade-off is that my approach needs a bit more storage, but I think that's fine. You also have some more setting-up of objects, but if you have a single factory for this, and solidly unit test it, I don't think you need to worry.
(I think my approach could be faster too, but it is harder to say without developing a prototype for both and measuring performance on real datasets).
Addendum 2, to clean up some of the comments discussions and preserve it as an answer in the context of the question.
To determine whether the suggestion I outline here is feasible, you'll need to create a prototype. I would recommend using an existing nested set solution, such as Propel with the NestedSetBehaviour, though GitHub will have many other libraries you can try. Do not try to integrate this prototype into your own ORM at this stage, as the integration work will just be a distraction. At the moment you want to test the idea for feasibility, that's all.
I am constructing a PHP OOP framework from scratch (I have no choice
in this matter)
You always have a choive.
The framework is required to handle object-oriented data in the most
efficient way possible. It doesn't need to be lightning quick, it just
needs to be the best possible solution to the problem
I personally would go for serialized strings or ORM+MySQL (InnoDB)
Objects very closely resemble class instances, in that they are an
instance of a specific class, which contains a strict set of
properties.
Sounds like definition of... objects. Because object is instance of a class it has to resemble class structure. Also, class' instance and object is the same thing. So you kinda said Objects ... resemble objects.
Object properties can be basic types (strings, numbers, bools) but can
also be one object instance, or an array of objects (with the
restriction that the array must be objects of the same type)
Yes, that's purpose of Object-Oriented Programming and one of it's powerful features.