How does PHP access properties internally?

How does PHP access properties internally? - php

Following up on this documentation: https://www.php.net/manual/en/language.oop5.references.php
One of the key-points of PHP OOP that is often mentioned is that "objects are passed by references by default". This is not completely true.
In PHP, an object variable doesn't contain the object itself as value. It only contains an object identifier which allows object accessors to find the actual object.
How does this actually work? In C++ for example it seems that the arrow operator implicitly dereferences the pointer and then accesses the properties like when accessing them on the object variable itself.
Here's what I mean:
obj->prop
(*obj).prop // equivalent to line above
This seems pretty clean. The property is called as the sum of the object-variable-address and the property-offset in both cases.
But how does that work in PHP?
The documentation suggests that the pointer does not store the memory address but rather an "object-identifier". Is accessing properties in PHP a highly abstracted process or does it resolve the object-identifier for the memory address and then access it in a similar way to C++ / Java / etc.?

It's a highly abstracted process, similarity in syntax does not indicate that the code "falls through" to working like C/C++. You can dive into the code to see how it works under the covers.

Related

Object Oriented PHP how does function __destruct come into play?

In PHP, when defining classes; there's often a __construct (constructor) and a __destruct (destructor) implemented into an object when it is created and 'destroyed'.
In PHP, an object is 'destroyed' when it stops being used
Now, how is that helpful? How is it used exactly and in which cases will it become handy in a programming language such as PHP?

"When an object is no longer needed it has to be deleted. Objects created within functions as local variables. (...) Whenever an object is deleted its destructor member function is called. Its understandable why constructors are so important, objects must be properly initialised before they can be used, but is it really necessary to have a special member function that gets called when the object is about to disappear?
In many cases the answer is no, we could leave the compiler to invent a default no-op one. However suppose your object contained a list of detector hits from which it was built. Without going into detail, its likely that this would be some kind of dynamic object owned by the object and accessed via a pointer. Now when it comes time to delete the object, we want this list to be deleted, but probably not the hits that it points to! The compiler cannot possibly know, when it comes across a pointer in an object whether it points to something owned by the object and to be deleted as well, or simply something related to, but independent of, the object.
So the rule is If an object, during its lifetime, creates other dynamic objects, it must have a destructor that deletes them afterwards. Failure to tidy up like this can leave to orphan objects that just clog up the memory, something that is called a memory leak. Even when a default is acceptable, its a good idea to define a destructor..."
See more: OO Concept: Constructors & Destructors

Comparing PHP's get() with get() and getattr__() in Python

What is the difference between __get__() and __getattr__() in Python? I come from a PHP background, where there is only __get(). When should I use which function?
I've been trying to figure this out for a while. I see plenty of questions like this one, asking about the difference between __getattr__() and __getattribute__(), though.

First an foremost, PHP does not have an equivalent to Python's __get__() – not even close! What you are looking for is most definitely __getattr__().
I come from a PHP background, where there is only __get__
PHP has a magic method called __get(), which is invoked whenever you are trying to access a property that does not exist.
A short list of non-equivalents
First, let's clear some things up:
PHP does not have an equivalent to Python's __get__()
PHP does not have an equivalent to Python's __getattr__()
PHP does not have an equivalent to Python's __getattribute__()
Python does not have an equivalent to PHP's __get()
(And for all setter methods respectively.)
Contrary to Achim's assumption, __get() does not do the same as Python's __getattr__()!
Python does not distinguish between methods and properties, but PHP does, which is why PHP has a second method: __call().
__call() is executed whenever you try to invoke a method on an object that does not exist. Python does not have an equivalent for this, because a method is simply an object (attribute) that is callable.
An example in PHP:
<?php
$obj = new stdClass();
($obj->test)();
In Python, this would fail with an AttributeError. In PHP, however, this does not even compile:
Parse error: syntax error, unexpected '(' on line 4
Compare this to Python:
obj.method()
# is eqvuivalent to:
(obj.method)()
This is an important difference. We conclude that the way PHP and Python think about calling methods are completely different.
PHP's "call awareness"
PHP's obj knows that you call a method
you get to handle calls to non-existing methods explicitly on the object you call on
this is because PHP's expression model is very inconsistent (PHP 5.4 is a small step forward though)
but Python's obj does not.
This makes it possible for PHP to have obj.does_not_exist to evaluate to 3, but obj.does_not_exist() to 5.
To my knowledge, it's impossible to do so in Python. (This would allow us to describe PHP's inconsistency as a feature.)
Thus, we get to extend our "not equivalent"-list by one bullet point:
Python does not have an equivalent to PHP's __call()/__callStatic()
Summing it up
PHP provides two separate mechanisms:
__get() for non-existing properties
__call() for calls to non-existing methods
Python has only one mechanism, because it does not distinguish between properties and methods, as far as it is concerned they are all attributes.
__getattr__() is invoked, when an attribute does not exist.
obj.non_existing() is not special "call syntax", it's an expression to which the call operator () is applied: (obj.__getattr__("non_existing"))()
Disclaimer: __getattr__() is not always called when an attribute does not exist. __getattribute__() takes the highest precedence in the lookup chain and may thus cause __getattr__() to be ignored.
Descriptors
__get__() in Python is something completely different from what has been addressed above.
The documentation describes descriptors to be "a descriptor is an object attribute with “binding behavior”". I can sort of form an intuitive understanding what "binding behavior" is supposed to mean, but only because I already understand what descriptors do.
I would choose to describe descriptors as "self-aware attributes" that can be shared across multiple classes.
Let me explain, what I mean by "self-awareness". Such attributes:
know when they are being accessed or read from
know when they are being written to
know whom they are read/written via
know when they are deleted
These attributes are independent objects: the "descriptor" objects, i.e. objects that adhere to the so called "descriptor protocol", which defines a set of methods along with their corresponding signature that such objects can implement.
An object does not own a descriptor attribute. In fact, they belong to the object's corresponding class (or an ancestor thereof). However, the same descriptor object can "belong" to multiple classes.
Note: "whom they are read via", what is the proper way to refer to obj in obj.attr? I would say: attr is accessed "via" obj.

You will find detailed documentation for all those methods here.
Coming from PHP, you should first make yourself familiar with the Python object model. It's much richer than PHP, so you should not try to map your PHP knowledge 1:1 to Python. If you want to develop PHP, use PHP. If you want to develop in Python, learn Python.
Coming back to your original question: __getattr__ is probably the function which does the same as the __get function in PHP. __get__ in Python is used to implement descriptors. Details about descriptors can also be found in the documentation I mentioned above.

Why return object instead of array?

I do a lot of work in WordPress, and I've noticed that far more functions return objects than arrays. Database results are returned as objects unless you specifically ask for an array. Errors are returned as objects. Outside of WordPress, most APIs give you an object instead of an array.
My question is, why do they use objects instead of arrays? For the most part it doesn't matter too much, but in some cases I find objects harder to not only process but to wrap my head around. Is there a performance reason for using an object?
I'm a self-taught PHP programmer. I've got a liberal arts degree. So forgive me if I'm missing a fundamental aspect of computer science. ;)

These are the reasons why I prefer objects in general:
Objects not only contain data but also functionality.
Objects have (in most cases) a predefined structure. This is very useful for API design. Furthermore, you can set properties as public, protected, or private.
objects better fit object oriented development.
In most IDE's auto-completion only works for objects.
Here is something to read:
Object Vs. Array in PHP
PHP stdClass: Storing Data in an Object Instead of an Array
When should I use stdClass and when should I use an array in php5 oo code
PHP Objects vs Arrays
Mysql results in PHP - arrays or objects?
PHP objects vs arrays performance myth
A Set of Objects in PHP: Arrays vs. SplObjectStorage
Better Object-Oriented Arrays

This probably isn't something you are going to deeply understand until you have worked on a large software project for several years. Many fresh computer science majors will give you an answer with all the right words (encapsulation, functionality with data, and maintainability) but few will really understand why all that stuff is good to have.
Let's run through a few examples.
If arrays were returned, then either all of the values need to be computed up front or lots of little values need to be returned with which you can build the more complex values from.
Think about an API method that returns a list of WordPress posts. These posts all have authors, authors have names, e-mail address, maybe even profiles with their biographies.
If you are returning all of the posts in an array, you'll either have to limit yourself to returning an array of post IDs:
[233, 41, 204, 111]
or returning a massive array that looks something like:
[ title: 'somePost', body: 'blah blah', 'author': ['name': 'billy', 'email': 'bill#bill.com', 'profile': ['interests': ['interest1', 'interest2', ...], 'bio': 'info...']] ]
[id: '2', .....]]
The first case of returning a list of IDs isn't very helpful to you because then you need to make an API call for each ID in order to get some information about that post.
The second case will pull way more information than you need 90% of the time and be doing way more work (especially if any of those fields is very complicated to build).
An object on the other hand can provide you with access to all the information you need, but not have actually pulled that information yet. Determining the values of fields can be done lazily (that is, when the value is needed and not beforehand) when using an object.
Arrays expose more data and capabilities than intended
Go back to the example of the massive array being returned. Now someone may likely build an application that iterates over each value inside the post array and prints it. If the API is updated to add just one extra element to that post array then the application code is going to break since it will be printing some new field that it probably shouldn't. If the order of items in the post array returned by the API changes, that will break the application code as well. So returning an array creates all sorts of possible dependencies that an object would not create.
Functionality
An object can hold information inside of it that will allow it to provide useful functionality to you. A post object, for instance, could be smart enough to return the previous or next posts. An array couldn't ever do that for you.
Flexibility
All of the benefits of objects mentioned above help to create a more flexible system.

My question is, why do they use objects instead of arrays?
Probably two reasons:
WordPress is quite old
arrays are faster and take less memory in most cases
easier to serialize
Is there a performance reason for using an object?
No. But a lot of good other reasons, for example:
you may store logic in the objects (methods, closures, etc.)
you may force object structure using an interface
better autocompletion in IDE
you don't get notices for not undefined array keys
in the end, you may easily convert any object to array
OOP != AOP :)
(For example, in Ruby, everything is an object. PHP was procedural/scripting language previously.)

WordPress (and a fair amount of other PHP applications) use objects rather than arrays, for conceptual, rather than technical reasons.
An object (even if just an instance of stdClass) is a representation of one thing. In WordPress that might be a post, a comment, or a user. An array on the other hand is a collection of things. (For example, a list of posts.)
Historically, PHP hasn't had great object support so arrays became quite powerful early on. (For example, the ability to have arbitrary keys rather than just being zero-indexed.) With the object support available in PHP 5, developers now have a choice between using arrays or objects as key-value stores. Personally, I prefer the WordPress approach as I like the syntactic difference between 'entities' and 'collections' that objects and arrays provide.

My question is, why do they (Wordpress) use objects instead of arrays?
That's really a good question and not easy to answer. I can only assume that it's common in Wordpress to use stdClass objects because they're using a database class that by default returns records as a stdClass object. They got used to it (8 years and more) and that's it. I don't think there is much more thought behind the simple fact.
syntactic sugar for associative arrays
-- Zeev Suraski about the standard object since PHP 3
stdClass objects are not really better than arrays. They are pretty much the same. That's for some historical reasons of the language as well as stdClass objects are really limited and actually are only sort of value objects in a very basic sense.
stdClass objects store values for their members like an array does per entry. And that's it.
Only PHP freaks are able to create stdClass objects with private members. There is not much benefit - if any - doing so.
stdClass objects do not have any methods/functions. So no use of that in Wordpress.
Compared with array, there are far less helpful functions to deal with a list or semi-structured data.
However, if you're used to arrays, just cast:
$array = (array) $object;
And you can access the data previously being an object, as an array. Or you like it the other way round:
$object = (object) $array;
Which will only drop invalid member names, like numbers. So take a little care. But I think you get the big picture: There is not much difference as long as it is about arrays and objects of stdClass.
Related:
Converting to object PHP Manual
Reserved Classes PHP Manual
What is stdClass in PHP?

The code looks cooler that way
Objects pass by reference
Objects are more strong typed then arrays, hence lees pron to errors (or give you a meaningful error message when you try to use un-existing member)
All the IDEs today have auto-complete, so when working with defined objects, the IDE does a lot for you and speeds up things
Easilly encapsulate logic and data in the same box, where with arrays, you store the data in the array, and then use a set of different function to process it.
Inheritance, If you would have a similar array with almost but not similar functionality, you would have to duplicate more code then if you are to do it with objects
Probably some more reason I have thought about

Objects are much more powerful than arrays can be.
Each object as an instance of a class can have functions attached.
If you have data that need processing then you need a function that does the processing.
With an array you would have to call that function on that array and therefore associate the logic yourself to the data.
With an object this association is already done and you don't have to care about it any more.
Also you should consider the OO principle of information hiding. Not everything that comes back from or goes to the database should be directly accessible.

There are several reasons to return objects:
Writing $myObject->property requires fewer "overhead" characters than $myArray['element']
Object can return data and functionality; arrays can contain only data.
Enable chaining: $myobject->getData()->parseData()->toXML();
Easier coding: IDE autocompletion can provide method and property hints for object.
In terms of performance, arrays are often faster than objects. In addition to performance, there are several reasons to use arrays:
The the functionality provided by the array_*() family of functions can reduce the amount of coding necessary in some cases.
Operations such as count() and foreach() can be performed on arrays. Objects do not offer this (unless they implement Iterator or Countable).

It's usually not going to be because of performance reasons. Typically, objects cost more than arrays.
For a lot of APIs, it probably has to do with the objects providing other functionality besides being a storage mechanism. Otherwise, it's a matter of preference and there is really no benefit to returning an object vs an array.

An array is just an index of values. Whereas an object contains methods which can generate the result for you. Sure, sometimes you can access an objects values directly, but the "right way to do it" is to access an objects methods (a function operating on the values of that object).
$obj = new MyObject;
$obj->getName(); // this calls a method (function), so it can decide what to return based on conditions or other criteria
$array['name']; // this is just the string "name". there is no logic to it.
Sometimes you are accessing an objects variables directly, this is usually frowned upon, but it happens quite often still.
$obj->name; // accessing the string "name" ... not really different from an array in this case.
However, consider that the MyObject class doesn't have a variable called 'name', but instead has a first_name and last_name variable.
$obj->getName(); // this would return first_name and last_name joined.
$obj->name; // would fail...
$obj->first_name;
$obj->last_name; // would be accessing the variables of that object directly.
This is a very simple example, but you can see where this is going. A class provides a collection of variables and the functions which can operate on those variables all within a self-contained logical entity. An instance of that entity is called an object, and it introduces logic and dynamic results, which an array simply doesn't have.

Most of the time objects are just as fast, if not faster than arrays, in PHP there isn't a noticeable difference. the main reason is that objects are more powerful than arrays. Object orientated programming allows you to create objects and store not only data, but functionality in them, for example in PHP the MySQLi Class allows you to have a database object that you can manipulate using a host of inbuilt functions, rather than the procedural approach.
So the main reason is that OOP is an excellent paradigm. I wrote an article about why using OOP is a good idea, and explaining the concept, you can take a look here: http://tomsbigbox.com/an-introduction-to-oop/
As a minor plus you also type less to get data from an object - $test->data is better than $test['data'].

I'm unfamiliar with word press. A lot of answers here suggest that a strength of objects is there ability to contain functional code. When returning an object from a function/API call it shouldn't contain utility functions. Just properties.
The strength in returning objects is that whatever lies behind the API can change without breaking your code.
Example: You get an array of data with key/value pairs, key representing the DB column. If the DB column gets renamed your code will break.

Im running the next test in php 5.3.10 (windows) :
for ($i = 0; $i < 1000000; $i++) {
$x = array();
$x['a'] = 'a';
$x['b'] = 'b';
}
and
for ($i = 0; $i < 1000000; $i++) {
$x = new stdClass;
$x->a = 'a';
$x->b = 'b';
}
Copied from http://atomized.org/2009/02/really-damn-slow-a-look-at-php-objects/comment-page-1/#comment-186961
Calling the function for 10 concurrent users and 10 times (for to obtain an average) then
Arrays : 100%
Object : 214% – 216% (2 times slower).
AKA, Object it is still painful slow. OOP keeps the things tidy however it should be used carefully.
What Wordpress is applying?. Well, both solutions, is using objects, arrays and object & arrays, Class wpdb uses the later (and it is the heart of Wordpress).

It follows the boxing and unboxing principle of OOP. While languages such as Java and C# support this natively, PHP does not. However it can be accomplished, to some degree in PHP, just not eloquently as the language itself does not have constructs to support it. Having box types in PHP could help with chaining, keeping everything object oriented and allows for type hinting in method signatures. The downside is overhead and the fact that you now have extra checking to do using the â€œinstanceofâ€ construct. Having a type system is also a plus when using development tools that have intellisense or code assist like PDT. Rather than having to google/bing/yahoo for the method, it exists on the object, and you can use the tool to provide a drop down.

Although the points made about objects being more than just data are valid since they are usually data and behaviour there is at least one pattern mentioned in Martin Fowler's "Patterns of Enterprise Application Architecture" that applies to this type of cenario in which you're transfering data from one system (the application behind the API) and another (your application).
Its the Data Transfer Object - An object that carries data between processes in order to reduce the number of method calls.
So if the question is whether APIs should return a DTO or an array I would say that if the performance cost is negligible then you should choose the option that is more maintainable which I would argue is the DTO option... but of course you also have to consider the skills and culture of the team that is developing your system and the language or IDE support for each of the options.

Functions by reference or by variable, which to use when?

Well, I read in my handy PHP book that it's very important to be able to distinguish between reference and variable parameters. The book says that the original value of parameterized variables are preserved when the variable is changed, and the original values of parameterized references change when the reference is changed. It says that's the key difference, if I am reading right.
Well, I'm wondering when each is more useful than the other. How do I know when to use variables and when to use references when I create my own functions?

It's pretty straightforward. Use references when you need to modify the value of the variable passed in to the function. Use variables when you don't need to or want to modify the value.
So, for example, if you're writing a function that takes an array and changes that array, you'd be better off using a reference for that array rather than returning a new array from the function.

"References" (variable aliases) make your code harder to understand and could be a source of hard to follow errors. There are no valid reasons to use references in php and to be on the safer side try to avoid them altogether.
And no, objects in php5 have nothing to do with "references".
"References" as implemented in php is a strange concept. Normally, in programming languages variables are independent of each other so that changing one variable doesn't affect others. Php "references" allow several variables to share the same value and to be dependent of each other. Basically, you change one variable, and suddenly another one, which you think is totally unrelated, is getting changed too. It's no good thing and often leads to much confusion.
Objects in php (do I need to add 'five'?) have nothing to do with "references" in the above sense. They behave much like C pointers (actually, this is what they are under the hood) - when you pass an object to a function, you actually pass a pointer, and the function can use this pointer to manipulate the object contents, but there's no way for the function to change the passed variable itself, for example, make it point to another object.
This "objects are references" misunderstanding is probably because people confuse php "references" (ampersand syntax) with the generic CS term , which also applies to pointers, handles etc.

Are primitive data types in PHP passed by reference?

In PHP, I'm frequently doing lots of string manipulation. Is it alright to split my code into multiple functions, because if primitive types like strings are passed by value I would be significantly affecting performance.

Only objects are passed by reference.
That doesn't mean you'll get a performance boost by changing to references though - PHP uses copy-on-write, so a copy is only made if you modify the variable.
Splitting your code into functions won't slow it down from that point of view.
There is a small overhead for calling a function, but unless your in a loop calling 10,000s of them it's probably not something you need to worry about.

Objects are passed by reference. Everything else is passed by value unless you explicitly use pass-by-reference with the & operator.
That being said, PHP also uses copy-on-write to avoid unnecessary copying.

Yes, primitives are passed by value unless you explicitly define the function to pass by reference (by using an ampersand & in front of the parameter) or invoke the function with an ampersand in front of the argument. (The latter of which is deprecated)
See this part of the documentation for more.
EDIT
Also, the statement that "objects are passed by reference" in PHP is a bit of a simplification, though it can often be thought of that way for most purposes. This chapter of the documentation explains the differences.

Passing by reference is actually slower than passing by value in PHP. I can't find the correct citation for this claim; it's somewhere in the "References" section of the PHP manual.

By default, everything is passed by value. If you want to pass something by reference you have to explicitly state it as so.
Here is the php documentation that explicitly states this behavior.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.