Class-to-class passing of data, optimization - php

I have written a file-handler-class, that works like this:
__construct opens and ex-locks a file, reads its json-content and parses that as an PHP-array, keeping this as a property of the class.
The file is still locked, in order to avoid race-conditions.
Other 'worker-classes' make changes in this Array, in/from other scopes.
__destruct encodes the finished Array, writes it to file, and unlocks the file.
Everything works fine ...
QUESTION:
Is it sensible to keep the Array as a property of the original class, or is it better to pass the Array to the worker-classes, and let them return it at the end?
Perhaps there is a way to keep the Array locally, and pass it to worker-classes by reference, instead of as raw data?
I mean ... this is a question of not having duplicates, waisting memory. A question of speed, not passing things unnecessarily. And a question of best practices, keeping things easy to understand.

Actually, by passing the array to another function, having that function modify the array, and then return it to some other caller that may or may not also conduct modifications on it, you are in fact copying that array multiple times (since this invokes copy-on-write semantics in PHP) and by definition wasting memory.
Whereas by keeping it as a property of the object instance, you would not be invoking any copy-on-write semantics, even if the caller is not the same instance. Since passing an object instance won't copy the array, nor will its modification from said instance.
Not to mention you just make it easier to retain state within that object (assuming you care about validation).

Related

Object Oriented PHP how does function __destruct come into play?

In PHP, when defining classes; there's often a __construct (constructor) and a __destruct (destructor) implemented into an object when it is created and 'destroyed'.
In PHP, an object is 'destroyed' when it stops being used
Now, how is that helpful? How is it used exactly and in which cases will it become handy in a programming language such as PHP?
"When an object is no longer needed it has to be deleted. Objects created within functions as local variables. (...) Whenever an object is deleted its destructor member function is called. Its understandable why constructors are so important, objects must be properly initialised before they can be used, but is it really necessary to have a special member function that gets called when the object is about to disappear?
In many cases the answer is no, we could leave the compiler to invent a default no-op one. However suppose your object contained a list of detector hits from which it was built. Without going into detail, its likely that this would be some kind of dynamic object owned by the object and accessed via a pointer. Now when it comes time to delete the object, we want this list to be deleted, but probably not the hits that it points to! The compiler cannot possibly know, when it comes across a pointer in an object whether it points to something owned by the object and to be deleted as well, or simply something related to, but independent of, the object.
So the rule is If an object, during its lifetime, creates other dynamic objects, it must have a destructor that deletes them afterwards. Failure to tidy up like this can leave to orphan objects that just clog up the memory, something that is called a memory leak. Even when a default is acceptable, its a good idea to define a destructor..."
See more: OO Concept: Constructors & Destructors

What is faster: an index-array or getters and variables? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I am trying to figure out what should be considered better for performance:
I have a bunch of objects that contain a lot of page-data.
A few examples of the data that an object can have:
filepath of PHP-file for includes
CSS filepath
JavaScript filepath
Meta data of the page
The object is specific for each type of content. I have an interface that defines the render-function. Each object implements this function differently.
Example:
class PhpFragment extends FragmentBase {
public function render() {
//... render output for this type of data
}
}
I am currently using a parent-object that contains variables that can contain multiple object of the type mentioned above. The object looks something like this:
class pageData {
protected $CSS;
protected $PHP;
protected $JS;
protected $Meta;
protected etc...
public function getCSS() {
return $this->CSS;
}
public function getPHP() {
return $this->PHP;
}
public function getJS() {
return $this->JS;
}
}
Whenever I load in a page, I walk through a template and render the data of each object that matches the tag in the template.
For example: If a template has a line where CSS is needed, I call the getCSS function of the pageData which returns an array of objects. Foreach of these objects I call the render function and add the data in the page.
What do I want?
I want to get rid of these fixed variables in the pageData object to be able to use my design as dynamically as possible. I want the pageData object to disappear and just have an array of different fragment-objects.
To achieve this, I need to replace the get-functions in the pageData with something clever?
My top priority is performance, so I thought I'd look through all the objects once to get all the different types, and put all the types as key in the array, the value of the array will then be a subarray that contains the correct key to the objects that match the type.
What I was wondering, before I start changing the design entirely, is this faster?
I don't know if this is the right place to ask this question (it's more a code-review question IMO). Anyway, here's a couple of thoughts I'd consider if I were you:
What are objects
Objects are units of functionality, or entities that represent a specific set of values. DTO's (like your pageData class) serves one purpouse: to group, and represent a set of values that belong together. The fact that a class has a type (type-hints) and an interface makes a code-base testable, easier to understand, maintain, debug, and document.
At first glance, a simple DTO isn't too different from a simple array, and yes, objects have a marginal performance cost.
The question you need to ask is whether or not you want to shave of those 1 or 2 ms per request at the cost of: increased development time, less testable, more error prone, and harder to maintain code. I'd argue that for this reason alone, DTO's make more sense than arrays
pre-declared properties are fast
If you want an object that is as dynamic as possible, then PHP offers you to possibility to add properties to instances on the fly:
Class Foo{}
$x = new Foo;
$x->bar = 'new property';
echo $x->bar;//echoes new property
So in essence, objects are just as flexible as arrays. However, properties that weren't declared beforehand are (again marginally) slower than predeclared properties.
When a class definition declares 3 properties, these properties are stored in a hash table. When accessing a member of an instance, this hashtable will be checked first. Internally, these hashtable lookups are O(1), If no properties were declared, any "dynamic" property is stored in a second hash table. Lookups on this fallback HT are O(n). Not terrible, but worse than they need be.
In addition to dynamic properties being less performant, they're also always public, so you have no control over their values (they can be reassigned elsewhere), and they are, of course, susceptible to human error (typo's):
$x = new Foo;
$x->foo = 'Set the value of foo';
echo $x->fo;//typo...
Getters and setters are good
The methods you have now don't do anything, true enough, but consider this:
class User
{
protected $email;
public function setEmail($email)
{
if (!filter_var($email, FILTER_VALIDATE_EMAIL)) {
throw new \InvalidArgumentException('Invalid email');
}
$this->email = $email;
return $this;
}
}
A setter like this not only allows me to control/check when and where a property is set, but also to validate the data that someone is trying to assign to it. You can validate the data. You can ensure that, no matter what, if you receive an instance of User, the email will either be null, or a valid email address.
There are many more reasons why objects make more sense than arrays, but these alone, to me at least, outweigh the benefits of 2ms/req performance gain.
If performance is such an issue, why not write in a faster language?
If all you're after is performance, you might want to look into languages that outperform PHP to begin with. Don't get me wrong: I honestly like PHP, but it's just a fact that, for example, Go can do the same thing, only faster.
Pass by value, copy-on-write, and (almost) pass by reference
Arrays are, essentially, scalar values. Pass an array to a function, and any changes made to that array inside the function doesn't change the array you passed to that method. Objects are (sort-of) passed by reference. That's to say: objects are passed by identifier.
Say you have an instance of Foo. The Zend engine will assign a unique ID to that instance (eg 123). When you call a function and pass that instance, internally, you'll pass the identifier of that object to the method not the object itself.
This has several implications: When changing the state of the instance, PHP doesn't have to make a copy of the object: it just uses the ID to get the zval (internal representation of a PHP variable), and operates on the same piece of memory. The net result: you're passing a simple value (an int), and whatever happens to the object, wherever it happens, the state is shared throughout.
Arrays are different: Passing an array is (sort-of) passing a copy of that value. In reality, PHP is clever enough to pass a reference to the existing array, but once you start reassigning values, PHP does have to create a copy. This is the copy-on-write mechanism. Put simply, the idea is: do not create needless copies of values, unless you have to:
function foo(array $data)
{
$x = $data[0];//read, no copy of argument is required
$data[1] = $x * $data[3];//now, we're altering the argument, a copy is created
}
$data = [1, 2, 3, 4];
foo($data);//passes reference
Depending on how you use the arrays or objects you pass to functions, one might perform better than the other. On the whole: passing an array that you'll only use to read values will most likely outperform passing an object. However, if you start operating on the array/object, an object might turn out to outperform arrays...
TL;DR
Yes, arrays are generally faster than objects. But they're less safe, pretty much impossible to test, harder to maintain an non-communicative (public function doStuff(array $data) doesn't tell me as much as public function doStuff(User $data)).
Owing to the copy-on-write and the way instances are passed to functions, it's impossible to say which will be faster with absolute certainty. It really boils down to what you do: is the array fairly small, and are you only reading its values, then it's probably going to be faster than objects.
The moment you start operating on the data, it's entirely possible objects might prove to be faster.
I can't just leave it there without at least mentioning that old mantra:
Premature optimization is the root of all evil
Switching from objects to arrays for performance sake does smell of micro-optimization. If you have in fact reached the point that there's nothing else to optimize but these kinds of trivial things, then the project is either a small one; or you're the first person to actually work on a big project and actually finish it. In all other cases, you shouldn't really be wasting time on this kind of optimization.
Things that are far more important to profile, and then optimize are:
Caching (opcache, memcache, ...)
Disk IO (including files, autoloader mechanisms)
Resource management: open file pointers, DB connections (when to connect, when to close connections)
If you're using a traditional SQL DB: queries... The vast majority of PHP applications can benefit a lot by having a DBA look at the queries and actually optimize those
Server setup
...
Only if you've gone through this list, and more, could you perhaps consider thinking about some micro optimization. That is of course, if by then you haven't encountered any bugs...

PHP: Is it faster to reuse object or copy into array

I have a basic question that I was hoping someone could answer please.
In PHP, is it quicker to refer back to properties of an object over and over again, or is it faster to copy those properties into an array, if they're being used so much?
This is if you have an object already instantiated and populated fully. When you want to constantly pass some of that objects properties into various functions, should I just reuse the object, or is that in some way creating an overhead that it wouldn't in an array?
Example:
I have a Request object. This object has several search parameters. I want to keep on referring to these different search parameters, so currently I'm using:
$request->d->postcode
Someone suggested copying these search parameters into an array first, then re-using the array instead:
$searchParams = get_object_vars($request->d);
then I can simply use:
$searchParams['postcode']
Many thanks for any advice.
I think this is a matter of personal preference.
One of the great things about objects is they reduce multiplication in your code, putting the object into an array creates unnecessary duplication and can make your code overly complex (the same values in differently named variables).
I find it easier to keep all data in my code in the same object(s). This makes it more readable for others as well.

Object caching - is it more efficient to store cloned objects or serialized data?

Suppose I'm loading a large number of objects from a database. These are normal, plain PHP objects, no inheritance from anything fancy. Suppose I might change a few of these objects and want to write them back to the database, but only use the fields that actually differ in the UPDATE ... SET ... query. Also suppose that I don't know in advance which objects are going to be changed.
I'm thinking that I need to make a copy of all the objects loaded, and keep around for reference and comparison, should I need to write objects back to the database.
I see two possible approaches:
I can either clone all the loaded objects and store in a separate list. When saving, look up the object in the list using an index, and compare the values.
Or, I can simply serialize everything loaded into a string, and keep around. When saving, find the serialized object in the string (somehow), unserialize it, compare the values, and there you go.
In terms of efficiency (mostly memory, but speed is also a consideration), which would be favorable?
Well you actually needs something to compare if the state of the object has changed or not. If you even want to track not only which object has changed but also which member, you need to have a state per member.
As you don't want to extend the original objects (e.g. they could have a flag they invalidate when they are changed), you need to track the state from the outside. I'd say serializing is probably the best option then. Cloning will take more memory.

PHP Omit Variables from Serialization

Is it possible to omit certain variables from serialization? Say I have a temporary variable in a php object that I don't want serialized as it is a waste of space. The only thing I can think of is making them static but this is not ideal as it is not really part of the object which there will be many instances of.
This may not even be possible but would love to hear some ideas.
Take advantage of the __sleep method of your object.

Categories