Too big data objects - php

Recently, I have identified very strong code smell in one of the project, I'm working for.
Especially, in its cost counting feature. To count the total cost for some operation, we have to pass many kind of information to the "parser class".
For example:
phone numbers
selected campaigns
selected templates
selected contacts
selected groups
and about 2-4 information types more
Before refactoring there were all this parameters were passed through constructor to the Counter class(8 parameters, you can image that..).
To increase readability I have introduced a Data class, named CostCountingData with readable getters and setters to all this properties.
But I don't think, that that code became much readable after this refactoring:
$cost_data = new CostCountingData();
$cost_data->setNumbers($numbers);
$cost_data->setContacts($contacts);
$cost_data->setGroups($groups);
$cost_data->setCampaigns($campaigns);
$cost_data->setUser($user);
$cost_data->setText($text);
$cost_data->setTotalQuantity($total_quantity);
$CostCounter = new TemplateReccurentSendingCostCounter($cost_data);
$total_cost = $CostCounter->count();
Can you tell me whether there is some problem with readability of this code snippet?
Maybe you can see any code smells and can point me at them..
The only idea I have how to refactore this code, is to split this big data object to several, consisting related data types. But I'm not sure whether should I do this or not..
Whay do you think about it?

If what you want is named parameters (which it looks to me is what you are trying to achieve), would you ever consider just passing in an associative array, with the names as keys? There is also the Named Parameter Idiom, I can only find a good reference for this for C++ though, perhaps it's called something else by PHP programmers.
For that, you'd change your methods on CostCountingData so they all returned the original object. That way you could rewrite your top piece of code to this:
$cost_data = new CostCountingData();
$cost_data
->setNumbers($numbers)
->setContacts($contacts)
->setGroups($groups)
->setCampaigns($campaigns)
->setUser($user)
->setText($text)
->setTotalQuantity($total_quantity);
So there's two options. I would probably go for the named parameter idiom myself as it is self-documenting, but I don't think the difference is too great.

There are a few possibilities here:
If you only need a few of the parameters each time, the named parameter approach suggested elsewhere in this answer is a nice solution.
If you always need all of the parameters, I would suggest that the smell is coming from a broken semantic model of the data required to calculate a cost.
No matter how you dress up a call to a method that takes eight arguments, the fact that eight supposedly unrelated pieces of information are required to calculate something will always smell.
Take a step back:
Are the eight separate arguments really all unrelated, or can you compose intermediate objects that reflect the reality of the situation a little better? Products? invoice items? Something else?
Can the cost method be split into smaller methods that calculate part of the cost based on fewer arguments, and the total cost produced from adding up the component costs?

If I needed the data object (and classes called "data" are themselves a smell) I'd set its values in its constructor. But the interesting question is where are these values coming from? The source of the values should itself be an object of some sort, and its probably the one you are really interested in.
Edit: If the data is coming from POST, then I would make the class be wrapper around the POST data. I would not provide any setter functions, and would make the read accessors look like this (my PHP is more than a little rusty):
class CostStuff {
constructor CostStuff( $postdata ) {
$mypost = $postdata;
}
function User() {
return $mypost[ "user_name" ];
}
...
}

What do you think is wrong with your code snippet? If you have several variables that you've already used that you now need to get into $cost_data, then you'll need to set each one in turn. There's no way around this.
A question though, is why do you have these variables $numbers, $contracts, etc. that you're now deciding you need to move into $cost_data ?
Is the refactoring you're looking for possibly that the point in the code where you set $numbers should actually be the point you're setting $cost_data->numbers, and having the $numbers variable at all is superfluous?
E.g.
$numbers=getNumbersFromSomewhere() ;
// do stuff
$contracts=getContracstFromSomewhere() ;
// do stuff
$cost_data=new dataObject() ;
$cost_data->setNumbers($numbers);
$cost_data->setContracts($contracts) ;
$cost_data->someOperation() ;
becomes
$cost_data=new dataObject() ;
$cost_data->setNumbers(getNumbersFromSomewhere()) ;
// do stuff
$cost_data->setContracts(getContractsFromSomewhere()) ;
// do stuff
$cost_data->someOperation() ;

Related

How to pass a boolean argument to a function to act as a switch?

My apologies if the title isn't clear, but I find this difficult to describe. Basically, I have a function that looks for an instance of a class (the school kind) given a class ID number and a date. The function can also create a new class instance if desired.
function get_class_instance($class_id, $date, $create)
Inside the function is a database select on the class_instances table using the class_id and date as arguments. If a matching class_instance is found, its ID is returned. If none is found, there's a conditional for the create argument. If it's true, a new class_instance is created using a database insert and its ID is returned. If false, nothing is changed in the database and false is returned.
I'm still a beginner with PHP and coding generally, so I'm thinking there is probably a better way. The issue is that when calling the function, it might not be clear to someone why there is a boolean being passed.
$original_cinstance_id = get_class_instance($original_class_id, $original_date, 1);
Passing boolean flags to functions to make them do two different things instead of just one is considered a Code Smell in Robert Martin's Clean Code book. The suggested better option would be to have two functions get_whatever and create_whatever.
While a Code Smell depends on context, I think the boolean flag smell does apply in this case, because creating something is different from merely reading it. The former is a Command and the latter is a Query. So they should be separated. One changes state, the other doesn't. It makes for better semantics and separation of concerns to split them. Also, it will reduce the Cyclomatic Complexity of the function by one branch, so you will need one unit-test less to cover it.
Quoting https://martinfowler.com/bliki/FlagArgument.html
Boolean arguments loudly declare that the function does more than one thing. They are confusing and should be eliminated.
Quoting http://www.informit.com/articles/article.aspx?p=1392524
My reasoning here is that the separate methods communicate more clearly what my intention is when I make the call. Instead of having to remember the meaning of the flag variable when I see book(martin, false) I can easily read regularBook(martin).
Additional discussion and reading material:
https://softwareengineering.stackexchange.com/questions/147977/is-it-wrong-to-use-a-boolean-parameter-to-determine-behavior
https://medium.com/#amlcurran/clean-code-the-curse-of-a-boolean-parameter-c237a830b7a3
https://8thlight.com/blog/dariusz-pasciak/2015/05/28/alternatives-to-boolean-parameters.html
You can use a default value for create, so if you pass anything to it, it acts as a normal "get" operation for the database. Like this:
function get_class_instance($class_id, $date, $create = false);
You can query your ID's like this:
$class_id = get_class_instance(1, "18-10-2017");
You can then pass "true" to it whenever you need to create it on the database:
$class_id = get_class_instance(1, "18-10-2017", true);
Instead of passing a number value, you can pass a real boolean value like this:
$original_cinstance_id = get_class_instance($original_class_id, $original_date, true);
Also in the declaration of your function, you can specify a default value to avoid to pass the boolean each time:
function get_class_instance($class_id, $date, $create = false)
one option is to create an enumerator like so:
abstract class create_options {
const no_action = 0;
const create = 1;
}
so now your function call would look like this:
$original_cinstance_id = get_class_instance($original_class_id, $original_date, create_options::no_action);
in practice as long as your code is well commented then this isn't a major issue with boolean flags, but if you had a dozen possible options with different results then this could be useful.
As others have mentioned, in many languages you can also make an argument optional, and have a default behavior unless the caller specifically defines that argument.

Functional Programming - Return Transformed array and the count of the array without calculating twice

I'm trying to write more functional code in PHP without any helper libraries.
I need to return some JSON that includes the results of a transformed array AND the count of that array (for convenience on the data consumer end). Since you're not supposed to use variables in FP, I'm stumped on how to get the count of the array without recalculating/remapping the array.
Here's an example of what my code currently looks like:
$duplicates = array_filter( get_results(), 'find_duplicates' );
send_json( array(
"duplicates" => $duplicates,
"numDuplicates" => count( $duplicates )
) );
How can I do the same without storing the results of the filter in a temporary variable to avoid running array_filter() twice?
But first, acknowledge the following...
"Since you're not supposed to use variables in FP..." – that's a ludicrous understanding of functional programming. Variables are used constantly in functional programs. I'm guessing you saw point-free functional programs and then imagined that every program can be expressed in such a way...
the receiver of the JSON could easily get the number of duplicates using JSON.parse(json).duplicates.length because every Array in JavaScript has a length property – it's arguably silly to attach a numDuplicates in the first place. Anyway, let's assume your consumer has a specific API that requires the numDuplicates field...
functional programming is concerned with things like function purity – maybe you've simplified your code in your post (which is bad; don't do that) or that is in fact your actual code. In such a case, get_results() and send_json functions are impure; send_json has an obvious (but unknown) side effect (the return value is not used) — You ask for a functional solution but you have other outstanding non-functional code... so...
There's nothing wrong with the code you have. Sometimes removing a point (variable, or argument), it hurts the readability of the code. In your case, this code is perfectly legible. It is at this point that I feel you're only trying to shorten the code or make it more clever. Your intention is to improve it, but I think you'd actually harm it in this case.
What if I told you...
a variable assignment can be replaced with a lambda? 0_0
(function ($duplicates) {
send_json([
'duplicates' => $duplicates,
'numDuplicates' => count($duplicates)
});
}) (array_filter(get_results(), 'find_duplicates'));
But that made the code longer.. and there's added abstraction which hurts readability T_T In this case, using a normal variable assignment (as in your original code) would've been much better
Combinators
OK, so what if you had some combinators at your disposal to massage the data into the desired shape?
function apply (...$xs) {
return function ($f) use ($xs) {
return call_user_func($f, ...$xs);
};
}
function identity ($x) { return $x; }
// hey look, mom! no points!
send_json(
array_combine(
['duplicates', 'numDuplicates'],
array_map(
apply(
array_filter(get_results(), 'find_duplicates')),
['identity', 'count'])));
Did we achieve anything other than writing the weirdest PHP you or anyone else has probably seen? Not to mention, the input is strangely nested in the middle of the expression...
remarks
I'm nearly certain that you'll be disappointed with this answer (or disagree with me), but I'm also pretty confident that you're not sure what you're looking for. A guess: you saw functional programming that "doesn't use variables" and assumed that's how all programs can and should be written; but that's just not the case. Sometimes using a variable or two can dramatically improve the readability of a given expression.
Anyway, all of this is truly beside the point because attaching numDuplicates is arguably an anti-pattern in JSON anyway (point #2 above).

PHP Codeigniter, should i set array defaults or simply use if.. else statements repeatedly

So I am utilizing codeigniter.
To allow reusability of code i have a number of functions in my model which get different things.
e.g get_details, get_features, get_products
I then have a function called get_all which calls all these methods, so If i want i can get them all but otherwise i can use them individually.
So I have my data, and I pass it to my view. My view loops through each establishment and displays various data in a table row.
At present I use if.. else statements to discern if a value is empty for example.
So if an establishment has not had its features added yet I use:
if(!empty($features['feature1'])){//DO STUFF e.g output 'YES'}
Anyway, my views code is no getting rather long and complicated because essentially for each and every key of each array returned using get_all I am using an if..else statement to output a "-" if it is not set.
It works, it just seems repetitive.
The work around I thought of is to simply set a default array whereby everything is by default set to "-", then if the data does exist it is overwritten, but then I just have to write/initiate a large default array..
So my question is not a life threatening one, nor is it particularly hard.. I am simply curious as to how one achieves such functionality without ugly code.
Cheers
Maybe you can "adjust" the array in the controller, by setting its empty values to -:
$features = array_map(function($value) {
return empty($value) ? '-' : $value;
}, $features);
without you posting your actual code i can only provide some general advice. To usually handle this, and consolidate your code to remove all the conditionals put the keys in an array.
$keys_to_check = array('feature1', 'feature2', 'etc.....');
foreach ($keys_to_check as $key) {
if (!empty($features[$key])) {
// do something
}
}
This refactors out all those conditional statements into somethign that is more maintainable.
also in your model code when you are providing a general function get_all that calls 3 subfunctions it is extremely important to make sure that unecessary queries aren't being executed. it seems it would be good software design to not repeat yourself, and group 3 functions into 1 but if those three functions are each executing similar queries then it is horrible for performance.

Should my PHP functions accept an array of arguments or should I explicitly request arguments?

In a PHP web application I'm working on, I see functions defined in two possible ways.
Approach 1:
function myfunc($arg1, $arg2, $arg3)
Approach 2:
// where $array_params has the structure array('arg1'=>$val1, 'arg2'=>$val2, 'arg3'=>$val3)
function myfunc($array_params)
When should I use one approach over another? It seems that if system requirements keep changing, and therefore the number of arguments for myfunc keep changing, approach 1 may require a lot of maintenance.
If the system is changing so often that using an indexed array is the best solution, I'd say this is the least of your worries. :-)
In general functions/methods shouldn't take too many arguments (5 plus or minus 2 being the maximum) and I'd say that you should stick to using named (and ideally type hinted) arguments. (An indexed array of arguments only really makes sense if there's a large quantity of optional data - a good example being configuration information.)
As #Pekka says, passing an array of arguments is also liable to be a pain to document and therefore for other people/yourself in 'n' months to maintain.
Update-ette...
Incidentally, the oft mentioned book Code Complete examines such issues in quite a bit of detail - it's a great tome which I'd highly recommend.
Using a params array (a surrogate for what is called "named arguments" in other languages") is great - I like to use it myself - but it has a pretty big downside: Arguments are not documentable using standard phpDoc notation that way, and consequently, your IDE won't be able to give you hints when you type in the name of a function or method.
I find using an optional array of arguments to be useful when I want to override a set of defaults in the function. It might be useful when constructing an object that has a lot of different configuration options or is just a dumb container for information. This is something I picked up mostly from the Ruby world.
An example might be if I want to configure a container for a video in my web page:
function buildVideoPlayer($file, $options = array())
{
$defaults = array(
'showAds' => true,
'allowFullScreen' = true,
'showPlaybar' = true
);
$config = array_merge($defaults, $options);
if ($config['showAds']) { .. }
}
$this->buildVideoPlayer($url, array('showAds' => false));
Notice that the initial value of $options is an empty array, so providing it at all is optional.
Also, with this method we know that $options will always be an array, and we know those keys have defaults so we don't constantly need to check is_array() or isset() when referencing the argument.
with the first approach you are forcing the users of your function to provide all the parameters needed. the second way you cannot be sure that you got all you need. I would prefer the first approach.
If the parameters you're passing in can be grouped together logically you could think about using a parameter object (Refactoring, Martin Fowler, p295), that way if you need to add more parameters you can just add more fields to your parameter class and it won't break existing methods.
There are pros and cons to each way.
If it's a simple function that is unlikely to change, and only has a few arguments, then I would state them explicitly.
If the function has a large number of arguments, or is likely to change a lot in the future, then I would pass an array of arguments with keys, and use that. This also becomes helpful if you have function where you only need to pass some of the arguments often.
An example of when I would choose an array over arguments, for an example, would be if I had a function that created a form field. possible arguments I may want:
Type
Value
class
ID
style
options
is_required
and I may only need to pass a few of these. for example, if a field is type = text, I don't need options. I may not always need a class or a default value. This way It is easier to pass in several combinations of arguments, without having a function signature with a ton arguments and passing null all the time. Also, when HTML 5 becomes standard many many years from now, I may want to add additional possible arguments, such as turning auto-complete on or off.
I know this is an old post, but I want to share my approach. If the variables are logically connected in a type, you can group them in a class, and pass an object of them to the function. for example:
class user{
public $id;
public $name;
public $address;
...
}
and call:
$user = new user();
$user->id = ...
...
callFunctions($user);
If a new parameter is needed you can just add it in the class and the the function signature doesn't need to change.

Proper way to declare a function in PHP?

I am not really clear about declaring functions in php, so I will give this a try.
getselection();
function getselection($selection,$price)
{
global $getprice;
switch($selection)
{
case1: case 1:
echo "You chose lemondew <br />";
$price=$getprice['lemondew'].'<br>';
echo "The price:".$price;
break;
Please let me know if I am doing this wrong, I want to do this the correct way; in addition, php.net has examples but they are kind of complex for a newb, I guess when I become proficient I will start using their documentation, thank you for not flaming.
Please provide links that might also help me clear this up?
Your example seems valid enough to me.
foo('bar');
function foo($myVar)
{
echo $myVar
}
// Output: bar
See this link for more info on user-defined functions.
You got off to a reasonable start. Now all you need to do is remove the redundant case 1:, close your switch statement with a } and then close your function with another }. I assume the global array $getprice is defined in your code but not shown in the question.
it's good practice to declare functions before calling them. It'll prevent infrequent misbehavior from your code.
The sample is basically a valid function definition (meaning it runs, except for what Asaph mentions about closing braces), but doesn't follow best practices.
Naming conventions: When a name consists of two or more words, use camelCase or underscores_to_delineate_words. Which one you use isn't important, so long as you're consistent. See also Alex's question about PHP naming conventions.
Picking a good name: a "get" prefix denotes a "getter" or "accessor"; any method or function of the form "getThing" should return a thing and have no affects visible outside the function or object. The sample function might be better called "printSelection" or "printItem", since it prints the name and price of the item that was selected.
Globals: Generally speaking, globals cause problems. One alternative is to use classes or objects: make the variable a static member of a class or an instance member of an object. Another alternative is to pass the data as an additional parameter to the function, though a function with too many parameters isn't very readable.
Switches are very useful, but not always the best choice. In the sample, $selection could easily hold the name of an item rather than a number. This points to one alternative to using switches: use an index into an array (which, incidentally, is how it's done in Python). If the cases have the same code but vary in values used, arrays are the way to go. If you're using objects, then polymorphism is the way to go--but that's a topic unto itself.
The $price parameter appears to serve no purpose. If you want your function to return the price, use a return statement.
When you called the function, you neglected to pass any arguments. This will result in warnings and notices, but will run.

Categories