I just started using CodeIgniter, and have really been happy with the results.
The one thing I've noticed is that I seem to be selecting different parts of a row with different parts of a model.
An example would be where on a page I need to get the current user's username, then farther down I need their email address. These are separate functions in the model (and therefore, separate queries). It annoys me knowing I could merge them into one query (saving overhead), but if I did that then I would loose the modularization the MVC model gives me (on plenty of other pages I just need the username or email, not both). Any suggestions on how to get past this?
Yes, but I have a specific function returning part of a row, and another function, returning another section of a row. I need this because most pages only need one section of the row, but some pages need both, this causes my code to run two queries. This is just one example. There are many. I was curious if there is a design pattern for this.
Youre essentially talking about lazy loading. So in your model you use something to track the state of the persisted properties, i.e. whether they have been loaded, modified, or if the model is completely new and not yet in the db. Then your getters for each property first check state, and if something is not yet loaded and the model isnt "new" then they query for that property and set it on the model before returning the value.
So your magic get might look like (pseudo code - youll have to translate to CI):
public function __get($property){
if($this->isPersisted($property) && !$property->isLoaded() && !$this->isNew()){
$this->$property = $this->db
->select($this->getColumn($property))
->from($this->tableName)
->where('id = ?', $this->id)
->fetchColumn(0);
}
return $this->property;
}
Related
Please note I'm not looking for 'use a framework' answers. I'm trying to structurally improve the way I code websites and approach databases from PHP.
I'm building a web service from scratch, without any frameworks. I'm using a LAMP stack and am trying to learn a bit of PHP's OO functionality while I'm at it. I've previously only used OO to make mobile apps.
I've been at it for months now (as planned, no worries). Along the way I've bumped into a couple of structural problems, making me wonder what the best way would be to make the code object oriented.
Pretty much all of the problems involve the database in some way. Say we have a class DB and a class User. In most cases I only need to fetch a single user's information from the database. I thought a good way to handle it was to have a global $_db variable and have the User object query the database like so (oversimplified):
class User {
function __construct($id) {
global $_db;
$q = $_db->query("SELECT name, mail FROM user WHERE id = ?", $id);
$this->loadProperties($q);
}
}
Now say we have a page that shows a list of users. I still want to make User objects for each of them, but I don't want to query the database for each separate user.
So, I extend the User class to take an object as an argument:
class User {
function __construct($id) {
if(is_object($id))
$q = $id;
else {
global $_db;
$q = $_db->query("SELECT name, mail FROM user WHERE id = ?", $id);
}
$this->loadProperties($q);
}
}
Now I can create a list of, for example, the 100 most recently created and active accounts:
$user_list = [];
$q = $_db->query("SELECT name, mail FROM user WHERE banned = 0 ORDER BY date_created DESC LIMIT 100");
while($a = $_db->fetch($q))
$user_list[] = new User($a);
This all works great, except for one big downside: the database queries for table user are no longer in one place, which is kind of making spaghetti code. This is where I'm starting to wonder whether this can be done more efficiently.
So maybe I need to extend my DB object instead of my User object, for example:
class DB {
public function getUsers($where) {
$q = $this->query("SELECT name, mail FROM user WHERE ".$where);
$users = [];
while($a = $this->fetch($q))
$users[] = new User($a);
}
}
Now I would create the user list as follows:
$user_list = $_db->getUsers("banned = 0 ORDER BY date_created DESC LIMIT 100");
But now I'm calling the getUsers() method in various places using various SQL queries, solving nothing. I also don't want to load the same properties each time, so my getUsers() method will have to take entire SQL queries as an argument. Anyway, you get the point.
Speaking of loading different properties, there's another thing that has been bugging me writing OO in PHP. Let's assume our PHP object has at least every property our database row has. Say I have a method User::getName():
class User {
public function getName() {
return $this->name;
}
}
This function will assume the appropriate field has been loaded from the database. However it would be inefficient to preload all of the user's properties each time I make an object. Sometimes I'll only need the user's name. On the other hand it would also be inefficient to go into the database at this point to load this one property.
I have to make sure that for each method I use, the appropriate properties have already been loaded. This makes complete sense from a performance perspective, but from an OO perspective, it means you have to know beforehand which methods you're gonna use which makes it a lot less dynamic and, again, allows for spaghetti code.
The last thing I bumped into (for now at least), is how to separate actual new users from new User. I figured I'd use a separate class called Registration (again, oversimplified):
class Registration {
function createUser() {
$form = $this->getSubmittedForm();
global $_db;
$_db->query("INSERT INTO user (name, mail) VALUES (?, ?)", $form->name, $form->mail);
if($_db->hasError)
return FALSE;
return $_db->insertedID;
}
}
But this means I have to create two separate classes for each database table and again I have different classes accessing the same table. Not to mention there's a third class handling login sessions that's also accessing the user table.
In summary, I feel like all of the above can be done way more efficiently. Most importantly I want pretty code. I feel like I'm missing a way to approach the database from an OO perspective. But how can I do so without losing the dynamics and power of SQL queries?
I'm looking forward to reading your experiences and ideas in this field.
Update
Seems most of you condemn my use of global $_db. Though you've convinced me this isn't the best approach, for the scope of this question it's irrelevant whether I'm supplying the database through an argument, a global or a singleton. It's still a separate class DB that handles any interaction with the database.
It's a common thing to have a separate class to handle SQL queries and to keep the fetched data. In fact, it is the real application of the Single Responsibility Principle.
What I usually do is keep a class with all the information concerning the data, in your case the User class, with all the user information as fields.
Then comes the business layer, for instance UserDataManager (though the use of "Manager" as a suffix is not recommended and you'd better find a more suitable name in each scenario) which takes the pdo object in its constructor to avoid use of global variables and has all the SQL methods. You'd thus have methods registerNewUser, findUserById, unsuscribeUser and so on (the use of "User" in the method can be implied by the class name and be omitted).
Hope it helps.
I've liked to use the data mapper pattern (or at least I think that's how I'm doing it). I've done this for some sites built on Silex, though it's applicable to going without a framework since Silex is very lightweight and doesn't impose much on how you architect your code. In fact, I recommend you check out Symfony2/Silex just to get some ideas for ways to design your code.
Anyway, I've used classes like UserMapper. Since I was using the Doctrine DBAL library, I used Dependency injection to give each mapper a $db. But the DBAL is pretty much a wrapper on the PDO class as far as I understand, so you could inject that instead.
Now you have a UserMapper who is responsible for the CRUD operations. So I solve your first problem with methods like LoadUser($id) and LoadAllUsers(). Then I would set all the properties on the new User based on the data from the database. You can similarly have CreateUser(User $user). Note that in "create", I'm really passing a User object and mapping it to the database. You could call it PersistUser(User $user) to make this distinction more clear. Now all of the SQL queries are in one place and your User class is just a collection of data. The User doesn't need to come from the database, you could create test users or whatever else without any modification. All of the persistence of `User is encapsulated in another class.
I'm not sure that it's always bad to load all of the properties of a user, but if you want to fix that, it's not hard to make LoadUsername($id) or other methods. Or you could do LoadUser($id, array $properties) with a set of properties taht you want to load. If your naming is consistent, then it's easy to have set the properties like:
// in a foreach, $data is the associative array returned by your SQL
$setter = 'set'.$property;
$user->$setter($data[$property]);
Or (and?) you could solve this with Proxy objects. I haven't done this, but the idea is to return a bunch of UserProxy objects, which extend User. But they have the Mapper injected and they override the getters to call into the Mapper to select more. Perhaps when you access one property on a proxy, it will select everything via the mapper (a method called populateUser($id)?) and then subsequent getters can just access the properties in memory. This might make some sense if you, for example, select all users then need to access data on a subset. But I think in general it may be easier to select everything.
public function getX()
{
if (!isset($this->x)) {
$this->mapper->populateUser($this);
}
return $this->x;
}
For new users, I say just do $user = new User... and set everything up, then call into $mapper->persist($user). You can wrap that up in another class, like UserFactory->Create($data) and it can return the (persisted) User. Or that class can be called Registration if you'd like.
Did I mention you should use Dependency Injection to handle all of these services (like the Mappers and others like Factories maybe)? Maybe just grab the DIC from Silex, called Pimple. Or implement a lightweight one yourself (it's not hard).
I hope this helps. It's a high-level overview of some things I've picked up from writing a lot of PHP and using Syfmony2/Silex. Good luck and glad to see PHP programmers like yourself actually trying to "do things right"! Please comment if I can elaborate anywhere. Hope this answer helps you out.
You should first begin by writing a class as a wrapper to your Database object, which would be more clean that a global variable (read about the Singleton Pattern if you don't know it, and there is a lot of examples of Singleton Database Wrapper on the web). You'll then have a better view of the architecture you should implement.
Best is to separate datas from transactions with the database, meaning that you can for example have two classes for your User ; one that will only send queries and fetch responses, and the other that will manage datas thanks to object's attributes and methods. Sometimes, there can be also some action that doesn't require to interact with the database, and that would be implemented in these classes too.
Last but not least, it can be a good idea to look a MVC frameworks and how they work (even if you don't want to use it) ; that would give you a good idea of how can be structured a web application, and how to implement these pattern for you in some way.
I am using CodeIgniter but this question applies in a general sense too.
I have a table of transactions with columns
item_name | type | date | price | document
I want to do the following in two completely independent cases.
1) Get a list of transactions within a certain date range.
2) Get the total price of each transaction.type within a certain date range.
The former can be achieved by simply using a select statement with > datetimestamp
The latter can be achieved by selecting the SUM, and grouping by the type whilst like implementing any required where conditionals e.g with > datetimestamp
Although a simple case, to achieve this I need to have two methods however the bulk of both of these methods (namely the WHERE clauses) are duplicated across both methods.
In terms of speed etc it does not matter but it seems like pointless code reproduction.
A second example is as follows.
I previously had a method get_data($ID) which would get a row from a table based on the ID passed in.
As such in a separate method I would get my 100 items for example.. return an array, loop through them and call get_data for each.
This setup meant that many different methods could get different lists from different sources and then still use the same get_data function and a loop to get the required data.
This minimized code duplication but was incredibly ineffiecient as it meant looping through loads of data items and hundreds of db queries.
In my current setup i just join the data table in each of my methods - code duplication but clear improved efficiency.
A final example is as follows
In codeigniter I can have a function such as the following:
get_thing($ID)
{
$this->load->database();
$this->db->where('ID',$ID);
$this->db->get('table');
}
BUT in alternate situations i might want to only get items in a specific folder.. as such making the function more generic works better.. e.g.
get_thing($array)
{
$this->load->database();
$this->db->where($array);
$this->db->get('table');
}
but then I might want to use this function in two different contexts e.g a user page and an admin page whereby admins can see all items, even unverified ones. My code now becomes:
get_thing($array,$show_unverified = false)
{
$this->load->database();
$this->db->where($array);
if($show_unverified == false)
{
$this->db->where('verified','YES');
}
$this->db->get('table');
}
As you can probably see this can quickly get out of hand and methods can become overly complex, confusing and full of conditionals.
My question is as follows - What are best practices for minimizing code duplication, and how could they be applied to the above situations? I spent hours and hours trying to make my code more efficient yet I'm getting nowhere because I cant workout what I should really be trying to achieve.
Cheers
My idea on code duplication in database access functions is that it is often better to keep it separate.
My rule here is especially that a function should not return different kinds of data depending on the parameter, for example it should not return a single user sometimes and other times an array of users. It may return error codes (false) though.
It is ok though if the function implements different access levels, which are shared across several pages.
This basically always comes back to common sense. You should try to minimize duplicate code and try to reduce complexity within single function. Keep them small and simple.
So basically everytime you try to generalize a function like this you would have to ask if the problem of duplicate code is bigger than the problem of overly complex functions.
In this case i would stop at your second point and next you could create some wrappers for the most common tasks (but be carefull not to make a maze of wrappers)
//you generic function
function get_thing($array)
{
$this->load->database();
$this->db->where($array);
$this->db->get('table');
}
// a nice and friendly wrapper
function get_thing_by_id($id)
{
get_thing(array('id' => $id));
}
// this is just getting silly. don't go crazy with wrappers, only for very often used things.
// and yes the function name is purposely crazy ;)
function get_thing_verified_by_name_and_city_and_some_more($name, $city, $somethingElse)
{
get_thing(array('name' => $name, 'city' => $city, 'somethingelse' => $somethingElse));
}
This answers the first part of your question. Assuming you're using mysql_fetch_assoc or similar. As you're iterating over the result sets you could store count values in a variable in the loop for the total price of each transaction type.
The second part as long as you're not repeating code ad infinitum which would cause you issues down the line when maintaining the code base, it's OK. For your function you could always test the type of variable being passed to the function and set conditional behaviours accordingly.
Have a look at the factory pattern or strategy pattern relating to software design patterns for further insight.
This is a long running question that gets me every time I am developing.
I suppose it is not specific to CodeIgniter, but as I am using it, I will consider it in my example.
Firstly, which is better:
function add_entry($data_array)
{
//code to add entry
}
function edit_entry($data_array)
{
//code to update entry
}
OR
function save_changes($what,$data_array)
{
//if what == update update
//otherwise insert
}
Both produce the same action, but does it really matter which one you use?
Getting onto more complicated things.
I have a page where I need to get ONE entry from the database.
I also have a page where I need to get all the entries from the same database ordered by a user specified column.
My resultant method is a function similar to
function($data_array,$order_by='',$limit='')
{
//get where $data_array
//if order_by!='' add order by
//if limit !='' add limit
}
As I develop my application and realise new places where I need 'similar' database functionality I am what feels like hacking previous methods so they work with all my case scenarios. The methods end up containing lots of conditional statements, and getting quite complex with in some cases 4 or 5 input parameters.
Have I missed the point? I don't want duplicate code, and when for the most part the functions are very similar I feel like this 'hacking' methodology works best.
Could someone advise?
Finally my admin functionality is part of the same application in an admin controller. I have an admin model which contains specific methods for admin db interaction. I however use some model functionality from 'user' models.
FOr example if on an admin page I need to get details of a db entry I may load the user model to access this function. There is nothing wrong/insecure about this..? right?
In addition to that within my admin model itself I need to get data about a user database entry so I call my user model directly from my admin model. This is strictly OK, but why? If i need data and there is already a method in my user model which gets it.. it seems a little pointless to rewrite the code in the admin model BUT each time that function is called does it load the whole user model again?
Thanks a lot all.
In order, add edit in the model vs save. Personally I have a save built in MY_Model that chooses whether it is a save or an edit depending on the existence of a primary key in the data being passed, so obviously I prefer that method it means a lot less duplication of code since I can use the save for any table without having functions in the model at all.
As to the second question I think it depends on situation. I also have a number of functions that have a ton of conditionals on them depending on where they're used and for what. In some cases I'm finding this makes the legibility of the code a little rough. If you're running them all through if statements it also could be impacting performance. DRY is a concept, not a rule and like other design concepts there are times when they just don't make sense, it's like database normalization, it's my personal opinion it's VERY easy to over normalize a database and destroy performance and usability.
Finally, using user functions in the admin code. I don't see an issue here at all, the reverse probably isn't true, but rewriting a function just because it's an "admin" function, when it's identical to a user function is utterly pointless. So you're correct there, it's a waste of time and space.
Something that has always bothered me is doing more than one loop to manipulate an array.
What I mean is, in the controller the data is fetched from the DB via a model. Lets say we are showing a list of users, and each user has a status (1,2,3 equates to verified, unverified, banned respectively). Within each iteration of the loop the status would be checked and displayed via another Db query (forget mysql joins in this example).
Now, would you do that in the controller within a loop, and then perform another loop in the view with all the data already fetched and pre-formed ready for display (therefore resulting in 2 loops).
--OR--
Would you just do it in the view therefore resulting in the one loop but with model calls from the view. I understand that this is ok in the strict MVC pattern but its frowned upon generally.
It seems silly to loop twice but then its tidier as all the data manipulation is kept within the controller.
I would do that nor in the view or the controller but on the model.
I explain :
Your controller's job is to retrieve the expected user list, check ACL, etc...
Your view's job is to present this data in an elegant form
Your Model job's in to fetch/store data from Database and ensure integrity. Userstatus is a model too for me.
My configuration make this pretty easy, I use mustache (Php port) for view, which allow me to call methods from my models directly in view. I wrote my own ORM for my models, that way I have wrappers.
Such code would look like that for me :
// Controller
$template = new Template('pages/users.html');
$template->users = mUser::find(); // return array of mUsers instances
echo $template->render();
// View
{{#users}} <!-- For each user -->
{{getName}} has status {{#getStatus}}{{getStatusName}}{{/getStatus}}<br />
<!-- getStatus is a method from mUser model, that return a mUserStatus instance -->
{{/users}}
/* More explain on the view syntax
{{name}} = $user->getName() (return string)
{{getStatus}} = $user->getStatus() (return instance of mUserStatus);
{{statusName}} = $user->getStatus()->getStatusName();
*/
You may want to have request caching for each model instances in request level so that you never runs a request twice times if not needed.
That seems more natural to me than to delegate it to controller. I try to put business intelligence on controllers, there is no need for intelligence nor programmer intervention to retrieve a status name for each user.
I Hope it help.
In my opinion logic that manipulates the data you're returning, should be located in the controller. Logic that manipulates the representation of your data can be located in the view.
So I would go for the second option.
But, as you pointed out yourself this is a choice of implementation.
Also note that multiple round trips to your DB are bad for performance. Your example is a typical n+1 problem, meaning that you have 1 'top' select query and then N more queries for each row in your first result set. If you encounter such a problem always try to solve them on the DB level.
Another note I would like to add is that in your example you're storing status explanations in the DB. If you want to provide your applications in other languages, this might prove to be a problem. But this is beyond the scope of your question :)
Doing two loops is the clean way. That is what I would do for most cases, but I think there is no gerneral answer to this. Like if you have a lot a data and performance gets an issue it would be better to forgett about MVC and just use one loop.
A third way would be to use a helper function you can call from the view. Now that I think about it... that would probably be the best way.
I'm trying to replace a site written procedurally with a nice set of classes as a learning exercise.
So far, I've created a record class that basically holds one line in the database's main table.
I also created a loader class which can:
loadAllFromUser($username)
loadAllFromDate($date)
loadAllFromGame($game)
These methods grab all the valid rows from the database, pack each row into a record, and stick all the records into an array.
But what if I want to just work with one record? I took a stab at that and ended up with code that was nearly identical to my procedural original.
I also wasn't sure where that one record would go. Does my loader class have a protected record property?
I'm somewhat confused.
EDIT - also, where would I put something like the HTML template for outputting a record to the site? does that go in the record class, in the loader, or in a 3rd class?
I recommend looking into using something like Doctrine for abstracting your db-to-object stuff, other than for learning purposes.
That said, there are many ways to model this type of thing, but in general it seems like the libraries (home-grown or not) that handle it tend to move towards having, at a high level:
A class that represents an object that is mapped to the db
A class that represents the way in which that object is mapped to the db
A class that represents methods for retrieving objects from the db
Think about the different tasks that need done, and try to encapsulate them cleanly. The Law of Demeter is useful to keep in mind, but don't get too bogged down with trying to grok everything in object-oriented design theory right this moment -- it can be much more useful to think, design, code, and see where weaknesses in your designs lie yourself.
For your "work with one record, but without duplicating a bunch of code" problem, perhaps something like having your loadAllFromUser methods actually be methods that call a private method that takes (for instance) a parameter that is the number of records to be retrieved, where if that parameter is null it retrieves all the records.
You can take that a step further, and implement __call on your loader class. Assuming it can know or find out about the fields that you want to load by, you can construct the parameters to a function that does the loading programatically -- look at the common parts of your functions, see what differs, and see if you can find a way to make those different parts into function parameters, or something else that allows you to avoid repetition.
MVC is worth reading up on wrt your second question. At the least, I would probably want to have that in a separate class that expects to be passed a record to render. The record probably shouldn't care about how it's represented in html, the thing that makes markup for a record shouldn't care about how the record is gotten. In general, you probably want to try to make things as standalone as possible.
It's not an easy thing to get used to, and most of "getting good" at this sort of design is a matter of practice. For actual functionality, tests can help a lot -- say you're writing your loader class, and you know that if you call loadAllFromUser($me) that you should get an array of three specific records with your dataset (even if it's a dataset used for testing only), if you have something you can run which would call that on your loader and check for the right results, it can help you know that your code is at least right from the standpoint of behavior, if not from design -- and when you change the design you can ensure that it still behaves correctly. PHPUnit seems to be the most popular tool for this in php-land.
Hopefully this points you in a useful group of directions instead of just being confusing :) Good luck, and godspeed.
You can encapsulate the unique parts of loadAllFrom... and loadOneFrom... within utility methods:
private function loadAll($tableName) {
// fetch all records from tableName
}
private function loadOne($tableName) {
// fetch one record from tableName
}
and then you won't see so much duplication:
public function loadAllFromUser() {
return $this->loadAll("user");
}
public function loadOneFromUser() {
return $this->loadOne("user");
}
If you like, you can break it down further like so:
private function load($tableName, $all = true) {
// return all or one record from tableName
// default is all
}
you can then replace all of those methods with calls such as:
$allUsers = $loader->load("users");
$date = $loader->load("date", false);
You could check the arguments coming into your method and decide from there.
$args = func_get_args();
if(count($args) > 1)
{
//do something
}
else // do something else
Something simple liek this could work. Or you could make two seperate methods inside your class for handling each type of request much like #karim's example. Whichever works best for what you would like to do.
Hopefully I understand what you are asking though.
To answer your edit:
Typically you will want to create a view class. This will be responsible for handling the HTML output of the data. It is good practice to keep these separate. The best way to do this is by injecting your 'data class' object directly into the view class like such:
class HTMLview
{
private $data;
public function __construct(Loader $_data)
{
$this->data = $_data;
}
}
And then continue with the output now that this class holds your processed database information.
It's entirely possible and plausible that your record class can have a utility method attached to itself that knows how to load a single record, given that you provide it a piece of identifying information (such as its ID, for example).
The pattern I have been using is that an object can know how to load itself, and also provides static methods to perform "loadAll" actions, returning an array of those objects to the calling code.
So, I'm going through a lot of this myself with a small open source web app I develop as well, I wrote most of it in a crunch procedurally because it's how I knew to make a working (heh, yeah) application in the shortest amount of time - and now I'm going back through and implementing heavy OOP and MVC architecture.