I have a bunch of legacy code without a Model layer separating it from the database. SQL statements abound. I would like to normalize tables in the database, yet not have to rewrite all this embedded SQL (eventually I will). I've tried updatable views in MySQL, but for anything of any complexity, it will perform a full table scan.
Can anyone suggest a clever way to hide database schema changes from clients that heavily rely on those very schemas?
There's no clever way to do it all at once. It's bound to be a difficult, meticulous process. Here are some tips:
Refactor your application code so that you don't have SQL code littered around the whole codebase. Refactor so that all the literal SQL is consolidated in a few "data access" classes, with sensible APIs. This is something you can do gradually.
Develop a thorough suite of system tests that you can run against your application so as you change the database schema, you can re-run the tests to ensure you didn't break something. This is also something you can develop gradually.
Use read-only views where you can. You can't make views for complex SQL queries as updatable views, but at least you can use views for the reading queries. I'm not sure why that would cause a table-scan where the original query did not. Sometimes views cause a temporary table, which is also costly.
In some cases, it's easier to make a view for the new tables. In other cases, it's easier to make a view for the old tables. Keep your options open to do it either way, on a case-by-case basis.
In your data access classes, you may have to double the database work. For example, saving data changes to the old tables and the new refactored tables as part of the same operation. Once you have tested to make sure all the database queries are using your data access class, you can then take out the code that maintains the legacy tables.
Eventually, you will have to bite the bullet and make some schema changes that require a "big bang integration". That's the term I've heard for a change that causes a lot of changes across all your code. It's a lot of work to do all the testing necessary to ensure nothing breaks. Sorry, there's no magic to avoid that work. But you can prepare for it with your data access layer and lots of system tests.
This is probably a question that the Stack Overflow community will close as "opinion-based".
Related
In codeigniter PHP framework we can use normal SQL query and also we can use active record. Here i have understood active record syntax is less than normal SQL queries.
Can any one tell me what is the main advantage of active record in codeigniter?
Thanks in advance
The Active Record design pattern provides an easy way to handle the data (create/load, modify, update, delete) without the need to take care of the technical details.
Its main drawback is that it makes the unit testing impossible without using a database (which increases the test execution time a lot and breaks the isolation needed by the unit testing.)
Why the creators of CodeIgniter have chosen to use Active Record as their main way to implement the data persistence in the framework? I don't know.
Use it as it is or step forward and use another framework.
This is like comparing SQL with the ORM features because ACTIVE RECORD works as ORM in CI.
Here is the list of benefits of ORM:
Productivity: The data access code is usually a significant portion
of a typical application, and the time needed to write that code can
be a significant portion of the overall development schedule. When
using an ORM tool, the amount of code is unlikely to be reduced—in
fact, it might even go up—but the ORM tool generates 100% of the
data access code automatically based on the data model you define,
in mere moments.
Application design: A good ORM tool designed by very experienced
software architects will implement effective design patterns that
almost force you to use good programming practices in an
application. This can help support a clean separation of concerns
and independent development that allows parallel, simultaneous
development of application layers.
Code Reuse: If you create a class library to generate a separate DLL
for the ORM-generated data access code, you can easily reuse the data
objects in a variety of applications. This way, each of the
applications that use the class library need have no data access code
at all.
Application Maintainability: All of the code generated by the ORM is
presumably well-tested, so you usually don’t need to worry about
testing it extensively. Obviously you need to make sure that the
code does what you need, but a widely used ORM is likely to have
code banged on by many developers at all skill levels. Over the long
term, you can refactor the database schema or the model definition
without affecting how the application uses the data objects.
You can change the backend database anything as you don't need to
worry about query syntax as you are playing with OBJECTS instead of
queries.
It definitely makes your code cleaner to read, and I only not use it when I have to do some seriously complex SQL stuff, like building multitable search queries. I recommend it for its code cleanliness highly.
It helps to prevent Sql Injection.
But for some complex sql queries you can prefer your normal sql queries.
I have been building a lot of website in the past using my own cms/framework and I have developed a simple way of executing queries. Recently I have started playing with other frameworks such as code igniter. They offer raw query imputs such as…
$this->db->query(“SELECT * FROM news WHERE newsId=1;”);
But they also offer chaining of MySQL command via PHP methods.
$this->db->select("*")->from("news")->where("newsId=?");
The question is; what is the main difference and of benefits of each option.
I know the latter options prevents MySQL injection but to be honest you can do exactly the same from using $this->db->escape().
So in the end from what I can see the latter option only serves to make you use more letters on your keyboard, this you would think would slow you down.
I think the implementation of activerecord in codeigniter is suitable for small and easy queries.
When you need complex queries with lots of joins, it is more clear to just write the query itself.
I don't think that an extra layer of abstraction will ever give you better performance, if you have a certain skill in SQL.
Most recent php framework developers are uses AR(active record)/DAO(database access object) Pattern. Because it's really faster then raw query. Nowadays AR technique originally built from PDO(php data object).
why active record is really faseter?
its true query writing is the best habit for a developer. But some problem make it tough
1. When we write insert and update large query, sometime it's hard to match every row value.. but AR make it easy. you just add array first and then execute easily.
2. Doesn't matter what DB you use.
3. Sometimes it's really hard read or write query if it has many condition. But in AR you can cascade many object for 1 query.
4. AR save your time to repeating statement
I can't speak for CodeIgniter (what I've seen of it seems rather slung-together, frankly), but there are a few reasons such systems may be used:
as part of an abstraction layer which supports different DBMS back-ends, where for instance ->offset(10)->limit(10) would automatically generate the correct variant of OFFSET, LIMIT, and similar clauses for MySQL vs PostgreSQL etc
as part of an "ORM" system, where the result of the query is automatically mapped into Model objects of an appropriate class based on the tables and columns being queried
to abstract away from the exact names of tables and columns for backwards-compatibility, or installation requirements (e.g. the table "news" might actually be called "app1_news" in a particular install to avoid colliding with another application)
to handle parameterised queries, as in your example; although largely unrelated to this kind of abstraction, they provide more than just escaping, as the DBMS (MySQL or whatever is in use) knows which parts of the query are fixed and which are variable, which can be useful for performance as well as security
I recently started working with Yii PHP MVC Framework. I'm looking for advice on how should I continue working with the database through the framework: should I use framework's base class CActiveRecord which deals with the DB, or should I go with the classic SQL query functions (in my case mssql)?
Obviously or not, for me it seems easier to deal with the DB through classic SQL queries, but, at some point, I imagine there has to be an advantage in using framework's way.
Some SQL queries will get pretty complex pretty often. I just can't comprehend how the framework could help me and not make things more complicated than they actually are.
Very General rule from my experience with Yii and massive databases:
Use Yii Active Record when:
You want to retrieve and post single to a few rows in the database (e.g. user changing his/her settings, updating users balance, adding a vote, getting a count of users online, getting the number of posts under a topic, checking if a model exists)
You want to rapidly design a hierarchical model structure between your tables, (e.g. $user->info->email,$user->settings->currency) allowing you to quickly adjust displayed currency/settings per use.
Stay away from Yii Active Record when:
You want to update several 100 records at a time. (too much overhead for the model)
Yii::app()->db->command()
allows you to avoid the heavy objects and retrieves data in simple arrays.
You want to do advanced joins and queries that involve multiple tables.
Any batch job!! (e.g. checking a payments table to see which customers are overdue on their payments, updating database values etc.)
I love Yii Active Record, but I interchange between the Active Record Model and plain SQL (using Yii::app()->db) based on the requirement in the application.
At the end I have the option whether I want to update a single users currency
$user->info->currency = 'USD';
$user->info->save();
or if I want to update all users currencies:
Yii::app()->db->command('UPDATE ..... SET Currency="USD" where ...');
In any language when dealing with the database a framework can help you by providing an abstraction over the database.
Here is a scenario I know I found myself in many times during my earlier development days:
I have an application that needs a database.
I write a ton of code.
I put the SQL statements in the code along with everything else.
The database changes somehow.
I'm stuck with having to go back and make 100 changes to all my SQL statements.
It's very frustrating.
Another scenario I found:
I write a ton of code against a database.
Bugs come in. Lots of bugs. I can't figure them all out.
I'm asked to write tests for my code.
This is impossible because all my code relies on a direct implementation of the database. How do you test SQL statements when they're with the actual code?
So my advice is to use the framework because it can provide an abstraction over the database. This gives you two really big advantages:
You can potentially swap out the database later and your code stays the same! If you're using interfaces/some framework, then most likely you're dealing with objects and not SQL statements directly. A given implementation might know how to write to MySQL or SQL Server, but in general your code just says "Write this object", "Read that list."
You can test your code! A good framework that deals with data will let you mock the database so you can test it easily.
Try to avoid writing SQL statements directly in the application. It'll save you pain later.
I'm unfamiliar with the database system bundled with Yii, but would advise you to use it a little bit to start with. My experience is with Propel, a popular PHP ORM. In general, ORM systems have a class per table (Propel has three per table).
Now, there'll probably be a syntax to do lookups and joins etc, but the first thing to do is to work out how to use raw SQL in your queries (for any of the CRUD operations). Put methods to do these queries in your model classes, so at least you will be benefitting from centralisation of code.
Once you've got that working, you can migrate to the recommended approach at a later time, without getting overwhelmed with the amount of material you have to learn in one go. Learning Yii (especially how to share code amongst controllers, and to write maintainable view templates) takes a while, so it may be sensible not to over-complicate it with many other things as well.
Why to use Yii:
Just imagine that you have many modules and for each module you have to write a pagination code; writing in old fashion style, will need a lot of time;
Why not use Yii ClistView widget? Oh, and this widget comes with a bonus: the data provider and the auto checking for the existance of the article that is about to be printed;
When using Yii CListView with results from ... Sphinx search engine, the widget will check if the article do really exists, because the result may not be correct
How long will it take for you to write a detection code for non existing registration?
And when you have different types of projects will you addapt the methods?
NO! Yii does this for you.
How long would it take for you to write the code in crud style ? create, read, update, delete ?
Are you going to adapt the old code from another project ?
Yii has a miracle module, called Gii, that generates models, modules, forms, controllers, the crud ... and many more
at first it might seem hard, but when you get experienced, it's easy
I would suggest you should use CActiveRecord.It will give many advantages -
You can use many widgets within yii directly as mentioned above.(For paginations,grids etc)
The queries which are generated by the Yii ORM are highly optimized.
You dont need to put the results extracted from SQLs in your VO objects.
If the tables for some reason modified(addition/deletion of column,changing data type), you just need to regenerate the models using the tool provided by yii.Just make sure you try to avoid doing any code changes in the models generated by yii, that will save your merging efforts.
If you plan to change the DB from MYSQL to other vendor in futur, it would be just config change for you.
Also you and your team would save your precious development time.
I quite often see in PHP, WordPress plugins specifically, that people write SQL directly in their plugins... The way I learnt things, everything should be handled in layers... So that if one day the requirements of a given layer change, I only have to worry about changing the layer that everything interfaces.
Right now I'm writing a layer to interface the database so that, if something ever changes in the way I interact with databases, all I have to do is change one layer, not X number of plugins I've created.
I feel as though this is something that other people may have come across in the past, and that my approach my be inefficient.
I'm writing classes such as
Table
Column
Row
That allow me to create database tables, columns, and rows using given objects with specific methods to handle all their functions:
$column = new \Namespace\Data\Column ( /* name, type, null/non-null, etc... */ );
$myTable = new \Namespace\Data\Table( /* name, column objects, and indexes */ );
\Namespace\TableModel.create($myTable);
My questions are...
Has someone else already written something to provide some separation between different layers?
If not, is my approach going to help at all in the long run or am I wasting my time; should I break down and hard-code the sql like everyone else?
If it is going to help writing this myself, is there any approach I could take to handle it more efficiently?
You seem to be looking for an ORM.
Here is one : http://www.doctrine-project.org/docs/orm/2.0/en/tutorials/getting-started-xml-edition.html
To be honest, I'd just hard-code the SQL, because:
Everyone else does so too. Big parts of WordPress would need to be rewritten, if they would ever wish to change from MySQL to something else. It would just be a waste of time to write your perfect layer for your plugin, if the rest of the whole system still only works with hard-coded SQL.
We don't live in a perfect world. Too much abstraction will - soon or late - end up in performance and other issues, which I don't even think of yet. Keep it simple. Also, using SQL you can benefit from some performance "hacks", which maybe won't work for other systems.
SQL is a widely accepted standard and can already be seen as abstraction layer. for example there's even the possibility to access Facebook's Graph via SQL-like syntax (see FQL). If you want to change to another data-source, you'll probably find some layer wich supports SQL-syntax anyways! In that sense, you could even say SQL already is some kind of abstraction layer.
But: if you decide to use SQL, be sure to use WordPress' $wpdb. Using that, you're on the safe side, as WordPress takes care of connecting to the database, forming the queries, etc. If, one day, WordPress will decide to change from databases to something else, they'll need to create a $wpdb-layer to that new source - for backwards compatibility. Also, many general requests already are in $wpdb as functions (such as $wpdb->insert()), so there's no direct need to hard-code SQL.
If however, you decide to use such an abstraction layer: Wikipedia has more information.
Update: I just found out that the CMS Drupal uses a database abstraction layer - but they still use SQL to form their queries, for all the different databases! I think that shows pretty clearly, how SQL can already be used as an abstraction layer.
I have a class that helps me to handle users.
For example:
$user = new User("login","passw");
$name = $user->getName();
$surname = $user->getSurname();
$table = $user->showStats();
All these methods have SQL queries inside. Some actions require only one sql queries, some - more than one. If database structure changes - it will be difficult to change all queries (class is long). So I thought to keep SQL queries away from this class. But how to do this?
After reading this question I've known about Stored Procedures. Does it mean, that now one action requires only one SQL query (call of Stored Procedure)? But how to organize separation sql from php? Should i keep sql-queries in an array? Or may be it should be an sql-queries class. If yes, how to organise this class (maybe what pattern I should learn)
This is a surprisingly large topic, but I have a few suggestions to help you on your way:
You should to look into object-relational mapping, in which an object automatically generates SQL queries. Have a look at the Object-Relational Mapping and Active Record articles for an overview. This will keep your database code minimal and make it easier if your table structure changes.
But there is no silver bullet here. If your schema changes you will have to change your queries to match. Some people prefer to deal with this by encapsulating their query logic within database views and stored procedures. This is also a good approach if you are consistent, but keep in mind that once you start writing stored procedures, they are going to be tied heavily to the particular database you are using. There is nothing wrong with using them, but they are going to make it much more difficult for you to switch databases down the road - usually not an issue, but an important aspect to keep in mind.
Anyway, whatever method you choose, I recommend that you store your database logic within several "Model" classes. It looks like you are doing something similar to this already. The basic idea is that each model encapsulates logic for a particular area of the database. Traditionally, each model would map to a single table in the DB - this is how the Ruby on Rails active record class works. It is a good strategy as it breaks down your database logic into simple little "chunks". If you keep all of the database query logic within a single file it can quickly grow out of control and become a maintenance nightmare - trust me, I've been there!
To get a better understanding of the "big picture", I recommend you spend some time reading up on web Model-View-Controller (MVC) architecture. You will also want to look at the established PHP MVC frameworks, such as CodeIgniter, Kohaha, CakePHP, etc. Even if you do not use one - although I recommend you do - it would be helpful to see how these frameworks organize your code.
I would say you should look into implementing the "repository" design pattern in your code.
A good answer to how to implement this would be too long for this space, so I'll post a couple of PHP-oriented references:
travis swicegood -- Repository Pattern in PHP
Jon Lebensold -- A Repository Pattern in PHP
You are on the right lines if you use separation of concerns to separate your business logic from your data access logic you will be in a better place.
Judging by your "there are already 2K lines of code" statement, you're either maintaining something, or midway through developing something.
Both Faust and Justin Ethier make good recommendations - "how should I separate my database access from my application code" is one of the oldest, and most-answered, questions in web development.
Personally, I like MVC - it's pretty much the default paradigm for web development, it balances maintainability with productivity, and there are a load of frameworks to support you while you're doing it.
You may, of course, decide that re-writing your app from scratch is too much effort - in which case the repository pattern is a good halfway house.
Either way, you need to read up on refactoring - getting from where you are to where you want to be is going to be tricky. I recommend the book by Fowler, as a starter.
Could you explain more about why your database schema may change? That's usually a sign of trouble ahead.....