I am re-factoring a codeigniter project that uses database extensively (right now that layer uses PDO and generated queries but it became unreadable so needs re-factoring), and trying to figure out what's the best way to go. I am interested in ease of development, but more importantly - performance, but I couldn't find out useful comparisons of performance:
CI's Active record, NotORM and ORMs (currently I am looking at GAS and Datamapper, but open to other suggestions) that can be integrated with CI .
I started looking at DataMapper, but then found a post claiming it is twice as slow as the CI Active Record, and that seems to me like a deal-breaker - I am ok with a bit of overhead for extra flexibility, code reuse and readability, but would rather with a really fast bad code than find out I significantly slowed my pages loading time for that.
I am looking for something like http://www.techempower.com/benchmarks/ , but for ORMs and other DB access layers and not PHP frameworks.
Actually there's no good answer here.
If you need extremely good performance, then use PDO. You write plain SQL queries, so you have 100% control.
If you want to introduce some tool to ease the way you write SQL, maybe you can have a look at any SQL-fluent-api library, that can abstract you "a bit":
select('X')->from('Y')->where('Z')->limit(10);
A bit clearer, maybe :). It'll probably also generate compatible-queries with many RDBMS (MySQL, Oracle, PostgreSQL...).
None of the above alternatives is an ORM. If you need it, of course there's a penalty on performance (and we can say it's always "big"). Good and modern ORMs usually allow you also to cache results, or even the generated SQL to avoid part of the overhead.
Anyway, the performance is degraded, of course. For each query, the ORM has to transform the resultset to your objects (and all the relations), which is (on the other hand) very cool :D. And you lose control over what the ORM is doing internally (sometimes, and if you don't know the ORM).
There's no good answer here, it depends on your use-case.
If you decide to use an ORM, have a look at Doctrine2. You can have a look at how to integrate it with CodeIgniter here: http://doctrine-orm.readthedocs.org/en/latest/cookbook/integrating-with-codeigniter.html
Related
I am in the process of picking a PHP framework for a web application I am starting. I have never really used a framework in the past but with this project there is a great need.
I have been debating between the usual suspects; CakePHP, Zend Framework and Symfony. I have been going back and forth about which framework will work best for me and this project. I am leaning towards CakePHP but I am still researching.
My question is not what framework is best. I know there is no real answer to that and there are tons of posts related to this subject. My question is related to the Model and ORM. I have read a lot about ORMs being slow and I am concerned about speed. I am very comfortable writing SQL and in the past have tried to keep all of my database interactions in stored procedures.
I am looking for some feedback about using CakePHP's ORM or Doctrine with Zend or Symfony as apposed to keeping everything in stored procedures. I know stored procedures are going to be faster but what else will I loose if I do not use an ORM? I understand that an ORM will give me database abstraction but in my mind that just helps people who do not write SQL. I also know that I do not know enough about ORMs.
If anyone can give me some feedback about this and which framework might be best based on using or not using an ORM.
Thanks for any help in advance.
0) Bang for effort
The key advantage with an ORM is that you don't spend as much time dealing with the persistence layer. A pre-written ORM ( I have worked with EclipseLink) will provide a ton of things you probably won't get in custom written stored procs. I think it's worth thinking about how much time you want to spend writing your persistence layer.
1) Caching
All the major ORMs provide multi-level distributed caches. Combined with Named/Predefined queries you can get SQL queries that don't actually have to go to the database. This can give you excellent performance.
2) Abstraction
ORMs allow you to define your table layout in one location and then they manage all the painful mapping between columns/tables and objects. Some will allow you to remap column, table and schema names without changing any code at all. If you work with people who like to change things around this can really simplify things.
3) Speed
Some ORMs can have bad performance, but it really is based on how you use it. I find that you tend to end up over-querying for things. On the other hand, you get things like built-in query profilers. You can write custom SQL for queries if you find you aren't getting the performance you need.
Mark Robinson gives a great response. I'm just going to back up what he says by giving our experience with Doctrine2.
I chose to use Doctrine2 as our ORM with Zend Framework a little while back. Our project is still being developed, but choosing D2 has been a decision we've not regretted one bit. Whilst you still need to give a lot of thought to your data architecture, D2 gives you the flexibility to be able to modify that model at a later date if needs be. It allowed us to try things out quickly in the early stages and the room to grow and change later when we decided that things weren't quite right - it happens.
In relation to Mark's point about abstraction. One of the other things I love about D2 is that we're working with plain old PHP objects. Don't underestimate the power of being able to think in terms of objects - both for the people responsible for modelling the data and the developers who work with the data - it'll make your life easier, trust me. Also, having inline documentation of the ORM mapping (if you choose the docblock approach) is nice.
Right, performance. As Mark says, there are ways and means to speed things up - but there's always going to be some overhead. Whenever you introduce another software layer, there'll be some performance hit, but it's a tradeoff. For us, the tradeoff - the advantages of using the ORM vs performance - is worth it. We'd spend more time debugging code and not getting things done without the ORM.
Anyway, D2 can help you with caching for queries, results and metadata. Whilst you probably just want an array cache during development, it's great that the facility for things like APC, memcache etc. is there when you go to test and deploy. You could even develop your own if you're brave.
http://www.doctrine-project.org/docs/orm/2.0/en/reference/caching.html
Hope that helps, I've probably missed stuff, but if you have any questions just fire them in and I'll do my best.
A framework implements mainly three kinds of features :
the flow between "getting a request" and "rendering a page". That's were you put things like MVC, router, etc...
the way to manage your model and it's persistence. That's where you see acronyms like ORM, DBAL, DAO
Components. Features, often working also standalone. Like Xml parsing, i18n handling, pdf generation...
When you choose your framework, it in facts means that you choose 1). It's the thing you will certainly have to stick with, it's the flow of your application. 2) and 3) ? You can integrate those you prefer. As an example, i'm on Zend Framework with most of it's components, but use Doctrine ORM 2 and Symfony's Dependency injection. A friend of mine is on Symfony 2, uses Doctrine ORM too, but does it's pdf generation and mail management with Zend's related components.
The other thing you need to know if that currently there is a "second generation" of php frameworks/orm's, (re)written to take advantage of the new php 5.3 features, and/or to solve the general performance/coupling issues they (nearly) all had. Some of them are production ready, some are still under development :
Doctrine ORM 2 (production ready)
Symfony 2 (production ready)
CakePhp 2 (in RC 2 currently, but by the time your project is ready it should be stable)
Zend Framework 2 (still under active development, but normally not for so long)
FLOW 3 (beta2, should be ready soon too)
For the ORM part, i'll recommend using one, especially Doctrine's. #Mark and #iainp999 explained perfectly why.
ORMs are for programmers who don't "grock" SQL!
Yes ORMs make it easier (or at least require less lines of code) to write simple CRUD stuff, but when you get to more complex requirements its like trying to write SQL with a piece of wet spaghetti from ten feet away.
So stick with SQL.
Its worth looking at something like "SQLMap" which is ORM starting from the "R"elational side of the mapping (most try to map an "O"bject on to a table). This will allow you to write the SQL yourself and generate the appropriate "helper" classes to easily access the results in your program.
I have been using ezSQL for the last few years but feel it is outdated. Though I like the simplicity and I like the file based caching ability with json, for small result sets that is.
So starting a new project I was looking for suggestions on a better mysql class for php. I know the db will only be mysql so portability is not a requirement. I read about mysqli extension, pdo etc but just dont know which one would be best for my situation. The site does a lot more reads than writes, though there are times where there are a lot of writes in the admin tool to the db. I looked at doctrine but dont know if that is too "bloated" for what I need. Hopefully this isnt to vague. Any suggestions?
EDIT
The site isnt small, I would consider it a high traffic site with a lot of db queries.
What don't you like about ezSQL? I often wish there was something like it for other protocols/languages I encounter. Every syntax should be written like ezSQL, in my opinion.. It describes the operation to be performed, in as few words as is possible, in the clearest and most logical order. Do you actually have performance problems, or are you just worried that something better has come along? I agree that ezSQL is rarely mentioned, but I have yet to find anything that matches it's simplicity, conciseness, and function...
From what I know of ezSQL (via it's wordpress pendant) I would consider Doctrine as well as too much for the moment because it's a complete data mapper for the database whereas you might be more looking to how to move away from your recent use of ezSQL which I think is a good idea.
Bascially you might be interested in a data-access abstraction layer. That could be PDO as it's build in into PHP. Even if you don't need to change the database server, it will give you defined interfaces how to query and access the data.
As you build the site from scratch, I can suggest you consider using some lightweight framework. A good introduction in my eyes is When Flat PHP meets Symfony which shows how a webapp can generally benefit from patterns and a flexible design.
From experience:
Doctrine - very easy to use I love doctrine query language - I never had to do initial setup though so im not sure how hard it is. It has very good community and lots of tutorials.
Propel - used for a bit. Does the job, very similar to doctrine. However, the documentation is very crap and community is very slack. I found that when I didn't know something it was quite hard to find an answer and often I had to post on Google forums.
Note: If you are starting from scratch you might want to look at some of the frameworks such as symfony+doctrine is a good combination, makes development a lot easier.
Links:
- http://www.doctrine-project.org/
- http://www.propelorm.org/
I know there already are a lot of posts floating on the web regarding this topic.
However, many people tend to focus on different things when talking about it. My main goal is to create a scalable web application that is easy to maintain. Speed to develop and maintain is far more appreciated BY ME than raw performance (or i could have used Java instead).
This is because i have noticed that when a project grows in code size, you must have maintainable code. When I first wrote my application in the procedural way, and without any framework it became a nightmare only after 1 month. I was totally lost in the jungle of spaghetti code lines. I didn't have any structure at all, even though i fought so badly to implement one.
Then I realized that I have to have structure and code the right way. I started to use CodeIgniter. That really gave me structure and maintainable code. A lot of users say that frameworks are slowing things down, but I think they missed the picture. The code must be maintainable and easy to understand.
Framework + OOP + MVC made my web application so structured so that adding features was not a problem anymore.
When i create a model, I tend to think that it is representing a data object. Maybe a form or even a table/database. So I thought about ORM (doctrine). Maybe this would be yet another great implementation into my web application giving it more structure so I could focus on the features and not repeating myself.
However, I have never used any ORM before and I have only learned the basics of it, why it's good to use and so on.
So now Im asking all of you guys that just like me are striving for maintainable code and know how important that is, is ORM (doctrine) a must have for maintainable code just like framework+mvc+oop?
I want more life experience advices than "raw sql is faster" advices, cause if i would only care about raw performance, i should have dropped framework+mvc+oop in the first place and kept living in a coding nightmare.
It feels like it fits so good into a MVC framework where the models are the tables.
Right now i've got like 150 sql queries in one file doing easy things like getting a entry by id, getting entry by name, getting entry by email, getting entry by X and so on. i thought that ORM could reduce these lines, or else im pretty sure that this will grow to 1000 sql lines in the future. And if i change in one column, i have to change all of them! what a nightmare again just thinking about it. And maybe this could also give me nice models that fits to the MVC pattern.
Is ORM the right way to go for structure and maintainable code?
Ajsie,
My vote is for an ORM. I use NHibernate. It's not perfect and there is a sizable learning curve. But the code is much more maintainable, much more OOP. Its almost impossible to create an application using OOP without an ORM unless you like a lot of duplicate code. It will definitely eliminate probably the vast majority of your SQL code.
And here's the other thing. If you're are going to build an OOP system, you'll end up writing your own O/R Mapper anyway. You'll need to call dynamic SQL or stored procs, get the data as a reader or dataset, convert that to an object, wire up relationships to other objects, turn object modifications into sql inserts/updates, etc. What you write will be slower and more buggy than NHibernate or something that's been in the market for a long while.
Your only other choice really is to build a very data centric, procedural application. Yes it may perform faster in some areas. I agree that performance IS important. But what matters is that its FAST ENOUGH. If you save a few milliseconds here and there doing procedural code, your users will not notice the performance increase. But you 'll notice the crappy code.
The biggest performance bottle-necks in an ORM are in the right way to pre-fetch and lazy-load objects. This gets into the n-query problems with ORMs. However, these are easily solved. You just have to performance tune your object queries and limit the number of calls to the database, tell it when to use joins, etc. NHibernate also supports a rich caching mechanism so you don't hit the database at all at times.
I also disagree with those that say performance is about users and maintenance is about coders. If your code is not easily maintained, it will be buggy and slow to add features. Your users will care about that.
I wont say every application should have an ORM, but I think most will benefit. Also don't be afraid to use native SQL or stored procedures with an ORM every now and then where necessary. If you have to do batch updates to millions of records or write a very complex report (hopefully against a separate, denormalized reporting database) then straight SQL is the way to go. Use ORMs for the OOP, transactional, business logic and C.R.U.D. stuff, and use SQL for the exceptions and edge cases.
I'd recommend reading Jeffrey Palermo's stuff on NHibernate and Onion Architecture. Also, take his agile boot camp or other classes to learn O/R Mapping, NHibernate and OOP. Thats what we use: NHibernate, MVC, TDD, Dependency Injection.
A lot of users say that frameworks are
slowing things down, but I think they
missed the big picture. The code MUST
BE MAINTAINABLE and EASY TO
UNDERSTAND.
A well-structured, highly-maintainable system is worthless if its performance is Teh Suck!
Maintability is something which benefits the coders who construct an application. Raw performance benefits the real people who use the app for their work (or whatever). So, whose concerns ought to be paramount: those who build the system or those who pay for it?
I know it's not as simple as that, because the customer will eventually pay for a poorly structured system - perhaps more bugs, certainly more time to fix them, more time to implement enhancements to the application. As is usually the case, everything is a trade-off.
I've started developing like you, without orm tools.
Then i worked for companies where software development was more industrialized, and they all use some kind of orm mapping tool (with more or less features). The development is far easier, faster, produce more maintainable code, etc.
But i've also seen the drawbacks of these tools : very slow performance. But it was mostly misuses of the tool (hibernate in that case).
Orm tool are very complex tool, so it is easy to misuse them, but if you have experience with them, you should be able to get nearly the same performances as with raw sql. I would have three advices for you :
If performance is not critical, use an orm tool (choose a good one, i am not developing with php, so i can't give you a name)
Be sure for each feature you add, to check the sql that the orm tool produce and send to the database (thanks to a logging facility for example). Think if it is the way you would have written your queries. Most of the inefficiencies of orm tools come from unwanted data that are gathered from the db, unique request split in multiple ones, etc. Slowness rarely comes from the tool in itself
Do not use the tool for everything. Choose wisely when not to use it (you reduce maintainability each time you do raw db access), but sometimes, it isn't just worst trying to make the orm tool do something it was not developed for.
Edit:
Orm tool are most useful with very complex model : many relationships between entities. Which is most of the time encountered in configuration part of the application, or in complex business part of the application.
So it is less useful if you have only few entities, and if there is less chance they get changed (refactored).
The limit between few entities and many is not clear. I would say more that 50 differents Types (sql tables, without join tables) is many, and less than 10 is few.
I don't know what was used to build stackoverflow but it must have been very carefully performance tested before.
If you want to build a web site that will get such a heavy load, and if you don't have experience with that, try to get someone in your team that have already worked on such sites (performance testing with a real set of data and a representative number of concurrent users is not an easy and fast task to implement). Having someone that have experience with it will greatly speed up the process.
Its very important to have a maintainabilty that is high. Ive developed large scaled web application with lowlevel super high preformance. The big disadvantage was maintaining the system, that is, developing new features. If you'r to slow developing the customers will look for other systems/applications.. Its a trade of. Most of the orms has features if you need to do optmized queries direct to sql. The orm itself isnt the bottleneck. Ill say its more about a good db design.
I think you missed the picture. Performance is everyday for your users, they care not at all about maintainability. You are being ethnocentric, you are concerned only for your personal concerns and not those of the the people who pay for the system. It isn't all about your convenience.
Perhaps you should sit down with the users and watch them use your system for day or two. Then you should sit down at a PC that is the same power as the ones they use (not a dev machine) and spend an entire week doing nothing but using your system all day long. Then you might understand their point.
I'm starting to get to grips with CodeIgniter and came across it's support for the Active Record pattern.
I like the fact that it generates the SQL code for you so essentially you can retrieve, update and insert data in to a database without tying your application to a specific database engine.
It makes simple queries very simple but my concern is that it makes complex queries more complex if not impossible (e.g. if need for engine specific functions).
My Questions
What is your opinion of this pattern especially regarding CodeIgniters implementation?
Are there any speed issues with wrapping the database in another layer?
Does it (logic) become messy when trying to build very complex queries?
Do the advantages out way the disadvantages?
Ok, First of all 99% of your queries will be simple select/insert/update/delete. For this active record is great. It provides simple syntax that can be easily changed. For more complex queries you should just use the query method. Thats what its for.
Second, It provides escaping & security for those queries. Face it, your application probably will have hundreds if not thousands of places where queries take place. Your bound to screw up and forget to properly escape some of them. Active record does not forget.
Third, performance in my experience is not dramatically affected. Of course it is but its probably around .00001 per query. I think that is perfectly acceptable for the added security and sanity checks it does for you.
Lastly, I think its clear that i believe the advantages are far greater than the disadvantages. Having secure queries that even your most junior developer can understand and not screw up is a great thing.
What is your opinion (sic) of this pattern especially regarding CodeIgniters implementation?
Can't say much about CI's implementation. Generally I avoid AR for anything but the simplest applications. If the table does not match 1:1 to my business objects, I don't use AR, as it will make modeling the application difficult. I also don't like the idea of coupling the persistence layer to my business objects. It's a violation of separation of concerns. Why should a Product know how to save itself? Futher reading: http://kore-nordmann.de/blog/why_active_record_sucks.html
EDIT after the comment of #kemp, I looked at the CI User Guide to see how they implemented AR:
As you can see in PoEAA an AR is an object that wraps a row in a database table or view, encapsulates the database access, and adds domain logic on that data. This is not what CI does though. It just provides an API to build queries. I understood that there is a Model class which extends AR and which can be used to build business objects, but that would be more like a Row Data Gateway then. Check out PHPActiveRecord for an alternate implementation.
Are there any speed issues with wrapping the database in another layer?
Whenever you abstract or wrap something into something else, you can be sure this comes with a performance impact over doing it raw. The question is, is it acceptable for your application. The only way to find out is by benchmarking. Further Reading: https://stackoverflow.com/search?q=orm+slow
EDIT In case of CI's simple query building API, I'd assume the performance impact to be neglectable. Assembling the queries will logically take some more time than just using passing a raw SQL string to the db adapter, but that should be microseconds only. And you as far as I have seen it in the User Guide, you can also cache query strings. But when in doubt, benchmark.
Does it (logic) become messy when trying to build very complex queries?
Depends on your queries. I've seen pretty messy SQL queries. Those don't get prettier when expressed through an OO interface. Depending on the API, you might find queries you won't be able to express through it. But then again, that depends on your queries.
Do the advantages out way the disadvantages?
That only you can decide. If it makes your life as a programmer easy, sure why not. If it fits your programming needs, yes. Ruby on Rails is build heavily on that (AR) concept, so it can't be all that bad (although we could argue about this, too :))
There are two things that seem to be popular nowadays and I was wondering what are the pros and cons of using something like this: http://codeigniter.com/user_guide/database/active_record.html ?
Another thing is ORM (Doctrine for instance). What are the benefits of using these?
ActiveRecord is a pattern common in ORMs. Doctrine is an ORM which uses an ActiveRecord'ish style.
Some benefits of using tools like Doctrine:
Database independence: The code should be easy to port to different DBs. For example, I often test using SQLite and use MySQL or Postgre in production with no changes in code.
They reduce the amount of code you have to write: A large part of application code deals with communicating with the database. An ORM takes care of most of that, so you can concentrate on writing the actual app.
Of course, they don't come without disadvantages:
Doctrine is heavy so it is slower than using straight SQL
ORMs can be complex, adding some weight to what you have to learn, and they can sometimes be difficult to understand for inexperienced programmers
You can take a look at these questions though they're not exactly PHP specific:
Are there good reasons not to use an ORM?
Using an ORM or plain SQL?
I tried to keep it light-weight and understandable. Even comes with it's own Mootools based Class Generator :)
http://www.schizofreend.nl/Pork.dbObject/
check it out :)