Web CMS Performance: pages/second (Joomla, Drupal, Plone, WP) - php

Note: I am not into web programming, so forgive my ignorance in case the question is trivial. Also, please don't comment about "how flawed" the out-of-box comparison of these products is. The question is not about how they compete against each other, rather about the reason behind the incredible slowness of ALL of them.
Just read about a benchmark, where Joomla, Drupal, Wordpress, Plone3 & 4 had been tested. What shocked me is this: out of the box they performed around 4-14 pages/sec. How is this possible, why are they so damn slow? A CMS should just query a DB and churn out the data packed into nice templates. DBs are fast. Templates should be fast (text replacement, no big deal). Our machines are superfast and yet, these high profile CMSs could only produce a few pages/sec. How come?

A CMS should just query a DB and churn out the data packed into nice templates.
Not so much. Major, modern CMS systems are incredibly complex beasts. A typical page isn't merely body text and a title, but also dynamic category-based content queries that aggregate info across many site areas; not to mention security trimming and user-specific content zones. For example, loading http://www.volvogroup.com involves at least 7 of these queries, plus recursion through the site structure to generate navigation, and connecting to external systems to pull in news and investor relations data. Considering that, it shouldn't be such a surprise that it takes a beefy server farm to serve up several hundred hits per second.

Because it takes a alot of processing to do all of that. ITs not a matter of "query, replace, render". All of thes products are made to fit a wide range of use cases and to be extensible to some degree so really those 3 basic operations your are talking about are split up into many, many operations all of which consume time.
All things being equal - the more flexible he system the slower it will be "out of the box".

They are slow for a few reasons:
1 - Most of them are very modular,that means more files, more code and more DB queries.
2 - They largely (not wordpress so much) try to do everything, again designing a system for every possible situation makes it more complex and harder to tune.
3 - Most of them (currently) support both PHP4 and PHP5, this is again just extra work.
4 - They are allegedly made so non-technical users can use them, which means they often have to do things in a way that is not the most efficient, i.e. Drupals CCK / Views lets people that can't programme effectively create database tables and SQL queries, the flaw being these tables / queries are very general in design and are rather inefficient in comparison to custom coded efforts.
5 - They tend to use lots of DB queries, Drupal uses 40 or so for a very basic page and if you search their forums you will see reports of people claiming some pages make hundreds or even over a thousand queries.
They do of course offer caching and Drupal can get quite good performance from things like its boost module, the flaw being one of Drupal's (and Joomla's) selling points is you can make a community site, forum, Digg like site in it, all sites where caching is of limited use...

They are relatively complex systems. They allow a lot of hooks for plugins, so there's a lot of steps in the workflow from request to response.
In the real world, however, caching (whether in-application or opcode caching) is a tremendous boost to performance.
I'm not familiar with Plone, but the PHP CMSs essentially have to load and interpret almost the entire system with every single request.

Please don't take offense to this, but to preface your question by explaining your unfamiliarity with web programming, and then criticize the performance of what appears to you to be a 'simple' operation is a bit short-sighted.
I would encourage you to learn a bit more about the common problems a CMS solves, and the general theory and practice of how dynamic web pages and HTTP work. It's far from a simple I/O operation.
Also, for practical use, I would highly encourage anyone running a CMS to find a caching solution. Caching is intended to solve a lot of the 'speed' problems that arise in web technology. It should be part of any common web stack.

In my opinion :
Because CMS and frameworks think of all the things you need, that you can use:
like
Filter userinput
Create PDF,AJAX Output Template and alot more
It Depends on your need, what you realy need
Im not agree what you wrote
A CMS should just query a DB and churn out the data packed into nice templates.
A CMS Does a lot more things alot more ...
And at last but not least dont compare Desktop Software Speed with Wep Aplication.
There is a Big Difference

A CMS Does a lot more things alot more ...
And at last but not least dont compare Desktop Software Speed with Wep Aplication.
There is a Big Difference

Related

Will an application on PHP Yii framework with MySQL database handle an ERP solution of 20K employees?

We have got a project to build an ERP system for one of the largest garment industry of Bangladesh.
They have around 20,000 employees and about 10% of them get out/in every month. We are a small company with 5 PHP developers and don't have much experience with such a large project. We have developed different small/medium scale projects previously with Codeigniter/Zend Framework and MySQL database.
For this project we decided to go with Yii framework and MySQL or PostgreSQL. There will be about 1 million database query every day. Now my question is can MySQL/PostgreSQL handle this load or is there a better alternative? Is it ok to do it with Yii framework or there have a better PHP framework for this kind of application? We have got only 5 months to build the payroll and employee management modules.
For one thing, consider using PostgreSQL rather than MySQL. You're going to be dealing with mission-critical data and, in general, you'll appreciate that:
You will have access to window functions (useful for reports), with statements, and a much more robust query planner.
You will have extra data types, namely geometry types which can be used to optimize date-range overlap related queries.
You will have access to full text search functionality without needing to use an engine (MyISAM) which is prone to data corruption.
You will have more options to implement DB replication (some of which are built-in).
With respect to scalability, be wary that scalability != performance. The latter is about making individual requests faster; the former is about being able to handle massive quantities of simultaneous requests, and often comes with a slight hit to the latter.
For the PHP framework, I've never used Yii personally, so I do not know how well it scales. But I'm quite certain that Symfony2 (or Symfony, if you're not into using beta software) will scale nicely: its key devs work in a web-agency whose main customers are mid- to large-sized organizations.
I think, Yii will work fine with (relatively) large amount of data. I'm using Yii to manage 1.3 million records, some thausend updates a day and some thousand querys a day on an small virtual host with an amazing performance.
If your database can handle this data, your Yii application will also handle that.
Your choice of the database will be an important point. So #Denis said some important thinks. By using MySQL probably you have to explore / determined the right storage-engine for your needs.
But, there are some points, which i realized by creating an growing project with Yii. You should think about those things:
-Yii is an young framework: new technologies (like ajax) are supported, but in some special cases it's a bit immature: it's very easy to generate an basic application in a cuple of hours. Problem could be occur by special situation and requirements.
Example: they have an nice validation-mechanism for user inputs(HTML Forms). But until Yii 1.1.6 that doesn't work with HTML Checkboxes, since Yii 1.1.7, Checkboxes are supported by default, but no groups of checkboxes. An other problem: Yii alway uses an table alias, which is always "t". That could be a problem! Sometimes you can define that alias, sometimes not (which is inconsistent). If you like to lock a couple of tables in MySql, you ran into a problem, because Yii calls every table with the same alias "t". So you are unable to loot the tables in MySql by tablename and it's also impossible to lock a couple of tables, which called by the same alias. -> those are specific problems, you can solve them, by writing pure PHP (not using Yii functionality) What I'm trying to say: the framework will not be helpful in very case, but in mostly.
-Yii is easy to extend. It's easy to add own extensions or functionality. So lot's of those "small problems" can be solved be writing own extensions, widgets or by overriding methods.
-Yii supports PHP 5.2. Yii is compatible with 5.3 but (Yii runs on 5.3 - i'm still using it since yesterday, it work's) but doesn't support new features from 5.3 (maybe you need one?)
PHP5.3 will be (maybe) supported with Yii 2.0 - in a distance future (2012)
-Yii has a small (but very good) community.
-there is no professional support (you can post bugs in hope, anybody will fix it - or you will fix it yourself)
-Yii is OO PHP. Think about that by handling with Data-Objects. It's possible to load large amount of data into Data-objects. But keep in mind, that your application server have enough RAM (but that's not a Yii specific thing)
At all: i like Yii an if your application is not to complex, you will have a lot of fun an an nice and powerful application at the end.
I think you might be asking the wrong question, though.
You have five months to build an ERP system. The primary concerns should be:
security. You're dealing with money and personal details.
reliability. Uptime is probably a big deal (at least during working hours)
consistency. You don't want to risk losing data or corrupting data
developer productivity. Five months is not much time do build what you describe
maintainability. Sounds like this is a core enterprise asset, with a lifetime of years - it's likely to require maintenance and extension in the future.
scalability. You need to support tens of thousands of workers, each with many time cards, pay roll runs etc.
performance. You want the application to be responsive.
I would query whether performance is an absolute priority - it shouldn't be slow, but many ERP systems are a bit sluggish. Performance optimizations often mean trading off other priorities - for instance, an ORM system improves developer productivity, but can be slower than hand-crafted SQL.
As for scalability - as long as you have a reasonably designed schema, I don't think 20K employees is much of a challenge to any modern RDBMS on decent hardware.
So, if I were you, I'd probably go with PostgreSQL, for the reasons Denis mentions. Never used Yii, but it seems perfectly reasonable. I would use ORM until you find a situation where the performance really is unacceptable.
Critically, I would put together a testing framework which allows you to monitor performance and scalability during the development cycle (I use JMeter for this), and only make performance optimizations if you really have to. Sacrificing all the other things - especially productivity and maintainability - in the name of performance before you know you have a problem tends to create over-complex solutions, and they in turn tend to have more security issues and maintenance challenges.
Just to add ,
Yii scales very nicely in both directions (ie functionality addition using new modules etc and is one of the fastest php frameworks when it comes to performance ).
The only drawback I can see with Yii is that it has lesser user base so a bit lesser support than some other frameworks, but this is changing fast.
The best part of Yii is the gii based code generation which helps you get started really quickly once you get used to it.
Yii is very flexible, light and easy to learn PHP framework.

CMS design patterns and considerations

I am creating the cms for a relatively simple site - portfolio, some general content pages, custom blog etc.
What are some of the best patterns to consider before diving into the design.
I want the system to be as flexible as possible without being too complex.
I have looked for some good resources that discus cms and blog design but can't find anything too good.
My language is php but I suppose I am looking for more language independent advice.
Flexibility without complexity... nice program.
Maybe you're a genius and you will make something that feet your needs. But I think the biggest problem you will face is security and robustness. So really, take other advices on this page and have a look at wordpress, drupal, joomla and ezpublish. A lot of security stuff is already done. And not only security...
So, study some of these tools, track their flaws, check their security policy. Study how they handle caching, sessions, bootstrap, absolute & relative url managment, documents (images, videos, etc), ajax, authentification, identification, acl, user interfaces, rich-text editing, migrations, templating, page composition, content filtering (I try to remove the things you won't need, plugins, database abstraction, fine caching, css and js minification, all the extra-complex stuff not needed for a single instance simple CMS). Soon you'll have a 'picture' of the stuff they've done.
By doing this work, you'll certainly notice some big differences, and mistakes. You'll start going on irc and flaming developpers, telling them that others have done better choices. You'll start forgetting to shave. You'll maybe do some contributions. Some will be accepted, others won't. Old core devs doesn't like when someone explain why they made mistakes (and they make mistakes).
Now, comes the day you have a beard. Some of your contributions will start looking like forks. You will have ennemies, and friends, or followers. And you will start feeling the force.
And you will go on irc and tell god that the world is ugly and that you'll make the first CMS which will be flexible without being complex. And people will cry. And birds will run in circles. And you will be able to explain what are the design pattern of a CMS.
I am a user. I know what I want. Doing what I want will make user happy. I'm happy.
You shall not trust code from people with glasses
"MVC MVC MVC" : and the people responds 'that shall be done'
Seriously, There's still a place for a good CMS with disruptive innovation, the fork history has started long time ago with phpNuke (as far as I can remember). But some of the actual products are really fine for most tasks.
I'm probably risking the reputation here, but my experience shows that building your own CMS can be a very justified decision, especially when you get familiar with current opensource systems and understand what exactly they lack in terms of features, security or what not. Open-source often means a lot of backward-compatibility concerns and bad architecture decisions that cannot be easily changed.
I strongly suggest that instead of just taking on MVC you take a look at ideas that make it attractive.
One main problem with CMSes is the range of technologies involved in driving dynamic web-sites: imperative php for logic, declarative SQL for data queries, markup HTML for interface, imperative/functional javascript for dynamic interface, JSON for ajax calls etc. To keep the system manageable you have to keep these technologies in a controlled and understandable environment, but yet allow for smooth integration. Knowledge and best practices are out there. MVC is but one approach to manage this problem.
My choice at the time was to use the following principles:
Object-oriented code with static calling (php is a one-run thing, many instances of code objects are rarely justified), nothing except for one line of init code in global context
100% code-design separation with the use of XSLT and custom content processor
Custom router that can take any http request and reroute it to registered methods
Custom content processor that can take arbitrary method output and convert it into any usable format such as xhtml, xml, json etc. based on the request parameters (i.e. http://local/class/method.xhtml, http://local/class/method.json)
One copy of code for as many virtual web servers as necessary
SQL query builder (chosen for flexibility over ORM) for all database queries
Mandatory filtering of method input with filter_* functions
I believe you can choose a few that you like :) And good luck!
A good pattern to start with is the Model View Controller pattern, or MVC.
This pattern suggests to seperate your application's logic in the following layers: data logic(model), manipulation or business logic (controller) and display logic (view).
This is a good pattern to start with as you'll run into other problems (and thus patterns) along the way.
The following website explains the MVC concept quite well: MVC Principles
There is no point reinventing the wheel unless you are trying to better it in anyway.
THere are a lot of CMS available already. I personally have worked with ezpublish. There are other options such as drupal etc. This is the list of all open source cms avaliable - Click here
If you are just trying to learn then you can perhaps pick any one of the popular opensource and work on them to find its architecture and design.
Besides, I dont think anyone can give you a list of design patterns that would be best for a CMS tool. Because each design pattern solves some particular problem. And, you just have to choose a design pattern depending on a specific problem you want to solve in your project.
These days, writing your own CMS is a horrible waste of time. The usual open source solutions -- these days Joomla, WordPress and Drupal are popular -- are written by thousands of people and while you might loose a little flexibility by using on that's ready made this is by far offset by not needing to redo everything from scratch. If you go with Drupal, you can also enjoy high quality, massively scalable etc code :)
If Your rquiremnt is portfolio, some general content pages, custom blog only, Wordpress will be simple and Better.
In PHP so many CMS available , most popular one is Joomla.

Recommended structure for high traffic website [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm rewriting a big website, that needs very solid architecture, here are my few questions, and pardon me for mixing apples and oranges and probably kiwi too:) I did a lot of research and ended up totally confused.
Main question: Which approach would you take in building a big website expected to grow in every way?
Single entry point, pages data in the database, pulled by associating GET variable with database entry (?pageid=whatever)
Single entry point, pages data in separate files, included based on GET variable (?pageid=whatever would include whatever.php)
MVC (Alright guys, I'm all for it, but can't grasp the concept besides checking all tutorials and frameworks out there, do they store "view" in database? Seems to me from examples that if you have 1000 pages of same kind they can be shaped by 1 model, but I'll still need to have 1000 "views" files?)
PAC - this sounds even more logical to me, but didn't find much resources - if this is a good way to go, can you recommend any books or links?
DAL/DAO/DDD - i learned about these terms by diligently reading through stack overflow before posting question. Not sure if it belongs to this list
Sit down and create my own architecture (likely to do if nobody enlightens me here:)
Something not mentioned...
Thanks.
Scalability/availability (iow. high-traffic) for websites is best addressed by none of the items you mention. Especially points 1 and 2; storing the page definitions in a database is an absolute no-no. MVC and other similar patterns are more for code clarity and maintenance, not for scalability.
An important piece of missing information is what kind of concurrent hits/sec are you expecting? Sometimes, people who haven't built high-traffic websites are surprised at the hit rates that actually constitute a "scalability nightmare".
There are books on how to design scalable architectures, so an SO post will not be able to the topic justice, but some very top-level concepts, in no particular order, are:
Scalability is best handled first by looking at hardware-based solutions. A beefy server with an array of SSD disks can go a long way.
Make static anything that can be static. Serve as much as you can from the web server, not the DB. For example, a lot of pages on websites dynamically generate data lists out of databases from data stores that very rarely or never really change.
Cache output that changes infrequently, and tune the cache refresh.
Build dynamic pages to be stateless or asynchronous. Look into CQRS and Event Sourcing for patterns that favor/facilitate scaling.
Tune your queries. The DB is usually the big bottleneck since it is a shared resource. Lots of web app builders use ORMs that create poor queries.
Tune your database engine. Backups, replication, sweeping, logging, all of these require just a little bit of resource from your engine. Tuning it can lead to a faster DB that buys you time from a scale-out.
Reduce the number of HTTP requests from clients. Each HTTP connect has overhead. Check your pages and see if you can increase the payload in each request so as to reduce the overall number of individual requests.
At this point, you've optimized the behavior on one server, and you have to "scale out". Now, things get very complicated very fast. Load-balancing scenarios of various types (sharding, DNS-driven, dumb balancing, etc), separating read data from write data on different DBs, going to a virtualization solution like Google Apps, offload static content to a big CDN service, use a language like Erlang or Scala and parallelize your app, etc...
Single entry point, pages data in the
database, pulled by associating GET
variable with database entry
(?pageid=whatever)
Potential nightmare for maintenance. And also for development if you have team of more than 2-3 people. You would need to create a set of strict rules for everyone to adhere to - effort that would be much better spent if using MVC. Same goes for 2.
MVC (Alright guys, I'm all for it, but
can't grasp the concept besides
checking all tutorials and frameworks
out there, do they store "view" in
database? Seems to me from examples
that if you have 1000 pages of same
kind they can be shaped by 1 model,
but I'll still need to have 1000
"views" files?)
It depends how many page layouts are there. Most MVC frameworks allow you to work with structured views (i.e. main page views, sub-views). Think of a view as HTML template for the web page. How many templates and sub-templates inside you need is exactly how many view's you'll have. I believe most websites can get away with up to 50 main views and up to 100 subviews - but those are very large sites. Looking at some sites I run, it's more like 50 views in total.
DAL/DAO/DDD - i learned about these
terms by diligently reading through
stack overflow before posting
question. Not sure if it belongs to
this list
It does. DDD is great if you need meta-views or meta-models. Say, if all your models are quite similar in structure, but differ only in database tables used and your views almost map 1:1 to models. In that case, it is a good time for DDD. A good example is some ERP software where you don't need a separate design for all the database tables, you can use some uniform way to do all the CRUD operations. In this case you could probably get away with one model and a couple of views - all generated dynamically at run-time using meta-model that maps database columns, types and rules to logic of programming language. But, please note that it does take some time and effort to build a quality DDD engine so that your application doesn't look like hacked-up MS Access program.
Sit down and create my own
architecture (likely to do if nobody
enlightens me here:)
If you're building a public-facing website, you're most likely going to do it well with MVC. A very good starting point is to look at CodeIgniter video tutorials. It helped me understand what MVC really is and how to use it way better than any HOWTO or manual I read. And they only take 29minutes altogether:
http://codeigniter.com/tutorials/
Enjoy.
I'm a fan of MVC because I've found it easier to scale your team when everything has a place and is nice and compartmentalized. It takes some getting used to, but the easiest way to get a handle on it is to dive in.
That said definitely check your local library to see if they have the O'Reilley book on scaling: http://oreilly.com/catalog/9780596102357 which is a good place to start.
If you're creating a "big" website and don't fully grasp MVC or a web framework then a CMS might be a better route since you can expand it with plugins as you see fit. With this route you can worry more about the content and page structure rather than the platform. As long as you pick the appropriate CMS.
I would suggest to create a mock app with some of the web mvc frameworks in the wild and pick one, with which your development was smooth enough. Establishing your code on a solid basis is fundamental, if you want to grasp concepts of mvc and be ready to add new functionality to your web easily.

Spaghetti php code performance and scalability compared to mvc/oop?

I have a php application that has about 50-55 code files. The file that has the maximum amount of code has about 1200 lines of code(this includes the spaces, tabs and multiple line breaks...) and rest of the code files are relatively smaller than this.
The application code in almost every file is a mix of html, sql and php(what you call spaghetti), except in a few files that are pure php include files....for example a file containing functions that are needed in many other places.
I have been considering whether it's a good idea to refactor this application to a mvc type architecture.
Now i know that a mvc application offers plenty of advantages like ease of maintenance, reuse and ease of further development etc but what about scalability and performance - specifically in this case ?
My assumption is that since this is a small application(i believe so, do you think it's small enough?), i don't envision having a hard time with maintenance or adding a few more features(at the max) which just may mean a few additions in existing files or maybe say addition of a maximum 5-10 new files.
So i am thinking i shouldn't be converting to mvc just for maintenance sake.
As far as i understand you may put each component of mvc on a separate server to spread the load so as to have a different server serving html, database and logic and do other optimization/caching further as well to make a hich is mvc application scale and perform.
I think even though in a small spaghetti application we cannot have different servers for html, database etc we can easily scale without degrading the performance by having a load balancer in front of web servers, databae server etc.(Supposing it comes to a situation where one server is not enough)
Even more so on it's own the spaghetti code should perform better than mvc, since it doesn't have any overheads like requiring includes or files or function calls from files placed under folders belonging to a different component of the mvc.
So, considering all these things do you really think it's useful to refactor a relatively small spaghetti application to mvc for scalability and performance?
Please don't tell me that refactoring will be useful in future (i know that will help and will consider if we really need to add much more code to the existing code base) but please provide me with a clear answer to
1)Do i really need to convert this application to a mvc architecture for scalability and performance ?
2)Can a spaghetti application like this scale and perform for atleast a minimum of 1 million request a day and half of which occur during some peak time?
As far as i understand you may put each component of mvc on a separate server to spread the load
I've never heard that myself - but I come from a .Net world where you'd run all your managed code on the same server anyway (it's not like in the Java world where you often have a separate "App" server and "Web" server).
The main reason you'd probably move to MVC (just as you mentioned) is for benefits in managing the code: separation of concerns, re-use, etc; not performance.
In theory you could do this with object / component based technologies like Java or .Net where the components communicate with each other - but in procedural code? i don't think so!
So, considering all these things do you really think it's useful to refactor a relatively small spaghetti application to mvc for scalability and performance?
No - assuming by scalability and performance you're refering to runtime qualities of the system, which I believe is what you meant.
If you put scalability and performance purely into the context of development (people coding - how fast and easily they can work on the system, how easily you can add developers to the project) then the answer would be yes.
2)Can a spaghetti application like this scale and perform for atleast a minimum of 1 million request a day and half of which occur during some peak time?
Nothings impossible - I love Gordons comments along those lines - but as I'm sure you'd agree, it's probably not the best footing you could be on.
No, you don't have to convert because with an infinite budget any application will scale infinitely. Just add more servers. The question is, do you have an infinite budget? If you dont have an infinite budget, find out what is cheaper: add more hardware or optimize code.
So the real answer is: maybe.
We cannot tell you what it takes for your application to reach your scalability goals. We don't know what it does, nor do you provide hard limits for the desired performance. For instance, how long may a request take until it is served? Run ab or Siege and measure your Status Quo. Run a profiler on your code and identify bottlenecks. Find out whether you are IO, CPU or RAM bound. Are you using an Opcode cache? Take all your findings and make an educated guess about cost. Then decide how to optimize.
There will be a point where the effort required to shed some microseconds is higher than simply adding better or more hardware. In practise, you will likely use a mixed strategy to find the most scalable and affordable solution for your needs.
Note that refactoring a big ball of mud into a pristine OOP application does not necessarily mean it will run faster afterwards. In fact, the more loosely coupled, the more indirection, the more layers, the slower the application will likely become. It's a tradeoff for better maintainability. Maintainbility is a cost saver too. It will cut down on your time to deliver new features.
Yes
No
50+ files is not small. Spaghetti code is unmaintainable and highly inefficient, so there is no performance advantage over a proper framework. Finally, a good framework offers well-designed, well-tested, and constantly updated plugins to achieve most common tasks, reducing the amount of code that you have to write and maintain yourself.
Back in 1995 high-traffic sites were running tangled messes of spaghetti-code. Today, you shouldn't even think about running a high-traffic site (or any kind of site!) on spaghetti code.

ORM or SQL in large, scalable and MAINTAINABLE web application?

I know there already are a lot of posts floating on the web regarding this topic.
However, many people tend to focus on different things when talking about it. My main goal is to create a scalable web application that is easy to maintain. Speed to develop and maintain is far more appreciated BY ME than raw performance (or i could have used Java instead).
This is because i have noticed that when a project grows in code size, you must have maintainable code. When I first wrote my application in the procedural way, and without any framework it became a nightmare only after 1 month. I was totally lost in the jungle of spaghetti code lines. I didn't have any structure at all, even though i fought so badly to implement one.
Then I realized that I have to have structure and code the right way. I started to use CodeIgniter. That really gave me structure and maintainable code. A lot of users say that frameworks are slowing things down, but I think they missed the picture. The code must be maintainable and easy to understand.
Framework + OOP + MVC made my web application so structured so that adding features was not a problem anymore.
When i create a model, I tend to think that it is representing a data object. Maybe a form or even a table/database. So I thought about ORM (doctrine). Maybe this would be yet another great implementation into my web application giving it more structure so I could focus on the features and not repeating myself.
However, I have never used any ORM before and I have only learned the basics of it, why it's good to use and so on.
So now Im asking all of you guys that just like me are striving for maintainable code and know how important that is, is ORM (doctrine) a must have for maintainable code just like framework+mvc+oop?
I want more life experience advices than "raw sql is faster" advices, cause if i would only care about raw performance, i should have dropped framework+mvc+oop in the first place and kept living in a coding nightmare.
It feels like it fits so good into a MVC framework where the models are the tables.
Right now i've got like 150 sql queries in one file doing easy things like getting a entry by id, getting entry by name, getting entry by email, getting entry by X and so on. i thought that ORM could reduce these lines, or else im pretty sure that this will grow to 1000 sql lines in the future. And if i change in one column, i have to change all of them! what a nightmare again just thinking about it. And maybe this could also give me nice models that fits to the MVC pattern.
Is ORM the right way to go for structure and maintainable code?
Ajsie,
My vote is for an ORM. I use NHibernate. It's not perfect and there is a sizable learning curve. But the code is much more maintainable, much more OOP. Its almost impossible to create an application using OOP without an ORM unless you like a lot of duplicate code. It will definitely eliminate probably the vast majority of your SQL code.
And here's the other thing. If you're are going to build an OOP system, you'll end up writing your own O/R Mapper anyway. You'll need to call dynamic SQL or stored procs, get the data as a reader or dataset, convert that to an object, wire up relationships to other objects, turn object modifications into sql inserts/updates, etc. What you write will be slower and more buggy than NHibernate or something that's been in the market for a long while.
Your only other choice really is to build a very data centric, procedural application. Yes it may perform faster in some areas. I agree that performance IS important. But what matters is that its FAST ENOUGH. If you save a few milliseconds here and there doing procedural code, your users will not notice the performance increase. But you 'll notice the crappy code.
The biggest performance bottle-necks in an ORM are in the right way to pre-fetch and lazy-load objects. This gets into the n-query problems with ORMs. However, these are easily solved. You just have to performance tune your object queries and limit the number of calls to the database, tell it when to use joins, etc. NHibernate also supports a rich caching mechanism so you don't hit the database at all at times.
I also disagree with those that say performance is about users and maintenance is about coders. If your code is not easily maintained, it will be buggy and slow to add features. Your users will care about that.
I wont say every application should have an ORM, but I think most will benefit. Also don't be afraid to use native SQL or stored procedures with an ORM every now and then where necessary. If you have to do batch updates to millions of records or write a very complex report (hopefully against a separate, denormalized reporting database) then straight SQL is the way to go. Use ORMs for the OOP, transactional, business logic and C.R.U.D. stuff, and use SQL for the exceptions and edge cases.
I'd recommend reading Jeffrey Palermo's stuff on NHibernate and Onion Architecture. Also, take his agile boot camp or other classes to learn O/R Mapping, NHibernate and OOP. Thats what we use: NHibernate, MVC, TDD, Dependency Injection.
A lot of users say that frameworks are
slowing things down, but I think they
missed the big picture. The code MUST
BE MAINTAINABLE and EASY TO
UNDERSTAND.
A well-structured, highly-maintainable system is worthless if its performance is Teh Suck!
Maintability is something which benefits the coders who construct an application. Raw performance benefits the real people who use the app for their work (or whatever). So, whose concerns ought to be paramount: those who build the system or those who pay for it?
I know it's not as simple as that, because the customer will eventually pay for a poorly structured system - perhaps more bugs, certainly more time to fix them, more time to implement enhancements to the application. As is usually the case, everything is a trade-off.
I've started developing like you, without orm tools.
Then i worked for companies where software development was more industrialized, and they all use some kind of orm mapping tool (with more or less features). The development is far easier, faster, produce more maintainable code, etc.
But i've also seen the drawbacks of these tools : very slow performance. But it was mostly misuses of the tool (hibernate in that case).
Orm tool are very complex tool, so it is easy to misuse them, but if you have experience with them, you should be able to get nearly the same performances as with raw sql. I would have three advices for you :
If performance is not critical, use an orm tool (choose a good one, i am not developing with php, so i can't give you a name)
Be sure for each feature you add, to check the sql that the orm tool produce and send to the database (thanks to a logging facility for example). Think if it is the way you would have written your queries. Most of the inefficiencies of orm tools come from unwanted data that are gathered from the db, unique request split in multiple ones, etc. Slowness rarely comes from the tool in itself
Do not use the tool for everything. Choose wisely when not to use it (you reduce maintainability each time you do raw db access), but sometimes, it isn't just worst trying to make the orm tool do something it was not developed for.
Edit:
Orm tool are most useful with very complex model : many relationships between entities. Which is most of the time encountered in configuration part of the application, or in complex business part of the application.
So it is less useful if you have only few entities, and if there is less chance they get changed (refactored).
The limit between few entities and many is not clear. I would say more that 50 differents Types (sql tables, without join tables) is many, and less than 10 is few.
I don't know what was used to build stackoverflow but it must have been very carefully performance tested before.
If you want to build a web site that will get such a heavy load, and if you don't have experience with that, try to get someone in your team that have already worked on such sites (performance testing with a real set of data and a representative number of concurrent users is not an easy and fast task to implement). Having someone that have experience with it will greatly speed up the process.
Its very important to have a maintainabilty that is high. Ive developed large scaled web application with lowlevel super high preformance. The big disadvantage was maintaining the system, that is, developing new features. If you'r to slow developing the customers will look for other systems/applications.. Its a trade of. Most of the orms has features if you need to do optmized queries direct to sql. The orm itself isnt the bottleneck. Ill say its more about a good db design.
I think you missed the picture. Performance is everyday for your users, they care not at all about maintainability. You are being ethnocentric, you are concerned only for your personal concerns and not those of the the people who pay for the system. It isn't all about your convenience.
Perhaps you should sit down with the users and watch them use your system for day or two. Then you should sit down at a PC that is the same power as the ones they use (not a dev machine) and spend an entire week doing nothing but using your system all day long. Then you might understand their point.

Categories