This question stems from watching Rasmus Lerdorf's talk from Drupalcon. This question and his talk have nothing specifically to do with Drupal, by the way... it was just given at their con. My own question also has nothing specific to do with PHP. It is the single entry point in general that I am curious about.
These days it seems that most frameworks offer a single entry point for whatever you build with them. In his talk Rasmus mentions that he thinks this is bad. It seems to me that he would be correct in this thinking. If everyone hitting the site is coming in through the same entry point wouldn't things bog down after traffic reached a certain point? Wouldn't it be more efficient to allow people direct access to specific points in a site without having their request go through the same point? But perhaps the actual impact is not very bad? Maybe modern architecture can handle it? Maybe you have to be truly gigantic in scale before it becomes even worth considering? I'm curious as to what people on this site think about this issue.
In short, Rasmus or the interpretation is wrong.
This shows a clear lack of understanding how computers work. The more something gets used, the more likely it's closer to the CPU, and therefore faster. Mind you, a single point of entry != single point of failure. But that's all beside the point, when people say single point of entry, we're talking about the app, it is a single point of entry for your logic.
Not to mention it's architecturally brain-dead not to have a central point of entry, or reduce the number of entries points in general. As soon as you want to do one thing across your app at every entry point, guess how many places need to change? Having dealt with an app that each page stood on it's own, it sucked having to change, and I assure you, we needed it.
The important thing is that you use a web framework that supports scalability through methods like load-balancing and declarative code.
No, a single-entry point does not in itself make a bottleneck. The front page of Google gets a lot of hits, but they have lots of servers.
So the answer is: It doesn't matter.
Like anything in software development, it depends. Rasmus's objection to the front-controller style frameworks is the performance hit you take from having to load so much code up-front on each request. This is 100% true. Even if you're using a smart-resource loading module/object/etc of some kind, using a framework is a performance trade-off. You take the performance hit, but in return you get back
Encouraged seperation of "business logic" (whatever that is) and Template/Layout Logic
Instant and (more importantly) unified access to the objects you'll use for database queries, service called, your data model, etc.
To a guy like Rasmus, this isn't worth the performance hit. He's a C/C++ programmer. For him, if you want to separate out business logic in a highly performant way, you write a C/C++ Extension to PHP.
So if you have an environment and team where you can easily write C/C++ extensions to PHP and your time-to-market vs. performance ratio is acceptable, then yes, throw away your front-controller framework.
If that's not your environment, consider the productivity increases a front-controller framework can bring to your (likely) simple CRUD Application.
I think one of the biggest advantages you have over having only a single point of entry is security. All of the input going in is less likely to corrupt the system if it is checked and validated in a single place.
I think it's a big misunderstanding discussing this from the point of "one file" vs. "several files".
One tends to think that because the entry point is in a single file, then all we have to focus on is the code in that one file - that's wrong.
All of the popular frameworks has tons of files with entry manipulation code, interpretation code, and validation code for requests. The code is not located in one place, rather is spread around in a jungle of require/include statement calling different classes depending on whats being requested and how.
In both cases the request is really handled by different files.
Why then should I have a single entry point with some kind of _detect_uri() function that needs to call several other functions like strpos(), substr(), strncmp() to clean up, validate, and split up the request string when I can just use several entry points eliminating that code all together?
Take a look at CodeIgniters _detect_uri() function in URI.php. Not to pick on CodeIgniter, it's just an example. The other frameworks does it likewise.
You can achieve the goals of a MVC pattern with a single entry point as well as with several entry points.
This is what I thought at first, but it doesn't seem to be an impact. After all, your entry point is (usually) only doing a couple of things: setting some environment constants, including the bootstrap loader, optionally catching any exceptions thrown and dispatching the front controller. I think the reason that this is not inefficient is because this file does not change depending on the controller, action or even user.
I do feel this is odd however. I'm building a small MVC framework myself at the moment but it's slightly reverse to most frameworks I've used. I put controller logic in the accessed file. For example, index.php would contain the IndexController and it's actions. This approach is working well for me at least.
As most of the php mvc frameworks use some sort of url rewriting, or at least parse anything after index.php at their own, a single entry point is needed.
Besides that, i like to provide entry points per context, say web(/soap)/console/...
Just to add, the thing people usually think is that, since there is one php page, it is a single page serving all requests. Sort of like queuing.
The important thing to note is that, each request creates an instance of the script and thus, the load is the same as if two different pages were being accessed at the same time. So, still the same load. Virtually.
However, some frameworks might have a hell whole lot of stuff you don't need going on in the entry script. Sort of like a catchall to satisfy all possible scenarios, and that might cause load.
Personally, I prefer multi pages, the same way I mix oop and procedural. I like php the old school way.
There are definitely disadvantages to using a front-controller file architecture. As Saem explains, accessing the same file is generally advantageous to performance, but as Alan Storm and Rasmus Lerdorf explain, trying to make code handle numerous situations (such as a front-controller tries to handle the request to every "page" on the site) leads to a bloated and complicated collection of code. Using a bloated and complicated collection of code for every page load can be considered bad practice.
Saem is arguing that a front-controller can save you from having to edit code in multiple locations, but I think that in some cases, that is better solved by separating out redundancies.
Having a directory structure that is actually used by the web server can make things simpler for the web server and more obvious for the developers. It is simpler for a web server to serve files by translating the url into the location as they would traditionally represent than to rewrite them into a new URL. A non-front-controller file architecture can be more obvious to the developer because instead of starting with a controller that is bloated to handle everything, they start with a file/controller that is more specific to what they need to develop.
Related
It has recently been highlighted (in my previous questions) that the way I have designed web applications is not ideal.
Consider the following. I am working on a multi-user website with lots of different sections including profiles and forums and support tickets. The structure is as follows:
A main page in which all the other pages are included or *required_once* we'll call it home.php.
In home.php, one of the first things loaded is router.php, this handles every single $_GET and $_POST that the user could possibly produce, and every form and process is sorted via a main variable called $data_process. Router.php is essentially just one giant switch() statement for $data_process. This parses all the data and gives a result.
Next included is header.php, which will not only process the neccessary variables for the page that will be loaded but also set up the header and decided exactly what is going to be shown there, e.g. menu, user info, and information about the page currently viewing (i.e. Home > Support > View Ticket).
Then the page is loaded according to $page variable. A simple include.
Then footer.php, then close.
And so the dynamic website is created. I was told this is bad practice by a user named #HorusKol. I am very pleased with this website as it is the most streamlined and easy to write website I have ever used. If this is still bad code design? What is perfect code design?
PS - can anyone recommend any good easy to read books that explain PHP, MySQL and design structure for me?
It is bad design because you process a lot of data that is perhaps not necessary in the rest of the process. The router should only process the url, processing of post data is handled somewhere else. Only include what you need, including everything makes things slow.
A better way is to structure you app more in different parts. A router that is processing the url, a controller that runs a action based on a routed request, a view that processes all html and pages, a model for accessing data. MVC is what comes in mind.
There is no such thing is the perfect code design.
There's no canonical definition of "good design" - the best you can hope for is that your design balances the various forces on your project in the optimum way, Forces on your project might be maintainability, performance, scalability, extensibility - classic non-functional requirements - but also things like search engine optimization, standards compliance and accessibility (things that apply to web projects in particular).
If all your URLS are of the form "www.mysite.com/home.php?action=getDetails&productID=123", your search engine friendliness is pretty low. It's far better to have semantic URLs - "www.mysite.com/products/DesktopPc/details.php". You can achieve this through cunning .htaccess trickery in your current design.
From a maintainability point of view, your design has some issues. If I've understood it correctly, adding a new page to the site requires you to modify the code in several different source files - router.php (your giant switch statement), the page itself, and probably the header.php as well. That indicates that the code is tightly coupled. Modifying the giant switch statement sounds like a likely source of entertaining bugs, and the combination of the router and the header, manipulating the variables, plus the actual page itself seems a little fragile. This is okay if you're the only person working on the project, and you're going to be around for the long term; if that's not the case, it's better to use an off-the-shelf framework (MVC is the current favourite; Zend, Symphony and Cake all do this well in PHP) because you can point new developers at the documentation and expect them to get up to speed.
One of the biggest enemies of maintainability is complexity - complex code is harder to work with, and harbours more bugs. There's a formal metric for complexity, and I'm pretty sure your switch statement scores very highly on that metric - in itself not necessarily a huge problem, but definitely something to keep an eye on. Lots of MVC frameworks avoid this by having the routing defined as data rather than code (i.e. have the routes in a configuration file), and/or by using convention over configuration (i.e. if the request is for page "productDetails", include the file "/inc/productDetails.inc").
Extensibility could be another concern - imagine having to expose your site content as JSON or XML as well as HTML; in your current design, that would require a lot of change, because every single item in your page processing pipeline cares and needs to know. The home.php needs to know not to send HTML, the header and footer need to know, the router needs to understand how to handle the additional data type, almost certainly making the switch statement even bigger. This again may not be a big deal.
Both extensiblity and maintainability are helped by being able to unit test your code. Test Driven Development turns this into a whole routine in its own right; I'm guessing that testing your application is hard - but that's just a guess; it depends more on how you've factored the individual lumps of code than what you've described above. However, another benefit of MVC is that it makes it easy to write unit tests for key parts of your system.
So, if the forces on your project don't include an emphasis on maintainability or extensibility, and you can handle the SEO aspect, I don't think your design is necessarily "bad"; even if you do care about those things, there are other things you can do to accommodate those forces - write documentation, hire lots of cheap coders.
The best way to get up to speed with these design topics are not books on PHP or MySQL; I'd recommend "Refactoring" and "Patterns of enterprise application architecture" by Martin Fowler, "Design Patterns" by Gamma et al. and Code Complete by McConnell (though that's a touch out of date by now).
I have been playing about with an Ajax/PHP site and I am wondering about the best practices concerning calls to multiple PHP files Vs a single PHP file with many functions in it.
All of these calls are simple database access calls - returning data from a query. It seems a sensible thing to have a single file that opens the database and contains multiple functions, one for each of the calls, however I do not know the best practices in this instance and I am unaware of any security concerns that there may be.
Does anyone know the best practice in this case?
Cheers
BK
It's a matter of opinion, really. If you haven't got much code, and don't intend to re-use any of it elsewhere, you may as well just have it all in one file. You don't want to fall into the trap of over-complicating what might be a simple setup.
If your code starts to build up and become difficult to follow, you will then be better off splitting it across a number of files based on the sort of task they perform. This is the concept frameworks are based on, where maintaining a more complex application structure is more beneficial to your productivity than fumbling with a monolithic index.php.
And finally: it is vital to take these considerations into account when expecting other people to work with your code.
This is all based on personal preference. Personally, I like to have multiple files as I feel as though it better organizes my code.
However, many people feel as though using only a single file makes locating code more feasible, and thus editing code more efficient.
As far as security is concerned, there is no difference in one using one file verse using multiple - just make sure you are be cautious when coding, not leaving anything open for injection etc.
This question can be applied to any programming language, but as I am thinking of PHP, I will phrase it accordingly...
I'm wondering if it is considered bad design/architecture if a web application uses action parameters, versus seperate files for each action.
For example:
/index.php?action=edit
Versus
/edit.php or /index/edit.php
I know mod_rewrite can translate a pretty-url to a parametrized url, but I try to avoid uneeded complexity when not necessary.
Thanks.
Quite often, for big applications, (especially with Frameworks, such as symfony, Zend Framework, ...) we tend to use one entry point : index.php.
That entry point will receive some informations (like your action parameter), that will allow it to route the request to the correct controller (or any equivalent you might have).
So, to make things short : no, using action parameters is not bad design / architecture.
Of course, this depends on the kind of application -- but, generally speaking, have a unique entry-point is quite a good idea.
Well I don't think it is a bad design - of course there is other possibilities - but overall it is about your in-house agreements between the programmers how you do it. As much as you can you should split the PHP and HTML code to make the development easier further on.
I prefer the MVC-coding style, which splits the PHP and HTML from each other as much as you "want it to".
Hope this is helpful :)
I would call both your examples at least outdated or short of best practice.
/index.php?action=edit
Doesn't look good and is therefore not user friendly and isn't SE friendly either.
/edit.php
Means that there is indeed a single file for each action which clearly is bad practice in the 21st century where we have great MVC frameworks which enable us to get rid of this clutter and keep the concerns separated.
A good URL looks for example like that:
mysite.com/user/profile/edit
meaning where in the user module, the user-profile controller and the edit action.
There is nothing actually bad in either, both can be used all right
Separate files considered to be better for the small application, to avoid unnecessary complexity as you said.
Action way is considered better to serve complex applications featuring single entry-point working as a boot-strap, initializing all the site features first and then calling appropriate module.
I just have to warn you against using such action in silly way, doing include $_GET['action'] in the middle of main 'design' file. it's both insecure and unreliable.
It depends on your requirements and scale.
Using separate files is ok. However, you will probably find yourself copying code and not properly reusing code and techniques. It's easier to do this with small, ad hoc applications, but could very well turn into spaghetti code or a jungle nest over time.
If you use a single point of entry (class loading with url handlers), you do have to learn how that works (CodeIgniter and ExpressionEngine are good MVC systems to use if you're not real good at it yet), but it's more consistent in coding practice, and scales better than a bunch of separate pages or a switch() statement you pass everything through.
It's not a panacea, either way, but most professional operations use something like an entry point with a class loading system (such as MVC).
About using mod_rewrite: mod_rewrite was not created and should not be used as part of your PHP architecture.
About having one file per logical element. This is a very good and practical way of separating logical units in your app. It way better than creating huge files with a lot of mixed logic inside, which will become unmaintainable as the app grows. This has no contradiction on having one point of entry and an MVC architecture.
Having action parameters is the most normal approach for CRUD controllers for example where makes a lot of sense to group actions to a common controller
// blog controller
-> create blog entry
-> edit
-> view
-> delete
-> list ( this is a very common addition to CRUD
all this have a common architecture in the sense that almost all accept an id and do related thing.
If you are talking strictly about GET parameters than you'll see that creating medium/large application where you direct everything from a file and the only thing changing is the get parameters will outgrow you very fast. Computer architecture is like real architecture, try to split thing in small, simple (maybe reusable) units.
I'm currently starting to write my own CMS in php from ground up using CakePHP (or should i use something else?) for my bachelors degree. And i'm thinking about various stuff that will be needed to do.
One of the things i can not figure out is if i should use a single file (for example, index.php will handle everything, and will include everything) or i should break up my cms into a few smaller files.
so my main questions are
is cakePHP a good choice?
use one file for everything or use multiple files?
do you have any good general advice on building more complex websites using php or any best-practices advice (i don't really understand why they don't teach us this in school)
Using a single entry point or multiple entry points becomes a moot point if you are using most frameworks. CakePHP for instance has an index.php file and all you end up doing is defining models, views, and controllers for different parts of your project. I would imagine that most frameworks these days work this way.
Alternatively, if you choose to roll your own framework and system for managing this, which given this is for a bachelor's degree may be (1) a lot of extra work but (2) more revealing and more instructive, I can speak from experience that I found having a single entry point to be useful.
It enables you to have a common code path for set-up stuff: things like enabling E_STRICT, E_NOTICE, etc. for debugging and reliability purposes. Things like sanitizing form inputs to work around the magic-quotes setting. Yes you can do that from an include 'globals.php' but:
Putting everything in one place also lets you come up with a standard file-naming convention and an __autoload handler that will help remove any include or require directives except for perhaps one. Means you can add classes and such without having to also remember to update a master file.
And this is entirely subjective, but I have found that it's easier to create simpler URLs using this. Instead of /volunteers/communities.php?id=Hedrick_Summit I can do /volunteers/communities/Hedrick_Summit which is more pleasing to me.
As for the choice of CakePHP, I have briefly toyed around with that framework. What I don't like about frameworks in general is they often have to be too general, to the point it results in extra cruft and slower page rendering. And the moment you have to do something that pushes the boundaries of the framework, and you will, you end up fighting the framework.
But to be fair, CakePHP seems to be adequate and generally well-designed. I personally took issue with the ORM layer but that was me striving for perfection and actually trying to do work in the SQL query. It has a reputation for being slow, but unless you're trying to build the next Facebook you should be fine.
Using a single file "entry point" gives you more flexibility when it comes to routing requests to various logic - you'll only ever have to worry about filtering one spot in a request chain.
These are really subjective questions.
I, once, wrote a CMS in php from ground up for my 3rd year project.
What I did was basically:
Checking how other people did it (Plume CMS and CMSmadesimple were a good start)
I didn't use any framework (that was a requirement)
and Yes, I used index.php with multiple params to handle different pages.
Answer is yes use multiple files in multiple directories, it makes all difference in the world when you need to debug or scale.
I would advise you to keep in mind the MVC (Model-View-Controller) pattern.
It is one of the most commonly used (and often misused) patterns in the CMS field.
Also, don't be afraid about looking what other people are doing. Read the code from Joomla, Drupal and other open source CMS. Have a look to language different from PHP to have a comprehensive glance about the possibilities.
Don't try to simply re-invent the wheel. Even if this is simply a Uni assignment, try to put something new on your CMS. Something that would push me to use yours instead of other CMS.
is cakePHP a good choice?
That's a highly subjective question and as such unanswerable. Though, if you want to experiment with architecture (eg. compare front controllers to page controllers), you probably should build more from scratch, as a lot of those decisions have already been made by the writers of said framework (And a lot of other frameworks, for the matter).
use one file for everything or use multiple files?
It's called a front controller (single entrypoint) or page controllers (multiple entry points). Get a copy of Patterns of Enterprise Application Architecture by M. Fowler.
do you have any good general advice on building more complex websites using php or any best-practices advice (i don't really understand why they don't teach us this in school)
There are billions of CMS's. Find some of them and analyse them to find out what they did and how they differ from each other. Trying to categorise the different approaches and compare their strenghts/weaknesses could make for a good paper.
I'm building a PHP site, but for now the only PHP I'm using is a half-dozen or so includes on certain pages. (I will probably use some database queries eventually.)
Are simple include() statements a concern for speed or scaling, as opposed to static HTML? What kinds of things tend to cause a site to bog down?
Certainly include() is slower than static pages. However, with modern systems you're not likely to see this as a bottleneck for a long time - if ever. The benefits of using includes to keep common parts of your site up to date outweigh the tiny performance hit, in my opinion (having different navigation on one page because you forgot to update it leads to a bad user experience, and thus bad feelings about your site/company/whatever).
Using caching will really not help either - caching code is going to be slower than just an include(). The only time caching will benefit you is if you're doing computationally-intensive calculations (very rare, on web pages), or grabbing data from a database.
Sounds like you are participating in a bit of premature optimization. If the application is not built, while performance concerns are good to be aware of, your primary concern should be getting the app written.
Includes are a fact of life. Don't worry about number, worry about keeping your code well organized (PEAR folder structure is a lovely thing, if you don't know what I'm talking about look at the structure of the Zend Framework class files).
Focus on getting the application written with a reasonable amount of abstraction. Group all of your DB calls into a class (or classes) so that you minimize code duplication (KISS principles and all) and when it comes time to refactor and optimize your queries they are centrally located. Also get started on some unit testing to prevent regression.
Once the application is up and running, don't ask us what is faster or better since it depends on each application what your bottleneck will be. It may turn out that even though you have lots of includes, your loops are eating up your time, or whatever. Use XDebug and profile your code once its up and running. Look for the segments of code that are eating up a disproportionate amount of time then refactor. If you focus too much now on the performance hit between include and include_once you'll end up chasing a ghost when those curl requests running in sync are eating your breakfast.
Though in the mean time, the best suggestions are look through the php.net manual and make sure if there's a built in function doing something you are trying to do, use it! PHP's C-based extensions will always be faster than any PHP code that you could write, and you'll be surprised how much of what you are trying to do is done already.
But again, I cannot stress this enough, premature optimization is BAD!!! Just get your application up off the ground with good levels of abstraction, profile it, then fix what actually is eating up your time rather than fixing what you think might eat up your time.
Strictly speaking, straight HTML will always serve faster than a server-side approach since the server doesn't have to do any interpretation of the code.
To answer the bigger question, there are a number of things that will cause your site to bog down; there's just no specific threshold for when your code is causing the problem vs. PHP. (keep in mind that many of Yahoo's sites are PHP-driven, so don't think that PHP can't scale).
One thing I've noticed is that the PHP-driven sites that are the slowest are the ones that include more than is necessary to display a specific page. OSCommerce (oscommerce.com) is one of the most popular PHP-driven shopping carts. It has a bad habit, however, of including all of their core functionality (just in case it's needed) on every single page. So even if you don't need to display an 'info box', the function is loaded.
On the other hand, there are many PHP frameworks out there (such as CakePHP, Symfony, and CodeIgniter) that take a 'load it as you need it' approach.
I would advise the following:
Don't include more functionality than you need for a specific page
Keep base functions separate (use an MVC approach when possible)
Use require_once instead of include if you think you'll have nested includes (e.g. page A includes file B which includes file C). This will avoid including the same file more than once. It will also stop the process if a file can't be found; thus helping your troubleshooting process ;)
Cache static pages as HTML if possible - to avoid having to reparse when things don't change
Nah includes are fine, nothing to worry about there.
You might want to think about tweaking your caching headers a bit at some point, but unless you're getting significant hits it should be no problem. Assuming this is all static data, you could even consider converting the whole site to static HTML (easiest way: write a script that grabs every page via the webserver and dumps it out in a matching dir structure)
Most web applications are limited by the speed of their database (or whatever their external storage is, but 9/10 times that'll be a database), the application code is rarely cause for concern, and it doesn't sound like you're doing anything you need to worry about yet.
Before you make any long-lasting decisions about how to structure the code for your site, I would recommend that you do some reading on the Model-View-Controller design pattern. While there are others this one appears to be gaining a great deal of ground in web development circles and certainly will be around for a while. You might want to take a look at some of the other design patterns suggested by Martin Fowler in his Patterns of Enterprise Application Architecture before making any final decisions about what sort of design will best fit your needs.
Depending on the size and scope of your project, you may want to go with a ready-made framework for PHP like Zend Framework or PHP On Trax or you may decide to build your own solution.
Specifically regarding the rendering of HTML content I would strongly recommend that you use some form of templating in order to keep your business logic separate from your display logic. I've found that this one simple rule in my development has saved me hours of work when one or the other needed to be changed. I've used http://www.smarty.net/">Smarty and I know that most of the frameworks out there either have a template system of their own or provide a plug-in architecture that allows you to use your own preferred method. As you look at possible solutions, I would recommend that you look for one that is capable of creating cached versions.
Lastly, if you're concerned about speed on the back-end then I would highly recommend that you look at ways to minimize your calls your back-end data store (whether it be a database or just system files). Try to avoid loading and rendering too much content (say a large report stored in a table that contains hundreds of records) all at once. If possible look for ways to make the user interface load smaller bits of data at a time.
And if you're specifically concerned about the actual load time of your html content and its CSS, Javascript or other dependencies I would recommend that you review these suggestions from the guys at Yahoo!.
To add on what JayTee mentioned - loading functionality when you need it. If you're not using any of the frameworks that do this automatically, you might want to look into the __autoload() functionality that was introduced in PHP5 - basically, your own logic can be invoked when you instantiate a particular class if it's not already loaded. This gives you a chance to include() a file that defines that class on-demand.
The biggest thing you can do to speed up your application is to use an Opcode cache, like APC. There's an excellent list and description available on Wikipedia.
As far as simple includes are concerned, be careful not to include too many files on each request as the disk I/O can cause your application not to scale well. A few dozen includes should be fine, but it's generally a good idea to package your most commonly included files into a single script so you only have one include. The cost in memory of having a few classes here and there you don't need loaded will be better than the cost of disk I/O for including hundreds of smaller files.