hi
I am working on a great website (social network with php) and I've decided to create only one php page, (index.php), but this php page will contain php if conditions and statments of the $_GET value,and will display the page requered (but in the same page index.php).
This means that the code(javascript+xhtml+php) will be very huge (nearly all the project in one page).
I will also use the Htaccess to rewrite the urls of those pages to avoid any malicious requests (so it will appear just like a normal website).
But, before doing so, I just want to know about the advantages and downsides of this technique, seeing it from all other sides (security, server resources, etc...)
thank you
I think what you're trying to do is organize your code properly and effectively, which I commend.
However if I understand correctly, you're going to put all of your javascript, html, and PHP in one file, which is really bad. You want your code to be modular, not lumped together in a single file.
I think you should look into using a framework (eg Zend) - PHP frameworks are specifically designed to help your code remain organized, modular, and secure. Your intent (organizing your code effectively) is great, but your idea for how to organize your code isn't very good. If you're absolutely adament about not using a framework (for example if this is a learning/school project), you should at least make sure you're following best practices.
This approach is not good because of server resource usage. In order to get access to say jQuery.js your web server is going to:
Determine that jQuery.js actually passes through index.php
Pass index.php through the php parser
Wait for php to generate a response.
Serve that response.
Or, you could serve it this:
Determine jQuery.js exists in /var/www/mysite/jQuery.js
Serve it as the response.
Likewise for anything that's "static" i.e. isn't generated from PHP directly. The bigger the number of ifs in the PHP script, the more tests will need be done to find your file.
You do not need to pass your static content through some form of url routing; only your dynamic content. For real speed, its better to generate responses ready as well, called caching, particularly if the dynamic content is expensive in terms of cpu cycles to generate. Other caching techniques include leaving frequently accessed database data in memory, which is what memcached does.
If you're developing a social network, these things really do matter. Heck, facebook wrote a PHP-to-C++ compiler to save clock cycles.
I second the framework recommendation because it really will make code organisation easier and might integrate with a caching-based solution.
In terms of PHP frameworks, there are many. Here's a list of many web application frameworks in many languages and from the same page, the PHP ones. Take a look and decide which you like best. That's what I did and I ended up learning Python to use Django.
Came by this question searching so since the best answer is old, here is more modern one, from this question
Why use a single index.php page for entire site?
A front controller (index.php) ensures that everything that is common to the whole site (e.g. authentication) is always correctly handled, regardless of which page you request. If you have 50 different PHP files scattered all over the place, it's difficult to manage that. And what if you decide to change the order in which the common library files get loaded? If you have just one file, you can change it in one place. If you have 50 different entry points, you need to change all of them.
Someone might say that loading all the common stuff all the time is a waste of resources and you should only load the files that are needed for this particular page. True. But today's PHP frameworks make heavy use of OOP and autoloading, so this "waste" doesn't exist anymore.
A front controller also makes it very easy for you to have pretty URLs in your site, because you are absolutely free to use whatever URL you feel like and send it to whatever controller/method you need. Otherwise you're stuck with every URL ending in .php followed by an ugly list of query strings, and the only way to avoid this is to use even uglier rewrite rules in your .htaccess file. Even WordPress, which has dozens of different entry points (especially in the admin section), forces most common requests to go through index.php so that you can have a flexible permalink format.
Almost all web frameworks in other languages use single points of entry -- or more accurately, a single script is called to bootstrap a process which then communicates with the web server. Django works like that. CherryPy works like that. It's very natural to do it this way in Python. The only widely used language that allows web applications to be written any other way (except when used as an old-style CGI script) is PHP. In PHP, you can give any file a .php extension and it'll be executed by the web server. This is very powerful, and it makes PHP easy to learn. But once you go past a certain level of complexity, the single-point-of-entry approach begins to look a lot more attractive.
It will be a hell of a mess.
You also wont be able to upgrade parts of the website or work on them without messing with the whole thing.
You will not be able to apply some programming architecture like MVC.
It could theoretically be faster, because you have only one file that needs to be fetched from disk, but only under the assumption that all or at least almost all the code is going to be executed.
So you will have to load and compile the whole file for every single request, also the parts that are not needed. so it will slow you down.
What you however CAN do is have a single point of entry where all requests originate from. That helps controlling a lot and is called a bootstrap file.
But most importantly:
Why would you want that?
From what I know most CMSes (and probably all modern ones) are made so that the requested page is the same index.php, but that file is just a dispatcher to other sections. The code is written properly in different files that are built together with includes.
Edit: If you're afraid your included scripts are vulnerable the solutions is trivial. Put them outside of the web root.
Simplistic example:
<?php
/* This folder shouldn't even be in the site root,
it should be in a totally different place on the server
so there is no way someone could request something from it */
$safeRoot = '/path/to/safe/folder/';
include $safeRoot.'all_pages_need_this.php'; // aka The Bootstrap //
switch($_GET['page']){
case 'home':
include $safeRoot.'home.module.php';
break;
case 'blog':
include $safeRoot.'blog.module.php';
break;
case 'store':
include $safeRoot.'store.module.php';
break;
default:
include $safeRoot.'404.module.php';
}
This means that the code(javascript+xhtml+php) will be very huge (nearly all the project in one page).
Yes and it'll be slow.
So you're not going to have any HTML cacheing?
It's all purely in one file, hard to update and slow to interpret? geesh, good luck.
What you are referring to is called single point of entry and is something many web applications (most notably the ones built following the MVC pattern) use.
The code of your point of entry file doesn't have to be huge as you can simply include() other files as needed. For example:
<?php
if ($_GET['module'] == 'messages') {
include('inbox.php');
}
if ($_GET['module'] == 'profile') {
include('profile.php');
} etc..
Related
It has recently been highlighted (in my previous questions) that the way I have designed web applications is not ideal.
Consider the following. I am working on a multi-user website with lots of different sections including profiles and forums and support tickets. The structure is as follows:
A main page in which all the other pages are included or *required_once* we'll call it home.php.
In home.php, one of the first things loaded is router.php, this handles every single $_GET and $_POST that the user could possibly produce, and every form and process is sorted via a main variable called $data_process. Router.php is essentially just one giant switch() statement for $data_process. This parses all the data and gives a result.
Next included is header.php, which will not only process the neccessary variables for the page that will be loaded but also set up the header and decided exactly what is going to be shown there, e.g. menu, user info, and information about the page currently viewing (i.e. Home > Support > View Ticket).
Then the page is loaded according to $page variable. A simple include.
Then footer.php, then close.
And so the dynamic website is created. I was told this is bad practice by a user named #HorusKol. I am very pleased with this website as it is the most streamlined and easy to write website I have ever used. If this is still bad code design? What is perfect code design?
PS - can anyone recommend any good easy to read books that explain PHP, MySQL and design structure for me?
It is bad design because you process a lot of data that is perhaps not necessary in the rest of the process. The router should only process the url, processing of post data is handled somewhere else. Only include what you need, including everything makes things slow.
A better way is to structure you app more in different parts. A router that is processing the url, a controller that runs a action based on a routed request, a view that processes all html and pages, a model for accessing data. MVC is what comes in mind.
There is no such thing is the perfect code design.
There's no canonical definition of "good design" - the best you can hope for is that your design balances the various forces on your project in the optimum way, Forces on your project might be maintainability, performance, scalability, extensibility - classic non-functional requirements - but also things like search engine optimization, standards compliance and accessibility (things that apply to web projects in particular).
If all your URLS are of the form "www.mysite.com/home.php?action=getDetails&productID=123", your search engine friendliness is pretty low. It's far better to have semantic URLs - "www.mysite.com/products/DesktopPc/details.php". You can achieve this through cunning .htaccess trickery in your current design.
From a maintainability point of view, your design has some issues. If I've understood it correctly, adding a new page to the site requires you to modify the code in several different source files - router.php (your giant switch statement), the page itself, and probably the header.php as well. That indicates that the code is tightly coupled. Modifying the giant switch statement sounds like a likely source of entertaining bugs, and the combination of the router and the header, manipulating the variables, plus the actual page itself seems a little fragile. This is okay if you're the only person working on the project, and you're going to be around for the long term; if that's not the case, it's better to use an off-the-shelf framework (MVC is the current favourite; Zend, Symphony and Cake all do this well in PHP) because you can point new developers at the documentation and expect them to get up to speed.
One of the biggest enemies of maintainability is complexity - complex code is harder to work with, and harbours more bugs. There's a formal metric for complexity, and I'm pretty sure your switch statement scores very highly on that metric - in itself not necessarily a huge problem, but definitely something to keep an eye on. Lots of MVC frameworks avoid this by having the routing defined as data rather than code (i.e. have the routes in a configuration file), and/or by using convention over configuration (i.e. if the request is for page "productDetails", include the file "/inc/productDetails.inc").
Extensibility could be another concern - imagine having to expose your site content as JSON or XML as well as HTML; in your current design, that would require a lot of change, because every single item in your page processing pipeline cares and needs to know. The home.php needs to know not to send HTML, the header and footer need to know, the router needs to understand how to handle the additional data type, almost certainly making the switch statement even bigger. This again may not be a big deal.
Both extensiblity and maintainability are helped by being able to unit test your code. Test Driven Development turns this into a whole routine in its own right; I'm guessing that testing your application is hard - but that's just a guess; it depends more on how you've factored the individual lumps of code than what you've described above. However, another benefit of MVC is that it makes it easy to write unit tests for key parts of your system.
So, if the forces on your project don't include an emphasis on maintainability or extensibility, and you can handle the SEO aspect, I don't think your design is necessarily "bad"; even if you do care about those things, there are other things you can do to accommodate those forces - write documentation, hire lots of cheap coders.
The best way to get up to speed with these design topics are not books on PHP or MySQL; I'd recommend "Refactoring" and "Patterns of enterprise application architecture" by Martin Fowler, "Design Patterns" by Gamma et al. and Code Complete by McConnell (though that's a touch out of date by now).
I see many MVC implementations for websites have a single-entry point such as an index.php file and then parses the URL to determine which controller to run. This seems rather odd to me because it involves having to rewrite the URL using Apache rewrites and with enough pages that single file will become bloated.
Why not instead just to have the individual pages be the controllers? What I mean is if you have a page on your site that lists all the registered members then the members.php page users navigate to will be the controller for the members. This php file will query the members model for the list of members from the database and pass it in to the members view.
I might be missing something because I have only recently discovered MVC but this one issue has been bugging me. Wouldn't this kind of design be preferable because instead of having one bloated entry-file that all pages unintuitively call the models and views for a specific page are contained, encapsulated, and called from its respective page?
From my experience, having a single-entry point has a couple of notorious advantages:
It eases centralized tasks such as resource loading (connecting to the db or to a memcache server, logging execution times, session handling, etc). If you want to add or remove a centralized task, you just have to change a singe file, which is the index.php.
Parsing the URL in PHP makes the "virtual URL" decoupled from the physical file layout on your webserver. That means that you can easily change your URL system (for example, for SEO purposes, or for site internationalization) without having to actually change the location of your scripts in the server.
However, sometimes having a singe-entry point can be a waste of server resouces. That applies obviously to static content, but also when you have a set of requests that have a very specific purpose and just need a very little set of your resorces (maybe they don't need DB access for instance). Then you should consider having more than one entry point. I have done that for the site I am working on. It has an entry point for all the "standard" dynamic contents and another one for the calls to the public API, which need much less resources and have a completely different URL system.
And a final note: if the site is well-implemented, your index.php doesn't have to become necessarily bloated :)
it is all about being DRY, if you have many php files handling requests you will have duplicated code. That just makes for a maintenance nightmare.
Have a look at the 'main' index page for CakePHP, https://github.com/cakephp/cakephp/blob/master/app/webroot/index.php
no matter how big the app gets, i have never needed to modify that. so how can it get bloated?
When deeplinking directly into the controllers when using an MVC framework it eliminates the possibility of implementing controller plugins or filters, depending on the framework you are using. Having a single point of entry standardizes the bootstrapping of the application and modules and executing previously mentioned plugins before a controller is accessed.
Also Zend Framework uses its own URL rewriting in the form of Routing. In the applications that use Zend Framework I work on have an .htaccess file of maybe 6 lines of rewriterules and conditions.
A single entry point certainly has its advantages, but you can get pretty much the same benefit from a central required file at the top of every single page that handles database connections, sessions, etc. It's not bloated, it conforms to DRY principles (except for that one require line), it seperates logic and presentation and if you change file locations, a simple search and replace will fix it.
I've used both and I can't say one is drastically better or worse for my purposes.
Software engineers are influencing the single point of entry paradigm. "Why not instead just to have the individual pages be the controllers?"
Individual pages are already Controllers, in a sense.
In PHP, there is going to be some boilerplate code that loads for every HTTP request: autoloader require statement (PSR-4), error handler code, sessions, and if you are wise, wrapping the core of your code in a try/catch with Throwable as the top exception to catch. By centralizing code, you only need to make changes in one place!
True, the centralized PHP will use at least one require statement (to load the autoloader code), but even if you have many require statements they will all be in one file, the index.php (not spread out over a galaxy of files on under the document root).
If you are writing code with security in mind, again, you may have certain encoding checks/sanitizing/validating that happens with every request. Using values in $_SERVER or filter_input_array()? Then you might as well centralize that.
The bottom line is this. The more you do on every page, the more you have a good reason to centralize that code.
Note, that this way of thinking leads one down the path of looking at your website as a web application. From the web application perspective, a single point of entry is justifiable because a problem solved once should only need to be modified in one place.
I'm working on a project that allows multiple users to submit large data files and perform operations on them. The "backend" which performs these operations is written in Perl while the "frontend" uses PHP to load HTML template files and determines which content to deliver. Data is stored in a database (MySQL, SQLite, Oracle) and while there is data which has not yet been acted upon, Perl adds it to a running queue which delivers data to other threads based on system load. In addition, there may be pre- and post-processing of the data before and after the main Perl script operates (the specifications are unclear) so I may want to allow these processors to be user-selectable plugins. I had been writing this project in a more procedural fashion but I am quickly realizing the benefit of separating concerns as to limit the scope one change has on the rest of the project.
I'm quite unexperienced with design patterns and am curious what the best way to proceed is. I've heard MVC thrown around quite a bit but I am unsure of how to apply it. Specifically, what are some good options to structure this code (in terms of design patterns and folder hierarchy)? How can I achieve this with both PHP and Perl while minimizing duplicated code between languages? Should I keep my PHP files in the top level so I don't have ugly paths in the URL?
Also, if I want to provide interchangeable databases, does each table need its own DAO implementation?
This is really several questions, but here goes:
MVC
To implement MVC you'll need to have a clear idea of what the Model, View and Controller sections are responsible for. You can read about this for yourself, but in this case:
The model should only contain the code which performs operations on the data files, i.e. the back-end Perl scripts
The view will be the HTML templates only. No PHP logic should be embedded in them except what's necessary to display the page.
The controller will be the rest of the application: the parts that connect the PHP front-end to the Perl back-end, and possibly the Perl scripts that poll for new files.
For every php, html or perl file you create, it should be absolutely clear to which section it belongs, in its entirety. Never mix up model, view or controller code in the same file.
There should be no reason why you can't continue writing procedurally. You don't necessarily need a framework either; It may help you to slot things into place, but may also take some time to learn.
MVC is a more of a separation of concerns that you need to keep in mind. A good way to think about it is: "Can each of these components work separately from the others", e.g.:
can you write a 'mocked up' (example) data file, and have the Perl scripts process it without running any of the PHP code?
can you request an operation from the front-end, and have it deliver all the parameters to a single place, ready for a single Perl routine to pick up, without running and Perl code?
You shouldn't need to worry about code duplication if the PHP and Perl scripts are doing totally different things (PHP is only setting up users parameters and input files, and Perl is only taking those parameters and files, and outputting new files).
As for folder hierarchy, if you're not using any existing framework, the most important thing is just that it makes sense, is consistent, and that you document your decisions, e.g. in a readme file.
Ugly URLs
You don't need to put your php files in any particular place. Use Apache rewrite rules to make the URLs pretty afterwards. (rewrite rules generator - see the "301 Redirect File" section). But a good MVC framework will solve this for you.
User-selectable plugins
Be careful not to optimise too early. You may be able to develop the new pre/post-processing steps yourself, and just put them in a list for users to select from.
MVC frameworks are a great tool, for any web developer, that provides well-defined separation indispensable for keeping your code organized.
For PHP I recommend Zend Framework
Database schema will change depending on what platform you use.
I am taking over an existing PHP project. I noticed that the previous developer uses a one index.php page for the entire site, currently 10+ pages. This is the second project that I have seen done like this. I don't see the advantage with this approach. In fact it seems like it over complicates everything because now you can't just add a new page to the site and link to it. You also have to make sure you update the main index page with a if clause to check for that page type and then load the page. It seems if they are just trying to reuse a template it would be easier to just use includes for the header and footer and then create each new page with those files referenced.
Can someone explain why this approach would be used? Is this some form of an MVC pattern that I am not familiar with? PHP is a second language so I am not as familiar with best practices.
I have tried doing some searches in Google for "single index page with php" and things like that but I can not find any good articles explaining why this approach is being used. I really want to kick this old stuff to the curb and not continue down that path but I want to have some sound reasoning before making the suggestion.
A front controller (index.php) ensures that everything that is common to the whole site (e.g. authentication) is always correctly handled, regardless of which page you request. If you have 50 different PHP files scattered all over the place, it's difficult to manage that. And what if you decide to change the order in which the common library files get loaded? If you have just one file, you can change it in one place. If you have 50 different entry points, you need to change all of them.
Someone might say that loading all the common stuff all the time is a waste of resources and you should only load the files that are needed for this particular page. True. But today's PHP frameworks make heavy use of OOP and autoloading, so this "waste" doesn't exist anymore.
A front controller also makes it very easy for you to have pretty URLs in your site, because you are absolutely free to use whatever URL you feel like and send it to whatever controller/method you need. Otherwise you're stuck with every URL ending in .php followed by an ugly list of query strings, and the only way to avoid this is to use even uglier rewrite rules in your .htaccess file. Even WordPress, which has dozens of different entry points (especially in the admin section), forces most common requests to go through index.php so that you can have a flexible permalink format.
Almost all web frameworks in other languages use single points of entry -- or more accurately, a single script is called to bootstrap a process which then communicates with the web server. Django works like that. CherryPy works like that. It's very natural to do it this way in Python. The only widely used language that allows web applications to be written any other way (except when used as an old-style CGI script) is PHP. In PHP, you can give any file a .php extension and it'll be executed by the web server. This is very powerful, and it makes PHP easy to learn. But once you go past a certain level of complexity, the single-point-of-entry approach begins to look a lot more attractive.
Having a single index.php file in the public directory can also protect against in the case of the php interpreter going down. A lot of frameworks use the index.php file to include the bootstrap file outside of the doc root. If this happens, the user will be able to see your sourcecode of this single file instead of the entire codebase.
Well, if the only thing that changes is the URL, It doesn't seem like it's done for any reason besides aesthetic purposes...
As for me - single entry point can help you to have better control of your application: it helps to handle errors easily, route requests, debug application.
A single "index.php" is an easy way to make sure all requests to your application flow through the same gate. This way when you add a second page you don't have to make sure bootstrapping, authentication, authorization, logging, etc are all configured--you get it for free by merit of the framework.
In modern web frameworks this could be using a front controller but it is impossible to tell since a lot of PHP code/developers suffer from NIH syndrome.
Typically such approaches are used when the contents of the pages are determined by database contents. Thus all the work would get done in a single file. This is seen often in CMS systems.
This question stems from watching Rasmus Lerdorf's talk from Drupalcon. This question and his talk have nothing specifically to do with Drupal, by the way... it was just given at their con. My own question also has nothing specific to do with PHP. It is the single entry point in general that I am curious about.
These days it seems that most frameworks offer a single entry point for whatever you build with them. In his talk Rasmus mentions that he thinks this is bad. It seems to me that he would be correct in this thinking. If everyone hitting the site is coming in through the same entry point wouldn't things bog down after traffic reached a certain point? Wouldn't it be more efficient to allow people direct access to specific points in a site without having their request go through the same point? But perhaps the actual impact is not very bad? Maybe modern architecture can handle it? Maybe you have to be truly gigantic in scale before it becomes even worth considering? I'm curious as to what people on this site think about this issue.
In short, Rasmus or the interpretation is wrong.
This shows a clear lack of understanding how computers work. The more something gets used, the more likely it's closer to the CPU, and therefore faster. Mind you, a single point of entry != single point of failure. But that's all beside the point, when people say single point of entry, we're talking about the app, it is a single point of entry for your logic.
Not to mention it's architecturally brain-dead not to have a central point of entry, or reduce the number of entries points in general. As soon as you want to do one thing across your app at every entry point, guess how many places need to change? Having dealt with an app that each page stood on it's own, it sucked having to change, and I assure you, we needed it.
The important thing is that you use a web framework that supports scalability through methods like load-balancing and declarative code.
No, a single-entry point does not in itself make a bottleneck. The front page of Google gets a lot of hits, but they have lots of servers.
So the answer is: It doesn't matter.
Like anything in software development, it depends. Rasmus's objection to the front-controller style frameworks is the performance hit you take from having to load so much code up-front on each request. This is 100% true. Even if you're using a smart-resource loading module/object/etc of some kind, using a framework is a performance trade-off. You take the performance hit, but in return you get back
Encouraged seperation of "business logic" (whatever that is) and Template/Layout Logic
Instant and (more importantly) unified access to the objects you'll use for database queries, service called, your data model, etc.
To a guy like Rasmus, this isn't worth the performance hit. He's a C/C++ programmer. For him, if you want to separate out business logic in a highly performant way, you write a C/C++ Extension to PHP.
So if you have an environment and team where you can easily write C/C++ extensions to PHP and your time-to-market vs. performance ratio is acceptable, then yes, throw away your front-controller framework.
If that's not your environment, consider the productivity increases a front-controller framework can bring to your (likely) simple CRUD Application.
I think one of the biggest advantages you have over having only a single point of entry is security. All of the input going in is less likely to corrupt the system if it is checked and validated in a single place.
I think it's a big misunderstanding discussing this from the point of "one file" vs. "several files".
One tends to think that because the entry point is in a single file, then all we have to focus on is the code in that one file - that's wrong.
All of the popular frameworks has tons of files with entry manipulation code, interpretation code, and validation code for requests. The code is not located in one place, rather is spread around in a jungle of require/include statement calling different classes depending on whats being requested and how.
In both cases the request is really handled by different files.
Why then should I have a single entry point with some kind of _detect_uri() function that needs to call several other functions like strpos(), substr(), strncmp() to clean up, validate, and split up the request string when I can just use several entry points eliminating that code all together?
Take a look at CodeIgniters _detect_uri() function in URI.php. Not to pick on CodeIgniter, it's just an example. The other frameworks does it likewise.
You can achieve the goals of a MVC pattern with a single entry point as well as with several entry points.
This is what I thought at first, but it doesn't seem to be an impact. After all, your entry point is (usually) only doing a couple of things: setting some environment constants, including the bootstrap loader, optionally catching any exceptions thrown and dispatching the front controller. I think the reason that this is not inefficient is because this file does not change depending on the controller, action or even user.
I do feel this is odd however. I'm building a small MVC framework myself at the moment but it's slightly reverse to most frameworks I've used. I put controller logic in the accessed file. For example, index.php would contain the IndexController and it's actions. This approach is working well for me at least.
As most of the php mvc frameworks use some sort of url rewriting, or at least parse anything after index.php at their own, a single entry point is needed.
Besides that, i like to provide entry points per context, say web(/soap)/console/...
Just to add, the thing people usually think is that, since there is one php page, it is a single page serving all requests. Sort of like queuing.
The important thing to note is that, each request creates an instance of the script and thus, the load is the same as if two different pages were being accessed at the same time. So, still the same load. Virtually.
However, some frameworks might have a hell whole lot of stuff you don't need going on in the entry script. Sort of like a catchall to satisfy all possible scenarios, and that might cause load.
Personally, I prefer multi pages, the same way I mix oop and procedural. I like php the old school way.
There are definitely disadvantages to using a front-controller file architecture. As Saem explains, accessing the same file is generally advantageous to performance, but as Alan Storm and Rasmus Lerdorf explain, trying to make code handle numerous situations (such as a front-controller tries to handle the request to every "page" on the site) leads to a bloated and complicated collection of code. Using a bloated and complicated collection of code for every page load can be considered bad practice.
Saem is arguing that a front-controller can save you from having to edit code in multiple locations, but I think that in some cases, that is better solved by separating out redundancies.
Having a directory structure that is actually used by the web server can make things simpler for the web server and more obvious for the developers. It is simpler for a web server to serve files by translating the url into the location as they would traditionally represent than to rewrite them into a new URL. A non-front-controller file architecture can be more obvious to the developer because instead of starting with a controller that is bloated to handle everything, they start with a file/controller that is more specific to what they need to develop.