php Zend / MVC without mod_rewrite - php

I've seen it mentioned in many blogs around the net, but I believe it shoud be discussed here.
What can we do when we have an MVC framework (I am interested in ZEND) in PHP but our host does not provide mod_rewrite?
Are there any "short-cuts"? Can we transfer control in any way (so that a mapping may occur between pages)? Any ideas?
Thank you :-)

Zend framework should work without mod_rewrite. If you can live with your URL:s looking more like "/path/to/app/index.php/controller/action". If you had mod_rewrite you could do away with the "index.php" bit, but it should work with too.
It's all a matter of setting up the routes to accept the index.php part.

OK my verdict :-): I have used successfully zend without mod_rewrite and it's as you've all said site/index.php/controller/action. I knew that before posting this. I've also found out around the net a technique that "pushes" 404 pages to index.php therefore what is not a resource (eg. CSS, image, etc) get there, with one exception: POST values.
So I decided that the next time an application has to be made in the specific server, to ask politely for mod_rewrite. If the administrator can not provide it, talk with my boss or if it is for me, switch provider.
Generally, it is a shame sometimes that the PHP market is so much fragmented (php4, php5, php6, mod_rewrite, mod_auth, mod_whatever), but this is another story...

mod_rewrite is almost essential in today's hosting environment..but unfortunately not everyone got the message.
Lots of the large php programs (I'm thinking magento, but most can cope) have a pretty-url fall back mode for when mod_rewrite isn't available.
URLs end up looking like www.site.com/index.php?load-this-page
They must be running some magic to grab the variable name from the $_GET variable and using it as the selector for what module/feature to execute.
In a related note, I've seen lots of messed up URLs in the new facebook site where it's using the #. So links look like www.new.facebook.com/home.php#/inbox/ Clearly we're not meant to see that but it suggests that they're probably parsing the $_SERVER['REQUEST_URI'] variable.

If you can find a non-mod_rewrite way to redirect all requests to index.php (or wherever your init script is), you can, as mentioned above, use 'REQUEST_URI' to grab the portion of the address after the domain and then parse it as you like and make the request do what you want it to. This is how Wordpress does it (granted, with mod_rewrite). As long as you can redirect requests to your index page while retaining the same URI, you can do what you need to to process the request.

Drupal's rewrite rules translate
http://example.com/path/goes/here
into
http://example.com/index.php?q=path/goes/here
...and has logic to decide which flavor of URLs to generate. If you can live with ugly URLs, this would let you keep all the logic of a single front controller in place w/o relying on URL rewriting.

Related

Htaccess rewrite for user profile and possible conflictions

I need to ask something about htaccess redirection. I know there are lots of questions about htaccess, rewrite and pretty profile urls, but I've never found real answer of my question and I hope I can find with your help.
That pretty url rules as you know would work like changing "mydomain.com/profile.php?username=myuser" to "mydomain.com/myuser".
But let's say I have a rewrite rule for my login url : www.mydomain.com/login
That means if user try to have exact same username as "login" how could you handle that possible confliction on rewrite?
actually possible solution might be minimum character limitation like minimum 6 chars, but it's not looking elegant since you loose your option to use more than 6 chars like "/resetpassword".
Probably a "banned words" kind of array control would be a solution when user picks a username but then you need to foresee all kind of possibilities which shouldn't be used.
Many of giant websites use this rewrite methods. Particularly Facebook uses "/username" kind of rule for pages and users in the same time.
Anyway, if someone has what is the magic behind that kind of url redirection/rewrite rules please help me out on this :)
Thanks
P.S. : I know there is another solution like "/user/username" but nowadays pointing directly to the base url and shortening full url is getting more and more popular, and I just need to understand possibilities on that.
Why not just have a login sub directory in the root of your site that contains the relevant files for logging a user in? That way, the rewrite rules in your htaccess file only have to deal with the whole user redirect stuff.
What you're looking for is something called "routes". They're typically implemented by MVC frameworks like Zend Framework, CakePHP of Symfony.
What they essentially do is forwarding every request to some index.php which in turn figures out from $_SERVER['REQUEST_URI'] which PHP files should handle the request.
I wouldn't recommend putting rewrite rules into your .htaccess file by PHP. Instead, try getting into PHP frameworks. They do the hard lifting for you.
Personally, I use Zend Framework. But I wouldn't recommend the new version 2 to beginners. Try ZF1. It's actually pretty easy to get into.

Why use a single index.php page for entire site?

I am taking over an existing PHP project. I noticed that the previous developer uses a one index.php page for the entire site, currently 10+ pages. This is the second project that I have seen done like this. I don't see the advantage with this approach. In fact it seems like it over complicates everything because now you can't just add a new page to the site and link to it. You also have to make sure you update the main index page with a if clause to check for that page type and then load the page. It seems if they are just trying to reuse a template it would be easier to just use includes for the header and footer and then create each new page with those files referenced.
Can someone explain why this approach would be used? Is this some form of an MVC pattern that I am not familiar with? PHP is a second language so I am not as familiar with best practices.
I have tried doing some searches in Google for "single index page with php" and things like that but I can not find any good articles explaining why this approach is being used. I really want to kick this old stuff to the curb and not continue down that path but I want to have some sound reasoning before making the suggestion.
A front controller (index.php) ensures that everything that is common to the whole site (e.g. authentication) is always correctly handled, regardless of which page you request. If you have 50 different PHP files scattered all over the place, it's difficult to manage that. And what if you decide to change the order in which the common library files get loaded? If you have just one file, you can change it in one place. If you have 50 different entry points, you need to change all of them.
Someone might say that loading all the common stuff all the time is a waste of resources and you should only load the files that are needed for this particular page. True. But today's PHP frameworks make heavy use of OOP and autoloading, so this "waste" doesn't exist anymore.
A front controller also makes it very easy for you to have pretty URLs in your site, because you are absolutely free to use whatever URL you feel like and send it to whatever controller/method you need. Otherwise you're stuck with every URL ending in .php followed by an ugly list of query strings, and the only way to avoid this is to use even uglier rewrite rules in your .htaccess file. Even WordPress, which has dozens of different entry points (especially in the admin section), forces most common requests to go through index.php so that you can have a flexible permalink format.
Almost all web frameworks in other languages use single points of entry -- or more accurately, a single script is called to bootstrap a process which then communicates with the web server. Django works like that. CherryPy works like that. It's very natural to do it this way in Python. The only widely used language that allows web applications to be written any other way (except when used as an old-style CGI script) is PHP. In PHP, you can give any file a .php extension and it'll be executed by the web server. This is very powerful, and it makes PHP easy to learn. But once you go past a certain level of complexity, the single-point-of-entry approach begins to look a lot more attractive.
Having a single index.php file in the public directory can also protect against in the case of the php interpreter going down. A lot of frameworks use the index.php file to include the bootstrap file outside of the doc root. If this happens, the user will be able to see your sourcecode of this single file instead of the entire codebase.
Well, if the only thing that changes is the URL, It doesn't seem like it's done for any reason besides aesthetic purposes...
As for me - single entry point can help you to have better control of your application: it helps to handle errors easily, route requests, debug application.
A single "index.php" is an easy way to make sure all requests to your application flow through the same gate. This way when you add a second page you don't have to make sure bootstrapping, authentication, authorization, logging, etc are all configured--you get it for free by merit of the framework.
In modern web frameworks this could be using a front controller but it is impossible to tell since a lot of PHP code/developers suffer from NIH syndrome.
Typically such approaches are used when the contents of the pages are determined by database contents. Thus all the work would get done in a single file. This is seen often in CMS systems.

PHP request handler script and SEO

Would using a central "page handler" affect SEO negatively?
eg A page request comes in for www.mysite.com/index.php, which mod_rewrite passes on as www.mysite.com/handler.php?page=index. Handler.php gathers the page-specific includes, language files and templates, and outputs the resultant html.
My understanding is that the page handler method won't be any different SEO-wise than serving index.php directly, as the content and publicly visible url remain the same regardless of the monkey-business going on behind-the-scenes, but I've been wrong before... :)
Search engines can only see the end HTML result. They have no idea if you're using a central page handler - how would they without hacking into your site's FTP?
Also, as many frameworks and CMSes use this technique - Drupal and WordPress come to mind immediately - Google et. al. would be lunatics to penalise it, even if they could detect it.
Because mod_rewrite happens within the server, the requester will only see that they requested index.php and got a response. Without a redirect, the requester will only know that index.php exists.
Many content management systems use this method. While in Drupal every page is actually served by the request /index.php?q=request/path through mod_rewrite, any links on the site will be seen as /requests/path, with the requester oblivious that they are all passed through one php script. There are modules as well that redirect the ?q= path to the 'clean path', telling the requester that the path with a query is invalid or doesn't exist.
A well formed URI is a bonus when it comes to SEO. It helps indexing. Consider that there are sites like PRWeb.com that sell you URI space. Not subdomains, but URI keywords.
Also, while many customers merely want to mouse around, astute web users are impressed with an intuitive URI pattern. If you chop the filename off a path, you should get something logical, like a homepage or an index page, not an error screen.
If your application will eventually be statically cached, you want to able to leverage the file system. So if you have content that will publish well in a static form, I wouldn't hide it behind a convoluted query string.
Also, when conducting web analytics, having an easily parse URI certainly helps you craft your reports.
Your URI doesn't have to correspond to your filesystem. REST style APIs make it quite common to use pathings as a way to divide up areas of their APIs. Your application might leverage some pathing in the URI as a way to separate features. For access control, too: if you want to restrict Googlebot forinstance, it doesn't make a lot of sense to put ?action=blah in a robots.txt file. It does expect paths and fileglobs.
Apache mod_rewrite is awesome. I love it, I live it. I'd rather design in mod_rewrite to proxy a consistent URI space to a changing application codebase early, rather than use mod_rewrite as a bandage on an aging file structure or application layout.

passing GET parameters that look like URI directories

I've seen a lot of URIs that look something like this:
www.fakesite.net/stories/1234/man_invents_fire
and I was wondering if the /1234/man_invents_fire part of the URI are actually directories or if they are GET parameters (or something else). I've noticed that a lot of times the /man_invents_fire segment is unnecessary (can be removed with no consequences), which led me to believe that the /1234/ is the id number for the story in a database table (or something along those lines).
If those segments of the URI are GET parameters, is there an easy way of achieving this?
If they aren't, what is being done?
(also, I am aware that CodeIgnitor gives this kind of functionality, but I was curious to find out if it could be easily achieved without CodeIgnitor. I am, however, generally PHP, if that is relevant to an answer)
Thanks
Easiest thing to do is route everything into a main index.php file and figure out your routing from there by running $pieces = explode("/", $_SERVER['REQUEST_URI']);
After installing/enabling mod_rewrite, make sure allow override is not set to false in your apache config (to allow .htaccess to be read), then throw this in your docroot's .htaccess file.
<ifModule mod_rewrite.c>
RewriteCond %{REQUEST_FILENAME} !-s #Make sure the file doesn't actually exist (needed for not re-routing things like /images/header.jpg)
RewriteRule . /index.php [L,QSA] #re-route everything into index.php
</IfModule>
That is called url rewriting, google for it, you will find a lot of information about that.
Implementing this in PHP is typically done via an .htaccess file and using apache's mod_rewrite module.
They make the url like that so that people can easily bookmark it, and it can return safely in the search.
Depends on what language you're using to decode it. In this case, it appears "stories" is the main script, and "1234" is the id, and "man_invent_fires" is the title.
If you're using php, you can use the $_SERVER['PHP_SELF'] or $_SERVER['REQUEST_URI'] variable to decode it.
If you're planning to make a website like that, certain safety must be kept in mind. Look them up in google, but key one to look out for is sql injectors.
Just like permalinks in WordPress, this is done typically done via Apache's mod_rewrite (or an equivalent thereof if not using Apache); however, you can also use a 404 error page handler to achieve the same result (but this is not usually recommended).
Typically, all page requests are redirected to a gateway that parses the requested URI to determine if it fits the specified pattern (in your case likely to be /{category}/{article_id}/{article_title}). From there, the gateway can typically use just the article_id to retrieve the appropriate content.
Depending on the system, category and article_title can usually be thrown away/ignored and are typically for SEO value; however, in some cases category might be used to augment article_id in some way (e.g.: to determine what DB table to query, etc).
MVC's, like Zend, also use a similar technique to determine which controller and method therein to execute. An example format for this type of use is /{module}/{controller}/{method}; however, this is highly customizable.
Well, you are kind of right in assuming that the 1234 and main_invents_fire are parameters. They are not truly GET parameters in the sense that the HTTP protocol describes them but they accomplish the same task, while keeping the URL "friendly". The technique is called URL rewriting and the web is full of info on this these days..
Here's an article about friendly URLs in PHP but I'm sure googling for the topic will render more useful results.
As some background information in addition to the answers before me, a URL is just that - a 'Uniform Resource Locator'. Although in the old days, it often used to map 1:1 to a file/directory structure, all that is required in the HTTP spec is to identify a certain resource. Basically that means that, given a certain string, it should indicate a certain 'resource' on the server. In practice, in a HTTP environment this is usually implemented with a rewriting mechanism such as mod_rewrite. RFC's such as this one: http://www.ietf.org/rfc/rfc1738.txt give a nice, albeit abstract, overview. The concepts only come to life after designing and implementing some non-obvious uses, though.
If you are using Symfony, then you can use the routing feature to do this.

What are the pros for using extension-less URLs?

What are the pros for using extension-less URLs?
For example, why should I change...
http://yoursite.com/mypage.html
http://yoursite.com/mypage.php
http://yoursite.com/mypage.aspx
to...
http://yoursite.com/mypage
And is it possible to have extension-less URLs for every page?
Update:
Are extension-less URLs better for site security?
The reason for extension-less URLs is that it is technology independent. If you want to change how your content is rendered you do not have to change the URL.
W3: Cool URIs don't change
File name extension
This is a very common one. "cgi", even ".html" is something which will change. You may not be using HTML for that page in 20 years time, but you might want today's links to it to still be valid. The canonical way of making links to the W3C site doesn't use the extension....
Conclusion Keeping URIs so that they will still be around in 2, 20 or 200 or even 2000 years is clearly not as simple as it sounds. However, all over the Web, webmasters are making decisions which will make it really difficult for themselves in the future. Often, this is because they are using tools whose task is seen as to present the best site in the moment, and no one has evaluated what will happen to the links when things change. The message here is, however, that many, many things can change and your URIs can and should stay the same. They only can if you think about how you design them.
It's mostly done for aesthetic purposes.
There is a very minor potential security benefit (a user doesn't immediately know what language the backend code is written in) but this is negligible.
A related blog post.
People claim it makes for better SEO, even if I am not personally convinced of that. Many clients request these extension-less URLs nowadays, so it's just as well that it can be easily achieved.
If you are running IIS 7, you can switch the AppPool to run on the Integrated Pipeline, thereby removing the need to have specific extensions mapped to the ASP.NET engine. Once that is done, you can instruct Sitecore to use extension-less urls in the web.config setting (assuming Sitecore 6):
<linkManager defaultProvider="sitecore">
<providers>
<clear />
<add name="sitecore" type="Sitecore.Links.LinkProvider, Sitecore.Kernel"
addAspxExtension="false" /* This one is set to true, per default */
alwaysIncludeServerUrl="false"
encodeNames="true"
languageEmbedding="asNeeded"
languageLocation="filePath"
shortenUrls="true"
useDisplayName="false" />
</providers>
</linkManager>
And you're set.
Be aware that early versions of Sitecore 6 had a few issues when running Integrated Pipeline. More information can be found here.
As stated, one of the advantages is that you do not tie URLS to a specific technology or language. Also, one of the advantages is that it allows you to manage the output format from within the application if you wish to do so.
But this is relevant only within a "routed" code framework, where you would basically attach url routes to code.
For instance, in my code library, you can specify the allowed output format of an url by
1) Setting an Accept header in the HTTP header
2) Attaching a valid extension to the URL
So the code for /my/simple/url.html, /my/simple/url.xml and /my/simple/url.json is exactly the same. The ouput manager will be responsible for outputing the content in the proper way.
So if you change the underlying technology, you are still able to keep the same URL pattern within the new version of you application.
From there, since you are parsing the URL withing your own code to extract the data, it generally gives you the opportunity to make SEO-friendly URL, i.e. more meaningful URLs in terms of search engine indexing. You can then define more meaningful URL patterns within you web application structure.
Because user does not need to know technology behind a page.
Example: domain.com/Programs/Notepad
Only thing I can think of is to make it easier for the end user to remember/type, other than that I don't see any reason, I also ran this by our admin and he says some say SEO but if he was to use it, he would use it for a level if security.

Categories