passing GET parameters that look like URI directories

passing GET parameters that look like URI directories - php

I've seen a lot of URIs that look something like this:
www.fakesite.net/stories/1234/man_invents_fire
and I was wondering if the /1234/man_invents_fire part of the URI are actually directories or if they are GET parameters (or something else). I've noticed that a lot of times the /man_invents_fire segment is unnecessary (can be removed with no consequences), which led me to believe that the /1234/ is the id number for the story in a database table (or something along those lines).
If those segments of the URI are GET parameters, is there an easy way of achieving this?
If they aren't, what is being done?
(also, I am aware that CodeIgnitor gives this kind of functionality, but I was curious to find out if it could be easily achieved without CodeIgnitor. I am, however, generally PHP, if that is relevant to an answer)
Thanks

Easiest thing to do is route everything into a main index.php file and figure out your routing from there by running $pieces = explode("/", $_SERVER['REQUEST_URI']);
After installing/enabling mod_rewrite, make sure allow override is not set to false in your apache config (to allow .htaccess to be read), then throw this in your docroot's .htaccess file.
<ifModule mod_rewrite.c>
RewriteCond %{REQUEST_FILENAME} !-s #Make sure the file doesn't actually exist (needed for not re-routing things like /images/header.jpg)
RewriteRule . /index.php [L,QSA] #re-route everything into index.php
</IfModule>

That is called url rewriting, google for it, you will find a lot of information about that.

Implementing this in PHP is typically done via an .htaccess file and using apache's mod_rewrite module.

They make the url like that so that people can easily bookmark it, and it can return safely in the search.
Depends on what language you're using to decode it. In this case, it appears "stories" is the main script, and "1234" is the id, and "man_invent_fires" is the title.
If you're using php, you can use the $_SERVER['PHP_SELF'] or $_SERVER['REQUEST_URI'] variable to decode it.
If you're planning to make a website like that, certain safety must be kept in mind. Look them up in google, but key one to look out for is sql injectors.

Just like permalinks in WordPress, this is done typically done via Apache's mod_rewrite (or an equivalent thereof if not using Apache); however, you can also use a 404 error page handler to achieve the same result (but this is not usually recommended).
Typically, all page requests are redirected to a gateway that parses the requested URI to determine if it fits the specified pattern (in your case likely to be /{category}/{article_id}/{article_title}). From there, the gateway can typically use just the article_id to retrieve the appropriate content.
Depending on the system, category and article_title can usually be thrown away/ignored and are typically for SEO value; however, in some cases category might be used to augment article_id in some way (e.g.: to determine what DB table to query, etc).
MVC's, like Zend, also use a similar technique to determine which controller and method therein to execute. An example format for this type of use is /{module}/{controller}/{method}; however, this is highly customizable.

Well, you are kind of right in assuming that the 1234 and main_invents_fire are parameters. They are not truly GET parameters in the sense that the HTTP protocol describes them but they accomplish the same task, while keeping the URL "friendly". The technique is called URL rewriting and the web is full of info on this these days..
Here's an article about friendly URLs in PHP but I'm sure googling for the topic will render more useful results.

As some background information in addition to the answers before me, a URL is just that - a 'Uniform Resource Locator'. Although in the old days, it often used to map 1:1 to a file/directory structure, all that is required in the HTTP spec is to identify a certain resource. Basically that means that, given a certain string, it should indicate a certain 'resource' on the server. In practice, in a HTTP environment this is usually implemented with a rewriting mechanism such as mod_rewrite. RFC's such as this one: http://www.ietf.org/rfc/rfc1738.txt give a nice, albeit abstract, overview. The concepts only come to life after designing and implementing some non-obvious uses, though.

If you are using Symfony, then you can use the routing feature to do this.

Related

Dynamically create URLs

Something that is really confusing me is how sites have urls such as:
http://example.com/shop/
-and-
http://example.com/shop/product-category/games/
Originally, I thought you could simply just create a php file and use URL rewrite in the .htaccess. For example:
http://example.com/shop.php
http://example.com/shop.php/product-category.php/games.php (It just doesn't make sense).
Clearly, this is not the case as not only is it not efficient for more trailing slashes, but CMS's such as Wordpress automatically generate the content for the 'dynamic' URLs, and then redirect to a 301 page if the URL is not recognised.
I am wanting to implement this into my site, I'm just completely clueless on how I'd approach this. I am struggling to research deep into the topic due to myself being unaware what this system actually is. Obviously, I'm looking at this completely wrong, which is causing me to confuse myself.
If someone could explain to what this system is called, and how I can do it. I'd prefer not to get spoon fed code, I just need someone to explain it all too me.
EDIT: After some further researching, I have found a perfect example to make my question clearer. Notice this url:
https://gamurs.com/g/csgo/players
has the same url as:
https://gamurs.com/g/csgo
But shows different content, it's a completely new page, that is being 'dynamically' created.
Then more random URLs from the same site:
https://gamurs.com/articles/world-esports-association-announced
https://gamurs.com/coaches
Thanks,
Sutton

So certainly if you were to use a framework like Laravel, its a very easy way of having routing.
There is a "routes.php" file in every project that defines calls to a URL and responds with a controllers method. This will generally return a view.
It makes it as easy to create something like this as adding a line like this
Route::get('home', 'PagesController#home');
to your routes file and when you go to example.com/home it will fire the controller called 'PagesController' and run its home method. Within this method you could return a view that will be the page that you want to display.
This can be expanded in other ways.
You could have another route that is nearly the same but have another method for when someone sends post data to the same controller.
Route::post('home', 'PagesController#submittedhome');
So now you have another route that will take a post input to that page and fire a completely different method from the same controller. These kind of controls + many many more can allow you to achieve what you want really easily and is part of the core fundamentals of laravel.
Here is the Routing page on the laravel page that can illuminate you a little more.

There are many frameworks that use the term url rewriting but obviously not referring to .htaccess. Frequently it's called routing. WordPress does this by using page slugs.
In WordPress, .htacces sends requests for existing files and directories directly through, bypassing WP. Anything not matching as described above must be a "virtual" page. and is sent to index.php
The URI is then parsed in one way or another usually involving regular expressions. Each part of the URI path correlates to a "slug", these are then used to create a database query to generate the relevant content.
There's a lot more going on in terms of selecting specific templates for certain slug types.

A low tech approach to achieve the same outcome without requiring routing is to create a folder with the desired name, and have an index.php (or index.html) within it. Then when the url is called - the default is to open the index file within the folder, even if it is not specified in the URL.
therefore
http://example.com/shop.php/product-category.php/games/
would be the equivalent of calling
http://example.com/shop.php/product-category.php/games/index.php
Note that I am NOT advocating this (I think it would get messy very quickly and there are way more efficient solutions), nor am I suggesting that the given examples are doing it this way, but I wanted to post this because it is a viable method to producing url's without file names or indexes listed. Just not a very good one IMO.

If someone could explain to what this system is called
It's called RewriteEngine and it's part of apache mod_rewrite, other web-servers have different mods, but this the most popular.
RewriteEngine on
RewriteRule ^shop/$ shop.php [L]
RewriteRule ^shop/products/(.*?)/$ products.php?type=$1 [L]
The 1st example will display the content of shop.php when a user accesses www.site.com/shop/
The 2nd example will send games, as argument ($1), to products.php?type=$1, if a user access www.site.com/shop/products/games/
[L] - is called a flag (L|last):
The [L] flag causes mod_rewrite to stop processing the rule set. In
most contexts, this means that if the rule matches, no further rules
will be processed. This corresponds to the last command in Perl, or
the break command in C. Use this flag to indicate that the current
rule should be applied immediately without considering further rules.
^ is relative to web root (somesite.com/^)
$ represents the end of the string (somesite.com/^somedir/test/$ this part will not be processed)
Resources:
Learn more about mod_rewrite and rewrite Flags

Url representation in php

I am used to representing embedded url information like this:
http://test.com/reports/statement.php?company=ABC&q=1
how would I do it like this instead?
http://test.com/reports/ABC/Q1

You need to use Apache mod_rewrite to achieve this.
If your server has it enabled, you could do something like this in .htaccess:
RewriteEngine on
RewriteRule ^([^/\.]+)/([^/\.]+)/?$ /statement.php?company=$1&q=$2 [L]

You can use $_SERVER['PATH_INFO'] to access anything in the URL docpath after the address of your script.
e.g.
http://test.com/reports/statement.php/ABC/Q1
...then in statement.php you would have the string "/ABC/Q1" in $_SERVER['PATH_INFO']
Of course, you'll need to setup your webserver to match the URL and target the correct script based on the HTTP request.

As stated by others, you have to use url rewriting.
Usually a php application that make use of it, it applies the pattern called Front Controller.
This means that almost every url is rewritten to point to a single file, where the $_SERVER['PATH_INFO'] is used to decide what to do, usually by matching with patterns you define for your actions, or return a 404 error if the url doesn't match any of the specified patterns.
This is called routing and every modern php framework has a component that helps doing this work.
A smart move would also be providing a tool to generate urls for your resources instead of handwriting them, so that if you change an url pattern you do not have to rewrite it everywhere.
If you need a complex routing system, check out the routing component of some major frameworks out there, e.g. Symfony2, Zend Framework 2, Auraphp.
For the simplest php router out there, check instead GluePHP. The codebase is tiny so you can make yourself an idea on how the stuff works, and even implement a small router that fits your needs if you want to.

Htaccess rewrite for user profile and possible conflictions

I need to ask something about htaccess redirection. I know there are lots of questions about htaccess, rewrite and pretty profile urls, but I've never found real answer of my question and I hope I can find with your help.
That pretty url rules as you know would work like changing "mydomain.com/profile.php?username=myuser" to "mydomain.com/myuser".
But let's say I have a rewrite rule for my login url : www.mydomain.com/login
That means if user try to have exact same username as "login" how could you handle that possible confliction on rewrite?
actually possible solution might be minimum character limitation like minimum 6 chars, but it's not looking elegant since you loose your option to use more than 6 chars like "/resetpassword".
Probably a "banned words" kind of array control would be a solution when user picks a username but then you need to foresee all kind of possibilities which shouldn't be used.
Many of giant websites use this rewrite methods. Particularly Facebook uses "/username" kind of rule for pages and users in the same time.
Anyway, if someone has what is the magic behind that kind of url redirection/rewrite rules please help me out on this :)
Thanks
P.S. : I know there is another solution like "/user/username" but nowadays pointing directly to the base url and shortening full url is getting more and more popular, and I just need to understand possibilities on that.

Why not just have a login sub directory in the root of your site that contains the relevant files for logging a user in? That way, the rewrite rules in your htaccess file only have to deal with the whole user redirect stuff.

What you're looking for is something called "routes". They're typically implemented by MVC frameworks like Zend Framework, CakePHP of Symfony.
What they essentially do is forwarding every request to some index.php which in turn figures out from $_SERVER['REQUEST_URI'] which PHP files should handle the request.
I wouldn't recommend putting rewrite rules into your .htaccess file by PHP. Instead, try getting into PHP frameworks. They do the hard lifting for you.
Personally, I use Zend Framework. But I wouldn't recommend the new version 2 to beginners. Try ZF1. It's actually pretty easy to get into.

Is it better to handle friendly/clean/pretty URLs with mod_rewrite or a language like PHP?

I'm developing my first decent-sized PHP site, and I'm a bit confused about what the "right way" (assuming there ever is such a thing) to handle clean/friendly/pretty URLs in the application.
The way I see it, there are two main options (I'll use a simplified social news site as an example):
1. Use mod_rewrite to handle all potential URLs. This would look similar, but not identical, to the following:
RewriteRule ^article/?([^/]*)/?([^/]*)/?([^/]*) /content/articles.php?articleid=$1&slug=$2
RewriteRule ^users/?([^/]*)/?([^/]*) /content/users.php?userid=$1&username=$2
RewriteRule ^search/?([^/]*)/? /content/search.php?query=$1
2. Pass everything to some handler script and let it worry about the details:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule (.*) handler.php?content=$1
Clearly this is all untested "air code," but you get the point.
Is one of these two ways going to be seriously slower than the other? Presumably the mod_rewrite is slower, since I'll be forced to use .htaccess files for this.
Are there serious disadvantages to either of these approaches?
Is there a "best practice" for this sort of thing, or is it something that each developer tends to decide for themselves? I know WordPress uses option two (though it was more trouble than it was worth when I investigated exactly how they did it).

Option 1 (.htaccess and several .php files) was often used "in the past" ; now, I see option 2 (every request going through one .php file) used a lot more.
The main advantages I see with option 2 are :
you can add / modify any kind of URL without having to change any physical file like .htaccess
which means the format of the URLs can be configured in the admin section of your application, for example
you only have one entry point to your PHP code.
which means everything goes though index.php : if you need some code executed for all requests, put it there, and you're sure it'll always be executed.
That's used a lot with MVC frameworks, for instance.
A couple of years ago, I would have gone with option 1 ; now that I use MVC and Frameworks, I always go with option 2.

Really this is the "are frameworks worth using?" question in disguise.
Using mod_rewrite to define your URL routes is quick and easy (if you understand regular expressions...) but your application code is oblivious to the URLs unless you duplicate the information somewhere.
Usually, people duplicate this information many times without thinking about it, by hard-coding URLs in the links in their views, or in redirects. This is really messy, and will one day cause pain when you decide to change the URL structure of your site halfway through development. You're bound to miss one and end up with a 404 somewhere.
Using a routing component in your application (such as the one in Symfony) means you can attach names to your routes, allowing you to define your URLs once and re-use them many times:
# apps/frontend/config/routing.yml
homepage:
url: /
param: { module: default, action: index }
This makes it really easy to link to pages of your site without repeating yourself:
<?php echo url_for('#homepage') ?>

Use option #2 - why? RewriteRules in .htaccess are powerful tool, but they're some kind of static. I mean you cannot easily manage then using PHP (or whatever you're going to use). Also .htaccess doesn't provide so much flexibility, but has some advantages (for example: it's a bit faster).
Option #2 also need .htaccess as you noticed, but in most cases RewriteRule takes the following form:
RewriteRule (.\*) index.php
Where index.php is your front controller.
The biggest advantage (IMO) of this soultion is that each route is described in PHP (or whatever you use) so accessing these routes, modifying them is much easier. Furthermore these routes can be used then not only for changing URL into set of variables, but also in opposite way - to create URL from set of variables.
I think the following example (from Symfony framework) will explain what I am talking about:
// apps/.../config/routing.yml - Describes routing rules
post:
url: /read/:id/:slug
params: { module: blog, action: index }
requirements: { id: \d+, slug: \w+ }
// apps/.../modules/blog/templates/indexSuccess.php - template for index action
<?php echo link_to($post['title'], '#post?id=' . $post['id'] . '&slug=' . $post['slug']); ?>
//creates: My first blog post
Now whenever you change your rounting.yml file and change /read/:id/:slug into /:slug_:id all your links in application will turn into /my-first-blog-post_123.html.
Doing such and others things when you use option #2 is much easier.

As far as I can see, any possible performance differences between those methods are really minuscule and relevant only for really, really high-traffic sites.
I think there is no "best practice" as such, both methods are equally often used. If your project structure allows it, and you're more at home with parsing the URL in PHP (where the rest of your project is), put everything through one controller file, and let your application handle the rest.
If performance is really of the essence, I suspect that having Apache handle the addresses is faster, because there is no interpreted language in between. (I have no hard data for this, though). But as I said, you're probably best of choosing whichever is going to be most maintainable for you in the long term.

Clean pretty URLs appear to be provided by PHP-script-based popular content management system Drupal using a combination of modrewrite rules in .htaccess and plug-in PHP Drupal modules such as path and pathauto.
Given the success and popularity of this tool - and its ability to run on the most modest of shared hosting, I think this would be your answer.

php Zend / MVC without mod_rewrite

I've seen it mentioned in many blogs around the net, but I believe it shoud be discussed here.
What can we do when we have an MVC framework (I am interested in ZEND) in PHP but our host does not provide mod_rewrite?
Are there any "short-cuts"? Can we transfer control in any way (so that a mapping may occur between pages)? Any ideas?
Thank you :-)

Zend framework should work without mod_rewrite. If you can live with your URL:s looking more like "/path/to/app/index.php/controller/action". If you had mod_rewrite you could do away with the "index.php" bit, but it should work with too.
It's all a matter of setting up the routes to accept the index.php part.

OK my verdict :-): I have used successfully zend without mod_rewrite and it's as you've all said site/index.php/controller/action. I knew that before posting this. I've also found out around the net a technique that "pushes" 404 pages to index.php therefore what is not a resource (eg. CSS, image, etc) get there, with one exception: POST values.
So I decided that the next time an application has to be made in the specific server, to ask politely for mod_rewrite. If the administrator can not provide it, talk with my boss or if it is for me, switch provider.
Generally, it is a shame sometimes that the PHP market is so much fragmented (php4, php5, php6, mod_rewrite, mod_auth, mod_whatever), but this is another story...

mod_rewrite is almost essential in today's hosting environment..but unfortunately not everyone got the message.
Lots of the large php programs (I'm thinking magento, but most can cope) have a pretty-url fall back mode for when mod_rewrite isn't available.
URLs end up looking like www.site.com/index.php?load-this-page
They must be running some magic to grab the variable name from the $_GET variable and using it as the selector for what module/feature to execute.
In a related note, I've seen lots of messed up URLs in the new facebook site where it's using the #. So links look like www.new.facebook.com/home.php#/inbox/ Clearly we're not meant to see that but it suggests that they're probably parsing the $_SERVER['REQUEST_URI'] variable.

If you can find a non-mod_rewrite way to redirect all requests to index.php (or wherever your init script is), you can, as mentioned above, use 'REQUEST_URI' to grab the portion of the address after the domain and then parse it as you like and make the request do what you want it to. This is how Wordpress does it (granted, with mod_rewrite). As long as you can redirect requests to your index page while retaining the same URI, you can do what you need to to process the request.

Drupal's rewrite rules translate
http://example.com/path/goes/here
into
http://example.com/index.php?q=path/goes/here
...and has logic to decide which flavor of URLs to generate. If you can live with ugly URLs, this would let you keep all the logic of a single front controller in place w/o relying on URL rewriting.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.