mod_rewrite chaining? - php

I have a bootstrap php file that I am routing all requests through:
RewriteRule ^(.*)$ index.php?query=$1 [L]
Say I have a url like /books/moby-dick, but I need the URL to pass to the index file like /books/detail/moby-dick. Is there a way to "rewrite" /books/moby-dick to /books/detail/moby-dick before the last RewriteRule? I thought the Chain [C] flag would do it but I end up with "books/detail/moby-dick/moby-dick". Here's where I'm currently stuck:
RewriteRule ^books/([A-Za-z0-9\-]+)$ books/detail/$1 [C]
RewriteRule ^(.*)$ index.php?query=$1 [L]

Any rewrites that you perform will automatically flow down to subsequent rules in your rule set provided that you don't cause the process to end/restart with the L (which typically restarts when used in .htaccess) or N flag. You could remove the chaining and it would still work, although in that case you'd have to condition the second rule:
RewriteRule ^books/([A-Za-z0-9\-]+)$ books/detail/$1
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ index.php?query=$1 [L]
Note that if you do chain the rules together, if the request path does not match the first rule, the request won't be redirected to the bootstrap file.
None of that is the cause of the actual problem though. What happens is that Apache has decided that the request has path info (for reasons I'll have to look into), and after your rewrite it automatically appends that to the result. The supposed "path info" is /moby-dick, which is why it ends up appearing twice.
Luckily, since we didn't want it in the first place, we can discard it with the DPI flag. Keeping the above points in mind, the following will redirect a request to books/moby-dick to index.php?query=books/detail/moby-dick:
RewriteRule ^books/([A-Za-z0-9\-]+)$ books/detail/$1 [DPI]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ index.php?query=$1 [L]
(I made the assumption you wanted books/detail/name, although you also mentioned books/view/name)

In .htaccess rules (unlike the same rules in httpd.conf) the [L] flag starts all over at the top. You probably really want to use the [END] flag (but only later versions of Apache support it). I believe the reason you get your change repeated has nothing to do with the [C] flag, but rather is because that line is being executed twice. (In fact the only thing that saves you from an "infinite loop" is your "-f" test ultimately stops things).
Each modification in a .htaccess file is always "passed on" to the next line (there are no flags to either enable or disable this behavior). The little-used [C] flag seems to mainly be useful for nested conditionals and for very slight simplification of some awkward if-then-else structures, neither of which you're doing in the example. That's why I don't understand that you need the [C] flag at all.
The standard technique to avoid massive looping and repeating problems in older .htaccess files is to add a bit of boilerplate at the top, something like
RewriteCond %{ENV:REDIRECT_STATUS} !^[ /]*$
RewriteRule ^ - [L]

Related

How to write .htaccess file to get everything after slash as parameter?

I have a URL i.e "www.mysite.com". I want to send parameters via url in following ways:
www.mysite.com/count
www.mysite.com/search_caption?query=huha
www.mysite.com/page=1
www.mysite.com/search_caption?query=huha&page=1
In each of these cases I want to load index.php page with parameters as follows for each case:
var_dump($_REQUEST) results into [count]
var_dump($_REQUEST) results into [query="huha"]
var_dump($_REQUEST) results into [page=1]
var_dump($_REQUEST) results into [query="huha",page=1]
How do I write .htaccess file to achieve this?
I am using this code but it is capturing only params after "?" and not everything after first slash
Options +FollowSymLinks
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
#RewriteRule ^([^/]+)/?$ index.php?{REQUEST_FILENAME}=$1 [L,QSA]
RewriteRule .* /index.php [L]
Something like that should get close, though you really should think about those strange URL patterns instead of trying to fix them afterwards with rewriting...
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} -f
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^ - [L,QSA]
RewriteRule ^count index.php?count=1 [L]
RewriteRule ^page/(.*)$ index.php?page=1 [L]
RewriteRule ^ index.php [L,QSA]
Some notes:
the first three RewriteRules are exceptions necessary because your given requests do not follow a sane and common pattern. They appear somewhat chaotically chosen.
this certainly is not free of issues, I did not test it, only typed it down.
this assumes the "page" example to be requested like as discussed in the comments.
index.php actually has to exist as a file, otherwise this will result in a rewrite loop
Given all that these rewritings should happen:
www.mysite.com/count => index.php?count=1
www.mysite.com/search_caption?query=huha => index.php?query=huha
www.mysite.com/page/1 => index.php?page=1
www.mysite.com/search_caption?query=huha&page=1 => index.php?query=huha&page=1
Also note that the rules above are written for .htaccess style files. To be used as normal rules, so inside the http servers host configuration, they would have to be written slightly different. You should only use .htaccess style files if you really, really have to, so if you have no access to the configuration files. You should always try to avoid those files if somehow possible. They are notoriously error prone, hard to setup and debug and really slow the server down. So if you have access to the http server configuration, then defines such rules in there instead.

Apache2 rewrite module, flag [L]

I have next rewrite rules:
RewriteEngine ON
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([a-z]+)/(js|css|img)/(.+\.jpg|gif|png|js|css)$ media/myfiles/$1/$2/$3 [L]
RewriteRule .* index.php
I my application i have route class, that can process url's for my needs.
When i try to open file, that contains extension that will match to rewrite rule, i move to next rewrite rule, and my router class process this url...
Any ideas why apache doesn't stop after rule match first time?
P.S. first rule works after disabling second rule.
Take a look here: http://httpd.apache.org/docs/2.2/rewrite/flags.html
If you are using RewriteRule in either .htaccess files or in
sections, it is important to have some understanding of
how the rules are processed. The simplified form of this is that once
the rules have been processed, the rewritten request is handed back to
the URL parsing engine to do what it may with it. It is possible that
as the rewritten request is handled, the .htaccess file or
section may be encountered again, and thus the ruleset may be run
again from the start. Most commonly this will happen if one of the
rules causes a redirect - either internal or external - causing the
request process to start over.
(emph mine)
So what I think happens is that your last rule hits, and redirects. It doesn't call the bottom line. But then, the request is handled like any other request, your regexp DOESN"T hit, and in this run the bottom line DOES come into play.
This is also why it works when you disable the bottom rule: the second time around there is nothing to do, so nothing happens.

PHP & URL Rewriting

I need a little help figuring out what the following URL rewrite rule means. I can understant the first three lines, but I got stuck with the index.php/$1 part. What does exactly the / means in this rule? The only thing I would always expect to see after a file name would be a query-string separator (?). This is the first time I am seeing the / as a separator. What does it exactly mean?
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php/$1 [PT,L]
</IfModule>
The <IfModule mod_rewrite.c>...</IfModule> block ensures that everything contained within that block is taken only into account if the mod_rewrite module is loaded. Otherwise you will either face a server error or all requests for URL rewriting will be ignored.
The following two lines are conditions for the RewriteRule line which follows them. It means that the RewriteRule will be evaluated only if these two conditions are met.
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
These lines simply state that rewriting (RewriteRule line) will occur only if there are no existing files or folders on the server which match the URI. If they do exist then they will be served instead, unless there is some other directive that prevents it, otherwise rewriting will occur.
The last line will do the actual rewriting. It will take whatever is following the website domain name and append it to a rewritten request which will begin with index.php/.
Here is an example.
Lets say you make a request for example.com/example-page.html.
If there is no existing file or folder in the virtual hosts root folder named example-page.html the rewrite rule at the end will rewrite the request to look like example.com/index.php/example-page.html.
The main reason why applications rewrite requests like this is to ensure that they have a single point of entry, often called a bootstrap, which is considered to be a good practice from the security standpoint and also offers more granular control of requests.
Here is in my opinion a very good beginner friendly tutorial for mod_rewrite.
It's just rewritting the url name.
For example, this url:
http://www.example.com/something/else
Will be the same as:
http://www.example.com/index.php/something/else

.htacces to capture the page name and send it as a get method

I am trying to capture a url such as
http://www.mysite.com/somepage.php?sometext=somevalue
and redirect it to.
http://www.mysite.com/index.php?page=somepage.php&sometext=somevalue
I tried searching for such .htaccess online, but couldn't find it.
Can you please help me?
I'm quite sure this is a duplicate, but I'm having a bit of an issue finding it/them [Edit: I found one, though possibly not the best example].
Anyway, this is a fairly standard problem resolved with fairly standard code:
RewriteRule ^(.*)$ index.php?get=$1 [L,QSA]
The RewriteRule captures the entire request as $1, and passes it to index.php as the page GET parameter.
The [QSA] flag on the end says to take any existing GET parameters (sometext=somevalue in your example), and add them as additional GET parameters on the new request. (The [L] flag just says that this should be the last rule executed.)
Note that this will also redirect requests for things like images or CSS files, so it's good to add the following lines directly before this rule:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
These lines say "if the request is for a file or directory that actually exists, don't process the rule." That way, requests for real files will be served directly by Apache, rather than being handled (or more likely, mishandled) by your PHP script.
RewriteRule ^(.*).php?sometext=(.*)$ index.php?page=$1.php&sometext=$2 [QSA,L] #rewrite
RewriteRule ^(.*).php?sometext=(.*)$ http://www.mysite.com/index.php?page=$1.php&sometext=$2 [R=301,L] #redirect

Rewrite to index.php best practices

I notice that there are a few common ways to setup RewriteRules for MVC based PHP applications. Most of which contain:
RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} !-f
RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} !-d
Followed by a RewriteRule:
RewriteRule ^(.*)$ /index.php?$1 [L,QSA]
or
RewriteRule .* /index.php/$0 [PT,L]
I realize that L = LAST, QSA = query string appended, PT = pass through but as I don't have the real world experience of using these yet, could anyone inform me which flags and URI they would go with and why?
The latter rule contains a slash before the $0, I'm assuming because this forces it so the PATH CGI variable is populated, as often times I don't see it populated. Does the PT actually serve somewhat of the same purpose as the QSA, indirectly? Or how else would one use query strings? Basically, what are the pros and cons of these?
And just to confirm, if I wanted to add say an ErrorDocument directive would the L flag matter? Let's say a request to '/non-existing-link/' is made, my application cannot pick it up from the defined routes I have, nor is there an existing directory as such, would the L have any effect if I placed the ErrorDocument below the RewriteRule? Should I place it before the entire snippet? Same with 301s, 302s. And if I were to actually manually invoke 3xx/4xx codes, I would be using the header() function within my application, right? I kind of have a feeling this is quite dirty but is probably the most practical and only way of doing it hence it probably isn't dirty.
When the htaccess is read by the server, it goes line-by-line, trying to find a match. Without the L flag it will check every rule in the htaccess (though I'm not sure what happens if it finds multiple matches here).
If you include the L flag, when it gets to that rule, it will stop processing rules and serve the request. However, the gotcha here is that when it serves the request it will process the htaccess file from the beginning again with the new, rewritten URL. This page explains it well, with an example.
The ErrorDocument rule will be independent from the rewrite rules, so it doesn't matter where it comes (I usually put it at the top so it's obvious and not buried under a bunch of rewrites).
However, note that if a rewrite rule matches a valid file or script, the error document won't fire, even if the data/querystring is bogus. For example if a URL gets written to /index.php?page=NON_EXISTENT_PAGE then the server believes it has found the document. You will need to handle the parameter in the PHP script. Setting 404 headers in the PHP script won't automatically serve up the 404 document (but you can include it from the PHP script).
I have used zend framework suggestion for MVC application.
http://framework.zend.com/manual/en/zend.controller.html
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} -s [OR]
RewriteCond %{REQUEST_FILENAME} -l [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^.*$ - [NC,L]
RewriteRule ^.*$ index.php [NC,L]
The ErrorDocument setting will have no effect. If files are not found by Apache, the request is handled by PHP (as defined by these rewrite rules). Once inside PHP, you have to stay inside. Setting the response code to an error value with header() will not invoke Apache's error handling. You have to make your own code to present a decent error page.

Categories