help with setup of .htaccess file redirects - php

I need help configuring my .htaccess file to handle redirects properly.
Here’s what I need to have happen. Stackoverflow's spam filter wouldn't allow me to post the full domain. So where I say "DOMAIN" you can substitue "domain.com". (I also needed to add and extra t to the http.)
Requests for the DOMAIN/page version of the file should be redirected to www.DOMAIN/page.
Requests for the 'friendly' versions of the URLS should be allowed. So a file that is really at www.DOMAIN/index.php?q=37 should be viewable by going to www.DOMAIN/latest-news
I have a big list of 301 redirects. We recently changed the site from an .asp based CMS to one written in PHP.
Example:
redirect 301 /overview.asp http://www.DOMAIN/overview
Items 1 and 2 are working fine.
However for item 3, if I put in a browser request for "http://www.DOMAIN/overview.asp" instead of redirecting to the friendly name of the file ("http://www.DOMAIN/overview") it will redirect to http://www.DOMAIN/index.php?q=overview.asp. This is the problem.
What do I need to change to get this working right?
My configuration is below:
## Fix Apache internal dummy connections from breaking [(site_url)] cache
RewriteCond %{HTTP_USER_AGENT} ^.*internal\ dummy\ connection.*$ [NC]
RewriteRule .* - [F,L]
## Exclude /assets and /manager directories and images from rewrite rules
RewriteRule ^(manager|assets)/*$ - [L]
RewriteRule \.(jpg|jpeg|png|gif|ico)$ - [L]
## For Friendly URLs
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php?q=$1 [L,QSA]
RewriteEngine On
RewriteCond %{HTTP_HOST} ^domain\.com$ [NC]
RewriteRule ^(.*)$ http://www.DOMAIN/$1 [R=301,L]
redirect 301 /overview.asp http://www.DOMAIN/overview
redirect 301 /news.asp http://www.DOMAIN/news
# ETC....
thanks!

Mod_rewrite is doing exactly what you're asking it to do ... (yes :-), that's often the problem with computers).
On the /overview.asp http://www.DOMAIN/overview line you're setting the browser to send out a brand new request from scratch, which starts the whole cycle again from the top and gets catched by the ^(.*)$ index.php?q=$1 directive.
Right before this line you should put another RewriteCond to prevent the ^(.*)$ rule to apply if REQUEST_FILENAME is either overview or news. You might also simply rewrite /overview.asp to overview [L] instead of redirecting.
If you can, set the RewriteLog directive to its highest verbosity and look at the logfile - it usually gives very good insights into what's really going on...
EDIT - if I get it right you shoud be doing this:
RewriteCond %{REQUEST_FILENAME} ! \.asp$
RewriteCond %{REQUEST_FILENAME} ! ^overview$
RewriteCond %{REQUEST_FILENAME} ! ^news$
RewriteRule ^(.*)$ index.php?q=$1 [L,QSA]
This would prevent any file already ending in .asp, plus those looking for overview and news, to be redirected toward index.php.
I suspect anyway that you got something backwards regarding that SEO stuff. You should indeed start from the structure of the query string that your scripts expect and use that as a base to build a sensible URL addressing schema.
EDIT #2:
There was a space too many between the bang mark ant the regex. The following code doesn't come from memory as the previous - I've tested on my local Apache and it does what's supposed to do (as long as I've understood correctly..)
RewriteCond %{REQUEST_FILENAME} !\.asp$
RewriteCond %{REQUEST_FILENAME} !overview$
RewriteCond %{REQUEST_FILENAME} !news$
RewriteRule ^(.*)$ index.php?q=$1 [L,QSA]
Hope this helps

Related

Specific redirection in .htaccess does not work

I have a specific problem with my mod_rewrite configuration that I cannot resolve. I am no admin, therefore I'm kindly asking for a collective advice :) Please note - it's not a general question about redirection, but very specific one.
Story
I have a shared hosting with access to FTP and ability to create my own .htaccess files. This shared hosting had plenty of files and directories before I created the website, so logical step for me was to place everything inside new-site folder.
Then I had to create custom rewrite rules so that everything under example.com points to new-site.
CONFIG
So I came up with the following config.
# (...) other rules
# 1. Make sure that /new-site/ is not a duplicated content
RewriteCond %{REQUEST_URI} ^/new-site/
RewriteRule ^/new-site/(.*)$ /$1 [R=301,L]
# 2. Make sure that example.com is internally handled by files in '/new-site'
RewriteCond %{REQUEST_URI} !^/new-site/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /new-site/$1
RESULTS
Rule marked with 2. works fine, my site is accessible as I want. However I didn't want https://example.com/new-site/ to be found on the server by bots and treated by a duplicated content, so I added rule 1..
This rule, however, doesn't seem to have any effect! I looked it up with CURL and request is handled immediately with a 200 status. I'm banging my head against the wall and experimenting with other variants of it, but everything fails.
What I'm after is pretty darn simple:
Make every request to the root domain be handled by website which is stored in /new-site/
Make sure that direct call to https://example.com/new-site/(.*) is redirected with 301 status back to the domain root.
What am I doing wrong?
EDIT
I've noticed that my setup seems to be doing far better if I remove a child .htaccess file under /new-site/ subfolder. I didn't mention it in my original question because there is nothing special about it (just some SEO rewrites).
RewriteEngine on
DirectoryIndex index.php
RewriteRule ^products$ products.php
# (...) similar rewrites
Old answer: RewriteRule does not accept leading slash. Try to change to
RewriteRule ^new-site/(.*)$ /$1 [R=301,L]
Edit:
Version that is provided by you will forward to the cyclic redirection. To avoid it, I think, you can use such .htaccess
RewriteEngine on
RewriteRule ^new-site/ - [R=404,L]
# 2. Make sure that example.com is internally handled by files in '/new-site'
RewriteCond %{REQUEST_URI} !^/new-site/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /new-site/$1 [L]
Direct asking /new-site/* wil receive 404 error, while url exaple.com/* wil be redirected to /new-site
And notice that if there are files with the same name, for example, /r.jpg and /new-site/r.jpg, the last never be achieve
Your first rule never matches because it must not begin with a leading slash.
With RewriteRule, you only need a leading slash if you're directly in httpd.conf or before Apache v2.4 i think.
While you have a good idea, your first rule will cause an infinite redirection loop if it's working. You have to use THE_REQUEST to match direct user request only.
You can put this code in /.htaccess
# 1. Make sure that /new-site/ is not a duplicated content
RewriteCond %{THE_REQUEST} \s/new-site/([^\s]*)\s [NC]
RewriteRule ^ /%1 [R=301,L]
# 2. Make sure that example.com is internally handled by files in '/new-site'
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^((?!new-site/).*)$ /new-site/$1 [L]
Also, you'll have to add this line in /new-site/.htaccess (to avoid automatic override)
RewriteOptions InheritBefore

Remove .php extension after moving site to WordPress

I'm moving an old site from flat PHP files over to a new WordPress installation and want to make sure all the old URLs redirect properly. For example,
Old url: /va/apply.php
should now go to:
New url: /veterans-affairs/apply
I've got /va redirecting to /veterans-affairs properly, but cannot get the .php stripped from the URL.
I'm not sure if these needs to all be done in one step? I've tried everything I can find online and made as many tweaks as my limited knowledge in .htaccess has allowed.
This is also on WordPress, so there may be something I did that was conflicting with the pretty permalinks stuff there.
This is some of the code that I've tried among many others.
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteRule ^(.*)$ $1.php [NC,L]
This should redirect the user to the non-PHP location, but I keep getting a 404. This must be a combination of my code and WordPress' pretty permalinks.
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteRule ^(.+)\.php$ $1 [L,QSA]
I have just had a quick look through where you are at and this above might help out. Add it to the wordpress htaccess above all the entries there so it can change this first... HTH
OK, I've finally got this working correctly. Again, what I'm trying to solve is to get this URL:
/va/apply.php
to correctly redirect to the new WordPress URL,
/veterans-affairs/apply
What worked for me was:
# This will remove the .php extension if it is not a directory, the file does not exist and it's not a WordPress specific admin page
RewriteCond %{REQUEST_URI} !/wp-(content|admin|includes)/ [NC]
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.php !-f
RewriteCond %{THE_REQUEST} ^(.+)\.php([#?][^\ ]*)?\ HTTP/
RewriteRule ^(.+)\.php$ $1 [R=301,L]
# The basic redirect for /va
Redirect /va /veterans-affairs
I think what was breaking it was this final line that you find in all the examples:
RewriteRule ^([^/.]+)$ $1.php [L]
I think this was trying to actually resolve the URL before WordPress could do what it needed to do.
I also found this page which proved insightful
Hide .php Extension, Set Directory Index, Eliminate Duplicate Content, etc.

htaccess redirect on mvc framework

I am working on the htaccess file for my mvc site. The software that the company purchased for the site works only without the www, so I was able to fix up the htaccess to allow www in the URL since most of our affiliates are going to try to use it anyway. However, this renders the siteurl.org/index.php/admin and the siteurl.org/index.php/members unreachable. I'm trying to exclude these URL's from the www forward to non-www but everything I know and can find seems to relate to non-MVC sites, and it seems that mvc sites are set up differently across the board so the examples I'm finding aren't working for me.
Here's my current htaccess (I had to comment out the forwarding lines to allow admins to access the admin section and affiliates to access the member section)
<Files ~ "serial.txt$">
Order allow,deny
Deny from all
</Files>
RewriteEngine On
#RewriteCond %{HTTP_HOST} !^www\. [NC]
#RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L]
RewriteRule ^aff/(.*)$ /index.php/aff/?aff=$1 [R,L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /index.php/$1 [R=301,NC]
If it could be addressed at the same time, I'd also love a pointer on how to clean up that URL so that we could type in siteurl.org/admin as opposed to siteurl.org/index.php/admin (same for members), and also to show the affiliate name in the URL (it's currently cleaning up the URL to remove the /aff/affiliateusername but affiliates would like to see their name in the URL). If anybody has a great link to specific resources on writing htaccess for MVC I would be eternally grateful. Thank you in advance for any assistance.
Let's be clear on three things before going into explanations:
Apache doesn't have a care in the world whether your site is built on the MVC approach/design pattern or not. It. Doesn't. Even. See. It. To it, it sees htaccess mod_rewrite rules.
Whoever puts a serial code in serial.txt on the root is just begging for it to get nicked using a file include vulnerability in PHP.
This is suspiciously similar to CodeIgniter in rewrite rules.
Now. Your rules:
RewriteCond %{HTTP_HOST} !^www\. [NC]
RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L]
This rule will match only if the http host does not start with www. . If this is true, then it'll redirect to the http ://www. version.
Based on your description, you want the opposite: Your CMS does not work with www.. So, you will want this:
RewriteCond %{HTTP_HOST} ^www\. [NC]
RewriteRule ^(.*)$ http://mydomain/$1 [R=302, L]
Note that you'll need to hardcode your domain in there. Sucks.
Next set of rules:
RewriteRule ^aff/(.*)$ /index.php/aff/?aff=$1 [R,L]
This is bog-standard - redirects aff/whatever to /index.php/aff/?aff=$1
For the future, change it to this:
RewriteRule ^aff/(.*)$ /index.php?/aff/?aff=$1 [L]
This will clean up the URL and prevent an Apache rewrite cycle.
Next one:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /index.php/$1 [R=301,NC]
These will wildcard-match anything that does not exist. Same thing as before, change the last line to this:
RewriteRule ^(.*)$ /index.php?$1 [L]
This will, again, make the rewrite transparent.
P.S: get a real CMS developer. 301s are hardly useful.

Zend Framework /index/ redirect throught htaccess to avoid content duplication

I'd like to know if it is possible to add a rule to the htaccess of my ZF app to redirect all the URLs that ends with the segment /index/ (such as http://domain.ext/index/) to the same URL without the /index/ suffix.
I've tried with this simple rule:
RedirectMatch ^(.*)/(index(/)?)$ http://localhost$1
but it doesn't work as expected (with other frameworks such as FuelPHP it works like a charm).
I know that this can be done via PHP using a plugin but I'd like to make the redirect via Apache to improve the performance of the application.
I don't know why nobody jumped in here, it is not that complicated?
A config file is executed from top to bottom and certain rules cause an immediate exit. If the rule defines an external redirect the server will perform that redirect immediately and all following rules are therefore ignored. If the redirect is back to the same server and config file then it is just a new game with the rules! If the redirect rule does not apply anymore it is on to the next rule. If the rule would still apply you get a loop.
Similar thing with a RewriteRule that matches and has [L]. L means "Stop the rewriting process here and don't apply any more rewrite rules". This quote is straight from the manual
Now you simply have to define some logic in what order you want to apply certain rules. Your request about the RedirectMatch for any /index/ path is certainly something you want to have very early to the top of the config. If there is a match your config will end here and perform a redirect! The browser will send a new request and we have a new game.
The RewriteRule to an index.php is something we will add very late at the bottom. It may be our last resort like a if all fails then rule. I does not matter if this is the Zend Framework or any other application you funnel through an index.php or other script for that matter.
The following rules should cover any variation with index, including .php, .htm and .html and finally trigger the index.php file for your ZF application.
RedirectMatch ^(/.*)/(index.(php|html|htm)|index)/?$ $1
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} -s [OR]
RewriteCond %{REQUEST_FILENAME} -l [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^.*$ - [NC,L]
RewriteRule ^.*$ index.php [NC,L]
When testing redirect rules be careful with your browser and use one where you can totally reset all cache and history settings. All current browsers are notorious in "remembering" redirects. If they learned a redirect rule they will perform that redirect internal, i.e. they don't go to the server to see what's new!
Here is your ruleset laid out readably
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} -s [OR]
RewriteCond %{REQUEST_FILENAME} -l [OR]
RewriteCond %{REQUEST_FILENAME} -d
RedirectMatch ^(.*)/(index(/)?)$ http://localhost$1
RewriteRule ^.*$ - [NC,L]
RewriteRule ^.*$ index.php [NC,L]
RedirectMatch is a mod_alias directive which severs the conds as from their rule. Also it's a lot less fraught not mixing mod_alias and mod_rewrite directives, so try:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} -s [OR]
RewriteCond %{REQUEST_FILENAME} -l [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^ - [L]
RewriteRule ^(.*?)/index/?$ $1 [R=301,L]
RewriteRule ^.*$ index.php [NC,L]
(updated following posters comments)
More footnotes
I tried this out on my VM which mirrors my hosting service, but having root access I can see the 'production' rewrite logs. This fails because the second rules still falls through to rule (3) which dispatchs to index.php. This then returms the full content but with a 301 status and without issuing a new location. If I change the [R=301] to [R=301,L] then it works fine as the server now issues a 301 with a Location header and the browser now retries with the new location.
The documentation states:
You will almost always want to use [R] in conjunction with [L] (that is, use [R,L]) because on its own, the [R] flag prepends http://thishost[:thisport] to the URI, but then passes this on to the next rule in the ruleset, which can often result in 'Invalid URI in request' warnings.
I resolved my problem with this (horrible) workaround:
- I renamed the "index" action of IndexController to "home"
- I setup a static route for home page (source)
- I changed the htaccess to:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} -s [OR]
RewriteCond %{REQUEST_FILENAME} -l [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^.*$ - [NC,L]
RewriteRule ^(.*)home(/)?$ $1 [R=301,L]
RewriteRule ^(.*)index/(.*)$ $1$2 [R=301,L]
RewriteRule ^.*$ index.php [QSA,NC,L]
So now the home page is not duplicated because http://localhost/home/ is redirected to the base domain and for other controllers the index action, if it is called specifying the action name (/controller/index/param/value) is redirected to the desired URL (/controller/param/value/)
RewriteRule ^(.*)/index(?|$)$ $1$2 [R=301,L]
this works with urls
/index => /
/index?page=2 => /?page2
/index/index => /
/index/index?page=2 /?page2
you need remove trailing slash before, for url like /index/, index/index/
RewriteRule ^(.*)/$ /$1 [L,R=301]
url like /index/help will work without changes

Rewrite pagerequests to index.php, filerequests to app/webroot directory

Hey, I've been reading StackOverflow.com for a long time but decided to sign up to ask a question. I'm writing my own lightweight MVC framework that routes page requests in index.php.
Page requests look like /controller/action/arg1/arg2/arg3, and they should be rewritten to index.php?route=[request]. So, a [request] like site.com/user/profile/123 should be rewritten to index.php?route=user/profile/123
However, files aren't meant to rewrite to index.php. Assets such as images and stylesheets are in the /app/webroot/ folder, and don't need PHP to be executed. So, the mod_rewrite engine should rewrite any filerequests to /app/webroot/, and serve the configured 404 ErrorDocument when the file doesn't exist.
Directory structure
./index.php
./app/webroot/scripts/helpers/hamster.js
./app/webroot/images/logo.png
./app/webroot/style/main.css
Since you can tell the difference between a file request (/squirrel.png) and a page request (/user/profile/123) just by the existence of the file extension / dot, I was expecting that this would be really easy. But... I'm having a really hard time with it and I was hoping someone could help me out.
Something I've tried was...
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ app/webroot/$1 [L]
RewriteRule ^(.*)$ index.php?route=$1 [QSA,L]
... but it doesn't really work except for redirecting correctly to existing files. Pagerequests or nonexisting files result in HTTP 500 errors.
Any help is greatly appreciated! =)
See if this works out a little more like you expected:
RewriteEngine On
# These two lines are very specific to your current setup, to prevent
# mod_dir from doing what it does, but in a more controlled way
RewriteCond %{THE_REQUEST} ^[A-Z]+\s/iceberg[^/]
RewriteRule .* http://localhost/iceberg/ [R=301,L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !^/app/webroot
RewriteCond %{REQUEST_URI} \.[a-z]+$ [NC]
RewriteRule ^.*$ app/webroot/$0 [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !^/app/webroot
RewriteRule ^.*$ index.php?route=$0 [QSA,L]
Also, to explain, the reason why you are getting the 500 error is likely because of your rule:
RewriteRule ^(.*)$ index.php?route=$1 [QSA,L]
Since it's unconditional, and the regular expression pattern will always match, your rewrite will be performed over and over (the L flag doesn't prevent this, because after you rewrite to index.php, an internal redirection is made inside of Apache, and the process loses its current state).

Categories