I am trying to trap old URL's of the form:
http://www.example.com/mpn_engine.php%3Ffamilyname%3Djiyalal+goswami%26menuopt%3D2%26submenuopt%3D1%26Search%3Dstuff
In my .htaccess file, with the help of various wise StackOverflowers as RegEx is alien to me, I have arranged to catch the PHP script 'mpn_engine.php' (both .php3 and newer .php copies) wherever it might be found (in any sub folder) and redirect visitors to the index page.
RewriteRule (^|/)mpn_engine\.php$ /index.html? [L,NC,R=301]
RewriteRule (^|/)mpn_engine\.php3$ /index.html? [L,NC,R=301]
The odd thing I am finding is that the above seems to work providing I seek after the php files exactly, or if I supply conventional parameters of the form:
http://www.example.com/lang/mpn_engine.php?x=fred
but as soon as I substitute a percent mark for the question mark, i.e. something like the following:
http://www.example.com/lang/mpn_engine.php%x=fred
The Rewrite fails, & and I get unpredictable results, usualy a a 404 but occassionally a 'Bad Gateway'.
How can I rewrite this ReWriteRule to catch this .php file in any folder it might be looked for and with any trailing characters, including a percent sign, and redirect it gracefully to the index page?
Thanks!
Your question has a number of sub-questions:
If you want to "catch this .php file in any folder it might be
looked for" then as long as your .htaccess file is in the root folder of your website (and not in a subfolder), then you are covered.
If you want to cover ANY trailing character, then you can make one of two changes to your rewrite rule:
Remove the ending $:
RewriteRule (^|/)mpn_engine\.php /index.html? [L,NC,R=301]
or
Add a wildcard after "php":
RewriteRule (^|/)mpn_engine\.php(.*)$ /index.html? [L,NC,R=301]
In the first case, if the $ present, this tells Apache to ONLY match if "php" is at the end of the URL. In the second case, this tells Apache to match if "php" is followed by zero or more of any other characters at the end of the URL. In either case, you do not need your second rewrite rule concerning "php3" -- either of these above will match for those instances as well.
The reason your first example with the "%" worked but subsequent attempts gave 404 errors is because the server translates "%3F" to "?", and "?" has a special meaning for web servers and is essentially ignored by your regex matcher -- thus the server acts as if "php" is the final part of the URL, and the rewrite succeeds.
Related
I am working on a website in which I write code for htaccess but the thing which I wanted to do is not happening. I have url which is:
http://www.example.com/demo.php?id=234&title=ask%20me%20a%20question
I converted to below url using htaccess:
http://www.example.com/234/ask%20me%a%question
htaccess code:
Options +FollowSymLinks
RewriteEngine on
RewriteRule ^([0-9]{4})/([a-z]+)/$ demo.php?url=$1&url2=$2
So. the problem is converted url is search for related file in subdirectory instead of server root i.e; public_html. I want to know how could this problem will solve.
Plz help me. Thanks.
The second parameter in your request requires that characters other than a-z be included, but you are limiting it to a-z.
In addition, you are requesting 234 in the URI, but checking for 4 numbers in the first parameter.
As such, change your rule to the following:
RewriteRule ^([0-9]{3,4})/([^/]+)/?$ demo.php?url=$1&url2=$2 [L]
Changes
Allow 3 or 4 numbers in the first parameter. If you want to be more flexible, you can change it to ([0-9]+).
Check for all characters other than / in the second parameter.
Make the trailing slash optional using /?.
Add the L flag to stop rewriting if the rule is matched (always good to have for when you add other rules).
I have this .htaccess
RewriteEngine On
RewriteRule ^custom index.php?page=value [NC,L]
It works if index.php and .htaccess are located in the same folder.
I'd like to place the .htaccess file in the root and make the content placed in a newFolder react to the htaccess rules. Say:
.htaccess -> root level
index.php -> root/newFolder level
I did try:
RewriteEngine On
RewriteRule ^custom newFolder/index.php?page=value [NC,L]
No reaction at all (404) :(
I guess the solution must by simple, but...
Any help will be appreciated.
Let's make sure we're on the same page. ;) In my server root directory I have a .htaccess file and in that file I have:
RewriteEngine On
RewriteRule ^(.+/)?custom $1index.php?page=value [NC,L]
The RewriteRule has three components: a match pattern, a target, and a list of flags. The match pattern can contain plain text or a regular expression. The target can contain plain text or text with intermingled variables captured in the regular expression match pattern. Both match pattern and target are relative to the directory the .htaccess file is located in.
The match pattern is compared to the text in the URL that is after the current directory. If there's a successful match then the target is called from the directory the .htaccess file is in. If information is captured in the match pattern then the variables in the target are first interpolated in the target.
In this example the .htaccess file is in the server root therefore the match pattern is compared to the URL text after: http://localhost/
To test it out, here's the PHP code I put in an index.php file:
<?php
echo 'page value: ' .$_GET['page'] . '<br>';
echo 'script loaded: ' . $_SERVER['PHP_SELF'];
?>
When I navigate to http://localhost/custom in my browser: the regular expression match pattern is compared to the string "custom", there's a successful match so the target index.php?page=value is executed, and I get the index.php page in the root directory loaded. The displayed page looks like this:
page value: value
script loaded: /index.php
Next I create a new directory called "newFolder" in the server root and copy the index.php file to the newly created directory.
When I navigate to http://localhost/newFolder/custom in my browser: the regular expression match pattern is compared to the string "newFolder/custom", again there's a successful match, this time the target newFolder/index.php?page=value is executed, and I get the index.php page in the newFolder directory loaded. The displayed page now looks like this:
page value: value
script loaded: /newFolder/index.php
So, likewise, I can create any number of directories in the root directory and new index.php files in those directories, and when navigating to "custom" in those directories, this .htaccess file ought to load the corresponding index.php file in the current subdirectory.
~~~
I'm going to go out on a limb here and assume you want to load pages dynamically, therefore capture "custom" as a dynamic page name and dump it in the page field (replace "value") and send that to the index.php file in the current subdirectory.
RewriteEngine On
RewriteCond %{REQUEST_URI} !index\.php
RewriteRule ^(.+/)?(.+)$ $1index.php?page=$2 [NC,L]
As mentioned, the parentheses capture dynamic text and dump it in the target string where the dollar sign numbers are. So the text captured in the first set of parentheses replaces the "$1" and the text captured in the second set replaces "$2". I also added the RewriteCond statement to prevent this rule from double-dipping. :) If I don't this rule gets executed for both the "custom" call AND the redirection to index.php.
So now when I navigate to http://localhost/newFolder/whoopdeedoo, the regex match pattern is compared to "newFolder/whoopdeedoo", there's a successful match, the target newFolder/index.php?page=whoopdeedoo is executed, and I get:
page value: whoopdeedoo
script loaded: /newFolder/index.php
~~~
At the risk of droning on, here's some background info. You mentioned this works in the root directory:
RewriteEngine On
RewriteRule ^custom index.php?page=value [NC,L]
But it does not work when navigating your browser to subdirectories. After reading the comments, it appears you discovered this next bit works for subdirectories, specifically "newFolder":
RewriteEngine On
RewriteRule ^newFolder/custom newFolder/index.php?page=value [NC,L]
But now it no longer works for the root directory. So you could simply include both:
RewriteEngine On
RewriteRule ^custom index.php?page=value [NC,L]
RewriteRule ^newFolder/custom newFolder/index.php?page=value [NC,L]
However there's a couple things here that jump out at me. One is that rather than hardcoding directory names in my root .htaccess I'd rather put .htaccess files in each subdirectory and take over the URL processing there.
But if for whatever reason you prefer to maintain one .htaccess file in the root directory, then, what jumps out at me is the duplication in the rules. So, alternatively, you can use regular expressions to make the subdirectory name a captured variable, like shown at the top of this answer:
RewriteEngine On
RewriteRule ^(.+/)?custom $1index.php?page=value [NC,L]
The regex capturing parentheses replace the text "newFolder/", and inside the parentheses we're saying, "capture one or more characters followed by a forward slash". Then, if there's a successful match, the value captured replaces the "$1" in the target. The reason this also works in the root is because the question mark following the parentheses makes the regex inside the parens optional. So if there's nothing before "custom" we're still good for a match. In that case the "$1" gets replaced with an empty string.
Try with this rewriterule:
# Activate RewriteEngine
RewriteEngine on
# Rewrite the URL requested by the user
# Entry: folder/clients/name/
# Output: folder/clients.php?id=name
RewriteRule ^folder/clients/(\w+)/?$ /folder/clients.php?id=$1
Explanation of this rewriterule:
First part of the rewriterule:
^ Top expression
folder/clients/ The requested URL string begins with folder/clients/
(\w+) Capture any letters that follow and stores it in $1
/? Optional backslash at the end of the URL
$ End of the expression
Second part of the rewriterule:
clients.php?id= Text string
$1 The first capture we saw in the first part
I just know how htaccess works but I am always confused with the writing syntax and I appreciate if anyone could help me solving the below htaccess issue.
I have couple pages linking to redirect to something like
http://mydomain.com.au/product-details.php/142/categoryAbstract
but due to the mistakes of previous developer the images are not loading unless that url is
http://mydomain.com.au/product-details.html/142/categoryAbstract
He converted all php pages to html (I really don't know what's this intention in doing that) but
now the url should work even if it as http://mydomain.com.au/product-details.php/142/categoryAbstract
He used the below htaccess for this but its not working. If I manually change the url from .php to .html everything working fine.
RewriteRule ^product-details.html/(.*)/(.*)$ product-details.php?productid=$1&category=$2
I need a working line of code so that even the url http://mydomain.com.au/product-details.php/142/categoryAbstract should work.
You will just need an OR group (a|b) to account for both possibilities:
RewriteRule ^product-details\.(html|php)/(.*)/(.*)$ product-details.php?productid=$1&category=$2
#---------------------------^^^^^^^^^^^
That can be improved a little though. The (.*) are greedy matches. You are better served to use ([^/]+) as the first grouping to match everything up to the next /. I have also escaped the dot as \. so it is matched as a literal instead of any character.
RewriteRule ^product-details\.(html|php)/([^/]+)/(.*)$ product-details.php?productid=$1&category=$2
The .php extension is commonly modified either through rewriting or actual file renaming and server configuration to parse .html as .php in order to hide some server-side information from end users. To prevent them from knowing what technologies the site runs on the back end. It less common to actually rename files to .html than to use URL rewriting to hide the .php, however.
RewriteRule ^product-details.html/(.*)/(.*)$ product-details.php?productid=$1&category=$2
What this rule does is take everything after product-details.html/ and before the last / and a second bit gets taken after the last / until the end of the line. then it takes those bits and puts them where the $1 and $2 are.
to change it so it accepts .html and .php you can change it with
RewriteRule ^product-details(.html|.php)/(.*)/(.*)$ product-details.php?productid=$2&category=$3
Because it looks like the first bit you are grabbing are numbers and (.*) is a greedy selector it may be better to replace it with ([0-9]*) which will only select numbers. that way if you ever have /s in your catagory you'll be fine. giving you:
RewriteRule ^product-details(.html|.php)/([0-9]*)/(.*)$ product-details.php?productid=$2&category=$3
What I'm trying to do:
have pretty URLs in the format 'http://domain.tld/one/two/three', that get handled by a PHP script (index.php) by looking at the REQUEST_URI server variable.
In my example, the REQUEST_URI would be '/one/two/three'. (Btw., is this a good idea in general?)
I'm using Apache's mod_rewrite to achieve that.
Here's the RewriteRule I use in my .htaccess:
RewriteRule ^/?([a-zA-Z/]+)/?$ /index.php [NC,L]
This works really well thus far; it forwards every REQUEST_URI that consists of a-z, A-Z or a '/' to /index.php, where it is processed.
Only drawback: '?' (question marks) and '#' (hash keys) seem to still be allowed in the REQUEST_URI, maybe even more characters that I've yet to find.
Is it possible to restrict those via my .htaccess and an adequate addition to the RewriteRule?
Thanks!
The fragment identifer, e.g. #some-anchor, is controlled by the browser, not the server. JavaScript would be needed to redirect and remove this, although why you would want to do so I am not sure.
[SNIPPED after clarification]
To rewrite only when the query string is empty:
RewriteCond %{QUERY_STRING} ^$
RewriteRule ^/?([a-zA-Z/]+)/?$ /index.php [NC,L]
In mod_rewrite and PHP the variable REQUEST_URI refers to two different part of the URI. In mod_rewrite, %{REQUEST_URI} contains the current URI path; in PHP, $_SERVER['REQUEST_URI'] contains the URI path and query. But in both cases the URI fragment as this part of the URI is not transmitted to the server but only used by the client.
So, when /one/two/three?foo#bar is requested, mod_rewrite’s %{REQUEST_URI} contains /one/two/three and PHP’s $_SERVER['REQUEST_URI'] contains /one/two/three?foo.
The $_SERVER['REQUEST_URI'] variable will contain the original REQUEST_URI as received by the server, before you perform the rewrite. Therefore it's impossible (as far as I know this early in the morning) to remove the query string portion from the REQUEST_URI's attribute, but you naturally have the option of removing it when you process the $_SERVER['REQUEST_URI'] variable in your script.
If you want to only perform your RewriteRule when the query string is not specified, the following should work:
RewriteCond %{QUERY_STRING} !^.+$
RewriteRule ^/?([a-zA-Z/]+)/?$ /index.php [NC,L]
Note that this might be problematic though, since if there's accidentally a query string in a URL that someone uses to link to your site, your script wouldn't be handling it (since the rewrite never happens), so they'll get a 404 response (or whatever the case may be) that might not be as user-friendly as if you had just chosen to silently ignore the trailing information.
If i understand, you want to forbid using of ? and # for your site?
You shouldn't do that, because:
hash (#) is used in AJAX URLs google specification,
question mark (?) is used for example in Google AdWords and Analytics or any Affiliation Program,
So if you force Apache to reject url request containing question mark, people who click on your Ad in AdWords will only see 404 error page.
There is nothing bad in letting people to use both of them. The case is to prevent your site against XSS attacks.
Btw. there is another very importand sign - percent (%) which is used to encode special chars (like Polish or German national letters)
Is it possible to use .htaccess to process all six digit URLs by sending them to a script, but handle every other invalid URL as an error 404?
For example:
http://mywebsite.com/132483
would be sent to:
http://mywebsite.com/scriptname.php?no=132483
but
http://mywebsite.com/132483a or
http://mywebsite.com/asdf
would be handled as a 404 error.
I presently have this working via a custom PHP 404 script but it's kind of kludgy. Seems to me that .htaccess might be a more elegant solution, but I haven't been able to figure out if it's even possible.
In your htaccess file, put the following
RewriteEngine On
RewriteRule ^([0-9]{6})$ /scriptname.php?no=$1 [L]
The first line turns the mod_rewrite engine on. The () brackets put the contents into $1 - successive () would populate $2, $3... and so on. The [0-9]{6} says look for a string precisely 6 characters long containing only characters 0-9.
The [L] at the end makes this the last rule - if it applies, rule processing will stop.
Oh, the ^ and $ mark the start and end of the incoming uri.
Hope that helps!
<IfModule mod_rewrite.c>
RewriteEngine on
RewriteRule ^([0-9]{6})$ scriptname.php?no=$1 [L]
</IfModule>
To preserve the clean URL
http://mywebsite.com/132483
while serving scriptname.php use only [L].
Using [R=301] will redirect you to your scriptname.php?no=xxx
You may find this useful http://www.addedbytes.com/download/mod_rewrite-cheat-sheet-v2/pdf/
Yes it's possible with mod_rewrite. There are tons of good mod_rewrite tutorials online a quick Google search should turn up your answer in no time.
Basically what you're going to want to do is ensure that the regular expression you use is just looking for digits and no other characters and to ensure the length is 6. Then you'll redirect to scriptname.?no= with the number you captured.
Hope this helps!