htaccess regular expression with a "/" - php

I'm having a brain cramp. I'm using htaccess to rewrite a page and sometimes the variable that gets passed through will have a / (forward slash) in the variable. Sometimes there will be a slash and sometimes there won't but it is super important that all of this is treated as one variable. I'd really rather not reprogram all my pages with a str_replace() to switch a - for a / and then make a call to a database. For example:
http://www.example.com/accounting/finance.htm
Accounting/Finance is one variable that I need.....it is not in an accounting directory and then there's a page called finance.htm in accounting. So far I've got something like
RewriteRule ^([A-Za-z]+.*[A-Za-z]*)\.htm$ mypage.php?page=$1 [L,NC]
But it doesn't like it.
Can someone help me out?
Thanks in advance.
REPLY TO COMMENTS/ANSWERS
The specific rule that I'm looking for is something like this.....
[start of string]...1 or more letters...[possibility of a / followed by 1 or more letters].htm[end of string]
The two answers given below aren't working...I'm pretty sure it keeps treating it as a directory and not an actual "filename". As soon as I remove the forward slash the page works just fine...

If i get you right, you just need this one:
([A-Za-z/]*)\.htm
it should work with every combination of / or not-/
e.g.
accounting/finance.htm
test.htm

A slash is just another character. Apart from that, your regexp looks unnecessarily complex. For instance, .*[A-Za-z]* is not different from .* and also [A-Za-z] can be shortened to [a-z] if you use the [NC] flag.
Your precise rules are not entirely clear, but you probably want something on this line:
RewriteRule ^([a-z/]+)\.htm mypage.php?page=$1

Related

How to load a specific page for any given pathname URL

Let's say I have a web-page called www.mysite.com
How can I make it so whenever a page is loaded like www.mysite.com/58640 (or any random number) it redirects to www.mysite.com/myPHPpage.php?id=58640.
I'm very new to website development so I don't even really know if I asked this question right or what languages to tag in it...
If it helps I use a UNIX server for my web hosting with NetWorkSolutions
Add this to your .htaccess file in the main directory of your website.
RewriteEngine on
RewriteBase /
RewriteRule ^([0-9]+)$ myPHPpage.php?id=$1 [L]
Brief explanation: it says to match:
^ from start of query/page
[0-9] match numbers
+ any matches of 1 or more
$ end of page requested
The parentheses part say to look for that bit and store it. I can then refer to these replacement variables in the new url. If I had more than one parentheses group then I would use $2, $3 and so on.
If you experience issues with the .htaccess file please refer to this as permissions can cause problems.
If you needed to capture something else such as alphanumeric characters you'd probably want to explore regex a bit. You can do things such as:
RewriteRule ^(.+)$ myPHPpage.php?id=$1 [NC, L]
which match anything or get more specific with things like [a-zA-Z0-9], etc..
Edit: and #Jonathon has a point. In your php file wherever you handle the $_GET['id'] be sure to sanitize it if used in anything resembling an sql query or mail. Since you are using only numbers that makes it easy:
$id = (int)$_GET['id']; // cast as integer - any weird strings will give 0
Keep in mind that if you are not going to just use numbers then you will have to look for some sanitizing function (which abound on google - search for 'php sanitize') to ensure you don't fall to an sql injection attack.

Rewrite syntax in .htaccess

I just know how htaccess works but I am always confused with the writing syntax and I appreciate if anyone could help me solving the below htaccess issue.
I have couple pages linking to redirect to something like
http://mydomain.com.au/product-details.php/142/categoryAbstract
but due to the mistakes of previous developer the images are not loading unless that url is
http://mydomain.com.au/product-details.html/142/categoryAbstract
He converted all php pages to html (I really don't know what's this intention in doing that) but
now the url should work even if it as http://mydomain.com.au/product-details.php/142/categoryAbstract
He used the below htaccess for this but its not working. If I manually change the url from .php to .html everything working fine.
RewriteRule ^product-details.html/(.*)/(.*)$ product-details.php?productid=$1&category=$2
I need a working line of code so that even the url http://mydomain.com.au/product-details.php/142/categoryAbstract should work.
You will just need an OR group (a|b) to account for both possibilities:
RewriteRule ^product-details\.(html|php)/(.*)/(.*)$ product-details.php?productid=$1&category=$2
#---------------------------^^^^^^^^^^^
That can be improved a little though. The (.*) are greedy matches. You are better served to use ([^/]+) as the first grouping to match everything up to the next /. I have also escaped the dot as \. so it is matched as a literal instead of any character.
RewriteRule ^product-details\.(html|php)/([^/]+)/(.*)$ product-details.php?productid=$1&category=$2
The .php extension is commonly modified either through rewriting or actual file renaming and server configuration to parse .html as .php in order to hide some server-side information from end users. To prevent them from knowing what technologies the site runs on the back end. It less common to actually rename files to .html than to use URL rewriting to hide the .php, however.
RewriteRule ^product-details.html/(.*)/(.*)$ product-details.php?productid=$1&category=$2
What this rule does is take everything after product-details.html/ and before the last / and a second bit gets taken after the last / until the end of the line. then it takes those bits and puts them where the $1 and $2 are.
to change it so it accepts .html and .php you can change it with
RewriteRule ^product-details(.html|.php)/(.*)/(.*)$ product-details.php?productid=$2&category=$3
Because it looks like the first bit you are grabbing are numbers and (.*) is a greedy selector it may be better to replace it with ([0-9]*) which will only select numbers. that way if you ever have /s in your catagory you'll be fine. giving you:
RewriteRule ^product-details(.html|.php)/([0-9]*)/(.*)$ product-details.php?productid=$2&category=$3

htaccess internal errors with using (escaped or unescaped) dots

Maybe I'm doing something stupid, but I can't get rid of an issue with htaccess.
I'm trying to match a function name in a documentation site and I'm getting errors I can't understand. I must point that I (think I) know about regular expressions escaping, and I know what dot and backslash-dot mean.
So: i want to allow all of these:
example.com/foofunction
example.com/foofunction.php
example.com/function.foofunction
example.com/function.foofunction.php
These are the lines that I've tried. Those which cause error are misunderstood, so lots of thanks to anyone that can explain any to me:
^function\.([A-Za-z0-9_-]+)(\.php)?$ -> works, but makes function. mandatory
^(function\.)?([A-Za-z0-9_-]+)(\.php)?$ -> internal error... ok, let's not escape dot, in the end, it will match any character and will work...
^(function.)?([A-Za-z0-9_-]+)(\.php)?$ -> internal error too! ok, just for trying, dot outside conditional?
^(function)?\.([A-Za-z0-9_-]+)(\.php)?$ -> works, ok, but it makes dot mandatory. By the way, more crazy things:
^(function)?.([A-Za-z0-9_-]+)(\.php)?$ -> if dot isn't escaped (imagine I want to allow any character), internal error too. Now i`ll try to make dot optional separately
^(function)?(\.)?([A-Za-z0-9_-]+)(\.php)?$-> internal error too, i'm going crazy...
These are my tries up to now, I'm going to try optional lookbehind and update with results... anyway, i'd love to understand whi those regexes cause internal error.
And if anyone knows about an "htaccess special regex exceptions" reference or something like that i must read, wil be very wellcome.
Thanks in advance to all of you guys.
Use non capturing groups for everything apart from the actual function name:
^(?:function\.)?([A-Za-z0-9_\-]+)(?:\.php)?$
Let's break that down:
^ # assert start of string
(?:function\.)? # optionally allow the string "function."
([A-Za-z0-9_\-]+) # capture the function name - this could be shortened to ([-\w]+)
(?:\.php)? # optionally allow the string ".php"
$ # assert end of string
So your .htaccess would look (I guess) something like this:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(?:function\.)?([A-Za-z0-9_\-]+)(?:\.php)?$ doc.php?functionname=$1 [L,QSA]
IMPORTANT POINT and the actual solution in this case:
You must use a sensible combination of RewriteCond and (usually) the [L] flag to ensure that the rule matches only once.
mod_rewrite behaves in a slightly counter-intuitive way that is not always immediately apparent: it keeps running the rules over and over until there are no more matches. So, let's say I use the rule outlined above:
RewriteRule ^(?:function\.)?([A-Za-z0-9_\-]+)(?:\.php)?$ doc.php?functionname=$1
...and I supply to this rule the input function.myfunc.php. First, it will be rewritten to:
doc.php?functionname=myfunc
However, next time it will match again. And it will be rewritten to:
doc.php?functionname=doc
...and this will keep happening over and over until MaxRedirects is reached and Apache will throw an error - which you will see on the client side as a 500 response.
The solution to this depends on your exact use case, but a common solution (the one I used above) is to check whether the requested file exists before applying the rewrite rule. By doing this, on the second iteration the rule will not be applied, and the request will be allowed to fall through for further processing.
The [L] flag is also commonly (over)used - this causes the current iteration of the rewrite process to stop, and start again at the next iteration. It effectively does the same thing as continue does to a loop in PHP.
Since Apache 2.3, a much more useful flag (to this situation) is available - [END]. This gives the behaviour most people expect from [L], it causes the rewrite process to halt immediately with no further iterations, like the break construct in PHP. Using this would mean that the aforementioned RewriteConds are no longer necessary. However, because this is only available in 2.3+, it can't be safely used unless you know for certain it will be available in every environment you run on.

mod_rewrite .htaccess redirect part of a query string

I'm struggling with a redirect problem. I need to redirect the following URL with mod_rewrite thorough .htaccess
http://www.mysite.com/somescript.php?&lang=php&var=1&var=2
to the following
http://www.mysite.com/somescript.php?lang=php&var=1&var=2
So, basically I just need to remove the
&
before
lang=php
However, the order is important. Sometimes
&lang=php
appears after other variables in the querystring. in this scenario I need the
&
to remain part of
&lang=php
Is this possible?
To summarise, if &lang=php appears at the beginning of the query string, remove the &. If &lang=php appears anywhere else in the query string, the & must remain.
Hope this is clear!
I would change the script myself but unfortunately I am not the developer, and he doesn't seem too helpful at the moment; this is a quick fix.
I would replace ?& with ?:
RewriteCond %{QUERY_STRING} ^\&(.*)$
RewriteRule ^somescript\.php$ /somescript.php?%1 [L,R=301]
why don't you match "?&" and replace it by "?" ?
Something like:
RewriteRule ^(.*)?&(.*) $1?$2 [L]
(not tested)
Because I think the combination "?&" is never valid...(?)

Need help with regular expressions - URL redirection

I'm trying to redirect an easy to remember url to a php file but I'm having some trouble with the regex.
Here's what I have at the moment:
RewriteRule ^tcb/([a-zA-Z0-9]{1,})/([a-zA-Z0-9]{1,})/([a-zA-Z0-9]{1,}) /tcb/lerbd.php?autocarro=$1&tipo=$2&dsd=$3
It is working but only if I supply all 3 arguments. I want the last two arguments to be optional so it either works with only the first or all three. I'm hoping you can help me with this.
Thank you very much.
Adding a ? after something in a RegEx makes it optional. so something like:
RewriteRule ^tcb/([a-zA-Z0-9]{1,})/?(([a-zA-Z0-9]{1,})/([a-zA-Z0-9]{1,}))? /tcb/lerbd.php?autocarro=$1&tipo=$3&dsd=$4
Notice I introduced a new grouping around the 2nd and 3rd arguments, so the backreferences had to be shifted.
You might also want to put an optional / at the end, too, so it can be used just as if it pointed to a directory...
Here's how I solved the problem. It could be helpful for someone who might stumble upon this question:
RewriteRule ^tcb/([a-zA-Z0-9]{1,})/?(([a-zA-Z0-9]{1,})/([a-zA-Z0-9]{1,}))?$ /tcb/lerbd.php?autocarro=$1&tipo=$3&dsd=$4

Categories