htaccess - Force a slash to the end of a dynamic URL - php

I have a single index.php file in a /slug subdirectory and would like to load dynamic content based on the file path. Regardless of what the url is, the content should reference that index.php.
In my code below, the slash is not being added at the end of the url. For example, example.com/slug/33 should be displayed in the address bar as example.com/slug/33/.
I have the following .htaccess in /slug:
Options -Indexes
# Turn mod_rewrite on
RewriteEngine On
RewriteBase /
# Dynamic url
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /slug/index.php/?path=$1 [NC,L,QSA]
I tried adding a / between index.php and ?path=$ but I'm not getting the desired result. Is this even possible?

RewriteRule ^(.*)$ /slug/index.php/?path=$1 [NC,L,QSA]
Changing the substitution string here changes the target of your internal rewrite - it does nothing to change the visible URL. By adding a slash after index.php you are (unnecessarily) adding additional pathname information (path-info) to the resulting URL that your application receives.
To change the visible URL (to append the slash) you need to implement an external redirect. However, to confirm... you must already be linking to the correct canonical URL (ie. with a slash) in your internal links. Appending the slash to the URL in .htaccess is only if you have changed the URL and search engines or 3rd parties are still using the old non-canonical URL (without a trailing slash).
Since the .htaccess file is in the /slug subdirectory and you are rewriting to index.php in that subdirectory then you don't need to prefix the rewritten URL with /slug/. By default, a relative URL-path is relative to the directory that contains the .htaccess file. However, you must also remove the RewriteBase directive (or set this "correctly" to RewriteBase /slug).
To redirect to append a trailing slash you can add the following before the current rewrite:
# Append trailing slash if omitted
RewriteRule ^(.*(?:^|/)[^/.]+)$ /slug/$1/ [R=301,L]
This requires the /slug/ prefix on the substitution string (unless RewriteBase /slug is set), otherwise the external redirect will attempt to redirect to a file-path, which will "break".
The RewriteRule pattern ^(.*(?:^|/)[^/.]+)$ captures URL-paths that do not already end in a slash and do not contain a dot in the last path segment. This is to avoid matching URLs that already contain (what looks-like) a file extension, ie. your static resources (images, CSS, JS, etc.). This should avoid the need for a filesystem check (which are relatively expensive) - to check that the request does not already map to a file. Although, if you are not referencing any static resources with the /slug/ prefix in the URL then this can be simplified.
NB: You should first test with a 302 (temporary) redirect to avoid potential caching issues.
In context (with the use of RewriteBase):
Options -Indexes
# Turn mod_rewrite on
RewriteEngine On
RewriteBase /slug
# Append trailing slash if omitted
RewriteRule ^(.*(?:^|/)[^/.]+)$ $1/ [R=301,L]
# Dynamic url
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule (.+) index.php?path=$1 [QSA,L]
The use of RewriteBase avoids you having to specify /slug/ in the other directives.
In the regex ^(.*)$, the start-of-string (^) and end-of-string ($) anchors are superfluous. And you might as well change this to use the + quantifier, since you don't want to match the base directory anyway (saves two additional filesystem checks). The NC flag was also superfluous here.

Related

Rewrite subdomain and URL-path to URL parameters but allow access to files

I'm struggling with my .htaccess file and setting it up the way I want it. The main function is a website that gets the language from the subdomain and the current page from the subfolders.
Requirements
I have three requirements that I need my .htaccess file to do;
Wildcard subdomain redirected to lang variable
Subfolder(s) redirected to page variable
Local files respected (this is where I'm stuck)
(Bonus) Split up the page variable into segments for each slash; page, sub1, sub2, etc
Examples
en.example.com/hello -> /index.php?lang=en&page=hello
es.example.com/hola -> /index.php?lang=es&page=hola
(Bonus) en.example.com/hello/there/sir -> index.php?lang=en&page=hello&sub1=there&sub2=sir
My current .htaccess
This is my current setup which actually kinda works, if I don't need any local files (lol). This means local images aren't found when my .htaccess below is active. I tried adding RewriteCond %{REQUEST_FILENAME} !-f to respect local files but that breaks the whole file it seems - and I don't know why.
RewriteCond %{REQUEST_URI} ^/$
RewriteCond %{HTTP_HOST} ((?!www).+)\.example\.com [NC]
RewriteRule ^$ /index.php?lang=%1 [L]
RewriteCond %{HTTP_HOST} ((?!www).+)\.example\.com [NC]
RewriteRule ^(.+)$ /index.php?lang=%1&page=$1 [L]
RewriteRule ^index\.php$ - [L]
RewriteRule ^(.*)$ /index.php?page=$1 [L,QSA]
If your URLs don't contain dots then exclude dots from your regex - this naturally excludes real files (that contain a dot before the file extension). This avoids the need for a filesystem check.
Your script should handle /index.php?lang=%1 and /index.php?lang=%1&page= exactly the same, so the first rule is superfluous.
RewriteRule ^index\.php$ - [L]
This rule should be first, not embedded in the middle.
Try the following instead:
RewriteRule ^index\.php$ - [L]
RewriteCond %{HTTP_HOST} ^((?!www).+)\.example\.com [NC]
RewriteRule ^([^.]*)$ /index.php?lang=%1&page=$1 [QSA,L]
RewriteRule ^([^.]*)$ /index.php?page=$1 [QSA,L]
Your last rule that rewrites everything else to index.php, less the lang URL param is questionable. Why not just include this in the preceding rule and validate the language in your script? Which you need to do anyway.
Assuming there is always a subdomain, then your rules could then be reduced to:
RewriteRule ^index\.php$ - [L]
RewriteCond %{HTTP_HOST} ^(.+)\.example\.com [NC]
RewriteRule ^([^.]*)$ /index.php?lang=%1&page=$1 [QSA,L]
Requests for the www language are then validated by your script and defaulted accordingly, as if the lang param was not passed at all (which you need to be doing anyway).
If your subdomain is entirely optional and you are accessing the domain apex then make it optional (with a non-capturing group) in the regex:
RewriteCond %{HTTP_HOST} ^(?:(.+)\.)?example\.com [NC]
:
The lang param would then be empty if the domain apex was requested.
(Bonus) en.domain.com/hello/there/sir -> index.php?lang=en&page=hello&sub1=there&sub2=sir
It would be preferable (more efficient, flexible, etc) to do this in your PHP script, not .htaccess.
But in .htaccess you could do something like this (instead of the existing rule):
:
RewriteRule ^([^/.]*)(?:/([^/.]+))?(?:/([^/.]+))?(?:/([^/.]+))?(?:/([^/.]+))?$ /index.php?lang=%1&page=$1&sub1=$2&sub2=$3&sub3=$4&sub4=$5 [QSA,L]
The URL params are empty when that path segment is not present.
It is assumed the URL-path does not end in a slash (the above will not match if it does, so a 404 will result). If a trailing slash needs to be permitted then this should be implemented as a canonical redirect to remove the trailing slash. Or reverse the logic to enforce a trailing slash.
This particular example allows up to 4 additional "sub" path segments, eg. hello/1/2/3/4. You can extend this method to allow up to 8 (since there is a limit of 9 backreferences in the Apache syntax) if required. Any more and you will need to use PHP. (You could potentially handle more using .htaccess, but it will get very messy as you will need to employ additional conditions to capture subsequent path segments.)
I tried adding RewriteCond %{REQUEST_FILENAME} !-f to respect local files but that breaks the whole file it seems
That should also be sufficient (if dots are permitted in your URLs). But I wonder where you were putting it? It should not "break" anything - it simply prevents the rule from being processed if the request does map to a file - the rule is "ignored".
This is of course assuming you are correctly linking to your resources/static assets using root-relative (starting with a slash) or absolute (starting with scheme + hostname) URLs. If you are using relative URLs then they will probably result in 404s. If this is the case then see my answer to the following question on the Webmasters stack:
https://webmasters.stackexchange.com/questions/86450/htaccess-rewrite-url-leads-to-missing-css

.htaccess redirect /subpage/ to /subpage?/wp-login

Trying to use .htaccess rule to do the wp-login JS check on first visit by appending ?/wp-login to the url since it's interferring with Sucuri firewall when using password protection.
I've created a test subdomain to try to get the htaccess redirect to work before using it on the live site:
RewriteCond %{QUERY_STRING} ^protectedpage$
RewriteRule ^(.*)$ https://testing.no11.ee/protectedpage?/wp-login [R=302,L]
view here: testing.no11.ee/protectedpage
Unfortunately this does not add the query arg to the url. What am I doing wrong here?
Expected result when visiting page should be https://testing.no11.ee/protectedpage?/wp-login as the browser url.
Full htaccess:
# BEGIN WordPress
# The directives (lines) between "BEGIN WordPress" and "END WordPress" are
# dynamically generated, and should only be modified via WordPress filters.
# Any changes to the directives between these markers will be overwritten.
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
RewriteCond %{QUERY_STRING} ^protectedpage$
RewriteRule ^(.*)$ https://testing.no11.ee/protectedpage?/wp-login [R=302,L]
</IfModule>
# END WordPress
RewriteCond %{QUERY_STRING} ^protectedpage$
RewriteRule ^(.*)$ https://testing.no11.ee/protectedpage?/wp-login [R=302,L]
This checks that the QUERY_STRING is set to protectedpage, but in your example it's the URL-path that is /protectedpage, not the query string.
You also need to first check that the query string is not already set to /wp-login, otherwise you'll get a redirect loop.
However, you've also put the code in the wrong place. Note the WordPress comment that precedes the code block - you should not manually edit this code. This directive also needs to go before the WordPress front-controller, otherwise, it's simply never going to get processed.
Try the following instead before the # BEGIN WordPress comment marker:
RewriteCond %{QUERY_STRING} !^/wp-login$
RewriteRule ^(protectedpage)/?$ /$1/?/wp-login [R=302,L]
This matches an optional trailing slash on the requested URL, but it redirects to include the trailing slash in the target URL.
(You do not need to repeat the RewriteEngine on directive.)
No need to include the scheme + hostname if you are redirecting to the same. The $1 backreference simply saves repetition and refers to the matched URL-path, ie. protectedpage (without the trailing slash) in this example.
However, this always redirects and appends /wp-login to this URL - not just the "first visit" - is that really what you require? Otherwise, you need to somehow differentiate between "first" and "subsequent" visits (by detecting a cookie perhaps?)
UPDATE: Minor addition: how would one improve this to add ?/wp-login to all urls that have the page /subpage/ as parent i.e /subpage/page-1 and /subpage/page-2 would result in /subpage/page-1?/wp-login etc? I tried using (.*) but this delets the subpage from the url...
You could do something like the following:
RewriteCond %{QUERY_STRING} !^/wp-login$
RewriteRule ^(subpage/[^/]+)/?$ /$1/?/wp-login [R=302,L]
The [^/]+ subpattern matches any character except / - so only the second path segment, excluding the optional trailing slash. This is similar to .*, but this would capture everything, including any trailing slash, so would result in a double slash in the redirected URL.

.htaccess rewrite rule breaks css and javascript when non existant sub directory or trailing slash is in url

I noticed that adding a trailing / to index.php breaks css and javascript which was explained here- what-happens-when-i-put-a-slash-after-a-php-url
My rewrite rule takes whatever string is after the domain and a forward slash and puts it into the GET variable q. So foo.com/foo works fine and I can access /foo in the GET variable q. How do I get any non existent resource requested in url string to work similarly? Make foo.com/foo/ OR foo.com/foo/foo etc. redirect to index.php and not break css and javascript.
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{SCRIPT_FILENAME} !-d
RewriteCond %{SCRIPT_FILENAME} !-f
RewriteRule ^(.*)$ ./index.php?q=$1 [NC,L,QSA]
</IfModule>
My rewrite rule works when a query string is appended to index.php or when /index.php is replaced with /foo but breaks js and css when additional directories are added like /foo/ or /foo/foo. How does the rule need to be written to prevent this?
It appears that you use relative links for including your javascript or css files.
So, all you have to do is make your file addresses absolute, like /js/search.js.
Your rewrite rule is correct and it won't forward the actual files to your index.php file. But when you say src='js/search.js', it means index.php/js/search.js for the browser, and that is not an actual file address.

Custom wordpress HTTP redirects to HTTPS breaks website

I have added extra rules in my .htaccess file.
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{HTTPS} off
RewriteRule !^/articles/(.*)$ https://%{HTTP_HOST}/index.php?name=$1 [L,R=301]
RewriteRule ^(articles/)(.*)$ https://%{HTTP_HOST}/index.php?name=$2 [L,R=301]
</IfModule>
I have two conditions which seems to work:
http://example.com/this-is-old-url/ meets the first rewrite rule
http://example.com/articles/this-is-old-url/ meets the second rule
But the problem is that after these rules are added my website is starting to make redirects everywhere. For example, I can't access my domain https://example.com which changes to https://example.com/index.php?name=. All my website links breaks.
What am I doing wrong?
RewriteCond %{HTTPS} off
RewriteRule !^/articles/(.*)$ https://%{HTTP_HOST}/index.php?name=$1 [L,R=301]
RewriteRule ^(articles/)(.*)$ https://%{HTTP_HOST}/index.php?name=$2 [L,R=301]
The first rule matches everything, since the URL-path that is matched by the RewriteRule pattern in an .htaccess context never starts with a slash (you appear to have recognised this in your second rule). You also can't have capturing groups in negated patterns - by definition, a negated pattern doesn't actually match anything.
And the second rule executes unconditionally, for both HTTP and HTTPS, since RewriteCond directives only apply to the first RewriteRule that follows. (But these two rules can be combined into one anyway.)
To externally "redirect" (as per your example) a URL of the form http://example.com/this-is-old-url/ (or http://example.com/articles/this-is-old-url/) to https://example.com/index.php?name=this-is-old-url then you can do the following at the top of your .htaccess file:
RewriteCond %{HTTPS} off
RewriteRule ^(?:articles/)?([\w-]+/)$ https://%{HTTP_HOST}/index.php?name=$1 [R=301,L]
UPDATE: the regex now includes - (hyphens) and trailing slash. Note that the trailing slash is included as part of the capturing group. So, it redirects to /index.php?name=this-is-old-url/ (with the trailing slash).
The subpattern (?:articles/)? is optional and non-capturing. So, this one rule matches both your example URLs.
The <IfModule> wrapper is not required. And neither is the RewriteEngine directive, since you presumably already have that in the WordPress code block that follows.
You don't necessarily need to expose the index.php, you could redirect to /?name=... instead, assuming the DirectoryIndex is set correctly.
You will need to clear your browser cache before testing. (Preferably test first with 302 - temporary - redirects.)
I can't access my domain https://example.com which changes to https://example.com/index.php?name=.
That couldn't have happened with your existing rules, only if you'd requested http://example.com (note HTTP, not HTTPS).

Removing slash in url (not trailing slash)

I have a folder named people on my server, and index.php in that folder
My url is like mydomain.com/people/?name=value1&age=value2
But I really want it to look like mydomain.com/people?name=value1&age=value2
Since "people" is a folder and your script is in that folder, the only way for this to work is if you turn off DirectoryIndex, which automatically redirects the browser to include a trailing slash for any request that's for a folder.
Note, this is a trailing slash, the URI ends with /people/. The ? and everything after it is the query string.
Turning off DirectoryIndex can be very dangerous, as it is used to prevent information disclosure. Without a trailing slash on your folders, requesting a folder will result in displaying the contents of that folder even if you have a directory index. In other words, index.php is ignored and instead, you get a listing of all your folder's contents. So to prevent that from happening, you have to internally add the slash back.
So something like this in the htaccess file of your document root:
DirectoryIndex Off
RewriteEngine On
RewriteCond %{THE_REQUEST} \ /+people/\?
RewriteRule ^ /people [L,R]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^(.*[^/])$ /$1/ [L]
Using mod_rewrite you can do it like this:
RewriteEngine On
RewriteRule ^(.+)/$ $1 [L,R,QSA]
QSA here is not required since it stands for Query String Append and it's on by default.

Categories