.htaccess redirect urls with spaces - php

I am trying to redirect all invalid urls to my index.php file via my .htaccess file.
Unfortunately I keep getting an Apache error.
My .htaccess file
RewriteEngine on
RewriteCond %{REQUEST_URI} !\.(?:css|js|jpe?g|gif|png)$ [NC]
RewriteRule ^([a-zA-Z0-9\-\_\/]*)$ index.php?p=$1
RewriteRule ^([A-Za-z0-9\s]+)$ index.php?p=$1 [L] 
This invalid url shoud redirect to index.php:
/vacatures/jobapplication/facility-manager%20qsdf
But it throws the object not found 404 Apache error.

The rule you have which allows spaces does not allow hyphens. The rule you have which allows hyphens does not allow spaces. So anything which includes both will not match either.
Your invalid URL facility-manager%20qsdf includes both.
My guess is that your RewriteCond is supposed to apply to both rules, but that is not what is happening now, it will apply only to the first RewriteRule after it. You can solve all these problems by including just 1 RewriteRule, and amending it to accept everything you want:
RewriteRule "^([A-Za-z0-9\-\_\/\s]+)$" index.php?p=$1 [L]
Note that this requires at least one of the characters in your character class, in other words it will not match your "home" location when there is no path ("http://somewhere.com/"). If you want to also match for that location, change the + to a *, to allow 0 or more character matches.

Your rewrite rules do not match the url you indicated. Your REQUEST_URI is
/vacatures/jobapplication/facility-manager%20qsdf
I suspect the URL decoding is not done before the RewriteRule matching and therefore it's trying to match literally %20, yet % sign is not included in your match. I'm not sure why you're using two RewriteRules - why not do something like this?
RewriteEngine on
RewriteCond %{REQUEST_URI} !\.(?:css|js|jpe?g|gif|png)$ [NC]
RewriteCond %{REQUEST_URI} !^index.php(\?.*)?$
RewriteRule ^(.*)$ index.php?p=$1 [L]

Related

Rewrite subdomain and URL-path to URL parameters but allow access to files

I'm struggling with my .htaccess file and setting it up the way I want it. The main function is a website that gets the language from the subdomain and the current page from the subfolders.
Requirements
I have three requirements that I need my .htaccess file to do;
Wildcard subdomain redirected to lang variable
Subfolder(s) redirected to page variable
Local files respected (this is where I'm stuck)
(Bonus) Split up the page variable into segments for each slash; page, sub1, sub2, etc
Examples
en.example.com/hello -> /index.php?lang=en&page=hello
es.example.com/hola -> /index.php?lang=es&page=hola
(Bonus) en.example.com/hello/there/sir -> index.php?lang=en&page=hello&sub1=there&sub2=sir
My current .htaccess
This is my current setup which actually kinda works, if I don't need any local files (lol). This means local images aren't found when my .htaccess below is active. I tried adding RewriteCond %{REQUEST_FILENAME} !-f to respect local files but that breaks the whole file it seems - and I don't know why.
RewriteCond %{REQUEST_URI} ^/$
RewriteCond %{HTTP_HOST} ((?!www).+)\.example\.com [NC]
RewriteRule ^$ /index.php?lang=%1 [L]
RewriteCond %{HTTP_HOST} ((?!www).+)\.example\.com [NC]
RewriteRule ^(.+)$ /index.php?lang=%1&page=$1 [L]
RewriteRule ^index\.php$ - [L]
RewriteRule ^(.*)$ /index.php?page=$1 [L,QSA]
If your URLs don't contain dots then exclude dots from your regex - this naturally excludes real files (that contain a dot before the file extension). This avoids the need for a filesystem check.
Your script should handle /index.php?lang=%1 and /index.php?lang=%1&page= exactly the same, so the first rule is superfluous.
RewriteRule ^index\.php$ - [L]
This rule should be first, not embedded in the middle.
Try the following instead:
RewriteRule ^index\.php$ - [L]
RewriteCond %{HTTP_HOST} ^((?!www).+)\.example\.com [NC]
RewriteRule ^([^.]*)$ /index.php?lang=%1&page=$1 [QSA,L]
RewriteRule ^([^.]*)$ /index.php?page=$1 [QSA,L]
Your last rule that rewrites everything else to index.php, less the lang URL param is questionable. Why not just include this in the preceding rule and validate the language in your script? Which you need to do anyway.
Assuming there is always a subdomain, then your rules could then be reduced to:
RewriteRule ^index\.php$ - [L]
RewriteCond %{HTTP_HOST} ^(.+)\.example\.com [NC]
RewriteRule ^([^.]*)$ /index.php?lang=%1&page=$1 [QSA,L]
Requests for the www language are then validated by your script and defaulted accordingly, as if the lang param was not passed at all (which you need to be doing anyway).
If your subdomain is entirely optional and you are accessing the domain apex then make it optional (with a non-capturing group) in the regex:
RewriteCond %{HTTP_HOST} ^(?:(.+)\.)?example\.com [NC]
:
The lang param would then be empty if the domain apex was requested.
(Bonus) en.domain.com/hello/there/sir -> index.php?lang=en&page=hello&sub1=there&sub2=sir
It would be preferable (more efficient, flexible, etc) to do this in your PHP script, not .htaccess.
But in .htaccess you could do something like this (instead of the existing rule):
:
RewriteRule ^([^/.]*)(?:/([^/.]+))?(?:/([^/.]+))?(?:/([^/.]+))?(?:/([^/.]+))?$ /index.php?lang=%1&page=$1&sub1=$2&sub2=$3&sub3=$4&sub4=$5 [QSA,L]
The URL params are empty when that path segment is not present.
It is assumed the URL-path does not end in a slash (the above will not match if it does, so a 404 will result). If a trailing slash needs to be permitted then this should be implemented as a canonical redirect to remove the trailing slash. Or reverse the logic to enforce a trailing slash.
This particular example allows up to 4 additional "sub" path segments, eg. hello/1/2/3/4. You can extend this method to allow up to 8 (since there is a limit of 9 backreferences in the Apache syntax) if required. Any more and you will need to use PHP. (You could potentially handle more using .htaccess, but it will get very messy as you will need to employ additional conditions to capture subsequent path segments.)
I tried adding RewriteCond %{REQUEST_FILENAME} !-f to respect local files but that breaks the whole file it seems
That should also be sufficient (if dots are permitted in your URLs). But I wonder where you were putting it? It should not "break" anything - it simply prevents the rule from being processed if the request does map to a file - the rule is "ignored".
This is of course assuming you are correctly linking to your resources/static assets using root-relative (starting with a slash) or absolute (starting with scheme + hostname) URLs. If you are using relative URLs then they will probably result in 404s. If this is the case then see my answer to the following question on the Webmasters stack:
https://webmasters.stackexchange.com/questions/86450/htaccess-rewrite-url-leads-to-missing-css

Set .htaccess rules to specify different folders

I want to set a rule in .htaccess if I enter in the url www.mydomain.com/compare.php set 'public_html' as root otherwise anything come in the url set root as 'public' folder.
RewriteRule ^(?!compare-source.php).*)$ public/$1 [L]
I want to achieve following result.
if url is www.mydomain.com/compare.php hit following file.
public_html/compare.php
if urls are www.mydomain.com/ OR www.mydomain.com/home etc hit following file.
public_html/public/index.php
I am weak in regex and in these apache rules always :-( can someone give me the solution with good description?
Your answers are welcome, please can you describe how this crazy things work in detail. Thanks.
Try:
RewriteEngine On
RewriteCond %{REQUEST_URI} !^/public
RewriteRule ^((?!compare\.php).*)$ /public/$1 [L]
The RewriteEngine directive enables or disables the runtime rewriting engine.
The RewriteCond directive defines a rule condition. The following Rule is only used if this condition is met; In our case, if REQUEST_URI (the path component of the requested URL) does not (because of !) begin (because of ^) with /public. We need this condition because we don't want to rewrite already rewritten URL - that would cause loop and Internal error 500.
Finally, the RewriteRule will match regex Pattern (^((?!compare\.php).*)$) against part of the URL after the hostname and port, and without the leading slash. If the pattern is matched, the Substitution (public/$1) will replace the original URL-path.
In plain language, if URL path does not begin with compare.php (because of ?!), pick everything (.*) between beginning (^) and end ($) and place it in variable $1. Then replace the original URL path with /public/$1.
#Anubhava's answer is also correct, he just placed both conditions in RewriteRule, and also it could be written even more readable as:
RewriteCond %{REQUEST_URI} !^/public
RewriteCond %{REQUEST_URI} !^/compare\.php
RewriteRule ^(.*)$ /public/$1 [L]
You can use this .htaccess in site root:
RewriteEngine On
# route /home/ or /home to /
RewriteRule ^home/?$ / [L,NC]
# if not compare-source.php or public/* then route to /public/*
RewriteRule ^(?!public/|compare-source\.php$).*)$ public/$1 [L,NC]

URL Rewriting .php to .html

I'm converting my urls extension from .php to .html in my .htaccess:
RewriteEngine On
RewriteBase /
RewriteCond %{THE_REQUEST} (.).php
RewriteRule ^(.*).php $1.html [R=301,L]
RewriteRule ^(.*).html $1.php [L]
FallbackResource /index.php
The problem is that I have some section with the word "php" on it:
www.mywebsite.com/phpscripts.php
And when it is converted:
www.mywebsite.com/htmlscripts.html
^(.*).php
This regex says: anything, including nothing, followed by a single arbitrary character, followed by "php". For example, it'll match "blahphp.html", specifically it'll match the "blahphp" part and not care about the extension at all.
What you're looking for is this:
RewriteRule (.+)\.php$ $1.html [R=301,L]
RewriteRule (.+)\.html$ $1.php [L]
(.+)\.php$ is something (at least one character) followed by a period followed by "php" at the end of the string. You can also get rid of the RewriteCond, it doesn't add anything to these rules.
Also note that you should be changing all your HTML files to link to href="...html". Don't rely on these redirects alone to fix your problem; not only is it inefficient to redirect every single request, it'll also break POST requests. It's only acceptable to redirect clients which try to open the old URLs for whatever reason to the new canonical URLs.

Rewrite automatically removes backslash if there's more than one?

I have a very simple url rewriting rules:
RewriteEngine on
RewriteCond %{HTTP_HOST} !script.php
RewriteRule ^test/(.*) script.php?q=$1
The idea is to have this kind of urls: http://mywebsite.com/test/http://example.com
and then send http://example.com to the script.php as a query parameter. The problem is that I'm receiving http:/example.com instead of http://example.com. Also, http:////example.com would be sent as http:/example.com. What causes this behavior ?
Apache mod_rewrite engine converts multiple ///... into single / for pattern matching in RewriteRule directive. However if you match it using RewriteCond then you can match multiple /s.
You can use rule like this:
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_URI} ^/+test/+(https?://.+)$ [NC]
RewriteRule ^ script.php?q=%1 [L,QSA]
The browser causes this behaviour. It contracts a sequence of / into 1 /, because it is still essentially a path. ///// does not change the directory we are in, so we could as well use /.
You have two options:
Change your links to use a query string instead. If you rewrite test/?q=something to script.php?q=something everything works as expected. You would do the following:
RewriteRule ^test/?$ script.php [L]
Since you don't alter the query string, the original query string is automatically copied to the new query string.
Don't make an assumption on how many slashes you will encounter. The url might not look correctly in the url bar of the browser, but if it is just a redirect, it will only be visible for a very short period of time.
RewriteRule ^test/(http|https):/+(.*)$ script.php?q=$1://$2

rewrite a folder so all files display under new rewrite rule

Is there a way that I can rewrite a folder so that all the files under that folder follow the same rule? For example:
if i have a folder with say 5 php files (a.php, b.php, c.php, d.php, index.php) in it and i use the following rule:
RewriteRule ^products/storage/?$ /content/products/storage/index.php [QSA,L]
is there a way that I can get all the files to show to be accessed like: site.com/products/apples/a.php, site.com/products/apples/b.php, etc. without having to write a rule for each one?
I tried the following but it didnt work.
RewriteRule ^products/storage/?$ /content/products/storage/ [QSA,L]
I also need it to NOT overwrite my other rules such as:
RewriteRule ^products/storage/?$ /content/products/storage/product-name1/ [QSA,L]
RewriteRule ^products/storage/?$ /content/products/storage/product-name2/ [QSA,L]
any ideas?
Your problem is the trailing $ on the end of the regex. This will only allow a match if the full URI matches products/storage (with optional trailing slash) exactly. Instead, try the following and note the absence of the trailing $ character:
RewriteRule ^products/storage/? /content/products/storage/ [QSA,L]
This will match anything that starts with products/storage (with optional trailing slash). Alternatively, if you wanted to capture and re-use everything in the URI that followed products/storage/ you could try:
RewriteRule ^products/storage(/?.+)?$ /content/products/storage$1 [QSA,L]
UPDATE
Should you need to preserve other RewriteRules as your updated question suggests, you should look to add a RewriteCond condition like so:
RewriteCond !^products/storage/?$
RewriteRule ^products/storage(/?.+)?$ /content/products/storage$1 [QSA,L]
The RewriteCond tells the RewriteRule to only process if the condition is not met.

Categories