URL Rewriting .php to .html - php

I'm converting my urls extension from .php to .html in my .htaccess:
RewriteEngine On
RewriteBase /
RewriteCond %{THE_REQUEST} (.).php
RewriteRule ^(.*).php $1.html [R=301,L]
RewriteRule ^(.*).html $1.php [L]
FallbackResource /index.php
The problem is that I have some section with the word "php" on it:
www.mywebsite.com/phpscripts.php
And when it is converted:
www.mywebsite.com/htmlscripts.html

^(.*).php
This regex says: anything, including nothing, followed by a single arbitrary character, followed by "php". For example, it'll match "blahphp.html", specifically it'll match the "blahphp" part and not care about the extension at all.
What you're looking for is this:
RewriteRule (.+)\.php$ $1.html [R=301,L]
RewriteRule (.+)\.html$ $1.php [L]
(.+)\.php$ is something (at least one character) followed by a period followed by "php" at the end of the string. You can also get rid of the RewriteCond, it doesn't add anything to these rules.
Also note that you should be changing all your HTML files to link to href="...html". Don't rely on these redirects alone to fix your problem; not only is it inefficient to redirect every single request, it'll also break POST requests. It's only acceptable to redirect clients which try to open the old URLs for whatever reason to the new canonical URLs.

Related

htaccess infinite loop issue

I have run into an issue with my .htaccess file.
The file changes the ugly URL such as http://localhost/news.php?article_slug=example-1 to http://localhost/news/example-1
This works perfectly, but when I go to http://localhost/news i get a 404 error.
Within news.php I have a redirect; so if there is not an article slug in the URL it will redirect to latest.php.
my PHP code on news.php
$article_slug=$_GET['article_slug'];
if (empty($_GET)) {
header("Location: ../latest.php");
die();// no data passed by get
}
This is what I currently have in my .htaccsess file
Options -MultiViews
RewriteEngine On
RewriteCond %{ENV:REDIRECT_STATUS} 200
RewriteRule ^ - [L]
RewriteRule ^news/([\w\d-]+)$ /news.php?article_slug=$1[QSA,L]
RewriteCond %{QUERY_STRING} article_slug=([\w\d-]+)
RewriteRule ^news$ /news/%1 [R,L]
RewriteRule ^category/([\w-]+)$ /category.php?category_slug=$1&page=$2 [QSA]
When I try to debug this myself (with very little knowledge) and add the following line it redirects to latest.php
RewriteRule ^([^/]*)/?$ /news.php [L,QSA]
but on the redirected page I get the following error
The page isn’t redirecting properly
Firefox has detected that the server is redirecting the request for this
address in a way that will never complete.
This problem can sometimes be caused by disabling or refusing to accept cookies
When I use the developer tools in firefox as IMSoP commented all I see is latest.php reloaded multiple times.
This is not just isolated to just latest.php but any file on the server thats not listed in the access file
when I remove the line
RewriteRule ^([^/]*)/?$ /news.php [L,QSA]
I can load the PHP file but it doesn't redirect from news.php and http://localhost/news is not found but http://localhost/news/example-1 works.
but when I go to http://localhost/news i get a 404 error.
None of your rules catch such a request, so no rewriting occurs and you get a 404.
RewriteRule ^([^/]*)/?$ /news.php [L,QSA]
This rewrites everything to /news.php, including /latest.php that you are redirecting to in your PHP script and the cycle repeats, resulting in a redirect loop to /latest.php.
However, this redirect in your PHP code would also seem to assume there is a slash on the original request. ie. should be /news/ (with trailing slash) not /news (no trailing slash) as you state in the question.
It would be better to redirect to a root-relative (or absolute URL) in your PHP script. ie. header('Location: /latest.php');
RewriteRule ^news/([\w\d-]+)$ /news.php?article_slug=$1[QSA,L]
RewriteCond %{QUERY_STRING} article_slug=([\w\d-]+)
RewriteRule ^news$ /news/%1 [R,L]
(Note you are missing a space before the "flags" argument in the first rule.)
The first rule can be modified to allow news/ and not just news/<something>. This is achieved by simply changing the quantifier from + (1 or more) to * (0 or more) on the capturing subpattern.
The second rule is not currently doing anything. You should probably be targeting news.php here. But the rules are also in the wrong order.
As noted in my answer to your earlier question, the \d shorthand character class is not necessary, since \w (word characters) already includes digits.
Try the following instead:
RewriteCond %{QUERY_STRING} (?:^|&)article_slug=([\w-]*)(?:$|&)
RewriteRule ^news\.php$ /news/%1 [R=301,L]
RewriteRule ^news/([\w-]*)$ /news.php?article_slug=$1 [QSA,L]
Note, the first rule should be a 301 (permanent) redirect, not a 302 (as it was initially). But always test first with a 302 to avoid potential cachining issues.
This allows requests to /news/, but not /news (no trailing slash). Only one of these can be canonical. If you need to handle /news as well then you should redirect to append the trailing slash (so /news/ is canonical, and the URL you should always link to.) For example, before the above two rules:
# Append trailing slash if omitted (canonical redirect)
RewriteRule ^news$ /$0/ [R=301,L]
Summary
Bringing the above points together:
Options -MultiViews
RewriteEngine On
RewriteCond %{ENV:REDIRECT_STATUS} 200
RewriteRule ^ - [L]
# Append trailing slash if omitted (canonical redirect)
RewriteRule ^news$ /$0/ [R=301,L]
RewriteCond %{QUERY_STRING} (?:^|&)article_slug=([\w-]*)(?:$|&)
RewriteRule ^news\.php$ /news/%1 [R=301,L]
RewriteRule ^news/([\w-]*)$ /news.php?article_slug=$1 [QSA,L]
RewriteRule ^category/([\w-]+)$ /category.php?category_slug=$1 [QSA,L]
However, the same now applies to your /category URL. This was also discussed in your earlier question. (I've removed the superfluous &page=$2 part and added the missing L flag.)
If you have many such URLs that follow a similar pattern (eg. news and category etc.) you don't necessarily need a separate rule for each. (An exercise for the reader.)
UPDATE:
$article_slug=$_GET['article_slug'];
if (empty($_GET)) {
header("Location: ../latest.php");
die();// no data passed by get
}
As discussed in comments, this should read:
$article_slug = $_GET['article_slug'] ?? null;
if (empty($article_slug)) {
header("Location: /latest.php");
die(); // no data passed by get
}
Rewrite rules are at heart very simple: they match the requested URL against a pattern, and then define what to do if it matches.
RewriteRule ^([^/]*)/?$ /news.php [L,QSA]
The pattern here translates as "must match right from the start; anything other than a slash, zero or more times; optional slash; must match right to the end". The action if it matches is to act as though the request was for "/news.php", adding on any query string parameters.
That's a very broad pattern; it will match "news" and "news/" but it will also match "hello-world", and "__foo--bar..baz/". The only thing that would stop it matching is other rules higher up your config file.
Meanwhile, every time this rule matches, your PHP code in news.php will run, and if there isn't anything on the query string, will tell the browser to request "latest.php".
But the rule will also match "latest.php". So when the browser requests "latest.php", the code in "news.php" gets run, and tells the browser to request "latest.php" again ... and we have an infinite loop.
The simplest fix is just to make your rule more specific, e.g. look specifically for the word "news":
RewriteRule ^news/?$ /news.php [L,QSA]
Another common technique is to add a condition to the rule that it only matches if the URL doesn't match a real filename, like this:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([^/]*)/?$ /news.php [L,QSA]

How to rewrite URLs with .htaccess (removing id?=1)

I've looked all over and have yet to figure out how to make this work, I'm sure it's very simple for someone who knows what they're doing. For starters, I have made sure that mod_rewrite is enabled and removing .php extensions is working so I know that isn't the issue.
Currently I'm working on a forum, and what I'm trying to do is simply remove the ?id=1 aspect of the URL, so basically making the URL look like such:
http://url.com/Forum/Topic/1
Instead of
http://url.com/Forum/Topic?id=1
/Forum is a directory with a document named Topic.php
My current .htaccess looks like this:
Options -MultiViews
RewriteEngine on
RewriteRule ^(.+)\.php$ /$1 [R,L]
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteRule ^(.*?)/?$ /$1.php [NC,END]
Any help would be appreciated.
Assuming you've changed the URLs in your application to use http://example.com/Forum/Topic/1 then try the following:
# Remove .php file extension on requests
RewriteRule ^(.+)\.php$ /$1 [R,L]
# Specific rewrite for /Forum/Topic/N
RewriteRule ^(Forum/Topic)/(\d+)$ $1.php?id=$2 [END]
# Append .php extension for other requests
RewriteCond %{DOCUMENT_ROOT}/$1.php -f
RewriteRule ^(.*?)/?$ /$1.php [END]
Your original condition that checked %{REQUEST_FILENAME}.php isn't necessarily checking the same "URL" that you are rewriting to - depending on the requested URL.
UPDATE: however how would I go about adding another ID variable, as in making http://example.com/Forum/Topic/1?page=1 look like http://example.com/Forum/Topic/1/1
So, in other words /Forum/Topic/1/2 goes to /Forum/Topic.php?id=1&page=2. You could just add another rule. For example:
# Specific rewrite for /Forum/Topic/N
RewriteRule ^(Forum/Topic)/(\d+)$ $1.php?id=$2 [END]
# Specific rewrite for /Forum/Topic/N/N
RewriteRule ^(Forum/Topic)/(\d+)/(\d+)$ $1.php?id=$2&page=$3 [END]
Alternatively, you can combine them into one rule. However, this will mean you'll get an empty page= URL parameter when the 2nd paramter is omitted (although that shouldn't be a problem).
# Specific rewrite for both "/Forum/Topic/N" and "/Forum/Topic/N"
RewriteRule ^(Forum/Topic)/(\d+)(?:/(\d+))?$ $1.php?id=$2&page=$3 [END]
The (?:/(\d+))? part matches the optional 2nd parameter. The ?: inside the parenthesis makes it a non-capturing group (otherwise we end up with an additional capturing subpattern that matches /2) and the trailing ? makes the whole group optional.

.htaccess redirect urls with spaces

I am trying to redirect all invalid urls to my index.php file via my .htaccess file.
Unfortunately I keep getting an Apache error.
My .htaccess file
RewriteEngine on
RewriteCond %{REQUEST_URI} !\.(?:css|js|jpe?g|gif|png)$ [NC]
RewriteRule ^([a-zA-Z0-9\-\_\/]*)$ index.php?p=$1
RewriteRule ^([A-Za-z0-9\s]+)$ index.php?p=$1 [L] 
This invalid url shoud redirect to index.php:
/vacatures/jobapplication/facility-manager%20qsdf
But it throws the object not found 404 Apache error.
The rule you have which allows spaces does not allow hyphens. The rule you have which allows hyphens does not allow spaces. So anything which includes both will not match either.
Your invalid URL facility-manager%20qsdf includes both.
My guess is that your RewriteCond is supposed to apply to both rules, but that is not what is happening now, it will apply only to the first RewriteRule after it. You can solve all these problems by including just 1 RewriteRule, and amending it to accept everything you want:
RewriteRule "^([A-Za-z0-9\-\_\/\s]+)$" index.php?p=$1 [L]
Note that this requires at least one of the characters in your character class, in other words it will not match your "home" location when there is no path ("http://somewhere.com/"). If you want to also match for that location, change the + to a *, to allow 0 or more character matches.
Your rewrite rules do not match the url you indicated. Your REQUEST_URI is
/vacatures/jobapplication/facility-manager%20qsdf
I suspect the URL decoding is not done before the RewriteRule matching and therefore it's trying to match literally %20, yet % sign is not included in your match. I'm not sure why you're using two RewriteRules - why not do something like this?
RewriteEngine on
RewriteCond %{REQUEST_URI} !\.(?:css|js|jpe?g|gif|png)$ [NC]
RewriteCond %{REQUEST_URI} !^index.php(\?.*)?$
RewriteRule ^(.*)$ index.php?p=$1 [L]

Using .htaccess to clean GET URL

Before anyone comments, I know there are a lot of posts created on this topic, but none of them seem to solve my problem, that is why I have started this thread.
So, I have a page in my website called project.php which is used in GET query like so: project.php?id=12 I want to have a .htaccess file that converts the given URL into localhost/MyWeb/project/id/12/. I've literally followed every single post regarding that topic but none of them seem to work.
Also, along with that, I want all my .php and .html files to be shown just with their names, i.e localhost/MyWeb/index.php/ becomes localhost/MyWeb/index/ and localhost/MyWeb/sub1/sub2.php becomes localhost/MyWeb/sub1/sub2/.
EDIT:
The reason why I did not add my work in first place was because I didn't think it would be any helpful. But here it is:
RewriteEngine On
RewriteRule ^([0-9]+)$ project.php?id=$1
RewriteRule ^([0-9]+)/$ project.php?page=$1
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([^\.]+)$ $1.php [NC,L]
RewriteRule ^([^\.]+)$ $1.html [NC,L]
Firstly, you are operating out of a sub-directory (MyWeb), which means you need to set a RewriteBase. Also, you need to ensure that your .htaccess file is placed inside that sub-directory, and not in the localhost document root.
So, below RewriteEngine on, insert the folloeing line:
RewriteBase /MyWeb/
Next, you stated that you want to convert project.php?id={id} to project/id/{id}, but your code omits the /id/ segment. I also noticed that you have two rules, and that the second one contradicts your question, so I am only going to show you the change you need to make for the first rule, until such time as you clarify what the second rule is for.
To make the project URI work, change the very first rule to:
RewriteRule ^project/id/([0-9]+)/?$ project.php?id=$1 [QSA,L]
This will match the URI you want, with an optional trailing slash. I've also added the QSA flag which appends any extra query string parameters to the rewitten URI, as well as the L flag which stops processing if the rule is matched.
Next, to omit the .php or .html from your URIs, change the last three lines to the following:
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteRule ^([^\.]+)$ $1.php [L]
RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^([^\.]+)$ $1.html [L]
When you make a request to localhost/MyWeb/index, Apache will check to see if localhost/MyWeb/index.php or localhost/MyWeb/index.html exist, and will then serve whichever one it finds first.
If you have both the PHP and HTML files, then the PHP one will be served, and not the HTML one. If you prefer to serve HTML files, then swap the two blocks around.
Unfortunately, I don't know of a good way to force a trailing slash for these, specifically because of the condition that checks for their existence. In other words, it won't work if you request sub2/, with the trailins slash because it would need to check if sub2/.php exists, which it does not.
Update: For added benefit, place these two blocks just below the new RewriteBase you set earlier to redirect the old URIs to the new ones whilst allowing the rewrites to the new URIs to still work:
RewriteCond %{THE_REQUEST} \/project\.php\?id=([0-9]+) [NC]
RewriteRule ^ project/id/%1/ [R=302,L,QSD]
RewriteCond %{THE_REQUEST} \/MyWeb/(.+)\.(php|html)
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^ %1 [R=302,L]
For reference, here's the complete file: http://hastebin.com/gacapesoqe.rb

htaccess url rewrite in dutch language and english language

I am handle with htaccess in my subdomain.
My htaccess script is given below
RewriteCond %{HTTP_HOST} ^demo\.example\.com/carrental$
RewriteRule (.*) carrental/([^/.]+)/([^/.]+)/([^/.]+) [R=301,L]
RewriteRule ^([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)$ carrental/locateaddress.php?country=$1&city=$2&locate=$3 [QSA]
RewriteCond %{HTTP_HOST} ^demo\.example\.com/carrental$
RewriteRule (.*) carrental/([^/.]+)/([^/.]+) [R=301,L]
RewriteRule ^([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)$ carrental/city.php?country=$1&city=$2 [QSA]
RewriteCond %{HTTP_HOST} ^demo\.example\.com/carrental$
RewriteRule (.*) carrental/([^/.]+) [R=301,L]
RewriteRule ^([a-zA-Z0-9_-]+)$ carrental/country.php?country=$1 [QSA]
I have 2 problems.
I Want to pass special characters in my url.
I am already try (.*) but its getting many problems. I want to pass special characters into ([a-zA-Z0-9_-]+).
My Major Query is i have one url. When my links translate to dutch language.
Url : http://demo.osiztechnologies.com/carrental/Albanië
The problem is because of Albanië. It shows a 404 error. If I change this into English it works fine.
How can I rewrite URL's with special characters?
The %{HTTP_HOST} variable is the HTTP request's "Host:" header. It is only a hostname, no path information is given in that field. Thus:
RewriteCond %{HTTP_HOST} ^demo\.example\.com/carrental$
will never match. Not sure why it's there, as the resulting rule that the condition gets applied to is wrong as well:
RewriteRule (.*) carrental/([^/.]+)/([^/.]+)/([^/.]+) [R=301,L]
Here, you are matching the entire URI (via the (.*)) and then redirecting the browser to:
/carrental/([^/.]+)/([^/.]+)/([^/.]+)
Note those ([^/.]+). They don't get replaced with anything, that's literally where you are sending the browser.
As far as the special characters. Rob Quist is only half right. While they do get encoded by the browser into escape sequences like %C3%AB, the rewrite engine decodes them back into the unicode characters before applying any rules.
So, say you want to include ë, then your rule will look like:
RewriteRule ^([a-zA-Z0-9_-]+)/([ëa-zA-Z0-9_-]+)$ carrental/locateaddress.php?country=$1&city=$2 [QSA]
You can stick all the possible unicode characters you expect to be getting inbetween the square brackets, but you can just make everything match easier by using the groupings similar to the ones in the broken rule:
RewriteRule ^([^/.]+)/([^/.]+)/([^/.]+)$ carrental/locateaddress.php?country=$1&city=$2&locate=$3 [L,QSA]
RewriteRule ^([^/.]+)/([^/.]+)$ carrental/city.php?country=$1&city=$2 [L,QSA]
RewriteRule ^([^/.]+)$ carrental/country.php?country=$1 [L,QSA]
That won't work. You'd need to replace them with special characters (%C3%AB in this case) in order to work.
The best solution here is to make 2 words for each record - a unique SEO-URL, and the real title.
Let the real title be "Albanië", and the SEO-version "albanie". Just do it for compatibilty's sake.
EDIT: Some browsers (such as Google Chrome) may translate %C3%AB in a URL to ë and back. Too bad the actual request sent to the server is the one with %C3%AB.
EG: Your browser shows: site.com/locations/Albanië
Server receives: GET locations/Albani%C3%AB

Categories