htaccess url rewrite rule not working as expected - php

I have a complex problem that I an unable to solve for days now. Maybe some expert with more knowledge of htaccess functionality will be able to help out.
I have two files placed in the root directory - test.php and files_include.php.
The URL that a user would normally see is:
www.example.com/test.php?cs1=A&cs2=B&cs3=C&cs4=D
Since this is a ugly URL I would like to rewrite it to something better like:
www.example.com/search/A-B-C-D.html
Using a rule in .htaccess like this I can easily rewrite the URL:
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^search/([^-]*)-([^-]*)-([^-]*)-([^-]*)\.html$ /test.php?cs1=$1&cs2=$2&cs3=$3&cs4=$4 [L]
In the file test.php I call for the website config files like this:
include('files_include.php');
Now the problem. As soon as I rewrite the URL to a location different from the root one, I get a really strange issue. The page still renders correct in browser but:
Problem 1. I have to replace src="images with src="../images if I want to see the image correct. This can be easily corrected by giving an absolute link, it is the easier part to do.
But the question is why is the relative path changing? Is .htaccess making the browser think we are in search'/ folder? The answer to this question will help me to identify the main issue, which is Problem2.
Problem 2. Sitemaps generators cannot follow the links on the page once the URL is rewritten, as if it appears blank to them, no matter that in browser all looks fine.
Therefore I am guessing that by rewriting the URL to search/A-B-C-D.html I am breaking something with the inclusion of files_include.php.
Basically, I need a general idea of were to look at and the things I should have in mind when rewriting root/test.php to root/search/A-B-C-D.html
Any suggestions?

Your browser is clueless about 'pretty' and 'ugly' urls. It just requests a folder or a file. If you request http://example.com/search/A-B-C-D.html, to the browser you are requesting a page A-B-C-D.html in the /search/ folder. If you have any relative urls on that page, it will request them relative to that /search/ folder. The browser has no clue, and should have no clue, what the internal representation of a request looks like. Heck, at your end of the line it might even be translated to instructions for a colony of hamsters, which will then send correct data through. The browser doesn't need to know how hamsters behave ;-)
The first problem is easily resolved by making your urls absolute. I wouldn't recommend making them relative to the pretty url. An alternate solutions would be to add the <base> tag to the <head> tag of your page. The href property of this tag will be used as a base for any relative links on your page. See mdn for more information. You would then do:
<head>
<base href="/">
</head>
As for your second problem, the include itself is not the problem. include(..) will first try to find the file in the include_path, and otherwise in the script's directory and the working directory. This doesn't change if you create pretty urls. Apache, and php, still know where the actual file is located you are executing. If an include statement fails to load a file it will generate an error too, which is another way you can tell if the include itself is the problem. See the documentation.

But the question is why is the relative path changing? Is .htaccess making the browser think we are in search'/ folder? The answer to this question will help me to identify the main issue, which is Problem2.
It's changing because the browser is loading /search/something-something-sometrhing-something.html instead of /test.php. The first URL has a relative URI base as: /search/ and the second URL has a base of /.
For the second problem, you could try externally redirecting, but not sure if that'll help the sitemap itself, it depends on the generator. Try adding this rule:
RewriteCond %{THE_REQUEST} \ /+test\.php\?cs1=([^&]*)&cs2=([^&]*)&cs3=([^&]*)&cs4=([^&\ ]*)
RewriteRule ^ /search/%1-%2-%3-%4.html [L,R]

Related

Subdomains and related links on a website

I'm working on the complete structure of a web page, and I'm using directories to the url of the site the user can understand the site map, with categories and subcategories. for example. My homepage is www.mantarrayamx.com.
The page I am trying to load is www.mantarrayamx.com/services/seo, but for seo I am using the subdomain seo.mantarrayamx.com to access this directory directly.
I'm using third-party code, for example "font awesome". Unfortunately, the web page loading failed because the links are relative. I try entering in the CSS and JS including of third-party code and yet it still loads with errors. You can see the difference between loading by subdomain and loading by sub-directory here:
mantarrayamx.com/servicios/posicionamientoweb/
posicionamientoweb.mantarrayamx.com/
The question is:
What is the best way to use and manage subdomains and links (../img/)?
For example: How do you do google in your applications:
drive.google.com
mail.google.com
If I have to modify the .htaccess file, please give me an example.
As far as I get your question, you are accessing a subdirectory of your server by using a subdomain. On this subdomain, your data is in the root-directory. I guess you are using absolute links in your app, like:
/service/type/(index.php) or
/about/me/(index.php)
First of all: If you just want to have this for seo-friendlieness and beautiful links, you should definitely use mod_rewrite or the appropriate nginx-config. This saves you from having real subdirectories - you just "fake" them. The following code rewrites all requested URLS to index.php?r=theenteredurl. In PHP (or, if you want, any other processing language of your choice) you can sanitize the URL, analyse it and then server the correct content.
mod_rewrite:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteBase /
RewriteRule ^(.*)$ index.php?r=$1 [L]
Nginx:
location / {
try_files $uri $uri/ /index.php?r=$request_uri;
}
The good thing about this solution is, that the only file that really gets processed is your index.php and you therefor have your app/website tidy and on one place. But be aware: HTML, CSS and JS relative links do NOT work as you might expect with this solution, since they do not see what PHP processes, but only what is in the address-bar of your browser. All relative links are relative to the fake subdirectory. To solve this, you can define a base-url in your HTML-file. All other files loaded in this HTML file will be relative to this url.
If I got you wrong and you really want to have real sub-directories on the one domain and no subdirectories on the other, then you could use the HTML base-tag to define a different base-URL depending on whether you are on the main domain or the subdomain. To find out the latter, try the PHP super-global $_SERVER. Please note, that HTML cannot access something that is out of the public scope - if your ressources are in a higher subdirectory that is not publicly accessable on this subdomain, you have no chance of loading it in HTML files.

Apache URL rewriting keeping old directory location?

So I'm trying to use url rewriting to simplify url's on a website.
Example usage:
www.example.com/test -> www.example.com/index.php?page=test
www.example.com/test/x -> www.example.com/index.php?page=test&value=x
I use this for the rewriting:
RewriteEngine on
#Simplify url
RewriteRule ^(\w+)/?$ index.php?page=$1
RewriteRule ^test/(\w+)/?$ index.php?page=test&value=$1
The issue is that this approach seems to preserve the location prior to the rewrite. So other files loaded on the website like CSS are relative to the original location rather than the location after being rewritten.
Example:
www.example.com/test/x rewrites to
www.example.com/index.php?page=test&value=x.
But CSS files loaded when a user enters www.example.com/test/x are loaded relative to the /test/ folder rather than the / folder. So they're not found.
Am I doing something incorrectly? I'd assumed that rewriting would literally redirect, so things like this wouldn't be an issue. I'd like to solve this issue rather than just using absolute url's for everything - so I can still use it on my test server.
It's important to keep in mind that rewriting is not the same as redirecting. The browser doesn't know about the rewrite that is happening; it just sees the folder structure as it seems.
Relative URLs for site resources are resolved by the browser. So, if you access www.example.com/test/x, and the browser sees <link href="style.css">, it naturally reads this as www.example.com/test/x/style.css, and tries to request this file, only to receive a 404.
One common solution is to always use absolute URLs like www.example.com/style.css. You would most likely store your site's URL as a constant and use <link href="<?php echo SITE_URL; ?>/style.css">.
1)
There is nothing wrong using absolute URLs. You can use define.
For the development environment use:
define('BASE_URL', 'http://test.server.local/');
then for the production environment just change it to:
define('BASE_URL', 'http://www.example.com/');
On all the pages of Your code, You can access those URLs as
x page
So You don't need to change code on all the pages where You reference the BASE_URL
2)
It is a good idea to place all your css in the styles/ directory, then in .htaccess file you can exclude it from rewriting like that:
RewriteEngine on
# add this line:
RewriteRule ^/?styles/.+$ - [L]
#Simplify url
RewriteRule ^(\w+)/?$ index.php?page=$1
RewriteRule ^test/(\w+)/?$ index.php?page=test&value=$1
UPDATE (regarding Your comment)
If you use the concept of BASE_URL the correct URL is made on server and then passed to the browser. If you use <base> you depend on the client side (browser the user uses). It is a good practice to use BASE_URL on the server side, thus you won't depend on the client's browser.
Check out this answer: Is it recommended to use the base html tag?
You can also include the php file (that has the define() function) to all your pages, thus there is no need to use <base> on every page. Here is a nice example of using this.

How to setup .htaccess to show 404 for unallowed urls?

I noticed in Drupal if you add .php to the url bar of any page it gives you a 404 message; clean urls enabled. The page is obviously a .php, but the .htaccess is preventing the user from being able to tamper with url extensions in the url bar. How could you do this using .htaccess. I have file extensions omitted at the moment, but would also like to add that feature. Thank you.
Also, this question does not pertain to Drupal. I only mentioned Drupal for and example.
Just because a file contains PHP code it doesn't mean it has to have the .php extension; even more so when you're accessing a file over the internet.
When you request http://mysite.com/page and you're using an .htaccess like Drupal's, the request is forwarded onto index.php?q=page whereupon Drupal will check it's database for a path matching page. If it finds one it will display the content for that page, if not it will (rightly) give a 404.
If you want all of your pages to be accessible with a PHP extension you could add an extra rule in your .htaccess file to remove .php from any request where the PHP file doesn't physically exist:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)\.php $1 [NC]
Bear in mind though that this adds zero extra value for your site's visitors (in fact they have to remember a file extension as well as the path to the page), and it exposes exactly what server-side technology you're using so a potential attacker would have some of his work done for him.
Hope that helps.
Could you please explain that in more depth. How can it redirect content into an existing page? Is that common practice / typical way of doing things?
Yes it is a very common practice, used by most frameworks and CMS.
The principle is simple: you setup your .htaccess so that every request which doesn't match a real file or directory will be redirected to a front controller, usually the index.php in the root directory of the application. That front controller handles the request by analyzing the URL and calling the necessary actions.
In this way you can minimize the rewrite rules to just one, and you can offer customized 404 pages.
I dunno Drupal but in the usual php app every request being routed to the front controller which performs some validations and throws 404 on errors.
easy-peasy

URL rewriting - Follow my rules and leave physical files alone

Noob in mod_rewrite here...I'm developing a new site and using mod_rewrite.
The problem is, when I activate my rules in .htaccess, my links to CSS files and images become unreadable.
For example, I had this:
http://www.dico2rue.com/dictionnaire.php?idW=675&word=Resto-basket
That I transformed to this:
http://www.dico2rue.com/dictionnaire/675/Resto-basket
I know it's probably because the browser is looking for the CSS file in the
http://www.dico2rue.com/dictionnaire/675/css/general.css instead of the base directory, but I was hoping there was a way to to leave physical files alone, and only parse other URLs in order to avoid full paths (which apparently slows down downloas speed...?...).
thanks.
This problem doesn't have anything to do with mod_rewrite; you just need to provide a valid URL to your CSS file in the src attribute of your link tag. The relative URL you probably want to use is "/css/general.css". See the relative URL rfc.
On another note, your thinking about mod_rewrite might be a little off. In your example you are actually providing a resource in the /dictionnaire/675/ path of your server. The fact that you are using mod_rewrite to do it instead of some other method makes no difference.
you need to add these lines to check if it's not a file or directory right before the rewrite rules
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d

PHP: Best solution for links breaking in a mod_rewrite app

I'm using mod rewrite to redirect all requests targeting non-existent files/directories to index.php?url=*
This is surely the most common thing you do with mod_rewrite yet I have a problem:
Naturally, if the page url is "mydomain.com/blog/view/1", the browser will look for images, stylesheets and relative links in the "virtual" directory "mydomain.com/blog/view/".
Problem 1:
Is using the base tag the best solution? I see that none of the PHP frameworks out there use the base tag, though.
I'm currently having a regex replace all the relative links to point to the right path before output. Is that "okay"?
Problem 2:
It is possible that the server doesn't support mod_rewrite. However, all public files like images, stylesheets and the requests collector index.php are located in the directory /myapp/public. Normally mod_rewrite points all request to /public so it seems as if public was actually the root directory too all users.
However if there is no mod_rewrite, I then have to point the users to /public from the root directory with a header() call. That means, however that all links are broken again because suddenly all images, etc. have to be called via /public/myimage.jpg
Additional info: When there is no mod_rewrite the above request would look like this: mydomain.com/public/index.php/blog/view/1
What would be the best solutions for both problems?
Edit/Additional question:
Is there a way to make /public/ the base dir using plain htaccess code?
Write the app in such a way that it doesn't need mod_rewrite to function (at the cost of having "ugly" urls). Progressively enhance it with mod_rewrite to achieve the desired result. This probably means that you'll need to store some base path config info in your app.
I don't understand these problems at all.
Yes, this is surely the most common thing you do with mod_rewrite, yet with 2 conditions:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
So, nothing hurt your existing images.
Why not to use just absolute path, e.g. /myapp/public/myimage.jpg, so, no virtual directory will hurt image path?
what about path info? You could use it without mod_rewrite
/index.php/path/to/another/file.jpg
<?php
echo $_SERVER["PATH_INFO"]; // outputs /path/to/another/file.jpg
?>
Anyways, if you want to know if mod_rewrite is supported by your server :
<?php
echo "mod_rewrite : ".(!empty($_SERVER["REDIRECT_URL"])?"supported":"not supported");
?>
Then you ll know if mod_rewrite is the solution or maybe path_info is more well suited for you, you could make support functions that could look for both too.

Categories