htaccess RewriteEngine URL interpretation - php

Lately I have changed the way my website works - physical page for every article vs. dynamically loaded content without physical (sub)page, but I realized I cannot simply upload the new site files cos I would break up all the social platform sharing links, counters and stuff as there are literally thousands of the subpages.
I heard (I know about it) that via .htaccess and RewriteEngine (I need using RewriteEngine as all the code in htaccess is made for it) I can make pages load internally something completely different depending on the actual URL, like, for example, if I have actual URL link to one of my subpages:
http://sub.mypage.com/php/somearticle.php?j=en
...so without changing the text of the URL it would load my new site files internally on different principle, like this:
http://sub.mypage.com/?s=somearticle&j=en
Now I also need that those variables "j" and the "somearticle" to be dynamic, or better said they need to be copied exactly as they are from the physical URL in the addressbar (where "somearticle" is actually name of the originally physical php file and the "j" is just language variable) as it will be something else every time so I do not have to make thousands of lines in htaccess for every single concrete subpage - I need some universal code that would manage all the subpages (as the principle is the same for all, just php names changes and sometimes language = variable "j"), you see?
So can anyone help me telling me the exact syntax/code for this to achieve?
EDIT
So I was playing with it myself a bit and this seems to work if I set the subpage manually (NOTE: just a clarification - this is for my localhost:8081 therefore I have in place that 1st condition cos for server I have different version that has different path to index.php) + I slightly updated variable j part thanx to #Ben's post:
# LOAD PAGE DIFFERENT INTERNALLY - LOCALHOST
RewriteCond %{HTTP_HOST} ^localhost:8081 [NC]
RewriteCond %{REQUEST_URI} /php/(.*)\.php$ [NC]
RewriteRule ^php/(.*).php$ /WWW/_PHP_/lego/index.php?s=$1 [QSA,L]
But unfortunately for some reason it affects every page on my site not only pages under /php/, so when I click to go to my first (default/initial = index.php) page it breaks it (it holds summary of all articles - they are not loaded) - anyone knows why, please?
SOLVED
So after small change to #Ben's code this is the right solution thus I take his solution as the right one (as it would actually work OK right away as it is if the page would be on server cos my test version is on my localhost where the path to /php/ directory is different))
LOAD PAGE DIFFERENT INTERNALLY - localhost
RewriteCond %{HTTP_HOST} ^localhost:8081 [NC]
RewriteCond %{REQUEST_URI} /php/([a-zA-Z0-9\-]+).php [NC]
RewriteRule ^ index.php?s=%1 [QSA,L]

REQUEST_URI is path component of the requested URI such as /php/somearticle.php, but not contains query string such as ?j=en.
The RewriteCond pattern, ! character (exclamation mark) is used to negate the result of the condition. To prefix with some pattern, use ^ character (caret), matches the beginning of a line.
%1 is the RewriteCond backreference that provides access to the grouped parts (in parentheses) of the pattern.
QSA flag, if the replacement URI contains a query string, the query string such as ?j=en will be appended to the newly rewrite uri.
RewriteCond %{REQUEST_URI} ^/php/([a-zA-Z0-9\-]+).php [NC]
RewriteRule ^ index.php?s=%1 [QSA,L]
See also: Apache Module mod_rewrite, RewriteRule Flags

Related

mod_rewrite on apache - language parameter

I have searched for a solution of this problem all over the Internet, read various threads and articles about it but did come to a full solution for my - i think quite generic problem in mod_rewrite. Here is the issue:
I have developed a small webapp that lets users make calculations (splitting costs after holidays "who pays to whom"). Now I have come to the point where I want to make it (especially the static pages) language dependent. What seemed like no big deal by passing a get parameter ?lang= seems to be a problem for search engines according to my research - so apache mod_rewrite to the rescue to have beautiful URLs like
example.com/en/index => example.com/index.php?lang=en.
example.com/en/about => example.com/about.php?lang=en.
Moreover, users should be able to share their calculations with their friends - for this they are issued an ID after caluclation therefore a rule like
example.com/c/9842398dfalkjdsf98sfdasf => example.com/c.php?id=9842398dfalkjdsf98sfdasf
is used to call their previous calculations (language handling is done here directly in the script automatically, also this links are not needed to be indexed in any search engine).
To achieve this I have come up with these rules:
RewriteEngine on
RewriteRule ^c/([^/]+) c.php?id=$1 [L,QSA]
RewriteRule ^c/([^/]+)/ c.php?id=$1 [L,QSA]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^/]+)/([^/]+)/$ $2.php?lang=$1 [L,QSA]
RewriteRule ^([^/]+)/([^/]+)$ $2.php?lang=$1 [L,QSA]
So far these rules work - however:
Here are my questions:
1) Is this approach a good approach for language dependent sites - meaning will google index the "static" sites like "about" etc. correctly?
2) Come somebody come up with a Rewrite Rule to also have requests like
example.com/about => example.com/about.php?lang=en.
(notice the missing language parameter in the first url)
to send them to the standard language
OR should I then first get their accpted langauge and then redirect them to example.com/LANG/about
3) How should I design the language detection - especially on the homepage? Right now it works according to the rules above - howver I have seen solutions on the Internet passing everything first to a index.php which then call the disred page like
index.php?lang=en&page=about
When google "visits" it will usually not provide an HTTP_ACCEPT_LANGUAGE so will it even ever see the other language versions like example.com/it/about ?
4) Turns out that using RewriteRules kill your relative CSS, JS, picture links in your code (suprise!), however I found a page on the internet saying that this also could be handled with a RewriteRule instead of using absoulute paths in the html? (here). Unfortunately I where not able to implement it.
In a nutshell:
I am a little confused, hope somebody can help how to set up a simple SEO conform, language dependent site and that this will help others to see a best practice solution as a whole.
Thanks in advance!
Try this , thank you
RewriteEngine On
# This is to prevent the rules from looping, they only work as on-shot
RewriteCond %{ENV:REDIRECT_STATUS} 200
RewriteRule ^ - [L]
# If the url is blank, go to 'web/en/'
RewriteRule ^/?$ /web/?lang=en [L,QSA]
# If the url starts with en,es,pt-br, remove it and add ?lang=$1 ,has /web
RewriteRule ^/?(en|es|pt-br)/web(/?.*)$ /web$2/?lang=$1 [L,QSA,R]
# If the url starts with en,es,pt-br, remove it and add ?lang=$1 ,has no /web
RewriteRule ^/?(en|es|pt-br)/?$ /web/?lang=$1 [L,QSA,R]
# If the url starts with en,es,pt-br, remove it and add ?lang=$1 ,everything else
RewriteRule ^/?(en|es|pt-br)/(.+?)/?$ /$2/?lang=$1 [L,QSA,R]

Lang parameter redirects to subfolder, keep the lang folder name on the next link

The requirement was to have:
http://xxxx.com/it/
To redirect to
http://xxxx.com/index.php?act=setlang&val=it
An the same t to happen for the rest of the links e.g
http://xxxx.com/it/test.php
to
http://xxxx.com/test.php?act=setlang&val=it
I have achieved that using the following:
RewriteRule ^([a-z]{2})$ index.php?act=setlang&val=$1 [L]
RewriteRule ^([a-z]{2})/([a-zA-Z0-9-]+)\.php$ $2.php?act=setlang&val=$1 [L]
The problem is that I would have like to have continuity of the language in the url path and retain it when changing links unless the lang var changes
When I visit http://xxxx.com/it (the language will be set to Italian as expected), when I click on the next link e.g test.php the link will be http://xxxx.com/test.php not http://xxxx.com/it/test.php.
Is there a way to retain it using htaccess until manually changed by the user (user selects another language)
The reason that I am looking to retain it is for SEO purposes really, not sure if it makes any difference? but i presume that if Google had to index 3 test.php (coming from it,en.fr) it wouldn't be able to crawl the individual languages at all?as well as if it has to crawl en/test.php fr/test.php and it/test.php..
Note: As you have guessed the sub-folders don't actually exist and in reality are virtual folders
To keep the language code sticky you can create a cookie first time then /it/ URL is loaded and then prefix every URI that doesn't have it.
RewriteCond %{ENV:REDIRECT_STATUS} 200
RewriteRule ^ - [L]
# if lang cookie is set and URI doesn't start with 2 char lang code
RewriteCond %{HTTP_COOKIE} LANG=([^;]+) [NC]
RewriteRule ^((?![a-z]{2}/).+)$ /%1/$1 [R,L,NC]
RewriteRule ^([a-z]{2})/?$ /index.php?act=setlang&val=$1 [L,CO=LANG:$1:%{HTTP_HOST}]
RewriteRule ^([a-z]{2})/([a-zA-Z0-9-]+)\.php$ $2.php?act=setlang&val=$1 [L,CO=lang:$1:%{HTTP_HOST}]
Is there a way to retain it using htaccess until manually changed by the user (user selects another language)
Htaccess can only change the content of your pages through something like an HTML Proxy, where all the content of your site is filtered through rules and links will magically get changed to include things like /it/ in front of them before they're returned to browsers.
What you should probably do is in your php files, dynamically add a URI base using the contents of the val parameter. So if the request is:
index.php?act=setlang&val=it
index.php will include a
<base href="/it/" />
in the page's headers. That way, your links will appear to be:
http://xxxx.com/it/something.php

URL Routing using RewriteRule

I am trying to create my own PHP MVC framework for learning purpose. I have the following directory structure:
localhost/mvc:
.htaccess
index.php
application
controller
model
view
config/
routes.php
error/
error.php
Inside application/config/routes.php I have the following code:
$route['default_controller'] = "MyController";
Now what I am trying to achieve is when any user visits my root directory using browser I want to get the value of $route['default_controller'] from route.php file and load the php class inside the folder controller that matches with the value .
And also if any user tries to visit my application using an url like this: localhost/mvc/cars, I want to search the class name cars inside my controller folder and load it. In case there is no class called cars then I want to take the user to error/error.php
I guess to achieve the above targets I have to work with the .htaccess file in the root directory. Could you please tell me what to code there? If there is any other way to achieve this please suggest me.
I have tried to use the .htaccess codes from here, but its not working for me
It all sounds well and good from a buzzword standpoint, but to me this is all a little confusing because I see PHP's model as an MVC model already. It's providing the API for you to program with and deliver your content to your web server Apache and your database (something like MySQL). It translates the code(model) for you into HTML(view) ... provided that's what you intend, and you're supplying code as the user input (control). Getting too wrapped up in the terminologies gets a little distracting and can lead to chaos when you bring someone in to collaborate who isn't familiar with your conventions. (This should probably never be used in a production environment for a paying gig.)
I can tell you that on the page that you referenced they guy's .htaccess file needs a little work. The [L] flag tells mod_rewrite that this is the last command to process when the rule returns true. So you would either need to do this:
<IfModule mod_rewrite.c>
RewriteEngine on
RewriteRule ^(.*)$ public/$1 [L]
</IfModule>
Or the following... but he was using a passthru flag which means that he is implying there are other things that could be processed prior to the last rule (eg. might be rewrite_base or alias), but that's not actually the case with his .htaccess file since it's a little bare. So this code would work similar to the code above but not exactly the same. They can't be used together though, and really there would be no need to:
<IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*) index.php?url=$1
</IfModule>
The difference is the in the way it's processed. On the first .htaccess example you're passing any file to index.php regardless of whether it exists or not. You can [accidentally] rewrite a path that has a real file so that the real file is never accessed using this method. An example might be you have a file called site.css that can't be accessed because it's being redirected back to index.php.
On the second ruleset he's at least checking to see if the server doesn't have a file or a directory by the name being requested, then they're forwarding it to index.php as a $_GET variable (which seems a little pointless).
The way I typically write these (since I know mod_rewrite is already loaded in the config) is to to this:
RewriteEngine On
RewriteCond %{HTTP_HOST} ^mydomain.com
RewriteRule (.*) http://www.mydomain.com/$1 [R=301,L]
RewriteCond %{SCRIPT_FILENAME} !-f
RewriteCond %{SCRIPT_FILENAME} !-d
RewriteRule .* index.php
In my PHP code I pull the $_SERVER['REQUEST_URI'] and match it against a list of URIs from the database. If there's a match then I know it's a real page (or at least a record existed at some point in time). If there's not a match, then I explode the request_uri and force it through the database using a FULLTEXT search to see what potentially might match on the site.
Note: if you blindly trust the request_uri and query the database directly without cleaning it you run the risk of SQL injection. You do not want to be pwnd.
<?php
$intended_path = $_SERVER['REQUEST_URI'];
if(in_array($intended_path,$uris_from_database)){
//show the page.
} else {
$search_phrase = preg_replace('!/!',' ',$intended_path);
$search_phrase = mysqli_real_escape_string($search_phrase);
$sql = "SELECT * FROM pages WHERE MATCH (title,content) AGAINST ('$search_phrase');"
}
Sorry if this sounds a bit pedantic, but I've had experience managing a couple of million dollar (scratch) website builds that have had their hurdles with people not sticking to a standard convention (or at least the agreed upon team consensus).

How to redirect to home page if no no pattern match in .htaccess

Friends I am newbie to .htaccess and Rewrite Rule and very puzzled about these rules. I am trying to execute these following lines for the index.php page.
When I write in url, one or two or three or four query arguments then it goes OK. Any of these pattern matches I collects query and proceed further in php. But when query arguments exceeds more than four arguments (say someone intentionally trying to give wrong urls) I want to redirect the page to homepage. I didn't found any solution anywhere on internet. Can anybody help me, how to do that. I desperately want a solution to do this project and in very urgent need. I read many articles but doesn't understand how to handle this problem. I'm executing these all tasks with XAMPP on Win 7. The patterns I am using are,
RewriteRule ^([a-zA-Z0-9_]+)/?$ index.php?levelone=$1 [NC]
RewriteRule ^([a-zA-Z0-9_]+)/([a-zA-Z0-9_]+)/?$ index.php?levelone=$1&leveltwo=$2 [NC]
RewriteRule ^([a-zA-Z0-9_]+)/([a-zA-Z0-9_]+)/([a-zA-Z0-9_]+)/?$ index.php?levelone=$1&leveltwo=$2&levelthree=$3 [NC]
RewriteRule ^([a-zA-Z0-9_]+)/([a-zA-Z0-9_]+)/([a-zA-Z0-9_]+)/([a-zA-Z0-9_]+)/?$ index.php?llevelone=$1&leveltwo=$2&levelthree=$3&levelfour=$4 [NC]
URL example :
(say localhost/myproject/index.php is my homepage)
localhost/myproject/levelone/
localhost/myproject/levelone/leveltwo/
localhost/myproject/levelone/leveltwo/levelthree/
localhost/myproject/levelone/leveltwo/levelthree/levelfour
When I use these four Urls then it's ok, but if I use
localhost/myproject/index.php/levelone/leveltwo/levelthree/levelfour/levelfive/levelsix/
then my index.php page return some html without style and layout, I trying many types of RewriteCond and commands in .htaccess but all in vain.
I want that query string arguments must be in the range from 1 to 4 if it exceeds the range say five arguments then page must redirect to homepage (/index.php) again otherwise match to pattern.
It would help me if anyone know if there is way to combine all these four long patterns in one short line for matching any of four.
This rewrite rule will only kick in if there is not a file or directory that already exists.
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule . /path/to/index.php [L,QSA]
The -f flag will make the rewrite only kick in if the file doesn't exist, and the -d flag will ensure the rewrite won't kick in if a directory exists.
I suggest avoiding a redirect to the home page.
In index.php, display the home page content by default if none of the level variables are set.
Here's a one-line rule (maybe it can be simplified even more):
RewriteRule ^(?:([a-zA-Z0-9_]+)\/)?(?:([a-zA-Z0-9_]+)\/)?(?:([a-zA-Z0-9_]+)\/)?(?:([a-zA-Z0-9_]+)\/)?$ index.php?level1=$1&level2=$2&level3=$3&level4=$4 [L]
And the PHP:
$levels=array();
foreach (range(1,4) as $i) {
if (isset($_GET['level'.$i])) {$levels[$i]=$_GET['level'.$i];}
}
if (!empty($levels)) {
// show level content
} else {
// show home page
}

Help with .htaccess file. mod_rewrite for custom urls?

Situation:
I have a few hundred posts each belonging to a particular category.
A] Now when the user visits the home page, the content is irrespective of the category sorted by date.
http://www.example.com
He can navigate through different pages like:
Type 1: http://www.example.com/3 which corresponds to http://www.example.com/index.php?page=3
I can probably do this in mod_rewrite
B] The user can then decide to view by category like:
Type 2: http://www.example.com/Football which will correspond to
http://www.example.com/index.php?page=1&category=Football
He can then navigate through pages like:
Type 3: http://www.example.com/Football/5 which =>
http://www.example.com/index.php?page=5&category=Football
C] I have a directory called View with index.php in it. It only shows individual posts like:
Type 4: http://www.example.com/View/1312 => http://www.example.com/View/index.php?id=1312
Here is the mod_rewrite I do:
RewriteEngine on
RewriteRule ^View/([^/.]+)/?$ View/index.php?id=$1 [L]
Now here are the problems I have
In point C]: http://www.example.com/View/1312 works fine but http://www.example.com/1312/
(notice the trailing slash) breaks apart & gives weird results.
Q1) So how do I maintain consistency here?
Q2) Ideally I would want http://www.example.com/View/1514 to show a 404 Error if there is no post with id 1514, but now I have to manually take care of that in PHP code.
What is the right way of dealing with such dynamic urls? especially if the url is wrong.
Q3) how do I ensure that http://www.example.com & http://www.example.com/ both redirect to http://www.example.com/index.php?page=1; (mod_rewrite code would be helpful)
Please Note that there are only two index.php files. One in the root directory which does everything apart from showing individual posts which is taken care by a index.php in View directory. Is this a logical way of developing a website?
Try these rules:
RewriteRule ^$ index.php?page=1 [L]
RewriteRule ^([0-9]+)/?$ index.php?id=$1 [L]
RewriteRule ^View/([0-9]+)/?$ View/index.php?id=$1 [L]
RewriteRule ^([A-Za-z]+)/?$ index.php?category=$1&page=1 [L]
RewriteRule ^([A-Za-z]+)/([0-9]+)/?$ index.php?category=$1&page=$2 [L]
As for the other question: Yes, since Apache can map these requests to existing files, it responds with a success status code. Now if your application decides that the requested resource does not exist, you need to handle that within your application and send an appropriate status code.
To fix the trailing slash, Just put /? before the $ at the end in your pattern

Categories