I am trying to write a .htaccess file for my website, which will prevent access to pages and images via direct URL input, but localhost requests will be granted. So far I've found this code after some googling:
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^http://(www\.)?localhost [NC]
RewriteCond %{HTTP_REFERER} !^http://(www\.)?mydomain.com.*$ [NC]
RewriteRule \.(php|css|js|jpg)$ - [F]
The problem is my website images are protected all right, but when I want to access the index.php from a parent directory (the htaccess is in my subdirectory, not the parent), I am shown a 403 Forbidden error.
Now I am not really clear as to what these lines mean, or how to tweak them, so I can't tell right from wrong. Can someone help me out and tell what this actually does? Thanks!
Either your assets are accessible or they're not. You cannot serve assets to the public without serving them publicly. You probably think "from localhost" means if someone is "on your website" already; that's a wrong understanding of how the web works. Every asset is requested from the server via a URL, all requests come from clients. Requests do not come from "your local website".
If endusers must be able to see your assets, they must be able to access them via a URL, which means they'll also be able to see them when "inputting the URL directly". There's no technical difference there.
Related
I'm trying to use mod.rewrite to deny direct access to files on my web server, e.g. http://domain.tld/reports/imareport.pdf or http://domain.tld/img/img1.png, and I've used the answer on this question:
(htaccess) How to prevent a file from DIRECT URL ACCESS?
That page suggests using mod.rewrite like this:
RewriteEngine on
RewriteRule \.(png|pdf|htm)$ - [F]
Using mod.rewrite in this manner works fine for denying access to PDFs, but other files that are ordinarily included in a page such as images and css are not only blocked from direct access, but also blocked when used on a webpage in a normal <img> tag or whatever. This is contrary to the question and answer mentioned above.
So... my question is... is there a way to block direct access to files but still allow them in webpages?
Thanks Mark Phillips, I didn't fully appreciate what these two rewrite conditions were doing for me:
RewriteCond %{HTTP_REFERER} !^http://(www\.)?localhost [NC]
RewriteCond %{HTTP_REFERER} !^http://(www\.)?localhost.*$ [NC]
So I had managed to mess them up. Things worked as needed when I used the code just as it was.
I have a blog with images. I do not want that the images are directly accessible through the URL (and also not for Googlebot and other bots)... for example... mysite.com/assets/images/img1... etc. So I thought to password protect the images directory with .htaccess. That worked, only front-end all my images became links, and I had to provide my credentials to make them show. How can I make my images show yet NOT make them directly accessible when typing the corresponding URL and the images URLs (or better yet the images directory) NOT accesible for bots to crawl/index?
Don't go with password protection. The right way to do it would be to filter the requests based on the referer URL. If the request originates from your own site then it's ok. Otherwise the request is trying to get an image directly.
I've found this site with detailed instructions on how to do that: http://altlab.com/htaccess_tutorial.html
Taken from the mentioned site:
RewriteEngine On
RewriteCond %{HTTP_REFERER} !^http://(.+\.)?mysite\.com/ [NC]
RewriteCond %{HTTP_REFERER} !^$
RewriteRule .*\.(jpe?g|gif|bmp|png)$ http://url_to_default_image.gif [L]
Note that you would have to enable mod_rewrite in your Apache server.
Btw, just asking. Why don't just let people get the image directly if they want to?
I have a lot of uncleverly named folders on my server. They're called things like cdn-1 cdn3 img1 and so on. I have collected all of the files in these folders and put them in one folder, called cdn. Now, I don't want users to get a 404 when they try to access a file that is at cdn-3.website.com/file/img/1.jpg. Instead, is there a way to mod_rewrite this folder so that even if a user tries to access a file at, say, cdn-7382910731293.website.com/file/1.jpeg, it'll still work? This might be far fetched, but I have seen this be done before. The only working code I've found for a solution like this is for a single file, not a folder.
EDIT: This is what I've tried so far. It just won't work. What's the problem with it?
RewriteRule ^cdn([^/]*).1$ http://cdn.website.com/$1 [L]
Simplest thing would be to use .htaccess to redirect all 404 pages to a single PHP page, which would in turn check if such a file/image exists in your new cdn folder and either server it or display a valid 404 page.
Don't forget about headers when serving images etc., too ;-)
Maybe try make the sub-domains all point to the same place (wherever cdn.website.com points to) and then stick the following in the htaccess on that domain:
RewriteCond %{HTTP_HOST} ^(cdn-1|cdn1|img1|etc)\.website\.com
RewriteRule (.*) http://cdn.website.com/$1 [R=301,L]
That should work seamlessly, as long as the files otherwise would be in the same place. That is, only the host name is different and not also the file path after the host name. The RewriteCond is a regular expression I think, so you should be able to write it to match whatever you need.
I have used this method successfully to redirect an old domain to a new one, and to redirect example.com to www.example.com.
I have a basic MVC system that is sending POST data to URLs such as
admin/product/add/
But this is giving me an error
Forbidden
You don't have permission to access
/admin/product/add/ on this server.
Additionally, a 404 Not Found error was encountered while trying to
use an ErrorDocument to handle the request.
The RewriteRule is simply
RewriteRule ^(.*)/$ index.php?uri=$1
Last time I saw this on a server changing file/directory permissions to 755 seemed to fix it but not this time. I have never really understood the reason for the error so was hoping someone may be able to provide some more information?
You have 2 errors:
You don't have permission to access /admin/product/add/ on this server.
Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request.
The 2nd one is quite certainly a consequence of the same bug. You may have something in your apache configuration which remove 404 errors from default http server handling and push it to your php application, if this php application was working we would have a nice 404, but...
The first one tells you your php application is not running at all.
So. This first error tell us that apache did try to directly access the directory /path/to/documentroot/admin/product/add/ on your server and to produce a listing of it (well a listing of the directory content would be done only if apache were authorized to do so). But of course this is not a real directory on your server. It is a virtual path in your application. So apache ends up with a 404 (which leads to error 2).
The application handles a virtual path, apache does not manage it. The RewriteRule job is to catch the requested path before apache is trying to serve it and give it to one single php file (index.php) as a query string argument.
So... this rewrite rule was not applied. Things that could prevent this rule to be applied are numerous:
mod_rewrite not activated: is the module present and enabled (RewriteEngine on)?
syntax error: mod rewrite syntax is quite hard to read, sometimes really complex. But here it seems quite simple.
The RewriteRule resulting file is maybe not a valid target for apache. If the index.php file is not present in the DocumentRoot, or not readable by the apache user, then apache will fail. Warning: having a file readable by the apache user means having read rights on the file but also execution rights on all parents directories for the apache user. This is where your classical chmod/chown solutions are fixing the problems.
The rule must be in a valid configuration file. Is this rule in a an apache configuration file, inside a Location or Directory section? Or maybe in the global scope -- this may alter the rewrite Rule syntax--. Or is it in a .htaccess file? If it's a .htacces does apache reads the .htacces files and are mod-rewrite instructions allowed there (AllowOverride None). Isn't there others .htaccess files taking precedence?
So to fix the problem:
If you have an apache version greater than 2.2.16 you can replace the RewriteRule by FallbackRessource /index.php to check that this does not come from a mod-rewrite problem.
try to directly request index.php, so that at least a direct request to this file does work
try to directly access a valid ressource on the documentRoot (a txt file, an image, something that should not be handled by the rewrite but directly served)
check that if any of your virtual paths could map real physical paths Apache is not trying to serve the physical one (like when you write a RewriteCond %{REQUEST_FILENAME}-d) but really push the path to index.php
check apache error logs.
debug mod_rewrite with RewriteLog and RewriteLogLevel
collect facts, settings and tests, and then push that to SO or Servfault.
So the problem is quite simple: the php application is not receiving the request. But there are a very big number of ways to end in this state. The message in itself is not very important. The only way to find the error is to check all parameters (or to have years of bug fixing experience and developing a pre-cognitive intuition organ for lamp bugs -- usually a beard --, like admins). And the only way for us to help you is to find strange facts in a big list of configuration details, this is why good questions contains a lot of informations, even if all theses informations looks simply "classical" for you.
EDIT
To clarify the problem you should edit your answer, track the POSTÂ requests with tools such as Chrome developpers tools or firebug (keep the network tracking in record mode to catch several POSTS) or try to replay the post with Live HTTP headers reply. You should try to isolate the problematic POST and give us details. Debug is not magical.
Now I know one magical random POST failure. It's the empty GET url bug. It could be that (or not). If you have one empty GET url hidden somewhere (<IMG SRC="">, url() in css, or an empty LINK in headers for example. As theses hidden POSTÂ are defined in HTTP as "replay-the-request-which-launched-the-source-page, and some browsers even replay the POST that gives you the page if they found one. This could lead to broken hidden POSTS.
It could be also that the POST is not sent to the right server. Hard to say. So please collect informations from your comments, add some more network analysis and edit the question which is now really containing not enough facts.
Try this:
RewriteCond %{REQUEST_METHOD} =POST
RewriteRule ^(.*)/$ index.php?uri=$1
Use this:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /index.php?uri=$1 [L]
Also use only www or non-www domain but not both at the same time. Redirect users with htaccess where you would like like to...
NonWWW to WWW:
RewriteCond %{HTTP_HOST} !^www\.(.*)$ [NC]
RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L]
WWW to NonWWW:
RewriteCond %{HTTP_HOST} ^www\.(.*)$ [NC]
RewriteRule ^www\.(.*)$ http://%1/$1 [R=301,L]
I have two different domains that both point to my homepage in the same server.
I want to log every single access made to my homepage and log which domain the user used to access my homepage, how can I do this?
I tried mod_rewrite in Apache and logging to a MySQL database with PHP but all I could do was infinite loops.
Any ideas?
EDIT:
By your answers, I see you didn't get what I want...
As far as I know Google Analytics does not allow me to differentiate the domain being used if they both point to the same site and it also does not allow me to see that some files like images were accessed directly instead of through my webpages.
I can't also just use $_SERVER['HTTP_HOST'] cause like I just said, I want to log EVERYTHING, like images and all other files, every single request, even if it doesn't exist.
As for Webalizer, I never saw it differentiate between domains, it always assumes the default domain configure in the account and use that as root, it doesn't even display it. I'll have to check it again, but I'm not sure it will do what I want...
INFINITE LOOP:
The approach I tried involved rewriting the urls in Apche with a simple Rewrite rule pointing to a PHP script, the PHP script would log the entry into a MySQL database and the send the user back to the file with the header() function. Something like this:
.htaccess:
RewriteCond %{HTTP_HOST} ^(www\.)?domain1\.net [NC]
RewriteRule ^(.*)$ http://www.domain1.net/logscript?a=$1 [NC,L]
RewriteCond %{HTTP_HOST} ^(www\.)?domain2\.net [NC]
RewriteRule ^(.*)$ http://www.domain2.net/logscript?a=$1 [NC,L]
PHP Script:
$url = $_GET['a'];
$domain = $_SERVER['HTTP_HOST'];
// Code to log the entry into the MySQL database
header("Location: http://$domain/$url");
exit();
So, I access some file, point that file to the PHP script and the script will log and redirect to that file... However, when PHP redirects to that file, the htaccess rules will pick it up and redirect again too the PHP script, creating an infinite loop.
The best thing do would be to parse the server logs. Those will show the domain and request. Even most shared hosting accounts provide access to the logs.
If you're going to go the rewrite route, you could use RewriteCond to check the HTTP_REFERER value to see if the referer was a local link or not.
RewriteCond %{HTTP_HOST} ^(www\.)?domain1\.net [NC]
RewriteCond %{HTTP_REFERER} !^(.*)domain1(.*)$ [NC]
RewriteRule ^(.*)$ http://www.domain1.net/logscript?a=$1 [NC,L]
RewriteCond %{HTTP_HOST} ^(.*)domain2\.net [NC]
RewriteCond %{HTTP_REFERER} !^(.*)domain2(.*)$ [NC]
RewriteRule ^(.*)$ http://www.domain2.net/logscript?a=$1 [NC,L]
You may also want to post in the mod_rewrite forum. They have a whole section about handling domains.
If Google Analytics is not your thing,
$_SERVER['HTTP_HOST']
holds the domain that is used, you can log that (along with time, browser, filepath etc). No need for mod_rewrite I think. Check print_r($_SERVER) to see other things that might be interesting to log.
Make sure to still escape (mysql_real_escape_string()) all the log values, it's trivially easy to inject SQL via the browser's user-agent string for example.
So, I access some file, point that file to the PHP script and the script will log and redirect to that file... However, when PHP redirects to that file, the htaccess rules will pick it up and redirect again too the PHP script, creating an infinite loop.
Can you check for HTTP headers in the RewriteCond? If so, try setting an extra header alongside the redirect in PHP (by convention custom HTTP headers start with 'X-' so it could be header('X-stayhere: 1');), and if the X-stayhere header is present, the RewriteCond fails and it doesn't forward the browser to the PHP script.
If, however, you can cron a script to download the server logs and run them through some freeware logfile analyzer, I'd go with that instead. Having two redirects for every request is a fair bit of overhead.. (and if I was more awake I might be able to come up with different solutions)
Does Google Analytics not provide this option? Or could you not parse your server log files?
Why not use the access log facility build in apache?
Apache have a "piped log" function that allow you redirect the access log to any program.
CustomLog "|/path/to/your/logger" common