Efficient Method for Preventing Hotlinking via .htaccess

Efficient Method for Preventing Hotlinking via .htaccess - php

I need to confirm something before I go accuse someone of ... well I'd rather not say.
The problem:
We allow users to upload images and embed them within text on our site. In the past we allowed users to hotlink to our images as well, but due to server load we unfortunately had to stop this.
Current "solution":
The method the programmer used to solve our "too many connections" issue was to rename the file that receives and processes image requests (image_request.php) to image_request2.php, and replace the contents of the original with
<?php
header("HTTP/1.1 500 Internal Server Error") ;
?>
Obviously this has caused all images with their src attribute pointing to the original image_request.php to be broken, and is also the wrong code to be sending in this case.
Proposed solution:
I feel a more elegant solution would be:
In .htaccess
If the request is for image_request.php
Check referrer
If referrer is not our site, send the appropriate header
If referrer is our site, proceed to image_request.php and process image request
What I would like to know is:
Compared to simply returning a 500 for each request to image_request.php:
How much more load would be incurred if we were to use my proposed alternative solution outlined above?
Is there a better way to do this?
Our main concern is that the site stays up. I am not willing to agree that breaking all internally linked images is the best / only way to solve this. I refuse to tell our users that because of something WE changed they must now manually change the embed code in all their previously uploaded content.

Ok, then you can use mod_rewrite capability of Apache to prevent hot-linking:
http://www.cyberciti.biz/faq/apache-mod_rewrite-hot-linking-images-leeching-howto/

Using ModRwrite will probably give you less load than running a PHP script. I think your solution would be lighter.
Make sure that you only block access in step 3 if the referer header is not empty. Some browsers and firewalls block the referer header completely and you wouldn't want to block those.

I assume you store image paths in database with ids of images, right?
And then you query database for image path giving it image id.
I suggest you install MemCached to the server and do caching of user requests. It's easy to do in PHP. After that you will see server load and decide if you should stop this hotlinking thing at all.

Your increased load is equal to that of a string comparison in PHP (zilch).
The obfuscation solution doesn't even solve the problem to begin with, as it doesn't stop future hotlinking from happening. If you do check the referrer header, make absolutely certain that all major mainstream browsers will set the header as you expect. It's an optional header, and the behavior might vary from browser to browser for images embedded in an HTML document.
You likely have sessions enabled for all requests (whether they're authenticated or not) -- as a backup plan, you can also rename your session cookie name to something obscure (edit: obscurity here actually doesn't matter as long as the cookie is set for your host only (and it is)) and check that a cookie by that name is set in image_request.php (no cookie set would indicate that it's a first-request to your site). Only use that as a fallback or redundancy check. It's worse than checking the referrer.
If you were generating the IMG HTML on the fly from markdown or something else, you could use a private key hash strategy with a short-live expire time attached to the query string. Completely air tight, but it seems way over the top for what you're doing.
Also, there is no "appropriate header" for lying to a client about the availability of a resource ;) Just send a 404.

Related

Convert external resource to https

My site is loading images from other sites and this is causing warnings when I implemented HTTPS instead of plain HTTP. I know why this is happening but I'm wondering how to correct.
Best solution I have seen is here, but I don't understand how that works.
The poster suggests prepending https://example.com/imageserver?url= to the image url. This doesn't work. So what am I missing? What is imageserver?
I hope this makes sense, I'm not sure if I'm not just missing something obvious here.

imageserver could be a php script that fetch the image and display its contents.
a very simple example, not very safe
echo file_get_contents($_GET['url']);
The idea here is that the browser now gets the images from your secure server instead of the original non-https server.

Do browsers cache PHP generated CSS and Javascript files?

Simple question.
Do browsers cache PHP generated CSS and script files automatically, just like CSS/JS files?

Sure, barring explicit acts to prevent caching. The browser has no way of knowing if the file was a static or dynamically generated resource.

If the URL remains the same, and there aren't hints in the HTTP responses to tell the browser otherwise, they can be cached.
If the URL includes dynamic information, the browser probably won't be able to take advantage of caching.
Changing the URL by adding a timestamp as a dummy parameter (e.g. http://host/myfile.php?t=17279273) is one of the ways you can prevent caching since the browser sees the slight change as a new resource.

Jonathon's answer suggesting the addition of a timestamp to prevent caching is a good one.
A useful tip along these lines is to append the creation/last modified date of a file. Doing this means that while unchanged the browser will cache the file, but when you update the file those changes are forced to your users.
It's not always the best option, but worth noting.

How to properly serve CSS

Say I for some reason want to serve my CSS through PHP (because of pre-processing, merging, etc). What do I need to do in my PHP to make this work well? Other than the most obvious:
header('content-type: text/css; charset=utf-8');
What about headers related to caching, modification times, etags, etc? Which ones should I use, why and how? How would I parse incoming headers and respond appropriately (304 Not Modified for example)?
Note: I know this can be tricky and that it would be a lot easier to just do what I want to do with the CSS before I deploy it as a regular CSS file. If I wanted to do it that way, I wouldn't have asked this question. I'm curious to how to do this properly and would like to know. What I do or could do beforehand with the CSS is irrelevant; I just want to know how to serve it properly :)
Note 2: I really would like to know how to do this properly. I feel most of the activity on this question has turned into me defending why I would want to do this, rather than getting answers on how to do this. Would very much appreciate it if someone could answer my question rather than just suggesting things like SASS. I'm sure it's awesome, and I might try it out sometime, but that's not what I'm asking about now. I want to know how to serve CSS through PHP and learn how to deal with the caching and things like that properly.

A commendable effort. Caching gets way too little good will. Please enjoy my short prose attempting to help you on your way.
The summary
Sending an ETag and a Last-Modified header will enable the browser to send a If-Modified-Since and a If-None-Match header back to your server on subsequent requests. You may then, when applicable, respond with a 304 Not Modified HTTP status code and an empty body, i.e. Content-Length: 0. Including a Expires header will help you to serve fresh content one day when the content has indeed changed.
The apprentice
Sounds simple enough, but it can be a bit tricky to get just right. Luckily for us all, there is really good guidance available.
Once you get it up and running, please turn to REDbot to help you smooth out any rough corners you may have left in.
The expert
For the value of the ETag, you will want to have something you can reproduce, but will still change whenever the content does. Otherwise you will not be able to tell whether the incoming value matches or not. A good candidate for a reproducible value which still changes when the content does, is an MD5 hash of the mtime of the file being served through the cache. In your case, it would probably be a sum for all the files being merged.
For Last-Modified the logical answer is the actual mtime of the file being served. Why neglect the obvious. Or for a group of files, as in your case, use the most recent mtime in the bunch.
For Expires, simply choose an appropriate TTL, or time-to-live, for the asset. Add this number to the asset's mtime, or the value you chose for Last-Modified, and you have your answer.
You may also want to include Cache-Control headers to let possible proxies on the way know how to properly serve their clients.
The scholar
For a more concrete response to your question, please refer to these questions predating yours:
What headers do I want to send together with a 304 response?
Get Browser to send both If-None-Match and If-Modified-Since
HTTP if-none-match and if-modified-since and 304 clarification in PHP
Is my implementation of HTTP Conditional Get answers in PHP is OK?

The easiest way to serve CSS (or JavaScript) through PHP would be to use Assetic, a super-useful PHP asset manager similar to Django's contrib.staticfiles or Ruby's Jammit. It handles caching and cache invalidation, dynamic minification, compression, and all the "tricky bits" that were mentioned in other answers.
To understand how to write your own asset server properly, I strongly recommend you read Assetic's source code. It's very commented and readable, and you'll learn a lot about best practices regarding caching, minification, and everything else that Assetic does so well.

One common patter is to include a meaningless GET parameter. In fact, stack exchange sites do exatly this:
<link ... href="http://cdn.sstatic.net/stackoverflow/all.css?v=0285b0392b5c">
The v (version) is presumably a hash of some kind, probably of the css file itself. They do not store the old sheets, it's just a way to force the browser to download the new file and not use the cached one.
With this setup, it is safe to set Cache-Control:max-age to a large value.
The ETag will make server reply 304 if the file is not modified, you might as well use the same hash:
header('ETag: "' . md5("path to css file") . '"');

I just finished explaining here why I don't think PHP-processed CSS is a good idea; I believe most people who implement it would be better served by another application structure. Take a look.
If you must do it, making caching work will require keeping track of each variant independently and having the client send a parameter which uniquely identifies that variant (so you can say "not modified").
The Content-Type header is a good start, but not the tricky bit...

You have to add query string at end of the javascript file, that is good option to say it is new file until that browsers are think same css files
www.example.com/css/tooltip.css?version1.0
or
www.example.com/css/tooltip.css?12-01-2012
so browser is going to understand this new files it reloads again, keep it in cache up to next release,and easy to maintainable if you append automatic date using php at end of the query string.

set browser URL in PHP header

I'll try to explain my query in the best of my ability. I would appreciate your help in this :)
There is a flash application (SWF) that I am outputting via a PHP file using header("Content-type: application/x-shockwave-flash");
Case A: When this flash file is loaded from http://www.a.com/flash.php?display=hello, it works the way it was intended to. Say for example it displays "hello".
Case B: When this flash file is loaded from http://www.b.com/flash.php?display=hello, it does not work. Say now it displays "Bye bye".
Important thing to note is that I did send the display=hello on the www.b.com but it somehow internally checks that it is being called from a website other than www.a.com and defaults to "bye bye".
The flash file is on the local web-server as flash.swf
How can I set the headers in that PHP file so that the the flash.swf thinks it is being called from www.a.com rather than www.b.com, even though it is.
I do not know how the flash.swf file is doing all these checks. and I don't think there is anyway for me to find out (decompile swf, etc -- but I'd like to avoid that route).
Is there perhaps a header I can set in the PHP file or set an environment variable to fix this issue?

If a.com is not under your control, I don't think this is possible to circumvent, because Flash can check for the current movie's URL in the browser, which you won't be able to manipulate in PHP.
If a.com is under your control, you could use an iframe as #Khez points out.
Otherwise, you'll have to talk to the author of the original Flash file and ask for the check to be changed (which is probably what you want to circumvent in the first place).

This will not be possible for security reasons.
For example, if your bank was trying to run this file, being able to fake the source would mean someone could steal from you.
The browser will enforce this and will prevent you from faking the source of your request, so it will not be possible for you to change this.

verifying a domain using php

I have a member area, where they can add their domains and it will be displayed in the profile page..but now I want to add a verification process, just like google web-masters does..where they need to upload a certain file and so..
please tell me whats the best way to do this ?
Thanks :)

Generate a token for each domain (sha-1 of domain or so), store it in your DB or what have you.
Generate a text-file containing the token on user request.
Ask the user to inform you to poll or poll every now and then to check the URL. This can easily be done by file_get_contents in PHP if fopen_wrappers are enabled.
The token is obviously compared to the token in your DB to make sure it wasn't just a random file present at a random domain..
Could be a good idea to check at some time interval if the file is still there, to keep someone from selling the domain but remain in control
It's not really black art as we can assume the user has access to its domain once any specific request which proves access can be fulfilled by the user. There's no real way to fool the system except doing some DNS-magic, or gaining entry to the webserver running on the domain, which is out of your control anyway.

Not sure if that's the best way, but I think Google does something like this:
get user's domain name (e.g. "http://example.com")
generate unique code and store in db
tell user where to upload the code (e.g. something like "/verification.txt")
after confirmation, make a HTTP request for the code ("http://example.com/verification.txt") from own server to the user's server
compare the code you received to the code in the db
You may want to generate consistently the same code for the same domain.

This question is convoluted. I think you need to spell out what you are looking for a little better.
EDIT #1:
Generate an md5 and give it to the user, tell them to put it on their domain and provide a URL to where it is. This could be in a txt file or anything.
Then read that file and check if the md5 string exists in there.
Actually I would come up with something slightly different than an md5. Maybe three of them, so that you reduce the chance they find it on some other domain and then give you that URL.
This can still be spoofed unless you nail down constraints, like it has to be a text file, the file must only contain the md5... etc.
Right now I can type in an md5 but it doesn't mean I control this website:
md5("i fooled you") = "0afb2d659b709f8ad499f4b87d9162f0"
But if I handed the URL to this answer, your system might accidentally think I have admin here.
I recommend creating a file and making them upload the file and give you the URL to it. But even that won't necessarily work because there are many sites where you can just upload something.
Maybe if it's a php encoded file that can execute? That's kind of a security flaw because I don't know if I would upload just anyone's PHP file. Typically if you don't have admin nobody is going to let you upload a php file that would work.
You might want to create a php call-home script but that's gonna be bad. People wouldn't use it.

Another way it could be done is:
Get the domain name
Generate a random code/string.
Sore this in your database
Make a meta tag and the random code in the content.
Use file get contents of the index page of the website.
Then search the page for the meta tag with the code sorted in the database.
If statement for success or unsuccessful.
The meta tag should look like this:
<meta name="site-verification" content="1010101010101010101010101010101010101010" />

Actually, just creating an md5-string for the domainname, letting the site owner put that in a meta-tag so you can check that would allready work fine ...

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.