One-Way Site-to-Site Authentication (PHP/Apache)

One-Way Site-to-Site Authentication (PHP/Apache) - php

I have a site which offers help information for users of a much larger application. Until recently both my help site and the main application were behind a corporate firewall. Now that the main application has been moved outside the firewall, I'll have to move my help site as well.
My only security requirement is that users reaching my help site are only permitted to enter if they clicked the 'Help' link in the main application. (Obviously the company doesn't want them to have to enter their credentials again.) I don't need to exchange information back and forth between the sites.
I have looked at $_SERVER['HTTP_REFERER'] (not secure), Oauth and OpenID (which seem like overkill). I'm wondering if the answer lies in one-way SSL authentication (the main app has a cert), but I'm gettting a little lost here.
So the question is: what is the simplest way to do this, and what would that look like in terms of Apache and PHP?
Thank you very much for any advice!

The link to help could contain a token string. When the user clicks the link, the help system sees that token and makes a web service call to your application asking if that token is valid. If the token is valid, then the web service responds in the affirmative and the help site lets the user in. You could have the token be valid only if that user is signed in. Additionally, you could encode the IP address of the client in the URL and verify that the person trying to enter the help system is from the same IP address. So like this:
You have a link to help with a token: http: //help.yoursite.com/?token=< unique id>&client=md5(< client ip>)
The user clicks that link, which takes them to help.yoursite.com.
help.yoursite.com checks that the md5(< client ip>) matches the client parameter of the url. If so, it's probably the same person and not a spoof.
help.yoursite.com then makes a web service call to yoursite.com asking if the unique id for that client ip is valid.
yoursite.com checks if it's valid and returns yes or no, and possibly the username of the person logged in so that help.yoursite.com will have the username of the person logged in.
help.yoursite.com takes the response and either lets the user in or not.
That way you've made sure the client is the same one, and that they were logged in to the other site. Your communication between yoursite and help.yoursite will have to be secure. It's a lot simpler than oauth, and even kinda follows a similar protocal, BUT it's not as secure overall. There are still ways around it, but it's all up to how much risk you're willing to accept.

Depends on just how independent the two sites are. If they share access to something private (a database or some file space) then you could consider passing a random value between them. Otherwise if you can use a time + salt approach which allows only those with the same salt/algorithm to generate valid URLs.
M = Main site
H = Help site
K = Key value passed between them (in plain text is fine)
A: Server time + Salt:
XMIT: M hashes time (rounded to -say- nearest ten minute marker) + some random value (salt) to make K. And appends K to the help URL (better yet the help request is a POST so K isn't visible to the user).
RCV: H completes the same hash using the same algorithm and if it's hash matches the supplied K then access is granted. Otherwise H shows a blank page (perhaps for security they'd prefer details of the site are kept secret?) or an error message (more risk, but helpful to legitimate users).
REQ: Both sites on same server OR both sites on servers that are reasonably synchronized in time - no need for perfect sync because of the 10 minute quantization. It's important the salt value be identical on both servers and not publicly accessible (it could also be updated if there was ever a risk a third party had figured it out).
SECURITY: The salt is never passed plain text between the two servers, but because the key passed only works for a period of time even someone sniffing the value (or copying it out of source from M) can only get access temporarily. You need to round to the nearest n minute marker to (a) give reasonable time for a page visitor to request help (b) because the request and check will be a small amount of time apart and (c) because the if the sites are on different servers the time won't be identical. The safety comes from keeping salt and the time calculation algorithm private.
NOTE: On H you may need to test two values of K to allow for cusp cases where rounding leads M and H to different times (because of different times or because of delay in processing)
B: Database/file key:
XMIT: M generates a random value for K and stores it and an expire time in a database table or file that H can access. Again M attaches K (not timeout) to the GET or POST request to H.
RCV: H checks the value against the list of stored values and if found and not timed out then access is granted.
REQ: Both sites have access to shared file storage or database. The database table or file will need to store multiple random values. Either M or H should clean up expired entries (either as part of their operation code, or a scheduled task [cron job] could be setup to complete this at regular intervals)
SECURITY: While K is out there in plain text, there's no advantage to knowing it - again, sniffing the value or copying it out of source from somewhere will only grant access to H temporarily.
Overall
Depending on how tolerant you are of user's hitting timeouts, or unauthorized sources using "found" keys you can use AJAX to generate the value when the help button is clicked, and keep the timeout very low (17 seconds?)

Related

How to stop form abuse From Different website? [duplicate]

I don't understand how using a 'challenge token' would add any sort of prevention: what value should compared with what?
From OWASP:
In general, developers need only
generate this token once for the
current session. After initial
generation of this token, the value is
stored in the session and is utilized
for each subsequent request until the
session expires.
If I understand the process correctly, this is what happens.
I log in at http://example.com and a session/cookie is created containing this random token. Then, every form includes a hidden input also containing this random value from the session which is compared with the session/cookie upon form submission.
But what does that accomplish? Aren't you just taking session data, putting it in the page, and then comparing it with the exact same session data? Seems like circular reasoning. These articles keep talking about following the "same-origin policy" but that makes no sense, because all CSRF attacks ARE of the same origin as the user, just tricking the user into doing actions he/she didn't intend.
Is there any alternative other than appending the token to every single URL as a query string? Seems very ugly and impractical, and makes bookmarking harder for the user.

The attacker has no way to get the token. Therefore the requests won't take any effect.
I recommend this post from Gnucitizen. It has a pretty decent CSRF explanation: http://www.gnucitizen.org/blog/csrf-demystified/

CSRF Explained with an analogy - Example:
You open the front door of your house with a key.
Before you go inside, you speak to your neighbour
While you are having this conversation, walks in, while the door is still unlocked.
They go inside, pretending to be you!
Nobody inside your house notices anything different - your wife is like, ‘oh crud*, he’s home’.
The impersonator helps himself to all of your money, and perhaps plays some Xbox on the way out....
Summary
CSRF basically relies on the fact that you opened the door to your house and then left it open, allowing someone else to simply walk in and pretend to be you.
What is the way to solve this problem?
When you first open the door to your house, you are given a paper with a long and very random number written on it by your door man:
"ASDFLJWERLI2343234"
Now, if you wanna get into your own house, you have to present that piece of paper to the door man to get in.
So now when the impersonator tries to get into your house, the door man asks:
"What is the random number written on the paper?"
If the impersonator doesn't have the correct number, then he won't get in. Either that or he must guess the random number correctly - which is a very difficult task. What's worse is that the random number is valid for only 20 minutes (e.g). So know the impersonator must guess correctly, and not only that, he has only 20 minutes to get the right answer. That's way too much effort! So he gives up.
Granted, the analogy is a little strained, but I hope it is helpful to you.
**crud = (Create, Read, Updated Delete)

You need to keep researching this topic for your self, but I guess that's why you are posting to SO :). CSRF is a very serious and widespread vulnerability type that all web app developers should be aware of.
First of all, there is more than one same origin policy. But the most important part is that a script being hosted from http://whatever.com cannot READ data from http://victom.com, but it can SEND data via POST and GET. If the request only contains information that is known to the attacker, then the attacker can forge a request on the victom's browser and send it anywhere. Here are 3 XSRF exploits that are building requests that do not contain a random token.
If the site contains a random token then you have to use XSS to bypass the protection that the Same Origin Policy provides. Using XSS you can force javascript to "originate" from another domain, then it can use XmlHttpRequest to read the token and forge the request. Here is an exploit I wrote that does just that.

Is there any alternative other than
appending the token to every single
URL as a query string? Seems very ugly
and impractical, and makes bookmarking
harder for the user.
There is no reason to append the token to every URL on your site, as long as you ensure that all GET requests on your site are read-only. If you are using a GET request to modify data on the server, you'd have to protect it using a CSRF token.
The funny part with CSRF is that while an attacker can make any http request to your site, he cannot read back the response.
If you have GET urls without a random token, the attacker will be able to make a request, but he won't be able to read back the response. If that url changed some state on the server, the attackers job is done. But if just generated some html, the attacker has gained nothing and you have lost nothing.

Is there any reliable way to identify the user machine in a unique way? [duplicate]

I need to figure out a way uniquely identify each computer which visits the web site I am creating. Does anybody have any advice on how to achieve this?
Because i want the solution to work on all machines and all browsers (within reason) I am trying to create a solution using javascript.
Cookies will not do.
I need the ability to basically create a guid which is unique to a computer and repeatable, assuming no hardware changes have happened to the computer. Directions i am thinking of are getting the MAC of the network card and other information of this nature which will id the machine visiting the web site.

Introduction
I don't know if there is or ever will be a way to uniquely identify machines using a browser alone. The main reasons are:
You will need to save data on the users computer. This data can be
deleted by the user any time. Unless you have a way to recreate this
data which is unique for each and every machine then your stuck.
Validation. You need to guard against spoofing, session hijacking, etc.
Even if there are ways to track a computer without using cookies there will always be a way to bypass it and software that will do this automatically. If you really need to track something based on a computer you will have to write a native application (Apple Store / Android Store / Windows Program / etc).
I might not be able to give you an answer to the question you asked but I can show you how to implement session tracking. With session tracking you try to track the browsing session instead of the computer visiting your site. By tracking the session, your database schema will look like this:
sesssion:
sessionID: string
// Global session data goes here
computers: [{
BrowserID: string
ComputerID: string
FingerprintID: string
userID: string
authToken: string
ipAddresses: ["203.525....", "203.525...", ...]
// Computer session data goes here
}, ...]
Advantages of session based tracking:
For logged in users, you can always generate the same session id from the users username / password / email.
You can still track guest users using sessionID.
Even if several people use the same computer (ie cybercafe) you can track them separately if they log in.
Disadvantages of session based tracking:
Sessions are browser based and not computer based. If a user uses 2 different browsers it will result in 2 different sessions. If this is a problem you can stop reading here.
Sessions expire if user is not logged in. If a user is not logged in, then they will use a guest session which will be invalidated if user deletes cookies and browser cache.
Implementation
There are many ways of implementing this. I don't think I can cover them all I'll just list my favorite which would make this an opinionated answer. Bear that in mind.
Basics
I will track the session by using what is known as a forever cookie. This is data which will automagically recreate itself even if the user deletes his cookies or updates his browser. It will not however survive the user deleting both their cookies and their browsing cache.
To implement this I will use the browsers caching mechanism (RFC), WebStorage API (MDN) and browser cookies (RFC, Google Analytics).
Legal
In order to utilize tracking ids you need to add them to both your privacy policy and your terms of use preferably under the sub-heading Tracking. We will use the following keys on both document.cookie and window.localStorage:
_ga: Google Analytics data
__utma: Google Analytics tracking cookie
sid: SessionID
Make sure you include links to your Privacy policy and terms of use on all pages that use tracking.
Where do I store my session data?
You can either store your session data in your website database or on the users computer. Since I normally work on smaller sites (let than 10 thousand continuous connections) that use 3rd party applications (Google Analytics / Clicky / etc) it's best for me to store data on clients computer. This has the following advantages:
No database lookup / overhead / load / latency / space / etc.
User can delete their data whenever they want without the need to write me annoying emails.
and disadvantages:
Data has to be encrypted / decrypted and signed / verified which creates cpu overhead on client (not so bad) and server (bah!).
Data is deleted when user deletes their cookies and cache. (this is what I want really)
Data is unavailable for analytics when users go off-line. (analytics for currently browsing users only)
UUIDS
BrowserID: Unique id generated from the browsers user agent string. Browser|BrowserVersion|OS|OSVersion|Processor|MozzilaMajorVersion|GeckoMajorVersion
ComputerID: Generated from users IP Address and HTTPS session key.
getISP(requestIP)|getHTTPSClientKey()
FingerPrintID: JavaScript based fingerprinting based on a modified fingerprint.js. FingerPrint.get()
SessionID: Random key generated when user 1st visits site. BrowserID|ComputerID|randombytes(256)
GoogleID: Generated from __utma cookie. getCookie(__utma).uniqueid
Mechanism
The other day I was watching the wendy williams show with my girlfriend and was completely horrified when the host advised her viewers to delete their browser history at least once a month. Deleting browser history normally has the following effects:
Deletes history of visited websites.
Deletes cookies and window.localStorage (aww man).
Most modern browsers make this option readily available but fear not friends. For there is a solution. The browser has a caching mechanism to store scripts / images and other things. Usually even if we delete our history, this browser cache still remains. All we need is a way to store our data here. There are 2 methods of doing this. The better one is to use a SVG image and store our data inside its tags. This way data can still be extracted even if JavaScript is disabled using flash. However since that is a bit complicated I will demonstrate the other approach which uses JSONP (Wikipedia)
example.com/assets/js/tracking.js (actually tracking.php)
var now = new Date();
var window.__sid = "SessionID"; // Server generated
setCookie("sid", window.__sid, now.setFullYear(now.getFullYear() + 1, now.getMonth(), now.getDate() - 1));
if( "localStorage" in window ) {
window.localStorage.setItem("sid", window.__sid);
}
Now we can get our session key any time:
window.__sid || window.localStorage.getItem("sid") || getCookie("sid") || ""
How do I make tracking.js stick in browser?
We can achieve this using Cache-Control, Last-Modified and ETag HTTP headers. We can use the SessionID as value for etag header:
setHeaders({
"ETag": SessionID,
"Last-Modified": new Date(0).toUTCString(),
"Cache-Control": "private, max-age=31536000, s-max-age=31536000, must-revalidate"
})
Last-Modified header tells the browser that this file is basically never modified. Cache-Control tells proxies and gateways not to cache the document but tells the browser to cache it for 1 year.
The next time the browser requests the document, it will send If-Modified-Since and If-None-Match headers. We can use these to return a 304 Not Modified response.
example.com/assets/js/tracking.php
$sid = getHeader("If-None-Match") ?: getHeader("if-none-match") ?: getHeader("IF-NONE-MATCH") ?: "";
$ifModifiedSince = hasHeader("If-Modified-Since") ?: hasHeader("if-modified-since") ?: hasHeader("IF-MODIFIED-SINCE");
if( validateSession($sid) ) {
if( sessionExists($sid) ) {
continueSession($sid);
send304();
} else {
startSession($sid);
send304();
}
} else if( $ifModifiedSince ) {
send304();
} else {
startSession();
send200();
}
Now every time the browser requests tracking.js our server will respond with a 304 Not Modified result and force an execute of the local copy of tracking.js.
I still don't understand. Explain it to me
Lets suppose the user clears their browsing history and refreshes the page. The only thing left on the users computer is a copy of tracking.js in browser cache. When the browser requests tracking.js it recieves a 304 Not Modified response which causes it to execute the 1st version of tracking.js it recieved. tracking.js executes and restores the SessionID that was deleted.
Validation
Suppose Haxor X steals our customers cookies while they are still logged in. How do we protect them? Cryptography and Browser fingerprinting to the rescue. Remember our original definition for SessionID was:
BrowserID|ComputerID|randomBytes(256)
We can change this to:
Timestamp|BrowserID|ComputerID|encrypt(randomBytes(256), hk)|sign(Timestamp|BrowserID|ComputerID|randomBytes(256), hk)
Where hk = sign(Timestamp|BrowserID|ComputerID, serverKey).
Now we can validate our SessionID using the following algorithm:
if( getTimestamp($sid) is older than 1 year ) return false;
if( getBrowserID($sid) !== createBrowserID($_Request, $_Server) ) return false;
if( getComputerID($sid) !== createComputerID($_Request, $_Server) return false;
$hk = sign(getTimestamp($sid) + getBrowserID($sid) + getComputerID($sid), $SERVER["key"]);
if( !verify(getTimestamp($sid) + getBrowserID($sid) + getComputerID($sid) + decrypt(getRandomBytes($sid), hk), getSignature($sid), $hk) ) return false;
return true;
Now in order for Haxor's attack to work they must:
Have same ComputerID. That means they have to have the same ISP provider as victim (Tricky). This will give our victim the opportunity to take legal action in their own country. Haxor must also obtain HTTPS session key from victim (Hard).
Have same BrowserID. Anyone can spoof User-Agent string (Annoying).
Be able to create their own fake SessionID (Very Hard). Volume atacks won't work because we use a time-stamp to generate encryption / signing key so basically its like generating a new key for each session. On top of that we encrypt random bytes so a simple dictionary attack is also out of the question.
We can improve validation by forwarding GoogleID and FingerprintID (via ajax or hidden fields) and matching against those.
if( GoogleID != getStoredGoodleID($sid) ) return false;
if( byte_difference(FingerPrintID, getStoredFingerprint($sid) > 10%) return false;

These people have developed a fingerprinting method for recognising a user with a high level of accuracy:
https://panopticlick.eff.org/static/browser-uniqueness.pdf
We investigate the degree to which modern web browsers
are subject to “device fingerprinting” via the version and configuration information that they will transmit to websites upon request. We
implemented one possible fingerprinting algorithm, and collected these
fingerprints from a large sample of browsers that visited our test side,
panopticlick.eff.org. We observe that the distribution of our finger-
print contains at least 18.1 bits of entropy, meaning that if we pick a
browser at random, at best we expect that only one in 286,777 other
browsers will share its fingerprint. Among browsers that support Flash
or Java, the situation is worse, with the average browser carrying at least
18.8 bits of identifying information. 94.2% of browsers with Flash or Java
were unique in our sample.
By observing returning visitors, we estimate how rapidly browser fingerprints might change over time. In our sample, fingerprints changed quite
rapidly, but even a simple heuristic was usually able to guess when a fingerprint was an “upgraded” version of a previously observed browser’s
fingerprint, with 99.1% of guesses correct and a false positive rate of only
0.86%.
We discuss what privacy threat browser fingerprinting poses in practice,
and what countermeasures may be appropriate to prevent it. There is a
tradeoff between protection against fingerprintability and certain kinds of
debuggability, which in current browsers is weighted heavily against privacy. Paradoxically, anti-fingerprinting privacy technologies can be self-
defeating if they are not used by a sufficient number of people; we show
that some privacy measures currently fall victim to this paradox, but
others do not.

It's not possible to identify the computers accessing a web site without the cooperation of their owners. If they let you, however, you can store a cookie to identify the machine when it visits your site again. The key is, the visitor is in control; they can remove the cookie and appear as a new visitor any time they wish.

A possibility is using flash cookies:
Ubiquitous availability (95 percent of visitors will probably have flash)
You can store more data per cookie (up to 100 KB)
Shared across browsers, so more likely to uniquely identify a machine
Clearing the browser cookies does not remove the flash cookies.
You'll need to build a small (hidden) flash movie to read and write them.
Whatever route you pick, make sure your users opt IN to being tracked, otherwise you're invading their privacy and become one of the bad guys.

There is a popular method called canvas fingerprinting, described in this scientific article: The Web Never Forgets:
Persistent Tracking Mechanisms in the Wild. Once you start looking for it, you'll be surprised how frequently it is used. The method creates a unique fingerprint, which is consistent for each browser/hardware combination.
The article also reviews other persistent tracking methods, like evercookies, respawning http and Flash cookies, and cookie syncing.
More info about canvas fingerprinting here:
Pixel Perfect: Fingerprinting Canvas in HTML5
https://en.wikipedia.org/wiki/Canvas_fingerprinting

You may want to try setting a unique ID in an evercookie (it will work cross browser, see their FAQs):
http://samy.pl/evercookie/
There is also a company called ThreatMetrix that is used by a lot of big companies to solve this problem:
http://threatmetrix.com/our-solutions/solutions-by-product/trustdefender-id/
They are quite expensive and some of their other products aren't very good, but their device id works well.
Finally, there is this open source jquery implementation of the panopticlick idea:
https://github.com/carlo/jquery-browser-fingerprint
It looks pretty half baked right now but could be expanded upon.
Hope it helps!

There is only a small amount of information that you can get via an HTTP connection.
IP - But as others have said, this is not fixed for many, if not most Internet users due to their ISP's dynamic allocation policies.
Useragent String - Nearly all browsers send what kind of browser they are with every request. However, this can be set by the user in many browsers today.
Collection of request fields - There are other fields sent with each request, such as supported encodings, etc. These, if used in the aggregate can help to ID a user's machine, but again are browser dependent and can be changed.
Cookies - Setting a cookie is another way to identify a machine, or more specifically a browser on a machine, but as others have said, these can be deleted, or turned off by the users, and are only applicable on a browser, not a machine.
So, the correct response is that you cannot achieve what you would live via the HTTP over IP protocols alone. However, using a combination of cookies, as well as IP, and the fields in the HTTP request, you have a good chance at guessing, sort of, what machine it is. Users tend to use only one browser, and often from one machine, so this may be fairly relieable, but this will vary depending on the audience...techies are more likely to mess with this stuff, and use more machines/browsers. Additionally, this could even be coupled with some attempt to geo-locate the IP, and use that data as well. But in any case, there is no solution that will be correct all of the time.

There are flaws with both cookie and non-cookie approaches. But if you can forgive the shortcomings of the cookie approach, here's an idea.
If you're already using Google Analytics on your site, then you don't need to write code to track unique users yourself. Google Analytics does that for you via the __utma cookie value, as described in Google's documentation. And by reusing this value you're not creating additional cookie payload, which has efficiency benefits with page requests.
And you could write some code easily enough to access that value, or use this script's getUniqueId() function.

As with the previous solutions cookies are a good method, be aware that they identify browsers though. If I visited a website in Firefox and then in Internet Explorer cookies would be stored for both attempts seperately. Some users also disable cookies (but more people disable JavaScript).
Another method to consider would be I.P. and hostname identification (be aware these can vary for dial-up/non-static IP users, AOL also uses blanket IPs). However since this only identifies networks this might not work as well as cookies.

The suggestions to use cookies aside, the only comprehensive set of identifying attributes available to interrogate are contained in the HTTP request header. So it is possible to use some subset of these to create a pseudo-unique identifier for a user agent (i.e., browser). Further, most of this information is possibly already being logged in the so-called "access log" of your web server software by default and, if not, can be easily configured to do so. Then, a utlity could be developed that simply scans the content of this log, creating fingerprints of each request comprised of, say, the IP address and User Agent string, etc. The more data available, even including the contents of specific cookies, adds to the quality of the uniqueness of this fingerprint. Though, as many others have stated already, the HTTP protocol doesn't make this 100% foolproof - at best it can only be a fairly good indicator.

When i use a machine which has never visited my online banking web site i get asked for additional authentification. then, if i go back a second time to the online banking site i dont get asked the additional authentification...i deleted all cookies in IE and relogged onto my online banking site fully expecting to be asked the authentification questions again. to my surprise i was not asked. doesnt this lead one to believe the bank is doing some kind of pc tagging which doesnt involve cookies?
This is a pretty common type of authentication used by banks.
Say you're accessing your bank website via example-isp.com. The first time you're there, you'll be asked for your password, as well as additional authentication. Once you've passed, the bank knows that user "thatisvaliant" is authenticated to access the site via example-isp.com.
In the future, it won't ask for extra authentication (beyond your password) when you're accessing the site via example-isp.com. If you try to access the bank via another-isp.com, the bank will go through the same routine again.
So to summarize, what the bank's identifying is your ISP and/or netblock, based on your IP address. Obviously not every user at your ISP is you, which is why the bank still asks you for your password.
Have you ever had a credit card company call to verify that things are OK when you use a credit card in a different country? Same concept.

Really, what you want to do cannot be done because the protocols do not allow for this. If static IPs were universally used then you might be able to do it. They are not, so you cannot.
If you really want to identify people, have them log in.
Since they will probably be moving around to different pages on your web site, you need a way to keep track of them as they move about.
So long as they are logged in, and you are tracking their session within your site via cookies/link-parameters/beacons/whatever, you can be pretty sure that they are using the same computer during that time.
Ultimately, it is incorrect to say this tells you which computer they are using if your users are not using your own local network and do not have static IP addresses.
If what you want to do is being done with the cooperation of the users and there is only one user per cookie and they use a single web browser, just use a cookie.

You can use fingerprintjs2
new Fingerprint2().get(function(result, components) {
console.log(result) // a hash, representing your device fingerprint
console.log(components) // an array of FP components
//submit hash and JSON object to the server
})
After that you can check all your users against existing and check JSON similarity, so even if their fingerprint mutates, you still can track them

Because i want the solution to work on all machines and all browsers (within reason) I am trying to create a solution using javascript.
Isn't that a really good reason not to use javascript?
As others have said - cookies are probably your best option - just be aware of the limitations.

I guess the verdict is i cannot programmatically uniquely identify a computer which is visiting my web site.
I have the following question. When i use a machine which has never visited my online banking web site i get asked for additional authentification. then, if i go back a second time to the online banking site i dont get asked the additional authentification. reading the answers to my question i decided it must be a cookie involved. therefore, i deleted all cookies in IE and relogged onto my online banking site fully expecting to be asked the authentification questions again. to my surprise i was not asked. doesnt this lead one to believe the bank is doing some kind of pc tagging which doesnt involve cookies?
further, after much googling today i found the following company who claims to sell a solution which does uniquely identify machines which visit a web site. http://www.the41.com/products.asp.
i appreciate all the good information if you could clarify further this conflicting information i found i would greatly appreciate it.

I would do this using a combination of cookies and flash cookies. Create a GUID and store it in a cookie. If the cookie doesn't exist, try to read it from the flash cookie. If it's still not found, create it and write it to the flash cookie. This way you can share the same GUID across browsers.

I think cookies might be what you are looking for; this is how most websites uniquely identify visitors.

Cookies won't be useful for determining unique visitors. A user could clear cookies and refresh the site - he then is classed as a new user again.
I think that the best way to go about doing this is to implement a server side solution (as you will need somewhere to store your data). Depending on the complexity of your needs for such data, you will need to determine what is classed as a unique visit. A sensible method would be to allow an IP address to return the following day and be given a unique visit. Several visits from one IP address in one day shouldn't be counted as uniques.
Using PHP, for example, it is trivial to get the IP address of a visitor, and store it in a text file (or a sql database).
A server side solution will work on all machines, because you are going to track the user when he first loads up your site. Don't use javascript, as that is meant for client side scripting, plus the user may have disabled it in any case.
Hope that helps.

I will give my ideas starting from simpler to more complex.
In all the above you can create sessions and the problem essentialy translates to match session with request.
a) (difficulty: easy) use client hardware to store explicitely a session id/hash of some sort (there are quite some privace/security issues so make sure you hash anything you store ), solutions include:
cookies storage
browser storage/webDB/ (more exotic browser solutions )
extensions with permission to store things in files.
The above suffer from the fact the the user can just empty his cache in case he doesn want.
b) (difficulty: medium) Login based authentication.
Most modern web frameworks provide such solution the core idea is you let the user voluntarily identify himself, quite straghtforward but adds complexity in the architecture.
The above suffer from additional complexity and making essentially non public content.
c)(difficulty: hard -R&D) Identification based on metadata, (browser ip/language /browser / and other privace invasice stuff so make sure you let your users know or you miay get sued )
non perfect solution can get more complicated (a user typing with specific frequency or using mouse with specific patterns ? you even apply ML solutions ).
The claimed solutions
The most powerful since the user even without wanting explicitely he can be identified. It is straight invasion of privacy(see GDPR) and not perfect eg. ip can change .

Assuming you don't want the user to be in control, you can't. The web doesn't work like that, the best you can hope for is some heuristics.
If it is an option to force your visitor to install some software and use TCPA you may be able to pull something off.

My post might not be a solution, but I can provide an example, where this feature has been implemented.
If you visit the signup page of www.supertorrents.org for the first time from you computer, it's fine. But if you refresh the page or open the page again, it identifies you've previously visited the page. The real beauty comes here - it identifies even if you re-install Windows or other OS.
I read somewhere that they store the CPU ID. Although I couldn't find how do they do it, I seriously doubt it, and they might use MAC Address to do it.
I'll definitely share if I find how to do it.

A Trick:
Create 2 Registration Pages:
First Registration Page: without any email or security check (just with username and password)
Second Registration Page: with high security level (email verification request and security image and etc.)
For customer satisfaction, and easy registration, default
registration page should be the (First Registration Page) but in the
(First Registration Page) there is a hidden restriction. It's IP
Restriction. If an IP tried to register for second time, (for example less than 1 hour) instead of
showing the block page. you can show the (Second Registration Page)
automatically.
in the (First Registration Page) you can set (for example: block 2
attempts from 1 ip for just 1 hour or 24 hours) and after (for example) 1 hour, you can open access from that ip automatically
Please note: (First Registration Page) and (Second Registration Page) should not be in separated pages. you make just 1 page. (for example: register.php) and make it smart to switch between First PHP Style and Second PHP Style

Primitive Diffie-Hellman cryptography for app-server exchange

Working on an app that lets a user call someone by clicking on them. After the call, a new activity is started, FeedbackActivity, where the user enters feedback regarding the person they called which is uploaded and the server crunches the numbers over time and produces a "rating."
However, the app does not have a traditional "log in and password" behavior... (and it is important that it does not have this) so there is nothing preventing a user from maliciously entering negative feedback over and over again... or worse, loading
http://www.example.com/feedback.php?personICalled=334875634&feedback=blahblahblah
into a browser and just reloading it over and over again.
So, we need to make sure that people can only give feedback on someone that have actually called.
I had the idea of having some sort of "token" be sent to the server when the user clicks "call." Then the server saves this token.
then, when they subsequently upload feedback, it would look like:
http://www.example.com/feedback.php?personICalled=334875634&feedback=blahblahblah,&token=[same token sent prior]
This way, the server checks that such a token was ever saved, and if so, then it saves the feedback, otherwise not.
Or, better yet, there could be a secret formula known only to the server (and the app), whereby [token checked upon feedback given] is a (complex mathematical) function of [token uploaded at phone call time].
But, obviously this wouldn't be that hard for someone to figure out by looking at app source code, or observing the y=f(x) relationship over time and figuring out the formula... and there has to be a better way to do this.
I read about the Diffie-Hellman key exchange... and it seems to me there must be a way of implementing this for this purpose... but I'm not a Harvard graduate and it been a while since discrete math... and I'm not particularly knowledgable about cryptography... and the wiki page makes me head hurt!!!!
Take this diagram, for example
If anyone could tell me how the terms "Common paint," "Secret Colors," "Public Transport" and "Common Secret" translate to my scenario, I think I might just be able to figure this out.
I'm guessing that Public Transport = internet... I've got that far.

First thing, Diffie Hellman is not going to solve your problem. There are a ton of things that can go wrong in crypto, so dont play with it unless you really know that you need it and know what you want it for.
What is your real requirement? A user should be able to enter feedback only one time per call. You do not need crypto to solve this.
When the user makes a call, generate a token. Send that token to the user and also store it in the database. When the call is finished, allow the user to "consume" the token by providing feedback associated with that token. The sever verifies that the token exists in the database (and has not already been consumed). Assuming it is there, accept the feedback and then remove the token from the database (it has been consumed). If it is not there, do not accept the feedback.
You can improve things by also storing a time with the token (the time it was generated). Don't let them provide feedback if they try to consume it too soon. Expire the tokens if they are not consumed after some max life period. This prevents people from repeatedly calling a user or tokens living indefinitely in your database (DoS).
You might also restrict people by IP address. Allow a user to receive only one rating from an IP address in any reasonable time period (one day). The IP addresses can be stored along with the feedback in the database.

Allow login to website only if request comes from another specific website

I have a PHP/MySQL website (website 1) that has a login system that only asks for a PIN code (just a long numeric string). A user has 2 ways of login in with this code:
Going to the website 1 login page and enter the code in a typical login form
Clicking in website 2 on a link that carries his PIN code as a GET value. The link has the format http://myURL.com/login.php?pin=123456789. That just calls a function that receives the PIN as a parameter and processes the login. Website 2 is located in a different domain/server than website 1.
Until here everything works fine.
Now come's the question. I would like to know if when using the second method described above, if it's possible to only allow the login (assuming the PIN is correct) ONLY if that link has been clicked in a specific website.
The way it works now, anyone with the link could use it to login into website 1. I want to prevent that, I want to allow that to happen if that link has been clicked win website 2.
The idea would be to "detect" the referring website in the login function, and only allow it if it matches the URL (or any other unique identifier) of website 2.
If using a "plain" link would not allow for this that wouldn't be a problem, I'm flexible as to what way I could use for this, but in the end it would need to be something that only meant a click for the users in website 2.
EDIT
I think it's good to add this since some of the comments/responses talk about the security of doing this (which is great of course). The main reason to do this is to "force" the users to visit website 2 before going to website 1. Basically so they can't enter that URL in their browser and log into website 1, I want to only be able to use that link if they're clicking it from website 2. I explain this because security is not a huge factor here, if a few savy users can get around whatever method I implement it's not a big deal, it's more important that the method is as simple as possible to implement in website 2 (since I don't run that website and I will need to ask people there to do whatever is needed).

I think you're looking for a variation of Single Sign On. This is a technique in which an authentication in one site is recognised transparently in another. Here is how it works in your case.
Normally you would have a link in site2.com like this:
http://site1.com/login.php?pin=123456789
However, site1.com cannot tell from the referrer which site it has really come from, since it it can be trivially faked. Of course, that may not matter for your use case, if you only want a simple level of security. But if you want something better, read on!
You can use a hashing system and a shared secret, to create something that can only have come from one source. Both sites have the same shared secret, stored in a file. We'll call that $sharedSecret. The algorithm goes like this:
$hash = hashFunction($pin . $sharedSecret);
Then you can do this in site2.com:
<a
href="http://site1.com/login.php?pin=<?php echo (int) $pin ?>&hash=<?php echo $hash ?>"
alt="Authenticated link"
>
When site1.com sees it, it can get the PIN straight away, repeat the algorithm, and check that the hash really did come from site2.com. If you have several referring sites, then site1.com should store a separate secret for all of them, and then it can securely check the referrer to see which one it should load.
The shared secret should be substantial enough that it cannot be guessed; I tend to go for around 40-60 characters.
However, the remaining flaw in this plan is that someone could visit site2.com and steal a link from them, and it would still work, providing they were also willing to fake the referrer every time they wanted access. It may therefore be useful to add a timestamp into the algorithm too:
// The time is rounded to the nearest 500 seconds, to account for
// out of sync clocks. Adjust this depending on how long you want links to
// remain active for
$time = floor(time() / 500) * 500;
$hash = hashFunction($pin . $sharedSecret . $time);
Then on site1.com you should compute two hashes:
One for floor(time() / 500) * 500
One for floor(time() / 500) * 500 - 500
If the supplied hash matches either, allow the link to unlock the content. This accounts for the possibility that the time went over a +/-500 boundary between one server and the next.
I haven't mentioned a specific hashing function here. SHA256 should be fine, but note I'm not a cryptographer. If you want more security again, it may be worth checking to ensure someone isn't brute-forcing the system by flooding your system with guesses - though over the internet it is hardly worth their trying.

The problem is multi-faceted. $_SERVER['HTTP_REFERER'] is available to PHP, but can be spoofed or omitted and is considered unreliable.
Cross-domain cookies are a bit of a challenge as well; I understand it's possible, but haven't yet found time to implement it (we have a use case at work). At any rate, cookies, are also quite exploitable.
Possibly your best bet would be to have the link point on "Site A" point to a resource also on "Site A" that sets a random key/token and timestamp into a shared database and fowards the browser to "Site B" with that token. The receiving page on "Site B" would then verify the existence of the key/token in the GET string, check for its existence in the database and possibly match the User-Agent and Referer data and checking that the time was within $smallnum seconds after the timestamp entry for that key/token.

I'm going to post this answer that I came out with thanks to the other answers/comments made by fellow SO users. I think it's a pretty simple method (simplicity is good in this case), and that it should work, but of course if it has any big flaw I'd be great to know :)
Like I say in the OP, security (in terms of some savvy user getting around this and using the link directly instead of from website 2) isn't a huge deal here, we can deal with a small number of exceptions.
Here's the idea:
Give the user in website 2 a link to a function in that same domain
That function grabs the PIN (i.e. 1234567890) that each user will have, and the current timestamp (i.e. 1417812842), creating a token using those 2 values, using another function that both websites will know (some sort of hashing with salt, for example)
Generate a redirect from website 2 to website 1, with a URL that includes the PIN and the token, something like: http://myURL.com/process_login_request.php?pin=123456789&token=abc123def456
A function in process_login_request.php, using the same function that was used to generate the token in website 2, will generate a number of tokens for the past X seconds (let's say 10 seconds), using the PIN and the timestamps for the past 10 seconds.
If any of the generated tokens match the received token means that the request was OK, and we allow the login
I think it's easier to implement than to explain though. The idea basically is that we use timestamps in a short period of time (the time between the user "clicks" the link that should take him to website 1 and the time he actually lands on website 1). I said 10 seconds, but we could increase that as necessary if 10 seconds is too short (which I think it shouldn't be).

How would I go about preventing hotlinking of a PHP web service script that runs a query on a MySQL database and outputs it in JSON form?

Introduction and background
I have a MySQL database of Lottery Results which my Android application queries to get the results via a PHP web service script which connects to the database, does a query for the top 10 and returns JSON data for the Android client to parse and then display.
I have the server hosted and it uses Cpannel (cannot find the version number).
In terms of the Android app (java program) The full URLs of php script are stored in a String. I fear if the program got decompiled they could get access to this string. I am using pro-guard to obfuscate the code but this does not hide actual values given to Strings or variables.
The actual Problem
How would I prevent others (if they got the actual URL of the PHP script location) to prevent them from leaching the results I provide just by running the script (this would cost me bandwidth usage). I tested it and I could actually get the JSON data output if I input the full URL of the script.
What I have done so far
Im new to this server hosting and administration. So far I disabled indexing on the directory which contains the PHP scripts just in case someone found them that way.
I was looking into setting permissions for the script file but ended up actually blocking legitimate use of it. At the moment they are 644. I cannot remember which ones I tried.
I found various hotlinking tutorials but these seem to be for images and multimedia but not specifically a script which outputs JSON data. Please help me.
What I am looking for
I don't have any code to show but I am looking for advice for those who have been through the same problem and point me into a direction to which I can research, investigate and build a solution from.
Thank your for your time in reading

A quick but not super secure solution would be to generate an unique token for every request:
Given:
Secret Key: examplekey1234
Client:
Calculates Token: sha256(examplekey + requestdata + date + ip ....)
Does request with token as additional request data
Server:
Calculates token the same way as the client.
Compares calculated with submitted token.
If both are equal, accept the request.
Since the secret key is known only to the client and the server, nobody ellse can calculate the token.
The data added to the calculation (requestdate, ip, date) ensures, that the token can't be reused for other requests (different reqest data, other user, at a later time, etc.).
If you have some kind of session id, you could also add it to your token calculation. This makes the token a little bit more secure, since it's only usable for this session.
But: When somebody decompiles your application, he can obtain the secret key. This method mainly protects against sniffing the network traffic to get the url.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.