Strategy for unique user-voting such as Stackoverflow's? - php

I noticed that for voting SO implements an XHR method which POSTs to a posts controller and sends the post ID and vote type through the URL, in addition a fkey parameter is sent, eg:
http://stackoverflow.com/posts/1/vote/2
I'm going to be implementing a similar technique, I'm wondering what logic I could use to prevent duplicate voting by the same user and prevent spamming, in addition to overall logic when implementing this.
The schema for the table I'll be storing them:
thread_id user_id vote_type
2334 1 2
So far I came up with these bullet points:
ensure the user is logged in
ensure that a valid post ID and valid vote type is sent
ensure that after POSTing, the user has not previously voted
the code that creates the hash can't contain dynamic information such as user agent, since a user could be on a different browser, different OS, right?
Update:
"SO is probably using the login cookie to identify the user." - Andrew
Could someone demonstrate how this would be done, or in other words more specifically provide an example of how the fkey, which is an alphanumeric 32-bit string, is generated?
Question:
since I'm not sending the actual user id anywhere with my XHR code, does this mean I have to update my table schema so that I can store the fkey instead of say, the user_id? The fkey will probably have to be unique to each user, and so I can probably query whether there is a row in the voting table that has an fkey of whatever.
Would appreciate any tips or insight on anyone who's implemented a similar technique.

create UNIQUE index on fields (thread_id, user_id) and DBengine will protect you from multy comments on one thread :)

You can just sign the URIs somehow in order to prevent users from manipulating valuse. For instance, you could hash parts of the URI with a secret and append the hash to the URI. When users copy the URI and change values, the URI and the signed part become invalid.
This is often done in RESTful APIs, and your current approach is similar to.

I think it depends on how badly you want to keep people from re-submitting or fiddling with your data. Nothing will be 100% (unless your budget is through the roof), but you can do a good job of keeping most people from resubmitting by:
check their UID - or generated ID from the UID (I will explain)
record their IP address, and check against the DB for the IP and the submission ID (along with the generated UID)
Using the IP solution alone, can be defeated by using a proxy of course, or a connection that changes IP's often such as the DSL carrier in my city (but even then, its every couple of days). I personally generate a unique key based on that persons UID, and pass that back and fourth if necessary. A salted MD5 hash usually works fine, or even an AES implementation if MD5 is viewed as too weak. Combined together, you should have a good starting place.

Related

Random ID Number when user created

I want to use a users unique id to save a cookie - so that I know which user is logged in, and then I can change their content to suit.
I am currently just using the usual auto id when a new record is created, but I have heard that for creating user accounts (specifically when you're going to use that ID to change content) that you shouldn't have them 1 after another; e.g. not 378, 379,380 and so on but more like this 138462193, 109346286, 982638192 so it's kind of like a random unique identifier.
How would i achieve this?
Is this a best practice?
You protect your data against attacks by using ACL, to limit which user has access to to what (and with what data). Foreign key relations to establish ownership between user and data, session ID regeneration at login, CRSF tokens to prevent attacks via other sites, and so forth.
Not to mention logging, to be able to find out what went wrong when things do go wrong.
Only in very special cases do you ever need to worry about the ID of users being sequential. Most of the time this ID will be available to other users, via the web site itself, anyway. As a part of normal operations.
Thus adding a random element to the user ID won't bring anything but a false sense of security. Even if you keep the internal ID different from the "external" user-facing ID, as long as you're using the external ID to identify and change content it's basically the same as the internal ID. Only valid reason for using a dual ID system, in most cases, is for human readability. If you're uncertain about whether your use case is one of the exceptions, it's not.
PS: I see in your comment that you say that the passwords are encrypted. Hopefully you mean "salted and hashed", more specifically by using password_hash () and it's associated functions.

Minimum information to authenticate a user

I have a table who's PK is INT type (roughly 4 billion possible unsigned values, however, I will have at most 2 million). The table also has a CHAR(32) column which contains a random value (created using bin2hex(mcrypt_create_iv(16, MCRYPT_DEV_URANDOM))), as well as a FK column indicated the userID.
I will send out some emails which will contain links in them, and the links will contain the above table's PK and/or random value, as well as potentially the userID. The links will also contain an answer (yes, no, etc). For instance, a link might look like:
Yes
I have a PHP page which will accept a GET request (initiated from the above mentioned emails of course) and update the appropriate record in the database with the provided answer if authenticated.
Is solely confirming the 16 byte random value exists in the database enough to authenticate that it came from the user who received the email? If not, why not and what would you recommend?
I would also use the user id as it is faster.
Your url will become: ckr.php?userid=1&rsp=eae8a14011e82cbf385f69b431a17e49&ans=yes
Then, when you check in database, you only check for userid. If the user exists, then check for the rsp code. If everthing ok, do your job.
Keep in mind, everything send over e-mail is unencrypted an thus not suitable for authenticating a user.
Some things you could do to make it a bit more secure:
Make the key only available for a short timespan. Maybe a 24h window?
Ask for the users password when receiving the get request.
Invalidate the key when an answer is received.

How to implement a secret url for delivering information after payment without login?

I would like to deliver some information to customers after a paypal payment, using the paypal return url, and without having the customer log in.
So I think I need a system to create urls for each transaction, and to avoid that a url for another transaction is guessed.
I have thought of something like:
http://www.domain.com/product/send.php?productID=12&transactionHash=[thisTransactionHash]
using a transactionHash that could be calculated based on the customer's email and the product unique id.
Does this method make sense? or what would be your recommendation delivering information without login, and avoiding customers guessing the url for other products?
Although they were several interesting answers about hashes, there is still one concern with the idea I mention above: Paypal needs to receive the return url, therefore the information is passed before payment and therefore the method is not securing against fraud.
The only secure way I see is the Paypal delivery system, which is why I accepted that answer.
If you target PayPal only, why don't you check Instant Payment Notification Guide?
https://www.x.com/sites/default/files/ipnguide.pdf
I didn't use Paypal before, but it seems this solves your problem.
Create table:
| product_id (unique ID of you product) | varchar transaction_hash |
In this sample code (PHP example):
https://www.x.com/developers/PayPal/documentation-tools/code-sample/216623
After validating that the payment is correct, insert product ID and verify_sign( value from paypal POST data) in the table. and Give the user a URL with with product ID and verify_sign.
"using a transactionHash that could be calculated based on the customer's email and the product unique id."
As soon as the algorithm gets known you system will break down. My recommendation is a "secure" aka cryptographic PRNG + some lookup table.
You can create a random id for a user at any given time you want, maybe even using some truely random generators out on the web.
BUT what you should do is make it UNIQUE for a specific amount of time, perhaps with a simple database structure, maybe storing informations into files on your server, that will be deleted by the same script as soon as they're read once, depends on your needs.
So whenever a user generates such unique ID he can access that information for either a certain period of time, or exactly once.
Using say random.org's random byte function you can generate a string like:
6f0d47cf3432d4015e0e798641191bf0e8e0b90b00df23181bcb3401a0dad43d85be711343c3baa9
Which is nearly impossible to guess even if someone else knows a productID AND the emailadress of said customer
Using a hash to access some stored information without the need of logging in isn't a bad idea. BUT that hash should not be generated based on already known data like IDs, email-address or similar data that could be known or guessed by any user.
Instead it is necessary to randomly generate a long enough hash thats value couldn't be guessed or generated out of any known data.
The already mentioned byte function from random.org could be a good choice for that.
include a hash param which value is calculated based on several parameters. for example, for your url o would calculate the hash like this:
$uniqueKeyString="some random characters";
transactionHash=md5("domain.com".$productId.time().$uniqueKeyString);
where $uniqueKeyString is a secret value (some random integer) only you know.
than, when a request will come to your servers, you can simply calculate the hash string yourself and compare it with the transactionHash of the request whether it is the same.

Basic about Verification

In the web application, of course in many case we will support CRUD(Create, Retriever, Update, Delete)
Basic programmer will do something like bellow(without Verification ) :
delete?room_id=12
update?room_id=13
The displayed room_id is only for room that belong to the user/client.
first Authentication is only using user and password. well that's a standard.
But i believe we should not trust the user. The bad user may guess the room_id that's not belong him. like delete?room_id=199
I ask my programmer friend, and they even never think about this issue.
So to prevent that, i have a basic solution that to always pass the user_id for any related object. such querying before any action is the room_id belong to the user.
If this the only solution, so i must to modify all of the query i already write.
The question is, is there any good or better solution for this basic problem ?
Thanks
Your approach is a good one. And it really doesn't surprise me that your programmer friends haven't thought about it; unfortunately security seems to be the last thing on most programmers minds.
In a good system, you will perform an authorization check on nearly every action performed to see if this particular user is authorized to perform that action. Its generally good practice to build this check in throughout your app, even for things that you don't normally care if they are authorized for or not: someday you might be.
In your scenario, that action might be to retrieve the room, update the room or even delete the room.
To help things along I have a few recommendations:
Make the room_id non guessable if possible. The easy way, presuming you are already using an int as your primary key, is to encrypt / decrypt it when passing between the client browser and your application.
Make sure that on the browser side you aren't passing in the users id but rather pulling that from session or through some other mechanism. The point is you don't want to trust the user to pass the id to you.
Any action that is not a GET, use an HTTP POST to perform. In other words you shouldn't be putting the ids in the query string at all but rather as post data.
assuming you have registered userid in a session when he logs in using: session_register("userid");
then you can do this to check if logged user owns the room (as you have told you have a database which contains roomid and userid at the same time)
$connect = mysql_connect("$server", "$dbuser", "$dbpassword")
OR die(mysql_error());
$room = intval($_GET['room_id']);
$user = mysql_real_escape_string($_SESSION['userid']);
mysql_select_db("$databasename", $connect);
$select = mysql_query("SELECT userid AS uid FROM table WHERE userid='$user' AND room='$room'");
$fetch = mysql_fetch_assoc($select);
$found = $fetch['uid'];
if ($found == $_SESSION['userid']){
// User owns this room let him delete
} else {
// FAIL, This user does not own this room
}
change the 'table' (which table in database contains both of this info) , 'userid' (user id column name in that table) and 'room' (room id column name in that table)
EDIT:
also if your room_ids may have any characters, remove the intval(); from $room and do a real_escaping for it, if room_ids are only numbers don't change it
Don't ever trust anything your client passes to you. Every request has to be validated on server side and checked whether the logged user is allowed to do that action.
Every text input has to be sanitized either on inserting or on viewing. The latter has bigger chance of forgetting and thus providing a field for XSS. I recommend using implicit sanitisation for templates.
Also try to avoid the use of GET method for critic actions. Attacker can always force user to visit the URL (iframe, sending him shortened link). You should also add special generated token to the deletion form to prevent CSRF attacks.

Database/Server Design (Cookies and Indexes) - Ints or Text?

This is a design question relating to a website I'm building.
I have a 'Player' table that will store names, passwords, last IP, dates of birth, links to avatars, locations, etc.
When the player logs in, the database will be searched for the username they entered, and their password will be checked out as well (yes, of course the passwords are hashed). Then, they will be given some cookie that will keep them logged in.
Every time they visit a new page, the correct player will be looked up in the database using the information in the cookie. If their current IP matches the last IP they logged in with (probably 10 seconds ago), the page is outputted with their name on it and whatnot.
Here's my question: should I have the primary key for the Player table be the player's name (a text field that I know will be unique), or should I create some arbitrary auto-incremented index for that?
Keep in mind that this also has an effect on the information stored in the cookie - whether to store an int or the user's name in text. As well, I want to do some sort of hashing on that value (just for a little added security), so that the cookie doesn't just contain the int or the username.
So, in terms of both efficiency and design, which is the better choice?
EDIT Using VARCHAR for the database would also be ok, and probably faster, I imagine.
EDIT2 This primary key will also be referenced by other tables.
As Marc's comment indicates, int will be more efficient for both memory and performance.
I'd recommend against tying logins to IP addresses, some users will have each request to the server come from a different IP address (onion routers, AOL, who knows what other kind of weird corporate NATs), and being logged out all the time will be super annoying.
You may also want to consider using sessions instead of setting a cookie saying who they are logged in as. Even though having a sig would make it a bit more secure, using sessions would be safer still, along with giving you more flexibility to track information about users before they log in (for example, what page they should be redirected to after a successful login).
Here's my question: should I have the primary key for the Player table be the player's name (a text field that I know will be unique), or should I create some arbitrary auto-incremented index for that?
The latter.
Names tend to change.
whether to store an int or the user's name in text.
As well, I want to do some sort of hashing on that value
Please make your mind first, then ask
is it going to be hash or a plain value finally? If the latter - what's the difference then?
So, in terms of both efficiency and design, which is the better choice?
Oh. This one absolutely doesn't matter.

Categories