Database/Server Design (Cookies and Indexes) - Ints or Text?

Database/Server Design (Cookies and Indexes) - Ints or Text? - php

This is a design question relating to a website I'm building.
I have a 'Player' table that will store names, passwords, last IP, dates of birth, links to avatars, locations, etc.
When the player logs in, the database will be searched for the username they entered, and their password will be checked out as well (yes, of course the passwords are hashed). Then, they will be given some cookie that will keep them logged in.
Every time they visit a new page, the correct player will be looked up in the database using the information in the cookie. If their current IP matches the last IP they logged in with (probably 10 seconds ago), the page is outputted with their name on it and whatnot.
Here's my question: should I have the primary key for the Player table be the player's name (a text field that I know will be unique), or should I create some arbitrary auto-incremented index for that?
Keep in mind that this also has an effect on the information stored in the cookie - whether to store an int or the user's name in text. As well, I want to do some sort of hashing on that value (just for a little added security), so that the cookie doesn't just contain the int or the username.
So, in terms of both efficiency and design, which is the better choice?
EDIT Using VARCHAR for the database would also be ok, and probably faster, I imagine.
EDIT2 This primary key will also be referenced by other tables.

As Marc's comment indicates, int will be more efficient for both memory and performance.
I'd recommend against tying logins to IP addresses, some users will have each request to the server come from a different IP address (onion routers, AOL, who knows what other kind of weird corporate NATs), and being logged out all the time will be super annoying.
You may also want to consider using sessions instead of setting a cookie saying who they are logged in as. Even though having a sig would make it a bit more secure, using sessions would be safer still, along with giving you more flexibility to track information about users before they log in (for example, what page they should be redirected to after a successful login).

Here's my question: should I have the primary key for the Player table be the player's name (a text field that I know will be unique), or should I create some arbitrary auto-incremented index for that?
The latter.
Names tend to change.
whether to store an int or the user's name in text.
As well, I want to do some sort of hashing on that value
Please make your mind first, then ask
is it going to be hash or a plain value finally? If the latter - what's the difference then?
So, in terms of both efficiency and design, which is the better choice?
Oh. This one absolutely doesn't matter.

Related

Random ID Number when user created

I want to use a users unique id to save a cookie - so that I know which user is logged in, and then I can change their content to suit.
I am currently just using the usual auto id when a new record is created, but I have heard that for creating user accounts (specifically when you're going to use that ID to change content) that you shouldn't have them 1 after another; e.g. not 378, 379,380 and so on but more like this 138462193, 109346286, 982638192 so it's kind of like a random unique identifier.
How would i achieve this?
Is this a best practice?

You protect your data against attacks by using ACL, to limit which user has access to to what (and with what data). Foreign key relations to establish ownership between user and data, session ID regeneration at login, CRSF tokens to prevent attacks via other sites, and so forth.
Not to mention logging, to be able to find out what went wrong when things do go wrong.
Only in very special cases do you ever need to worry about the ID of users being sequential. Most of the time this ID will be available to other users, via the web site itself, anyway. As a part of normal operations.
Thus adding a random element to the user ID won't bring anything but a false sense of security. Even if you keep the internal ID different from the "external" user-facing ID, as long as you're using the external ID to identify and change content it's basically the same as the internal ID. Only valid reason for using a dual ID system, in most cases, is for human readability. If you're uncertain about whether your use case is one of the exceptions, it's not.
PS: I see in your comment that you say that the passwords are encrypted. Hopefully you mean "salted and hashed", more specifically by using password_hash () and it's associated functions.

Is it ok to replace clients username id with their email?

I want to replace username id with their email. Is it ok to do that?
For example in MySQL table:
client_id
som665
som881
som876
som887
I want to replace them with emails for future clients e.g.:
client_id
som665
som881
som876
som887
xyz123#gmail.com
xyy333#gmail.com
xcv5557#yahoo.com
Question-
Does replacing client_id like som881 with email make my software (or query) slow? When I use below query on emails instead of small id's like som881?
$sel_service = "select * from all_services where client_id='$client_id' order by sub_cat_name"
Right now my clients login with client id like som881 which is difficult for them to remember compared to an email address. I'm also unable to provide "forgot your password?" functionality.

The longer the string the more time it will take to query the users because it will match more number of chars,
so if you want to make it with the email ( and yes it easier for the user ).
you have to make this field as unique index .
in this way mysql will index the emails and it will find the user record without scaning the entire table.
but the drawback of it that it will take more space for the index. but it will be very fast
and if the user will have more than record( have more than service), then you have to make it just index not unique

Overall your question is very broad without details of the application you are working on.
I'd say that in general, yes, it'd be good idea to allow users to login using the email address. If are mainly concerned about the slowdown this shouldn't be an issue as this should only impact login which doesn't happen all that often.
From design perspective you probably should have two identifiers for the user:
User visible token. This can be username chosen by the user (user123) or email address (user123#example.com). This token is only used when user logs in to the application. You may even go as far as having two different columns to store the username and email address. In this case your login part needs to match the username against both database columns.
Internal identifier used in application. id INT PRIMARY KEY AUTO_INCREMENT column might do well, depending on your intended scale. This is what your application should use internally. You don't even need to show this to the user.
This would allow users to change their email address without having to recreate their account.
EDIT: based on your comment, you use client_id as both index value to reference data between the tables in the database and as a username user logs in with. My suggestion is to normalize the data as listed above.
If you don't normalize the data, you may end up with over-sized index columns in several tables. To store email address you need to use string (CHAR, VARCHAR) column with significant length. With normalized data you just need INT/BIGINT in each table. In this scenario allowing users to use email address may slow down as it may require client_id column to be resized.

well this is a very good use of emails because mail servers of companies doesnt allow clients to have same username so you dont even need to set a primary key constraint on this id field .
you know nowadays databases finds the best algorithm to search between records by their Artificial Intelligence so the only thing you should worry about is :
is it worth to provide your clients comfortability at the expense of complexity !?
if yes in my opinion there wont be any problem if the server responds a little slower
i suggest you to use at least one security server service layer using precreated classes between client and database server that accepts clients connections and created one way virtual connections to database server for searching login information instead of explicit SQL command to provide a higher level of security and rejecting SQLInjection

Minimum information to authenticate a user

I have a table who's PK is INT type (roughly 4 billion possible unsigned values, however, I will have at most 2 million). The table also has a CHAR(32) column which contains a random value (created using bin2hex(mcrypt_create_iv(16, MCRYPT_DEV_URANDOM))), as well as a FK column indicated the userID.
I will send out some emails which will contain links in them, and the links will contain the above table's PK and/or random value, as well as potentially the userID. The links will also contain an answer (yes, no, etc). For instance, a link might look like:
Yes
I have a PHP page which will accept a GET request (initiated from the above mentioned emails of course) and update the appropriate record in the database with the provided answer if authenticated.
Is solely confirming the 16 byte random value exists in the database enough to authenticate that it came from the user who received the email? If not, why not and what would you recommend?

I would also use the user id as it is faster.
Your url will become: ckr.php?userid=1&rsp=eae8a14011e82cbf385f69b431a17e49&ans=yes
Then, when you check in database, you only check for userid. If the user exists, then check for the rsp code. If everthing ok, do your job.

Keep in mind, everything send over e-mail is unencrypted an thus not suitable for authenticating a user.
Some things you could do to make it a bit more secure:
Make the key only available for a short timespan. Maybe a 24h window?
Ask for the users password when receiving the get request.
Invalidate the key when an answer is received.

PHP Session ID uniqueness (for use in a cookie)

I'm writing a user system where users will log in using Twitter's API, then I'll store the information in a database along with a few extra pieces that I have the user put in. I want the user to be able to come back after logging in and not have to log in again. I decided that I'd get all the relevant information about the user, save it to the database, then save the session ID and user ID to another table. Finally, I'd set a cookie on the user's computer containing the same session ID so that throughout their browsing they would stay logged in. Then if they closed the browser and revisited the site later, I would read that cookie, get the compare it with the sessions table, get the User ID, and reconstruct the session (updating the sessions table with the new session ID).
My question is, how random is the session ID? Is there a possibility that a user might get the same session ID that a user that hasn't visited the site in a week (so the cookie would still be active) had assigned to them? If this happens, then the server might mistake the new user for the old one. I really would like to avoid using the IP address because people might visit the site from a mobile browser where the IP can change at any time.
Any ideas on this? I just want to ensure that user A and user B, separated by any amount of time, won't get the same session ID.

Append current time in microsecond to the unique id...
session_id() + microtime();
So not only would the session_ids have to be the same, it would have to happen on the same microsecond... making the vanishingly unlikely just about impossible. The only way to guarantee it 100% is to check this random value against all existing session ids and re-roll it if it already exists.

Although the probability of having two active sessions with identical identifiers at the same time is vanishingly low (depending on the hash function), you could add an additional (pseudo-) unique value to that session ID to get a value with both characteristics.
You could use uniqid that fulfills the latter:
uniqid(session_id(), true)
uniqid’s value is based on microtime with an additional pseudo-random number from lcg_value and an additional source for more entropy that all together guarantees unique values.

The PHP Session ID is an MD5 hash, which makes it 128 bits in length. That's something like 340,000,000,000,000,000,000,000,000,000,000,000,000 different possibilities. The odds of two people getting the same one are pretty remote.
If you want to guarantee uniqueness, put something in their cookie based on sequential numbers.

Strategy for unique user-voting such as Stackoverflow's?

I noticed that for voting SO implements an XHR method which POSTs to a posts controller and sends the post ID and vote type through the URL, in addition a fkey parameter is sent, eg:
http://stackoverflow.com/posts/1/vote/2
I'm going to be implementing a similar technique, I'm wondering what logic I could use to prevent duplicate voting by the same user and prevent spamming, in addition to overall logic when implementing this.
The schema for the table I'll be storing them:
thread_id user_id vote_type
2334 1 2
So far I came up with these bullet points:
ensure the user is logged in
ensure that a valid post ID and valid vote type is sent
ensure that after POSTing, the user has not previously voted
the code that creates the hash can't contain dynamic information such as user agent, since a user could be on a different browser, different OS, right?
Update:
"SO is probably using the login cookie to identify the user." - Andrew
Could someone demonstrate how this would be done, or in other words more specifically provide an example of how the fkey, which is an alphanumeric 32-bit string, is generated?
Question:
since I'm not sending the actual user id anywhere with my XHR code, does this mean I have to update my table schema so that I can store the fkey instead of say, the user_id? The fkey will probably have to be unique to each user, and so I can probably query whether there is a row in the voting table that has an fkey of whatever.
Would appreciate any tips or insight on anyone who's implemented a similar technique.

create UNIQUE index on fields (thread_id, user_id) and DBengine will protect you from multy comments on one thread :)

You can just sign the URIs somehow in order to prevent users from manipulating valuse. For instance, you could hash parts of the URI with a secret and append the hash to the URI. When users copy the URI and change values, the URI and the signed part become invalid.
This is often done in RESTful APIs, and your current approach is similar to.

I think it depends on how badly you want to keep people from re-submitting or fiddling with your data. Nothing will be 100% (unless your budget is through the roof), but you can do a good job of keeping most people from resubmitting by:
check their UID - or generated ID from the UID (I will explain)
record their IP address, and check against the DB for the IP and the submission ID (along with the generated UID)
Using the IP solution alone, can be defeated by using a proxy of course, or a connection that changes IP's often such as the DSL carrier in my city (but even then, its every couple of days). I personally generate a unique key based on that persons UID, and pass that back and fourth if necessary. A salted MD5 hash usually works fine, or even an AES implementation if MD5 is viewed as too weak. Combined together, you should have a good starting place.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.