I am currently working with a medium-sized team developing a custom content management system for a large client. The CMS is written using PHP and follows the MVC pattern (custom). It is a modular system, for which plugins can be added to the system by us or other developers at a later stage.
The system will contain user-based permissions, and a series of generic roles that have predefined permissions. It is required that a super-admin user can also modify permissions on a user basis (for example John Doe might be defined as a regular user, but has the possibility of modifying content).
Opinion is currently divided about the best way for us to store and handle these permissions. Half of the dev team are suggesting to add a new DB table that will store key/value pairs and user IDs for each user, with boolean values stored in each record. The table structure would be something like this:
user_ID: the ID of the user
perm_name: the name of the permission
perm_value: a boolean value dictating whether the user can carry out this action
The proposal is that if the value associated with a particular permission is set to 0, or does not exist in the table, the user does not have the required permission.
The other half of the dev team is favouring storing the permissions in a single field as a JSON-encoded string within the users table. So for example, we would store the following JSON for John Doe):
{
'modifyProducts': 1,
'addProducts': 1,
'addPages': 0
}
We would then be able to use json_decode() within the User class to extract the permissions, for example:
$this->permissions = json_decode($dbval);
I am personally leaning towards the latter option for two main reasons:
It is scalable
It does not require us to modify the database if we need a new permissions.
In short, what is the best approach for such an application?
I think the best solution in this case would be to use NoSQL database, such as MongoDB - this way you can still keep the scalability and take advantage of the JSON structure.
On the other hand, depending on your user table you could take possible advantage of column type indexing and optimize your requests for querying and reading, if of course you're working with normalized database.
I personally would store JSON within a relational DB only when I want to directly display the info and not use it for any querying. Just like you've said yourself - there's always the possibility of ending up with huge and growing JSON string and this would most probably cause troubles at some point.
Related
I need to implement access control in my application. I need to control which columns of which row can be edited by who.
Examples of a rule may be:
User can edit only her own rows (and also insert rown only for her) of TableA, which is linked to her through TableB (TableB contains foreign keys of both User and TableA).
User can edit just some parts of her profile, while admin can update other parts.
Instead of asking a DB to give my application all the information needed to evaluate the possibility of change in the application code, I am currently thinking about trigger based approach for inserts, updates and deletes. In this scenario a triggers, which would fail when operation is not permitted, would be hooked to controlled tables.
I will tell Postgres which user is working right now using local configuration.
Then I simply run the operation while expecting the possibility of failure.
Assuming I am able to evaluate all the permissions inside the Postgres DB, I see a few positives:
I would not need to ask the DB before the operation if it can be done.
I would not litter the application with permissions evaluation.
I can run updates from different parts of application even without using proper models while maintaining the security.
My questions are:
Can this become a bottleneck when scaling up? (It is a pretty traditional web-based PHP app, so I am expecting about a 100-times more selects than updates. But this would shift evaluation from web server which could be more easily replicated then DB server.)
Is there some better well-known design practice to implement this kind of permissions I have in mind.
there is a possibility that after scaling up your DB will be full of triggers which would be difficult to edit or change something. E.g. if you add extra column, you should edit all procedures related to table.
If I've got your question, you're trying to control web application users.
I think better way is to implement restrictions on a PHP side. For instance to create role, which contain array of possible to edit columns. And if this is 'admin', his/her list would contain all columns.
Then dynamically create the form according to role configuration, in order to let user see, what he/she can edit.
I have a web application where companies can register their company and use a set of features. However, lets say company 1 and company 2 has registered. They are still accessing the same website. Now each of these companies are 100% independent of each other when it comes to sharing information etc. The only thing they might share, is the users/employees.
Now my question is really, what is the best practice if each of these companies are to insert, select, update and deleted about 10K rows a day, each.
It can be everything from project handling, hourlists etc. All of which are split into different tables.
Would it be best practice to have independent databases, or use the same database for all the companies, and identify them by company_id?
Also keeping in mind the web application has to easily adapt to more than 10+ companies.
You could go one of two ways:
Add a companyId column to your tables,
Create a separate database for each company.
Option 1:
This option is the most dynamic one. You can keep the data separated by adding the correct companyId identifier to the where clause of your query.
This method is good when:
You expect a large number of customers,
You expect your number of customers to increase and decrease on a regular basis,
You do not need to share your database access with your customers (they only access it through your API/GUI).
Option 2:
This option gives a better separation of data. You keep each custommers data in their own dedicated instance of the database schema. This option allows you to offload the access-control burden to the database server, instead of having to enforce it in your application logic (which is more error prone).
However, there are some downsides: whenever a new customer shows up, you need to create a new database instance for them, which implies having a user with create database and grant privileges, something not every system administrator would be overly happy about.
The other issue is that whenever something changes in the database structure, you need to apply the chance to each instance of the database.
The good thing about this option is that you can give backup copies of your database to your customers, give them direct access to the database server, if needs be, or, in a more limited form, you could give them a copy of the database structure, without the need to filter out the customerId columns (as would be the case with option 1 above).
In summary:
There is no silver bullet, it all depends on your use-case. Option 1 is more flexible, Options 2 offers a better separation of data and easier access management.
[1]Keep separate database as there is more DML operations with your database.
[2]Keep very good database maintenance plan for Statistics management, Index maintenance and Backup/Recovery,otherwise you will have performance issue or more down time in case of database crash.
I've been researching on creating an Access Control List and there are a few things I've found. However, I'm not sure if one way is extremely overboard and another is far too simplistic.
So here it is:
Right now, how I have it set up is that in the users table i have a permissions field. This field contains JSON of all the permissions the user has. I was curious and wanted to know if there was a better way to do this and I have found structured databases that use separate tables for roles and permissions.
e.g.
role
-------------
id | name
permissions
-----------------
id | role_id | name
user_role
---------------
user_id | role_id
That's very basic, but the general idea.
My question is, which method is better. The tables approach seems a bit heavy with the joins and everything to get the permissions. However, when I look for the user, I can just pull out the JSON field and cache it... Or is there something fundamental that I'm missing?
You should be using native tables in place of non-native data structures when working with rational databases. Riding a bike down the interstate is an option but is it the best option?
Efficiency:
Modern database servers are able to cache the result of repeated queries so why waste resources in your app when the caching is already done for you. Most, at the least, offer a time-out cache where if the same query is produced moments apart the same result is returned(assuming values have not been altered for the result) from memory instead of requiring a DB read.
The larger your field-set becomes the more space a JSON string will take up on the file system, the slower that data will be to parse, and the more memory the cached result will consume. Using tables, the field-set will be far less resource consuming on the file system and enables you to request just the value(s) you need at that moment already formatted in a way your application understands. Where as with JSON, you retrieve a string that yet still needs more manipulation to be understood, containing not only the values you need at that moment but possibly quite a few values you do not need.
Scalability:
With a stored JSON string, if you wish to delete a no longer needed field or to add a required field you will have to spend quite a bit of processing power to adjust each item's data set. Where as with a table this can be done with a short query command. After such each time the server encounters an item with a field that is supposed to be deleted or missing a required field, that item's field will be adjusted in memory and the update will be scheduled for batch file system writing.
--
You state you are worried about joins being slow/costly to do, but assuming you are only requesting the data you need AT THAT MOMENT from ONLY the tables you need said values from the joins should be very minimum compared to alternatives.
Your JSON way looks like Wordpress serialization, WP uses serialization to store options, take a look at this post, for a quick example:
Working with serialized data in Wordpress
But it is used for options which IMHO are more or less irrelevant, options that have no need to be filtered or whatever, the only function is to provide the configuration of certain features. Let's say for unimportant things.
Permissions & roles are the kind of thing I would consider vital for any application, and the best way is to use the standard table approach. You could need new inserts and it's much easier, or to query who has certain permissions, and that's the magic of relational tables.
Assuming I have a valid session and an authenticated user, what are some ways to go about implementing user authorization in an application with a PHP/MySQL backend, and a heavy JavaScript front-end?
Most of the implementation examples I can find seem too focused on user authentication and the authorization just sort of happens. For instance, an if statement checking if the type of user an admin. This seems way too implemented to me.
In an implementation like mine, there is no way of knowing what "page" the user was on when they initiated the request. So, a method of only serving certain content for certain users, determined by PHP, is too broad for what I need to do.
Ideally each entity has a sort of access control list based either on the user explicitly or what group or type the user is/in.
I went to a local bookstore and spent an afternoon looking through all they had on PHP, MySQL and JavaScript. Surprisingly, most of the books had virtually nothing on user authorization. That scares the hell out of me! This has to be solved by anyone building a large web application that uses AJAX, I just can't seem to find something to get me started.
I would appreciate any and all feedback, experiences, tips, etc. (Any books on this subject?)
PHP security seems stuck in the dark ages of single password gives a token for a single user for a class of particular pages. You seem to be wanting to get a lot more fine-grained in your app, maybe even allowing access to specific pieces of resources depending on that login token. Your thought of access control lists is absolutely correct, and yes, you've discovered the dark secret: no one really published how to design or write an ACL mechanism. That said, it has been done.
First, are you familiar with unix file permissions? The're the -rwxr-xr-x things you see in an ls -l on the command line. Unix has chosen a very simplified approach to ACLs. Each person logged in has a User ID (UID) and one or more Group IDs (GID) (whoami, groups). The Unix file permissions allow three operations, Read, Write, and Execute which can be on or off. With 2^^9 states, these permissions easily fit in an integer, and Unix can then attach that
integer to the file directly in the file system. When a user attempts to access a file, permissions are compared from strict to permissive, matching the most permissive privileges allowed. So, users get the first set of permissions, groups get the second, and anyone gets the third. Thus, an executable is usually 755: only the owner can change it, but anyone can read and use it.
Second, LDAP is the Lightweight Directory Access Protocol, a system designed to give multiple network users access to resources. OpenLDAP is a common Linux implementation, and Microsoft's Active Directory on Windows Server speaks LDAP (with a lot of extensions). LDAP has a much more robust system of ACLs. A general configuration is access to [resources] by [who] [type of access granted] [control] or access to dn="uid=matt,ou=Users,dc=example,dc=com" by * none to limit all access to to Matt's user information. For a much more complete discussion, I would highly recommend Mastering LDAP, specifically chapter 4 on security. (This is where I get a bit out of my direct knowledge.) I am under the impression that LDAP stores this information in a separate database table, but I don't know that and can't find documentation one way or another. I am keeping an eye out for a possible schema for that.
Short stop to summarize: ACLs take a concept of a user token with possible groups above the user level, a collection of objects to secure in some way, and several consistent possible operations on those pieces- 3 dimensions of information. Unix stores two of those dimensions with the thing to be secured directly. OpenLDAP stores those three dimensions separately, in some way we don't quite know, but that I suspect is a linked tree structure.
Given that, let's take a look at how we could design an ACL system for a RESTful web application. For assumptions, we will break your application into discrete addressable units- each thing that needs to be secured will be accessible via a URI (http://example.com/users, http://example.com/page_pieces/ticker). Our users will be a simple UID/GIDs token- a user can be part of a several groups. Finally, our available operations will be based on the HTTP requests- GET, POST, PUT, DELETE, etc. We now need a system that efficiently handles a 3-dimensional array of data. Our schema should be pretty obvious: (uri, userid, groupid, operations). We deliberately denormalize the operations column into a string list of GET,POST,... so we only need one table. There is no primary key, since we will never really be looking up by ID.
Queries will be done in two steps: SELECT * FROM acl WHERE uri=#uri, userid=#userid which will return 0 or 1 rows. If it returns 1 row, we're done and can grep permisssion to see if the operation is in the list (use * to indicate all perms). If we got 0 rows, run a second query SELECT * FROM acl WHERE uri=#uri, userid='*', groupid in (#groupid) which will again return 0 or some rows. If it returns some, loop through and look at perms. If it returns 0, do one last query SELECT * FROM acl WHERE uri=#uri, userid='*', groupid='*' which will finally return 0 or 1 row. If it returns 1, look at perms. If it returns 0, take the default action.
We can set permissions in several ways:
INSERT INTO acl VALUES (#uri, #userid, '', 'GET,POST') allows a single user GET or POST access
INSERT INTO acl VALUES (#uri, '*', 'admin,contributors', 'GET,PUT,POST,DELETE')
INSERT INTO acl VALUES (#uri, '*', '*', '') denies all access.
A couple things to note:
All URIs must be expressed exactly; this solution has no way to set
default permissions at a higher level and have them trickle down
(left as exercise to the Questioner).
Uniqueness of uri/uid/gid pairs should happen at some point. The app can handle it, or in MySQL you can do ALTER TABLE acl ADD UNIQUE INDEX (uri, userid, groupid) (look up documentation for similar constraints in other DBMSes).
It seems that you are looking for something called Access Control List aka ACL (which is dead according to Zed Shaw, great video).
It's pretty hard to give a you a solution without knowing what kind of backend you have, but you might check out how other are doing that.
For something specific to the lithium framework (PHP), see: Lithium Access Control
This is what I understand:
You need to build an access control list for your users? do you?
[correct me if I'm wrong]
I suggest you to create a DB table in which you can store the User ID (or username) and what kind of access it has on your Web Application. Then you can check the table to know if the requested URL/resource is accessible to that user. That's all.
I have built a web application for one user, but now I would like to offer it to many users (it's an application for photographer(s)).
Multiple databases problems
I first did this by creating an application for each user, but this has many problems, like:
Giving access to a new user can't be automated (or is very difficult) since I have to create a subdomain, a database, initial tables, copy code to a new location, etc. This is tedious to do by hand!
I can't as easily create reports and statistics of usage, like how many projects do my users have, how many photos, etc.
Single database problems
But having just one database for each users creates it's own problems in code:
Now I have to change the DB schema to accommodate extra users, like the projects table having a user_id column (the same goes for some other tables like settings, etc.).
I have to look at almost each line of code that accesses the database and edit the SQL for selecting and inserting, so that I sava data for that specific user, at the same time doing joins so that I check permissions (select ... from projects inner join project_users ... where user_id = ?).
If I forget to do that at one spot in the code it means security breach or another unpleasant thing (consider showing user's projects by just doing select * from projects like I used to do - it will show all users' projects).
Backup: backup is harder because there's more data for the whole database and if a user says: "hey, I made a mistake today, can you revert the DB to yesterday", I can't as easily do that.
A solution?
I have read multiple questions on stackoverflow and have decided that I should go the "single database" route. But I'd like to get rid of the problems, if it's possible.
So I was thinking if there was a way to segment my database somehow so that I don't get these nasty (sometimes invisible) bugs?
I can reprogram the DB access layer if needed, but I'm using SQLs and not OO getter and setter methods.
Any help would be greatly appreciated.
I don't think there's a silver bullet on this one - though there are some things you can do.
Firstly, you could have your new design use a different MySQL user, and deny that user "select" rights on tables that should only be accessed through joins with the "users" table. You can then create a view which joins the two tables together, and use that whenever you run "select" queries. This way, if you forget a query, it will fail spectacularly, instead of silently. You can of course also limit insert, update and delete in this way - though that's a lot harder with a view.
Edit
So, if your application currently connects as "web_user", you could revoke select access on the projects table from that user. Instead, you'd create a view "projects_for_users", and grant "select" permissions on that view to a new user - "photographer", perhaps. The new user should also not have select access to "projects".
You could then re-write the application's data access step by step, and you'd be sure that you'd caught every instance where your app selects projects, because it would explode when trying to retrieve data - neither of your users would have "select" permissions on the projects table.
As a little side bonus - the select permission is also required for updates with a where clause, so you'd also be able to find instances where the application updates the project table without having been rewritten.
Secondly, you want to think about the provisioning process - how will you grant access to the system to new users? Who does this? Again, by separating the database user who can insert records into "users", you can avoid stupid bugs where page in your system does more than you think it does. With this kind of system, there are usually several steps that make up the provisioning process. Make sure you separate out the privileges for those tasks from the regular user privileges.
Edit
Provisioning is the word for setting up a service for a new user (I think it comes from the telephony world, where phone companies will talk about provisioning a new service on an existing phone line). It usually includes a whole bunch of business processes - and each step in the process must succeed for the next one to start. So, in your app, you may need to set up a new user account, validate their email address, set up storage space etc. Each of those steps needs to be considered as a step in the process, not just a single task.
Finally, while you're doing this, you may as well think about different levels of privilege. Will your system merit different types of user? Photographers, who can upload work, reviewers who can't? If that's a possible feature extension, you may want to build support for that now, even if the only type of user you support on go-live is photographer.
Well, time to face some hard facts -- I think. The "single database problem" that you describe, is not a problem, but a normal (usual) design. Quite often, one is simply a special case of many.
For some reason you have designed a web-app for one user -- not many of those around.
So, time to re-design.