How to handle user assets after deletion

How to handle user assets after deletion - php

As always, I apologize if this question has been discussed somewhere.
I'm curious for some insight on how everyone is handling related tables and assets when a user is deleted. The project I'm currently working on is mostly a project management / issue tracking system.
When a user is deleted what should happen to their issues, projects, files, etc..?
A few scenarios off the top of my head were to either:
Delete all their issues, files, projects, etc...
Reparent everything to the only (or one of many) admin user
Lock the user account but keep all their assets (but I may need to REALLY delete them at some point if server storage becomes an issue)
Assign a non-existent user_id (0) to related tables (This seems like it would be too problematic when trying to JOIN tables
Any other possible considerations I'm missing? The third solution sounds the best to me right now. Those of you who've worked on large projects, how have you handled deleting users? On a side note, I'm developing the project in php (yii) and mysql.

You should delete all data that is not used by other users.
Because of situations in which other users are affected, like issues or tasks, you should just deactivate the user profile. The best way doing this is by adding a column 'active' to the database and set it to 0 if the user gets deleted. In your system, you can parse this column and display something like 'Deleted user'.
If no one else is assigned to a task, it would be the best if you let them unassigned and make something like a pool in which other users can choose tasks or get assigned to tasks.
The same way I'd do it with other data like projects.
Concerning files it does make sense to delete them if they are not required by anything any more.
Always remember, it's better to have some more data on the server than loosing important information!

Actually deleting a user and related objects should be the last thing you do. Even a deleted user is historical and useful data. You can always archive it if it truly becomes burdensome.
As recommended by Lukas, deactivate the user instead. All of your queries that request user-related objects should then check the 'active' flag to include/exclude the user. When you have that complex join query to retrieve the data, it saves a lot of effort and processing with that flag.
When it comes to 'actual' assets, like a file, those can take up a lot of space, but if they are related to a inactive user, you should not delete those either. Archive again, if necessary. It may require adding an 'archived' flag to an entry pointing to that asset, so that you're not trying to open the file when it does not exist.

Related

User permissions in Postgres

I need to implement access control in my application. I need to control which columns of which row can be edited by who.
Examples of a rule may be:
User can edit only her own rows (and also insert rown only for her) of TableA, which is linked to her through TableB (TableB contains foreign keys of both User and TableA).
User can edit just some parts of her profile, while admin can update other parts.
Instead of asking a DB to give my application all the information needed to evaluate the possibility of change in the application code, I am currently thinking about trigger based approach for inserts, updates and deletes. In this scenario a triggers, which would fail when operation is not permitted, would be hooked to controlled tables.
I will tell Postgres which user is working right now using local configuration.
Then I simply run the operation while expecting the possibility of failure.
Assuming I am able to evaluate all the permissions inside the Postgres DB, I see a few positives:
I would not need to ask the DB before the operation if it can be done.
I would not litter the application with permissions evaluation.
I can run updates from different parts of application even without using proper models while maintaining the security.
My questions are:
Can this become a bottleneck when scaling up? (It is a pretty traditional web-based PHP app, so I am expecting about a 100-times more selects than updates. But this would shift evaluation from web server which could be more easily replicated then DB server.)
Is there some better well-known design practice to implement this kind of permissions I have in mind.

there is a possibility that after scaling up your DB will be full of triggers which would be difficult to edit or change something. E.g. if you add extra column, you should edit all procedures related to table.
If I've got your question, you're trying to control web application users.
I think better way is to implement restrictions on a PHP side. For instance to create role, which contain array of possible to edit columns. And if this is 'admin', his/her list would contain all columns.
Then dynamically create the form according to role configuration, in order to let user see, what he/she can edit.

Most efficient way to track unread messages per user

I am working on a jQuery Mobile Web App where a company will have the ability to message certain groups of users (based on their profile preferences).
I am debating on what is the most efficient way to mark when each user has read the latest messages. I have considered using a session to try and keep track of the last time they opened the messages page, and comparing that to the post times of the messages. I have also considered a table with the message_id and the user_id, marking each one as read when they open the page.
I think both would work, but I am trying to balance the pros and cons. Keeping in a database would allow me to keep a history (especially if i added a timestamp column to know when they read the message), but if it is going to hurt the app performance due to the table size, then it may not be worth it. The app will potentially have 10's of thousands of users.
One thing I should probably mention is that the users may use the app on multiple devices and the app will have very long session times, potentially allowing the user to stay logged in for months. I like the idea that if they read it on one device then it would mark it read on all devices, which may make sessions difficult to work with, right?

Ok, I'm gonna put everything I said in the comments into one solid answer.
Short Answer: You should be using the database to store 'read' notifications
Logic behind it:
It should be a negligable performance hit with decent servers and optimized code (couple of ms max) even with hundreds of thousands of users
It is highly maintainable
You can track it and sync it across devices
Specifically why you shouldn't use sessions
Sessions were designed to store temporary user data (think ram), they're not supposed to log stuff.
You should not be keeping sessions for months. It is highly insecure as it opens up a much larger window for session hijacking. Rather you should be generating a new session each time the app is accessed, and using a different "remember me" cookie or something each time to authenticate them.
Even if you do make your session persist for months, after those months won't the user all of a sudden get a bajillion "unread" notifications?
How to store it in the database
This is called a many-to-many relationship (from the message perspective) OR a one-to-many relationship (from the user perspective)
Table 1: messages
ID, message, timestamp
Table 2: messages_users
ID, user_id, message_id, read
Table 3: users
(Do user business as usual)

I can do one thing, if no problem with one user or 100 of user, you create one column named readUnread with more than 63,999 Characters in which you use put every user your message with 0 and 1 assign like {jeff:0,kevin:1,Sal:0} when read update from 0 to 1 and when you open this on the screen, split it with current user and ";", this will help you (this is the logic which inhance your performance).

Logging user activities in applications

The problem I'm here to talk about and (ask about of course) is not new. I searched web and stack overflow and I got ideas to many part of this problem (pros and cons) but there is still some part missing in my mind. So I thought it would be a good idea to share in one place (of course it will be more complete with others' ideas) and ask for it.
The problem is clear: "We Want to log every single action of user" - probably when we solve the big problem, smaller ones (like logging only one action would be piece of cake).
First from what I read over the web and stack overflow:
Use DB instead of File: That's a good advice although it always depends on situation. But because of many benefits of DB, in long term and in general, it's the better solution.
DB Layer or Application Layer: Actually it depends. For example If you want really monitor everything(I mean really every single rows that changes in Database, it seems we will have one choice "Using Database Triggers". Although there are many discussions around MySQL that says, triggers slowdown DB and they advised not to use it. So it depends on the level of details you need, you can put your logging system in DB Layer or Application Layer(for exam some common function call $logClass->logThis()).
Use Observers: Clean codes are always better. If you are familiar with observers, you can use them to do things for you when an action is happened so you don't have to add $logClass->logThis() every time a CRUD happens in your application.
What To Log: Simple and short answer is: Based on your needs, but there are some common fields you will need:
user_id (if a unique user ID is available)
timestamp (unix maybe)
ip (not everyone know how to fake it in first place so use it, even faking it give you some insight about user behavior)
action_id (should be predefined actions for better unifying in queries and reports)
object_id (the unique row ID of a record that changes had made on)
action (which my question is about this part)
and etc...
I would appreciate if anyone correct me if I made mistake in any part or add other useful information to this post, so it would become one of good references for other users.
And now my question: How to Store actions?. For better understanding, consider following scenario.
I have a table named "product" and a table named "companies". From the business logic we want to assign products to companies, which we ended up in a table "company_product". Now when a user insert new product and simultaneously assign it's companies, 2 table will be affected (the same goes for delete and update): "product" and "company_product" and we want to know:
what's inserted?
what's deleted?
what's updated to what?
For performance issue and because I don't have enough knowledge about triggers, I want to use logging in Application Layer, so I ended up with this idea that I can, save action fields of database in array or json structure. But as I developed my solution I encountered a problem: How to make this log understandable for non technical users? Because for example I want to save something like this in action field of database when delete(insert) product with id 20:
action : [{id: 20, product_id:2, company_id: 1},{id: 21, product_id:2, company_id: 2}]
And this is not something easy for every one to read and understand. Actually I can use this json more readable and make it something like this:
action : {'Product A Deleted From Company X', 'Product A Deleted From Company Y'}
and save the previous action in technical_action field for further diagnose, But it needs additional works and more query to run for something that is not always needed to be considered(log)
I would appreciate any additional information on this article (I'm definitely sure that there exist other criteria that can be discussed), and answer to my question.

You are actually going to gather details for analytics kind of stuffs.
It will be good if you go for flat tables rather than going to relational tables.
Because if you want to do more analysis your relational table will not be a good choice as it lacks in performance.

MVC Multi User Authentication/Security

I've been working on a web application for a company that assists them with quoting, managing inventory, and running jobs. We believe the app will be useful to other companies in the industry, but there's no way I want to roll out separate instances of the app, so we're making it multi-user (or multi-company might be a better term, as each company has multiple users).
It's built in Codeigniter (wish I had've done it in Rails, too late now though), and I've tried to follow the skinny-controller fat-model approach. I just want to make sure I do the authorisation side of things properly. When a user logs in I'd store the companyID along with the userID in the session. I'm thinking that every table that the user interfaces with should have an additional companyID field (tables accessed indirectly via relationships probably wouldn't need to store the companyID too, tell me if I'm wrong though). Retrieving data seems pretty straight forward, just have an additional where clause in AR to add the company ID to the select, eg $this->db->where('companyID', $companyID). I'm ok with this.
However, what I'd like to know is how to ensure users can only modify data within their own company (in case they send say, a delete request to a random quoteID, using firebug or a similar tool). One way I thought of is to add the same where clause above to every update and delete method in the models as well. This would technically work, but I just wanted to know whether it's the correct way to go about doing it, or if anyone had any other ideas.
Another option would be to check to see if the user's company owned the record prior to modification, but that seems like a double-up on database requests, and I don't really know if there's any benefit to doing it this way.
I'm surprised I couldn't find an answer to this question, I must be searching for the wrong terms :p. But I would appreciate any answers on this topic.
Thanks in advance,
Christian

I'd say you're going about this the correct way. Keeping all of the items in the same tables will allow you to run global statistics as well as localized statistics - so I think this is the better way to go.
I would also say that it would be best to add the where clause you mention to each query (whether it's a get, update, delete. However, I'm not sure you'd want to manually go in and do that for all of your queries. I would suggest you overwrite those methods in your models to add the relevant where clauses. That way, when you call $this->model->get(), you will automatically get the where->($companyID, $userID) clause added to the query.

From the looks of things it looks like this might be a more API type system (as otherwise this is simply a normal user authentication system).
Simple Authentication
Anyway, the best bet I can see for an API is to have two tables, companies and users
in the companies table have an companyID, and password. in the users table link each user to a company.
Then when a user makes a request have them send through the companyID and password with every request.
oauth
The next option, slightly harder to implement, and means that the other end must also setup Oauth authentication is oauth.
But, in my opinion is much nicer overall to use and is a bit more secure.

One way to do it would be with table prefixes. However, if you have a lot of tables already, duplicating them will obviously grow the size of the db rapidly. If you don't have many tables, this should scale. You can set the prefix based on user credentials. See the prefixes section of this page: http://codeigniter.com/user_guide/database/queries.html for more on working with them.
Another option is to not roll out separate instances of the application, but use separate databases. Here is a post on CI forum discussing multiple db's: http://codeigniter.com/forums/viewthread/145901/ Here again you can select the proper db based on user credentials.
The only other option I see is the one you proposed where you add an identifier to the data designating ownership. This should work, but seems kinda scary.

How to segment a database for an application accessing it (a.k.a. single database for multiple users problem)?

I have built a web application for one user, but now I would like to offer it to many users (it's an application for photographer(s)).
Multiple databases problems
I first did this by creating an application for each user, but this has many problems, like:
Giving access to a new user can't be automated (or is very difficult) since I have to create a subdomain, a database, initial tables, copy code to a new location, etc. This is tedious to do by hand!
I can't as easily create reports and statistics of usage, like how many projects do my users have, how many photos, etc.
Single database problems
But having just one database for each users creates it's own problems in code:
Now I have to change the DB schema to accommodate extra users, like the projects table having a user_id column (the same goes for some other tables like settings, etc.).
I have to look at almost each line of code that accesses the database and edit the SQL for selecting and inserting, so that I sava data for that specific user, at the same time doing joins so that I check permissions (select ... from projects inner join project_users ... where user_id = ?).
If I forget to do that at one spot in the code it means security breach or another unpleasant thing (consider showing user's projects by just doing select * from projects like I used to do - it will show all users' projects).
Backup: backup is harder because there's more data for the whole database and if a user says: "hey, I made a mistake today, can you revert the DB to yesterday", I can't as easily do that.
A solution?
I have read multiple questions on stackoverflow and have decided that I should go the "single database" route. But I'd like to get rid of the problems, if it's possible.
So I was thinking if there was a way to segment my database somehow so that I don't get these nasty (sometimes invisible) bugs?
I can reprogram the DB access layer if needed, but I'm using SQLs and not OO getter and setter methods.
Any help would be greatly appreciated.

I don't think there's a silver bullet on this one - though there are some things you can do.
Firstly, you could have your new design use a different MySQL user, and deny that user "select" rights on tables that should only be accessed through joins with the "users" table. You can then create a view which joins the two tables together, and use that whenever you run "select" queries. This way, if you forget a query, it will fail spectacularly, instead of silently. You can of course also limit insert, update and delete in this way - though that's a lot harder with a view.
Edit
So, if your application currently connects as "web_user", you could revoke select access on the projects table from that user. Instead, you'd create a view "projects_for_users", and grant "select" permissions on that view to a new user - "photographer", perhaps. The new user should also not have select access to "projects".
You could then re-write the application's data access step by step, and you'd be sure that you'd caught every instance where your app selects projects, because it would explode when trying to retrieve data - neither of your users would have "select" permissions on the projects table.
As a little side bonus - the select permission is also required for updates with a where clause, so you'd also be able to find instances where the application updates the project table without having been rewritten.
Secondly, you want to think about the provisioning process - how will you grant access to the system to new users? Who does this? Again, by separating the database user who can insert records into "users", you can avoid stupid bugs where page in your system does more than you think it does. With this kind of system, there are usually several steps that make up the provisioning process. Make sure you separate out the privileges for those tasks from the regular user privileges.
Edit
Provisioning is the word for setting up a service for a new user (I think it comes from the telephony world, where phone companies will talk about provisioning a new service on an existing phone line). It usually includes a whole bunch of business processes - and each step in the process must succeed for the next one to start. So, in your app, you may need to set up a new user account, validate their email address, set up storage space etc. Each of those steps needs to be considered as a step in the process, not just a single task.
Finally, while you're doing this, you may as well think about different levels of privilege. Will your system merit different types of user? Photographers, who can upload work, reviewers who can't? If that's a possible feature extension, you may want to build support for that now, even if the only type of user you support on go-live is photographer.

Well, time to face some hard facts -- I think. The "single database problem" that you describe, is not a problem, but a normal (usual) design. Quite often, one is simply a special case of many.
For some reason you have designed a web-app for one user -- not many of those around.
So, time to re-design.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.