Testing thousands of similar fields for differences - php

I have created a privilege system for my application which allows/disallows access to specific pages based on user input.
The table looks something like this:
page_id | client_id | sys_group_no | name | friendly_name | viewable |
1 | 4 | 1 | home | Home | true |
2 | 4 | 1 | admin| Admin Home | false |
So if the user in client_id 4 is of group 1 they are NOT allowed to view 'Admin Home' it isn't actually quite this simple but for the sake of this question we can pretend.
The problem is as maintenance goes on this table get out of date quickly, and when you have a few thousand rows, constantly checking the table against the actual page names (using scandir() and array_diff()) will be expensive. Is there a different paradigm for checking this kind of integrity other than direct comparison? - For instance would hashing my $page_array and comparing it be a better approach?

Related

Database Layout Privileges & Roles

Im currently creating a role System for a web Project (PHP/SQL)
To start me of I will have ~5 fixed roles which can be assigned to users.
And then I got a lot of different privileges.
Now I have to create a table to assign different privileges to different roles and I wonder whats the best way to do so:
1) Table with columns like that:
id | privilege | sysadmin | editor | ... | guest
0 | db_view | True | False | ... | False
1 | page_edit | True | True | ... | False
...
This seems the best solution as Long as there are only five fixed different roles, but is it still practicable when I open the role system to user defined roles in a future version.
2) Table with columns like that:
id | role | db_view* | page_edit | ... | usermanagement
0 | sysadmin | True | True | ... | True
1 | Editor | False | True | ... | False
...
At the end this table will have a great amount of columns - is that a good idea?
2) Table with columns like that:
id | role | privilege | Value
0 | sysadmin | page_edit | True
1 | Editor | page_edit | True
2 | sysadmin | dbview | True
3 | Editor | dbview | False
...
Here it should probably be enough to just create a dataset for privilges who are true. But it would still require a big table.
Is there another way? What would be the clearest, most flexible way to create that table?
Which way do you learn when you study database design?
Thanks in advance for any suggestions or questions!
This seems like a classic case for a many to many relationship:
TblPrivileges
-------------
Privilege_Id (primary key)
Privilege_name
Other privileges related data
TblRoles
--------
Role_Id (primary key)
Role_Name
Other role related data
TblPrivilegesToRoles
--------------------
PTR_Privilege_Id (reference privilege id in tblPrivileges)
PTR_Role_Id (reference role id in tblRoles)
In table TblPrivilegesToRoles the primary key should be both columns.

Database design with undetermined data

Recently I have been planning a system that allows a user to customize and add to a web interface. The app could be compared to a quiz creating system. The problem I'm having is how to design a schema that will allow for "variable" numbers of additions to be made to the application.
The first option that I looked into was just creating an object for the additions and then serializing it and putting it in its own column. The content wouldn't be edited often so writing would be minimal, reads however would be very often. (caching could be used to cut down)
The other option was using something other than mysql or postgresql such as cassandra. I've never used other databases before but would be interested in learning how to use them if they would improve the design of the system.
Any input on the subject would be appreciated.
Thank you.
*edit 29/3/14
Some information on the data being changed. For my idea above of using a serialized object, you could say that in the table I would store the name of the quiz, the number of points the quiz is worth and then a column called quiz data that would store the serialized object containing the information on the questions. So overall the object could look like this:
Questions(Array):{
[1](Object):Question{
Field-type(int):1
Field-title(string):"Whats your gender?"
Options(Array):{"Female", "Male"}
}
[2](Object):Question{
Field-type(int):2
Field-title(string):"Whats your name?"
}
}
The structure could vary of course but generally i would be storing integers to determin the type of field in the quiz and then a field to hold the label for the field and the options (if there are any) for that field.
In this scenario I would advise looking at MongoDB.
However if you want to work with MySQL you can think about the entity-attribute-value model in your design. The EAV model allows you to design for entries that contain a variable number of attributes.
edit
Following your update on the datatypes you would like to store, you could map your design as follows:
+-------------------------------------+
| QuizQuestions |
+----+---------+----------------------+
| id | type_id | question_txt |
+----+---------+----------------------+
| 1 | 1 | What's your gender? |
| 2 | 2 | What's your name? |
+----+---------+----------------------+
+-----------------------------------+
| QuestionTypes |
+----+--------------+---------------+
| id | attribute_id | description |
+----+--------------+---------------+
| 1 | 1 | Single select |
| 2 | 2 | Free text |
+----+--------------+---------------+
+----------------------------+
| QuestionValues |
+----+--------------+--------+
| id | question_id | value |
+----+--------------+--------+
| 1 | 1 | Male |
| 2 | 1 | Female |
+----+--------------+--------+
+-------------------------------+
| QuestionResponses |
+----+--------------+-----------+
| id | question_id | response |
+----+--------------+-----------+
| 1 | 1 | 1 |
| 2 | 2 | Fred |
+----+--------------+-----------+
This would then allow you to dynamically add various different questions (QuizQuestions), of different types (QuestionTypes), and then restrict them with different options (QuestionValues) and store those responses (QuestionResponses).

Which is more efficient? Count() or a reference to a count in another table?

Say if I wanted to add the functionality of logging user actions within a web application. My table schema would look similar to the following:
tbl_history:
+----+---------+--+-----------+
| id | user_id | | action_id |
+----+---------+--+-----------+
| 1 | 1 | | 1 |
| 1 | 1 | | 2 |
| 1 | 2 | | 2 |
+----+---------+--+-----------+
A user can generate many actions so I will need to paginate this history. In order to do this I will need to figure out the total amount of rows for the user then calculate how many pages of data there should be.
Which would method be the most efficient if I were to have hundreds of users generating thousands of rows of data each day?
A)
Using the MYSQL's COUNT() function to query the amount of rows of data in the tbl_history table for a particular user.
B)
Having another table which would keep a count of history for the user within the tbl_history table.
+---------+--+---------------+
| user_id | | history_count |
+---------+--+---------------+
| 1 | | 2 |
| 2 | | 1 |
+---------+--+---------------+
This will allow me to instantly get the total count of rows with a simple query in less than 1ms.
The tradeoff is that I will need to perform more queries updating the count for each user and also again on page load.
Which method is more efficient to use? Or is there any other better method? Any technical explanation would be great.
Thanks in advance.

Like and Unlike System in PHP

I am developing a community site for high school students. I am trying to implement a like and unlike system using PHP. Heres what I have got :
A table named likes in MySQL with 3 columns namely app_id VARCHAR(32), user VARCHAR(12), dormant VARCHAR(6).
UNIQUE(app_id,user)
When a person likes a page on my site, a row is either inserted or updated in the likes table with dormant = false.
When a person unlikes a page, the row present is again updated with dormant = true. This is an alternative to deleting the row as it is a bit intensive for a speedy work of likes and unlikes.
I want to know, if I should go for deleting the row instead of updating it, when someone unlikes the page.
Dont Delete the row. Every data you can gather its a valuable data point.
I would say you should create a new record for every unlike also.
These data will be usefull to you in the future to figure out user behaviour.
Some ppl might like smth now and then unlike it , then like it again and so on.
Maybe in the future u would like to see why so many people who liked an item suddely unliked it then liked it again.
So i say gather as much data as you can.
Sounds like premature optimization. Don't do that.
Design your application as you want to use it /as it should work. When it gets busy, find out the bottlenecks and fix them.
If you want to design your application for scalability to the millions, consider using a different database engine / programming platform altogether.
Looks like you haven't record the number of user liked or unliked the pages. In this case, LIKES should be a many table and there should be another table called APPS (or any name you wish) to store pages:
**USER**
+---------+-------+-----+
| user_id | name | ....|
+---------+-------+-----+
| 1 | ... | ... |
+---------+-------+-----+
| 2 | ... | ... |
+---------+-------+-----+
**APPS**
+---------+-------+-----+
| app_id | name | ....|
+---------+-------+-----+
| 1 | ... | ... |
+---------+-------+-----+
| 2 | ... | ... |
+---------+-------+-----+
**LIKES**
+---------+-------+----------+----------+
| like_id |user_id| app_id | is_liked |
+---------+-------+----------+----------+
| 1 | 1 | 2 | 1 |
+---------+-------+----------+----------+
| 2 | 1 | 3 | 0 |
+---------+-------+----------+----------+
Where you can toggle if the user click like( is_liked = 1) or unlike( is_liked = 0) the page

Constructing a simple recommendation engine

Both users and pages on my website have IDs. When a user goes on a certain page, their userID and the pageID will be written to a MySQL table as such:
userID | pageID
3 | 1
2 | 1
3 | 2
etc...
In this table, called user_pages, I would end up with a bunch of raw data that can be turned into a recommendation engine. What I mean by recommendation engine - I want to analyze historical data, and be able to predict, based on a set of viewed pages, the next pages that a user may like. Let's say there is a strong correlation between visiting page with ID 3 after going to pages with IDs 4, 9, 15. If a user goes on pages 4, 9, and 15, then the engine should recommend page 3.
I think I have all of the data input code necessary for creating this. How would I write something that analyzes the data for correlation of pages (i.e. almost everyone who visited page 5 visited page 1 also), and somehow use that to predict in the future the pages that a user may end up liking?
Recommendation systems are a big part of A.I research. I believe you are interested in a collection of algorithms called collaborative filtering. Since the netflix prize in 2007 this field has developed greatly. I would recommend going here and having a read. It explains the basic concepts of recommender systems in a succinct and clear way and also provides a link to Java source code for an approach to the Netflix project, MemReader. You could examine this source code and extrapolate the basic algorithms for building a recommendation engine.
Alternatively if you want a more mathematical explanation of the algorithms employed go here.
It shouldn't take too long to implement at all.
This post posed a similar question: Advanced MySQL: Find correlations between poll responses
I think you would be able to generate a similar response if your primary data table had one additional field in it, specifically the id of the page the used last visited or visited immediately following.
Something like this:
+------+----------+--------------+----------+
| id | page_id | next_page_id | user_id |
+------+----------+--------------+----------+
| 1 | 1 | 1 | 1 |
| 2 | 1 | 2 | 2 |
| 3 | 1 | 2 | 3 |
| 4 | 1 | 2 | 4 |
| 5 | 2 | 3 | 1 |
| 6 | 2 | 3 | 2 |
| 7 | 2 | 3 | 3 |
| 8 | 2 | 4 | 4 |
| 9 | 3 | 5 | 1 |
+------+----------+--------------+----------+
Then you should be able to use a modified version of one of the SQL queries suggested there to generate a list of high-correlation recommendations between the current page and the next page.

Categories