Creating recommendation algorithm base on user interests

Creating recommendation algorithm base on user interests - php

I'm currently building an application that would recommend website base on their tag.
On my website when a user registers, it will fill out an interests. So this is a sample interest:
football, model trains, hockey
So this is separated by commas. So when the user clicks on register that will be saved in my database. This is the design of my database.
userID | name | interest
001 | John Doe | sports, model trains, hockey
So on the other hand, I also have users in my sites who uploads website URLs and also creates a tag related to it. So this is my database design for that:
postID | title | tags
001 | techcrunch.com | technology,softwares,startups
002 | nba.com | basketball,sports,all-star
003 | tmz.com | gossip, showbiz
So the logic for this one is that, I wanted to recommend NBA.com to user John Doe since NBA.com has a tag of sports and John Doe's interest has a sports tag.
Do you have any idea how to do that one? Just a follow up question, Is the database design correct or should I create a new table to store all the tags. Something like that (not sure though).
Your help would be greatly appreciated and rewarded! Thanks in advance! :)

I would have normalized the database so that you have tags in a separate table and relationship tables to connect with it. As such:
User table:
UserId Name
001 John Does
TagUserRelation
UserId TagId
001 001
Tag table:
TagId TagName
001 Sports
TagUrlRelation
TagId Url
001 nba.com
001 nhl.com
To increase performance I would have continued by creating indexed views with the necessary joins and implementing stored procedures to work with them.
An alternative, as mentioned, is full text search but this will be much slower and generally not considered good database design in this case.

this can be done by using full text search
refer here

You should create two separate table which hold single tags, several for each person or post.
You can create a multi-column primary key for it if you wish.
userID | interest
001 | sports
001 | model trains
001 | hockey
...
and the same way for posts:
postID | tags
003 | gossip
003 | showbiz
...
This greatly enhances your chances to write efficient SQL.

It would be much better to store the tags separately. So that you have a table for the tags and two more tables - one for the relationship between users and tags, and one for the relationship between posts and tags.
users
----------------------------------------
userId | name | password | ....
1 | John Doe | $p$fgA |
tags
--------------------
tagId | tagname
1 | basketball
2 | hockey
user_interests
----------------------------
id | user_id | tag_id
1 | 1 | 1
2 | 1 | 2
post_tags
--------------------------
id | post_id | tag_id
1 | 1 | 2
Then you use JOINs to get the required information

Related

Change architecture of User Table from old design to new

I am looking for some general information in regards to User Table Design.
I have an old table design for 'users', which I need to update but not breaking the entire site's structure.
Current Table Design
UserID | Email | FirstName | Last Name | ...
1 | a#a.com | John | Doe | ...
2 | b#b.com | Jane | Doe | ...
I need to be able to create "Primary" users, as well as "Assitant" users.
Now I believe I should have a few tables designed:
Users
Accounts
Users > Accounts - (Relationships & Permissions)
IE: of users > accounts
TableID | UserID | AccountID | PERM
1 | 1 | 1 | 001
So I guess my question is. Is there a better way to do this? Specifically if there is a current design being used?
Hope this makes sense. Any direction in this would be greatly appreciated.

Here's an example where you'd have a table for each group, plus a users table. You can filter the users by group using a JOIN. Personally I don't love this. If anyone else has a better suggestion, I'd like to hear it.
http://sqlfiddle.com/#!9/993dd/1

Dynamic Multiple dropdown select options from MySQL using PHP

I'm trying to come up with some sort solution to a problem where I have to provide a user with dynamic dropdowns depending on the options they choose.
Currently I have 3 tables that are normalized as such.
Currently this works well with my HTML select elements, where if I select John Doe I would get Paul, Kevin and Dick as my second options and if I were to choose Kevin I would get Drake and Kanye as a third option.
My issue is that I do not want to keep creating tables since I would like to add more layers of staff_level in my application.
How would I approach this and have a fully dynamic table structure using PHP and MySQL?
Thank you for taking your time to read this.

You want an association table between the people. Put all of them in one table with unique IDs like so:
Table Staff
id | Name | <Other fields>
----+-------------+----------
1 | John Doe |
2 | Sam Smith |
3 | John Johns |
4 | Paul Pete |
5 | Kevin Mayor |
6 | Dick Ross |
...
Then the association table named whatever you like - maybe StaffHeirarchy:
Table StaffRelationships
id | ManagerId | SubordinateId
---+-----------+--------------
* | Null | 1 # Has no manager
* | 2 | 6 # Dick Ross is subordinate to Sam Smith
This table should have an id field for unique keys, but you don't have to care about what it is (it's not used as a Foreign Key as the Staff.id field is), which is why I put * there - in reality it would be some integer id.
I haven't seen your PHP for pulling values out of the database, but it is basically the same - query the association table filtering for the id of the manager you are looking for and you will get the ids of the subordinates (which you can JOIN on the staff table to get the names).

PHP & MySQL Tagging system logic

I am a beginner developer and i would like to ask some advice.
I am currently building a platform where people will be allowed to upload images and tag them.
I was reading through some articles with the following structure to store tags
Storing Logic 1
| photo_id | name | tags |
| 1 | some photo | flower, sun. island, beach |
| 2 | some photo2 | hawaii, travle. surf |
Lot of people said this is not such a good idea
So my logic.
I was reading around about Many-to-Many relations and i came up with this logic
Tags table
| tag_id | name |
-----------------------
| 1 | flower |
| 2 | hawaii |
| 3 | surfing |
| 4 | island |
| 5 | travel |
Photos table
| photo_id | name |
---------------------------
| 1 | some photo |
| 2 | some photo2 |
Relation table
| tag_id | photo_id |
---------------------------
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 4 | 2 |
| 5 | 2 |
I have chosen to use Laravel framework to make the development easier
But my problem is with logic 2 and what i am scared of is it will generate a huge load time.
Because there will be no default just user based tags i thought about the following logic.
User uploads the image with tags, before image is saved, check if the actual tags exsit if not save it, than return tags_id and save it to the relation table with photo_id
So i have 2 questions
Which logic is better and why?
If logic 2, is it good the way i thought it up? and should i worry about the load time in the future when lot of tags will be there?
thank you

I would go with the second one. I wouldn't worry about load times. You can easily get the categories with joins.
However, you should add an id column on the relation table so that multiple images can share a category.

In your second example, your relation table should have indexes, so that when you look for all the tags based on a specific photo_id, the answer will be rapidly returned.
See also Foreign Keys
In your relation table, tag_id is a foreign key into your tag table and photo_id is a foreign key into the photo table. Tags may have a relationship to more than 1 photo and a photo may have a relationship to more than one tag.
Similarly the names of your tags (and photos) should also be indexed for rapid searching.

Tagging hierarchy and search

I am trying to create a relatively simple hierarchical tagging system that can be searched. Here's how it works as of now, this is the MySQL table structure:
--------------------------------------------
id | tag | parentID | topParentID |
--------------------------------------------
1 | Boston | NULL | NULL |
--------------------------------------------
2 | Events | 1 | 1 |
--------------------------------------------
3 | June 30th | 2 | 1 |
--------------------------------------------
4 | NYC | NULL | NULL |
--------------------------------------------
5 | Attractions | 4 | 4 |
--------------------------------------------
So, if a user types Boston in the search bar, they will be delivered the suggestions "Boston Events" and "Boston Events June 30th". Similarly, if they type NYC in the search bar, they will be delivered "NYC Attractions" as a suggestion.
Also, if someone typed Events into the search bar, they would get the suggestion "Boston Events" or if they typed June 30th, they would get the suggestion "Boston Events June 30th"
I've messed around with code to do this, and I can definitely break the query string into keywords then search the tag table for each of the keywords and return matches, but I have not found the correct way to return the full tag strings in the format I mentioned above.

Well, you can join the same table twice. Suppose, we have $id - id of the current tag:
SELECT
tags.id,
tags.tag,
parent_tags.id,
parent_tags.tag,
parent2_tags.id,
parent2_tags.tag,
FROM
tags
INNER JOIN
tags AS parent_tags
ON
tags.parentID = parent_tags.id
INNER JOIN
tags AS parent2_tags
ON
tags.topParentID = parent2_tags.id
WHERE
tags.id=$id
But it will give parents and grandparents twice because of the incorrect data in your table: parent.id = parent2.id
Actually, this is a very primitive solution, allowing only 2 levels of hierarchy to be displayed in 1 request. If you want to implement any levels, read about nested sets on stack. And there is a great book: "Trees and hierarchies in SQL for smarties" by Joe Celko

I think that you may delete the topParentID column and add one called "level" (Boston would have level 0, events level 1, June 30th level 2).
So you cold order by this level column and implode the values so you would have something like what you wish.
You can do that without the level column, but I think it will be a lot more work on the php side.

How do I make replies to comments? (PHP)

I want to create something like reddit where they have comments, then replies to the comment, then reply to the reply.
What type of database structure do they use so:
1. they keep track of all the comments to a posting
2. a reply to a comment
3. a reply to a reply
All I have right are is just a posting and a bunch of comments relating to it like..
POSTING TABLE
posting_id | title | author
COMMENTS TABLE
comment_id | posting_id | comment
REPLIES TABLE
????
How do I relate the comments to the replies?
What type of css do they use to give replies that indented space?
EDIT:
Thanks for the answers! Now my only question how do I indent the replies?
Such as..
you like food
yes I love italian
Yes i do like it too
chinese is best

You can add another column to your comments table specifying parent_comment_id where you populate it with the ID of the comment (or reply) the user is replying to. In the case where the comment is a direct reply to the post (not a reply to a comment) this column would be null.

To show replies inside replies, you'll have to do a recursive call to keep on generating the sub replies.
Something like
function get_comments($comment_id) {
print '<div class="comment_body">';
// print comment body or something?
if (comment_has_reply($comment_id)) {
foreach(comment_comments($comment_id) as $comment) {
get_comments($comment->id);
}
}
print '</div>';
}
To indent comments however, use css.
<style type="text/css">
.comment_body {
margin-left:10px;
}
</style>
This way sub replies are indented more than the parent, and their subs are indented even more, and so on.

I would do that by making a cross reference table.
Example:
Table: Posts
Columns: pstkey | userid | postMessage | etc...
pstkey is the key for the post body. userid is the person who created the post. postMessage is the actual post entry.
Table: Comments
Columns: comkey | pstkey | userid | commentMessage | etc...
comkey is the key for the comment made. referenced to the post using the pstkey. userid is the person who made the comment. and then commentMessage is the text body of the actual comment.
Table: xref_postComm
Columns: xrefkey | pstkey | comkey | comkey2 |
Now for the fun part. ALL posts go into post table. ALL comments go into comment table. The relationships are all defined in the Cross Reference Table.
I do all of my programming this way. I was privileged to work with one of the worlds bests database engineers who was retired and he taught me a few tricks.
How to use the Cross Reference table:
xrefkey | pstkey | comkey | comkey2
All that you look for is the population of a given field.
xref (Auto Incremented)
pstkey (Contains the pstkey for the post)
comkey (Contains the comkey for the comment post)
comkey2 (Contains the comkey for the comment post)
(but only populate comkey2 if comkey already has a value)
and of course you populate comkey2 with the key of the comment.
SEE, no reason for a 3rd tabel!
With this method you can add as many relationships as you want.
Now or in the future!
comkey2 is your reply to a reply. Where which this single row contains.... the key of the post, the key of the comment, and the key of the reply to the reply comment. All done by population of xref.
EXAMPLE:
PAGES.... Page table
POSTS
pstkey | pageid | user| Post
-------------------------------------
| 1 | 1 | 45 | Went to the store the....|
| 2 | 2 | 18 | Saw an apple on tv.....
COMMENTS
comkey | pstkey | user | Comment
-----------------------------------------------
| 1 | 1 | 9 | Wanted to say thanks...
| 2 | 1 | 7 | Cool I like tha.....
| 3 | 2 | 3 | Great seeing ya....
| 4 | 2 | 6 | Had a great....
| 5 | 2 | 2 | Don't sweat it man...
xref_PostCom
xrefkey | pageid | pstkey | comkey | comkey2 |
----------------------------------------------
| 1 | 1 | 1 | NULL | NULL | Post1 on Page1
| 2 | 1 | 1 | 1 | NULL | Comment1 under Post1
| 3 | 1 | 1 | 2 | NULL | Comment2 under Post1
| 4 | 2 | 2 | NULL | NULL | Post2 on Page2
| 5 | 2 | 2 | 3 | NULL | Comment3 under Post2 on Page2
| 6 | 2 | 2 | 4 | NULL | Comment4 under Post2 on Page2 (a second Comment)
| 7 | 2 | 2 | 4 | 5 | Explained below....
Comment key 5 is matched with comment key 4....under post2 on Page 2
If you know anything about join, left join, right join, inner/outer join creating SELECT's to get the data arrays using these relationships, your job becomes a whole lot easier.
I believe the engineer's call this basically "the data map" of defined relationships. The trick is now how you access them using these relationships. It seams hard at first, but know what I know it, I refuse to do it any other way.
What happens in the end is you end up writing 1 script that says, ok, go do uhh, everything and come back. You will end up with 1 function call that asks for page 1. It returns with page1, post 1, comment1&2&3 and the replies to the reply in 1 array. echo to output and done.
UPDATE FOR COMMENT
I said the same exact thing the first time it was shown to me. As a matter of fact it really was making me mad that the database programmer was forcing me to do it this way. But now I get it. The advantages are so many.
Advantage 1) 1 query can be written to pull it all out in 1 shot.
2) Answers in multiple queries can populate arrays in a structure that when printing the page a loop in a loop can display the page.
3) Upgrading your software that uses it can support any possible design change you can ever think of. Flawless expandability.
The guy who taught it to me was the hired gun who redesigned sears and jcpenny databases. Back when they has 9 books going to the same house because of duplicate records issues.
Cross reference tables prevent a lot of issues in design.
The heart to this theory is, a column can not only contain data but serve as a true or false statement at the same time. That in it's self saves space. Why search 20 tables when you can search one? 1 indexed cross reference table can tell you everything you need to know about the other 20 tables, it contents, what you need, what you don't need, and do you even need to open the other table at all.
IN SHORT:
1 Cross reference containing nothing but INT(2/11) that tells you everything thing you need to know before you ever open another table, not only contains flawless expandability but lighting speed results. Not to mention little possibility of duplicate records. To you and me duplicate records may not be an issue. But to Sears with 4 billion records at $11 a book, mistakes add up.

Add another field to your comments table which "reply_to" or some such, and store the id of the comment which it is in reply to there.

you could make the comments table generic like so :
COMMENTS TABLE
comment_id | posting_type | posting_id | comment
where posting_type is some sort of discriminator, eg a string 'POST' or 'COMMENT', or an integer for more efficiency (1 = POST, 2 = COMMENT, etc).
edit : admittedly this is more complicated but it means you can use the same comment table for comments on anything, not just posts and other comments.

You don't need the replies table. As others already correctly have pointed out, recursion is the way to go with an RDBMS. You could always consider using a nosql style DBMS, to avoid having to deal with recursion.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Creating recommendation algorithm base on user interests - php

this can be done by using full text search refer here

Related

Change architecture of User Table from old design to new

Dynamic Multiple dropdown select options from MySQL using PHP

PHP & MySQL Tagging system logic

Tagging hierarchy and search

How do I make replies to comments? (PHP)

Categories

Resources