Database schema and queries for activity stream in social network

Database schema and queries for activity stream in social network - php

To preface, I'm no DBA or SQL expert. But I've taken on a personal project that requires me to wear all hats in making a social network. (No, I'm not trying to reinvent Facebook. I'm targeting a niche audience.) And yes, I've heard of frameworks such as http://activitystrea.ms/, but I feel like data serialization should be a last resort for my needs.
Anyway, How to implement the activity stream in a social network helped me get the ball rolling, but I have some unanswered questions.
Below is my database schema (some rows have been omitted for simplification):
Action table:
id name
-------------
1 post
2 like
3 follow
4 favorite
5 tag
6 share
Activity table:
id (int)
user_id (int)
action_id (tinyint)
target_id (int)
object_id (tinyint)
date_created (datetime)
The object_id refers to which object type the target_id is. The idea here is to represent (User + Action + Target Object)
User Post(s) Media
User Favorite(s) Scene
User Follow(s) User
Object (type) table:
id name
-------------
1 media
2 scene
3 brand
4 event
5 user
The problem here is that each object has its own separate table. For example:
Media table:
id (int)
type (int)
thumbnail (varchar)
source (varchar)
description (varchar)
Event table:
id (int)
user_id (int)
name (varchar)
city (int)
address (varchar)
starts (time)
ends (time)
about (varchar)
User table:
id (int)
username (varchar)
profile_picture (varchar)
location (int)
What, then, would be the best (i.e., most efficient) way of querying this database?
Obviously I could perform a SELECT statement on the activity table, and then – based on the object_id – use conditional logic in PHP to make a separate query to the appropriate object's table (e.g., media).
Or would it be smarter (and more efficient) to implement some sort of left or inner JOIN on all 5 object tables, as suggested here: MySQL if statement conditional join. I'm not entirely familiar with how JOINS work, and whether SQL is smart enough to only scan the appropriate object table for each activity row, rather than ALL the joined tables.
Of course the first solution means MANY more calls to the database, which is less desirable. However, I'm not sure how else I could retrieve all the relevant columns (e.g., media "source", event "address") in just one query without implementing some conditional logic.

Suppose, you change your activity table a little bit:
Activity table:
id (int)
user_id (int)
action_id (tinyint)
object_id (tinyint)
date_created (datetime)
and your join table for every target type:
activity_id (int)
target_id (int)
and finally your target table (media)
id (int)
type (int)
thumbnail (varchar)
source (varchar)
description (varchar)
and target table (event)
id (int)
user_id (int)
name (varchar)
city (int)
address (varchar)
starts (time)
ends (time)
about (varchar)
now, you can select the data with
SELECT
activity.id,
activity.user_id,
activity.action_id,
action.name,
activity.object_id,
object.name,
media.id as media_id,
media.type,
media.thumbnail,
media.source,
media.description,
event.id as event_id,
event.name,
...
FROM
activity
LEFT JOIN action ON (action.id = activity.action_id)
INNER JOIN mediaToActivity ON (mediaToActivity.activity_id = activity.id)
LEFT JOIN media ON (media.id = mediaToActivity.target_id)
INNER JOIN eventToActivity ON (eventToActivity.activity_id = activity.id)
LEFT JOIN event ON (event.id = eventToActivity.target_id)
with this query you should get all rows in one query (but only the ones which actually exists are filled with data)
Note, I haven't tested this by now...

I pieced together from your discussion what your solution was. Fiddle
create table activity (
id int,
user_id int,
action_id int,
target_id int,
object_id int,
date_created datetime
);
create table action (
id int,
name varchar(80)
);
create table object (
id int,
name varchar(80)
);
create table media (
id int,
type int,
thumbnail varchar(255),
source varchar(255),
description varchar(255)
);
create table event (
id int,
user_id int,
name varchar(255),
city int,
address varchar(255),
starts time,
ends time,
about varchar(255)
);
-- setup
insert into action values (1, "post");
insert into object values (1, "media");
insert into object values (2, "event");
-- new event
insert into event values (1, null, "breakfast", null, "123 main st", null, null, "we will eat");
insert into activity values (1, null, 1, 1, 2, null);
-- new media
insert into media values (1, null, null, null, "new media");
insert into activity values (2, null, 1, 1, 1, null);
SELECT *
FROM
activity
left join event on (event.id = activity.target_id and activity.object_id = 2)
left join media on (media.id = activity.target_id and activity.object_id = 1);

Related

Multiple JOINs to the same subquery expression

I've recently been teaching myself SQL, and have been working on a toy project to do so. Here is a sample schema:
CREATE TABLE user (
id INT PRIMARY KEY NOT NULL AUTO_INCREMENT,
name VARCHAR(50)
);
INSERT INTO user(name) VALUES
("User 1"),
("User 2"),
("User 3"),
("User 4"),
("User 5");
CREATE TABLE friendship (
uid_1 INT,
uid_2 INT,
accepted_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (uid_1, uid_2),
CONSTRAINT fk_uid_1 FOREIGN KEY (uid_1) REFERENCES user (id),
CONSTRAINT fk_uid_2 FOREIGN KEY (uid_2) REFERENCES user (id)
);
INSERT INTO friendship(uid_1, uid_2) VALUES
(1, 2),
(2, 1);
CREATE TABLE event (
id INT PRIMARY KEY NOT NULL AUTO_INCREMENT,
name VARCHAR(50),
owner_id INT,
CONSTRAINT fk_owner_id FOREIGN KEY (owner_id) REFERENCES user (id)
);
INSERT INTO event (name, owner_id) VALUES
("Event 1", 1),
("Event 2", 2),
("Event 3", 3),
("Event 4", 4),
("Event 5", 5),
("Event 6", 1);
CREATE TABLE invite (
event_id INT NOT NULL,
sent_from_id INT NOT NULL,
sent_to_id INT NOT NULL,
PRIMARY KEY (event_id, sent_to_id),
CONSTRAINT fk_event_id FOREIGN KEY (event_id) REFERENCES event (id),
CONSTRAINT fk_sent_from_id FOREIGN KEY (sent_from_id) REFERENCES user (id),
CONSTRAINT fk_sent_to_id FOREIGN KEY (sent_to_id) REFERENCES user (id)
);
INSERT INTO invite(event_id, sent_from_id, sent_to_id) VALUES
(1, 2, 3);
As part of this project, I have a query that gets a list of users, with information populated relative to the currently authenticated user.
A simplified version of the query looks like this:
$select_users_query = "
SELECT
user.id AS id,
user.name AS name,
friendship.accepted_time AS friend_since
FROM user
LEFT JOIN friendship
ON friendship.uid_1 = user.id AND friendship.uid_2 = $relative_to_id
";
Then, at some endpoints, I want to return objects which have one or more users as sub-objects. In order to do this, I've been JOINing tables to the above query as a subquery, but when the returned object has multiple users (e.g., an invite to an event has a sending user, a receiving user, and a user that owns the event in question), the resulting query can end up pretty repetitive:
$select_invites_query = "
SELECT
event.id AS event_id,
event.name AS event_name,
owner.id AS owner_id,
owner.name AS owner_name,
owner.friend_since AS owner_friend_since,
sent_to.id AS sent_to_id,
sent_to.name AS sent_to_name,
sent_to.friend_since AS sent_to_friend_since,
sent_from.id AS sent_from_id,
sent_from.name AS sent_from_name,
sent_from.friend_since AS sent_from_friend_since,
FROM invite
INNER JOIN event
ON event.id = invite.event_id
INNER JOIN ($select_users_query) owner
ON event.owner_id = owner.id
INNER JOIN ($select_users_query) sent_from
ON invite.sent_from_id = sent_from.id
INNER JOIN ($select_users_query) sent_to
ON invite.sent_to_id = sent_to.id
";
My questions are:
Is repeating a subquery like this a performance issue during execution, assuming that the INNER JOINs all match on just a single row?
If not, is the additional parsing required for $select_invites_query a significant concern at all (especially as $select_users_query grows big and complex)?
Would using a variable here be a good idea, or a bad idea? From my inspection of EXPLAIN it seems as though MySQL is able to handle these JOINs pretty efficiently, but would defining a variable force MySQL to pull the unfiltered result set into memory before JOINing?
See SQL Fiddle schema here.

Since you appear to need self joins of same query, consider a Common Table Expression (CTE) (available in MySQL 8.0+) and with PHP parameterization. Below demonstrates with PHP's mysqli API in object-oriented and procedural styles:
$select_invites_query = "
WITH sub AS (
SELECT u.id AS id, u.`name` AS `name`,
f.accepted_time AS friend_since
FROM user
LEFT JOIN friendship f
ON f.uid_1 = u.id AND f.uid_2 = ?
)
SELECT event.id AS event_id, event.`name` AS event_name,
owner.id AS owner_id, owner.`name` AS owner_name,
owner.friend_since AS owner_friend_since,
sent_to.id AS sent_to_id, sent_to.`name` AS sent_to_name,
sent_to.friend_since AS sent_to_friend_since,
sent_from.id AS sent_from_id, sent_from.`name` AS sent_from_name,
sent_from.friend_since AS sent_from_friend_since
FROM invite
INNER JOIN event
ON event.id = invite.event_id
INNER JOIN sub owner
ON event.owner_id = owner.id
INNER JOIN sub sent_from
ON invite.sent_from_id = sent_from.id
INNER JOIN sub sent_to
ON invite.sent_to_id = sent_to.id";
// OBJECT-ORIENTED STYLE
$conn = new mysqli("my_host", "my_user", "my_pwd", "my_db");
$stmt = $conn->prepare($select_invites_query))
$stmt->bind_param("i", $relative_to_id);
$stmt->execute();
...
// PROCEDURAL STYLE
$conn = mysqli_connect("my_host", "my_user", "my_pwd", "my_db");
$stmt = mysqli_prepare($conn, $select_invites_query);
mysqli_stmt_bind_param($stmt, "i", $relative_to_id);
mysqli_stmt_execute($stmt);
...

Insert data in 3rd table with the values inserted in 2 other table

I have 3 table in postgres database. Created with this code:
CREATE TABLE AUTHOR(
ID SERIAL PRIMARY KEY,
NAME TEXT
);
CREATE TABLE BOOK(
ID SERIAL PRIMARY KEY,
NAME TEXT
);
CREATE TABLE BOOK_AUTHOR(
BOOK_ID INTEGER REFERENCES BOOK(ID),
AUTHOR_ID INTEGER REFERENCES AUTHOR(ID)
);
A book can have multiple author.
I want to insert multiple author in AUTHOR table.
A book in BOOK table.
And pair in BOOK_AUTHOR table.
For example: If BOOK X is written by Mr. A and Mr. B
I want the table content be like this
AUTHOR
ID-NAME
1, Mr. A
2, Mr. B
BOOK
ID-NAME
1, X
BOOK_AUTHOR
BOOK_ID-AUTHOR_ID
1,1
1,2
I am using postgres-php.
I know I can insert data in author table. Insert data in book table. Make query over them to get the ids.
Then insert in book_author table.
But is there any way to insert those data more efficiently?
What is the possible best way?

PostgreSQL has a very handy 'RETURNING' function you can use here like this:
WITH authors AS (
INSERT INTO
author (name)
VALUES
('Mr. A'), ('Mr. B')
RETURNING
id
), books AS (
INSERT INTO
book (name)
VALUES
('X')
RETURNING
id
)
INSERT INTO
book_author
SELECT
b.id
, a.id
FROM
books b
, authors a;
Just make a Cartesian product of the output and use it as input for the third insert.

Is there a way to strictly search for strings in mysql query

Suppose I have a table name 'employee' as bellow:
| name | skills |
|------------------------|
| john | PHP, HTML |
| RIcky | HTML5, PHP |
| ROman | HTML5, HTML|
I want to search for HTML strictly. I tried:
select * from employee where skills like %HTML% // show all result. But it should only display row with name 'john' and 'ROman'.
select * from employee where skills like %HTML5% // show all result. But it should only display row with name 'RIcky' and 'ROman'.
How can i do this directly from mysql query.
Updated:
I couldnot normalize the table because I am working in automatic form builder. Means skills are not static. There may be any values. Like in place of skills user may make other fields with options. Same like survey form biulder.

This would be proper to use.
SELECT * FROM employee WHERE FIND_IN_SET( 'HTML' , REPLACE(skills, SPACE(1), '') ) > 0;

Not exactly what you've asked for:
You could normalize your database table(s) and instead of storing complex datatypes (a list of strings) in one field create three tables:
1) for the properties (skills)
2) for the entities that "have" certain properties (employees having certain skills)
3) a junction table where you store the information (a reference) which entity has which properties
This way your relational database system has a much better chance of using indices to find the appropriate data (using LIKE, string functions et al in a WHERE/ON clause usually causes a full table scan and for that you hardly need a database - you can do that with a flat file almost as easily).
E.g. (I didn't pay attention to the indices though)
<?php
$pdo = new PDO('mysql:host=localhost;dbname=test;charset=utf8', 'localonly', 'localonly', array(
PDO::ATTR_EMULATE_PREPARES=>false,
PDO::MYSQL_ATTR_DIRECT_QUERY=>false,
PDO::ATTR_ERRMODE=>PDO::ERRMODE_EXCEPTION
));
setup($pdo);
// Which employees have the skill 'HTML' ?
$query = "
SELECT
e.name
FROM
skills as s
JOIN
employee_skills as x
ON
s.id=x.id_skill
JOIN
employees as e
ON
x.id_employee=e.id
WHERE
s.name = 'HTML'
";
foreach( $pdo->query($query) as $row ) {
echo $row['name'], "\r\n";
}
/* creating temporary test tables
and inserting sample data
*/
function setup($pdo) {
$pdo->exec('
CREATE TEMPORARY TABLE employees (
id int auto_increment,
name varchar(32),
primary key(id)
)
');
$pdo->exec('
CREATE TEMPORARY TABLE skills (
id int auto_increment,
name varchar(32),
primary key(id)
)
');
$pdo->exec('
CREATE TEMPORARY TABLE employee_skills (
id_employee int,
id_skill int,
unique(id_employee,id_skill)
)
');
$pdo->exec("
INSERT INTO employees (id, name) VALUES
(1, 'John'), (2,'Ricky'), (3,'Roman')
");
$pdo->exec("
INSERT INTO skills (id, name) VALUES
(1, 'PHP'), (2,'HTML'), (3,'HTML5')
");
$pdo->exec("
INSERT INTO employee_skills (id_employee, id_skill) VALUES
(1, 1), (1,2),
(2, 3), (2,1),
(3, 3), (3,2)
");
}

Try surrounding the text you are searching for with speechmarks:
select * from employee where skills like "%HTML%"
select * from employee where skills like "%HTML5%"

Try a REGEX-based search
SELECT * FROM employee WHERE skills REGEXP '[[:<:]]HTML[[:>:]]'

MySQL inclusion/exclusion of posts

This post is taking a substantial amount of time to type because I'm trying to be as clear as possible, so please bear with me if it is still unclear.
Basically, what I have are a table of posts in the database which users can add privacy settings to.
ID | owner_id | post | other_info | privacy_level (int value)
From there, users can add their privacy details, allowing it to be viewable by all [privacy_level = 0), friends (privacy_level = 1), no one (privacy_level = 3), or specific people or filters (privacy_level = 4). For privacy levels specifying specific people (4), the query will reference the table "post_privacy_includes_for" in a subquery to see if the user (or a filter the user belongs to) exists in a row in the table.
ID | post_id | user_id | list_id
Also, the user has the ability to prevent some people from viewing their post in within a larger group by excluding them (e.g., Having it set for everyone to view but hiding it from a stalker user). For this, another reference table is added, "post_privacy_exclude_from" - it looks identical to the setup as "post_privacy_includes_for".
My problem is that this does not scale. At all. At the moment, there are about 1-2 million posts, the majority of them set to be viewable by everyone. For each post on the page it must check to see if there is a row that is excluding the post from being shown to the user - this moves really slow on a page that can be filled with 100-200 posts. It can take up to 2-4 seconds, especially when additional constraints are added to the query.
This also creates extremely large and complex queries that are just... awkward.
SELECT t.*
FROM posts t
WHERE ( (t.privacy_level = 3
AND t.owner_id = ?)
OR (t.privacy_level = 4
AND EXISTS
( SELECT i.id
FROM PostPrivacyIncludeFor i
WHERE i.user_id = ?
AND i.thought_id = t.id)
OR t.privacy_level = 4
AND t.owner_id = ?)
OR (t.privacy_level = 4
AND EXISTS
(SELECT i2.id
FROM PostPrivacyIncludeFor i2
WHERE i2.thought_id = t.id
AND EXISTS
(SELECT r.id
FROM FriendFilterIds r
WHERE r.list_id = i2.list_id
AND r.friend_id = ?))
OR t.privacy_level = 4
AND t.owner_id = ?)
OR (t.privacy_level = 1
AND EXISTS
(SELECT G.id
FROM Following G
WHERE follower_id = t.owner_id
AND following_id = ?
AND friend = 1)
OR t.privacy_level = 1
AND t.owner_id = ?)
OR (NOT EXISTS
(SELECT e.id
FROM PostPrivacyExcludeFrom e
WHERE e.thought_id = t.id
AND e.user_id = ?
AND NOT EXISTS
(SELECT e2.id
FROM PostPrivacyExcludeFrom e2
WHERE e2.thought_id = t.id
AND EXISTS
(SELECT l.id
FROM FriendFilterIds l
WHERE l.list_id = e2.list_id
AND l.friend_id = ?)))
AND t.privacy_level IN (0, 1, 4))
AND t.owner_id = ?
ORDER BY t.created_at LIMIT 100
(mock up query, similar to the query I use now in Doctrine ORM. It's a mess, but you get what I am saying.)
I guess my question is, how would you approach this situation to optimize it? Is there a better way to set up my database? I'm willing to completely scrap the method I have currently built up, but I wouldn't know what to move onto.
Thanks guys.
Updated: Fix the query to reflect the values I defined for privacy level above (I forgot to update it because I simplified the values)

Your query is too long to give a definitive solution for, but the approach I would follow is to simply the data lookups by converting the sub-queries into joins, and then build the logic into the where clause and column list of the select statement:
select t.*, i.*, r.*, G.*, e.* from posts t
left join PostPrivacyIncludeFor i on i.user_id = ? and i.thought_id = t.id
left join FriendFilterIds r on r.list_id = i.list_id and r.friend_id = ?
left join Following G on follower_id = t.owner_id and G.following_id = ? and G.friend=1
left join PostPrivacyExcludeFrom e on e.thought_id = t.id and e.user_id = ?
(This might need expanding: I couldn't follow the logic of the final clause.)
If you can get the simple select working fast AND including all the information needed, then all you need to do is build up the logic in the select list and where clause.

Had a quick stab at simplifying this without re-working your original design too much.
Using this solution your web page can now simply call the following stored procedure to get a list of filtered posts for a given user within a specified period.
call list_user_filtered_posts( <user_id>, <day_interval> );
The whole script can be found here : http://pastie.org/1212812
I haven't fully tested all of this and you may find this solution isn't performant enough for your needs but it may help you in fine tuning/modifying your existing design.
Tables
Dropped your post_privacy_exclude_from table and added a user_stalkers table which works pretty much like the inverse of user_friends. Kept the original post_privacy_includes_for table as per your design as this allows a user restrict a specific post to a subset of people.
drop table if exists users;
create table users
(
user_id int unsigned not null auto_increment primary key,
username varbinary(32) unique not null
)
engine=innodb;
drop table if exists user_friends;
create table user_friends
(
user_id int unsigned not null,
friend_user_id int unsigned not null,
primary key (user_id, friend_user_id)
)
engine=innodb;
drop table if exists user_stalkers;
create table user_stalkers
(
user_id int unsigned not null,
stalker_user_id int unsigned not null,
primary key (user_id, stalker_user_id)
)
engine=innodb;
drop table if exists posts;
create table posts
(
post_id int unsigned not null auto_increment primary key,
user_id int unsigned not null,
privacy_level tinyint unsigned not null default 0,
post_date datetime not null,
key user_idx(user_id),
key post_date_user_idx(post_date, user_id)
)
engine=innodb;
drop table if exists post_privacy_includes_for;
create table post_privacy_includes_for
(
post_id int unsigned not null,
user_id int unsigned not null,
primary key (post_id, user_id)
)
engine=innodb;
Stored Procedures
The stored procedure is relatively simple - it initially selects ALL posts within the specified period and then filters out posts as per your original requirements. I have not performance tested this sproc with large volumes but as the initial selection is relatively small it should be performant enough as well as simplifying your application/middle tier code.
drop procedure if exists list_user_filtered_posts;
delimiter #
create procedure list_user_filtered_posts
(
in p_user_id int unsigned,
in p_day_interval tinyint unsigned
)
proc_main:begin
drop temporary table if exists tmp_posts;
drop temporary table if exists tmp_priv_posts;
-- select ALL posts in the required date range (or whatever selection criteria you require)
create temporary table tmp_posts engine=memory
select
p.post_id, p.user_id, p.privacy_level, 0 as deleted
from
posts p
where
p.post_date between now() - interval p_day_interval day and now()
order by
p.user_id;
-- purge stalker posts (0,1,3,4)
update tmp_posts
inner join user_stalkers us on us.user_id = tmp_posts.user_id and us.stalker_user_id = p_user_id
set
tmp_posts.deleted = 1
where
tmp_posts.user_id != p_user_id;
-- purge other users private posts (3)
update tmp_posts set deleted = 1 where user_id != p_user_id and privacy_level = 3;
-- purge friend only posts (1) i.e where p_user_id is not a friend of the poster
/*
requires another temp table due to mysql temp table problem/bug
http://dev.mysql.com/doc/refman/5.0/en/temporary-table-problems.html
*/
-- the private posts (1) this user can see
create temporary table tmp_priv_posts engine=memory
select
tp.post_id
from
tmp_posts tp
inner join user_friends uf on uf.user_id = tp.user_id and uf.friend_user_id = p_user_id
where
tp.user_id != p_user_id and tp.privacy_level = 1;
-- remove private posts this user cant see
update tmp_posts
left outer join tmp_priv_posts tpp on tmp_posts.post_id = tpp.post_id
set
tmp_posts.deleted = 1
where
tpp.post_id is null and tmp_posts.privacy_level = 1;
-- purge filtered (4)
truncate table tmp_priv_posts; -- reuse tmp table
insert into tmp_priv_posts
select
tp.post_id
from
tmp_posts tp
inner join post_privacy_includes_for ppif on tp.post_id = ppif.post_id and ppif.user_id = p_user_id
where
tp.user_id != p_user_id and tp.privacy_level = 4;
-- remove private posts this user cant see
update tmp_posts
left outer join tmp_priv_posts tpp on tmp_posts.post_id = tpp.post_id
set
tmp_posts.deleted = 1
where
tpp.post_id is null and tmp_posts.privacy_level = 4;
drop temporary table if exists tmp_priv_posts;
-- output filtered posts (display ALL of these on web page)
select
p.*
from
posts p
inner join tmp_posts tp on p.post_id = tp.post_id
where
tp.deleted = 0
order by
p.post_id desc;
-- clean up
drop temporary table if exists tmp_posts;
end proc_main #
delimiter ;
Test Data
Some basic test data.
insert into users (username) values ('f00'),('bar'),('alpha'),('beta'),('gamma'),('omega');
insert into user_friends values
(1,2),(1,3),(1,5),
(2,1),(2,3),(2,4),
(3,1),(3,2),
(4,5),
(5,1),(5,4);
insert into user_stalkers values (4,1);
insert into posts (user_id, privacy_level, post_date) values
-- public (0)
(1,0,now() - interval 8 day),
(1,0,now() - interval 8 day),
(2,0,now() - interval 7 day),
(2,0,now() - interval 7 day),
(3,0,now() - interval 6 day),
(4,0,now() - interval 6 day),
(5,0,now() - interval 5 day),
-- friends only (1)
(1,1,now() - interval 5 day),
(2,1,now() - interval 4 day),
(4,1,now() - interval 4 day),
(5,1,now() - interval 3 day),
-- private (3)
(1,3,now() - interval 3 day),
(2,3,now() - interval 2 day),
(4,3,now() - interval 2 day),
-- filtered (4)
(1,4,now() - interval 1 day),
(4,4,now() - interval 1 day),
(5,4,now());
insert into post_privacy_includes_for values (15,4), (16,1), (17,6);
Testing
As I mentioned before I've not fully tested this but on the surface it seems to be working.
select * from posts;
call list_user_filtered_posts(1,14);
call list_user_filtered_posts(6,14);
call list_user_filtered_posts(1,7);
call list_user_filtered_posts(6,7);
Hope you find some of this of use.

PHP/Mysql: read data field value from lookup tables (split array)

I have 1 Mysql database with 2 tables:
DOCUMENTS
...
- staffID
.....
STAFF
- ID
- Name
The DOCUMENTS table assigns each document to a single or multiple users from the STAFF table therefore the staffID in the DOCUMENTS table consists of a comma separated array of staff ID's for example (2, 14).
I managed to split the array into individual values:
2
14
but rather than having the ID numbers I would like to have the actual names from the STAFF table - how can I achieve this. Any help would be greatly appreciated - please see my current code below.
$result = mysql_query("SELECT
organizations.orgName,
documents.docName,
documents.docEntry,
documents.staffID,
staff.Name,
staff.ID
FROM
documents
INNER JOIN organizations ON (documents.IDorg = organizations.IDorg)
INNER JOIN staff ON (documents.staffID = staff.ID)
")
or die(mysql_error());
while($row = mysql_fetch_array($result)){
$splitA = $row['staffID'];
$resultName = explode(',', $splitA );
$i=0;
for($i=0;$i<count($resultName);$i++)
{
echo "<a href='staffview.php?ID=".$row['docName'].
"'>". $resultName[$i]."</a><br>";
}
echo '<hr>';
}

It looks like your existing code might work where documents.staffID = staff.ID - that is where there is just a single staffID associated with the document?
You'd be better off adding a table to model the relationships between documents and staff separately from either, and removing or deprecating the staffID field in the documents table. You'd need something like
CREATE TABLE document_staff (
document_id <type>,
staff_id <type>
)
You can include compound indexes with ( document_id, staff_id ) and ( staff_id, document_id ) if you have lots of data and/or you want to traverse the relationship efficiently in both directions.
(You don't mention data types for your identity fields, but documents.staffID appears to be some sort of varchar based on what you say - perhaps you could use an integer type for these instead?)
But you can probably achieve what you want using the existing schema and the MySQL FIND_IN_SET function:
SELECT
organizations.orgName,
documents.docName,
documents.docEntry,
documents.staffID,
staff.Name,
staff.ID
FROM
documents
INNER JOIN organizations ON (documents.IDorg = organizations.IDorg)
INNER JOIN staff ON ( FIND_IN_SET( staff.ID, documents.staffID ) > 0 )
MySQL set types have limitations - maximum membership size of 64 for example - but may be sufficient for your needs.
If it was me though, I'd change the model rather than use FIND_IN_SET.

Thank you so much for you answer - greatly appreciated!
My table setup is:
DOCUMENTS:
CREATE TABLE documents (
docID int NOT NULL,
docTitle mediumblob NOT NULL,
staffID varchar(120) NOT NULL,
Author2 int,
IDorg int,
docName varchar(150) NOT NULL,
docEntry int AUTO_INCREMENT NOT NULL,
/* Keys */
PRIMARY KEY (docEntry)
) ENGINE = MyISAM;
STAFF:
CREATE TABLE staff (
ID int AUTO_INCREMENT NOT NULL,
Name varchar(60) NOT NULL,
Organization varchar(20),
documents varchar(150),
Photo mediumblob,
/* Keys */
PRIMARY KEY (ID)
) ENGINE = MyISAM;
The DOCUMENTS table reads via a lookup table (dropdown) from the STAFF table so that I can assign multiple staff members to a document. So I can access the staffID array in the DOCUMENTS table and split that and I wonder if there is a way to then associate the staffID with the staff.Name and print out the staff Name rather than the ID in the results of the query. Thanks again!

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Database schema and queries for activity stream in social network - php

Related

Multiple JOINs to the same subquery expression

Insert data in 3rd table with the values inserted in 2 other table

Is there a way to strictly search for strings in mysql query

MySQL inclusion/exclusion of posts

PHP/Mysql: read data field value from lookup tables (split array)

Categories

Resources