How do I track changes and store calculated content in Nermalization? - php

I'm trying to create a table like this:
lives_with_owner_no from until under_the_name
1 1998 2002 1
3 2002 NULL 1
2 1997 NULL 2
3 1850 NULL 3
3 1999 NULL 4
2 2002 2002 4
3 2002 NULL 5
It's the Nermalization example, which I guess is pretty popular.
Anyway, I think I am just supposed to set up a dependency within MySQL for the from pending a change to the lives_with table or the cat_name table, and then set up a dependency between the until and from column. I figure the owner might want to come and update the cat's info, though, and override the 'from' column, so I have to use PHP? Is there any special way I should do the time stamp on the override (for example, $date = date("Y-m-d H:i:s");)? How do I set up the dependency within MySQL?
I also have a column that can be generated by adding other columns together. I guess using the cat example, it would look like:
combined_family_age family_name
75 Alley
230 Koneko
132 Furrdenand
1,004 Whiskers
Should I add via PHP and then input the values with a query, or should I use MySQL to manage the addition? Should I use a special engine for this, like MemoryAll?

I disagree with the nermalization example on two counts.
There is no cat entity in the end. Instead, there is a relation (cat_name_no, cat_name), which in your example has the immediate consequence that you can't tell how many cats named Lara exist. This is an anomaly that can easily be avoided.
The table crams two relations, lives_with_owner and under_the_name into one table. That's not a good idea, especially if the data is temporal, as it creates all kinds of nasty anomalies. Instead, you should use a table for each.
I would design this database as follows:
create table owner (id integer not null primary key, name varchar(255));
create table cat (id integer not null primary key, current_name varchar(255));
create table cat_lives_with (
cat_id integer references cat(id),
owner_id integer references owner(id),
valid_from date,
valid_to date);
create table cat_has_name (
cat_id integer references cat(id),
name varchar(255),
valid_from date,
valid_to date);
So you would have data like:
id | name
1 | Andrea
2 | Sarah
3 | Louise
id | current_name
1 | Ada
2 | Shelley
cat_id | owner_id | valid_from | valid_to
1 | 1 | 1998-02-15 | 2002-08-11
1 | 3 | 2002-08-12 | 9999-12-31
2 | 2 | 2002-01-08 | 2001-10-23
2 | 3 | 2002-10-24 | 9999-12-31
cat_id | name | valid_from | valid_to
1 | Ada | 1998-02-15 | 9999-12-31
2 | Shelley | 2002-01-08 | 2001-10-23
2 | Callisto | 2002-10-24 | 9999-12-31
I would use a finer grained date type than just year (in the nermalization example having 2002-2002 as a range can really lead to messy query syntax), so that you can ask queries like select cat_id from owner where '2000-06-02' between valid_from and valid_to.
As for the question of how to deal with temporal data in the general case: there's an excellent book on the subject, "Developing Time-Oriented Database Applications in SQL" by Richard Snodgrass (free full-text PDF distributed by Richard Snodgrass), which i believe can even be legally downloaded as pdf, Google will help you with that.
Your other question: you can handle the combined_family_age either in sql externally, or, if that column is needed often, with a view. You shouldn't manage the content manually though, let the database calculate that for you.

Related

finding next ID in a CHAR field without AUTO_INCREMENT

I got a Table which stores objects. An Object can be anything from a chair to a employee. An Object got an ObjectID, which is a 10 characters code-39 barcode label on the Object.
Many Objects already have a Label, thus an ObjectID assinged to them. Some have Prefixes, e.g. "9000000345" might be a Desk or "0000000895" might be a folder with invoices.
When People start a new Folder for example, they take pre-printed Barcode Labels
and put them on it. The pre-printed Barcode Labels are generated by a Printer which just increases a number by 1 and zerofills it to 10 Digits and then prints it as code-39.
All Most of the objects are stored in Excel Sheets. They now should be migrated into a MySQL Database.
Now, the System should also be able to create objects on its own. Objects created by the System have a leading "1" e.g. "1000000426".
The Problem: How do I get the next ObjectID for Auto generated Objects?
I cant really use AUTO_INCREMENT because there are also non-auto-generated rows in the table.
Another Thing to say is that the 'ObjectID' field has to be CHAR(10) because for special occasions there were alphanumeric prefixes used like "T1" -> "T100003158"
My Table when using AUTO_INCREMENT:
| ID | Created | ObjectID | Parent | Title | Changed | Note |
|----|-------------|--------------|--------|-------------|-------------|------|
| 1 | <timestamp> | "1000000001" | NULL | "Shelf 203" | <timestamp> | NULL |
| 2 | <timestamp> | "9000000458" | NULL | "Lamp" | <timestamp> | NULL |
| 3 | <timestamp> | "1000000003" | NULL | "Shelf 204" | <timestamp> | NULL |
The ObjectID of the last Object in the table should be "1000000002" not "1000000003"
I hope I could explain the Problem well enough.
Naive solution can be:
SELECT CAST(ObjectID AS UNSIGNED) + 1 FROM yourTable WHERE ObjectId LIKE "1%" ORDER BY ObjectID DESC LIMIT 1
Basically search for all Object ID starting with 1xxxx then sort them (because its zero padded we can still sort) and then cast result to int and increment it.
Might be faster to cast to int first and then do between. Rest would be the same

SQL - trying to sum value from multiple reference

I have a problem to calculate sum of the amount column from some references field in 4 different tables. here is my tables :
First Table (Master) :
ID_1 | Name_1
1 A
2 B
Second table (Master) :
ID_2 | ID_1 | Name_2
1_1 1 A1
1_2 1 A2
2_1 2 B1
2_2 2 B2
Third Table (Transaction) :
ID_trans | ID_2 | trans_name | amount | cpy_ID
trans1 1_1 Rev 123 1400
trans2 2_1 Dir 321 1400
trans3 2_1 Ind 231 1400
trans4 1_2 OTH 234 1400
Fourth Table (report template) :
report_set_id | report_set_name | cpy_ID
set001 Own Apps 1400
set002 Third Party 1400
The main case is I have to create a report with the third table (transaction) as data reference. And the report template has been determined like this :
----------------------------------------------------
| 1 | 2 | TOTAL |------> (1 & 2 first table fields)
----------------------------------------------------
set001 | (data 1) | - | (horizontal sum)
set002 | - | (data 2) | (horizontal sum)
-----------------------------------------------------
TOTAL | (sum of 1)| (sum of 2) |
which is :
(data 1 & data 2) = summary data from transaction table with same ID_2 and put in the column 1 rows (bacause ID_1 is foreign key in the second table)
I know my language is complicated to understand cause actually its hard to explaining by words, but I hope you guys can get what exactly I mean it :D
Can someone give me some advice to solve my problem? Thanks
If this will only ever have 2 data columns (as labelled "1" and "2" in your example) then it will be quite easy to write in SQL and we could do that. But if there will be several data columns, and if there will be a variable number of data columns, then we are into the general "pivot table" question. You can find many discussions on this topic under the [pivot] tag in stackoverflow. My own opinion is that anything but trivial formatting (including almost all pivot tables) is best done in the application. Use SQL to group and aggregate the data, then use a visualisation tool, or your application, to format it. I have written more about this on my website, where I explain a couple of common approaches and why I don't use them.

Possible to loop through tables in mySQL (using PHP) with table name containing a variable?

I'm pretty new to programming and trying to design a web application which provides a front-end to update data for an online course database (tables incl. users, assignments, questions, answers, etc). All data is coming into this database upon submission via course management system. Currently working with dummy data for development purposes.
The idea is to allow a user to update from the front end, rather than have updates occur automatically from the back end (as in using triggers). This is because we have a relatively small data set, and will just need the updated tables for users not familiar with mysql to export into data analysis programs.
Using multiple joins, I've created a list of assignments already taken by users, which looks like this:
+----+---------------+---------------------------------------+
| id | assignment_id | quiz_name |
+----+---------------+---------------------------------------+
| 1 | 2 | Guidance Counselors (Post-Assessment) |
| 2 | 3 | Guidance Counselors (Pre-Assessment) |
| 3 | 4 | Guidance Counselors (Pre-Assessment) |
+----+---------------+---------------------------------------+
In PHP, I've coded a basic front-end for displaying and updating these assignments in a multi-selection dropdown, which looks like this:
Assignment list table
Whenever a user wants to update the list, php runs this if statement (the same one used to generate the above table):
//update assignment list when button is clicked
if(isset($_POST['updatelist'])){
//query to update the assignment list
$sql_update = "DELETE FROM assignment_list;";
$sql_update .= "TRUNCATE assignment_list;";
$sql_update .= "INSERT INTO assignment_list....
#values added from joined assignment, quizzes, and user_quizzes tables
SELECT...
INNER JOIN ...
INNER JOIN ...
...
GROUP BY assignment_id";
$update_result = mysqli_multi_query($conn,$sql_update);
//success and error messages here
}
ETA: Which is then reflected in the dropdown menu. This menu then should allow users to select one or more assignments for which to update data, let's say if there are new user submissions and scores available for that assignment (i.e. the data contained within each assignment -- structurally they are all the same).
Updating the list seems to be working, but I am struggling to figure out the best way to update what we're calling "consolidated data" tables (appended with "_(assignment_id)" for each individual assignment. So when a user selects "update data", the data in the table for the selected assignment(s) above should update. As an example, consolidated_data_4 looks like this (some fields omitted for better readability):
+------------------+---------+---------------+---------+-----------------------------+----------------------------+--------------------------------------+-----------+
| consolidation_id | user_id | assignment_id | quiz_id | assignment_pass_score_point | assignment_pass_score_perc | quiz_name | answer_id |
+------------------+---------+---------------+---------+-----------------------------+----------------------------+--------------------------------------+-----------+
| 1 | 34 | 4 | 50 | 5.00 | 50.00 | Guidance Counselors (Pre-Assessment) | 175973 |
| 2 | 34 | 4 | 50 | 5.00 | 50.00 | Guidance Counselors (Pre-Assessment) | 175981 |
| 3 | 34 | 4 | 50 | 5.00 | 50.00 | Guidance Counselors (Pre-Assessment) | 175985 |
| 4 | 34 | 4 | 50 | 5.00 | 50.00 | Guidance Counselors (Pre-Assessment) | 175991 |
| 5 | 34 | 4 | 50 | 5.00 | 50.00 | Guidance Counselors (Pre-Assessment) | 175995 |
| 6 | 34 | 4 | 50 | 5.00 | 50.00 | Guidance Counselors (Pre-Assessment) | 175999 |
| 7 | 34 | 4 | 50 | 5.00 | 50.00 | Guidance Counselors (Pre-Assessment) | 176002 |
| 8 | 34 | 4 | 50 | 5.00 | 50.00 | Guidance Counselors (Pre-Assessment) | 176009 |
| 9 | 34 | 4 | 50 | 5.00 | 50.00 | Guidance Counselors (Pre-Assessment) | 176015 |
| 10 | 34 | 4 | 50 | 5.00 | 50.00 | Guidance Counselors (Pre-Assessment) | 176021 |
+------------------+---------+---------------+---------+-----------------------------+----------------------------+--------------------------------------+-----------+
Each table is currently identified by its assignment id, e.g. "consolidated_data_4" is a table in which assignment_id = 4 for every record in that table.
I tried looping through each table (using a foreach loop) and performing a similar query as the one above for the assignments list, but I receive an error unless I separate the queries out like so (also abridged):
//for each assignment user selects, start with an empty table and join data from corresponding course database tables
foreach($_POST['selectassign'] as $assignment){
$table = "quiz_data_update_application.consolidated_data_".$assignment;
//query updates consolidated data table(s) for selected assignments
$sql_del = "DELETE FROM $table";
$sql_trunc = "TRUNCATE $table";
//records added from joining tables in course db
$sql_ins = //sql omitted here for brevity
....
$del_sel_res = mysqli_query($conn,$sql_del);
$trunc_sel_res = mysqli_query($conn,$sql_trunc);
$ins_sel_res = mysqli_query($conn,$sql_ins);
}
(The only difference between this and the code that doesn't work is that the queries are combined in the same fashion as the first PHP snippet above; this seems redundant to post here).
Using single queries seems inefficient and will slow down the application. I am wondering if there is a better way to approach this than using part of a table name as a variable in PHP loops. (Don't see too many people asking about this or iterating through actual tables as opposed to fields) It seems there's either some silly syntax mistake I'm missing, or mysqli_multiquery() can't be used in loops/is overall a poor approach?
Some of the higher-ups where I am working have suggested either:
(1) creating separate loops - one to temporarily create a new table based on a users' selection of assignments, and another to split that up into separate tables by assignment (this part they are saying is required for optimal data analysis. IMO this makes everything more challenging than having a large "master" table of data for ALL assignments -- which I did in fact have success doing -- but alas this is how they're requesting it be done). Those tables would then be deleted after an update completes;
(2) Using stored procedures. This I am not as familiar with, and not sure how that'd work, but if it's more feasible than (1), I could look into it more.
Any other alternative, more feasible suggestions would be appreciated as well. I've made a lot of progress with this, but have been stuck here the past few weeks and not finding much in the way of online resources.
Apologies for the length of the post. I thought it necessary to provide more context rather than less.
Single queries is the way to go. The perceived inefficiency is not there -- the operations are more costly than the mysqli_query.
If you have thousands of tables that to work with, then I question the wisdom of having multiple similar tables. Instead, have an extra column to indicate which thing is involved.
DELETE (all of a table) and TRUNCATE have virtually the same effect; there is no need to do both. TRUNCATE is faster.
And deleting one indexed row from a table is a lot faster than truncate + insert.
Furthermore, batching the deletes into a single SQL statement is even faster. So is batch inserting multiple rows at once. This is easily 10 times as fast.
#Rick James here is SHOW CREATE TABLE for one of the tables, although the structure is the same for all of them.
+---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| consolidated_data_2 | CREATE TABLE `consolidated_data_2` (
`consolidation_id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`assignment_id` int(11) NOT NULL,
`quiz_id` int(11) NOT NULL,
`assignment_pass_score_point` decimal(10,2) DEFAULT NULL,
`assignment_pass_score_perc` decimal(10,2) DEFAULT NULL,
`quiz_name` varchar(3800) NOT NULL,
`question_id` int(11) NOT NULL,
`question_text` varchar(3800) CHARACTER SET utf8 NOT NULL,
`question_type_id` int(11) NOT NULL,
`answer_id` int(11) DEFAULT NULL,
`answer_textA` varchar(800) CHARACTER SET utf8 DEFAULT NULL,
`answer_added_ts` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
`group_id` int(11) DEFAULT NULL,
`user_answer_text` varchar(3800) CHARACTER SET utf8 DEFAULT NULL,
`answer_image` varchar(800) DEFAULT NULL,
`is_correct` int(11) DEFAULT NULL,
`correct_answer_text` varchar(3800) CHARACTER SET utf8 DEFAULT NULL,
PRIMARY KEY (`consolidation_id`)
) ENGINE=InnoDB AUTO_INCREMENT=128 DEFAULT CHARSET=latin1 |
+---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
The reason for having multiple tables is so that there's one table unique to each assignment id, so each assignment table can be updated separately/selectively for analysis purposes.
It isn't necessarily just updating one row, since each assignment is typically 10 questions. So each new user submission would add 10 more rows in most cases.
I am attaching an ER diagram which shows all the tables that have source data for the consolidated data tables.
ER Diagram for consolidated_data tables
(Blue = fields used for joins; yellow = all other fields included in the table; data type info on some for reference)

Storing variable number of values of something in a database

I'm developing a QA web-app which will have some points to evaluated assigned to one of the following Categories.
Call management
Technical skills
Ticket management
As this aren't likely to change it's not worth making them dynamic but the worst point is that points are like to.
First I had a table of 'quality' which had a column for each point but then requisites changed and I'm kinda blocked.
I have to store "evaluations" that have all points with their values but maybe, in the future, those points will change.
I thought that in the quality table I could make some kind of string that have something like that
1=1|2=1|3=2
Where you have sets of ID of point and punctuation of that given value.
Can someone point me to a better method to do that?
As mentioned many times here on SO, NEVER PUT MORE THAN ONE VALUE INTO A DB FIELD, IF YOU WANT TO ACCESS THEM SEPERATELY.
So I suggest to have 2 additional tables:
CREATE TABLE categories (id int AUTO_INCREMENT PRIMARY KEY, name VARCHAR(50) NOT NULL);
INSERT INTO categories VALUES (1,"Call management"),(2,"Technical skills"),(3,"Ticket management");
and
CREATE TABLE qualities (id int AUTO_INCREMENT PRIMARY KEY, category int NOT NULL, punctuation int NOT nULL)
then store and query your data accordingly
This table is not normalized. It violates 1st Normal Form (1NF):
Evaluation
----------------------------------------
EvaluationId | List Of point=punctuation
1 | 1=1|2=1|3=2
2 | 1=5|2=6|3=7
You can read more about Database Normalization basics.
The table could be normalized as:
Evaluation
-------------
EvaluationId
1
2
Quality
---------------------------------------
EvaluationId | Point | Punctuation
1 | 1 | 1
1 | 2 | 1
1 | 3 | 2
2 | 1 | 5
2 | 2 | 6
2 | 3 | 7

MySQL design with dynamic number of fields

My experience with MySQL is very basic. The simple stuff is easy enough, but I ran into something that is going to require a little more knowledge. I have a need for a table that stores a small list of words. The number of words stored could be anywhere between 1 to 15. Later, I plan on searching through the table by these words. I have thought about a few different methods:
A.) I could create the database with 15 fields, and just fill the fields with null values whenever the data is smaller than 15. I don't really like this. It seems really inefficient.
B.) Another option is to use just a single field, and store the data as a comma separated list. Whenever I come back to search, I would just run a regular expression on the field. Again, this seems really inefficient.
I would hope there is a good alternative to those two options. Any advice would be very appreciated.
-Thanks
C) use a normal form; use multiple rows with appropriate keys. an example:
mysql> SELECT * FROM blah;
+----+-----+-----------+
| K | grp | name |
+----+-----+-----------+
| 1 | 1 | foo |
| 2 | 1 | bar |
| 3 | 2 | hydrogen |
| 4 | 4 | dasher |
| 5 | 2 | helium |
| 6 | 2 | lithium |
| 7 | 4 | dancer |
| 8 | 3 | winken |
| 9 | 4 | prancer |
| 10 | 2 | beryllium |
| 11 | 1 | baz |
| 12 | 3 | blinken |
| 13 | 4 | vixen |
| 14 | 1 | quux |
| 15 | 4 | comet |
| 16 | 2 | boron |
| 17 | 4 | cupid |
| 18 | 4 | donner |
| 19 | 4 | blitzen |
| 20 | 3 | nod |
| 21 | 4 | rudolph |
+----+-----+-----------+
21 rows in set (0.00 sec)
This is the table I posted in this other question about group_concat. You'll note that there is a unique key K for every row. There is another key grp which represents each category. The remaining field represents a category member, and there can be variable numbers of these per category.
What other data is associated with these words?
One typical way to handle this kind of problem is best described by example. Let's assume your table captures certain words found in certain documents. One typical way is to assign each document an identifier. Let's pretend, for the moment, that each document is a web URL, so you'd have a table something like this:
CREATE TABLE WebPage (
ID INTEGER NOT NULL,
URL VARCHAR(...) NOT NULL
)
Your Words table might look something like this:
CREATE TABLE Words (
Word VARCHAR(...) NOT NULL,
DocumentID INTEGER NOT NULL
)
Then, for each word, you create a new row in the table. To find all words in a particular document, select by the document's ID:
SELECT Words.Word FROM Words, WebPage
WHERE Words.DocumentID = WebPage.DocumentID
AND WebPage.URL = 'http://whatever/web/page/'
To find all documents with a particular word, select by word:
SELECT WebPage.URL FROM WebPage, Words
WHERE Words.Word = 'hello' AND Words.DocumentID = WebPage.DocumentID
Or some such.
Hurpe, is the scenario you are describing that you will have a database table with a column that can contain a up to 15 keywords. Later you will use these keywords to search the table which will presumably have other columns as well?
Then isn't the answer to have a separate table for the keywords? You will also need to have a many-to-many relationship between the keywords and the main table.
So using cars as an example, the WORD table that will store the 15 or so keywords would have the following structure:
ID int
Word varchar(100)
The CAR table would have a structure something like:
ID int
Name varchar(100)
Then finally you need a CAR_WORD table to hold the many-to-many relationships:
ID int
CAR_ID int
WORD_ID int
And sample data to go with this for the WORD table:
ID Word
001 Family
002 Sportscar
003 Sedan
004 Hatchback
005 Station-wagon
006 Two-door
007 Four-door
008 Diesel
009 Petrol
together with sample data for the CAR table
ID Name
001 Audi TT
002 Audi A3
003 Audi A4
then the intersection CAR_WORD table sample data could be:
ID CAR_ID WORD_ID
001 001 002
002 001 006
003 001 009
which give the Audi TT the correct characteristics.
and finally the SQL to search would be something like:
SELECT c.name
FROM CAR c
INNER JOIN CAR_WORD x
ON c.id = x.id
INNER JOIN WORD w
ON x.id = w.id
WHERE w.word IN('Petrol', 'Two-door')
Phew! Didn't intend to set out to write quite so much, it looks complicated but it is where I always seem to end up however hard I try to simplify things.
I would create a table with and ID and one field, then store your results as multiple records. This offers many benefits. For example, you can then programatically enforce your 15 word limit instead of doing it in your design, so if you ever change your mind it should be rather easy. Your queries to search on the data will also be much faster to run, regular expressions take a lot of time to run (comparatively). Plus using a varchar for the field will allow you to compress your table much better. And indexing on the table should be much easier (more efficient) with this design.
Do the extra work and store the 15 words as 15 rows in the table, i.e. normalize the data. It may require you to re-think your strategy a bit, but trust me when the client comes along and says "Can you change that 15 limit to 20...", you'll be glad you did.
Depending on exactly what you want to accomplish:
Use a full-text index on your string table
Three tables: one for the original string, one for unique words (after word-rooting?), and a join table. This would also let you do more complicated searches, like "return all strings containing at least three of the following five words" or "return all strings where 'fox' occurs after 'dog'".
CREATE TABLE string (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
string TEXT NOT NULL
)
CREATE TABLE word (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
word VARCHAR(14) NOT NULL UNIQUE,
UNIQUE INDEX (word ASC)
)
CREATE TABLE word_string (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
string_id INT NOT NULL,
word_id INT NOT NULL,
word_order INT NOT NULL,
FOREIGN KEY (string_id) REFERENCES (string.id),
FOREIGN KEY (word_id) REFERENCES (word.id),
INDEX (word_id ASC)
)
// Sample data
INSERT INTO string (string) VALUES
('This is a test string'),
('The quick red fox jumped over the lazy brown dog')
INSERT INTO word (word) VALUES
('this'),
('test'),
('string'),
('quick'),
('red'),
('fox'),
('jump'),
('over'),
('lazy'),
('brown'),
('dog')
INSERT INTO word_string ( string_id, word_id, word_order ) VALUES
( 0, 0, 0 ),
( 0, 1, 3 ),
( 0, 2, 4 ),
( 1, 3, 1 ),
( 1, 4, 2 ),
( 1, 5, 3 ),
( 1, 6, 4 ),
( 1, 7, 5 ),
( 1, 8, 7 ),
( 1, 9, 8 ),
( 1, 10, 9 )
// Sample query - find all strings containing 'fox' and 'quick'
SELECT
UNIQUE string.id, string.string
FROM
string
INNER JOIN word_string ON string.id=word_string.string_id
INNER JOIN word AS fox ON fox.word='fox' AND word_string.word_id=fox.id
INNER JOIN word AS quick ON quick.word='quick' AND word_string.word_id=word.id
You are correct that A is no good. B is also no good, as it fails to adhere to First Normal Form (each field must be atomic). There's nothing in your example that suggests you would gain by avoiding 1NF.
You want a table for your list of words with each word in its own row.

Categories