I am extending a product sales plugin and am trying to understand how wordpress handles database relations. I am building tables on activation using dbDelta. An example of a table schema would be:
$table_schema = [
"CREATE TABLE IF NOT EXISTS `{$wpdb->prefix}plugin_orders` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`people_id` bigint(20) DEFAULT NULL,
`order_id` bigint(20) DEFAULT NULL,
`order_status` varchar(11) DEFAULT NULL,
`order_date` datetime DEFAULT NULL,
`order_total` decimal(13,2) DEFAULT NULL,
`accounting` tinyint(4) DEFAULT '0',
PRIMARY KEY (`id`),
KEY `people_id` (`people_id`),
KEY `order_id` (`order_id`)
) $collate;",
"CREATE TABLE IF NOT EXISTS `{$wpdb->prefix}plugin_order_product` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`order_id` bigint(20) DEFAULT NULL,
`product_id` bigint(20) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `order_id` (`order_id`),
KEY `product_id` (`product_id`)
) $collate;"
];
I see that id in each table is the PRIMARY KEY but what does declaring the other KEYs actually do? I have read that wordpress uses MyISAM which doesn't actually build foreign key connections. While these tables may point to other tables already existing, in this example does declaring KEY order_id (order_id) create a variable of sorts called order_id that any other table can use to reference? Is this code specifically connecting one tables attributes to another tables attributes (it doesn't appear to be)? After these tables are built, I can inspect them in phpMyAdmin and see that there are indexes assigned but no foreign key constraints. How does this code create tables that point one table at another to build relations?
KEY `foo_bar` (`order_id`)
"KEY" is the same as "INDEX". It specifies that a separate data structure is maintained for the efficient access of the table via the column order_id.
foo_bar is the name of the index. It has no special meaning, and has very few uses. For example, DROP KEY foo_bar; is the way to get rid of the index.
In MyISAM, a "FOREIGN KEY" allowed, but ignored. In InnoDB, it does two things:
Create an index if one is not already provided
Provide a constraint. The default effectively "complain if the other table does not already have the value referenced".
Having an index is important for performance. The index above make this
SELECT ... WHERE order_id = 1234 ...
run in milliseconds, even if there are billions of rows in the table. Without the index, the query would take minutes or hours.
A PRIMARY KEY is a UNIQUE key, which is an INDEX.
UNIQUE(widget) says that only one row can have a particular value of `widget in the table.
PRIMARY KEY(id) says that each row is uniquely identified by the column id. InnoDB really wants each table to have a PK.
"id" is a convention (not a requirement) for the name of the PK. It is also INT AUTO_INCREMENT by convention. You may or may not actually ever touch id.
Tables can be related to each other in 3 main ways:
1:1 -- They share the same unique key. This is rarely useful; you may as well have a single table.
1:many -- An "order" has several "items" in it (one-order : many-items). This is usually handled by order_id being a column in the items table.
many:many -- students_classes -- each student is in many classes; each class has many students. This is implemented via a mapping table that has (usually) only two columns: student_id and class_id (no id is needed) and PRIMARY KEY(student_id, class_id) and INDEX(class_id, student_id). Those two indexes make it efficient to go from a known student to their classes, and vice versa.
Another convention for the PK of a table is to include the table name. (It is clutter to do that for other columns, such as order_status.) I was assuming this convention for student_id and class_id.
But now I am confused by your plugin_orders -- it has both id and order_id. If that table describes "orders", then I would expect order_id to be the PK instead of id.
And, if order_product is a list of all the "products" in each "order", then I would expect you to have the 1:many pattern.
What indexes to have?
PRIMARY KEY to uniquely identify each row -- either id or some column (or combination of columns) that are unique.
Other columns, as needed, for the SELECTs, UPDATEs, and DELETEs that you have. Do not blindly add indexes before having some clues of the queries that might need them.
Indexes sometimes help in sorting:
SELECT ... ORDER BY last_name, first_name;
together with
INDEX(last_name, first_name)
Indexes provide performance; FKs provide integrity checks. Neither is "required"; both are "desirable".
MyISAM is ancient; you should change to InnoDB.
Then do something like
SELECT ...
FROM plugin_orders AS o
JOIN plugin_order_product AS op
ON o.order_id = op.order_id
WHERE ...
In this example, the Optimizer will perform the query something like this:
Look at the WHERE to see which table is best filtered by the conditions there. Declare that to be the first table work with.
Scan through the first table, using an index if practical.
For each row in the first table, reach into the second table.
Reaching into the second table would probably be done via INDEX(order_id) on the second table. This would make the JOIN fast and efficient.
Both tables have INDEX(order_id), but that is not relevant.
Next example:
SELECT ...
FROM plugin_orders AS o
JOIN plugin_order_product AS op
ON o.order_id = op.order_id
WHERE o.people_id = 123 -- note
Pick o as the first table due to filtering on people_id
use op INDEX(people_id) to rapidly find the o rows that are relevant.
etc (op is the second table)
Next example:
SELECT ...
FROM plugin_orders AS o
JOIN plugin_order_product AS op
ON o.order_id = op.order_id
WHERE op.product_id = 9887 -- changed again
Pick op as the first table due to filtering on product_id
use o INDEX(people_id) to rapidly find the op rows that are relevant.
etc (o is the second table this time)
Related
Question to all Yii2 normalization geeks out there.
Where is the best place to set non-normalized columns in Yii2?
Example, I have models Customer, Branch, CashRegister, and Transaction.
In a perfect world, and in a perfectly normalized Database, our Transaction model would have only the cashregister_id, The CashRegister would store branch_id, and the Branch would store customer_id. However due to performance issues, we find ourselves obliged sometimes though to have a non-normalized Transaction model containing the following:
cashregister_id
branch_id
customer_id
When creating a transaction, I want to store all 3 values. Setting
$transaction->branch_id = $transaction->cashRegister->branch_id;
$transaction->customer_id = $transaction->cashRegister->branch->customer_id;
however in the controller does not feel correct.
One solution would be to do this in aftersave() in the Transaction model and make those columns read-only. But this also seems better but not perfect.
I wanted to know what is the best practice or where is the best place to set those duplicate columns, to make sure that the data integrity is maintained?
The following is a DB-only solution.
I assume your relations are:
A customer has many branches
A branch has many cashregisters
A cashregister has many transactions
The corresponding schema could be:
create table customers (
customer_id int auto_increment,
customer_data text,
primary key (customer_id)
);
create table branches (
branch_id int auto_increment,
customer_id int not null,
branch_data text,
primary key (branch_id),
index (customer_id),
foreign key (customer_id) references customers(customer_id)
);
create table cashregisters (
cashregister_id int auto_increment,
branch_id int not null,
cashregister_data text,
primary key (cashregister_id),
index (branch_id),
foreign key (branch_id) references branches(branch_id)
);
create table transactions (
transaction_id int auto_increment,
cashregister_id int not null,
transaction_data text,
primary key (transaction_id),
index (cashregister_id),
foreign key (cashregister_id) references cashregisters(cashregister_id)
);
(Note: This should be part of your question - so we wouldn't need to guess.)
If you want to include redundant columns (branch_id and customer_id) in the transactions table, you should make them part of the foreign key. But first you will need to include a customer_id column in the cashregisters table and also make it part of the foreign key.
The extended schema would be:
create table customers (
customer_id int auto_increment,
customer_data text,
primary key (customer_id)
);
create table branches (
branch_id int auto_increment,
customer_id int not null,
branch_data text,
primary key (branch_id),
index (customer_id, branch_id),
foreign key (customer_id) references customers(customer_id)
);
create table cashregisters (
cashregister_id int auto_increment,
branch_id int not null,
customer_id int not null,
cashregister_data text,
primary key (cashregister_id),
index (customer_id, branch_id, cashregister_id),
foreign key (customer_id, branch_id)
references branches(customer_id, branch_id)
);
create table transactions (
transaction_id int auto_increment,
cashregister_id int not null,
branch_id int not null,
customer_id int not null,
transaction_data text,
primary key (transaction_id),
index (customer_id, branch_id, cashregister_id),
foreign key (customer_id, branch_id, cashregister_id)
references cashregisters(customer_id, branch_id, cashregister_id)
);
Notes:
Any foreign key constraint needs an index in the child (referencing) and the parent (referenced) table, which can support the constraint check. The given column order in the keys allows us to define the schema with only one index per table.
A foreign key should always reference a unique key in the parent table. However in this example the composition of referenced columns is (at least) implicitly unique, because it contains the primary key. In almost any other RDBMS you would need to define the indices in the "middle" tables (branches and cashregisters) as UNIQUE. This however is not necessary in MySQL.
The composite foreign keys will take care of the data integrity/consistency. Example: If you have a branch entry with branch_id = 2 and customer_id = 1 - you wan't be able to insert a cashregister with branch_id = 2 and customer_id = 3, because this would violate the foreign key constraint.
You will probably need more indices for your queries. Most probably you will need cashregisters(branch_id) and transactions(cashregister_id). With these indices you might not even need to change your ORM relation code. (though AFAIK Yii supports composite foreign keys.)
You can define relations like "customer has many transactions". Previously you would need to use "has many through", involving two middle/bridge tables. This will save you two joins in many cases.
If you want the redundant data to be maintained by the database, you can use the following triggers:
create trigger cashregisters_before_insert
before insert on cashregisters for each row
set new.customer_id = (
select b.customer_id
from branches b
where b.branch_id = new.branch_id
)
;
delimiter $$
create trigger transactions_before_insert
before insert on transactions for each row
begin
declare new_customer_id, new_branch_id int;
select c.customer_id, c.branch_id into new_customer_id, new_branch_id
from cashregisters c
where c.cashregister_id = new.cashregister_id;
set new.customer_id = new_customer_id;
set new.branch_id = new_branch_id;
end $$
delimiter ;
Now you can insert new entries without defining the redundant values:
insert into cashregisters (branch_id, cashregister_data) values
(2, 'cashregister 1'),
(1, 'cashregister 2');
insert into transactions (cashregister_id, transaction_data) values
(2, 'transaction 1'),
(1, 'transaction 2');
See demo: https://www.db-fiddle.com/f/fE7kVxiTcZBX3gfA81nJzE/0
If your business logic allows to update the relations, you should extend your foreign keys with ON UPDATE CASCADE. This will make the changes through the relation chain down to the transactions table.
I had similar problem once and using afterSave() or beforeSave() looked as a great solution at the beginning, but finally resulted hard to maintain spaghetti code. I ended up with creating separate component for managing such relations. Something like:
class TransactionsManager extends Component {
public function createTransaction(TransactionInfo $info, CashRegister $register) {
// magic
}
}
Then you're not creating or updating Transaction model directly, you're alway using this component and encapsulates all logic in it. Then ActiveRecord works more like a data representation and does not contain any advanced business logic. It looks more complicated in some cases than $model->load($data) && $model->save() but after all it is much easier to maintain when you have all logic in one place and you don't need to debug save() calls chains (one model runs save() of different model in afterSave() which runs save() of different model in afterSave()... and so on).
I have a MySQL (5.6.26) database with large ammount of data and I have problem with COUNT select on table join.
This query takes about 23 seconds to execute:
SELECT COUNT(0) FROM user
LEFT JOIN blog_user ON blog_user.id_user = user.id
WHERE email IS NOT NULL
AND blog_user.id_blog = 1
Table user is MyISAM and contains user data like id, email, name, etc...
CREATE TABLE `user` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`username` varchar(50) DEFAULT NULL,
`email` varchar(100) DEFAULT '',
`hash` varchar(100) DEFAULT NULL,
`last_login` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`created` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`id`),
UNIQUE KEY `id` (`id`) USING BTREE,
UNIQUE KEY `email` (`email`) USING BTREE,
UNIQUE KEY `hash` (`hash`) USING BTREE,
FULLTEXT KEY `email_full_text` (`email`)
) ENGINE=MyISAM AUTO_INCREMENT=5728203 DEFAULT CHARSET=utf8
Table blog_user is InnoDB and contains only id, id_user and id_blog (user can have access to more than one blog). id is PRIMARY KEY and there are indexes on id_blog, id_user and id_blog-id_user.
CREATE TABLE `blog_user` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`id_blog` int(11) NOT NULL DEFAULT '0',
`id_user` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
UNIQUE KEY `id_blog_user` (`id_blog`,`id_user`) USING BTREE,
KEY `id_user` (`id_user`) USING BTREE,
KEY `id_blog` (`id_blog`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=5250695 DEFAULT CHARSET=utf8
I deleted all other tables and there is no other connection to MySQL server (testing environment).
What I've found so far:
When I delete some columns from user table, duration of query is shorter (like 2 seconds per deleted column)
When I delete all columns from user table (except id and email), duration of query is 0.6 seconds.
When I change blog_user table also to MyISAM, duration of query is 46 seconds.
When I change user table to InnoDB, duration of query is 0.1 seconds.
The question is why is MyISAM so slow executing the command?
First, some comments on your query (after fixing it up a bit):
SELECT COUNT(*)
FROM user u LEFT JOIN
blog_user bu
ON bu.id_user = u.id
WHERE u.email IS NOT NULL AND bu.id_blog = 1;
Table aliases help make it easier to both write and to read a query. More importantly, You have a LEFT JOIN but your WHERE clause is turning it into an INNER JOIN. So, write it that way:
SELECT COUNT(*)
FROM user u INNER JOIN
blog_user bu
ON bu.id_user = u.id
WHERE u.email IS NOT NULL AND bu.id_blog = 1;
The difference is important because it affects choices that the optimizer can make.
Next, indexes will help this query. I am guessing that blog_user(id_blog, id_user) and user(id, email) are the best indexes.
The reason why the number of columns affects your original query is because it is doing a lot of I/O. The fewer columns then the fewer pages needed to store the records -- and the faster the query runs. Proper indexes should work better and more consistently.
To answer the real question (why is myisam slower than InnoDB), I can't give an authoritative answer.
But it is certainly related to one of the more important differences between the two storage engines : InnoDB does support foreign keys, and myisam doesn't. Foreign keys are important for joining tables.
I don't know if defining a foreign key constraint will improve speed further, but for sure, it will guarantee data consistency.
Another note : you observe that the time decreases as you delete columns. This indicates that the query requires a full table scan. This can be avoided by creating an index on the email column. user.id and blog.id_user hopefully already have an index, if they don't, this is an error. Columns that participate in a foreign key, explicit or not, always must have an index.
This is a long time after the event to be much use to the OP and all the foregoing suggestions for speeding up the query are entirely appropriate but I wonder why no one has remarked on the output of EXPLAIN. Specifically, why the index on email was chosen and how that relates to the definition for the email column in the user table.
The optimizer has selected an index on email column, presumably because it's included in the where clause. key_len for this index is comparatively long and it's a reasonably large table given the auto_increment value so the memory requirements for this index would be considerably greater than if it had chosen the id column (4 bytes against 303 bytes). The email column is NULLABLE but has a default of the empty string so, unless the application explicitly sets a NULL, you are not going to find any NULLs in this column anyway. Neither will you find more than one record with the default given the UNIQUE constraint. The column DEFAULT and UNIQUE constraint appear to be completely at odds with each other.
Given the above, and the fact we only want the count in the query, I'd then wonder if the email part of the where clause serves any purpose other than slowing the query down as each value is compared to NULL. Without it the optimizer would probably pick the primary key and do a much better job. Better yet would be a query which ignored the user table entirely and took the count based on the covering index on blog_user that Gordon Linoff highlighted.
There's another indexing issues here worth mentioning:
On the user table
UNIQUE KEY `id` (`id`) USING BTREE,
is redundant since id is the PRIMARY KEY and therefore UNIQUE by definition.
To answer your last question,
The question is why is MyISAM so slow executing the command?
MyISAM is dependent on the speed of your hard drive,
INNODB once the data is read is at speed of RAM. 1st time query is run could be loading data, second and later will avoid hard drive until aged out of RAM.
I am working on a CMS system (largely as a learning exercise) for a private website. Atm I have three tables: one for articles, one for tags and a joining table so that each article can have multiple tags.
The table I am having issues with consists of three columns -
article_tags: id (auto_increment), article_id, tag_id
My problem stems from the fact that an article can appear any number of times, and a tag can also appear any number of times, however a given combination of the two should only appear once - that is, each article should only have one reference to any single tag. Currently it is possible to INSERT "duplicate" rows where the id is different, but the combination of article_id and tag_id are the same:
id , article_id, tag_id
1 1 1
2 1 2
3 2 1
4 1 1 <- this is wrong
I could check in PHP code for a record that contains this combination, but I'd prefer to do it in sql if possible (if it is not, or it is undesirable then I will do it using PHP). Due to the id being different and the inability to set unique columns things like INSERT IGNORE and ON DUPLICATE do not work.
I'm quite new to mySQL so if I'm doing something silly please point me in the right direction.
Thanks
You should review your table definition.
You can (from best to worst):
Add a composite primary key on (article_id and tag_id) and remove auto_increment (previous primary key)
Add an index (UNIQUE) on (article_id and tag_id) and keep your auto_increment primary key
Select distinct in php: SELECT DISTINCT(article_id, tag_id) FROM
... without changing anything in your table
Right now, your table is defined as something like this:
CREATE TABLE IF NOT EXISTS `article_tags` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`article_id` int(11) NOT NULL,
`tag_id` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
The best solution (option 1) would be to remove your current (auto_increment) primary key and add a primary key (composite) on columns article_id and tag_id:
CREATE TABLE IF NOT EXISTS `article_tags` (
`article_id` int(11) NOT NULL,
`tag_id` int(11) NOT NULL,
PRIMARY KEY (`article_id`,`tag_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
But (option 2) if you absolutely want to keep your auto_increment primary key, add an index (unique) on your columns:
CREATE TABLE IF NOT EXISTS `article_tags` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`article_id` int(11) NOT NULL,
`tag_id` int(11) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `article_id` (`article_id`,`tag_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
Anyway, if you don't want to change your table definition, you could always use DISTINCT in your php query:
SELECT DISTINCT(article_id, tag_id) FROM article_tags
Such many-to-many relationship tables, sometimes called join tables, often have just two columns, and have a primary key that's a composite of the two.
article_id
tag_id
pk = (article_id, tag_id)
If you change the definition of that table you will definitively solve that problem.
How should you order the columns in composite keys? It depends on how your application will look up items in the join table. If you'll always start with the article_id and look up the tag_id, then you put the article_id first in the key. The DBMS can random-access values for the first column in the key, but has to scan the index to find values in second (or subsequent) columns in the key.
You may want to create a second index on the table, (tag_id, article_id). This will allow fast lookups based on the tag_id. You may ask, "why bother to put both columns in the index?" That's to make the index into a covering index. In a covering index, the desired value can be retrieved directly from the index. For example, with a covering index,
SELECT article_id FROM article_tag WHERE tag_id = 12345
(or a JOIN that uses similar lookup logic) only needs to access the index on the disk drive to get the result. If you don't have a covering index, the query needs to jump from the index to the data table, which is an extra step.
Join tables typically have very short rows (a couple of integers) so the duplicated data for a couple of covering indexes (the primary key and the extra one) isn't a big disk-space hog.
Ok, so i'm a newbie here at SQL..
I'm settings up my tables, and i'm getting confused on indexes, keys, foreign keys..
I have a users table, and a projects table.
I want to use the users (id) to attach a project to a user.
This is what I have so far:
DROP TABLE IF EXISTS projects;
CREATE TABLE projects (
id int(8) unsigned NOT NULL,
user_id int(8),
name varchar(120) NOT NULL,
description varchar(300),
created_at date,
updated_at date,
PRIMARY KEY (id),
KEY users_id (user_id)
) ENGINE=InnoDB;
ALTER TABLE projects (
ADD CONSTRAINT user_projects,
FOREIGN KEY (user_id) REFERENCES users(id),
ON DELETE CASCADE
)
So what I'm getting lost on is what is the differences between a key, an index, a constraint and a foreign key?
I've been looking online and can't find a newbie explanation for it.
PS. I'm using phpactiverecord and have the relationships set up in the models
user-> has_many('projects');
projects -> belongs_to('user');
Not sure if that has anything to do with it, but thought i'd throw it in there..
Thanks.
EDIT:
I thought it could possible be something to do with Navicat, so I went into WampServer -> phpMyAdmin and ran this...
DROP TABLE IF EXISTS projects;
CREATE TABLE projects (
id int(8) unsigned NOT NULL,
user_id int(8) NOT NULL,
name varchar(120) NOT NULL,
description varchar(300),
created_at date,
updated_at date,
PRIMARY KEY (id),
KEY users_id (user_id),
FOREIGN KEY (user_id) REFERENCES users(id)
) ENGINE=InnoDB;
Still nothing... :(
Expanding on Shamil's answers:
INDEX is similar to the index at the back of a book. It provides a simplified look-up for the data in that column so that searches on it are faster. Fun details: MyISAM uses a hashtable to store indexes, which keys the data, but is still linearly proportional in depth to the table size. InnoDB uses a B-tree structure for its indexes. A B-tree is similar to a nested set - it breaks down the data into logical child groups, meaning search depth is significantly smaller. As such, lookups by ranges are faster in a InnoDB, whereas lookups of a single key are faster in MyISAM (try to remember the Big O of hashtables and binary trees).
UNIQUE INDEX is an index in which each row in the database must have a unique value for that column or group of columns. This is useful for preventing duplication, e.g. for an email column in a users table where you want only one account per email address. Important note that in MySQL, an INSERT... ON DUPLICATE KEY UPDATE statement will execute the update if it finds a duplicate unique index match, even if it's not your primary key. This is a pitfall to be aware of when using INSERT... UPDATE statements on tables with uniques. You may wind up unintentionally overwriting records! Another note about Uniques in MySQL - per the ANSI-92 standard, NULL values are not to be considered unique, which means you can have multiple NULL values in a nullable unique-indexed column. Although it's a standard, some other RDBMSes differ on implementation of this.
PRIMARY KEY is a UNIQUE INDEX that is the identifier for any given row in the table. As such, it must not be null, and is saved as a clustered index. Clustered means that the data is written to your filesystem in ascending order on the PK. This makes searches on primary key significantly faster than any other index type (as in MySQL, only the PK may be your clustered index). Note that clustering also causes concerns with INSERT statements if your data is not AUTO_INCREMENTed, as MySQL will have to shift data around on the filesystem if you insert a new row with a PK with a lower ordinal value. This could hamper your DB performance. So unless you're certain you know what you're doing, always use an auto-incremented value for your PK in MySQL.
FOREIGN KEY is a reference to a column in another table. It enforces Referential Integrity, which means that you cannot create an entry in a column which has a foreign key to another table if the entered value does not exist in the referenced table. In MySQL, a FOREIGN KEY does not improve search performance. It also requires that both tables in the key definition use the InnoDB engine, and have the same data type, character set, and collation.
KEY is just another word for INDEX.
A UNIQUE index means that all values within that index must be unique, and not the same as ant other within that index. An example would be an Id column in a table.
A PRIMARY KEY is a unique index where all key columns must be defined as NOT NULL, i.e, all values in the index must be set. Ideally, each table should have (and can have) one primary key only.
A FOREIGN KEY is a referential constraint between two tables. This column/index must have the same type and length as the referred column within the referred table. An example of a FOREIGN KEY is a userId, between a user-login table and a users table. Note that it usually points to a PRIMARY KEY in the referred table.
http://dev.mysql.com/doc/refman/5.1/en/create-table.html
I'm trying to create some tables in a mysql db to handle customers, assign them to groups and give customers within these groups unique promotion codes/coupons.
there are 3 parent(?) tables - customers, groups, promotions
then I have table - customerGroups to assign each customer_id to many group_id's
also I have - customerPromotions to assign each customer_id to many promotion_id's
I know I need to use cascade on delete and update so that when I delete a customer, promotion or group the data is also removed from the child tables. I put together some php to create the tables easily http://pastebin.com/gxhW1PGL
I've been trying to read up on cascade, foreign key references but I think I learn better by trying to do things then learning why they work. Can anyone please give me their input on what I should do to these tables to have them function correctly.
I would like to have the database and tables set up correctly before I start with queries or anything further so any advice would be great.
You seem to want just a little guidance. So I'll try to be brief.
$sql = "CREATE TABLE customerGroups (
customer_id int(11) NOT NULL,
group_id int(11) NOT NULL,
PRIMARY KEY (customer_id, group_id),
CONSTRAINT customers_customergroups_fk
FOREIGN KEY (customer_id)
REFERENCES customers (customer_id)
ON DELETE CASCADE,
CONSTRAINT groups_customergroups_fk
FOREIGN KEY (group_id)
REFERENCES groups (group_id)
ON DELETE CASCADE
)ENGINE = INNODB;";
You only need id numbers when identity is hard to nail down. When you're dealing with people, identity is hard to nail down. There are lots of people named "John Smith".
But you're dealing with two things that have already been identified. (And identified with id numbers, of all things.)
Cascading deletes makes sense. It's relatively rare to cascade updates on id numbers; they're presumed to never change. (The main reason Oracle DBAs insist that primary keys must always be ID numbers, and that they must never change is because Oracle can't cascade updates.) If, later, some id numbers need to change for whatever reason, you can alter the table to include ON UPDATE CASCADE.
$sql = "CREATE TABLE groups
(
group_id int(11) NOT NULL AUTO_INCREMENT,
group_title varchar(50) NOT NULL UNIQUE,
group_desc varchar(140),
PRIMARY KEY (group_id)
)ENGINE = INNODB;";
Note the additional unique constraint on group_title. You don't want to allow anything like this (below) in your database.
group_id group_title
--
1 First group
2 First group
3 First group
...
9384 First group
You'll want to carry those kinds of changes through all your tables. (Except, perhaps, your table of customers.)