I'm trying to create a single table for private messaging on a website. I created the following table which I think is efficient but I would really appreciate some feedback.
CREATE TABLE IF NOT EXISTS `pm` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`to` int(11) NOT NULL,
`date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`subject` varchar(255) DEFAULT NULL,
`message` text NOT NULL,
`read` tinyint(1) NOT NULL DEFAULT '0',
`deleted` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`)
FOREIGN KEY (user_id) REFERENCES User(user_id)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;
I have 2 columns that determine the status of the message: read and deleted
If read = 1, the message has been read by the receiver. If deleted = 1, either the sender or the receiver deleted the message from the sent or received inbox. If deleted = 2 both users deleted the message, therefor delete the row from the database table.
I see that you don't have have any indexes explicitly stated. Having the appropriate indexes on your table could improve your performance significantly. I also believe that for your message column you may want to consider making i a varchar with a max size explicitly stated. Other than those two items which you may already taken care of your table looks pretty good to me.
MySQL Table Performance Guidelines:
Add appropriate indexes to tables. Indexes aren't just for primary/unique keys add them to frequently referenced columns.
Explicitly state maximum lengths. Fixed length tables are faster than their counterpart
Always have an id column.
Add NOT NULL where ever you can. The nulls still take up space
Know your data types. Knowledge is power and can save on performance and space
Interesting Articles:
VarChar/TEXT Benchmarks
Similar Question
Some Best Practices
Data Type Storage Requirements
The articles and some of the items I have listed may not be 100% correct or reliable so make sure you do a bit of your own research if you are interested in further tuning your performance.
A few comments:
Charset=latin1 is going to piss some people of I'd suggest charset=utf8.
I'd suggest putting a foreign key check in not only on user_id, but on to as well.
Also I'd put an index on date, as you will be doing a lot of sorting on that field.
You need to split deleted in two fields, otherwise you will not know which user has deleted the message. (deleted_by_user, deleted_by_recipient)
Note that date is a reserved word and you'll need to change it into message_date or `backtick` it in your queries.
some comments:
not bad.
i would name the table something that other people might guess out of context. so maybe private_message instead of pm.
i would be explicit on the user column names, so maybe from_user_id, and to_user_id instead of 'user_id' and 'to'
i would consider pulling out the status into a new table with status, user_id, and date - this should give you a lot more flexibility in who is doing what to the message over time.
For displaying both the receiver's inbox and the senders outbox (and being able to delete messages respectively), you will probably need more information that what you currently have encoded. I would suggest a "deleted" field for each party. (As long as this is limited to only 1 user on each end and no broadcast messages, this works. This does not scale to broadcast messages, however, which would require more than 1 table to do efficiently)
You may also want to enforce key relationships with ON DELETE and ON UPDATE:
FOREIGN KEY (user_id) REFERENCES User(user_id) ON DELETE CASCADE ON UPDATE CASCADE,
FOREIGN KEY (to) REFERENCES User(user_id) ON DELETE CASCADE ON UPDATE CASCADE
The removal or modification of a user will propagate changes or deletions to the messages table.
I think you may need to add an column called Parent_Message_ID which will have the parent mail ID. So that replies can also included.
If you think in future to add replies to your private messages.
Related
I have this chat table:
CREATE TABLE IF NOT EXISTS `support_chat` (
`id` int(11) UNSIGNED NOT NULL AUTO_INCREMENT,
`from` varchar(255) NOT NULL DEFAULT '',
`to` varchar(255) NOT NULL DEFAULT '',
`message` text NOT NULL,
`sent` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`seen` varchar(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
KEY `from` (`from`),
KEY `to` (`to`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 AUTO_INCREMENT=1 ;
basically I need to do a select all the time (3s per user) to check new messages:
select id, `from`, message, sent from support_chat where `to` = ? and seen = 0
I have 5 million rows, usually 100 users online at the same time. Can I change something to make this table faster? key from and key to is a good option?
There isn't much you can do by way of indexes to speed up that particular query. You could have a composite index on the to and seen fields but the improvement will be minimal if at all. Why? Because the seen field has very poor cardinality. You only seem to be storing 0 or 1 in it and indexes on such columns are not very usefull. Often it would be faster for the query optimizer to read the data directly.
But here's what you can do Partition:
... enables you to distribute portions of individual tables across a
file system according to rules which you can set largely as needed. In
effect, different portions of a table are stored as separate tables in
different locations. The user-selected rule by which the division of
data is accomplished is known as a partitioning function,
You can partition your data in such a way that very old data is separated from the new. This will probably give you a big boost. However be aware that if you have a query that fetches old data as well as new data that will be a lot slower.
Here is another thing you can do: Add a limit clause.
You are probably only showing a limited number of messages at any given time. Putting a limit clause will help. Then mysql knows that it doesn't need to look anymore after it has found N rows.
Add a multiple column index on to and seen columns in this particular order (to column should be the 1st column in the index). Then run explain select... on your query to see if the new index is used.
Assuming that the seen column stores 2 values only ('0' and '1') and that to column stores the recipient of the chat message (email, username), so it can have many more values, I'd use a composite index with seen first and to second:
ALTER TABLE support_chat
ADD INDEX seen_to_ix
(seen, `to`) ;
A composite index with reversed order (`to`, seen) would be a good choice, too. It might even be better depending on server load and how often the table is updated. An advantage (if you decide to use the second index), is that you can remove the (`to`) index.
Pick and add one of the two indexes and check the performance of your queries again.
Additional notes:
Using a varchar(1) for what is essentially a boolean value is not optimal. Even worse that it is a utf8mb4 charset. It uses 5 bytes! (1 for the variable and 4 for the single byte!)
I'd change the type of that column to tinyint (and store 0 and 1) or bit.
Please avoid using reserved words (eg, from, to) for table and column names.
I have a MySQL (5.6.26) database with large ammount of data and I have problem with COUNT select on table join.
This query takes about 23 seconds to execute:
SELECT COUNT(0) FROM user
LEFT JOIN blog_user ON blog_user.id_user = user.id
WHERE email IS NOT NULL
AND blog_user.id_blog = 1
Table user is MyISAM and contains user data like id, email, name, etc...
CREATE TABLE `user` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`username` varchar(50) DEFAULT NULL,
`email` varchar(100) DEFAULT '',
`hash` varchar(100) DEFAULT NULL,
`last_login` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`created` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`id`),
UNIQUE KEY `id` (`id`) USING BTREE,
UNIQUE KEY `email` (`email`) USING BTREE,
UNIQUE KEY `hash` (`hash`) USING BTREE,
FULLTEXT KEY `email_full_text` (`email`)
) ENGINE=MyISAM AUTO_INCREMENT=5728203 DEFAULT CHARSET=utf8
Table blog_user is InnoDB and contains only id, id_user and id_blog (user can have access to more than one blog). id is PRIMARY KEY and there are indexes on id_blog, id_user and id_blog-id_user.
CREATE TABLE `blog_user` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`id_blog` int(11) NOT NULL DEFAULT '0',
`id_user` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
UNIQUE KEY `id_blog_user` (`id_blog`,`id_user`) USING BTREE,
KEY `id_user` (`id_user`) USING BTREE,
KEY `id_blog` (`id_blog`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=5250695 DEFAULT CHARSET=utf8
I deleted all other tables and there is no other connection to MySQL server (testing environment).
What I've found so far:
When I delete some columns from user table, duration of query is shorter (like 2 seconds per deleted column)
When I delete all columns from user table (except id and email), duration of query is 0.6 seconds.
When I change blog_user table also to MyISAM, duration of query is 46 seconds.
When I change user table to InnoDB, duration of query is 0.1 seconds.
The question is why is MyISAM so slow executing the command?
First, some comments on your query (after fixing it up a bit):
SELECT COUNT(*)
FROM user u LEFT JOIN
blog_user bu
ON bu.id_user = u.id
WHERE u.email IS NOT NULL AND bu.id_blog = 1;
Table aliases help make it easier to both write and to read a query. More importantly, You have a LEFT JOIN but your WHERE clause is turning it into an INNER JOIN. So, write it that way:
SELECT COUNT(*)
FROM user u INNER JOIN
blog_user bu
ON bu.id_user = u.id
WHERE u.email IS NOT NULL AND bu.id_blog = 1;
The difference is important because it affects choices that the optimizer can make.
Next, indexes will help this query. I am guessing that blog_user(id_blog, id_user) and user(id, email) are the best indexes.
The reason why the number of columns affects your original query is because it is doing a lot of I/O. The fewer columns then the fewer pages needed to store the records -- and the faster the query runs. Proper indexes should work better and more consistently.
To answer the real question (why is myisam slower than InnoDB), I can't give an authoritative answer.
But it is certainly related to one of the more important differences between the two storage engines : InnoDB does support foreign keys, and myisam doesn't. Foreign keys are important for joining tables.
I don't know if defining a foreign key constraint will improve speed further, but for sure, it will guarantee data consistency.
Another note : you observe that the time decreases as you delete columns. This indicates that the query requires a full table scan. This can be avoided by creating an index on the email column. user.id and blog.id_user hopefully already have an index, if they don't, this is an error. Columns that participate in a foreign key, explicit or not, always must have an index.
This is a long time after the event to be much use to the OP and all the foregoing suggestions for speeding up the query are entirely appropriate but I wonder why no one has remarked on the output of EXPLAIN. Specifically, why the index on email was chosen and how that relates to the definition for the email column in the user table.
The optimizer has selected an index on email column, presumably because it's included in the where clause. key_len for this index is comparatively long and it's a reasonably large table given the auto_increment value so the memory requirements for this index would be considerably greater than if it had chosen the id column (4 bytes against 303 bytes). The email column is NULLABLE but has a default of the empty string so, unless the application explicitly sets a NULL, you are not going to find any NULLs in this column anyway. Neither will you find more than one record with the default given the UNIQUE constraint. The column DEFAULT and UNIQUE constraint appear to be completely at odds with each other.
Given the above, and the fact we only want the count in the query, I'd then wonder if the email part of the where clause serves any purpose other than slowing the query down as each value is compared to NULL. Without it the optimizer would probably pick the primary key and do a much better job. Better yet would be a query which ignored the user table entirely and took the count based on the covering index on blog_user that Gordon Linoff highlighted.
There's another indexing issues here worth mentioning:
On the user table
UNIQUE KEY `id` (`id`) USING BTREE,
is redundant since id is the PRIMARY KEY and therefore UNIQUE by definition.
To answer your last question,
The question is why is MyISAM so slow executing the command?
MyISAM is dependent on the speed of your hard drive,
INNODB once the data is read is at speed of RAM. 1st time query is run could be loading data, second and later will avoid hard drive until aged out of RAM.
Hello friends I am working on a school database system based on php mysql. the basic structure is as below:
Table Class-Details of all classes. Primary Key Class ID
Table Student-Details of all students, Primary key studentID. Foreign Key ClassID
Table Semester-Details of all Semesters, key SemesterID
Table class–Semester. This table solves many to many relation, primary key- IDs of both class and semester. Foreign Key ClassID, SemesterID
Table Subject -Details of all Subjects, key SubjectID
Table class–Subject. This table solves many to many relation, primary key- IDs of both class and semester. Foreign Key ClassID, SubjectID
Table marks- consists of student ID, Subject ID, Semester ID, Marks Achieved.Foreign Key ClassID, SemesterID, SubjectID
I have also applied foreign keys in all the tables which are referring back to the parent table. I am looking to apply integrity in my database so that a student for a particular class will automatically be assigned to subjects of that particular class.
If we try to change the subjects of the student, database should throw an error that these subjects belong to the class for which student is a part of.
I am sure this can be done using foreign key constraints. However, I am bit naive to do so. A working example is highly appreciated
ENGINE = InnoDB
AUTO_INCREMENT = 53
DEFAULT CHARACTER SET = utf8;
Ok, I'll try to help. :-) First make sure you know the syntax completely by using the MySQL Manual for creating tables.
MySQL 5.1: CREATE TABLE
Look for the sections that look like this.
reference_definition:
REFERENCES tbl_name (index_col_name,...)
[MATCH FULL | MATCH PARTIAL | MATCH SIMPLE]
[ON DELETE reference_option] <----
[ON UPDATE reference_option] <----
reference_option:
RESTRICT | CASCADE | SET NULL | NO ACTION
Here is an example (...attept ...) from a child table of contact statistics that links to a contacts (people) parent table.
CREATE TABLE IF NOT EXISTS contactStats_tbl(
id INT UNSIGNED NOT NULL AUTO_INCREMENT COMMENT 'Contact ID number.',
email VARCHAR(254) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL COMMENT 'E-mail address from contacts_tbl.',
subscribeTime TIMESTAMP DEFAULT '0000-00-00 00:00:00' COMMENT 'Time of subscription.',
unsubscribeTime TIMESTAMP DEFAULT '0000-00-00 00:00:00' COMMENT 'Time of unsubscription.',
totalMessages INT(4) NOT NULL COMMENT 'Number of messages sent.',
newsLetter ENUM('Y', 'N') CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL DEFAULT 'N' COMMENT 'Newsletter subscription.',
CONSTRAINT csconstr01 FOREIGN KEY (id, email) REFERENCES contacts_db.contacts_tbl(id, email) ON UPDATE CASCADE ON DELETE RESTRICT)
ENGINE=InnoDB DEFAULT CHARACTER SET = utf8 COMMENT 'Contact statistics table.';
Essentially, with table constraints you are focusing on a time when someone attempts to DELETE or UPDATE a record in a child table containing fields that point to a parent table (foreign keys, in this case). For all of your child tables, my advice would be to set the ON DELETE options to RESTRICT (the default). But, for ON UPDATE, child tables should probably CASCADE to keep them consistent with their parents (I have not researched referential integrity for a while, but I think that's how it goes! Dang that MS Access! Don't vote me down if I am wrong. Just comment and I'll fix my answer. :-)). The best thing to do would be to make sure you know how referential integrity applies to the situation at hand. Truthfully, I forget how the ON UPDATE bit works because I have not used it in a while. :-)
Now, as far as automatically inserting field values into a record (in a secondary table) based on actively inserting a record into some other table (primary table), make sure that you are not in need of a trigger.
MySQL 5.1: CREATE TRIGGER
This should get you going. I tried! :-)
Anthony
I am working on a project where I want to allow the end user to basically add an unlimited amount of resources when creating a hardware device listing.
In this scenario, they can store both the quantity and types of hard-drives. The hard-drive types are already stored in a MySQL Database table with all of the potential options, so they have the options to set quantity, choose the drive type (from dropdown box), and add more entries as needed.
As I don't want to create a DB with "drive1amount", "drive1typeid", "drive2amount", "drive2typeid", and so on, what would be the best way to do this?
I've seen similar questions answered with a many-to-many link table, but can't think of how I could pull this off with that.
Something like this?
CREATE TABLE `hardware` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(256) NOT NULL,
`quantity` int(11) NOT NULL,
`hardware_type_id` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `type_id` (`hardware_type_id`),
CONSTRAINT `hardware_ibfk_1` FOREIGN KEY (`hardware_type_id`) REFERENCES `hardware_type` (`id`)
) ENGINE=InnoDB
hardware_type_id is a foreign key to your existing table
This way the table doesnt care what kind of hardware it is
Your answer relies a bit on your long term goals with this project. If you want to posses a data repository which has profiles all different types of hardware devices with their specifications i suggest you maintain a hardware table of each different types of hardware. For example you will have a harddisk table which consist of all different models and types of hardisks out there. Then you can assign a record from this specific table to the host configuration table. You can build the dataset as you go from the input from user.
If this is not clear to you let me know i will create a diagram and upload for you.
I've got a friendship table between users that looks like this.
CREATE TABLE user_relations (
pkUser1 INTEGER UNSIGNED NOT NULL,
pkUser2 INTEGER UNSIGNED NOT NULL,
pkRelationsType TINYINT UNSIGNED NOT NULL,
PRIMARY KEY(pkUser1,pkUser2),
FOREIGN KEY(pkuser1) references users(ID),
FOREIGN KEY(pkuser2) references users(ID),
FOREIGN KEY(pkRelationsType) references user_relations_type(ID)
);
pkRelationsType is a pointer to another table that defines the kind of relation the users have (friendship(1),pending(2) or blocked(3))
If user 1 is friend with user 2 I've got only one instance |1|2|1| and NOT also |2|1|1|.
The thing is, in order to block a user I have to keep in mind the relation can be already made (users can be already friends or even have the pending friendship petition) so I am trying to insert the data or update it if the relation does not exist already.
I have this for the friendship request send, but this just ignores the the insert if the data exists already.
INSERT INTO
user_relations(pkUser1,pkUser2,pkRelationsType)
SELECT * FROM (SELECT :sender0,:target0,2) AS tmp
WHERE NOT EXISTS
(SELECT pkUser1 FROM user_relations
WHERE
(pkUser1= :sender1 AND pkUser2=:target1) OR (pkUser1=:sender2 AND pkUser1=:target2) LIMIT 1)
Due to the nature of the table I cannot use INSERT ... ON DUPLICATE KEY UPDATE.
I've been thinking about handling it with PHP, searching for the relation and it's order if exists and then doing one thing or another but it seems like a waste of processing.
Please note that I'm not a MYSQL expert even though I've handled myself so far.
Hope I have explained myself well enough.
Thanks for the feedback.
From your description, it seems that you are only keeping the "latest" relationship. If this is the case, why don't you DELETE the relationship first, then INSERT the new one?