in MySQL, I have a row for each user, with a column that contains friends names separated by \n.
eg.
Friend1
Friend2
Friend3
I'd like to be able to quickly search all the users where the Friends field contains Friend2.
I've found FIND_IN_SET but that only works for commas and the data can contains commas and foreign characters.
Obviously searching with regular expressions and the such will be slow. I'm new to the whole cross referencing so I'd love some help on the best way to structure the data so that it can be found quickly.
Thanks in advance.
Edit: Ok, I forgot to mention a point that the data is coming from a
game where friends names are stored locally and there are no links to
another users ID. Thus the strings. Every time they connect I am given
a dump of their friends names which I use in the background to help match games.
The most commonly used structure for this kind of data is usually adding an extra table. I.e.
user
id,
name
email,
e.t.c.
user_friend
user_id
friend_id
Querying this is a matter of querying the tables. I.e.
List all of a users friends names:
SELECT friend_id
FROM user_friend
WHERE user_id = :theUser
Edit: Regarding OPs edit. Just storing the names is possible too. In this case the table structure would become:
user_friend
user_id
friend_name
and the query:
SELECT friend_name
FROM user_friend
WHERE user_id = :theUser
Why are you keeping friend names as text? This will be inefficient to edit uf say a user removes a friend or changes their name. That's another thing, you should store friend names by some auto_increment id key in your database. It's much faster to search for an integer than a string, especially in a very large database. You should set up a friends table which is like
Column 1: connectionid auto_increment key
Column 2: user1id int
Column 3: user2id int
Column 4: date added date
ect...
Then you can search the connection table above for all rows where user is user1id or user2id and get a list of the other users from that.
My database hasn't been filled yet so I can easily change the format and structure in which the data will be stored.
Yes, you need to normalize your database a bit. With current structure, your searches will be quite slow and consume more space.
Check out this wiki for detailed help on normalization.
You can have the friends table and users table separate and link them both by either foreign key constraint or inner joins.
The structure would be:
Users table
id: AUTO_INCRMENT PK
name
other columns
Friends table
id: AUTO_INCREMENT(not required, but good for partitioning)
UserID:
FriendsID
DateAdded
OtherInfo if required.
Related
Okay so im new to databases, and have created a site with a users table, and i also hace a list table, where suers can insert list items, however when they log in everyones list is appearing, how can i link the user table to the lists table, is it creating the same field in each one and using a foreign key? Sorry I am very new to this. Appreciate any help
I think you can just use user_id on both tables to fix this. Let me give an example:
Table A (user_id, username, password)
Table B (list_item_id, user_id , any_other_attribute)
When you design your tables like this a simple sql call will do what you need like:
SELECT 'list_item_id','any_other_attribute' FROM Table B Where user_id=$user_id
Where $user_id is the user_id of the one's who loginned your system.
Also by your question, i suggest you to read about these : 'sessions' , 'sql queries' , 'generating sql query results' on your choice of programming language.
It calls MANY MANY relationnship. There mus be 1 table with fields user_id and field_id that will join this 2 tables
I have a query regarding regular expression.I have design a table which contain three column one column contain member ids which are separated by commas.I am showing you my table structure.Please follow
send_id member_id
1 1211,23,34
2 1,23
I want to select only send_id 2 data which contain member_id as 1.
this is the query that i am using
SELECT * FROM table WHERE column REGEXP '^[1]+$';
but this query giving me both row.Please help me.
With Regards
Rahul
Never store separate values in one column
Normalize your structure like
send_id member_id
1 1211
1 23
1 34
2 1
2 23
If you still want your regex, then it will be
SELECT * FROM t WHERE column REGEXP '(^|[^0-9])1([^0-9]|$)'
First, you should be normalizing your data so you're not in this horrible mess in the first place. Here's a good resource explaining normalization.
Second, I believe your problem lies with your regular expression. Try this instead:
SELECT * FROM table WHERE column REGEXP '^[1]$';
The regular expression you're using uses the [1]+ group. The + means it has to match [1] 1 or more times, hence why you're getting two rows instead of one. Removing the + means it will match [1] once.
However, that still won't fix your problem, as more than one row contains 1. This is why normalization is so important.
Having multiple values inside a column isn't a good practice for designing a DB.
You should normalize your data, i.e., put just one piece of atomic information inside each element of your table.
You can find more information regarding to this in Wikipedia:
http://en.wikipedia.org/wiki/Database_normalization
Like they have told you, perfect solution would be normalize your data, I think Alma Do Mundo answer explains it quite well.
If you want to use REGEXP anyway you have to take in account four approaches; id is the only one, id is the first, id is in the middle and id is at the end. I have use id=74 for the example:
SELECT * FROM table WHERE member_id REGEXP '(^74$|^74,|,74,|,74$)';
depending on your requirements, you should either normalize your data i.e. make 3 tables, one with the send ID, one with the member id, and one that combines the two, then you can link them up with INNER JOINS.
However, if you are going to do it that way, you can use a "WHERE member_id LIKE %1%" to pull in all the relevant fields. You'll have to use the application to filter the relevant records.
In any case, if you're not going to normalize the data you will have to use the front end to filter out the results.
An example of the inner join syntax would look like this
SELECT * FROM SendTable
JOIN Send_Member ON SendTable.send_id = Send_Member.send_id
JOIN Member ON Member.member_id = Send_Member.member_id
WHERE Member.member_id = 1;
where the schema looks like:
Sendtable:
send_Id (primary key)
...other fields
Send_Member:
send_id (primary key and foreign key to SendTable)
member_id (primary key and foreign key to member)
...any fields you might want that are relevant to the particular send table and member table link
Member:
member_id (primarykey)
...other fields
We are creating a website where users can create a certain profile. At the moment we already have about 662000 profiles (records in our database). The user can link certain keywords (divided into 5 categories) to their profile. They can link up to about 1250 keywords per category (no, this isn't nonsense, for certain profiles this would actually make sense). At the moment we save these keywords into an array and insert the serialized array in the profile's record in the database.
When a different user uses the search function and searches for one of the keywords, an SQL query is executed with 'WHERE keyword LIKE %keyword%'. This means that is has to go to a pretty big number of records and go through the entire serialized array for each record. Adding an index to the keyword columns is pretty tricky, since they don't have a defined max lenght (this could be 22000+ chars!).
Is there any other more sensible and practical way to go about this?
Thanks!
Never, never, never store multiple values in one column!
Use a mapping table
user_keywords TABLE
--------------------
user_id INT
keyword_id INT
users TABLE
---------------------
id INT
name VARCHAR
...
keywords TABLE
---------------------
id INT
name VARCHAR
...
You could then return all users having a specific keyword in their profile like this
select u.*
from users u
inner join user_keywords uk on uk.user_id = u.id
inner join keywords k on uk.keyword_id = k.id
where k.name = 'keyword_name'
Since you are dealing with a large data you should use NoSQL databases such as Hadoop/Hbase, Cassandra etc. You should also take a look at Lucene/Solr...
http://nosql-database.org/
I have a publications database and I need to fetch some information regarding the author. The author field is such that the authors have been lumped together in one field e.g if a book has two authors called Robert Ludlum and John Grisham, in the database it is saved as Ludlum, R.;Grisham,J.;
My application needs to spool information and retrieve data on books authored by a particular author if they click on their name. I am using this statement to retrieve the data
$select = "SELECT tblPublications.Title, tblPublications.Year FROM tblPublications WHERE tblPublications.Authors LIKE '%$sname%'";
$sname is a variable referring to the surname of the author. The problem arises if two authors share the same surname. however a workaround I am trying to implement is to get the applicationtake the surname, insert a comma, take the first name of a user and get the first letter then combine the result to a stringe and match them to each comma delimited value in the author field e.g if it is Grisham's books I am looking for I use *Grisham, J.* in my query.
Any Idea how to do this in PHP,MYSQL?
If it is possible to redesign the database, you should probably have an authors table and a book_authors table that relates books to authors, so that multiple authors can be associated with each book. Where is the Last Name coming from that the user clicks? Is it possible to have the link generated be LastName, First letter of first name? If so then you can probably change the link so it will include the first letter. But it is still possible to have two authors with the same last name and first letter of first name. So I think the best solution is to have an authors table and a Book_authors table and just store the author id as a hidden field and use that to retrieve the books by the selected author.
Your database design is incorrect, you have not normalized the data.
If you use like to search with leading wildcards, you will kill any chance of using an index.
Your only option to fix (if you want to keep the mistaken CSV data) is to convert the table to MyISAM format and put a FULLTEXT index on the authors field.
You can then search for an author using
SELECT fielda, b,c FROM table1 WHERE MATCH(authors) against ('$lastname')
See: http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html
Of course a better option would be to normalize the database and create a separate table for authors with a link table.
TABLE books
------------
id primary key
other book data
TABLE authors
--------------
id primary key
lastname varchar (indexed)
other author data
TABLE author_book_link
----------------------
author_id
book_id
PRIMARY KEY ab (author_id, book_id)
Now you can query using very fast indexes using something like:
SELECT b.name, b.ISBN, a.name
FROM books b
INNER JOIN author_book_link ab ON (ab.book_id = b.id)
INNER JOIN author a ON (a.id = ab.author_id)
WHERE a.lastname = '$lastname'
It would entirely depend on what input you are getting from the user.
If the user just types a name, then there isn't much you can do (as there is no guarantee that they will enter it in a correct format for you to parse).
If you are getting them to type in a firstname and lastname however, something like this could be done:
<?php
$firstname = trim(mysql_real_escape_string($_GET['fname']));
$surname = trim(mysql_real_escape_string($_GET['sname']));
$firstletter = substr($_GET['fname'],0,1);
$sname = $surname.', '.$firstletter;
$select = "SELECT tblPublications.Title,
tblPublications.Year
FROM tblPublications
WHERE tblPublications.Authors LIKE '%$sname%'";
So I have a database with a few tables.
The first table contains the user ID, first name and last name.
The second table contains the user ID, interest ID, and interest rating.
There is another table that has all of the interest ID's.
For every interest ID (even when new ones are added), I need to make sure that each user has an entry for that interest ID (even if its blank, or has defaults).
Will foreign keys help with this scenario? or will I need to use PHP to update each and every record when I add a new key?
Foreign keys are a kind of constraint, so they can only fail when you attempt to add records.
You can accomplish what you are describing with a trigger. I don't know the MySql syntax, but in SQL Server it would look something like this:
CREATE TRIGGER TR_ensure_user_interest ON interest FOR INSERT, UPDATE AS
BEGIN
INSERT user_interest (user_id, interest_id)
SELECT user_id, interest_id
FROM inserted
,user
EXCEPT (SELECT user_id, interest_id)
END
Note that this is a rather inefficient approach, but it should cover many of the cases you're concerned about.
UPDATE: I agree with the others who have observed the design "smell" here. If you can accomplish the required result using JOIN queries, that would be a much more efficient solution. However, I was trying to answer the question actually asked. (Plus, I have been in this situation, where physical records are helpful to other database users who are not adept at compound queries.)
For every interest ID (even when new
ones are added), I need to make sure
that each user has an entry for that
interest ID (even if its blank, or has
defaults).
It sounds like you need an OUTER JOIN (either LEFT or RIGHT) in one of your queries instead.
For example, if you wanted to get the level of interest a particular person has for each interest:
Assuming your tables look like this:
users:
user_id PK
user
user_interests:
user_id PK FK
interest_id PK FK
interest_level
interests:
interest_id PK
interest
SELECT i.interest, ui.interest_level
FROM interests i
INNER JOIN user_interests ui USING (interest_id)
LEFT JOIN users u USING (user_id)
WHERE user_id = ?
? is a placeholder.
Note that ui.interest_level will be null for interests with no data.
It sounds like you are forcing your physical design to mirror your logical design too tightly.
Maybe it would be a good idea to rethink exactly why you need to insert a row for every user in the physical table. Couldn't you just write your queries to assume the default value for an interestID if there isn't an associated interestID for a given user?
"Will foreign keys help with this scenario?"
No.
Your constraint is a sort of "completeness" constraint. It implies that for each new Interest added, there must be as many rows added to the USER_INTEREST table as there are users.
No SQL system is able to enforce that for you. It's up to you to enforce it through code.