SQL query to search for matches within derived table - php

I have three tables. The first table ("MainTable") contains rows with data and a primary key (call it "MainKey"). The second two tables contain two columns each.
The first child table, "StatesTable", contains a unique combination of an ID (matches "MainKey") and a column, "State", which is the abbreviated value of a state (i.e., WA, CA).
The second child table, "CategoriesTable", contains a unique combination of an ID (matches "MainKey") and a column, "Category", which is various categories (i.e., "Lawyer", "Engineer", "Teacher").
What I'm trying to achieve is to get all matches in "MainTable" for various posted queries of the two child tables. For instance, if a user selects three states (WA, CA, MT) and two categories ("Lawyer", "Engineer") then the query should return all matches in "MainTable" where the primary key matches that of the child tables.
I have the queries working for the child tables to retrieve a distinct result but I'm not certain how to use these derived tables in the main query to search for all matches of the primary MainKey "WITHIN" the derived table results.
Here is my query for the "States" child table as an example. Any help is appreciated, thanks.
SELECT DISTINCT MainKey
FROM (
SELECT *
FROM `StatesTable`
WHERE State IN ('WA','CA','MT')
) t

Assuming you want only Lawyers or Engineers who are in WA, CA, or MT, this query using two IN expressions should give you the results you want:
SELECT *
FROM MainTable
WHERE MainKey IN (SELECT DISTINCT MainKey
FROM StatesTable
WHERE State IN ('WA', 'CA', 'MT'))
AND MainKey IN (SELECT DISTINCT MainKey
FROM CategoriesTable
WHERE Category IN ('Lawyer', 'Engineer'))
If you want people who are Lawyers or Engineers, OR who live in WA, CA or MT, just change the AND in the WHERE clause to OR.

Related

Speed-up/Optimise MySQL statement - finding a new row that hasn't been selected before

First a bit of background about the tables & DB.
I have a MySQL db with a few tables in:
films:
Contains all film/series info with netflixid as a unique primary key.
users:
Contains user info "ratingid" is a unique primary key
rating:
Contains ALL user rating info, netflixid and a unique primary key of a compound "netflixid-userid"
This statement works:
SELECT *
FROM films
WHERE
INSTR(countrylist, 'GB')
AND films.netflixid NOT IN (SELECT netflixid FROM rating WHERE rating.userid = 1)
LIMIT 1
but it takes longer and longer to retrieve a new film record that you haven't rated. (currently at 6.8 seconds for around 2400 user ratings on an 8000 row film table)
First I thought it was the INSTR(countrylist, 'GB'), so I split them out into their own tinyint columns - made no difference.
I have tried NOT EXISTS as well, but the times are similar.
Any thoughts/ideas on how to select a new "unrated" row from films quickly?
Thanks!
Try just joining?
SELECT *
FROM films
LEFT JOIN rating on rating.ratingid=CONCAT(films.netflixid,'-',1)
WHERE
INSTR(countrylist, 'GB')
AND rating.pk IS NULL
LIMIT 1
Or doing the equivalent NOT EXISTS.
I would recommend not exists:
select *
from films f
where
instr(countrylist, 'GB')
and not exists (
select 1 from rating r where r.userid = 1 and f.netflixid = r.netflixid
)
This should take advantage of the primary key index of the rating table, so the subquery executes quickly.
That said, the instr() function in the outer query also represents a bottleneck. The database cannot take advantage of an index here, because of the function call: basically it needs to apply the computation to the whole table before it is able to filter. To avoid this, you would probably need to review your design: that is, have a separate table to represent the relationship between movies and countries, which each tuple on a separate row; then, you could use another exists subquery to filter on the country.
The INSTR(countrylist, 'GB') could be changed on countrylist = 'GB' or countrylist LIKE '%GB%' if the countrylist contains more than the country.
Then don't select all '*' if you need only some columns details. Depends on the number of columns, the query could be really slow

MySQL Database: How far to Normalize / Queries VS Join / Unique Index

Lately i found myself designing a database. The database is consisted of several tables (InnoDB) :
Table 1: Country (id , country_name)
Table 2: City (id, city_name , countryid)
Table 3: Users (id , cityid , A , B, C, D, E)
On the Users table, A , B ,C , D and E are some characteristics of the user, where characteristic A if you combine it with cityid must be unique, that is why i created a unique index for these 2 columns:
CREATE UNIQUE INDEX idx_user ON Users(cityid , A);
The rest columns B,C,D and E are other user characteristics (for example hair color, height, weight, etc.), that as you understand, will be repeated on the table ( hair color = black, or weight = 75 kg).
At the same time countryid and cityid are configured as foreign keys on UPDATE and DELETE CASCADE.
Search will be based on cityid and A columns. A drop down menu to select the city (hence cityid) and a text box to insert the characteristic A and then hit SEARCH button.
My questions are:
On Users table, i have repeating data in the same column (columns B, C ,D and E). This is against 2NF. Do i have to create a separate table for each of these columns and then assign a foreign key of each of these tables to Users table in order to achieve 2NF?
Table B (id, Bchar)
Table C (id, Cchar)
Table D (id, Dchar)
Table E (id, Echar)
Users (id, cityid, A, Bid, Cid, Did, Eid)
For the time i will not use columns B,C,D and E as search data, only display them after searching using cityid and A search. If (in the future) i decide that i need to display all results of Users that live in cityid and have black hair, what do i have to keep in mind now while designing the database?
In one hand we have DML(INSERT, UPDATE, DELETE) and on the other hand quering (SELECT). DML will work faster on normalized DBs and quering on denormalized DBs. Is there a middle solution?
Will UNIQUE INDEX created above , be enough to ensure uniqueness for the combination of the data in columns cityid and A? Do i need to further restrict it using JavaScript or better PHP?
Multiple Queries VS Joins:
Normalizing the database will require multiple queries or a single query with joins. In the case where "The user searches for a user from Madrid with characteristic A":
a) Multiple queries:
i) Go to City table and find the id of Madrid (for example, id = 2 )
ii) Given the Madrid id and the input for characteristic A, go to Users table and SELECT * FROM Users WHERE cityid="2" AND A="characteristic";
b) INNER JOIN:
i) SELECT City.city_name, Users.B, Users.C FROM City INNER JOIN Users ON Users.cityid = City.id;
Which one should i prefer?
Thanks in advance.
Your tables are already in 2NF.The condition for 2NF is there should be no partial dependency.For example lets take your users table and user-id is the primary key and another primary key more appropriate to call candidate key is (cityid,A) with which you can uniquely represent a row in the table.Your table is not in 2NF if cityid or A alone is enough to uniquely retrieve B,C,D or E but in your case one needs both (cityid,A) to retrieve a unique record and hence it's already normalized.
Note:
Your tables are not in 3NF.The condition for 3NF is no transitive dependency.Let's take the users table here userid is the primary key and you can get a unique (cityid,A) pair with that and in turn you can get a unique (B,C,D,E) record with (cityid,A) obtained from userid.In short if A->B and B->C indirectly A->C which is called transitive dependency and it's present in your user table and hence it's not a suitable candidate for 3NF.

Storing an random size array in a mySQL column

I have a table. It cointains two columns unique song and genre. Unique song stores a string, doesn't really matter what.
Genres contains an array of strings (genres that apply to the song), number of elements in the strings being random (to big to ditch the array and just make additional columns).
I know that this setup does not work in mySQL as I set it up, but that is what I need.
One way to do it would be serialization, but I would very much like to be able to query out all rock songs without having to first querying them all, unserializing and then finding my match.
Since all of the array contents are of the same type, is there a column that would support such an input? (int is a limited array of ints in a way, no?)
You've got a many-to-many relationship - one song can have multiple genres, and a genre can be used by multiple songs.
Create a table called Song, that contains information about the song and some unique identifier. For the sake of argument, we'll just say it's the name of the song: s_name.
Create a table called Genre, that contains information about genres. Maybe you have the genre, and some information on what style of music it is.
Finally, create a table called SongAndGenre, that'll act as a bridge table. It'll have two column - a song ID (in our case, s_name), and a genre ID (say, g_name). If a song S has multiple genres G1 and G2, you'll have two rows for that song - (S, G1) and (S, G2).
You now have a table, let's say, songs, containing a column genres.
To know the genres of song #123, you can now issue
SELECT genres FROM songs WHERE id = 123;
What you need to do is to create two additional tables:
CREATE TABLE genres (
genre_id integer not null primary key auto_increment,
genre_name varchar(75)
);
CREATE TABLE song_has_genre (
song_id integer not null,
genre_id integer not null
);
To store the fact that song 123 is in genres 'Folk', 'Pop', 'Jazz' and 'Whatever', you can run:
INSERT INTO song_has_genre
SELECT 123, genre_id FROM genres
WHERE genre_name IN ( 'Folk', 'Pop', 'Jazz', ... );
To query what songs are in genre Folk,
SELECT songs.*, genres.genre_name FROM songs
JOIN song_has_genre AS shg ON ( songs.id = shg.song_id )
JOIN genres ON (shg.genre_id = genres.genre_id)
WHERE genres.genre_name = 'Folk';
A bit more work is needed to avoid duplicates if you select two genres and one song is in both, or to retrieve all genres of some songs selected based on genre (i.e., you search 'Pop', and want to find 'Pop,Jazz,Folk', 'Pop,Techno', 'Pop', 'Pop,Whatever', but not 'Techno,Jazz,Folk,Anything except Pop'), but it's doable (e.g. using GROUP_CONCAT and/or GROUP BY, or in the code outside MySQL).

Need help in a sql query

I have a field that is a varchar that contain values such as (88,90,100,200) and i have another one contains the value 200 when using the IN clause to see if the second field in the first field it returns empty result but when comparing it with 88 it return results
So i was wondering what im doing wrong here and if there is a better way.
Here is the mysql code
select * from user inner join category where parent_id IN (categories)
parent_id is located in the category table and the categories in the user table
When you do a JOIN you need an ON clause to specify that you are looking for rows that somehow "match" across two tables. If you want all the rows from a single table where some column has a value that is "one of the values in this list..." you use IN. So these are both valid SQL:
SELECT * FROM user INNER JOIN category ON user.someColumn = category.someOtherColumn // can add WHERE clause
or
SELECT * FROM user WHERE parent_id IN (88, 99, 100, 200)
Can't tell you exactly what query you should use unless you share your table structures, as the question is unclear.

Finding value in a comma-separated text field in MySQL?

I've got a database of games with a genre field that has unique ids in it that are separated by commas. It's a text field. For example: HALO 2 - cat_genre => '1,2' (Action,Sci-Fi)
I'm trying to make a function that calculates the total number of games in that genre. So it's matching 1 value to multiple values separated by commas in the db.
I was using SELECT * FROM gh_game WHERE cat_genre IN (1) which would find Action Games.
I'm trying to work this out, I thought I had nailed it once before but I just can't figure it out.
You need to create a many to many relation. like so
CREATE TABLE gameGenreTable ( id int NOT NULL PRIMARY KEY, genreID, gameID)
EDIT: if you're using InnoDB you can also create foreign keys on genreID and gameID..
I would add a UNIQUE key on genreID, gameID
then you can do a query like this
SELECT genreID,count(genreID) as count from gameGenreTable GROUP BY genreID;
-- and join in your other table to get the genre name (or just use the ID).

Categories