Storing an random size array in a mySQL column

Storing an random size array in a mySQL column - php

I have a table. It cointains two columns unique song and genre. Unique song stores a string, doesn't really matter what.
Genres contains an array of strings (genres that apply to the song), number of elements in the strings being random (to big to ditch the array and just make additional columns).
I know that this setup does not work in mySQL as I set it up, but that is what I need.
One way to do it would be serialization, but I would very much like to be able to query out all rock songs without having to first querying them all, unserializing and then finding my match.
Since all of the array contents are of the same type, is there a column that would support such an input? (int is a limited array of ints in a way, no?)

You've got a many-to-many relationship - one song can have multiple genres, and a genre can be used by multiple songs.
Create a table called Song, that contains information about the song and some unique identifier. For the sake of argument, we'll just say it's the name of the song: s_name.
Create a table called Genre, that contains information about genres. Maybe you have the genre, and some information on what style of music it is.
Finally, create a table called SongAndGenre, that'll act as a bridge table. It'll have two column - a song ID (in our case, s_name), and a genre ID (say, g_name). If a song S has multiple genres G1 and G2, you'll have two rows for that song - (S, G1) and (S, G2).

You now have a table, let's say, songs, containing a column genres.
To know the genres of song #123, you can now issue
SELECT genres FROM songs WHERE id = 123;
What you need to do is to create two additional tables:
CREATE TABLE genres (
genre_id integer not null primary key auto_increment,
genre_name varchar(75)
);
CREATE TABLE song_has_genre (
song_id integer not null,
genre_id integer not null
);
To store the fact that song 123 is in genres 'Folk', 'Pop', 'Jazz' and 'Whatever', you can run:
INSERT INTO song_has_genre
SELECT 123, genre_id FROM genres
WHERE genre_name IN ( 'Folk', 'Pop', 'Jazz', ... );
To query what songs are in genre Folk,
SELECT songs.*, genres.genre_name FROM songs
JOIN song_has_genre AS shg ON ( songs.id = shg.song_id )
JOIN genres ON (shg.genre_id = genres.genre_id)
WHERE genres.genre_name = 'Folk';
A bit more work is needed to avoid duplicates if you select two genres and one song is in both, or to retrieve all genres of some songs selected based on genre (i.e., you search 'Pop', and want to find 'Pop,Jazz,Folk', 'Pop,Techno', 'Pop', 'Pop,Whatever', but not 'Techno,Jazz,Folk,Anything except Pop'), but it's doable (e.g. using GROUP_CONCAT and/or GROUP BY, or in the code outside MySQL).

Related

SQL query to search for matches within derived table

I have three tables. The first table ("MainTable") contains rows with data and a primary key (call it "MainKey"). The second two tables contain two columns each.
The first child table, "StatesTable", contains a unique combination of an ID (matches "MainKey") and a column, "State", which is the abbreviated value of a state (i.e., WA, CA).
The second child table, "CategoriesTable", contains a unique combination of an ID (matches "MainKey") and a column, "Category", which is various categories (i.e., "Lawyer", "Engineer", "Teacher").
What I'm trying to achieve is to get all matches in "MainTable" for various posted queries of the two child tables. For instance, if a user selects three states (WA, CA, MT) and two categories ("Lawyer", "Engineer") then the query should return all matches in "MainTable" where the primary key matches that of the child tables.
I have the queries working for the child tables to retrieve a distinct result but I'm not certain how to use these derived tables in the main query to search for all matches of the primary MainKey "WITHIN" the derived table results.
Here is my query for the "States" child table as an example. Any help is appreciated, thanks.
SELECT DISTINCT MainKey
FROM (
SELECT *
FROM `StatesTable`
WHERE State IN ('WA','CA','MT')
) t

Assuming you want only Lawyers or Engineers who are in WA, CA, or MT, this query using two IN expressions should give you the results you want:
SELECT *
FROM MainTable
WHERE MainKey IN (SELECT DISTINCT MainKey
FROM StatesTable
WHERE State IN ('WA', 'CA', 'MT'))
AND MainKey IN (SELECT DISTINCT MainKey
FROM CategoriesTable
WHERE Category IN ('Lawyer', 'Engineer'))
If you want people who are Lawyers or Engineers, OR who live in WA, CA or MT, just change the AND in the WHERE clause to OR.

Use prepared statements in PHP to insert into/display from multiple MySQL tables with foreign key

I have two tables: movies and directors. The directors attribute in the movie entity is multi-valued, which is why I created the directors table with movieId as a foreign key that references the id column of the movies table.
Now, if I want to insert movies into my movies table, how do I add all the information for the movies into the movies table, as well as the name of the directors into the directors table at the same time, possibly using transactions? Also, how do I display the information from the movies table with their corresponding directors from the directors table in a HTML5 table using a while loop? I am using PHP with prepared statements.
This is what I have so far, but it's not complete:
mysqli_autocommit($connect, FALSE);
$stmtMov = $connect->prepare("SELECT id, title, plot, rating, releaseDate, language, duration, country, posterUrl, trailerUrl, imdbUrl FROM movies");
$stmtMov->execute();
$resultMov = $stmtMov->get_result();
$rowMov = $resultMov->fetch_assoc();
$movieId = $rowMov['id'];
$stmtDir = $connect->prepare("SELECT movieId, name FROM directors WHERE movieId = ?")
$stmtDir->bind_param("?", $movieId);
$stmtDir->execute();
$resultDir = $stmtDir->get_result();
$rowDir = $resultDir->fetch_assoc();
Any help would be very much appreciated.

Since you haven't added anything about insert, I'll consider only your select part.
The $rowMov will likely result in a rowset, which is nothing more than an array, which each row will have an ID value. What you should do is iterate with your rowset and generate, for every value, a query for directors entity and get the data you want. Something like:
foreach ($rowMov as $movie) {
$stmt = $connection->prepare("SELECT .... FROM directors WHERE id_movie = ?");
$stmt->bindParam("?", $movie["ID"]);
// $execution, binding results, etc.
}
With that done, you'll have an array with directors and an array with movies. If you want to simplify things on your view (considering you're using a MVC pattern), I would associate both arrays, looking for relations of directors["ID_MOVIE"] and movies["ID"], finally creating an array with both informations like and object.

You've asked two questions,
How do I insert into two tables at the same time, and
How do I display from two tables
But before I go into that, a bit of database review is in order. I would think that each movie would have one director, while each director might have many movies. I suppose the possibility for co-directors exists, too.
So for the first case, each movie must have a director:
movies
------
movie_id
director_id
title
plot, etc...
In this case, you could simply put the director's name in the movie database, but in order to list movies by director you would have to search by the actual name (and mis-spelling would make things complicated). And directors and movies are two different things, so it's better to have separate tables.
In the second case, you need a join table to have a many-to-many relationship.
movies director_movie directors
------- -------------- --------
movie_id movie_id director_id
title director_id name
So to answer your questions,
How do you insert into two tables at the same time?
You don't. First insert into whichever table stands on its own-- in the first case, directors. Then you get the last_insert_id from that table (or if the director already exists, search for the director_id). If last_insert_id is squirrely, you may have to search for what you just inserted to get the id.
Then, you take that id value and insert it into the dependent table movies along with the rest of that table's fields.
For the many-to-many case, you would do it in similar steps: 1) insert into movies 2) get the movie_id 3) insert into direcotors 4) get the director_id 5) insert ids into director_movie
How do I display the results
If there is only one director per movie, it's a simple sql query:
SELECT movies.*, directors.name FROM movies, directors where movies.director_id=directors.director_id AND movies.movie_id=?"
If you have multiple directors per movie, you'll have to loop through results:
SELECT * FROM movies WHERE movie_id=?
then make another query to list the directors
SELECT d.* from directors AS d,director_movie AS dm WHERE dm.director_id=d.director_id AND dm.movie_id=?

SQL return only not empty columns from row as new row

I'm in the situation where my client e-mails me an excel-file with 50 columns of data extremely un-normalized. I then export it to CSV and upload into MySQL -- single table. The columns are for different ingredients (10 columns of data for each ingredient -- title, category, etc) and then 40 different columns for characteristics on each ingredients. So each ingredient in the table has all of these 50 columns even though every column doesn't apply for that ingredient.
My question is if I can create a SQL that selects only filled in characteristics for one selected ingredient and leaves out all of the other columns?
(I know that another option is to build my own CSV-parser that created multiple tables and then write SQL for them instead, but I wanna investigate solving this as is first. If that's not possible then I just have to face that and build a parser ;P)
This is as far as I came but this doesn't completely exclude columns not filled in (or that contains "nei".
SELECT
IF(`Heving-vanlig-gjaerbakst` <> '' AND `Heving-vanlig-gjaerbakst` <> 'nei', `Heving-vanlig-gjaerbakst`, 'random') AS `test1`,
IF(`Frys-kort` <> '' AND `Frys-kort` <> 'nei', `Frys-kort`, 'random') AS `test2`
... and for the 38 other rows ...
FROM x
WHERE id = 123
And I'd rather not solve this in the PHP-code by skipping empty rows =P
Example row (column names first):
g1 gruppe ug1 undergruppe artnr artikkel beskrivelse status enhet ansvar prisliste Heving-vanlig-gjaerbakst Heving-soete-deiger Deig-stabilitet Smaksgiver Saftighet Krumme-poring Skorpe Volum Konservering Skjaerbarhet Frys-lang Frys-kort Kjoel Holdbarhet E-fri Azo-fri Mandler Aprikoskjerner Helmiks Halvmiks Base Konsentrat Utstrykning Bakefasthet Frukt-Baerinnhold Slippegenskaper Hindre-koksing Palmefri Fritering Smidighet Baking Kreming Roere Fylning Dekor Prefert Viskositet Cacaoinnhold Fet-innhold
100150 Bakehjelpemidler 100150200 Fiber/potetprodukter 10085 Potetflakes sekk 15 kg Egnet til lomper, lefser, brød og annet bakverk. B... Handel Sekk Trond Olsen JA xxx xxx xxx
As you can see most columns are empty here. X, XX and XXX is a form of grade-system, but for some columns the content is instead "yes" or "no".
And as I said, the first 10 columns are information about that product, the other 40 is different characteristics (and it's those I wanna work with for one given product).

It sounds a bit as if you'd like to convert the table you have into two tables:
CREATE TABLE Ingredients
(
g1 ...,
gruppe ...,
ug1 ...,
undergruppe ...,
artnr ... PRIMARY KEY,
artikkel ...,
beskrivelse ...,
status ...,
enhet ...,
ansvar ...,
prisliste ...
);
I've opted to guess that the artnr is the primary key, but adapt what follows to the actual primary key. This table contains the eleven (though your question said ten) columns that are common to all ingredients. You then have another table which contains:
CREATE TABLE IngredientProperties
(
artnr ... NOT NULL REFERENCES Ingredients,
property VARCHAR(32) NOT NULL,
value VARCHAR(3) NOT NULL,
PRIMARY KEY(artnr, property)
);
You can then load the populated columns from your original table into these two. At worst, there'd be 40 entries in IngredientProperties for one entry in Ingredient. You might make 'property' into a foreign key reference to a defining list of possible ingredient properties (a third table that defines the possible values for the properties - basically, a record of the column names from your original table). If you add the third table, it might logically be called IngredientProperties (too), in which case the table I called IngredientProperties needs to be renamed.
You can then join Ingredients and IngredientProperties to get the information you want.
I'm not sure that I recommend this solution; it is basically a use of the 'Entity Attribute Value' approach to database design. However, for extremely sparse information like you seem to have, and when used with the constraint of the third table.
What you can't sensibly do is handle all possible combinations of 40 columns as that number grows exponentially with the number of columns (and is pretty large with N = 40).

Best way to store "tags" for speed in enormous table

I'm developing a big content site, with a table "contents", with more than 50 Million of records. Here's the table structure:
contain id(INT11 INDEX),
name(varchar150 FULLTEXT),
description (text FULLTEXT),
date(INT11 INDEX)
I wan to add a "tags" to this contents.
I'm think 2 methods:
Make a varchar(255 FULLTEXT) "tags" column in table contents. Store all tags separated by comas, and search row by row (Which I think this will be slow) using MATCH & AGAINS.
Make 2 tables. First table name "tags" with columns id, tag(varchar(30 INDEX or FULLTEXT?)), "contents_tags" with id, tag_id (int11 INDEX) and content_id (int11 INDEX) and search contents by a JOINS of 3 tables (contents - contents_tags - tags) to retrieve all contents with the tag(s).
I think this is slow and memory killer because a ENORMOUS JOIN of 50M
table * contents_tags * tags.
What is the best method to store tags to make it as efficient as possible? What is the fastest way to search by a text (for example "movie 3d 2011" and simple tag "video") and to locate contents.?
The size of the table (approx. 5Gb now without tags). The table is a MYISAM because I need to store name and description of the table contents in FULLTEXT to string search (users ca search now by this fields), and need the best speed to search by tags.
Any with experience in this?
Thanks!

FULLTEXT indexes are really not as fast as you may think they are.
Use a separate table to store your tags:
Table tags
----------
id integer PK
tag varchar(20)
Table tag_link
--------------
tag_id integer foreign key references tag(id)
content_id integer foreign key references content(id)
/* this table has a PK consisting of tag_id + content_id */
Table content
--------------
id integer PK
......
You SELECT all content with tag x by using:
SELECT c.* FROM tags t
INNER JOIN tag_link tl ON (t.id = tl.tag_id)
INNER JOIN content c ON (c.id = tl.content_id)
WHERE tag = 'test'
ORDER BY tl.content_id DESC /*latest content first*/
LIMIT 10;
Because of the foreign key, all fields in tag_links are individually indexed.
The `WHERE tags = 'test' selects 1 (!) record.
Equi-joins this with 10,000 taglinks.
And Equi-joins that with 1 content record each (each tag_link only ever points to 1 content).
Because of the limit 10, MySQL will stop looking as soon as it has 10 items, so it really only looks at 10 tag_links records.
The content.id is autoincrementing, so higher numbers are very fast proxy for newer articles.
In this case you never need to look for anything other than equality and you start out with 1 tag that you equi-join using integer keys (the fastest join possible).
There are no if-thens-or-buts about it, this is the fastest way.
Note that because there are at most a few 1000 tags, any search will be much faster than delving in the full contents table.
Finally
CSV fields are a very bad idea, never use then in a database.

Finding value in a comma-separated text field in MySQL?

I've got a database of games with a genre field that has unique ids in it that are separated by commas. It's a text field. For example: HALO 2 - cat_genre => '1,2' (Action,Sci-Fi)
I'm trying to make a function that calculates the total number of games in that genre. So it's matching 1 value to multiple values separated by commas in the db.
I was using SELECT * FROM gh_game WHERE cat_genre IN (1) which would find Action Games.
I'm trying to work this out, I thought I had nailed it once before but I just can't figure it out.

You need to create a many to many relation. like so
CREATE TABLE gameGenreTable ( id int NOT NULL PRIMARY KEY, genreID, gameID)
EDIT: if you're using InnoDB you can also create foreign keys on genreID and gameID..
I would add a UNIQUE key on genreID, gameID
then you can do a query like this
SELECT genreID,count(genreID) as count from gameGenreTable GROUP BY genreID;
-- and join in your other table to get the genre name (or just use the ID).

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.