DB Design; or Conditional Selects with json data

DB Design; or Conditional Selects with json data - php

I have a DB with several tables that contain basic, static ID-to-name data. 2 Columns only in each of these reference tables.
I then have another table that will be receiving data input by users. Each instance of user input will have it's own row with a timestamp, but the important columns here will contain either one, or several of the ID's related to names in one of the other tables. For the ease of submitting and retrieving this information I opted to input it as text, in json format.
Everything was going great until I realized I'm going to need to Join the big table with the little tables to reference the names to the ID's. I need to return the IDs in the results as well.
An example of what a few rows in this table might look like:
Column 1 | Column 2 | Timestamp
["715835199","91158582","90516801"] | ["11987","11987","22474"] | 2012-08-28 21:18:48
["715835199"] | ["0"] | 2012-08-28 21:22:48
["91158582","90516801"] | ["11987"] | 2012-08-28 21:25:48
There WILL be repeats of the ID#'s input in this table, but not necessarily in the same groupings, hence why I put the ID to name pairings in a separate table.
Is it possible to do a WHERE name='any-of-these-json-values'? Am I best off doing a ghetto join in php after I query the main table to pull the IDs for the names I need to include? Or do I just need to redo the design of the data input table entirely?

First of all:
Never, ever put more than one information into one field, if you want to access them seperately. Never.
That said, I think you will need to create a full N:M relation, which includes a join table: One row in your example table will need to be replaced by 1-N rows in the join table.
A tricky join with string matching will perform acceptably only for a very small number of rows, and the WHERE name='any-of-these-json-values' is impossible in your construct: MySQL doesn't "understand", that this is a JSON array - it sees it as unstructured text. On a join table, this clause comes quite naturally as WHERE somecolumn IN (1234,5678,8012)
Edit
Assuming your Column 1 contains arrays of IDs in table1 and Column 2 carries arrays of IDs in table2 you would have to do something like
CREATE TABLE t1t2join (
t1id INT NOT NULL ,
t2id INT NOT NULL ,
`Timestamp` DATETIME NOT NULL ,
PRIMARY KEY (t1id,t2id,`Timestamp`) ,
KEY (t2id)
)
(you might want to sanity-check the keys)
And on an insert do the following (in pseudo-code)
Remember timestamp
Cycle all permutations of (Column1,Column2) given by user
Create row
So for your third example row, the SQL would be:
SELECT #now:=NOW();
INSERT INTO t1t2join VALUES
(91158582,11987,#now),
(90516801,11987,#now);

Related

Relating MySQL-table content correctly

I have the following tables in a database:
products
assembly_steps
parts
warnings
I want to relate the content of these tables as follows:
A product consists of many assembly_steps. An assembly_step can have different part and warnings. So I build the tables
assembly_steps_has_parts
assembly_steps_has_warnings
products_has_assembly_steps
to relate the data. The ...has...-tables are connected with their related partners by foreign keys. I modeled that with the MySQL-Workbench.
I am confused about the mechanism to relate the info. How do I program that in PHP?
I think first you add the content on the lowest level, that would be parts and warnings. Then you add the assembly step and relate the data. But I don't know how to do this.
Here you find an overview: Database-Model

Relational databases relate entities/values by recording them together in a row in a table.
To relate assembly_steps to parts, just insert a row into assembly_steps_has_parts, e.g. if you have the assembly_step_id in $assembly_step_id, and the part_id in $part_id, then:
INSERT INTO assembly_steps_has_parts (assembly_steps_id, parts_id)
VALUES ($assembly_step_id, $part_id)

You wouldn't program this in PHP, you'd handle it fully with mysql. The way this would be structured in mysql would be something like this:
assembly_steps
assembly_id
assembly_description (or something like that
assembly_id is the primary key
parts
part_id
part_name
part_id is the primary key
assembly_steps_has_parts
assemply_id
part_id
In this table, you'd have a dual primary key. Both assembly and part id are foreign keys AND primary keys for their respective tables.
The way that dual primary keys work is that there are two keys to make up one primary key on one table. That means that instead of limiting to 1 key, it limits the table to one of any combination of these keys to make one.
For instance:
pk1 pk2
1 1
1 2
1 3
2 1
2 2
2 3
You could query them like this (this is a generic query, but the basic idea)
select a.assembly_description, p. part_name
from assembly_id a
join assembly_steps_has_parts ats
on a.assembly_id = ats.assembly_id
join parts p
on ats.part_id = p.part_id
You'd do the same thing for the other tables. From that point you'd just call the results of your query in php the way you would handle any other query.

Mysql with regular expression

I have a query regarding regular expression.I have design a table which contain three column one column contain member ids which are separated by commas.I am showing you my table structure.Please follow
send_id member_id
1 1211,23,34
2 1,23
I want to select only send_id 2 data which contain member_id as 1.
this is the query that i am using
SELECT * FROM table WHERE column REGEXP '^[1]+$';
but this query giving me both row.Please help me.
With Regards
Rahul

Never store separate values in one column
Normalize your structure like
send_id member_id
1 1211
1 23
1 34
2 1
2 23
If you still want your regex, then it will be
SELECT * FROM t WHERE column REGEXP '(^|[^0-9])1([^0-9]|$)'

First, you should be normalizing your data so you're not in this horrible mess in the first place. Here's a good resource explaining normalization.
Second, I believe your problem lies with your regular expression. Try this instead:
SELECT * FROM table WHERE column REGEXP '^[1]$';
The regular expression you're using uses the [1]+ group. The + means it has to match [1] 1 or more times, hence why you're getting two rows instead of one. Removing the + means it will match [1] once.
However, that still won't fix your problem, as more than one row contains 1. This is why normalization is so important.

Having multiple values inside a column isn't a good practice for designing a DB.
You should normalize your data, i.e., put just one piece of atomic information inside each element of your table.
You can find more information regarding to this in Wikipedia:
http://en.wikipedia.org/wiki/Database_normalization

Like they have told you, perfect solution would be normalize your data, I think Alma Do Mundo answer explains it quite well.
If you want to use REGEXP anyway you have to take in account four approaches; id is the only one, id is the first, id is in the middle and id is at the end. I have use id=74 for the example:
SELECT * FROM table WHERE member_id REGEXP '(^74$|^74,|,74,|,74$)';

depending on your requirements, you should either normalize your data i.e. make 3 tables, one with the send ID, one with the member id, and one that combines the two, then you can link them up with INNER JOINS.
However, if you are going to do it that way, you can use a "WHERE member_id LIKE %1%" to pull in all the relevant fields. You'll have to use the application to filter the relevant records.
In any case, if you're not going to normalize the data you will have to use the front end to filter out the results.
An example of the inner join syntax would look like this
SELECT * FROM SendTable
JOIN Send_Member ON SendTable.send_id = Send_Member.send_id
JOIN Member ON Member.member_id = Send_Member.member_id
WHERE Member.member_id = 1;
where the schema looks like:
Sendtable:
send_Id (primary key)
...other fields
Send_Member:
send_id (primary key and foreign key to SendTable)
member_id (primary key and foreign key to member)
...any fields you might want that are relevant to the particular send table and member table link
Member:
member_id (primarykey)
...other fields

Populating a single-dimensional array with multiple MySQL column values

I am quite new to PHP and MySQL, but have experience of VBA and C++. In short, I am trying to count the occurrences of a value (text string), which can appear in 11 columns in my table.
I think I will need to populate a single-dimensional array from this table, but the table has 14 columns (named 'player1' to 'player14'). I want each of these 'players' to be entered into the one-dimensional array (if not NULL), before proceeding to the next row.
I know there is the SELECT DISTINCT statement in MySQL, but can I use this to count distinct occurrences across 14 columns?
For background, I am building a football results database, where player1 to player14 are the starting 11 (and 3 subs), and my PHP code will count the number of times a player has made an appearance.
Thanks for all your help!
Matt.

Rethink your database schema. Try this:
Table players:
player_id
name
Table games:
game_id
Table appearances:
appearance_id
player_id
game_id
This reduces the amount of duplicate data. Read up on normalization. It allows you to do a simple select count(*) from appearances inner join players on player_id where name='Joe Schmoe'

First of all, the database schema you're using is terrible, and you just found out a reason why.
That being said, I see no other way then to first get a list of all players by distinctly selecting the names of players into an array. Before each insertion, you would have to check if the name is already in the array (if it is already in, don't add it again).
Then, when you have the list of names, you would have to run an SQL statement for each player, adding up the number of occurences, like so:
SELECT COUNT(*)
FROM <Table>
WHERE player1=? OR player2=? OR player3=? OR ... OR player14 = ?
That is all pretty complicated, and as I said, you should really change your database schema.

This sounds like a job for fetch_assoc (http://php.net/manual/de/mysqli-result.fetch-assoc.php).
If you use mysqli, you would get each row as an associative array.
On the other hand the table design seems a bit flawed, as suggested before.
If you had on table team with team name and what not and one table player with player names.
TEAM
| id | name | founded | foo |
PLAYER
| id | team_id | name | bar |
With that structure you could add 14 players, which point at the same team and by joining the two tables, extract the players that match your search.

SQL return only not empty columns from row as new row

I'm in the situation where my client e-mails me an excel-file with 50 columns of data extremely un-normalized. I then export it to CSV and upload into MySQL -- single table. The columns are for different ingredients (10 columns of data for each ingredient -- title, category, etc) and then 40 different columns for characteristics on each ingredients. So each ingredient in the table has all of these 50 columns even though every column doesn't apply for that ingredient.
My question is if I can create a SQL that selects only filled in characteristics for one selected ingredient and leaves out all of the other columns?
(I know that another option is to build my own CSV-parser that created multiple tables and then write SQL for them instead, but I wanna investigate solving this as is first. If that's not possible then I just have to face that and build a parser ;P)
This is as far as I came but this doesn't completely exclude columns not filled in (or that contains "nei".
SELECT
IF(`Heving-vanlig-gjaerbakst` <> '' AND `Heving-vanlig-gjaerbakst` <> 'nei', `Heving-vanlig-gjaerbakst`, 'random') AS `test1`,
IF(`Frys-kort` <> '' AND `Frys-kort` <> 'nei', `Frys-kort`, 'random') AS `test2`
... and for the 38 other rows ...
FROM x
WHERE id = 123
And I'd rather not solve this in the PHP-code by skipping empty rows =P
Example row (column names first):
g1 gruppe ug1 undergruppe artnr artikkel beskrivelse status enhet ansvar prisliste Heving-vanlig-gjaerbakst Heving-soete-deiger Deig-stabilitet Smaksgiver Saftighet Krumme-poring Skorpe Volum Konservering Skjaerbarhet Frys-lang Frys-kort Kjoel Holdbarhet E-fri Azo-fri Mandler Aprikoskjerner Helmiks Halvmiks Base Konsentrat Utstrykning Bakefasthet Frukt-Baerinnhold Slippegenskaper Hindre-koksing Palmefri Fritering Smidighet Baking Kreming Roere Fylning Dekor Prefert Viskositet Cacaoinnhold Fet-innhold
100150 Bakehjelpemidler 100150200 Fiber/potetprodukter 10085 Potetflakes sekk 15 kg Egnet til lomper, lefser, brød og annet bakverk. B... Handel Sekk Trond Olsen JA xxx xxx xxx
As you can see most columns are empty here. X, XX and XXX is a form of grade-system, but for some columns the content is instead "yes" or "no".
And as I said, the first 10 columns are information about that product, the other 40 is different characteristics (and it's those I wanna work with for one given product).

It sounds a bit as if you'd like to convert the table you have into two tables:
CREATE TABLE Ingredients
(
g1 ...,
gruppe ...,
ug1 ...,
undergruppe ...,
artnr ... PRIMARY KEY,
artikkel ...,
beskrivelse ...,
status ...,
enhet ...,
ansvar ...,
prisliste ...
);
I've opted to guess that the artnr is the primary key, but adapt what follows to the actual primary key. This table contains the eleven (though your question said ten) columns that are common to all ingredients. You then have another table which contains:
CREATE TABLE IngredientProperties
(
artnr ... NOT NULL REFERENCES Ingredients,
property VARCHAR(32) NOT NULL,
value VARCHAR(3) NOT NULL,
PRIMARY KEY(artnr, property)
);
You can then load the populated columns from your original table into these two. At worst, there'd be 40 entries in IngredientProperties for one entry in Ingredient. You might make 'property' into a foreign key reference to a defining list of possible ingredient properties (a third table that defines the possible values for the properties - basically, a record of the column names from your original table). If you add the third table, it might logically be called IngredientProperties (too), in which case the table I called IngredientProperties needs to be renamed.
You can then join Ingredients and IngredientProperties to get the information you want.
I'm not sure that I recommend this solution; it is basically a use of the 'Entity Attribute Value' approach to database design. However, for extremely sparse information like you seem to have, and when used with the constraint of the third table.
What you can't sensibly do is handle all possible combinations of 40 columns as that number grows exponentially with the number of columns (and is pretty large with N = 40).

I need some advice on storing data in mysql, where one needs to store more than one, let say userids for a single post?

In cases when some one needs to store more than one value in a in a cell, what approach is more desirable and advisable, storing it with delimiters or glue and exploding it into an array later for processing in the server side language of choice, for example.
$returnedFromDB = "159|160|161|162|163|164|165";
$myIdArray = explode("|",$returnedFromDB);
or as a JSON or PHP serialized array, like this.
:6:{i:0;i:1;i:1;i:2;i:2;i:3;i:3;i:4;i:4;i:5;i:5;i:6;}
then later unserialize it into an array and work with it,
OR
have a new row for every new entry like this
postid 12 | showto 2
postid 12 | showto 3
postid 12 | showto 5
postid 12 | showto 6
postid 12 | showto 8
instead of postid 12 | showto "2|3|4|6|8|5|".
OR postid 12 | showto ":6:{i:0;i:2;i:1;i:3;i:2;i:3;i:3;i:4;i:4;i:5;i:5;i:6;}".
Thanks, looking forward to your opinions :D

In cases when some one needs to store more than one value in a in a cell, what approach is more desirable and advisable, storing it with delimiters or glue and exploding it into an array later for processing in the server side language of choice, for example.
Neither. Oh goodness, neither! Edgar F. Codd is rolling in his grave right now.
Storing delimited data in a text field is no better than storing it in a flat file. The data becomes unqueryable. Storing PHP serialized data in a text field is even worse because then only PHP can parse the data.
You want a nice, happy, normalized database.
The thing you're trying to describe is a many-to-many relationship. Each user can maintain one or more posts. Likewise, each post can be maintained by one or more user. Right? Then something like this will work.
CREATE TABLE users (
user_id INTEGER PRIMARY KEY,
...
);
CREATE TABLE posts (
post_id INTEGER PRIMARY KEY,
...
);
CREATE TABLE user_posts (
user_id INTEGER REFERENCES users(user_id),
post_id INTEGER REFERENCES posts(post_id),
UNIQUE KEY(user_id, post_id)
);
-- All posts made by user 22.
SELECT posts.*
FROM posts, user_posts
WHERE user_posts.user_id = 22
AND posts.post_id = user_posts.post_id
-- All users that worked on post 47
SELECT users.*
FROM users, user_posts
WHERE user_posts.post_id = 47
AND users.user_id = user_posts.user_id

Most of the time the recommendation is that many-to-many relationships (such as posts to users) should have a mapping table with 1 row for each post-user combination (in other words, your "new row for every new entry" version).
It's more optimal for things like join queries, and lets you retrieve only the data you need.

You should only serialize data in the DB if the data is never needed to be processed by the DB. For example, you could serialize user ID in the user_id field if you never need to do a query with the user_id field; e.g. never selecting anything based on user.
If these are posts (blog/news/etc. posts?) then I'm pretty confident you'll need to be able to query them by user. Normalizing the user into another table would serve you:
CREATE TABLE posts (post_id, ....);
CREATE TABLE post_users (post_id, user_id, ...);
You can then get the users in a different query, or use group_concat: SELECT post_id, GROUP_CONCAT(user_id) FROM posts JOIN post_users USING (post_id) GROUP BY post_id. When you need to show user name, just join to the users table to get their name in the group concat.

From RDBMS point of view i would 'have a new row for every new entry'
Thats called m:n relationship table.
You can then query the data however you like.
If you need postid 12 | showto ":6:{i:0;i:2;i:1;i:3;i:2;i:3;i:3;i:4;i:4;i:5;i:5;i:6;}". you can do
SELECT postid, CONCAT(':',count(showto),':{i:',GROUP_CONCAT(showto SEPARATOR ';i:'),';}') AS showto
FROM tablename
GROUP BY postid
However if you only need the data in 1 form and not do any other kind of queries on that data then you may aswell store the string.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.