Imagine a table for articles. In addition to the main query:
SELECT * From articles WHERE article_id='$id'
We also need several other queries to get
SELECT * FROM users WHERE user_id='$author_id' // Taken from main query
SELECT tags.tag
FROM tags
INNER JOIN tag_map
ON tags.tag_id=tag_map.tag_id
WHERE article_id='$id'
and several more queries for categories, similar articles, etc
Question 1: Is it the best way to perform these queries separately with PHP and handle the given results, or there is way to combine them?
Question 2: In the absence of many-to-many relationships (e.g. one tag, category, author for every article identified by tag_id, category_id, author_id); What the best (fastest) was to retrieve data from the tables.
If all the relationships are one-many then you could quite easily retrieve all this data in one query such as
SELECT
[fields required]
FROM
articles a
INNER JOIN
users u ON a.author_id=u.user_id
INNER JOIN
tag_map tm ON tm.article_id=a.article_id
INNER JOIN
tags t t.tag_id=tm.tag_id
WHERE
a.article_id='$id'
This would usually be faster than the three queries separately along as your tables are indexed correctly as MySQL is built to do this! It would save on two round trips to the database and the associated overhead.
You can merge in the user in the first query:
SELECT a.*, u.*
FROM articles a
JOIN users u ON u.user_id = a.author_id
WHERE a.article_id='$id';
You could do the same with the tags, but that would introduce some redundancy in the answer, because there are obviously multiple tags per article. May or may not be beneficial.
In the absence of many-to-many relationships, this would do the job in one fell swoop and would be superior in any case:
SELECT *
FROM users u
JOIN articles a ON a.author_id = u.user_id
JOIN tag t USING (tag_id) -- I assume a column articles.tag_id in this case
WHERE a.article_id = '$id';
You may want to be more selective on which columns to return. If tags ar not guaranteed to exist, make the second JOIN a LEFT JOIN.
You could add an appropriately denormalized view over your normalized tables where each record contains all the data you need. Or you could encapsulate the SQL calls in stored procedures and call these procs from your code, which should aid performance. Prove both out and get the hard figures; always better to make decisions based on evidence rather that ideas. :)
Related
I have two tables, sections and articles. Originally it was a one-to-many relationship, therefore articles has a sectionID column. Then one day it was decided to allow each article to belong to two sections. Rather than making another 'articles-in-section' table, I opted to not deal with refactoring the data, and I just added an additionalSectionID column onto the articles table. Now I have a problem.
I want to show all the articles associated with a given section, whether as its primary or secondary section. Essentially, I'm looking for some sort of double join between two tables.
These two questions have answers to the same issue - 1,2, but with a different db server. Is there are a way to do this in PHP/MySQL?
Tables' structure is basically like this:
-- SECTIONS:
id title description moderatorID url
-- ARTICLES:
id title shortDesc fullText photo sectionID additionalSectionID
See below.
SELECT s.*, a.*
FROM sections s
LEFT JOIN articles a
ON s.id = a.sectionID
OR s.id = a.additionalSectionID
WHERE s.title = 'My Section';
You might try two inner joins on separate table aliases, along the lines of:
SELECT
*,
s1.section_name AS sectionA,
s2.section_name AS sectionB
FROM articles
INNER JOIN sections s1 ON articles.sectionID = s1.sectionID
INNER JOIN sections s2 ON articles.additionalSectionID = s2.sectionID
I have a SQL SELECT statement in which I'm using 3 tables.
I'm using INNER JOINs to join the tables, however I've come across a bit of an issue because two of the columns that I'd like the join conditional to be based on are different data types;
One is an integer - the id of the products table and can be seen below as p.id.
The other is a comma delimited string of these id's in the order table. customers can order more than one product at a time, so the product id's are stored as a comma delimited list.
here's how far I've gotten with the SQL:
"SELECT o.transaction_id, o.payment_status, o.payment_amount, o.product_id, o.currency, o.payment_method, o.payment_time, u.first_name, u.last_name, u.email, p.title, p.description, p.price
FROM orders AS o
INNER JOIN products AS p ON ( NEED HELP HERE--> p.id IN o.product_id comma delimited list)
INNER JOIN users AS u ON ( o.user_id = u.id )
WHERE user_id = '39'
ORDER BY payment_time DESC
LIMIT 1";
Perhaps I could use REGEX? currently the comma delimited list reads as '2,1,3' - however the number of characters isn't limited - so I need a conditional to check if my product id (p.id) is in this list of o.product_id?
What you have is a perfect example for one-to-many relationship where you have one order and several items attached to it. You should have a link table like
order_product - which makes the connection between a orderid and productid where you can also put specific data for the relationship between the two (like when the item was added, quantity, etc)
Then you make the join using this table and you have same field types everywhere.
simple example:
select
/* list of products */
from
order o,
order_product op,
product p
where
o.id = 20
and o.id = op.orderid
and op.productid = p.id
This in one of those very common nightmares when working with legacy database.
The rule is simple: never ever store multiple values in one table columns. This is known as first normal form.
But how to deal with that in existing DB?
The good thing™
If you have the opportunity to refactor your DB, extract the "comma separated values" to their own table. See http://sqlfiddle.com/#!2/0f547/1 for a basic example how to do that.
Then to query the tables you will have to use a JOIN as explained in elanoism's answer.
The bad thing™
I you can't or don't want do that, you probably have to rely on the FIND_IN_SET function.
SELECT * FROM bad WHERE FIND_IN_SET(target_value, comma_separated_values) > 0;
See http://sqlfiddle.com/#!2/29eba/2
BTW, why is this bad thing™? Because as you see, it is not easy to write query against multi-valued columns -- but, probably more important, you are not able to use index on that columns, nor, as a consequence, to easily perform join operations or enforce referential integrity.
The so-so thing™
As a final note, if the set of possible value is small (less that 65), an alternative approach would be to change the column type to a SET().
I've been scratching my head at this problem all day and I simple just can't work it out. This is the first time I've attempted to try and use SQL Joining, while we do kinda get taught the basics I'm more into pushing a little more into the advanced stuff.
Basically I'm making my own forum, and I have two tables. f_topics (The threads) and f_groups (The forums, or categories). There is a relationship between topicBase in f_topics and groupID in f_groups, this shows which group each topic belongs to. Each topic has a unique ID called topicID and same for the groups, called groupID.
Basically, I'm trying to get all these columns into a single SELECT statement - The title of the topic, the date the topic was posted, the ID of the group the topic belongs in, and the name of that group. This is what I was trying to use, but the group always comes back as 1, even if the topic is in groupID 2:
$query=mysqli_query($link, "
SELECT `topicName`, `topicDate`, `groupName`, `groupID`
FROM `f_topics`
NATURAL JOIN `f_groups`
WHERE `f_topics`.`topicID`='$tid';
") or die("Failed to get topic detail E: ".mysqli_error());
var_dump(mysqli_fetch_assoc($query));
Sorry if this doesn't make much sense, and if my entire logic is completely wrong, if so could you suggest an alternate method?
Thanks for reading!
To join tables, you need to map the foreign keys. Assuming your groups table has an groupID field, this is how you'd join them:
SELECT `topicName`, `topicDate`, `groupName`, `groupID`
FROM `f_topics`
LEFT JOIN `f_groups`
ON `f_topics`.`groupID` = `f_groups`.`groupID`
WHERE`f_topics`.`topicID`='$tid';
So from what I gather there is a column in f_topics named "topicBase" which references the groupID column from the f_groups table.
Based on that assumption, you can perform either an INNER JOIN or a LEFT JOIN. INNER requires there be an entry in both tables while LEFT requires there only be data in f_topics.
SELECT
f_topics.topicName,
f_topics.topicDate
f_groups.groupName
f_groups.groupID
FROM
f_topics
INNER JOIN
f_groups
ON
f_topics.topicBase = f_groups.groupID
WHERE
f_topics.topicID = '$tid'
I recommend you avoid NATURAL JOIN.
Primarily because a working query can be broken by the addition of a new column in a referenced table, which matches a column name in the other referenced table.
Secondly, for any reader (reviewer) of the SQL, which columns are being matched to which columns is not clear, without a careful review of both tables. (And, if someone has added a column that has broken the query, it makes it even more difficult to figure out what the JOIN criteria used to be, before the column was added.
Instead, I recommend you specify the column names in a predicate in the ON clause.
It's also good practice to qualify all column references by table name, or preferably, a shorter table alias.
For simpler statements, I agree that this may look like unnecessary overhead. But once statements become more complicated, this pattern VASTLY improves the readability of the statement.
Absent the definitions of the two tables, I'm going to have to make assumptions, and I "guess" that there is a groupID column in both of those tables, and that is the only column that is named the same. But you specify that its the topicBase column in f_topics that matches groupID in f_groups. (And the NATURAL JOIN won't get you that.)
I think the resultset you want will be returned by this query:
SELECT t.`topicName`
, t.`topicDate`
, g.`groupName`
, g.`groupID`
FROM `f_topics` t
JOIN `f_groups` g
ON g.`groupID` = t.`topicBase`
WHERE t.`topicID`='$tid';
If its possible for the topicBase column to be NULL or to contain a value that does not match a f_groups.GroupID value, and you want that topic returned, with the columns from f_group returned as NULL (when there is no match), you can get that with an outer join.
To get that behavior, in the query above, add the LEFT keyword immediately before the JOIN keyword.
MySQL - Workbench (PHP):
Tables:
TUsers (One to many relationship with TCompanies):
TUsers_CompanyID (FOREIGN KEY)
TUsers_UserName
TUsers_UserPassword
TUsers_ID (UNIQUE)
TCompanies:
TCompanies_CompanyName
TCompanies_CompanyContactNumber
TCompanies_CompanyAddress
TCompanies_ID (UNIQUE)
Is it possible to link multiple tables in a relational database without using the JOIN, or INNER JOIN query commands, without duplicating data in tables?
Thus speaking even another way of creating a relationship that makes the one table "point" to the other's data.
So that one can query the following and successfully retrieve all the data from both tables at once:
MySQL:SELECT * FROM TUsers;
See example above..
You can do it without "appearing" to use a join (ie, the word JOIN won't be in the query), but MySQL will still perform a JOIN...
SELECT *
FROM TUsers, TCompanies
WHERE TUsers_CompanyID=TCompanies_ID;
You won't be able to escape having to use a JOIN for how you want to display your data, but what you might want to do is create what's known as a VIEW, so that you don't actually have to type out the JOIN commands whenever you want to query for the user data:
CREATE VIEW UsersView AS
SELECT *
FROM TUsers a
INNER JOIN TCompanies b ON a.TUsers_CompanyID = b.TCompanies_ID
Then once the view is defined, you can just select from UsersView like so:
SELECT * FROM UsersView
...And it will return the users information as well as the joined company information. You can think of views as a way to simplify (or "compactify") more complex queries, because underneath the hood, it's actually the same thing as:
SELECT *
FROM
(
SELECT *
FROM TUsers a
INNER JOIN TCompanies b ON a.TUsers_CompanyID = b.TCompanies_ID
) UsersView
I am trying to create a search functionality where users would type a word or key phrase and then information is displayed.
I was thinking of using the LEFT JOIN to add all the table i need to be searchable,someone has told me about UNION and I have a hunch that it may be slower than JOIN
so
$query = '
SELECT *
FROM t1
LEFT JOIN t2
ON t2.content = "blabla"
LEFT JOIN t3
ON t3.content = "blabla"
[...]
WHERE t1.content = "blabla"
';
Is the above a good practice or is there a better approach i should be looking into ?
Send me on the right path for this :) also argue why its wrong, argue why you think your approach is better so it will help me and other understand this:
In general, it's a bad idea to play hunches to "guess" what the performance of an SQL engine will be like. There is very sophisticated optimization happening in there which takes into account the size of the tables, the availability of indexes, the cardinality of indexes, and so on.
In this example, LEFT JOIN is wrong because you're producing a semi-cartesian JOIN. Basically, there will be a lot more rows in your result set than you think. That's because each matching row in t1 will be joined with each matching row in t2. If ten rows match in t1 and three in t2, you will not get ten results but thirty.
Even if only one row is guaranteed to match from each table (eliminating the cartesian join problem) it's clear that the LEFT JOIN solution will give you a dataset that's very hard to work with. That's because the content columns from each of the tables you JOIN will be separate columns in the result set. You'll have to examine each of the columns to figure out which table matched.
In this case, UNION is a better solution.
Also, please note:
Use of "*" in SELECT is generally not a good idea. It reduces performance (because all columns must be assembled in the result set) and in a case like this you lose the opportunity to ALIAS each of the content columns, making the result set harder to work with.
This is a very novel use of LEFT JOIN. Normally, it's used to associate rows from two different tables. In this case you're using it to produce three separate result sets "side-by-side". Most SQL programmers will have to look at this statement cross-eyed for a while to figure out what your intent was.