I've been scratching my head at this problem all day and I simple just can't work it out. This is the first time I've attempted to try and use SQL Joining, while we do kinda get taught the basics I'm more into pushing a little more into the advanced stuff.
Basically I'm making my own forum, and I have two tables. f_topics (The threads) and f_groups (The forums, or categories). There is a relationship between topicBase in f_topics and groupID in f_groups, this shows which group each topic belongs to. Each topic has a unique ID called topicID and same for the groups, called groupID.
Basically, I'm trying to get all these columns into a single SELECT statement - The title of the topic, the date the topic was posted, the ID of the group the topic belongs in, and the name of that group. This is what I was trying to use, but the group always comes back as 1, even if the topic is in groupID 2:
$query=mysqli_query($link, "
SELECT `topicName`, `topicDate`, `groupName`, `groupID`
FROM `f_topics`
NATURAL JOIN `f_groups`
WHERE `f_topics`.`topicID`='$tid';
") or die("Failed to get topic detail E: ".mysqli_error());
var_dump(mysqli_fetch_assoc($query));
Sorry if this doesn't make much sense, and if my entire logic is completely wrong, if so could you suggest an alternate method?
Thanks for reading!
To join tables, you need to map the foreign keys. Assuming your groups table has an groupID field, this is how you'd join them:
SELECT `topicName`, `topicDate`, `groupName`, `groupID`
FROM `f_topics`
LEFT JOIN `f_groups`
ON `f_topics`.`groupID` = `f_groups`.`groupID`
WHERE`f_topics`.`topicID`='$tid';
So from what I gather there is a column in f_topics named "topicBase" which references the groupID column from the f_groups table.
Based on that assumption, you can perform either an INNER JOIN or a LEFT JOIN. INNER requires there be an entry in both tables while LEFT requires there only be data in f_topics.
SELECT
f_topics.topicName,
f_topics.topicDate
f_groups.groupName
f_groups.groupID
FROM
f_topics
INNER JOIN
f_groups
ON
f_topics.topicBase = f_groups.groupID
WHERE
f_topics.topicID = '$tid'
I recommend you avoid NATURAL JOIN.
Primarily because a working query can be broken by the addition of a new column in a referenced table, which matches a column name in the other referenced table.
Secondly, for any reader (reviewer) of the SQL, which columns are being matched to which columns is not clear, without a careful review of both tables. (And, if someone has added a column that has broken the query, it makes it even more difficult to figure out what the JOIN criteria used to be, before the column was added.
Instead, I recommend you specify the column names in a predicate in the ON clause.
It's also good practice to qualify all column references by table name, or preferably, a shorter table alias.
For simpler statements, I agree that this may look like unnecessary overhead. But once statements become more complicated, this pattern VASTLY improves the readability of the statement.
Absent the definitions of the two tables, I'm going to have to make assumptions, and I "guess" that there is a groupID column in both of those tables, and that is the only column that is named the same. But you specify that its the topicBase column in f_topics that matches groupID in f_groups. (And the NATURAL JOIN won't get you that.)
I think the resultset you want will be returned by this query:
SELECT t.`topicName`
, t.`topicDate`
, g.`groupName`
, g.`groupID`
FROM `f_topics` t
JOIN `f_groups` g
ON g.`groupID` = t.`topicBase`
WHERE t.`topicID`='$tid';
If its possible for the topicBase column to be NULL or to contain a value that does not match a f_groups.GroupID value, and you want that topic returned, with the columns from f_group returned as NULL (when there is no match), you can get that with an outer join.
To get that behavior, in the query above, add the LEFT keyword immediately before the JOIN keyword.
Related
What I want to do is to query three separate tables into one row which is identified by a unique reference. I don't really have full understanding of the Join clause as it seems to require some sort of related data from each table.
I know I can go about this the long way round, but can not afford to lose even a little efficiency. Any help would be greatly appreciated.
Table Structure
package_id int(8),
client_id int(8),
unique reference varchar (40)
Each of the tables have essentially the same structure. I just need to know how to query all three, for 1 row.
If you have few tables that are sharing the same or similar definition, you can use union or union all to treat them as one. This query will return rows from each table having requested reference. I've included OriginTable info in case your code will need to refer to original table for update or something else.
select 'TableA' OriginTable,
package_id,
client_id
from TableA
where reference = ?
union all
select 'TableB' OriginTable,
package_id,
client_id
from TableB
where reference = ?
union all
select 'TableC' OriginTable,
package_id,
client_id
from TableC
where reference = ?
You might extend select list with other columns, provided that they have the same data type, or are implicitly convertible to data type from first select.
Let's say you have 3 tables :
table1, table2 and table3 with structure
package_id int(8),
client_id int(8),
unique reference varchar (40)
Let's assume that column reference is unique key.
Then you can use this:
SELECT t1.exists_row ,t2.exists_row ,t3.exists_row FROM
(
(SELECT COUNT(1) as exists_row FROM table1 t1 WHERE
t1.reference = #reference ) t1,
(SELECT COUNT(1) as exists_row FROM table1 t2 WHERE
t2.reference = #reference ) t2,
(SELECT COUNT(1) as exists_row FROM table1 t3 WHERE
t3.reference = #reference ) t3
) a
;
Replace #reference with actual value of unique key
or when you provide output of
SHOW CREATE TABLE
I can rewrite SQL with actual query
It is entirely possible to create a join between tables using a where clause. In fact this is often what I do as I find it leads to clearer information of what you are actually doing, and if you don't get the results you expect you can debug it bit by bit.
That said however a join is certainly a lot quicker to write!
Please bear in mind I'm a bi rusty on SQL so I may have missed remembered, and I'm not going to include any code as you haven't said what DBMS you are using as they all have slightly different code.
The thing to remember is that the join functions on a column with the same data (and type) within it.
It is much easier if each table has the 'joining' field named the same, then it should be a matter of
join on <nameOfField>
However if you wish to use field that have different names in the different tables you will need to list the fully qualified names. ie tableName.FieldName
If you are having trouble with natural, inner and outer, left and right, you need to think of a venn diagram with the natural being the point of commonality between the tables. If you are using only 2 tables inner and outer are equivalent to left and right (with each table being a single circle in the venn diagram) and left and right being the order of the tables in your list in the main part of your select (the first being the left and the second being the right).
When you add a third table this is where you can select any of the cross over section using these keywords.
Again however I have always found it easier to do a primary select and create a temp table, then perform my next join using this temp table (so effectively only need to use natural or left and right again). Again I find this easier to debug.
The best thing is to experiment and see what you get in return. Without a diagram of your tables this is the best I can offer.
in brief...
nested selects where field = (select from table where field = )
and temp tables
are (I think) easier to debug... but do take more writting !
David.
array_of_tables[]; // contain name of each table
foreach(array_of_tables as $val)
{
$query="select * from `$val` where $condition "; // $conditon
$result=mysqli_query($connection,$query);
$result_row[]=mysqli_fetch_assoc($result); // if only one row going to return form each table
//check resulting array ,for your row
}
SELECT * FROM table1 t1 JOIN table2 t2 ON (t2.unique = t1.unique) JOIN table3 t3 ON (t3.unique = t1.unique) WHERE t1.unique = '?';
You could use a JOIN like this, assuming all three tables have the same unique column.
I hate to submit a new question, but everyone else has some slight thing that is different enough to make this one seem necessary to ask.
Users are to type in a vendor name, and then see all the "kinds" of things they have bought from that company, in a list, sorted by the lowest-inventory-on-hand.
Summary:
I have three tables.
There are more fields than these, but these are the relevant ones (as far as I can tell).
stuff_table
stuff_vendor_name *(search this field with $user_input, but only one result per lookup_type)*
lookup_type
lookup_table
lookup_type
lookup_quantity (order by this)
category_type
category_table
category_type
category_location (check if this field == $this_location, which is already assigned)
Wordier Explanation:
The users are searching for a value that is contained only in the stuff_table -- distinct stuff_vendor_name values for each lookup_type. Each item can be bought from multiple sources, the idea is to see if any vendor has ever sold even one of any type of item before.
But the results need to be ORDER BY the lookup_quantity, in the lookup_table.
And importantly, I have to check to see if they are searching the correct location for these categories, located in the category_table in the category_location field.
How do I efficiently make this query?
Above, I mentioned the variables that I have:
$user_input (the value we are searching for distinct matches in the stuff_vendor_name field) and $current_location.
To understand the relationship of these tables, I will use an example.
The stuff_table would have dozens of entries with dozens of vendors, but have a lookup_type of, say, "watermelon," "apple," or "cherry."
The lookup_table would give the category_type of "Jellybean." One category type can have multiple lookup_types. But each lookup_type has exactly one category_type.
You are not sharing much about the relationships, but try this:
SELECT *
FROM stuff_table st
LEFT JOIN lookup_table lt
ON st.lookup_type = lt.lookup_type
LEFT JOIN category_table ct
ON lt.category_type = ct.category_type
AND ct.category_location = $this_location
GROUP BY st.lookup_type
ORDER BY lt.lookup_quantity
WHERE st.stuff_vendor_name = $user_input
From a first glance at it you could use foreign keys in your tables to make link between them or using the LEFT JOIN mysql command to make abstraction of another linked table.
The only example I can think of is on a Doctrine pattern, but I think you'll get what I'm saying:
$q = Doctrine_Query::create()
->from('Default_Model_DbTable_StuffTable s')
->leftJoin('s.LookupTable l')
->leftJoin('s.CategoryTable c')
->orderBy('l.lookup_quantity DESC');
$stuff= $q->execute(array(), Doctrine_Core::HYDRATE_ARRAY);
I made a nested query instead.
The final code looks like this:
$query_row=mysql_query(
"SELECT DISTINCT * FROM table_a WHERE
field_1 IN (SELECT field_1 FROM table_b WHERE field_2 = $field_2)
AND field_3 IN (SELECT field_3 FROM table_c WHERE field_4 = $field_4)
ORDER BY field_5 DESC
");
This was incredibly simple. I just didn't know you could do a nested query like that.
I read it was "bad form" because it makes some kind of search optimization not as good as it could be, so be careful using nested select statements.
However for me, it seemed to actually be significantly faster.
Can anyone tell me how I can speed up mysql group by clause? Ive read the documentation but it doesnt give any good examples.
UPDATE SQL
SELECT
post.topic_id,
topic.topic_posts,
topic.topic_title,
topic.topic_poster_name,
topic.topic_last_post_id,
forum.forum_name AS group_name,
`group`.slug AS child_slug,
`parent`.slug AS parent_slug
FROM bb_posts post
LEFT JOIN bb_topics topic
ON topic.topic_id = post.topic_id
LEFT JOIN bb_forums forum
ON forum.forum_id = topic.forum_id
LEFT JOIN wp_bp_groups `group`
ON topic.forum_id = `group`.id
LEFT JOIN wp_bp_groups `parent`
ON `group`.parent_id = `parent`.id
WHERE (topic_title LIKE '%$search_terms%' || MATCH(post.post_text) AGAINST('$search_terms'))
&& topic_status = 0
GROUP BY topic_id
ORDER BY topic.topic_start_time DESC
LIMIT $offset,$num
http://dev.mysql.com/doc/refman/5.0/en/group-by-optimization.html
Group by is fastest when you have an index on the column being grouped on, and:
The query is over a single table.
The GROUP BY names only columns that form a leftmost prefix of the index and no other columns. (If, instead of GROUP BY, the query has a DISTINCT clause, all distinct attributes refer to columns that form a leftmost prefix of the index.) For example, if a table t1 has an index on (c1,c2,c3), loose index scan is applicable if the query has GROUP BY c1, c2,. It is not applicable if the query has GROUP BY c2, c3 (the columns are not a leftmost prefix) or GROUP BY c1, c2, c4 (c4 is not in the index).
The only aggregate functions used in the select list (if any) are MIN() and MAX(), and all of them refer to the same column. The column must be in the index and must follow the columns in the GROUP BY.
Any other parts of the index than those from the GROUP BY referenced in the query must be constants (that is, they must be referenced in equalities with constants), except for the argument of MIN() or MAX() functions.
For columns in the index, full column values must be indexed, not just a prefix. For example, with c1 VARCHAR(20), INDEX (c1(10)), the index cannot be used for loose index scan.
The general best practice would be to make sure the field you are grouping on has an index.
From the Reference Manual: Group by Optimization
The most general way to satisfy a
GROUP BY clause is to scan the whole
table and create a new temporary table
where all rows from each group are
consecutive, and then use this
temporary table to discover groups and
apply aggregate functions (if any). In
some cases, MySQL is able to do much
better than that and to avoid creation
of temporary tables by using index
access.
Make sure that every foreign key has a corresponding index.
Create covering indexes on the fields you retrieve
Creating an index on the field you are sorting bij wouldn't hurt either.
Make your where clause part of the join conditions.
I've a ('courses') table that has a HABTM relationship with ('instructors') table through another table...
I want to get the data of an instructor with all related courses in one query..
Currently, I have the following SQL:
SELECT *
FROM `instructors` AS `instructor`
LEFT JOIN `courses` AS `course`
ON `course`.`id` IN (
SELECT `course_id`
FROM `course_instructors`
WHERE `course_instructors`.`instructor_id` = `instructor`.`id`
)
WHERE `instructor`.`id` = 1
This SQL does what it should be doing, the only "problem" I have is that I get multiple rows for each joined rows.
My question is:
Can I get the result I want in one query? Or do I have to manipulate the data in PHP?
I'm using PHP and MySQL.
Each record of a query result set has the same format: same number of fields, same fields, same order of fields. You cannot change that.
SELECT *
FROM instructors AS instructor
LEFT JOIN
course_instructors
ON
instructor.id= course_instructors.instructor_id
LEFT JOIN
courses
ON
course_instructors.course_id = course.id
WHERE instructor.id = 1
This assumes the PK of course_instructors is (instructor_id,course_id)
Explanation of query:
First join + WHERE make sure you get the relevant instructor
Second join matches ALL the entries from the course_instructor table that belongs to this instructor. If none found, will return one row with NULL in all fields
Last join matches all relevant courses from the entries found from course_instructor If none would will return one record with NULL in all fields.
Again: important to use the right constraints to avoid duplicate data.
That's the nature of relational databases. You need to get the instructor first and then get the related courses. That's how I would do it and that's how I've been doing it. I'm not sure if there is a "hack" to it.
I am trying to create a search functionality where users would type a word or key phrase and then information is displayed.
I was thinking of using the LEFT JOIN to add all the table i need to be searchable,someone has told me about UNION and I have a hunch that it may be slower than JOIN
so
$query = '
SELECT *
FROM t1
LEFT JOIN t2
ON t2.content = "blabla"
LEFT JOIN t3
ON t3.content = "blabla"
[...]
WHERE t1.content = "blabla"
';
Is the above a good practice or is there a better approach i should be looking into ?
Send me on the right path for this :) also argue why its wrong, argue why you think your approach is better so it will help me and other understand this:
In general, it's a bad idea to play hunches to "guess" what the performance of an SQL engine will be like. There is very sophisticated optimization happening in there which takes into account the size of the tables, the availability of indexes, the cardinality of indexes, and so on.
In this example, LEFT JOIN is wrong because you're producing a semi-cartesian JOIN. Basically, there will be a lot more rows in your result set than you think. That's because each matching row in t1 will be joined with each matching row in t2. If ten rows match in t1 and three in t2, you will not get ten results but thirty.
Even if only one row is guaranteed to match from each table (eliminating the cartesian join problem) it's clear that the LEFT JOIN solution will give you a dataset that's very hard to work with. That's because the content columns from each of the tables you JOIN will be separate columns in the result set. You'll have to examine each of the columns to figure out which table matched.
In this case, UNION is a better solution.
Also, please note:
Use of "*" in SELECT is generally not a good idea. It reduces performance (because all columns must be assembled in the result set) and in a case like this you lose the opportunity to ALIAS each of the content columns, making the result set harder to work with.
This is a very novel use of LEFT JOIN. Normally, it's used to associate rows from two different tables. In this case you're using it to produce three separate result sets "side-by-side". Most SQL programmers will have to look at this statement cross-eyed for a while to figure out what your intent was.