I am developing a personal proyect for academic books. I have some tables with +30.000 rows each for works, editions, authors and so on. All the information of the books —genres, subjects, authors, publishers, etc— is spread over a lot of tables with different types of relations.
I have a query for the main page that works, but the site takes six seconds to load. A lot of time… I was wondering which would be the proper approach for obtaining all the data I need with temporary tables.
What I want to do now is to join the temporary table _work with the related data of another table, say «genre». But the relationship between «work» and «genre» is done with the temporary table «work_has_genre».I know how to do that with normal tables in a single query:
SELECT *
FROM work a
LEFT JOIN (
SELECT GROUP_CONCAT(f_a.id SEPARATOR '|') AS genre_id, GROUP_CONCAT(f_a.genre SEPARATOR '|') AS genre_name, f_b.work_id AS _work_id
FROM genre f_a
INNER JOIN (
SELECT *
FROM work_has_genre f_b_a
) f_b
ON f_a.id=f_b.genre_id
GROUP BY f_b.work_id
) f
ON a.id=f._work_id
WHERE a.id=13
I suppose the idea would be to break this actions in parts, but I don't know how. Could someone help me with a bit of pseudocode? Or maybe this is not the best approach. Any idea will be very welcomed!
A.
As I said in comments, I would first suggest reworking/flattening the subqueries as much as possible first, but once you get to semi-independent aggregations temp tables can be helpful.
Generally, the pattern is to put each such aggregation subquery's results into it's own temp table (with an index on the field the subquery was joined to the main query on) even if that means adding tables (and the main query's WHERE) to the original subquery, and then joining to the temp table in the main query.
Related
I've been scratching my head at this problem all day and I simple just can't work it out. This is the first time I've attempted to try and use SQL Joining, while we do kinda get taught the basics I'm more into pushing a little more into the advanced stuff.
Basically I'm making my own forum, and I have two tables. f_topics (The threads) and f_groups (The forums, or categories). There is a relationship between topicBase in f_topics and groupID in f_groups, this shows which group each topic belongs to. Each topic has a unique ID called topicID and same for the groups, called groupID.
Basically, I'm trying to get all these columns into a single SELECT statement - The title of the topic, the date the topic was posted, the ID of the group the topic belongs in, and the name of that group. This is what I was trying to use, but the group always comes back as 1, even if the topic is in groupID 2:
$query=mysqli_query($link, "
SELECT `topicName`, `topicDate`, `groupName`, `groupID`
FROM `f_topics`
NATURAL JOIN `f_groups`
WHERE `f_topics`.`topicID`='$tid';
") or die("Failed to get topic detail E: ".mysqli_error());
var_dump(mysqli_fetch_assoc($query));
Sorry if this doesn't make much sense, and if my entire logic is completely wrong, if so could you suggest an alternate method?
Thanks for reading!
To join tables, you need to map the foreign keys. Assuming your groups table has an groupID field, this is how you'd join them:
SELECT `topicName`, `topicDate`, `groupName`, `groupID`
FROM `f_topics`
LEFT JOIN `f_groups`
ON `f_topics`.`groupID` = `f_groups`.`groupID`
WHERE`f_topics`.`topicID`='$tid';
So from what I gather there is a column in f_topics named "topicBase" which references the groupID column from the f_groups table.
Based on that assumption, you can perform either an INNER JOIN or a LEFT JOIN. INNER requires there be an entry in both tables while LEFT requires there only be data in f_topics.
SELECT
f_topics.topicName,
f_topics.topicDate
f_groups.groupName
f_groups.groupID
FROM
f_topics
INNER JOIN
f_groups
ON
f_topics.topicBase = f_groups.groupID
WHERE
f_topics.topicID = '$tid'
I recommend you avoid NATURAL JOIN.
Primarily because a working query can be broken by the addition of a new column in a referenced table, which matches a column name in the other referenced table.
Secondly, for any reader (reviewer) of the SQL, which columns are being matched to which columns is not clear, without a careful review of both tables. (And, if someone has added a column that has broken the query, it makes it even more difficult to figure out what the JOIN criteria used to be, before the column was added.
Instead, I recommend you specify the column names in a predicate in the ON clause.
It's also good practice to qualify all column references by table name, or preferably, a shorter table alias.
For simpler statements, I agree that this may look like unnecessary overhead. But once statements become more complicated, this pattern VASTLY improves the readability of the statement.
Absent the definitions of the two tables, I'm going to have to make assumptions, and I "guess" that there is a groupID column in both of those tables, and that is the only column that is named the same. But you specify that its the topicBase column in f_topics that matches groupID in f_groups. (And the NATURAL JOIN won't get you that.)
I think the resultset you want will be returned by this query:
SELECT t.`topicName`
, t.`topicDate`
, g.`groupName`
, g.`groupID`
FROM `f_topics` t
JOIN `f_groups` g
ON g.`groupID` = t.`topicBase`
WHERE t.`topicID`='$tid';
If its possible for the topicBase column to be NULL or to contain a value that does not match a f_groups.GroupID value, and you want that topic returned, with the columns from f_group returned as NULL (when there is no match), you can get that with an outer join.
To get that behavior, in the query above, add the LEFT keyword immediately before the JOIN keyword.
I am building a chorus management database and need to create a particular query. (MySQL programming in PHP.)
I have a table of singers and a table of events which have a many-to-many relationship managed by a roster table. Each roster record links to one SingerID and one EventID. I would like to create a browse table of the form:
The catch is that some singers may have no events linked.
Is there a way to do this in a single MySQL query, or will I need to write one query to list all my singers, and a second query to list all of the events for each singer, and then examine the second query to extract the first and last records (assuming I sort the last query by date)?
Use a LEFT JOIN that way it doesn't need to exist
I also think the way Mike does, Query should be like this
SELECT * FROM singer
LEFT OUTER JOIN roster ON singer.id =roster.singerid
INNER JOIN event ON event.id=roster.eventid
ORDER BY singer.name,singer.date
Imagine a table for articles. In addition to the main query:
SELECT * From articles WHERE article_id='$id'
We also need several other queries to get
SELECT * FROM users WHERE user_id='$author_id' // Taken from main query
SELECT tags.tag
FROM tags
INNER JOIN tag_map
ON tags.tag_id=tag_map.tag_id
WHERE article_id='$id'
and several more queries for categories, similar articles, etc
Question 1: Is it the best way to perform these queries separately with PHP and handle the given results, or there is way to combine them?
Question 2: In the absence of many-to-many relationships (e.g. one tag, category, author for every article identified by tag_id, category_id, author_id); What the best (fastest) was to retrieve data from the tables.
If all the relationships are one-many then you could quite easily retrieve all this data in one query such as
SELECT
[fields required]
FROM
articles a
INNER JOIN
users u ON a.author_id=u.user_id
INNER JOIN
tag_map tm ON tm.article_id=a.article_id
INNER JOIN
tags t t.tag_id=tm.tag_id
WHERE
a.article_id='$id'
This would usually be faster than the three queries separately along as your tables are indexed correctly as MySQL is built to do this! It would save on two round trips to the database and the associated overhead.
You can merge in the user in the first query:
SELECT a.*, u.*
FROM articles a
JOIN users u ON u.user_id = a.author_id
WHERE a.article_id='$id';
You could do the same with the tags, but that would introduce some redundancy in the answer, because there are obviously multiple tags per article. May or may not be beneficial.
In the absence of many-to-many relationships, this would do the job in one fell swoop and would be superior in any case:
SELECT *
FROM users u
JOIN articles a ON a.author_id = u.user_id
JOIN tag t USING (tag_id) -- I assume a column articles.tag_id in this case
WHERE a.article_id = '$id';
You may want to be more selective on which columns to return. If tags ar not guaranteed to exist, make the second JOIN a LEFT JOIN.
You could add an appropriately denormalized view over your normalized tables where each record contains all the data you need. Or you could encapsulate the SQL calls in stored procedures and call these procs from your code, which should aid performance. Prove both out and get the hard figures; always better to make decisions based on evidence rather that ideas. :)
I am trying to create a search functionality where users would type a word or key phrase and then information is displayed.
I was thinking of using the LEFT JOIN to add all the table i need to be searchable,someone has told me about UNION and I have a hunch that it may be slower than JOIN
so
$query = '
SELECT *
FROM t1
LEFT JOIN t2
ON t2.content = "blabla"
LEFT JOIN t3
ON t3.content = "blabla"
[...]
WHERE t1.content = "blabla"
';
Is the above a good practice or is there a better approach i should be looking into ?
Send me on the right path for this :) also argue why its wrong, argue why you think your approach is better so it will help me and other understand this:
In general, it's a bad idea to play hunches to "guess" what the performance of an SQL engine will be like. There is very sophisticated optimization happening in there which takes into account the size of the tables, the availability of indexes, the cardinality of indexes, and so on.
In this example, LEFT JOIN is wrong because you're producing a semi-cartesian JOIN. Basically, there will be a lot more rows in your result set than you think. That's because each matching row in t1 will be joined with each matching row in t2. If ten rows match in t1 and three in t2, you will not get ten results but thirty.
Even if only one row is guaranteed to match from each table (eliminating the cartesian join problem) it's clear that the LEFT JOIN solution will give you a dataset that's very hard to work with. That's because the content columns from each of the tables you JOIN will be separate columns in the result set. You'll have to examine each of the columns to figure out which table matched.
In this case, UNION is a better solution.
Also, please note:
Use of "*" in SELECT is generally not a good idea. It reduces performance (because all columns must be assembled in the result set) and in a case like this you lose the opportunity to ALIAS each of the content columns, making the result set harder to work with.
This is a very novel use of LEFT JOIN. Normally, it's used to associate rows from two different tables. In this case you're using it to produce three separate result sets "side-by-side". Most SQL programmers will have to look at this statement cross-eyed for a while to figure out what your intent was.
I am trying to query 2 tables in a database, each query having nothing to do with each other, other then being on the same page.
Query 1 - The first query on the page will retrieve text and images that are found throughout the page from Table A.
Query 2 - The second query will retrieve several products with a image, description and title for each product from Table B.
I know that putting the second query inside the first query's while loop would work but of course is very inefficient.
How can I and what is the best way to retrieve all the data I need through 1 query?
Thanks,
Dane
So all you want to know is if its ok to have 2 queries on the same webpage? Its A-OK. Go right ahead. Its completelly normal. No one expects a join between table news and table products. Its normal to usetwo queries to fetch data from two unrelated tables.
Use LEFT or INNER JOIN (depends on whether you want to display records from TableA that have no correspondent records in TableB)
SELECT a.*, b.*
FROM TableA a
[LEFT or INNER] JOIN TableB b ON (b.a_id = a.id)
If there's no way to relate the two tables to each other, then you can't use a JOIN to grab records from both. You COULD use a UNION query, but that presumes that you can match up fields from each table, as a UNION requires you to select the same number/type of fields from each table.
SELECT 'pageinfo' AS sourcetable, page.id, page.images, page.this, page.that
WHERE page.id = $id
UNION
SELECT 'product' AS sourcetable, products.id, products.image, product.other, product.stuff
But this is highly ugly. You're still forcing the DB server to do two queries in the background plus the extra work of combining them into a single result set, and then you have to do extra work to dis-entangle in your code to boot.
It's MUCH easier, conceptually and maintenance-wise, to do two seperate queries instead.