This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
SQL Join Differences
$PERSON = $DATABASE_LINK->query("SELECT * FROM `users`,`profiles` WHERE users.first_name = 'shane' && users.last_name = 'larson' && users.setup = '1' && profiles.zipcode = '53511' ORDER BY `full_name` ASC");
$PERSON = $PERSON->fetch_object();
var_dump($PERSON);
I want a query that scans for a user record based off name, and checks the zip code from the profile table. Above is a example. It works, but idk how joins work exactly. Any explanation on how joins match 2 rows would be awesome :)
the result set that you are receiving is a cross product of all rows in the user table that match the users criteria and all rows in the profiles table that match the profiles criteria.
It seems likely that this is not what you want.
consider adding something like users.blammy = profiles.blammy to join the two tables.
SELECT Cities.Name, Countries.Name, Countries.id
FROM Cities INNER JOIN Countries
ON Cities.CountryId = Countries.id
WHERE Cities.Name LIKE = 'town' LIMIT 10
Looking at that example:
The first row selects three columns from two different tables (Cities and Countries).
The second row specifies the two tables to join.
The third row specifies what item to match the two tables on.
The result from that query is a list
of Cities.Name that match the LIKE
clause and their corresponding
Countries.Name (which it finds by
matching Cities.CountryId with
Countries.id).
Read more here: http://www.wellho.net/mouth/158_MySQL-LEFT-JOIN-and-RIGHT-JOIN-INNER-JOIN-and-OUTER-JOIN.html
So using SQL Statements with "WHERE table1.id_x = table2.id_y" do work, but as far as i know they are slower than Joins (primarly with huge datasets).
So there inner and outer Joins, the outer Joins are divided into left- and right-joins.
A little Explanation for a left-Join:
SELECT Class.Id, Class.Name, Professor.Id, Professor.Name
FROM FROM Professor INNER JOIN Class
ON Professor.Id = Class.ProfessorId;
This Statement Selects all Professors, and does not care if a Professor don't teaches a Class. But if there is a class without a Professor, this class is not selected. It's called left Join because 'NULL' References in the left Table are ignored. The explanation for the right-Join should be trivial ;)
SELECT Class.Id, Class.Name, Professor.Id, Professor.Name
FROM FROM Professor LEFT OUTER JOIN Class
ON Professor.Id = Class.ProfessorId;
The Inner Join don't shows Professors without Classes and it don't shows Classes without Professors.
Please remember: Not all DBMS got the Keywords 'INNER' and 'OUTER' ;)
Related
I'm facing a problem here:
I'm building a forum, this forum has several tables and I'm trying to fetch the comments and user info in a single query.
So far, it should be easy, the problem is that I can't change the structure and with the following query I get a perfect result IF there is a like to the answer. If no one likes the answer it fails.
Select
mfr.mfr_forum_answers.id,
mfr.mfr_forum_answers.date_created,
mfr.mfr_forum_answers.last_updated,
mfr.mfr_forum_answers.content,
mfr.mfr_forum_answers.accepted,
mfr.mfr_forum_answers.user_id,
mfr.mfr_users.level,
mfr.mfr_users.avatar,
mfr.mfr_forum_likes.subject_id,
mfr.wp_users.ID As ID1,
mfr.mfr_forum_topics.user_id As owner_id,
(SELECT count(mfr.mfr_forum_likes.id) FROM mfr.mfr_forum_likes WHERE mfr.mfr_forum_likes.subject_id = :id AND mfr.mfr_forum_likes.type = 'answer') as likes,
(SELECT count(mfr.mfr_forum_likes.id) FROM mfr.mfr_forum_likes WHERE mfr.mfr_forum_likes.subject_id = :id AND makefitreal.mfr_forum_likes.type = 'answer' AND mfr.mfr_forum_likes.user_id = :sessionId ) as i_like,
mfr.wp_users.user_nicename
From
mfr.mfr_forum_likes Inner Join
mfr.mfr_forum_answers
On mfr.mfr_forum_answers.topic_id =
mfr.mfr_forum_likes.subject_id Inner Join
mfr.mfr_users
On mfr.mfr_forum_answers.user_id = mfr.mfr_users.id
Inner Join
mfr.wp_users
On mfr.mfr_users.id = mfr.wp_users.ID Inner Join
mfr.mfr_forum_topics
On mfr.mfr_forum_answers.topic_id = mfr.mfr_forum_topics.id
Where
mfr.mfr_forum_answers.topic_id = :id
And
mfr.mfr_forum_likes.type = 'answer'
So far as said it returns only if an answer has a like, I'm thinking on adding a add to the user who posts the answer by default but I'm trying to improve my skills by solving new issues.
If someone has a suggestion in how I could overcome the fact that if a table is empty, the query continues I'd be really thankfull.
Thanks in advance-
Pihh
Yes. What you are looking for are called left and right joins. According to the documentation, with a LEFT JOIN you still join two tables as normal but
If there is no matching row for the right table in the ON or USING part in a LEFT JOIN, a row with all columns set to NULL is used for the right table.
This means that you can try to join two tables, but if a row does not have any results it will still return the results from the first table. The same is true for a RIGHT JOIN only it works the opposite way: it will return results if the tabled being joined to has results, but the original table does not.
It looks like you have 3 tables for 3 relationships: there are answers, a user gives an answer, and an answer might or might not have like. To grab this data, I would suggest starting from your answers table, performing an INNER JOIN on your users table (assuming there are always users), and a LEFT JOIN on your likes table. Here is a simple example:
SELECT *
FROM answers
INNER JOIN users ON users.id = answers.user_id
LEFT JOIN likes ON likes.answer_id = answer.id
WHERE answers.id = :id
AND likes.type = 'answers'
Of course, if for some unknown reason you need to start from your likes table, then you'd have to RIGHT JOIN the other tables. I hope that gives you a good idea of how you'd make your query.
I have two tables: "users" and "posts." The posts table has a 'post' column and a 'poster_id' column. I'm working on a PHP page that shows the latest posts by everyone, like this:
SELECT * FROM posts WHERE id < '$whatever' LIMIT 10
This way, I can print each result like this:
id: 43, poster_id:'4', post: hello, world
id: 44, poster_id:'4', post: hello, ward
id: 45, poster_id:'5', post: oh hi!
etc...
Instead of the id, I would like to display the NAME of the poster (there's a column for it in the 'users' table)
I've tried the following:
SELECT *
FROM posts
WHERE id < '$whatever'
INNER JOIN users
ON posts.poster_id = users.id LIMIT 10
Is this the correct type of join for this task? Before learning about joins, I would query the users table for each post result. The result should end up looking similar to this:
id: 43, poster_id:'4', name:'foo', post: hello, world
id: 44, poster_id:'4', name:'foo', post: hello, ward
id: 45, poster_id:'5', name:'fee', post: oh hi!
etc...
Thanks for helping in advance.
WHERE clause must come after the FROM clause.
SELECT posts.*, users.* // select your desired columns
FROM posts
INNER JOIN users ON posts.poster_id = users.id
WHERE id < '$whatever'
LIMIT 10
the SQL Order of Operation is as follows:
FROM clause
WHERE clause
GROUP BY clause
HAVING clause
SELECT clause
ORDER BY clause
UPDATE 1
For those column names that exists on both tables, add an ALIAS on them so it can be uniquely identified. example,
SELECT post.colName as PostCol,
users.colName as UserCol, ....
FROM ....
on the example above, both tables has column name colName. In order to get them both, you need to add alias on them so in your front end, use PostCol and UserCol to get their values.
Try:
SELECT *
FROM posts
INNER JOIN users ON posts.poster_id = users.id
WHERE posts.id < '$whatever'
LIMIT 10
Got the syntax a little incorrect.
Should be
SELECT * FROM posts
INNER JOIN users ON posts.poster_id = users.id
WHERE id < '$whatever' LIMIT 10
The answers already given tell you the main reason for your query not working at all (ie the WHERE clause should come after the JOIN clauses), however, I'd like to make a couple of additional points:
I would suggest using an OUTER JOIN for this. It probably won't make much difference, but in the event of a post record having an invalid poster_id, an INNER JOIN will mean the record is dropped from the results, whereas an OUTER JOIN will mean that the record is included, but the values from the users table will be null. I imagine you don't want to ever have an invalid poster_id on the posts table, but broken data does happen even in the best regulated system, and it is helpful in these cases to still get the data from the query.
I would strongly suggest not doing SELECT *, and instead itemising the fields you want to get back from the query. SELECT * has a number of problems, but it's particularly bad when you have multiple tables in the query, because if you have fields with the same name on both tables, (eg id), then it becomes very hard to distinguish which one you're working with, as your PHP recordset won't include the table reference. Itemising the fields may make your query string longer, but it won't make it any slower - if anything it'll be quicker - and it will be easier to work with in the long run.
Neither of these points are essential; the query will work without them (as long as you switch the WHERE clause to after the JOIN), but they may improve your query and hopefully also improve your understanding of SQL.
Can you let me know if my interpretation is correct (the last AND part)?
$q = "SELECT title,name,company,address1,address2
FROM registrations
WHERE title != 0 AND id IN (
SELECT registrar_id
FROM registrations_industry
WHERE industry_id = '$industryid'
)";
Below was really where I am not sure:
... AND id IN (select registrar_id from registrations_industry where industry_id='$industryid')
Interpretation: Get any match on id(registrations id field) equals registrar_id(field) from the join table registrations_industry where industry_id equals the set $industryid
Is this select statement considered a sub routine since it's a query within the main query?
So an example would be with the register table id search to 23 would look like:
registrations(table)
id=23,title=owner,name=mike,company=nono,address1=1234 s walker lane,address2
registrations_industry(table)
id=256, registrar_id=23, industry_id=400<br>
id=159, registrar_id=23, industry_id=284<br>
id=227, registrar_id=23, industry_id=357
I assume this would return 3 records with the same registration table data And of course varying registrations_industry returns.
For a given test data set your query will return one record. This one:
id=23,title=owner,name=mike,company=nono,address1=1234 s walker lane,address2
To get three records with the same registration table data and varying registrations_industry you need to use JOIN.
Something like this:
SELECT r.title, r.name, r.company, r.address1, r.address2
FROM registrations AS r
LEFT OUTER JOIN registrations_industry AS ri
ON ri.registrar_id=r.id
WHERE r.title!=0 AND ri.industry_id={$industry_id}
Sorry for the essay, I didn't realize it was as long as it is until looking at it now. And although you've checked an answer, I hope you read this gain some insight into why this solution is preferred and how it evolved out of your original query.
First things first
Your query
$q = "SELECT title,name,company,address1,address2
FROM registrations
WHERE title != 0 AND id IN (
SELECT registrar_id
FROM registrations_industry
WHERE industry_id = '$industryid'
)";
seems fine. The IN syntax is equivalent to a number of OR matches. For example
WHERE field_id IN (101,102,103,105)
is functionally equivalent to
WHERE (field_id = 101
OR field_id = 102
OR field_id = 103
OR field_id = 105)
You complicate it a bit by introducing a subquery, no problem. As long as your subquery returns one column (and yours does), passing it to IN will be fine.
In your case, you're comparing registrations.id to registrations_industry.registrar_id. (Note: This is just <table>.<field> syntax, nothing special, but helpful to disambiguate what tables your fields are in.)
This seems fine.
What happens
SQL would first run the subquery, generating a result set of registrar_ids where the industry_id was set as specified.
SQL would then run the outer query, replacing the subquery with its results and you would get rows from registrations where registrations.id matched one of the registrar_ids returned from the subquery.
Subqueries are helpful to debug your code, because you can pull out the subquery and run it separately, ensuring its output is as you expect.
Optimization
While subqueries are good for debugging, they're slow, at least slower than using optmized JOIN statements.
And in this case, you can convert your query to a single-level query (without subqueries) by using a JOIN.
First, you'd start with basically the exact same outer query:
SELECT title,name,company,address1,address2
FROM registrations
WHERE title != 0 AND ...
But you're also interested in data from the registrations_industry table, so you need to include that. Giving us
SELECT title,name,company,address1,address2
FROM registrations, registrations_industry
WHERE title != 0 AND ...
We need to fix the ... and now that we have the registrations_industry table we can:
SELECT title,name,company,address1,address2
FROM registrations, registrations_industry
WHERE title != 0
AND id = registrar_id
AND industry_id = '$industryid'
Now a problem might arise if both tables have an id column -- since just saying id is ambiguous. We can disambiguate this by using the <table>.<field> syntax. As in
SELECT registrations.title, registrations.name,
registrations.company, registrations.address1, registrations.address2
FROM registrations, registrations_industry
WHERE registrations.title != 0
AND registrations_industry.industry_id = '$industryid'
We didn't have to use this syntax for all the field references, but we chose to for clarity. The query now is unnecessarily complex because of all the table names. We can shorten them while still providing disambiguation and clarity. We do this by creating table aliases.
SELECT r.title, r.name, r.company, r.address1, r.address2
FROM registrations r, registrations_industry ri
WHERE r.title != 0
AND ri.industry_id = '$industryid'
By placing r and ri after the two tables in the FROM clause, we're able to refer to them using these shortcuts. This cleans up the query but still gives us the ability to clearly specify which tables the fields are coming from.
Sidenote: We could be more explicit about the table aliases by including the optional AS e.g. FROM registrationsASr rather than just FROM registrations r, but I typically reserve AS for field aliases.
If you run the query now you will get what is called a "Cartesian product" or in SQL lingo, a CROSS JOIN. This is because we didn't define any relationship between the two tables when, in fact, there is one. To fix this we need to reintroduce part of the original query that was lost: the relationship between the two tables
r.id = ri.registrar_id
so that our query now looks like
SELECT r.title, r.name, r.company, r.address1, r.address2
FROM registrations r, registrations_industry ri
WHERE r.title != 0
AND r.id = ri.registrar_id
AND ri.industry_id = '$industryid'
And this should work perfectly.
Nitpicking -- implicit vs. explicit joins
But the nitpicker in me needs to point out that this is called an "implicit join". Basically you're joining tables but not using the JOIN syntax.
A simpler example of an implicit join is
SELECT *
FROM foo f, bar b
WHERE f.id = b.foo_id
The corresponding explicit syntax is
SELECT *
FROM foo f
JOIN bar b ON f.id = b.foo_id
The result will be identical but it is using proper (and clearer) syntax. (Its clearer because it explicitly stats that there is a relationship between the foo and bar tables and it is defined by f.id = b.foo_id.)
We could similarly express your implicit query
SELECT r.title, r.name, r.company, r.address1, r.address2
FROM registrations r, registrations_industry ri
WHERE r.title != 0
AND r.id = ri.registrar_id
AND ri.industry_id = '$industryid'
explicitly as follows
SELECT r.title, r.name, r.company, r.address1, r.address2
FROM registrations r
JOIN registrations_industry ri ON r.id = ri.registrar_id
WHERE r.title != 0
AND ri.industry_id = '$industryid'
As you can see, the relationship between the tables is now in the JOIN clause, so that the WHERE and subsequent AND and OR clauses are free to express any restrictions. Another way to look at this is if you took out the WHERE + AND/OR clauses, the relationship between tables would still hold and the results would still "make sense" whereas if you used the implicit method and removed the WHERE + AND/OR clauses, your result set would contain rows that were misleading.
Lastly, the JOIN syntax by itself will cause rows that are in registrations, but do not have any corresponding rows in registrations_industry to not be returned.
Depending on your use case, you may want rows from registrations to appear in the results even if there are no corresponding entries in registrations_industry. To do this you would use what's called an OUTER JOIN. In this case, we want what is called a LEFT OUTER JOIN because we want all of the rows of the table on the left (registrations). We could have alternatively used RIGHT OUTER JOIN for the right table or simply OUTER JOIN for the outer join of both tables.
Therefore our query becomes
SELECT r.title, r.name, r.company, r.address1, r.address2
FROM registrations r
LEFT OUTER JOIN registrations_industry ri ON r.id = ri.registrar_id
WHERE r.title != 0
AND ri.industry_id = '$industryid'
And we're done.
The end result is we have a query that is
faster in terms of runtime
more compact / concise
more explicit about what tables the fields are coming from
more explicit about the relationship between the tables
A simpler version of this query would be:
SELECT title, name, company, address1, address2
FROM registrations, registrations_industry
WHERE title != 0
AND id = registrar_id
AND industry_id = '$industryid'
Your version was a subquery, this version is a simple join. Your assumptions about your query are generally correct, but it is harder for SQL to optimize and a little harder to unravel to anyone trying to read the code. Also, you won't be able to extract the data from the registrations_industry table in that parent SELECT statement because it's not technically joining and the subtable is not a part of the parent query.
I am using the following statement to fill an Excel spreadsheet with student information, including "actualstudenthours."
The problem is, I want to show all students for which the tblstudentstatus.id = 3, but I also need to show actual student hours for those students. Unfortunately, not all of the students have a corresponding entry in "viewactualstudenthours." This statement completely leaves out those students for which there is no corresponding entry in "viewatualstudenthours."
How do I get all the students to show up where the tblstudentstatus.id = 3?
If there is no entry for them in viewactualstudenthours, it should not omit the student entirely...the student hours fields should just be blank. Your help would be greatly appreciated.
$result=mysqli_query($dbc,"SELECT tblstudent.first, tblstudent.last,
LEFT(viewactualstudenthours.ACTUAL_remain,5),
(SELECT first from tbladdress where tbladdress.id = tblstudent.contact2),
(SELECT last from tbladdress where tbladdress.id = tblstudent.contact2),
tbladdress.address1,tbladdress.city,tbladdress.state,tbladdress.zip1,
tbladdress.phone, tbladdress.cell, tbladdress.email
FROM tblstudent, tbladdress, tblstudentstatus, viewactualstudenthours
WHERE viewactualstudenthours.student_id = tblstudent.id
AND tblstudent.status = tblstudentstatus.id
AND tbladdress.id = tblstudent.contact1
AND tblstudentstatus.id = 3");
(Note: the editor made the SQL semi-legible - but probably broke every rule in the PHP code book.)
Learn to use the explicit join notation introduced in SQL-92 instead of the older comma-separated list of table names in the FROM clause.
You need to use a LEFT OUTER JOIN of the table tblstudent with the view viewactualstudenthours. Ignoring the quotes etc needed to make the code work in PHP, you need:
SELECT S.first, S.last,
H.ACTUAL_remain,
A2.first, A2.last,
A1.address1, A1.city, A1.state, A1.zip1,
A1.phone, A1.cell, A1.email
FROM tblstudent AS S
JOIN tbladdress AS A1 ON S.Contact1 = A1.ID
JOIN tbladdress AS A2 ON S.Contact2 = A2.ID
JOIN tblstudentstatus AS T ON S.Status = T.ID
LEFT JOIN viewactualstudenthours AS H ON S.ID = H.Student_ID
WHERE T.id = 3
Also learn to use table aliases (the AS clauses) - it simplifies and clarifies the SQL. And if the schema is up to you, don't prefix the table names with 'tbl' and the view names with 'view' - it is just so much clutter.
Note that I got rid of the sub-selects in the select-list by joining to the Address table twice, with two separate aliases. I removed the function LEFT(); you can reintroduce it if you need to.
I have four tables I want to join and get data from. The tables look something like...
Employees (EmployeeID, GroupID[fk], EmployeeName, PhoneNum)
Positions (PositionID, PositionName)
EmployeePositions (EployeePositionID, EmployeeID[fk], PositionID[fk])
EmployeeGroup (GroupID, GroupName)
[fk] = foreign key
I want to create a query that will return all the information about an employee(given by EmployeeID). I want a query that will return the given employees Name, position(s), and group in one row.
I think it needs to involve joins, but I am not sure how to format the queries. MYSQL's manual is technical beyond my comprehension. I would be very grateful for any help.
It seems you have trouble with SQL, in general, rather than with mySQL in particular. The documentation of mySQL provides details about the various SQL expressions, but generally assume some familiarity with SQL. To get a quick start on SQL you may consider this W3schools.com primer.
The query you need is the following.
SELECT EmployeeName, PositionName, GroupName
FROM Employees E
LEFT JOIN EmployeePositions EP ON EP.EmployeeID = E.EmployeeID
LEFT JOIN Positions P ON P.PositionID = EP.PositionId
LEFT JOIN EmployeeGroup EG ON EG.GroupId = E.GroupId
WHERE E.EmployeeId = some_value
A few things to note:
The 'LEFT' in 'LEFT JOIN' will result in producing NULL in lieu of PositionName or GroupName when the corresponding tables do not have a value for the given FK. (Should only happen if the data is broken, say if for example some employees have GroupId 123 but somehow this groupid was deleted from the EmployeeGroup table.
The query returns one line per employee (1). You could use an alternative search criteria, for example WHERE EmployeeName = 'SMITH', and get a listing of all employees with that name. Indeed without a WHERE clause, you'd get a list of all employees found in Employees table.
(1) that is assuming that each employee can only have one position. If somehow some employees have more than one position (i.e. multiple rows in EmployeePositions for a given EmployeeID), you'd get several rows per employee, the Name and Group being repeated and a distinct PostionName.
Edit:
If a given employee can have multiple positions, you can use the query suggested by Tor Valamo, which uses a GROUP BY construct, with GROUP_CONCAT() to pivot all the possible positions in one single field value in the returned row.
SELECT e.EmployeeID, e.EmployeeName, e.PhoneNum,
g.GroupName, GROUP_CONCAT(p.PositionName) AS Positions
FROM Employees e
LEFT JOIN EmployeeGroup g ON g.GroupID = e.GroupID
LEFT JOIN EmployeePositions ep ON ep.EmployeeID = e.EmployeeID
LEFT JOIN Positions p ON p.PositionID = ep.PositionID
WHERE e.EmployeeID = 1
GROUP BY e.EmployeeID
Returns positions in a comma separated string on one row.