I have two tables, table A one with two columns: IP and ID, and table B with columns: ID and extra information. I want to extract the rows in table B for IPs that are not in table A. So if I have a rows in table A with
id = 1
ip = 000.000.00
id = 2
ip = 111.111.11
and I have rows in table B
id = 1
id = 2
then, given ip = 111.111.11, how can I return row 1 in table B?
select b.id, b.*
from b
left join a on a.id = b.id
where a.id is null
This'll pull all the rows in B that have no matching rows in A. You can add a specific IP into the where clause if you want to try for just that one ip.
The simplest and most easy-to-read way to spell what you're describing is:
SELECT * FROM `B` WHERE `ID` NOT IN (SELECT `ID` FROM `A`)
You should be aware, though, that using a subquery for something like this has historically been slower than doing the same thing with a self-join, because it is easier to optimise the latter, which might look like this:
SELECT
`B`.*
FROM
`B`
LEFT JOIN
`A` ON `A`.`ID` = `B`.`ID`
WHERE
`A`.`ID` IS NULL
However, technology is improving all the time, and the extent to which this is true (or even whether this is true) depends on the database software you're using.
You should test both approaches then settle on the best balance of readability and performance for your use case.
Related
table1 (id, name)
table2 (id, name)
Query:
SELECT name
FROM table2
-- that are not in table1 already
SELECT t1.name
FROM table1 t1
LEFT JOIN table2 t2 ON t2.name = t1.name
WHERE t2.name IS NULL
Q: What is happening here?
A: Conceptually, we select all rows from table1 and for each row we attempt to find a row in table2 with the same value for the name column. If there is no such row, we just leave the table2 portion of our result empty for that row. Then we constrain our selection by picking only those rows in the result where the matching row does not exist. Finally, We ignore all fields from our result except for the name column (the one we are sure that exists, from table1).
While it may not be the most performant method possible in all cases, it should work in basically every database engine ever that attempts to implement ANSI 92 SQL
You can either do
SELECT name
FROM table2
WHERE name NOT IN
(SELECT name
FROM table1)
or
SELECT name
FROM table2
WHERE NOT EXISTS
(SELECT *
FROM table1
WHERE table1.name = table2.name)
See this question for 3 techniques to accomplish this
I don't have enough rep points to vote up froadie's answer. But I have to disagree with the comments on Kris's answer. The following answer:
SELECT name
FROM table2
WHERE name NOT IN
(SELECT name
FROM table1)
Is FAR more efficient in practice. I don't know why, but I'm running it against 800k+ records and the difference is tremendous with the advantage given to the 2nd answer posted above. Just my $0.02.
SELECT <column_list>
FROM TABLEA a
LEFTJOIN TABLEB b
ON a.Key = b.Key
WHERE b.Key IS NULL;
https://www.cloudways.com/blog/how-to-join-two-tables-mysql/
This is pure set theory which you can achieve with the minus operation.
select id, name from table1
minus
select id, name from table2
Here's what worked best for me.
SELECT *
FROM #T1
EXCEPT
SELECT a.*
FROM #T1 a
JOIN #T2 b ON a.ID = b.ID
This was more than twice as fast as any other method I tried.
Watch out for pitfalls. If the field Name in Table1 contain Nulls you are in for surprises.
Better is:
SELECT name
FROM table2
WHERE name NOT IN
(SELECT ISNULL(name ,'')
FROM table1)
You can use EXCEPT in mssql or MINUS in oracle, they are identical according to :
http://blog.sqlauthority.com/2008/08/07/sql-server-except-clause-in-sql-server-is-similar-to-minus-clause-in-oracle/
That work sharp for me
SELECT *
FROM [dbo].[table1] t1
LEFT JOIN [dbo].[table2] t2 ON t1.[t1_ID] = t2.[t2_ID]
WHERE t2.[t2_ID] IS NULL
You can use following query structure :
SELECT t1.name FROM table1 t1 JOIN table2 t2 ON t2.fk_id != t1.id;
table1 :
id
name
1
Amit
2
Sagar
table2 :
id
fk_id
email
1
1
amit#ma.com
Output:
name
Sagar
All the above queries are incredibly slow on big tables. A change of strategy is needed. Here there is the code I used for a DB of mine, you can transliterate changing the fields and table names.
This is the strategy: you create two implicit temporary tables and make a union of them.
The first temporary table comes from a selection of all the rows of the first original table the fields of which you wanna control that are NOT present in the second original table.
The second implicit temporary table contains all the rows of the two original tables that have a match on identical values of the column/field you wanna control.
The result of the union is a table that has more than one row with the same control field value in case there is a match for that value on the two original tables (one coming from the first select, the second coming from the second select) and just one row with the control column value in case of the value of the first original table not matching any value of the second original table.
You group and count. When the count is 1 there is not match and, finally, you select just the rows with the count equal to 1.
Seems not elegant, but it is orders of magnitude faster than all the above solutions.
IMPORTANT NOTE: enable the INDEX on the columns to be checked.
SELECT name, source, id
FROM
(
SELECT name, "active_ingredients" as source, active_ingredients.id as id
FROM active_ingredients
UNION ALL
SELECT active_ingredients.name as name, "UNII_database" as source, temp_active_ingredients_aliases.id as id
FROM active_ingredients
INNER JOIN temp_active_ingredients_aliases ON temp_active_ingredients_aliases.alias_name = active_ingredients.name
) tbl
GROUP BY name
HAVING count(*) = 1
ORDER BY name
See query:
SELECT * FROM Table1 WHERE
id NOT IN (SELECT
e.id
FROM
Table1 e
INNER JOIN
Table2 s ON e.id = s.id);
Conceptually would be: Fetching the matching records in subquery and then in main query fetching the records which are not in subquery.
First define alias of table like t1 and t2.
After that get record of second table.
After that match that record using where condition:
SELECT name FROM table2 as t2
WHERE NOT EXISTS (SELECT * FROM table1 as t1 WHERE t1.name = t2.name)
I'm going to repost (since I'm not cool enough yet to comment) in the correct answer....in case anyone else thought it needed better explaining.
SELECT temp_table_1.name
FROM original_table_1 temp_table_1
LEFT JOIN original_table_2 temp_table_2 ON temp_table_2.name = temp_table_1.name
WHERE temp_table_2.name IS NULL
And I've seen syntax in FROM needing commas between table names in mySQL but in sqlLite it seemed to prefer the space.
The bottom line is when you use bad variable names it leaves questions. My variables should make more sense. And someone should explain why we need a comma or no comma.
I tried all solutions above but they did not work in my case. The following query worked for me.
SELECT NAME
FROM table_1
WHERE NAME NOT IN
(SELECT a.NAME
FROM table_1 AS a
LEFT JOIN table_2 AS b
ON a.NAME = b.NAME
WHERE any further condition);
MySQL Tables:
Table Name: basicdetails
id,firstname,lastname,hometown
1,bob,dylan,somewhere
2,judge,judy,somewhere
Table Name: fulldetails
id,firstname,lastname,age,gender,eyes,hometown
1,bob,dylan,51,m,blue,somewhere
2,bob,dylan,22,m,green,somewhereelse
3,judge,judy,19,f,blue,somewhere
4,judge,judy,62,f,blue,somewherenicer
5,bob,dylan,31,m,blue,somewhere
Intended result is the a comparison that returns only the entries from the fulldetails that aren't in the basic details based on their firstname, lastname, and hometown only.
In this case it would be:
bob,dylan,somewhereelse
judge,judy,somewherenicer
I am better at PHP that writing MySQL queries so all of my attempts have been about creating unique arrays and trying to sort through them. It's very complicated and very slow so I was thinking maybe it was possible to get the entries that don't exist in both based on their (firstname,lastname,hometown) only. Is there a specific way to return the unique values that don't exist in both tables at the same time with MySQL (or MySQLi if that makes a difference)?
My apologies for the wording on this, I am having trouble wording it correctly.
An anti-join is a familiar pattern.
You already know how to find rows that have matches:
SELECT a.*
FROM a
JOIN b
ON a.firstname = b.firstname
AND a.lastname = b.lastname
AND a.hometowm = b.hometown
To get the set of rows that don't match, we can use an OUTER join (so that all rows from a are returned), along with matching rows from b.
SELECT a.*
FROM a
LEFT
JOIN b
ON a.firstname = b.firstname
AND a.lastname = b.lastname
AND a.hometowm = b.hometown
The "trick" now is to filter out all the rows that had matches. We can do this by adding a WHERE clause, a predicate that tests whether a match was found. A convenient way to do this, is to test whether a column from b is NULL, a column from b that we know would not be NULL if a match was found:
SELECT a.*
FROM a
LEFT
JOIN b
ON a.firstname = b.firstname
AND a.lastname = b.lastname
AND a.hometowm = b.hometown
WHERE b.firstname IS NULL
I am using mysql and this is a query that I quickly wrote, but I feel this can be optimized using JOINS. This is an example btw.
users table:
id user_name first_name last_name email password
1 bobyxl Bob Cox bob#gmail.com pass
player table
id role player_name user_id server_id racial_name
3 0 boby123 1 2 Klingon
1 1 example 2 2 Race
2 0 boby2 1 1 Klingon
SQL
SELECT `player`.`server_id`,`player`.`id`,`player`.`player_name`,`player`.`racial_name`
FROM `player`,`users`
WHERE `users`.`id` = 1
and `users`.`id` = `player`.`user_id`
I know I can use a left join but what are the benefits
SELECT `player`.`server_id`,`player`.`id`,`player`.`player_name`,`player`.`racial_name`
FROM `player`
LEFT JOIN `users`
ON `users`.`id` = `player`.`user_id`
WHERE `users`.`id` = 1
What are the benefits, I get the same results ether way.
Your query has a JOIN in it. It is the same as writing:
SELECT `player`.`server_id`,`player`.`id`,`player`.`player_name`,`player`.`racial_name`
FROM `player`
INNER JOIN `users` ON `users`.`id` = `player`.`user_id`
WHERE `users`.`id` = 1
The only reason for you to use left join is if you want to get data from player table even when you don't have matches in users table.
LEFT JOIN will get data from the left table even if there's no equal data from the right side table.
I guess at one point, that player table's data will not be equivalent to users table specially if the data on users table has not been inserted into player table.
Your first query might return null on cases that the 2nd table (player) has no equivalent data corresponding to users table.
Also, IMHO, setting up another table for servers is a good idea in terms of complying to the normalization rules in database structure. After all, what details of the server_id is the column on player table pointing to.
The first solution makes a direct product (gets and connects everything with everything) then drops away the bad results. If you have a lot of rows this will be very slow!
The left join gets first the left table then put only the matching rows from the right (or null).
In your example you don't even need join. :)
This'll give you the same result and it'll be good until you just check for user id:
SELECT `player`.`server_id`,`player`.`id`,`player`.`player_name`,`player`.`racial_name`
FROM `player`
WHERE `player`.`user_id` = 1
Another solution if you want more conditions, without join could be something like this:
SELECT * FROM player WHERE player.user_id IN (SELECT id FROM user WHERE ...... )
I have the following database structure:
Sites table
id | name | other_fields
Backups table
id | site_id | initiated_on(unix timestamp) | size(float) | status
So Backups table have a Many to One relationship with Sites table connected via site_id
And I would like to output the data in the following format
name | Latest initiated_on | status of the latest initiated_on row
And I have the following SQL query
SELECT *, `sites`.`id` as sid, SUM(`backups`.`size`) AS size
FROM (`sites`)
LEFT JOIN `backups` ON `sites`.`id` = `backups`.`site_id`
WHERE `sites`.`id` = '1'
GROUP BY `sites`.`id`
ORDER BY `backups`.`initiated_on` desc
The thing is, with the above query I can achieve what I am looking for, but the only problem is I don't get the latest initiated_on values.
So if I had 3 rows in backups with site_id=1, the query does not pick out the row with the highest value in initiated_on. It just picks out any row.
Please help, and
thanks in advance.
You should try:
SELECT sites.name, FROM_UNIXTIME(b.latest) as latest, b.size, b.status
FROM sites
LEFT JOIN
( SELECT bg.site_id, bg.latest, bg.sizesum AS size, bu.status
FROM
( SELECT site_id, MAX(initiated_on) as latest, SUM(size) as sizesum
FROM backups
GROUP BY site_id ) bg
JOIN backups bu
ON bu.initiated_on = bg.latest AND bu.site_id = bg.site_id
) b
ON sites.id = b.site_id
In the GROUP BY subquery - bg here, the only columns you can use for SELECT are columns that are either aggregated by a function or listed in the GROUP BY part.
http://dev.mysql.com/doc/refman/5.5/en/group-by-hidden-columns.html
Once you have all the aggregate values you need to join the result again to backups to find other values for the row with latest timestamp - b.
Finally join the result to the sites table to get names - or left join if you want to list all sites, even without a backup.
Try with this:
select S.name, B.initiated_on, B.status
from sites as S left join backups as B on S.id = B.site_id
where B.initiated_on =
(select max(initiated_on)
from backups
where site_id = S.id)
To get the latest time, you need to make a subquery like this:
SELECT sites.id as sid,
SUM(backups.size) AS size
latest.time AS latesttime
FROM sites AS sites
LEFT JOIN (SELECT site_id,
MAX(initiated_on) AS time
FROM backups
GROUP BY site_id) AS latest
ON latest.site_id = sites.id
LEFT JOIN backups
ON sites.id = backups.site_id
WHERE sites.id = 1
GROUP BY sites.id
ORDER BY backups.initiated_on desc
I have removed the SELECT * as this will only work using MySQL and is generally bad practice anyway. Non-MySQL RDBSs will throw an error if you include the other fields, even individually and you will need to make this query itself into a subquery and then do an INNER JOIN to the sites table to get the rest of the fields. This is because they will be trying to add all of them into the GROUP BY statement and this fails (or is at least very slow) if you have long text fields.
I wanted to see if I can better organize / optimize my code, and so I've been reading more about joins and how you can query / select from two different tables where a certain column matches up in a single query. However, I could not find any documentation on what I would like to do.
Consider two tables (A , B).
Table A
user_id -- + -- course_id
1 -- + -- 1
Table B
course_id -- + -- project_id
1 -- + -- 2
My queries look something along the lines of the following:
$sql_course= mysql_query("SELECT course_id FROM A WHERE user_id = 1") or die(mysql_error());
while ($course_row = mysql_fetch_assoc($sql_course)) {
// Unique course ID
$courseID = $course_row['course_id'];
$sql_b= mysql_query("SELECT project_id FROM B WHERE course_id=$courseID") or die(mysql_error());
So, you see, this is not very easy to explain. I suppose what I'm looking to find out is whether or not there is a way to optimize this code, say, using one query?
Yes you can achieve this using joins in SQL.
Try something like this:
SELECT project_id FROM B JOIN A on B.course_id=A.course_id WHERE user_id=1
There are several types of joins, and depending on the results set you'll need to use different ones.
There is a good examination of what joins are doing here:
http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html
Try:
$sql_course = mysql_query("SELECT A.course_id, B.project_id FROM A INNER JOIN B ON B.course_id = A.course_id WHERE A.user_id = 1 ORDER BY A.course_id, B.project_id") or die(mysql_error());
You only need to include A.course_id in the select portion if you care about the value.
SELECT b.project_id
FROM table_b b
JOIN table_a a ON a.course_id = b.course_id
WHERE a.user_id = 1
Whether this will actually give you exactly what you want depends on what you are doing with the data, but hopefully it will give you a starting point.
You can use inner join to make it as single query. As I understood, you are trying to get project_id based on user_id and the user_id is 1.
There is a foreign key relation between Table A and B. So, you can write query as -
Select project_id from A, B where A.course_id = B.course_id and user_id = 1;