MySQL - structuring query to discard common results - php

That title is really not useful, but its a complex question (in my head, maybe) ... anywho...
Say I have a MySQL table of Countries (A-Z all countries in the world) with id & name
Then I have a table where I am tracking which countries a user has been to: Like so:
Country Table
id name
1 india
2 luxembourg
3 usa
Visited Table
id user_id country_id
1 1 1
2 1 3
Now here's what I want to do, when I present the form to add to the list of visited countries I want country.id 1 & 3 to be excluded from the query result.
I know I can filter this using PHP ... which is something I have done in the past ... but surely there must be a way to structure a query in such a way that 1 & 3 are excluded from the returned results, like:
SELECT *
FROM `countries`
WHERE `id`!= "SELECT `country_id`
FROM `visited`
WHERE `user_id`='1'"
I suspect it has something to do with JOIN statements but I can't quite figure it out.
Bonus gratitude if someone can point me in the right direction with Laravel.
Thanks you all :)

Is this what you want?
select c.*
from countries c left join
visited v
on c.id = v.country_id and v.user_id = 1
where v.country_id is null;
You can also express this as a not in or not exists, but the left join method typically has pretty good performance.
The left outer join keeps all records in the first table regardless of whether or not the on clause evaluates to true. If there are no matches in the second table, then the columns are populated with NULL values. The where clause simply chooses these records -- the ones that do not match.
Here is another way of expressing this that you might find easier to follow:
select c.*
from countries c
where not exists (select 1 from visited where c.id = v.country_id and v.user_id = 1)

You can use your query like this.
SELECT *
FROM `countries` c LEFT JOIN `visited` v on c.id = v.country_id
WHERE v.`country_id` is null
AND v.`user_id` = 1
This is a operation of a LEFT JOIN. What is means is that I'm selecting all registries from the table countries that may or may not is on the table visited based on the ID of the country.
So it will bring you this group
from country from visited
1 1
2 no registry
3 3
So on the where condition (v.country_id is null) I'm saying: I only want the ones that on this left join operation is only on the country table but it is not on visited table so it brings me the id 2. Plus the condition that says that those registries on visited must be from the user_id=1

SELECT * FROM COUNTRIES LEFT JOIN
VISITED ON
countries.id = visited.country_id and visited.country_id NOT IN ( SELECT country_id FROM visited )
if i understand right maybe you need something like this ?

Related

SQL/PHP How can I filter this with the use of a LEFT JOIN?

This database is a very basic concept for a video rental store. The database is displayed with the use of PHP (and HTML, CSS, bootstrap, etc).
I have three tables:
tl_dvd: which holds the basic info about the dvd, duration date, name, language, etc.
tl_order: which holds the name of the person who rented a movie and the startdate.
tl_client: which isn't really relevant to my question/problem, but just holds the first and last name of the clients.
When I'm on the order page it displays all the orders which are currently running. I have one column called returned in tl_order with a tinyint where 0 stands for not returned and 1 stands for returned.
There's a button which says 'terug gebracht' (='received back' in English) which will set the selected order to 1. The tl_order only displays the orders which are not returned yet (returned set on 0). On the end of the SQL query I've set something like ... AND returned = 0';
Here is the part where I'm supposed to use the LEFT (OUTER?) JOIN. If an order is not back (so that means returned is still on 0 in tl_order) the dvd should not show up in my tl_dvd (think of it like we only have one dvd copy for every movie). I've tried these LEFT JOINs in tl_dvd:
'SELECT * FROM tl_dvd LEFT JOIN tl_order WHERE tl_order.returned = 1';
Breaks my page with 500 error.
'SELECT * FROM tl_dvd LEFT JOIN tl_order ON tl_order.returned = 1';
Doesn't give me an error but spams every title like 3 times.
Can someone explain me how to tackle this issue or what I'm doing wrong?
The problem with your queries is that you're not specifying how the tables should be joined (which columns to join on), and so it's returning everything from both tables.
Try this:
select
*
from
tl_dvd left join tl_order on (tl_dvd.titel=tl_order.titel)
where
tl_order.returned = 1
You don't clearly explain or give an example of what you want in your query result. But from your examples you might want
SELECT * FROM tl_dvd
LEFT JOIN tl_order
ON tl_dvd.titel = tl_order.titel
WHERE tl_order.returned = 1
OUTER JOIN without ON is not standard SQL but MySQL treats absent ON in INNER and OUTER JOIN like ON 1=1, which in either case is equivalent to CROSS JOIN.
But it doesn't seem like you really want LEFT JOIN rather than (INNER) JOIN. LEFT JOIN returns (INNER) JOIN rows plus unmatched left table rows extended by nulls. So here besides a row for every dvd order you get row for every nordered dvd, with order info null. Since "If an order is not back [...] the dvd should not show up" seems to say you don't want those extra rows. You'd want the above without 'LEFT' or equivalently (but less clearly)
SELECT * FROM tl_dvd
CROSS JOIN tl_order
WHERE tl_dvd.titel = tl_order.titel
AND tl_order.returned = 1
You may want to rewrite the query to user inner join like the following
SELECT * FROM tl_dvd INNER JOIN tl_order ON
tl_dvd.id=tl_order.id //replace the id with the id you would like to join
WHERE tl_order.returned = 1

MySQL LEFT OUTER JOIN not giving correct results

This is probably simple, but I can't see the issue...
I have a MySQL table which contains an index of mailing lists by title.
That table is LISTS.
Then there is a users table and a subscriptions table.
The subscriptions table contains simple entries that link users to their mailing lists.
The columns of concern are:
USERS u_id
LISTS l_id
SUBS s_id, u_id, l_id
So when user 23 joins mailing list 7, the entry created in SUBS is:
s_id: 1 --
u_id: 23 --
l_id: 7
Here's the problem:
When a user is looking at their subscriptions, I need to display a table that shows all mailing lists, each with a checkbox that is ticked if they are already subscribed to that list.
My SQL, which is wrong, but I'm not sure how, is here:
SELECT l.l_id, l.l_name, l.l_desc,
CASE
WHEN s.u_id = '23' THEN 1 ELSE 0
END
FROM lists as l
LEFT OUTER JOIN
subs as s
ON s.l_id = l.l_id
GROUP BY l.l_id ASC
This should be presenting 1s to tick relevant boxes and 0s to leave them empty.
But the behavior is odd, because when I get * from subs, I see all the expected entires.
The above SQL, however, returns several empty checkboxes where subscriptions exist. And even stranger, if all boxes are ticked, the SQL returns no ticks at all.
I've been fighting this thing for far too long. Could anyone offer a solution?
Thank you very much!
The problem isn't the LEFT JOIN, it is the aggregation. You need aggregation functions in the SELECT.
However, I don't think you need aggregation at all. Assuming that a user id only appears once in lists for a given list:
SELECT l.l_id, l.l_name, l.l_desc,
(s.l_id is not null) as flag
FROM lists l LEFT OUTER JOIN
subs s
ON s.l_id = l.l_id AND s.u_id = 23;
You can express the query your way (with aggregation). It would look like:
SELECT l.l_id, l.l_name, l.l_desc,
MAX(s.u_id = 23) as flag
FROM lists l LEFT OUTER JOIN
subs s
ON s.l_id = l.l_id
GROUP BY l.l_id, l.l_name, l.l_desc;
You can also use MAX(CASE s.u_id = 23 THEN 1 ELSE 0 END), but MySQL's boolean shorthand is more convenient and readable.

SQL Optimization WHERE vs JOIN

I am using mysql and this is a query that I quickly wrote, but I feel this can be optimized using JOINS. This is an example btw.
users table:
id user_name first_name last_name email password
1 bobyxl Bob Cox bob#gmail.com pass
player table
id role player_name user_id server_id racial_name
3 0 boby123 1 2 Klingon
1 1 example 2 2 Race
2 0 boby2 1 1 Klingon
SQL
SELECT `player`.`server_id`,`player`.`id`,`player`.`player_name`,`player`.`racial_name`
FROM `player`,`users`
WHERE `users`.`id` = 1
and `users`.`id` = `player`.`user_id`
I know I can use a left join but what are the benefits
SELECT `player`.`server_id`,`player`.`id`,`player`.`player_name`,`player`.`racial_name`
FROM `player`
LEFT JOIN `users`
ON `users`.`id` = `player`.`user_id`
WHERE `users`.`id` = 1
What are the benefits, I get the same results ether way.
Your query has a JOIN in it. It is the same as writing:
SELECT `player`.`server_id`,`player`.`id`,`player`.`player_name`,`player`.`racial_name`
FROM `player`
INNER JOIN `users` ON `users`.`id` = `player`.`user_id`
WHERE `users`.`id` = 1
The only reason for you to use left join is if you want to get data from player table even when you don't have matches in users table.
LEFT JOIN will get data from the left table even if there's no equal data from the right side table.
I guess at one point, that player table's data will not be equivalent to users table specially if the data on users table has not been inserted into player table.
Your first query might return null on cases that the 2nd table (player) has no equivalent data corresponding to users table.
Also, IMHO, setting up another table for servers is a good idea in terms of complying to the normalization rules in database structure. After all, what details of the server_id is the column on player table pointing to.
The first solution makes a direct product (gets and connects everything with everything) then drops away the bad results. If you have a lot of rows this will be very slow!
The left join gets first the left table then put only the matching rows from the right (or null).
In your example you don't even need join. :)
This'll give you the same result and it'll be good until you just check for user id:
SELECT `player`.`server_id`,`player`.`id`,`player`.`player_name`,`player`.`racial_name`
FROM `player`
WHERE `player`.`user_id` = 1
Another solution if you want more conditions, without join could be something like this:
SELECT * FROM player WHERE player.user_id IN (SELECT id FROM user WHERE ...... )

MySQL Query AND IN (select.... - Need assistance in clarifying and is it a Sub Routine

Can you let me know if my interpretation is correct (the last AND part)?
$q = "SELECT title,name,company,address1,address2
FROM registrations
WHERE title != 0 AND id IN (
SELECT registrar_id
FROM registrations_industry
WHERE industry_id = '$industryid'
)";
Below was really where I am not sure:
... AND id IN (select registrar_id from registrations_industry where industry_id='$industryid')
Interpretation: Get any match on id(registrations id field) equals registrar_id(field) from the join table registrations_industry where industry_id equals the set $industryid
Is this select statement considered a sub routine since it's a query within the main query?
So an example would be with the register table id search to 23 would look like:
registrations(table)
id=23,title=owner,name=mike,company=nono,address1=1234 s walker lane,address2
registrations_industry(table)
id=256, registrar_id=23, industry_id=400<br>
id=159, registrar_id=23, industry_id=284<br>
id=227, registrar_id=23, industry_id=357
I assume this would return 3 records with the same registration table data And of course varying registrations_industry returns.
For a given test data set your query will return one record. This one:
id=23,title=owner,name=mike,company=nono,address1=1234 s walker lane,address2
To get three records with the same registration table data and varying registrations_industry you need to use JOIN.
Something like this:
SELECT r.title, r.name, r.company, r.address1, r.address2
FROM registrations AS r
LEFT OUTER JOIN registrations_industry AS ri
ON ri.registrar_id=r.id
WHERE r.title!=0 AND ri.industry_id={$industry_id}
Sorry for the essay, I didn't realize it was as long as it is until looking at it now. And although you've checked an answer, I hope you read this gain some insight into why this solution is preferred and how it evolved out of your original query.
First things first
Your query
$q = "SELECT title,name,company,address1,address2
FROM registrations
WHERE title != 0 AND id IN (
SELECT registrar_id
FROM registrations_industry
WHERE industry_id = '$industryid'
)";
seems fine. The IN syntax is equivalent to a number of OR matches. For example
WHERE field_id IN (101,102,103,105)
is functionally equivalent to
WHERE (field_id = 101
OR field_id = 102
OR field_id = 103
OR field_id = 105)
You complicate it a bit by introducing a subquery, no problem. As long as your subquery returns one column (and yours does), passing it to IN will be fine.
In your case, you're comparing registrations.id to registrations_industry.registrar_id. (Note: This is just <table>.<field> syntax, nothing special, but helpful to disambiguate what tables your fields are in.)
This seems fine.
What happens
SQL would first run the subquery, generating a result set of registrar_ids where the industry_id was set as specified.
SQL would then run the outer query, replacing the subquery with its results and you would get rows from registrations where registrations.id matched one of the registrar_ids returned from the subquery.
Subqueries are helpful to debug your code, because you can pull out the subquery and run it separately, ensuring its output is as you expect.
Optimization
While subqueries are good for debugging, they're slow, at least slower than using optmized JOIN statements.
And in this case, you can convert your query to a single-level query (without subqueries) by using a JOIN.
First, you'd start with basically the exact same outer query:
SELECT title,name,company,address1,address2
FROM registrations
WHERE title != 0 AND ...
But you're also interested in data from the registrations_industry table, so you need to include that. Giving us
SELECT title,name,company,address1,address2
FROM registrations, registrations_industry
WHERE title != 0 AND ...
We need to fix the ... and now that we have the registrations_industry table we can:
SELECT title,name,company,address1,address2
FROM registrations, registrations_industry
WHERE title != 0
AND id = registrar_id
AND industry_id = '$industryid'
Now a problem might arise if both tables have an id column -- since just saying id is ambiguous. We can disambiguate this by using the <table>.<field> syntax. As in
SELECT registrations.title, registrations.name,
registrations.company, registrations.address1, registrations.address2
FROM registrations, registrations_industry
WHERE registrations.title != 0
AND registrations_industry.industry_id = '$industryid'
We didn't have to use this syntax for all the field references, but we chose to for clarity. The query now is unnecessarily complex because of all the table names. We can shorten them while still providing disambiguation and clarity. We do this by creating table aliases.
SELECT r.title, r.name, r.company, r.address1, r.address2
FROM registrations r, registrations_industry ri
WHERE r.title != 0
AND ri.industry_id = '$industryid'
By placing r and ri after the two tables in the FROM clause, we're able to refer to them using these shortcuts. This cleans up the query but still gives us the ability to clearly specify which tables the fields are coming from.
Sidenote: We could be more explicit about the table aliases by including the optional AS e.g. FROM registrationsASr rather than just FROM registrations r, but I typically reserve AS for field aliases.
If you run the query now you will get what is called a "Cartesian product" or in SQL lingo, a CROSS JOIN. This is because we didn't define any relationship between the two tables when, in fact, there is one. To fix this we need to reintroduce part of the original query that was lost: the relationship between the two tables
r.id = ri.registrar_id
so that our query now looks like
SELECT r.title, r.name, r.company, r.address1, r.address2
FROM registrations r, registrations_industry ri
WHERE r.title != 0
AND r.id = ri.registrar_id
AND ri.industry_id = '$industryid'
And this should work perfectly.
Nitpicking -- implicit vs. explicit joins
But the nitpicker in me needs to point out that this is called an "implicit join". Basically you're joining tables but not using the JOIN syntax.
A simpler example of an implicit join is
SELECT *
FROM foo f, bar b
WHERE f.id = b.foo_id
The corresponding explicit syntax is
SELECT *
FROM foo f
JOIN bar b ON f.id = b.foo_id
The result will be identical but it is using proper (and clearer) syntax. (Its clearer because it explicitly stats that there is a relationship between the foo and bar tables and it is defined by f.id = b.foo_id.)
We could similarly express your implicit query
SELECT r.title, r.name, r.company, r.address1, r.address2
FROM registrations r, registrations_industry ri
WHERE r.title != 0
AND r.id = ri.registrar_id
AND ri.industry_id = '$industryid'
explicitly as follows
SELECT r.title, r.name, r.company, r.address1, r.address2
FROM registrations r
JOIN registrations_industry ri ON r.id = ri.registrar_id
WHERE r.title != 0
AND ri.industry_id = '$industryid'
As you can see, the relationship between the tables is now in the JOIN clause, so that the WHERE and subsequent AND and OR clauses are free to express any restrictions. Another way to look at this is if you took out the WHERE + AND/OR clauses, the relationship between tables would still hold and the results would still "make sense" whereas if you used the implicit method and removed the WHERE + AND/OR clauses, your result set would contain rows that were misleading.
Lastly, the JOIN syntax by itself will cause rows that are in registrations, but do not have any corresponding rows in registrations_industry to not be returned.
Depending on your use case, you may want rows from registrations to appear in the results even if there are no corresponding entries in registrations_industry. To do this you would use what's called an OUTER JOIN. In this case, we want what is called a LEFT OUTER JOIN because we want all of the rows of the table on the left (registrations). We could have alternatively used RIGHT OUTER JOIN for the right table or simply OUTER JOIN for the outer join of both tables.
Therefore our query becomes
SELECT r.title, r.name, r.company, r.address1, r.address2
FROM registrations r
LEFT OUTER JOIN registrations_industry ri ON r.id = ri.registrar_id
WHERE r.title != 0
AND ri.industry_id = '$industryid'
And we're done.
The end result is we have a query that is
faster in terms of runtime
more compact / concise
more explicit about what tables the fields are coming from
more explicit about the relationship between the tables
A simpler version of this query would be:
SELECT title, name, company, address1, address2
FROM registrations, registrations_industry
WHERE title != 0
AND id = registrar_id
AND industry_id = '$industryid'
Your version was a subquery, this version is a simple join. Your assumptions about your query are generally correct, but it is harder for SQL to optimize and a little harder to unravel to anyone trying to read the code. Also, you won't be able to extract the data from the registrations_industry table in that parent SELECT statement because it's not technically joining and the subtable is not a part of the parent query.

Complex SQL query, need to sort via count based upon time constraints

Hi guys I have the following three tables here.
COUNTRIES
ID | Name | Details
Airports
ID | NAME | CountryID
Trips
ID | AirportID | Date
I have to retrieve a list showing the following:
AirportID | AIrport Name | Country Name | Number of Trips Made Between Date1 and Date2
I need this to be really efficient, what kind of indexes do I need to set up and how would I formulate the SQL query here? I would be displaying this using Php. Note that I need to be able to sort based upon the number of trips made.
EDIT ==
Oops forgot to mention my sql:
I've tried the following:
SELECT `c`.*, `t`.`country` AS `country_name`, COUNT(f.`id`) AS `num_trips` FROM `airports` AS `c`
LEFT JOIN `countries` AS `t` ON t.`id` = c.`country_id`
LEFT JOIN `trips` AS `f` ON f.`airportid` = c.`id` GROUP BY `c`.`id` ORDER BY `num_flights` ASC LIMIT 10
It works but takes a really looong time to execute - plus consider this that my airports table has over 30'000 entries and teh trips table is variable.
I'm just taking the name of the country from the countries table - would it be better if I were to instead exclude joining teh countries table in the sql and instead retrieve the country name from an array where the index is the ID and values are the names of countries?
I'm not sure why you're using left joins. If every trip has an airport and every airport has a country, and inner join would give you accurate results.
I would do this:
select a.ID as AirportID, a.Name as AirportName, c.Name as CountryName, count(t.id) as NumTrips
from Trips t
inner join Airports a on t.AirportID = a.ID
inner join Countries c on a.CountryID = c.ID
where t.Date >= #StartDate
and t.Date <= #EndDate
group by AirportID, AirportName, CountryName
order by NumTrips
limit 10
Replace the #StartDate and #EndDate with your appropriate values.
Not sure what you're looking for in results, but I would expect you want the most trips. In that case you would want to do "order by NumTrips desc". This will show the highest values first, especially since you're limiting it to 10.
Also, I suggest you rename your "Date" column to something that won't collide with reserved SQL words. I usually use "DateCreated" or "DateOfTravel" or something like that.
If I made any poor assumptions let me know and I can re-write this.
Edit:
For indexes, create them on fields you will be looking up on. In other words, primary keys (which should always be indexed), foreign keys, and in this case it looks like the Date column would be the other important index. However, if you plan on searching by "Airport Name", then add an index there. I think you see where this is headed, etc.
Indexes on airpoirt(countryid, id) and trips(airportid) would seem the most important.
Instead of count(f.id) try count(f.airportid), so MySQL doesn't have to check the trips.id column.

Categories