I have a table. Table has structure of id, name, color, product_id.
And the table has multiple rows with the same product_id.
With SQL query from PHP file - I would like to choose only one, the oldest, row. (The first one that was added to the current table).
What query should I use or approach?
Thank you!
Just making up a bit of mockup data ... Note the notes I put in. And I trust it's a newer version of MySQL, as the older ones did not support ROW_NUMBER() OVER() .
Here goes:
WITH
-- input ... you *need* a timestamp to identify the oldest ---
indata(id, name, color, product_id,ts) AS (
SELECT 1,'Arthur','blue' ,42,TIMESTAMP'2021-01-31 17:45:00'
UNION ALL SELECT 1,'Arthur','blue' ,42,TIMESTAMP'2021-01-31 17:50:00'
UNION ALL SELECT 1,'Arthur','blue' ,42,TIMESTAMP'2021-01-31 17:55:00'
UNION ALL SELECT 1,'Arthur','blue' ,42,TIMESTAMP'2021-01-31 18:00:00'
UNION ALL SELECT 2,'Ford' ,'red' ,42,TIMESTAMP'2021-01-31 17:45:00'
UNION ALL SELECT 2,'Ford' ,'blue', 42,TIMESTAMP'2021-01-31 17:50:00'
UNION ALL SELECT 2,'Ford' ,'green',42,TIMESTAMP'2021-01-31 17:55:00'
UNION ALL SELECT 2,'Ford' ,'cyan' ,42,TIMESTAMP'2021-01-31 18:00:00'
)
,
-- select all, plus a rank, on which you will filter outside ..
with_rank AS (
SELECT
*
, ROW_NUMBER() OVER(PARTITION BY id ORDER BY ts) AS rnk
FROM indata
)
SELECT
id
, name
, color
, product_id
, ts
FROM with_rank
WHERE rnk = 1
id|name |color|product_id|ts
1|Arthur|blue |42 |2021-01-31 17:45:00
2|Ford |red |42 |2021-01-31 17:45:00
One method is a correlated subquery:
select t.*
from t
where t.id = (select min(t2.id)
from t t2
where t2.product_id = t.product_id
);
This assumes that id is incrementing with each insertion. If not, you have no way of knowing what the "oldest" row is. SQL tables represent unordered sets, so there is no "oldest" row unless a column contains that information.
SELECT * FROM TableName WHERE product_id = ProductID ORDER BY product_id LIMIT 1;
Related
I have this table :
id idm date_play
1 5 2017-08-23 12:12:12
2 5 2017-08-23 12:12:12
3 6 2017-08-23 12:14:13
I want to identify if user has more then one insert in the same second. In the case describe I want to get the user id that is 5.
I tried like this :
SELECT `idm`, MAX(`s`) `conseq` FROM
(
SELECT
#s := IF(#u = `idm` AND (UNIX_TIMESTAMP(`date_play`) - #pt) BETWEEN 1 AND 100000, #s + 1, 0) s,
#u := `idm` `idm`,
#pt := UNIX_TIMESTAMP(`date_play`) pt
FROM table
WHERE date_play >= '2017-08-23 00:00:00'
AND date_play <= '2017-08-23 23:59:59'
ORDER BY `date_play`
) AS t
GROUP BY `idm`
Can you help me please ? Thx in advance and sorry for my english.
Assuming your dates are accurate down to the second level, you can do this with a single aggregation:
select idm
from t
group by idm
having count(*) > count(distinct date_play);
If date_play has fractional seconds, then you would need to remove those (say by converting to a string).
If you want the play dates where there are duplicates:
select idm, date_play
from t
group by idm, date_play
having count(*) >= 2;
Or, for just the idms, you could use select distinct with group by:
select distinct idm
from t
group by idm, date_play
having count(*) >= 2;
(I only mention this because this is the only type of problem that I know of where using select distinct with group by makes sense.)
If you want all the rows that are duplicated, I would go for exists instead:
select t.*
from t
where exists (select 1
from t t2
where t2.idm = t.idm and t2.date_play = t.date_play and
t2.id <> t.id
);
This should have reasonable performance with an index on (idm, date_play, id).
If your table is called mytable, the following should work:
SELECT t.`idm`
FROM mytable t INNER JOIN mytable t2
ON t.`idm`=t2.`idm` AND t.`date_play`=t2.`date_play` AND t.`id`!=t2.`id`
GROUP BY t.`idm`
Basically we join the table with itself, pairing records that have the same idm and date_play, but not the same id. This will have the effect of matching up any two records with the same user and datetime. We then group results by user so you don't get the same user id listed multiple times.
Edit:
Gordon Linoff and tadman's suggestions led me to this probably much more efficient query (credit to them)
SELECT t.`idm`
FROM mytable t
GROUP BY t.`date_play`
HAVING COUNT(t.`id`)>1
Two tables, with a left join. For ease table 1 and table 2.
Table 1 contains a list of people and their current status, table 2 is all of their "invites". All im trying to do as part of the join is show in a list all the current "people" and then the LATEST invite status (from table 2) so return a single row from table 2.
I have everything working... but its duplicating for example if a person has had multiple invites it will put them twice on the list. I just want to limit it to
$sql = "SELECT table1.fieldname as table1fielname table2.fieldname [more fields]
FROM xxx
LEFT JOIN xxx on table1.sharedid=table2.sharedid
WHERE XXX LIMIT 1 ";`
Obvioulsy the limit 1 doesnt do what its supposed to. I have tried adding additional select statements in brackets but being honest it just breaks everything and im not an expert at all.
I'm not an expert too but I'll try. Have you tried to use DISTINCT?
For exemple:
SELECT DISTINCT column_name1,column_name2
FROM table_name; [...]
It normally delete double matches.
Here are the links:
http://www.w3schools.com/sql/sql_distinct.asp
https://www.techonthenet.com/oracle/distinct.php
Give example data. And use good table and column names. For example:
(this returns all rows that satisfy the join):
WITH people(ppl_id,ppl_name,status) AS (
SELECT 1,'Arthur','active'
UNION ALL SELECT 2,'Tricia','active'
), invites(ppl_id,inv_id,inv_date) AS (
SELECT 1,1, DATE '2017-01-01'
UNION ALL SELECT 1,2, DATE '2017-01-07'
UNION ALL SELECT 1,3, DATE '2017-01-08'
UNION ALL SELECT 2,1, DATE '2017-01-01'
UNION ALL SELECT 2,2, DATE '2017-01-08'
)
SELECT
*
FROM people
JOIN invites USING(ppl_id)
ORDER BY 1
;
ppl_id|ppl_name|status|inv_id|inv_date
1|Arthur |active| 1|2017-01-01
1|Arthur |active| 3|2017-01-08
1|Arthur |active| 2|2017-01-07
2|Tricia |active| 2|2017-01-08
2|Tricia |active| 1|2017-01-01
But we want only 'Arthur' with '2017-01-08' and 'Tricia' with '2017-01-08'.
With any database that supports ANSI 99, you could try with a temporary table containing the newest invitation date per "people id", and join that temporary table with the invitations table. We call that table newest_invite_date, and, apparently, it does what we expect it to do:
WITH people(ppl_id,ppl_name,status) AS (
SELECT 1,'Arthur','active'
UNION ALL SELECT 2,'Tricia','active'
), invites(ppl_id,inv_id,inv_date) AS (
SELECT 1,1, DATE '2017-01-01'
UNION ALL SELECT 1,2, DATE '2017-01-07'
UNION ALL SELECT 1,3, DATE '2017-01-08'
UNION ALL SELECT 2,1, DATE '2017-01-01'
UNION ALL SELECT 2,2, DATE '2017-01-08'
), newest_invite_date(ppl_id,inv_date) AS (
SELECT ppl_id,MAX(inv_date)
FROM invites
GROUP BY ppl_id
)
SELECT
people.ppl_id
, people.ppl_name
, people.status
, newest_invite_date.inv_date
FROM people
JOIN newest_invite_date USING(ppl_id)
ORDER BY 1
;
ppl_id|ppl_name|status|inv_date
1|Arthur |active|2017-01-08
2|Tricia |active|2017-01-08
Is this what you were looking for?
Happy playing ...
Marco the Sane
How can I find the most frequent value in a given column in an SQL table?
For example, for this table it should return two since it is the most frequent value:
one
two
two
three
SELECT
<column_name>,
COUNT(<column_name>) AS `value_occurrence`
FROM
<my_table>
GROUP BY
<column_name>
ORDER BY
`value_occurrence` DESC
LIMIT 1;
Replace <column_name> and <my_table>. Increase 1 if you want to see the N most common values of the column.
Try something like:
SELECT `column`
FROM `your_table`
GROUP BY `column`
ORDER BY COUNT(*) DESC
LIMIT 1;
Let us consider table name as tblperson and column name as city. I want to retrieve the most repeated city from the city column:
select city,count(*) as nor from tblperson
group by city
having count(*) =(select max(nor) from
(select city,count(*) as nor from tblperson group by city) tblperson)
Here nor is an alias name.
Below query seems to work good for me in SQL Server database:
select column, COUNT(column) AS MOST_FREQUENT
from TABLE_NAME
GROUP BY column
ORDER BY COUNT(column) DESC
Result:
column MOST_FREQUENT
item1 highest count
item2 second highest
item3 third higest
..
..
For use with SQL Server.
As there is no limit command support in that.
Yo can use the top 1 command to find the maximum occurring value in the particular column in this case (value)
SELECT top1
`value`,
COUNT(`value`) AS `value_occurrence`
FROM
`my_table`
GROUP BY
`value`
ORDER BY
`value_occurrence` DESC;
Assuming Table is 'SalesLT.Customer' and the Column you are trying to figure out is 'CompanyName' and AggCompanyName is an Alias.
Select CompanyName, Count(CompanyName) as AggCompanyName from SalesLT.Customer
group by CompanyName
Order By Count(CompanyName) Desc;
If you can't use LIMIT or LIMIT is not an option for your query tool. You can use "ROWNUM" instead, but you will need a sub query:
SELECT FIELD_1, ALIAS1
FROM(SELECT FIELD_1, COUNT(FIELD_1) ALIAS1
FROM TABLENAME
GROUP BY FIELD_1
ORDER BY COUNT(FIELD_1) DESC)
WHERE ROWNUM = 1
If you have an ID column and you want to find most repetitive category from another column for each ID then you can use below query,
Table:
Query:
SELECT ID, CATEGORY, COUNT(*) AS FREQ
FROM TABLE
GROUP BY 1,2
QUALIFY ROW_NUMBER() OVER(PARTITION BY ID ORDER BY FREQ DESC) = 1;
Result:
Return all most frequent rows in case of tie
Find the most frequent value in mysql,display all in case of a tie gives two possible approaches:
Scalar subquery:
SELECT
"country",
COUNT(country) AS "cnt"
FROM "Sales"
GROUP BY "country"
HAVING
COUNT("country") = (
SELECT COUNT("country") AS "cnt"
FROM "Sales"
GROUP BY "country"
ORDER BY "cnt" DESC,
LIMIT 1
)
ORDER BY "country" ASC
With the RANK window function, available since MySQL 8+:
SELECT "country", "cnt"
FROM (
SELECT
"country",
COUNT("country") AS "cnt",
RANK() OVER (ORDER BY COUNT(*) DESC) "rnk"
FROM "Sales"
GROUP BY "country"
) AS "sub"
WHERE "rnk" = 1
ORDER BY "country" ASC
This method might save a second recount compared to the first one.
RANK works by ranking all rows, such that if two rows are at the top, both get rank 1. So it basically directly solves this type of use case.
RANK is also available on SQLite and PostgreSQL, I think it might be SQL standard, not sure.
In the above queries I also sorted by country to have more deterministic results.
Tested on SQLite 3.34.0, PostgreSQL 14.3, GitHub upstream.
Most frequent for each GROUP BY group
MySQL: MySQL SELECT most frequent by group
PostgreSQL:
Get most common value for each value of another column in SQL
https://dba.stackexchange.com/questions/193307/find-most-frequent-values-for-a-given-column
SQLite: SQL query for finding the most frequent value of a grouped by value
SELECT TOP 20 WITH TIES COUNT(Counted_Column) AS Count, OtherColumn1,
OtherColumn2, OtherColumn3, OtherColumn4
FROM Table_or_View_Name
WHERE
(Date_Column >= '01/01/2023') AND
(Date_Column <= '03/01/2023') AND
(Counted_Column = 'Desired_Text')
GROUP BY OtherColumn1, OtherColumn2, OtherColumn3, OtherColumn4
ORDER BY COUNT(Counted_Column) DESC
20 can be changed to any desired number
WITH TIES allows all ties in the count to be displayed
Date range used if date/time column exists and can be modified to search a date range as desired
Counted_Column 'Desired_Text' can be modified to only count certain entries in that column
Works in INSQL for my instance
One way I like to use is:
select *<given_column>*,COUNT(*<given_column>*)as VAR1 from Table_Name
group by *<given_column>*
order by VAR1 desc
limit 1
I know that this must be a very basic question, but I've not found an answer.
As the title says, I would like the query the record that holds the max value of a specific column.
I use the following code to achieve that:
SELECT * FROM `table_name` ORDER BY `column_with_specific_max_value` DESC LIMIT 1
I would like to know if there is an other way to achieve the same result (more parsimonious)? I know that SQL has a function MAX(column) but it's not working the way I want. I tried this:
SELECT * FROM `table_name` WHERE `column_with_specific_max_value`=MAX(`column_with_specific_max_value`)
and this:
SELECT *, MAX(`column_with_specific_max_value`) FROM `table_name`
What happen if the column_with_specific_max_value has 2 rows with the same max value? will it return both rows?
What about?
select * from table1
where score in (select max(score) from table1)
Or even without a max:
select * from table1
where score >= all (select score from table1)
Any of those WILL return all rows with the max value. You can play with it here.
If your table has an auto-increment column to work with, you could do something like...
select
YT3.*
from
( select
MAX( YT2.AutoIncColumn ) as ReturnThisRecordID
from
( select max( YT1.WhatColumn ) as MaxColumnYouWant
from YourTable YT1 ) JustMax
Join YourTable YT2
on JustMax.MaxColumnYouWant = YT2.WhatColumn ) FinalRecord
JOIN YourTable YT3
on FinalRecord.ReturnThisRecordID = YT3.AutoIncColumn
I would also ensure this is a column that SHOULD have an index on it, otherwise, you'll always be doing a table scan.
I have two tables - incoming tours(id,name) and incoming_tours_cities(id_parrent, id_city)
id in first table is unique, and for each unique row from first table there is the list of id_city - s in second table(i.e. id_parrent in second table is equal to id from first table)
For example
incoming_tours
|--id--|------name-----|
|---1--|---first_tour--|
|---2--|--second_tour--|
|---3--|--thirth_tour--|
|---4--|--hourth_tour--|
incoming_tours_cities
|-id_parrent-|-id_city-|
|------1-----|---4-----|
|------1-----|---5-----|
|------1-----|---27----|
|------1-----|---74----|
|------2-----|---1-----|
|------2-----|---5-----|
........................
That means that first_tour has list of cities - ("4","5","27","74")
AND second_tour has list of cities - ("1","5")
Let's assume i have two values - 4 and 74:
Now, i need to get all rows from first table, where my both values are in the list of cities. i.e it must return only the first_tour (because 4 and 74 are in it's list of cities)
So, i wrote the following query
SELECT t.name
FROM `incoming_tours` t
JOIN `incoming_tours_cities` tc0 ON tc0.id_parrent = t.id
AND tc0.id_city = '4'
JOIN `incoming_tours_cities` tc1 ON tc1.id_parrent = t.id
AND tc1.id_city = '74'
And that works fine.
But i generate the query dynamically, and when the count of joins is big (about 15) the query slowing down.
i.e. when i try to run
SELECT t.name
FROM `incoming_tours` t
JOIN `incoming_tours_cities` tc0 ON tc0.id_parrent = t.id
AND tc0.id_city = '4'
JOIN `incoming_tours_cities` tc1 ON tc1.id_parrent = t.id
AND tc1.id_city = '74'
.........................................................
JOIN `incoming_tours_cities` tc15 ON tc15.id_parrent = t.id
AND tc15.id_city = 'some_value'
the query run's in 45s(despite on i set indexes in the tables)
What can i do, to optimaze it?
Thanks much
SELECT t.name
FROM incoming_tours t INNER JOIN
( SELECT id_parrent
FROM incoming_tours_cities
WHERE id IN (4, 74)
GROUP BY id_parrent
HAVING count(id_city) = 2) resultset
ON resultset.id_parrent = t.id
But you need to change number of total cities count.
SELECT name
FROM (
SELECT DISTINCT(incoming_tours.name) AS name,
COUNT(incoming_tours_cities.id_city) AS c
FROM incoming_tours
JOIN incoming_tours_cities
ON incoming_tours.id=incoming_tours_cities.id_parrent
WHERE incoming_tours_cities.id_city IN(4,74)
HAVING c=2
) t1;
You will have to change c=2 to whatever the count of id_city you are searching is, but since you generate the query dynamically, that shouldn't be a problem.
I'm pretty sure this works, but a lot less sure that it is optimal.
SELECT * FROM incoming_tours
WHERE
id IN (SELECT id_parrent FROM incoming_tours_cities WHERE id_city=4)
AND id IN (SELECT id_parrent FROM incoming_tours_cities WHERE id_city=74)
...
AND id IN (SELECT id_parrent FROM incoming_tours_cities WHERE id_city=some_value)
Just an hint.
If you use the IN operator in a WHERE clause, you can hope that the short-circuit of operator AND may remove unnecessary JOINs during the execution for the tours that do not respect the constraint.
Seems like an odd way to do that query, here
SELECT t.name FROM `incoming_tours` as t WHERE t.id IN (SELECT id_parrent FROM `incoming_tours_cities` as tc WHERE tc.id_city IN ('4','74'));
I think that does it, but not tested...
EDIT: Added table alias to sub-query
I've written this query using CTE's and it includes the test data in the query. You'll need to modify it so that it queries the real tables instead. Not sure how it performs on a large dataset...
Declare #numCities int = 2
;with incoming_tours(id, name) AS
(
select 1, 'first_tour' union all
select 2, 'second_tour' union all
select 3, 'third_tour' union all
select 4, 'fourth_tour'
)
, incoming_tours_cities(id_parent, id_city) AS
(
select 1, 4 union all
select 1, 5 union all
select 1, 27 union all
select 1, 74 union all
select 2, 1 union all
select 2, 5
)
, cityIds(id_city) AS
(
select 4
union all select 5
/* Add all city ids you need to check in this table */
)
, common_cities(id_city, tour_id, tour_name) AS
(
select c.id_city, it.id, it.name
from cityIds C, Incoming_tours_cities tc, incoming_tours it
where C.id_city = tc.id_city
and tc.id_parent = it.id
)
, tours_with_all_cities(id_city) As
(
select tour_id from common_cities
group by tour_id
having COUNT(id_city) = #numCities
)
select it.name from incoming_tours it, tours_with_all_cities tic
where it.id = tic.id_city