I have two tables in my SQL database, one for the photos that users sent and another one for the votes which photo had on my application. I need to extract the 30 photos from the 'photos' table which had the most votes on the 'votes' table.
Is there a way to do it within a single query?
You should be able to use a query like this:
select
a.photoFileName
from
photos a
join votes b
on a.photoId=b.photoId
order by
b.voteCount desc
limit 30
Adjust the keys to your exact column names on the linked fields.
This assumes that the votes table has an number column (voteCount) that has a tally of the votes for that image.
Something like this ( if each vote is stored single ), but make your own adjustments:
SELECT
p.id,
COUNT( v.id )
FROM
photos p
JOIN
votes v ON p.id = v.photo_id
ORDER BY
COUNT( v.id ) DESC
GROUP BY
v.photo_id
LIMIT 30;
PS: I did not test the query, just gave you an example!
Related
I’m really struggling with how to write a query which randomly selects 50 DISTINCT random titles from one table in my MySQL database and then selects 1 random excerpt from each title from a separate table. The first table is titles and the second is excerpts.
I’ve tried two queries nested together but this either doesn’t work or returns duplicate titles despite supposedly being DISTINCT.
Could somebody please, PLEASE help me with where I’m going wrong?!
My existing PHP:
$distincttitlequery = “SELECT DISTINCT titleid FROM titles ORDER BY rand() LIMIT 50”;
$distincttitleresult = mysql_query($cxn,$distincttitlequery);
while ($distinctqueryreturn = mysqli_fetch_assoc($distincttitlequery))
{
extract ($distinctqueryreturn);
$selectedtitle = $titleid;
$randomexcerptquery = “SELECT excerpts.titleid, excerpts.excerptid, excerpts.excerptsynopsis, title.titleid, title.title FROM excerpts INNER JOIN titles ON excerpts.titleid=title.titleid WHERE titleid = ‘$selectedtitle’ ORDER BY rand() LIMIT 1”;
$randomexcerptresults = mysql_query($cxn,$randomexcerptquery);
while ($randomexcerptreturn = mysqli_fetch_assoc($randomexcerptquery))
{
[ECHO RESULTS HERE]
}};
I’ve read in similar posts about GROUP BY but I need to create a query which deals with distinct, random and joined tables and I have absolutely no idea where to start!
My existing code uses DISTINCT on multiple columns and joins the tables but this leads to titles being repeated in returned results. I can LIVE with that but I’d love to perfect it!
Thank you in advance for your help with this.
In mysql 8 you can use row_number to get 1 random row per titleid
SELECT
titleid,title,excerptid,excerptsynopsis
FROM (
SELECT
e.titleid, e.excerptid, e.excerptsynopsis
,ROW_NUMBER() OVER( PARTITION BY e.titleid ORDER BY rand()) rn
, t.title
FROM excerpts e
INNER JOIN (SELECT DISTINCT titleid FROM titles ORDER BY rand() LIMIT 50) t ON e.titleid=t.titleid
) t1
WHERE rn = 1
I have a mysql statement that queries a database for the latest track. However, since the database is partially normalized the ID's are in different tables. In the query's I get the artist ID'd from the artists table and put them into a variable. The variable in then parsed into a query that looks at the tracks to find the latest one, this is where the problem lies. Since the $artist variable can have tonnes of ID's in, all those ID's are parsed into the query and the outcome is several url's put together even though I have put a LIMIT on the query.
Bear in mind that I cannot LIMIT the artist query as I need to get all the artists from the table and find the latest track out of all the artists.
How would I get just the latest url from the query without limiting the artist query?
//Set up artist query so only NBS artists are chose
$findartist = mysql_query("SELECT * FROM artists") or die(mysql_error());
while ($artist = mysql_fetch_array($findartist)){
$artist = $artist['ID'];
//get track url
$fetchurl = mysql_query("SELECT * FROM tracks WHERE id = '$artist' ORDER BY timestamp DESC LIMIT 1");
url = mysql_fetch_array($fetchurl);
$track_ID = $url ['ID'];
$trackname = $url ['name'];
$trackurl = $url ['url'];
$artist_ID =$url['ID'];
}
ADDITION:
$findartist = mysql_query("SELECT A.*, T.*
FROM (
SELECT T.ARTIST_ID, MIN(T.TRACK_ID) TRACK_ID
FROM (
SELECT ARTIST_ID, MAX(`TIMESTAMP`) `TIMESTAMP`
FROM TRACKS
GROUP BY ARTIST_ID
) L
JOIN TRACKS T ON ( L.ARTIST_ID = T.ARTIST_ID
AND L.`TIMESTAMP` = T.`TIMESTAMP`)
GROUP BY T.ARTIST_ID
) X
JOIN ARTISTS A ON X.ARTIST_ID = A.ARTIST_ID
JOIN TRACKS T ON (X.TRACK_ID = T.TRACK_ID AND X.ARTIST_ID = T.ARTIST_ID)
ORDER BY A.NAME");
while ($artist = mysql_fetch_array($findartist)){
$artist = $artist['ID'];
$trackurl = $artist['url'];
The relation between artists table and tracks table is one-to-many. So your tracks table should have a column artist_id and foreign key constraint which cross-references this column with id column in artists table. When this is done, the query to get latest tracks would look like:
SELECT id, name, url, MAX(timestamp) timestamp
FROM tracks
GROUP BY artist_id
If I understand you correctly, you want the latest (most recent timestamp) track from each artist in your artist table.
It would help if you had your table definitions displayed. I think you're confusing ARTIST_ID and TRACK_ID in your query from your tracks table. So I will use the column names ARTIST_ID and TRACK_ID throughout.
(TIMESTAMP is an unfortunate choice for a column name, because it's also a MySQL data type name, by the way. No matter.)
You can do this with one query. Let us construct that query. It's not super simple but it will work just fine.
First, let's get the timestamp of the latest track or tracks by each artist. This returns a virtual table with ARTISTS_ID and latest TIMESTAMP shown.
SELECT ARTIST_ID, MAX(`TIMESTAMP`) `TIMESTAMP`
FROM TRACKS
GROUP BY ARTIST_ID
Now, let's nest that query into another query to come up with a particular track_id that is the latest track from each artist. It is necessary to disambiguate the situation where an artist has more than one track with precisely the same timestamp. In this case we'll grab the lowest numbered TRACK_ID.
I suppose that all the tracks on an album by an artist have the same timestamp, but they have ascending track IDs, so this picks the first track on the artist's latest album.
SELECT T.ARTIST_ID, MIN(T.TRACK_ID) TRACK_ID
FROM (
SELECT ARTIST_ID, MAX(`TIMESTAMP`) `TIMESTAMP`
FROM TRACKS
GROUP BY ARTIST_ID
) L
JOIN TRACKS T ON ( L.ARTIST_ID = T.ARTIST_ID
AND L.`TIMESTAMP` = T.`TIMESTAMP`)
GROUP BY T.ARTIST_ID
See how this goes? The inner subquery finds the latest timestamp for each artist, and the outer query uses the subquery to find the lowest-numbered track ID for that artist and timestamp. So, now we have a virtual table that shows the latest track_id for each artist.
Finally, we need to query the joined-together artist and track information to get your list of artists and their latest tracks. We'll join the two physical tables with the virtual table we just figured out.
SELECT A.*, T.*
FROM (
SELECT T.ARTIST_ID, MIN(T.TRACK_ID) TRACK_ID
FROM (
SELECT ARTIST_ID, MAX(`TIMESTAMP`) `TIMESTAMP`
FROM TRACKS
GROUP BY ARTIST_ID
) L
JOIN TRACKS T ON ( L.ARTIST_ID = T.ARTIST_ID
AND L.`TIMESTAMP` = T.`TIMESTAMP`)
GROUP BY T.ARTIST_ID
) X
JOIN ARTISTS A ON X.ARTIST_ID = A.ARTIST_ID
JOIN TRACKS T ON (X.TRACK_ID = T.TRACK_ID AND X.ARTIST_ID = T.ARTIST_ID)
ORDER BY A.NAME
Think of it this way: You have some physical tables with your data in them. You can also create virtual tables with subqueries and use them as if they were physical tables by including them, nested, in your queries. That nesting is one of the reasons it's called Structured Query Language.
You're going to need indexes on your TIMESTAMP, ARTIST_ID, and TRACK_ID columns for this to work efficiently.
Edit:
There really isn't sufficient information about your schema in your question to figure out how unambiguously to get the most recently uploaded track.
If the TRACK_ID is the autoincrementing primary key for the TRACKS table, it's easy. Get the highest numbered track ID left joined to the artist (left joined in case there's no corresponding row in the artist table).
SELECT T.*, A.*
FROM TRACKS T
LEFT JOIN ARTISTS A ON T.ARTIST_ID = A.ARTIST_ID
ORDER BY T.TRACK_ID DESC
LIMIT 1
If TRACK_ID isn't an autoincrementing primary key but you almost never have two timestamps the same, do this. If there happen to be two or more tracks with the same timestamp, it will arbitrarily select one of them.
SELECT T.*, A.*
FROM TRACKS T
LEFT JOIN ARTISTS A ON T.ARTIST_ID = A.ARTIST_ID
ORDER BY T.`TIMESTAMP` DESC
LIMIT 1
The trick to this data stuff is to be very careful to specify exactly what you want. It's pretty clear from your question that you're trying, in a loop, to get the most recent track for each artist in turn. My query did that without a loop in your program. But, you know what, I don't know the names of all your columns so my SQL might not be perfect.
Big thanks to #OllieJones and #hookman for helping me out on this. I have found the query I need and I have done it all in one query without any PHP so big thanks to them both.
Anyway here it is;
SELECT T.url, A.ID, T.ID
FROM tracks T
LEFT JOIN ARTISTS A ON T.ID = A.ID
WHERE T.ID = A.ID
ORDER BY T.timestamp DESC
LIMIT 1
I took much of #OllieJones query and edited it a bit. I added the WHERE clause so that only artists are chosen and took away the * so only the needed data is returned. I also took #hookman advice and used a load of foreign keys. Gonna help a lot in the future.
I have two table for gallery system :
gallery_cat(
gallery_cat_id PK,
gallery_cat_name
)
gallery(
gallery_id PK,
gallery_cat_id FK,
gallery_name,
gallery_file_name,
gallery_date
)
I need to write a SQL query that return one picture from gallery table for each album, the purpose of this that I need to list the albums with one picture for each.
gallery_name | gallery_cat_name| gallery_file_name
-------------+-----------------+------------------
pic1 | Album1 | pic1.jpg
This should do the trick:
SELECT g2.gallery_name, gc2.gallery_cat_name, g2.gallery_file_name
FROM gallery g2
INNER JOIN gallery_cat gc2 ON (g2.gallery_cat_id = gc2.gallery_cat_id)
WHERE g2.gallery_id IN (
SELECT g.gallery_id
FROM gallery g
GROUP BY g.gallery_cat_id)
Explanation:
At the end is a sub-select
IN (
SELECT g.gallery_id
FROM gallery g
GROUP BY g.gallery_cat_id) <<-- select 1 random g.id per gallery_cat.
Here I select all g.id, but because of the group by clause it will reduce the results to 1 row per grouped by item. I.e. 1 row (chosen more or less at random) per g.gallery_cat_id.
Next I do a normal select with a join:
SELECT g2.gallery_name, gc2.gallery_cat_name, g2.gallery_file_name
FROM gallery g2
INNER JOIN gallery_cat gc2 ON (g2.gallery_cat_id = gc2.gallery_cat_id)
WHERE g2.gallery_id IN (
Because I refer to the same table twice in the same query you have to use an alias(*).
I select all names and all catnames and all filenames.
However in the where clause I filter these so that only rows from the sub-select are shown.
I have to do it this way, because the group by mixes rows into one messed up ow, if I select from that directly I will get values from different rows mixed together, not a good thing.
By first selecting the id's I want and then matching full rows to those id I prevent this from happening.
*(in this case with this kind of subselect that's not really 100% true, but trust me on the point that it's always a good idea to alias your tables)
This attempts to select the most recent gallery_date for each category ID and join against gallery_cat
SELECT
c.gallery_cat_id,
c.gallery_cat_name,
i.lastimg
FROM
gallery_cat c
LEFT JOIN (
SELECT gallery_cat_id, gallery_filename AS lastimg, MAX(gallery_date)
FROM gallery
GROUP BY gallery_cat_id, gallery_filename
) i ON c.gallery_cat_id = i.gallery_cat_id
You can use SQL JOINS to do this, otherwise you would have to loop out all the albums and pick one random picture from each which would be less efficient.
this is my query
SELECT U.id AS user_id,C.name AS country,
CASE
WHEN U.facebook_id > 0 THEN CONCAT(F.first_name,' ',F.last_name)
WHEN U.twitter_id > 0 THEN T.name
WHEN U.regular_id > 0 THEN CONCAT(R.first,' ',R.last)
END AS name,
FROM user U LEFT OUTER JOIN regular R
ON U.regular_id = R.id
LEFT OUTER JOIN twitter T
ON U.twitter_id = T.id
LEFT OUTER JOIN facebook F
ON U.facebook_id = F.id
LEFT OUTER JOIN country C
ON U.country_id = C.id
WHERE (CONCAT(F.first_name,' ',F.last_name) LIKE '%' OR T.name LIKE '%' OR CONCAT(R.first,' ',R.last) LIKE '%') AND U.active = 1
LIMIT 100
its realy fast, but in the EXPLAIN it don't show me it uses INDEXES (there is indexes).
but when i add ORDER BY 'name' before the LIMIT its takes long time why? there is a way to solve it?
tables: users 150000, regular 50000, facebook 50000, twitter 50000, country 250 and growing!
It takes a long time because it's a composite column, not a table column. The name column is a result of a case selection, and unlike simple selects with multiple join, MySQL has to use a different sorting algorithm for this kind of data.
I'm talking from ignorance here, but you could store the data in a temporary table and then sort it. It may go faster since you can create indexes for it but it won't be as fast, because of the different storage type.
UPDATE 2011-01-26
CREATE TEMPORARY TABLE `short_select`
SELECT U.id AS user_id,C.name AS country,
CASE
WHEN U.facebook_id > 0 THEN CONCAT(F.first_name,' ',F.last_name)
WHEN U.twitter_id > 0 THEN T.name
WHEN U.regular_id > 0 THEN CONCAT(R.first,' ',R.last)
END AS name,
FROM user U LEFT OUTER JOIN regular R
ON U.regular_id = R.id
LEFT OUTER JOIN twitter T
ON U.twitter_id = T.id
LEFT OUTER JOIN facebook F
ON U.facebook_id = F.id
LEFT OUTER JOIN country C
ON U.country_id = C.id
WHERE (CONCAT(F.first_name,' ',F.last_name) LIKE '%' OR T.name LIKE '%' OR CONCAT(R.first,' ',R.last) LIKE '%') AND U.active = 1
LIMIT 100;
ALTER TABLE `short_select` ADD INDEX(`name`); --add successive columns if you are going to order by them as well.
SELECT * FROM `short_select`
ORDER BY 'name'; -- same as above
Remember temporary tables are dropped upon connection termination, so you don't have to clean them, but you should anyway.
Without actually knowing your DB structure, and assuming you have all of the proper indexes on everything. An Order By statement takes some variable amount of time to sort the elements being returned by a query (index or not). If it is only 10 rows, it will seem almost instant, if you get 2000 rows, it will be a little slower, if you are sorting 15k rows joined across multiple tables, it is going to take some time to sort the returned result. Also make sure your adding indexes to the fields your sorting by. You may want to take the desired result and store everything in a presorted stub table for faster querying later as well (if you query this sorted result set often)
You need to create first 100 records from each name table separately, then union the results, join them with user and country, order and limit the output:
SELECT u.id AS user_id, c.name AS country, n.name
FROM (
SELECT facebook_id AS id, CONCAT(F.first_name, ' ', F.last_name) AS name
FROM facebook
ORDER BY
first_name, last_name
LIMIT 100
UNION ALL
SELECT twitter_id, name
FROM twitter
WHERE twitter_id NOT IN
(
SELECT facebook_id
FROM facebook
)
ORDER BY
name
LIMIT 100
UNION ALL
SELECT regular_id, CONCAT(R.first, ' ', R.last)
FROM regular
WHERE regular_id NOT IN
(
SELECT facebook_id
FROM facebook
)
AND
regular_id NOT IN
(
SELECT twitter_id
FROM twitter
)
ORDER BY
first, last
LIMIT 100
) n
JOIN user u
ON u.id = n.id
JOIN country с
ON c.id = u.country_id
Create the following indexes:
facebook (first_name, last_name)
twitter (name)
regular (first, last)
Note that this query orders slightly differently from your original one: in this query, 'Ronnie James Dio' would be sorted after 'Ronnie Scott'.
The use of functions on the columns prevent indexes from being used.
CONCAT(F.first_name,' ',F.last_name)
The result of the function is not indexed, even though the individual columns may be. Either you have to rewrite the conditions to query the name columns individually, or you have to store and index the result of that function (such as a "full name" column).
The index on [user.active] is unlikely to help you if most of the users are active.
I don't know what your application is all about, but I wonder if it hadn't been easier if you ditched the foreign keys in User table and instead put the UserID as a foreign key in the other tables instead.
dear php and mysql expertor
i have two table one large for posts artices 200,000records (index colume: sid) , and one small table (index colume topicid ) for topics has 20 record .. have same topicid
curent im using : ( it took round 0.4s)
+do get last 50 record from table:
SELECT sid, aid, title, time, topic, informant, ihome, alanguage, counter, type, images, chainid FROM veryzoo_stories ORDER BY sid DESC LIMIT 0,50
+then do while loop in each records for find the maching name of topic in each post:
while ( .. ) {
SELECT topicname FROM veryzoo_topics WHERE topicid='$topic'"
....
}
+Now
I going to use Inner Join for speed up process but as my test it took much longer from 1.5s up to 3.5s
SELECT a.sid, a.aid, a.title, a.time, a.topic, a.informant, a.ihome, a.alanguage, a.counter, a.type, a.images, a.chainid, t.topicname FROM veryzoo_stories a INNER JOIN veryzoo_topics t ON a.topic = t.topicid ORDER BY sid DESC LIMIT 0,50
It look like the inner join do all joining 200k records from two table fist then limit result at 50 .. that took long time..
Please help to point me right way doing this..
eg take last 50 records from table one.. then join it to table 2 .. ect
Do not use inner join unless the two tables share the same primary key, or you'll get duplicate values (and of course a slower query).
Please try this :
SELECT *
FROM (
SELECT a.sid, a.aid, a.title, a.time, a.topic, a.informant, a.ihome, a.alanguage, a.counter, a.type, a.images, a.chainid
FROM veryzoo_stories a
ORDER BY sid DESC
LIMIT 0 , 50
)b
INNER JOIN veryzoo_topics t ON b.topic = t.topicid
I made a small test and it seems to be faster. It uses a subquery (nested query) to first select the 50 records and then join.
Also make sure that veryzoo_stories.sid, veryzoo_stories.topic and veryzoo_topics.topicid are indexes (and that the relation exists if you use InnoDB). It should improve the performance.
Now it leaves the problem of the ORDER BY LIMIT. It is heavy because it orders the 200,000 records before selecting. I guess it's necessary. The indexes are very important when using ORDER BY.
Here is an article on the problem : ORDER BY … LIMIT Performance Optimization
I'm just give test to nested query + inner join and suprised that performace increase much: it now took only 0.22s . Here is my query:
SELECT a.*, t.topicname
FROM (SELECT sid, aid, title, TIME, topic, informant, ihome, alanguage, counter, TYPE, images, chainid
FROM veryzoo_stories
ORDER BY sid DESC
LIMIT 0, 50) a
INNER JOIN veryzoo_topics t ON a.topic = t.topicid
if no more solution come up , i may use this one .. thanks for anyone look at this post