Complicated MySQL Database Query - php

I have the following database structure:
Sites table
id | name | other_fields
Backups table
id | site_id | initiated_on(unix timestamp) | size(float) | status
So Backups table have a Many to One relationship with Sites table connected via site_id
And I would like to output the data in the following format
name | Latest initiated_on | status of the latest initiated_on row
And I have the following SQL query
SELECT *, `sites`.`id` as sid, SUM(`backups`.`size`) AS size
FROM (`sites`)
LEFT JOIN `backups` ON `sites`.`id` = `backups`.`site_id`
WHERE `sites`.`id` = '1'
GROUP BY `sites`.`id`
ORDER BY `backups`.`initiated_on` desc
The thing is, with the above query I can achieve what I am looking for, but the only problem is I don't get the latest initiated_on values.
So if I had 3 rows in backups with site_id=1, the query does not pick out the row with the highest value in initiated_on. It just picks out any row.
Please help, and
thanks in advance.

You should try:
SELECT sites.name, FROM_UNIXTIME(b.latest) as latest, b.size, b.status
FROM sites
LEFT JOIN
( SELECT bg.site_id, bg.latest, bg.sizesum AS size, bu.status
FROM
( SELECT site_id, MAX(initiated_on) as latest, SUM(size) as sizesum
FROM backups
GROUP BY site_id ) bg
JOIN backups bu
ON bu.initiated_on = bg.latest AND bu.site_id = bg.site_id
) b
ON sites.id = b.site_id
In the GROUP BY subquery - bg here, the only columns you can use for SELECT are columns that are either aggregated by a function or listed in the GROUP BY part.
http://dev.mysql.com/doc/refman/5.5/en/group-by-hidden-columns.html
Once you have all the aggregate values you need to join the result again to backups to find other values for the row with latest timestamp - b.
Finally join the result to the sites table to get names - or left join if you want to list all sites, even without a backup.

Try with this:
select S.name, B.initiated_on, B.status
from sites as S left join backups as B on S.id = B.site_id
where B.initiated_on =
(select max(initiated_on)
from backups
where site_id = S.id)

To get the latest time, you need to make a subquery like this:
SELECT sites.id as sid,
SUM(backups.size) AS size
latest.time AS latesttime
FROM sites AS sites
LEFT JOIN (SELECT site_id,
MAX(initiated_on) AS time
FROM backups
GROUP BY site_id) AS latest
ON latest.site_id = sites.id
LEFT JOIN backups
ON sites.id = backups.site_id
WHERE sites.id = 1
GROUP BY sites.id
ORDER BY backups.initiated_on desc
I have removed the SELECT * as this will only work using MySQL and is generally bad practice anyway. Non-MySQL RDBSs will throw an error if you include the other fields, even individually and you will need to make this query itself into a subquery and then do an INNER JOIN to the sites table to get the rest of the fields. This is because they will be trying to add all of them into the GROUP BY statement and this fails (or is at least very slow) if you have long text fields.

Related

mySQL INNER JOIN very large tables with same column names [duplicate]

This question already has answers here:
How to resolve ambiguous column names when retrieving results?
(11 answers)
Closed 2 years ago.
I have some big tables which I need to combine into a single very large table, to form a single-page data export for a statistical package.
This is easy with INNER JOIN but the some of the tables have the same column names and these are being overwritten by each other when I fetch them as an array in PHP.
There are 4 tables being joined with 30-200 columns in each so there are far too many field names to manually include in the query with aliases, as would be the norm in this situation.
Here's the query:
SELECT * FROM logs
INNER JOIN logdetail ON logdetail.logID = logs.id
INNER JOIN clients ON clients.id = logs.clientID
INNER JOIN records ON records.id = logdetail.id
WHERE logs.userID=1
Is there any way around this? I don't actually mind what the column names are as long as I have the data so if I could prepend the table name to each field, that would do the trick.
I would create a view, your view would be comprised of your long query with aliases
Here is an example taken from the manual
mysql> CREATE TABLE t (qty INT, price INT);
mysql> INSERT INTO t VALUES(3, 50);
mysql> CREATE VIEW v AS SELECT qty, price, qty*price AS value FROM t;
mysql> SELECT * FROM v;
+------+-------+-------+
| qty | price | value |
+------+-------+-------+
| 3 | 50 | 150 |
+------+-------+-------+
This has always worked for me, unless you have one to many or some other relationship among these tables, which will duplicate records.
SELECT * FROM logs l
INNER JOIN logdetail ld ON ld.logID = l.id
INNER JOIN clients c ON c.id = l.clientID
INNER JOIN records r ON r.id = ld.id
WHERE l.userID=1
As andrew says you can also use a View to get this thing working which is much cooler.
I found a solution for this. Simply, fetch each duplicate column a second time, this time using an alias. This way, the overwritten values are selected again and aliased:
SELECT * FROM logs,
clients.name as clientName,
logs.name as logName,
etc...
INNER JOIN logdetail ON logdetail.logID = logs.id
INNER JOIN clients ON clients.id = logs.clientID
INNER JOIN records ON records.id = logdetail.id
WHERE logs.userID=1
Note: There is no need to do this for the final instance of the duplicate, because this column will not have been overwritten. So, in the example above, there is no need to include a line like records.name as recordName because, since there are no columns after it which have the same name, the record.name field was never overwritten and is already available in the name column.

MySQL join multiple rows of query into one column

Table structure
client_commands (the "main" table):
id | completed
command_countries:
id | command_id | country_code
command_os:
id | command_id |OS
command_id on references the id column on client_commands.
Problem
I can add client commands with filters based on countries and operating systems. To try and normalise my DB structure, for each new command added:
Add a new row to client_commands
For each country, I add a new row to command_countries, each referencing client_command.id
For each OS, I add a new row to command_os, each referencing client_command.id
For one of the pages on my site, I need to display all client_commands (where completed = 0) as well as all the countries and operating systems for that command. My desired output would be something like:
id | countries | OS
1 | GB, US, FR| 2, 3
2 | ES, HU | 1, 3
I'm not sure how to go about doing this. The current query I'm using returns multiple rows:
SELECT a.id, b.country_code, c.OS
FROM client_commands a
LEFT JOIN command_countries b on b.command_id = a.id
LEFT JOIN command_os c on c.command_id = a.id
WHERE a.completed = 0
Any help?
Thanks!
EDIT: I forgot to mention (if you couldn't infer from above) - there can be a different number of operating systems and countries per command.
--
Also: I know I could do this by pulling all the commands, then looping through and running 2 additional queries for each result. But if I can, I'd like to do it as efficiently as possible with one query.
You can do this in one query by using GROUP_CONCAT
SELECT a.id,
GROUP_CONCAT(DISTINCT b.country_code SEPARATOR ' ,') `countries`,
GROUP_CONCAT(DISTINCT c.OS SEPARATOR ' ,') `os`,
FROM client_commands a
LEFT JOIN command_countries b on b.command_id = a.id
LEFT JOIN command_os c on c.command_id = a.id
WHERE a.completed = 0
GROUP BY a.id
if you want the ordered results in in a row you can use ORDER BY in GROUP_CONCAT like
GROUP_CONCAT(b.country_code ORDER BY b.command_id DESC SEPARATOR ' ,') `countries`
But be aware of that fact it has a limit of 1024 character to concat set by default but this can be increased b,steps provided in manual

Mysql group by fetch last row

SELECT * FROM conversation_1
LEFT JOIN conversation_2
ON conversation_1.c_id = conversation_2.c_id
LEFT JOIN user
ON conversation_2.user_id = user.user_id
LEFT JOIN message
ON conversation_1.c_id = message.c_id
WHERE conversation_1.user_id=1
GROUP BY message.c_id
conversation_1 conversation_2
c_id user_id c_id user_id
1 1 1 2
2 1 2 3
3 2
I have a message DB build in Mysql
I make 4 tables user, conversation_1, conversation_2, message
when user try to open his message box, it will fetch out all conversations(conversation_1)
than join to user conversation_2 and use conversation_2 to find out which user
than join to the message.
c_id user_id user_name message
1 2 Alex Hi user_1, this is user_2
2 3 John hi user_3, user_2 don't talk to me
it works fine, however I want to display the message from last row GROUP BY
currently it display the 1st row in this group.
ps.conversation_1.c_id is auto increment and the c_id will insert to conversation_2 who has join this conversation
select * from (SELECT * FROM conversation_1
LEFT JOIN conversation_2
ON conversation_1.c_id = conversation_2.c_id
LEFT JOIN user
ON conversation_2.user_id = user.user_id
LEFT JOIN message
ON conversation_1.c_id = message.c_id
WHERE conversation_1.user_id=1
order by conversation_1.c_id desc) finalData
GROUP BY message.c_id
Beware that, as documented under MySQL Extensions to GROUP BY:
MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. This means that the preceding query is legal in MySQL. You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. Furthermore, the selection of values from each group cannot be influenced by adding an ORDER BY clause. Sorting of the result set occurs after values have been chosen, and ORDER BY does not affect which values within each group the server chooses.
This is what is happening to select the message (and potentially other columns) in your existing query.
Instead, you want the groupwise maximum:
SELECT messages.* FROM messages NATURAL JOIN (
SELECT c_id, MAX(m_id) m_id FROM messages GROUP BY c_id
) t

Limit the output of a mysql query when a variable is used in WHERE

I have a mysql statement that queries a database for the latest track. However, since the database is partially normalized the ID's are in different tables. In the query's I get the artist ID'd from the artists table and put them into a variable. The variable in then parsed into a query that looks at the tracks to find the latest one, this is where the problem lies. Since the $artist variable can have tonnes of ID's in, all those ID's are parsed into the query and the outcome is several url's put together even though I have put a LIMIT on the query.
Bear in mind that I cannot LIMIT the artist query as I need to get all the artists from the table and find the latest track out of all the artists.
How would I get just the latest url from the query without limiting the artist query?
//Set up artist query so only NBS artists are chose
$findartist = mysql_query("SELECT * FROM artists") or die(mysql_error());
while ($artist = mysql_fetch_array($findartist)){
$artist = $artist['ID'];
//get track url
$fetchurl = mysql_query("SELECT * FROM tracks WHERE id = '$artist' ORDER BY timestamp DESC LIMIT 1");
url = mysql_fetch_array($fetchurl);
$track_ID = $url ['ID'];
$trackname = $url ['name'];
$trackurl = $url ['url'];
$artist_ID =$url['ID'];
}
ADDITION:
$findartist = mysql_query("SELECT A.*, T.*
FROM (
SELECT T.ARTIST_ID, MIN(T.TRACK_ID) TRACK_ID
FROM (
SELECT ARTIST_ID, MAX(`TIMESTAMP`) `TIMESTAMP`
FROM TRACKS
GROUP BY ARTIST_ID
) L
JOIN TRACKS T ON ( L.ARTIST_ID = T.ARTIST_ID
AND L.`TIMESTAMP` = T.`TIMESTAMP`)
GROUP BY T.ARTIST_ID
) X
JOIN ARTISTS A ON X.ARTIST_ID = A.ARTIST_ID
JOIN TRACKS T ON (X.TRACK_ID = T.TRACK_ID AND X.ARTIST_ID = T.ARTIST_ID)
ORDER BY A.NAME");
while ($artist = mysql_fetch_array($findartist)){
$artist = $artist['ID'];
$trackurl = $artist['url'];
The relation between artists table and tracks table is one-to-many. So your tracks table should have a column artist_id and foreign key constraint which cross-references this column with id column in artists table. When this is done, the query to get latest tracks would look like:
SELECT id, name, url, MAX(timestamp) timestamp
FROM tracks
GROUP BY artist_id
If I understand you correctly, you want the latest (most recent timestamp) track from each artist in your artist table.
It would help if you had your table definitions displayed. I think you're confusing ARTIST_ID and TRACK_ID in your query from your tracks table. So I will use the column names ARTIST_ID and TRACK_ID throughout.
(TIMESTAMP is an unfortunate choice for a column name, because it's also a MySQL data type name, by the way. No matter.)
You can do this with one query. Let us construct that query. It's not super simple but it will work just fine.
First, let's get the timestamp of the latest track or tracks by each artist. This returns a virtual table with ARTISTS_ID and latest TIMESTAMP shown.
SELECT ARTIST_ID, MAX(`TIMESTAMP`) `TIMESTAMP`
FROM TRACKS
GROUP BY ARTIST_ID
Now, let's nest that query into another query to come up with a particular track_id that is the latest track from each artist. It is necessary to disambiguate the situation where an artist has more than one track with precisely the same timestamp. In this case we'll grab the lowest numbered TRACK_ID.
I suppose that all the tracks on an album by an artist have the same timestamp, but they have ascending track IDs, so this picks the first track on the artist's latest album.
SELECT T.ARTIST_ID, MIN(T.TRACK_ID) TRACK_ID
FROM (
SELECT ARTIST_ID, MAX(`TIMESTAMP`) `TIMESTAMP`
FROM TRACKS
GROUP BY ARTIST_ID
) L
JOIN TRACKS T ON ( L.ARTIST_ID = T.ARTIST_ID
AND L.`TIMESTAMP` = T.`TIMESTAMP`)
GROUP BY T.ARTIST_ID
See how this goes? The inner subquery finds the latest timestamp for each artist, and the outer query uses the subquery to find the lowest-numbered track ID for that artist and timestamp. So, now we have a virtual table that shows the latest track_id for each artist.
Finally, we need to query the joined-together artist and track information to get your list of artists and their latest tracks. We'll join the two physical tables with the virtual table we just figured out.
SELECT A.*, T.*
FROM (
SELECT T.ARTIST_ID, MIN(T.TRACK_ID) TRACK_ID
FROM (
SELECT ARTIST_ID, MAX(`TIMESTAMP`) `TIMESTAMP`
FROM TRACKS
GROUP BY ARTIST_ID
) L
JOIN TRACKS T ON ( L.ARTIST_ID = T.ARTIST_ID
AND L.`TIMESTAMP` = T.`TIMESTAMP`)
GROUP BY T.ARTIST_ID
) X
JOIN ARTISTS A ON X.ARTIST_ID = A.ARTIST_ID
JOIN TRACKS T ON (X.TRACK_ID = T.TRACK_ID AND X.ARTIST_ID = T.ARTIST_ID)
ORDER BY A.NAME
Think of it this way: You have some physical tables with your data in them. You can also create virtual tables with subqueries and use them as if they were physical tables by including them, nested, in your queries. That nesting is one of the reasons it's called Structured Query Language.
You're going to need indexes on your TIMESTAMP, ARTIST_ID, and TRACK_ID columns for this to work efficiently.
Edit:
There really isn't sufficient information about your schema in your question to figure out how unambiguously to get the most recently uploaded track.
If the TRACK_ID is the autoincrementing primary key for the TRACKS table, it's easy. Get the highest numbered track ID left joined to the artist (left joined in case there's no corresponding row in the artist table).
SELECT T.*, A.*
FROM TRACKS T
LEFT JOIN ARTISTS A ON T.ARTIST_ID = A.ARTIST_ID
ORDER BY T.TRACK_ID DESC
LIMIT 1
If TRACK_ID isn't an autoincrementing primary key but you almost never have two timestamps the same, do this. If there happen to be two or more tracks with the same timestamp, it will arbitrarily select one of them.
SELECT T.*, A.*
FROM TRACKS T
LEFT JOIN ARTISTS A ON T.ARTIST_ID = A.ARTIST_ID
ORDER BY T.`TIMESTAMP` DESC
LIMIT 1
The trick to this data stuff is to be very careful to specify exactly what you want. It's pretty clear from your question that you're trying, in a loop, to get the most recent track for each artist in turn. My query did that without a loop in your program. But, you know what, I don't know the names of all your columns so my SQL might not be perfect.
Big thanks to #OllieJones and #hookman for helping me out on this. I have found the query I need and I have done it all in one query without any PHP so big thanks to them both.
Anyway here it is;
SELECT T.url, A.ID, T.ID
FROM tracks T
LEFT JOIN ARTISTS A ON T.ID = A.ID
WHERE T.ID = A.ID
ORDER BY T.timestamp DESC
LIMIT 1
I took much of #OllieJones query and edited it a bit. I added the WHERE clause so that only artists are chosen and took away the * so only the needed data is returned. I also took #hookman advice and used a load of foreign keys. Gonna help a lot in the future.

Fastest way to join to an IP2C table?

I have three tables overall, one with player names and their last login, and another table with the player name and their IP. These are from a game server, but it's two separate "plugins" of the server, so I cannot merge these into one table.
I successfully join these two on the playername column like so:
SELECT
u.`user` as `ign`,
lb.`lastlogin` as `date`,
lb.`ip`
FROM `mcmmo_users` u
LEFT JOIN `lb-players` lb
ON u.`user`=lb.`playername`
These produce the following array: Array(ign,date,ip);
However, I have an IP2C (IP-Country) table as well, and I would like to get these results at the same time. However, this table is extremely large, and would heavily slow down the query if I did a standard LEFT JOIN.
Is there a quicker way to join this? I would prefer to not query on every PHP loop of the data.
I am using MySQL and PHP
The IP2C database is layed out as follows:
begin_ip | end_ip | begin_ip_num | end_ip_num | country_code | country_name
And is queried as follows:
$IPNUM = sprintf("%u",ip2long($ip));
SELECT `country_code`
FROM `cpanel_ip2c`
WHERE `$IPNUM` BETWEEN `begin_ip_num` AND `end_ip_num`
A between condition is hard to optimize for a database. Instead, consider querying for the first IP block that is greater or equal to the user's IP:
select *
from mcmmo_users u
left join
`lb-players` lb
on u.user = lb.playername
left join
cpanel_ip2c ip
on ip.begin_ip_num =
(
select begin_ip_num
from cpanel_ip2c ip
where ip.begin_ip_num <= inet_aton(lb.ip)
order by
ip.begin_ip_num desc
limit 1
)
and inet_aton(lb.ip) <= ip.end_ip_num
With an index on cpanel_ip2c(begin_ip_num ), the country can be resolved with an index seek.
Here's an example on SQL Fiddle, with the mcmmo_users table omitted for simplicity.

Categories