Besides the tags, I would like to solve this on query, if possible.
I have this table
activity_type | value | date | company_id
network.new | 1 | 2011-10-08 | 1
members.count | 3 | 2011-10-08 | 1
network.new | 1 | 2011-10-10 | 2
network.new | 1 | 2011-10-11 | 3
members.count | 4 | 2011-10-11 | 2
That's basically a log activity.
'network.new' activity occur only
once per company_id
'members_count' activity occur only after
'network.new' appears per company_id and can appear once per day per
company_id.
I need to make a line graph that the X axis is the date, the Y axis is the quantity of two things:
How many company_ids have members each day of activity for the first
time (That is the one that is giving me a hard time);
How many have the network.new activity and only that activity
for each given day.
All queries I tried gave me false-positive lists, mostly because it counts company_ids that have the 'members_count' activity every day.
I wish, if possible, to create a query that give me date, first_time_members, new_company columns for create view purposes.
I hope my question was clear enough, and not silly because I couldn't find it anything that looks close to my problem anywhere.
[EDIT]
Since my english is really poor, I couldn't make myself clear I'm going to try explain a litle more:
My client have a network of companies and he wishes to learn how many
companies join the network day by day but there's a catch: A company
when sign in for the network, it is only considered a completed
one when it is also had registered members. So he wants to know, how many
companies make a 'incomplete' sign up and how many make a 'complete'
sign up.
Mr Ollie Jones put me in the right direction, I think I can use what he tough me, but it is not right there yet.
Thank ollie Jones for your answer by the way. Answers like yours made me love this site.
I'm going to stick my neck out and guess what you want. You're asking for "How many company_ids have members each day of activity for the first time". With respect, this is a very hard statement to understand.
I think you mean this: for each day, how many company_id values appear for the very first time in a network.new activity type, and how many of those are accompanied by nonzero members.count item in that same day, and how many are not?
Here's what you do:
First think of a query that will give the very first date for each company appearing in your system. Try this.
SELECT MIN(date) networknewdate, company_id
FROM table
WHERE activity_type = 'network.new'
GROUP BY company_id
This yields a virtual table of networknewdate, company_id.
Next, you need a query that will give the first date a members.count item turns up for each company.
SELECT MIN(date) memberscountdate, company_id
FROM table
WHERE activity_type = 'members.count'
GROUP BY date
OK, now we have two nice virtual tables each with at most one row for each company_id value. Let's join them, driving the join off the first (network.new) value.
SELECT a.networknewdate,
a.company_id,
IFNULL(b.members_present, 'no') members
FROM (
SELECT MIN(date) networknewdate, company_id
FROM table
WHERE activity_type = 'network.new'
GROUP BY company_id
) a
LEFT JOIN (
SELECT MIN(date) memberscountdate, company_id, 'yes' members_present
FROM table
WHERE activity_type = 'members.count'
GROUP BY date
) b ON (a.networknewdate = b.memberscountdate and a.company_id = b.company_id)
This will return three columns: a date, a company_id, and 'yes' or 'no' saying whether there was a first members.count record on the same day as the first network.new record for each company_id.
Now you need to summarize this whole thing so you get one record per day, with the number of 'yes' and the number of 'no' items listed. Here we go.
The number of 'yes' records by day.
SELECT networknewdate, count(*) yesrecords
FROM (
SELECT a.networknewdate,
a.company_id,
IFNULL(b.members_present, 'no') members
FROM (
SELECT MIN(date) networknewdate, company_id
FROM table
WHERE activity_type = 'network.new'
GROUP BY company_id
) a
LEFT JOIN (
SELECT MIN(date) memberscountdate, company_id, 'yes' members_present
FROM table
WHERE activity_type = 'members.count'
GROUP BY date
) b ON (a.networknewdate = b.memberscountdate and a.company_id = b.company_id)
) r
WHERE r.members = 'yes'
GROUP BY networknewdate
The number of no records by date is a similar query. Then you need to left join those two queries together on the networknewdate so you get a table of date, yesrecords, norecords. I'm going to leave this as a cut 'n paste exercise for you. It's more than twice as long as the query I wrote ending in GROUP BY networknewdate.
Welcome to SQL that implements real world business logic! I think the take-home lesson on this question is that you're asking for a result that's actually quite hard to specify. Once you specify exactly what you want, writing a query to get it is tedious and repetitive but not hard.
Another little hint. It may make sense for you to create some views so your queries aren't so enormous.
So using the same approach Ollie jones showed me I figure it out:
First I need a list of dates where 'members_count' OR 'network.new' happens
SELECT date as current_date
FROM activity_log ld
WHERE `activity_type` in ('members_count', 'network.new')
GROUP BY date
ORDER BY date
Them I left Join with a list of first date companies appears
SELECT MIN(date) AS new_date, company_id
FROM activity_log
WHERE activity_type = 'network.new'
GROUP BY company_id
ORDER BY date
Also Left Join first time a company count members
SELECT min(date) as members_count_date, company_id
FROM `activity_networks` WHERE `activity_type` = 'network.daily.members_count'
GROUP BY company_id
ORDER BY date
Then a Make a distinct count of new companies and companies that count members for the first time, group by date. Then I have this:
SELECT DATE(FROM_UNIXTIME(ld.date)) as curr_date,
COUNT(DISTINCT(new_co)) as new_co,
COUNT(DISTINCT(complete_co)) as complete
FROM activity_log ld
LEFT JOIN (SELECT date AS new_date, company_id as new_co
FROM activity_networks
WHERE activity_type = 'network.new'
GROUP BY company_id
ORDER BY date) nd ON (ld.date=nd.new_date)
LEFT JOIN (SELECT min(date) as members_count_date, company_id as complete_co
FROM `activity_log` WHERE `activity_type` = 'members_count'
GROUP BY company_id
ORDER BY date) mcd ON (mcd.members_count_date=ld.date)
WHERE `activity_type` in ('members_count', 'network.new')
GROUP BY DATE(FROM_UNIXTIME(ld.date))
ORDER BY ld.date
The distinct function was crucial because the counting wasn't doing right without it. It is not perfect. The column I named 'new_co' should bring only incomplete registrations (incomplete means, new register with out members linked to a company), but still the information can be useful.
Related
I'm making a timesheet submit/approve function and currently working on the pending.php page, which the manager/admin can go to pending.php and view the pending timesheets for review...
my code now is:
list($qh,$num) = dbQuery(
"SELECT start_time, end_time, uid, submitid, submitstatus, submitdate, submitapprover
FROM $TIMES_TABLE
WHERE submitstatus=1
ORDER BY submitid");
right now it shows all the timesheet entries for that week:
example
what I really need is just one line for each week submitted. Basically, grabbing the first start_time and the last end_time and making it together (start - end)
(start_time - end_time | username | id | submitdate | submit status..etc)
Someone told me to use group_catcon or something but I'm unfamiliar with that.
From my pic I would want something like:
2012-12-30 - 2013-01-05 | admin | submitid#### | submitdate | status | approver
2013-01-06 - 2013-01-09 | admin | submitid#### | submitdate | status | approver
I'm pretty new to php/mysql so my apologies
You may find with all these columns it divides things up more than you want. For example of there's various approvers. To that end you may want to remove some from the query.
select
concat(min(start_time), ' - ', max(end_time)),
uid,
submitid,
submitstatus,
submitdate,
submitapprover
FROM
$TIMES_TABLE
WHERE
submitstatus=1
GROUP BY
uid,
submitid,
submitstatus,
submitdate,
submitapprover
ORDER BY
submitid
I have built many timesheet applications in many languages. You can use group_concat it ties strings together with a comma, but i do not think that is what you need.
The catch is that you have to use it together with group by. From your db structure it might not work for you.
The rule with "group by" is that you have to select columns that are either used in the "group by" list of columns or you have to use an aggregate function on that column.
on a books table that has the following structure
books_sold (id, author_id, date, price)
you can do the following operations
select sum(price), author FROM books_sold group by author
To get the total sum of the books per author.
select sum(price), date FROM books_sold group by date
to get the total sum of the books sold per date.
select group_concat(id), date FROM books_sold group by date
TO get the the id of all the books sold per date. The ids will be separated by comma
But you cannot do a
select group_concat(id), date, author FROM books_sold group by date
because on the author column you are not doing a group by, or a mass operation. The query works but the author column is not reliable.
Now from your db structure, I do not think you can do a group_concat and still get the fields you desire. What happens if the approver is not the same guy in a week? Your approver column will not make sense. What happens if the submitid is not the same in an entire week? What happens if the submitdate is not the same on an entire week?
If those columns are always the same you can do a
select CONCAT(min(start_time), ' - ', max(end_time)), uid, submitid, submitstatus, submitdate, submitapprover FROM $TIMES_TABLE WHERE submitstatus=1 GROUP BY uid, submitid, submitstatus, submitdate, submitapprover ORDER BY submitid
Im new to this form and hopefuly I can get some awesome help!
I got three tables
1 "companies"
ID
2 "log"
compid
datum (date)
3 "sales"
datumnow (datetime)
uppdaterad (datetime)
I want to compare log and sales and get the latest or the "newest" entry and display a ASC list of companies from table 1 with only one company for each row. (comparing datum, datumnow & uppdaterad and get the highest date value displayed on one row for each ID from companies)
#RESULT
Rover - 2012-01-15
Daniel - 2012-02-01
Damien - 2012-03-05
I´ve struggled with this for a few days now and can´t get a hold of the solution.
App. ANY help! thanx.
You can use GREATEST() to return the most recent date from those three columns. This assumes you have another column in sales that relates to the other tables. From the structure you show above, the relationship is unclear.
SELECT
companies.ID,
GREATEST(log.datum, sales.datumnow, sales.uppdatedad) AS mostrecent
FROM
companies LEFT JOIN log ON companies.ID = log.compid
/* Assumes sales also has a compid column. Will edit if new info is posted */
LEFT JOIN sales ON companies.ID = sales.compid
WHERE log.userid='$userID' AND sales.seller='$userID'
For only one row with the max date per company, use a MAX() aggregate with a GROUP BY:
SELECT
companies.ID,
MAX(GREATEST(log.datum, sales.datumnow, sales.uppdatedad)) AS mostrecent
FROM
companies LEFT JOIN log ON companies.ID = log.compid
/* Assumes sales also has a compid column. Will edit if new info is posted */
LEFT JOIN sales ON companies.ID = sales.compid
WHERE log.userid='$userID' AND sales.seller='$userID'
GROUP BY companies.ID
I have two tables, one called episodes, and one called score. The episode table has the following columns:
id | number | title | description | type
The score table has the following columns:
id | userId | showId | score
The idea is that users will rate a show. Each time a user rates a show, a new row is created in the score table (or updated if it exists already). When I list the shows, I average all the scores for that show ID and display it next to the show name.
What I need to be able to do is sort the shows based on their average rating. I've looked at joining the tables, but haven't really figured it out.
Thanks
To order the results, use and ORDER BY clause. You can order by generated columns, such as the result of an aggregate function like AVG.
SELECT e.title, AVG(s.score) AS avg_score
FROM episodes AS e
LEFT JOIN scores AS s ON e.id=s.showId
GROUP BY e.id
ORDER BY avg_score DESC;
You're right. You have to JOIN these tables, then use GROUP BY on the 'episodes' table's 'id' column. Then you'll be able to use AVG() function on 'the scores' tables's 'score' column.
SELECT AVG(scores.score) FROM episodes LEFT JOIN scores ON scores.showId = episodes.id GROUP BY episodes.id
SELECT episodes.*, AVG(score.score) as AverageRating FROM episodes
INNER JOIN score ON (episodes.id = score.showId)
GROUP BY episodes.id
ORDER BY AVG(score.score) DESC
I am working on an auction web application. Now i have a table with bids, and from this table i want to select the last 10 bids per auction.
Now I know I can get the last bid by using something like:
SELECT bids.id FROM bids WHERE * GROUP BY bids.id ORDER BY bids.created
Now I have read that setting an amount for the GROUP BY results is not an easy thing to do, actually I have found no easy solution, if there is i would like to hear that.
But i have come up with some solutions to tackle this problem, but I am not sure if i am doing this well.
Alternative
The first thing is creating a new table, calling this bids_history. In this table i store a string of the last items.
example:
bids_history
================================================================
auction_id bid_id bidders times
1 20,25,40 user1,user2,user1 time1,time2,time3
I have to store the names and the times too, because I have found no easy way of taking the string used in bid_id(20,25,40), and just using this in a join.
This way i can just just join on auction id, and i have the latest result.
Now when there is placed a new bid, these are the steps:
insert bid into bids get the lastinserteid
get the bids_history string for this
auction product
explode the string
insert new values
check if there are more than 3
implode the array, and insert the string again
This all seems to me not a very well solution.
I really don't know which way to go. Please keep in mind this is a website with a lot of bidding's, they can g up to 15.000 bidding's per auction item. Maybe because of this amount is GROUPING and ORDERING not a good way to go. Please correct me if I am wrong.
After the auction is over i do clean up the bids table, removing all the bids, and store them in a separate table.
Can someone please help me tackle this problem!
And if you have been, thanks for reading..
EDIT
The tables i use are:
bids
======================
id (prim_key)
aid (auction id)
uid (user id)
cbid (current bid)
created (time created)
======================
auction_products
====================
id (prim_key)
pid (product id)
closetime (time the auction closses)
What i want as the result of the query:
result
===============================================
auction_products.id bids.uid bids.created
2 6 time1
2 8 time2
2 10 time3
5 3 time1
5 4 time2
5 9 time3
7 3 time1
7 2 time2
7 1 time3
So that is per auction the latest bids, to choose by number, 3 or 10
Using user variable, and control flow, i end up with that (just replace the <=3 with <=10 if you want the ten auctions) :
SELECT a.*
FROM
(SELECT aid, uid, created FROM bids ORDER BY aid, created DESC) a,
(SELECT #prev:=-1, #count:=1) b
WHERE
CASE WHEN #prev<>a.aid THEN
CASE WHEN #prev:=a.aid THEN
#count:=1
END
ELSE
#count:=#count+1
END <= 3
Why do this in one query?
$sql = "SELECT id FROM auctions ORDER BY created DESC LIMIT 10";
$auctions = array();
while($row = mysql_fetch_assoc(mysql_query($sql)))
$auctions[] = $row['id'];
$auctions = implode(', ', $auctions);
$sql = "SELECT id FROM bids WHERE auction_id IN ($auctions) ORDER BY created LIMIT 10";
// ...
You should obviously handle the case where, e.g. $auctions is empty, but I think this should work.
EDIT: This is wrong :-)
You will need to use a subquery:
SELECT bids1.id
FROM ( SELECT *
FROM bids AS bids1 LEFT JOIN
bids AS bids2 ON bids1.created < bids2.created
AND bids1.AuctionId = bids2.AuctionId
WHERE bid2.id IS NULL)
ORDER BY bids.created DESC
LIMIT 10
So the subquery performs a left join from bids to itself, pairing each record with all records that have the same auctionId and and a created date that is after its own created date. For the most recent record, there will be no other record with a greater created date, and so that record would not be included in the join, but since we use a Left join, it will be included, with all the bids2 fields being null, hence the WHERE bid2.id IS NULL statement.
So the sub query has one row per auction, contianing the data from the most recent bid. Then simply select off the top ten using orderby and limit.
If your database engine doesn't support subqueries, you can use a view just as well.
Ok, this one should work:
SELECT bids1.id
FROM bids AS bids1 LEFT JOIN
bids AS bids2 ON bids1.created < bids2.created
AND bids1.AuctionId = bids2.AuctionId
GROUP BY bids1.auctionId, bids1.created
HAVING COUNT(bids2.created) < 9
So, like before, left join bids with itself so we can compare each bid with all the others. Then, group it first by auction (we want the last ten bids per auction) and then by created. Because the left join pairs each bid with all previous bids, we can then count the number of bids2.created per group, which will give us the number of bids occurring before that bid. If this count is < 9 (because the first will have count == 0, it is zero indexed) it is one of the ten most recent bids, and we want to select it.
To select last 10 bids for a given auction, just create a normalized bids table (1 record per bid) and issue this query:
SELECT bids.id
FROM bids
WHERE auction = ?
ORDER BY
bids.created DESC
LIMIT 10
To select last 10 bids per multiple auctions, use this:
SELECT bo.*
FROM (
SELECT a.id,
COALESCE(
(
SELECT bi.created
FROM bids bi
WHERE bi.auction = a.id
ORDER BY
bi.auction DESC, bi.created DESC, bi.id DESC
LIMIT 1 OFFSET 9
), '01.01.1900'
) AS mcreated
COALESCE(
(
SELECT bi.id
FROM bids bi
WHERE bi.auction = a.id
ORDER BY
bi.auction DESC, bi.created DESC, bi.id DESC
LIMIT 1 OFFSET 9
), 0)
AS mid
FROM auctions a
) q
JOIN bids bo
ON bo.auction >= q.auction
AND bo.auction <= q.auction
AND (bo.created, bo.id) >= (q.mcreated, q.mid)
Create a composite index on bids (auction, created, id) for this to work fast.
I have a MySQL table that has price requests in it, it has date, first, last, and product_id fields. The product brand can be found from the product table from the product_id.
Given a date range, they want to know the total number of requests, by people and by brand for each day in the date range, and total for date range. Here is the tricky part, if the same person makes a request more than once in a day, then it only counts as 1 for people. But if a person makes a request for 2 different brands in 1 day, each brand gets 1 for that day. but if they make a requests for mulitples of a brand in a single day that brand only gets counted 1 for that day.
For example lets say on a given date John Doe made 3 price requests, for a Burberry product and 2 swarovski products. That would only count 1 for people, 1 for burberry, and 1 for swarovski. But if another person made a request for burberry then there would be 2 burberry for that day and 2 people for that day.
I hope this makes since. Anyways what is the best way to do this? I am using PHP4 and MySQL4
Thanks!
Assuming the table definition of Machine, try
-- # Shows number of people doing a request per day
SELECT
Count(DISTINCT userId) AS NumberOfUsers,
FROM
priceRequestsProducts prp,
date
GROUP BY
prp.date;
-- # Shows number of request for a given product per day
SELECT
Count(DISTINCT userId) AS NumberofUsers,
productId,
date
FROM
priceRequestsProducts prp
GROUP BY
prp.date, prp.productId;
It is not desirable to use on query. For people, you would expect one row per day, for products you would expect one row per pair (product, day). If you really want to do this, introduce a value not used as a productId (I would make productId UNSIGNED INT NOT NULL AUTO_INCREMENT) or use NULL, and use it to designate users:
SELECT
Count(DISTINCT userId) AS NumberOfUsers,
FROM
priceRequestsProducts prp,
NULL,
date
GROUP BY
prp.date;
UNION ALL
SELECT
Count(DISTINCT userId) AS NumberofUsers,
productId,
date
FROM
priceRequestsProducts prp
GROUP BY
prp.date, prp.productId;
Note: I made this table, since you didn't provide your SQL DDL for your tables. Just give me some comments and I'll edit the answer until we get it right.
CREATE TABLE priceRequestsProducts AS (
userId INT NOT NULL,
productId INT NOT NULL,
date DATE NOT NULL DEFAULT CURRENT_TIMESTAMP,
INDEX(userId),
INDEX(productId),
PRIMARY KEY (userId, productId, date),
CONSTRAINT fk_prp_users FOREIGN KEY (userId) REFERENCES (Users.id),
CONSTRAINT fk_prp_products FOREIGN KEY (productId) REFERENCES (Products.id)
) ENGINE=InnoDb;
-- # Shows all dates each user has made a priceRequest for at least one product:
SELECT U.userId, U.firstName, U.lastName, U.username, date
FROM Users U
JOIN priceRequestsProducts as prp ON u.id = prp.id
GROUP BY prp.userId, prp.date;
-- # Shows number of price requests for a product on all dates
SELECT P.id, P.name, count(1) as numRequests, prp.date
FROM Products P
JOIN priceRequestsProducts prp ON prp.productId = P.id
GROUP BY prp.productId, prp.date;