I have a table of words used in the title of articles. I want to find which words which are used the least in the set or article titles.
Example:
Titles:
"Congressman Joey of Texas does not sign bill C1234."
"The pretty blue bird flies at night in Texas."
"Congressman Bob of Arizona is the signs bill C1234."
The table would contain the following.
Table WORDS_LIST
----------------------------------------------------
| INDEX ID | WORD | ARTICLE ID |
----------------------------------------------------
| 1 | CONGRESSMAN | 1234 |
| 2 | JOEY | 1234 |
| 3 | SIGN | 1234 |
| 4 | BILL | 1234 |
| 5 | C1234 | 1234 |
| 6 | TEXAS | 1234 |
| 7 | PRETTY | 1235 |
| 8 | BLUE | 1245 |
| 9 | BIRD | 1245 |
| 10 | FLIES | 1245 |
| 11 | NIGHT | 1245 |
| 12 | TEXAS | 1245 |
| 13 | CONGRESSMAN | 1246 |
| 14 | BOB | 1246 |
| 15 | ARIZONA | 1246 |
| 16 | SIGNS | 1246 |
| 17 | BILL | 1246 |
| 18 | C1234 | 1246 |
----------------------------------------------------
In this case, the words "pretty,blue, flies, night" would be the used in the least number of articles.
I would appreciate any ideas on how to best create this query. So far below is what I started with. I can also write something in PHP but figured a query would be faster.
SELECT distinct a1.`word`, count(a1.`word`)
FROM mmdb.words_list a1
JOIN mmdb.words_list b1
ON a1.id = b1.id AND
upper(a1.word) = upper(b1.word)
where date(a1.`publish_date`) = '2017-06-09'
group by `word`
order by count(a1.`word`);
I don't see why a self-join is necessary. Just do something like this:
select wl.word, count(*)
from mmdb.words_list wl
where date(wl.`publish_date`) = '2017-06-09'
group by wl.word
order by count(*);
You can add a limit to get a fixed number of words. If publish_date is already a date, you should do the comparison as:
where publish_date = '2017-06-09'
If it has a time component:
where publish_date >= '2017-06-09' and publish_date < '2017-06-10'
This expression allows MySQL to use an index.
Try this. It's a bit more simple and should return the correct results:
SELECT `WORD`,
COUNT(*) as `num_articles`
FROM `WORDS_LIST`
WHERE date(`publish_date`) = '2017-06-09'
GROUP BY `WORD`
ORDER BY COUNT(*) ASC;
Related
I have a main table (advices) and two reference tables (expert, friend)
advices
----------------------------------------
|id | advisor_id | advisor_type |
----------------------------------------
| 1 | 6 | expert |
| 2 | 6 | friend |
| 3 | 7 | expert |
| 4 | 8 | expert |
----------------------------------------
expert
----------------------------------
|id | lastname | firstname |
----------------------------------
| 6 | Polo | Marco |
| 7 | Wayne | John |
| 8 | Smith | Brad |
----------------------------------
friend
----------------------------------
|id | lastname | firstname |
----------------------------------
| 6 | Doe | John |
| 7 | Brown | Jerry |
| 8 | Goofy | Doofy |
----------------------------------
I would like to get all of the advices (some are from an expert, some are from a friend) and have their respective lastname and firstname be part of the result set.
Each advice row has reference tables (expert, friend tables) tied to it via the id and type.
So I would like to have a result based on id but depending on type inso far as which table to query
The result would look like this
Combining lastname and firstname from reference tables depending on whether it is an expert or a friend.
advices (array)
----------------------------------------------------------------
|id | advisor_id | advisor_type | lastname | firstname |
-----------------------------------------------------------------
| 1 | 6 | expert | Polo | Marco |
| 2 | 6 | friend | Doe | John |
| 3 | 7 | expert | Wayne | John |
| 4 | 8 | expert | Smith | Brown |
-----------------------------------------------------------------
In non programming simple words term I would like to create a query such as this.
SELECT
advices.id, advices.advisor_id, advices.type
IF advices.type==expert THEN expert.lastname, expert.firstname
ELSE IF advices.type==friend THEN friend.lastname, friend.firstname
FROM advices, expert, friend
Obviously I know that the SELECT statement does not allow for this type of on the fly logic. But can this be done in another way?
This should work:
SELECT a.*, e.firstname, e.lastname
FROM advices AS a
INNER JOIN expert AS e ON a.advisor_id = e.id AND a.advisor_type = 'expert'
UNION
SELECT a.*, f.firstname, f.lastname
FROM advices AS a
INNER JOIN friend AS f ON a.advisor_id = f.id AND a.advisor_type = 'friend'
I have a table with concatenated values within both rows, I am therefore uniquely retrieve ranking for each row in the tables.
UPDATE
The other tables has been added to question
NamesTable
NID | Name |
1 | Mu |
2 | Ni |
3 | ices |
GroupTable
GID | GName |
1 | GroupA |
2 | GroupB |
3 | GroupC |
MainTable
| NID | Ages | Group |
| 1 | 84 | 1 |
| 2 | 64 | 1 |
| 3 | 78 | 1 |
| 1 | 63 | 2 |
| 2 | 25 | 2 |
| 3 | 87 | 2 |
| 1 | 43 | 3 |
| 2 | 62 | 3 |
| 3 | 37 | 3 |
Now the first Name is equated to the first age in the table, I am able to equate them using php and foreach statements, Now the problem is with the ranking of the ages per each group. I am ranking the names uniquely on each row or group.
Results which is expected
| Names | Ages | Group | Ranking |
| Mu,Ni,ices | 84,64,78 | 1 | 1,3,2 |
| Mu,Ni,ices | 63,25,87 | 2 | 2,3,1 |
| Mu,Ni,ices | 43,62,37 | 3 | 2,1,3 |
In my quest to solving this, I am using GROUP_CONCAT, and I have been able to come to this level in the below query
SELECT
GROUP_CONCAT(Names) NAMES,
GROUP_CONCAT(Ages) AGES,
GROUP_CONCAT(Group) GROUPS,
GROUP_CONCAT( FIND_IN_SET(Ages, (
SELECT
GROUP_CONCAT( Age ORDER BY Age DESC)
FROM (
SELECT
GROUP_CONCAT(Ages ORDER BY Ages DESC ) Age
FROM
`MainTable` s
GROUP by `Group`
) s
)
)) rank
FROM
`MainTable` c
GROUP by `Group`
This actually gives me the below results.
| Names | Ages | Group | Ranking |
| 1,2,3 | 84,64,78 | 1 | 7,9,8 |
| 1,2,3 | 63,25,87 | 2 | 5,6,4 |
| 1,2,3 | 43,62,37 | 3 | 2,1,3 |
The only problem is that the ranking Values increase from 1 to 9 instead of ranking each row uniquely. Its there any idea that can help me cross over and fix this? I would be grateful for your help. Thanks
Can't seem to get this query right. Here's what I need to do.
Get by Age under 40, example return...
| NAME |--| AGE |
|------|--|-----|
| Amy | | 26 |
| John | | 22 |
| Dan | | 30 |
Find Names that are like the names returned from above and sort alphabetically...
| NAME |--| AGE |
|------|--|-----|
| Aaron| | 33 |
| Amy | | 26 |
| Jacob| | 25 |
| John | | 22 |
| Dan | | 30 |
Sort the alphabetical groups by original returned age values values...
| NAME |--| AGE |
|------|--|-----|
| Jacob| | 25 |
| John | | 22 |-->was youngest from first query so his group goes first
| Aaron| | 33 |
| Amy | | 26 |
| Dan | | 30 |
This should allow you to order the results by age then names that match the first letter of the person of that person. The OTHER p1 CRITERIA should be replaced with whatever other criteria you are using to determine the first group:
SELECT p2.NAME, p2.AGE
FROM PEOPLE p1
JOIN PEOPLE p2
ON substring(p1.NAME,1,1) = substring(p2.NAME,1,1)
WHERE p1.AGE < 40
AND OTHER p1 CRITERIA
ORDER BY p1.AGE, p2.NAME
I have a database with entries similar to the following...
users:
+-----+------+----------+---------+-------+
| UID | Name | Addr | City | State |
+-----+------+----------+---------+-------+
| 1 | John | 101 Main | Austin | TX |
| 2 | John | 101 Main | Houston | TX |
| 3 | John | 101 Main | Del Rio | TX |
| 4 | John | 101 Main | Houston | TX |
+-----+------+----------+---------+-------+
verification:
+-----+---------------+--------------+
| UID | LicenseFirst3 | LicenseLast3 |
+-----+---------------+--------------+
| 1 | 554 | 122 |
| 2 | 556 | 345 |
| 3 | 555 | 382 |
| 4 | 555 | 108 |
+-----+---------------+--------------+
section_user_map:
+-----+-----------+---------------------+
| UID | SectionID | CompleteDate |
+-----+-----------+---------------------+
| 1 | 65 | 2012-05-12 05:05:15 |
| 2 | 72 | 2012-05-06 14:03:15 |
| 3 | 65 | 2012-05-09 16:13:15 |
| 4 | 72 | 2012-05-06 18:14:15 |
+-----+-----------+---------------------+
I need to be able to search for students who completed section 65 between noon on day X and noon on day Y. I also need to show the student's name, address, city, state and first and last three digits of their license number. I believe this will require both a left join and union command but it's getting a bit too complicated to formulate.
SELECT *
FROM section_user_map
JOIN users USING (UID)
JOIN verification USING (UID)
WHERE SectionID = 65
AND CompleteDate BETWEEN '2012-05-09 12:00:00'
AND '2012-05-11 12:00:00'
See it on sqlfiddle.
No UNION required. Outer join would only be required if you still want to return results for users who do not exist in one (or both) of the users or verification tables.
When trying to execute this query my mysql server cpu usage goes to 100% and the page just stalls. I setup an index on (Client_Code, Date_Time, Time_Stamp, Activity_Code, Employee_Name, ID_Transaction) it doesn't seem to help. What steps can I go about next to fix this issue? Also there is already one index on the database if that matters any. Thanks
Here is what this query does
Database info
ID_Transaction | Client_Code | Employee_Name | Date_Time |Time_Stamp| Activity_Code
1 | 00001 | Eric | 11/15/10| 7:30AM | 00023
2 | 00001 | Jerry | 11/15/10| 8:30AM | 00033
3 | 00002 | Amy | 11/15/10| 9:45AM | 00034
4 | 00003 | Jim | 11/15/10| 10:30AM | 00063
5 | 00003 | Ryan | 11/15/10 | 12:00PM | 00063
6 | 00003 | bill | 11/14/10 | 1:00pm | 00054
7 | 00004 | Jim | 11/15/10 | 1:00pm | 00045
8 | 00005 | Jim | 11/15/10| 10:00 AM| 00045
The query takes the info above and counts it like so. By the most recent entry for each client_code. In this case the query would look like this. After php.
Jerry = 1
2 | 00001 | Jerry | 11/15/10| 8:30AM | 00033
Amy = 1
3 | 00002 | Amy | 11/15/10| 9:45AM | 00034
Ryan = 1
5 | 00003 | Ryan | 11/15/10 | 12:00PM | 00063
Jim = 2
7 | 00004 | Jim | 11/15/10 | 1:00pm | 00045
8 | 00005 | Jim | 11/15/10| 10:00 AM| 00045
$sql = "SELECT m.Employee_Name, count(m.ID_Transaction)
FROM ( SELECT DISTINCT Client_Code FROM Transaction)
md JOIN Transaction m ON
m.ID_Transaction = ( SELECT
ID_Transaction FROM Transaction mi
WHERE mi.Client_Code = md.Client_Code AND Date_Time=CURdate() AND Time_Stamp!='' AND
Activity_Code!='000001'
ORDER BY m.Employee_Name DESC, mi.Client_Code DESC, mi.Date_Time DESC,
mi.ID_Transaction DESC LIMIT 1 )
group by m.Employee_Name";
Is there a better way to write this query so it doesnt bog down my system? The query works fine with 10 database entries but it locks my server up when the database has 300,000 entries.
Thanks
Eric
+----+--------------------+-------------+--------+------------------------+--------------+---------+----------------+------+----------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+-------------+--------+------------------------+--------------+---------+----------------+------+----------+----------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | [NULL] | [NULL] | [NULL] | [NULL] | 8 | 100.00 | Using temporary; Using filesort |
| 1 | PRIMARY | m | index | [NULL] | search index | 924 | [NULL] | 21 | 100.00 | Using where; Using index; Using join buffer |
| 3 | DEPENDENT SUBQUERY | mi | ref | search index,secondary | search index | 18 | md.Client_Code | 3 | 100.00 | Using where; Using temporary; Using filesort |
| 2 | DERIVED | Transaction | index | [NULL] | secondary | 918 | [NULL] | 21 | 38.10 | Using index |
+----+--------------------+-------------+--------+------------------------+--------------+---------+----------------+------+----------+----------------------------------------------+
What about going with multiple GROUP BY's instead of the all the sub queries to simplify things.... something like:
SELECT * FROM Transaction WHERE Date_Time=CURdate() AND Time_Stamp!='' AND Activity_Code != '000001' GROUP BY Client_Code, Employee_Name
If I'm understanding your query correctly then something like this would solve the issues and prevent the need for sub queries.
You'll definitely want to do a join instead of a sub select.
Also, how many records are you viewing? Is pagination and using limit out of the question?
If you set up your initial query modified with inner/outer joins as a view and it doesn't crash, you'll be one step closer. Once the view is set up, you'll be able to use a much less complicated select statement - potentially paginated.