Sphinx data:
+----------+-------------+-------------+
| id | car_id | filter_id |
+----------+-------------+-------------+
| 37280991 | 4261 | 46 |
| 37280992 | 4261 | 18 |
| 37281000 | 4261 | 1 |
| 37281002 | 4261 | 28 |
| 51056314 | 4277 | 18 |
| 51056320 | 4277 | 1 |
| 51056322 | 4277 | 28 |
+----------+-------------+-------------+
I have a page that show cars and you can apply filters. I'm trying that Sphinx return the cars that have filter 1 and 46. If you take a look the above table, you will see that just one car(4261) have both filters. The problem is that I don't know how to apply this in Sphinx.
$this->cs->SetFilter('filter_id', array(1, 46)); // this don't work because show me both(4261, 4277) cars, because work like a "in"
$this->cs->SetGroupBy('car_id', SPH_GROUPBY_ATTR);
$this->cs->SetFilter('filter_id', array(1));
$this->cs->SetFilter('filter_id', array(46));
Both filters apply, and both need to match. In effect they are 'AND'ed.
Seems, misread the question, missed the fact using group by. THought using a MVA.
... so have to be a bit more creative. Alas will probably need to use SphinxQL, rather than SphinxAPI. As sphinxQL has HAVING
SELECT id,car_id FROM index WHERE filter_id IN (1,46) GROUP BY car_id HAVING COUNT(*)>1
This only includes rows where multiple documents match per car (ie matches each time using the IN clause. If there can be duplicates (like two rows with filter_id=1 then can perhaps use COUNT(DISTINCT filter_id) instead? )
Related
Hello :) I am fairly new to using INNER JOIN and still trying to comprehend it's logic which I think I am sort of beginning to understand. After being across a few different articles on the topic I have generated a query for finding duplicates in my table of phone numbers.
My table structure is as such:
+---------+-------+
| PhoneID | Phone |
+---------+-------+
Very simple. I created this query:
SELECT A.PhoneID, B.PhoneID FROM T_Phone A
INNER JOIN T_Phone B
ON A.Phone = B.Phone AND A.PhoneID < B.PhoneID
Which returns the ID of a phone that matches another one. I don't know how to word that properly so here is an example output:
+---------+---------+
| PhoneID | PhoneID |
+---------+---------+
| 17919 | 17969 |
| 17919 | 22206 |
| 17919 | 23837 |
| 17920 | 17970 |
| 17920 | 22203 |
| 17920 | 23834 |
| 17921 | 17971 |
| 17921 | 22225 |
| 17921 | 22465 |
| 17921 | 24011 |
| 17921 | 24047 |
| 17922 | 17972 |
| 17922 | 22198 |
| 17922 | 23879 |
| 17923 | 17973 |
| 17923 | 22199 |
| 17923 | 23880 |
+---------+---------+
You can note that on the left there is repeating IDs, the phone number that matches will be on the right (These are just the IDs of said numbers). what I am trying to accomplish, is to actually change a join table relative to the ID on the right. The join table structure is as such:
+----------+-----------+
| T_JoinID | T_PhoneID |
+----------+-----------+
Where T_JoinID is a larger object with a collection of those T_PhoneIDs, hence the join table. What I want to do is take a row from the original match query, and find the right side PhoneID in the join table, then update that item in the Join to be equal to the left side PhoneID. Repeating this for each row.
It's sort of a way to save space and get rid of matching numbers, I can just point the matching ones to the original and use that as a reference when I need to retrieve it.
After that I need to actually delete the original numbers that I reset the reference for but... This seems like a job for 2 or 3 different queries.
EDIT:
Sorry I know I didn't include enough detail. Here is some additional info:
My exact table structure is not the same as here but I am only using the columns that I listed so I didn't consider the fact that any of the others would matter. Most of the tables have a unique ID that is auto incremented. The phone table has carrier, type, ect columns. The additional columns I felt were irrelevant to include, but if there is a solution that includes the auto incremented ID of each table, let me know :) Anyway, I sort of found a solution, using multiple queries though I am still interested to learn and apply knowledge based on this question. So I have a that join table that I mentioned. It might look something like this for the expected results. There is a before and after table in one sorry for poor formatting.
+--------------------+---------+----------+---------+
| Join Table Results | | | |
+--------------------+---------+----------+---------+
| Before | | After | |
| Join | Table | Join | Table |
| PersonID | PhoneID | PersonID | PhoneID |
| 1 | 1 | 1 | 1 |
| 1 | 2 | 1 | 2 |
| 1 | 3 | 1 | 3 |
| 2 | 4 | 2 | 1 |
| 2 | 5 | 2 | 5 |
| 2 | 6 | 2 | 6 |
| 3 | 7 | 3 | 5 |
| 3 | 8 | 3 | 5 |
| 3 | 9 | 3 | 5 |
| 3 | 10 | 3 | 8 |
| 3 | 11 | 3 | 9 |
+--------------------+---------+----------+---------+
So you can see that in the before columns, 7, 8, and 9 would all be duplicate phone numbers in the PhoneID - PhoneID relationship table I posted originally. After the query I wanted to retrieve the duplicates using the PhoneID - PhoneID comparison and take the ones that match, to change the join table in a way that I have shown directly above. So 7, 8, 9 all turn to 5. Because 5 is the original number, and 7, 8, 9 coincidentally were duplicates of 5. So I am basically pointing all of them to 5, and then deleting what would have been 7, 8, 9 in my Phone table since they all have a new relationship to 5. Is this making sense? xD It sounds outrageous typing it out.
End Edit
How can I improve my query to accomplish this task? Is it possible using an UPDATE statement? I was also considering just looping through this output and updating each row individually but I had a hope to just use a single query to save time and code. Typing it out makes me feel a tad obnoxious but I had hope there was a solution out there!
Thank you to anyone in advance for taking your time to help me out :) I really appreciate it. If it sounds outlandish, let me know I will just use multiple queries.
I have a table called facility.
Structure looks as follows:
id | name
---------
1 | Hotel
2 | Hospital
3 | medical shop
I have an other table which is taking data from the above table and keeping multiple values in one column. View looks like below:
id | facilities
---------------
1 | Hospital~~medical shop~~Hotel
2 | Hospital~~Hotel
3 | medical shop~~Hotel
If I want to join these two tables how does the query look like?
I tried this, but it didn't work:
select overview.facilities as facility
from overview join facility on facility.id=overview.facilities;
you can do this with a bit of hackery
select o.facilities as facility
from overview o
join facility f on find_in_set(f.facilities, replace(o.facilities, '~~', ','));
I would highly recommend you change the way you are storing data. currently it is considered un normalized and that quickly becomes a monster to deal with
you should change your table structure to look something more like this
+----------+--------------+
| facility |
+----------+--------------+
| id | name |
+----------+--------------+
| 1 | Hotel |
| 2 | Hospital |
| 3 | medical shop |
+----------+--------------+
+-----------+-------------+
| overview |
+-----------+-------------+
| id | facility_id |
+-----------+-------------+
| 1 | 2 |
| 2 | 3 |
| 3 | 1 |
| 4 | 2 |
| 5 | 1 |
| 6 | 3 |
| 7 | 1 |
+-----------+-------------+
Code Explanation:
basically you are wanting to find the matching facilities in the overview. one handy function MySQL has is FIND_IN_SET() that allows you to find an item in a comma separated string aka find_in_set(25, '11,23,25,26) would return true and that matching row would be returned... you are separating your facilities with the delimiter ~~ which wont work with find_in_set... so I used REPLACE() to change the ~~ to a comma and then used that in the JOIN condition. you can go from here in multiple ways.. for instance lets say you want the facility id's for the overview.. you just add in the select GROUP_CONCAT(f.id) and you have all of the id's... note if you do that you need to add a GROUP BY at the end of your query to tell it how you want the results grouped
I have the following table structure
+-------+------------+-----------+---------------+
| id |assigned_to | status | group_id |
+-------+------------+-----------+---------------+
| 1 | 1001 | 1 | 19 |
+-------+------------+-----------+---------------+
| 2 | 1001 | 2 | 19 |
+-------+------------+-----------+---------------+
| 3 | 1001 | 1 | 18 |
+-------+------------+-----------+---------------+
| 4 | 1002 | 2 | 19 |
+-------+------------+-----------+---------------+
| 5 | 1002 | 2 | 19 |
+-------+------------+-----------+---------------+
I would like to get the information in the following format
+-------+------------+-----------+
| | 1001 | 1002 |
+-------+------------+-----------+
| 1 | 1 | 0 |
+-------+------------+-----------+
| 2 | 1 | 2 |
+-------+------------+-----------+
So basically I am looking to use the assigned to field as the column names. Then the rows represent the status. So for example in the table we have two rows where user 1002 has a status of 2, therefore the sum is shown on that particular status row.
Please note that the group_id must be 19. Hence why I left out the row with id 3 on my table.
Can someone point me in the right direction. Im sure there is a name for this type of query, but I can't for the life of me put this into words. I have tried various other queries, but none of them even come close to this.
Marc B is right, there is no way to pivot a table -i.e. converting the content of a field into columns- unless you make some assumptions, like supossing that the values of assigned_to are somewhat fixed.
On the other hand, this is the kind of problems that can be solved by a program. It is not an easy program, but it can do the job.
I recently made a program similar to this in java, if you are interested I can post the core of it here.
you might want to read this article http://www.artfulsoftware.com/infotree/qrytip.php?id=523
i'd be something like
SELECT
assigned_to,
COUNT( CASE assigned_to WHEN '1001' THEN 1 ELSE 0 END ) AS '1001',
COUNT( CASE assigned_to WHEN '1002' THEN 1 ELSE 0 END ) AS '1002'
FROM table
WHERE group_by = 19
GROUP BY assigned_to WITH ROLLUP;
or something like that (i haven't tested this code.. )
in the article, he does it using SUM() you'd have to do it with COUNT() and add a WHERE constraint for the group_id
Hope this helps
I am trying to get a list of distinct values from the columns out of a table.
Each column can contain multiple comma delimited values. I just want to eliminate duplicate values and come up with a list of unique values.
I know how to do this with PHP by grabbing the entire table and then looping the rows and placing the unique values into a unique array.
But can the same thing be done with a MySQL query?
My table looks something like this:
| ID | VALUES |
---------------------------------------------------
| 1 | Acadian,Dart,Monarch |
| 2 | Cadillac,Dart,Lincoln,Uplander |
| 3 | Acadian,Freestar,Saturn |
| 4 | Cadillac,Uplander |
| 5 | Dart |
| 6 | Dart,Cadillac,Freestar,Lincoln,Uplander |
So my list of unique VALUES would then contain:
Acadian
Cadillac
Dart
Freestar
Lincoln
Monarch
Saturn
Uplander
Can this be done with a MySQL call alone, or is there a need for some PHP sorting as well?
Thanks
Why would you store your data like this in a database? You deliberately nullify all the extensive querying features you would want to use a database for in the first place. Instead, have a table like this:
| valueID | groupID | name |
----------------------------------
| 1 | 1 | Acadian |
| 2 | 1 | Dart |
| 3 | 1 | Monarch |
| 4 | 2 | Cadillac |
| 2 | 2 | Dart |
Notice the different valueID for Dart compared to Matthew's suggestion. That's to have same values have the same valueID (you may want to refer to these later on, and you don't want to make the same mistake of not thinking ahead again, do you?). Then make the primary key contain both the valueID and the groupID.
Then, to answer your actual question, you can retrieve all distinct values through this query:
SELECT name FROM mytable GROUP BY valueID
(GROUP BY should perform better here than a DISTINCT since it shouldn't have to do a table scan)
I would suggest selecting (and splitting) into a temp table and then making a call against that.
First, there is apparently no split function in MySQL http://blog.fedecarg.com/2009/02/22/mysql-split-string-function/ (this is three years old so someone can comment if this has changed?)
Push all of it into a temp table and select from there.
Better would be if it is possible to break these out into a table with this structure:
| ID | VALUES |AttachedRecordID |
---------------------------------------------------------------------
| 1 | Acadian | 1 |
| 2 | Dart | 1 |
| 3 | Monarch | 1 |
| 4 | Cadillac | 2 |
| 5 | Dart | 2 |
etc.
In a MySQL query, when using the DISTINCT option, does ORDER BY apply after the duplicates are removed? If not, is there any way to make it do so? I think it's causing some issues with my code.
EDIT:
Here's some more information about what's causing my problem. I understand that, at first glance, this order would not be important, since I am dealing with duplicate rows. However, this is not entirely the case, since I am using an INNER JOIN to sort the rows.
Say I have a table of forum threads, containing this data:
+----+--------+-------------+
| id | userid | title |
+----+--------+-------------+
| 1 | 1 | Information |
| 2 | 1 | FAQ |
| 3 | 2 | Support |
+----+--------+-------------+
I also have a set of posts in another table like this:
+----+----------+--------+---------+
| id | threadid | userid | content |
+----+----------+--------+---------+
| 1 | 1 | 1 | Lorem |
| 2 | 1 | 2 | Ipsum |
| 3 | 2 | 2 | Test |
| 4 | 3 | 1 | Foo |
| 5 | 2 | 3 | Bar |
| 6 | 3 | 5 | Bob |
| 7 | 1 | 2 | Joe |
+----+----------+--------+---------+
I am using the following MySQL query to get all threads, then sort them based on the latest post (assuming that posts with higher ids are more recent:
SELECT t.*
FROM Threads t
INNER JOIN Posts p ON t.id = p.threadid
ORDER BY p.id DESC
This works, and generates something like this:
+----+--------+-------------+
| id | userid | title |
+----+--------+-------------+
| 1 | 1 | Information |
| 3 | 2 | Support |
| 2 | 1 | FAQ |
| 3 | 2 | Support |
| 2 | 1 | FAQ |
| 1 | 1 | Information |
| 1 | 1 | Information |
+----+--------+-------------+
However, as you can see, the information is correct, but there are duplicate rows. I'd like to remove such duplicates, so I used SELECT DISTINCT instead. However, this yielded the following:
+----+--------+-------------+
| id | userid | title |
+----+--------+-------------+
| 3 | 2 | Support |
| 2 | 1 | FAQ |
| 1 | 1 | Information |
+----+--------+-------------+
This is obviously wrong, since the "Information" thread should be on top. It would seem that using DISTINCT causes the duplicates to be removed from the top to the bottom, so only the final rows are left. This causes some issues in the sorting.
Is this the case, or am I analyzing things incorrectly?
Two things to understand:
Generally speaking, resultsets are unordered unless you specify an ORDER BY clause; to the extent that you specify a non-strict order (i.e. ORDER BY over non-unique columns), the order in which records that are equal under that ordering appear within the resultset is undefined.
I suspect you may be specifying such a non-strict order, which is the root of your problems: ensure that your ordering is strict by specifying ORDER BY over a set of columns that is sufficient to uniquely identify each record for which you care about its final position in the resultset.
DISTINCT may use GROUP BY, which causes the results to be ordered by the grouped columns; that is, SELECT DISTINCT a, b, c FROM t will produce a resultset that appears as though ORDER BY a, b, c has been applied. Again, specifying a sufficiently strict order to meet your needs will override this effect.
Following your update, bearing in mind my point #2 above, it is clear that the effect of grouping the results to achieve DISTINCT makes it impossible to then order by the non-grouped column p.id; instead, you want:
SELECT t.*
FROM Threads t INNER JOIN Posts p ON t.id = p.threadid
GROUP BY t.id
ORDER BY MAX(p.id) DESC
DISTINCT informs MySQL how to build a rowset for you, ORDER BY gives a hint how this rowset should by presented. So the answer is: DISTINCT first, ORDER BY last.
The order in which DISTINCT and ORDER BY are applied, in most cases, will not affect the final output.
However, if you also use GROUP BY, this will affect the final output. In this case, the ORDER BY is performed after the GROUP BY, which will return unexpected results (assuming you expect the sort to be performed before the grouping).