MySQL query order by "most completed fields" - php

I have a table witch has 45 columns but only a few of these are yet completed. This table is continuously updated and added etc. In my auto-complete function i want to select these records ordered by the most completed fields(i hope you understand)?
One of the solutions is to create another filed (the "rank" field) and create a php function that selects * the records and gives a rank for each record.
... but i was wondering if there is a more simple way of doing this only whit a single ORDER BY?

MySQL has no function to count the number of non-NULL fields on a row, as far as I know.
So the only way I can think of is to use an explicit condition:
SELECT * FROM mytable
ORDER BY (IF( column1 IS NULL, 0, 1)
+IF( column2 IS NULL, 0, 1)
...
+IF( column45 IS NULL, 0, 1)) DESC;
...it is ugly as sin, but should do the trick.
You could also devise a TRIGGER to increment an extra column "fields_filled". The trigger costs you on UPDATE, the 45 IFs hurt you on SELECT; you'll have to model what is more convenient.
Note that indexing all fields to speed up SELECT will cost you when updating (and 45 different indexes probably cost as much as a table scan on select, not to say that the indexed field is a VARCHAR). Run some tests, but I believe that the 45-IF solution is likely to be the best overall.
UPDATE:
If you can rework your table structure to normalize it somewhat, you could put the fields in a my_values table. Then you would have a "header table" (maybe with only a unique ID) and a "data table". Empty fields would not exist at all, and then you could sort by how many filled fields are there by using a RIGHT JOIN, counting the filled fields with COUNT(). This would also greatly speed up UPDATE operations, and would allow you to efficiently employ indexes.
EXAMPLE (from table setup to two normalized tables setup):
Let us say we have a set of Customer records. We will have a short subset of "mandatory" data such as ID, username, password, email, etc.; then we will have a maybe much larger subset of "optional" data such as nickname, avatar, date of birth, and so on. As a first step let us assume that all these data are varchar (this, at first sight, looks like a limitation when compared to the single table solution where each column may have its own datatype).
So we have a table like,
ID username ....
1 jdoe etc.
2 jqaverage etc.
3 jkilroy etc.
Then we have the optional-data table. Here John Doe has filled all fields, Joe Q. Average only two, and Kilroy none (even if he was here).
userid var val
1 name John
1 born Stratford-upon-Avon
1 when 11-07-1974
2 name Joe Quentin
2 when 09-04-1962
In order to reproduce the "single table" output in MySQL we have to create a quite complex VIEW with lots of LEFT JOINs. This view will nonetheless be very fast if we have an index based on (userid, var) (even better if we use a numeric constant or a SET instead of a varchar for the datatype of var:
CREATE OR REPLACE VIEW usertable AS SELECT users.*,
names.val AS name // (1)
FROM users
LEFT JOIN userdata AS names ON ( users.id = names.id AND names.var = 'name') // (2)
;
Each field in our logical model, e.g., "name", will be contained in a tuple ( id, 'name', value ) in the optional data table.
And it will yield a line of the form <FIELDNAME>s.val AS <FIELDNAME> in the section (1) of the above query, referring to a line of the form LEFT JOIN userdata AS <FIELDNAME>s ON ( users.id = <FIELDNAME>s.id AND <FIELDNAME>s.var = '<FIELDNAME>') in section (2). So we can construct the query dynamically by concatenating the first textline of the above query with a dynamic Section 1, the text 'FROM users ' and a dynamically-built Section 2.
Once we do this, SELECTs on the view are exactly identical to before -- but now they fetch data from two normalized tables via JOINs.
EXPLAIN SELECT * FROM usertable;
will tell us that adding columns to this setup does not slow down appreciably operations, i.e., this solution scales reasonably well.
INSERTs will have to be modified (we only insert mandatory data, and only in the first table) and UPDATEs as well: we either UPDATE the mandatory data table, or a single row of the optional data table. But if the target row isn't there, then it must be INSERTed.
So we have to replace
UPDATE usertable SET name = 'John Doe', born = 'New York' WHERE id = 1;
with an 'upsert', in this case
INSERT INTO userdata VALUES
( 1, 'name', 'John Doe' ),
( 1, 'born', 'New York' )
ON DUPLICATE KEY UPDATE val = VALUES(val);
(We need a UNIQUE INDEX on userdata(id, var) for ON DUPLICATE KEY to work).
Depending on row size and disk issues, this change might yield an appreciable performance gain.
Note that if this modification is not performed, the existing queries will not yield errors - they will silently fail.
Here for example we modify the names of two users; one does have a name on record, the other has NULL. The first is modified, the second is not.
mysql> SELECT * FROM usertable;
+------+-----------+-------------+------+------+
| id | username | name | born | age |
+------+-----------+-------------+------+------+
| 1 | jdoe | John Doe | NULL | NULL |
| 2 | jqaverage | NULL | NULL | NULL |
| 3 | jtkilroy | NULL | NULL | NULL |
+------+-----------+-------------+------+------+
3 rows in set (0.00 sec)
mysql> UPDATE usertable SET name = 'John Doe II' WHERE username = 'jdoe';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> UPDATE usertable SET name = 'James T. Kilroy' WHERE username = 'jtkilroy';
Query OK, 0 rows affected (0.00 sec)
Rows matched: 0 Changed: 0 Warnings: 0
mysql> select * from usertable;
+------+-----------+-------------+------+------+
| id | username | name | born | age |
+------+-----------+-------------+------+------+
| 1 | jdoe | John Doe II | NULL | NULL |
| 2 | jqaverage | NULL | NULL | NULL |
| 3 | jtkilroy | NULL | NULL | NULL |
+------+-----------+-------------+------+------+
3 rows in set (0.00 sec)
To know the rank of each row, for those users that do have a rank, we simply retrieve the count of userdata rows per id:
SELECT id, COUNT(*) AS rank FROM userdata GROUP BY id
Now to extract rows in "filled status" order, we do:
SELECT usertable.* FROM usertable
LEFT JOIN ( SELECT id, COUNT(*) AS rank FROM userdata GROUP BY id ) AS ranking
ON (usertable.id = ranking.id)
ORDER BY rank DESC, id;
The LEFT JOIN ensures that rankless individuals get retrieved too, and the additional ordering by id ensures that people with identical rank always come out in the same order.

Related

Displaying MySQL rows on a JOIN query even when records do not exist in another table

I have an PHP/MySQL application which is connected to a database featuring 2 tables called 'displays' and 'display_substances'. The structure of these tables is as follows:
mysql> DESCRIBE displays;
+----------+----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+----------------------+------+-----+---------+----------------+
| id | smallint(5) unsigned | NO | PRI | NULL | auto_increment |
| label | varchar(255) | NO | | NULL | |
+----------+----------------------+------+-----+---------+----------------+
mysql> DESCRIBE display_substances;
+--------------+-----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+-----------------------+------+-----+---------+----------------+
| id | mediumint(8) unsigned | NO | PRI | NULL | auto_increment |
| display_id | smallint(5) unsigned | NO | MUL | NULL | |
| substance_id | mediumint(8) unsigned | NO | MUL | NULL | |
| value | text | NO | | NULL | |
+--------------+-----------------------+------+-----+---------+----------------+
There is also a 'substances' table and the foreign key display_substances.substance_id is associated with substances.id.
The 'displays' table contains exactly 400 rows and the 'display_substances' approx 1.5 million rows.
What I'm trying to do is output all 400 'displays.label' into a HTML table, and then in a second column show the 'display_substances.value'. Since the page only shows 1 substance at a time it also needs to be based on the 'display_substances.substance_id'.
The issue I'm having is that records only exist in 'display_substances' when we have data available for the appropriate 'displays'. However, the output has to show all records from 'displays' and then put the text "Not Listed" next to anything where there is no corresponding record in 'display_substances'.
I've done the following - which gives me the output I want - but is flawed (see "The Problem" section below).
Select all of the records in the "displays" table: SELECT label FROM displays ORDER BY label ASC
Select all display_substances.display_id for the substance currently being shown: SELECT display_id FROM display_substances WHERE substance_id = 1 (assuming 1 is the current substance ID). Store the ID's in an array called $display_substances
Loop through (1) and use in_array() to see if (2) exists:
foreach ($displays as $display) { // step (1)
$display_substances = // See step (2)
if (in_array($display['id'], $display_substances)) { // step (3)
$display_value = // See step (4)
} else {
$display_value = 'Not listed';
}
$output[] = ['display_label' => $display['label'], 'display_value' => $display_value]; // See step (5)
}
If the in_array() condition is true then I make a query to select the corresponding row of "display_substances": SELECT value FROM display_substances WHERE display_id = $display['id'] AND substance_id = 1
The $output variable buffers all the data and then it gets output into a HTML table later. The output I get is exactly as I want.
The problem
Although the output is correct I want to do this all as 1 query (if possible) because I need to add features to search by either displays.label or display_substances.value - or a combination of both. The first part of this is fairly trivial because I can amend the query in (1) to:
SELECT label FROM displays WHERE label LIKE '% Foo %' ORDER BY label ASC
However, this won't make display_substances.value searchable because by the time we get to step (3) we're dealing with a single row of display_substances not the whole table. I can't see how to write it differently though since we need to know which records exist in that table for the loaded substance.
I have also written the following query - but this will not work because it misses anything that's "Not Listed":
SELECT displays.label, display_substances.`value` FROM displays JOIN display_substances ON displays.id = display_substances.display_id WHERE display_substances.substance_id = 1
I have read How do I get the records from the left side of a join that do not exist in the joined table? but that didn't help.
For further clarification
Let's say there are 120 rows in display_substances that correspond to substance ID 1 (WHERE display_substances.substance_id = 1). The output of the query should always have 400 rows. In this example, 120 should have display_substances.value next to them, and 280 should have the text "Not Listed".
You need a left join & a group_concat to get all records on the left table along with group by.
But keep in mind that group_concat has a limit so you might not get all associated records, as it's usually used for small fields but since you have a 'text' field for your value there's a high probability you'd hit the limit
Anyway here's the query
SELECT d.*, GROUP_CONCAT(ds.value) `substances`
FROM displays `d`
LEFT JOIN display_substances `ds` ON `d`.`id` = `ds`.`display_id`
GROUP BY `d`.`id`
Something like this might work then if I understand correctly
SELECT d.*, IFNULL((SELECT GROUP_CONCAT(value) FROM display_substances `ds` WHERE `ds`.`display_id` = `d`.`id` GROUP BY `ds`.`display_id`), 'Not Listed') `substances`
FROM displays `d`
You can update the where & add AND substance_id = 1
My understanding is that you want the following:
Your page limits the results by substance ID
You want one substance per row
If there are displays with no substances, they should still show in the page with "Not Listed" as the substance value
I believe this should work for you:
SELECT
d.id AS display_id,
d.label AS display_label,
IFNULL(ds.value, 'Not Listed') AS substance_value
FROM displays AS d
LEFT JOIN display_substances AS ds ON (ds.display_id = d.id)
WHERE ds.substance_id = 1 OR ds.substance_id IS NULL;
I've realized that Ramy has about 98% of the solution.
FWIW this problem is just a variation on one that occurs all the time.
You will find other answers on SO when you search for 'left outer join with where clause' -- that address the problem. One example is this question.
Ultimately, you have a many to many resolution table (display_substances) that resolves the many to many relationship between substances and displays. You are just looking for an outer join from one of the 2 parent tables, but also requiring that you filter the results by a specific substance.
SELECT
d.id AS display_id,
d.label AS display_label,
IFNULL(ds.value, 'Not Listed') AS substance_value
FROM displays AS d
LEFT JOIN display_substances AS ds ON (ds.display_id = d.id AND ds.substance_id = 1);
This query does not generate a value of 'not listed' but it does generate NULL columns for those display rows where there is no corresponding display_substance value. You could embelish it with the IF_NULL() function demonstrated by ahmad, but as you are using PHP to go through the result set, you can just as easily handle that in the procedural loop you'll use to fetch the results.
I'm posting what I've used as a solution. I don't think it's possible to do this in MySQL as one query. Several people answered but none of the answers worked.
For even further clarification - although obvious from the question - the output in the application is a table with 2 columns. The first of these columns should include all 400 rows from displays:
displays.label | display_substances.value
------------------|--------------------------
Display 1
------------------|--------------------------
...
------------------|--------------------------
Display 400
------------------|--------------------------
This is fairly trivial since at this point it's just SELECT * FROM displays.
The challenge begins when we want to populate the second column of the table with display_substances.value. The data for a given substance (assume substance ID is 1) might look like this:
id | display_id | substance_id | value
-----|----------------|-----------------|-------------
206 | 1 | 1 Foo
-----|----------------|-----------------|-------------
361 | 3 | 1 Bar
-----|----------------|-----------------|-------------
555 | 5 | 1 Baz
-----|----------------|-----------------|-------------
The problem: In this case we only have 3 records for substance ID 1. So if we do a JOIN query, it will only return 3 rows. But the table we are displaying in the application needs to show all 400 rows from displays and put the text "Not listed" on any row where there is no corresponding row in display_substances. So in the example above, when it encounters display_id 2, 4, 6...400 it should say "Not listed" (because we only have data for three display_substances.display_id [1,3,5] for substance ID 1).
Both columns also need to be searchable.
My solution
I don't think it's possible to do this in MySQL so I resorted to using PHP. The logic is as now follows:
If the user is doing a search on column 2 (display_substances.value): SELECT display_id FROM display_substances WHERE value LIKE '% Search term column 2 %'. Store this as an array, $ds.
Select all 400 records from displays. If the user is performing a search on column 1 (displays.label) then that must form part of the query: SELECT * FROM displays WHERE label LIKE '% Search term column 1 %'. Critically - if the $ds array from step (1) is not empty then the following must become part of query: WHERE displays.id IN (2, 4, 6...400). Store this as an array, $displays
Get all of the display_id's associated with the substance being viewed: SELECT display_id FROM display_substances WHERE substance_id = 1
Do a loop as per point (3) of the original question.
The result is that the page loads in <2 seconds, each column is searchable.
The SQL queries given in answers took - at best - around 15-20 seconds to execute and never gave all 400 rows.
If anyone can improve on this or has a pure SQL solution please post.

How to track a secondary index id

Have a table that will be shared by multiple users. The basic table structure will be:
unique_id | user_id | users_index_id | data_1 | data_2 etc etc
With the id fields being type int and unique_id being an primary key with auto increment.
The data will be something like:
unique_id | user_id | users_index_id
1 | 234 | 1
2 | 234 | 2
3 | 234 | 3
4 | 234 | 4
5 | 732 | 1
6 | 732 | 2
7 | 234 | 5
8 | 732 | 3
How do I keep track of 'users_index_id' so that it 'auto increments' specifically for a user_id ?
Any help would be greatly appreciated. As I've searched for an answer but am not sure I'm using the correct terminology to find what I need.
The only way to do this consistently is by using a "before insert" and "before update" trigger. MySQL does not directly support this syntax. You could wrap all changes to the table in a stored procedure and put the logic there, or use very careful logic when doing an insert:
insert into table(user_id, users_index_id)
select user_id, count(*) + 1
from table
where user_id = param_user_id;
However, this won't keep things in order if you do delete or some updates.
You might find it more convenient to calculate the users_index_id when you query rather than in the database. You can do this using either subqueries (which are probably ok with the right indexes on the table) or using variables (which might be faster but can't be put into a view).
If you have an index on table(user_id, unique_id), then the following query should work pretty well:
select t.*,
(select count(*) from table t2 where t2.user_id = t.user_id and t2.unique_id <= t.unique_id
) as users_index_id
from table t;
You will need the index for non-abyssmal performance.
You need to find the MAX(users_index_id) and increment it by one. To avoid having to manually lock the table to ensure a unique key you will want to perform the SELECT within your INSERT statement. However, MySQL does not allow you to reference the target table when performing an INSERT or UPDATE statement unless it's wrapped in a subquery:
INSERT INTO users (user_id, users_index_id) VALUES (234, (SELECT IFNULL(id, 0) + 1 FROM (SELECT MAX(users_index_id) id FROM users WHERE user_id = 234) dt))
Query without subselect (thanks Gordon Linoff):
INSERT INTO users (user_id, users_index_id) SELECT 234, IFNULL((SELECT MAX(users_index_id) id FROM users WHERE user_id = 234), 0) + 1;
http://sqlfiddle.com/#!2/eaea9a/1/0

How to get values of other fields if a row is having duplicate entries in sql

I have a table like this:
ID build1 build2 test status
1 John ram test1 pass
2 john shyam test2 fail
3 tom ram test1 fail
The problem that I am facing is - on one of my webpage, only the values from the column "uild1" are available to me. Now in table there are 2 entries corresponding to "John". so, even if the user selects different "John", its showing the values for other values from the row only. On my webpage, in the drop down list, user can see 2 "John" but since query has been made using "John" condition, on both occasions, its showing the results from the first row only.
Try this:
SELECT t1.*
FROM Table1 t1
WHERE t1.build1 NOT IN(SELECT t2.build1
FROM table1 t2
GROUP BY t2.build1
HAVING COUNT(t2.build1) > 1);
SQL Fiddle Demo
This will give you only:
| ID | BUILD1 | BUILD2 | TEST | STATUS |
-----------------------------------------
| 3 | tom | ram | test1 | fail |
Since, it is the only row that has no duplicate build1.
If I'm understanding your question correctly, given a web page with 2 johns available to click on, how can you get each result accordingly? Unfortunately, there is no way of doing this with just SQL.
In your PHP code, if you can pass a parameter to your SQL code with either the ID or a counter/row number, then you could query the database to return a corresponding unique record.
Good luck.
You build1 is not unique or primary key so it is picking all the row matching your condition. You should use primary key or unique key to find the result. In your select drop-down your option value should be uniq/primary key so when you select particular "John" it will get result of that john.
select * from table_name where id=params[:id] ;
If you post some more information. It will be helpful to write better code for you.
select * from yourtable where build1 == 'john' limit 1;

MySQL: Update row where the first available column is empty

I have a table that I use to keep track of some associations between users and various other aspects of their website.
I need to be able to get the first available row and update one or two of it's columns ... the criteria is whether or not the user_id column has been used or not.
id | tag_id | user_id | product_id
If a row has a tag available where there is no user_id assigned, I want to be able to use and update that row for the latest purchased product.
1 | 100001 | 29 | 66
2 | 100002 | 0 | 0
3 | 100003 | 0 | 0
So as you can see, the second row would be the first eligible candidate.
I'm just not sure what the SQL needs to be in order to make that happen
UPDATE yourTablename SET user_id = 'your value for userid',
product_id='ur value for productid' WHERE id=(select min(id) where user_id='0');
alternative method already told are efficient but if your table has sorting with id
UPDATE yourTablename SET user_id = 'your value for userid',
product_id='ur value for productid' where user_id='0' LIMIT 1;
If I understand you correctly you want to update the first available empty (not NULL but empty) user_id row. How's this?
UPDATE users
SET user_id = 'user_value_here'
WHERE user_id=''
LIMIT 1
If your index is sorted ASC, the query below should find the first result in order.
See the fiddle.
UPDATE table SET user_id = 1 WHERE user_id IS NULL LIMIT 1
You can replace IS NULL with the condition for an empty user_id.

Retrieving "likes" tied to users from a database

I'm new to database structure. I'm trying to create an app that allows users to like certain entries, but I want to be able to tie likes to users so that I can change the visuals before/after the like action.
I think from research that I should have an 'entries' and 'users' table and then have a 'likes' table that ties the two to each other.
The only thing I'm unsure of is, when getting and displaying the contents... how would I write the queries? If I query for all the entries I need, do I then go back and individually query each to see if it has a like tied to it for the current user? That seems like it might be a costly operation. Is there a more efficient way?
Hope that makes sense,
Thanks.
I think you have the right database design in mind. As far as queries are concerned, assume tables as such:
Users
ID | Name
1 | Bob
2 | Sally
Entries
ID | Name
1 | Red
2 | Blue
3 | Yellow
Likes
UserID | EntryID
1 | 1
1 | 2
2 | 2
2 | 3
So we can say Bob likes Red and Blue while Sally likes Blue and Yellow. So a query to retrieve all entries, plus an indicator of what Bob likes would be:
SELECT
e.ID,
e.Name,
l.UserID
FROM Entries e LEFT JOIN Likes l ON l.EntryID = e.ID
WHERE l.UserID = 1 -- Bob's User ID
ORDER BY e.Name
This would return
ID | Name | UserID
2 | Blue | 1
1 | Red | NULL
3 | Yellow | 1
The UserID column indicates if Bob likes the entry or not - a NULL is No and a value is Yes.
Assuming you have a table Entries with a column entity_id (and whatever else you store about the entity) and a second table UserLikes that contains the columns user_id and entity_id, you would do the following:
SELECT Entries.col1, Entries.col1 . . ., UserLikes.user_id
FROM Entries LEFT OUTER JOIN UserLikes ON
Entries.entity_id = UserLikes.entity_id
WHERE UserLikes.user_id = :user_id
AND Entity.col_whatever = :whatever
In this example, Entries.col1, Entries.col2 . . . is the list of columns you want to get back about the Entries. The :user_id is a parameter that contains the id of the user you're currently trying to display Entries for. And the last line is standing in for whatever limitations you want to put on the Entries are returned.
This query will give you a row for each Entry you searched for. You can check the value the returned column user_id. If it's NULL then it was not liked by the user, if it contains the user's id, it was liked by the user.
I think u can retrieve the entries and query the likes table at the same time to get if the current user likes the entry performing a stored procedure. So u can control the value of the set of data returned by the query for example returning one colum for the entry text and one boolean column to evaluates the current user likes... In this way you will at least one parameter for the stored procedure to indicate who is the current user
I hope this idea help u...

Categories