i have a table in which a row contains following data. So i need to compare data among themselves and show which data has maximum count.for ex. my table has following fruits name. So i need to compare these fruits among themselves and show max fruit count first.
s.no | field1 |
1 |apple,orange,pineapple |
2 |apple,pineapple,strawberry,grapes|
3 |apple,grapes, |
4 |orange,mango |
i.e apple comes first,grapes second,pineapple third and so on. and these datas are entered dynamically, so whatever the values is entered dynamically it needs to compare among themselves and get max count
Great question.
This is a classical bad outcome of not having the data normalized.
I recommend you to read about Database Normalization, normalize your tables and see after that how easy it is to do this with simple SQL queries
If you need to run queries on column field 1, then why not consider normalization ? Otherwise it might keep on getting complex and dirty in future.
Your current table will look like this (for serianl number 1 only), Pk can be an autoincrement primary key.
Pk | s.no |fruitId|
1 | 1 |1 |
2 | 1 |2 |
3 | 1 |3 |
Your New Table of Fruits
PK |fruitName |
1 |Apple |
2 |Orange |
3 |Pineapple |
This also helps you to avoid redundancy.
Quick solution would be counting the amount of fruits where you insert/update the row and add a fruitCount column. You can then use this column to order by.
Zohaib has to correct solution though - if you have the time and possibility for such changes. And I definitely suggest you to read Tudor's link!
Related
When I started designing my application database schema few months ago I have been told not to store the same data/calculated data in more than one place in the database(normalization). If I do, I will make a scope of bugs when I update the data in one place and left the other without updating. So I did an orders table and ordersDetails table. Something like this..
-- orders table
+-----+---------+----------+
| ID | clintID | date |
+-----+---------+----------+
| 1 | 1 |2018-02-22|
| 2 | 1 |2018-02-23|
| 3 | 2 |2018-02-24|
+-----+---------+----------+
-- orderDetail table
+-----+---------+------------+----------+----------+
| ID | orderID | itemNumber | quantity | unitPrice|
+-----+---------+------------+----------+----------+
| 1 | 1 | 12345 | 3 | 100.75 |
| 2 | 1 | 12346 | 3 | 100.75 |
| 3 | 2 | 12347 | 3 | 100.75 |
| 4 | 2 | 12345 | 3 | 100.75 |
| 5 | 3 | 12347 | 3 | 100.75 |
| 6 | 3 | 12345 | 3 | 100.75 |
+-----+---------+------------+----------+----------+
And to make the the queries easier for me I made a view "allOrdersSummary" like
-- allOrdersSummary
SELECT
orders.*, SUM(orderDetail.quantity * orderDetail.unitPrice) totalAmount
FROM orders INNER JOIN orderDetail ON orders.ID = orderDetail.orderID
GROUP BY orders.ID;
and I used this view later for my queries, but now I started to get the MAX_JOIN_SIZE error.
So I thought of saving the calculated total order amount along with the orders table ID, clintID, date, totalAmount and whenever I change something in the orderDeatils table I update the calculated totalAmount column in the orders table, I don't know if this is good or bad!
This problem -I don't know if this is considered a problem or not- is encountered many times, for example to know the unread messages of the client making the request I have to do sum(messages) unread from messages where to = ? and isRead = 0
A) should I make another column for calculated totalAmount in the orders table or it is a normal thing in databases to calculate the totalAmount from the orderDetails table every time I need it ?
B) If you recommend making another column in the orders table, what is the best way to update it every time a change happens in the orderDetails table ? should I update it at the PHP layer whenever I update the orderDetails table, or this is something that needs a stored procedure ?
Yes, it is normal to store pre-calculated values, based on other data in the database, in a database. But not necessarily for the reason you mention. I never had a problem with MAX_JOIN_SIZE.
The main, and probably only, reason for storing calculated values is speed. So you do it for values that don't change that often and that may be used in queries that use a lot of data and may therefore be too slow if you didn't use them.
For instance: If you want to know the average value of all the orders in your database the query would be a lot faster if you already have the order totals.
Why, and how, you update the values is completely up to you. However you have got to be consistent about it. If you use the MVC pattern it would make sense to integrate it in the controller. Or in simple terms: Whenever a form is submitted that could change one of the values, out of which the pre-calculated value is computed, you need to recompute it.
This is a clear demonstration where 'normalization' is not entirely maintained. It's not really pretty, but sometimes worth it. You could, of course, argue, that the calculated value represents 'new' information, and therefore does not offend against 'normalization'.
You have an "inflate-deflate" problem.
JOIN the two tables to make a much larger temporary table.
GROUP BY to shrink back to one row per row of the original (orders) table.
This avoids the problem:
SELECT *,
( SELECT SUM(quantity * unitPrice
FROM orderDetail WHERE orderID = orders.ID
) AS totalAmount
FROM orders;
Please let me know how your experience is with this one. It is one of the simplest examples of the inflate-deflate problem.
I need to store and retrieve items of a course plan in sequence. I also need to be able to add or remove items at any point.
The data looks like this:
-- chapter 1
--- section 1
----- lesson a
----- lesson b
----- drill b
...
I need to be able to identify the sequence so that when the student completes lesson a, I know that he needs to move to lesson b. I also need to be able to insert items in the sequence, like say drill a, and of course now the student goes from lesson a to drill a instead of going to lesson b.
I understand relational databases are not intended for sequences. Originally, I thought about using a simple autoincrement column and use that to handle the sequence, but the insert requirement makes it unworkable.
I have seen this question and the first answer is interesting:
items table
item_id | item
1 | section 1
2 | lesson a
3 | lesson b
4 | drill a
sequence table
item_id | sequence
1 | 1
2 | 2
3 | 4
4 | 3
That way, I would keep adding items in the items table with whatever id and work out the sequence in the sequence table. The only problem with that system is that I need to change the sequence numbers for all items in the sequence table after an insertion. For instance, if I want to insert quiz a before drill a I need to update the sequence numbers.
Not a huge deal but the solutions seems a little overcomplicated. Is there an easier, smarter way to handle this?
Just relate records to the parent and use a sequence flag. You will still need to update all the records when you insert in the middle but I can't really think of a simple way around that without leaving yourself space to begin with.
items table:
id | name | parent_id | sequence
--------------------------------------
1 | chapter 1 | null | 1
2 | section 1 | 1 | 2
3 | lesson a | 2 | 3
4 | lesson b | 2 | 5
5 | drill a | 2 | 4
When you need to insert a record in the middle a query like this will work:
UPDATE items SET sequence=sequence+1 WHERE sequence > 3;
insert into items (name, parent_id, sequence) values('quiz a', 2, 4);
To select the data in order your query will look like:
select * from items order by sequence;
I have a table called user having 3 columns namely id, name and phone no.
i want insert data like below clip.
+----+---------------+---------------------+-
| id | name | phone no |
+----+---------------+---------------------+-
| 1 | mahadev | +91 XXXXX |
| 2 | swamy | +91 YYYYY |
| | | +91 ZZZZZ |
| 3 | charlie | +91 AAAAA |
| | | |
+----+---------------+---------------------+-
Here question is how can i add more than one values (one by one) to same row as showing id = 2 in above clip.
Could anyone please help me on this?
Thanks in advance.
You cannot do what you intended, how you intended. And for a reason.
One possible solution (bad), would be to make id non-unique and then insert two times id 2, name swamy, phone for two different phones.
Proper solution is to have two tables. One is your current user, which would have only id and name.
Second table is phone_numbers which would have user_id and phone_no. Primary key on that table would be composite of user_id and phone_no so it would prevent duplicates. Then in that table you can insert as many numbers as you need.
In your example you would have two rows with user_id=2, one for each phone number.
Then it is only a matter JOIN to join the two tables together and display your results.
SQL architecture don't allow such things. You need to use more than one row or you can use more than one table with foreign keys. Or you can serialize(phone no) before you put it into mysql.
One possible solution could be creating an array of that data and then storing it with serialize() function.
Small example:
$phones_array = array('phone_a' => '+91 YYYYY', 'phone_b' => '+91 ZZZZZ');
serialize($phones_array);
Now your data are serialized into a string, trying var_dump($phones_array) you should get:
string 'a:2:{s:7:"phone_a";s:9:"+91 YYYYY";s:7:"phone_b";s:9:"+91 ZZZZZ";}' (length=66)
You can now insert this value into your table
You can retrieve this data with:
unserialize($phones_array);
I please need some help:
I have this database, which has this fields with their respect values:
agency_id | hostess_id
3 | 12-4-6
5 | 19-4-7
1 | 1
In hostess_id are stored all hostesses ids that are associated with that agency_id but separated with a "-"
Well, i login as a hostess, and i have the id=4
I need to retrieve all the agency_id which contain the id=4 , i can't do this with like operator.. i tried to do it by saving the hostess_id row to an array, then implode it, but i can't resolve it like this.
Please, please any idea?
You should change your database design. What you are describing is a typical N:N relation
Agencies:
agency_id | name
3 | Miami
5 | Annapolis
1 | New York
Hosteses
Hostes_id | name
4 | Helen
12 | May
19 | June
AgencyHostes
Hostes_id | agency_id
4 | 1
4 | 3
4 | 5
12 | 1
12 | 3
19 | 1
First, let me say that I absolutely agree with #JvdBerg on that this is terrible database design that needs to be normalized.
Let's think for a minute though, that you have no way of changing the database layout and that you must solve this with SQL, an inefficient but working solution would be
select agency_id from tablename where
hostess_id LIKE '4-%' OR
hostess_id LIKE '%-4-%' OR
hostess_id LIKE '%-4'
if you were searching for all agencies with hostess id 4. I build this on sqlfiddle to illustrate more thoroughly http://sqlfiddle.com/#!2/09a52/1
Mind though, that this SQL statement is hard to optimize since an index structure for substring matching is rarely employed. For very short id lists it will work okay though. If you have ANY chance at changing the table structure, normalize your schema like #JvdBerg suggested and look up database design and normal forms on google.
I have one table GAMES and another PLAYERS. Currently each "game" has a column for players_in_game but I have nothing reciprocating in the PLAYERS table. Since this column is an array (Comma separated list of the player's ID #s) I'm thinking that it would probably be better to have each player's record also contain a list of the games they are a member of. On the other hand, duplicating the information in two separate tables might actually require more DB calls.
For perspective, there aren't likely to be more then a dozen players in a game (generally 4-6 is the norm) but there could potentially be a large number of games.
Is there a good way to figure out which would be more efficient?
Thanks.
Normalization is generally a good thing. Comma delimited lists in tables is a sign that a table is in desperate need of a foreign key. If you're worried about extra queries, check out JOINING
dbo.games
+----+----------+
| id | name |
+----+----------+
| 1 | war |
| 2 | invaders |
+----+----------+
dbo.players
+----+----------+---------+
| id | name | game_id |
+----+----------+---------+
| 1 | john | 1 |
| 2 | mike | 1 |
+----+----------+---------+
SELECT games.name, count(players.id) as total_players FROM games INNER JOIN players ON games.id = players.game_id GROUP BY games.name;
Result:
+-----------+--------------+
| name |total_players |
+-----------+--------------+
| war | 2 |
| invaders | 0 |
+-----------+--------------+
Sidenote: Go Hokies :)
Oh god, please don't use CSVs!! I know it's tempting when you're new to SQL, but it becomes unqueryable...
You need 3 tables: games, players, and players_in_games. games and players should each have a primary auto-incrementing key like id, and then players_in_games needs just two fields, player_id and game_id. This is called a "many to many" relationship. A player can play many games, and a game can have many players.
The right answer is a table called PlayersInGames that has a player id and a game id per row.
I would create a third table that links the players and games. Your comma-delimited list is effectively a third table, but parsing your list is almost certainly going to be less efficient than letting the database do it for you.
Ask yourself what happens if you remove a row from the GAME table. Now you'll have to loop over all the PLAYER rows, parse the list, figure out which ones contain a reference to the removed GAME, and then update all the lists.
Bad design. Let SQL do what it was born for. The query will be fast enough if you index it properly. Micro-optimizations like this are the wrong approach.