Many database rows vs one comma separated values row - php

I'm creating a table for allowing website users to become friends. I'm trying to determine which is the best table design to store and return a user's friends. The goal is to have fast queries and not use up a lot of db space.
I have two options:
Have individual rows for each friendship.
+----+-------------+-------------------+
| ID | User_ID | Friend_ID |
+----+-------------+-------------------+
| 1 | 102 | 213 |
| 2 | 64 | 23 |
| 3 | 4 | 344 |
| 4 | 102 | 2 |
| 5 | 102 | 90 |
| 6 | 64 | 88 |
+----+-------------+-------------------+
Or store all friends in one row as CSV
+----+-------------+-------------------+
| ID | User_ID | Friend_ID |
+----+-------------+-------------------+
| 1 | 102 | 213,44,34,67,8 |
| 2 | 64 | 23,33,45,105 |
+----+-------------+-------------------+
When retrieving friends I can create an array using explode() however deleting a user would be trickier.
Edit: For second method I would separate each id in array in php for functions such as counting and others.
Which method do you think is better?

First method is definitely better. It's what makes relational databases great :)
It will allow you to search for and group by much more specific criteria than the 2nd method.
Say you wanted to write a query so users could see who had them as a friend. The 2nd method would require you to use IN() and would be much slower than simply using JOINS.

The first method is better in just about every way. Not only will you utilize your DBs indexes to find records faster, it will make modification far far easier.

Breaking from 1st normal form is usually not desirable because
Easy to Orpahned ids
Easy to insert invalid data types
Updates can require full table scans
Increases concurrency issues
No way to create the key (user_id, friend_id)

Use the power of the relational database. Definitely go with the first approach. MySQL is faster than you think, and it regularly deals with VERY large datasets.

Related

Best practices linking data in MySQL tables

For an online game, I have a table that contains all the plays, and some information on those plays, like the difficulty setting etc.:
+---------+---------+------------+------------+
| play-id | user-id | difficulty | timestamp |
+---------+---------+------------+------------+
| 1 | abc | easy | 1335939007 |
| 2 | def | medium | 1354833214 |
| 3 | abc | easy | 1354833875 |
| 4 | abc | medium | 1354833937 |
+---------+---------+------------+------------+
In another table, after the game has finished, I store some stats related to that specific game, like the score etc:
+---------+----------------+--------+
| play-id | type | value |
+---------+----------------+--------+
| 1 | score | 201487 |
| 1 | enemies_killed | 17 |
| 1 | gems_found | 4 |
| 2 | score | 110248 |
| 2 | enemies_killed | 12 |
| 2 | gems_found | 7 |
+---------+----------------+--------+
Now, I want to make a distribution graph so users can see in what score percentile they are. So I basically want the boundaries of the percentiles.
If it would be on a score level, I could rank the scores and start from there, but it needs to be on a highscore level. So mathematically, I would need to sort all the highscores of users, and then find the percentiles.
I'm in doubt what's the best approach here.
On one hand, constructing an array that holds all the highscores seems like a performance heavy thing to do, because it needs to cycle through both tables and match the scores and the users (the first table holds around 10M rows).
On the other hand, making a separate table with the highscore of users would make things easier, but it feels like it's against the rules of avoiding data redundancy.
Another approach that came to mind was doing the performance heavy thing once a week and keep the result in a separate table, or doing the performance heavy stuff on only a (statistically relevant) subset of the data.
Or maybe I'm completely missing the point here and should use a completely different database setup?
What's the best practice here?

Storing tags value for displaying content

I have 150+ tags (more in future), 18000+ of different content to display and 380+ users (more in future).
What will be the best way to display content to users according to the Tag value for the user?
I thought of storing all tag activities in the database like:
________________________________
| Sr. | User_Id | Tag_Id | Int. |
|________________________________|
| 1 | 152 | 18 | 15 |
| 2 | 152 | 24 | 8 |
| 3 | 18 | 127 | 4 |
|________________________________|
In database Int. means how many times the user is interested in the posts having Tag_Id.
As the user clicks Interested? the Int. column will be +1 for that user and that tag.
If I store the values in the database, the database will have huge content and huge traffic and will need large storage too. (108K values only for now then imagine the stage of values after 2 years.)
Any other best alternatives?
I am using PHP & MySQL.
For the best RM model solution, you will have to give some more information about the complete model. But as I see it now, this is the best solution. You can remove the id and set the primary key to user_id and tag_id (both).

Which is more efficient? Count() or a reference to a count in another table?

Say if I wanted to add the functionality of logging user actions within a web application. My table schema would look similar to the following:
tbl_history:
+----+---------+--+-----------+
| id | user_id | | action_id |
+----+---------+--+-----------+
| 1 | 1 | | 1 |
| 1 | 1 | | 2 |
| 1 | 2 | | 2 |
+----+---------+--+-----------+
A user can generate many actions so I will need to paginate this history. In order to do this I will need to figure out the total amount of rows for the user then calculate how many pages of data there should be.
Which would method be the most efficient if I were to have hundreds of users generating thousands of rows of data each day?
A)
Using the MYSQL's COUNT() function to query the amount of rows of data in the tbl_history table for a particular user.
B)
Having another table which would keep a count of history for the user within the tbl_history table.
+---------+--+---------------+
| user_id | | history_count |
+---------+--+---------------+
| 1 | | 2 |
| 2 | | 1 |
+---------+--+---------------+
This will allow me to instantly get the total count of rows with a simple query in less than 1ms.
The tradeoff is that I will need to perform more queries updating the count for each user and also again on page load.
Which method is more efficient to use? Or is there any other better method? Any technical explanation would be great.
Thanks in advance.

mysql insert into two tables using auto_increment value from the first table as a foreign key in the second

I want to insert into the cart table
**orderId** | cartId | cartDate | cartStatus
____________________________________________
1 | 1 | 20120102 | complete
2 | 2 | 20120102 | complete
3 | 3 | 20120102 | complete
4 | 4 | 20120102 | complete
using the auto increment value orderId from the order table
**orderId** | orderStatus | secret | sauce
____________________________________________
1 | 7 | 020200202 | bbq
2 | 6 | 020200202 | bbq
3 | 6 | 020200202 | t
4 | 4 | 020200202 | m
INSERT INTO ordertable VALUES(null,7,020200202,bbq)
but then using the orderId (which will now be 5)
INSERT INTO carttable VALUES(orderId,20120102,complete)
However,
this insert must be done as the same query. If I use mysql_last_id (php) there is an opportunity for someone else to insert into the database before my cart insert is executed. Or the connection might timeout. The database is MyISAM (and I can not change this, 3rd party solution).
Thank you,
J
I think your concern about using mysql_last_id is unfounded - it will return the last id for the current connection, not the last id globally across all connections.
So unless you have multiple threads sharing the same database connection or you perform another identity insert on the same connection before calling mysql_last_id, you should have nothing to worry about.
ETA: You could do this by sending multiple queries at once, like this:
INSERT INTO ordertable VALUES(null,7,020200202,bbq);
INSERT INTO carttable VALUES(LAST_INSERT_ID(),20120102,complete);
But if you are using mysql_query it usually won't let you send multiple queries in the same call (mostly as a security measure to try to prevent SQL injection).

MYSQL Comma Delimited list, possible to add and remove values?

I have a comma delimited list that im storing in a varchar field in a mysql table.
Is it possible to add and remove values from the list directly using sql queries? Or do I have to take the data out of the table, manipulate in PHP and replace it back into mysql?
There is no way to do it in InnoDB and MyIsam engines in mysql. Might be in other engines (check CSV engine).
You can do it in a stored procedure, but, not recommended.
What you should do to solve such an issue is to refactor your code and normalize your DB =>
original table
T1: id | data | some_other_data
1 | gg,jj,ss,ee,tt,hh | abanibi
To become:
T1: id | some_other_data
1 | abanibi
T2: id | t1_id | data_piece
1 | 1 | gg
2 | 1 | jj
3 | 1 | ss
4 | 1 | ee
5 | 1 | tt
6 | 1 | hh
and if data_piece is a constant value in the system which is reused a lot, you need to add there a lookup table too.
I know it looks more work, but then it will save you issues like you have now, which take much more time to solve.

Categories