Best approach for mysql table with large number of columns [duplicate]

Best approach for mysql table with large number of columns [duplicate] - php

This question already has answers here:
mysql table with 40+ columns
(4 answers)
Closed 5 years ago.
Mysql tables.
phones
| ID | Name //5-6 columns
Specifications
| ID |phone_id| ram | camera | price |network // 50 columns approx
Now my specification table has 50 columns so far, so I need a suggestion how to handle this sort of situation ? Do I need to create some other tables to split specifications table or can I continue with this?
I need a better suggestion for better speed and performance.

Dealing with more than 50+ columns will not be best approach-
Data-Base operation like insert, update, select could take time.
Its became difficult task to handle the data when you are dealing
with 50+ columns in the table.
So I suggest do not continue with adding columns into the tables. mysql-table-with-40-columns could be usefull to solve your problems.

Perhaps you can have 1 main table with id, name, date and some more post specific stuff.
Then you could create 1 meta_table wich contains the id, main_id (this links to the post in your main table), meta_name (i.e.: price or color or dimensions), meta_value (i.e.: 125,95 or black or 150 x 50 x 8).
This way you can make as many column-like values stored into 1 table wich gets linked to 1 main table using an id.
This is the same concept as wordpress is using.
Otherwise you could also use something called serialized data/arrays into your main table, this is 1 column field with many array keys and their values.

Related

PHP: Best Way To Determine Position In Array [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
What I Wish To Implement
My site does a nightly API data fetch, inserting 100,000+ new entries each night. To save space, each field name is in a seperate table with an allocated ID saving around 1,027 bytes per data set, 2.5675MB approx per night and just under a gigabyte over the course of a year, however this is set to increase.
For each user, a JSON file is requested containing the 112 entries to be added. Instead of checking my table for each name ID, I feel to save time, it would be best to create an array whereas the position in the array will be the ID, so lets use some random vegetable names;
Random List Of Vegetables
"Broccoli", "Brussels sprouts", "Cabbage", "Calabrese", "Carrots", "Cauliflower", "Celery", "Chard", "Collard greens", "Corn salad", "Endive", "Fiddleheads (young coiled fern leaves)", "Frisee", "Fennel"
When I create the insert via my PHP classes, I use the following;
$database->bind(':veg_name', VALUE);
Question
What would be the best method to quickly check what position $x is within the array?

As an alternative solution to matching the entries in PHP (which might at some point run into time and/or memory problems):
The general idea is to let the database to the work. It is already optimized (index structures) to match entries to one another.
So following your example, the database probably has a dimensional table for the field names fields:
ID | Name
---------------------------------
0 | "Broccoli"
1 | "Brussels sprouts"
2 | "Cabbage"
Then there is the "final" table facts, which has a structure like this:
User_ID | Field_ID | Timestamp
Now a new batch of entries should be inserted. For this, we first create a temporary table temp with the following format and insert all raw entries. The last column Field_ID will stay empty for now.
User_ID | Field_Name | Timestamp | Field_ID
In a next step we match each field name with its ID using a simple SQL query:
UPDATE `temp` t
SET Field_ID=(SELECT Field_ID FROM fields f WHERE f.Name=t.Field_Name)
So now the database has done our required mapping and we can issue another query to insert the rows into our fact table:
INSERT INTO facts
SELECT User_ID, Field_ID, Timestamp FROM temp WHERE Field_ID IS NOT NULL
A small side-effect here: All rows in our temp table, that could not be matched (we didn't have the field name in our fields table), are still available there. So we could write some logic to send an error report somewhere and have someone add the field names or otherwise fix the issue.
After we are done, we should remove or at least truncate the temp table to be ready for next nights iteration.
Small remark: The queries here are just examples. You could do the mapping and insertion into your facts table in one query, but then you'd lose the "unmatched" entries or have to redo the work.
Redoing the work might not be an issue now, but you said the number of entries will increase in the future, so this might become an issue.

If you're only doing 2.5 megs/night, that's almost nothing. If you gzipped that before dragging it across, it would reduce it a lot more.
Using array positions could get tricky if you're trying to use that to match something in some other table.
That being said, every array has a numeric index as well, so you can find out what that is at any point.
Try this and you'll see:
$array = array("Broccoli", "Brussels sprouts", "Cabbage", "Calabrese", "Carrots", "Cauliflower", "Celery", "Chard", "Collard greens", "Corn salad", "Endive", "Fiddleheads (young coiled fern leaves)", "Frisee", "Fennel");
var_dump(array_keys($array));
On the array, you can also do this:
$currentKey = array_search("carrot",$array);
That will return the key for a given variable. So if you're looping through an array, you can output the key(index) and go do something else with it.
Also, gzip is a form of compression that makes your data much smaller.

If you have a list of item, e.g. an array containing only strings that represent your values, you can use foreach with a key-value ($users as $index => $user) method instead of just a $users as $user like following :
$users = ["Broccoli", "Brussels sprouts", "Cabbage", "Calabrese", "Carrots", "Cauliflower", "Celery", "Chard", "Collard greens", "Corn salad", "Endive", "Fiddleheads (young coiled fern leaves)", "Frisee", "Fennel"];
foreach( $users as $index => $name ) {
echo "about to insert $name which is the #$index..." . PHP_EOL;
}
Which will echo :
about to insert Broccoli which is the #0...
about to insert Brussels sprouts which is the #1...
about to insert Cabbage which is the #2...
about to insert Calabrese which is the #3...
about to insert Carrots which is the #4...
about to insert Cauliflower which is the #5...
about to insert Celery which is the #6...
about to insert Chard which is the #7...
about to insert Collard greens which is the #8...
about to insert Corn salad which is the #9...
about to insert Endive which is the #10...
about to insert Fiddleheads (young coiled fern leaves) which is the #11...
about to insert Frisee which is the #12...
about to insert Fennel which is the #13...
Live-example available here : https://repl.it/Jpwk
Like #m13r asked, how would an index be useful in your case ?

MYSQL output multiple rows with just a single row in mysql database

I have this data that should output to corresponding number of social media that he interacted with.
There's 4 interaction which is fblike_point, fbshare_point, tweet_point, and follow_point
So let's say, I've interacted with fblike_point and tweet_point judging from the data below.
So what I want to do is, it should output 2 times since I've interacted with fblike_point and tweet_point.
Output:
2013-05-14 | fblike_point
2013-05-14 | tweet_point
If I interacted 4 times, it should output 4 times with the corresponding social media interaction that he made.
Well I can manage to do this stuff but, it was like redundancy, for example I'm using a mysql query in PHP for selecting data:
SELECT date_participated, fblike_point FROM table WHERE fblike_point = 1
SELECT date_participated, fbshare_point FROM table WHERE fbshare_point = 1
SELECT date_participated, tweet_point FROM table WHERE tweet_point = 1
SELECT date_participated, follow_point FROM table WHERE follow_point = 1
So is there any other way to have a short method or something?

If I interacted 4 times, it should output 4 times
With your data schema, you'd either need the four distinct queries you quoted, or a UNION over these.
it was like redundancy
This is redundant because the way your schema is organized. If you want to be able to treat these different interactions alike (which makes a lot of sense), then you'd want an extra table for these, with one column identifying the row of your original table that this refers to, and a second column (probably of an ENUM type) identifying the social media. Both together would form the primary key of that table.
You can then create a VIEW from the actual tables which looks just like your table does now. That way you can maintain compatibility to existing queries and still provide more flexible queries for those cases where you need them.

One ID for every database column, how to do?

I working on a food database, every food has a list of properties (fats, energy, vitamins, etc.)
These props are composed by 50 different columns of proteins, fat, carbohydrates, vitamins, elements, etc.. (they are a lot)
the number of columns could increase in the future, but not too much, 80 for extreme case
Each column needs an individual reference to one bibliography of a whole list from another table (needed to check if the value is reliable or not).
Consider the ids, should contain a number, a NULL val, or 0 for one specific exception reference (will point to another table)
I've though some solution, but they are very different eachothers, and I'm a rookie with db, so I have no idea about the best solution.
consider value_1 as proteins, value_2 as carbohydrates, etc..
The best (I hope) 2 alternatives I thought are:
(1) create one varchar(255?) column, with all 50 ids, so something like this:
column energy (7.00)
column carbohydrates (89.95)
column fats (63.12)
column value_bil_ids (165862,14861,816486) ## as a varchar
etc...
In this case, I can split it with "," to an array and check the ids, but I'm still worried about coding praticity... this could save too many columns, but I don't know how much could be pratical in order to scalability too.
Principally, I thought this option usual for query optimization (I hope!)
(2) Simply using an additional id column for every value, so:
column energy (7.00)
column energy_bibl_id (165862)
column carbohydrates (89.95)
column carbohydrates_bibl_id (14861)
column fats (63.12)
column fats_bibl_id (816486)
etc...
It seems to be a weightful number of columns, but much clear then first, especially for the relation of any value column and his ID.
(3) Create a relational table behind values and bibliographies, so
table values
energy
carbohydrates
fats
value_id --> point to table values_and_bibliographies val_bib_id
table values_and_bibliographies
val_bib_id
energy_id --> point to table bibliographies biblio_id
carbohydrates_id --> point to table bibliographies biblio_id
fats_id --> point to table bibliographies biblio_id
table bibliographies
biblio_id
biblio_name
biblio_year
I don't know if these are the best solutions, and I shall be grateful if someone will help me to bring light on it!

You need to normalize that table. What you are doing is madness and will cause you to loose hair. They are called relational databases so you can do what you want without adding of columns. You want to structure it so you add rows.
Please use real names and we can whip a schema out.
edit Good edit. #3 is getting close to a sane design. But you are still very unclear about what a bibliography is doing in a food schema! I think this is what you want. You can have a food and its components linked to a bibliography. I assume bibliography is like a recipe?
FOODS
id name
1 broccoli
2 chicken
COMPONENTS
id name
1 carbs
2 fat
3 energy
BIBLIOGRAPHIES
id name year
1 chicken soup 1995
FOOD_COMPONENTS links foods to their components
id food_id component_id bib_id value
1 1 1 1 25 grams
2 1 2 1 13 onces
So to get data you use a join.
SELECT * from FOOD_COMPONENTS fc
INNER JOIN COMPONENTS c on fc.component_id = c.id
INNER JOIN FOODS f on fc.foods_id = f.id
INNER JOIN BIBLIOGRAPHIES b on fc.bib_id = b.id
WHERE
b.name = 'Chicken Soup'

You seriously need to consider redesiging your database structure - it isn't recommended to keep adding columns to a table when you want to store additional data that relates to it.
In a relational database you can relate tables to one another through the use of foreign keys. Since you want to store a bunch of values that relate to your data, create a new table (called values or whatever), and then use the id from your original table as a foreign key in your new table.
Such a design that you have proposed will make writing queries a major headache, not to mention the abundance of null values you will have in your table assuming you don't need to fill every column..

Here's one approach you could take to allow you to add attributes all day long without changing your schema:
Table: Food - each row is a food you're describing
Id
Name
Description
...
Table: Attribute - each row is a numerical attribute that a food can have
Id
Name
MinValue
MaxValue
Unit (probably a 'repeating group', so should technically be in its own table)
Table: Bibliography - i don't know what this is, but you do
Id
...
Table: FoodAttribute - one record for each instance of a food having an attribute
Food
Attribute
Bibliography
Value
So you might have the following records
Food #1 = Cheeseburger
Attribute #1 = Fat (Unit = Grams)
Bibliography #1 = whatever relates to cheeseburgers and fat
Then, if a cheeseburger has 30 grams of fat, there would be an entry in the FoodAttribute table with 1 in the Food column, 1 in the Attribute column, a 1 in the Bibliography column, and 30 in the Value column.
(Note, you may need some other mechanisms to deal with non-numeric attributes.)
Read about Data Modeling and Database Normalization for more info on how to approach these types of problems...

Appending more columns to a table isn't recommended nor popular in the DB world, except with a NoSQL system.
Elaborate your intentions please :)

Why, for the love of $deity, are you doing this by columns? That way lies madness!
Decompose this table into rows, then put a column on each row. Without knowing more about what this is for and why it is like it is, it's hard to say more.

I re-read your question a number of times and I believe you are in fact attempting a relational schema and your concern is with the number of columns (you mention possibly 80) associated with a table. I assure you that 80 columns on a table is fine from a computational perspective. Your database can handle it. From a coding perspective, it may be high.
Proposed (1) Will fail when you want to add a column. You're effectively storing all your columns in a comma delimited single column. Bad.
I don't understand (2). It sounds the same as (3)
(3) is correct in spirit, but your example is muddled and unclear. Whittle your problem down to a simple case with five columsn or something and edit your question or post again.
In short, don't worry about number of columns right now. Low on the priority list.

If you have no need to form queries based on arbitrary key/value pairs you'd like to add to every record, you could in a pinch serialize()/unserialize() an associative array and put that into a single field

Get details from another mysql table

I have a table which would contain information about a certain month, and one column in that row would have mysql row id's for another table in it to grab multiple information from
is there a more efficent way to get the information than exploding the ids and doing seperate sql queryies on each... here is an example:
Row ID | Name | Other Sources
1 Test 1,2,7
the Other Sources has the id's of the rows from the other table which are like so
Row ID | Name | Information | Link
1 John | No info yet? | http://blah.com
2 Liam | No info yet? | http://blah.com
7 Steve| No info yet? | http://blah.com
and overall the information returned wold be like the below
Hi this page is called test... here is a list of our sources
- John (No info yet?) find it here at http://blah.com
- Liam (No info yet?) find it here at http://blah.com
- Steve (No info yet?) find it here at http://blah.com
i would do this... i would explode the other sources by , and then do a seperate SQL query for each, i am sure there could be a better way?

Looks like a classic many-to-many relationship. You have pages and sources - each page can have many sources and each source could be the source for many pages?
Fortunately this is very much a solved problem in relational database design. You would use a 3rd table to relate the two together:
Pages (PageID, Name)
Sources (SourceID, Name, Information, Link)
PageSources (PageID, SourceID)
The key for the "PageSources" table would be both PageID and SourceID.
Then, To get all the sources for a page for example, you would use this SQL:
SELECT s.*
FROM Sources s INNER JOIN PageSources ps ON s.SourceID = ps.SourceID
AND ps.PageID = 1;

Not easily with your table structure. If you had another table like:
ID Source
1 1
1 2
1 7
Then join is your friend. With things the way they are, you'll have to do some nasty splitting on comma-separated values in the "Other Sources" field.

Maybe I'm missing something obvious (been known to), but why are you using a single field in your first table with a comma-delimited set of values rather than a simple join table. The solution if do that is trivial.

The problem with these tables is that having a multi-valued column doesn't work well with SQL. Tables in this format are considered to be normalized, as multi-valued columns are forbidden in First Normal Form and above.
First Normal Form means...
There's no top-to-bottom ordering to the rows.
There's no left-to-right ordering to the columns.
There are no duplicate rows.
Every row-and-column intersection contains exactly one
value from the applicable domain (and
nothing else).
All columns are regular [i.e. rows have no hidden components such as
row IDs, object IDs, or hidden timestamps].
—Chris Date, "What First Normal Form Really Means", pp. 127-8[4]
Anyway, the best way to do it is to have a many to many relationship. This is done by putting a third table in the middle, like Dominic Rodger does in his answer.

Questions about Php and Mysql Hash Table

I am a new php and mysql programmer. I am handling quite large amount of data, and in future it will grow slowly, thus I am using hash table. I have couple of questions:
Does mysql have hash table built in function? If yes, how to use that?
After couple of days doing research about hash table. I briefly know what hash table is but I just could not understand how to start creating one. I saw a lot of hash table codes over the internet. Most of them, in the first step in to create a hashtable class. Does it mean, they store the hash table value in the temporary table instead of insert into mysql database?
For questions 3,4 & 5, example scenario:
User can collect items in the website. I would like to use hash table to insert and retrieve the items that the user collected.
[Important] What are the possible mysql database structure looks like?
e.g, create items and users table
in items table have: item_id, item_name, and item_hash_value
in users table have: user_id, username, item_name, item_hash_value
I am not sure if the users table is correct?
[Important] What are the steps of creating hash table in php and mysql?
(If there is any sample code would be great :))
[Important] How to insert and retrieve data from hash table? I am talking about php and mysql, so I hope the answers can be like: "you can use mysql query i.e SELECT * from blabla..."

(sorry about the italics, underscores can trigger them but I can't find a good way to disable that in the middle of a paragraph. Ignore the italics, I didn't mean to put them there)
You don't need to worry about using a hashtable with MySQL. If you intend to have a large number of items in memory while you operate on them a hashtable is a good data structure to use since it can find things much faster than a simple list.
But at the database level, you don't need to worry about the hashtable. Figuring out how to best hold and access records is MySQL's job, so as long as you give it the correct information it will be happy.
Database Structure
items table would be: item_id, item_name
Primary key is item_id
users table would be: user_id, username
Primary key is user_id
user_items table would be: user_id, item_id
Primary key is the combination of user_id and item_id
Index on item_id
Each item gets one (and only one) entry in the items table. Each user gets one (and only one) entry in the users table. When a user selects an item, it goes in the user items table. Example:
Users:
1 | Bob
2 | Alice
3 | Robert
Items
1 | Headphones
2 | Computer
3 | Beanie Baby
So if Bob has selected the headphones and Robert has selected the computer and beanie baby, the user_items table would look like this:
User_items (user_id, item_id)
1 | 1 (This shows Bob (user 1) selected headphones (item 1))
3 | 2 (This shows Robert (user 3) selected a computer (item 2))
3 | 3 (This shows Robert (user 3) selected a beanie baby (item 3))
Since the user_id and item_id on the users and items tables are primary keys, MySQL will let you access them very fast, just like a hashmap. On the user_items table having both the user_id and item_id in the primary key means you won't have duplicates and you should be able to get fast access (an index on item_id wouldn't hurt).
Example Queries
With this setup, it's really easy to find out what you want to know. Here are some examples:
Who has selected item 2?
SELECT users.user_id, users.user_name FROM users, user_items
WHERE users.user_id = user_items.user_id AND user_items.item_id = 2
How many things has Robert selected?
SELECT COUNT(user_items.item_id) FROM user_items, users
WHERE users.user_id = user_items.user_id AND users.user_name = 'Robert'
I want a list of each user and what they've selected, ordered by the user name
SELECT user.user_name, item.item_name FROM users, items, user_items
WHERE users.user_id = user_items.user_id AND items.item_id = user_items.item_id
ORDER BY user_name, item_name
There are many guides to SQL on the internet, such as the W3C's tutorial.

1) Hashtables do exist in MySQL but are used to keep internal track of keys on tables.
2) Hashtables work by hashing a data cell to create a number of different keys that separate the data by these keys making it easier to search through. The hashtable is used to find what the key is that should be used to bring up the correct list to search through.
Example, you have 100 items, searching 100 items in a row takes 10 seconds. If you know that they can be separated by type of item and break it up into 25 items of t-shirts, 25 items of clocks, items rows of watches, and items rows of shoes. Then when you need to find a t-shirt, you can only have to search through the 25 items of t-shirts which then takes 2.5 seconds.
3) Not sure what your question means, a MySQL database is a binary file that contains all the rows in the database.
4) As in #2 you would need to decide what you want your key to be.
5) #2 you need to know what your key is.

If you think a hash table is the right way to store your data, you may want to use a key-value database like CouchDB instead of MySQL. They show you how to get started with PHP.

I am a new php and mysql programmer. I am handling quite large amount of data, and in future it will grow slowly, thus I am using hash table.
lookin at your original purpose, use "memcache" instead, it is the most scalable solution while offers the minimal changes in your code, you can scale up the memcache servers as your data go larger and larger.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Best approach for mysql table with large number of columns [duplicate] - php

Related

PHP: Best Way To Determine Position In Array [closed]

MYSQL output multiple rows with just a single row in mysql database

One ID for every database column, how to do?

Get details from another mysql table

Questions about Php and Mysql Hash Table

Categories

Resources