mysql insert multiple data into a single column or multiple row - php

just want to ask for an opinion regarding mysql.
which one is the better solution?
case1:
store in 1 row:-
product_id:1
attribute_id:1,2,3
when I retreive out the data, I split the string by ','
I saw some database, the store the data in this way, the record is a product, the column is stored product attribute:
a:3:{s:4:"spec";a:2:{i:1;s:6:"black";i:3;s:2:"37";}s:21:"spec_private_value_id";a:2:{i:1;s:11:"12367591683";i:3;s:11:"12367591764";}s:13:"spec_value_id";a:2:{i:1;s:1:"5";i:3;s:2:"29";}}
or
case2:
store in 3 row:-
product_id:1
attribute_id:1
product_id:1
attribute_id:2
product_id:1
attribute_id:3
this is the normal I do, to store 3 rows for the attribute for a record.
In term of performance and space, anyone can tell me which one is better. From what I see is case1 save space, but need to process the data in PHP (or other server side scripting).
case2 is more straight forward, but use spaces.

Save space? Seriously? You're talking about saving bytes when a one terabyte disk goes for 70 dollars?
And maybe you're not even saving bytes. If you store attributes as "12234,23342,243234", that's like 30 bytes for 3 attributes. If you'd store them as smallint, they'd take up 6 bytes.

Depends on whether the attributes are important for searching later, for example.
It may be good if you keep attributes as serialized array in just one field in case you actually don't care about them and in case that you, for example, won't need to run a query to show all products that have one attribute.
However, finding all products that have one attribute would be at least "lousy" in case you have attributes as comma-separated (you need to use LIKE), and in case you store attributes as serialized arrays they are completely unusable for any kind of sorting or grouping using sql queries.
Using separate table for multiple relations between products and attributes is far better if they are of any importance for selecting/grouping/sorting other data.

In case 1, although you save space, there's time spent on splitting the string.
You also must take care of the size of your field: If you have 50 products with 2 attributes and one with 100 attributes, you must make the field ~ varchar(200)... You will not save space at all.
I think case 2 is the best and recommended solution.

You need to consider the SELECT statements that would be using these values. If you wish to search for records that have certain attributes, it is much more efficient to store them in separate columns and index them. Otherwise, you are doing "LIKE" statements which take much longer to process.

Related

How to insert data in the nested set model(MySQL);

In the nested set model we have LEFT and Right columns
the first time when the table is empty, what i need to insert to the RIGHT column, if i don't know how many children will i have
LEFT 1 - forever
RIGHT ? - what value goes here??
how to make it dynamic? not static.
ps: using php
I'm assuming from your tags and title that you are looking for a solution that works with MySQL.
Yes, you are right that unless you know the number of elements in advance the value for right needs to be calculated dynamically. There are two approaches you can use:
You could start with the least value that works (2 in this case) and increase it later as needed.
You could just make a guess like 10000000 and hope that's enough, but you need to be prepared for the possibility that it wasn't enough and may need adjusting again later.
In both cases you need to implement that the left and right values for multiple rows may need to be adjusted when inserting new rows, but in the second case you only actually need to perform the updates if your guesses were wrong. So the second solution is more complex, but can give better performance.
Note that of the four common ways to store heirarchical data, the nested sets approach is the hardest to perform inserts and updates. See slide 69 of Bill Karwin's Models for Heirarchical Data.

How to store searchable arrays in MySQL

So I've got this form with an array of checkboxes to search for an event. When you create an event, you choose one or more of the checkboxes and then the event gets created with these "attributes". What is the best way to store it in a MySQL database if I want to filter results when searching for these events? Would creating several columns with boolean values be the best way? Or possibly a new table with the checkbox values only?
I'm pretty sure selializing is out of the question because I wouldn't be able to query the selialized string for whether the checkbox was ticked or not, right?
Thanks
You can use the set datatype or a separate table that you join. Either will work.
I would not do a bunch of columns though.
You can search the set easily using FIND_IN_SET(), but it's not indexed, so it depends on how many rows you expect (up to a few thousand is probably OK - it's a very fast search).
The normal solution is a separate table with one column being the ID of the event, and the second column being the attribute using the enum datatype (don't use text, it's slower).
create separate columns or you can store them all in one column using bit mask
One way would be to create a new table with a column for each checkbox, as already described by others. I'll not add to that.
However, another way is to use a bitmask. You have just one column myCheckboxes and store the values as an int. Then in the code you have constants or another appropriate way to store the correlation between each checkbox and it's bit. I.e.:
CHECKBOX_ONE 1
CHECKBOX_TWO 2
CHECKBOX_THREE 4
CHECKBOX_FOUR 8
...
CHECKBOX_NINE 256
Remember to always use the next power of two for new values, otherwise you'll get values that overlap.
So, if the first two checkboxes have been checked you should have 3 as the value of myCheckboxes for that row. If you have ONE and FOUR checked you'd have 9 as the values of myCheckboxes, etc. When you want to see which rows have say checkboxes ONE, THREE and NINE checked your query would be like:
SELECT * FROM myTable where myCheckboxes & 1 AND myCheckboxes & 4 AND myCheckboxes & 256;
This query will return only rows having all this checkboxes marked as checked.
You should also use bitwise operations when storing and reading the data.
This is a very efficient way when it comes to speed. You have just a single column, probably just a smallint, and your searches are pretty fast. This can make a big difference if you have several different collections of checkboxes that you want to store and search trough. However, this makes the values harder to understand. If you see the value 261 in the DB it'll not be easy for a human to immeditely see that this means checkboxes ONE, THREE and NINE have been checked whereas it is much easier for a human seeing separate columns for each checkbox. This normally is not an issue, cause humans don't need to manually poke the database, but it's something worth mentioning.
From the coding perspective it's not much of a difference, but you'll have to be careful not to corrupt the values, cause it's not that hard to mess up a single int, it's magnitudes easier than screwing the data than when it's stored in different columns. So test carefully when adding new stuff. All that said, the speed and low memory benefits can be very big if you have a ton of different collections.

best way to store options in a db

i have a table and one of the columns is co_com
this is communication preferences
there are three options (and only ever will be)
i dont want to have a seperate column for these values
so i was thinking of storing them as
sms/email/fax
sms = yes
email = no
fax = yes
which would be stored as: 101
but,
im thinking thats not the best way
what other ways can you see?
yes i am aware that this is a subjective question
but im not sure how else to ask.
You're correct. That is in fact not the best way.
You say you don't want to have separate columns for these values, but that's exactly what you should be doing.
Storing combinations of logical values as coded binary is... 1900's. Seriously, how much does disk space cost these days, and how much do you save by cramming three bits of information into a single number rather than three bytes or characters?
Go on, create three columns with sensible names, and store either 0's and 1's in them, or if your DB is weird that way, story 'Y' and 'N'. But don't do this binary cleverness stuff. It will bite you eventually when you try to write sensible queries.
In my mind, columns is the best way to go, for ease of use if nothing else. The columns are straight forward and won't be confusing in the future. BUT I wouldn't say storing them in a single column as 3 digits is necessarily bad, just confusing. Save yourself the headaches later and do 3 columns.
Another point of view would be to have another table called com_options for example. Have an ID field and an options field, store all of the different communication options combinations in the options field along with a unique ID in the ID field and in your co_com table have an opt_id field referencing the ID in the com_options table. Then use an INNER JOIN to join these 2 tables together.
If your DB is MySQL, then you can use SET datatype.
It's OK, don't worry -- sometimes we should denormalize tables :)
But if your DB isn't MySQL, then you also can use this method, but implementation will be non-your-DB-native. Also bitwise logic on a big bunch of data works very well vs default normalize d one-to-many relation. Because it`s more computer-oriented.

Count line breaks in a field and order by

I have a field in a table recipes that has been inserted using mysql_real_escape_string, I want to count the number of line breaks in that field and order the records using this number.
p.s. the field is called Ingredients.
Thanks everyone
This would do it:
SELECT *, LENGTH(Ingredients) - LENGTH(REPLACE(Ingredients, '\n', '')) as Count
FROM Recipes
ORDER BY Count DESC
The way I am getting the amount of linebreaks is a bit of a hack, however, and I don't think there's a better way. I would recommend keeping a column that has the amount of linebreaks if performance is a huge issue. For medium-sized data sets, though, I think the above should be fine.
If you wanted to have a cache column as described above, you would do:
UPDATE
Recipes
SET
IngredientAmount = LENGTH(Ingredients) - LENGTH(REPLACE(Ingredients, '\n', ''))
After that, whenever you are updating/inserting a new row, you could calculate the amounts (probably with PHP) and fill in this column before-hand. Or, if you're into that sort of thing, try out triggers.
I'm assuming a lot here, but from what I'm reading in your post, you could change your database structure a little bit, and both solve this problem and open your dataset up to more interesting uses.
If you separate ingredients into its own table, and use a linking table to index which ingredients occur in which recipes, it'll be much easier to be creative with data manipulation. It becomes easier to count ingredients per recipe, to find similarities in recipes, to search for recipes containing sets of ingredients, etc. also your data would be more normalized and smaller. (storing one global list of all ingredients vs. storing a set for each recipe)
If you're using a single text entry field to enter ingredients for a recipe now, you could do something like break up that input by lines and use each line as an ingredient when saving to the database. You can use something like PHP's built-in levenshtein() or similar_text() functions to deal with misspelled ingredient names and keep the data as normalized as possbile without having to hand-groom your [users'] data entry too much.
This is just a suggestion, take it as you like.
You're going a bit beyond the capabilities and intent of SQL here. You could write a stored procedure to scan the string and return the number and then use this in your query.
However, I think you should revisit the design of whatever is inserting the Ingredients so that you avoid searching strings in of every row whenever you do this query. Add a 'num_linebreaks' column, calculate the number of line breaks and set this column when you're adding the Indgredients.
If you've no control over the app that's doing the insertion, then you could use a stored procedure to update num_linebreaks based on a trigger.
Got it thanks, the php code looks like:
$check = explode("\r\n", $_POST['ingredients']);
$lines = count($check);
So how could I update all the information in the table so Ingred_count based on field Ingredients in one fellow swoop for previous records?

Get the greatest value into serialized data with php into mysql column

What is the way to get the greatest value into a serialized data. For example i have this in my column 'rating':
a:3:{s:12:"total_rating";i:18;s:6:"rating";i:3;s:13:"total_ratings";i:6;}
How can I select the 3 greatest 'rating' with a query?
thanks a lot
You're probably looking at a pile of SUBSTRING_INDEX(field,':',#offset) calls if you want to do it in SQL. It would be very grisly. Storing a serialized version of an object in the db is a convenience for persistance, but it should not be considered a permanent storage method. If you insist on using the serialized string for queries, you've lost all the power of a relational db and you might as well store the strings in a text file.
The best option is to use the serialized string only for persistance purposes (like remembering what the user was doing last time they visited), and store the data you need for calculations in properly normalized fields and tables. Then you can easily query what you need to know.
The other option is to select all the 'rating' strings from rows whos fields meet certain other criteria (e.g. the date_added field is within the last week), reinstantiate all the objects in your application layer and compare them there.

Categories