DB Design - Suggestions

DB Design - Suggestions - php

i really confused what to do and need your suggestions for my DB Design.
First of all,
As you see in the table, i have id,name,eventCategory,totalEvents and date . This table name is InitPlayer. InitPlayer is an eventAction and i have 3 more eventAction.
As you see in the table, eventCategory items always repetated because dates are changed.
First i though that i keep eventCategory as a table and retrieve items according to them.
What is your DB design suggestions acccording to this picture?
Thank you
Lastly,

I think you should normalize the crap out of your database.
If you're continually reusing the same eventCategory items, you should make a little table, store their names in there, eventCategories, with eventCategory_id and eventCategory_title fields (or something similar), then just reference the ID of said eventCategory in the initPlayer table or create a table to references both initPlayer IDs and eventCategories IDs.
Normalizing and separating will help you to maintain order within your database and help keep you sane. You'll have a little more work with your queries, but it's worth it if you want to scale, or say, change the name of a specific eventCategory.

First i though that i keep eventCategory as a table and retrieve items according to them.
Hold that thought, and do it. Then create a UserInitEventCategory table, where you link userInit.ID to eventCategory.ID.

Related

Find the underlying field from a query/view field

Why?
I am trying to dynamically find where foreign keys points. For this I search in information_schema.KEY_COLUMN_USAGE. It works fine for tables, but not for views.
Views are referenced in information_schema.VIEWS, the view_definition field exposes the query.
I think that this is the only place I will find information about where view fields comes from, right?
Then, I would search for my field name between the SELECT and the FROM. If it is an alias, get the table field name and the table name (and resolve the table if it is an alias).
Last complication, the view can refer to another view, then the code will have to be recursive.
Let's take an example (view name is vw_mandates_articles):
select ma.*, a.id_articles_unit, a.id_articles_category from mandates_articles ma
left join articles a on ma.id_article = a.id
The way it is stored in the VIEWS table is:
select `ma`.`id` AS `id`,
`ma`.`id_mandate` AS `id_mandate`,
`ma`.`id_article` AS `id_article`,
`ma`.`unit_price` AS `unit_price`,
`ma`.`description` AS `description`,
`a`.`id_articles_unit` AS `id_articles_unit`,
`a`.`id_articles_category` AS `id_articles_category`
from (`ste`.`mandates_articles` `ma`
left join `ste`.`articles` `a` on((`ma`.`id_article` = `a`.`id`)))
my inputs are:
the view name (vw_mandates_articles)
the field name (id_articles_category)
the expected output:
the field table (ste.articles)
the field name (id_articles_category) //could be same as input but not necessarily
I am not asking someone to write it for me, I just want to validate the approach before digging.
Any thoughts? Good/bad approach, alternatives?
Thanks in advance for your lights

Yes. Views only have fields stored in the query in the information_schema.VIEWS table.
No there's no better way than exploding etc. in the query...
I wouldn't recommend to make recursive views. What's sure is that it'll be slow (mysql will have to store the temporary result(s) on the hard disk what's really not improving performance).
Even if it isn't best practice, I'd tend to increase redundancy and get the data by using one single query (with maximal 1 subselect).

MySQL database optimization for 20.000 users or more

I have been looking for some optimization tips since I´m doing a RPG modification which uses MySQL to store data by PHP.
I´m using one unique table to store all user information in columns by his unique ID, and I have to store (many?) data for each user. Weapons and other information.
I´m using explode and implode as a method to store the weapons, for example, in one column with the 'text' value. I don´t know if that´s a good practice and I don´t know if I will have performance problems if I get thousands of players doing tons of UPDATES , SELECT , etc, requests.
I read that a Junction table may be better to store the weapons and all those information, but I don´t know if that will get better information that you request it by the explode method.
I mean, I should store all the weapons in a different table, each weapon with his information (each weapon have some information, like different columns, I use multiple explode for that inside the main explode) and the user owner of that weapon to identify the weapon than just have them in one column.
It can be 100 items at least to store, I don´t know if it´s good to make 100 records per user on a different table and call all of them all the time better than just call the column and use explode.
Also I want to improve my skills and knowledge to make the best performance MySQL database I can.
I hope somebody can tell me something.
Thanks, and sorry for my stupid english grammar.

It is almost always best practice to normalize your table data. There are some exceptions to this rule (especially in very high volume databases), but you probably do not need to worry about those exceptions until you get to the point of first understanding how to properly normalize and index your tables.
Typically, try to arrange your tables in a way that mimics real-world objects and their relations to each other.
So, in your case you have users - that is one table. Each user might have multiple weapons. So, you now have a weapons table. Since multiple different users might have the same weapon and each user might have multiple weapons, you have a many-to-many relationship between them, so you should have a table "users_weapons" or similar that does nothing but relate user id's to weapon id's.
Now say the users can all have armor. So now you add an armor table and a users_armor table (as this is likely many-to-many as well).
Just think through the different aspects of your game and try to understand the relationships between them. Make sure you can model these relationships in database tables before you even bother writing any code to actually implement the functionality.

Yes it is better to use several tables instead of one. It's better to db performance, easier to understand, easier to maintain and simplier to use as well.
Let's suggest that one user has several weapons with multiple features(but not unique among all weapons). And in one place in your game you just need to know the value of one specific feature:
doing it by your way you'll need to find user row in users table, fetch on column, explode it several times, and there you have your value, but it complicates even more if you want to change it and save then.
better way is having one table for user details(login, password, email etc), another table which keeps user weapons(name of weapon, image maybe) and table in which will be all features, special powers of weapons kept. You could keep all possible features of all weapons in extra table as well. This way you if you already know user id from user table, you'll have to only join 2 tables in your sql query, and there you got value of feature of specific weapon of user.
Example pseudo schema of tables:
users
user_id
user_name
password
email
weapons
weapon_id
user_id
weapon_name
image
weapons_features
feature_id
weapon_id
feature_name
feature_value
And if you really want to use some ordered data in text field in database encode it to JSON or serialize it. This way you don't have to explode and implode it!

As all guys said, typically you should start from normalized database structure.
If performance is ok, then great, nothing to do.
If not, you can try many different things:
Find and optimize query which works slow.
Denormalize queries - sometimes joins kill performance.
Change data access pattern used in application.
Store data in file system or use NoSQL/polyglot persistence solution.

How to store multi-valued profile details?

I have many fields which are multi valued and not sure how to store them? if i do 3NF then there are many tables. For example: Nationality.
A person can have single or dual nationality. if dual this means it is a 1 to many. So i create a user table and a user_nationality table. (there is already a nationality lookup table). or i could put both nationalities into the same row like "American, German" then unserialize it on run-time. But then i dont know if i can search this? like if i search for only German people will it show up?
This is an example, i have over 30 fields which are multi-valued, so i assume i will not be creating 61 tables for this? 1 user table, 30 lookup tables to hold each multi-valued item's lookups and 30 tables to hold the user_ values for the multi valued items?
You must also keep in mind that some multi-valued fields group together like "colleges i have studied at" it has a group of fields such as college name, degree type, time line, etc. And a user can have 1 to many of these. So i assume i can create a separate table for this like user_education with these fields, but lets assume one of these fields is also fixed list multi-valued like college campuses i visited then we will end up in a never ending chain of FK tables which isn't a good design for social networks as the goal is it put as much data into as fewer tables as possible for performance.

If you need to keep using SQL, you will need to create these tables. you will need to decide on how far you are willing to go, and impose limitations on the system (such as only being able to specify one campus).
As far as nationality goes, if you will only require two nationalities (worst-case scenario), you could consider a second nationality field (Nationality and Nationality2) to account for this. Of course this only applies to fields with a small maximum number of different values.

If your user table has a lot of related attributes, then one possibility is to create one attributes table with rows like (user_id, attribute_name, attribute_value). You can store all your attributes to one table. You can use this table to fetch attributes for given users, also search by attribute names and values.

The simple solution is to stop using a SQL table. This what NoSQL is deigned for. Check out CouchDB or Mongo. There each value can be stored as a full structure - so this whole problem could be reduced to a single (not-really-)table.
The downside of pretty much any SQL based solution is that it will be slow. Either slow when fetching a single user - a massive JOIN statement won't execute quickly or slow when searching (if you decide to store these values as serialized).

You might also want to look at ORM which will map your objects to a database automatically.
http://en.wikipedia.org/wiki/List_of_object-relational_mapping_software#PHP

This is an example, i have over 30
fields which are multi-valued, so i
assume i will not be creating 61
tables for this?
You're right that 61 is the maximum number of tables, but in reality it'll likely be less, take your own example:
"colleges i have studied at"
"college campuses i visited"
In this case you'll probably only have one "collage" table, so there would be four tables in this layout, not five.
I'd say don't be afraid of using lots of tables if the data set you're modelling is large - just make sure you keep an up to date ERD so you don't get lost! Also, don't get caught up too much in the "link table" paradigm - "link tables" can be "entities" in their own rights, for example you could think of the "colleges i have studied at" link table as an "collage enrolments" table instead, give it it's own primary key, and store each of the times you pay your course fees as rows in a (linked) "collage enrolment payments" table.

hiding model data based on id's existance in another table

I've got a somewhat complicated question for you cakephp experts.
Basically, I have created a db table called "locations". Every month I will get this table sent to me in csv format from a client. Unfortunately, instead of updating this table, I will have to empty it and reimport all of the records. Unfortunately, I cannot alter this table at all.
Functionality wise, users will have the ability to look at a display of these records, and be able to choose to hide certain ones. This "hidden" attribute must be persistent and survive the month to month purging of all records.
I had all of this working yesterday. What I did was, create a separate table called location_properties (columns were: id(int), location_id(foreign key), is_hidden(boolean)). When showing these records, it would simply check to see if "is_hidden==true".
This was all well and good(AND WORKING!), but then my boss kind of gummed up the works. He told me to delete the "is_hidden" column from the table because it would be more efficient. That I should be able to simply check for the existence of the location_id to hide or show it.
It doesn't appear to be quite that simple. Anyone know how I can pull this off? I've tried everything I can think of.

Your boss is wrong.
It's more efficient to add your column, than it is too delete and re-import the locations every month.
Did he say it was less efficient, or did you do an actual benchmark to see if its harms performance too much?

At first glance I see 2 solutions:
1) add a condition array('Location.id' => 'NOT NULL')
2) change join type to right join
I hope this helps

Count line breaks in a field and order by

I have a field in a table recipes that has been inserted using mysql_real_escape_string, I want to count the number of line breaks in that field and order the records using this number.
p.s. the field is called Ingredients.
Thanks everyone

This would do it:
SELECT *, LENGTH(Ingredients) - LENGTH(REPLACE(Ingredients, '\n', '')) as Count
FROM Recipes
ORDER BY Count DESC
The way I am getting the amount of linebreaks is a bit of a hack, however, and I don't think there's a better way. I would recommend keeping a column that has the amount of linebreaks if performance is a huge issue. For medium-sized data sets, though, I think the above should be fine.
If you wanted to have a cache column as described above, you would do:
UPDATE
Recipes
SET
IngredientAmount = LENGTH(Ingredients) - LENGTH(REPLACE(Ingredients, '\n', ''))
After that, whenever you are updating/inserting a new row, you could calculate the amounts (probably with PHP) and fill in this column before-hand. Or, if you're into that sort of thing, try out triggers.

I'm assuming a lot here, but from what I'm reading in your post, you could change your database structure a little bit, and both solve this problem and open your dataset up to more interesting uses.
If you separate ingredients into its own table, and use a linking table to index which ingredients occur in which recipes, it'll be much easier to be creative with data manipulation. It becomes easier to count ingredients per recipe, to find similarities in recipes, to search for recipes containing sets of ingredients, etc. also your data would be more normalized and smaller. (storing one global list of all ingredients vs. storing a set for each recipe)
If you're using a single text entry field to enter ingredients for a recipe now, you could do something like break up that input by lines and use each line as an ingredient when saving to the database. You can use something like PHP's built-in levenshtein() or similar_text() functions to deal with misspelled ingredient names and keep the data as normalized as possbile without having to hand-groom your [users'] data entry too much.
This is just a suggestion, take it as you like.

You're going a bit beyond the capabilities and intent of SQL here. You could write a stored procedure to scan the string and return the number and then use this in your query.
However, I think you should revisit the design of whatever is inserting the Ingredients so that you avoid searching strings in of every row whenever you do this query. Add a 'num_linebreaks' column, calculate the number of line breaks and set this column when you're adding the Indgredients.
If you've no control over the app that's doing the insertion, then you could use a stored procedure to update num_linebreaks based on a trigger.

Got it thanks, the php code looks like:
$check = explode("\r\n", $_POST['ingredients']);
$lines = count($check);
So how could I update all the information in the table so Ingred_count based on field Ingredients in one fellow swoop for previous records?

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.