Database design (normalization?)

Database design (normalization?) - php

In the following situation, what would the database design look like?
This is for some sort of inventory system.
I want users to be able to create a item-type (lets say, a Laptop). I also want users to be able to say the serial number and MAC address must be unique. This part confuses me where to check for unique values, since I have no idea how to make a table with all items in it, with unique values..
Let's say a user creates another item-type that has no serial number or any unique fields, this means I can't build my DB with property1 till property10 fields in the database.
I also don't want to build a table for every item type, since that would involve too advanced table management in PHP.
Any suggestions on how to build this DB?

Just to clarify I understand your requirements correctly, you meant to create a table that unique rule only applies to a subset of the table instead of the entire table?
If so, I think there will be two options
have two tables, one with unique rules and one without OR
Enforce the unique rules in application level as business rules instead of database level.

If i got your question correctly , i would just do something simple and maintainable, such as :

you can achieve it by,
assume ,table name is item,here item id is primary key.it makes easy to pick up which item you want.then while posting(inserting) serial number and mac address,you should check with php ,that there is no duplication of data.
did i got u !

Related

Using Json format to decrease JOINs in MySQL

I have many tables in my database, an example is the table fs_user, the following is an extract of the table columns (dealing with privacy settings):
4 Columns from the table fs_user:
show_email_to
show_address_to
show_gender_to
show_interested_in_to
Like many social networks, I need not only to specify which data is private and which is public, but also which data is available to a chosen users, and which one is not.
As I have about 30 data like the 4 data above, I think it will be bad to create one table for every data, and make a many to many relation with the table fs_user.
This is why, I got the idea of saving this data in a Json form for every column (whose type=TEXT), example
show_email_to => {1:'ALL',2:'BUT',3:'3'}
This data means, show email to all users, except the user whose id=3.
Another example:
show_email_to => {1:'NONE',2:'BUT',3:'3',4:'80',5:'10'}
This means, no user will see the email except the users id=3,id=80 and id=10.
Of course, the MySql query will select this data, and PHP/Js will extract the data I need from Json.
Another point, is that sometimes .. a user wants to show data only to his friends except 3 friends.
This will do :
show_email_to => {1:'FRIENDS',2:'BUT',3:'3'}
This means that the email will be shown to all his friends, except user with id=3.
My question is : How much will be this system performant, flexible (for other uses) compared to the 'many to many' solution (which requires to have many data in many tables)??
Note: I know already that saving many elements in one column is a bad practice, But here: I think this is a json element and can be considered as a one Object

This is a good question. What you propose is, with respect, a very bad idea indeed if you're using any flavor of SQL. You are proposing to denormalize your tables in a way that will defeat every attempt to speed up searching or querying in the future.
What should you do instead? You could take a look at using an XML-centric dbms like MarkLogic. It's capable of creating indexes that accelerate various Xpath-style queries, so you would be able to search on relationships. If you do that, I hope you have a big budget.
Or, you could use normalized permission tables.
item_to_show (item id)
order (an integer specifying rule ordering, needed for this)
recipient (user id)
isdenied (0 means recipient is allowed, 1 means she is denied)
In this table, the primary key is a compound key constructed of the first two columns.
I'm aware that you have many types of items. You assert that it's bad to have an extra table for each item type in your system. I don't agree that it's inherently bad. I believe your proposed solution is far worse.
You could arrange to give each item a unique id number to allow you to use a single permission table. See this for an example of how to do that. Fastest way to generate 11,000,000 unique ids
Or you could have a single permission table with a type id.
item_to_show (item id)
item_type_to_show (item type id)
order (an integer specifying rule ordering, needed for this)
recipient (user id)
isdenied (0 means recipient is allowed, 1 means she is denied)
In this case the primary key is the first three columns.
Or, you can do what you don't want to do and have a separate permission table for each item type.

You say, "As I have about 30 data like the 4 data above, I think it will be bad to create one table for every data, and make a many to many relation with the table fs_user"
I agree with the first part of your statement only. You only need one table. For the sake of a name, I'll call it ShowableItems. Fields would be ShowableItemId (PK) and Item. Some of these items would be email, gender, address, etc.
Then you need a many to many table that shows what items can be shown to whom. Your three fields would be, the id of the person who owns the item, the showable item id, and the id of the person who can see it.

Dynamically change mySql db table

I am working on a web application that manages the clients of the company. Details such as phone, address, email and name are saved for each client and there are corresponding fields in the database table where I save these details.
The user of the application has to be able to change the different details. For instance, he might decide that we need an extra field to save the fax number of the client or he may decide that the address field is no longer needed and delete it.
Using NoSql is not a option. I have to use PHP and mySql.
I have been considering using a JSON string to save database table fields but I have not come up with a solution yet.
Is altering the structure of my db table the only solution to my problem? I would like to prevent dynamically altering the structure of the db table, if possible.
Would it be a could idea to implement dynamic views? However, I guess that this would not address the necessity to insert new fields.
Thank you in advance.

Wouldn't it make more sense to have another table, let's call it 'information' which has the user_id as a foreign key?
So you have:
CREATE TABLE user (
user_id ...
/* necessary information */
);
CREATE TABLE information (
user_id ...
information_type /* maybe enum, maybe just string, maybe int, depending how you want to do that */
information_blob
);
You then retrieve the information with JOIN, and do not have to alter the table every time somebody wants to add another bit of info.

What you need a key-value pair system for MySQL. The idea of NoSQL databases is that you can create your own schema based on key/values, using essentially anything for the value.
Create a table special_fields with a field_name column, or something named more specifically to field names. Use this table to define the available field names, and another table to store the client_id and special_field_id and then a value.
So client #1 would have an address (special_field record #1) value of "123 x street"
The only other way I can think of is to actually change the schema of a table to add/remove columns. Don't do that.

Build PHP function to retrieve a variety of mySQL database queries and correctly traverse through multiple tables via their foreign key relationship

I am trying to build a robust php function that allows me to traverse over my normalized database. My mySQL database has 6 tables with the following column names (I am only including the primary and foreign keys, as well as some limited table columns for simplicity) so that you can see how they are related.
tableA:
partID (primary key)
tableABJunction
itemID (foreign key)
partID (foreign key)
tableB
itemID (primary key)
itemName
sales
customerID (foreign key)
itemID (foreign key)
partDate
itemID (foreign key)
customer
customerID (primary key)
nameFirst
nameLast
When I need to generate a query, such as: What are the names of the customers that ordered itemID = 12? I have to first do a query from the sales database for all customerIDs where itemID=12 and then query the customer table to find out their first and last names. Some times, I may need to perform a query where I have to return data from all 6 tables, based on a query asking for all information pertaining to customer whose name is John Smith. Is there any easy way to build a function to handle this variety of queries, without having to build a query for every possible type of search?
Currently, my approach is to pass the following to php via AJAX:
web_conditionArray (contains the column name and value of the data provided. Such as nameFirst => 'John', nameLast => 'Smith'); web_resultArray (contains the table name and the columns that I am requesting: sales => 'itemID, itemName').
The issue that I am having with this approach is a way to store the relationships between all of the mySQL datatables with their foreign keys so that my php program knows how to link all the tables together to run the correct query to get from the data provided from one table to the data requested in another table. Any suggestions or a better way to solve this? I was initially thinking of a doubly linked list but the flow from table to table is not linear given that there is a fork where the tableB links to the sales and partDate tables.
I tried to be as specific as I could in describing this situation without writing a novel; however, please let me know if you need any additional information to refine my question further.

Looking at your table structure, I imagine it would be possible to construct logic to calculate the relationships between tables, and dynamically construct queries, but it seems to me that that would be far more work than manually constructing queries for your particular database. I'm assuming that your tables have many more fields in them, but that you've only included the most important, and have definitely included all primary and foreign keys.
Based on that, you have only three information objects in your database: Parts, Items and Customers. You should, therefore, not need more than 12 manually constructed queries to make your system work. You just need to ensure that you simplify your queries to work with whole information objects, and use the PHP layer to filter them later.
So, you reduce your query logic to:
"Fetch me all [Parts, Items or Customers] (and possibly also all [Parts, Items or Customers]) related to [Part, Item or Custromer] (and possibly [Part, Item or Customer])"
This results in the following queries:
All Customers for a Part
All Customers for an Item
All Customers for a Part and an Item
All Items for a Part
All Items for a Customer
All Items for a Part and a Customer
All Parts for an Item
All Parts for a Customer
All Parts for a Customer and an Item
All Parts and Customers for an Item
All Customers and Items for a Part
All Items and Parts for a Customer
(This is the full list of logical relationships - some may not make any sense practically, which makes your life easier)
So, your PHP script needs to perform the following tasks:
Identify which object(s) are required for the criteria of the query. This is based on the fields supplied.
Construct a WHERE clause for your query which identifies the primary key for the criteria objects from the fields passed.
Identify which object(s) are required for the result of the query, based on the fields requested.
Select the query based on the criteria and return objects, and insert the constructed WHERE clause.
Perform the query, extracting all information available about the requested objects
Filter the results, extracting only the required information
Return the final results.

First, know that my answer will most likely be downvoted to hell (as this methodology is constantly downvoted despite its' correctness). DBAs want you to believe that just because a complex query can be done with a SQL statement that it should (like how server-siders think all client-side should be done with server-side or how client-siders think layouts should be done with client-side instead of CSS). No. Complex queries are for people sitting at command lines needing to come up with on demand data grabbing for specific, non-routine reasons. For processing speed, SELECTing, UPDATEing, and DELETEing should always be done off the PK server-side.
It sounds like you have a set of legitimately large tables.
Assuming it's large and speed is the primary concern (and not development time), use only a primary key and no other indexes because the more indexes you have, the more those indexes need to be reindexed by the database when really the comparisons that DBAs would have you do are faster server-side.
The primary key will take some finagling, but it's the most important thing past data types and lengths. For instance, the non-FK, independent tables like tableA, tableB, and customer should probably have an ai INT PK (Generally, remember that computers think in terms of integers), but the ones with multiple FKs should probably have no ai INT but instead a composite PK with the less variant SELECTed FK first. For example, with my site, I store vote totals on links by userID and linkID. If a user's logged in, they'll need to know how many votes they've placed on a link, so the userID is the one less likely to change, so that's first in my PK on that table. Counting this on demand database side or server-side was a performance nightmare.
For just a few lines of code, you will GREATLY improve speed. Sorting on the PK via php will cut latency by 50%. Absorbing JOINs into php will decrease the rate of latency spikes. Having no on demand MySQL calculations will keep your site from becoming paralyzed.
If you step away from the dogma that just because a SQL statement can get you the results that you should use a SQL statement instead of a server-side language (C++ being the fastest), you'll see performance skyrocket.
If you can be more specific with the tables you're trying to obfuscate, I can get more specific, but you probably get the idea.
AJAX has changed the game and forced refocus. CSS for layouts; js for client-side programming; server-side for...server-side processing; database for storing everything that lasts longer than a moment.
Bring on the downvotes! LOL

Generate user friendly id's in MongoDb

There's this project I am working on. This is like a social network where we can have users, posts, pictures etc and then this problem came up. We are used to Mysql and the "almost magical" auto-increment field and now we cannot count on it anymore. I know the _id object in Mongo gives an easy way for identifying a document as it guarantee uniqueness. But the key is not user friendly and that's what we need, so we can make urls like:
http://website.com/posts/{post_id}
http://website.com/{user_id}
I developed a solution but I don't think this is the best way of doing this. I first create a mysql table with only one column. This column stores the user_id and it's an auto-increment field. For every new record on mongo I insert a new row in this mysql table and get the user_id with "LAST_INSERT_ID" function, now I can insert my data in my mongo collection with a numeric ID. And other benefit is that I can erase my mysql table let's say, after a million rows because the id's are already stored in mongo.
Am I doing it wrong?

Why not using slugs for posts and usernames for users? That should be human readable.

First, I don't see any benefit to using an arbitrary auto incrementing number over the generated id mongo provides. Not only is not again just a arbitrary id, but you have to maintain the sequence.
That said, why not let mongo manage the id, and use another unique identifier for your URLs. If your users have a 'username', I'm assuming you've already made sure that's unique across the collection. Just query by that unique property, instead of finding by id.
That also allows the user to change their unique identifier, without you having to remap associations in the database.
And for the post, just generate a unique slug from the title.

You can also create the id's in Mongo instead of MySQL, ...here's some documentation and articles on how to achieve it
http://www.mongodb.org/display/DOCS/How+to+Make+an+Auto+Incrementing+Field
http://shiflett.org/blog/2010/jul/auto-increment-with-mongodb

How to design a generic database whose layout may change over time?

Here's a tricky one - how do I programatically create and interrogate a database whose contents I can't really foresee?
I am implementing a generic input form system. The user can create PHP forms with a WYSIWYG layout and use them for any purpose he wishes. He can also query the input.
So, we have three stages:
a form is designed and generated. This is a one-off procedure, although the form can be edited later. This designs the database.
someone or several people make use of the form - say for daily sales reports, stock keeping, payroll, etc. Their input to the forms is written to the database.
others, maybe management, can query the database and generate reports.
Since these forms are generic, I can't predict the database structure - other than to say that it will reflect HTML form fields and consist of a the data input from collection of edit boxes, memos, radio buttons and the like.
Questions and remarks:
A) how can I best structure the database, in terms of tables and columns? What about primary keys? My first thought was to use the control name to identify each column, then I realized that the user can edit the form and rename, so that maybe "name" becomes "employee" or "wages" becomes ":salary". I am leaning towards a unique number for each.
B) how best to key the rows? I was thinking of a timestamp to allow me to query and a column for the row Id from A)
C) I have to handle column rename/insert/delete. Foe deletion, I am unsure whether to delete the data from the database. Even if the user is not inputting it from the form any more he may wish to query what was previously entered. Or there may be some legal requirements to retain the data. Any gotchas in column rename/insert/delete?
D) For the querying, I can have my PHP interrogate the database to get column names and generate a form with a list where each entry has a database column name, a checkbox to say if it should be used in the query and, based on column type, some selection criteria. That ought to be enough to build searches like "position = 'senior salesman' and salary > 50k".
E) I probably have to generate some fancy charts - graphs, histograms, pie charts, etc for query results of numerical data over time. I need to find some good FOSS PHP for this.
F) What else have I forgotten?
This all seems very tricky to me, but I am database n00b - maybe it is simple to you gurus?
Edit: please don't tell me not to do it. I don't have any choice :-(
Edit: in real life I don't expect column rename/insert/delete to be frequent. However it is possible that after running for a few months a change to the database might be required. I am sure this happens regularly. I fear that I have worded this question badly and that people think that changes will be made willy-nilly every 10 minutes or so.
Realistically, my users will define a database when they lay out the form. They might get it right first time and never change it - especially if they are converting from paper forms. Even if they do decide to change, this might only happen once or twice ever, after months or years - and that can happen in any database.
I don't think that I have a special case here, nor that we should be concentrating on change. Perhaps better to concentrate on linkage - what's a good primary key scheme? Say, perhaps, for one text input, one numerical and a memo?

"This all seems very tricky to me, but
I am database n00b - maybe it is
simple to you gurus?"
Nope, it really is tricky. Fundamentally what you're describing is not a database application, it is a database application builder. In fact, it sounds as if you want to code something like Google App Engine or a web version of MS Access. Writing such a tool will take a lot of time and expertise.
Google has implemented flexible schemas by using its BigTable platform. It allows you to flex the schema pretty much at will. The catch is, this flexibility makes it very hard to write queries like "position = 'senior salesman' and salary > 50k".
So I don't think the NoSQL approach is what you need. You want to build an application which generates and maintains RDBMS schemas. This means you need to design a metadata repository from which you can generate dynamic SQL to build and change the users' schemas and also generate the front end.
Things your metadata schema needs to store
For schema generation:
foreign key relationships (an EMPLOYEE works in a DEPARTMENT)
unique business keys (there can be only one DEPARTMENT called "Sales")
reference data (permitted values of EMPLOYEE.POSITION)
column data type, size, etc
whether column is optional (i.e NULL or NOT NULL)
complex business rules (employee bonuses cannot exceed 15% of their salary)
default value for columns
For front-end generation
display names or labels ("Wages", "Salary")
widget (drop down list, pop-up calendar)
hidden fields
derived fields
help text, tips
client-side validation (associated JavaScript, etc)
That last points to the potential complexity in your proposal: a regular form designer like Joe Soap is not going to be able to formulate the JS to (say) validate that an input value is between X and Y, so you're going to have to derive it using templated rules.
These are by no means exhaustive lists, it's just off the top of my head.
For primary keys I suggest you use a column of GUID datatype. Timestamps aren't guaranteed to be unique, although if you run your database on an OS which goes to six places (i.e. not Windows) it's unlikely you'll get clashes.
last word
'My first thought was to use the
control name to identify each column,
then I realized that the user can edit
the form and rename, so that maybe
"name" becomes "employee" or "wages"
becomes ":salary". I am leaning
towards a unique number for each.'
I have built database schema generators before. They are hard going. One thing which can be tough is debugging the dynamic SQL. So make it easier on yourself: use real names for tables and columns. Just because the app user now wants to see a form titled HEADCOUNT it doesn't mean you have to rename the EMPLOYEES table. Hence the need to separate the displayed label from the schema object name. Otherwise you'll find yourself trying to figure out why this generated SQL statement failed:
update table_11123
set col_55542 = 'HERRING'
where col_55569 = 'Bootle'
/
That way madness lies.

In essence, you are asking how to build an application without specifications. Relational databases were not designed so that you can do this effectively. The common approach to this problem is an Entity-Attribute-Value design and for the type of system in which you want to use it, the odds of failure are nearly 100%.
It makes no sense for example, that the column called "Name" could become "Salary". How would a report where you want the total salary work if the salary values could have "Fred", "Bob", 100K, 1000, "a lot"? Databases were not designed to let anyone put anything anywhere. Successful database schemas require structure which means effort with respect to specifications on what needs to be stored and why.
Therefore, to answer your question, I would rethink the problem. The entire approach of trying to make an app that can store anything in the universe is not a recipe for success.

Like Thomas said, rational database is not good at your problem. However, you may want to take a look at NoSQL dbs like MongoDB.

See this article:
http://www.simple-talk.com/opinion/opinion-pieces/bad-carma/
for someone else's experience of your problem.

This is for A) & B), and is not something I have done but thought it was an interesting idea that Reddit put to use, see this link (look at Lesson 3):
http://highscalability.com/blog/2010/5/17/7-lessons-learned-while-building-reddit-to-270-million-page.html

Not sure about the database but for charts instead of using PHP for the charts, I recommend looking into using javascript (http://www.reynoldsftw.com/2009/02/6-jquery-chart-plugins-reviewed/). Advantages to this are some of the processing is offloaded to the client side for chart displays and they can be interactive.

The other respondents are correct that you should be very cautious with this approach because it is more complex and less performant than the traditional relational model - but I've done this type of thing to accommodate departmental differences at work, and it worked fine for the amount of use it got.
Basically I set it up like this, first - a table to store some information about the Form the user wants to create (obviously, adjust as you need):
--************************************************************************
-- Create the User_forms table
--************************************************************************
create table User_forms
(
form_id integer identity,
name varchar(200),
status varchar(1),
author varchar(50),
last_modifiedby varchar(50),
create_date datetime,
modified_date datetime
)
Then a table to define the fields to be presented on the form including any limits
and the order and page they are to be presented (my app presented the fields as a
multi-page wizard type of flow).
-
-************************************************************************
-- Create the field configuration table to hold the entry field configuration
--************************************************************************
create table field_configuration
(
field_id integer identity,
form_id SMALLINT,
status varchar(1),
fieldgroup varchar(20),
fieldpage integer,
fieldseq integer,
fieldname varchar(40),
fieldwidth integer,
description varchar(50),
minlength integer,
maxlength integer,
maxval varchar(13),
minval varchar(13),
valid_varchars varchar(20),
empty_ok varchar(1),
all_caps varchar(1),
value_list varchar(200),
ddl_queryfile varchar(100),
allownewentry varchar(1),
query_params varchar(50),
value_default varchar(20)
);
Then my perl code would loop through the fields in order for page 1 and put them on the "wizard form" ... and the "next" button would present the page 2 fields in order etc.
I had javascript functions to enforce the limits specified for each field as well ...
Then a table to hold the values entered by the users:
--************************************************************************
-- Field to contain the values
--************************************************************************
create table form_field_values
(
session_Id integer identity,
form_id integer,
field_id integer,
value varchar(MAX)
);
That would be a good starting point for what you want to do, but keep an eye on performance as it can really slow down any reports if they add 1000 custom fields. :-)

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.