php: Mysql Database Design and Workflow, need more creativity !

php: Mysql Database Design and Workflow, need more creativity ! - php

i was wondering if any one can help me with my php-mysql design
my current app. (is a more or less survey app) it let users store questions about targeting specific features in other products also saved in other table in database !
for example , a user can post a car: and then ask users about there opion in safty elements of his car.
car db : Id,brand,safety
brand = Fast
saftety = ABS=ABS (Anti lock braking System),DriverAirBag=Air bags
questions db: ID,Question,Answer,Target,type
eg of data:
Question:safety options you like
Answer:ABS=ABS (Anti lock braking System),DriverAirBag=Air bags"
target:saftey
type=checkbox
problem is that to display questions stored, i have to .
1) loop through all questions, echo Question and echo target in hidden input,
2) explode Answer field twice(1st w/ "," to get each answer and other with "=" to differ > between whats inside the database[0] and a user friendly text[1]
3) check type to chose display type (3 options checkbox,select,text)
4) set this display type of [0] and show [1] for user !!! (stupid i know:()
eg:
< checkbox
value=$expolde[0]>$explode[1]
All these steps make it very hard to maintain, not flexable by any mean cause display is embeded in code :(,
any ideas :) ?

I would separate the tables into a one-to-many type design like:
CarTable
ID
Brand
Model
CarInfo
CarID # Foreign key to Car Table
Category # Optional: Safety, Performance, Looks, etc...
Value # Specific Info Value: ABS, Air Bags, etc...
In this design you can have 0 to many CarInfo records for each Car making it easier to add/remove info records for a car without having to parse a potentially complex field like in your original design.
Your question table design could be similar depending on what your ultimate goal is:
Question
ID
Description
QuestionInfo
QuestionID
Category
Value
Some other things you should be considering and questions you should be asking yourself:
How are you handling custom user inputs? If user1 enters "Air Bags" and user2 requests "Driver Side AirBag" how are you going to match the two?
Make sure you understand the problem before you attempt to solve it. It was not clear to me from your question what you are trying to do (which could be just me or limited size of the question here).
Be careful when outputting raw database values (like the type field in your question table). This is fine as long as the database values cannot be input by the user or are properly sanitized. Search for "SQL Injection" if you are not familiar with it.

If you want a survey PHP application, I suppose, to be clear, that you need something where:
one user can add a subject (a car in your example).
there can be an arbitrary number of questions attached to a subject by that user
each question can accept several types of answers: yes/no (checkbox input), a number (text input, or say 10 radiobuttons with values 1 to 10 attached etc), single or multiple choice (select with or without multi attribute), arbitrary data (textarea). Moreover, some questions may accept comments / "other, please explain" field.
any other user can answer all the questions, and all of them are stored.
A more sophisticated version will require different sets of questions based on what was replied previously, but it's out of the scope of this question, I hope.
For that I think you need several tables:
Subjects
id int pri_key and anything that can come to mind: brand, type etc.
Questions
id int pri_key, text varchar, subject int f_key, type int/enum/?
QuestionOptions
id int pri_key, question int f_key, option varchar
Users
id int pri_key + whatever your authentication structure is
UserReplies
user int f_key, question int f_key, answer varchar, comments varchar
The user-creator sets up a subject and attaches several questions to it. Each question knows which type it is - the field 'type' may be an integer, or an enum value, I myself prefer storing such data as integer and defining constants in php, using something like QTYPE_MULTISELECT or QTYPE_BOOLEAN for readability.
For single/multiselect questions a user-creator also populates the QuestionOptions table, where the options for select-tag are stored.
To display all the questions there'll be something like
SELECT Questions.id, Questions.text, Questions.type, GROUP_CONCAT(CONCAT(questionOptions.id, questionOptions.option)) AS options
FROM Questions LEFT JOIN QuestionsOptions ON (Questions.type = $select AND Questions.id = QuestionsOptions.question)
WHERE Questions.subject = $subject
GROUP BY Questions.id
The GROUP_CONCAT and CONCAT here should be modified to return something like 5:Option_One;6:Option_Two etc, so that exploding the data won't be much hassle.
I realize this is not the cleanest approach in terms of performance and optimization, but It should do for a non-large-scale project.
There is also a drawback in in the above design in that the answers to the "multiple answer question" are stored imploded in the "answer" field of the UserReplies table. Better add another table, where every record holds an option value the user selected for this or that question. That way there will be no unnecessary denormalization in the database and queries for statistics will be much easier (i.e. querying which options were most popular in a single question)

Related

inserting form data into mysql

Please advise how to do this php mysql form and data insert.Already searched on this site and couldn't find any question regarding this.
I have a form that collects student information - student_info(fields: id, name, sex, dob). I can insert this to a table. Now I would like to create two other tables like this
male_students (id, student_info_id, male_names)
female_students (id, student_info_id, female_names).
My idea for these two separate tables is because I can show the list of male and female easily by a SELECT query.
To do this, I thought I can do this but I am not sure how and if this is even a right approach.
for example I have a script called form_submit.php - this has the form
filling and submitting the form would insert data into student_info tables.
when doing the step 2, I would like to check if ($sex == male) or (if $sec==female), do a insert into male_students and female_students respectively.
but I am stuck
should i just write three individual queries inside the
form_submit.php ?
how to get the student_info_id for these two
tables. I thought of LAST_INSERT_ID but I am confused what will
happen if two users fill out the form at same time. So how to
approach this?
If this is not even a right way to approach, how to populate the data for those two tables?
Please advise.
regards

There is absolutely no reason to split "males" and "females" into their own tables in this scenario. (And I'm at a loss to imagine any scenario where it would make sense.)
The entity you're storing is, for lack of a better term, a Person. (User, Individual, etc. could be used in this context as well. Stick with whatever language is appropriate for the domain.) So a Person is a record in a table. Gender is an attribute of a Person, so it's a data element on that table. A highly simplified structure to convey this might be:
Person
----------
ID (integer)
GivenName (string)
FamilyName (string)
Gender (enumeration)
The Gender value would simply be a selected value from whichever possible options are available. Such options might include:
Male
Female
Unknown
Undisclosed
There are medical cases where there may be even more options, and psychological cases may indeed further add to the set. But for most domains that might be covered by "Unknown" or "Undisclosed" (or perhaps "Other" as an option, though that might look strange on the form to the vast majority of users).
To select this information, you'd simply add a WHERE clause to your query. Something like this:
SELECT * FROM Person WHERE Gender=1
If 1 maps to, for example, Male then this would select all Persons who have a Gender attribute of Male.

mysql storing right and wrong answer strings

I have a game. In the game, people make many choices out of 2 options.
The choice can be either right or wrong and I am storing the result of their run through the game (which can be a very large length) as a string with 1 for a right answer and 0 for wrong answers.
So for example, player 128937 will have stored in his run column the string 00010101010010001010111 as a varchar(5000).
Is there a better way I can store this information in MYSQL? (I am using PHP too if that can help)

I would create a new table (say it's called 'answers') with three columns:
question_id,user_id and answer (which will hold values of 0/1 )
every time the player answers a question you INSERT a new entry to this table.
This way it'll be easier to maintain the sum of right/wrong answers

Why not use a tinyint(1) for each option rather than using strings?

I would make multiple tables
choices
id
scenario (or other title)
options
id
choice_id
title (example: "go left" or "turn around and go home"
correct (0 or 1)
user_choices
user_id
option_id
choice_id (optional since choice_id is already in options table)

PHP/MySQL: Handling Questionnaire Input

I have a questionnaire for users to be matched by similar interests: 40 categories, each with 3 to 10 subcategories. Each of the subcategories has a 0 - 5 value related to how interested they are in that subcategory (0 being not even remotely interested, 5 being a die-hard fan). Let's take an example for a category, sports:
<input type="radio" name="int_sports_football" value="0">0</input>
<input type="radio" name="int_sports_football" value="1">1</input>
<input type="radio" name="int_sports_football" value="2">2</input>
<input type="radio" name="int_sports_football" value="3">3</input>
<input type="radio" name="int_sports_football" value="4">4</input>
<input type="radio" name="int_sports_football" value="5">5</input>
With so many of these, I have a table with the interest categories, but due to the size, have been using CSV format for the subcategory values (Bad practice for numerous reasons, I know).
Right now, I don't have the resources to create an entire database devoted to interests, and having 40 tables of data in the profiles database is messy. I've been pulling the CSV out (Which looks like 0,2,4,1,5,1), exploding them, and using the numbers as I desire, which seems really inefficient.
If it were simply yes/no I could see doing bit masking (which I do in another spot – maybe there's a way to make this work with 6-ary values? ). Is there another way to store this sort of categorized data efficiently?

You do not do this by adding an extra field per question to the user table, but rather you create a table of answers where each answer record stores a unique identifier for the user record. You can then query the two tables together using joins in order to isolate only those answers for a specific user. In addition, you want to create a questions table so you can link the answer to a specific question.
table 1) user: (uniqueID, identifying info)
table 2) answers: (uniqueID, userID, questionID, text) links to unique userID and unique questionID
table 3) question: (uniqueID, subcategoryID, text) links to uniqueID of a subcategory (e.g. football)
table 4) subcategories: (uniqueID, maincategoyID, text) links to uniqueID of a mainCategory (e.g sports)
table 5) maincategories: (uniqueID,text)
An individual user has one user record, but MANY answer records. As the user answers a question, a new record is created in the answers table, storing the uniqueID of the user, the uniqueID of the question, and the value of their answer.
An answer record is linked to a single user record (by referencing the user's uniqueID field) and a single question record (via uniqueID of question).
A question record is linked to a single subcategory record.
A subcategory record is linked to a single category record.
Note this scheme only handles two levels of categories: sports->football. If you have 3 levels, then add another level in the same manner. If your levels are arbitrary, there may be some other scheme more suited.

okay, so, given that you have 40 categories and let's assume 10 subcategories, that leaves us with 400 question-answer pairs per user.
now, in order to design the best intermediary data storage, I would suggest starting out with a few questions:
1) what type of analysis will I need
2) what resources do I have
3) is this one time solution or should it be reused in future
Well, if I were you, I would stick to very simple database structure e.g.:
question_id | user_id | answer
if I would foresee more this kind of polls going on with same questions and probably having same respondents, I would further extend the structure with "campaign_id". This would work as raw data storage which would allow quick and easy statistics of any kind.
now, you said database is no option. well, you can mimic this very same structure using arrays and create your own statistical interface that would work based on the array storage type, BUT, you would save their and your time if you could get sql. as others suggest, there is always sqlite (file based database engine), which, is easy to use and setup.
now, if all that does not make you happy, then there is another interesting approach. if data set is fixed, meaning, that there are pretty much no conditional questions, then, given that you could create question index, you could further create funny 400byte answer chunk, where each byte would represent answer in any of the given values. then what you do is you create your statistical methods that, based on the question id, can easily operate with $answer[$user][$nth] byte (or $answer[$nth][$user] -- again, based on the type of statistics you need)
this should help you get your mind set on the goal you want to achieve.

I know you said you don't have the resources to create a database, but I disagree. Using SQL seems like your best bet and PHP includes SQLite (http://us2.php.net/manual/en/book.sqlite.php) which means you wouldn't need to set up a MySQL database if that were a problem.
There are also tools for both MySQL and SQLite which would allow you to create tables and import your data from the CSV files without any effort.

maybe I am confused but it seems like you need a well designed relational database.
for example:
tblCategories (pkCategoryID, fldCategoryName)
tblSubCategory (pkSubCategoryID, fkdSubCategoryName)
tblCategorySubCategory(fkCategoryID,fkSubCategoryID)
then use inner joins to populate the pages. hopefully this helps you :)

i consider NoSQL architecture as a solution to scaling MySQL field in agile solutions.
To get it done asap, I'd create a class for "interest" category that constructs sub-categories instance which extends from category parent class, carrying properties of answers, which would be stored as a JSON object in that field, example:
{
"music": { // category
"instruments": { // sub category
"guitar": 5, //intrest answers
"piano": 2,
"violin": 0,
"drums": 4
},
"fav artist":{
"lady gaga": 1,
"kate perry": 2,
"Joe satriani": 5
}
}
"sports": {
"fav sport":{
"soccer": 5,
"hockey": 2,
}
"fav player":{
"messi": 5,
"Jordan": 5,
}
}
}
NOTE that you need to use "abstraction" for the "category" class to keep the object architecture right

How to handle Tree structures returned from SQL query using PHP?

This is a "theoretical" question.
I'm having trouble defining the question so please bear with me.
When you have several related tables in a database, for example a table that holds "users" and a table that holds "phones"
both "phones" and "users" have a column called "user_id"
select user_id,name,phone from users left outer join phones on phones.user_id = users.user_id;
the query will provide me with rows of all the users whether they have a phone or not.
If a user has several phones, his name will be returned in 2 rows as expected.
columns=>|user_id|name|phone|
row0 = > | 0 |fred|NULL|
row1 = > | 1 |paul|tlf1|
row2 = > | 1 |paul|tlf2|
the name "paul" in the case above is a necessary duplicate which in the RDMS's eye's is not a duplicate at all!
It will then be handled by some server side scripting language, for example php.
How are these "necessary duplicates" actually handled in real websites or applications?
as in, how are the row's "mapped" into some usable object model.
p.s. if you decide to post examples, post them for php,mysql,sqlite if possible.
edit:
Thank you for providing answers, each answer has interpreted the question differently and as such is different and correct in it's own way.
I have come to the conclusion that if round trips are expensive this will be the best way along with Jakob Nilsson-Ehle's solution, which was fitting for the theoretical question.
If round trips they are cheap, I will do separate selects for phones and users as 9000 suggests, if I need to show a single phone for every user, I will give a primary column to the phones and join it with the user select like Ollie Jones correctly suggests.
even though for real life applications I'm using 9000's answer, I think that for this unrealistic question Jakob Nilsson-Ehle's solution is most appropriate.

The thing I would probably do in this case in PHP would be to use the userId in a PHP array and then use that to continuosly update the users
A very simple example would be
$result = mysql_query('select user_id,name,phone from users left outer join phones on phones.user_id = users.user_id;');
$users = Array();
while($row = mysql_fetch_assoc($result)) {
$uid =$row['user_id'];
if(!array_key_exists($uid, $users)) {
$users[$uid] = Array('name' => $row['name'], 'phones' => Array());
}
$users[$uid]['phones'][] = $row['phone'];
}
Of course, depending on your programming style and the complexity of the user data, you might define a User class or something and populate the data, but that is fundamentally how I would would do it.

Your data model inherently allows a user to have 0, 1, or more phones.
You could get your database to return either 0 or 1 phone items for each user by employing a nasty hack, like choosing the numerically smallest phone number. (MIN(phone) ... GROUP BY user). But numerically comparing phone numbers makes very little sense.
Your problem of ambiguity (which of several phone numbers) points to a problem in your data model design. Take a look, if you will, at some common telephone-directory apps. A speed-dial app on a mobile phone is a good example. Mostly they offer ways to put in multiple phone numbers, but they always have the concept of a primary phone number.
If you add a column to your phone table indicating number priority, and make it part of your primary (unique) key, and declare that priority=1 means the user's primary number, your app will not have this ambiguous issue any more.

You can't easily get a tree structure from an RDBMS, only a table structure. And you want a tree: [(user1, (phone1, phone2)), (user2, (phone2, phone3))...]. You can optimize towards different goals, though.
Round-trips are more expensive than sending extra info: go with your current solution. It fetches username multiple times, but you only have one round-trip per entire phone book. May make sense if your overburdened MySQL host is 1000 miles away.
Sending extra info is more expensive than round-trips, or you want more clarity: as #martinho-fernandes suggests, only fetch user IDs with phones, then fetch user details in another query. I'd stick with this approach unless your entire user details is a short username. With SQLite I'd stick with it at all times just for the sake of clarity.

Sound's like you're confusing the object data model with the relational data model - Understanding how they differ in general, and in the specifics of your application is essential to writing OO code on top of a relational database.
Trivial ORM is not the solution.
There are ORM mapping technologies such as hibernate - however these do not scale well. IME, the best solution is using a factory pattern to manage the mapping properly.

Storing and Displaying Questionnaire Data, Easier Solution?

Basically this is a questionnaire, but that does not only ask Yes/No type of questions. There are questions that are asked in the form of a table. Here is an example of one of the step pages for the questionnaire:
The questionnaire allows for a client to save what they have entered, log out, come back at a later point in time and continue filling out the rest, then submit. An admin will also be reviewing the questionnaire and allowing the vendor access to only some of the questions in case any corrections are required.
The following text describes my solution to store the data, but i was wondering if there was a simpler way of doing this. Here is also the Database Design for the Questions and Answers table, located on the right side of the image.
The column names for the question should be stored as separate questions as well, except now for instance questions 4.a, will have
4.a.1, "Standard"
4.a.2, "Certifying Organization"
4.a.3, "Date of Last Certification"
So to display this would be pretty simple. If we set the type for the question as a new type, for instance TABLECOL, we would know to create a table and table column". Also since the data is going to be pulled out in ascending order then it should not be a problem to create the html for this. Anyhow, right now I think we will be fine with all the cells as text input types. (Maybe in the future, if and when the time comes, one of the columns in a table might not be a text input field, it could for instance be a drop down, so we would then need a way to describe what to use).
Now also when displaying the table in html, question 4.a has three rows as a default. Other questions have a different number. Also i was thinking about validation as well for table columns. So for all of this, i was thinking about creating a new table called QuestionAttributes. This will serve as a way to store many attributes for a question id. So an example of this use would be, question 4.a is a TABLE and should display 3 rows. In the attributes table there would be an entry such as:
idof 4.a, "MINROWS", 3
For storing the data in the answers table, we would have to put a new field in the answers table that would give the answer uniqueness and ability to be sort as well. So instead of using an autoincrement value, i would say storing UTC time stamp which would also describe when the answer was given, if need be. This will allow sorting to help us display the data in the correct order in the table of the web page. Basically, in the answers table we should have a different integer value for every answer.
The query to retrieve the answers should have a sort on the Questions table sort_order, and the Answers table utc_timestamp. The result of this query will look something like:
4.a.1, "Answer1", 9878921
4.a.2, "Answer2", 9878923
4.a.3, "Answer3", 9878925
4.a.1, "Answer1", 9878926
4.a.2, "Answer2", 9878928
4.a.3, "Answer3", 9878929
Any help would be greatly appreciated.

You're probably going to disagree, but I think the design is way overengineered, especially for a first version. I'd go with as simple a design as possible:
QuestionaireId
Status f.e. "Pre-Approval"
StatusDate
Answerer f.e., "Mike Mayhem"
Question1 f.e., "Yes"
Question2 f.e., "Option6"
Question3 f.e., "Blah Blah Blah"
...
Then you can have a log table that says when someone approved an item, answered a question, and so on:
LogId
QuestionaireId
LogDate
LogEntry f.e., "Questionaire approved by Bill"
For new iterations past the first version, add one-to-many or many-to-many relations only when it adds huge business value. Relations are expensive in terms of complexity, and keeping complexity to a minimum is the essence of a good design.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.