SELECT DISTINCT or separate normalized table? - php

We're making the plans now, so before I start progress I want to make sure I'm handing things in the best way.
We have a products table to which we're adding a new field called 'format', which is going to be the structure of the product (bag, box, etc). There is no set values for this, users can enter whatever they like into that field, however we want to show a drop down list of all formats that the user has already entered.
There's two ways I can think of to do that: either a basic SELECT DISTINCT on the products table to get all formats the user already filled in; or a separate table that stores the formats and is linked to by the product.
Instinctively I'd like to use SELECT DISTINCT, since it would make my life easier. However, assuming a table of a billion products, which would be the best way to go?

I think i would opt for the second option (additional table + foreign key if you want to add constraint), just because of the volume and because you can have management that will merge similar product form for example.

If you decide to keep everything in one table, then build an index on the column. This should speed the processing for creating the list in the user application.
I'm somewhat agnostic about which is the best approach. Often, when designing user interfaces, you want to try out different things. Having to make database changes impedes the creative process of building the application.
On the other hand, generally when users pick things from a drop down box in the application, these "things" are excellent examples of "entities" -- and that is what tables are intended to store.
In the end, I would say do what is most convenient while developing the application. As you get closer to finalizing it, consider whether it would be better to store these things in a separate table. One of the big questions is whether you want to know all formats that have every been used, even if no user currently has them defined.

Since you are letting users enter whatever they want I would go with the 2nd option.
Create a new table and insert in there all the new 'formats' and link to the product table.
Be sure when you create the code to add the format the user typed in, check if there is an equal value on the database so you won't need to distinct them as well.
Also, keep it consistent, either by having only the first letter upprcase of each word.

Related

MySQL but don't know the column names before hand

I am building an PHP/MySQL app and I am allowing users to create their own custom (as much as they want) profile data (i.e. they can add any amount of info to their profile with additional textboxes, but there is a "CORE" set of user profile fields)
For example, they can create a new textbox on the form and call it "my pet" and/or "my favorite color". We need to store this data in a database and cannot obviously create columns for each of their choices since we don't know what their additional info is before hand.
One way we think that we could store all "addidional info" they provide is to store their additional info as JSON and store it in a MySQL text field ( I love MySQL :) )
I've seen Wordpress form builder plugins where you can create your own fields so I'm thinking they must store the data in MySQL somehow as NoSQL solutions are beyond the scope of these plugins.
I would love to stick with MySQL but do you guys think NoSQL solutions like MongoDB/Redis would be a better fix since for this?
Thanks
One way to approach this is to use a single table using the EAV paradigm, or Entity-Attribute-Value. See the Wikipedia article. That would be far tidier in most respects than letting users choose a database schema.
You could create a table of key value pairs where anything not in core would be stored. The table would look like: user_id, name_of_user_specified_field, user_specified_value;
Any name_of_user_specified_field that starts showing up a lot you could then add to the core table. This is referred to as Entity-Attribute-Value. Please note, some people consider this an anti-pattern.
If you do this, please add controls to limit the number of new entries a user can create or you might find someone stuffing your db with lots of fields :)
MySQL can handle this just fine. If the additional data is always going to be pulled out all together (i.e. you will never need to get just the pet field without any other additional fields) then you can store it serialized in a column on the users table. However, if you want a more relational model, you can store the extra data in a separate table linked by the user ID. The additional table would have a column for the user ID, additional field name additional field value, and whatever else you might want with it. Then you just run a JOIN query when getting the profile to get all of the extra fields.

Is it okay to dynamically create a MySQL table?

I'm building a aweber-like list management system (for phone numbers, not emails).
There are campaigns. A phone number is associated with each campaign. Users can text to a number after which they will be subscribed.
I'm building "Create a New Campaign" page.
My current strategy is to create a separate table for each campaign (campaign_1,campaign_2,...,campaign_n) and store the subscriber data in it.
It's also possible to just create a single table and add a campaign_id column to it.
Each campaign is supposed to have 5k to 25k users.
Which is a better option? #1 or #2?
Option 2 makes more sense and is widely used approach.
I suppose it really depends on the amount of campaigns you're going to have. Let's give you some pros/cons:
Pros for campaign_n:
Faster queries
You can have each instance run with its own code and own database
Cons for campaign_n:
Database modifications are harder (you need to sync all tables)
You get a lot of tables
Personally I'd go for option 2 (campaign_id field), unless you have a really good reason not to.

Allowing users to build views from my database and editing those fields

I'm building a site that contains "panels" which are used as containers for various information. I have set it up so panels are editable, which is simple for panels that just contain text. For that I just grab the content from the database and wrap it in a textarea rather than a <p> tag. For panels that contain table views however this is proving to be a more difficult task.
First off I'm having trouble allowing the admin of the site pick what information is in a given table (for example if the admin wanted to add a panel view that showed each members first name, last name, and picture they could pick from those columns in my database). I've come up with a few ways to do this, but each have their own set of problems.
I tried using the INFORMATION_SCHEMA table to generate a table containing the possible tables and columns that the user can choose from. But when it comes to building the query with PDO I have problems. For instance with prepared statements you can't use a variable for the schema.
I also thought of using MySQL views but I can't seem to figure out how to do it that way either.
My second problem is allowing the admin to add rows to the tables directly. Right now all the add row template does is create a row with a text field in each column. This is good for purely text options (like first name) but for things like pictures obviously a text field won't work. Should I create a table that contains this metadata or perform the check in PHP? If it's the latter, how would I know what input type the column needs?
I think my main problem is I'm trying to solve too many things with only one design change (or not focusing on one problem at a time). It's resulting in me becoming very flustered and confused. Help is greatly appreciated and if you need anymore information like how my database tables are currently setup I'll provide an ERD.
Edit: I just wanted to make it clear that I don't want to allow the user to actually manipulate the tables in the database, but rather select what information from the existing tables is shown on a given panel.
Coding the ability for users to freely query a database has a lot of problems (including security) and is way more complicated than predefined information queries that simply return a defined set of information.
It also places the burden of defining which info might be useful onto the user. It places the burden of deciding whether a certain information should be accessible to a particular user onto the query logic and database access rules.
Effectively you are trying to copy PHPMyAdmin with a different design and only your defined database as a target.

Tracking data changes

I work on a market research database centric website, developed in PHP and MySQL.
It consists of two big parts – one in which users insert and update own data (let say one table T with an user_id field) and another in which an website administrator can insert new or update existing records (same table).
Obviously, in some cases end users will have their data overridden by the administrator while in other cases, administrator entered data is updated by end users (it is fine both ways).
The requirement is to highlight the view/edit forms with (let’s say) blue if end user was the last to update a certain field or red if the administrator is to “blame”.
I am looking into an efficient and consistent method to implement this.
So far, I have the following options:
For each record in table T, add another one ( char(1) ) in which write ‘U’ if end user inserted/updated the field or ‘A’ if the administrator did so. When the view/edit form is rendered, use this information to highlight each field accordingly.
Create a new table H storing an edit history containing something like user_id, field_name, last_update_user_id. Keep table H up-to-date when fields are updated in main table T. When the view/edit form is rendered, use this information to highlight each form field accordingly.
What are the pros/cons of these options; can you suggest others?
I suppose it just depends how forward-looking you want to be.
Your first approach has the advantage of being very simple to implement, is very straightforward to update and utilize, and also will only increase your storage requirements very slightly, but it's also the extreme minimum in terms of the amount of information you're storing.
If you go with the second approach and store a more complete history, if you need to add an "edit history" in the future, you'll already have things set up for that, and a lot of data waiting around. But if you end up never needing this data, it's a bit of a waste.
Or if you want the best of both worlds, you could combine them. Keep a full edit history but also update the single-character flag in the main record. That way you don't have to do any processing of the history to find the most recent edit, just look at the flag. But if you ever do need the full history, it's available.
Personally, I prefer keeping more information than I think I'll need at the time. Storage space is very cheap, and you never know when it's going to come in handy. I'd probably go even further than what you proposed, and also make it so the edit history keeps track of what they changed, and the before/after values. That can be very handy for debugging, and could be useful in the future depending on the project's exact needs.
Yes, implement an audit table that holds copies of the historical data, by/from whom &c. I work on a system currently that keeps it simple and writes the value changes as simple name-value string pairs along with date and by whom. It requires mandatory master record adjustment, but works well for tracking. You could implement this easily with a trigger.
The best way to audit data changes is through a trigger on the database table. In your case you may want to just update the last person to make the change. Or you may want a full auditing solution where you store the previous values making it easy to restore them if they were made in error. But the key to this is to do this on the database and not through the application. Database changes are often made through sources other than the application and you will want to know if this happened as well. Suppose someone hacked into the database and updated the data, wouldn't you like to be able to find the old data easily or know who did it even if he or she did it through a query window and not through the application? You might also need to know if the data was changed through a data import if you ever have to get large amounts of data at one time.

Best way to display this in a form?

On my website, I have two tables which are linked using a pivot table. What I am trying to do is let a user update the relationships between the two tables (inserting and removing records from the pivot table). I have no problem doing this in PHP, but what I am concerned about is the way the form is displayed in the users web browser.
The way I am doing it now, is to have a table full of checkboxes, with each checkbox corresponding to a relationship between the column header and the row header (which represent the database tables). The user can check the checkbox to tell the PHP that a record should be present for that relationship (an unchecked box means there is no relationship). However this method can get quite ugly (columns stretching outside page bounds) if there are quite a few columns and quite a few rows, and is a bit tedious to use.
What would be a good way to display this form to the user?
Maybe use a data grid? These are quite powerful:
jQuery TableFilter (click "Go")
ExtJS Grid Filter (click a small down arrow ▼ that appears near the column name)
It may be a time consuming task to make it work through Ajax, though.
As this is more about the UI of the application than anything else, I don't think there is going to be a single right answer, as it will come down to a combination of what works (which is difficult without being able to see / play with things) and your personal preferences.
A few progressions I would run through:
Visual feedback
Make you table more interactive by providing visual feedback to the user. At the most basic level, try adding some colour to the cells - a colour for those that are checked. This will allow the user to quickly see which options are "in play". It may be the reverse of this works better (highlighting unchecked cells) - but this all depends what the form is doing / intending to indicate - i.e. if it's more important to make clear that the unchecked state is bad, you may want these to be red.
The next level up is to add some dynamic highlighting. If the table is huge, you may want to highlight the row and column header cells that correspond the the cell under the cursor. You could also consider highlight the whole row / column (cross-hair style) to allow the user to examine 'companion' cells.
Dynamic table
Slightly more involved would be to add some spice to you table. Instead of showing rows and columns of check-boxes, use graphical icons / images. They are a lot easier on the eye, and will probably allow you to have tighter control on the dimensions of the table. The entire UI could then be done via Javascript and on-click - which is pretty easy these days if you employ something like JQuery.
Split the interface
This is based on the assumption that all combinations of Table A & Table B aren't setup in the pivot table to begin with - only when a user tries to relate A.item with B.item
Instead of showing all possible combinations, show only those which are active (have an entry in the pivot table). Then provide the user with a second form (probably of two drop-downs) that allows them to relate a record from the first table to the second.
Filter the interface
Provide the user with the ability to filter the interface - to show only the relationships between a single record from one of the tables. This would have the effect of restricting your table to a single column, making it a bit easier to accommodate in the design.
However, I would still allow the user to get to the "big view" of all records, as, depending on what you are doing, such as view can be very useful to quickly cross reference lots of records.

Categories