I'm working on a PHP app which requires various settings to be stored in a database. The client often asks if certain things can be added or changed/removed, which has been causing problems with the table design. Basically, I had a lot of boolean fields which simply indicated if various settings were enabled for a particular record.
In order to avoid messing around with the table any more, I'm considering storing the data as a serialized array. I have read that this is considered bad practice, but I think this is a justified case for using such an approach.
Is there any real reason to avoid doing this?
Any advice appreciated.
Thanks.
The real reason is normalisation, and you will break the first normalform by doing it.
However, there are many cases in which a breach of the normal forms could be considered. How many fields are you dealing with and are they all booleans?
Storing an array serialized as a string in your database will have the following disadvantages (among others):
When you need to update your settings you must first extract the current settings from the database, unserialize the array, change the array, serialize the array and update the data in the table.
When searching, you will not be able to just ask the database whether a given user (or a set of users) has a given setting disabled or enabled, thus you won't have any chances of searching.
Instead, you should really consider the option of creating another table with the records you need as a one-to-many relation from your other table. Thus you won't have 30 empty fields, but instead you can just have a row for each option that deviates from the default (note that this option has some disadvantages aswell, for example if you change the default).
In sum: I think you should avoid serializing arrays and putting them into the databases, at least if you care just a tiny bit about the aforementioned disadvantages.
The proper way (which isn't always the best way)
CREATE TABLE mytable (
myid INT UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT,
mytitle VARCHAR(100) NOT NULL
);
CREATE TABLE myarrayelements (
myarrayid INT UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT,
myid INT UNSIGNED NOT NULL,
mykey VARCHAR(100) NOT NULL,
myval VARCHAR(100) NOT NULL,
INDEX(myid)
);
$myarray = array();
$res = mysql_query("SELECT mykey, myval FROM myarrayelements WHERE myid='$myid'");
while(list($k, $v) = mysql_fetch_array($res)) $myarray[$k] = $v;
Although sometimes it's more convenient to store a comma separated list.
One thing is that extensibility in limited. Database should not be mixed with programming environment. Also changing the values in database and debugging is much easier. The database and cgi can be interchanged to another database or cgi like perl.
One of the reasons to use a relational database is to help maintain data integrity. If you just have a serialized array dumped into a blob in a table there is no way for the database to do any checking that what you have in that blob makes any sense.
Any reason you can't store your settings in a configuration file on the server? For example, I save website or application settings in a config.php rather than a database.
Related
I am working on a marketplace, and was wondering what is the best way of handling website settings such as title, url, if its https, contact email, version, etc.
I am trying to structure the table so it is easy to update, and able to have more and more settings added to it to fetch.
I developed 2 structures, either keep it in one row, with column names as the setting name, and the row column value as the setting value. and just echoing the first row value column name with a mysql_fetch_assoc.
I was also thinking about having a new auto-increment row for every setting. And turning it into an array to fetch from the database to assign a column name for the column im fetching of the setting name.
What would be your way of handling this efficiently. thank you.
A row for each distinct option setting, using name/value pairs one per row, is probably the best way to go. It's more flexible than lots of columns; if you add an option setting you won't have to run any kind of ALTER TABLE operation.
The WordPress wp_options table works this way. See here. http://codex.wordpress.org/Options_API
If you had a "compound" option, you could serialize a php array to store it in a single row of the table.
First of all i would considere one more thing, a configuration file...
Then you should ask yourself what you need for your project...
First of all i would considere config file vs database :
The big advantage of databases options over a config file is the scalability, if you have many applications / sites requiering those configurations then go for the database as it would avoid you to copy several times the same config file with all the problem of versioning and modification of the file on all those different "sites"
Otherwise i would stick to the config file as access is faster for an application and the file may still be aviable in case of sql server outage in which case some config may still be relevent, the config file may also be include by your versioning software. For some security reason as well, imagine your DB is shared among many softwares ...
Then if you stick to database i would recomand the one row one label one config, i considere it easyer to manage records than table structure, specially over time and over evolution of your software.If other dev join your project your table structure may quickly become a big mess :]
The final arg is security... a good practice is to set the "DB user" form a software to a user which dosen't have DB structure modification rights, only the right to access/moidify delete records ;)
The two ways works fine. I would say that if you want to administrate those settings in an admin panel, the one setting by column is better, because you can add new settings on the fly with a simple INSERT query from your admin. Which is better (more secure) than an ALTER TABLE query.
It will depends on the technology you are using. For example in a PHP Symfony Project, settings are mainly stored in flat files (Json, xml...).
I worked on many big web applications for clients. Key/value table is commonly used to store simple settings. If you need to store more than one value you have to serialize them, so it's a little bit tricky.
Keep in mind to cypher sensitive data such as passwords ( Sha256 + salt).
The best way is to create two tables.
A table to store settings as key/value :
CREATE TABLE Settings (
Id INT NOT NULL PRIMARY KEY,
Key NOT NULL NVARCHAR,
Value NULL NVARCHAR
EnvId INT NOT NULL
);
Then you need a Environment table.
CREATE TABLE Environment (
Id INT NOT NULL PRIMARY KEY,
Key NOT NULL NVARCHAR,
);
Don't forget the foreign key constraint.
Moreover you should create these tables in a separated schema. You will be able to apply a security policy by filtering access.
So you can work with many environments (dev, test, production, ....) you just need to activate one Environment. For example you can configure to do not send email in development env, but send them in production env.
So you perform a join to get settings for a specified environment. You can add a Boolean to easily switch between environments.
If you use a file (it doesn't need db connection) you can get something like that (Json) :
Env:
Dev:
Email: ~
Prod:
Email: contact#superwebsite.com
First of all it totally depends on your business requirement but as of now the best approach is to create a settings table scheme is below.
CREATE TABLE `settings` (
`id` int(11) NOT NULL,
`name` varchar(255) NOT NULL,
`value` text NOT NULL,
`type` enum('general','advanced') NOT NULL DEFAULT 'general'
);
Example
site_title = "example"
site_logo = "something.jpg"
site_url = "https://www.example.com"
address = "#795 Folsom Ave, Suite 600 San Francisco"
email = "something#example.com"
mobile = "9898xxxxxx"
This is the best approach because you never know a when new key will be introduced.
Here name will be the key and value will be value.
value column data type should be TEXT for long description.
I have taken one more column named type data type is ENUM which is to distinguish data. You can customize type as per the business logic.
I have a MySQL/PHP performance related question.
I need to store an index list associated with each record in a table. Each list contains 1000 indices. I need to be able to quickly access any index value in the list associated to a given record. I am not sure about the best way to go. I've thought of the following ways and would like your input on them:
Store the list in a string as a comma separated value list or using JSON. Probably terrible performance since I need to extract the whole list out of the DB to PHP only to retrieve a single value. Parsing the string won't exactly be fast either... I can store a number of expanded lists in a Least Rencently Used cache on the PHP side to reduce load.
Make a list table with 1001 columns that will store the list and its primary key. I'm not sure how costly this is regarding storage? This also feels like abusing the system. And then, what if I need to store 100000 indices?
Only store with SQL the name of the binary file containing my indices and perform a fopen(); fseek(); fread(); fclose() cycle for each access? Not sure how the system filesystem cache will react to that. If it goes badly then there are many solutions available to adress the issues... but that's sounds a bit overkill no?
What do you think of that?
What about a good old one-to-many relationship?
records
-------
id int
record ...
indices
-------
record_id int
index varchar
Then:
SELECT *
FROM records
LEFT JOIN indices
ON records.id = indices.record_id
WHERE indices.index = 'foo'
The standard solution is to create another table, with one row per (record, index), and add a MySQL Index to allow fast search
CREATE TABLE IF NOT EXISTS `table_list` (
`IDrecord` int(11) NOT NULL,
`item` int(11) NOT NULL,
KEY `IDrecord` (`IDrecord`)
)
Change the item's type according to your needs - I used int in my example.
The most logical solution would be to put each value in it's own tuple. Adding a MYSQL index to each tuple will enable the DBMS to quickly ascertain the value, and should improve performance.
The reasons we're not going with your other answers are as follows:
Option 1
Storing multiple values in one MYSQL cell is a violation of the first stage of database normalisation. You can read up on it here.
Option 3
This has heavy reliance on other files. You want to localize your data storage as much as possible, to make it easier to maintain in the future.
While I'm designing a MySQL database for a dating website, I have come with the doubt of how to store the referenced data. Currently the database has 33 tables and there are nearly 32 different fields who need to be referenced. We have to consider as well that many of these elements need to be translated.
After been reading several opinions, I have almost dismissed to use enum like:
CREATE TABLE profile (
'user_id' INT NOT NULL,
...
'relationship_status' ENUM('Single','Married') NOT NULL,
...
);
And normally I would be using a reference table like:
CREATE TABLE profile (
'user_id' INT NOT NULL,
...
'relationship_status_id' INT NOT NULL,
...
);
CREATE TABLE relationship_status (
'id' INT NOT NULL,
'name' VARCHAR(45) NOT NULL,
PRIMARY KEY ('id')
);
But it might be over-killed to create 32 tables so I'm considering to code it in PHP like this:
class RelationshipStatusLookUp{
const SINGLE = 1;
const MARRIED = 2;
public static function getLabel($status){
if($status == self::SINGLE)
return 'Single';
if($status == self::MARRIED)
return 'Married';
return false;
}
}
What do you think? Because I guess it could improve the performance of the queries and also make easier the development of the whole site.
Thanks.
Definitely a good idea to steer clear of ENUM IMHO: why ENUM is evil. Technically a lookup table would be the preferred solution although for simple values a PHP class would work. You do need to be careful of this for the same reasons as ENUM; if the values in your set grow it could become difficult to maintain. (What about "co-habiting", "divorced", "civil partnership", "widowed" etc). It also not trivial to query for lists of values using PHP classes; it's possible using reflection but not as easy as a simple MySQL SELECT. This is probably one of those cases where I wouldn't worry about performance until it becomes a problem. Use the best solution for your code/application first, then optimise if you need to.
enum fields present some issues:
Once they're set, they can't easily be changed
'relationship_status' ENUM('Single','Married') NOT NULL,
would need 'Civil Partnership' adding in this country nowadays
You can't easily create a dropdown list of options from the enum lists
However, data onthe database can be subjected to referential integrity constraints, so using a foreign key link against a reference table gives you that degree of validation without the constraints of an enum.
Maintaining the options in a class requires a code change for any new options that have to be added to the data, which may increase the work involved depending on your release procedures, and doesn't prevent bad data being inserted into the database.
Personally, I'd go for a reference table
First off, you wouldn't need id and relationship_status_id in the Relationship_status table.
Personally, I would use an enum unless you need to associate more data than just the name of the person's relationship status (or if you foresee needing to expand on this in the future). It will be much easier when you're looking at the database to see what's what if it is in an easily readable language versus having to query against a second table.
When you are considering performance, sure it's faster to query a table by a unique ID but you have to track that relationship and you will always be joining multiple tables to get the same data. If the enum solution ends up being slower, I don't think it will be enough that the human brain will be able to perceive the difference even with large data sets.
I just came across the idea of writing a special database which will fit for exactly one purpose. I have looked into several other database-systems and came to the conclusion that I need a custom type. However my question is not about if it is a good idea, but how to implement this best.
The application itself is written in php and needs to write to a custom database system.
Because there can be simultaneous read/write operations I can forget the idea of implementing the database directly into my application. (correct me please if I'm wrong).
That means I have to create 2 scripts:
The database-server-script
The application.
This means that the application has to communicate with the server. My idea was using php in cli mode for the database-server. The question is, if this is effective, or if I should look into a programming language like c++ to develop the server application? The second question is then the communication. When using php in cli mode I thought about giving a serialized-array-query as a param. When using c++ should I still do it serialized? or maybe in json, or whatever?
I have to note that a database to search through can consist of several thousands of entries. So i dont know exactly if php is realy the right choice.
Secondly i have to note that queries arent strings which have to be parsed, but an array giving a key,value filter or dataset. The only maybe complexer thing the database server has to be able to is to compare strings like the MySQL version of LIKE '%VALUE%', which could be slow at several thousand entries.
Thanks for the Help.
writing a special database which will fit for exactly one purpose
I presume you mean a custom database management system,
I'm having a lot of trouble undertanding why this would ever be necessary.
Datasbes and Tables like usual databases have. But i dont have columns. Each entry can have its own columns, except for the id
That's not a very good reason for putting yourself (and your users) through a great deal of pain and effort.
i could use mysql id | serialized data... but then much fun searching over a specific parameter in a entry
So what's wrong with a fully polymorphic model implemented on top of a relational database:
CREATE TABLE relation (
id INTEGER NOT NULL auto_increment,
....
PRIMARY KEY (id)
);
CREATE TABLE col_string (
relation_id NOT NULL /* references relation.id */
name VARCHAR(20),
val_string VARCHAR(40),
PRIMARY KEY (relation_id, name)
);
CREATE TABLE col_integer (
relation_id NOT NULL /* references relation.id */
name VARCHAR(20),
val_integer INTEGER,
PRIMARY KEY (relation_id, name)
);
CREATE TABLE col_float (
relation_id NOT NULL /* references relation.id */
name VARCHAR(20),
val_float INTEGER,
PRIMARY KEY (relation_id, name)
);
... and tables for BLOBs, DATEs, etc
Or if scalability is not a big problem....
CREATE TABLE all_cols (
relation_id NOT NULL /* references relation.id */
name VARCHAR(20),
ctype ENUM('string','integer','float',...),
val_string VARCHAR(40),
val_integer INTEGER,
val_float INTEGER,
...
PRIMARY KEY (relation_id, name)
);
Yes, inserts and selecting 'rows' is more complicated than for a normal relational table - but a lot simpler than writing your own DBMS from scratch. And you can wrap most of the functionality in stored procedures. The method described would also map easily to a NoSQL db.
I'm designing a website using PHP and MySQL currently and as the site proceeds I find myself adding more and more columns to the users table to store various variables.
Which got me thinking, is there a better way to store this information? Just to clarify, the information is global, can be affected by other users so cookies won't work, also I'd lose the information if they clear their cookies.
The second part of my question is, if it does turn out that storing it in a database is the best way, would it be less expensive to have a large number of columns or rather to combine related columns into delimited varchar columns and then explode them in PHP?
Thanks!
In my experience, I'd rather get the database right than start adding comma separated fields holding multiple items. Having to sift through multiple comma separated fields is only going to hurt your program's efficiency and the readability of your code.
Also, if your table is growing to much, then perhaps you need to look into splitting it into multiple tables joined by foreign dependencies?
I'd create a user_meta table, with three columns: user_id, key, value.
I wouldn't go for the option of grouping columns together and exploding them. It's untidy work and very unmanageable. Instead maybe try spreading those columns over a few tables and using InnoDb's transaction feature.
If you still dislike the idea of frequently updating the database, and if this method complies with what you're trying to achieve, you can use APC's caching function to store (cache) information "globally" on the server.
MongoDB (and its NoSQL cousins) are great for stuff like this.
The database a perfectly fine place to store such data, as long as they're variables and not, say, huge image files. The database has all the optimizations and specifications for storing and retrieving large amounts of data. Anything you set up on file system level will always be beaten by what the database already has in terms of speed and functionality.
would it be less expensive to have a large number of columns or rather to combine related columns into delimited varchar columns and then explode them in PHP?
It's not really that much of a performance than a maintenance question IMO - it's not fun to manage hundreds of columns. Storing such data - perhaps as serialized objects - in a TEXT field is a viable option - as long as it's 100% sure you will never have to make any queries on that data.
But why not use a normalized user_variables table like so:
id | user_id | variable_name | variable_value
?
It is a bit more complex to query, but provides for a very clean table structure all round. You can easily add arbitrary user variables that way.
If you are doing a lot of queries like SELECT FROM USERS WHERE variable257 = 'green' you may have to stick to have specific columns.
The database is definitely the best place to store the data. (I'm assuming you were thinking of storing it in flat files otherwise) You'd definitely get better performance and security from using a DB over storing in files.
With regards to the storing your data in multiple columns or delimiting them... It's a personal choice but you should consider a few things
If you're going to delimit the items, you need to think of what you're going to delimit them with (something that's not likely to crop up within the text your delimiting)
I often find that it helps to try and visualise whether another programmer of your level would be able to understand what you've done with little help.
Yes, as Pekka said, if you want to perform queries on the data stored you should stick with the seperate columns
You may also get a slight performance boost from not retrieving and parsing ALL your data every time if you just want a couple of fields of information
I'd suggest going with the seperate columns as it offers you the option of much greater flexibility in the future. And there's nothing worse than having to drastically change your data structure and migrate information down the track!
I would recommend setting up a memcached server (see http://memcached.org/). It has proven to be viable with lots of the big sites. PHP has two extensions that integrate a client into your runtime (see http://php.net/manual/en/book.memcached.php).
Give it a try, you won't regret it.
EDIT
Sure, this will only be an option for data that's frequently used and would otherwise have to be loaded from your database again and again. Keep in mind though that you will still have to save your data to some kind of persistent storage.
A document-oriented database might be what you need.
If you want to stick to a relational database, don't take the naïve approach of just creating a table with oh so many fields:
CREATE TABLE SomeEntity (
ENTITY_ID CHAR(10) NOT NULL,
PROPERTY_1 VARCHAR(50),
PROPERTY_2 VARCHAR(50),
PROPERTY_3 VARCHAR(50),
...
PROPERTY_915 VARCHAR(50),
PRIMARY KEY (ENTITY_ID)
);
Instead define a Attribute table:
CREATE TABLE Attribute (
ATTRIBUTE_ID CHAR(10) NOT NULL,
DESCRIPTION VARCHAR(30),
/* optionally */
DEFAULT_VALUE /* whatever type you want */,
/* end_optionally */
PRIMARY KEY (ATTRIBUTE_ID)
);
Then define your SomeEntity table, which only includes the essential attributes (for example, required fields in a registration form):
CREATE TABLE SomeEntity (
ENTITY_ID CHAR(10) NOT NULL
ESSENTIAL_1 VARCHAR(30),
ESSENTIAL_2 VARCHAR(30),
ESSENTIAL_3 VARCHAR(30),
PRIMARY KEY (ENTITY_ID)
);
And then define a table for those attributes that you might or might not want to store.
CREATE TABLE EntityAttribute (
ATTRIBUTE_ID CHAR(10) NOT NULL,
ENTITY_ID CHAR(10) NOT NULL,
ATTRIBUTE_VALUE /* the same type as SomeEntity.DEFAULT_VALUE;
if you didn't create that field, then any type */,
PRIMARY KEY (ATTRIBUTE_ID, ENTITY_ID)
);
Evidently, in your case, that SomeEntity is the user.
Instead of MySQL you might consider using a triplestore, or a key-value store
that way you get the benifits of having all the multithreading multiuser, performance and caching voodoo, figured out, without all the trouble of trying to figure out ahead of time what kind of values you really want to store.
Downsides: it's a bit more costly to figure out the average salary of all the people in idaho who also own hats.
depends on what kind of user info you are storing. if its session pertinent data, use php sessions in coordination with session event handlers to store your session data in a single data field in the db.