I'm just trying out foreign keys for the first time and I'm worried I'm getting a little carried away.
For several of my class variables and their corresponding database records, I've got sets of constants which limit the values that can be chosen. These are currently set and validated using PHP.
What I'm wondering is, should I make tables of these constants in MySQL and lock them into foreign keys? Is this worth the trouble, or is the PHP definitions enough?
For example, say I've got a table transactions, with PHP constants defined for 'credit' and 'debit'.
transactions has a type field which indicates whether the transaction was credit or debit.
Should I create another table (transactions_constants or something) that defines the constants used in transactions(type) and foreign-key them together?
Yes, you should. With FK:
Your database has a more specific interface.
The DB has a clear structure (a DB should be readable regardless of code).
You can avoid accidents with inserts/updates by other DB clients (like PMA).
But you can also have copy of table with consts as consts in PHP (I use codegeneration for this).
Yes, you could use enum. But read 8 Reasons Why MySQL's ENUM Data Type Is Evil first.
In your case, creating a separate table for the transaction type is probably overkill. You can make your type column an enum to help with limiting the allowed values on the database side.
If transaction types had more data associated with them that you wanted to store in the database then a separate table and a foreign key relationship would be the way to go.
If your values set change rarely, you should use MySQL ENUM datatype, and you should turn on MySQL restrict mode. For example:
If your table like that:
CREATE TABLE `table` (
id int unsigned primary key auto_increment,
type enum('credit','debit') not null,
);
the strict mode will help you when you execute a query like
update table `table` set `type` = 'credit1 ' where id = 2
# strict mode on: an error will be raised
# strict mode off: `type` will be '' (empty string)
The above query can not be run if your code is perfect. It's up to you.
Related
I am working on a marketplace, and was wondering what is the best way of handling website settings such as title, url, if its https, contact email, version, etc.
I am trying to structure the table so it is easy to update, and able to have more and more settings added to it to fetch.
I developed 2 structures, either keep it in one row, with column names as the setting name, and the row column value as the setting value. and just echoing the first row value column name with a mysql_fetch_assoc.
I was also thinking about having a new auto-increment row for every setting. And turning it into an array to fetch from the database to assign a column name for the column im fetching of the setting name.
What would be your way of handling this efficiently. thank you.
A row for each distinct option setting, using name/value pairs one per row, is probably the best way to go. It's more flexible than lots of columns; if you add an option setting you won't have to run any kind of ALTER TABLE operation.
The WordPress wp_options table works this way. See here. http://codex.wordpress.org/Options_API
If you had a "compound" option, you could serialize a php array to store it in a single row of the table.
First of all i would considere one more thing, a configuration file...
Then you should ask yourself what you need for your project...
First of all i would considere config file vs database :
The big advantage of databases options over a config file is the scalability, if you have many applications / sites requiering those configurations then go for the database as it would avoid you to copy several times the same config file with all the problem of versioning and modification of the file on all those different "sites"
Otherwise i would stick to the config file as access is faster for an application and the file may still be aviable in case of sql server outage in which case some config may still be relevent, the config file may also be include by your versioning software. For some security reason as well, imagine your DB is shared among many softwares ...
Then if you stick to database i would recomand the one row one label one config, i considere it easyer to manage records than table structure, specially over time and over evolution of your software.If other dev join your project your table structure may quickly become a big mess :]
The final arg is security... a good practice is to set the "DB user" form a software to a user which dosen't have DB structure modification rights, only the right to access/moidify delete records ;)
The two ways works fine. I would say that if you want to administrate those settings in an admin panel, the one setting by column is better, because you can add new settings on the fly with a simple INSERT query from your admin. Which is better (more secure) than an ALTER TABLE query.
It will depends on the technology you are using. For example in a PHP Symfony Project, settings are mainly stored in flat files (Json, xml...).
I worked on many big web applications for clients. Key/value table is commonly used to store simple settings. If you need to store more than one value you have to serialize them, so it's a little bit tricky.
Keep in mind to cypher sensitive data such as passwords ( Sha256 + salt).
The best way is to create two tables.
A table to store settings as key/value :
CREATE TABLE Settings (
Id INT NOT NULL PRIMARY KEY,
Key NOT NULL NVARCHAR,
Value NULL NVARCHAR
EnvId INT NOT NULL
);
Then you need a Environment table.
CREATE TABLE Environment (
Id INT NOT NULL PRIMARY KEY,
Key NOT NULL NVARCHAR,
);
Don't forget the foreign key constraint.
Moreover you should create these tables in a separated schema. You will be able to apply a security policy by filtering access.
So you can work with many environments (dev, test, production, ....) you just need to activate one Environment. For example you can configure to do not send email in development env, but send them in production env.
So you perform a join to get settings for a specified environment. You can add a Boolean to easily switch between environments.
If you use a file (it doesn't need db connection) you can get something like that (Json) :
Env:
Dev:
Email: ~
Prod:
Email: contact#superwebsite.com
First of all it totally depends on your business requirement but as of now the best approach is to create a settings table scheme is below.
CREATE TABLE `settings` (
`id` int(11) NOT NULL,
`name` varchar(255) NOT NULL,
`value` text NOT NULL,
`type` enum('general','advanced') NOT NULL DEFAULT 'general'
);
Example
site_title = "example"
site_logo = "something.jpg"
site_url = "https://www.example.com"
address = "#795 Folsom Ave, Suite 600 San Francisco"
email = "something#example.com"
mobile = "9898xxxxxx"
This is the best approach because you never know a when new key will be introduced.
Here name will be the key and value will be value.
value column data type should be TEXT for long description.
I have taken one more column named type data type is ENUM which is to distinguish data. You can customize type as per the business logic.
If we use the "DESCRIBE table" syntax in MySQL it returns information about the table including the fields and their default value.
http://dev.mysql.com/doc/refman/5.0/en/describe.html
However, how do we tell the difference between a field having a default value of an empty string versus not having a default value at all?
In seems in both cases it returns an empty value for the "Default" column in the output of the DESCRIBE table statement.
I would need to be able to parse the data using PHP to easily detect differences between an old table format and new table format.
Try the following:
Show create table tablename
If you need to have easily query-able schema information available to an application, I would suggest using the MySQL INFORMATION_SCHEMA database. The database provide query-able metadata tables that should meet your needs in your case, you are probably interested in the COLUMNS table. You might query it like:
SELECT * FROM COLUMNS WHERE `TABLE_NAME` = 'your_table' AND `TABLE_SCHEMA` = 'your_database'
Of course you need to consider limited the access privileges of the database user associated with your application, as you may not want them to see the entire INFORMATION_SCHEMA if there are other applications running (though this user would obviously be able to see information about other databases on the COLUMNS table).
It seems as if MySQL will report the value as "NULL" if a default value does not exist. This means we if the field has is set to "NOT NULL" we can assume a default value of NULL means that the default value is either empty or set to the MySQL default value for that field type.
While I'm designing a MySQL database for a dating website, I have come with the doubt of how to store the referenced data. Currently the database has 33 tables and there are nearly 32 different fields who need to be referenced. We have to consider as well that many of these elements need to be translated.
After been reading several opinions, I have almost dismissed to use enum like:
CREATE TABLE profile (
'user_id' INT NOT NULL,
...
'relationship_status' ENUM('Single','Married') NOT NULL,
...
);
And normally I would be using a reference table like:
CREATE TABLE profile (
'user_id' INT NOT NULL,
...
'relationship_status_id' INT NOT NULL,
...
);
CREATE TABLE relationship_status (
'id' INT NOT NULL,
'name' VARCHAR(45) NOT NULL,
PRIMARY KEY ('id')
);
But it might be over-killed to create 32 tables so I'm considering to code it in PHP like this:
class RelationshipStatusLookUp{
const SINGLE = 1;
const MARRIED = 2;
public static function getLabel($status){
if($status == self::SINGLE)
return 'Single';
if($status == self::MARRIED)
return 'Married';
return false;
}
}
What do you think? Because I guess it could improve the performance of the queries and also make easier the development of the whole site.
Thanks.
Definitely a good idea to steer clear of ENUM IMHO: why ENUM is evil. Technically a lookup table would be the preferred solution although for simple values a PHP class would work. You do need to be careful of this for the same reasons as ENUM; if the values in your set grow it could become difficult to maintain. (What about "co-habiting", "divorced", "civil partnership", "widowed" etc). It also not trivial to query for lists of values using PHP classes; it's possible using reflection but not as easy as a simple MySQL SELECT. This is probably one of those cases where I wouldn't worry about performance until it becomes a problem. Use the best solution for your code/application first, then optimise if you need to.
enum fields present some issues:
Once they're set, they can't easily be changed
'relationship_status' ENUM('Single','Married') NOT NULL,
would need 'Civil Partnership' adding in this country nowadays
You can't easily create a dropdown list of options from the enum lists
However, data onthe database can be subjected to referential integrity constraints, so using a foreign key link against a reference table gives you that degree of validation without the constraints of an enum.
Maintaining the options in a class requires a code change for any new options that have to be added to the data, which may increase the work involved depending on your release procedures, and doesn't prevent bad data being inserted into the database.
Personally, I'd go for a reference table
First off, you wouldn't need id and relationship_status_id in the Relationship_status table.
Personally, I would use an enum unless you need to associate more data than just the name of the person's relationship status (or if you foresee needing to expand on this in the future). It will be much easier when you're looking at the database to see what's what if it is in an easily readable language versus having to query against a second table.
When you are considering performance, sure it's faster to query a table by a unique ID but you have to track that relationship and you will always be joining multiple tables to get the same data. If the enum solution ends up being slower, I don't think it will be enough that the human brain will be able to perceive the difference even with large data sets.
I decided back when I was coding to have different tables for each type of content. Now I am stuck solving this. Basically my notification system ranks the newest content by its timestamp currently. This is inaccurate however because there is a small chance that someone would submit content at the same time as another person, and incorrect ranking would occur.
Now if I had all my content in a single table, I would simply rank it by an auto-incrementing variable. Is there a way to implement this auto-increment integer across multiple tables (e.g. When something is inserted into table1, id=0, something is inserted into table2, id=1). Or do I have to recode all my stuff into a single table.
NOTE:
The reason I have content in multiple tables is because its organized and it would reduce load stress. I don't really care about the organization anymore, because I can just access the data through a GUI I coded, I'm just wondering about the load stress.
EDIT:
I'm using PHP 5 with MySQL.
Your question, particularly the need for ID spanning over multiple tables, is clearly signalizing that your database design needs change. You should make one table for all content types (as a generalization), with autoincrementing ID. Then, for each particular content type, you can define other table (equivalent of inheritance in OOP) with extra fields, and foreign key pointing to the basic table.
In other words, you need something like inheritance in SQL.
You can create a table with auto increment id just to keep track of ids. Your program would do an insert on that table, get the id, use it as necessary.
Something along the lines of:
function getNextId() {
$res = mysql_query("INSERT INTO seq_table(id) VALUES (NULL)");
$id = mysql_insert_id();
if ($id % 10 == 0) {
mysql_query("DELETE FROM seq_table");
}
return $id;
}
Where seq_table is a table that you've to create just to get the ids. Make it a function so it can be used whenever you need. Every 10 ids generated I delete all generated ids, anyway you don't need them there. I don't delete every time since it would slow down. If another insert happen in the meantime and I delete 11 or more records, it doesn't affect the behaviour of this procedure. It's safe for the purpose it has to reach.
Even if the table is empty new ids will just keep on growing since you've declared id as auto-increment.
UPDATE: I want to clarify why the ID generation is not wrapped in a transaction and why it shouldn't.
If you generate an auto id and you rollback the transaction, the next auto id, will be incremented anyway. Excerpt from a MySQL bug report:
[...] this is not a bug but expected behavior that happens in every RDBMS we know. Generated values are not a part of transaction and they don't care about other statements.
Getting the ID with this procedure is perfectly thread safe. Your logic after the ID is obtained should be wrapped in a transaction, especially if you deal with multiple tables.
Getting a sequence in this way isn't a new concept, for instance, the code of metabase_mysql.php which is a stable DB access library has a method called GetSequenceNextValue() which is quite similar.
In a single table, you could have a field for the content type and clustered index that includes the content type field. This effectively keeps all of one content type in one place on the disc, and another content type in another place, etc. (It's actually organised into pages, but this physical organisation is still true.)
Assuming that each content type has the same fields, this would likely meet your needs and behave similarly to multiple tables. In some cases you may even find that, with appropriate indexes, a single table solution can be faster, more convenient and maintainable, etc. Such as trying to create global unique identifiers across all content types.
If you're unable to merge these back into a single table, you could create a central link table...
CREATE TABLE content_link (
id INT IDENTITY(1,1), -- MS SQL SERVER syntax
content_type INT,
content_id INT -- The id from the real table
)
As you insert into the content tables, also insert into the link table to create your globally unique id.
More simply, but even more manually, just hold a single value somewhere in the database. Whenever you need a new id, use that centrally stored value and increment it by one. Be sure to wrap the increment and collection in a single transaction to stop race conditions. (This can be done in a number of ways, depending on your flavor of SQL.)
EDIT
A couple of MySQL example lines of code from the web...
START TRANSACTION;
INSERT INTO foo (auto,text)
VALUES(NULL,'text'); # generate ID by inserting NULL
INSERT INTO foo2 (id,text)
VALUES(LAST_INSERT_ID(),'text'); # use ID in second table
COMMIT TRANSACTION;
Personally, I'd actually store the value in a variable, commit the transaction, and then continue with my business logic. This would keep the locks on the tables to a minimum.
You could have a separate ID table, insert into that, and use the newly-inserted ID.
e.g.
CREATE TABLE ids (INT UNSIGNED AUTO INCREMENT PRIMARY KEY, timeadded DATETIME);
In the script:
<?php
$r = mysql_query('INSERT INTO ids (timeadded) VALUES (NOW())');
$id = mysql_insert_id();
mysql_query("INSERT INTO someOtherTable (id, data) VALUES ('$id', '$data)");
Add error checking etc. to taste.
The MySQL manual states:
The ID that was generated is maintained in the server on a
per-connection basis. This means that the value returned by the
function to a given client is the first AUTO_INCREMENT value generated
for most recent statement affecting an AUTO_INCREMENT column by that
client. This value cannot be affected by other clients, even if they
generate AUTO_INCREMENT values of their own. This behavior ensures
that each client can retrieve its own ID without concern for the
activity of other clients, and without the need for locks or
transactions.
(Source) So I don't think concerns about ACID complians are a problem.
I just came across the idea of writing a special database which will fit for exactly one purpose. I have looked into several other database-systems and came to the conclusion that I need a custom type. However my question is not about if it is a good idea, but how to implement this best.
The application itself is written in php and needs to write to a custom database system.
Because there can be simultaneous read/write operations I can forget the idea of implementing the database directly into my application. (correct me please if I'm wrong).
That means I have to create 2 scripts:
The database-server-script
The application.
This means that the application has to communicate with the server. My idea was using php in cli mode for the database-server. The question is, if this is effective, or if I should look into a programming language like c++ to develop the server application? The second question is then the communication. When using php in cli mode I thought about giving a serialized-array-query as a param. When using c++ should I still do it serialized? or maybe in json, or whatever?
I have to note that a database to search through can consist of several thousands of entries. So i dont know exactly if php is realy the right choice.
Secondly i have to note that queries arent strings which have to be parsed, but an array giving a key,value filter or dataset. The only maybe complexer thing the database server has to be able to is to compare strings like the MySQL version of LIKE '%VALUE%', which could be slow at several thousand entries.
Thanks for the Help.
writing a special database which will fit for exactly one purpose
I presume you mean a custom database management system,
I'm having a lot of trouble undertanding why this would ever be necessary.
Datasbes and Tables like usual databases have. But i dont have columns. Each entry can have its own columns, except for the id
That's not a very good reason for putting yourself (and your users) through a great deal of pain and effort.
i could use mysql id | serialized data... but then much fun searching over a specific parameter in a entry
So what's wrong with a fully polymorphic model implemented on top of a relational database:
CREATE TABLE relation (
id INTEGER NOT NULL auto_increment,
....
PRIMARY KEY (id)
);
CREATE TABLE col_string (
relation_id NOT NULL /* references relation.id */
name VARCHAR(20),
val_string VARCHAR(40),
PRIMARY KEY (relation_id, name)
);
CREATE TABLE col_integer (
relation_id NOT NULL /* references relation.id */
name VARCHAR(20),
val_integer INTEGER,
PRIMARY KEY (relation_id, name)
);
CREATE TABLE col_float (
relation_id NOT NULL /* references relation.id */
name VARCHAR(20),
val_float INTEGER,
PRIMARY KEY (relation_id, name)
);
... and tables for BLOBs, DATEs, etc
Or if scalability is not a big problem....
CREATE TABLE all_cols (
relation_id NOT NULL /* references relation.id */
name VARCHAR(20),
ctype ENUM('string','integer','float',...),
val_string VARCHAR(40),
val_integer INTEGER,
val_float INTEGER,
...
PRIMARY KEY (relation_id, name)
);
Yes, inserts and selecting 'rows' is more complicated than for a normal relational table - but a lot simpler than writing your own DBMS from scratch. And you can wrap most of the functionality in stored procedures. The method described would also map easily to a NoSQL db.