MySQL datatypes? - php

I'm designing a database in MySQL and PHP for a basic CMS. The CMS will have a front end which will allow sorting and searching. The backend will allow authorized users to upload files.
I'm using PHPMyAdmin and I'd like help setting up my database.
I am open to answers explaining various MySQL datatypes and what they are good for as well . Use common database fields as examples please.
Below is a list of what I'd like. What's missing and what datatypes do I need to use?
Resources (For my files)
file_id
filename (Files are presorted and display names and paths are derived
from here.)
file_type (PDF | AUDIO | VIDEO | PHOTO [Also used to generate file
urls])
upload date (timestamp in PHP or MySQL)
uploaded_by (User ID from Users table)
event (event_id from Events table, optional)
Users (User accounts - for admin access and maybe a notification list)
user_id
first_name
last_name
email
password
phone_number (optional)
permissions_level (read only, upload)
creation_date
Events
event_id
event_name
event_location
event_date
event_description
entry_date

What I would take:
Resources (For my files)
file_id INT (or SMALLINT depending on the number of expected entries)
filename VARCHAR (or text if longer than 255 chars)
file_type ENUM (if only those you mentioned or VARCHAR if dynamic types can be added)
upload DATE DATETIME (or DATE if you don't need the time)
uploaded_by INT (or SMALLINT but the same as in the user table)
event INT (or SMALLINT but the same as in the event table)
Users (User accounts - for admin access and maybe a notification list)
user_id INT (or SMALLINT depending on the number of expected entries)
first_name VARCHAR
last_name VARCHAR
email VARCHAR
password CHAR(40) (for a SHA1 hash)
phone_number VARCHAR (as it might contain something like -, / or +)
permissions_level TINYINT (if only number values and at most 127 values)
creation_DATE DATETIME (or DATE if you don't need the time)
Events
event_id INT (or SMALLINT depending on the number of expected entries)
event_name VARCHAR
event_location VARCHAR
event_DATE DATETIME (or DATE if you don't need the time)
event_description TEXT (as 255 of VARCHAR might be to short)
entry_DATE DATETIME (or DATE if you don't need the time)
When you have set up your database and input some dummy data, you can run a simple statement through phpmyadmin that will tell you, what MySQL would take for that exact dummy data:
SELECT * FROM events PROCEDURE ANALYSE()
In the column Optimal_fieldtype you will find what MySQL tells you to take. But you should not take that exact fieldtype. It will tell you very often to take a ENUM but most of the time you add random data so you have to take a VARCHAR in that cases the column Max_length will give you a hint on how long it should be. But on all VARCHAR fields you should add additonal space depending on how long you expect the values to be. Take in consideration that even a name can be longer than 50 chars.

Your user table doesnt have a column for the password hash. Not sure if you intended for it to have such or not. I cant see how we can answer what datatypes should be used, since it completely depends on how you plan on using the columns. For dates, I prefer datetimes over timestamps, but thats just a personal preference as I like to manually insert the dates in the queries.

What is missing, well that is sort of for you to decide / what is required. You should read up on Designing databases.
As far as the datatypes for what you have
The id fields should be a INT, or BIGINT (depends on how big your application may become) and set as the PRIMARY KEY.
The names should be a varchar how long you want it depends on what your requirements are. Most first / list names are generally 25-30 characters max. Event names could be upwards to 250, depending on your requirements.
The location will be similar to the name as a VARCHAR somewhere around 50-150, depending on your requirements.
The date should be a DATETIME field.
The description should be either a VARCHAR(250) or a TEXT field.
The permissions really depends on how you want to handle it. This could be an INT or a VARCHAR (incase you want to serialize an array).
Phone number could be an INT (if you want to strip all non-numeric characters and format it your own way) or a VARCHAR(15) if you do not want to strip the characters
EMail should be a VARCHAR(250).
Hopefully this helps, again it really depends on your requirements for the application and what you have envisioned. But the initial types can always be changed as your requirements change.
EDIT
And if you want to know full information about the different MySQL Data Types, read the manual: http://dev.mysql.com/doc/refman/5.0/en/data-types.html

Related

How to convert a string to a unique number?

I have a table like this:
// viewed
+----+------------------+
| id | username_or_ip |
+----+------------------+
As you see, username_or_ip columns keeps username or ip. And its type is INT(11) UNSIGNED. I store IP like this:
INSERT table(ip) VALUES (INET_ATON('192.168.0.1'));
// It will be saved like this: ip = 3232235521
Well, I want to know, is there any approach for converting a string like Sajad to a unique number? (because as I said, username_or_ip just accepts digit values)
int(11) is a 32-bit data type. As such it's just enough to hold an ipv4 address. Your question points that out.
To reversibly convert an arbitary string to a 32-bit data type is difficult: it simply lacks the information storage capacity.
You could use a lookup table for the purpose. Many languages, including php 5.4+, support that using an process called "interning." https://en.wikipedia.org/wiki/String_interning
Or you could build yourself a lookup table in a MySQL table. Its columns would be an id column and a value column. You'd intern each new text string by creating row for it with a unique id value, then use that value.
Your intuition about the slowness of looking up varchar(255) or similar values in MySQL is reasonable. But, with respect, it is not correct. Properly indexed, tables with that kind of data in them are very fast to search.

What are the options, with +vs and -vs, for store and retrieval of 400-500k fields of user data?

Context
I'm implementing a website to help people learn a foreign language.
I'm working in PHP and PDO. My database backend is MySQL. (For those who are interested, the front end is all done in HTML5, CSS and Javascript.)
The essence of this question is how to best planning/structure the backend of a web app which requires storing lots of individual items of data for many users.
What I Have Already, and What I Want to Do
I have four database tables:
Contains every word of a corpus of texts in the language, with lemma
and morphological tagging. (350,000+ rows)
Contains dictionary of words, with lemma numbers that match table 1. (6-7,000 rows)
Contains list of grammar morphemes that need to be learnt. (500-1,000 rows)
Contains list of users.
I want users to have a score for how well they know every word in the corpus. For each word:
Score for recognition of lexeme meaning.
x3 different scores for different aspects of grammar parsing relevant to this
particular language.
I also want users to have a score for how well they know the different grammar morphemes. In other words, for each user, I want to store and retrieve up to 400-500k fields.
What I Would Like to Know
I'm pretty sure that I can't store all this data for each user in a database table, because the number of columns required far exceeds the maximum allowed in SQL (from my research: 1k, or maybe 4k on some systems).
At present, the only options I know about are storing the data in an xml file for each user, or in a csv file for each user.
What are my options? What are the +ves and -ves of these options? Thanks for your time and help.
I strongly recommend using (a) join table(s):
Word ID
User ID
Lexeme Score
x3 grammar Score
With a PK of (UserID, WordID) (and maybe a secondary key on WordID) you get a table, that has a max of 350k*Usercount rows, accessed only (or mostly) via PK, with close-to-perfect index locality, which seems quite manageable.
Edit
Assuming, the word and user tables each have an integer PK called id and the score is a positive int , to create your join table you would need
CREATE TABLE scores (
wordID INT NOT NULL,
userID INT NOT NULL,
lexscore UNSIGNED INT DEFAULT NULL,
gramscoreA UNSIGNED INT DEFAULT NULL,
gramscoreB UNSIGNED INT DEFAULT NULL,
gramscoreC UNSIGNED INT DEFAULT NULL,
PRIMARY KEY(userID, wordID)
)

MySQL Database I18N, a JSON approach?

UPDATE: I've come across this question I did after some years: now I know this is a very bad approach. Please don't use this. You can always use additional tables for i18n (for example products and products_lang), with separate entries for every locale: better for indexes, better for search, etc.
I'm trying to implement i18n in a MySQL/PHP site.
I've read answers stating that "i18n is not part of database normally", which I think is a somewhat narrow-minded approach.
What about product namesd, or, like in my instance, a menu structure and contents stored in the db?
I would like to know what do you think of my approach, taking into account that the languages should be extensible, so I'm trying to avoid the "one column for each language solution".
One solution would be to use a reference (id) for the string to translate and for every translatable column have a table with primary key, string id, language id and translation.
Another solution I thought was to use JSON. So a menu entry in my db would look like:
idmenu label
------ -------------------------------------------
5 {"en":"Homepage", "it":"pagina principale"}
What do you think of this approach?
"One solution would be to use a reference (id) for the string to translate and for every translatable column have a table with primary key, string id, language id and translation."
I implemented it once, what i did was I took the existing database schema, looked for all tables with translatable text columns, and for each such table I created a separate table containing only those text columns, and an additional language id and id to tie it to the "data" row in the original table. So if I had:
create table product (
id int not null primary key
, sku varchar(12) not null
, price decimal(8,2) not null
, name varchar(64) not null
, description text
)
I would create:
create table product_text (
product_id int not null
, language_id int not null
, name varchar(64) not null
, description text
, primary key (product_id, language_id)
, foreign key (product_id) references product(id)
, foreign key (language_id) references language(id)
)
And I would query like so:
SELECT product.id
, COALESCE(product_text.name, product.name) name
, COALESCE(product_text.description, product.description) description
FROM product
LEFT JOIN product_text
ON product.id = product_text.product_id
AND 10 = product_text.language_id
(10 would happen to be the language id which you're interested in right now.)
As you can see the original table retains the text columns - these serve as default in case no translation is available for the current language.
So no need to create a separate table for each text column, just one table for all text columns (per original table)
Like others pointed out, the JSON idea has the problem that it will be pretty impossible to query it, which in turn means being unable to extract only the translation you need at a particular time.
This is not an extension. You loose all advantages of using a relational database. By way like yours you may use serialize() for much better performance of decoding and store data even in files. There is no especial meen to use SQL with such structures.
I think no problem to use columns for all languages. That's even easier in programming of CMS. A relational database is not only for storing data. It is for rational working with data (e.g. using powerful built-in mechanisms) and controlling the structure and integrity of data.
first thought: this would obviously brake exact searching in sql WHERE label='Homepage'
second: user while search would be able to see not needed results (when e.g. his query was find in other languge string)
I would recommend keeping a single primary language in the database and using an extra sub-system to maintain the translations. This is the standard approach for web applications like Drupal. Most likely in the domain of your software/application there will be a single translation for each primary language string, so you don't hav to worry about context or ambiguity. (In fact for best user experience you should strive to have unique labels for unique functionality anyway).
If you want to roll your own table, you could have something like:
create table translations (
id int not null primary key
, source varchar(255) not null // the text in the primary language
, lang varchar(5) not null // the language of the translation
, translation varchar(255) not null // the text of the translation
)
You probably want more than 2 characters for language since you'll likely want en_US, en_UK etc.

How to allow a column variable to be changed only once with PHP/MySQL?

I have a table in my MySQL DB with information from each user that they submitted during registration.
I would now like to allow users to change one of those columns (Column X), but only once.
What is the best way to do that?
The only way I can think of is to add an additional column (Column Z) to the table with a binary value that defaults to 0, and changes to 1 when Column X is updated by the user. If Column Z is 0, the site allows the change, otherwise, it does not allow it.
Is there a better way? Should Column Z be in the same table as Column X? Any other relevant points/issues I should consider?
Thanks.
You could have a default value for column x that gets created once the row is inserted to the table. Then when the user wants to update his row, the db checks this value, if it has not changed since insertion, then the user can be allowed to update. Otherwise it rejects
CREATE TABLE example_table (
id INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
data VARCHAR(100),
value_to_be_updated_once DEFAULT NULL
);
in your code you can check if the column (value_to_be_updated_once) is null, then the user should be allowed to edit.
You must make sure that the user does not set this value to NULL, unless that is something you want to have. (maybe the user changed his/her mind and will edit later)
You may, at some point, decide to allow the user to change columnX 3 times. A boolean will not allow for this. And what if you decide to allow the user to change columnY too. What then?
If you are absolutely positive that you will never need to change your rules, a binary flag will work fine. But if you will potentially be allowing and limiting changes on multiple columns with possibly different limits, you might consider a User_Edits table. It might look something like this:
tablename varchar(30) not null,
columnname varchar(30) not null,
user_id int unsigned not null,
changetime timestamp default current_timestamp,
oldvalue text
To find out how many edits a user had done:
SELECT COUNT(user_id) FROM User_Edits
WHERE tablename='Mytable' AND columnname='ColumnX';
And, as an added bonus, you'll have an audit trail that allows you to undo changes.

How to store an article in MySQL?

I would like to store some articles (blog posts) in a mysql table, these posts will be made from more parts (e.g.: part1, part2 ... part x)
I don`t know how to store them...
Could I store each part in a text file, or how could I store it in a mysql database ?
What field can support data of this size ?
And how should I design the table to store each part of the post ?
It would be good to store each part in the same cell and just separate them with a word () and then cut it with php ?
Thanks!
A common design method is to create a "Parts" table, ex:
CREATE TABLE parts (page_id INTEGER, part_name VARCHAR(255), body TEXT);
which will work fine at lower traffic. (page_id in this case is the foreign key to the page which "owns" this part - you'd get all parts for a given page by saying, natch SELECT * FROM parts WHERE page_id = :some_page_id)
As your traffic rises, the cost of pulling in and assembling the pages may become egregious, in which case the splitting apart the body contents from a larger text field (as you suggested) would not be a terrible idea. At this level, the speed gains from doing direct serialization of a hash into the database column and making the app server's CPU bear the brunt of the work (as opposed to the DB server) may be worth it.
The column types you'd be interested in are enumerated here under "Storage Requirements for String Types": http://dev.mysql.com/doc/refman/5.0/en/storage-requirements.html
Summarized, TEXT (64KB) should be large enough to hold most basic data. MEDIUMTEXT (16 MB) or LONGTEXT (4096 MB) if your data is noticeably large or you foresee it growing. BLOB, MEDIUMBLOB or LONGBLOB (same sizes as the *TEXT types) if you intend to do any PHP variable deserialization from DB columns.
I'd suggest one table "Post" and second table "Post_part" with FK to "Post". In the "Post_part" table you could store the text in column of TEXT type.

Categories