Database design for a S3 based file storage application

Database design for a S3 based file storage application - php

I am working on a file storage application and I am using PHP and MySQL as my tools. My use case is user can share multiple files with multiple users using the app. The sender will send URLs via email and the receiver has to click on the link and login to see the file. So when a user shares a file I want to make an entry to the database.
id|filename|shared_with|shared_by|shared_on|shared_url|url_expiration
Now, the above is my currently thought database structure. But in this case I will have to store multiple multiple values if the same file is shared with multiple users, which I believe is not a good way to do. Also, storing comma(,) separated values is not a good idea.
I gave a thought to Document database like Mongo DB (just because dropbox uses it and handles key value pair data well). But as MySQL can handle a decent amount of records and NOSQL can be a potential solution for bigdata now, I am not sure which would be the right way to go about this use case.
I would like the experts to throw some light on it. I am using Amazon S3 for storing files.

Here's a very basic design to get you started...
You need a table to store your file information:
files
id unsigned int(P)
owner_id unsigned int(F users.id)
name varchar(255)
+----+----------+----------+
| id | owner_id | name |
+----+----------+----------+
| 1 | 1 | File A |
| 2 | 1 | File B |
| 3 | 1 | File C |
| 4 | 2 | File 123 |
| .. | ........ | ........ |
+----+----------+----------+
You need a table to store information about what files where shared with whom. In my example data you see bob shared File A with mary and jim, then he shared File B with mary.
shares
id unsigned int(P)
file_id unsigned int(F files.id)
shared_with unsigned int(F user.id)
shared datetime
url varchar(255)
url_expires datetime
+----+---------+-------------+---------------------+-------+---------------------+
| id | file_id | shared_with | shared | url | url_expires |
+----+---------+-------------+---------------------+-------+---------------------+
| 1 | 1 | 2 | 2014-01-06 08:00:00 | <url> | 2014-01-07 08:00:00 |
| 2 | 1 | 3 | 2014-01-06 08:00:00 | <url> | 2014-01-07 08:00:00 |
| 3 | 2 | 2 | 2014-01-06 08:15:32 | <url> | 2014-01-07 08:15:32 |
| .. | ....... | ........... | ................... | ..... | ................... |
+----+---------+-------------+---------------------+-------+---------------------+
And finally you need a table to store user information.
users
id unsigned int(P)
username varchar(32)
password varbinary(255)
...
+----+----------+----------+-----+
| id | username | password | ... |
+----+----------+----------+-----+
| 1 | bob | ******** | ... |
| 2 | mary | ******** | ... |
| 3 | jim | ******** | ... |
| .. | ........ | ........ | ... |
+----+----------+----------+-----+

Related

Delete values from second column alongside the values from first column in the same row between two database tables

I am trying to delete the same id between two tables. And deleting a storage file by getting the image file value.
For example procedure:
user_id(table1)->user_id(table2)->avatar->file name-> delete images-> delete all data from same user_id between first table and second table
I have two tables,
*user* table
| user_id | Full_name |
| -------- | -------------- |
| 1 | Steve Jobs |
| 2 | Bill Gates |
| 3 | Elon Musk |
*users_option* table (with both column foreign keys)
| user_id | user_option | user_value |
| ------------ | -------------- | -------------- |
| 1 | email | abc#abc.com |
| 1 | avatar | 'big.jpg','sml.png' |
| 1 | username | stevejobs |
| 2 | email | def#def.com |
| 2 | avatar | 'big.png','sml.jpeg' |
| 2 | username | billgates |
I am hardly writing this code with ORM MySQL
ORM::table($config['db']['pre'].'user')
ORM::table($config['db']['pre'].'user_option')
Simply that, I just want completely delete all the data from the same user_id.
I just want completely delete all the data from the same user_id.

database: best practice countries, country codes, country phone codes

I am looking for a "best practice" if you store country codes in a database but couldn't find a "this is the right way" for that. I want to store the 2 chars country code and also the country phone codes (eg Germany would be "DE" and "+49").
Actually my plan is as follows: create one table countriesand one table with country_codes. Something like this:
TABLE: countries
id INT(11)
code CHAR(2)
TABLE: country_codes
id INT(11)
country_id INT(11) FORGEIGN KEY (countries -> id)
phone_code VARHAR(6)
I think I need to split them because some countries have more than one phone code. This way a country can have multiple phone codes.
But to my question: is that the "best practice" to do that? No only from that point "that will work" also more from that view if I want to rollout my application in "all" countries or if I want to translate the app in multiple languages (in that case I wanted to use the countries table also for the different languages.
What is your way to do thing like taht, if you want to able to translate your app in any language without the need of re-coding stuff and if you also need a list of all countries in you app?
If it should matter: I am planing to go with laravel for this app.

Country codes are standardized to two letters by ISO 3166-1-alpha-2, so storing them that way will work. It's often helpful to include a country name in the table, so a user can choose the right country without having to know all the codes.
Telephone numbers are far less standardized. The ITU offers recommendation E.164 for representing actual telephone numbers (called "directory numbers" in telephony jargon). Country codes are defined as one to three digits. North America (including USA, Canada and many Carribean nations) all are part of the North American Numbering Plan and share the country code 1.
Directory numbers are typically preceded by + and punctuated by dots. So, for example, the published New York City directory assistance number is (or was when they still had such a service) +1.212.555.1212. If you called that number from someplace in Europe, you would see the + and substitute your local international prefix. In NANP, multiple nationalities have the same country code.
But, UK is strange. Calling from outside the country, it's +44.exchange.number. But calling long distance from within the country it's (0) exchange.number.
My point: it's hard to get it right if you try to compose directory numbers with a country code in your software. You're probably better off asking users to provide their telephone numbers with the international prefix.
You should definitely not tie E.164 country codes to ISO 3166 two-letter country codes by putting them as different columns on the same row of a table. You need two separate tables to be future proof. The standardization organizations are different and do their own things, so your data model should reflect that.
Read this: Falsehoods Programmers Believe About Telephone Numbers.

My DB looks like this:
> id int(11) Auto Increment (Just an ID (primary key))
> iso char(2) (2-letters ISO code)
> name varchar(80) (normalized name (all uppercase))
> nicename varchar(80) (Nicely formatted name)
> iso3 char(3) NULL (3-letters ISO code)
> numcode smallint(6) NULL (numeric ISO code)
> phonecode int(5) (phone code like '1' for USA, without '+')
It should be more then enough. You get user's phone number, remove zeroes at the beginning, remove any non-numerical characters, add a country code from DB and you are good to go!
Example:
1) User input (045) 111-22-33, Germany
2) You convert it to 451112233
3) Add code of Germany (49) from DB. You get 49451112233. Add '+' if you wish.
4) Now you can make a call or send SMS with Twilio or any other service.
If you want to "easily" translate the site to other languages, store all of your text in database and pull the right version depending on user's language preferences.

Based on the answers I would do the following:
DB Tables:
+------------------------------------------------------------+
| Table: countries |
+--------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------+--------------+------+-----+---------+-------+
| id | int(11) | NO | | NULL | |
| iso_code2 | char(2) | NO | | NULL | |
| iso_code3 | char(3) | NO | | NULL | |
| num_code | int(3) | NO | | NULL | |
| name | varchar(48) | NO | | NULL | |
| nicename | varchar(48) | NO | | NULL | |
+--------------+--------------+------+-----+---------+-------+
// will store all countries available
+------------------------------------------------------------+
| Table: country_phonecodes |
+--------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------+--------------+------+-----+---------+-------+
| id | int(11) | NO | | NULL | |
| country_id | int(11) | NO | | NULL | |
| phonce_code | int(6) | NO | | NULL | |
+--------------+--------------+------+-----+---------+-------+
// based on this page: https://countrycode.org/ there are
// countries with more than one code
// and also codes can be 6 chars long
+------------------------------------------------------------+
| Table: languages |
+--------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------+--------------+------+-----+---------+-------+
| id | int(11) | NO | | NULL | |
| code | char(2) | NO | | NULL | |
| locale | char(5) | NO | | NULL | |
| name | varchar(50) | NO | | NULL | |
| native_name | varchar(50) | NO | | NULL | |
| flag | varchar(10) | NO | | NULL | |
+--------------+--------------+------+-----+---------+-------+
// table for available translations of the app
+------------------------------------------------------------+
| Table: country_languages |
+--------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------+--------------+------+-----+---------+-------+
| id | int(11) | NO | | NULL | |
| country_id | int(11) | NO | | NULL | |
| language_id | int(11) | NO | | NULL | |
+--------------+--------------+------+-----+---------+-------+
// table for language suggestions for a given country
And some example inserts:
+---------------------------------------------------------------------------------------+
| Inserts: countries |
+-----+------------+------------+-----------+---------------------+---------------------+
| id | iso_code2 | iso_code3 | num_code | name | nicename |
+-----+------------+------------+-----------+---------------------+---------------------+
| 1 | de | deu | 276 | GERMANY | Germany |
| 2 | do | dom | 214 | DOMINICAN REPUBLIC | Dominican Republic |
| 3 | be | bel | 056 | BELGIUM | Belgium |
+-----+------------+------------+-----------+---------------------+---------------------+
+----------------------------------+
| Inserts: country_phonecodes |
+-----+-------------+--------------+
| id | country_id | phonce_code |
+-----+-------------+--------------+
| 1 | 1 | 49 |
| 2 | 2 | 1809 |
| 3 | 2 | 1829 |
| 4 | 2 | 1849 |
| 5 | 3 | 32 |
+-----+-------------+--------------+
+----------------------------------------------------------+
| Inserts: languages |
+-----+-------+---------+---------+--------------+---------+
| id | code | locale | name | native_name | flag |
+-----+-------+---------+---------+--------------+---------+
| 1 | de | de_DE | German | Deutsch | de.svg |
| 2 | do | es_DO | Spanish | Español | es.png |
| 3 | be | fr_BE | French | Français | fr.jpg |
| 4 | be | nl_BE | Dutch | Nederlands | nl.png |
| 5 | be | de_BE | German | Deutsch | de.svg |
+-----+-------+---------+---------+--------------+---------+
+----------------------------------+
| Inserts: country_languages |
+-----+-------------+--------------+
| id | country_id | language_id |
+-----+-------------+--------------+
| 1 | 1 | 1 |
| 2 | 2 | 2 |
| 3 | 3 | 3 |
| 4 | 3 | 4 |
| 5 | 3 | 5 |
+-----+-------------+--------------+
I think this should work and be useable for any project where a country list and/or i18n is needed.
If a user comes from Belgium, he can choose from the list of available languages/translations. He will get a suggestion for FR, NL and DE but will still be able to choose es_DO as prefered language.
Think this should cover all needs - but if anyone sees a problem in that or has ideas/comments: I would be happy if I can improve this solution :)

How can I implement model revisions in Laravel?

This question is for my pastebin app written in PHP.
I did a bit of a research, although I wasn't able to find a solution that matches my needs. I have a table with this structure:
+-----------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+------------------+------+-----+---------+----------------+
| id | int(12) unsigned | NO | PRI | NULL | auto_increment |
| author | varchar(50) | YES | | | |
| authorid | int(12) unsigned | YES | | NULL | |
| project | varchar(50) | YES | | | |
| timestamp | int(11) unsigned | NO | | NULL | |
| expire | int(11) unsigned | NO | | NULL | |
| title | varchar(25) | YES | | | |
| data | longtext | NO | | NULL | |
| language | varchar(50) | NO | | php | |
| password | varchar(60) | NO | | NULL | |
| salt | varchar(5) | NO | | NULL | |
| private | tinyint(1) | NO | | 0 | |
| hash | varchar(12) | NO | | NULL | |
| ip | varchar(50) | NO | | NULL | |
| urlkey | varchar(8) | YES | MUL | | |
| hits | int(11) | NO | | 0 | |
+-----------+------------------+------+-----+---------+----------------+
This is for a pastebin application. I basically want paste revisions so that if you open paste #1234, it shows all past revisions of that paste.
I thought of three ways:
Method 1
Have a revisions table with id and old_id or something and for each ID, I would insert all old revisions, so if my structure looks like this:
rev3: 1234
rev2: 1233
rev1: 1232
The table will contain this data:
+-------+----------+
| id | old_id |
+-------+----------+
| 1234 | 1233 |
| 1234 | 1232 |
| 1233 | 1232 |
+-------+----------+
The problem which I have with this is that it introduces a lot of duplicate data. And the more the revisions get, it has not only more data but I need to do N inserts for each new paste to the revisions table which is not great for a large N.
Method 2
I can add a child_id to the paste table at the top and just update that. And then, when fetching the paste, I will keep querying the db for each child_id and their child_id and so on... But the problem is, that will introduce too many DB reads each time a paste with many revisions is opened.
Method 3
Also involves a separate revisions table, but for the same scenario as method 1, it will store the data like this:
+-------+-----------------+
| id | old_id |
+-------+-----------------+
| 1234 | 1233,1232 |
| 1233 | 1232 |
+-------+-----------------+
And when someone opens paste 1234, I'll use an IN clause to fetch all child paste data there.
Which is the best approach? Or is there a better approach? I am using Laravel 4 framework that has Eloquent ORM.
EDIT: Can I do method 1 with a oneToMany relationship? I understand that I can use Eager Loading to fetch all the revisions, but how can I insert them without having to do a dirty hack?
EDIT: I figured out how to handle the above. I'll add an answer to close this question.

If you are on Laravel 4, give Revisionable a try. This might suite your needs

So here is what I am doing:
Say this is the revision flow:
1232 -> 1233 -> 1234
1232 -> 1235
So here is what my revision table will look like:
+----+--------+--------+
| id | new_id | old_id |
+----+--------+--------+
| 1 | 1233 | 1232 |
| 2 | 1234 | 1233 |
| 3 | 1234 | 1232 |
| 4 | 1235 | 1232 |
+----+--------+--------+
IDs 2 and 3 show that when I open 1234, it should show both 1233 and 1232 as revisions on the list.
Now the implementation bit: I will have the Paste model have a one to many relationship with the Revision model.
When I create a new revision for an existing paste, I will run a batch insert to add not only the current new_id and old_id pair, but pair the current new_id with all revisions that were associated with old_id.
When I open a paste - which I will do by querying new_id, I will essentially get all associated rows in the revisions table (using a function in the Paste model that defines hasMany('Revision', 'new_id')) and will display to the user.
I am also thinking about displaying the author of each revision in the "Revision history" section on the "view paste" page, so I think I'll also add an author column to the revision table so that I don't need to go back and query the main paste table to get the author.
So that's about it!

There are some great packages to help you keeping model revisions:
If you only want to keep the models revisions you can use:
Revisionable
If you also want to log any other actions, whenever you want, with custom data, you can use:
Laravel Activity Logger
Honorable mentions:
Activity Log. It also has a lot of options.

MySQL: Select row from table which name is specified in a field of another table

I have a feed table with say fields:
id - unique feed id
created - the date the feed was created
table - the name of the table the rest of the feed info resides
Then I have say 2 tables: feed_image and feed_text. Now these 2 tables contain different information about a feed, different fields.
How is it possible (in MySQL) to extract the information for the feed from the appropriate table which name is specified in feed.table?
Here is how my schema looks like:
+------------------+
| table_a |
+---------------------+ |------------------|
| feed | | id |
|---------------------| +------+ feed_id |
| id <-------------------+-+ | field_in_a |
| created | | | ... |
| table | | | |
| | | | |
| | | | |
| | | +------------------+
+---------------------+ |
|
|
| +-------------------+
| | table_b |
| |-------------------|
| | id |
+--------+ feed_id |
| field_in_b |
| ... |
| |
| |
| |
| |
+-------------------+
Each feed exists either in table_a or table_b or table_c or ... (I have like 30 of them).
How can I specify which table to extract the info from (each table has a different structure).
Or, if I add indexes on each table_*.feed_id and map it to feed.id, would InnoDB do some magic, so when I JOIN them all it would look in just one of them, not all 30?
My latest idea is to have just one table feed with a field feed.content where I would store a serialized PHP object of a different PHP class representing the different feed type and its individual contents.
What is the best way to go regarding performance?
P.S.: No records would need to be selected / searched / ordered by individual parameters, just by created. The idea should be able to work well with 1 000 000+ records.
UPDATE:
To clarify about the 30+ table_a/b/c..
Each feed can be of too many different types (new ones will also be added with time):
An image feed would have VARCHAR(255) url field
A text feed would have LONGTEXT text field
A youtube.com feed would have VARCHAR(255) title, VARCHAR(255) video_id fields
A *.com feed would have * x1, * x2, * x3 ... fields
Each of these feeds will be then displayed with PHP according to type:
An image will be displayed as na image from the given URL
A text will be displayed as a pure text
A youtube.com feed would display a video player with the given title from the given video id
A *.com feed would display... :)

I would use a LEFT JOIN and alias my columns in the select and alias my tables in the join allowing you to return any and all information you need.
The with whatever language your pulling the results you can group and perform logic as necessary.
UPDATE:
Why do you have 30 tables exactly? Maybe one "meta" table with the feed creation date url it came from etc... and another table that contains a unique record id, feed id, content, content type.
That way you can join on one table where feed id's match as well as group by or filter by content type.
Visualization: Feed table
--------------------------------------------------------------
| feed_id | feed_name | feed_created | Feed_url |
--------------------------------------------------------------
| 1 | Feed 1 | 03/28/2012 | www.go.com |
--------------------------------------------------------------
| 2 | Feed 1 | 03/28/2012 | www.be.com |
--------------------------------------------------------------
| 3 | Feed 2 | 03/28/2012 | www.hi.com |
--------------------------------------------------------------
| 4 | Feed 3 | 03/28/2012 | www.ex.com |
--------------------------------------------------------------
Visualization: Feed Resources table
------------------------------------------------------------------------------------------------
| rec_id | feed_id | content | type |
------------------------------------------------------------------------------------------------
| 1 | 1 | 'hello world! | text |
--------------------------------------------------------------------------------------
| 2 | 3 | 'http://me.com/my-image | img |
------------------------------------------------------------------------------------------------
| 3 | 2 |{\'title\':\'VIDEO\',\'url\':\'http://me.com/1.mov\'}| vid |
------------------------------------------------------------------------------------------------
| 4 | 1 | 'Wow that was easy!' | text |
------------------------------------------------------------------------------------------------

Can't you do something like this:
+------------------+
| table_a |
+---------------------+ |------------------|
| feed | | id |
|---------------------| +------+ feed_id |
| id <-------------------+-+ | field_in_a |
| created | | | ... |
| | | | |
| | | | |
| | | | |
| | | +------------------+
+---------------------+ |
|
|
| +-------------------+
| | table_b |
| |-------------------|
| | id |
+--------+ feed_id |
| field_in_b |
| ... |
| |
| |
| |
| |
+-------------------+
And then join the records from table_a and table_b? MySQL is pretty efficient at that.

You should create a normalized layout as d_inevitable suggested.
You haven't told us exactly how you're displaying this data. But you can get a list of ALL feeds with select * from feed;
Then you can get additional data for the feed by searching the other tables. For your example of URLs, if table_a = URLs and field_in_a = URL
Whichever feed you're on, you'd search for URLs with the ID for that feed.
select * from URLs where feed_id = "id"
This would allow each feed to have 1 to many URLs associated with it.
You could do this for each type of data you'd have associated with a feed. The "feed_id" is your Foreign Key that you use to reference which feed it is.
The key is going to come down to how you're displaying this.
You're going to need to loop through all the Feeds, and then build a table (?) appropriately.
If a feed has two URLs, how do you want it to look?
Should it display
-------------------------------------------------
| Feed Name | Feed Created | URL |
-------------------------------------------------
| Feed 1 | 03/28/2012 | www.go.com |
-------------------------------------------------
| Feed 1 | 03/28/2012 | www.be.com |
-------------------------------------------------
| Feed 2 | 03/28/2012 | www.hi.com |
-------------------------------------------------
| Feed 3 | 03/28/2012 | |
-------------------------------------------------
or
-------------------------------------------------
| Feed Name | Feed Created | URL |
-------------------------------------------------
| Feed 1 | 03/28/2012 | www.go.com |
| | | www.be.com |
-------------------------------------------------
| Feed 2 | 03/28/2012 | www.hi.com |
-------------------------------------------------
| Feed 3 | 03/28/2012 | |
-------------------------------------------------
I think the data layout should be as d_inevitable suggested, and then you need to determine how you're going to display the data, and that will determine how you query it.

Uploading a csv using php into MySQL and update as well

Ok so i have a database table called requests with this structure
mysql> desc requests;
+------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| artist | varchar(255) | YES | | NULL | |
| song | varchar(255) | YES | | NULL | |
| showdate | date | YES | | NULL | |
| amount | float | YES | | NULL | |
+------------+--------------+------+-----+---------+----------------+
Here is some example data
+----+-----------+-------------------------+------------+--------+
| id | artist | song | showdate | amount |
+----+-----------+-------------------------+------------+--------+
| 6 | Metallica | Hello Cruel World | 2010-09-15 | 10.00 |
| 7 | someone | Some some | 2010-09-18 | 15.00 |
| 8 | another | Some other song | 2010-11-10 | 45.09 |
+----+-----------+-------------------------+------------+--------+
I need a way to be able to give user a way to upload a csv with the same structure and it updates or inserts based on whats in the csv. I have found many scripts online but most have a hard coded csv which is not what i need. I need the user to be able to upload the csv...Is that easy with php....
Here is an example csv
id artist song showdate amount
11 NewBee great stuff 2010-09-15 12.99
10 anotherNewbee even better 2010-09-16 34.00
6 NewArtist New song 2010-09-25 78.99
As you can see i have id 6 which is already in the database and needs to be updated..The other two will get inserted
I am not asking for someone to write the whole script but if i can get some direction on the upload and then where to go from there....thanks

Create store procedure as below and test it. It is works
CREATE proc csv
(
#id int,
#artist varchar(50),
#songs varchar(100),
#showdate datetime,
#amount float
)
as
set nocount on
if exists (select id from dummy1 where id=#id) -- Note that dummy1 as my table.
begin
update dummy1 set artist= #artist where id=#id
update dummy1 set songs=#songs where id=#id
update dummy1 set showdate=#showdate where id=#id
update dummy1 set amount=#amount where id=#id
end
else
insert into dummy1 (artist,songs,showdate,amount)values(#artist,#songs,#showdate,#amount)
Go

upload the file to a directory using move_uploaded_file
use fgetcsv to read the uploaded csv and process each row as you like.
delete the csv file

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.