Best approach to user activity wall - php

I have different post types, like status updates, projects, donation etc. Each type of post has its one or more tables in databse. A user can create all post types. User has a wall like Facebook where he can see different post types which he created in chronological order (any post type created last should be on top of the wall).
What would be the most appropriate approach?
Fetch data from database with different queries store in array and then manipulate array?
To write a complex single query which can fetch data from different tables in chronological order?
To make a separate table for user activity and store data whenever user perform any activity?
Your approach different from the above?

simple to set up, doesn't perform very well (has a very bad worst-case).
is the simplest. You say complex but you can do this fairly easy with a UNION + ORDER BY construction. Performance will be pretty good.
will perform the best I think but there will be some duplication and things might get a little complex. Relational databases are not very good at polymorphism.
What's important to realize is that it's relatively easy to switch between these solutions. If you have a service oriented architecture (or just good design in general). So I wouldn't be too worried about which approach you pick. If in the future it seems your chosen approach doesn't work too well you could switch to another.

Related

Inserting dynamic form data to database. PHP

I have problem, I am creating quite complex form. Some parts of form are created dynamically. Lets say if you select certain option from a drop-down, extra fields gets injected to the form.
What approach would be best to store that data? I would like to try and get-away without using multiple tables. Because I makes the whole application so much more complex.
I was thinking of initializing all possible values as "0" in my model. And then overwrite them with post data, and just store the whole array in the table. Anyone see any problems with this approach?
The necessity of using multiple tables in your model doesn't depend on how much data (how many fields) you have to store - it depends on the logic of your model. So if there is a logical reason to use relationships in your model (f.e. 1:n, n:m) JUST DO IT!!!
If you will not follow the basic rules in creating your model and will try f.e. to store all the data in one table, although it should be divided into many tables, you will very soon regret it. Any change in your code in the future will cost you much more work and at some point you will not understand your own code and will have to write it again, this time following the rules ;)
And don't worry if the devoloping the right model costs a lot of work (lately I invested over two weeks in developing my model) - it really makes sense, because afterwards you can work much faster and more effectively with a well developed and planned model.
On the other hand there are situations, when storing over 100 and more fields in one table makes sense - it depends on the logic. So if you will provide some example, maybe one can say if you should work with one or more tables.
A lot depends on what you want to do with the form data later, and how often.
Serialized Single Field
In the simplest use cases you could base64_encode(serialize($data)) all the data and put that into a single column in the database.
Simple
Fast to insert
Easy to add/change input fields
Difficult AND Slow to search for values (particularly at scale)
Difficult to programmatically update should you need to make systematic changes to the data
Perfect if you always pull all of the data out of the db and never narrow your sql queries by data in the serialized string.
Metadata Table
Adding a second metadata table could offer a little more flexibility. The 2nd table would have a foreign key reference to the main form submissions, a metadata name, and the value. This allows a very flexible many to one relationship that you can easily store, search, and manipulate. You can see examples of this in wordpress.
2 tables, but still simple
Easy to add/change input fields
Much better searching via sql
Much easier to systematically update
Perfect if you don't always get all the data or have to narrow searches by the form data
And a different direction - You may also consider looking at Document based databases like MongoDB or CouchDB if you find yourself dealing with a lot of this type of data.

social network - User profile design schema question

I am creating user profiles on my site and lost on how to design this: There are many fields, some are 1:1 like city of residence, birthday, etc. But there are over 50 fields which are 1:many (or many to many?) like favorite movies, sport teams, dating preference, screen names, phone numbers, email addresses etc. It gets more complex when we have previous companies worked at, previous schools, etc. A person can belong to many companies and there are many fields in this group like Date worked at, department, company name, industry name, etc.
So the question is how to store all this? If we normalize all these profile fields there will be many many tables to join. As far as i read, for social networks people recommend a denormalized approach. But eitherways, I am storing all user details and profile details in the main user table, so each row is a unique user. If i have to store all these multiple preference, esp like favorite movies can go in the hundreds and past companies itself have a whole group of fields, so there will be lots of duplicates in the user table.
What approach do social networks take for this?
Social network data storage questions are really no different than the data storage questions in general... normalized and related data is the best way to 'store' this data efficiently. The RDBMS is made to handle these relationships - the PK-FK relationships and JOINS are the MAIN point of Relational DBs... so even though YOU 'see" join join join etc, the DB is (should be) efficient in handling these joins.
From a USAGE standpoint of getting to the pertinent data - make sure your indexes are accurate and optimized - and make use of VIEWS to 'flatten' the data you need for display purposes...
So whatever application server you are using to get the data will call the VIEW - that will 'appear' to you, the developer, as a 'flatter' representation of the data, making UI and APP serer interaction cleaner and more efficient (both in resources, and in coding),
as a general guideline - flattening of data is generally considered 'acceptable' in a data warehousing environment... of course I don't what to open up the monstrous debate of "just how normalized, is 'normalized'" (first - sixth form of normalization...)
I guess you could think of a SN as more of an OLAP, than the OLTP. In which case 'some' de-normalized data storage is common - and acceptable - really, YOU get to decide just how de-normalized you want things... For instance - in your examples, of employment history and movies, sports. I'd think that a simple 1:many allowing duplicate entries on such items would be fine, and probably easier to maintain...
Hope that was helpful,
You have to stick with the normalization strategy of creating your schema.The query might be a pain which you should handle with extreme caution especially when dealing with joins.If you are a dot developer, i guess LINQ will handle d pain for you.I believe your RDMS is smart enough to handle your queries with great performance. One thing to take note is your query structure.Write performance-based queries.As i said, LINQ should do this best....cheers

How do I write object classes effectively when dealing with table joins?

I should start by saying I'm not now, nor do I have any delusions I'll ever be a professional programmer so most of my skills have been learned from experience very much as a hobby.
I learned PHP as it seemed a good simple introduction in certain areas and it allowed me to design simple web applications.
When I learned about objects, classes etc the tutor's basic examnples covered the idea that as a rule of thumb each database table should have its own class. While that worked well for the photo gallery project we wrote, as it had very simple mysql queries, it's not working so well now my projects are getting more complex. If I require data from two separate tables which require a table join I've instead been ignoring the class altogether and handling it on a case by case basis, OR, even worse been combining some of the data into the class and the rest as a separate entity and doing two queries, which to me seems inefficient.
As an example, when viewing content on a forum I wrote, if you view a thread, I retrieve data from the threads table, the posts table and the user table. The queries from the user and posts table are retrieved via a join and not instantiated as an object, whereas the thread data is called using my Threads class.
So how do I get from my current state of affairs to something a little less 'stupid', for want of a better word. Right now I have a DB class that deals with connection and escaping values etc, a parent db query class that deals with the common queries and methods, and all of the other classes (Thread, Upload, Session, Photo and ones thats aren't used Post, User etc ) are children of that.
Do I make a big posts class that has the relevant extra attributes that I retrieve from the users (and potentially threads) table?
Do I have separate classes that populate each of their relevant attributes with a single query? If so how do I do that?
Because of the way my classes are written, based on what I was taught, my db update row method, or insert method both just take the attributes as an array and update all of that, if I have extra attributes from other db tables in each class then how do I rewrite those methods as obbiously updating automatically like that would result in errors?
In short I think my understanding is limited right now and I'd like some pointers when it comes to the fundamentals of how to write more complex classes.
Edit:
Thanks for the answers so far they've given me lots of pointers and thoughts and a lot of reading material. What I would like though is maybe an idea of how different people have decided to handle a simple table join with any amount of classes? Did you add attributes to the classes? Query from outside the class then pass the results into each class? Something else?
Entire books have been written about how to design a set of classes to fit a database schema.
Long story short: there is no one-size-fits-all way to do it, you have to make a lot of design decisions about the trade offs you want to make on an application-by-application basis.
You can find a library or framework to help, keywords: ActiveRecord, ORM (Object Relational Mapper)
P.S. You have no idea the potential for soul-killing analysis paralysis and over designing you can get into. Do the simplest thing that can possibly work for your app.
Code sample for my (below) comment:
$post = new PublishedPost($data);
$edit = $post->setTitle($newTitle);
$edit->save();
This is too broad to be answered without going into epic length.
Basically, there is four prominent Data Source Architectural Patterns from Patterns of Enterprise Architecture: Table Data Gateway, Row Data Gateway, Active Record and Data Mapper. These can be found implemented in the common php frameworks in some variation. These are easy to grasp and implement.
Where it gets difficult is when you start to tackle the impedance mismatch between the database and the business objects in your application. To do so, there are a number of Object-Relational Behavioral, Structural and Metadata Mapping Patterns, like Identity Maps, Lazy Loading, Query Objects, Repositories, etc. Explaining these is beyond scope. They cover almost 200 pages in PoEAA.
What you can look at is Doctrine or Propel - the two most well known PHP ORM - that implement most of these patterns and which you could use in your application to replace your current database access handling.
Many of your worries can be answered by inspecting the existing solutions found in well-tested frameworks such as CakePHP, symfony and Zend Framework. Examining their approaches and peeking under the hood should shed light on your questions. Who knows? You may even decide to write future projects using them!
They've spent years putting their heads together to tackle these problems. Take advantage!
Checkout Doctrine:
Here is an example of a forum application using Doctrine.
http://www.doctrine-project.org/documentation/manual/1_2/en/real-world-examples#forum-application

Understanding large mysql data relations

I am trying to teach myself how to use SQL, namely mysql.
What I am trying to understand is how to deal with many different types of data with in the same table. Say I am building a web application, and I have many different content types (blog item, comment item, files, pages, forms) that I need to store different data fields for each. Would I create a new table for each different content type since each content type has its own unique field requirements, or is there a better way to do this? It seems a little much to create a new table for content each type. If I had 30 types of content in my web app, that would be 30 tables just for the types, which seems a little much. And, if I had a new content type, I would have to create a new table that contained all the required fields I would need for that type.
Is there a better way to do something like this, when I have many different types of content that each requires different fields of data that needs to go into the database? Can I somehow check to see what type the content is, then select another table that holds all the different field types?
A little confused about what to do.
Just to give an example:
Stack Overflow itself uses the same database table (called Posts) for questions and answers. Even though these two types of data are not identical, the site creators considered them similar enough to put them into one table. There's a PostTypeId field that says whether this post is a question or an answer. On answers, the Title field would be NULL, on questions, other columns might be ignored.
Comments, on the other hand, are in a different table. Of course you could theoretically put them into the same Posts table and have a PostTypeId for comments. But the overhead this would create (because of the lightweightness of comments) justifies creating a new table.
I know this isn't really an answer, and other developers might even have decided to put questions and answers into different tables; but it gives some perspective. Long story short: It depends :)
Sketch interactions
First try not to think about database design, but how entities should interact between themselves. Think of it as each entity has its own Class, which represents required data.
It's always a good start to take pencil and paper and sketch your interactions between these entities, on what interactions (or relations) are you trying to accomplish. Learning the Database design process
Extendability and reuse
For example you want to have a User, which can post BlogPosts each BlogPost can have a set of Tags and relevant set of Comments. Attachments can be injected into BlogPost and also into Comment.
Reusability and extendability is the key. When sketching your interactions try to isolate dependencies. Think of it in OO manner. Let's explore the Attachment a little more. You can create an Attachment table and then extend Attachement by creating BlogPostAttachment and CommentAttachment where you can easily create relations between these dependable entities. This creates an easily extendable content type which you can further reuse in eg. UserDetailsAttachment
ORM's to rescue
By studying example code usage of Object relational mappers like Doctrine or Propel you can grasp some ideas for table extendabity. Practical examples are always the best one.
Related SO questions, which you may be interested in
Good Resources for Relational Database Design
Good PHP ORM Library?
How should a programmer learn great database design?
I know, it's a long way to go, but considering factors of creating large scale DB applications with many relations and entity types it best to use help of ORM in the long run
You needn't be afraid of using many many tables - the database will happily deal with lots of them without complaining. If you let each content type have its own table, you get certain advantages:
Simplicity: Each table can be fairly simple, and the constraints are straightforward. For example if ContentType1 has a field with a relation to another table, you can make that a foreign key in the database design and the RDBMS will take care of data integrity for you.
Indexing efficiency: if ContentType2 needs to be indexed by date but ContentType3 needs to be indexed by name (to take a simple example), having them in two separate tables means each index is there for exactly the data it needs and nothing else. Combining them in one table means you need both indexes covering the combined dataset, which is messier and uses up more disk space.
If you need to output a list combining two content types, a UNION of the two tables is both easy; and if you need to do that often with large amounts of data, an indexed view can make it cheap.
On the other hand, if you have two content types which are very similar (as in the StackOverflow case above for example), you can get some advantages from combining them into one table:
Simplicity: You only need to code the table once - if done right (i.e. the two content types are really very similar), this can make your codebase smaller and simpler.
Extensibility: if a third content type crops up which is again similar to the first two, and similar in the same way that the first two match each other, the table can straightforwardly be extended to store all three content types.
Indexing for performance. If the most common way of getting at the data is to combine the two content types and order them by date (say), a field which is common to both content types, then it can be inefficient to have two separate tables which must repeatedly be UNIONed and then sorted. Combining the two content types in one table lets you put a single index on the date field, allowing faster querying (though remember you can get a similar benefit from indexed views).
If you normalize rigorously, you will have a database where every entity type has its own table in the database. However, denormalization in various ways (such as combining two entity types in one table) can have benefits which might (depending on the size and shape of your data) outweight the costs. I'd advise a strategy of keeping all content types separate at least at first, and consider combining them as a tactical denormalization if it turns out to be necessary.
You need to read a book about building websites with PHP and MySQL. It's a good attitude to google first because some programmers think it is a lazy question. I suggest reading "Learning PHP MySQL and JavaScript".
Anyway, before you start coding your site, you need to plan what kinda information you will store, then you design your database. Say a register form will contain A First_Name, Second_Name, DateOfBirth, Country, Gender and Email. You create a table named as say "USER_INFO" and you assign a datatype matching the data you would like to store, a Number, text, Date, and So on, then via PHP you connect to MySQL and store or retrieve the data you want. You really need to read a book or a tutorial so you get a full answer, AND GOOGLE :P

Multiple application instances on the same database

I'm writing an application that that I'm going to provide as a service and also as a standalone application.
It's written in Zend Framework and uses MySQL.
When providing it as a service I want users to register on my site and have subdomains like customer1.mysite.com, customer2.mysite.com.
I want to have everything in one database, not creating new database for each user.
But now I wonder how to do it better.
I came up with two solutions:
1. Have user id in each table and just add it to WHERE clause on each database request.
2. Recreate tables with unique prefix like 'customer1_tablename', 'customer2_tablename'.
Which approach is better? Pros and cons?
Is there another way to separate users on the same database?
Leonti
I would stick to keeping all the tables together, otherwise there's barely any point to using a single database. It also means that you could feasibly allow some sort of cross-site interaction down the track. Just make sure you put indexes on the differentiating field (customer_number or whatever), and you should be ok.
If the tables are getting really large and slow, look at table partitioning.
It depends on what you intend to do with the data. If the clients don't share data, segmenting by customer might be better; also, you may get better performance.
On the other hand, having many tables with an identical structure can be a nightmare when you want to alter the structure.
I'd recommend using separate databases for each user. This makes your application easier to code for, and makes MySQL maintenance (migration of single account, account removal and so on.)
The only exception to this rule would be if you need to access data across accounts or share data.
This is called a multi-tenant application and lots of people run them; see
multi tenant tag
For some other peoples' questions

Categories