Sharing model data between web applications

Sharing model data between web applications - php

I'm looking for the best possible way of sharing model data between two MVC (I'm using Symfony) driven web sites.
Background information
We have two web sites A and B. The same software is used for both sites, but there are different customers and data. Customers are allowed to release content. Now we're going to introduce a new payment option with the advantage that the user's content is released on both web sites automatically.
Implementation ?!
I have three ideas for the implementation:
Using the same database for both applications. Then I would have to extend some tables by one column which indicates the appropriate target web site (A/B).
I think that this would be bad design. A lot of code has to be rewritten in order to exclude records from query result sets, which does not belong to the respective web site.
Using two databases.
In my opinion, this would decrease performance significantly and would be very hard to implement. Data has always to be requested twice. Also, in future there may be web sites C,D,E...
Synchronizing two databases via web-service.
Some data would be stored twice. Therefore, all operations on such a piece of data has to be performed twice (create, read, update, destroy).
Now I'm stuck, because each solution has serious disadvantages.
Do you have any ideas? If not, which one do you think is the best of mine?

I think your first option is the best. You're going to reduce duplicate data as much as possible and you should have the best performance. You will have to add an extra check to exclude the records not belonging to each particular website but all solutions will require work.

Related

Should each user have a separate siloed database?

I don't know if this question belongs here or not, someone please move it to an appropriate place if needed.
We are working on a web application using PHP and MySQL. The software is of the sort that provides a lot of pre-fed data to its users. For example, a list of questions and answers like a knowledge base. Now every user who registers into the system would have the liberty to add/update/delete this knowledge base, without affecting the data of the other users.
Now I understand that we would require to have a master copy of this pre-fed data, and would have to make a copy of this data available to users.
I was wondering how to implement this in the system without affecting the performance.
Would we have to create separate databases for each user?
Any pointers?
Thanks!

I find three approaches to this, they'd depend upon your domain requirement.
You're 'seeding' configurations and basic data for which it does make sense (to me) to localize the settings per user. I guess most of the apps follow this.
If it's domain data, when you say knowledge base (which I take to be very huge), it'd make more sense to save the per user edits and merge the master data with a user's personalized data. This is a very abstract & I wouldn't know it's implementation unless I actually see the data modeling, but then this looks a viable approach!
Save edits from all the users separately at one location (per collection or however you wish) if you want collaboration and stuff. With this, I think, it'd be easier to grow your knowledge base, although you can do the same with the previous approach with a little help from DBA!

Managing multiple web applications with the same code base

I'm looking for the best way (or easiest way) to manage multiple instances of a PHP web application, that share the same code base.
Let me break it down for you:
Our domain is hosting multiple instances of the application, each with their own settings files and database.
http://mydomain.com
|
|------/customer1/
|
|------/customer2/
|
|------/customer3/ + custom features
Let's say that customer 1 & 2 purchased the application (that we host for them), and they have the base model of that application. (ie. not customized)
However, customer 3 wants feature X or Y, so we code that feature for him and add that to the application.
But whenever there is an update to the code base (ie. a security fix in the core classes of the framework) all three customers should get an update of the base code!
What would be the best way of managing this sort of setup? Manually uploading all files using FTP is a pain, and it's not possible to merge code.
Using Git is perhaps a solution, but how would I go around and do it? Create separate repositories per customer? What if we grow to over one-hundred customers?
Any insight are welcome, including why we should or should not use such a setup. (but remember that we'll be the ones hosting the application for our customers)

I remember doing this years ago so you will have to take into account i'm now a little rusty at this.
I built a standalone framework, which combined all includes into ONE .php file. Any frameworks that used that, would do a PULL request and if the md5 of their framework matched the framework on the central server then no update was needed. Otherwise it would download the new framework over https and replace it's own copy. This created an automatic update system that was PULLED to all other apps that used it.
A major problem to this is, if you cause say a syntax error and you upload that to the central server, it will get pulled to all others and break them! You will be best to use a cron job to make the pull request that does NOT use the framework so the broken framework won't break it from doing a pull request to FIX the syntax error in the framework. This at least adds the ability to automatically fix itself as well once you fix the syntax error on the central server. However, having a staging server to test each update really is very important in this case.
That is only the basics of course as if you have say images that the framework uses they will also need to get pulled over, as well as any SQL updates and so forth.
You must regorisly test this before uploading to the central server in order to prevent mass errors! Not ideal! Unit testing, staging server, small and simple updates but more often (large updates have more potential to go wrong, and more to undo if it does go wrong) will all help mitigate the risk.
You will also have to structure the framework VERY VERY VERY VERY VERY well from the beginning to make it as flexible as possible when planning on having many different sites use it. If you design it wrong in the beginning it may be next to impossible to redesign further down the road. For example it may be wise to use PDO for database access, allowing all the applications the ability to use different databases while your classes etc will still no know how to interact with the database (regardless of if it's mysql or oracle), though, i would advise at least sticking to one if you can.
Design wise, you are best to look at other language frameworks and see how they do what they do. You must stick to good design principles, use design patterns only where applicable, and take note of MVC!
Further Reading...
http://en.wikipedia.org/wiki/Software_design_pattern
http://www.ipipan.gda.pl/~marek/objects/TOA/oobasics/oobasics.html
http://en.wikipedia.org/wiki/Model%E2%80%93view%E2%80%93controller
http://www.phpframeworks.com/
This is no easy task, so be warned.

You mixed two separate different tasks in one question
Development and support of diverged code
Deploy of code from (any) SCM to live systems
Answer on 1-st question (for any modern) SCM is branching and merge branches (each customer have own branch, into which you merge needed parts from you single development-branch, or /better/ with "branch-per-task" you merge task-branch in all needed targets, avoiding cherry-picking)
Answer on 2-nd question is "Build-tools", which can interact with your SCM (you have to write more details later for more detailed answer)

Make your custom features modular. Use a similar architecture to WordPress/Joomla which have plugins or extensions. This allows your customers to easily have separate feature sets but all share the same base code.

based on database design is it possible to predict the queries to be used in the application?

lets say i have this mysql db, and all the tables in the db are related to one another, primary keys, foreign keys, etc all are set. Now is it possible to predict, just from the database design, what the queries will be used for the application? Since the database does dictate the application capabilities, then therefore from the design, we can predict what queries that will be used in the application, right?
If it is possible, is there a strategy or automated way to generate the possible queries?

I have written a book on the subject of analyzing data using SQL and Excel, and have spent many years working with databases.
Yes, from a database structure, you can figure out how tables are going to be joined together. You are not going to figure out the harder -- and generally more business relevant -- things that users need. Here are some examples:
You can have a database where the primary table is telephone calls, with the associated information. From this database, you may need to know the maximum number of active calls at one time. Or you may need to know how many different people someone calls in a month.
You can have a database of subscriber records. You may need to figure out the probability that someone will stop after a given amount of time.
You can have a database of products and purchases. You may need to figure out the most common combinations of three products that occur together.
You can have a database of credit card purchases. You may need to figure out who spends more than $200 in a restaurant more than 50 miles from their billing address.
The point is. A database does not represent "application capabilities". A database represents entities and relationships between them, presumably in the real world. There is hubris to think that you can look at a database and know what the business questions are.
Instead, the purpose of a database is to support data, which in turn, supports applications. The needs of applications will change over time. The beauty of databases, as opposed to many other data storage technologies, is that the technology scales as the data increases, supports changes to the structure, and allows new entities and relationships to be added into the system, without completely rewriting it.
Over time, and with experience, you might develop intuition on what's important. Even if you do, you will be constantly surprised at the varied needs of your users.

I am sincerely not trying to be smart here but answer is - yes and no.
Yes, because 3NF design usually outlines business rules behind it pretty well, so you can to a degree tell what is the business logic behind it, you can create an object or graph model from it and get a good idea
what kinds of questions can be asked from based on connections/relations and accessible properties.
No, because combinatorially you might have a untractable number of combinations of questions from a graph. Hence, you can't really tell what question one might ask in reasonable, non-exponential amount of time.
In general, if design is good and tables are meaningfully named you can get a pretty good idea what is going on.

Theoretically it's possible but due to the combinatorial explosion of N rows by X columns by Z tables by W possible functions by Q possible values on each column/row this is an amazingly large number.
The issue here is that you need to take into account the data too. Some queries only make sense when there is particular data and other don't. So you are essentially considering massively large hypercube.
I work with Multidimensional databases (denormalised cubes) and this is essentially denormalised databases. Have a read aup on OLAP theory and you'll see why.
So in short no as it's practically impossible.

Now is it possible to predict, just from the database design, what the queries will be used for the application?
You can, at least in principle, predict which queries can be answered efficiently. Which queries will the applications actually try to execute is another matter.
In an ideal world, database model would take into account all the querying needs of all the applications, now and in the future. We don't live in that world yet ;)
If it is possible, is there a strategy or automated way to generate the possible queries?
No, that requires human understanding of what the model actually means. Unfortunately, there is no good way to teach a tool to have that level of understanding.
A good model will immediately make sense to a person experienced in database modeling and the domain being modeled. Such person will typically be able to predict a fair portion of queries actually being used, but rarely all of them, so the documentation beside the database model itself is desirable. And of course, not all models are good...

Recommended structure for high traffic website [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm rewriting a big website, that needs very solid architecture, here are my few questions, and pardon me for mixing apples and oranges and probably kiwi too:) I did a lot of research and ended up totally confused.
Main question: Which approach would you take in building a big website expected to grow in every way?
Single entry point, pages data in the database, pulled by associating GET variable with database entry (?pageid=whatever)
Single entry point, pages data in separate files, included based on GET variable (?pageid=whatever would include whatever.php)
MVC (Alright guys, I'm all for it, but can't grasp the concept besides checking all tutorials and frameworks out there, do they store "view" in database? Seems to me from examples that if you have 1000 pages of same kind they can be shaped by 1 model, but I'll still need to have 1000 "views" files?)
PAC - this sounds even more logical to me, but didn't find much resources - if this is a good way to go, can you recommend any books or links?
DAL/DAO/DDD - i learned about these terms by diligently reading through stack overflow before posting question. Not sure if it belongs to this list
Sit down and create my own architecture (likely to do if nobody enlightens me here:)
Something not mentioned...
Thanks.

Scalability/availability (iow. high-traffic) for websites is best addressed by none of the items you mention. Especially points 1 and 2; storing the page definitions in a database is an absolute no-no. MVC and other similar patterns are more for code clarity and maintenance, not for scalability.
An important piece of missing information is what kind of concurrent hits/sec are you expecting? Sometimes, people who haven't built high-traffic websites are surprised at the hit rates that actually constitute a "scalability nightmare".
There are books on how to design scalable architectures, so an SO post will not be able to the topic justice, but some very top-level concepts, in no particular order, are:
Scalability is best handled first by looking at hardware-based solutions. A beefy server with an array of SSD disks can go a long way.
Make static anything that can be static. Serve as much as you can from the web server, not the DB. For example, a lot of pages on websites dynamically generate data lists out of databases from data stores that very rarely or never really change.
Cache output that changes infrequently, and tune the cache refresh.
Build dynamic pages to be stateless or asynchronous. Look into CQRS and Event Sourcing for patterns that favor/facilitate scaling.
Tune your queries. The DB is usually the big bottleneck since it is a shared resource. Lots of web app builders use ORMs that create poor queries.
Tune your database engine. Backups, replication, sweeping, logging, all of these require just a little bit of resource from your engine. Tuning it can lead to a faster DB that buys you time from a scale-out.
Reduce the number of HTTP requests from clients. Each HTTP connect has overhead. Check your pages and see if you can increase the payload in each request so as to reduce the overall number of individual requests.
At this point, you've optimized the behavior on one server, and you have to "scale out". Now, things get very complicated very fast. Load-balancing scenarios of various types (sharding, DNS-driven, dumb balancing, etc), separating read data from write data on different DBs, going to a virtualization solution like Google Apps, offload static content to a big CDN service, use a language like Erlang or Scala and parallelize your app, etc...

Single entry point, pages data in the
database, pulled by associating GET
variable with database entry
(?pageid=whatever)
Potential nightmare for maintenance. And also for development if you have team of more than 2-3 people. You would need to create a set of strict rules for everyone to adhere to - effort that would be much better spent if using MVC. Same goes for 2.
MVC (Alright guys, I'm all for it, but
can't grasp the concept besides
checking all tutorials and frameworks
out there, do they store "view" in
database? Seems to me from examples
that if you have 1000 pages of same
kind they can be shaped by 1 model,
but I'll still need to have 1000
"views" files?)
It depends how many page layouts are there. Most MVC frameworks allow you to work with structured views (i.e. main page views, sub-views). Think of a view as HTML template for the web page. How many templates and sub-templates inside you need is exactly how many view's you'll have. I believe most websites can get away with up to 50 main views and up to 100 subviews - but those are very large sites. Looking at some sites I run, it's more like 50 views in total.
DAL/DAO/DDD - i learned about these
terms by diligently reading through
stack overflow before posting
question. Not sure if it belongs to
this list
It does. DDD is great if you need meta-views or meta-models. Say, if all your models are quite similar in structure, but differ only in database tables used and your views almost map 1:1 to models. In that case, it is a good time for DDD. A good example is some ERP software where you don't need a separate design for all the database tables, you can use some uniform way to do all the CRUD operations. In this case you could probably get away with one model and a couple of views - all generated dynamically at run-time using meta-model that maps database columns, types and rules to logic of programming language. But, please note that it does take some time and effort to build a quality DDD engine so that your application doesn't look like hacked-up MS Access program.
Sit down and create my own
architecture (likely to do if nobody
enlightens me here:)
If you're building a public-facing website, you're most likely going to do it well with MVC. A very good starting point is to look at CodeIgniter video tutorials. It helped me understand what MVC really is and how to use it way better than any HOWTO or manual I read. And they only take 29minutes altogether:
http://codeigniter.com/tutorials/
Enjoy.

I'm a fan of MVC because I've found it easier to scale your team when everything has a place and is nice and compartmentalized. It takes some getting used to, but the easiest way to get a handle on it is to dive in.
That said definitely check your local library to see if they have the O'Reilley book on scaling: http://oreilly.com/catalog/9780596102357 which is a good place to start.

If you're creating a "big" website and don't fully grasp MVC or a web framework then a CMS might be a better route since you can expand it with plugins as you see fit. With this route you can worry more about the content and page structure rather than the platform. As long as you pick the appropriate CMS.

I would suggest to create a mock app with some of the web mvc frameworks in the wild and pick one, with which your development was smooth enough. Establishing your code on a solid basis is fundamental, if you want to grasp concepts of mvc and be ready to add new functionality to your web easily.

Multiple application instances on the same database

I'm writing an application that that I'm going to provide as a service and also as a standalone application.
It's written in Zend Framework and uses MySQL.
When providing it as a service I want users to register on my site and have subdomains like customer1.mysite.com, customer2.mysite.com.
I want to have everything in one database, not creating new database for each user.
But now I wonder how to do it better.
I came up with two solutions:
1. Have user id in each table and just add it to WHERE clause on each database request.
2. Recreate tables with unique prefix like 'customer1_tablename', 'customer2_tablename'.
Which approach is better? Pros and cons?
Is there another way to separate users on the same database?
Leonti

I would stick to keeping all the tables together, otherwise there's barely any point to using a single database. It also means that you could feasibly allow some sort of cross-site interaction down the track. Just make sure you put indexes on the differentiating field (customer_number or whatever), and you should be ok.
If the tables are getting really large and slow, look at table partitioning.

It depends on what you intend to do with the data. If the clients don't share data, segmenting by customer might be better; also, you may get better performance.
On the other hand, having many tables with an identical structure can be a nightmare when you want to alter the structure.

I'd recommend using separate databases for each user. This makes your application easier to code for, and makes MySQL maintenance (migration of single account, account removal and so on.)
The only exception to this rule would be if you need to access data across accounts or share data.

This is called a multi-tenant application and lots of people run them; see
multi tenant tag
For some other peoples' questions

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.