How to version-control different features within one web application?

How to version-control different features within one web application? - php

We have some web applications, and now these websites are being upgraded, not for the first time, but it is becoming very dificult to control the version for the users and for the developers.
We have many clients, some of they are running the same application, but they need to pay for upgrades. But, not all clients pay for upgrades, and because this we have some clients running one version and another clients running another version.
We have two ways, and we are researching for a third way:
Put the version in the path, like this: www\project\version\system-files
But this way became confusing for some users, because for they the URL became: www.website.com/app-version, and when the system is upgraded, the URL change.
Put the version in the function, like this: function V1_functionX()
When the function need to be upgraded, we create a new function called V2_functionX. But, this create a "fat" website, and the team did some mistakes during development, because we don't have "one development version", but "many versions to develop", and some functions are used in more than one website.
The very first way was abandoned a long time ago. We developed the web application, and "closed the version", and all requests was included in the upgraded version, that version when finished was "closed" too. But this was too slow too made corrections and deploy "small upgrades"
We talked about the way in another companies: they "shutdown" the website to upgrade the system. This will be probably our way.
But, if anyone have another idea to not shutdown a website for upgrade the application, we will be glad to listen.
Note: this is not about SVN.

You say you have different versions of your applications that must be maintained for different clients. I expect you don't need me to tell you this adds significantly to the complexity of your overall system, and thus your first priority is to reduce the number of versions you are maintaining in parallel.
API services have the same problem: a new version with more features is offered, but the old one needs to be maintained to give the new version time to stabilise and to give users sufficient time to upgrade their code. Your difficulty is similar. The first question I would therefore ask is whether it is possible to maintain only two versions.
If that is not possible, try at least minimising the number of concurrent versions: where a new version must be created, you need to encourage users to migrate from one version to another. (You've said you cannot unify your users onto one version, but without further information about your exact use-case, it is not possible to offer an independent view on that). So, perhaps one approach is to never maintain more than, say, five versions.
There are a number of strategies you can take to mitigate the complexity of the system you now have. Firstly, consider separating your code into a "core" of features that all versions absolutely must have. This will be common to all versions, so that if you fix a bug here, all clients benefit from the fix. This might be a visible feature (e.g. a product editing screen) or a framework feature (e.g. force SSL during checkout).
Your core libraries and client-specific functions could then reside in a set of libraries like so:
/project/core
/project/versions/1/Class.php
/project/versions/1.1/Class.php
/project/versions/2/Class.php
/project/versions/2.1.1/Class.php
/project/versions/...
(Class.php is of course an example - in practise there would be many class files here, each named appropriately.)
In this way, you do not need to call functions with a V1_ prefix, since that will require the replication of your version choosing code in a lot of places. It is much better to just load the library pertaining to the correct version, and as long as the function names are the same across all versions, you can just use the function name and your library loader will take care of the rest.
Another approach is to use plugins, like WordPress does. Where a plugin is added, it modifies some core functionality by adding new or different behaviour. The "middleware" design pattern may be useful here - the Slim framework (and undoubtedly others) uses this approach to add pre- or post-call hooks to an existing route handler, and thus offers a clean mechanism to edit existing functionality in a variety of combinations.
To summarise, your current situation is not just a management problem, but will cost you in slow development time and additional debugging. Whilst the above approaches will still be necessary to reduce some of the complexity, consider also:
forcing laggard clients to upgrade to one of your currently supported versions
giving an upgrade to laggard clients to the oldest possible supported version for free
Some additional thoughts based on new information. I had pondered whether splitting the code into separate repositories would help, one for each client. However I wonder if there is no guarantee that they would; even if you pull core features in using Composer, or a Git submodule, there is still the possibility of divergence between your latest core and your earliest client code. At some point your worst laggard client is going to hold back development on the core.
You can always leave this client on an abandoned version, but if they spot a bug, it is not worth back-porting a fix from your latest core, since that will cause you all the compatibility headaches you've been trying to avoid. Either they upgrade to a minimum client version that works with the latest core (and pay to do so if necessary) or they tolerate the bug indefinitely.
You've mentioned that each client gets his or her own database. That is helpful, up to a point, since it means that client versions are not entirely constrained with database schema decisions that have been forced by the core. However, this will still have a knock-on effect on how much code you can move to the core.
For example, let us assume that you have seven clients, and six of them have a User entity that has an email address, to handle password change requests (one client has a User entity without this field). This means that, if the awkward schema may not change, the core cannot assume that an email address is available. (In this trivial case it might be cheaper to upgrade the odd-one-out for free, so that more code can go in the core, rather than maintaining such a standard thing as a version enhancement).
Given the level of complexity, and since it sounds like you are maintaining this for the long term, I think you should set up some unit and functional tests. You'll need to split these into "core" and "per version" as well. If you find a bug, regardless of whether it is caused by feature versioning or not, write a failing test, and then fix it. You'll then have - at least in theory - a way to check if a change will impact on a particular client's version in a way you did not envisage.

We have this at my work :
Local dev website(SVN)
dev server where all developer test
Preprod where everything is Ok
Prod (rsync from preprod)
The rsync between 2 server is super fast, when we do a major update its in less than 5s

Related

multiple databases looking at single php codebase issues?

I am making a solution in php and am looking at having multiple installs for different clients. If I had 20 databases, is there anything wrong with pointing them to the same php codebase? eg. speed issues, or is it bad practice?
Thanks in advance for your thoughts and expertise :-)

It's possible and won't affect your performance but it has a downside.
If you have one codebase, it means you have to recheck all clients for errors when you update it.
Therefor it's better for your clients to have the codebase separated so you can update your codebase for the specific client and prevent errors on other clients.
If everything seems to work for one client, you can consider the same update for another client.
This is the case if you have custom code for every client, if it's one application like a mailmanagement system
it wouldn't be a bad practice at all.
It would be nice tho, if you have assigned a version number to each client.
This makes it possible to test a new version on a single client, and slowly migrated other clients to a newer version

I would say the opposite from Visser. If there is a bug in one installation, then the same bug is going to exist in all installations unless you provide customized versions of your software. Sure, different customers might use the applications in different ways, hence some might never experience a defect that brins the business of another customer to a grinding halt. Hence I disagree with Viseer again that it won't affect performance - differences in usage could lead to very marked differences in overall performance for a particular customer, but tuning at the PHP tier will benefit all customers. Tuning at the database tier is a slightly different story - some customers might benefit from an index that would slow down other customers.
If you do provide per-customer variations in the behaviour of your code, then how you do this depends on the complexity of those variations. Ideally the differences should be described in the database, and the same PHP code would then produce different results - e.g. one customer wants a standalone user-management and authentication system, another customer wants to use LDAP authentication, a third OpenID - in which case your code should implement all 3 and the method chosen at runtime based on the data.
Sometimes (but rarely) it's not practical to implement this approach and using different application logic for different installations is the solution. In this case, the right approach is to maintain a fork in your version control system. An example would be different branding on the site - unless you're developing on top of a content management system, then it's probably simpler to use different CSS files (I'm struggling to think of an example where different PHP code is justified).

Beta testing new website features with live data and real customers

The main goal of this question is to determine the pitfalls of deploying a slightly modified version of a website alongside a live website.
This secondary website would be pulling from the same database as the live but would have modified features for beta testers.
The end goal is to allow certain customers test our new features with their data.
So:
They don't have to do things twice by going to a copied version of the site.
They are using familiar data sets
Another possibility would be setting a flag per user account to allow them to see certain features but this would require a lot of extra work. Also, once it is ready for release, we would have to remove all the extra checks.
I am having a hard time seeing the disadvantages of this, but I know there has to be some glaring at me. Thank you for any assistance.
Git version controlled, Capistrano Deployment workflow, Cakephp framework, MySql
We currently have local and testing servers that are separate from our production servers.
EDIT 12-20-2012 10:30am EST
Based on some comments and one answer I have an update based on feedback.
Meticulous internal testing should be done before 'beta'/user feedback testing. (which we already do)
If we take these precautions and the code base seems solid, the risk in deploying alongside the production server could be manageable. We are working within a framework here so the likelihood of mass deletion and bad sql is relatively low.
All that being said, I would rather not take this approach because it still has inherent risk. Does anyone do beta testing with live server data another way?

It depends...
If this is a beta to get customer feedback, on a product that has been fully tested and is known to be stable, the risks are relatively manageable (though see points below). This is the way Google defines "beta".
If "beta" means code complete, and sorta-kinda tested, but who knows what bugs are in there, you risk corrupting your live database. No matter how clever your backup strategy, if something goes wrong, the best case scenario is that the beta users face data loss or corruption; the worst case is that all your users lose data (I've seen broken "where" clauses in delete or update statements do all kinds of entertaining damage).
Another issue to consider is whether the database is backward and forward compatible between versions - can you migrate your beta users back to the mainstream version if they don't like the upgrade, or if something goes wrong? This is a far bigger deal if "beta" means "untested", of course.
In general, it's a lot easier to deal with one-way compatibility - allowing users to upgrade, but not downgrade - another strong argument for "beta" to mean "user feedback"...

How else can I check the user for a program?

Currently, my friend has a program that checks the users Windows CD-Key and then it goes through a one way encryption. He, then, adds that new generated number to the program for checking purposes and then he compiles it and then he sends it off to the client. Is there a better way to keep the program from being shared utilizing PHP somehow instead of his current method while not using a login system of any kind.

Fortunately, I've done extensive research in this area, a more affordable, and some say safer option to Zend Guard is SourceGuardian. It allows binding to IP addresses, MAC addresses, domains, and time. They're also working on a version that will support a physical dongle attached to the computer. They also release often and have pretty good support.
Another affordable and secure option is NuCoder, they have similar options to SourceGuardian, but also allow the option to bind to a uniquely generated hardware id.
Both SourceGuardian and NuCoder are the best out there, in my opinion anyway, however NuCoder has fallen behind in supporting the latest PHP releases. Currently they support up to 5.2, while SourceGuardian supports the very latest, including 5.3.
Furthermore, since your code is converted to protected bytecode, you also gain speed benefits as PHP doesn't need to take the extra step to convert your code into bytecode. However, as the previous commenter noted, this will require your users to install the necessary loaders, however, this usually entails a simple line addition to the php.ini, or in the case of > 5.2.6, otherwise no additions are usually necessary.

In short, any program using a key can be forged one way or another. Especially if the sources are available (which is the case with most PHP projects. You might want to look into Zend Gard if you really want something professional.) But most security systems are a pain to the clients in my opinion.
A good system I came across once was an C compiled library that had many redundant code checks (spaghetti-like calling trees) and would validate an encrypted serial number. Since the application was custom and did not have many releases, there was no "crack" available for it and the client was in deep water when the reseller went into bankruptcy. Eventually, that code was cracked anyway.
In my opinion, the only true secure way would be to host your application and not releasing any of your source code, then have the client pay for a license and send him only an API key that he must send for each request.

What are best practices for self-updating PHP+MySQL applications?

It is pretty standard practice now for desktop applications to be self-updating. On the Mac, every non-Apple program that uses Sparkle in my book is an instant win. For Windows developers, this has already been discussed at length. I have not yet found information on self-updating web applications, and I hope you can help.
I am building a web application that is meant to be installed like Wordpress or Drupal - unzip it in a directory, hit some install page, and it's ready to go. In order to have broad server compatibility, I've been asked to use PHP and MySQL -- is that **MP? In any event, it has to be broadly cross-platform. For context, this is basically a unified web messaging application for small businesses. It's not another CMS platform, think webmail.
I want to know about self-updating web applications. First of all, (1) is this a bad idea? As of Wordpress 2.7 the automatic update is a single button, which seems easy, and yet I can imagine so many ways this could go terribly, terribly wrong. Also, isn't the idea that the web files are writable by the web process a security hole?
(2) Is it worth the development time? There are probably millions of WP installs in the world, so it's probably worth the time it took the WP team to make it easy, saving millions of man hours worldwide. I can only imagine a few thousand installs of my software -- is building self-upgrade worth the time investment, or can I assume that users sophisticated enough to download and install web software in the first place could go through an upgrade checklist?
If it's not a security disaster or waste of time, then (3) I'm looking for suggestions from anyone who has done it before. Do you keep a version table in your database? How do you manage DB upgrades? What method do you use for rolling back a partial upgrade in the context of a self-updating web application? Did using an ORM layer make it easier or harder? Do you keep a delta of version changes or do you just blow out the whole thing every time?
I appreciate your thoughts on this.

Frankly, it really does depend on your userbase. There are tons of PHP applications that don't automatically upgrade themselves. Their users are either technical enough to handle the upgrade process, or just don't upgrade.
I purpose two steps:
1) Seriously ask yourself what your users are likely to really need. Will self-updating provide enough of a boost to adoption to justify the additional work? If you're confident the answer is yes, just do it.
Since you're asking here, I'd guess that you don't know yet. In that case, I purpose step 2:
2) Release version 1.0 without the feature. Wait for user feedback. Your users may immediately cry for a simpler upgrade process, in which case you should prioritize it. Alternately, you may find that your users are much more concerned with some other feature.
Guessing at what your users want without asking them is a good way to waste a lot of development time on things people don't actually need.

I've been thinking about this lately in regards to database schema changes. At the moment I'm digging into WordPress to see how they've handled database changes between revisions. Here's what I've found so far:
$wp_db_version is loaded from wp-includes/version.php. This variable corresponds to a Subversion revision number, and is updated when wp-admin/includes/schema.php is changed. (Possibly through a hook? I'm not sure.) When wp-admin/admin.php is loaded, the WordPress option named db_version is read from the database. If this number is not equal to $wp_db_version, wp-admin/upgrade.php is loaded.
wp-admin/includes/upgrade.php includes a function called dbDelta(). dbDelta() scans $wp_queries (a string of SQL queries that will create the most recent database schema from scratch) and compares it to the schema in the database, altering the tables as necessary so that the schema is brought up-to-date.
upgrade.php then runs a function called upgrade_all() which runs specific upgrade_NNN() functions if $wp_db_version is less than target values. (ie. upgrade_250(), the WordPress 2.5.0 upgrade, will be run if the database version is less than 7499.) Each of these functions run their own data migration and population procedures, some of which are called during the initial database setup script. Nicely cuts down on duplicate code.
So, that's one way to do it.

Yes it would be a security feature if PHP went and overwrote its files from some place on the internet with no warning. There's no guarantee that the server is connecting correctly to your update server (it might download someone code crafted by someone else if DNS poisoning occured) - giving someone else access to your client's data. Therefore digital signing would be important.
The user could control updates by setting permissions on the web directory so that PHP only has read access to the files - this procedure could simply be documented with your program.
One question remains (I really don't know the answer to): can PHP overwrite files if it's currently using them (e.g. if the update.php file itself needed to be updated)? Worth testing.

I suppose you've already ruled this out, but you could host it as a service. (Think wordpress.com)

I'd suggest that you package your application with pear and set up a channel. Your users can then upgrade the application through a standard interface (pear). It's not entirely automatic (unless the users have some kind of automation running on top of pear), but it's standard, so any sysadmin can maintain it.

I think your best option is an update checking mechanism that will alert the administrator when there are update(s).
As you mention, there are a number of potential security problems. Due to those alone, I would suggest not doing this. Instead, try creating a fairly smart upgrading script.

Just my 2 cents: I'd consider an automatically self updating application within my CMS as a security hole, so if you decide to code this feature, you should consider to implement different levels of this behavior:
Automatically update
Check for updates and notify
Disable

Is it necessary in any circumstance to modify Wordpress other than writing plugins and themes?

I recently had to work on a project where the previous developer modified the wp-admin directory. It seems like a bad idea to me, since Wordpress is constantly updated. Am I just not at that level of expertise with modifying Wordpress?

Being open source, I think it's a common thing for software like WordPress to be modified and extended at any point.
To modify or not to modify is a choice between trade-offs. New features can be encapsulated as modules, which may, perhaps, cause their functionality to be less integrated than desired. However, fully integrating changes may hinder easily updating the software as new versions are released.
It does require that someone be very familiar with the software to modify the software directly, but this isn't necessarily a bad idea.
On a side note, I think modifying WordPress is almost a necessity, especially if you want it to have a decent architecture or to actually be secure (ok, that was a jab, sue me).

Well, it is a bad idea only in that it means you are now responsible for maintaining an internal defacto fork ... every time WordPress releases an update, you have to do a three-way diff to merge your changes into the new "real" WordPress. (Three-way diff means you do a diff between your fork of the old version and the standard old version to build a patch set, then apply that patch set to the new version.) You should also be using a VCS yourself to keep yourself sane.
If you aren't up to this then you aren't up to it, there's nothing wrong with following the KISS principle and not mucking up the application code.
If you can write a plugin that does the same thing and does it just as efficiently, then you should do that so you don't have to maintain your own fork.
However, there are a lot of things WordPress is terrible at (efficiency, security) that you can ameliorate (sometimes without much work, just by disabling code you don't need) only by hacking the application code. WordPress is dirty legacy spaghetti code originally written by people with virtually zero knowledge of software or database design, and it does a lot of tremendously stupid things like querying the database on every request to see what it's own siteurl is, when this never changes -- there's nothing wrong with taking 5 minutes to change 2 lines of code so it doesn't do this any more.
I worked as tech lead on a then-top-20 Technorati-ranked blog and did a lot of work to scale WordPress on a single server and then onto a cluster (with separate servers for admin vs. public access). We had upstream reverse proxies (i.e. Varnish or Squid) acting as HTTP accelerators and an internal object/page fragment cache system that plugged into memcached with failover to filesystem caching using PEAR::Cache_Lite. We had to modify WordPress to do things like send sane, cache-friendly HTTP headers, to disable a lot of unnecessary SQL and processing.
I modified WP to run using MySQL's memory-only NDB cluster storage engine, which meant specifying indexes in a lot of queries (in the end we opted for a replicated cluster instead, however). In modifying it to run with separate servers for admin vs. public access, we locked down the public-side version so it ran with much reduced MySQL privileges allowing only reads (a third MySQL user got commenting privileges).
If you have a serious comment spam problem (i.e. 10K/hour), then you have to do something beyond plugins. Spam will DOS you because WordPress just initializing its core is something like half a second on a standalone P4 with no concurrency, and since WP is a code hairball there's no way to do anything without initializing the core first.
"WP-Cron" is braindead and should be disabled if you have access to an actual crontab to perform these functions. Not hard to do.
In short, I could go on forever listing reasons why you might want to make modifications.
Throughout this it was of course a goal for maintainability reasons to keep these modifications to a minimum and document them as clearly as possible, and we implemented many as plugins when it made sense.

On one blog/forum combination, we hacked together the signup procedure so that people filled in one form to sign up to both WordPress and phpBB at the same time. I'm sure there's a better way to do that with plugins, but it did have one unexpected benefit - it really confuses the spambots. Despite having several of them register each day, we've had about two spam posts in the life of the forum.
Not something I'd recommend, of course - it stops us from upgrading either software.

I tend to strongly advocate against modifying core code if at all possible, especially in a project that updates like WordPress does. If WordPress can't be made to do what you need it to with plugins and the like, you're probably better off with a more extensible/generic system like Drupal. Hacking a blogging-oriented CMS into something else might not be worth it.

In the older versions of WordPress (1.0 and even the early 2.0s), I wouldn't bat an eye to modifying WordPress itself.
However, WordPress' architecture has matured. Sidebars no longer need to be manually coded. Instead, you can port your theme to use widgets and just create widgets (what a godsend!). Don't like how something is displayed - just modify the theme! Don't like how WordPress handles something? Create a plug-in. I'm hard pressed to think of a reason to modify the WordPress code itself that cannot be handled via WordPress' contemporary modular components (widgets, plug-ins, themes) instead.
I'm the type of person to always get "under the hood" in open source apps like WordPress. However, nowadays, there's really no good reason to modify the core WordPress code.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.