Beta testing new website features with live data and real customers - php

The main goal of this question is to determine the pitfalls of deploying a slightly modified version of a website alongside a live website.
This secondary website would be pulling from the same database as the live but would have modified features for beta testers.
The end goal is to allow certain customers test our new features with their data.
So:
They don't have to do things twice by going to a copied version of the site.
They are using familiar data sets
Another possibility would be setting a flag per user account to allow them to see certain features but this would require a lot of extra work. Also, once it is ready for release, we would have to remove all the extra checks.
I am having a hard time seeing the disadvantages of this, but I know there has to be some glaring at me. Thank you for any assistance.
Git version controlled, Capistrano Deployment workflow, Cakephp framework, MySql
We currently have local and testing servers that are separate from our production servers.
EDIT 12-20-2012 10:30am EST
Based on some comments and one answer I have an update based on feedback.
Meticulous internal testing should be done before 'beta'/user feedback testing. (which we already do)
If we take these precautions and the code base seems solid, the risk in deploying alongside the production server could be manageable. We are working within a framework here so the likelihood of mass deletion and bad sql is relatively low.
All that being said, I would rather not take this approach because it still has inherent risk. Does anyone do beta testing with live server data another way?

It depends...
If this is a beta to get customer feedback, on a product that has been fully tested and is known to be stable, the risks are relatively manageable (though see points below). This is the way Google defines "beta".
If "beta" means code complete, and sorta-kinda tested, but who knows what bugs are in there, you risk corrupting your live database. No matter how clever your backup strategy, if something goes wrong, the best case scenario is that the beta users face data loss or corruption; the worst case is that all your users lose data (I've seen broken "where" clauses in delete or update statements do all kinds of entertaining damage).
Another issue to consider is whether the database is backward and forward compatible between versions - can you migrate your beta users back to the mainstream version if they don't like the upgrade, or if something goes wrong? This is a far bigger deal if "beta" means "untested", of course.
In general, it's a lot easier to deal with one-way compatibility - allowing users to upgrade, but not downgrade - another strong argument for "beta" to mean "user feedback"...

Related

eval Remote Code From My Server

I am building a site platform similar to Wordpress that allows my users to download a .zip file, upload it onto their server, and be good to go.
I know everyone says eval() is evil - but the code will not include any user or variable input.
The benefit here is that updates will occur automatically. I can just change the code being grabbed on my server.
My clients using the code will have pretty low traffic sites - so I'm not worried about overloading their server. Most of the heavy lifting will be done by us.
Here's the basic code concept:
$code=file_get_contents("http://myserver.com/code.txt");
eval($code);
Is this a realistic option? What security holes do I need to worry about?
It's "realistic" in the sense that it will work, but at the same time it sounds like a sysadmin's nightmare. If you are meaning to have a client download and execute remote code every time a request is made, your clients are at your whim if the master server goes down or is unreachable at any point. It's now a mission-critical service you'll have to keep running forever for as long as your clients need it.
You list automatic updates as a benefit, but is it? In nearly every software platform, the features users depend on can change over time; function signatures can change, or functionality may be dropped entirely in favour of a more refined alternative. Since it sounds like you're writing some form of framework, can you guarantee that future versions will always be backwards-compatible? Not everyone is using the cutting-edge version of a piece of software in production for a reason -- they want what they are using to be stable. If an upgraded version of your platform rolls out overnight, and it breaks some custom code written by the client (at least one of them will try doing this, even if you don't want them to) or even, old, standard functionality that was deprecated but still worked with the previous release, how are they going to roll it back to a version that works?
It just sounds like something that will eventually incur a ton of technical debt.

How to version-control different features within one web application?

We have some web applications, and now these websites are being upgraded, not for the first time, but it is becoming very dificult to control the version for the users and for the developers.
We have many clients, some of they are running the same application, but they need to pay for upgrades. But, not all clients pay for upgrades, and because this we have some clients running one version and another clients running another version.
We have two ways, and we are researching for a third way:
Put the version in the path, like this: www\project\version\system-files
But this way became confusing for some users, because for they the URL became: www.website.com/app-version, and when the system is upgraded, the URL change.
Put the version in the function, like this: function V1_functionX()
When the function need to be upgraded, we create a new function called V2_functionX. But, this create a "fat" website, and the team did some mistakes during development, because we don't have "one development version", but "many versions to develop", and some functions are used in more than one website.
The very first way was abandoned a long time ago. We developed the web application, and "closed the version", and all requests was included in the upgraded version, that version when finished was "closed" too. But this was too slow too made corrections and deploy "small upgrades"
We talked about the way in another companies: they "shutdown" the website to upgrade the system. This will be probably our way.
But, if anyone have another idea to not shutdown a website for upgrade the application, we will be glad to listen.
Note: this is not about SVN.
You say you have different versions of your applications that must be maintained for different clients. I expect you don't need me to tell you this adds significantly to the complexity of your overall system, and thus your first priority is to reduce the number of versions you are maintaining in parallel.
API services have the same problem: a new version with more features is offered, but the old one needs to be maintained to give the new version time to stabilise and to give users sufficient time to upgrade their code. Your difficulty is similar. The first question I would therefore ask is whether it is possible to maintain only two versions.
If that is not possible, try at least minimising the number of concurrent versions: where a new version must be created, you need to encourage users to migrate from one version to another. (You've said you cannot unify your users onto one version, but without further information about your exact use-case, it is not possible to offer an independent view on that). So, perhaps one approach is to never maintain more than, say, five versions.
There are a number of strategies you can take to mitigate the complexity of the system you now have. Firstly, consider separating your code into a "core" of features that all versions absolutely must have. This will be common to all versions, so that if you fix a bug here, all clients benefit from the fix. This might be a visible feature (e.g. a product editing screen) or a framework feature (e.g. force SSL during checkout).
Your core libraries and client-specific functions could then reside in a set of libraries like so:
/project/core
/project/versions/1/Class.php
/project/versions/1.1/Class.php
/project/versions/2/Class.php
/project/versions/2.1.1/Class.php
/project/versions/...
(Class.php is of course an example - in practise there would be many class files here, each named appropriately.)
In this way, you do not need to call functions with a V1_ prefix, since that will require the replication of your version choosing code in a lot of places. It is much better to just load the library pertaining to the correct version, and as long as the function names are the same across all versions, you can just use the function name and your library loader will take care of the rest.
Another approach is to use plugins, like WordPress does. Where a plugin is added, it modifies some core functionality by adding new or different behaviour. The "middleware" design pattern may be useful here - the Slim framework (and undoubtedly others) uses this approach to add pre- or post-call hooks to an existing route handler, and thus offers a clean mechanism to edit existing functionality in a variety of combinations.
To summarise, your current situation is not just a management problem, but will cost you in slow development time and additional debugging. Whilst the above approaches will still be necessary to reduce some of the complexity, consider also:
forcing laggard clients to upgrade to one of your currently supported versions
giving an upgrade to laggard clients to the oldest possible supported version for free
Some additional thoughts based on new information. I had pondered whether splitting the code into separate repositories would help, one for each client. However I wonder if there is no guarantee that they would; even if you pull core features in using Composer, or a Git submodule, there is still the possibility of divergence between your latest core and your earliest client code. At some point your worst laggard client is going to hold back development on the core.
You can always leave this client on an abandoned version, but if they spot a bug, it is not worth back-porting a fix from your latest core, since that will cause you all the compatibility headaches you've been trying to avoid. Either they upgrade to a minimum client version that works with the latest core (and pay to do so if necessary) or they tolerate the bug indefinitely.
You've mentioned that each client gets his or her own database. That is helpful, up to a point, since it means that client versions are not entirely constrained with database schema decisions that have been forced by the core. However, this will still have a knock-on effect on how much code you can move to the core.
For example, let us assume that you have seven clients, and six of them have a User entity that has an email address, to handle password change requests (one client has a User entity without this field). This means that, if the awkward schema may not change, the core cannot assume that an email address is available. (In this trivial case it might be cheaper to upgrade the odd-one-out for free, so that more code can go in the core, rather than maintaining such a standard thing as a version enhancement).
Given the level of complexity, and since it sounds like you are maintaining this for the long term, I think you should set up some unit and functional tests. You'll need to split these into "core" and "per version" as well. If you find a bug, regardless of whether it is caused by feature versioning or not, write a failing test, and then fix it. You'll then have - at least in theory - a way to check if a change will impact on a particular client's version in a way you did not envisage.
We have this at my work :
Local dev website(SVN)
dev server where all developer test
Preprod where everything is Ok
Prod (rsync from preprod)
The rsync between 2 server is super fast, when we do a major update its in less than 5s

Planning Ahead For Website Upgrades

I've noticed while developing my first site, that the smallest changes in database columns, connection options, and various other components cause the website to fail until I correct the problem which may or may not require a lot of time (woe is me). I'm wondering what steps I can take now in order to prevent these headaches, as I've delayed the website launch in order to keep upgrading. Once I'm done implementing the changes I would like to see, I know I won't truly be done, but at some point I have to move on to the next stage.
Yes, I know there is probably no one good solution, and ultimately a self-correcting design is more trouble than its worth at this point. But if any
grey beards have any tips that they could offer based on their own experiences working with WebDev, particularly with LAMP stacks, I would greatly appreciate it.
Specifically, I would like to know what to look out for when modifying databases and website code after customer information is in active use, in order to prevent errors, and how to roll out the changes.
EDIT 1:
Yes, so the answer seems to be that I need to copy the live site to my testing environment. I'm looking going to some of the already suggested development solutions. Regular backups are crucial, but I can just see inserting new columns and modifying queries as a cause for mis-ordered tables and such. "That's where being a good programmer and testing diligently comes in handy", someone in the corner said. As I look into the proposed solutions, I welcome all others in the meantime. A real-time copy of the 'live-site' would be nice to create on the fly during testing.
The above answers are all very valid and in the end, they represent your target solution.
In the meantime, you may already do a lot for your website, even with a gradual migration to those practices.
In order to do so, I suggest you to install PHPUnit (or whatever Unit comes with the web languages you use). There are also "graphical" versions of it, like VisualPHPUnit, if that's more of your taste.
These tools are not the permanent solution. You should actually aim adding them to your permanent solution, that is setting up development server etc.
However, even as interim solution they help you reach a fairly stable degree of quality for your software components and avoid 80-90% of the surprises that come with coding on a live server.
You can develop your code in a separate directory and test it before you move it into production. You can create mock objects which your code under test may freely interact with, without fear of repercussions. Your tests may load their own alternate configuration so they work on a second, copy database.
Moving even further, you may include your website into tests itself. There are several applications like Selenium that allow you both to automate and test your production website, so that you can reliably know that your latest changes did not negatively affect your website semantics.
In short, while you should certainly aim at getting a proper development environment running, you can do something very good even today, with few hours of study.
Start using some (maybe simplified) sort of release management:
Maintain a development environment. Either locally, or in a second .htaccess-protected folder online. Make it use it´s own db.
Test every change in this dev env. Once you are satisfied, move them to productive env.
Use git or svn (svn might be simpler to learn, but opinions vary. Checkout "tortoise") to save a snapshot ("commit") of every change you make. That way you can diff throu your latest commits if anything goes wrong.
Maintain regular backups.

multiple databases looking at single php codebase issues?

I am making a solution in php and am looking at having multiple installs for different clients. If I had 20 databases, is there anything wrong with pointing them to the same php codebase? eg. speed issues, or is it bad practice?
Thanks in advance for your thoughts and expertise :-)
It's possible and won't affect your performance but it has a downside.
If you have one codebase, it means you have to recheck all clients for errors when you update it.
Therefor it's better for your clients to have the codebase separated so you can update your codebase for the specific client and prevent errors on other clients.
If everything seems to work for one client, you can consider the same update for another client.
This is the case if you have custom code for every client, if it's one application like a mailmanagement system
it wouldn't be a bad practice at all.
It would be nice tho, if you have assigned a version number to each client.
This makes it possible to test a new version on a single client, and slowly migrated other clients to a newer version
I would say the opposite from Visser. If there is a bug in one installation, then the same bug is going to exist in all installations unless you provide customized versions of your software. Sure, different customers might use the applications in different ways, hence some might never experience a defect that brins the business of another customer to a grinding halt. Hence I disagree with Viseer again that it won't affect performance - differences in usage could lead to very marked differences in overall performance for a particular customer, but tuning at the PHP tier will benefit all customers. Tuning at the database tier is a slightly different story - some customers might benefit from an index that would slow down other customers.
If you do provide per-customer variations in the behaviour of your code, then how you do this depends on the complexity of those variations. Ideally the differences should be described in the database, and the same PHP code would then produce different results - e.g. one customer wants a standalone user-management and authentication system, another customer wants to use LDAP authentication, a third OpenID - in which case your code should implement all 3 and the method chosen at runtime based on the data.
Sometimes (but rarely) it's not practical to implement this approach and using different application logic for different installations is the solution. In this case, the right approach is to maintain a fork in your version control system. An example would be different branding on the site - unless you're developing on top of a content management system, then it's probably simpler to use different CSS files (I'm struggling to think of an example where different PHP code is justified).

What are best practices for self-updating PHP+MySQL applications?

It is pretty standard practice now for desktop applications to be self-updating. On the Mac, every non-Apple program that uses Sparkle in my book is an instant win. For Windows developers, this has already been discussed at length. I have not yet found information on self-updating web applications, and I hope you can help.
I am building a web application that is meant to be installed like Wordpress or Drupal - unzip it in a directory, hit some install page, and it's ready to go. In order to have broad server compatibility, I've been asked to use PHP and MySQL -- is that **MP? In any event, it has to be broadly cross-platform. For context, this is basically a unified web messaging application for small businesses. It's not another CMS platform, think webmail.
I want to know about self-updating web applications. First of all, (1) is this a bad idea? As of Wordpress 2.7 the automatic update is a single button, which seems easy, and yet I can imagine so many ways this could go terribly, terribly wrong. Also, isn't the idea that the web files are writable by the web process a security hole?
(2) Is it worth the development time? There are probably millions of WP installs in the world, so it's probably worth the time it took the WP team to make it easy, saving millions of man hours worldwide. I can only imagine a few thousand installs of my software -- is building self-upgrade worth the time investment, or can I assume that users sophisticated enough to download and install web software in the first place could go through an upgrade checklist?
If it's not a security disaster or waste of time, then (3) I'm looking for suggestions from anyone who has done it before. Do you keep a version table in your database? How do you manage DB upgrades? What method do you use for rolling back a partial upgrade in the context of a self-updating web application? Did using an ORM layer make it easier or harder? Do you keep a delta of version changes or do you just blow out the whole thing every time?
I appreciate your thoughts on this.
Frankly, it really does depend on your userbase. There are tons of PHP applications that don't automatically upgrade themselves. Their users are either technical enough to handle the upgrade process, or just don't upgrade.
I purpose two steps:
1) Seriously ask yourself what your users are likely to really need. Will self-updating provide enough of a boost to adoption to justify the additional work? If you're confident the answer is yes, just do it.
Since you're asking here, I'd guess that you don't know yet. In that case, I purpose step 2:
2) Release version 1.0 without the feature. Wait for user feedback. Your users may immediately cry for a simpler upgrade process, in which case you should prioritize it. Alternately, you may find that your users are much more concerned with some other feature.
Guessing at what your users want without asking them is a good way to waste a lot of development time on things people don't actually need.
I've been thinking about this lately in regards to database schema changes. At the moment I'm digging into WordPress to see how they've handled database changes between revisions. Here's what I've found so far:
$wp_db_version is loaded from wp-includes/version.php. This variable corresponds to a Subversion revision number, and is updated when wp-admin/includes/schema.php is changed. (Possibly through a hook? I'm not sure.) When wp-admin/admin.php is loaded, the WordPress option named db_version is read from the database. If this number is not equal to $wp_db_version, wp-admin/upgrade.php is loaded.
wp-admin/includes/upgrade.php includes a function called dbDelta(). dbDelta() scans $wp_queries (a string of SQL queries that will create the most recent database schema from scratch) and compares it to the schema in the database, altering the tables as necessary so that the schema is brought up-to-date.
upgrade.php then runs a function called upgrade_all() which runs specific upgrade_NNN() functions if $wp_db_version is less than target values. (ie. upgrade_250(), the WordPress 2.5.0 upgrade, will be run if the database version is less than 7499.) Each of these functions run their own data migration and population procedures, some of which are called during the initial database setup script. Nicely cuts down on duplicate code.
So, that's one way to do it.
Yes it would be a security feature if PHP went and overwrote its files from some place on the internet with no warning. There's no guarantee that the server is connecting correctly to your update server (it might download someone code crafted by someone else if DNS poisoning occured) - giving someone else access to your client's data. Therefore digital signing would be important.
The user could control updates by setting permissions on the web directory so that PHP only has read access to the files - this procedure could simply be documented with your program.
One question remains (I really don't know the answer to): can PHP overwrite files if it's currently using them (e.g. if the update.php file itself needed to be updated)? Worth testing.
I suppose you've already ruled this out, but you could host it as a service. (Think wordpress.com)
I'd suggest that you package your application with pear and set up a channel. Your users can then upgrade the application through a standard interface (pear). It's not entirely automatic (unless the users have some kind of automation running on top of pear), but it's standard, so any sysadmin can maintain it.
I think your best option is an update checking mechanism that will alert the administrator when there are update(s).
As you mention, there are a number of potential security problems. Due to those alone, I would suggest not doing this. Instead, try creating a fairly smart upgrading script.
Just my 2 cents: I'd consider an automatically self updating application within my CMS as a security hole, so if you decide to code this feature, you should consider to implement different levels of this behavior:
Automatically update
Check for updates and notify
Disable

Categories