(Lightweight) backup strategies for a LAMP application stack?

(Lightweight) backup strategies for a LAMP application stack? - php

I'm researching some lightweight tools to backup a LAMP stack. The two most imporatant pieces are the
php codebase and
the mysql database.
I can tar/bz2 the code and a mysqldump and restore it on a new server (if the old one crashes) and this is more or less fine.
Anyway, are there more complete solutions to this?
e.g. track and re-install additionally installed pear-packages;
track other packages related to the LAMP stack installed via linux package managers, e.g. APC;
keep mysql and php configurations alongside backups and being able to restore them automatically ...
possibly complete server images, which can be restored without the need to reinstall everything ..
I'm curious about hints, tips, experiences, solutions ..

The PHP code base should be managed in a version control system such as SVN, Git, etc. Just creating a tar doesn't give you many capabilities that a proper version control system gives you.
The trouble with mysqldump is that you have to lock the tables you are dumping to ensure a consistent snapshot. If this takes a long time, other DB operations might timeout while waiting. We use a wonderful script for snapshotting the running database without excessive locks. It was designed for the Amazon/EC2 environment but the principals apply to any Linux system with the xfs file system.

Here is a great guide for imaging an Ubuntu machine (obviously you can use on other distros):
http://ubuntuforums.org/showthread.php?t=35087
In a nutshell (from the article)
tar cvpzf backup.tgz --exclude=/proc --exclude=/lost+found --exclude=/backup.tgz --exclude=/mnt --exclude=/sys /
To back up the system, then ftp it to another server.

I can answer a few points. I know it is not a popular package, but I've always revisioned schema with RCS at the server. It doesn't have to be RCS, but no reason not to dump the CVS/RCS repository with the backup.
For "complete server images," instead of autonomously installing application-requirements (PHP packages &c) we deploy our own bin/ src/ usr/ var/ lib/ structure as per each application which simplifies the backup and system req's perspective.
Hope that helps.
I've also seen mysqldumps RCS'd to save only changes. I'm sure that would be somewhat non-trivial in terms of change management though.

Related

What ways are there to work on a project in a testing environment where the Git commit needs to be different to the PHP code used for testing and dev?

In my project the deployable version needs to have a copy of each of the external libs, a different config file and install and setup files, for security concerns, the main project is set to refuse to run if they are present. Thus the upstream copies of the other projects need to be committed to repo. How can I work on code running on localhost where the file layout and sometimes file contents from dev and testing are different to what I need to commit?
Background
I am working on a project on hosted on github and my main IDE is netbeans which has imperfect git support (good enough for >99% of my needs). The project is in PHP and uses several other projects as libraries.
As Netbeans does not have the best support for sub-repos I have chosen to keep each additional project in a separate project. This is fine as the central project looks at the config data for where to find these outside libs.
Half an answer
My instinct is to suppose that there will need to be some "build stage" prior to committing to the github repo but how on earth do I go about setting all that up?
I could write some sort of homebrew thing but then when I pull other people's contributions I would need to reverse the process unless we had a branch for builds and a branch for working copies which seems needlessly complex and could leave the dev(s) config data on public display (not to mention updates being a mess).
I have seen that others have wrestled with somewhat similar problems to no conclusion (at time of asking) (How to push and pull from github without sharing sensitive information? Smudge & clean?) so I am looking for anything that might help me come up with a solution

my main IDE is netbeans which has imperfect git support
Most devs just use the command line. I switch to the NetBeans conflict resolver occasionally, which is very good, but for normal stuff the console is usually faster.
My instinct is to suppose that there will need to be some "build stage" prior to committing to the github repo
... unless we had a branch for builds and a branch for working copies
No, there is only ever one repository. It is better to think of your repo as your code history, rather than your deployment state. Branches should just be for features or large changes, which merge into your mainline/master.
There are a good deal of options available to you when deploying. The first is Composer, which Mark points out: when deploying you issue an install or update command, which fetches the dependencies that satisfy your library requirements recursively. You can use Bower to do the same thing for your JavaScript dependencies.
Some deployment strategies prefer to build locally and then scp/rsync to a remote server. Composer and Bower are still probably a good idea, but you write a build script (using Ant or Phing, for example) to create a build copy in a local temporary folder, and then send it to the server. It is common here also to push it to a new release folder on the server, and then swap a symlink or Apache config file when it's ready to go live.
the deployable version needs to have a copy of each of the external libs, a different config file and install and setup files, for security concerns
Assuming this is a web project, have you tried adding your sensitive environment data to your Apache configuration file? This can be trivially read in PHP, and of course PHP does not care that this information is different according to whether you are developing, testing, demoing a branch or operating live.
Further reading: an excellent PHP deployment book, free of charge, that suggests Phing and Capistrano.

Development environment - VCS from development to staging server to production

I've read a number of topics in the same sort of ballpark as this one, but in all honesty I'm still not exactly sure on the best approach (as a starting point). I am a solo developer in a small office and I have around 30 websites which are hosted on a linux VPS. I want to start using using version control (probably SVN) and also set up a staging server. At the moment, I do development either locally on my machine before using FTP to upload to the live server, or ocassionally for small changes I edit the remote files directly, which is not an ideal approach.
I'm looking for some guidance on how to improve my development environment. I imagine I should be installing SVN on the web server, which would then allow me to check out versions to my local machine (which would also require SVN i think). Also, if I want to set up a staging server, should I just set up subdomains for each of the live websites, then use these subdomains for showing clients changes to the site before making them live?
Hope this makes sense!

This is what we do at work:
We have a staging server running Apache and a Subversion server. We have a post commit hook that updates a working copy in the htdocs directory, that way, when a developer commits something it automatically gets updated on the staging server, so everyone can see the latest code.
On the client's production servers (the ones we can control) we have the Subversion client installed and the website is a working copy. When we need to update the live site we login to a shell and run svn up. If you do something like this, make sure to limit access to the .svn directories, either with .htaccess files or from the main Apache config.
We have a custom app that manages the projects, but that is only because we're lazy and don't want to setup each project by hand, the app creates the necessary directories and working copies. You could write a quick script to do this.
We never, ever, edit files via FTP on the live site. All in all we have been using this setup for almost 2 years and aside from the occasional conflict on the staging server, we never have had any problems.

You can actually install the SVN server on your local machine, which I would recommend in lieu of installing it on the web server (assuming you make backups). The easiest thing to do, since it’s only you using it, would be to use the file:// protocol, but using svnserve is a little more robust, and the preferred method if you want to take the time to do it.

#Michael, I disagree - I would say it's better to install on the linux vps, especially if you are already paying for the hosting service. I find it very helpful to be able to browse and download stuff from my svn repo wherever I am, from whatever computer I'm on.
#nicky, I started with svn (and version control) several years ago and I took baby steps which made it easier to tackle.
If I had to do it over again, I'd read the svn book to start with. The book is very well laid out and didn't take more than 1-2 days to plow thru.
While you're reading, install svn on your linux vps with an apache front end.
Once you have that up, pick one of your websites and import it into svn. This is how I structure my svn repo. For example, say my repo is hosted at http://mysvn.mydomain.com/svn/:
mywebsite1
- trunk
- tags
- branches
mywebsite2
- trunk
- tags
- branches
Don't worry about creating the perfect structure. It's pretty easy to re-organize especially when you're starting out. After you import a few projects into svn, you'll start to get a feel for which projects should have their own "trunk/tags/branches" dir structure and which can be combined.
For creating test environments, I do exactly what you describe. I use build scripts to checkout from svn and download files into dirs that are mapped to subdomains like "test.clientsite.com" (I work primarily in java and use ant and maven, but I think you can use whatever scripting language you're familiar with).
Once you get used to version control, you'll never go back, good luck!

What's the best process / app for automated deployment of PHP apps?

There's another post on SO relating to .NET -- not us. Pure PHP. Trying to find the best way/process to deploy stable version of our PHP app. I've seen an article on Capistrano, but am curious what else is out there. Aside from the obvious reasons, I'm also looking to add some scripting so that the SVN rev number gets added in there as well.
Much thanks.

I've used a home-grown script for quite some time. It will (based on an application configuration file):
Run svn export on the repository based on a tag.
Package the export into a tar or zip file, which includes the tag in the name.
Use scp to copy the package to the appropriate server (QA or release).
Connect to the server with ssh to install the package and run post-install scripts.
The application configuration file is part of the project. It can tell the script (at step 2) to strip paths and otherwise process specified files. It also specifies server names and how to handle externals.
I've recently migrated the script to support Git as well as Subversion. I'm also probably going to migrate it to PHP since we're now running in a mixed (Linux and Windows) set up, with Linux now in the minority.
I have plans to automatically call the script with post-commit hooks, but haven't had the need to implement that just yet.

Coincidentally, I was just reading about an Apache Ant/gnu make like build tool called Phing. What I like about it is the ability to write custom extensions in PHP!

I don't know if it works for deploying an app live, but phpUnderControl is a continuous integration suite (which I'm just now starting to look into). If it doesn't support doing deployments natively, it can probably be extended to do them.

Chris Hartjes has a nice view on this: Deployment is not a 4 letter word

We're using Webistrano, which is a web frontend for Capistrano, to deploy a few dozen projects. It's built as a Ruby on Rails app, and provides a nice, centralized and consistent user interface for Capistrano deployments.
Instead of having cap recipes in every project, and running command-line tools, Webistrano stores the recipes in its database, and allows you to attach the recipes to multiple projects and stages. This reduces duplication of scripts.
Also nice is that all deployment logs are stored so there's an auditing trail. Who deployed which revision to the live server, that sort of thing.
As you requested, the Revision number is stored in the deployed project as well.
All in all, we're very pleased with it.

What's the best way to use SVN to version control a PHP site?

I've always just FTPed files down from sites, edited them and put them back up when creating sites, but feel it's worth learning to do things properly.
I've just commited everything to a SVN repo, and have tried sshing into the server and checking out a tagged build, as well as updating that build using switch.
All good, but it's a lot lot slower than my current process.
What's the best way to set something like this up? Most of my time is just bug fixes or small changes rather than large rewrites, so I'm frequently updating things.

You don't necessarily need to use SVN to deploy the files to the server. Keep using FTP for that and just use SVN for revision history.

You should look at installing rsync to upload changes to your server.
Rsync is great because it compares your local copy of the repo to the copy that's currently on the server and then only sends files that have changed.
This saves you having to remember every file that you changed and selecting them manually to FTP, or having to upload your whole local copy to the server again (and leaving FTP to do the comparisons).
Rsync also lets you exclude files/folder (i.e. .svn/ folders) when syncing between your servers.

I'd recommend you keep using Subversion to track all changes, even bug fixes. When you wish to deploy to your production server, you should use SSH and call svn update. This process can be automated using Capistrano, meaning that you can sit at your local box and call cap deploy -- Capistrano will SSH into your server and perform the Subversion update. Saves a lot of tedious manual labor.

For quick updates I just run svn update from the server.
Sometimes for really really quick updates I edit the files using vim and commit them from the server.
It's not very proper, but quick and quite reliable.

If you want to do this properly, you should definitely look into setting up a local SVN repository. I would also highly recommend setting up a continuous integration (CI) server such as cruise control, which would automatically run any tests against your PHP code when ever you check in to svn. Your CI server could also be used to publish your files via FTP to your host at the click of a button, once it has passed the tests.
Although this sounds like a lot of work, it really isn't and the benefits of a smooth deployment process will more than pay for itself in the long run.

For my projects, I usually have a repo. On my laptop is a working copy, and the live website is a working copy. I make my changes on the local copy, using my local webserver. When everything is tested and ready to go, I commit the changes, then I ssh into the remote server and svn update.
I also keep a folder in this repository which contains sql files of any changes I've made to the database structure, labelled according to their revision number. For instance, when I commit Revision 74 and it has a couple extra columns in one of the tables, included in the commit will be dbupdates/rev74.sql. That way, after I do my svn update, all I just have to run my sql file (mysql db_name -p -u username < dbupdates/rev74.sql) and I'm good to go.

If you want to get real funky with it, you could use a build script to get the current version from SVN, then compile your PHP code, then on a successful build, automatically push the changes to your server.
This will help in debugging and may make your code run faster. Also, getting into the build habit has really improved my coding over just pushing the PHP straight to the server and debugging via Firefox.

The benefits of source control reveal themselves as the complexity of the project and number of developers increase. If you are working directly on a remote server, and are only making quick patches most of the time, source control might not be worth the effort to you.
Preferably, you should be working from a local working copy of the repository (meaning you should also set up a local server). Working against a remote server using SVN as the only means to update it would slow you down quite considerably.
Having said that, working with SVN (or any other source control) will yield many benefits in the long run - you have a complete history of changes, you can always be sure the server is up-to-date (if you ran update) and if you add more developers to the project you can avoid costly source overwrites from each other.

What I do at work, is use FTP to upload changes to a test server. Then when I am finished with the section of the site that I was working on, I commit the changes and update both. Sometimes, if I am working on something and I change a lot of files in different directories, I commit it and update the test server. But I don't update the production server. But I am the only programmer here, I wouldn't recommend committing possibally buggy code if there is more than one programmer.

I use ZendStudio for Eclipse (currently version 6.1). And I use SVN to keep my source codes available. Initially I thought the process was somewhat slow due to commit process (and entering commit comment) and wait until it stops.
However after learning that Ctrl+Alt+C to Commit and check 'Always run in Background', the process doesn't slow at all.
Plus, I do run everything locally, then only SSH after a while.

I did a post-commit hook to automatically update my web. It´s fast but you can make mistakes.

IF on a *nix server AND you have the appropriate SSH access AND you have space to keep multiple copies of the website, THEN the single most useful versioning technique I have found is to use a symbolic link to point to the "current" version of the website. (You can still use SVN to version source code -- this is a way to easily/instantly switch between versions of the website on the server.)
Set up the webserver to point to /whatever.com as the root of the website.
Have a folder like /website/r1v00 to which you FTP the website files, then create a symlink called "whatever.com" that points to /website/r1v00
When you have an updated version of the website, create another folder called /website/r1v001, FTP all the files for the updated site, then change the symlink for "whatever.com" to now point to /website/r1v01. If there are any problems with the new site, you can back it out instantly by simply pointing the "whatever.com" symlink back to /website/r1v00
Of course, you can/should set up scripts to automate the creation and switching of the symlink. In my case, I have an "admin" page written in PHP that lists all the available versions, and allows me to switch to any of them. This technique has saved my bacon several times...!
Obviously this does not address any issues with versioning database schemas or database content.

Deploying to multiple servers

I have to deploy my php/html/css/etc code to multiple servers and i am looking at my options for software that allows easy and secure deployment to multiple servers.
Also helps if it could be tied into my SVN.
Any suggestions?

Capistrano is pretty handy for that. There's a few people using it (1, 2, 3) for deploying PHP code as evidenced by doing a quick search.

Setting up password-less publickey authentication with ssh would allow you to scp your files to any of your servers very quickly (or be automated by a shell script).
Here's a simple tutorial: http://rcsg-gsir.imsb-dsgi.nrc-cnrc.gc.ca/documents/internet/node31.html

If you're running on Redhat or Debian, consider packaging up your code into RPM's or Debs. Then build a yum or dpkg repository and put your packages there. You can then use your system's package management to do upgrades/rollbacks, etc. You can even use puppet to automate the process.
If you want to tie it into subversion, you can create a branch for each new version. Use the commit scripts to build the RPM's when a new branch shows up in a directory.

I'll second Capistrano. It's incredibly powerful and flexible. Our current project uses Capistrano for deploying to different servers as well as multiple servers. We pass two arguments to the cap command:
1) the name of the set of machine specific config options to run and
2) the name of the action to run
ends up looking like this:
cap -f deploy.rb live deploy
or
cap -f deploy.rb dev deploy
Of course the default use case - deploy to lots of machines at once - is a doddle with Capistrano AND you don't need to have Capistrano on the machines you are deploying to. All in all, tasty technology.

I've used Automated Build Studio before for a similar task. It gives you a lot of flexibility in what you can do.

I concur -- set your svn tree up, and use rsync over ssh to copy the tree out to the remote locations. rsync will make it fast and efficient, only copying changes rather than full files.
You want to export your svn tree to some directory, then rsync from there to the remote host's directory tree.

I also forgot to mention that if you use rsync, you can set up rsync to use ssh, so you will only transfer the files that have changed, saving on time and bandwidth.

You can also use kwateeSDCM which is free and allows remote installation via ssh. It also enables you to manage server-specific configuration from a central location and make upgrades seemless.

I had marked a post on how to deploy your websites using Subversion : http://blog.lavablast.com/post/2008/02/I2c-for-one2c-welcome-our-new-revision-control-overlords!.aspx

I found capistrano to be very easy to use once it's setup. The configuration file can be a bit confusing at first for more complicated environments but it soon becomes worthwhile. I deploy to 14 servers on production. I also use multiple environments for deployment to a staging server. One quirk, there's a bug in Ruby that breaks parallel deployment but serially isn't too bad with svn exports.

Capistrano setup is just too complicated. We found that KwateeSDCM was very straightforward to use with a simple web interface and no scripting. We've got our deployment configuration done in no time for Dev and QA configuration on windows and linux servers.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.