A product is a CMS under constant development and new features patches are always being rolled out. The product is hosted on multiple servers for some of which I just have FTP credentials and access to db and for others I have full root access. Now, as soon as a new feature is released after testing and all I have to manually ftp files and run SQL queries on all servers. This is very time consuming, error prone and inefficient. How can I make it more robust, fool proof or automate parts of it? The CMS is based on PHP, MySQL, JS, HTML and CSS. The admin part is common for all. Skins and some custom modules are developed for different clients and the only part we update is admin.
Update
For managing code we use GIT, SQL is not part of this GIT structure and I will be talking to product manager/owners to have it under version control.
This is one of the big questions of unpackaged code.
Personally, I have a PHAR, which when executed, extracts code to the specific folder and executes needed queries.
I've run web deployments across dozens of servers handling hundreds of millions of visitors/month.
SQL change management is always going to be a beast. You're only hope is either rolling your own in house (what I did) or using something like EMS DB Comparer.
To handle the file syncing, you're going to need a number of tools, all expertly crafted to work together, including:
Source code version control (bzr, svn, etc.), that is properly branched (stable branch, dev branch, testing branch are required),
A Continuous Integration server,
SFTP support on every server,
Hopefully unit and integrative tests to determine build quality,
Rsync on every server,
Build scripts (I do these in Phing),
Deployment scripts (in Phing as well),
Know how.
The general process takes approximately 20 hours to research thoroughly and about 40 hours to set up. When I drew up my documentation, there were over 40 distinct steps involved. Creating the SQL change management utility took another 20-30 hours. Add in another 10 hours of testing, and you're looking at a 100-120 hour project, but one which will save you considerably in botched deployments in the future as well as reduce deployment time to the time it takes to click a button.
That's why I do consultations on setting up this entire process, and it usually takes ~5 hours to set up on a client's network.
Related
I've checked other threads about this subject, but nothing really answers my exact question. I'm currently developing a site that updates data on a live web server (usually photos) multiple times daily using a cron and some external data feeds. Additionally, we're about to allow user-submitted data. Now, I want to start using svn because the project is getting big enough where it's starting to get relatively complicated and people are making edits to our code on the live server. I've installed svn for apache and all of the subdomains mentioned below are set up on one server using vhosts. We haven't started using this workflow yet, so if I could get some advice before we start it would be greatly appreciated.
Okay, so I've set up a repository under a subdomain (http://svn.website.com), and checked it out to a development subdomain (http://dev.website.com). I also plan to check out a copy to the live environment (http://www.website.com). The idea is to first develop on the dev subdomain, commit when it's ready to go live, and then svn up to the live environment. I've read the pros and cons between exporting and updating, and updating seems best for this project because we often will commit small tweaks/code changes and exporting seems like overkill. I have also setup the appropriate directives to avoid .svn directory access. Does it sound like I'm on the right track with this? Anything I should be aware of?
If the following is a good way to take care of everything, then we can move onto the second question. The user-submitted data (mainly photos) are being uploaded to the live environment. I have to somehow manage this between the live and dev environment so we can develop with up-to-date user-submitted files. I don't really know the best way to do this...I feel like there's a serious flaw if I were to svn add and svn ci every time a file is submitted. What is the best way to accomplish this?
Sounds like you're growing!
First, I'd question as to whether SVN is really the best choice. My guess is that you know it already, have worked with it, as have others on your team, and you want to stick with it. That's fine, but just an FYI I've used SVN and Git and I found Git generally easier to use, and I would argue that it is becoming more and more commonly used by application developers.
As for your workflow, I used to go through an almost identical process with a large distributed team and it worked fine. Depending on the amount of traffic you get, I might suggest a different production deployment strategy. When you're ready for a new production push, clone into a new directory, labeled perhaps by date or milestone names, and create a new virtual host for that directory. That way, you get a couple benefits:
You don't run the risk of having someone access your application and requesting a certain file which is in the process of being updated
You can easily revert back to an earlier milestone by simply changing the active production virtual host. Bugs and problems make it out of development regardless of how thoroughly you test, and sometimes they're serious enough to warrant just reverting back to a previous codebase. Of course, schema changes would add complexity to this, but if we're just talking about code this should avoid frantic commit reverts at 2am.
This was suggested by Rasmus Lerdorf at a conference I attended when he was posed nearly the same question you're asking.
As for ensuring developers are working with the most recent fileset, do you really want to do this? If you're processing a serious quantity of images, that's going to take up a lot of hard disk space. Sure with a CDN or other mass storage no big deal, but do you want your developers to have to download 1000 new photos every time they svn up?
If you're dead set on doing this, I'd suggest adding some code within your app for automatically doing an svn add, svn ci when svn status has a certain result (ie, this is a script you could call with cron, or every time new media gets added). You could also check into using svn hooks (git has these as well) to do things like notify you that the repo has been updated.
Hope this helps. I'd strongly discourage including all of the files users upload unless you have a really, really strong reason for doing so. The complexity that's gonna add is gonna be a pain = )
Good luck and welcome to StackOverflow!
I've been reading this site here and there and appears as though you guys have a wonderful community.
As for my background, I am a sophomore at university familiar with SQL, C++, Visual Basic, and some PHP. One of my school projects for the summer term involves building a web application that allows users to log in and schedule specific timeslots over the internet. Typically, I have been the only person working on a project, but in this case I will be part of a group. Since we're all relatively new to working as a team, I would like to set up source control for my group so we're not all working off a shared drive somewhere. Additionally, I would like to make sure that all of us are able to test our changes in some sort of development server that hosts an instance of our website.
My actual question is in regards to the toolset that we should use to achieve this. As a group, we are most familiar with PHP and MySQL so we'll end up using that for the code and database. I have used SVN in the past for my own personal use, but my group members aren't very familiar with source control. We'll probably stick with something simple like Excel for the project management and bug tracking side of things. Ideally, we would like the tools to be free and open source.
How as a group should we manage the construction of the actual application? Are there methods out there that I can use that will allow any one of us to move the files to our development machine and keep track of who did it so we don't end up overwriting each other's changes? If this is not possible, one of us will write some scripts to handle it - but I would like to avoid building basically a separate software application that will only be used to manage our project. Another issue I foresee will be updating the database running on the development machine. Are there any standardised methods that we can use to manage our SQL scripts among the four of us?
I do not expect a really long winded answer here (after all, this is our project!), but any helpful tips would be greatly appreciated. Once I return from holiday I am looking forward to getting started! Thanks!
I recommend your group use source control to synchronize your code. You can either setup your own server or just use a free provider such as github, Google code, or bitbucket.
If you do decide to use one of these sites, a nice feature is that they provide free issue tracking as well, so you can use that instead of Excel.
The best way to manage the SQL scripts is to break them out into separate files and place them under source control as well. You can either create .sql files, or use a tool to manage these changes - for example, have a look at Ruby on Rails' Migrations. This may take some effort to setup, but you'll thank yourself later if you are working on a project of any size...
Draw up a plan for how you would do it if it were just you.
Split the plan up into tasks that take around 3-4 hours to complete. Make sure each task has a measurable objective.
Divy out the tasks. Try to sort them if possible to maximize developer efficiency.
Teach them to use source control. Explain to them that they will use this (maybe not svn, but SOMETHING) in a few years, so they might as well learn how now. Additionally, this will help in every group project they do down the road.
Make a script for building and running your tests. Also script your deployment. This will ensure you have the same mechanism going to live as you do going to test, which increases the number of defects found in testing. (This is as opposed to letting them exist but not found in testing.)
You mentioned updating the development database. It would be entirely reasonable to dump the development database often with a refresh from live. You may want to make 3 environments. Development, staging, and production. The development database would contain fabricated test data. The staging database would have a copy of live (recent to within a few days maybe.) And of course live is live.
Excel works fine as a "bug database." Consider putting it in source control that you manipulate and commit. This will give you a good idea of what happened over time, and you can correct mistakes quicker.
As far as source/version control, I would recommend subversion. There are some GUI tools they might use, or even webDAV to access the SVN. This will allow users to edit files collaboratively and also give you details as to who edited what, when, and why... SVN will also do a pretty good job at merging files that happen to be saved at the same time.
It's not the easiest concept to wrap your head around, but its not very complicated once you get running.
I suggest having everyone read the first chapter from: http://svnbook.red-bean.com/en/1.5/
and they should have a good idea of what's happening.
I am also curious to see what people have to say about the database
How as a group should we manage the construction of the actual application? Are there methods out there that I can use that will allow any one of us to move the files to our development machine and keep track of who did it so we don't end up overwriting each other's changes?
It sounds like you're looking for build management. In the case of PHP, a true "build" is as simple as a collection of source files because the language is interpreted; there is no compilation.
It just so happens that I am one of the developers for BuildMaster, a tool which basically solves every problem you have listed in your question... and it also sounds like it would be free in your case under the Community Edition license. I'll try to address some of your individual pain points and how BuildMaster could be used as a solution.
Source Control
As suggested by others, you must use it. The trick when it comes to deployment is to set up some form of continuous integration so that every time someone checks in, a new "build" is created. In BuildMaster, you can set this up for any source control provider you want.
Issue/Bug Tracking
Excel will work, but it's not an optimal solution. There are plenty of free issue tracking tools you can use to manage your bugs and features. With BuildMaster, you can link your bugs and features list with the application by their release number so you could view them within the tool at any time. It can also modify issue statuses and add descriptions automatically if you want.
Deployments
Using BuildMaster, you can create automated deployment plans for your development environment, e.g.:
Get Latest Source Code
Create Artifact
Copy Files To Development Machine
Deploy Configuration Files
Update Database
The best part is, once you set these up for other environments (glowcoder's point #6), pushing all of your code and database updates is as simple as clicking a button.
Another issue I foresee will be updating the database running on the development machine. Are there any standardised methods that we can use to manage our SQL scripts among the four of us?
Database Updates
Not surprisingly, BuildMaster handles these as well by using the change scripts module. When a member of your team creates a script (e.g. ALTER TABLE ADD [Blah] INT NOT NULL) he can upload it into BuildMaster, then run it on any environment you have created.
The best part is that you can add a step in your automated deployment and never worry about it again. As Justin mentions, you can use .sql files for your object code (stored procedures, views, triggers, etc.) and have those executed on every build since they are essentially code anyway. You can keep those in source control.
Configuration Files
One aspect of all this you may have neglected (but will inevitably run into) is dealing with configuration files. With PHP, you may have an .htaccess file, a php.ini file, a prepend.php, or roll your own custom config file. Since by definition configuration files need to change between your personal machine and the development machine, grabbing them from source control wouldn't necessary work without some bit of hacking a la:
if (DEV) {
// do one thing
}
else if (PROD) {
// do another
}
With BuildMaster, you can templatize your configuration files and associate them with an environment so they can be deployed automatically. It will also maintain a history of changes for you.
Automated Testing
If you want the full ALM effect, you can automatically unit test your code during an automated build, and notify you if anything fails so you know as soon as possible that something is broken.
Apologies for the "long winded" response, but I feel like you're already ahead of the game by observing the problems you might run into in the future and really believe BuildMaster will make all of this deployment stuff simple for your team so you can focus on the fun part, coding!
I would like to have some input on how a professional development setup with the following requirements might look like.
several PHP-developers (say PHP)
each developer belongs to one group
each group has one team-leader who delegates tasks
each developer works on one Windows 7 machine
and developes either with NetBeans or Eclipse
each developer 'owns' one virtual test-server where he can run the code
the VCS in use is SVN
there is a staging server where the product is ultimately tested before it gets released/deployed
I gave some specific technology to not be too abstract and b/c I also would be interested in concrete suggestions for plug-ins etc.
There are several questions coming to my mind in that setup.
1) So every developer will work on
personal branch.
2) This branch is checked out in a working copy.
Now ... this working copy is edited locally on the PC with the dev's IDE and executed/tested on the server.
What would be in that case the best/usual way to do that? I mean - how do you get your edited code on the server without causing too much overhead?
Would the dev have the code on his local disk at all? Or would it be better to have the IDE write on the remote virtual server through a tunnel or via a specific protocol?
3) Every day a dev will commit his work into his personal branch which resides in a central repository.
Is there a best practice on where the repository is supposed to be located? A seperate server?
4) Then after a dev finished his task either s/he or the team-leader merges the new code into the respective main-branch or trunk.
The most confusing part is about what I wrote between 2) and 3). Because so far I only worked with a local server. For example a VM with a server running a code which is located in a shared folder so I will be able to edit it directly. I'm not sure how to bridge the gap efficiently when the server is now actually remote. Efficiently means not having to upload manually via FTP for example.
Also external sources or book recommendations are welcome.
edit
My question/s is/are aiming at a quasi-standard / best-practice. I think this is pretty much a standard development scenario so there must be a 'usual' solution.
edit 2
Okay ... so lets try with a picture:
V is the virtual test-server for one or more developers D. C and C' are the two code-versions. They should be kept as identical as possible.
Two solutions come to my mind:
1 : Edit C, then upload it to C', then execute C', then commit C.
2 : No C existant. Just C' which is edited through some tunnel technology and executed and committed.
My guts tell me both solutions are semi-optimal. So what would be "professional" / most efficient / fastest / most convenient / most friction-less / least error-prone / best practice / industry standard?
Any questions?
Maybe its not of great help but GIT sounds like a perfect fit to your problems, i recommend to take a look to the GIT features. And if you got time check Linus Torvalds him self talking ablout GIT. http://www.youtube.com/watch?v=4XpnKHJAok8
The standard procedure as you describe is more or less the same. I also you this approach for my team. It can also be called staged application development.
Here is how I am doing it, I use a remote SVN host (ex: assembla.com, unfuddle.com) to store all my codes. My team members store the information there on these remote svn servers. You can also buy an VPS and setup SVN there and user the same approach.
Best practices is to test locally and commit and commit as many times as you can but every commit must solve a problem or include a significant segment that adds any new feature.
Once the commit is done by everyone the lead developer then can login to the staging server via SSH using tools like PuTTY. First time the lead developer has to checkout the code into the folder where the codes are to be located. During this phase file conflict may arise if multiple developers edits same segment of a file. The lead developer should then resolve the code first and then proceed with the checkout. Once checked out, there onwards the lead developer will only need to do a svn update on the staging server to make the code up to date.
Basic idea is to get the code working on local setup then commit and update the staging for testing the application on a simulated scenario and then commit it to the live site.
There are a lot of if's and but's here which will need me to write a chapter on :) but in short this is the zest.
Tools (you can use to work under this setup):
- Tortoise SVN Manager
- PuTTy
- NetBeans
hope it helps :)
I don't like working with personal branches. I worked with ClearCase for almost 15 years and even though ClearCase probably handles personal branching better than most, it was still a big pain. Even worse, personal branches encourages people to not commit their work until the last minute -- usually a day or two before a major release.
For that reason, and to force developers to stay on track with each other, I highly recommend everyone working together on a single branch (or on the trunk) as much as possible. I keep telling developers to take small bites when they make changes.
What you sound like you need is a way to automate the deployment. That is, I make changes on my local machine, and with a single command, I make sure that the server has a duplicate copy of the code. You also want the deployment to be efficient. If you change a single 2 kilobyte file of a 2 gigabyte, 10,000 file deployment, you only want to copy over that one file, not 10,000 gigabytes. For that, I would recommend you write a deployment script in Ant.
Your developers can modify files, then deploy those files via an Ant script. The developers don't have to remember what files they had updated because Ant will automatically handle that. In fact, Ant can even modify files to make sure they contain the right environment information as they get copied over. And, of course, Ant can rearrange the files if the setup on the server is different from the setup in the source repository. And both Netbeans and Eclipse can execute Ant scripts right in the IDE.
So:
Have your developers modify code on their local machine.
Run an Ant script to make sure the server and the local machine are in sync.
Test on the server.
Then, check in their changes once they're happy with the results on the server.
Someone mentioned a Continuous Build System like Jenkins. That actually would be a good idea anyway even though it doesn't solve this particular issue. Jenkins could have its own server and database. Then when you commit your code, Jenkins would update the server and run automated tests. Jenkins can then create a report. It all gets displayed on Jenkin's webpage. Plus, you can archive your deployments on Jenkins, so if you tell someone to test "Build #20", they can simply pull it off of Jenkins where its easy to find.
I'm sure everyone has different ways of doing things but here are my thoughts.
"Best Practice" is probably "Continous Integration" ie each developer doesn't have their own branch but checks in to a common development branch. This forces them to handle conflicts and coordinate with each other early and often to avoid the lead developer from managing a huge train wreck merge later down the road. Take a look at cruisecontrol if you really want to go that route.
The best way is if they have a local apache web server and full php stack. You can use the Zend_Server community edition to get up and running on windows fast. Most standard php code will run just fine on both Windows and Linux, but if you are doing lots of file manipulation or cron job or cli stuff, or need memecache, etc you'll run into incompatabilities. If thats the case and the Linux only stuff is going to bite you use VMWARE or VirtualBox to run local linux instances and install the IDE inside those and just make sure they have goobs of RAM to deal with it.
Each developer needs to run a syncronize inside of Eclipse, basically an svn update, deal with any conflicts with the other developers right then and there, do local testing and commit their changes.
I setup a post_commit hook on the svn server that calls and /autobuild.php on my web server. autobuild.php runs svn update and gets the latest code changes as well as does any chown or chmod file permissions stuff and resets any server specific config files config.php. Its a little tricky to get it setup so that the apache user can run svn update, but once you do your beta/testing server always has the latest committed code. CruseControl, and several others can also help you do this sort of thing and add unit testing, etc
Now your Lead Developer still has a job to do merging the Development Branch into the Production One, testing on the dev server, and reviewing the commits of the others and deciding how and when to push out a release, but your not putting the burden on him of resolving every conflict and merging every change.
Your developers are not ftping files or ssh remoting into servers, they just work locally in their IDE and interact with each other through svn (and email, phone, chat, etc) updating to get the new code and commiting as they finish things.
I don't see any good coming out of having a seperate branch for each developer using SVN. Merging those branches might work in Git but with SVN your lead developer will be hating life very quickly with that type of setup.
I am a lone developer most of my time, working on a number of big, mainly PHP-based projects. I want to professionalize and automate how changes to the code base are handled, and create a Continuous Integration process that makes the transition to work in a team possible without having to make fundamental changes.
What I am doing right now is, I have a local test environment for every project; I use SVN for each project; changes are tested locally, and then transferred to the on-line version, usually via FTP. API documentation is generated manually from the source code; Unit tests are something I am getting into slowly, and it's not yet part of my daily routine.
The "build cycle" I am envisioning would do the following:
A changeset gets checked into SVN after having been tested locally.
I start the build process. The SVN HEAD revision gets checked out, modified if necessary, and made ready for upload.
API Documentation gets generated automatically - if I haven't set it up in detail yet, using a default template, scanning the whole code base.
The new revision is deployed to the remote location via FTP (Including some directory renaming, chmodding, importing databases, and the likes.) This is something I already like phing for very much, but I'm open for alternatives of course.
Unit tests residing in a predefined location are run. I am informed about their failure or success using E-Mail, RSS or (preferably) HTML output that I can grab and put into a web page.
(optionally) a end-user "changelog" text file in a pre-defined location gets updated with a pre-defined part of the commit message ("It is now possible to filter for both "foo" and "bar" at the same time). This message is not necessarily identical with the SVN commit message, which probably contains much more internal information.
Stuff like code metrics, code style checking and so on are not my primary focus right now, but on the long run, they certainly will. Solutions that bring this out-of-the-box are very kindly looked upon.
I am looking for
Feedback and experiences from people who are or were in a similar situation, and have successfully implemented a solution for this
Especially, good step-by-step tutorials and walkthroughs on how to set this up
Solutions that provide as much automation as possible, for example by creating a skeleton API, test cases and so on for each new project.
and also
Product recommendations. What I know so far is phing/ant for building, and phpUnderControl or Hudson for the reporting part. I like them all as far as I can see, but I have of course no detailed experience with them.
I am swamped with work, so I have a strong inclination towards simple solutions. On the other hand, if a feature is missing, I'll cry about it being too limited. :) Point-and-click solutions are welcome, too. I am also to commercial product recommendations that can work with PHP projects.
My setup
I am working on Windows locally (7, to be exact) and most client projects are run on a LAMP stack, often on shared hosting (= no remote SSH).
I am looking for solutions that I can run in my own environment. I am ready to set up a Linux VM for this, no problem. Hosted solutions are interesting for me only if they provide all of the aspects described, or are flexible enough to interact with the other parts of the process.
Bounty
I am accepting the answer that I feel will give me the most mileage. There is a lot of excellent input here, I wish I could accept more than one answer. Thanks everyone!
I've been through buildbot, CruiseControl.net, CruiseControl and Hudson. All though I really liked CruiseControl*, it was just too much of a hassle with really complex dependency cases. buildbot is not easy to set up, but it's got a nice aura (I just like python, that's all). But hudson won over the former three because:
It's just easy to set up
It's easy to customize
It looks good and got nice overview functionality
It got point-and-click updates, for itself and all installed plugins. This is a really nice feature, that I appreciate more and more
Caveat: I only ever used linux as base for the above mentioned build servers (CC.net ran on mono), but they should all - according to the docs - run cross-platform.
Setting up a hudson server
Prerequisites:
Java (1.5 will serve you just fine)
Read access to the subversion server (I have a separate account for the hudson user)
From here, it's just:
java -jar hudson.war
This will run a small server instance right off your console, and you should be able to browse the installation at your http://localhost:8080, if you don't have anything else running on that port in advance (you can specify another port by passing the --httpPort=ANOTHER_HTTP_PORT option to the above command) and everything went well in the 'installation' process.
If you go to the available plugins directory (http://localhost:8080/pluginManager/available), you'll find plugins for supporting your above mentioned tasks (subversion support is installed per default).
If that has whet you appetite, you should install a java application server, such as tomcat or jetty. Installation instructions are available for all major application servers
Update: Kohsuke Kawaguchi has constructed a windows service installer for hudson
Setting up a project in hudson
The links in the following walk-through assumes a running instance of hudson located at http://localhost:8080
Select new Job (http://localhost:8080/view/All/newJob) from the menu on the left
Give the job a name and tick Build a free-style software project on the list
Pressing 'ok' will take you to the configuration page of the job. All the options have a little question mark besides them. Pressing this will bring up a help text regarding the option.
Under the option group 'Source Code Management' you would be using Subversion. Hudson accepts both url access as well as local module access
Under the option group 'Build Triggers', you would use 'Poll SCM'. The syntax used here is that of cron, so polling the subversion repository every 5 minutes would be */5 * * * *
The process of building the project is specified under the option group 'Build'. If you already have an ant build file with all the targets you need, you're in luck. Just choose 'Invoke ant' and write the name of the target. The option group supports maven and shell commands as well out of the box, but there is also a plugin available for phing.
Tick off additional build actions in 'Post Build Actions', such as e-mail notifications or archiving of build artefacts.
For setting up processes for which hudson have no plugins, you can either call them directly through a shell script from within the build setup, or you could write you own plugin
Pitfalls:
If you have it produce build artefacts, remember to have hudson clean up after itself in regular intervals.
If you have more than 20 projects set up, consider not displaying their build status as the default main page on hudson
Good luck!
The term you are looking for is "continous integration."
Here is an example of someone who uses GIT + phpundercontrol: http://maff.ailoo.net/2009/09/continuous-integration-phpundercontrol-git/
CruiseControl (which is a CI server), can use Hosted SVN/GIT as a source. So you can even use it with GitHub or Beanstalk or something else.
Then you can integrate that with the following kind of software:
PHPUnit
php-codesniffer
phpdocumentor
PHP Gcov
PHPXref
Yasca
etc.
You could also try this hosted CI: http://www.php-ci.net/hosting/create-project
Keep in mind though, that those tools need custom support if you integrate them yourself.
Have you also thought about project management and patch management?
You can use Redmine for project management. It has integrated continuous integration support, but only as client side (not as CI server).
Try using a hosted SVN/GIT/etc. solution, because they will cover your backups and keep their servers running, so you can focus on development.
For a tutorial on how to setup Hudson, see: http://toptopic.wordpress.com/2009/02/26/php-and-hudson/
I use Atlassian's Bamboo continous integration server for my main PHP project (along with their other products such as fisheye (repository browsing), jira (issue tracker) and clover (code coverage)).
It supports SVN and now supports Git and it has a great user interface. It is available for linux, windows and mac and can run standalone on its own tomcat server which is great for people (like me) who does not like to take days to setup their tools). Although it may look expensive, being a lone developer myself I purchased the starter kit license for 10$ (10$ by software). This is great for small teams and it is worth the look.
PHPTesting PHPCI This is nice, continuous integration server built in php.
Plus, its free and open source. :)
it has number of plugins..
PHPCI includes integration plugins for:
Atoum
Behat
Campfire
Codeception
Composer
Email
Grunt
IRC
PHP
Lint
MySQL
PDepend
PostgreSQL
PHP Code Sniffer
PHP Copy/Paste Detector
PHP Spec
PHP Unit
Shell Commands
Tar / Zip
I am mostly a sys admin but sometimes I code PHP as well. As a side project I created some scripts that will make it simple and painless to set up a full blown PHP CI environment using Jenkins. It also runs a sample project for you so you can see how each build step is configured.
If you want to try it out all you need is a Debian/Ubuntu box and shell access.
http://yauh.de/articles/379/setting-up-a-ci-environment-for-php-projects-using-jenkins-ci
Update To add some content to my answer:
You can simply set up a Jenkins CI for PHP using Ansible. Since v1.4 it supports roles which you can download from their galaxy.ansibleworks.com community site and it will do the heavy lifting for you. It is called jenkins-php.
I would suggest using Jenkins http://jenkins-ci.org/ it's free and it's open source.
It's pretty straight forward to setup, works on multiple platforms and integrates well with other continuous integration tools like SonarQube (+ SQUALE) to measure technical debt and Thucydides for testing automation.
I would highly suggest using GIT or GIT Hub for version control instead of SVN. From my point of view it's just a better version control system that will help you scale your development efforts later.
Since you're working mostly with PHP project there are some other tools you can use.
PHPUnit - For unit testing
PHP CodeSniffer - Check for coding standards
PHP Depend - Shows your PHP code dependencies
XDEBUG - For performance testing
All of these tools and be triggered with a Jenkins job and helps with the quality and performance of your code.
Good luck and Enjoy!
I do not use many of the products, or even types of products that you use, but I will give you my experience.
I run a TEST environment in parrallel with my PROD environment. I have no local testing per se. If it is too hard to get soemthing up into a real TEST environment, then I fix my build process. I don't see the point in testing locally, as the environments are different. UPDATE: The only thing I do locally is run "php -l" before I upload anything. Stops the stupid mistakes.
The build process works with whatever is in the current workspace, which includes uncommitted code. This is not everyone's cup of tea, but I am going to TEST very often. Everything gets committed before going to PROD.
Part of my build process (similar to yours) creates two META files. One contains the last (typically) 100 changes and also gives me the current changelist number. The shows me what changes are installed. The other contains the CLIENTSPEC (in Perforce terms) which shows me exactly what branches were used in this build. Together these give me reproducible builds.
I do not build straight to the target environment, but to a staging area on the server. I use SSH so this makes sense. This gives me a few advantages. Most importantly it avoids dying half way through a large upload. It also gives me a place to store META files, and all the build files are automatically archived (so I can go straight back to any build). The script also logs the update (so there is an entry in the log stream and I can see pre- and post-) and kicks all daemons (I use daemontools so "svc -t"). All of these are better off on the target machine.
One other issue is DB changes. I keep a master script of the DB schema, which I update every time the schema changes. Each of the changes also go into a changes.sql script, which is uploaded with the build to the staging area. The script is run as part of the install script.
I've recently begun the same kind of process, and am using Beanstalk for svn hosting.
There are two nifty features in the paid accounts (start at $15pm i think):
deployment allows the user to create ftp targets for staging and production servers, which can be deployed at the click of a button (inc specifying a revision and branch)
webhooks allow the user to set up a url that is called on each commit/deploy, passing across things like revision number, description and user. This could be used to update docs, run unit tests and update changelogs.
I'm sure there are other hosted or self-hosting svn servers with these two features, but beanstalk is the one i have experience of and it's working very, very well
There's also an API, which I imagine could be used to integrate deployment further in to your process.
Consider fazend.com, a free hosted CI platform, which automates configuration and installation procedures. You don't need to setup version control, bug tracking, CI server, test environment, etc. Everything is done on-demand.
It is pretty standard practice now for desktop applications to be self-updating. On the Mac, every non-Apple program that uses Sparkle in my book is an instant win. For Windows developers, this has already been discussed at length. I have not yet found information on self-updating web applications, and I hope you can help.
I am building a web application that is meant to be installed like Wordpress or Drupal - unzip it in a directory, hit some install page, and it's ready to go. In order to have broad server compatibility, I've been asked to use PHP and MySQL -- is that **MP? In any event, it has to be broadly cross-platform. For context, this is basically a unified web messaging application for small businesses. It's not another CMS platform, think webmail.
I want to know about self-updating web applications. First of all, (1) is this a bad idea? As of Wordpress 2.7 the automatic update is a single button, which seems easy, and yet I can imagine so many ways this could go terribly, terribly wrong. Also, isn't the idea that the web files are writable by the web process a security hole?
(2) Is it worth the development time? There are probably millions of WP installs in the world, so it's probably worth the time it took the WP team to make it easy, saving millions of man hours worldwide. I can only imagine a few thousand installs of my software -- is building self-upgrade worth the time investment, or can I assume that users sophisticated enough to download and install web software in the first place could go through an upgrade checklist?
If it's not a security disaster or waste of time, then (3) I'm looking for suggestions from anyone who has done it before. Do you keep a version table in your database? How do you manage DB upgrades? What method do you use for rolling back a partial upgrade in the context of a self-updating web application? Did using an ORM layer make it easier or harder? Do you keep a delta of version changes or do you just blow out the whole thing every time?
I appreciate your thoughts on this.
Frankly, it really does depend on your userbase. There are tons of PHP applications that don't automatically upgrade themselves. Their users are either technical enough to handle the upgrade process, or just don't upgrade.
I purpose two steps:
1) Seriously ask yourself what your users are likely to really need. Will self-updating provide enough of a boost to adoption to justify the additional work? If you're confident the answer is yes, just do it.
Since you're asking here, I'd guess that you don't know yet. In that case, I purpose step 2:
2) Release version 1.0 without the feature. Wait for user feedback. Your users may immediately cry for a simpler upgrade process, in which case you should prioritize it. Alternately, you may find that your users are much more concerned with some other feature.
Guessing at what your users want without asking them is a good way to waste a lot of development time on things people don't actually need.
I've been thinking about this lately in regards to database schema changes. At the moment I'm digging into WordPress to see how they've handled database changes between revisions. Here's what I've found so far:
$wp_db_version is loaded from wp-includes/version.php. This variable corresponds to a Subversion revision number, and is updated when wp-admin/includes/schema.php is changed. (Possibly through a hook? I'm not sure.) When wp-admin/admin.php is loaded, the WordPress option named db_version is read from the database. If this number is not equal to $wp_db_version, wp-admin/upgrade.php is loaded.
wp-admin/includes/upgrade.php includes a function called dbDelta(). dbDelta() scans $wp_queries (a string of SQL queries that will create the most recent database schema from scratch) and compares it to the schema in the database, altering the tables as necessary so that the schema is brought up-to-date.
upgrade.php then runs a function called upgrade_all() which runs specific upgrade_NNN() functions if $wp_db_version is less than target values. (ie. upgrade_250(), the WordPress 2.5.0 upgrade, will be run if the database version is less than 7499.) Each of these functions run their own data migration and population procedures, some of which are called during the initial database setup script. Nicely cuts down on duplicate code.
So, that's one way to do it.
Yes it would be a security feature if PHP went and overwrote its files from some place on the internet with no warning. There's no guarantee that the server is connecting correctly to your update server (it might download someone code crafted by someone else if DNS poisoning occured) - giving someone else access to your client's data. Therefore digital signing would be important.
The user could control updates by setting permissions on the web directory so that PHP only has read access to the files - this procedure could simply be documented with your program.
One question remains (I really don't know the answer to): can PHP overwrite files if it's currently using them (e.g. if the update.php file itself needed to be updated)? Worth testing.
I suppose you've already ruled this out, but you could host it as a service. (Think wordpress.com)
I'd suggest that you package your application with pear and set up a channel. Your users can then upgrade the application through a standard interface (pear). It's not entirely automatic (unless the users have some kind of automation running on top of pear), but it's standard, so any sysadmin can maintain it.
I think your best option is an update checking mechanism that will alert the administrator when there are update(s).
As you mention, there are a number of potential security problems. Due to those alone, I would suggest not doing this. Instead, try creating a fairly smart upgrading script.
Just my 2 cents: I'd consider an automatically self updating application within my CMS as a security hole, so if you decide to code this feature, you should consider to implement different levels of this behavior:
Automatically update
Check for updates and notify
Disable