Help me improve my continuous deployment workflow

Help me improve my continuous deployment workflow - php

I've been developing a workflow for practicing a mostly automated continuous deployment cycle for a PHP project. I'd like some feedback on possible process or technical bottlenecks in this workflow, suggestions for improvement, and ideas for how to better automate and increase the ease-of-use for my team.
Core components:
Hudson CI server
Git and GitHub
PHPUnit unit tests
Selenium RC
Sauce OnDemand for automated, cross-browser, cloud testing with Selenium RC
Puppet for automating test server deployments
Gerrit for Git code review
Gerrit Trigger for Hudson
EDIT: I've changed the workflow graphic to take ircmaxwell's contributions into account by: removing PHPUnit's extension for Selenium RC and running those tests only as part of the QC stage; adding a QC stage; moving UI testing after code review but before merges; moving merges after the QC stage; moving deployment after the merge.
This workflow graphic describes the process. My questions / thoughts / concerns follow.
My concerns / thoughts / questions:
Overall difficulty using this system.
Time involvement.
Difficulty employing Gerrit.
Difficulty employing Puppet.
We'll be deploying on Amazon EC2 instances later. If we're going about setting up Debian packages with Puppet and deploying to Linode slices now, is there a potential for a working deployment on Linode to break on EC2? Should we instead be doing our builds and deployments on EC2 from the get-go?
Another question re: EC2 and Puppet. We're also considering using Scalr as a solution. Would it make as much sense to avoid the overhead of Puppet for this alone and invest in Scalr instead? I have a secondary (ha!) concern here about cost; the Selenium tests shouldn't be running that often that EC2 build instances will be running 24/7, but for something like a five-minute build, paying for an hour of EC2 usage seems a bit much.
Possible process bottlenecks on merges.
Could "A" be moved?
Credits: Portions of this workflow are inspired by Digg's awesome post on continuous deployment. The workflow graphic above is inspired by the Android OS Project.

How many people are working on it? If you only have maybe 10 or 20 developers, I'm not sure it will make sense to put such an elaborate workflow into place. If you're managing 500, sure...
My personal feeling is KISS. Keep It Simple, Stupid... You want a process that's both efficient, and more important: simple. If it's complicated, either nobody is going to do it right, or after time parts will slip. If you make it simple, it will become second nature and after a few weeks nobody would question the process (Well, the semantics of it anyway)...
And the other personal feeling is always run all of your UNIT tests. That way, you can skip a whole decision tree in your flow chart. After all, what's more expensive, a few minutes of CPU time, or the brain cycles to understand the difference between the partial test passing and the massive test failing. Remember, a fail is a fail, and there's no practical reason that code should ever be shown to a reviewer that has the potential to fail the build.
Now, Selenium tests are typically quite expensive, so I might agree to push those off until after the reviewer approves. But you'll need to think about that one...
Oh, and if I was implementing this, I would put a formal QC stage in there. I want human testers to look at any changes that are being made. Yes, Selenium can verify the things you know about, but only a human can find things you didn't think of. Feed back their findings into new Selenium and Integration tests to prevent regressions...

Importent to make your tests extremely fast, i.e. no IO and ability to run parallel and distributed tests. Don't know how applicable it is with php, but if you can test units of code with in memory db and mock the environment you'll be better off.
If you have a QA/QC or any human in the way between the commit and production you would have a problem getting to a full continuous deployment. The key is your trust your testing, monitoring and auto response (immune system) enough to eliminate error prone process evolving humans from your system.

All handovers between functions have the effect of slowing things down, and with that, an increase of the amount of change (and hence risk) that goes in to a deployment.
Manual quality gates are by definition an acceptance that quality has not been built in from the start. The only reason code needs to be reviewed later is because there is some belief that the quality is not good enough already.
I'm currently trying to remove formal code review from our pipelines for exactly this reason. It causes feedback delays, and quoting Martin Fowler:
"The whole point of Continuous Integration is to provide rapid feedback. Nothing sucks the blood of a CI activity more than a build that takes a long time. "
Instead I'd like to make code review something that submitters request if required, or otherwise is done at the time of coding by team members, perhaps a la XP pair programming.
I think it should be your goal that once the code is merged to source control, that there is absolutely no more manual intervention.

I don't know whether this is relevant to PHP, but you can replace at least at least some of the code review stage with static analysis.
The quality of code reviews is relying on the quality of the reviewers, while static analysis relies on best practices and patterns, and is fully automatic. I'm not saying that code reviews should be abandoned, I simply think it can be done off-line.
See
http://en.wikipedia.org/wiki/Static_code_analysis
http://en.wikipedia.org/wiki/List_of_tools_for_static_code_analysis

Related

Planning Ahead For Website Upgrades

I've noticed while developing my first site, that the smallest changes in database columns, connection options, and various other components cause the website to fail until I correct the problem which may or may not require a lot of time (woe is me). I'm wondering what steps I can take now in order to prevent these headaches, as I've delayed the website launch in order to keep upgrading. Once I'm done implementing the changes I would like to see, I know I won't truly be done, but at some point I have to move on to the next stage.
Yes, I know there is probably no one good solution, and ultimately a self-correcting design is more trouble than its worth at this point. But if any
grey beards have any tips that they could offer based on their own experiences working with WebDev, particularly with LAMP stacks, I would greatly appreciate it.
Specifically, I would like to know what to look out for when modifying databases and website code after customer information is in active use, in order to prevent errors, and how to roll out the changes.
EDIT 1:
Yes, so the answer seems to be that I need to copy the live site to my testing environment. I'm looking going to some of the already suggested development solutions. Regular backups are crucial, but I can just see inserting new columns and modifying queries as a cause for mis-ordered tables and such. "That's where being a good programmer and testing diligently comes in handy", someone in the corner said. As I look into the proposed solutions, I welcome all others in the meantime. A real-time copy of the 'live-site' would be nice to create on the fly during testing.

The above answers are all very valid and in the end, they represent your target solution.
In the meantime, you may already do a lot for your website, even with a gradual migration to those practices.
In order to do so, I suggest you to install PHPUnit (or whatever Unit comes with the web languages you use). There are also "graphical" versions of it, like VisualPHPUnit, if that's more of your taste.
These tools are not the permanent solution. You should actually aim adding them to your permanent solution, that is setting up development server etc.
However, even as interim solution they help you reach a fairly stable degree of quality for your software components and avoid 80-90% of the surprises that come with coding on a live server.
You can develop your code in a separate directory and test it before you move it into production. You can create mock objects which your code under test may freely interact with, without fear of repercussions. Your tests may load their own alternate configuration so they work on a second, copy database.
Moving even further, you may include your website into tests itself. There are several applications like Selenium that allow you both to automate and test your production website, so that you can reliably know that your latest changes did not negatively affect your website semantics.
In short, while you should certainly aim at getting a proper development environment running, you can do something very good even today, with few hours of study.

Start using some (maybe simplified) sort of release management:
Maintain a development environment. Either locally, or in a second .htaccess-protected folder online. Make it use it´s own db.
Test every change in this dev env. Once you are satisfied, move them to productive env.
Use git or svn (svn might be simpler to learn, but opinions vary. Checkout "tortoise") to save a snapshot ("commit") of every change you make. That way you can diff throu your latest commits if anything goes wrong.
Maintain regular backups.

Why is Symfony2 performing so bad in benchmarks and does it matter?

My colleagues and I are in the process of choosing a web framework to develop a high traffic web site. We are really good with node.js + express and php + symfony2. Both are great frameworks but we are a bit concerned about Symfony2 because it seems to be outperformed by most web frameworks out there.
Here is the benchmarks that proves it:
http://www.techempower.com/benchmarks/
For this reason we will probably use node.js + express but I still wonder why is Symfony2 performing so bad in benchmarks.

In the end it all comes down to correct cache handling...
symfony or PHP in general IS slower than other languages or frameworks thus providing you with the tools to create rich, secure and testable web applications really fast.
If you use a reverse proxy like Varnish and ESI ( edge side includes ) and end up serving the parts of your templates you really need to have updated through symfony. you will have a blazingly fast experience.
Furthermore if you use an opcode cache like APC and an optimized database a human user won't actually notice the difference of a few ms in a real world application.
As per request i will dive a bit deeper and give you a few more things to think about.
Caching & Performance
With cloud-services (s3,ec2,gae,...) at almost no cost paired with load-balancers, easy provisioning (chef,puppet,...) and all this funky stuff avilable it has become easy and affordable even for small companies to run and manage large data and/or high traffic applications.
More storage means more space for cache - more computing power means quicker cache warmimg.
things you will often hear if people are talking about php or framework-performance:
facebook runs with php
youp**n was developed with symfony
...
So why do these sites not break down completely? Because their caching routines are clever.
facebook
Did you know for instance what facebook does if you write a status update?
It does not save it right into a database table with all your status updates and if a friend visits his stream all the statuses from all his friend are being fetched from database prior to being served.
facebook writes your status to all of your friends news streams and starts warming their cache. Now all the streams are being prepared for serving and whenever one of your friends visits his stream he will be served a cached version; instantly with almost no code execution involved. The stream will only show your newly created status when the cache warming has finished. we are talking about ms here ...
What does this tell us? In modern highly-frequented applications almost everything is being served from cache and the user will not notice if the actual computing of the page took 1ms or 5 seconds.
In a "real-world" scenario the end-user will notice no difference in req/sec between frameworks. Even with simple stuff like micro-caching you can have your vps hosted blog not go down instantly once you made it on hackernews's landing page.
In the end the more important thing is ... does my framework provide the tools, the documentation and the tutorials and examples ... to get this whole thing up and running quick & easy. symfony does for me!
If you're stuck ... how many people are there willing and able to answer your performance-related questions?
How many real-world applications have already been or will in the near future be created with this framework?
you choose a community by choosing a framework !
... okay thats for the does it matter part ... now back to these benchmarks :)
Benchmarks & Setups
Over all these shiny colors and fancy graphs in the benchmark you easily miss the fact that there is only one setup ( webserver, database, ... ) tested with each of these frameworks while you can have a wide variety of configurations for each of them.
Example: instead of using symfony2+doctrineORM+mysql you could also use symfony+doctrineODM+MongoDB.
MySQL ... MongoDB ... Relational Databases ... NoSQL Databases ... ORM ... micro ORMs ... raw SQL ... all mixed up in these configurations ------> apples and oranges.
Benchmarks & Optimization
A common problem with almost all benchmarks - even those only comparing php frameworks - found around the web and also those "TechEmpower Web Framework Benchmarks" is unequal optimization.
These benchmarks don't make use of possible (and by experienced developers well known) optimizations on those frameworks ... at least for symfony2 and their tests this is a fact.
A few examples regarding the symfony2 setup used in their latest tests:
"composer install" is not being called with the -o flag to dump an optimized classmap autoloader (code)
Symfony2 will not use APC cache for Doctrine metadata annotations without apc_cli = 1 ( issue )
the whole DI container is injected in the controller instead of only the few necessary services
hereby setter injection is used -> creates object then call setContainer() method instead of injecting the container directly into the constructor (see: BenchController extends Controller extends ContainerAware)
an alias ( $this->get('service_name') ) is used to retrieve services from the container instead of accessing it directly ($this->container->get('service_name') ). ( code )
...
the list continues ... but i guess you understood where this is leading. 90 open issues by now ... an endless story.
Development & Ressources
Ressources like servers and storage are cheap. Really cheap ... compared to development time.
I am a freelancer charging considerably common rates. you can either get 2-3 days of my time ... or a sh**load of computing power and storage!
When choosing a framework, you are also choosing a toolkit for rapid development - a weapon for your fight against the never completely satisfied, feature-creeping customer ... who will pay you well for his wishes.
As an agency (or a freelancer) you want to build feature-rich applications in short time. You will face points where you are stuck with something ... maybe a performance related issue. But you are facing development costs and time as well.
What will be more expensive? An additional server or an additional developer?

This blog answers the second part of your question:
http://symfony.com/blog/is-symfony-too-slow-for-real-world-usage
Dismissing symfony because the speed of a "hello, world" test is not
as good as with FooBar framework is a mistake. Raw speed is not the
key factor for professionals. Cost is the key factor. And the cost of
development, hosting and maintenance of an application with symfony is
less than what it is for other solutions.
When choosing a framework, one should consider the total costs of development. That means looking at the code quality of the framework (unit tests, documentation, etc.), the performance (and hosting costs), the quantity and quality of features it has out of the box, the size of the community, usage by organizations like yours, scalability, etc.
As a Symfony developer, I passionately hate WordPress from a technical point of view. But I'll still recommend (and even use!) it for a simple website. Not just because it's popularity, but because the size of it's community: it's very easy to hire a WordPress designer/developer. Looking at a performance comparison between WordPress and Symfony wouldn't make any sense in this case.

Unit Test code generation

We have a project developed by 2 years with poorly designed architecture. Now a days there are no any unit tests at all.
Current version of system works satisfyingly but we vitally need refactoring of core modules.
The budget is also limited so we can not hire suffisient number of developers to write unit tests.
Is it the possible strategy to generate code automatically for unit tests which covers, for example, interaction with data, in assumpion that now system works fine and current system's output can be converted in XML-fixtures for unit testing?
This approach gives us a possibility to quickly start refactoring of existing code and receieve immediate feedback if some core functionality is corrupted because of changes.

I would be wary of any tools that claim to be able to automatically determine and encode an arbitrary application's requirements into nice unit tests.
Instead, I would spend a little time setting up at least some high-level functional tests. These might be in code, using the full stack to load a predefined set of inputs and checking against known results, for instance. Or perhaps even higher-level with an automation tool like Selenium or FitNesse (depending on what type of app you're building). Focus on testing the most important pieces of your system first, since time is always limited.
Moving forward, I'd recommend getting a copy of Michael Feathers' Working Effectively with Legacy Code, which deals with exactly the problem you face: needing to make updates to a large, untested codebase while making sure you don't break existing functionality in the process.

Performance logging for a live test?

We are completing our application, written in PHP/MySQL (+ memcached) and we are going to start, next weekend, a live test for one night (it's a sort of "social" application).
We will, of course, monitor error log files to make sure everything went fine. But we would like also to keep logs of the performance of the application: for example, determine if a script ran too slow, and more in details how long it took for functions/methods to run, for MySQL queries to be executed and compare that with data obtained (and "un-jsoned") from memcached.
This is the first time I'm doing something like this: however, I believe it's fundamental because we need to make sure the application will scale correctly when our customers will start using it in 10-15 days. (Up)scaling will not be a big issue, since we are using cloud servers (we will start with a single instance with 256 MB of RAM, provided by a well-known company), but we would like to be sure resources are used efficiently.
Do you have any suggestion for this monitoring? Good practices? Good articles to read?
Also, when the test is over, should we continue to monitor performances on the live environment? Even if not on all requests, but just on a sample?
PS: Is it a good idea to log on a file all MySQL queries and the time they took to run?

I usually unit test my work so I can make sure its running at a satisfactory speed, giving the right results, etc.
Once I've finished unit testing I will stress test it, I run the most commonly used functions through an insane loop from my local machine and an instance I have set up for brutally testing my applications and since my tests are written in PHP I can log anything i like;
Such as performance, but I would never use it live I'd just be sure I'm confident with what I have written
[EDIT]
Do you use Zend Studio? (that's the PHP IDE I use) because it has built in unit testing, you can get a free trial here which when it ends is still very functional, I have the paid version so I'm not sure if unit testing is still viable but its well worth its buck!
Here's a link that introduces unit testing which is really great and you can grab the trial of Zend Studio here
here are a few more links for unit testing just pulled from Google.
lastcraft.com
List of unit testing
Another Stack Overflow post on unit testing PHP
Google results page

you could look at installing something like this on the machine(s):
http://en.wikipedia.org/wiki/Cacti_%28software%29
It's always handy to have current and historical information about the performance of you system (CPU\Mem\Bandwidth).

Introducing Test Driven Development in PHP

My workplace consists of a lot of cowboy coders. Many of them are junior. Which coincidentally contributes to a lot of code quality issues.
I'm looking for suggestions on how to best wane my team into using TDD (we can start with Unit tests, and move into regression tests, and later more automated testing).
Ultimately, I want us to learn more quickly from our mistakes, and produce better code and breed better developers.
I'm hoping there are some practical suggestions for how to introduce TDD to a team. Specifically, what tools are best to choose in the LAMP (php) stack.
Sorry if this question is too open-ended.

After going through this process four times now, I've found that any introduction of TDD will fail without some level of enforcement. Programmers do not want to switch style and will not write their first unit test and suddenly see the light.
You can enforce on a management level, but this is time-consuming for all involved. Some level of this is necessary at the beginning anyway, but ultimately you will need automatic enforcement. The answer to this is to introduce Continuous Integration.
I've found that a CI Server is the ultimate keystone to any TDD environment. Unless developers know that something bad will happen if they don't write the tests, you'll always have cowboys who feel it's beneath them.

Make writing tests easy and the results visible.
Use a TestFramework with good documentation. like SimpleTest
If test depend on database contents, create a reference database that will be dropped and created at the beginning of a script.
Make a script that runs all test and shows the results on a standalone monitor or something that will make the test visible / easily accessable. (Running a command prompt is not an option)
I personally don't write test for every chunk of code in the application.
Focus on the domain objects in the application. In my case these are "price-calculation" and "inventory-changes"
Remind them that they are probably already writing tests, but that they delete their work just after creation.
Example: During the development of a function you'll have a page/testscript in with an echo or var_dump() the result. After a manual validation of the result you'll modify the parameters of the function and check again.
With some extra effort these tests could be automated in a UnitTest. And which programmer doesn't like to automate stuff?

As for the team question as well as universal ideas about software development and testing, I would suggest Joel Spolski's website and books: http://joelonsoftware.com/ I got many insights from him.

SimpleTest - excellent documentation and explanations of testing for php

Another way to start TDD is try to use PHP framework. Without framework, it's hard to implement unit test effectively.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.