Unit Test code generation - php

We have a project developed by 2 years with poorly designed architecture. Now a days there are no any unit tests at all.
Current version of system works satisfyingly but we vitally need refactoring of core modules.
The budget is also limited so we can not hire suffisient number of developers to write unit tests.
Is it the possible strategy to generate code automatically for unit tests which covers, for example, interaction with data, in assumpion that now system works fine and current system's output can be converted in XML-fixtures for unit testing?
This approach gives us a possibility to quickly start refactoring of existing code and receieve immediate feedback if some core functionality is corrupted because of changes.

I would be wary of any tools that claim to be able to automatically determine and encode an arbitrary application's requirements into nice unit tests.
Instead, I would spend a little time setting up at least some high-level functional tests. These might be in code, using the full stack to load a predefined set of inputs and checking against known results, for instance. Or perhaps even higher-level with an automation tool like Selenium or FitNesse (depending on what type of app you're building). Focus on testing the most important pieces of your system first, since time is always limited.
Moving forward, I'd recommend getting a copy of Michael Feathers' Working Effectively with Legacy Code, which deals with exactly the problem you face: needing to make updates to a large, untested codebase while making sure you don't break existing functionality in the process.

Related

Help me improve my continuous deployment workflow

I've been developing a workflow for practicing a mostly automated continuous deployment cycle for a PHP project. I'd like some feedback on possible process or technical bottlenecks in this workflow, suggestions for improvement, and ideas for how to better automate and increase the ease-of-use for my team.
Core components:
Hudson CI server
Git and GitHub
PHPUnit unit tests
Selenium RC
Sauce OnDemand for automated, cross-browser, cloud testing with Selenium RC
Puppet for automating test server deployments
Gerrit for Git code review
Gerrit Trigger for Hudson
EDIT: I've changed the workflow graphic to take ircmaxwell's contributions into account by: removing PHPUnit's extension for Selenium RC and running those tests only as part of the QC stage; adding a QC stage; moving UI testing after code review but before merges; moving merges after the QC stage; moving deployment after the merge.
This workflow graphic describes the process. My questions / thoughts / concerns follow.
My concerns / thoughts / questions:
Overall difficulty using this system.
Time involvement.
Difficulty employing Gerrit.
Difficulty employing Puppet.
We'll be deploying on Amazon EC2 instances later. If we're going about setting up Debian packages with Puppet and deploying to Linode slices now, is there a potential for a working deployment on Linode to break on EC2? Should we instead be doing our builds and deployments on EC2 from the get-go?
Another question re: EC2 and Puppet. We're also considering using Scalr as a solution. Would it make as much sense to avoid the overhead of Puppet for this alone and invest in Scalr instead? I have a secondary (ha!) concern here about cost; the Selenium tests shouldn't be running that often that EC2 build instances will be running 24/7, but for something like a five-minute build, paying for an hour of EC2 usage seems a bit much.
Possible process bottlenecks on merges.
Could "A" be moved?
Credits: Portions of this workflow are inspired by Digg's awesome post on continuous deployment. The workflow graphic above is inspired by the Android OS Project.
How many people are working on it? If you only have maybe 10 or 20 developers, I'm not sure it will make sense to put such an elaborate workflow into place. If you're managing 500, sure...
My personal feeling is KISS. Keep It Simple, Stupid... You want a process that's both efficient, and more important: simple. If it's complicated, either nobody is going to do it right, or after time parts will slip. If you make it simple, it will become second nature and after a few weeks nobody would question the process (Well, the semantics of it anyway)...
And the other personal feeling is always run all of your UNIT tests. That way, you can skip a whole decision tree in your flow chart. After all, what's more expensive, a few minutes of CPU time, or the brain cycles to understand the difference between the partial test passing and the massive test failing. Remember, a fail is a fail, and there's no practical reason that code should ever be shown to a reviewer that has the potential to fail the build.
Now, Selenium tests are typically quite expensive, so I might agree to push those off until after the reviewer approves. But you'll need to think about that one...
Oh, and if I was implementing this, I would put a formal QC stage in there. I want human testers to look at any changes that are being made. Yes, Selenium can verify the things you know about, but only a human can find things you didn't think of. Feed back their findings into new Selenium and Integration tests to prevent regressions...
Importent to make your tests extremely fast, i.e. no IO and ability to run parallel and distributed tests. Don't know how applicable it is with php, but if you can test units of code with in memory db and mock the environment you'll be better off.
If you have a QA/QC or any human in the way between the commit and production you would have a problem getting to a full continuous deployment. The key is your trust your testing, monitoring and auto response (immune system) enough to eliminate error prone process evolving humans from your system.
All handovers between functions have the effect of slowing things down, and with that, an increase of the amount of change (and hence risk) that goes in to a deployment.
Manual quality gates are by definition an acceptance that quality has not been built in from the start. The only reason code needs to be reviewed later is because there is some belief that the quality is not good enough already.
I'm currently trying to remove formal code review from our pipelines for exactly this reason. It causes feedback delays, and quoting Martin Fowler:
"The whole point of Continuous Integration is to provide rapid feedback. Nothing sucks the blood of a CI activity more than a build that takes a long time. "
Instead I'd like to make code review something that submitters request if required, or otherwise is done at the time of coding by team members, perhaps a la XP pair programming.
I think it should be your goal that once the code is merged to source control, that there is absolutely no more manual intervention.
I don't know whether this is relevant to PHP, but you can replace at least at least some of the code review stage with static analysis.
The quality of code reviews is relying on the quality of the reviewers, while static analysis relies on best practices and patterns, and is fully automatic. I'm not saying that code reviews should be abandoned, I simply think it can be done off-line.
See
http://en.wikipedia.org/wiki/Static_code_analysis
http://en.wikipedia.org/wiki/List_of_tools_for_static_code_analysis

PHP: How to begin testing large, existing codebase, and test for regression on production site?

I'm in charge of at least one large body of existing PHP code, that desperately needs tests, and as well I need some method of checking the production site for errors.
I've been working with PHP for many years, but am unfortunately new to testing. (Sorry!).
While writing tests for code that has predictable outcomes seems easy enough, I'm having trouble wrapping my head around just how I can test the live site, to ensure proper output.
I know that in a test environment, I could set up the database in a known state... but are there proper methods or techniques for testing a live site? Where should I begin?
[I am aware of PHPUnit and SimpleTest, but haven't chosen one over the other yet]
Unit testing frameworks like PHPUnit are built more for testing the functionality of separate, logical units (i.e. classes), not so much the behaviour of entire live sites. One household name for that is Selenium, a web application testing system. It is my understanding that Selenium tests can in turn be integrated with PHPUnit.
As for the choice between Simpletest and PHPUnit, check out my recent question about PHPUnit vs. SimpleTest - the answers to that question, notably #Gordon's, managed to convince me of PHPUnit.
I second Pekka's suggestion. Also, I strongly suggest using PHPUnit, as it is the de-facto Standard in UnitTesting frameworks in the PHP World.
Another thing you can do is head over to phpqatools.org (edit: this website is no longer active) and use the given tools to analyze your codebase, find dead code, copy and paste, code violations, etc.
Also profile your code with XDebug or Zend Debugger to find out what it is actually running how often. This way you will not only get an idea of which code you should test first (those that runs most often), but also how it performs, which is a good starting point when you wil optimize it after you have written the Unit-Tests.
In addition, check out:
Legacy Code Nightmare and
Theory and practice – migrating your legacy code
PHPUnit Manual

Basics of automated tests?

Till now I've been testing my web (usually written in PHP) as well as desktop applications (normally Java or C#) manually. Now I read somewhere on the net about automated tests. I tried searching to know about it in details but almost all searches end up at things like PHPUnit. Could someone please put some light on the theory behind automated tests? How a software can be tested automatically? Any limitations etc? Or may be you can tell me a place where I can read about this.
Regards
For testing the correctness of code you can use unit testing. This was first explained to me from Dive Into Python: Unit Testing and will do much more justice to the topic then I could here. Now that you know about the term unit testing, you shouldn't be far from other existing explanations that will make sense to you, be it that one doesn't.
You may find test-driven development of interest too.
Your code is not magically tested for you, as you may had the idea of. The code to test your application will be written by you. What packages like PHPUnit offer you is a framework in which you can implement your tests. These packages will provide many conveniences for defining your tests, running them together as a suite, and generating a report. These are the only automated aspects.
These Test Tools are used in the following manner as, If suppose we have to test a webform we will pas the input data into the fields they have provided us with the tool, and the number of users needs to filling the form (their user-name), In this way when this is executed then the form gets filled with provided data with individual usernames, This execution may provide u test data as performance, time of execution and load etc...
WebLoad Testing Tutorial
see the above link for load testing...
See these links as well
Functional testing
Security Testing
Link and HTML tool Testing
Performance Testing

Performance implications with 'clean code'

At my workplace we're planning a major refactor on our core product, a web application with several 'modules'. I quoted that because that's one of our main concerns: modules are not really modules, the whole thing is monolithic. The application is written in PHP with smarty templating and using Pear for accessing a MySQL database. We're not really concerned with database independence, although it would be nice if that wouldn't take months to implement.
Our main concerns are that development time/cost is increasing exponentially because of bugs popping up in unrelated places and not having a sound common architecture to rely on to get the most common functionality (each module is basically copy/paste from the previous one, then adapt).
I've got some experience with the web MVC principle, mainly in ASP.NET MVC. I like the clean separation it offers and the testability. However, when trying this on a local machine the app is simply a lot slower than it should be.
Alright, enough introduction, off to the questions:
- Should I rely on caching modules? Does this remove most of the overhead using a good architecture provide? Something like APC.
The application is mainly read. Writing is mainly single values (change a single field on a record). Any OR/M for PHP that are good at this?
Also looking for a flexible MVC framework. I know Zend, CakePHP, maybe Symfony?
The tricky part is that we don't have the luxury of being able to do a full rewrite. We'll have to incrementally improve a currently very messy codebase. This has to be done while writing new code, or fixing bugs. One thing I'd really, REALLY like to be able to do is write a regression test for a new bug before fixing it, to prevent it from popping up again later (this happens, occasionally).
The stack I'm currently considering contains:
MVC framework of choice
Logging (log4php?)
an OR/M of choice (doesn't have to be dynamic, code generation is fine too)
IoC container of choice
Smarty Templating, perhaps abstracted so we can switch it out if we need to.
Opcode cache of choice (we're using one now, forgot which one, have to ask sysadmin)
The main point that worries me is the performance implications of creating clean code in PHP. Seeing it's a parsed language opposed to something like the .NET/Java web stack, creating abstractions for otherwise in-line code (with obligatory separation in different files) might create new problems on another level.
Note: Retag if you come up with more appropriate tags, I'm not sure on the current ones.
Having a clean setup isn't a performance issue, usually. Most performance is spent with databases or other external systems you're talking to.
Except for these there are usually one or two hotspots which might be worth optimizing but for that you should start with a clean design, then use a profiler (like XDebug or ZendDebugger) to identify the bottlenecks.
A clean software design is way more important than the 0.01% performance gain by a "optimized" design. Usuallyit's even cheaper to buy and run more hardware than worry about an "optimized" codebase which is unmaintainable.
I'd stress budgeting time to build tests, with the following arguments to management:
When developers fix a bug, allow them to write a test for the bug. Bugs reoccur much more often than they probably should, and this is a cheap and effective way to stop that completely.
When developers are building new functionality, allow them to write tests under it. Since they're completely familiar with the functionality at that point, this is the least expensive time to build the "safety net" of automated testing.
Don't candy coat how long testing will take you; whether it's 1% of your time or 50%, give that to the manager straight, but stress that building automated testing as a safety net will stop users from hitting as many bugs, and will save developer time for new development instead of bugfixing.
As far as managing an MVC component with a spaghetti code component, we had a similar issue with a large project. What worked well was just taking a directory and making that the new docroot for MVC app (Zend Framework in our case) such that:
old part:
http://site.com/data.php
http://site.com/other.php
new part:
http://site.com/app/controller/action/...
Re authentication, you have a couple of choices. Probably the most logical is to redirect your login.php script to the MVC login and then pass it back to the original page that you want to go with necessary info passed as a GET parameter. This will allow legacy and new systems to exist simultaneously and transparently.
Re slowness, before I would pull out XDebug, I would try to isolate a problematic part and just output times it takes. Faster IMHO.
There isn't any good reason that well structured object oriented code should perform significantly worse than sapghetti php code in a database driven web application. You need to do some profiling to find where your bottlenecks are and optimize accordingly.
You do have a tough (but not uncommon) situation.
As far as organizing the code to minimize bugs, all I can give is a tip of the cap to DRY.
For performance issues, those are easy to find, because their very slowness shows them to you, by this technique.

Introducing Test Driven Development in PHP

My workplace consists of a lot of cowboy coders. Many of them are junior. Which coincidentally contributes to a lot of code quality issues.
I'm looking for suggestions on how to best wane my team into using TDD (we can start with Unit tests, and move into regression tests, and later more automated testing).
Ultimately, I want us to learn more quickly from our mistakes, and produce better code and breed better developers.
I'm hoping there are some practical suggestions for how to introduce TDD to a team. Specifically, what tools are best to choose in the LAMP (php) stack.
Sorry if this question is too open-ended.
After going through this process four times now, I've found that any introduction of TDD will fail without some level of enforcement. Programmers do not want to switch style and will not write their first unit test and suddenly see the light.
You can enforce on a management level, but this is time-consuming for all involved. Some level of this is necessary at the beginning anyway, but ultimately you will need automatic enforcement. The answer to this is to introduce Continuous Integration.
I've found that a CI Server is the ultimate keystone to any TDD environment. Unless developers know that something bad will happen if they don't write the tests, you'll always have cowboys who feel it's beneath them.
Make writing tests easy and the results visible.
Use a TestFramework with good documentation. like SimpleTest
If test depend on database contents, create a reference database that will be dropped and created at the beginning of a script.
Make a script that runs all test and shows the results on a standalone monitor or something that will make the test visible / easily accessable. (Running a command prompt is not an option)
I personally don't write test for every chunk of code in the application.
Focus on the domain objects in the application. In my case these are "price-calculation" and "inventory-changes"
Remind them that they are probably already writing tests, but that they delete their work just after creation.
Example: During the development of a function you'll have a page/testscript in with an echo or var_dump() the result. After a manual validation of the result you'll modify the parameters of the function and check again.
With some extra effort these tests could be automated in a UnitTest. And which programmer doesn't like to automate stuff?
As for the team question as well as universal ideas about software development and testing, I would suggest Joel Spolski's website and books: http://joelonsoftware.com/ I got many insights from him.
SimpleTest - excellent documentation and explanations of testing for php
Another way to start TDD is try to use PHP framework. Without framework, it's hard to implement unit test effectively.

Categories