01 Jan, 2009
The curse of performance test automation?
Posted by: j pimmel In: agile|performance testing
In our last three projects we automated our performance testing extensively. This involved significant effort, which paid off for the 2 projects already delivered: meeting performance expectations owed a great deal to our performance improvements.
Interestingly after the bulk of those improvements were delivered and the product made live, the performance automation testing aspects fell slowly into dis-use from lack of maintenance. Why? Maybe because after the big performance wins had been made it seemed that the remaining value to be gained was far less than the effort required in maintaining the automation.
However, there are inherent risks to discontinuing performance test automation:
- Developers could introduce poorly performing code into the system at any time
- New code, while itself not patently poor in performance, tends to gradually erode performance over time
- Unpredictable user behaviour generating unexpected load
- External forces, eg:
- Unforseen marketing/ad campaigns
- Unanticipated user volumes
Without the performance test automation we once had, aren’t we flying blind, implicitly turning production into our performance test environment?
What we’ve always done
Its worth a quick digression to discuss how we’ve tested performance to date. We’ve generally tended towards playback driven load generation against a performance test environment which is as identical to live as possible (this is usually termed record/playback: our ‘recordings’ are typically simulated usage journeys described in a test, rather than recording UI interaction for a tool to playback).

Even though this has delivered value in our previous performance improvement cycles, there are nevertheless issues worth mentioning:
- High upfront costs
- Load generation tools must be designed and built
- The entire testing environment must be complete and in place
- When load is generated against the monolith, the sum of all moving parts – when subject to concurrent load – creates the greatest possible variability in results
- Hard to pinpoint gradually introduced problems; you still see spikes for major performance issues
- This increased variability prevents our triggering performance threshold failures
- Without failure triggers, developers must be responsible for routinely checking the generated results for changes to performance
- Implementation of performance record/playback tests is complex and their execution heavyweight because they are usage simulations
Agile performance testing
I think we could improve on our performance testing by addressing some other issues inherent to the development process. Unit performance testing should be integral to the development process:
- Ensure that performance acceptance criteria are defined as part of the story card where this need is identified
- Develop unit performance tests to supplement the more heavyweight record/playback tests when performance criteria exist
- Baseline both unit performance and record/playback tests as early as possible
To quote Alexander Podelko, “During unit testing different variables such as load, the amount of data, security, etc. can be reviewed to determine their impact on performance. In most cases, test cases are simpler and tests are shorter in unit performance testing. There are typically fewer tests with limited scope; e.g., fewer number of variable combinations than we have in a full stress and performance test.”
Since these are easier to write and more focused they can initially be developed in the absence of the full performance testing infrastructure or load generation tool. Furthermore, as the data generated will be considerably less variable, we could trigger performance threshold-based test failures prior to check-in.
Any such failure would mandate that developers review their changes. If the change justifies more intensive resource utilisation, the existing performance criteria defined for that function are modified, in agreement with the product owner and QA where appropriate. If the change exposes poorly implemented code, the respective performance improvements would be made.

An added benefit of automating both unit performance tests and record/playback tests is traceability. A record/playback test can be correlated to numerous smaller performance unit tests and vice versa – their combined value in the rapid diagnosis of performance problems before production is greater than the sum of the parts.
Another aspect of traceability comes from having the earliest possible baselines set by writing unit performance tests in parallel with code. This extensive history could provide an accurate and detailed picture of the performance of functions through the course of their development.
This article in two parts by Scott Barber (from which the above diagram is taken) provides a good overview of how to think about implementing an Agile performance testing strategy.
In my next article I explore ideas and techniques to assist developers in designing and writing meaningful unit performance tests by decomposing a seemingly monolithic system into smaller testable parts.
