Research note: testing - Some questions you should be asking your testing partners

I recently blogged about the Qualitest Group and its third party testing services. I like the idea of using an external testing organisation because I believe that testers need a different mind-set to developers – a delight in breaking things and finding defects perhaps – and, in most organisations, such a mindset is career limiting. It’s a question for an IT group to ask itself – do we like having people that regularly find our mistakes and publicise them to those around us? If not, perhaps it should be considering a third party testing organisation, that fully understands testing in all its aspects and employs people with the “testing mindset”.

That choice raises further questions, however, which really concern the governance of the development process and its quality assurance. Does your external testing partner allow developers to unit-test their own code, for example? With old-style development practices, people probably shouldn’t test their own code (the developers often have the wrong mindset and their test cases can embody the same misconception of the requirement as the code does); but developers unit-testing their own work is pretty fundamental to eXtreme Programming and Agile development. Agile, also, is generally becoming accepted as the way to go, both for productivity and quality. It’s a question to ask your testing partners: “do you support agile development effectively without ‘spoiling’ the Agile culture we’re trying to promote?”.

Another thing I like about Qualitest is its results-based testing approach. However, this rather assumes that you have something to compare your results against and that the results you want are feasible. You can never claim 100% confidence that there are no bugs in a system, even a safety-critical system; and Qualitest would say that you can never say that “testing is finished”.

Nevertheless, I would suggest that that’s actually a matter of semantics, to a large extent. Should you discuss the semantics of a results-based testing SLA saying something like “Find at least 95% of the bugs” with your testing partner? There are ways of estimating the total bugs in a piece of code (here, for example) but does the SLA refer to these estimates or merely to finding 95% of the bugs actually reported by users? Does a design flaw count as a bug? What about the possibility of a systematic testing bias that puts the most business-critical bugs in the 5% that aren’t found? And, what about latent bugs which haven’t been found and perhaps can’t ever be reached – with current workloads and data patterns? Are they worth wasting time on? Perhaps not; but latent bugs can represent a potential production disaster waiting to happen when workloads change or new data enters the system (perhaps you gain a significant Far East customer for the first time and its data looks different to what you’ve been processing before). So perhaps latent bugs are important (which is partly why static code analysis can be important).

I like results-based testing because it promises to give you a fair and equitable contract with your external testing partner. It can also, perhaps, give you a handle on the “is testing finished” issue – something else to question your testing partner about, I think.

“Is testing finished” is really another question of semantics. If you can place confidence limits on the number of bugs found relative to the number of bugs expected; if you can put numbers on the risk associated with “going live”; and if you can estimate, with confidence, the cost associated with the risk going live against the cost to the business of withholding the new automated service; then you have, in a real and practical (although limited) sense, “finished testing”. Even if running some more tests (perhaps tests which you haven’t thought of and which aren’t in your test pack) might find some more defects.

Part of the value of employing an organisation like Qualitest is that it is a testing specialist and understands the testing process and its semantics, probably better than most developers do. However, although management can outsource responsibility for the execution of testing and quality assurance, it can’t outsource responsibility for Quality. If, for example, one of the 5% of defects Qualitest hasn’t found (while satisfying its testing by results SLA) results in confidential customer credit card details held by a company being splashed over the Internet, it’ll be (potentially) the company’s directors in the dock facing gaol, not Qualitest’s directors.

So, the semantics of testing is probably important to the managers employing a firm like Qualitest. For example:

Is a “bug” in an automated system a coding error; an error in automating business logic; an error in the business logic being automated; or a fundamental misunderstanding of the business operation and its commercial context by business management? Even if you replace “bug” with “defect”, depending on where I am in the organisation and how technical I am, I might reasonably expect one, some or any of these to be addressed by a quality assurance or testing team that promises to help me control the “quality” of my automated business systems. And I’ve been, sloppily, mixing up “bug” and “defect” throughout this piece; this is common (although I do hope that many readers noticed), but is it acceptable?
Is “defect free” software possible? Altran Praxis (formally Praxis High Integrity Systems) promises to deliver “zero defect” software and this claim has been validated by the NSA. This isn’t trivial and Praxis achieves zero defects by using mathematical proof where it is cost-effective (but only where it is cost-effective, not everywhere) and by developing in a restrictive subset of Ada that doesn’t support constructs which facilitate coding errors; but what it means by “zero defect” is that the code complies 100% with the spec. Is this the same as what you mean by “defect free”? Mind you, just 100% compliance with spec would be a useful step forward for many systems.
If I design my system to store credit card details in a database that is accessible via SQL queries embedded in orders sent over the web, is this a “bug”, a “system defect” or a “design fault” and would you expect your testing team, or Qualitest, to find this? Or, does this, perhaps, depend on what you ask (and pay) your testers, or Qualitest, to do? Perhaps you think this is something your security team should be testing; but perhaps they think it’s a development issue and, in practice, nobody takes ownership of such issues.
If I say that you have bugs in your systems because you tell your developers that you want bugs in your systems are you shocked and in immediate denial? But if you, perchance, reward developers for delivering ahead of schedule and reward them again for coming in out-of-hours to fix production bugs, aren’t you, in effect, telling your developers that you are happy to live with bugs in the interests of immediate delivery and conspicuous “company loyalty” in your developers? Especially if you try to reduce what you spend on quality assurance as much as possible. I would see this as a “cultural defect” or “organisational defect” – a failure of “good governance”, perhaps – which will impact the business; but, in semantic terms, does this count as a “system defect” which you could expect your quality assurance partners to help you eliminate?

I just raise the questions and they don’t invalidate my view that external testing by an organisation such as Qualitest may bring significant improvements to system quality. But the outsourcing of testing has consequences and raises governance issues which are often cultural and semantic as much as technical – but no less important for all that.