Accelerating Software Quality: Machine Learning and Artificial Intelligence in the Age of DevOps - a book by Eran Kinsbruner et al

You may think that writing code with modern tools is getting ever easier. Possibly this is true, but I think that the systems it is being written for are getting ever more complex. A “legacy” transaction-based database application, even a very large one, was, logically, fairly simply. You took some data, applied an algorithm to it, nothing else was allowed to touch the data until you’d checked all the changes and when you were happy, you committed. If anything went wrong, you told the DBMS (Database Management System) and it backed out the transaction as if it had never been. You had to verify your logic, of course, but given a bit of good practice (people not testing their own code; and starting testing early, by questioning the completeness and lack of ambiguity of the spec), even manual testing was feasible, if a bit of a waste of resources.

The past is a different country. Things are different now. Your data is spread across a dozen different servers, possibly in cache (which you have to trust is kept coherent). It is processed asynchronously, in real time, by thousands of VMs or web browsers, and if something goes wrong part of the way through a complex process, something else has probably already processed the intermediate results and you can’t just back out the relevant “transaction” (in fact, you need a lot of extra remediation code so as to undo the partially completed logic reliably). That gets complicated, when you need to test that it all works as you expect it to.

Worse, modern development techniques encourage developers to test (unit test, anyway) their own code, so there is a risk that their tests mirror misconceptions embodied in their code; and that developers focus mostly on demonstrating that their lovely code works rather than on torturing it until it breaks (and remember that the purpose of most testing is to find defects, so they can be removed, not to give developers warm feelings and a sense of achievement). And no-one tests remediation code because, well, users never do stupid things or make mistakes, do they?

It is fast becoming accepted that the only way to deal with this situation is to automate testing, with the kind of tools one would give programmers, and to build continuous testing into the DevOps CI/CD process. DevOps is not just about delivering software fast, it is about delivering the right, high quality, reliable software – with no surprises.

Well, test automation is good. But I’m not convinced that conventional test automation is enough. Software is complicated, so that even if you just look at the “light path”, where it is being used properly and should work, the number of paths through a system increases exponentially as you add conditional logic; add in the “dark path” of remediation logic after abuse or error and you add, possibly, an order of magnitude more code to test (unless you don’t bother to test the “dark path”, which is why, if I wanted to break into or corrupt a system, I’d be ferreting around the “dark path” for code with defects I could exploit).

Even back in the last century, when a project manager once told me that just by manually typing stuff in at the keyboard, he could infallibly find all the bugs in his code, his record of production failures suggested that he was wrong. Any idea of trying to manually test a large, modern, 21st century, system adequately is laughable. In an era, however, in which we all have to do “more with less”, I simply don’t believe that even automated testing is being given adequate resources. “Completely” (that is, to some high level of confidence) testing a modern asynchronous, distributed system is a (different kind of) programming project at least as big as the project being tested.

Luckily, there is hope – Machine-Augmented Intelligence (MAI) and Machine Learning (ML). Instead of having humans writing regression tests, for example, a computer could randomly generate a vast amount of test data, run it through a programming system before and after a code change, and validate that the outcomes haven’t changed (except in anticipated ways) after the change. MAI would let the test home in on more effective tests and ML would let it remember past problems and use them to drive its future activity. This would be more than static code analysis (which is both excellent and necessary), it would be looking at dynamic code behaviours and directing resources to ever more effective defect identification. It is, however, necessary that the MAI is sophisticated enough to explain to the developers what its output means and where to find the defects – a simplistic “test failed” isn’t much help.

Nevertheless, I don’t see why a computer shouldn’t be given a reasonably formal spec, identify ambiguities and incompleteness in the underlying logic (these are probably what would lead to the most expensive defects in production) and then generate tests exploring boundaries (edge conditions), unanticipated inputs etc. Most importantly, it wouldn’t be biased by the programmers’ egos and misconceptions, and could get better at finding defects, based on feedback from a conventional testing team guiding the testing process.

Science fiction? I hope not, because I don’t see any real alternative to taking test automation a stage further. And now I’ve seen a book that seems to indicate that we are well into his journey: “Accelerating Software Quality: Machine Learning and Artificial Intelligence in the Age of DevOps”. Its lead author is Eran Kinsbruner (Chief Evangelist at Perfecto, part of Perforce Software), and it has chapters both from his Perfecto colleagues and from external experts from other organisations:

Chapters written by Perforce group subject matter experts from – Perforce, Perfecto, Klocwork, OpenLogic, TestCraft.
Chapters written by external subject matter experts from – Chailatte Consulting, Digital Assured, Aista, RedGreen Refactor, Logz.io, test.ai, AI Appsore, Mesmer, TwentyOne, Sealight.

The book aims to show how what I call MAI and ML helps us to make more data-driven decisions, helps us to automate more processes and helps us to deliver higher-quality software, at scale, faster. The main audience, I think, comprises software developers, testers and managers and anyone on a journey towards DevOps maturity. Non-technical managers might find quite a lot of it interesting, but might find it hard to follow in detail. The book contains proven best practices and real life examples, across open source and commercial software development (see the chapter titles, below).

I know the book is written by the employees of software vendors, but it does largely resist the temptation to market product, I think. Don’t overlook the expertise available in IT software vendors. At the same time, remember that each vendor has, at best, its own point of view; at worst, its own agenda, so relying on a group of vendors, rather than just one, is a lot safer. I’ve liked Perforce, the main driver for this book, ever since I was the first journalist its founder, Chris Seiwald, was unleashed on back in the last century; I have been following its growth since with interest. It may not be as fashionable as GIT, but it now has a lot more capabilities than just Configuration Management – including, as here, test automation tooling.

The range of topics covered looks good and includes chapters titled:

How do AI and ML Testing Tools Fit and Scale in the DevOps Pipeline
Impact of AI on Humans and Technology
The New Categories of Software Defects in the Era of AI and ML
Codeless Testing 101 for Web and Mobile Test Automation
AI Data Usage
Analyzing the User Experience
Introduction to Robotic Process Automation
API Testing with AI and Machine Learning
Cognitive Engineering – Shifting Right
Automated Code Reviews with AI and ML
Conversational AI Applications
Moving to Modern DevOps with Fuzzing and ML
Maximizing Code Observability Within DevOps using AL and ML
Using Machine Learning to Improve Static Code Analysis Results
How does AIOps Benefit DevOps Pipeline and Software Quality
What’s Next for AI/ML Testing Tools?

So, how does this book live up to its promise? Well, as I’ve implied, it is quite heavy going, because the subject itself is rather heavy. It is a serious book for advanced practitioners and perhaps one to dip into rather than read cover to cover. I did feel that some of the authors were a bit close to their subject and that this made it more difficult for a layperson to read – the content is good but perhaps a professional communicator should be presenting some of it. On the other hand, there is a lot of detailed information and practical advice, taken from hands-on experience.

The quality of the illustrations is a bit variable. Some are quite hard to read, although this isn’t a major problem in practice. What I do find unforgivable, in the paper book format, is the lack of a proper index; possibly there are full-text search options on the Kindle version (I haven’t installed this, but I don’t find that technical books are particularly suited to the Kindle experience). I’d also like to see the names of the various authors included against their chapters in the chapter list at the front.

This is, however, definitely a book which stimulates thought, with a range of viewpoints from a wide range of authors, all telling a reasonably coherent story. For instance, in Chapter 9: The New Categories of Software Defects in the Era of AI and ML, we are looking not just at the use of AI and ML to help automate testing, but also at the new kinds of defects in MAI-based systems that will have to be addressed by testers.

Testing ethical defects, for example? Imagine a known criminal trying to drive an autonomous car, to take a seriously ill person to hospital. The behaviour of the car will be embodied in software, and the person responsible for that software will have to test whether the behaviour of the car matches on of a range of possible behaviours, possibly regulated by a legal framework. Does the Intelligent Car:

Obey the driver without question;
Immobilise itself, if the car determines that the driver is not the owner of (or is not authorised to drive) the car, regardless of the risk to the sick passenger;
Lock the doors and drive itself to the nearest police station – regardless of the risk to the passenger – or perhaps this is kidnap;
Allow itself to be driven to an official destination such as a hospital or police station, but nowhere else;
Contact a human authority for instructions;
Use its awesome AI to make the correct ethical decision, in the circumstances, whatever that is, and carry it out (yes, I am joking);
Do something else.

Two things are clear to me. First that in the brave new world of MAI these sort of behaviours would be part of the design, and this means that someone will have to confirm (test) that the actual behaviour of the car matches the agreed (and possibly regulated) designed behaviour; and, second, that the job of the Tester has just got a lot more intellectually interesting – and, probably, far better paid. I don’t see MAI and ML systems as replacing conventional testers (well, not any time soon) but as complementing them – augmenting their testing capacity.

The last chapter of this book looks towards “What’s Next for AI/ML Testing Tools” and the possibility of truly autonomous testing (and its implications). I could see this as tying in with the use of “digital twins” for building business automation (see the latter part of this article in support of business automation). I think that “Accelerating Software Quality: Machine Learning and Artificial Intelligence in the Age of DevOps” is important reading for anyone involved in “building software without surprises” in today’s, and future, environments.

This Post Has 2 Comments

bloor-user-3234 says:
2nd November 2020 at 10:19 pm
The premise of this post is incorrect. Testing isn’t a dry checking whether something worked. It is much more nuanced. You are asking the question, ‘Will this ever fail?’ – No matter who is the customer, no matter what they do, no matter which device they use, no matter how many updates they have, no matter if they are targeted by a hacker, no matter what their level of technical sophistication, will this ever fail? That question cannot be answered by test automation.
There are others who have answered this question:
kaner.com
developsense.com
satisfice.com
geraldmweinberg.com
1. David Norfolk says:
  4th November 2020 at 2:20 pm
  Did you actually read my post? I said “developers focus mostly on demonstrating that their lovely code works rather than on torturing it until it breaks (and remember that the purpose of most testing is to find defects, so they can be removed”. So I agree that: “Testing isn’t a dry checking whether something worked”. And I talk about augmentation of testing with automation, not about replacing people entirely.
  You point out that the testing issue is BIGGER than most people think it is, dealing with more stakeholders. I agree – but I don’t see why automation can’t help. Automate the routine (static code checking is a good start), use people where they add value. I see lots of applications with defects and I don’t see testing currently as being over-resourced so I suspect that your questions aren’t being fully answered by manual testing 🙂

Comments are closed.