DevOps at the Oval - Development vs. Operations

Cloudbees and friends had a great “Oval Table” discussion about DevOps (the merging of the Dev – development or programming – silo with the Ops – automated systems Operations – silo) a week or so ago, at the KIA Oval. Attending were Ben Williams (Product Director, Temenos), Steven Armstrong (Principal Automation Engineer, Paddy Power betfair), Vilian Atmadzhov (Automation Engineer, Paddy Power betfair), Gary Gruver (Author and President, Gruver Consulting), Brian Fox (VP Product Management, Sonatype), Jayne Groll (Co-Founder, DevOps Institute), Paul Hinz (Sr. Director Middleware, Red Hat), Sanjeev Sharma (CTO, DevOps Evangelist, IBM), and Sacha Labourey (CEO, CloudBees).

The impressive users of DevOps from Paddy Power betfair, Armstrong and Atmadzhov, are making DevOps work in the real world, partly because they have top management buy-in, and partly because they are doing it right, providing services that the business can use in small chunks, so the impact of failure (and “Mean Time to Recovery”, often considered so much more important in DevOps than “Mean Time Between Failure”) is kept low.

However, I guess I have a “tester’s mindset” because, as soon as I meet a basically good idea, I’m always thinking “how could this go wrong, even with the best of intentions”. In the case of DevOps, I think Jayne Groll (co-founder and board member of the DevOps Institute) hit the nail on the head when she suggested to the Oval Table that people implementing DevOps often neglected the Ops part. It really isn’t “no-Ops” and shouldn’t be – perhaps “new-Ops” catches the idea better, she suggests. Interestingly, when I was talking to BMC a few days later, the attitude was rather opposed to any “no-Ops” ideas – it was closer to (but not overwhelmingly so) “no-Dev”.

Things go wrong with DevOps, I think, when either Dev or Ops gets to rule the roost. And the Dev culture can be very strong (especially, perhaps, in the open source world) and overwhelm the Ops culture, if you aren’t careful. And, yet, Dev is often optional in practice (you can usually buy, rent or orchestrate packages and services to do most things) and Ops mostly isn’t (somebody needs to orchestrate the services and manage capacity, resilience, security and so on in the longer term, at least in the current state of AI).

This is important because, whatever the developers think, the business doesn’t really want continuous and rapid delivery of more software code, it wants continuous and rapid delivery of working business services. These, as Williams pointed out, include governance, security, compliance and all that stuff and if they aren’t included in the DevOps feedback loop, the continuous delivery pipeline stalls while they are sorted out. This then makes DevOps almost pointless (delivery is as slow as the slowest bottleneck and bolting on security, not a very good idea, can be excruciatingly slow).

The business can usually cope with a “mean time to recovery” culture, if small things with restricted scope fail and are fixed fast – except that some failures simply can’t be tolerated. If you post unencrypted personal customer information on the Internet, fines can now be up to 4% of global turnover; and if you allow personal banking services to read about impending investment banking trades in advance, the regulators could close you down (because of the insider trading opportunities), no matter how fast the error can be fixed.

Such potential issues must be identified early and built into the design of any systems where they matter from the first. This is Ops culture (concerned with maintaining service levels and not breaking regulations) rather than Dev culture (concerned with exciting innovation and “try it and see what happens, back it off if it doesn’t work”).

Despite the ordering in the name DevOps, Ops is actually more important than Dev. Ops delivers working systems to the business over many years, systems that (hopefully) help make money for the business and don’t give the business unpleasant surprises. Dev merely delivers the potential for making money and is active for only a small part of the system lifecycle. The ROI from a clever development is only realised in operational use of the system and is zero on the day of delivery.

So, what to do? Well, you should pay attention to culture and people. If the Dev side of the culture is particularly strong and respected in the company, or certain people (“hero programmers”) particularly powerful, you risk building an unbalanced DevOps continuous delivery pipeline, one that delivers code fast rather than one that delivers business benefit. And perhaps you should promote a services culture, one that thinks in terms of delivering workable and manageable business services as a whole – rather than just code.