ActiveBlog

DevOps: Who We Are and Where We Are Going
by Dan Razzell

Dan Razzell, May 21, 2013

Everything old is new again, including perennial quesions about the human condition. Here at ActiveState, we got to talking the other day about a recent DevOps paper by Dave Zwieback, The Human Side of Postmortems, published by O'Reilly in 2013. This paper discusses human performance under stress, citing a much earlier study by Yerkes and Dodson which examines the effects of stress on cognitive ability in mice. Although the study was published over a hundred years ago, largely forgotten for the first half of the century, and largely misrepresented through the second, its insights resurface from time to time and have recently caught the attention of the DevOps community.

The representation of the Yerkes-Dodson empirical law from Daniel Kahneman (Attention and Effort, Prentice-Hall, 1973) shows that, as problems engage us and even stress us, our performance is enhanced to a point, but soon begins to degrade. Anyone who has been involved in troubleshooting during a critical operations outage will recognize this effect at once. As stress and fatigue accumulate, we find that we can still perform simple, routine tasks with some ability, but complex tasks become disproportionately more difficult.

The Yerkes-Dodson law has important implications for DevOps, not merely because it validates our firsthand experience but also because of how it challenges us, as designers, to bridge the gap between development and operations.  We're always on the lookout for ways to reduce complexity.  Here we see that the greatest leverage comes from reducing complexity in stressful situations.

Thinking About How We Got Here

We know that there's a natural tendency to build systems whose accumulated complexity pushes the limit of what the brightest people in the room can handle. From an operational (and staffing) perspective, this wonderful achievement is exactly what we don't want to end up with, because (a) people who are groggy and stressed in the middle of a long outage are easily overwhelmed by complexity, and (b) people who happen to be present during an outage are, on average, going to be average people. To mitigate these effects, it's simply not practical to raise the bar by requiring senior expertise for routine operations and outage management. Most of the time, we need to apply that precious expertise elsewhere.

What DevOps really wants to deliver are systems whose design and operation are as simple as possible to reason about. Taken by itself, this is nothing new. In our field, Tony Hoare (1), Fred Brooks (2), Carver Mead (3), and Bruce Schneier (4) have all stressed the importance of simplicity in system design, taking their cue from Aristotle and Plato. For several decades, they've made a philosophical call in response to a practical problem. Wise men don't advocate such attention to fundamentals unless they foresee that there is no other way forward.

And this history provides a clue to what I think is really going on now in the DevOps movement. As the interval shortens between development and deployment, we must come to treat these two activities as different aspects of the same process. That, combined with our concern for simplicity, in turn means that we have to grudgingly unwind the particular reductive analysis that has served us so well, and try to break down the problem along different lines.

A Different Revolution

It's as if we had to imagine ourselves in a different timeline in which the industrial revolution never took place. As always, we're driven to look for economies of scale, but suppose that, for whatever reason, the idea of manufacture based on subassemblies just never occurs to us. Instead, after trying to scale up traditional cottage methods that involve painstaking craftsmanship, someone accidentally discovers a way to perfectly replicate an object at zero cost.

What would it look like to build an economy based on perfect replication? Suppose that, no matter how complex the artifact, we could spin up as many instances as the economy needed, and we would have a clear conscience about throwing them away once they're no longer required? Clearly, we would bypass certain traditional problems, but many others would still be with us. The "paradox of choice," as discussed by Barry Schwartz, is likely to become more pernicious. An incorrect design replicated at massive scale is still a design problem. But is it still the same design problem at scale as it was for a single instance? Or will there be emergent, combinatorial problems to cope with? And is it the same design problem if the solution doesn't presuppose delegating it in pieces to different silos of expertise?

Life in the Cloud

Complexity in this hypothetical "cloud" world remains a prominent challenge, and it brings us around full circle. Reasoning about configuration is a complex cognitive task, one which, as we have seen, can be readily overwhelmed by stress. But this time around, without the distractions of manufacturing lying smack in the middle (what ActiveState's developer evangelist John Wetherill recently called "fiddling around with the plumbing"), we can now afford to engage with -- and expose -- design and operations as a unified whole. And that's where we go looking for deep simplicity.

It's an easy prediction that we will increasingly see design and operation as common aspects of the same artifact. DevOps will continue to draw upon the governing principles of design -- simplicity, modularity, composability  -- even from within the crucible of real-time operations. Most importantly, we will empower our customers to do so, and here I speak for DevOps and Private PaaS organizations everywhere.

Indeed I believe that private PaaS is where the deeper story arc is resolving. Among the cloud service layers, PaaS provides the greatest expressive range, and private PaaS is in its most empowering form, where design and operations come together like nowhere else. These are excellent virtues to work with. Our particular and somewhat humbling task is to make them "as simple as possible, but no simpler."

(1) "There are two ways of constructing a software design.  One way is to make it
so simple that there are obviously no deficiencies.  And the other way is to
make it so complicated that there are no obvious deficiencies."
C.A.R (Tony) Hoare

(2) "Software entities are more complex for their size than perhaps any other
 human construct because no two parts are alike. If they are, we make the
 two similar parts into a subroutine. In this respect, software systems
 differ profoundly from computers, buildings, or automobiles, where repeated 
elements abound."
 Fred Brooks, Jr.

(3) "It's easy to have a complicated idea.  It's very, very hard to have a 
simple idea."
 Carver Mead

(4) "The worst enemy of security is complexity."Bruce Schneier

Subscribe to ActiveState Blogs by Email

Share this post:

About the Author: RSS

Dan has a background in language design, system administration, development, and network operations, with information security and identity management experience as well. He's worked with everything from supercomputers to embedded control systems. His past involvements include the Laboratory for Computational Intelligence, Parasun, Sun Microsystems and WestGrid.