Change-Able Operations or "Could DevOps Save Spiderman?"

[This article first appeared in the SAGE-AU newsletter.]

Have you heard about the New York City broadway show Spider-Man Turn Off the Dark? It should have been a big success. The music was written by Bono and the Edge from U2. It was directed by Julie Taymor, who had previously created many successful shows including The Lion King. Sadly, before it opened, the show was already making headlines due to six actors getting seriously injured and other issues.

The show opened late, but it did finally open. It ran from June 2011 to January 2014.

When the show closed Taymor said that one of the biggest problems with bringing the show to production was that they were using new technology that was difficult to work with. Many of the scenes involved choreography that was highly dependent on pre-programmed robotics. Any changes that involved the robotics required a 5 hour wait.

A 5 hour wait?

Normally directors and choreographers can attempt dozens of changes in a day of rehearsal to get a scene or dance number "just right." Imagine finding yourself in a situation where you can only attempt a new change once or twice a day.

The ability to confidently make changes at will is key to being able to innovate. Innovation means trying new things and keeping what works, throwing away what doesn't. If you can't make changes, then you can't innovate.

Consider the opposite of innovation. We've all been at a company that resists change or has calcified to the point where they are unable to make change. Nothing can get better if you can't change policies, procedures, or technology. Since entropy means things slowly get worse over time, an organization that is unable to change, by definition, is an organization that is spiraling towards doom.

I'm reminded of this recently due to the Heartbleed security issue. Like most system administrators, the Heartbleed bug meant I had to spend a lot of time upgrading the software and firmware of nearly every system in their organization. For many of us it meant discovering systems that hadn't been upgraded in so long that the implications were unknown. As sysadmins we wanted to protect ourselves against this security flaw, but we also had to face our own fear of change.

We need to create a world where we are able to change, or "change-able".

There are many factors that enable us to be "change-able". One factor is frequency: we can make change, one after the next, in rapid succession.

Software upgrades: Every 1-3 years there is a new Microsoft Windows operating system and upgrading requires much care and planning. Systems are wiped and reloaded because we are often re-inventing the world from scratch with each upgrade. On the other hand, software that is upgraded frequently requires less testing each time because the change is less of a "quantum leap". In addition, we get better at the process because we do it often. We automate the testing, the upgrade process itself, we design systems that are field-upgradable or hot-upgradable because we have to... otherwise these frequent upgrades would be impossible.

Procedures: Someone recently told me he doesn't document his process for doing something because it only happens once a year and by then the process has changed. Since he has to reinvent the procedure each time the best he can do is keep notes about how the process worked last time. Contrast this to a procedure that is done weekly or daily. You can probably document it well enough that, barring major changes, you can delegate the process to a more junior person.

Software releases: If you collaborate with developers who put out releases infrequently, each release contains thousands of changes. A bug could be in any of those changes. Continuous Delivery systems compile and test the software after every source code change. Any new bugs discovered are likely to be found in the very small change that was recently checked in.

Another factor in being "change-able" is the how difficult it is to make a change.

I've been at companies where making a DNS change required editing 5 different files on two different systems, manually running a series of tests and then saying a prayer. I've been at others where one typed a command to insert to delete the record, and the rest just happened for me.

When it is difficult to make a change, we make them less often. We are tempted to avoid any action that requires that kind of change. This has a domino effect that slows and delays other projects. Or, it means we make the decision to live with a bad situation rather than fix it. You settle for less.

When we make changes less frequently, we get worse at doing them. Therefore they become more risky to do. Because they are more risky, we do them even less. It becomes a downward spiral.

DevOps is, if anything, about making operations more "change-able". Everyone has their own definition of DevOps, but what they all have in common is that DevOps makes operations better able to change: change more frequently and change more easily. The result is confidence in our ability to make changes. In that way, confidence is a precondition to being able to innovate.

Which brings us back to Spider-Man Turn Off the Dark. How much innovation could really happen if each change took 5 hours? Imagine paying a hundred dancers, actors, and technicians to do nothing for 5 hours waiting for the next iteration. You can't send them home. You can't tell them "come in for a few minutes every 5 hours". You would, instead, avoid change and settle for less. You would settle for what you have instead of fixing the problems.

Would DevOps have saved Spiderman? Would a more change-able world make me less fearful of the next Heartbleed


Posted by Tom Limoncelli in DevOpsWriting

No TrackBacks

TrackBack URL:

Leave a comment