Migrating the Runbook - a Journey from Legacy to DevOps

October 18, 2016 by Mark Phillips

DevOps Journey - Migrating The Runbook


"Just type this invoice up for me will you please?" asked a sheepish looking Malcolm.

"I do have better things to do you know" I replied.

"Yes, yes, I know. But who else is going to do it?"

"Give it here then!"

In the beginning, there was a problem

That was a fairly common interaction for me as a young lad. I was fresh out of school and working my summer in the sales department of a local car dealership. My job was mostly admin related tasks, which up until that point hadn't included doing all the sales guys' typing. Our secretary had recently departed the company, and the sales guys all figured I could happily do the replacement typing jobs. The duty had fallen to me because a) I had the stereotypical 1980s glasses of a nerdy computer kid and b) they all knew I actually was a nerdy computer kid. So fair play to them for assuming I could type, I could.

The thing was I really did have better things to do, and these daily interruptions were eating into my productive time. I wanted that time back; so I got a book on programming VBA (everyone remember that? yes? jolly good) and started to write some macros in Word to generate invoices. After a short while I had a simple dialog box, that even the oldest curmudgeon in the office could work, which filled in a neat looking invoice and printed it out. The interruptions stopped, the sales folk could all help themselves to get the job done, and I got countless hours of my life back.

That was the start of my IT career – automating. Solving problems for people using technology.

The runbook

Over the course of the last twenty odd years I've worked in a lot of companies, and yet even today I still see places doing things manually. As recent as last year I saw a 'runbook' in action – basically a list of manual steps somebody had to do to deploy code into production. As I'm fond of saying, "if the solution is delivering a business goal then it's not 'wrong' per se, but sometimes things can be done better". Following a runbook leaves so much to chance, one can never really be sure what the result will be.

Image source: wikipedia.org/wiki/Decrepit_car

Automation is about trust. We can have faith in task consistency with automation. Things turn out more the way we expect them to, time after time.

Image source: porsche-leipzig.com

I've visited customers and seen plenty who still aren't trusting the computer to do the heavy lifting. Pondering why this might be, I've drawn four conclusions:

  1. Speed of entry – "oh it's quicker if I just type this straight into the server now"
  2. It's too hard
  3. I only need to do this once
  4. But if I automate this, won't my job go away?

Number two is a tooling problem – if it's too hard, then the tool is wrong for the job (ever tried putting a screw into a wall with a hammer?) Number three always makes me smile, because who ever heard of a temporary solution in IT? We all know that "bits of sticking plaster" end up running production systems. So I always think, automate it from the start and it's repeatable and documented.

Number one is actually fed by numbers two and three. Number four is a bit of a fallacy too – automation isn't always about replacing people, it's about augmenting them. Get some of your own time back as an engineer and you are free to find more ways to be productive (I'm sure we've all heard of Google's 20% rule and the products that originated from that free thinking). Every business needs to grow, and if you find time to investigate new ways to enable that growth then all the better.

Augmenting DevOps

But how does automation feature in the DevOps story? Well, "DevOps" as a word is a contraction of two words - Development and Operations. It was coined, supposedly, around the late 00s by a developer looking to get his code into production a lot faster. The two departments have opposing goals - development want to get their shiny new code out in front of people as quickly as they write it, yet operations want a stable environment for customers to visit (and let's face it, as an operations person, who wants that dreaded call at 3am when something goes terribly wrong?). The defining ethos of the word was, in effect, we should communicate. The proposal was that Dev and Ops should stop working as discrete silos, where one simply throws their work over the fence to the other, and should start talking to one another to build "pipelines" – the software equivalent of assembly lines.

Some might say, therefore, that DevOps is common sense. To an extent, it is. When we're a team, and every business is, we should remember we're all in the same boat and rowing together in the same direction is going to get us places a lot faster. But often that's hard to do in a large organisation because of many factors – geography, legislation, process etc.

Have you heard of Dunbar's Number? I'll quote the underlying Wikipedia link there - "...is a suggested cognitive limit to the number of people with whom one can maintain stable social relationships. These are relationships in which an individual knows who each person is and how each person relates to every other person...with a commonly used value of 150".

Getting things done as a team is so much easier when everybody knows each other that little bit better. This is why so many corporates force upon us the often cringey team building days :) But they have merit. Learn a little bit of the social side to your colleagues and sure enough, you'll work together easier. The problem with this magic 150, and the ethos of DevOps, is quite clear to see when you work in a massive, global, corporate though. How do you achieve the sort of relationships a close knit community enjoys? The DevOps ethos was formed out of people working in small, often startup-sized, companies.

I watched an interesting TED talk a while ago by a chap called Yves Morieux, titled "As work gets more complex, 6 rules to simplify". I highly recommend watching it. His first rule is this:

Understand what your colleagues actually do

Coming back to the core tenet of DevOps, communication, one can see how this rule plays a key part. Once developers and operations understand each other's working lives, it makes it easier to have the conversation about implementing a trusted, automated, pipeline. Sidenote: on the subject of working together, I read an excellent book many years ago called "Getting to yes" by Roger Fisher and William Ury. It talks about objective criteria rather than digging into positions. It's well worth a read.

Looking at this in the context of a globally dispersed corporate organisation, it is possible to encourage, or bolster, the ability for colleagues to understand one another's jobs using technology.

One of my favourite quotes is from Einstein: "make things as simple as possible, but not simpler". Technology, over the last 20 odd years that I've been working with it, hasn't always got simpler. On the contrary, a lot of the time I find things have got much more complicated, and often harder with that. Using technology to help understand what your colleagues do needs to be easy, because then it becomes far more natural to do so. The world now moves at a rapid pace of change and attention spans are short (thanks Twitter!), so information needs to be quick to access, easy to keep up to date, and most importantly easy to search for relevance. Wikis are a great documentation tool, just look at the success of Wikipedia.

Later in his talk, Yves Morieux goes on to speak about feedback loops. Reinforcing his first rule is tremendously beneficial to this communications story, and I've seen companies do this very well with technology. I once visited a customer who showed me their excellent Ansible development workflow. Anybody could write playbooks and contribute to an internal repository via a "pull request". Essentially they had mimicked, internally, a typical opensource software development model you would see on GitHub. There was a feedback loop – people wrote code, and submitted it for others to review and bring into the bigger picture. Dispersed teams, geographically in this particular case, working together.

I've seen similar arrangements taken a step further, with commit messages and PRs automatically sending a message into a chat system. Within the Ansible team we're big users of Slack. For a time when we were just a small startup it was my personal saviour. Although I was over 3,000 miles away from HQ I always felt very much connected to the organisation. Slack was magical at helping me understand what my colleagues did (in true BBC style – there are other chat systems available. I say this with some jest; as a veteran Internet user I've been using IRC for over 20 years, plus countless other chat systems along the way. Slack's differentiator is they've made using it easy and relatively simple).

Chat systems can take feedback loops and automation to the next level too, with ChatBots. I'm sure we've all read about bots and how they're revolutionising our interactions with companies, but it's always impressive to see a company implement a ChatBot that can be asked to spin up 50 new web servers on demand.

The journey from legacy to DevOps

The migration of old methods to the modern DevOps mindset can be taken in small steps by introducing simple, easy-to-use technology that puts people to the fore. When solving a problem using a piece of technology that's quick and easy to get started with, it's far more likely to achieve widespread adoption. See how easy it is to edit a wiki page, put some code into a GitHub repository, create a Slack team or, and it would be remiss of me not to say, write an Ansible Playbook, and you'll witness how small seeds grow into big trees. When technology isn't restricted to one particular group then it's easy to introduce to many teams at the same time. It helps them work to their own cadence in parallel and also to come together in one, coherent, whole.

I'll leave you with one last anecdote from my past. Many years ago I fought hard to get a wiki introduced into a large corporate I was working in. The documentation practice at the time was weighty, and often unread, Word documents. These documents quickly went out of date, and people couldn't easily find information contained within them from a central location. After the wiki was implemented, more documentation was completed in the following six months than had been written in the past three years.

Rapid progress brought about by ease of use and simplicity.




Mark Phillips

Mark Phillips is a UK-based Product Manager. With almost a quarter of a century of industry experience, he has designed and engineered automated infrastructures at every level - from a handful of hosts in startups, to the tens of thousands in investment banks. You can follow him on twitter at @thismarkp.

rss-icon  RSS Feed