12 days of DevOps: Expert tips to help teams succeed

Is your holiday wish to begin or improve a DevOps effort? Read on for 12 ways to help your teams shine
674 readers like this.

Enterprise technology leaders across any firm and any vertical will tell you there is no prescriptive way to ensure that a DevOps transformation journey will be successful. Every organization is architected differently, and each faces unique pressures from differing outside forces. There is no magic 8-ball to ask where to begin the journey. However, there are several expert DevOps tips and tricks that every IT leader should keep in mind.

[ Read our related story:  DevOps culture: 3 ways to strengthen yours in 2019. ]

To get into the holiday spirit, and offer help to those in need, I reconnected with Gary Gruver, author of Starting and Scaling DevOps in the Enterprise, to discuss some tried and true methods to transform a DevOps culture, increase IT efficiency, and ultimately, deliver more business value.

We narrowed in on 12 ideas for presents that any enterprise IT leader may want to add to their wish list this year while pursuing their own DevOps journeys.

It’s never easy to know where to begin; you just need to start! So…

On the first day of DevOps, select a deployment pipeline as your starting spot

A deployment pipeline is code that must be developed, qualified, and released as a system because it’s coupled together. Ideally, if you’ve got loosely coupled systems, that’s a team of five, and those things can go pretty quickly. But in many large organizations, that’s a much larger group of people with exponential layers of complexity.

Starting with the largest deployment pipeline can typically help expose where the most waste and inefficiency resides.

It is easy to start small and show early success, but starting with the largest deployment pipeline can typically help expose where the most waste and inefficiency resides within an organization. For example, trying to coordinate code development across 500 people is no easy task, but large, tightly coupled systems often prove to be the biggest opportunity for improvement.

We’re not saying you should try to boil the ocean by attempting to solve everything at once. But picking a pipeline that is a bit complicated, that touches on more than one technology stack and requires coordination between different teams, will help show that DevOps practices really do make a difference.

On the second day of DevOps, set up a continuous improvement culture

In many cases, it’s important to pick leaders who are going to help coordinate this improvement across the organization. This could be the head of Dev, QA, Ops, security, or other.

If you get the executives on board, you’ll likely fly downwind. If you don’t, you’ll probably fight a headwind. You’ll see that you can make some progress from the bottom up. But if you don’t get executives' engagement, alignment, and commitment to make improvements, you won’t be able to drive the cultural changes. You’ll struggle to find the funding that you need, and you’ll tend to lose momentum on your transformation.

It’s important to understand that the quality of the daily work matters, but the improvements in the quality of the daily work are what really matter.

[ Trouble getting your managers on board? Read our related article: DevOps: What’s in it for managers? ]

On the third day of DevOps, target an environment on the deployment pipeline

The deployment pipeline is where the code flows from a business idea all the way through to production. If we look at this environment called production, there are some developers on the left side, and production on the right side. What environments does this flow through on the way? Do you start with a pre-prod? Do you have a staging environment? A user acceptance testing environment? A QA environment? Do you have more than one Dev environment?

In general, you want to pick an environment to start understanding where the issues and the challenges are in the organization. One of the best ways to do that is to look at the leaders who have agreed to lead a continuous improvement effort and go as far right on that deployment pipeline as possible.

You need to prove that what you’re trying to do is going to work in the non-production environments first.

Production environments are typically off limits, and generally, nobody whos anywhere near the left side of the process gets to touch it. So you need to prove that what you’re trying to do is going to work in the non-production environments first. The goal is to understand how those things are configured and what needs to change over time. The goal is to get to the state of having immutable environments where we deploy to them, use them, and then throw them away.

On the fourth day of DevOps, find stable, automated tests

With that environment you’ve targeted, take your automated tests and run all of them 20 times in a row. See if you get the same answer. If you have a bunch that always fail, set them aside until you get the defect fixed so they can pass. If you have a set that is toggling between pass and fail, you’re not in a very good situation. You will end up frustrating developers if you ask them to try to keep those builds green, and it’s stuff that’s outside of their control.

Find a set of automated tests that are passing that you can run over and over on an ongoing basis. If you can’t count on that as a stable signal, you can’t use it to change how you’re doing development. Instead of inspecting the quality with manual testing and entering defects, the goal is to get the organization to keep these automated tests running and passing in a way that fixes the defects without the overhead of entering the defects. They’re building quality in early.

Scott Prugh from CSG gave a great talk a few years ago about how they introduced DevOps into their mainframe operation. One of the very first things they did was write five unit tests. That’s all. And that immediately started paying off. Scott and his team were able to stop things in their tracks when there were problems.

If you’re trying to change your processes and get a software pipeline going and your tests are flaky, you can fall into the signal-to-noise ratio trap. The team may be thinking that your environment is broken when in fact it’s your tests that are broken. Flaky tests are the devil. They waste a huge amount of time in organizations. If a test is intermittently failing, it may well indicate a problem in the product, but most often it’s just a poorly written task. Disable them entirely. Log the bug and disable them and move on, because they’re not helping you; they’re hurting you.

On the fifth day of DevOps, ensure that your environment is stable

Even when you think your tests are good, you still need to run them over and over to make sure your environment is stable. I have seen many different organizations get stuck because there was maybe an intermittent F5 in the system that was timing out.

When you’ve been doing stuff manually, you can have an environment a certain size that works. When you really start to load it up with automated tests, that whole environment structure around it may not be stable enough to handle the automated test. Once you have tests that you think are good, really load it up and see if you can always get the same answer out of those tests. Use the same code and the same environment. Don’t change anything; just stress it as much as you can and go through and fix any issues with environmental stability.

If you don’t have a stable signal, you really can’t do anything to start changing how the organization develops quality and how you start building in quality instead of inspecting it.

A good practice is to not configure and maintain the environments manually. Use a technology platform that sets up and configures these environments for you automatically, so that it’s push button and self-service to the greatest extent possible. If you’re following manual scripts to configure environments, stop doing that. Make sure that all environment provisioning is on-demand and automated. Otherwise, best practices will be ignored in all of the other environments as well.

[ Want more on this topic? Read also: 7 DevOps lessons learned in 2018. ]

On the sixth day of DevOps, ensure that your deployment is repeatable

If you’re trying to get something stable for the signal, repeat it over and over to ensure stability. We started with getting down to our stable test. Then we ran it on the same environment over and over again. The idea is to change one variable at a time and make sure that we’re getting stability in the system. We want to answer the question: “Can we take the code perceived to be stable and deploy it the way we deploy it on an ongoing basis and get the same answer?”

If you can’t go through the process of getting the same answer each and every time you deploy, you probably need to drive consistency into your deployment. This is a problem that must be resolved before you can go further in making improvements. When you’re doing manual deployments, for example, you’ve got a real opportunity for issues and challenges. It’s a good opportunity to automate the deployments.

The vast majority of delays and errors are introduced during handoffs and during manual processes.

The vast majority of delays and errors are introduced during handoffs and during manual processes. This is where teams need to think hard about taking all the manual phases and manual steps out of this process. That can be scary in the beginning, but computers tend to be pretty good at repeating the same tasks over and over. Humans, not so much. We forget things, we get distracted, we skip a step inadvertently, or we run the same thing twice.

The point is, the more you automate things, the more you’ll have a record of what went wrong and ensure that a process is repeatable.

Anders Wallgren is chief technology officer at Electric Cloud. Anders brings with him over 25 years of in-depth experience designing and building commercial software. Prior to joining Electric Cloud, Anders held executive positions at Aceva, Archistra, and Impresse.