How to get a job as a site reliability engineer (SRE)

Site reliability engineers are the go-to professionals when production outages and other issues arise. Here's how to prove you're up for the challenge and get an SRE job
321 readers like this.

Editor’s note: In this ongoing series for IT job hunters, we'll explore in-demand roles, necessary skills, and how to stand out in an interview. Here, Rob Hernandez, CTO of Nebulaworks, shares insights on getting a job as a site reliability engineer.

Site reliability engineer (SRE) salary range

$82,000 - $150,000 per year. Source: Glassdoor.

In a nutshell: What is a site reliability engineer (SRE)?

The site reliability engineer is the good systems administrator of the recent past. These individuals contain an expertise in systems engineering, networking, and development. They leverage all three to develop automated solutions that solve the routine, manual problems perpetually revisited by less capable administrators. Those assuming this role are often on-call and need to be able to act quickly to resolve the production outages.

[ Want more data and advice on today's IT job market? Read: 5 flourishing and 5 fading IT careers. ]

What skills are needed for an SRE job?

As an SRE, you will be developing all of your automation the typical way a developer would be developing an application. Only this application is used to run other applications. Having a strong grasp of at least one programming language and Git source control management is a must. This is how all of your work will be done, and it needs to be second nature for you to commit your changes and submit a pull request for others to validate your solution.

The ability to quickly understand and work with an existing codebase will be invaluable here. There will often be a codebase that you'll need to familiarize yourself with, and the more comfortable you are reading and reviewing code that you didn't write, the better. Many people want to "nuke and pave" and just start fresh, but there were reasons why the initial solutions existed in a codebase and it's often too risky to throw it all out.

Underlying the entire architecture of all distributed systems is the network. Understanding how to break down each layer of the Open Systems Interconnection (OSI) model and pinpointing where to spend time troubleshooting is going to be immensely valuable to an engineer who's woken up and alerted to an issue at 3:00 a.m.

[ Read also: IT careers: How to job hunt during a pandemic. ]

How to stand out in a SRE interview

Contribute to open source. It's critical to learn how projects work with a large number of contributors and a codebase that's not your own. Open source allows you to find something that you're passionate about and get involved. Often times you'll learn more than what you're currently exposed to at your current job. The good news is that this work will complement your current skillset in ways you never thought possible. All of this should be highlighted on your resume.

Ask the hiring manager ahead of the interview if you can get some examples of the existing problems that plague your potential future team. If they are able to elaborate and share examples, come to the interview with ideas about how you might approach and try to solve the problems. Obviously you won't have all the details but having this high level discussion during an interview and some potential solutions has netted me an offer on a few occasions. This shows your potential employer that you're willing and able to think critically about solutions even before joining the team.

Bonus: Sample SRE job interview question

Question: What is the difference between a hub and a layer 2 switch?

Answer: A hub sends all packets to all ports while a switch keeps a table of available mac addresses and can send packets specifically to the port that the traffic is destined for.

Question: What is the "PATH" environment variable used for in a Linux shell? How would one append it?

Answer: $PATH is a list of directories that contain executable binaries that can be referred to by the operator without having to use the fully qualified path. Instead they can use just the basename. You can append to your path by doing something like export PATH=$PATH:/opt/bin if one wanted to add "/opt/bin" to the PATH if it was not there.

[ Are you a rising IT leader? Read also: IT careers: Why you should embrace unexpected twists. ]

Rob Hernandez is the CTO at Nebulaworks, a consulting and SI firm that was built for engineers by engineers. He is responsible for defining the development principles and processes to deliver unbiased strategic solutions to client business outcomes.