How to explain service mesh in plain English

As people continue to scale and automate containers and microservices, you’ll hear a lot about service mesh. What exactly are the benefits? Here’s how to break it down, even to non-techies
548 readers like this.
service mesh explained

Technology adoption trends tend to produce additional technology adoption trends. That’s not necessarily cutting-edge analysis, but it offers a simple explanation for why you’re hearing more – or will soon – about service meshes. Increasing interest in this emerging technology follows the growth of containers and microservices, as organizations look for the best tools to scale their applications and environments in production in as automated a fashion as possible.

Service mesh tools can reduce the operational burden of managing microservices-based applications, and in particular traffic between services.

In short, service mesh tools like Istio can reduce the operational burden of managing microservices-based applications, and in particular traffic between services, which could otherwise involve significant and often unsustainable manual work.

As Red Hat CTO Chris Wright wrote earlier this year, “Alongside serverless, we see the service mesh concept taking off. A service mesh is essentially platform-level automation for creating the network connectivity required by microservices-based software architectures.”

That’s a good, concise definition. We asked a variety of other IT leaders and practitioners to share their own clear-cut definitions to help boost your service mesh IQ, in part because it’s likely to come up in more discussions around containers, microservices, hybrid cloud, and other topics.

Let’s start with some quick definitions:

What is service mesh?

Todd Loeppke, lead CTO architect at Sungard Availability Services, shares some of the services that a service mesh can automate: “A service mesh is a key component of container and microservice architectures. At a high level, a service mesh ensures communication between containerized application infrastructures. It provides features such as traffic routing, load balancing, service discovery, encryption, authentication, and authorization.”

Prasad Dronmaraju, solution architect at OpsRamp, notes a key “why” behind service mesh: manual configuration is a no-go: “Service mesh is a pool of pre-configured application services that allow services to talk to each other, sharing data and consistency across an application life cycle. They are exclusively used to manage microservices, using the thin, writeable layer of a container that’s built to be easily set up and destroyed. Because the life of these containers can be as fast as one minute, getting the required application services configured manually is not possible.”

Brian Redbeard Harrington, principal product manager at Red Hat, describes service mesh as a mash-up of several better-known technologies: “A service mesh is a set of software components which act as the “glue” for a set of independent applications. The goal of the mesh is to guarantee secure communications between each application and be able to redirect traffic in the event of failures. Often the features of a service mesh look like a mash-up between a load balancer, a web application firewall, and an API gateway.”

Enlin Xu, director of advanced engineering at Turbonomic, notes the growing importance of service meshes for modern apps: “Service mesh provides the network functionality to deliver service communication through API. It becomes essential for the modern microservices-based application. A service mesh is usually composed of a control plane, [which] defines load balancing rules, routing policy, circuit breaking, etc., and a data plane that is responsible for routing, service discovery, authentication, visibility, etc.”

Manish Chugtu, CTO of Cloud Infrastructure and Microservices at Avi Networks, explains service mesh in the context of the added complexity that comes with modern applications and environments: “A service mesh is software that helps services – especially microservices – communicate. It makes the communication between them resilient, observable, and secure while working with any application language, architecture, and infrastructure. Microservices increase developers’ speed and flexibility, but they also increase the complexity developers need to grapple with. In contrast to a monolithic application, microservices require network connectivity for the independent services to communicate with each other and other third-party services. Before service mesh, this additional complexity made it difficult to ensure each service had access to and availability on the network, and even more difficult to troubleshoot and secure them.”

Daniel Bryant, product architect at Ambassador, points out that service mesh applies to what is typically referred to “east-west” traffic, which is also an area of increasing attention in container security: “A service mesh is an infrastructure layer that sits on top of the application deployment fabric (e.g., Kubernetes, VMs, bare metal) and is responsible for managing all service-to-service communication – commonly referred to an ‘east-west’ traffic – within a system. A service mesh provides functionality such as traffic routing, shifting and shaping, for example, service discovery, canarying, shadowing; resilience, through the implementation of retries, time-outs, and circuit breaking; security, such as mTLS; and observability, via the collection of metrics and distributed traces.”

Muddu Sudhakar, CEO, AISERA, notes the importance of speed and reliability to service-to-service communication: “Service mesh is a configurable service infrastructure layer that handles high-volume, low-latency communications between applications and services using APIs. This has to be fast and reliable, with high throughput and secure communications while maintaining redundancy. Other capabilities related to service mesh include service discovery and service mapping.”

How does service mesh work? Two analogies

You may also need to be able to explain what a service mesh is to non-technical people in the organization, which can be tricky with an inherently technical topic. You may be best served by focusing on the why – “a service mesh helps us automate a lot of the management of our applications” or a similar statement– rather than digging into the “what” or “how.”

You can also try to help people’s understanding of the concept by comparing a service mesh to another system they already know. Here are two examples:

How service mesh is like a street traffic app 

Mark Runyon, principal consultant at Improving, offers a comparison point that some of your colleagues probably have on their smartphone:

“I like to think of a service mesh similar to Waze,” Runyon says. “You know where you are starting and where you are going, but not necessarily the most effective way to get there. There are lots of events – wrecks, road work, traffic-light outages – which can render your preferred route undesirable. Waze tracks hundreds of thousands of data points to chart out a custom route for each driver on the road. At a very high level, service mesh works in a similar fashion.”

It may be a slightly overused word in the tech world, but it’s all about scale. If you’re just tinkering in a dev environment, that might not apply. But running containerized microservices in production can require scale in a hurry.

“Like Waze, service mesh keeps track of routing rules and dynamically shifts traffic to deliver the point of lowest latency and highest reliability,” Runyon says. “This is normally done with microservices that support massive traffic and must scale effectively. Think Netflix on a Saturday night, or Amazon on Cyber Monday. This complexity necessitates a dedicated service-to-service infrastructure layer like service mesh.”

How service mesh is like the postal service

Here’s how Bryant from Ambassador analogizes service mesh with a more ubiquitous system: The postal service.

“The closest analogy in the real world is probably that of a national postal service,” Bryant says. “You can attempt to deliver a package any number of ways, but you are then responsible for translating an address into a building location (and figuring out any redirects if the recipient has moved), and coordinating retries if the recipient is not home or a road to the building is flooded. Also, how do you ensure the security of the package and get reports on its current location and status? The postal service provides a standard interface for all of this (e.g., the post office, use of ZIP codes, tracking websites), and they offer certain guarantees around quality of service and privacy.”

[ Want to learn more about building cloud-native apps and containers? Get the free download: Principles of container-based application design. ]

Kevin Casey writes about technology and business for a variety of publications. He won an Azbee Award, given by the American Society of Business Publication Editors, for his story, "Are You Too Old For IT?" He's a former community choice honoree in the Small Business Influencer Awards.