Lyft's Envoy: Embracing a Service Mesh | Software Development Conference QCon New York

What You’ll Learn

Hear from the creator of Envoy the sorts of problems Lyft was facing that ultimately led to the creation of Envoy.
Understand how Lyft used Envoy to be able to focus on producing more business logic oriented code and less infrastructure oriented code.
Learn more about Envoy and why so many companies are making it part of their infrastructure when deploying Microservices.

Abstract

Over the past several years, facing considerable operational difficulties with its initial microservice deployment primarily rooted in networking and observability, Lyft migrated to a sophisticated service mesh powered by Envoy (https://www.envoyproxy.io/), a high-performance distributed proxy that aims to make the network transparent to applications. Envoy’s out-of-process architecture allows it to be used alongside any language or runtime.

At its core, Envoy is an L4 proxy with a pluggable filter chain model. It includes a full HTTP stack with a parallel pluggable L7 filter chain. This programming model allows Envoy to be used for a variety of different scenarios, including HTTP/2 gRPC, MongoDB. Redis, rate limiting, etc. Envoy provides advanced load balancing support, including eventually consistent service discovery, circuit breakers, retries, and zone-aware load balancing. Envoy also has best-in-class observability, using statistics, logging, and distributed tracing.

Matt Klein explains why Lyft developed Envoy, focusing primarily on the operational agility that the burgeoning service mesh paradigm provides, with a particular focus on microservice networking observability.

Interview

Note: This is a portion of a full podcast with Matt.Klein. You can read the full show notes and listen to the podcast on InfoQ.com. You can also subscribe to all future podcasts by following us on our RSS feeds on InfoQ, SoundCloud, or on iTunes.

Question:

QCon: You created Envoy. How did you come up with the idea for Envoy?

Answer:

Matt: I've been working on Internet-scale networking for the last 10 years at places like Amazon, Twitter, and Lyft.

The migration of technology stacks from a single language stack to a more polyglot stack over the last five to seven years has made it clear that people are embracing more Microservices architectures. Embracing a stack that has many different languages brings with it a lot of different problems. For example, you have hugely heterogeneous environments across different types of architectures and even across different on-prem and cloud providers. We realized that networking and unobservable behavior are quickly becoming the largest impediments to scale. These are things like advanced load balancing, timeouts, retries, circuit breakers, tracing, and logging.

Looking around the ecosystem you see a lot of great tooling around the JVM (things like Finagle or Hystrix from Netflix). But when you start looking in the polyglot environment, there really did not exist any cohesive set of technologies that allow people to deploy distributed system best practices (particularly, across networking and observability).

So when I came into Lyft, the company had a monolith environment with mostly PHP. They had some services in Python and were looking to add more services in Go. We were facing a lot of the same problems any company would face around this type of architecture. When it came to choosing between solving these problems with yet another library, it became clear that if we could solve these problems with an out of process proxy that was extensible, high performance, had best of class load balancing, and observability, it would be something compelling and help improve Lyft's architecture. In addition, if you could use the same proxy for internal services and for traffic at the edge, that's a pretty great thing from an operational perspective. So we felt there was a really great opportunity to help Lyft scale. That solution became Envoy.

Question:

QCon: What's the focus of the QConNYC talk?

Answer:

Matt: What we're going to do is dig deep into what Lyft's problems were prior to Envoy existing. So we'll try to set the stage for why Lyft was rolling out a microservice architecture. I'll discuss what we were hoping to gain from it. What were the operational problems that we were actually having, and then we're going to dig into the main design points of Envoy and how it helped fix those problems. We'll probably spend a considerable amount of time actually talking about the operational aspects of those problems. I'll show a lot of the internal dashboarding that we use. I'll talk a little bit about the alarming, the tracing, the logging, and try to give people a good understanding of how (from an operational perspective) Envoy and the service mesh actually help people scale their Microservices architectures.

Question:

QCon: Why do you think this is an important story today?

Answer:

Matt: think deploying microservice architectures is obviously all the rage right now. I think that there are very good reasons for organizations to do that, but I think, at the same time, the current state of the industry is such that organizations undertake microservice migrations without fully understanding all the operational complexity.

I think many organizations get stuck, and I think Lyft was in that position. We wanted to unlock the people agility around microservices but faced major operational concerns particularly around networking and observability. That's where envoy comes in and helps bridge that gap. How do you allow people to come in and build microservice architectures and scale them in such a way that they don't spend all their time debugging?

Question:

QCon: Who are you talking to or are you talking to in this talk?

Answer:

Matt: First off, I think I'm talking to two different types of people. The first are people who are building infrastructure. So people who are building the foundational systems that the application developers are going to run their business logic on. The second set of people are application developers. I think a lot of application developers spend a lot of time dealing with infrastructure problems and not focusing on business logic. For that audience, my goal is to try to help them understand that there is a better way. If the infrastructure is mature enough (and provides enough abstractions), they can spend more time focusing on business logic than on dealing with debugging random problems.

Speaker: Matt Klein

Creator of Envoy & Software Engineer @Lyft

Matt Klein is a software engineer at Lyft and the creator of Envoy (www.envoyproxy.io). Matt has been working on operating systems, virtualization, distributed systems, networking, and making systems easy to operate for more than 15 years across a variety of companies. Some highlights include leading the development of Twitter’s L7 edge proxy and working on high-performance computing and networking in Amazon’s EC2.

Find Matt Klein at

Speaker page

@mattklein123

Similar Talks

Programming for Hostile Environments

SVP, Engineering @packethost

Nathan Goulding

Platforms at Twilio: Unlocking Developer Effectiveness

Senior Director Platform Engineering @twilio

Justin Kitagawa

Help! I Accidentally Distributed My System!

Software Engineer & Engineering Manager @Honeycombio

Emily Nakashima

Help! I Accidentally Distributed My System!

Developer Programs Engineer @Google

Rachel Myers

Heretical Resilience: To Repair is Human

Staff Infrastructure Engineer @travisci

Ryn Daniels

AutoCAD & WebAssembly: Moving a 30 Year Code Base to the Web

Software Architect @autodesk

Kevin Cheung

ML Data Pipelines for Real-Time Fraud Prevention @PayPal

Lead Data Architect, Risk and Compliance Management Platform @PayPal

Mikhail Kourjanski

How Machines Help Humans Root Cause Issues @Netflix

Senior Software Engineer, Operational Insights @Netflix

Seth Katz

Rethinking HCI With Neural Interfaces @CTRLlabsco

Director of R&D @CTRLlabsCo

Adam Berenzweig

Tracks

Microservices: Patterns & Practices

Evolving, observing, persisting, and building modern microservices
Developer Experience: Level up Your Engineering Effectiveness

Improving the end to end developer experience - design, dev, test, deploy, operate/understand. Tools, techniques, and trends.
Modern Java Reloaded

Modern, Modular, fast, and effective Java. Pushing the boundaries of JDK 9 and beyond.
Modern User Interfaces: Screens and Beyond

Zero UI, voice, mobile: Interfaces pushing the boundary of what we consider to be the interface
Practical Machine Learning

Applied machine learning lessons for SWEs, including tech around TensorFlow, TPUs, Keras, Caffe, & more

Ethics in Computing

Inclusive technology, Ethics and politics of technology. Considering bias. Societal relationship with tech. Also the privacy problems we have today (e.g., GDPR, right to be forgotten)
Architectures You've Always Wondered About

Next-gen architectures from the most admired companies in software, such as Netflix, Google, Facebook, Twitter, Goldman Sachs
Modern CS in the Real World

Thoughts pushing software forward, including consensus, CRDT's, formal methods, & probalistic programming
Container and Orchestration Platforms in Action

Runtime containers, libraries, and services that power microservices
Finding the Serverless Sweetspot

Stories about the pains and gains from migrating to Serverless.

Chaos, Complexity, and Resilience

Lessons building resilient systems and the war stories that drove their adoption
Real World Security

Practical lessons building, maintaining, and deploying secure systems
Blockchain Enabled

Exploring Smart contracts, oracles, sidechains, and what can/cannot be done with blockchain today.
21st Century Languages

Lessons learned from languages like Rust, Go-lang, Swift, Kotlin, and more.
Empowered Teams

Safely running inclusive teams that are autonomous and self-correcting

Schedule

Track: Architectures You've Always Wondered About

Location: Broadway Ballroom North Center, 6th fl.

Duration: 10:35am - 11:25am

Day of week: Wednesday

Level: Intermediate

Persona: Architect, Developer, DevOps Engineer

What You’ll Learn

Abstract

Interview

Find Matt Klein at

Similar Talks

Tracks

Microservices: Patterns & Practices

Developer Experience: Level up Your Engineering Effectiveness

Modern Java Reloaded

Modern User Interfaces: Screens and Beyond

Practical Machine Learning

Ethics in Computing

Architectures You've Always Wondered About

Modern CS in the Real World

Container and Orchestration Platforms in Action

Finding the Serverless Sweetspot

Chaos, Complexity, and Resilience

Real World Security

Blockchain Enabled

21st Century Languages

Empowered Teams

Presentation: Lyft's Envoy: Embracing a Service Mesh

Track: Architectures You've Always Wondered About

Location: Broadway Ballroom North Center, 6th fl.

Duration: 10:35am - 11:25am

Day of week: Wednesday

Level: Intermediate

Persona: Architect, Developer, DevOps Engineer

More talks on:

Share this on:

What You’ll Learn

Abstract

Interview

Find Matt Klein at

Similar Talks

Tracks