Global Capacity Management through Strategic Demand Allocation

Meta currently operates in more than 15 data center regions around the world. This rapidly expanding global datacenter footprint poses new challenges for service owners as well as our infrastructure management systems. In this talk, we will present the challenges in managing a global-scale infrastructure and our approach for global service and capacity management. In particular, we’ll focus on the following,

  • Abstractions and guarantees we present to service owners with global capacity

  • Our current design and implementation for how we manage our workloads across 10s of regions

  • Categorize & Model different demands

  • Achieve global capacity management by shifting demand across different regions.

We’ll also present our future plans as we build towards our longer term vision of transparent automated global capacity management.


Speaker

Ranjith Kumar S

Software Engineer @Meta

Hello World! I work on building systems towards enabling fungible capacity management. I'm passionate about distributed systems @scale, and in the past have worked on Autoscaling, Capacity Management, Cluster management, etc. Currently, I'm focusing on enabling Meta's Global Capacity Management across Geo-distributed Datacenters.

Read more
Find Ranjith Kumar S at:

Date

Tuesday Jun 13 / 01:40PM EDT ( 50 minutes )

Location

Salon A-C

Topics

Architecture Infrastructure Resource Management

Share

From the same track

Session Architecture

Using Traffic Modeling to Load-Balance Netflix Traffic at Global Scale

Tuesday Jun 13 / 10:35AM EDT

Netflix Infrastructure supports personalized UI and Streaming experience across 230M+ members around the world.

Speaker image - Niosha Behnam
Niosha Behnam

Staff Software Engineer @Netflix

Speaker image - Sergey Fedorov
Sergey Fedorov

Director of Engineering @Netflix

Session Architecture

From Open Source to SaaS: The Journey of ClickHouse

Tuesday Jun 13 / 05:25PM EDT

Have you ever wondered what it takes to go from an open-source project to a fully-fledged saas product? How about doing that in only 1 year’s time? If the answer is yes, then this talk is for you. You’ll hear straight from the experts who worked on the design, and execution of this huge project.

Speaker image - Sichen Zhao
Sichen Zhao

Senior Software Engineer @Clickhouse

Speaker image - Shane Andrade
Shane Andrade

Principal Software Engineer @ClickHouse

Session

Several Components are Rendering: Client Performance at Slack-Scale

Tuesday Jun 13 / 02:55PM EDT

Our users expect the interactions in our applications and websites to be fast, no matter how complicated they are under the hood. In this talk, we’ll explore some frontend performance issues encountered by Slack as they continue to grow and evolve the desktop app.

Speaker image - Jenna Zeigen
Jenna Zeigen

Staff Engineer @Slack

Session Platform

Building Sub-Second Latency Video Infrastructure at Cloudflare

Tuesday Jun 13 / 04:10PM EDT

Cloudflare has deployed a sub-second latency live streaming system at scale over the last few years. In this talk, we’ll provide insight on how this works under the cover, specifically focusing on protocols that Cloudflare Stream uses: HLS, DASH, RTMPS, SRT and WebRTC.

Speaker image - Renan Dincer
Renan Dincer

Systems Engineer @Cloudflare

Session Architecture

Unconference: Architectures You've Always Wondered About

Tuesday Jun 13 / 11:50AM EDT

What is an unconference? An unconference is a participant-driven meeting. Attendees come together, bringing their challenges and relying on the experience and know-how of their peers for solutions.

Speaker image - Ben Linders
Ben Linders

Independent Consultant in Agile, Lean, Quality and Continuous Improvement