Meta currently operates in more than 15 data center regions around the world. This rapidly expanding global datacenter footprint poses new challenges for service owners as well as our infrastructure management systems. In this talk, we will present the challenges in managing a global-scale infrastructure and our approach for global service and capacity management. In particular, we’ll focus on the following,
Abstractions and guarantees we present to service owners with global capacity
Our current design and implementation for how we manage our workloads across 10s of regions
Categorize & Model different demands
Achieve global capacity management by shifting demand across different regions.
We’ll also present our future plans as we build towards our longer term vision of transparent automated global capacity management.
Ranjith Kumar S
Software Engineer @Meta
Hello World! I work on building systems towards enabling fungible capacity management. I'm passionate about distributed systems @scale, and in the past have worked on Autoscaling, Capacity Management, Cluster management, etc. Currently, I'm focusing on enabling Meta's Global Capacity Management across Geo-distributed Datacenters.