New Yorkers
Presentations about New Yorkers

Debugging Microservices: How Google SREs Resolve Outages

Debugging Microservices: How Google SREs Resolve Outages

Defense in Depth: In Depth

Engineering Secure Products at Facebook

Canopy: Scalable Distributed Tracing & Analysis @Facebook

Canopy: Scalable Distributed Tracing & Analysis @Facebook

Java 11 - Keeping the Java Release Train on the Right Track

Design Microservice Architectures the Right Way

Digital Publishing for Scale: The Economist and Go

Organizing for Your Ethical Principles

Rethinking HCI With Neural Interfaces @CTRLlabsco

Software Is Eating the World, ML Is Going to Eat Software

"Yo... Ask Me Anything" - Panel of NY Senior Java Developers

"Yo... Ask Me Anything" - Panel of NY Senior Java Developers

"Yo... Ask Me Anything" - Panel of NY Senior Java Developers

"Yo... Ask Me Anything" - Panel of NY Senior Java Developers

"Yo... Ask Me Anything" - Panel of NY Senior Java Developers

Why Bother With Kotlin - Not Just Another Language Tour
Debugging Microservices: How Google SREs Resolve Outages
What is the work that you do today as a Google SRE?
Adam: I work for a Google DevOps team that takes care of Monarch. Monarch is a very large time series database used for querying and metrics collection. Monarch is roughly the internal equivalent of combining Prometheus, Grafana, and Graphite from the open source world. Monarch also adds to that stack all of Stackdriver and provides the backend for a lot of our cloud signals product. My role is an SRE-SWE which means I'm involved in the software engineering side as well. So a lot of my time is spent taking apart Monarch and putting it back together more durably and more reliably. Durability is especially important because Monarch is a globally distributed system (it runs in every single availability zone).
Can you give me an idea of the scope and size we’re talking about with Monarch?
Adam: I can’t be specific, but it’s very large in terms of both QPS and resources. The quantity of data per stream is extremely variable in size, from periodically receiving one byte, to receiving a constant stream of high-cardinality data. The same applies to the query side, where some queries need only fetch a single stream, and some need to fetch and aggregate a lot of them. Some consumers are doing ad hoc queries, and other teams are doing a tremendous number of queries per second to inform their actual customer facing products. Without Monarch, we have no monitoring or alerting, so it’s a critical system.
See more interviews