Location:

Salon D

Day of week:

Tuesday

“Antifragility is beyond resilience or robustness. The resilient resists shocks and stays the same; the antifragile gets better.”

Failure and change are constants in Internet scale companies. Uptime is a battle, with tales of glory and heartache. How can we do better than withstand, but improve with each step? Learn from industry leaders how they proactively prepare for the inevitable. Find out how these techniques help them to weather production storms and have confidence in the behavior of their complex systems.

Track Host:

Kolton Andrus

Founder Gremlin and previously a Chaos Engineer at Netflix

Kolton is the founder of Gremlin Inc - helping companies build more robust services. He was a Chaos Engineer at Netflix, focused on the resilience of the Edge services. He designed and built FIT: Netflix’s failure injection service. Prior he improved the performance and reliability of the Amazon Retail website. At both companies he has served as a ‘Call Leader’, managing the resolution of company-wide incidents. Kolton is passionate about building resilient systems, primarily as it lets him break things for fun and profit.

10:35am - 11:25am

by Theo Schlossnagle
Founder and CEO @Circonus, Editorial board of ACM's ‘Queue’

Adaptive Availability for Quality of Service

In this presentation, I'll talk about lessons learned in building a always-on distributed time-series database with aggressive quality of service guarantees. As any distributed systems engineer knows, coping with a failed machine is an easy problem compared to an under performing one. When SLAs are tight, under performing is effectively byzantine behavior. I will talk about both macro and micro techniques used in our system to cope with bad machines, bad actors...

11:50am - 12:40pm

by Luke Kosewski
Founding Member of Netflix Chaos and Traffic Team

Chaos Kong - Endowing Netflix with Antifragility

The Netflix control plane handles a third of peak Internet traffic. That's an awful lot of customers we need to keep safe from any service outages. Netflix developed "Flow" to wage war against these outages. Flow coordinates recovery from localized disruptions and enables periodic verification through production experimentation called “Chaos Kong.”

Flow endows all services within Netflix the capabilities to withstand regional...

1:40pm - 2:30pm

by Michalis Zervos
Service Resilience Software Engineer @Microsoft

Improving Resilience by Creating Storms in the Cloud

For any company to run on the cloud they need assurances that their workloads, services, and data will be always available and secure. To be able to provide such guarantees, application developers and cloud providers need to perform extensive verification across a number of distributed services. Traditional testing tools were not designed to verify the resiliency of such systems.

At Microsoft, we actively develop and use fault...

2:55pm - 3:45pm

by Richard Kasperowski
Author of The Core Protocols: A Guide to Greatness

Open Space

Antifragile Open Space

4:10pm - 5:00pm

by Abel Mathew
Co-founder & CEO of Backtrace I/O

Scalable Post-Mortem Analysis

Resilience for many of us comes from our ability to restart applications in the face of failure. We as debuggers and operators are often forced to go back and analyze clues left behind to tease out root-cause from assets like logs, heap dumps, or even core dumps. As our systems grow, and become more distributed, these one-off investigations become less tenable and a scalable way to analyze incidents after-the-fact is needed. In this talk, we'll explore examples...

5:25pm - 6:15pm

by Thomissa Comellas
Technical Project Manager @Dropbox

by Tammy Butow
SRE Manager @Dropbox

0 to 100 days - Running DRTs at Dropbox

Thomissa joined the Dropbox Infrastructure team 100 days ago. This presentation will share her experiences developing and rolling out new Disaster Recovery Testing techniques at Dropbox. Tammy will join Thomissa to share how her team runs DRTs and has implemented the techniques Thomissa has evangelized.

Dropbox was founded by engineers, and the ethos of technical innovation is fundamental to our culture. We’ve grown enormously...

Tracks

Monday, 13 June

Architectures You've Always Wondered About

Case studies from: Google, Linkedin, Alibaba, Twitter, and more...
Stream Processing @ Scale

Technologies and techniques to handle ever increasing data streams
Culture As Differentiator

Stories of companies and team for whom engineering culture is a differentiator - in delivering faster, in attracting better talent, and in making their businesses more successful.
Practical DevOps for Cloud Architectures

Real-world lessons and practices that enable the devops nirvana of operating what you build
Incredible Power of an Open-Sourced .NET

.NET is more than you may think. From Rx to C# 7 designed in the open, learn more about the power of open source .NET
Sponsored Solutions Track 1

Tuesday, 14 June

Better than Resilient: Antifragile

Failure is a constant in production systems, learn how to wield it to your advantage to build more robust systems.
Innovations in Java and the Java Ecosystem

Cutting Edge Java Innovations for the Real World
Modern CS in the Real World

Real-world Industry adoption of modern CS ideas
Containers: From Dev to Prod

Beyond the buzz and into the how and why of running containers in production
Security War Stories

Expert-level security track led by well known and respected leaders in the field
Sponsored Solutions Track 2

Wednesday, 15 June

Microservices and Monoliths

Practical lessons on services. Asks the question when and when to NOT go with Microservices?
Modern API Architecture - Tools, Methods, Tactics

API-based application development, and the tooling and techniques to support effectively working with APIs in the small or at scale. Using internal and external APIs
Commoditized Machine Learning

Barriers to entry for applied ML are lower than ever before, jumpstart your journey
Full Stack JavaScript

Browser, server, devices - JavaScript is everywhere
Optimizing Yourself

Keeping life in balance is always a challenge. Learning lifehacks
Sponsored Solutions Track 3

See the Full Schedule

Location:

Day of week:

Tracks

Monday, 13 June

Tuesday, 14 June

Wednesday, 15 June

Conference for Professional Software Developers

Follow QCon

Contact

Menu

QCons around the World

Track: Better than Resilient: Antifragile

Location:

Day of week:

Tracks

Monday, 13 June

Tuesday, 14 June

Wednesday, 15 June

Conference for Professional Software Developers

Follow QCon

Contact

Menu

QCons around the World