Presentation: Using Chaos To Build Resilient Systems
Share this on:
What You’ll Learn
- Learn how chaos engineering can help an organization build more resilient systems.
- Understand strategies on how to get a chaos engineering program started and what are appropriate first steps.
- Hear first-hand experiments from a senior principal SRE how chaos engineering has affected her systems.
Abstract
There are those of us that are motivated to build resilient systems, improve uptime, move fast and keep systems reliable. Then there are those of us who feel overwhelmed by our to-do lists and the features or projects we feel we need to get out the door.
The world needs more resilient systems because the world needs engineers in this for the long haul. We can create a better future for ourselves, those who come after us, our customers and our wider teams by focusing on building resilient systems. How do we make it easier for everyone to build resilient systems?
It is not easy to build resilient systems, but that doesn’t mean we shouldn’t try. Engineers love a technical challenge. In this talk I will explain how focusing on the detection, mitigation, resolution and prevention of incidents is a great place to start. I will share my experiences using chaos engineering to build resilient systems... even when you can’t build your systems from scratch.
What do you want someone to leave your talk with?
Everyone who comes along to this talk will leave with an understanding of how they can start seeing massive benefits from practicing Chaos Engineering within 3 months. Chaos Engineering to me is the fastest, most efficient way to take a giant leap forward for the resilience of your systems and team.
Can you give me an example of a time Choas Engineering really saved you?
Through practicing Chaos Engineering I have personally achieved a 10x reduction in incidents and the complete elimination of high severity (SEV 0) incidents for 12+ months. This giant leap was achieved within a 3-month window. That means less downtime and less pagerpain for everyone.
What is the level of experience someone attending this talk should have?
To get the most value from this talk you have ideally been on-call and felt the pain of keeping the lights on.
Similar Talks
Tracks
-
Microservices: Patterns & Practices
Evolving, observing, persisting, and building modern microservices
-
Developer Experience: Level up Your Engineering Effectiveness
Improving the end to end developer experience - design, dev, test, deploy, operate/understand. Tools, techniques, and trends.
-
Modern Java Reloaded
Modern, Modular, fast, and effective Java. Pushing the boundaries of JDK 9 and beyond.
-
Modern User Interfaces: Screens and Beyond
Zero UI, voice, mobile: Interfaces pushing the boundary of what we consider to be the interface
-
Practical Machine Learning
Applied machine learning lessons for SWEs, including tech around TensorFlow, TPUs, Keras, Caffe, & more
-
Ethics in Computing
Inclusive technology, Ethics and politics of technology. Considering bias. Societal relationship with tech. Also the privacy problems we have today (e.g., GDPR, right to be forgotten)
-
Architectures You've Always Wondered About
Next-gen architectures from the most admired companies in software, such as Netflix, Google, Facebook, Twitter, Goldman Sachs
-
Modern CS in the Real World
Thoughts pushing software forward, including consensus, CRDT's, formal methods, & probalistic programming
-
Container and Orchestration Platforms in Action
Runtime containers, libraries, and services that power microservices
-
Finding the Serverless Sweetspot
Stories about the pains and gains from migrating to Serverless.
-
Chaos, Complexity, and Resilience
Lessons building resilient systems and the war stories that drove their adoption
-
Real World Security
Practical lessons building, maintaining, and deploying secure systems
-
Blockchain Enabled
Exploring Smart contracts, oracles, sidechains, and what can/cannot be done with blockchain today.
-
21st Century Languages
Lessons learned from languages like Rust, Go-lang, Swift, Kotlin, and more.
-
Empowered Teams
Safely running inclusive teams that are autonomous and self-correcting