Conference: Jun 13-15, 2016
Tutorials: Jun 16-17, 2016
Track: Architecting for Failure
Location:
- Salon D
Day of week:
- Thursday
Complex systems fail in spectacular ways. Failure isn’t a question of if, but when. Resilient systems recover from failure; robust systems resist failure. In this track we’ll hear from experts who have designed systems that shifted from fragility to resilience and robustness in the face of failure. Attendees will learn architectural patterns and approaches that didn’t and did work, with take-aways that can be applied to their own systems.
by Jon Moore
Senior Fellow, Comcast Cable
Comcast’s TV products serve tens of millions of customers and are powered by a suite of dozens of services that are continuously developed and operated by hundreds of technical staff. While we have enjoyed many of the touted benefits of a microservice architecture--looser coupling between teams, independent deployments--we have also encountered the corresponding reliability challenges. Delivering business value in this environment can seem like hacking your way through the wilderness at...
by Nori Heikkinen
Google Site Reliability Engineering Expert
Failure is a fact of life, so we design our system to be fault-tolerant at all levels. In practice, however, some components almost never fail. As the product grows, these components are increasingly stressed in new and different ways; when they ultimately do fail they create outages for which we are unprepared. We thought we were designing for failure, but the design didn't include failures at this level. At Google, some of our most exciting production snafus involve large and unpredictable...
by Kolton Andrus
Chaos Engineer at Netflix
Netflix’s 57M members watch over 2 billion hours of content per month and their streaming accounts for 1/3rd of Internet traffic in some parts of the world. The Edge platform, which 1000’s of devices rely on to access the streaming experience, guards the front door to Netflix where any major issue results in a twitter storm.
In order to harden our systems, we designed “Failure as a Service” to allow anyone to test and validate how our systems handle failure. Purposefully injecting...
by Tom Limoncelli
Author, SRE @ Stack Exchange
Distributed or "cloud" computing involves many moving parts, any of which can break or fail. Succeeding in this environment requires embracing failure, not running or hiding from it. To do this requires challenging our instincts with radical ideas. Tom will highlight some of the most radical advice from the new book “The Practice of Cloud System Administration”.
Topics will include: create resiliency at the most economic level, do risky procedures often, and create a blameless culture...
by Joe Stein
Founder, Principal Consultant at Big Data Open Source Security LLC
Building and deploying elastic distributed data centric systems that can fail, without losing data and without sacrificing elasticity, has been traditionally challenging. With Apache Mesos, an open source project that is the kernel for your data center, we can now create fully elastic end to end compute environments. With Mesos, distributed data persistent services can run durably and elastically. Kafka, HDFS, Cassandra, MySQL and more data centric systems run on Mesos.
We will talk...
Tracks
Wednesday Jun 10
-
Applied Data Science and Machine Learning
Putting your data to use. The latest production methods for deriving novel insights
-
Engineer Your Culture
Building and scaling a compelling engineering culture
-
Modern Advances in Java Technology
Tips, techniques and technologies at the cutting edge of modern Java
-
Monoliths to Microservices
How to evolve beyond a monolithic system -- successful migration and implementation stories
-
The Art of Software Design
Software Arch as a craft, scenario based examples and general guidance
-
Sponsored Solutions Track I
Thursday Jun 11
-
Emerging Technologies in Front-end Development
The state of the art in client-side web development
-
Fraud Detection and Hack Prevention
Businesses are built around trust in systems and data. Securing systems and fighting fraud throughout the data in them.
-
Reactive Architecture Tactics
The how of the Reactive movement: Release It! techniques, Rx, Failure Concepts, Throughput, Availability
-
Architecting for Failure
War stories and lessons learned from building highly robust and resilient systems
-
High Performance Streaming Data
Scalable architectures and high-performance frameworks for immediate data over persistent connections
-
Sponsored Solutions Track II
Friday Jun 12
-
Architectures You've Always Wondered about
Learn from the architectures powering some of the most popular applications and sites
-
Continuously Deploying Containers in Production
Production ready patterns for growing containerization in your environment
-
Mobile and IoT at Scale
Users, Usage and Microservices
-
Modern Computer Science in the Real World
How modern CS tackles problems in the real world
-
Optimizing Yourself
Maximizing your impact as an engineer, as a leader, and as a person
-
Sponsored Solutions Track III