Conference: Jun 26-28, 2017
Workshops: Jun 29-30, 2017
Presentation: Papers We Love, QCon Edition
Location:
- Salon D
Duration
Day of week:
- Tuesday
Level:
- Advanced
Persona:
- General Software
Abstract
The Paper's We Love Meetup (New York City) put together a series of 4 PWL "mini" (15~20 minute) presentations by 4 wonderful speakers (all of whom are also speaking at QCon New York)! So, we're extremely happy to welcome Evelina Gabasova (@evelgab), Eric Brewer (@eric_brewer), Ines Sombra (@randommood), and Caitie McCaffrey (@caitie) to PWLNYC!
Visit the Papers We Love, QCon Edition meetup to learn more about the NYC chapter of the Paper's We Love Meetup and the Tuesday evening event.
PWL "Mini" Speakers
Evelina Gabasova presenting Mastering the Game of Go with Deep Neural Networks and Tree Search:
The game of Go has long been viewed as the most challenging of classic games for artificial intelligence due to its enormous search space and the difficulty of evaluating board positions and moves. We introduce a new approach to computer Go that uses value networks to evaluate board positions and policy networks to select moves. These deep neural networks are trained by a novel combination of supervised learning from human expert games, and reinforcement learning from games of self-play. Without any lookahead search, the neural networks play Go at the level of state-of-the-art Monte-Carlo tree search programs that simulate thousands of random games of self-play. We also introduce a new search algorithm that combines Monte-Carlo simulation with value and policy networks. Using this search algorithm,our program AlphaGo achieved a 99.8% winning rate against other Go programs,and defeated the European Go champion by 5 games to 0. This is the first time that a computer program has defeated a human professional player in the full-sized game of Go, a feat previously thought to be at least a decade away.
Eric Brewer presenting Experience with Processes and Monitors in Mesa:
The use of monitors for describing concurrency has been much discussed in the literature. When monitors are used in real systems of any size, however, a number of problems arise which have not been adequately dealt with: the semantics of nested monitor calls; the various ways of defining the meaning of WAIT; priority scheduling; handling of timeouts, aborts and other exceptional conditions; interactions with process creation and destruction; monitoring large numbers of small objects. These problems are addressed by the facilities described here for concurrent programming in Mesa. Experience with several substantial applications gives us some confidence in the validity of our solutions.
Ines Sombra presenting IronFleet: Proving Practical Distributed Systems Correct:
Distributed systems are notorious for harboring subtle bugs. Verification can, in principle, eliminate these bugs a priori,but verification has historically been difficult to apply at full program scale, much less distributed-system scale.We describe a methodology for building practical and provably correct distributed systems based on a unique blend of TLA-style state-machine refinement and Hoare-logic verification.We demonstrate the methodology on a complex implementation of a Paxos-based replicated state machine library and a lease-based sharded key-value store. We prove that each obeys a concise safety specification, as well as desirable liveness requirements. Each implementation achieves performance competitive with a reference system. With our methodology and lessons learned, we aim to raise the standard for distributed systems from “tested” to “correct.”
Caitie McCaffrey presenting Simple Testing Can Prevent Most Critical Failures...:
Large, production quality distributed systems still fail periodically,and do so sometimes catastrophically, where most or all users experience an outage or data loss. We present the result of a comprehensive study investigating198 randomly selected, user-reported failures that occurred on Cassandra, HBase, Hadoop Distributed FileSystem (HDFS), Hadoop MapReduce, and Redis, with the goal of understanding how one or multiple faults eventually evolve into a user-visible failure. We found that from a testing point of view, almost all failures require only 3 or fewer nodes to reproduce, which is good news considering that these services typically run on avery large number of nodes. However, multiple inputs are needed to trigger the failures with the order between them being important. Finally, we found the error logs of these systems typically contain sufficient data on both the errors and the input events that triggered the failure,enabling the diagnose and the reproduction of the production failures.
We found the majority of catastrophic failures could easily have been prevented by performing simple testing on error handling code – the last line of defense – even without an understanding of the software design. We extracted three simple rules from the bugs that have lead to some of the catastrophic failures, and developed a static checker, Aspirator, capable of locating these bugs. Over30% of the catastrophic failures would have been prevented had Aspirator been used and the identified bugs fixed. Running Aspirator on the code of 9 distributed systems located 143 bugs and bad practices that have been fixed or confirmed by the developers.
Similar Talks
Tracks
Monday, 13 June
-
Architectures You've Always Wondered About
Case studies from: Google, Linkedin, Alibaba, Twitter, and more...
-
Stream Processing @ Scale
Technologies and techniques to handle ever increasing data streams
-
Culture As Differentiator
Stories of companies and team for whom engineering culture is a differentiator - in delivering faster, in attracting better talent, and in making their businesses more successful.
-
Practical DevOps for Cloud Architectures
Real-world lessons and practices that enable the devops nirvana of operating what you build
-
Incredible Power of an Open-Sourced .NET
.NET is more than you may think. From Rx to C# 7 designed in the open, learn more about the power of open source .NET
-
Sponsored Solutions Track 1
Tuesday, 14 June
-
Better than Resilient: Antifragile
Failure is a constant in production systems, learn how to wield it to your advantage to build more robust systems.
-
Innovations in Java and the Java Ecosystem
Cutting Edge Java Innovations for the Real World
-
Modern CS in the Real World
Real-world Industry adoption of modern CS ideas
-
Containers: From Dev to Prod
Beyond the buzz and into the how and why of running containers in production
-
Security War Stories
Expert-level security track led by well known and respected leaders in the field
-
Sponsored Solutions Track 2
Wednesday, 15 June
-
Microservices and Monoliths
Practical lessons on services. Asks the question when and when to NOT go with Microservices?
-
Modern API Architecture - Tools, Methods, Tactics
API-based application development, and the tooling and techniques to support effectively working with APIs in the small or at scale. Using internal and external APIs
-
Commoditized Machine Learning
Barriers to entry for applied ML are lower than ever before, jumpstart your journey
-
Full Stack JavaScript
Browser, server, devices - JavaScript is everywhere
-
Optimizing Yourself
Keeping life in balance is always a challenge. Learning lifehacks
-
Sponsored Solutions Track 3