Track: Stream Processing at Large

Day of week:

The software industry has learned that the world’s data can be represented as unbounded queues of changes. It can be sliced into sliding windows. It can be aggregated, rolled up, and analyzed. We can choose a number of ways to do this work such as using Kafka Streams or Spark Streaming. We can opt for Apache Beam, Storm, Samza, Flume, or Flink. We have a large pool of options on which we can build powerful systems, but there is accidental complexity lurking in any of the choices:

  • What if I need to rebuild all the data?
  • How do I know when my system is not healthy?
  • How do I reason about time in this system?
  • What if things arrive out of order?
  • How do I know things have arrived?

This track walks through uses of streaming technologies at large, the problems encountered, and how teams are coping with the state of this new world. As we approach maturity in streaming systems the companies using these systems are growing ecosystems and best practices around building and operating them. They are discovering new ways to reason about monitoring, testing, performance, and failure. This track is an opportunity to learn from their experiences.

Track Host:
Michelle Brush
Engineering Director @Cerner
Michelle Brush is a math geek turned computer geek with 15 years of software development experience. She has developed algorithms and data structures for pathfinding, search, compression, and data mining in embedded as well as distributed systems. In her current role as an Engineering Director for Cerner Corporation, she is responsible for the data ingestion and processing platform for Cerner’s Population Health solutions. She also leads several engineering education programs and culture initiatives including Cerner’s software architect development program and internal developer conference. Outside of Cerner, she is the chapter leader for the Kansas City chapter of Girl Develop It and one of the conference organizers for

by Shriya Arora
Senior Data Engineer @Netflix

Streaming applications have historically been complex to design and implement because of the significant infrastructure investment. However, recent active developments in various streaming platforms provide an easy transition to stream processing, and enable analytics applications/experiments to consume near real-time data without massive development cycles.

This talk will cover the experiences Netflix’s Personalization Data team had in stream processing unbounded datasets. The...

by Sean Cribbs
Software Engineer @Comcast

In the midst of building a multi-datacenter, multi-tenant instrumentation and visibility system, we arrived at stream processing as an alternative to storing, forwarding, and post-processing metrics as traditional systems do. However, the streaming paradigm is alien to many engineers and sysadmins who are used to working with "wall-of-graphs" dashboards, predefined aggregates, and point-and-click alert configuration.

Taking inspiration from REPLs, literate programming, and DevOps...


Monday, 26 June

Tuesday, 27 June

Wednesday, 28 June