Track: Stream Processing at Large

The software industry has learned that the world’s data can be represented as unbounded queues of changes. It can be sliced into sliding windows. It can be aggregated, rolled up, and analyzed. We can choose a number of ways to do this work such as using Kafka Streams or Spark Streaming. We can opt for Apache Beam, Storm, Samza, Flume, or Flink. We have a large pool of options on which we can build powerful systems, but there is accidental complexity lurking in any of the choices:

  • What if I need to rebuild all the data?
  • How do I know when my system is not healthy?
  • How do I reason about time in this system?
  • What if things arrive out of order?
  • How do I know things have arrived?

This track walks through uses of streaming technologies at large, the problems encountered, and how teams are coping with the state of this new world. As we approach maturity in streaming systems the companies using these systems are growing ecosystems and best practices around building and operating them. They are discovering new ways to reason about monitoring, testing, performance, and failure. This track is an opportunity to learn from their experiences.

Track Host:
Michelle Brush
Engineering Director @Cerner
Michelle Brush is a math geek turned computer geek with 15 years of software development experience. She has developed algorithms and data structures for pathfinding, search, compression, and data mining in embedded as well as distributed systems. In her current role as an Engineering Director for Cerner Corporation, she is responsible for the data ingestion and processing platform for Cerner’s Population Health solutions. She also leads several engineering education programs and culture initiatives including Cerner’s software architect development program and internal developer conference. Outside of Cerner, she is the chapter leader for the Kansas City chapter of Girl Develop It and one of the conference organizers for Midwest.io.

Tracks