Presentation: When Streams Fail: Kafka Off the Shore

Track: Stream Processing at Large

Location: Liberty, 8th fl.

Duration: 10:35am - 11:25am

Day of week: Tuesday

Level: Intermediate

Persona: Data Scientist

Abstract

How good is your streaming framework at failure? Does it die gracefully telling you exactly at which point it died? Does it tell you why it died? Does it pick-up where it left off afterwards? Can it easily skip the "erroneous" portions of the stream? Do you always know what was processed and what wasn't? Does it even have to die when process, host, data-center fail?

In this talk we focus on "What Ifs" scenarios and how to evaluate and architect a streaming platform that has high level of resilience. We'll look at Kafka and Spark Streaming as specific examples and share our experience of using these frameworks to process financial transactions answering the questions above along the way. We'll also show examples of tools that we built along our streaming journey which we found invaluable during failure scenarios.

Speaker: Anton Gorshkov

Managing Director @GoldmanSachs

Anton Gorshkov is a Managing Director at Goldman Sachs Asset Management where he runs a global Core Platform team, focusing on GSAM’s data strategy and real-time services. Anton started at Goldman 15 years ago and worked with numerous groups throughout his career, mostly focusing on data-oriented concerns, ranging from data warehouses to in-memory key-value stores to building a custom language and framework used to generate investment signals.

Find Anton Gorshkov at

Similar Talks

Software Engineer @Agrilyst
Cofounder & CTO @Flow.io., previously Co-Founder & CTO @Gilt
Platform Director, "SeatGeek Open"​ @SeatGeek
Director of Engineering @ Squarespace
Software Engineer @Jet, previous CTO
Leading Machine Learning Researcher, Vowpal Wabbit Contributor
Senior Research Software Development Engineer @Microsoft

Tracks

Monday, 26 June

Tuesday, 27 June

Wednesday, 28 June