Presentation: Ingest & Stream Processing - What will you choose?



11:50am - 12:40pm

Day of week:




A lot has changed and a lot has stayed the same with Ingest and Stream Processing over the years. But today there are many options than even for Ingest and Stream Processing that one may wonder why one solution versus the other. The problem is that in this space, one size does not fit all, and that makes it all the more confusing. This talk aims at giving the audience a direction to choose when it comes to Ingest and Stream Processing.

We aim to help the audience understand the solutions available to them, and to make the best choice based on their use-case. In this talk, we will be go over current and emerging technologies in the marketplace. We will evaluate each and understand how they are useful in solving problems related to large scale data processing, joining and combining streams. We will also look at the various ways of achieving "at least once" and "exactly once" processing. We will discuss how each of these can be scaled and how we can make sure that data is processed in a timely fashion.

Speaker: Pat Patterson

Community Champion @StreamSets & Lecturer California State University, Monterey Bay

Pat Patterson has been working with Internet technologies since 1997, building software and working with communities at Sun Microsystems, Huawei, Salesforce and StreamSets. At Sun, Pat was the community lead for the OpenSSO open source project, while at Huawei he developed cloud storage infrastructure software. Part of the developer evangelism team at Salesforce, Pat focused on identity, integration and the Internet of Things. Now community champion at StreamSets, Pat is responsible for the care and feeding of the StreamSets open source community.

Find Pat Patterson at

Speaker: Ted Malaska

Committer to Flume, Avro, Pig, YARN & Architect @Cloudera

Ted Malaska is a solutions architect at Cloudera and has worked on close to 100 clusters for over two- to three-dozen clients with over hundreds of use cases. Ted has 18 years of professional experience working for startups, the US government, a number of the world’s largest banks, commercial firms, bio firms, retail firms, hardware appliance firms, and the largest nonprofit financial regulator in the US. He has architecture experience across topics such as Hadoop, Web 2.0, mobile, SOA (ESB, BPM), and big data. Ted is a regular contributor to the Hadoop, HBase, and Spark projects, a regular committer to Flume, Avro, Pig, and YARN, and the coauthor of O’Reilly Media’s Hadoop Application Architectures.

Find Ted Malaska at


Monday, 13 June

Tuesday, 14 June

Wednesday, 15 June