Presentation: "MapReduce and Its Discontents"

Time: Tuesday 10:50 - 11:50

Location: Salon A-B

Abstract:
Apache Hadoop is the current darling of the ""Big Data"" world. At its core is the MapReduce computing model for decomposing large data-analysis jobs into smaller tasks and distributing those tasks around a cluster. MapReduce itself was pioneered at Google for indexing the Web and other computations over massive data sets.
 
I'll describe MapReduce and discuss strengths, such as cost-effective scalability, as well as weaknesses, such as its limits for real-time event stream processing and the relative difficulty of writing MapReduce programs. I'll briefly show you how higher-level languages ease the development burden and provide useful abstractions for the developer.
 
Then I'll discuss emerging alternatives, such as Google's Pregel system for graph processing and event stream processing systems like Storm, as well as the role of higher-level languages in optimizing the productivity of developers. Finally, I'll speculate about the future of Big Data technology.

Dean Wampler, Author "Programming Scala", "Functional Programming for Java Devs"

 Dean  Wampler

Dean Wampler is the author of "Functional Programming for Java Developers", the co-author of "Programming Scala", and the co-author of the forthcoming "Programming Hive" (all from O'Reilly). He is a Principal Consultant for Think Big Analytics, specialists in "Big Data" application development, primarily using Hadoop-related technologies. Dean is the founder of the Chicago-Area Scala Enthusiasts (meetup.com/chicagoscala/) and the programming web site polyglotprogramming.com. He is also a contributer to several open-source projects. You can follow Dean on Twitter at @deanwampler.