Leveraging your hadoop cluster better - running performant code at scale

Location:

Grand Ballroom - Salon A/B

Track:

Abstract:

Somebody once said that hadoop is a way of running highly unperformant code at scale. In this talk I want to show how we can change that and make map reduce jobs more performant. I will show how to analyze them at scale and optimize the job itself, instead of just tinkering with hadoop options. The result is a much better utilized cluster and jobs that run in a fraction of the original time running performant code at scale! Most of the time when speaking about Hadoop people only consider scale, however, when looking at it it very often runs highly unperformant jobs. By actually looking at the performance characteristics of the jobs themselves and optimizing and tuning those far better results can be achieved. Examples include small changes that cut jobs down from 15 hours to 2 hours without adding any more hardware. The concepts and techniques explained in the talk will be applicable regardless which tool is used to identify the performance characteristics, what is important is that by applying performance analysis and optimization techniques that we have used on other applications for a long time we can make hadoop jobs much more effective and performant! The attendees will be able to understand those techniques and apply them to their map/reduce/PIG/hive or other mapreduce jobs.

Michael Kopp

Michael Kopp has over 12 years of experience as an architect and developer. He is a technology strategist in CompuwareAPM's center of excellence where he focuses on architecture and performance of cloud and big data environments. In this role he drives the dynaTrace Enterprise product strategy works closely with key customers in implementing APM in these environments. Before joining Compuware he has been the Chief Architect at GoldenSource. He is a frequent speaker at technology conferences on performance and architecture related topics, a published author and regularly publishes articles on apmblog.compuware.com.