Presentation: Machine-Learned Indexes - Research from Google

Track: Architectures You've Always Wondered About

Location: Broadway Ballroom North Center, 6th fl.

Duration: 11:50am - 12:40pm

Day of week: Monday

Share this on:

Abstract

Modern data processing systems are designed to be general purpose, in that they can handle a wide variety of different schemas, data types, and data distributions, and aim to provide efficient access and computation over this data. This “one-size-fits-all” nature results in systems that do not take advantage of the unique characteristics of each application, data of the user, or workload.  However, ignored in these old systems’ design: machine learning excels at understanding and adapting to particular datasets.  We present here a vision (with evidence) for the future of data processing systems: through learning models of the application, data, and workload, we can redesign and customize nearly every component of data processing systems.  We will do a deep-dive into understanding how traditional index structures can be reframed as machine learning problems, and that by doing so, and through careful model design and code synthesis, we are able to outperform cache-optimized B-Trees by up to 70% in speed while saving an order-of-magnitude in memory over several real-world data sets.  Building on these same modeling techniques, we find that we can achieve improvements in sorting, multi-dimensional indexing, and query optimization, all areas that have historically been the domain of traditional discrete algorithms and complex systems engineering.

Speaker: Alex Beutel

Senior Research Scientist @Google

Alex Beutel is a Senior Research Scientist in the Google Brain SIR team working on neural recommendation, fairness in machine learning, and ML for Systems. He received his Ph.D. in 2016 from Carnegie Mellon University’s Computer Science Department, and previously received his B.S. from Duke University in computer science and physics. His Ph.D. thesis on large-scale user behavior modeling, covering recommender systems, fraud detection, and scalable machine learning, was given the SIGKDD 2017 Doctoral Dissertation Award Runner-Up. He received the Best Paper Award at KDD 2016 and ACM GIS 2010, was a finalist for best paper in KDD 2014 and ASONAM 2012, and was awarded the Facebook Fellowship in 2013 and the NSF Graduate Research Fellowship in 2011. More details can be found at alexbeutel.com.

Find Alex Beutel at

Speaker: Tim Kraska

Associate Professor @MIT & Visiting Researcher @Google

Find Tim Kraska at

Tracks

Monday, 24 June

Tuesday, 25 June

Wednesday, 26 June