Workshop: Introduction to Machine Learning with Redis-ML and Apache Spark

Level: 
Beginner
9:00am - 4:00pm

Date:

Fri, 30 Jun

Key takeaways

Basic understanding of Machine Learning

Understanding of the role of tools like Spark, Redis

Ability to built a basic machine learning pipeline to solve real-world problems

Hands-on experience implementing solutions to classic ML problems

Prerequisites

Attendees should bring a laptop with the following software pre-installed:

  • Web browser (Chrome is ideal)
  • iPython 5.2.2 or later
  • Text editor for development (anything attendee is comfortable with)

This tutorial will introduce attendees to two introductory machine learning problems, decision trees and handwritten digit recognition and show attendees how to train models using open source software to solve classic ML problems. From there, attendees will learn how these classic techniques apply to modern business and software problems.

To set the context for the session, we start with a quick introduction to Redis, Spark, and Python. We provide attendees with a basic understanding of the technologies, including: how to obtain the software, how the software is used by businesses to build ML systems, and where to get more information to build on the tutorial.

The first part of the section of the tutorial will introduce decision trees for classifying users based on a set of features derived from the user. Attendees will learn the conceptual model of decision trees as well as the appropriate domains for the technique. Attendees will learn to use Spark to train a decision tree model on sample data. From there, we will discuss how decision trees can be applied to common business problems. We close out the first section of the tutorial by looking at how techniques like Random forests can be used to improve the performance of a decision model and how Redis can be used to improve the performance of a data analysis pipeline.

The second half of the tutorial will demonstrate how to use two different classification techniques (neural network and support vector machines) can be used to build classifiers to recognize handwritten digits. Again, Spark and Redis will be used by attendees to build a working handwritten digit classifier based on sample data provided for the tutorial. Python will be used to drive the classifier pipeline.

At the end of the session, attendees should have an understanding of several classic decision making models and when and when not to apply them. Attendees will feel comfortable building a prototype ML system to solve real-world business problems and will have a solid understanding of three popular ML technologies.

This tutorial will approach machine learning from a practical standpoint, focusing primarily on solving well understood and common problems developers have. Some theoretical ideas, primarily to motivate why certain models are better for certain problems, will be touched on, this mathematical underpinnings of ML will not be covered. The material should be accessible to any engineer with a CS background.

Speaker: Tague Griffith

Head of Developer Advocacy @RedisLabs

Tague Griffith is the Head of Developer Advocacy at Redis Labs, where he focuses on developer outreach and developer education. Prior to joining Redis Labs, he worked in Infrastructure engineering building several high performance Redis Systems. He holds degrees in Computer Science with a specialization in Databases from Stanford University.

Find Tague Griffith at

Other Workshops:

Day: Thursday [Full Day]
Day: Thursday [Half Day]
Day: Friday [Full Day]
Day: Thursday [Half Day]

Tracks

Monday, 26 June

Tuesday, 27 June

Wednesday, 28 June