Workshop: Intro to Machine Learning with Redis-ML and Spark
Location:
- Manhattan Ballroom, 8th fl.
Date:
Fri, 30 JunKey takeaways
Basic understanding of Machine Learning
Understanding of the role of tools like Spark, Redis
Ability to built a basic machine learning pipeline to solve real-world problems
Hands-on experience implementing solutions to classic ML problems
Prerequisites
Attendees should bring a laptop with the following software pre-installed:
- Web browser (Chrome is ideal)
- iPython 5.2.2 or later
- Text editor for development (anything attendee is comfortable with)
This tutorial will introduce attendees to two introductory machine learning problems, decision trees and handwritten digit recognition and show attendees how to train models using open source software to solve classic ML problems. From there, attendees will learn how these classic techniques apply to modern business and software problems.
To set the context for the session, we start with a quick introduction to Redis, Spark, and Python. We provide attendees with a basic understanding of the technologies, including: how to obtain the software, how the software is used by businesses to build ML systems, and where to get more information to build on the tutorial.
The first part of the section of the tutorial will introduce decision trees for classifying users based on a set of features derived from the user. Attendees will learn the conceptual model of decision trees as well as the appropriate domains for the technique. Attendees will learn to use Spark to train a decision tree model on sample data. From there, we will discuss how decision trees can be applied to common business problems. We close out the first section of the tutorial by looking at how techniques like Random forests can be used to improve the performance of a decision model and how Redis can be used to improve the performance of a data analysis pipeline.
The second half of the tutorial will demonstrate how to use two different classification techniques (neural network and support vector machines) can be used to build classifiers to recognize handwritten digits. Again, Spark and Redis will be used by attendees to build a working handwritten digit classifier based on sample data provided for the tutorial. Python will be used to drive the classifier pipeline.
At the end of the session, attendees should have an understanding of several classic decision making models and when and when not to apply them. Attendees will feel comfortable building a prototype ML system to solve real-world business problems and will have a solid understanding of three popular ML technologies.
This tutorial will approach machine learning from a practical standpoint, focusing primarily on solving well understood and common problems developers have. Some theoretical ideas, primarily to motivate why certain models are better for certain problems, will be touched on, this mathematical underpinnings of ML will not be covered. The material should be accessible to any engineer with a CS background.
Other Workshops:
Tracks
Monday, 26 June
-
Microservices: Patterns & Practices
Practical experiences and lessons with Microservices.
-
Java - Propelling the Ecosystem Forward
Lessons from Java 8, prepping for Java 9, and looking ahead at Java 10. Innovators in Java.
-
High Velocity Dev Teams
Working Smarter as a team. Improving value delivery of engineers. Lean and Agile principles.
-
Modern Browser-Based Apps
Reactive, cross platform, progressive - webapp tech today.
-
Innovations in Fintech
Technology, tools and techniques supporting modern financial services.
Tuesday, 27 June
-
Architectures You've Always Wondered About
Case studies from the most relevant names in software.
-
Developer Experience: Level up Your Engineering Effectiveness
Trends, tools and projects that we're using to maximally empower your developers.
-
Chaos & Resilience
Failures, edge cases and how we're embracing them.
-
Stream Processing at Large
Rapidly moving data at scale.
-
Building Security Infrastructure
How our industry is being attacked and what you can do about it.
Wednesday, 28 June
-
Next Gen APIs: Designs, Protocols, and Evolution
Practical deep-dives into public and internal API design, tooling and techniques for evolving them, and binary and graph-based protocols.
-
Immutable Infrastructures: Orchestration, Serverless, and More
What's next in infrastructure. How cloud function like lambda are making their way into production.
-
Machine Learning 2.0
Machine Learning 2.0, Deep Learning & Deep Learning Datasets.
-
Modern CS in the Real World
Applied, practical, & real-world dive into industry adoption of modern CS.
-
Optimizing Yourself
Maximizing your impact as an engineer, as a leader, and as a person.
-
Ask Me Anything (AMA)