Track: Machine Learning for Developers

Location: Soho Complex, 7th fl.

Day of week: Monday

Machine learning is more approachable than ever. Learn techniques and use cases with PyTorch, Keras and TensorFlow that will become foundational for moderan application developers.

Track Host: Hien Luu

Engineering Leader @LinkedIn - AI & Big Data Enthusiast

Hien Luu is an engineering manager at LinkedIn and he is an AI & big data enthusiast. He is particularly passionate about the intersection between Big Data and Artificial Intelligence. Teaching is one his passions and he is currently teaching Apache Spark course at UCSC Silicon Valley Extension school. He was the author of the "Beginning Apache 2" book, which was published in 2018. He has given presentations at various conferences like QCon (SF, London, Shanghai), Hadoop Summit, JavaOne, and Seattle Data Day.

10:35am - 11:25am

Getting Started in Deep Learning with TensorFlow 2.0

The introduction of deep learning into the data science toolkit has allowed for significant improvements on many important problems in data science. Many advancements in fields such as natural language processing, computer vision and generative modeling can be attributed to advancements in deep learning. In this talk, we will explain what deep learning is, why you may (or may not!) want to use it over traditional machine learning methods, as well as how to get started building deep learning models yourself using TensorFlow 2.0.  

Why do we want to specifically highlight TensorFlow 2.0? The release of TensorFlow 2.0 comes with a significant number of improvements over its 1.0 version, all with a focus on ease of usability and a better user experience. We will give an overview of what TensorFlow 2.0 is and discuss how to get started building models from scratch using TensorFlow 2.0’s high-level api, Keras. We will walk through an example step-by-step in Python of how to build an image classifier. We will then showcase how to leverage a technique called transfer learning to make building a model even easier! With transfer learning, we can leverage other pretrained models such as ImageNet to drastically speed up the training time of our model. TensorFlow 2.0 makes this incredibly simple to do.  

The TensorFlow ecosystem is rich with other offerings, and we would be remiss not to mention them. We will conclude by briefly discussing what these are, including Swift for TensorFlow, TensorFlow.js and TensorFlow Extended!

Brad Miro, Machine Learning Engineer @Google

11:50am - 12:40pm

Panel: ML for Developers/SWEs

Throughout the day, we'll have speakers cover how they've adopted applied machine learning to software engineering. The day wraps with a discussion from the speakers on taking an applied, pragmatic approach to adding ML to your systems and how they solved challenges. Eager to deploy ML and have questions? This is a forum to discuss, learn, and help crystalize that roadmap. Join us discussing first principles adding ML to your systems.

Hien Luu, Engineering Leader @LinkedIn - AI & Big Data Enthusiast
Jeff Smith, Engineering Manager @Facebook AI
Brad Miro, Machine Learning Engineer @Google
Ashi Krishnan, Building the Next Generation of Developer Tools @Github

1:40pm - 2:30pm

From Research to Production With PyTorch

PyTorch is a powerful, flexible deep learning platform that enables engineers and researchers to move quickly from research to production. Since the 1.0 release a few months ago, researchers and engineers are already seeing success in taking advantage of the new capabilities to take deep learning models from research into production. With the 1.1 release, yet more new features have been released and ecosystem projects launched.    

This talk will cover some of the latest features from PyTorch including the TorchScript JIT compiler, distributed data parallel training, TensorBoard integration, new APIs, and more. We’ll also discuss some of the most exciting projects coming out of the PyTorch ecosystem like BoTorch, Ax, and PyTorch BigGraph. Finally, we’ll dig into some of the use cases and industries where people are successfully taking PyTorch models to production, from cars to cancer treatments.

Jeff Smith, Engineering Manager @Facebook AI

2:55pm - 3:45pm

MLflow: An Open Platform to Simplify the Machine Learning Lifecycle

Developing applications that successfully leverage machine learning is difficult. Building and deploying a machine learning model is challenging to do once. Enabling other data scientists (or even yourself, one month later) to reproduce your pipeline, compare the results of different versions, track what’s running where, and redeploy and rollback updated models is much harder.

Corey Zumar offers an overview of MLflow, a new open source project from Databricks that simplifies this process. MLflow provides APIs for tracking experiment runs between multiple users within a reproducible environment and for managing the deployment of models to production. Moreover, MLflow is designed to be an open, modular platform—you can use it with any existing ML library and incorporate it incrementally into an existing ML development process.

Corey Zumar, Software Engineer @databricks

4:10pm - 5:00pm

Hands-On Feature Engineering for Natural Language Processing

Think of Grammarly, Autotext and Alexa, as many applications in software engineering are full of natural language, the opportunities are endless. The latest advances in NLP such as Word2vec, GloVe, ELMo and BERT are easily accessible through open source Python libraries. There is no better time for software engineers to develop NLP applications.

Feature Engineering is the secret source to creating robust NLP models, because features are the input parameters for NLP algorithms. These NLP algorithms generate output based on the input features.

The aim of this talk is to share various NLP feature engineering techniques from Bag-Of-Words to TF-IDF to word embedding, that includes feature engineering for ML models as well as feature engineering for emerging deep learning approach.

The talk will cover the end-to-end details including contextual and linguistic feature extraction, vectorization, n-grams, topic modeling, named entity resolution which are based on concepts from mathematics, information retrieval and natural language processing. We will also be diving into more advanced feature engineering strategies such as word2vec, GloVe and fastText that leverage deep learning models.

In addition, attendees will learn how to combine NLP features with numeric and categorical features and analyze the feature importance from the resulting models.

The following libraries will be used to demonstrate the aforementioned feature engineering techniques: spaCy, Gensim, fasText and Keras in Python.

Susan Li, Sr Data Scientist at Kognitiv Corporation

5:25pm - 6:15pm

Time Predictions in Uber Eats

Uber Eats has been one of the fastest-growing meal delivery services since its initial launch in Toronto in December 2015. Currently, it’s available in over 40 countries and 400 cities. The ability to accurately predict delivery times is paramount to customer satisfaction and retention. Additionally, estimates are important on the supply side as they inform when to dispatch couriers.  

This talk will cover how Uber Eats has leveraged machine learning to address these challenges. We’ll briefly talk about the implementation of the intelligent dispatch system, and compare the versions before and after introducing time predictions powered by machine learning. Then we’ll use food preparation time prediction as an example to show you how ML is applied in our engineering work step by step. In the end, we’ll quickly go over the time predictions of estimated time to delivery the order and estimated time to travel.

Zi Wang, Leading the Machine Learning Engineering Work for Time Predictions @UberEats

Tracks

Monday, 24 June

Tuesday, 25 June

Wednesday, 26 June