Workshop: Python-Based AI Workflows - From Notebook to Production Scale

Location: Marquis C, 9th fl.

Duration: 9:00am - 4:00pm

Day of week: Monday

Level: Intermediate

Prerequisites

  • A laptop with the ability to ssh into a remote machine
  • Experience with Python
  • Some familiarity with the command line

We all love the notebook environment for exploring data, developing our models, and visualizing results, and we love Python for it's huge ecosystem of AI/ML tooling and ease of use. However, if all of our work stayed in local notebooks analyzing small local data, we wouldn't be creating real value for a business at production scale. We need to understand which Python tools to use as we scale our workflows beyond the notebook, and we need to understand how to manage and distribute our work on large data. 

In this workshop, we we start with a set of Jupyter notebooks implementing an example ML/AI workflow in Python. We will then modify this code to get it ready for deployment as a set of scalable data pipeline stages. In that process, we will learn about various packages, tools, and frameworks in the Python ML/AI ecosystem (even touching on things like PyTorch). These tools are enabling data scientists to run AI workflows and transform data at scale. We will also learn about how our Python processing can be deployed on infrastructure outside of our laptop with tools like Docker and Kubernetes, which are powered the largest technology companies on the planet. Each participant will deploy their own Python-based workflow in the cloud and will complete a number of related, hands-on exercises.

Speaker: Daniel Whitenack

Data Scientist, Lead Developer Advocate @pachydermIO

Daniel is a Ph.D. trained data scientist working with Pachyderm (@pachydermIO). Daniel develops innovative, distributed data pipelines which include predictive models, data visualizations, statistical analyses, and more. He has spoken at conferences around the world (Datapalooza, DevFest Siberia, GopherCon, and more), teaches data science/engineering with Ardan Labs (@ardanlabs), maintains the Go kernel for Jupyter, and is actively helping to organize contributions to various open source data science projects.

Find Daniel Whitenack at

Tracks