Workshop: Python-Based AI Workflows - From Notebook to Production Scale

Location: Marquis C, 9th fl.

Duration: 9:00am - 4:00pm

Day of week: Monday

Level: Intermediate

Prerequisites

  • A laptop with the ability to ssh into a remote machine
  • Experience with Python
  • Some familiarity with the command line

We all love the notebook environment for exploring data, developing our models, and visualizing results, and we love Python for it's huge ecosystem of AI/ML tooling and ease of use. However, if all of our work stayed in local notebooks analyzing small local data, we wouldn't be creating real value for a business at production scale. We need to understand which Python tools to use as we scale our workflows beyond the notebook, and we need to understand how to manage and distribute our work on large data. 

In this workshop, we we start with a set of Jupyter notebooks implementing an example ML/AI workflow in Python. We will then modify this code to get it ready for deployment as a set of scalable data pipeline stages. In that process, we will learn about various packages, tools, and frameworks in the Python ML/AI ecosystem (even touching on things like PyTorch). These tools are enabling data scientists to run AI workflows and transform data at scale. We will also learn about how our Python processing can be deployed on infrastructure outside of our laptop with tools like Docker and Kubernetes, which are powered the largest technology companies on the planet. Each participant will deploy their own Python-based workflow in the cloud and will complete a number of related, hands-on exercises.

Speaker: Daniel Whitenack

Data Scientist @SILintl

Daniel Whitenack (@dwhitena) is a PhD-trained data scientist with over ten years of experience developing data-intensive applications in industry and academia. He is currently working as a data scientist with SIL International, a non-profit working towards sustainable language development worldwide. Daniel also co-hosts the Practical AI podcast (@PracticalAIFM), teaches data science/engineering at Ardan Labs (@ardanlabs) and Purdue University (@LifeAtPurdue), and has spoken at conferences around the world (ODSC, PyCon, QCon AI, DataEngConf, QCon, GopherCon, Spark Summit, Applied ML Days, and more).

Find Daniel Whitenack at

Tracks