Immersive Teaching and Research in Data Sciences via Cloud Computing

Legends Ballroom - Robinson-Whitman

Many talk about cloud computing, some try, yet only few succeed, since cloud computing follows a new paradigm which needs to be learned and understood. This talk is about what cloud computing means to academic teachers and researchers in data sciences and how to take advantage of it. With public cloud computing a new era for research and higher education begins. Scientists, educators and students can now work on advanced high capacity technological infrastructures without having to build them or to comply with rigid and limiting access protocols. Thanks to the cloud's pay-per-use and virtual machine models, they rent the resources and the software they need for the time they want, get the keys to full ownership, and work and share with little limitation. In addition, the centralized nature of the cloud and the users' ubiquitous access to its capabilities should make it straightforward for every user to share with others any reusable artifacts. This is a new ecosystem for open science, open education and open innovation. What is missing is bridging software. We propose such software to help data scientists, educators and students take advantage of this new ecosystem: R, Python, Mathematica, Spreadsheets, etc. are made accessible as articulated, programmable and collaborative components within a virtual research and education environment (VRE). The result is astonishing and requires some adaptation in the way we think: Teachers can easily prepare interactive learning environments and share them like documents in Google Docs; students can share their sessions to solve problems in collaboration. Costs may be hidden to the students by allowing them to access temporarily shared institution-owned resources or using tokens that a teacher can generate using institutional cloud accounts. This includes on-line-courses. The talk includes examples using Amazon EC2 and Microsoft's Azure such as: 1. Constructing a collaborative environment articulated around an R session to teach statistics, 2. Creating enhanced spreadsheets to demonstrate variations of chemical molecules, 3. Creating and sharing interactive dashboards for financial analysis, 4. Reproducible research via VREs. Many things which used to be in the hands of large organizations or corporations such as science gateways, and big data treatment are now at the reach of any talented analyst, teacher, or researcher. Come and see.

Karim Chine's picture
Karim has been designing and building large scale distributed software for over a decade. Born in Tunisia, he moved to Paris in 1997, graduated from Ecole Polytechnique and Telecom ParisTech and worked several years at Schlumberger and IBM R&D departments. In 2006, Karim moved to Cambridge and worked at EBI and at Imperial College. In 2008, he created "Cloud Era Ltd" and started the design of a ubiquitous and collaborative platform for data science in the cloud. Karim is part of the European commission Experts group for e-Infratsructure and cloud computing projects and he is the author of Elastic-R.