A guide to Python frameworks for Hadoop
Track: Applied Data ScienceLocation:Grand Ballroom - Salon A/BAbstract:
Distributed computing frameworks like Hadoop have revolutionized our ability to process large amounts of data. Using these tools typically requires writing complex programs in lower-level languages like Java; however, data scientists and analysts prefer to spend time in higher-level languages, such as Python. In order to address this gap, multiple open-source Python frameworks have been built to enable simple, user-friendly access to Hadoops underlying systems. This talk will review the different available frameworks, including a comparison of performance, ease of use/installation, differences in implementation, and other features.