Trading based on Text Analytics of Online News and Social Media

Legends Ballroom - Robinson-Whitman

With the advances in automated text analysis, unstructured data sources such as a financial news and social media are becoming an effective source of trading signal to supplement existing structured data feeds. An important component of such analysis is to characterize the sentiment expressed in online text about specific publicly-traded companies and commodities. Sentiment Analysis focuses on this task of automatically identifying whether a piece of text expresses a positive or negative opinion about the subject matter. Such expression of sentiment tends to be very domain specific, and we present recent advances in Machine Learning that help us adapt sentiment analysis to a target domain, such as Finance. We describe an approach to trading on sentiment expressed in text, and demonstrate the effectiveness of this approach on back-testing results over several years of financial news. We discuss alternative sources of user-generated content, such as blogs and Twitter, which can provide additional trading signals. In addition to sentiment analysis, we provide an overview of other aspects of text and social network mining that become relevant when incorporating such social media sources.

Prem Melville's picture
Prem Melville received a Ph.D. in Computer Science at the University of Texas at Austin, and is currently a Research Scientist in the Machine Learning Group at IBM Research. His primary research interests lie in Machine Learning and Data Mining. He has over 50 refereed publications in these fields, on a wide range of topics, including social media analytics, sentiment analysis, emerging topic detection, assessing influence in networks, active learning, recommender systems, text classification, and applications of data mining to analyzing social media, business analytics and e-commerce. Prem has served on the organizing committees of the premier conferences in the field KDD, ICML, CIKM and WSDM. He also serves on the Editorial Board of Data Mining and Knowledge Discovery. Prem co-organized the first workshop on Social Media Analytics (SOMA 2010), the first workshop on Budgeted Learning, and the workshop on Mining and Learning with Graphs (MLG 2011). Together with his colleagues, he received the Best Application Paper Award at KDD 2010, and has won several data mining challenges KDD Cup 2009, KDD Cup 2008 and the INFORMS Data Mining Contest 2008.