Scaling Chartbeat from 8 Million Open Browsers to Realtime Analytics and Optimization
Chartbeat provides a real-time analytics platform that’s used by 80% of top US news organizations to inform their newsrooms on how visitors are engaging with content. We exceed 200,000 requests / sec on an average day, with each piece of real-time data making it into our dashboard within a fraction of a second. In addition to processing large amounts of data in real-time, we run Hadoop and Redshift jobs to produce reports and optimization tools for our customers.
I’ll discuss the overall architecture of the system and how we’re able to handle the requests in a cost efficient manner using a custom written analytics engine in C and Lua. I’ll also talk about some interesting numerical problems that arise from scale, and some algorithmic challenges and solutions we’ve encountered while measuring tens of billions of engaged minutes every month.