Presentation: Managing Millions of Data Services @Heroku
What You’ll Learn
- Learn about the evolution of Heroku servers and services.
- Hear approaches to reducing the late night calls and pager churn.
- Understand new ways of thinking about fleet orchestration, immutable infrastructure, and managing cloud resources.
Abstract
Over the years, Heroku Data's offerings continue to grow and reach new higher demands with Postgres, Kafka and Redis. Performing repairs, maintainenances, applying patches and auditing a fleet of millions creates some serious time constraints. We'll walk through the evolution of fleet orchestration, immutable infrastructure, security auditing and more to see how managing the data services for many Salesforce customers, start-ups and hobby developers alike is done with as little human interaction as possible.
Interview
Gabriel: My main focus pertains to running our fleet in efficient, secure and performant manners. I want to make sure our services are highly available and provide the most bang-for-buck in comparison to what companies have had to homegrow and remove the cludge for other engineering organizations to get back to analyzing and solving problems
Gabriel: I’ve done Cloud computing and DevOps for the last 4 years, and honestly, I hear the same complaints all the time about how ragged engineers are run with on-call and rolling code. There’s so much to be improved on running databases, app servers, and monitoring. This talk means empathizing with my fellow on-call engineers and hopefully provide a new idea or way of thinking to address problems managing large fleets.
Gabriel: I’d say it’s a medium level talk, we’ll get into recent, real scenarios that all web-based companies using cloud technologies face and ways to keep services alive during seriously impactful events. I’ll have examples of code, architecture, and a bit of theory sprinkled in as well.
Gabriel: One example I'm going to address is the Amazon Web Services S3 incident that happened in February, because it practically brought down one third of the Internet. Frankly, we weren't unaffected. We were affected as much as everyone else was I think, but what I think what made it different for us is that we had enough stability in place to keep things up and running while the S3 incident was being worked on.
Similar Talks
Tracks
Monday, 26 June
-
Microservices: Patterns & Practices
Practical experiences and lessons with Microservices.
-
Java - Propelling the Ecosystem Forward
Lessons from Java 8, prepping for Java 9, and looking ahead at Java 10. Innovators in Java.
-
High Velocity Dev Teams
Working Smarter as a team. Improving value delivery of engineers. Lean and Agile principles.
-
Modern Browser-Based Apps
Reactive, cross platform, progressive - webapp tech today.
-
Innovations in Fintech
Technology, tools and techniques supporting modern financial services.
Tuesday, 27 June
-
Architectures You've Always Wondered About
Case studies from the most relevant names in software.
-
Developer Experience: Level up Your Engineering Effectiveness
Trends, tools and projects that we're using to maximally empower your developers.
-
Chaos & Resilience
Failures, edge cases and how we're embracing them.
-
Stream Processing at Large
Rapidly moving data at scale.
-
Building Security Infrastructure
How our industry is being attacked and what you can do about it.
Wednesday, 28 June
-
Next Gen APIs: Designs, Protocols, and Evolution
Practical deep-dives into public and internal API design, tooling and techniques for evolving them, and binary and graph-based protocols.
-
Immutable Infrastructures: Orchestration, Serverless, and More
What's next in infrastructure. How cloud function like lambda are making their way into production.
-
Machine Learning 2.0
Machine Learning 2.0, Deep Learning & Deep Learning Datasets.
-
Modern CS in the Real World
Applied, practical, & real-world dive into industry adoption of modern CS.
-
Optimizing Yourself
Maximizing your impact as an engineer, as a leader, and as a person.
-
Ask Me Anything (AMA)