Keynote: Data as DNA: Building a Company on Data
Location:
- Broadway Ballroom, 6th fl.
Day of the Week:
- Wednesday
What You’ll Learn
- Discover how important data can be for your company.
- Learn from Stitch Fix - what were some of the data issues they had to deal with, and how they ended up building a data-first company.
- Hear how data and science can go together to achieve great results.
Abstract
Creating a data driven culture is no simple task. Many companies say they are data-centric but few are taking full advantage of the data they have and spend massive amounts of time aggregating it. What does it take to create a data-driven culture? And, can you take it too far? With stories from Stitch Fix, Salesforce, and Amazon, Cathy will talk about how giving people access to the data they need and aligning everyone to the right metrics can transform an organization.
Cathy will discuss the early evolution of A/B testing at Amazon and share how Salesforce built a telemetry system for real-time monitoring and decision making in its large-scale search infrastructure. In addition, she will discuss the data-rich culture at Stitch Fix and share how the company’s 70-person data science team drives every decision and function inside the online personalized styling service. From styling recommendations and inventory allocation to logistics and demand modeling, data is the DNA of the company. Cathy will share the highlights and pitfalls to avoid in the journey to becoming a data-centric company and she will provide suggestions on how other organizations can harness their data to do the same.
Interview
Cathy: The title is "Data as DNA: Building a Company on Data." Every company is becoming a data company, and not everybody understands how to create a data-driven culture. There's a lot of data, but people are not taking advantage of it in the right ways, and sometimes it's not easy to do.I’ve been thinking about ways that I've seen data being transformative in different companies.
I started my career at Amazon and I have a lot of interesting stories to share about how tests were created in the early days of the Internet. Sometimes, people were not understanding what the data they had really meant, and how to build a good experiment with the questions they had. When I came to work at Salesforce, the team was building a brand new large-scale search infrastructure. One of the struggles of the team was how to deploy that without interrupting any of the customers. We likened it to ripping out an engine in mid-flight because everything was so dependent on search. When we built a full-scale telemetry system, and we measured how the system was performing compared to the legacy one, we got the confidence to switch over systems. So, I’ve seen data being used from an Internet company to a large-scale infrastructure project.
Then it was Stitch Fix, where I started six months ago. Stitch Fix was created as a data company from scratch. Our clients fill out a detailed style profile about their price, style and size preferences and then a stylist selects clothing and accessories that best meet the needs of each client. It's the perfect balance of art and science: we use data science to deliver personalization at scale, combined with an expert human stylist, who deeply understands each client’s unique needs and is able to connect on a personal level. A machine helps customers find out what they love and refine the selection of merchandise for each stylist, so our stylists can focus on building important client relationships.
One of our first hires was to lead data science, and data is essential for everything, from picking the right merchandise, to recommendations for stylists to match the right stylist to the right clients, to helping with customer support requests. I love this idea of how to be a data company, how to use data in the right way and how to avoid some of the pitfalls of conducting the wrong experiments. We've even seen things that take too long sometimes. I think that's a really interesting aspect:to know when to use data and when is it better for expediency.
Cathy: I think the art and the science machine is an interesting topic right now. It took a long time for any chess player to beat a human. Then, with Deep Blue, machines evolved. Many master chess players have used computer aided chess to beat the machines. So, the machines are better than humans by themselves, but the blend of humans and machines is better together than machines are solo.
Cathy: I want them to understand how important it is to use data to solve business problems, and that there’s an abundance of data in the world. Also, it's not just about being a data science company, it's about how you can answer questions and improve your businesses by leveraging data in the right way. I want them to learn how to build data into the DNA of their company and how to create the right culture to solve problems.
Cathy: In the beginning, we used percentile data for performance metrics and we're missing so much of what was happening on the fringes until we started looking at the 98th percentile. Then, we were able to dig into some of the issues that were going on.
Tracks
Monday, 26 June
-
Microservices: Patterns & Practices
Practical experiences and lessons with Microservices.
-
Java - Propelling the Ecosystem Forward
Lessons from Java 8, prepping for Java 9, and looking ahead at Java 10. Innovators in Java.
-
High Velocity Dev Teams
Working Smarter as a team. Improving value delivery of engineers. Lean and Agile principles.
-
Modern Browser-Based Apps
Reactive, cross platform, progressive - webapp tech today.
-
Innovations in Fintech
Technology, tools and techniques supporting modern financial services.
Tuesday, 27 June
-
Architectures You've Always Wondered About
Case studies from the most relevant names in software.
-
Developer Experience: Level up Your Engineering Effectiveness
Trends, tools and projects that we're using to maximally empower your developers.
-
Chaos & Resilience
Failures, edge cases and how we're embracing them.
-
Stream Processing at Large
Rapidly moving data at scale.
-
Building Security Infrastructure
How our industry is being attacked and what you can do about it.
Wednesday, 28 June
-
Next Gen APIs: Designs, Protocols, and Evolution
Practical deep-dives into public and internal API design, tooling and techniques for evolving them, and binary and graph-based protocols.
-
Immutable Infrastructures: Orchestration, Serverless, and More
What's next in infrastructure. How cloud function like lambda are making their way into production.
-
Machine Learning 2.0
Machine Learning 2.0, Deep Learning & Deep Learning Datasets.
-
Modern CS in the Real World
Applied, practical, & real-world dive into industry adoption of modern CS.
-
Optimizing Yourself
Maximizing your impact as an engineer, as a leader, and as a person.
-
Ask Me Anything (AMA)