Presentation: Reaching Production Faster with Containers in Testing

Location:

Duration

Duration: 
2:55pm - 3:45pm

Day of week:

Level:

Persona:

Key Takeaways

  • Learn why integration testing services inside containers can increase service robustness and prevent many types of bugs.
  • Hear techniques used by Spotify for testing not only service containers but also Docker orchestration tools.
  • Learn how container integration tests allow developers to test real dependencies easily and ensure tests are reproducible and isolated.

Abstract

Spotify adopted container technology early on and built its own OSS framework for container orchestration called Helios. Not only do containers run many critical systems at Spotify, they also improve and accelerate development. We run containerized integration tests close to 400 times a day.

This talk covers how our Helios testing framework drives integration tests and spins up entirely self-contained environments during test runs. Developers can test services locally in an environment closely resembling the production stack; spin up dependent services like Cassandra, memcached, or even other containerized Spotify services; and even test their deployment and service discovery configurations.

Learn how this style of integration testing has increased our code quality and successful deployments.

Interview

Question: 
What is your role today?
Answer: 
I am a software engineer on one of the Spotify tech infrastructure teams. My team builds tools like Helios that help other backend engineers deploy and run their containers.
Question: 
Can you describe Helios and where it fits in the container ecosystem?
Answer: 
Helios is an open-source Docker orchestration framework. When Docker first became popular, there weren’t any open-source orchestration tools yet. So Spotify created its own.
A Helios “job” is a Docker image along with configuration like exposed ports or mounted volumes. Users tell Helios to deploy a job to a collection of hosts. On a high level, Helios’ design is similar to other popular orchestration tools. One or more masters receive requests via an HTTP API. The masters write a desired state into a highly available datastore (ZooKeeper in our case), and agents running on each host read the datastore and tell Docker what containers to run.
Question: 
What’s the motivation for your talk?
Answer: 
We want to share how testing not only services themselves but services as they’re running inside containers can be valuable. Container-based tests prevent configuration mistakes, allow developers to easily test against real dependencies, and ensure tests are reproducible and isolated.
With Helios, users can write a JUnit test to create a Helios job and deploy it in an environment closely resembling production. In this way, developers are not only testing the service but also the container configuration and the deployment infrastructure.
Using the same job configuration for testing and production deployments means less chance of misconfigured containers at runtime. No more forgetting to expose a port or mounting a crucial file.
Starting a real PostgreSQL or Cassandra instance as a test dependency is easier when they’re running as containers. Developers can also start a container running another team’s service they depend on.
Tests are more isolated, especially when a datastore container is started as a dependency. Developers can reset the datastore by simply stopping the container and starting a new one.
For these reasons, we believe container-based tests can be a powerful tool in a one’s testing arsenal.
Question: 
So you can write some tests, kickoff a unit test, pull something out of the registry, spin up the container, and then run some tests against it? You can do all this locally?
Answer: 
Yes, users have the option to run it all locally.
Originally, our test framework could only deploy to a long-running, remote Helios cluster dedicated to continuous integration. Spotify’s backend developers needed to be on the internal VPN to run their tests. Today, we are experimenting with running everything locally by running the Helios cluster itself as a local container.
Question: 
What about a really complex service? Can the framework you use test a complex service? I mean something that is designed to have instances fail, so it may run across many nodes of a cluster or is it just testing a single container instance?
Answer: 
You can definitely do that. We’ve seen some of our more advanced users write crazy integration tests where they spin up a small Hadoop cluster plus a CMS and test various failure scenarios. We never thought people would take it so far, but you can definitely write advanced tests.
Question: 
What about dependencies between containers? Do you have some facility to be able to orchestrate multiple ones?
Answer: 
Yes. The helios-testing framework has service discovery. Helios operators need to run SkyDNS or any other service registration system alongside the Helios cluster. Helios registers the service with that system for other containers to lookup.
Question: 
What are your key takeaways for this talk?
Answer: 
If you deploy services inside containers, consider having container-based tests. They can be valuable in preventing configuration mistakes, allowing developers to easily test against real dependencies, and ensuring tests are reproducible and isolated.

Speaker: David Xia

Software Engineer @Spotify

David Xia is a software engineer at Spotify who works on deployment infrastructure. His team focuses on building tools and platforms on top of Docker like Helios, docker-client, and Spotify’s internally hosted Docker registry.

Find David Xia at

Tracks

Monday, 13 June

Tuesday, 14 June

Wednesday, 15 June