Speed Up Your Software Testing Life Cycle With Containers

Dror Arazi
Dror Arazi | May 22, 2017

When it comes to testing, it is always about how to release faster without compromising quality. Normally our blogs deal with the second part of the equation, how to improve quality despite increasing release speeds. However, this time we’ll focus on and discuss how to use containers to speed up your software testing life cycle.

We’ll walk you through the problems we had at SeaLights that led us to implement containers in testing, issues experienced during our implementation, and what we got out of it. TL: DR, buckle up your Docker/ Kubernetes because it’s worth the ride.

Shifting from linear processes to parallel processes

Traditionally, the way testing works is develop -> test -> deploy, repeat as needed. Today, most companies have adopted shift left or shift right testing, which means develop + test -> deploy or develop -> deploy + test. As SeaLights doesn’t have any tolerance for bugs and testing in production, we took the shift left approach. Moving testing as far left and earlier into the development pipeline as possible.

The improvement here is that now you do not need to wait until the entire feature set is complete in order to begin the software testing life cycle. The main differentiation here is that tests are now run at the PR level instead of waiting for the merge to master, this creates a faster feedback loop.

Shift left testing was a great first step but it was nowhere near enough to bring our software testing life cycle up to the speed that we aspired to. Despite all our improvements we were still experiencing significant developer downtime because both our tests and environments were still linear. So we set out to achieve parallel testing. In order to achieve parallel testing there are two separate aspects that we had to address and solve:

  1. Parallelism: in regards to tests. Meaning reducing test duration from the sum of all tests to the duration of the lengthiest test.

container testing2.  Multitenancy: in regards to developers/ testers. Meaning reducing the downtime wait for QA environments in CI with differing setups.

multi tenancy

Parallelism

What you’ll need: 1 Developer

How long will it take: 2-3 days (worst case scenario)

Payoff: Improved test duration by 1,614% (or alternatively 5.83% of the previous test duration)

Parallelism is relatively easy and straightforward to solve. The expectations and results matched up perfectly. Instead of waiting for all test to execute in succession of each other, tests now executed in parallel. Meaning that test duration is equal to the duration of the lengthiest test instead of the sum of test durations. For us, that meant going from a total test duration of 120 minutes to 7 minutes at maximum parallelism, overall a 1,614% improvement in test duration. Note, that time is dependent only on maximum parallelism not the number of tests.

The improvements that test parallelism brings is for the most part stable. Meaning that after the initial drop in test duration you shouldn’t see fluctuations in your release cycle unless you are:

  1. Skipping tests
  2. Optimizing tests
  3. Suffer from environment provisioning bottlenecks (see multitenancy)

*Minor Issue: CI Performance

A side effect of test parallelism can be build server overflow. Build server overflow occurs when a large volume of tests is run in parallel within the build server causing VM exhaustion.  Companies that have a very high volume of tests (over 1000) can compensate for this in advance by running tests in their CI container orchestration environment and not in the build server.

Content Tailored to Your Interests

Multitenancy

What you’ll need: 1 Developer + 1 DevOps Engineer

How long will it take: 2 weeks

Payoff: Improved provisioning by 333.33% (or alternatively 30% of the previous provisioning)

Multitenancy was a bit more difficult to solve as it required DevOps assistance. Here we needed to solve environment provisioning: the average amount of time that a developer/ tester had to wait until his/ her environment was spun up. Our starting average T1 was ~10 minutes.

By implementing multitenancy, we expected to achieve fast environment provisioning, reducing downtime from 10 minutes to 3 minutes. In reality, we achieved, on the first implementation 5 minutes provisioning and on the final implementation 3.5 minutes provisioning. This was after final optimizations that included, initiating namespace via PR forecasting and retaining reserve compute power to support peak outlier concurrent testing.

An improvement from 10 to 3 minutes might not sound like a lot, but when you take into consideration how many developers you have and how many times they are committing code a day, it adds up. For us on a calm day were talking about 15 developers committing code twice daily, meaning 300 minutes idle time a day. Now imagine a super stressed day. 

Code Coverage vs. Test Coverage | Download >>

*Minor Issue: Private build images

Multitenancy at such an early stage, which is what shift left testing de facto requires, requires PR level private image maintenance. Meaning that now we need not only to manage master images but, branch, tag, fork level branches. This is much more difficult to manage and maintain:

  1. Your build servers need to be able to support automated image building on all levels.
  2. Memory complexity cost: your cost has now multiplied by a factor of 4.

Neither a nor b are dealbreakers for container registry. What is a dealbreaker is thresholds. Now that you are working on a larger scale it is only a matter of time until you hit your limit and break your pipeline. All you need to do to combat this in advance is set up alerts and delete obsolete images.

Friends with Benefits

We set out to reduce our release cycle but, there are a few more things that we achieved along the way that are worth noting:

  1. Safe environment replay in production: Before we set out, our old environment provisioning was based on VMs. Now there is nothing wrong with VMs, but it is not built for the workload that we deal with at SeaLights. As a result of containerization, our CI is now built in orchestration meaning that we now enjoy all of the out-of-the-box advantages that container orchestration provides. Meaning that because we implemented a solution for common application issues already in the testing stage rolling out this same solution to production is straightforward and relatively safe because of compatible environment configurations. Read more about investing in containers.
  2. Consistent environments: One of the more annoying issues that any developer/ tester can attest to is inconsistencies in environments. There is nothing more annoying than having a build consistently fail just to discover that the test environment is missing some of the production environment configurations. With containers maintaining all the different environment configurations and enforcing consistency is easy work.

Not only is it easy work but it also killed some of our false negatives and bugs. Once we implemented this, we realized that 17% of our “fails” were false negatives as a result of environment inconsistencies and, 9% of our production bugs were a result of environment inconsistencies. Today we no longer have to deal with these issues. Woot!

Summary

Payoffs in terms of operation

We reduced time without compromising quality. Now we enjoy a near real-time feedback loop that enables a shorter time to market.

Payoffs in terms of team

Less frustration on the part of developers. Now they no longer need to constantly deal with tedious, repetitive tasks but can focus more on the R in R&D.

Culture issues that should be addressed ahead of time

Shift left testing in essence shifts quality responsibility to developers. Therefore, gone is the handover to QA culture. You can’t get away with sayings like “But it worked on my machine” when adopting shift left. When you pile containerized testing on top of that things are now getting pushed much much faster.

You must adapt your team culture to support distributed responsibility and ownership ahead of a move like this otherwise all the bottlenecks that you are setting out to solve will get clogged by cultural differences. Once you have laid out these cultural guidelines you are on track for highly flexible, self-sufficient teams from design to production.

Suggested Reading

What is Shift Left Testing?

So what is it? Shift Left, Shift Right, or Both?

Parallel Testing