The SeaLights test metrics guide
for better and faster CI/CD
Code Coverage Metrics
How does code coverage work, and is it really important to measure it?
In this article, we’ll explain the inner workings of this popular testing metric and reveal that while this metric often goes together with great quality software, it does not provide a complete picture of quality, which can sometimes lead to incorrect decisions or poor testing practice.
What is Code Coverage?
This measures the number of lines of source code executed during a given test suite for a program. Tools that measure code coverage normally express this metric as a percentage.
Many use the terms “code coverage” and “test coverage” interchangeably. However, they are two different things. Test coverage is a measurement of the degree to which a test or testing suite actually checks the full extent of a program’s functionality.
Benefits of Measuring Code Coverage
- Software with high results is less likely to contain undetected bugs stemming from coding errors, non-adherence to good coding practices, or overly complex code.
- High percentages can imply code that is more maintainable and readable. However, it is possible to achieve high percentages (“for the sake of coverage”) without improving maintainability.
- It provides a measurable value to stakeholders on software quality. Such stakeholders are often not involved in day-to-day software development, and they need a measurable standard to gauge software quality.
- Levels between 70 and 90 percent suggest reliable software, according to a review of academic studies that examined the correlation between software quality and code coverage.
- The larger a project team is, the more room for ambiguity when it comes to defining the well-tested code. This measurement can act as an approximation metric that consolidates the team’s definition of well-tested code, and leads to consistent testing practices.
Many experts believe that while this metric is valuable, it should not be used as a target for testing or development teams. Targeting a specific percentage or range does not necessarily increase software quality and can lead to problematic testing practices, something we’ll discuss further in this article.
There are several different ways to measure code coverage—it is more like a family of dimensions rather than a single formula metric.
Method Coverage (Function Coverage)
Method or function coverage measures code by counting the number of functions called by a test suite.
Statement coverage measures the percentage of code statements executed during a test suite. Statements are instructions in the code expressing some action that the program should carry out. Therefore, statement coverage gives an accurate measure of the quantity of written code that tests actually execute.
However, statement coverage is only useful as a measure of physical code — it says nothing about the quality of tests used to execute the code.
Branch coverage measures whether a test suite executes the branches from decision points written into the code. Such decision points arise from if and case statements, with two possible outcomes: true and false. Consider the following code:
IF “X > Y”
PRINT “X is greater than Y”
There are two outcomes for this if statement: true and false. Branch coverage needs to consider what happens both when X is larger than Y and when Y is larger than X, the latter of which is the FALSE condition for this statement. Two tests can ensure full branch coverage in this code:
TEST CASE 2: X=2, Y=10
The aim of measuring branch coverage is to check whether tests execute all reachable branch points (true and false) across a comprehensive set of inputs. This is a good measure of logic coverage, which relates to the quantity of possible code paths tested.
Condition coverage measures whether tests execute statements using each of the Boolean expressions contained in the code. For example, consider the basic if statement below:
A valid condition coverage for this code needs to test what happens when X and Y take on their respective Boolean values of true and false. The tests required are:
TEST 2: X=FALSE, Y=TRUE
Condition coverage is another way of ensuring tests hit all possible code paths.
Multiple Condition Decision Coverage (MC/DC)
This type of coverage requires extensive tests to ensure that tests execute all combinations of conditions inside each decision statement. For example: testing the results of every combination of the Boolean results for X, Y and Z. This type of coverage metric is used when testing safety-critical applications, such as software used inside aircraft.
Parameter Value Coverage
Parameter value coverage aims to cover all possible parameter values for each program procedure/method that uses parameters. For example, a string can take on several values. Parameter value coverage ensures the tests execute the code using all possible string values. Neglecting certain parameter values can lead to software defects.
Cyclomatic complexity measures the total number of linearly dependent paths in a program’s source code. For example, source code containing no conditional statements or decision points has a cyclomatic complexity of 1.
Cyclomatic complexity is useful in planning test cases to determine coverage for particular code modules. For example, testing teams can use the cyclomatic complexity value as an upper bound for the number of required test cases to achieve full branch coverage.
Code Coverage Tools
These tools work by instrumenting either the source code of your program or the byte code. Tools that instrument source code have the advantage of collecting detailed code metrics. The disadvantage is that you need to recompile the code after instrumentation.
When each line or branch is visited during automated unit tests, the code coverage tool caches the information and visually presents it as a percentage.
Some examples of such tools are:
- McCabe – a testing tool that calculates a full set of metrics, including the MC/DC metric, making it useful for safety-critical applications.
- Visual Studio – a popular code tool from Microsoft. It has a wide variety of testing features, including calculating cyclomatic complexity.
- JaCoCo – Java Code Coverage Library – an open source project intended to provide a new standard technology for code coverage analysis in Java VM based environments, with integrations for tools like Ant, Maven, and Eclipse.
Criticism Of Code Coverage
Many software testing experts argue that code coverage is not a good metric for software testing teams, even though it is often used to measure team performance. That’s not to say coverage doesn’t have its uses—as Martin Fowler points out, it is a good way to identify untested code.
But aiming for arbitrary percentages do not necessarily contribute to test quality. Such coverage goals tend to encourage poorly designed tests, written with the sole intention of meeting requirements instead of testing the software correctly.
A useful alternative metric is functional test coverage, which tracks whether tests execute important values or sequences of values corresponding to software features. This metric tells you which features satisfy the relevant conditions for acceptance by a user, customer, or other stakeholders.
Beyond Code Coverage
As discussed, this is an important but limited metric. What would be truly useful is to measure actual test coverage, and go beyond unit tests to include integration tests, acceptance tests, and manual tests as well. Traditionally there has been no easy way to see a unified test coverage metric across all types of tests and all test systems in one place.
SeaLights is a continuous testing platform that makes this possible. It integrates with all your testing tools and allows you to measure holistic test coverage across all types of tests. SeaLights provides answers to crucial questions: how extensively is your product tested, and how much risk exists in terms of untested code changes that are shipping with your next release. Check out our free trial.