Improving Test Coverage

Testing is hard. Given a testing regime and test coverage regime, a fundamental issue is how to raise the level of test coverage. There are a number of approaches:

  • Write More Tests
  • Generate Tests Automatically
  • Remove Dead/Zombie Code
  • Remove Redundant/Cloned Code
  • Write/Execute More Sophisticated Tests
We discuss these in turn.

Write More (Useful) Tests

Total coverage is usually defined as a ratio covered_code/total_code. So given a current ratio, one can increase total coverage by by increasing the amount of covered_code. (We'll visit the topic of decreasing total_code later).

The usual way to increase covered code answer is "code more tests" that exercise additional code. To minimize test coding effort, you want to do that in a way that does not re-test what has already been tested. First, testing should focus on a black-box approach, e.g., write tests that verify user requirements, or elements that support user requirements; if the requirements call for a user login, there should be a test that a user login step occurs and that the login actually validates a user id and some identity check such as a password. With good black box tests in place, one can turn to domain-boundary focused testing: write tests that exercise the boundary or edge cases in the problem domain supported by the application; if one is testing a program that assigns packages to trucks, one should try the boundary cases of zero packages, one packages, average truck capacity and values which are clearly impossible (such as millions of packages). One can do some white-box testing, that is, tests based on inspecting the code itself; this is principally useful in manually discovering the boundary cases, as the application code presumably must check for them.

One can also avoid writing tests that cover the same code twice. A good test coverage tool (such as SD's) can tell you what code a test actually covers, and can provide a way to determine the intersection of coverage sets from multiple tests. If one has two tests that essentially cover the same code, you probably don't need one of the tests.

Generate Tests Automatically

Writing tests is hard. Often the code is more or less correct before testing starts seriously; if one could generate tests that verified that the code does today, what it did yesterday, one could get a large number of (regression) tests quickly.

It is possible to automate this. If one has a tool that can read the source code, identify the inputs, outputs, and control flow paths within and across functions, one can generate in the abstract possible input values, and know how those values are filtered by the control flow paths to produce outputs. Each unique control flow path thus determines a test. This approach requires a tool that knows the abstract types of the inputs and outputs, and can reason symbolically about the predicates that control each control flow path instance.

An approach that avoids the symbolic analysis to considerable degree is "puppetization". In this approach the code is instrumented for the purposes of test generation and test validation. The instrumentation in collection mode collects values of function parameters outputs, and results of predicates, and in puppet mode provides assignment of parameters and controls that force the execution of particular control flow path. Running the code in collection mode produces test cases with specific input and output values with associated control flow path. From this one generates a test that used the puppet mode to force assignments of values to parameters and control flow order; such generated tests then compare the result to the collected outputs.

Our DMS Software Reengineering Toolkit has these capability to both analyze code, reason about paths, determine symbolic ranges of values, and decide algebraically if a conditional is a tautology (always true or false), as well as inserting arbitrary instrumentation into code. It could be configured for such tasks.

Remove Dead Code

Total coverage is usually defined as a ratio covered_code/total_code. So given a current ratio, one can increase total coverage by decreasing total code. This is possible because programs typically have long, torturous histories in which feature code was added, deleted or disabled, and debugging code was likewise added and deleted. The usual result is there is often a lot of dead code or zombie code (enabled by a conditional that cannot happen in a production system). If one can find this code and simply remove it, total code coverage ratio goes up without any need to write additional tests.

A good test coverage tool, such as SD's, indicates what gets executed. If in the course of testing, one finds that some block of code is not exercised, that block can be inspected. If the functional testing is relatively complete, unexercised code will likely be dead feature support. After determination that the code really has no use, it can be removed. This work is considerably easier than writing a good test.

Static analysis can also identify blocks of code (and even variables) that are unreferenced, unreachable or reachable only under conditions which are impossible or are controlled by known false configuration signals from outside the program. Such code can be removed. Our DMS Software Reengineering Toolkit builds symbol tables that know if a routine or value is ever reference other than in a declaration, and contains control flow analysis machinery for determining paths. For some languages, such as Java, Semantic Designs even provide off-the-shelf tools that will automatically remove dead code. We can build similar tools for other languages.

Remove Redundant/Cloned Code

Removing cloned code can improve test coverage ratios in the same way as removing dead code.

Imagine we have an application with base code A, containing code blocks C and C' which are clones. The total size of this system is size(A) by definition. If we find C' and remove it by abstracting C to handle both cases, then we get two improvements in test coverage, one from size reduction, and one from "free" additional testing. The size reduction comes about because the decloned software is smaller: its new size is size(A)-size(C')+ε, where epsilon is the additional code required to abstract C and C' together. Given that some 10-20% of an application consists of cloned code, and one can shrink that by a factor of roughly 2, or an average of 7% (actually more, if a clone occurs more than twice). This can on average improve test coverage ratio to covered_code/((1-7%)*total_code). If the ratio is currently 70%, removing clones will raise the ratio to about 75% without the need to write any tests.

Secondly, if the clones of C exist, to get good coverage, one has to write tests that exercise both C and C' (and any additional clones). By removing the clones, tests that exercise C now effectively exercise C', also. So this helps one avoid writing additional tests.

Semantic Designs offers tools for finding clones (CloneDR) for many languages. Once found, the clones need to be removed but that's a far simpler task once they are known. If nothing else, simply knowing where clones exist can help in coding the additional tests, if one chooses not to declone the code.

CloneDR is a particularly effective tool for finding clones: it will find them in spite of format changes, comment insertions/deletions, change of variable names and in many cases when statements are inserted or deleted. This means it finds clones that other clone detectors typically do not find. This maximizes the amount of code removable from a system, and thus maximizes the improvement in test coverage.

Write/Execute More Sophisticated Tests

One may have a high level of line or branch coverage, but that isn't a guarantee that everything is well tested. Such coverage only shows that individual blocks have been executed, but indicates nothing about combinations of blocks or data values used.

MC/DC: The aircraft industry is famous from requiring modified condition/decision coverage (MD/DC), which verifies that each of the boolean sub-conditions of a decision have been exercise in a way the demonstrates the sub-condition actually controls the decision outcome (e.g., the sub-condition is causal under some circumstances).

Path Coverage: The number of paths through a computer program is enormous due to combinations of conditions. Testing just one path does not prove the program works. Ideally, one would be able to test all paths with all data combinations, but that is unrealistic, too. What one can do is to ensure that all non-looping paths have been covered, and that loops (iterative or recursive) have been covered for some reasonable iteration count. This requires instrumentation somewhat similar to what SD's test coverage tools do, but also requires access to the control flow graph so that the instrumentation tool can decide where to place probes to cover the paths of interest. The DMS Software Reengineering Toolkit has access to control flow analysis machinery that could be use to implement this.

Data Access Coverage: The discussion so far has largely focused on code coverage. Ideally, tests would cover the space of data possibilities, too. An analogy to branch coverage ("are code blocks executed?") is data access coverage ("are all data accesses covered?"), which answers the question, "is this data element appropriately exercised in all the code elements?". To implement this idea, one needs to parse code, determine scopes of identifiers, and add probes with are identifier-in-scope specific to track accesses. DMS has all the necessary machinery to do this.

Data Range Coverage: Probably closer in analogy to MC/DC, which verifies all the condition sub-expressions are validated, it would ideal if one could be sure that code was exercised with all the possible data values for each variable. To this, a symbolic range analysis for variables is required; the code instrumenter needs to break executions into blocks controlled by abstract ranges of variables and capture execution under such contexts. DMS has the ability to extract symbol range information from code.

Unusual Requirements?

If you have need for some of the more sophisticated methods for testing outlined on this page, SD can configure a custom test coverage tool for you! These tools are based on DMS, and inherit DMS's language agility and scalability.

For more information:    Follow us at Twitter: @SemanticDesigns

Test Coverage Tools