Test adequacy criteria
The goal for white box testing is to ensure that the internal components of a program are working properly. A common focus is on structural elements such as statements and branches. The tester develops test cases that exercise these structural elements to determine if defects exist in the program structure. The term exercise is used in this context to indicate that the target structural elements are executed when the test cases are run. By exercising all of the selected structural elements the tester hopes to improve the chances for detecting defects. Testers need a framework for deciding which structural elements to select as the focus of testing, for choosing the appropriate test data, and for deciding when the testing efforts are adequate enough to terminate the process with confidence that the software is working properly. Such a framework exists in the form of test adequacy criteria. Formally a test data adequacy criterion is a stopping rule [1,2]. Rules of this type can be used to determine whether or not sufficient testing has been carried out.
The criteria can be viewed as representing minimal standards for testing a program. The application scope of adequacy criteria also includes:
helping testers to select properties of a program to focus on during test;
helping testers to select a test data set for a program based on the selected properties;
supporting testers with the development of quantitative objectives for testing;
indicating to testers whether or not testing can be stopped for that program.
A program is said to be adequately tested with respect to a given criterion if all of the target structural elements have been exercised according to the selected criterion. Using the selected adequacy criterion a tester can terminate testing when he/she has exercised the target structures, and have some confidence that the software will function in manner acceptable to the user.
If a test data adequacy criterion focuses on the structural properties of a program it is said to be a program-based adequacy criterion. Program-based adequacy criteria are commonly applied in white box testing. They use either logic and control structures, data flow, program text, or faults as the focal point of an adequacy evaluation . Other types of test data adequacy criteria focus on program specifications. These are called specification-based test data adequacy criteria. Finally, some test data adequacy criteria ignore both program structure and specification in the selection and evaluation of test data. An example is the random selection criterion.
Adequacy criteria are usually expressed as statements that depict the property, or feature of interest, and the conditions under which testing can be stopped (the criterion is satisfied). For example, an adequacy criterion that focuses on statement/branch properties is expressed as the following:
A test data set is statement, or branch, adequate if a test set T for program P causes all the statements, or branches, to be executed respectively.
In addition to statement/branch adequacy criteria as shown above, other types of program-based test data adequacy criteria are in use; for example, those based on (i) exercising program paths from entry to exit, and (ii) execution of specific path segments derived from data flow combinations such as definitions and uses of variables (see Section 5.5). As we will see in later sections of this chapter, a hierarchy of test data adequacy criteria exists; some criteria presumably have better defect detecting abilities than others.
The concept of test data adequacy criteria, and the requirement that certain features or properties of the code are to be exercised by test cases, leads to an approach called coverage analysis, which in practice is used to set testing goals and to develop and evaluate test data. In the context of coverage analysis, testers often refer to test adequacy criteria as coverage criteria. For example, if a tester sets a goal for a unit specifying that the tests should be statement adequate, this goal is often expressed as a requirement for complete, or 100%, statement coverage. It follows from this requirement that the test cases developed must insure that all the statements in the unit are executed at least once. When a coverage-related testing goal is expressed as a percent, it is often called the degree of coverage. The planned degree of coverage is specified in the test plan and then measured when the tests are actually executed by a coverage tool. The planned degree of coverage is usually specified as 100% if the tester wants to completely satisfy the commonly applied test adequacy, or coverage criteria. Under some circumstances, the planned degree of coverage may be less than 100% possibly due to the following:
• The nature of the unit
Some statements/branches may not be reachable.
The unit may be simple, and not mission, or safety, critical, and so complete coverage is thought to be unnecessary.
• The lack of resources
The time set aside for testing is not adequate to achieve 100% coverage.
There are not enough trained testers to achieve complete coverage for all of the units. There is a lack of tools to support complete coverage.
• Other project-related issues such as timing, scheduling, and marketing constraints
The following scenario is used to illustrate the application of coverage analysis. Suppose that a tester specifies branches as a target property for a series of tests. A reasonable testing goal would be satisfaction of the branch adequacy criterion. This could be specified in the test plan as a requirement for 100% branch coverage for a software unit under test. In this case the tester must develop a set of test data that insures that all of the branches (true/false conditions) in the unit will be executed at least once by the test cases. When the planned test cases are executed under the control of a coverage tool, the actual degree of coverage is measured.
If there are, for example, four branches in the software unit, and only two are executed by the planned set of test cases, then the degree of branch coverage is 50%. All four of the branches must be executed by a test set in order to achieve the planned testing goal. When a coverage goal is not met, as in this example, the tester develops additional test cases and re- executes the code. This cycle continues until the desired level of coverage is achieved. The greater the degree of coverage, the more adequate the test set. When the tester achieves 100% coverage according to the selected criterion, then the test data has satisfied that criterion; it is said to be adequate for that criterion. An implication of this process is that a higher degrees of coverage will lead to greater numbers of detected defects.
It should be mentioned that the concept of coverage is not only associated with white box testing. Coverage can also be applied to testing with usage profiles (see Chapter 12). In this case the testers want to ensure that all usage patterns have been covered by the tests. Testers also use coverage concepts to support black box testing. For example, a testing goal might be to exercise, or cover, all functional requirements, all equivalence classes, or all system features. In contrast to black box approaches, white box-based coverage goals have stronger theoretical and practical support.