Archive for June, 2011
Levels of Testing
The other day my friend and colleague Alan Page was asking about the levels of testing. When I teach fundamental testing concepts I often use the the ‘levels of testing’ as a model to help students conceptualize different types of testing activities and potentially different stages of testing processes during the software development lifecycle (SDLC). In his book “Software Testing Techniques,” B. Beizer writes there are 3 different types of testing activities. He groups them as unit/component, integration, and system ‘levels of testing.’ The different levels are explained by Beizer as:
“A unit is the smallest testable piece of software, by which I mean it can be compiled or assembled, linked, loaded, and put under the control of a test harness or driver.”
“A component is an integrated aggregate of one or more units. A unit is a component a component with subroutines it calls is a component, etc. By this (recursive) definition, a component can be anything from a unit to an entire system.”
“Unit testing is the testing we do to show that the unit does not satisfy its functional specification and/or that its implemented structure does not match the intended design structure.”
It’s interesting to note Beizer’s explanation of the purpose of unit testing. Perhaps this is the inspiration behind TDD concepts rediscovered 10 years later. It also is counter to the typical ideology that unit tests are intended to “prove it works.”
“Integration is a process by which components are aggregated to create larger components.”
“Integration testing is testing done to show that even though the components were individually satisfactory, as demonstrated by successful passage of component tests, the combination of components are incorrect or inconsistent. “Integration testing is specifically aimed at exposing the problems that arise from the combination of components.”
“A system is a big component.”
“System testing is aimed at revealing bugs that cannot be attributed to components as such, to the inconsistencies between components, or to the planned interactions of components and other objects. “System testing concerns issues and behaviors that can only be exposed by testing the entire integrated system or a major part of it.”
I often us the following visual model to illustrate the concept of ‘level of testing.’ This model is intended to show how each level builds upon the previous level, and also shows the increasing scope of each level of testing.
The problem with this model is that if we don’t understand the internal structure of software such as methods (or functions), classes, APIs or interfaces, forms or other UI elements, etc., it is difficult to differentiate between the different levels other than system (the entire integrated system, or product), and the other levels.
One way to explain this model is that unit testing level tests individual methods or functions in a class. Component level testing tests a public methods that call one or more methods in a class. Integration tests are targeted at testing individual APIs or combinations of APIs treating the APIs as black boxes but usually without traversing through a user interface. And system testing tests both functional aspects and behavioral aspects of the entire product’s components together in a single build. Obviously, the scope of testing at the system level of testing is very large and includes functional testing of computational logic, non-functional testing such as performance, security, etc, and behavioral testing such as usability, look and feel, etc.
Another problem with this model is that it may appear that by focusing testing efforts at the system level (e.g. testing each build through a user interface) we would have greater coverage and implicitly cover the ‘levels’ below the system. Unfortunately this is not always the case because
- the user interface can mask or hide some functional bugs in public APIs that are called by other developers
- the system is large and complex and we may miss underlying functional bugs
As Beizer indicated, each ‘level’ of testing has different objectives and can help identify certain types of bugs more efficiently as compared to the other ‘levels of testing.’ In our experience at Microsoft, we have learned that hiring massive numbers of people to test at the system level with little or no unit/component or integration testing does not necessarily result in higher “quality,” and may cost more in maintenance and support due to undetected functional issues.
Another model that I also like to show comes from Agile Testing: A Practical Guide for Testers and Agile Teams by Lisa Crispin and Janet Gregory. In their chapter on automation they illustrate the “test automation pyramid” with 3 layers of automated tests.
This model also shows how automated tests benefit from the underlying automation, and each layer is targeted towards helping the tester (and developer) identify different types of issues during the SDLC. I also like this example because it shows where automated tests are most effective or provide the greatest value to developers and testers.
Ideally there should be heavy emphasis on unit/component level testing, additional investment in API level testing, and as Crispin and Gregory state, “The top tier represents what should be the smallest automation effort, because the tests generally provide the lowest ROI.” While ROI means different things to different people, I would generally agree that GUI automated tests tend to be less reliable and require much more maintenance as compared to other levels of automation.
There are also some illustrations of testing levels that show a sequential progression from unit testing to component testing to integration testing and finally to system testing. In my opinion these don’t provide much value to anyone other than process wonks who are only capable of linear thought. There are also other models of ‘levels of testing’ that folks have devised they use usually to separate unique classes of system testing. Models are abstract concepts that can be used to help explain complex systems, and sometimes a little more detail in the model helps people understand the system.
But, I guess that is both the benefit and the drawback of models. Models can help explain complex concepts, but they can also be misused when single-minded individuals attempt to create rigid linear processes from an abstract model, or assume how someone actually implements a model is representative of the abstract concepts that someone attempted to present in a model.