Code Coverage: Finding missing tests
[NOTE: This was written last week but due to a glitch did not get automatically posted before I left on a boat trip where I disconnected from the world. How refreshing…but more about that later.]
Well, we got a buy in the quarter final play-offs, and we won our semi-final game against the Sockeyes (2 – 0) last Tuesday. Unfortunately, while celebrating with my teammates I was literally knocked to the floor by a kidney stone in my bladder. After a trip to the hospital I am now taking a battery of pain-killers until I pass the stone. Unfortunately, this put me on the injured roster and knocked me out of the final game and until I get an OK from the doctor. Fortunately, I am not prone to kidney stones and this shall pass (literally). I have only had one kidney stone in the past and that was about 20 years ago. For those of you who have not experienced a kidney stone, trust me on this…be very, very glad and I hope you never experience this malady.
Last week I attempted to to illustrate how we might achieve high levels of code coverage (structural control flow) but potentially overlook critical tests, especially from a ‘black-box’ testing approach. The bottom line message…high code coverage does not necessarily equal good test coverage. In reality it is really unlikely to get 100% measured code coverage of an reasonably complex application under test. Unfortunately this often begs the question, “What is the right amount of code coverage for [my] application?” To which I have heard several leads/managers reply, “Our goal is 80% code coverage?” Really? C’mon…that’s just plain ROMA data. Setting arbitrary goals for code coverage is about as pointless as tits on a boar hog. The real answer is that we simply don’t know what measure of code coverage is the ideal level for any product.
Also, for those who have read this article will know that regardless of the testing approach at some point the effectiveness of our tests to hit untested areas of code diminishes. While more time and effort may increase data flow coverage and expose issues, it is unlikely to increase control flow (code) coverage. Remember, just because you exercise a line of code doesn’t mean you found all the bugs, but you have 0% probability of finding any bugs in any untested code.
Fortunately, the majority of customers will traverse the same code paths covered by many of our tests. Also, if your team/organization does robust unit testing then there is a good probability that unit tests at least provided some minimal level of code coverage. (NOTE: While I highly recommend that unit tests and coverage results should be transparent to the test team, I do not recommend using unit tests as part of the battery of tests designed by the test team and executed against the whole build to measure code coverage.) So, there are a couple of questions we have to ask ourselves.
“Does the untested code present significant risk to our customers?”
“Do we need to reduce exposure to risk in the untested areas of code?”
“What is the most efficient way to effectively evaluate the untested code?”
As I wrote last week, code coverage is not about the number. Code coverage is about analyzing the results and potentially designing additional functional tests, or at least being able to explain why areas of the code are untested. If we determine that it is important for our business to better understand the untested code, or improve overall confidence and reduce potential risk then we should use a tool to measure code coverage. But, again it is not about simply measuring code coverage and reporting some magical metric.
Code coverage analysis is the most efficient method to help testers evaluate untested code. Code coverage analysis basically involves the tester reviewing untested code reported by the code coverage tool and determining why some code was not exercised, and possibly design additional tests to exercise the previously untested code. (Remember I also wrote last week the future of professional testing is about analyzing information and designing tests…so, here we go!)
Missing tests
For several years I’ve used the triangle simulation to help set a ‘test effectiveness’ baseline for new testers who had never been formally trained in different test techniques, patterns, or approaches. After a few years of analyzing the results we found that there was about a 70 to 75% probability of tests exercising true branch of the first conditional expression in a compound predicate statement in a key method in the program. There was about a 20 to 25% chance of tests exercising the true branch of the second conditional expression, and there was less than a 10% probability of tests exercising the third conditional expression in the predicate statement. When I found these results I could hardly believe it, so I changed the third conditional expression to inject a bug and sure enough the results held true; in any class of 20 people on average only 1 or 2 people found the bug in the software.
From a black box approach let’s say our tests used the following values for sides A, B, and C respectively:
- 1, 2, 3 – an error message indicating the values would not produce a triangle
- 2, 1, 3 – an error message indicating the values would not produce a triangle
- 4, 5, 6 – scalene triangle
- 2, 1, 2 – isosceles triangle
- 5, 5, 5 – equilateral triangle
In this case our code coverage tool would report our coverage is less than 100%. As we drill down we see that the IsValidTriangle() method illustrated below is not completely covered. So, (assuming the arguments values passed to the parameters in this method are all validated to be greater than 0) we analyze the code below in our coverage tool and realize that we need a test to evaluate the third conditional expression to true (e.g. 1, 3, 2 for sides A, B, and C respectively).
1: internal bool IsValidTriangle(int sideA, int sideB, int sideC)
2: {
3: bool result = true;
4: if ((sideA + sideB <= sideC) || (sideB + sideC <= sideA) || (sideA + sideC <= sideB))
5: {
6: result = false;
7: }
8:
9: return result;
10: }
[...] This post was mentioned on Twitter by Joris Meerts, Bj Rollison. Bj Rollison said: New blog post: http://tinyurl.com/26ww4wq – Code Coverage: Finding missing tests [...]
Tweets that mention I.M. Testy › Code Coverage: Finding missing tests -- Topsy.com
23 Aug 10 at 9:18 AM
[...] • I.M.Testy не оставляет багам ни единого шанса: покрытие кода тестами в исполнении Мастера. [...]
OpenQuality.ru | Качество программного обеспечения
31 Aug 10 at 9:15 PM
Hi BJ,
Nice post! Just thought of bouncing some related experience off of you.
I’ve been evaluating the effectiveness of code coverage directed “system test” coverage improvement in one of our projects. I am evaluating this by having the system testers run their tests with the coverage profiler in the background (albeit having faced reluctance from the testers at first — “code coverage, and me??”). So far, the code coverage results from a 100% run of system tests for a selected feature have unearthed quite a few sub-features that have not been considered. Of the ones that have not been covered, about 50% can be considered “risky” to be shipped without being tested at least once. Found some possible “rich dead code” as well. Although the test team has yet to analyze the misses and determine how critical they are, the test manager is convinced with the approach and has lined up another feature for analysis. Which is what I wanted!
I also got the impression that developers find it interesting to see the coverage report at this large, comprehensive scale (as opposed to unit test based code coverage.) It was also easy to get feedback from the developer on how to get the as-yet-untouched code covered through new tests.
For starters, I have begun with only measuring method coverage across the different relevant assemblies (it’s managed.) But this seems to suffice to improve feature-level test coverage. Next will be to use branch coverage — if the test team has the steam/time for it, that is.
An enabler has been the reduction of the overhead of the coverage profiler. Normally, as code coverage profilers deal with a reasonably small amount of code (in unit tests), the performance and resource overhead of these profilers goes unnoticed. But when I first began evaluating the profiler’s usage with system tests, the system became largely unusable. I couldn’t obviously go the test manager with this and say “I need a full system test run for feature X with the profiler. Oh, by the way, it takes just 5x more time to execute the tests with the profiler on. *compensatory grin*”
So, I got into discussion with the coverage tool guys, and over continuous dialogue managed to bring the overhead down to acceptable levels for a system tester to be able to run his/her tests under the profiler. Now I could dial the managers’ numbers.
So, having now been convinced that this approach does help, I look forward to selling it around in my org that “Code coverage analysis of system tests helps identify untested code even as you prepare to ship the product in the next weeks. You as project/test manager decide whether it is risky to ship it without being tested at least once in the system context. Get the code tested if it is risky. And get the developers to remove their rich dead code.”
Is there anything that might be missing in the context I described? It would be my pleasure to have your response.
chai
6 Sep 10 at 7:27 AM
maldita!
Isabella Manual
13 Aug 11 at 10:34 AM