Skip to content

Test Effectiveness

Originally Published Friday, October 13, 2006

Boris Biezer stated black box testing was approximately 35 to 65% effective. I had also read that Gerald Weinberg conducted studies at IBM with similar results. I recently spoke at the SQS conference in London and in the opening presentation Bob Barlett stated that SQS studies indicated that formal test design was almost twice as effective in defect detection per test case as compared to expert (exploratory) type testing, and of course put into perspective the infamous “death by checklist” syndrome.

About 4 years ago I began a 3 year study at Microsoft to verify assertions on testing effectiveness from a black box approach. I used Weinberg’s famous Triangle paradigm for the assessment. Given a brief functional requirement participants in the case study were asked to define tests to validate a program written in C# against the stated requirements. The basic requirements are outlined in Glendford Myer’s book The Art of Software Testing as “A program reads three (3) integer values. The three values are interpreted as representing the lengths of the sides of a triangle. The program displays a message that states whether the triangle is scalene, isosceles, or equilateral.”

Based on the implementation in C# (pseudo code below) and assuming that all inputs are valid integer values we determined the minimum number of tests to validate a program against this functional requirement is 11 tests as outlined below.

if (a + b <= c) or (b + c <= a) or (a + c <= b)
then invalid triangle
else if (a equals b) and (b equals c)
then equilateral triangle
else if (a not equal b) and (b not equal c) and (a not equal c)
then scalene triangle
else isosceles triangle

The minimum tests for conditional control flow and data flow (again assuming valid integer inputs)

  • 6 tests to validate the invalid triangle path
    a + b < c
    a + b = c
    etc.
  • 1 test for the equilateral path
  • 1 tests for scalene
  • 3 tests for isosceles (which actually verify the false outcomes in the sub-expressions of the scalene predicate statement)

I collected data for 3 years with more than 500 participants ranging from < 6 months to more than 5 years testing experience but non having formal training in testing techniques or methodologies. Interestingly enough, the data changed very little from the first few groups. The empirical results of this case study demonstrate the average effectiveness of tests in the most critical area of the program was only 36%. This literally means that of the minimum 11 tests for control and data flow coverage this section of code the average tester defined only 4 tests (1 test for invalid, 1 for equilateral, 1 for scalene, and 1 for isosceles). During this time period Microsoft was also making a transition to hire testers with greater technical competence and coding skills. Perhaps not surprising to most, the testers with a coding background increased the test effectiveness ratio by 50%.

This is just a small snap shot of the overall case study, but the overall conclusions determined that untrained testers using only an exploratory black box approach to testing are less effective and non-technical testers are 50% more likely to perform redundant or ineffective tests as compared to testers with greater technical competence (not necessarily coding skills, but a greater understanding of the entire system under test.)

Some managers at Microsoft scoffed at these results. One said that if he asked one of his non-technical testers to test the design of a new coffee cup that person would probably do better as compared to someone with a computer science background. OK…he probably has a point. But, I would argue that Microsoft and many other software companies are in the business of producing technological solutions to customers, and not in the business of making mugs (unless perhaps Microsoft’s ceramic team is in building 7 and that is a new LOB I don’t know about). The bottom line is that formal training in established, time proven formal functional and structural techniques can increase effectiveness of testers and reduce potential risk in a software project.

10 Comments

  1. testingmentor wrote:

    It would be interesting to see the case study as a whole as there is the danger to misinterpret the above information.

    Does the low rate of 36% of test effectiveness cause concern? Perhaps putting testing into the hands of testers who are not trained properly is a great risk factor? and concern to the testing community? I would personally be concerned if I was working with testers who could not identify basic tests.

    And what is the definition of trained and untrained testers? I’ve been testing for years, very keen and self educated, am not a coder to any extent, but believe I have a good understand of software. Am I a trained or untrained tester?

    You say that Microsoft provides technology solutions and are not in the business of making coffee mugs, which to me does not ring right. Making coffee mugs to me is the equivalent of Microsoft (or any company) designing software. I think it’s dangerous to measure tester’s effectiveness on the basis of number of tests identified. I’m sure most of us testers have experience of finding issues that were not covered in tests.

    Afterall, if something doesn’t look good or work as expected it won’t succeed – it doesn’t matter whether it’s functionally correct.

    It’s all about balance…

    Sunday, October 15, 2006 6:41 PM by rosiesherry

    Wednesday, November 11, 2009 at 11:13 AM | Permalink
  2. testingmentor wrote:

    Does the low rate of test effectiveness cause a concern? Well, let’s say that instead of simply calculating a type of triangle the calculations were for the trajectory of a rocket, or an automated drug dispenser in a hospital then I would say it is cause for grave concern.

    Is the low rate a great risk factor and concern for the testing community? The answers to both questions are emphatically yes! One of the objectives of a tester is to provide information that will allow management to make calculated assessments of risk. If test effectiveness is only 36% on the most critical aspect of this simple program, than 64% of the code is susceptible to 100% risk.

    Regarding trained or untrained I cannot say because I do not know you or what you have done to ‘educate’ yourself. The raw ability to find bugs in software or subjectively disagree with a design does not qualify a person as a tester. My mother finds bugs all the time. I love my mother, but I wouldn’t hire her as a software tester.

    Professional testing requires more than making sure it looks good and simply banging away indiscriminately, or making wild guesses, or the infamous “hey, let’s tyr this” approach. Professional testers can look “inside the box” to assess the effectiveness of their tests in reducing overall risk. Professional testers can be more efficient with targeted test design and less redundant.

    The simple fact is that many commercial software companies are requiring more technically competent, knowledgeable, well-trained, professional testers in lieu of the self-proclaimed ‘testers’ whose only contribution to the party is the ability to ‘test’ via trial and error from an end-user perspective or black box only approach.

    - Bj -

    Sunday, October 15, 2006 8:29 PM by I.M.Testy

    Wednesday, November 11, 2009 at 11:13 AM | Permalink
  3. testingmentor wrote:

    BJ,

    As seen from you few previous posts you seem to be biased strongly in favor of some methodologies or approaches and at the same time constantly targeting certain others like ET – often trying to find reasons or cases to prove it as bad, ineffective and so on….

    If you were to “glorify” formal test design techniques – you can do it without showing other techniques in “poor” light. As you would agree – “good” testing involves bit everything from formal to informal, from theoretical approaches to practical approaches from Mathematics/Computer science to Cognitive thinking. Your heavy (more than the usual) emphasis on “formal test design techniques” – indicates that you are a thinker or proponent of factory/Analytical school of testing. Why is that there is no talk of tester’s thinking and analytical abilities, Focus on (that methodologies like ET relies on) figuring in your posts? Why not take a balanced approach? Let the tester choose what is best in a given context? Formal techniques are not silver bullets and so are others – then why this fight?

    Other word (or jargon or keyword) that you most often use (or overuse) is “professional tester” – who is this person (I see that in one of your previous blogs you attempted to define) but was not convincing for me. The more you use that word (some times out of context) – there is a danger that, the word will loose its significance.

    What is unprofessional about those other testers who are not professional? What are the characteristics of a non professional tester? What about a “tester”?

    How about you, changing to “Skilled” tester or simply a tester? In my opinion, using the term “professional” tester in a larger context is “offensive” to rest of testers who may or not may not be “professional” in whatever you definition you have. You just can not put the whole world of testers in two buckets “Professional” and “un professional”…

    Shrini

    Wednesday, November 11, 2009 at 11:14 AM | Permalink
  4. testingmentor wrote:

    Shrini,

    It is not that I favor some methodologies or approaches over others, and I don’t completely discount the value of ‘exploratory’ testing in our profession. Techniques, methods, approaches are like ‘tools’ in a ‘toolbox’. Each ‘tool’ has a unique purpose it is best designed for, and that tool serves that specific purpose more effectively and more efficiently as compared to other ‘tools’ in the ‘toolbox’.

    Of course, the real key is knowledge of how and when to use the tools properly. And, if there are tools in the toolbox that someone doesn’t know how to use properly, then the overall ability of that person to perform certain tasks correctly or adequately are undoubtedly impaired.

    The use of formal techniques helps identify certain classes of defects and provides a solid foundation upon which further experimentation can continue. Unfortunately, studies indicate that few testers or developers receive formal training on many of the techniques or methodologies or take the time to read books on the practice of testing which can dramatically increase their knowledge of our discipline, as well as their overall effectiveness and efficiency test execution.

    I don’t have to look for reasons to prove the limitations of ‘exploratory’ testing as a primary approach to testing large, complex software projects; there is ample empirical evidence across the industry to support that assertion. There are individuals who disagree with some of the case studies; but there are always people who will ignore or deny the existence of the large pink elephant in the middle of the room. However, many companies are demanding more than simply an opinion of quality. I don’t get paid for my opinion (although I give it quite a bit). One of my primary objectives is to provide qualitative information to my managers; not a ‘best guess’ or ‘feel-good’ assessment. My managers trust my judgment; however, when the rubber hits the road I have to be able to qualify my position with facts.

    The so-called 4 schools of testing is something I have been meaning to blog about for awhile, but the short version is…

    I am not a big fan of the ‘so-called’ 4 schools of testing. I don’t like segregation in any form because it can lead to biased opinions, incorrect assumptions, and a general disregard for things that are “different.” But, most importantly segregation stifles innovative thoughts, creative collaboration, and the ability to expand a person’s knowledge and in-depth understanding of the ‘system’ as a whole.

    Reviewing the descriptions of the different ‘schools’ I don’t particularly align myself with any single school. Instead of affinity to one ‘school’ we should understand the values, techniques, and standards of all four ‘schools.’ The testing community needs to embrace the diverse values and mores of these various ‘schools’ of thought in order to extend the impact of testers, and mature testing into a professional discipline in field of computer science.

    I do not group individuals into “professional” or “un-professional” buckets as you suggest. I think anyone who is serious about this “profession” is a professional, and they surely realize they must constantly grow and hone their skills in order to succeed in our business and remain valuable assets to their company. When I use the term ‘professional’ I don’t use it as jargon or some secret keyword, and I certainly have not used the term to disparage others who are seriously pursuing software testing as a profession. (I tend to choose my words very carefully, and almost always use the denotative meaning of a word or clarify the context in which its meaning could be misconstrued.) I, and many other readers, have chosen software testing as a career or profession, and I strive to become “a person who is expert at his or her work;” “having or showing great skill;” and “a skilled practitioner.” I expect nothing less from other people in the industry who have chosen software testing as a career.

    - Bj -

    Thursday, October 19, 2006 10:27 PM by I.M.Testy

    Wednesday, November 11, 2009 at 11:14 AM | Permalink
  5. testingmentor wrote:

    I don’t have to look for reasons to prove the limitations of ‘exploratory’ testing as a primary approach to testing large, complex software projects; there is ample empirical evidence across the industry to support that assertion.

    It will be great if you can point to me those empirical studies. It is possible that ET may not be very effective in all contexts – A good ET is always accompanied by a good scripted testing that may use all available tools and techniques. It is important to understand that ET supplements any or all main stream testing techniques – in some case it is a viable mainstream technique.

    [Bj] The information is out there if you do a bit of research.

    there are always people who will ignore or deny the existence of the large pink elephant in the middle of the room.

    Such kind of behavior is quite possible – just as you seem to ignore the existence of 4 large pink elephants – 4 schools of testing

    [Bj] If you read closely you will understand that I am not ignoring the 4 pink elephants, I have recognized that some folks wish to label themselves to feel important or for improved self-esteem, but instead of ignoring the rediculous notion, I am suggesting that segregation of testing values and ideals has no place in professional testing. Speaking out against something is not ignoring it…at least not where I come from.

    However, many companies are demanding more than simply an opinion of quality.

    What else do they need? Fancy numbers, graphs and endless and mindless documentation? If they are demanding more than opinion – opinion might not be covering the things that managers are looking at

    [Bj] They are looking for qualitative information, and facts.

    One of my primary objectives is to provide qualitative information to my managers; not a ‘best guess’ or ‘feel-good’ assessment.

    ET – when done well gives qualitative information not just “best guess” or “feel-good” assessment. “Best guess” or “feel-good” assessment was NEVER an objective of ET and never will be.

    I do not group individuals into “professional” or “un-professional” buckets as you suggest.

    Then why use or over use or always “prefix” tester with “professional”?

    [Bj] If you don’t know by now, I fear you will never understand, so I shall no longer attempt to explain it to you.

    I think anyone who is serious about this “profession” is a professional, and they surely realize they must constantly grow and hone their skills in order to succeed in our business and remain valuable assets to their company. When I use the term ‘professional’ I don’t use it as jargon or some secret keyword, and I certainly have not used the term to disparage others who are seriously pursuing software testing as a profession. (I tend to choose my words very carefully, and almost always use the denotative meaning of a word or clarify the context in which its meaning could be misconstrued.) I, and many other readers, have chosen software testing as a career or profession, and I strive to become “a person who is expert at his or her work;” “having or showing great skill;” and “a skilled practitioner.” I expect nothing less from other people in the industry who have chosen software testing as a career.

    According to Merriam Webster dictionary – A professional is one who engages in a *profession* to make is his livelihood. If you add word “Tester” to this, then a professional tester is some one who is engaged in a ‘testing” profession and makes is livelihood. I am sure that is not Michael Bolton’s interpretation of a ‘professional tester”. This *dictionary* meaning does not say anything about thinking abilities and intelligence of tester – it just mentions about *engaging* in a profession and making *making livelihood. You might want to switch over to word “Skilled Tester”.

    You might want to do a control F in your blog and count the instances of the word “Professional” – you would be surprised. If you use some word so frequently – I am afraid it will loose its significance.

    [Bj] Your limiting the definition of the word professional. Merriam Webster have additional denotations for the word the provide greater context. I am sure my use of the word professional to describe myself and my collegues in a career we have choosen, and in which we strive to excel is not as perfuse as the letters ET in your comments. It is actually sad that you disdain the use of the word professional. Perhaps it is because you label yourself as something else (such as “exploratory tester) and feel threatened by the word professional for some reason. Get over it!

    Tuesday, October 31, 2006 9:56 AM by Shrini

    Wednesday, November 11, 2009 at 11:14 AM | Permalink
  6. testingmentor wrote:

    think the most interesting thing about this test is that, by definition, every equilateral triangle *is also* an isosceles triangle.

    Tuesday, March 06, 2007 2:08 PM by RCarbol

    Wednesday, November 11, 2009 at 11:15 AM | Permalink
  7. testingmentor wrote:

    Good point, but all isosceles triangles are not equilateral triangles! So, they are a special type of isosceles triangle.

    Since the requirements state that we specifically want to identify equilateral triangles (separate isosceles triangles that are equiangular) then we cannot equivalence both equilateral and isosceles triangles in the code or in our testing.

    - Bj -

    Tuesday, March 06, 2007 6:08 PM by I.M.Testy

    Wednesday, November 11, 2009 at 11:15 AM | Permalink
  8. testingmentor wrote:

    Of course. But I think it’s a mistake for the specifications to imply that ‘equilateral’ and ‘isosceles’ are mutually-exclusive.

    In the general case, I think it’s an interesting topic for testing: Is it even *possible* for the specifications to be wrong, in the context of software testing?

    Thursday, March 08, 2007 11:37 AM by RCarbol

    Wednesday, November 11, 2009 at 11:15 AM | Permalink
  9. testingmentor wrote:

    Yes, it is absolutely possible for the specification to be wrong; not just in the context of software testing, but also in the design in development of the software.

    In this particular example of the ‘triangle’ problem (which is well over 20 years old) the requirements state we specifically want to identify 3 different types of triangles ( equilateral, scalene, and isosceles).

    You may ‘think’ the requirements are wrong (everyone is entitled to their opinion), but it is what it is.

    We could display a message that appears when users input 3 equal values read “These values equate to an equilateral triangle which is a special type of isosceles triangle.” Perhaps that would satisfy those who really want to over-analyze this simple parable.

    Personally I think you are putting way too much energy into trying to draw me into a philosophical debate on the differences between an equilateral triangle and an isosceles triangle.

    Thursday, March 08, 2007 1:15 PM by I.M.Testy

    Wednesday, November 11, 2009 at 11:15 AM | Permalink
  10. testingmentor wrote:

    Heh; sorry about that. I don’t mean to belabour this issue.

    Thursday, March 08, 2007 5:27 PM by RCarbol

    Wednesday, November 11, 2009 at 11:16 AM | Permalink