I.M. Testy

Treatises on the practice of software testing

Archive for the ‘General Testing Topics’ Category

API Testing–How can it help?

with 5 comments

After a rather wet and soggy weekend, I woke up this morning to a beautiful sunny day in Seattle. Despite it being a bit cool, I do enjoy the sunshine so much more than the dreary gray days of a Seattle winter. Most of the leaves have fallen from the trees which makes good mulch for the gardens, but just adds more work to my stack. The good news is that there is snow in the mountains and the ski resorts in the area have opened early this year, so I hope to get in some good ski days.

In the previous post I attempted to explain the subtle differences between unit testing and API testing. It should also be noted that testing at the API layer is different than testing through the GUI. API testing is primarily focused on the functionality of the business logic of the software, and not necessarily the behavior or the “look and feel” from the end user customer perspective. In fact, the 1st tier customer of the API tester is the developer who designed and develops the API. The second tier customers are the developers who utilize the APIs in building the user interface or applications that sit on top of the underlying infrastructure of a program. And finally, API testers must also consider the various end-to-end scenarios of the end user customers at the integration level of testing (without a GUI).

This post will discuss why API testing is an important activity in the complete software development lifecycle (SDLC). Teams that have multiple developers and a continuous integration (CI) build process can greatly benefit from API testing. Key benefits of API testing include:

  • Reduced testing costs
  • Improved productivity
  • Higher functional (business logic) quality   

Reduced Testing Costs

It shouldn’t be a surprise to anyone that finding most types of functional bugs early in the SDLC is more cost effective. The primary goal of unit testing, and component/integration levels of testing (API testing) is to flush out as many functional issues in the business logic layer of a software program as early as possible in the SDLC. Driving functional quality upstream not only reduces production costs, but can also reduce testing costs.

API testing can reduce the overall cost of test automation.  Automated API tests are based on the API’s interface. So, once the API interface is defined testers can begin to design and develop automated tests. Having a battery of automated tests ready as the functional APIs come on-line pushes testing upstream in parallel with development rather than later in the SDLC. This also enables earlier tester engagement and closer collaboration between testers and developers.

Also, since API interfaces are generally very stable, so automated API tests are less impacted by changes as compared to GUI based automated tests. Many testers are familiar with the constant upkeep and maintenance typically associated with GUI based automated tests. The constant massaging of GUI automation is often a huge cost in a test automation effort and a contributing factor to why so many automation projects fail. Automated API tests in general require a lot less maintenance unless there is a fundamental change in the underlying infrastructure or design of the program.

Another significant way that API testing can reduce testing costs is by refocusing testing. Many test strategies rely  heavily on finding functional bugs typically using exploratory type testing through the GUI. But, most software produced today is developed in “layers” (see Testing in Layers). A more robust test strategy should focus the bulk of functional testing at the API layer that contains the “business logic” of the program. Of course some functional issues will still be found while testing through the GUI, but the focus of testing at the GUI layer should be on  behavioral testing. A test strategy that provides a multi-tiered approach is more effective than the typical approach of throwing a bunch of bodies to bang on the GUI in an attempt to beat out the bugs. A multi-tiered test strategy may even reduce the total testing time by reducing the need to spend long cycles trying to uncover a lot of functional bugs through the GUI.

Improved Productivity

There are different ways to evaluate productivity, but certainly one way is to ensure production keeps moving forward. Continuous integration is a keystone of Agile development projects, and at Microsoft this means daily builds. If the build breaks, production grinds to a virtual halt and forward momentum is blocked until the issue is fixed. A build break negatively impacts the productivity of the entire team. A suite of low level integration tests can help identify potential build breaks especially involving dependent modules before new fixes or features are merged into a higher level branch.

API testing can also improve productivity of testing. For example, structural testing is a white box test approach intended to test the structure or flow of a program. If increased levels of code coverage is an important goal, then the most efficient way to improve structural coverage is to identify untested code paths and design and develop API level tests that will tactically target untested code.

Perhaps the most significant improvement to productivity is gained through teamwork. Building and releasing great software products require a team effort. A team of people working closely together. Gone are the days of the adversarial relationship between developers and testers. Changes in technologies, changes in customer demands, and changes in how we build software require close collaboration between developers and testers, and testers being actively engaged throughout the SDLC and not just at the beginning (picking apart a spec), or at the end (banging out  bugs via the GUI pretending to mimic a ‘customer’). A team focused on delivering high quality can greatly add to a team’s overall productivity.

Higher Functional Quality

One of the advantages and also disadvantages of testing at the API layer is that you can test the API in ways that are different then how the GUI interacts with the API. For example, the Morse Code Trainer has an interface for the methods that parse the dots and dashes and plays a system beep of 1 unit duration for each dot in the stream, and a system beep for 3 units of duration for each dash in the stream. The duration of a unit is based on the WordsPerMinute property value.

   1:    interface ISoundGenerator
   2:    {
   3:      void PlayMorseCode(string morseCodeString);
   4:   
   5:      int WordsPerMinute { get; set; }
   6:    }
 

Testing this property at the API level we could “set” a negative integer value to make sure nothing really bad happens. But, a well-designed GUI would never accept an integer value less than 1 (which is painfully slow) nor above 150 (which is ridiculously high). A better design might be to use a drop-down list of values ranging from 5 (the minimum requirement for a basic license) to 20 words per minute (required for the highest level amateur radio operator license). Of course, it may be possible to find functional anomalies while API testing that could not be found via testing through the GUI. But, the important thing an API tester must consider is how a bug found at the component or integration levels of testing adversely affects a scenario, or the customer.

API testers work alongside of the developers. An API tester may also provide input into the initial API design, engage in code reviews before check-ins, and of course write automated tests to test the API (component level) and APIs in end-to-end scenarios (integration level). Having testers engage with developers early and throughout the SDLC helps ensure team work and instills the idea that quality is a collaborative effort.

Written by Bj Rollison

December 1st, 2011 at 5:48 pm

API Testing–Functional Testing Below the User Interface

with 4 comments

After a long hiatus from writing I am finally carving out some time to put thoughts to words again. A lot has been going on both professionally and personally. On the personal side I will simply say that never take for granted the time someone you care for has on this earth, and make a habit of spending quality time with that person regularly. On the professional side, things have been crazy busy in a very good way. I have settled in at work (still have lots to learn as always), and squeezed out a day to drive to Portland, OR to speak at PNSQC on random test data generation, and also present an online presentation discussing API testing best practices for the STP Online Summit: Achieving Business Value With Test Automation. Based on questions from that session I thought I would follow up with a few posts discussing API testing. Let’s start with describing API testing and how it differs from other “levels of software testing.”

Application Programming Interface (API)

The Microsoft Press Computer Dictionary defines API as “A set of routines used by an application program to direct the performance of procedures by the computer’s operating system.” So, referring to the abstract levels of testing, an API can be a unit, but is more likely a component because it is usually “an integrated aggregate of one or more units.”

An API provides value to both the developer and to the customer. For example, an API:

  • provides developers with common reusable functions so they don’t have to rewrite common routines from scratch
  • provides a level of abstraction between the application and lower level ‘privileged’ functions
  • ensures any program that uses a given API will have identical behavior/functionality (for example, many Windows programs use a common file dialog such as IFileSaveDialog API to allow customers to save files in a consistent manner)

Essentially, an API contains the core functionality of a program, or the business logic as some people refer to it. Customers don’t interact with API’s directly. Customers interact with software via the Graphical User Interface (GUI) which in turn interacts with an abstraction layer (e.g. controller in the MVC design pattern) that interacts with APIs exposed via Interfaces.

Testing an API as a Black Box

Some people assume that API testing is a ‘white-box’ testing activity in which the tester has access to the product source code. But in reality, API testing is truly black-box testing in the truest sense of the testing approach. API testers make no assumptions about how the functionality is implemented, and are not limited by constraints or distracting behaviors of a graphical user interface.

As an example, let’s use a program I developed called Morse Code Trainer. As a boy I was really into electronics (HeathKit projects were routinely on my Christmas list), and in order to pursue my amateur radio license I had to learn Morse code, or CW for short. Although Morse code is not required any longer to get a HAM operators license I think it is not hard to learn (memorize) about 55 sequences of dits and dahs, and in my opinion learning additional languages is good for the brain.

A core bit of functionality in this program is to convert a string of characters (a sentence) to the dits (represented as a period character “.”) and dahs (represented as a dash character “-“). The API to do this bit of magic is:

        string AlphaNumericCharacterStringToMorseCodeString(string input)

and is exposed to the developer who will code the UI and controller via the IMorseCodeEncoder interface.

        interface IMorseCodeEncoder
        {
            string AlphaNumericCharacterStringToMorseCodeString(string input);

           string MorseCodeStringToAlphaNumericCharacterString(string input);
        }

Notice that we don’t see any of the underlying code of how this method actually does its magic.

Let’s assume the developer didn’t do any unit testing and simply threw the code over the proverbial wall for testers to beat on. Since the tester (me) knows the developer (me) didn’t do any unit testing of any of the private methods the API under test relies on, the API tester (me) writes a simple test just to see if this code “works” like the developer (me) assured the tester (me) that it would. The most basic API test looks very similar to the unit test illustrated below, and in fact it this is a unit test the developer should write and execute before chucking a program at testers to bang on. A proper API test would call the method under test from the dynamic link library (DLL), and include initialization, clean-up, utilize the proper test design, have a robust oracle,  and of course have no hard-coded strings.

   1:      [TestMethod()]
   2:      [DeploymentItem("Morse Code Trainer.exe")]
   3:      public void GetMorseCodeStreamTest()
   4:      {
   5:        try
   6:        {
   7:          MorseCodeEncoder_Accessor target = new MorseCodeEncoder_Accessor();
   8:          string input = "A QUICK TEST";
   9:          string expected = ".-  --.- ..- .. -.-. -.-  - . ... -";
  10:          string actual;
  11:          actual = target.GetMorseCodeStream(input);
  12:          Assert.AreEqual(expected, actual);
  13:        }
  14:        catch (Exception e)
  15:        {
  16:          Assert.Fail(e.ToString());
  17:        }
  18:      }

 

Interestingly enough, had the developer (me) ran this unit test the developer would have discovered an unhandled exception. The unit test failed because this API called a method to get a Dictionary in another class and the Dictionary was created from 2 string arrays (an array of alpha-numeric characters, and an array of Morse code sequences). The specific error was a duplicate key/value in the Dictionary; in other words a duplicate entry in the alphaCharacterArray string array was throwing an System.ArgumentException for duplicate keys. But, because the methods in the MorseCodeLibrary class weren’t unit tested the API to encode a string of alpha-numeric characters to Morse code characters failed its basic unit test.

   1:      public Dictionary<string, string> GetAlphaCharacterToMorseCodeDictionary()
   2:      {
   3:        Dictionary<string, string> AlphaToMorseCodeDictionary = new Dictionary<string, string>();
   4:        for (int i = 0; i < this.alphaCharacterArray.Length; i++)
   5:        {
   6:          AlphaToMorseCodeDictionary.Add(this.alphaCharacterArray[i], this.morseCodeArray[i]);
   7:        }
   8:   
   9:        return AlphaToMorseCodeDictionary;
  10:      }

 

But, it actually gets worse. Another API in another class to decode a string of Morse code to alpha-numeric characters would have failed as well because it used the same faulty string array of data to create a Dictionary calling the public method

        public Dictionary<string, string> GetMorseCodeToAlphaCharacterDictionary().

This is actually a good example of 2 very different bugs with the same root cause. This is also good example of how it is more efficient to find functional bugs at the unit,  or component or integration levels of testing (API testing) as compared to finding this problem via functional testing through the user interface.

Unit vs. API Testing

So, you’re probably asking yourself “if the above example is really an example of a unit test the developers should do before throwing their code at testers, then how does unit testing differ from API testing?” When testing a single API call the most significant difference is in the thoroughness of test coverage. Most unit tests are rather simple things. Unit tests are not very complex; unit tests are not comprehensive in test coverage (although a good suite of unit tests should achieve good structural coverage); and unit tests often rely on simplistic oracles.

API tests by contrast are usually more comprehensive as compared to unit tests. API tests usually include both positive tests (does it do what its supposed to do) as well as negative tests (how well does it handle error conditions). While API tests should strive for a high level of code coverage (structural testing) a more important goal is test coverage. For example API tests of this same method might include a series of data-driven tests that:

  • test every known alpha-numeric character defined in Morse code (the population of the variable is small enough to test every element, if the population of a given variable is large then testers should define equivalent partitions and test an adequate number of samples from the population for confidence)
  • test character casing
  • test pangrams, and special signals (e.g. end of message, attention, received, etc)
  • test boundary conditons (e.g string max len although 2 billion+ characters seems excessive for a Morse code transmission)
  • test strings with invalid or characters that are not defined in Morse code
  • test strings with non-ASCII letters that have Morse code encodings (Ä, Á, Å, Ch, É, Ñ, Ö, Ü)
  • test performance to provide baseline measures of individual methods

More complex APIs such as this MessageBox.Show method that have several parameters with variable argument values might benefit from additional testing techniques such as combinatorial testing.

Testing API End–To–End Scenarios

Testing a single API is usually considered unit or component level testing in the abstract levels of testing. Some people consider unit and component level testing to be “owned” by the developer. I certainly agree that unit tests must be owned by the developer, and that developers can do a much better job of component level testing. But, I also think this is a key area where API testers can collaborate more closely with developers to increase the effectiveness of the tests, the data used in the test, and even the test design (e.g. data-driven unit testing).

But, I will suggest that the integration level of testing or “testing done to show that even though the components were individually satisfactory, as demonstrated by successful passage of component tests, the combination of components are incorrect or inconsistent“ is the domain of the API tester. Software applications are complex beasts that often rely on sequences of API calls interacting with databases, cloud services, or other background workers. So, although API testing rarely involves testing through the GUI, API testers must also understand how the various APIs will be used to effect various customer scenarios. The only difference is that API testers emulate these scenarios without navigating a graphical user interface.

For example, one scenario is to convert a string of text into dits and dahs and use the system’s beep to “play” the Morse code sequence over the computer’s speaker. So, this program contains a class to convert the sequences of dits and dahs into sound; the SoundGenerator class. The interface for the sound functions includes a getter and setter, and the PlayCharacterCode API.

   1:    interface ISoundGenerator
   2:    {
   3:      void PlayMorseCode(string morseCodeString);
   4:   
   5:      int WordsPerMinute { get; set; }
   6:    }

 

So, although we don’t know exactly yet how the developer and GUI designer will implement the GUI for this program, we can still create a test that inputs a string of alphanumeric characters, encodes the alphanumeric string into a Morse code encoded string, and then passes the string of Morse code dits and dahs as an argument to the PlayMorseCode method. This is a rather simple example of an end-to-end scenario. In more complex application the API functions/methods would likely be compiled in one or more dynamic link libraries (DLLS), rely on mocks, fake servers, and possibly other emulators. Of course, the oracles for this type of API testing is also more complex and generally involves checking multiple outcomes or states.

API testing focuses on an application’s functional capabilities, whereas testing through a GUI should focus primarily on behavior, usefulness and general ‘likeability.’

Written by Bj Rollison

November 21st, 2011 at 9:47 pm

The SDET vs STE Debate Redux: It’s only a title!

with 3 comments

Every few months the STE vs. SDET debate reemerges like the crazy outcast relative that comes to visit unexpectedly and sits around complaining about imaginative ailments, and reminiscing about how things were in the good ol’ days. We certainly don’t want to be rude to our relatives, so we tolerate their rants while watching the clock and giving subtle suggestions about the late time. But, with the ridiculous ‘debate’ between STE and SDET I can be rude; drop it! It’s a baseless discussion without merit. It’s only a title!

In this previous post I explained the business reasons why Microsoft changed the title from STE to SDET. But, for some reason people commonly mistake the title with the role or job function. In the good ol’ days our internal job description for STE at level 59 included ‘must be able to debug other’s code,’ and ‘design automated tests.’ Almost all STEs hired prior to 1995 had coding questions as part of their interview and were expected to grow their ‘technical skills’ throughout their career.  That was the traditional role of the STE.

As I explained in this previous post we established the title of SDET to ensure that testers at a given level in one organization in the company had comparable skills to another tester in a different organization. As part of the title change, the company decided that we needed to reestablish the base skill set of our testers to include ‘technical competence.’ Unfortunately when the career profiles were introduced some managers misinterpreted ‘technical competence’ with raw coding skills and the naive ideology of 100% automation. These same managers now complain their SDETs don’t excel at ‘bug finding’ and customer advocacy.

On my current team, the program managers are big customer advocates. They run their own set of ‘scenarios’ against new builds at least weekly. My feature area is testing private APIs on our platform. Our primary customers are the developers who consume those APIs, but we also must understand how bugs we find via our automated tests might manifest themselves and impact our customers. So, our team spends quite a bit of time also self-hosting, doing exploratory testing, and we even started a new approach that takes customer scenarios to the n-th degree that we call "day in the life" testing to help us better understand how customers might use our product throughout their busy days. Our product has 93% customer satisfaction.

So, if its true that the SDETs on some teams aren’t finding bugs and lack customer focus (and I suspect it is for some teams) then they hired the wrong people onto their test team. If SDETs don’t balance their technical competence with customer empathy then we have a problem; and I will say it is likely a management problem.

The testing profession is diverse and requires people to perform different roles or job functions during the development process and over the course of their career. Microsoft didn’t eradicate the STE “role” we simply changed the title of the people we hire in our testing “roles” and reestablished the traditional expectations of people in that role.

Differentiating between STE and SDET in our industry seems nonsensical to me, and I also think this false differentiation ultimately limits our potential to positively impact our customer’s experience and advance the profession. Testers today face many challenges, and hiring great testers (regardless of the job title) is about finding people who not only have a passion and drive to help improve our customer’s experience and satisfaction, but can also solve tough technical challenges to advance the craft and help improve the company’s business.

Written by Bj Rollison

July 22nd, 2011 at 7:55 am

Winning with team work

with one comment

imageLast Sunday evening our summer league Monarchs hockey team had a game against the Ice Dogs. In our previous game we tied against this team so I knew this would not be an easy game. To compound things we had a short bench (10 players and a goalie); enough for 2 forward lines and 2 defense lines. It was a hard game and our team really congealed and we played one of our best games this summer season. Just like the saying ‘when the going gets tough the tough get going.’ When you have a great team of people they don’t sit around and cry like a bunch of panty wastes, play the victim card, point fingers, or incessantly complain. A good team buckles down in hard times in spite of the difficulties that might lie ahead and work together to get things done. Individual hero’s need not apply. Team’s don’t worship heroes; they value every person on the team.

The weekend hockey game was a good break for me. You see, we have been in ship mode for our Mango release on the Windows Phone. The hockey game was a good outlet for some pent up frustration. Every D-man had at least 2 shots on goal, and I blocked a couple of shots; one off the mask and one off my inner thigh of course where there are no pads, and yes it left a pretty good bruise. But, as they say, “pain is temporary; a win is forever.”

Seemingly against the odds, we ended up winning the hockey game 5 to 1.

Ship mode often times gets a little crazy. Second guessing takes on a whole new meaning. “Did you test this"?” “What about that?” “I have a situation when I do such and such, and the sun came out (remember we’re in Seattle) something bad happened. Have you seen this before?” Some people run around looking for fires, others are trying to start them.

As I get older I have learned not to react to fires as I did in my younger days. I have learned that sometimes fires aren’t really fires at all; it’s just a spark that someone is recklessly trying to fan into a flame. Sometimes there are fires that burn themselves out, but you just have to manage them in a control burn. And then there are the fires that have to be dealt with. Dealing with fires late in the product cycle is not fun for on the team. But, a team of people are responsible for doing just that and it is seldom easy; and it happens in the ship room.

Our ship room looks at a lot of data every day throughout the product cycle to help us manage our release schedule and stay focused. In ship mode, data is scrutinized even closer and every bug goes under the microscope. Managers must now work together to make some hard decisions about whether to take a fix. There is often intense discussion, but you will never hear anyone play the consultant card saying “it depends.” These guys in ship room have been in the game a long time, they know the risks and they know the business. Of course they know “it depends.” They don’t want a bunch of hand waving and bloviating, they need facts to make hard decisions. If you say an issue going to adversely affect customers you better be able to explain how customers are impacted, how many customers are impacted, how customers get into that predicament, and know if there is a potential work-around.

In the end, a team of senior managers must make hard decisions about what issues to punt and which to fix based on the information that is presented. This never easy at anytime during the product cycle, but in ship mode each issue is carefully investigated down to root cause, the fix is understood, and the impact of fix and testing considerations are well defined before the final decision is made. Of course, many products are schedule driven, but at the forefront of every decision in our ship room is customer impact. Perhaps that is why customer satisfaction for Windows Phone 7 is at 93%, and why I am glad to be on a team that works hard to do the right thing for our customers.

Written by Bj Rollison

July 12th, 2011 at 8:51 pm

Levels of Testing

with 4 comments

The other day my friend and colleague Alan Page was asking about the levels of testing. When I teach fundamental testing concepts I often use the the ‘levels of testing’ as a model to help students conceptualize different types of testing activities and potentially different stages of testing processes during the software development lifecycle (SDLC). In his book “Software Testing Techniques,” B. Beizer writes there are 3 different types of testing activities. He groups them as unit/component, integration, and system ‘levels of testing.’ The different levels are explained by Beizer as:

“A unit is the smallest testable piece of software, by which I mean it can be compiled or assembled, linked, loaded, and put under the control of a test harness or driver.”

“A component is an integrated aggregate of one or more units. A unit is a component a component with subroutines it calls is a component, etc. By this (recursive) definition, a component can be anything from a unit to an entire system.”

Unit testing is the testing we do to show that the unit does not satisfy its functional specification and/or that its implemented structure does not match the intended design structure.”

It’s interesting to note Beizer’s explanation of the purpose of unit testing. Perhaps this is the inspiration behind TDD concepts rediscovered 10 years later. It also is counter to the typical ideology that unit tests are intended to “prove it works.”

Integration is a process by which components are aggregated to create larger components.”

Integration testing is testing done to show that even though the components were individually satisfactory, as demonstrated by successful passage of component tests, the combination of components are incorrect or inconsistent. “Integration testing is specifically aimed at exposing the problems that arise from the combination of components.”

“A system is a big component.”

System testing is aimed at revealing bugs that cannot be attributed to components as such, to the inconsistencies between components, or to the planned interactions of components and other objects. “System testing concerns issues and behaviors that can only be exposed by testing the entire integrated system or a major part of it.”

Levels of testing modelI often us the following visual model to illustrate the concept of ‘level of testing.’ This model is intended to show how each level builds upon the previous level, and also shows the increasing scope of each level of testing.

The problem with this model is that if we don’t understand the internal structure of software such as methods (or functions), classes, APIs or interfaces,  forms or other UI elements, etc., it is difficult to differentiate between the different levels other than system (the entire integrated system, or product), and the other levels.

One way to explain this model is that unit testing level tests individual methods or functions in a class. Component level testing tests a public methods that call one or more methods in a class. Integration tests are targeted at testing individual APIs or combinations of APIs treating the APIs as black boxes but usually without traversing through a user interface. And system testing tests both functional aspects and behavioral aspects of the entire product’s components together in a single build.  Obviously, the scope of testing at the system level of testing is very large and includes functional testing of computational logic, non-functional testing such as performance, security, etc, and behavioral testing such as usability, look and feel, etc.

Another problem with this model is that it may appear that by focusing testing efforts at the system level (e.g. testing each build through a user interface) we would have greater coverage and implicitly cover the ‘levels’ below the system. Unfortunately this is not always the case because

  • the user interface can mask or hide some functional bugs in public APIs that are called by other developers
  • the system is large and complex and we may miss underlying functional bugs

As Beizer indicated, each ‘level’ of testing has different objectives and can help identify certain types of bugs more efficiently as compared to the other ‘levels of testing.’ In our experience at Microsoft, we have learned that hiring massive numbers of people to test at the system level with little or no unit/component or integration testing does not necessarily result in higher “quality,” and may cost more in maintenance and support due to undetected functional issues.

imageAnother model that I also like to show comes from Agile Testing: A Practical Guide for Testers and Agile Teams by Lisa Crispin and Janet Gregory. In their chapter on automation they illustrate the “test automation pyramid” with 3 layers of automated tests.

This model also shows how automated tests benefit from the underlying automation, and each layer is targeted towards helping the tester (and developer) identify different types of issues during the SDLC. I also like this example because it shows where automated tests are most effective or provide the greatest value to developers and testers.

Ideally there should be heavy emphasis on unit/component level testing, additional investment in API level testing, and as Crispin and Gregory state, “The top tier represents what should be the smallest automation effort, because the tests generally provide the lowest ROI.” While ROI means different things to different people, I would generally agree that GUI automated tests tend to be less reliable and require much more maintenance as compared to other levels of automation.

There are also some illustrations of testing levels that show a sequential progression from unit testing to component testing to integration testing and finally to system testing. In my opinion these don’t provide much value to anyone other than process wonks who are only capable of linear thought. There are also other models of ‘levels of testing’ that folks have devised they use usually to separate unique classes of system testing. Models are abstract concepts that can be used to help explain complex systems, and sometimes a little more detail in the model helps people understand the system.

But, I guess that is both the benefit and the drawback of models. Models can help explain complex concepts, but they can also be misused when single-minded individuals attempt to create rigid linear processes from an abstract model, or assume how someone actually implements a model is representative of the abstract concepts that someone attempted to present in a model.

Written by Bj Rollison

June 25th, 2011 at 11:42 am

The Second 30 Days: Getting a handle on the project

with one comment

Well, it has been 60 days on my new team. It has been a challenge going from an IC back to a management role. Time management is a huge challenge. I also am still ramping up on our feature area (I now know more about social networks then I ever thought I would). By day I find my time split between people management (team meetings, 1:1’s, etc.) and project management (triage meetings, status meetings, external partner sync-ups, etc.) At night I read through the API specs and brush up on my C++. While I come up to speed I am fortunate to lead a great team of highly competent SDETs. Someone once asked me if I were to ever leave the classroom and go back to manage a team in a product group if I would hire people as smart as I was. I replied, “Well, I am not really that smart, so I would want a team of people who are much smarter than me.” And, that’s exactly the team I have!

clip_image002So, what does my team do? Essentially we are responsible for integrating the social networking functionality onto the Windows Phone devices. The Windows Phone is already the “real Facebook Phone” according to Wired magazine, and also helps you manage feeds from your LinkedIn contacts. The next release will also include Twitter integrated into the People’s hub (peeps) as well so you won’t need to install and switch between multiple apps to read or send updates from your contacts on different social networks. (The photo illustrates key aspects of the Peeps hub for those of you that don’t have a Windows Phone yet).

How we test in our corner of Microsoft

Last year I wrote about testing at the importance of functional testing at the API layer. So it is interesting that 1 year later I am leading a team that does just that. Our feature area includes the functional capabilities of the APIs that integrate social network features into the Windows Phone. 100% of our automated tests are designed to test functional scenarios at the API layer. Like many other teams at Microsoft we have daily builds of our feature branch. In addition to code reviews, our dev team runs their unit tests and a series of critical tests authored by our test team before checking in fixes and updates. Then an automated test suite is fired off in a lab on each nightly build to further validate the code changes did not destabilize key functional areas. Next lower priority automated tests are ran in the lab, as well as stress, performance and other test suites. This is continuous integration continuously.

While the team members are highly competent coders writing tests in C++ and some C#, they are more than developers in testing roles. Although a sister team is responsible for testing the customer experience, we also spend a time exploring our functional area via the user interface using an emulator or a device. This helps us develop customer empathy, but more importantly when our automated tests expose an issue we are better able to explain how that anomaly might impact our customer. We also spend a lot of time eating our own dog food and self hosting on test devices and some of us even flash our personal phones.

So, while I feel I am starting to come up to speed and beginning to contribute to the team I know I still have a lot more to learn. But, most importantly…I am having a blast!

Written by Bj Rollison

May 3rd, 2011 at 10:48 pm

You need testers because…

with 9 comments

…developers basically suck at testing and relying on developers to test your product will cost you money! Hey you pointy haired managers who think you can save money by cutting testing costs…are you listening?

Don’t get me wrong, I don’t think developers suck in their jobs; in fact, I know many really great developers (but, I also know and have seen the work of some hacks who call themselves developers). Great developers are usually pretty good at writing unit tests that test discrete functions. Some developers are even good at writing higher component or API level tests. Unit testing and API testing are valuable approaches to help identify certain types of problems in a new build after refactoring or bug fixing.

I don’t distrust developers; I think similar to testers they are concerned about the quality of their code and of the product they are working on. But, testers have a different mindset as compared to developers, and we have different perspectives. Some testers may write, or partner with developers for component level or API tests, but most testers generally focus on integration and system levels of testing. Using various approaches in our craft we often expose functionality issues as well as behavioral or usability issues that might adversely your customers.

imageFor example, the other day I came across a website to make a travel reservation. When I get to the payment page I realize that I can’t complete it because it is missing 3 months.

“Hmm…this is odd,” I thought to myself. I refreshed the page thinking there might have been a glitch in getting the list of months. Ultimately, I had to call the travel company to make reservations.

imageAt first they didn’t believe me, then they confessed they just commissioned this new website. So, I told them that after they fire the developer and hire some testers for the next go round they should also figure out why “test” is listed as a State.

 

imageOr, among other things, why this dropdown list contains seeming unrelated options, why some options are all upper case, and why one city name (scottsdale) is not capitalized. And of course our all time novice faux pas…”abc_test.”

These are just of few of the examples of the lack of perceived quality of this site. And the company really expects customers to enter personal data into this site; especially credit card information?

Now, of course testing doesn’t guarantee the absence of bugs, but I am pretty confident that most testers would have easily found these before staining this company’s reputation.

So, the next time your company thinks it can cut testing costs by having developers do the testing…point them here and say…I sure hope the developers you hire do a better job than this.

Written by Bj Rollison

March 13th, 2011 at 2:34 pm

State Transition Testing: Thinking in Models

with 7 comments

Well, it has been quite some time since I last posted to this blog, and I apologize to my regular readers for the inconsistency. Since the December timeframe I have been a bit busy behind the scenes searching for a new direction at Microsoft, and last week I was finally able to officially announce that I will soon be a Principal Test Lead in the Windows Phone group on the Communications team. My feature area will be Social Networking. I am so excited about the change, and getting back into the “real-world.” I have already started making the transition (more mentally than physically) and I am recalling the feelings I had when I first started at Microsoft. It’s incredible!

For the past 5 weeks I have also been teaching a course on Software Testing at the UW on Monday and Wednesday evenings. Tonight was my last class with a final that culminated in the attendees writing “scripted tests” from vague requirements, running those tests followed by some exploratory testing, and finally reviewing structural coverage effectiveness and defect detection effectiveness of the testing approaches.All in all, it is a fun class.

One lesson in the class focused state-transition testing.  The focus of the lesson is to identify the important “states” the machine is in at a given time, and identify various ways a customer might navigate from one state to another (traversals). In my opinion, one of the easiest ways to present the concept of modeling states is with a setup or installer of a program with a minimum number of configuration variables. I have found that minimizing the configuration variables helps people focus on “state” rather than be distracted by the different configuration settings (combinatorial testing).

After a brief explanation and a demo, I cut them loose on a problem to solve. In this case I used a shareware application called 3rd Plan It. The objective was to identify all possible paths a customer could navigate while setup is running and the important states the machine might be in at any given moment in time during the installation process. The value of a state diagram or map is that it can help us identify paths that we might miss via exploratory testing. Also, by identifying important characteristics of each state we can better assess whether the machine is in the expected state (oracle) as we navigate the various paths towards the objective of our test (in this situation it is the installation of a program).

The state map below illustrates a model of what we might consider the “important states,” and the various ways to traverse or navigate between states. Coming up with a state transition diagram often requires us to explore the feature. But, it also seems to force us to really understand the feature we are exploring at a much deeper level, and consider paths that we might otherwise have missed. A state transition diagram also helps us think about what “state” the machine should be in at any given moment during the setup process (it is also important to understand the “state” within the appropriate context).

MachineState

This is not to suggest that we need to create a state transition diagram for every feature. That would be silly. But, even if we don’t create a state diagram on paper we certainly create one in our mind. I suspect that most of the time we are purposeful in our testing approach and use heuristic patterns as we design tests (that is of course unless someone is high on PCP or some other hallucinogen and are completely reckless in their test approach and simply bang away indiscriminately in hopes of finding bugs).

Of course, some people found 1 or 2 issues using an exploratory approach without first creating a state diagram. But, after viewing the state diagram those folks were also able to realize paths they had not considered or traversed and were able to find more issues. The single biggest epiphany that most people have after participating in this exercise is how much more they understand the feature they are modeling compared to what they thought they knew.

Then of course, when the setup is littered with problems, the uninstall functionality which is part of the setup engine has its own set of problems as well.

image

Often, after teaching concepts and demoing examples of situations where various testing techniques might apply I am asked how diligently we should apply these patterns of testing. Of course, functional techniques are only one way to think though a problem. Test techniques are useful to help us systematically analyze a problem, and come up with some model of what we are testing. Models are sometimes useful to help us think of the problem differently, or identify things we might otherwise miss. State transition testing is a useful functional technique that is useful for helping us visualize various ways a customer might use a feature, and in turn come up with various tests and potentially find more issues.

Written by Bj Rollison

February 21st, 2011 at 1:27 pm

A Source of “Real-World” Test Data for Globalization Testing

with 4 comments

I am generally not a big fan of static test data. I do know that in the proper context static test data can provide some value. Of course we should be aware of the common problems with files of static test data or (even worse) hard coded test data in a test case. Some problems with static test data include:

  • Stagnation – static test data may add some initial value, but over time simply reusing the same test data over and over in a test diminishes the value of that test. For example, retesting the same name strings in a first name input textbox is not providing any new information if those ‘static’ names worked in the previous build and the underlying functionality has not changed.
  • Contextual blindness – sometimes we have files of static test data that was identified as “problematic” in one situation (context), so we reuse the “problematic” test data regardless of the context. In 1995 I wrote a white paper on “problematic double-byte encoded characters (DBCS) explaining why each code point was problematic in a given context. For example, a Japanese character that began with a 0x5C trail-byte might be problematic in a filename on an ANSI based system that parsed characters by bytes instead of wide bytes. This is not true on Windows systems where the default encoding is the Unicode transformation format of UTF-16. However, some people continue to use obsolete DBCS problem characters perhaps because they don’t fully understand the underlying contextual differences between ANSI based encodings and Unicode.

Perhaps on the opposite end of the test data spectrum is random test data. Many of you that read this blog or have heard me speak know that I am a big proponent of parameterized random test data generation. Parameterization allows us to better model our test data. I know that even parameterized random data can be crafted to be representative of real data, but it is not “real” data.

But, there may be a happy medium between static test data and random test data. And, best of all it is abundantly available. One of the best sources for (especially non-English) test data comes from sources that most of us already use on a daily basis. The test data source I speak of are social networks.

I have met many wonderful people from around the world both in person and virtually, and stayed in contact with many of them. Last year while keynoting at the first software testing conference in Vietnam (VistaCon 2010) I was privileged to meet my dear friend Thuyen, who helped organize the conference. Since the conference we have stayed in contact via email and Facebook. When she posts on Facebook it is usually in Vietnamese. Since I don’t (yet) read Vietnamese I use Bing Translator to help me figure out the comment.

Last week she had an entry on her Facebook wall that began “Tối nay vô tình nghe trên TV 1 bài hát mà giai điệu…” So, I copied the entry and opened Bing Translator to translate the entry.

image

Many of you will quickly notice the strange anomaly in the translation. I initially thought that this service might be incrementing this numeric value for some reason, but when I changed the number value to 2 the number 2 displayed in the translated string. I tried various other numbers and quickly discovered that 6 incremented to the number 7, 8 decremented to 7, and 9 decremented to the number 3. I didn’t see a clear pattern here so I thought this might be an issues resulting from parsing a particular sequence of characters.

So, I modified different parts of the string (removed words) to narrow down the problem. I found the string “tình nghe trên TV 1 bài hát mà giai điệu” contained the problematic sequence. Removing any ‘word’ from this string displayed the translated string with a number of 1, with the exception of 1 word. Removing the word “nghe” from the above string resulted in the translation illustrated below.

image

imageBy the way…the Google translation engine doesn’t fair much better. And, the results are different between www.google.com/ig and http://translate.google.com.

But, the purpose of this post is not to illustrate this particular bug, but to give you ideas of how we can use social network feeds in our testing. People around the world use social networks and you can find “real world” strings in various languages that you can use as test data in various contexts. Most of the time this ‘test data’ will not likely result in a bug; but sometimes it can reveal interesting issues. Best of all, strings taken from social networks are not some manufactured static or random test data. Using strings copied from social networks is about as “real world” as we can get…this is the “data” from our customers.

Written by Bj Rollison

January 23rd, 2011 at 10:14 am

Looking back…

with 2 comments

This has been a rather eventful year; both positive and negative. I have done quite a bit this past year professionally and personally, but certainly not as much as I had hoped to accomplish. As the year is winding down, my attention is focused on fulfilling my daughter’s Christmas dreams and wishes. Christmas is still a magical time for her and I both.

I don’t know if she still truly believes in Santa Claus, but at least she does a good job of pretending. Tonight we will perform our yearly ritual of putting milk and cookies on the fireplace hearth and toss some carrots out onto the back lawn for the reindeer. We won’t build a fire in that fireplace on Christmas eve because she doesn’t want Santa Claus to get burnt as he comes down the chimney. Of course, after she is tucked soundly in bed, I will eat the cookies, drink the milk, and throw the carrots into the greenbelt behind my house for the critters to munch on.

She is delighted by finding the present she asks for from Santa Claus under the tree. She only asks for one thing from Santa so “he” feels obliged to satisfy her wish. There are some smaller gifts as well; puzzles, books, American Girl doll accessories, etc.). But, it is not the presents that make this a special time for her and I. It is all the things we do leading up to Christmas day that make this a wonderful time of year for our family.

Well, I am won’t blather on about Christmas, or what a special time it has become for me since my daughter was born. Instead, I would actually like to thank Michael Larsen for taking the time to write several detailed reviews of How We Test Software at Microsoft. As I read his reviews of chapter 5 and chapter 6 (the ones I wrote) he actually brought out several key points that I think were a little obscure such as, “Bj champions the use of Exploratory Testing (ET). ET is a great way to get a handle on an application, especially when a tester is new to it.” Michael also caused me to reflect on the following point in the book, “Bj argues that Exploratory Testing can be sufficient for small software projects, or software with limited distribution, or software with a limited shelf life. But that it doesn’t scale well for large-scale, complex or mission-critical applications.” In retrospect I wish I would have expanded on that statement by saying that in our experiences at Microsoft ET does not scale well as a primary approach to testing, and the teams that tried to use ET as a primary approach have changed their strategy. However, ET is used throughout MS; it always has been and always will be a valued tool in any professional tester’s toolbox. Fortunately, Michael was able to read between the lines and wrote, “I get what he’s trying to say, in that Exploratory Testing techniques will not be the be all and the end all with testing (nor should it be; all tools have their right time and their right place).” Right on!

So, whether you have read the book or not, I highly recommend that you visit Michael’s blog to read the reviews for insightful thoughts on the book. I also recommend his other thoughtful posts on the topic of software testing.

Written by Bj Rollison

December 24th, 2010 at 11:53 am