Skip to content

Do I Really Need To Automate This Test?

For the past 2 weeks my students in my automation course at University of Washington have been tasked with designing automated test cases through the GUI for a shareware program. In my opinion, GUI automation is the least effective approach for testing the functional or business logic of a program (assuming a well designed architecture where the business logic code is separate from the form (GUI) and the event hander (GUI object behavior) code). However, if a tester doesn’t have access to the underlying APIs used in the application under test (AUT) and is given just a compiled application (‘GUI application’) to test then functional testing through the GUI may be the only alternative.

GUI automation can be effective for some types of behavioral and non-functional tests such as performance and stress testing. It can also be useful in checking for layout issues such as control alignment, and clipping or truncation of controls on a dialog much more effectively than compared to the human eye.

However, there are some behavioral tests that are more efficient to perform manually by ‘me’ the tester. For example, end-2-end user scenarios are designed to simulate a customer completing some task involving multiple features and system interactions. Sure, we could automate these types of tests and I can even design my automated test to simulate emotions such as frustration by timing out if an event takes ‘too long’ or anger because of ‘too many’ pop-ups. (Of course, I’d have to specify ‘too long’ and ‘too many.’)  But, in my opinion we shouldn’t automate things like end-2-end scenarios because automation is poor at emulating a real person. I write automated tests to provide value to ‘me’ the tester; to free up my time to test the things that are better tested by ‘me.’

There are other types of GUI test cases that I need to execute, but shouldn’t be automated. One student wanted to automate a test that clicked the buttons on the toolbar but was having difficulty accessing the toolbar buttons on a native code application using C#. Now, in my opinion, spending time to automate a test case to ‘validate’ the toolbar buttons makes about as much sense as automating a test to validate the tab order of a dialog or checking duplicate access key mnemonics. The question is not how can we can automate ‘test cases’ for tab-order, key mnemonics, or the toolbar buttons; the question is should we?

First, I explained to the student that the difficulty was due to the fact that toolbar buttons are not the same as common button controls (e.g. OK or Cancel buttons). Toolbar “buttons” are actually bitmap images that sit on a toolbar control and just look and act similar to small buttons. Next, I asked the student, “Since I know you are not testing the toolbar control itself, what is the purpose of this test; what exactly are you testing?” He replied, “To make sure it works.” Again I asked, “What exactly are you testing, what specifically are you making sure works?” Finally he replied, “To make sure the toolbar button triggers the appropriate event handler.” I thought to myself, “Great! They are starting to think about how this stuff works below the covers.” The questioning continued, “Are there other ways to trigger the same events? The student replied, “Yes, there are menu items.” In fact, most toolbar buttons are essentially shortcuts so users don’t have to navigate dropdown menus. The example program below illustrates how toolbar buttons provide a visual cue to the user, but end up calling the same event handler as the menu item.

menu items toolbar buttons

So I asked, “Since there is a menu item that calls the same apparent event as the toolbar button, do you think there are two separate event handlers for the same behavior; one for the menu item click event and another for the toolbar button click event, or do you think the menu item click event and the toolbar button click event call the same event handler?”

The answer here could depend on whether or not we are dealing with competent developers. For example, as we build out the event handlers for the UI element I guess we could create 4 separate events (2 that do the same thing) as illustrated below.

4 event handlers

Competent developers would of course realize we only need 1 event handler for the ‘click’ events for the align right menu item and toolbar button, and 1 event handler for the align left menu item and toolbar button since there is no behavioral difference between clicking the menu item or clicking the toolbar button in this situation. So, our developer refactors the code to have 1 event handler for each specific behavior similar to:

2 event handlers

and then updates the appropriate UI element Click event statements in the form designer code to call the appropriate event handler for the menu items and as illustrated below for the toolbar buttons.

update click events

But, I still wasn’t completely convinced of the purpose of his test case. So, I asked, “Are you testing the event handler, or are you testing to make sure the toolbar button “click” event calls the appropriate event handler?” To which he responded, “To make sure the toolbar button ‘click’ event calls the ‘correct’ event handler.”

“OK,” I said in a pondering sort of way, “Let me get this right. You are going to spend some amount of time to automate a test that will validate whether or not each toolbar button click event calls the appropriate event handler.” Then I proceeded to click each toolbar button on the application under test to trigger the expected behavior. The few buttons only took a matter of a few seconds. Then I looked at him and asked, “Are you sure you want to spend time automating a test to do what I just did in a few seconds? “Are you sure you want to automate a test that has an extremely low probability of changing during the product development lifecycle? “Are you sure you want to automate a test that will probably get a lot of “face time” by testers, developers, beta testers, and others on the team? “Are you sure you want to automate a test case that you will likely spend even more time massaging and maintaining over the product shelf-life? “Or, do you think it might be a more efficient use of your time to take a few seconds and test this once per sprint cycle or milestone and let dog-fooding, beta testing, self-hosting, etc. help in ‘testing’ the behavior of those toolbar buttons?’”

I suspect this is a case of “well, this is a test that I need to test at least once, so we should automate it if we can.” Certainly we need to test toolbar buttons to make sure they trigger the appropriate event handler; once, maybe once per milestone or sprint cycle. But, do I really need to automate this test? In a similar case, one tester at Microsoft said to me, “we have to constantly retest this in sustained engineering and if we don’t automate this test then we will have to hire testers to test it manually.

Besides the faulty logic of retesting unchanged code or code that is not impacted by other changes repeatedly (and we have lots of tools to show us code churn and dependencies between modules that might be affected by churn) and beside the foolish notion that automation will replace testers, I will say that I would rather have a tester spend a few seconds each cycle testing whether a toolbar button event calls the appropriate event handler rather than have a tester spend hours/days/weeks baby-sitting and massaging temperamental GUI test code.

This is not to say that all GUI automation is finicky. And this is not to say that we shouldn’t consider automating our test cases. But, we shouldn’t automate for the sake of trying to automate all our test cases, and we certainly shouldn’t automate mindlessly simple tests; especially automated tests that might require more of my time in the long run or that have little value (virtually zero probability of  new information) to the overall testing effort when executed. (Just because a test is automated doesn’t mean it’s free!)

Before we develop an automated test we should really think about the test design from a “what am I REALLY testing here” perspective and then ask, “Does this really make sense to have a separate automated test case, or is this behavior or functionality being covered by other tests (manual and/or automated) sufficiently?”

Programmatically Detecting The Operating System Version (Part II)

Time is a commodity in short supply! It has been more than 2 weeks since my last post. I have not been sitting idle, but really haven’t had a lot of free time to write. In preparation for my up-coming trip to Zurich, Switzerland to give a workshop and keynote at Swiss Testing Day I was interviewed by Marco van der Spek for TESTNIEUWS.NL. The interview provides a bit more background about me and some of my perspectives of Microsoft and software testing.

Also sucking up some of my time over the past couple of weeks were ‘reorgs’ at work. Change at Microsoft is a constant. Most people get used to it after awhile; others still freak out even with the slightest change. For me it mostly means shifting some priorities based on the new General Manager’s strategic vision, acclimating to a new manager and letting him know what I am working on, and generally making sure the day to day business on our team keeps moving forward during the transition.

Just like life at Microsoft and technology in general the Windows operating system continuously goes through changes. New versions, new service packs, Ultimate, Home, Server versions, etc. Sometimes it is hard to keep up with all the changes. And from a test automation perspective it is sometimes important to know which operating system version the test is running on. In some cases control flow in the automated test case may need to branch in order for the test to execute on different operating system versions. Branching in an automated test based on the operating system version eliminates the need to write separate test cases for variances in operating system versions.

And, certainly if our test matrix includes multiple product versions (Home, Ultimate, Professional, etc) and our automated test exposes a bug then we certainly want to collect information about the operating system version the test was running on. This is especially important if the automated test fails on one machine, but passes on another. Sometimes the cause of the failure may be a slight difference in the machine configuration or the operating system version.

Almost 2 years ago I described a way to get the operating system version information using the System.PlatformID enumeration and the System.Environment.OSVersion property and OperatingSystem class members in this blog post. But, I also mention some limitations such as detecting the specific edition or product type of a Windows Version. Another limitation is the difficulty in detecting whether the operating system version is Windows 7 or Windows 2008 Server R2.

I also mentioned that in order to identify a particular version  and/or edition of Windows we need to invoke the Win32 GetVersionEx() function. If a test is dependent on a specific edition of Windows Vista or Windows Server 2008 then we can invoke the Win32 GetProductInfo() function. But, to use these Win32 APIs we need to use platform invocation services (P/Invoke) to marshal native code into our managed test code.

The code snippet below illustrates the required Win32 function marshaled using the DLLImport attribute in C#, constant values, wrapper methods to get common operating system information, and public properties to get the operating system version, any installed service pack information, and the operating system edition for Windows Vista, Windows Server 2008, and Windows 7.

   1: // <copyright file = VersionInfo.cs" company = "Testing Mentor">

   2: // Copyright © 2010 All Rights Reserved. Test developers can simply copy and

   3: // paste the code into their code, but may not reproduce or publish the code

   4: // snippets on any web site, online service, or distribute as source on any

   5: // media without express written permission. </copyright>

   6:  

   7: namespace TestingMentor.Snippet.OperatingSystemVersionInfo

   8: {

   9:   using System;

  10:   using System.Runtime.InteropServices;

  11:  

  12:   public class WindowsVersionInfo

  13:   {

  14:     public string GetOSVersion

  15:     {

  16:       get { return this.GetOSVersionInfo(); }

  17:     }

  18:  

  19:     public string GetServicePack

  20:     {

  21:       get { return this.GetServicePackInfo(); }

  22:     }

  23:  

  24:     public string GetProductType

  25:     {

  26:       get { return this.GetProductTypeInfo(); }

  27:     }

  28:  

  29:     private string GetOSVersionInfo()

  30:     {

  31:       string version = "Unsupported Version";

  32:  

  33:       NativeMethods.OSVersionInfoEx osvi = new NativeMethods.OSVersionInfoEx();

  34:       osvi.VersionInfoSize = 

  35:         Marshal.SizeOf(typeof(NativeMethods.OSVersionInfoEx));

  36:       NativeMethods.GetVersionEx(ref osvi);

  37:  

  38:       if (OsviConstant.SupportedPlatform == osvi.PlatformId &&

  39:         osvi.MajorVersion > 4)

  40:       {

  41:         if (osvi.MajorVersion == (int)OsviConstant.MajorVersion.NT5 &&

  42:           osvi.MinorVersion == (int)OsviConstant.MinorVersion.Windows2000)

  43:         {

  44:           version = "Windows 2000";

  45:         }

  46:  

  47:         if (osvi.MajorVersion == (int)OsviConstant.MajorVersion.NT5 &&

  48:           osvi.MinorVersion == (int)OsviConstant.MinorVersion.WindowsXP)

  49:         {

  50:           version = "Windows XP";

  51:         }

  52:  

  53:         if (osvi.MajorVersion == (int)OsviConstant.MajorVersion.NT5 &&

  54:           osvi.MinorVersion == (int)OsviConstant.MinorVersion.WindowsServer2003)

  55:         {

  56:           if (osvi.ProductType == (byte)OsviConstant.WorkStation)

  57:           {

  58:             version = "Windows XP Professional x64";

  59:           }

  60:           else

  61:           {

  62:             version = "Windows Server 2003";

  63:             if (NativeMethods.GetSystemMetrics(OsviConstant.ServerR2) != 0)

  64:             {

  65:               version += " R2";

  66:             }

  67:           }

  68:         }

  69:  

  70:         if (osvi.MajorVersion == (int)OsviConstant.MajorVersion.NT6 &&

  71:           osvi.MinorVersion == (int)OsviConstant.MinorVersion.WindowsVista)

  72:         {

  73:           if (osvi.ProductType ==

  74:             (byte)OsviConstant.WorkStation)

  75:           {

  76:             version = "Windows Vista";

  77:           }

  78:           else

  79:           {

  80:             version = "Windows Server 2008";

  81:           }

  82:         }

  83:  

  84:         if (osvi.MajorVersion == (int)OsviConstant.MajorVersion.NT6 &&

  85:           osvi.MinorVersion == (int)OsviConstant.MinorVersion.Windows7)

  86:         {

  87:           if (osvi.ProductType == (byte)OsviConstant.WorkStation)

  88:           {

  89:             version = "Windows 7";

  90:           }

  91:           else

  92:           {

  93:             version = "Windows Server 2008 R2";

  94:           }

  95:         }

  96:       }

  97:  

  98:       return version;

  99:     }

 100:  

 101:     private string GetServicePackInfo()

 102:     {

 103:       NativeMethods.OSVersionInfoEx versionInfo = new NativeMethods.OSVersionInfoEx();

 104:       versionInfo.VersionInfoSize = Marshal.SizeOf(typeof(NativeMethods.OSVersionInfoEx));

 105:       NativeMethods.GetVersionEx(ref versionInfo);

 106:       return versionInfo.CSDVersion; 

 107:     }

 108:  

 109:     private string GetProductTypeInfo()

 110:     {

 111:       string product = String.Empty;

 112:  

 113:       NativeMethods.OSVersionInfoEx osvi = new NativeMethods.OSVersionInfoEx();

 114:       osvi.VersionInfoSize =

 115:         Marshal.SizeOf(typeof(NativeMethods.OSVersionInfoEx));

 116:       NativeMethods.GetVersionEx(ref osvi);

 117:  

 118:       if (osvi.MajorVersion > 5)

 119:       {

 120:         uint productType = 0;

 121:  

 122:         NativeMethods.GetProductInfo(

 123:           osvi.MajorVersion,

 124:           osvi.MinorVersion,

 125:           osvi.ServicePackMajor,

 126:           osvi.ServicePackMinor,

 127:           ref productType);

 128:  

 129:         switch (productType)

 130:         {

 131:           case (uint)OsviConstant.ProductInfo.Business:

 132:             product = "Business Edition";

 133:             break;

 134:           case (uint)OsviConstant.ProductInfo.BusinessN:

 135:             product = "Business N Edition";

 136:             break;

 137:           case (uint)OsviConstant.ProductInfo.ClusterServer:

 138:             product = "HPC Edition";

 139:             break;

 140:           case (uint)OsviConstant.ProductInfo.DatacenterServer:

 141:             product = "Server Datacenter (Full)";

 142:             break;

 143:           case (uint)OsviConstant.ProductInfo.DatacenterServerCore:

 144:             product = "Server Datacenter (Core)";

 145:             break;

 146:           case (uint)OsviConstant.ProductInfo.DataCenterServerCoreV:

 147:             product = "Server Datacenter without Hyper-V (Core)";

 148:             break;

 149:           case (uint)OsviConstant.ProductInfo.DataCenterServerV:

 150:             product = "Server Datacenter without Hyper-V (Full)";

 151:             break;

 152:           case (uint)OsviConstant.ProductInfo.Enterprise:

 153:             product = "Enterprise Edition";

 154:             break;

 155:           case (uint)OsviConstant.ProductInfo.EnterpriseE:

 156:             product = "Enterprise E Edition";

 157:             break;

 158:           case (uint)OsviConstant.ProductInfo.EnterpriseN:

 159:             product = "Enterprise N Edition";

 160:             break;

 161:           case (uint)OsviConstant.ProductInfo.EnterpriseServer:

 162:             product = "Server Enterprise (Full)";

 163:             break;

 164:           case (uint)OsviConstant.ProductInfo.EnterpriseServerCore:

 165:             product = "Server Enterprise (Core)";

 166:             break;

 167:           case (uint)OsviConstant.ProductInfo.EnterpriseServerCoreV:

 168:             product = "Server Enterprise without Hyper-V (Core)";

 169:             break;

 170:           case (uint)OsviConstant.ProductInfo.EnterpriseServerIA64:

 171:             product = "Server Enterprise for Itanium-based Systems";

 172:             break;

 173:           case (uint)OsviConstant.ProductInfo.EnterpriseServerV:

 174:             product = "Server Enterprise without Hyper-V (Full)";

 175:             break;

 176:           case (uint)OsviConstant.ProductInfo.HomeBasic:

 177:             product = "Home Basic Edition";

 178:             break;

 179:           case (uint)OsviConstant.ProductInfo.HomeBasicE:

 180:             product = "Home Basic E Edition";

 181:             break;

 182:           case (uint)OsviConstant.ProductInfo.HomeBasicN:

 183:             product = "Home Basic N Edition";

 184:             break;

 185:           case (uint)OsviConstant.ProductInfo.HomePremium:

 186:             product = "Home Premium Edition";

 187:             break;

 188:           case (uint)OsviConstant.ProductInfo.HomePremiumE:

 189:             product = "Home Premium E Edition";

 190:             break;

 191:           case (uint)OsviConstant.ProductInfo.HomePremiumN:

 192:             product = "Home Premium N Edition";

 193:             break;

 194:           case (uint)OsviConstant.ProductInfo.HomeServer:

 195:             product = "Home Server Edition";

 196:             break;

 197:           case (uint)OsviConstant.ProductInfo.HyperV:

 198:             product = "Microsoft Hyper-V Server";

 199:             break;

 200:           case (uint)OsviConstant.ProductInfo.MediumBusinessServerManagement:

 201:             product = "Windows Essential Business Server Management Server";

 202:             break;

 203:           case (uint)OsviConstant.ProductInfo.MediumBusinessServerMessaging:

 204:             product = "Windows Essential Business Server Messaging Server";

 205:             break;

 206:           case (uint)OsviConstant.ProductInfo.MediumBusinessServerSecurity:

 207:             product = "Windows Essential Business Server Security Server";

 208:             break;

 209:           case (uint)OsviConstant.ProductInfo.Professional:

 210:             product = "Professional Edition";

 211:             break;

 212:           case (uint)OsviConstant.ProductInfo.ProfessionalE:

 213:             product = "Professional E Edition";

 214:             break;

 215:           case (uint)OsviConstant.ProductInfo.ProfessionalN:

 216:             product = "Professional N Edition";

 217:             break;

 218:           case (uint)OsviConstant.ProductInfo.ServerForSmallBusiness:

 219:             product = "Windows Server 2008 for Windows Essential Server Solutions";

 220:             break;

 221:           case (uint)OsviConstant.ProductInfo.ServerForSmallBusinessV:

 222:             product = "Windows Server 2008 without Hyper-V for Windows Essential Server Solutions";

 223:             break;

 224:           case (uint)OsviConstant.ProductInfo.ServerFoundation:

 225:             product = "Server Foundation";

 226:             break;

 227:           case (uint)OsviConstant.ProductInfo.SmallBusinessServer:

 228:             product = "Windows Small Business Server";

 229:             break;

 230:           case (uint)OsviConstant.ProductInfo.SmallBusinessServerPremium:

 231:             product = "Windows Small Busines Server Premium";

 232:             break;

 233:           case (uint)OsviConstant.ProductInfo.StandardServer:

 234:             product = "Server Standard (Full)";

 235:             break;

 236:           case (uint)OsviConstant.ProductInfo.StandardServerCore:

 237:             product = "Server Standard (Core)";

 238:             break;

 239:           case (uint)OsviConstant.ProductInfo.StandardServerCoreV:

 240:             product = "Server Standard without Hyper-V (Core)";

 241:             break;

 242:           case (uint)OsviConstant.ProductInfo.StandardServerV:

 243:             product = "Server Standard without Hyper-V (Full)";

 244:             break;

 245:           case (uint)OsviConstant.ProductInfo.Starter:

 246:             product = "Starter Edition";

 247:             break;

 248:           case (uint)OsviConstant.ProductInfo.StarterE:

 249:             product = "Starter E Edition";

 250:             break;

 251:           case (uint)OsviConstant.ProductInfo.StarterN:

 252:             product = "Starter N Edition";

 253:             break;

 254:           case (uint)OsviConstant.ProductInfo.StorageEnterpriseServer:

 255:             product = "Storage Server Enterprise";

 256:             break;

 257:           case (uint)OsviConstant.ProductInfo.StorageExpressServer:

 258:             product = "Storage Server Express";

 259:             break;

 260:           case (uint)OsviConstant.ProductInfo.StorageStandardServer:

 261:             product = "Storage Server Standard";

 262:             break;

 263:           case (uint)OsviConstant.ProductInfo.StorageWorkgroupServer:

 264:             product = "Storage Server Workgroup";

 265:             break;

 266:           case (uint)OsviConstant.ProductInfo.Ultimate:

 267:             product = "Ultimate Edition";

 268:             break;

 269:           case (uint)OsviConstant.ProductInfo.UltimateE:

 270:             product = "Ultimate E Edition";

 271:             break;

 272:           case (uint)OsviConstant.ProductInfo.UltimateN:

 273:             product = "Ulitmate N Edition";

 274:             break;

 275:           case (uint)OsviConstant.ProductInfo.Undefined:

 276:             product = "Unknown Product";

 277:             break;

 278:           case (uint)OsviConstant.ProductInfo.Unlicensed:

 279:             product = "Unlicensed or Expired";

 280:             break;

 281:           case (uint)OsviConstant.ProductInfo.WebServer:

 282:             product = "Web Server (Full)";

 283:             break;

 284:           case (uint)OsviConstant.ProductInfo.WebServerCore:

 285:             product = "Web Server (Core)";

 286:             break;

 287:         }

 288:       }

 289:  

 290:       return product;

 291:     }

 292:   }

 293:  

 294: // ****************************************************************************

 295: // NEW CLASS - SHOULD BE PLACED IN SEPARATE FILE

 296: // ****************************************************************************

 297:   

 298:   internal class OsviConstant

 299:   {

 300:     internal const int SupportedPlatform = 2;

 301:     internal const int ServerR2 = 89;

 302:     internal const int WorkStation = 0x00000001;

 303:  

 304:     private OsviConstant()

 305:     {

 306:     }

 307:  

 308:     internal enum MajorVersion

 309:     {

 310:       NT5 = 5,

 311:       NT6 = 6

 312:     }

 313:  

 314:     internal enum MinorVersion

 315:     {

 316:       Windows2000 = 0,

 317:       WindowsXP = 1,

 318:       WindowsServer2003 = 2,

 319:       WindowsVista = 0,

 320:       Windows7 = 1

 321:     }

 322:  

 323:     internal enum ProductInfo : uint

 324:     {

 325:       Business = 0x00000006,

 326:       BusinessN = 0x00000010,

 327:       ClusterServer = 0x00000012,

 328:       DatacenterServer = 0x00000008,

 329:       DatacenterServerCore = 0x0000000C,

 330:       DataCenterServerCoreV = 0x00000027,

 331:       DataCenterServerV = 0x00000025,

 332:       Enterprise = 0x00000004,

 333:       EnterpriseE = 0x00000046,

 334:       EnterpriseN = 0x0000001B,

 335:       EnterpriseServer = 0x0000000A,

 336:       EnterpriseServerCore = 0x0000000E,

 337:       EnterpriseServerCoreV = 0x00000029,

 338:       EnterpriseServerIA64 = 0x0000000F,

 339:       EnterpriseServerV = 0x00000026,

 340:       HomeBasic = 0x00000002,

 341:       HomeBasicE = 0x00000043,

 342:       HomeBasicN = 0x00000005,

 343:       HomePremium = 0x00000003,

 344:       HomePremiumE = 0x00000044,

 345:       HomePremiumN = 0x0000001A,

 346:       HyperV = 0x0000002A,

 347:       MediumBusinessServerManagement = 0x0000001E,

 348:       MediumBusinessServerSecurity = 0x0000001F,

 349:       MediumBusinessServerMessaging = 0x00000020,

 350:       Professional = 0x00000030,

 351:       ProfessionalE = 0x00000045,

 352:       ProfessionalN = 0x00000031,

 353:       ServerForSmallBusiness = 0x00000018,

 354:       ServerForSmallBusinessV = 0x00000023,

 355:       ServerFoundation = 0x00000021,

 356:       SmallBusinessServer = 0x00000009,

 357:       StandardServer = 0x00000007,

 358:       StandardServerCore = 0x0000000D,

 359:       StandardServerCoreV = 0x00000028,

 360:       StandardServerV = 0x00000024,

 361:       Starter = 0x0000000B,

 362:       StarterE = 0x00000042,

 363:       StarterN = 0x0000002F,

 364:       StorageEnterpriseServer = 0x00000017,

 365:       StorageExpressServer = 0x00000014,

 366:       StorageStandardServer = 0x00000015,

 367:       StorageWorkgroupServer = 0x00000016,

 368:       Undefined = 0x00000000,

 369:       Ultimate = 0x00000001,

 370:       UltimateE = 0x00000047,

 371:       UltimateN = 0x0000001C,

 372:       WebServer = 0x00000011,

 373:       WebServerCore = 0x0000001D,

 374:       Unlicensed = 0xABCDABCD,

 375:       HomeServer = 0x00000013,

 376:       SmallBusinessServerPremium = 0x00000019,

 377:     }

 378:   }

 379:  

 380: // ****************************************************************************

 381: // NEW CLASS - SHOULD BE PLACED IN SEPARATE FILE

 382: // ****************************************************************************

 383:  

 384:   internal class NativeMethods

 385:   {

 386:     private NativeMethods()

 387:     {

 388:     }

 389:  

 390:     [DllImport("kernel32")]

 391:     [return: MarshalAs(UnmanagedType.Bool)]

 392:     internal static extern bool GetVersionEx(ref OSVersionInfoEx osvi);

 393:  

 394:     [DllImport("kernel32.dll")]

 395:     [return: MarshalAs(UnmanagedType.Bool)]

 396:     internal static extern bool GetProductInfo(

 397:       int osMajorVersion,

 398:       int osMinorVersion,

 399:       int spMajorVersion,

 400:       int spMinorVersion,

 401:       ref uint type);

 402:  

 403:     [DllImport("kernel32.dll")]

 404:     internal static extern int GetSystemMetrics(

 405:       int index);

 406:  

 407:     [StructLayout(LayoutKind.Sequential)]

 408:     internal struct OSVersionInfoEx

 409:     {

 410:       public int VersionInfoSize;

 411:       public int MajorVersion;

 412:       public int MinorVersion;

 413:       public int BuildNumber;

 414:       public int PlatformId;

 415:       [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 128)]

 416:       public string CSDVersion;

 417:       public Int16 ServicePackMajor;

 418:       public Int16 ServicePackMinor;

 419:       public Int16 SuiteMask;

 420:       public byte ProductType;

 421:       public byte Reserved;

 422:     }

 423:   }

 424: }

Now, some of my readers have indicated that these code snippets are not very useful because they can’t copy them directly and put them to use. So, to help resolve that issue I have created a new section on my web site called the Code Snippet Library. This snippet is posted there along with fully annotated (mostly FxCopy and StyleCopy compliant) file available for download or to copy for inclusion in your automated test cases, or compiled into a dynamic link library (DLL).

This example doesn’t differentiate between 32-bit and 64-bit Windows operating systems, but that is not really difficult to add, and if I get enough requests I will certainly add that into the pot. If the operating system version is no longer supported by Microsoft the GetOsVersion property will return "Unsupported Version." If no Service Packs are installed the GetServicePack property will return an empty string. If you need to detect an unsupported operating system version use the example here.

Scary Stories and GUI Automation

I remember going camping with my cousins as I was growing up. It was great fun despite sleeping inside a musty smelling canvas tent that retained heat so well it was more like a sauna. But, my father was adamant that the old canvas tent he bought at an Army Surplus store and took 2 men and 4 boys to carry and assemble was much better then those new fangled nylon tents. Nylon ripped too easily he reasoned, but canvas will withstand anything short of a raging bear. I’m not too sure there were too many raging bears roaming the camp-grounds of Maryland, Pennsylvania, and Virginia but I was confident that even if there were I would have stood a better chance inside a canvas tent as compared to those paper thin nylon tents. To this day I remember those camping trips when I smell old canvas, or perhaps it’s dried mold spores embedded in the canvas. Whichever, it takes me back to a time of fun and fond memories.

One of the best parts of the trips were sitting around the camp fire at night listening to my uncle concocting some story intended to scare the wits out of us young boys. You know, the kind of stories about headless Confederate soldiers, or werewolves, vampires, or other such wicked creatures of the night. I think these campfire chats are remnants of man’s tribal roots where the elders tried to scare the hell out of the juvenile hominids to prevent them from wandering off at night. As we got older we realized that these stories were simply fictitious folktales; sort of like successful, value-add GUI automation projects.

Last week I had lunch with a colleague who wanted to talk to me about an automation project on his team that went horribly awry. As he started to tell me his story I thought, “Wait…I heard this tale before. I can tell this story because I’ve heard it so many times over and over again…just like those scary stories I heard around the campfire growing up.” The story goes something like this.

Our team bought a new tool, or built another framework, and taught everyone how to script “black box” test cases. They developed quite a number of automated test scripts, and of course everything was working very well. The scripts were running and managers were happy because the team had a lot of automated test scripts. But, the tests weren’t finding any bugs, so just out of curiosity the managers suggested a bug bash. And sure enough as the testers started exploring the project it didn’t take long for them to fill the database with bugs. The developers were shell shocked, and the managers couldn’t believe it! They couldn’t understand why the automated GUI tests weren’t finding any bugs? And so, in a typical knee-jerk reaction fashion, the managers immediately halted the GUI automation project and required every tester to embark on an exploratory testing adventure in search of bugs. Of course, the managers decided this approach was better than investing in more GUI automation bringing an end to another GUI automation project.

Unfortunately, unlike the scary stories my father and uncles told around the campfires, stories of failed GUI automation are often true, and usually much scarier. Why are they so scary? Because I hear these sorts of stories repeated so often. It seems that we as a discipline rely on tribal knowledge where each generation simply learns through trial and error and the folktales of our elders and thrive more on hero worship of people who are often remarkably good at finding bugs by poking and prodding hour upon hour.

Now, if you haven’t caught on already you will know that I am no big fan of GUI automation. Not because I don’t think it can be useful. In fact, I think GUI automation can provide tremendous value in some situations. But unfortunately much of the automated GUI test cases I see (especially in examples) are poorly designed, simple rudimentary script-lets. Many of these automated tests are nothing more than mindless automated sequences of events contrived because the testers have been told to automate, but are not given strategic vision (why) and little to no tactical direction (what and how).

With little or no direction or goals, or without an in-depth understanding of the system they are testing the biggest problems with GUI automation is that many testers attempt to automate

  • functional tests intended to expose computational errors in the business logic layer or in the underlying APIs
  • usability tests intended to imitate ‘me’ trying to emulate the scenarios or tasks I think the customer might do

GUI automation is probably the least effective approach for functional testing. This is not to say that GUI automation will not find functional issues in the lower logic layers. I suspect we will always find ‘functional’ bugs (e.g. boundary issues, unhandled exceptions, string parsing errors, calculation problems, etc.) while testing through the UI. But, as indicated in my previous post, well-designed software is usually built in layers and a good many of the ‘functional’ issues we find today can likely be more efficiently found through more robust unit and component (API) levels of testing.

Perhaps even more silly than trying to use GUI automation for low level functional testing is the notion of using GUI automation to emulate a ‘user’ by scripting out prescriptive sequences of actions (often with hard-coded data) that are then played over and over again. Test automation cannot and should not attempt to replicate ‘me.’ I’ve said before the purpose of automation is to provide value to me, to free up my time, to increase my efficiency, and to help me be more effective in my job; automation does not replace me. Let’s face it…we (humans) are much better at evaluating the ease of use of software and whether scenarios that represent target customer segments are intuitive for those customers.

For example, I once had a conversation explaining that GUI automation runs much faster than I can interact with software, and sometimes I make mistakes when typing in something that throws an unexpected message or takes me down a path. My colleague replied, “Well, we can slow down the automation.” Why? Why in the hell would I want to slow down my automated tests? C’mon…we all should know by now that the 100% automation (or automate all tests) mantra is a ridiculous dream and I’ve heard more plausible fantasies from people on acid trips.

So, where does GUI automation add value. In my opinion, GUI automation is probably most effective in testing UI control properties and the event handlers between the UI layer and the API layers. It is also effective in behavioral testing areas such as performance and stress.  And GUI automation is also much more effective in evaluating UI layout issues such as misaligned controls, or clipped or truncated controls on a window as compared to the human eye.

Similar to how we use different techniques to expose different categories of defects, and how we use different approaches to testing depending on the context or test objective test automation is a useful tool in our toolbox. It is certainly not the only tool. We can do some remarkable things with automation, but we must learn where GUI automation adds value and where other approaches testing (automated or manual) might be more effective.

API Testing: Testing in Layers

For the past few weeks my test automation class at the University of Washington has been focused on API (application programming interface) testing, or component level testing. Boris Beizer defines component level testing as “an integrated aggregate of one or more units” and that a “component can be anything from a unit to an entire system.”

This seems a bit confusing at first but then we realize that a single method (or function) may be a unit, or a component, or may (although unlikely) be the ‘system.’ A collection of methods wrapped in a library or DLL that interact to meet a functional requirement is a component. Rather than a developer having to call each method individually to achieve some usually repetitive functional outcome from the library that functionality is usually exposed via a call to a single API.

In this situation the students are testing a public API in a single library (DLL) that calls several methods to produce a randomly generated string (outcome) based on parameterized property values. The interface is the single API call (and the property variables) in an automated test, so we might consider the DLL that contains this API as the ‘system’ under test based on Beizer’s definition of component. Also, since students don’t actually see the underlying code, API testing in this context is ‘black-box’ testing. clip_image001

The debate of who is responsible for API or component testing is tangential to the practice. I promote API testing because even in today’s extreme programming and TDD development lifecycle models we testers are still finding way too many ‘functional’ bugs (as opposed to behavioral bugs) during the integration and system levels of testing.

Also, generally (not all) software is designed and developed in layers similar to this simplified illustration. In well-designed, more easily testable projects the business logic or logical functionality is (should be) contained in classes or libraries (DLLs), and the public methods or APIs in those libraries know nothing about the user interface. and visa versa. End user inputs at the GUI are marshaled to the business logic layer via event handlers and properties (get/set accessors in C#) in different classes.

So, it shouldn’t be a surprise to anyone that certain categories of functional issues are more easily exposed at the API level of testing as opposed to testing through the UI. In fact, sometimes the UI properties and event handler layers in one project may actually mask some bugs in the APIs which aren’t exposed until much later when someone else uses that API in a different application or feature. The value of API testing is that a lot of functional testing can be performed very early in the project cycle, and functional testing can progress while the UI layer is unstable or in flux.

I sometimes think we are stuck testing from the end-user’s perspective. It seems that we often approach testing by trying to expose both behavioral type bugs and functional type bugs by testing completely through the UI. But, if we think of testing in layers the same way many products are developed then I wonder if we could better focus our test designs to target specific categories of functional bugs earlier and concentrate on behavioral issues and end-2-end customer scenarios when we have a more stable UI?

Code Coverage: More Than Just a Number

When I was growing up I would sometimes go down into my grandfather’s basement. He had amassed a variety of tools during his lifetime and he was an excellent wood craftsman. I wasn’t allowed to touch any of the power tools, because his rule was, “if you don’t know how to use a tool properly then you shouldn’t play with it.”

Of course, I am a bit of a hard head (even back then) and one day I started playing with the wood lathe while my grandfather was upstairs. Everything seemed to be going pretty well until I pushed the chisel in too far too fast and the wood split and went flying. One piece shattered the overhead light and the other piece ricocheted off the back of my hand leaving an nice gash. I shut off the machine and ran upstairs. After my grandmother cleaned and wrapped my hand, my grandfather made me go back downstairs and clean up the mess and stood over me with a stern look of disapproval making sure I wiped up my blood trail. After that incident, I heeded my grandfather’s advice, at least in his basement shop.

Anyway, with the recent discussions of code coverage around the testing blogosphere I started thinking about what was really being discussed. The discussions (as is the case with most discussions about code coverage) were not actually about the application code coverage as a tool, but more about the code coverage metric. And more specifically the discussions were about how not to assume a high measure of code coverage implies something is well tested. Interestingly enough, 2 years ago I wrote a post illustrating how the metric can be gamed and how the code coverage measure tells us nothing about quality or test effectiveness, but also alluded to how it might be used more effectively.

I thought that how the metric is sometimes misused is mostly self-evident, but then I realized that almost every time testers start talking about code coverage the discussion tends to focus on the metric. This may seem a bit harsh, but if a person’s only contribution to a conversation about code coverage is about how the metric doesn’t relate to quality or testing effectiveness then that person should not be allowed to play with hammers, and employing more complex tools such a wheel-barrows are well beyond that person’s comprehension.

Only thinking of code coverage as a means to get some magic number is akin to thinking “how many nails can I pound with this hammer. The metric itself is mostly irrelevant; and it is completely irrelevant if you don’t know how to interpret it in a way that helps you as a tester. Think about it this way; if we told our managers “our tests achieved 80% code coverage” some of our managers would be elated. (Of course IMHO, these types of managers are metric morons.) But, what do you think these same pointy headed number zombies would say if we told them “we ran our tests and we only missed testing 20% of the code.” I suspect they would start pacing back and forth in the room mumbling “We must run more tests, we must run more tests.”

When we stop thinking of code coverage as a simply measure where our only use of the tool is to try and achieve some magical number then perhaps we can start thinking about how to actually use code coverage as an effective tool to help us design tests (in under-tested or untested areas of the code), reduce potential risk, and possibly even drive quality upstream.

For example, one of my mentees is currently working on a project that uses just in time code coverage as a tool to evaluate how tests exercise changed code and downstream dependencies prior to checking code changes (e.g. bug fixes) back into the main tree. The initial pushback by some members of the team (including some pointy headed managers) was “code coverage doesn’t tell us about product quality” or “its too hard to achieve 80% code coverage” (although no such goal had been mentioned), and my personal favorite, “it’s too difficult to get everyone to measure coverage.” I reminded my mentee that the project is not about achieving some magic number, and in fact, it’s really not even about measuring at all. It’s about using the tool to discover information and to help us design additional functional tests at the API or component level that we might otherwise overlook to help prevent downstream regressions. In a nutshell, its about using code coverage as a defect prevention tool in this case.

Bottom line, code coverage is a tool! If you don’t know how to use it to improve your testing, well…

Boundary bug hunting; sometimes it’s almost too easy!

This past weekend I was working on a new test tool library for generating random email addresses; specifically the local address segment of an email address. I know, there are already a lot of email address generators available and this could be construed as reinventing the wheel. But I wanted to give my students in my test automation course at the University of Washington something to test at the API level. So why not have them test a test tool and learn a bit more about API level testing and how to use combinatorial analysis of the input property values to drive a data-driven automated test case. Also, having them test it means that I don’t have too!

Anyway, one of the tool’s properties is a character array of invalid characters for the specific email address system under test. Although the guidelines for email addresses are outlined in RFC 5322 and RFC 2821 many companies can place greater restrictions on the characters that are allowed for the local address component of an email address (the local address is the part before the ‘@’ character).

For example, Yahoo only allows a local address to be between 4 and 32 characters, the first character must be a letter, and only letters, numbers, underscores and only 1 period character. The Google mail local address is between 6 and 30 characters, and only allows letters, numbers, and (multiple) period characters. Hotmail and Live mail allow local address name lengths between 6 and 64 characters (64 is the maximum allowable size according to RFC 5322), and can only contain letters, numbers, periods, hyphens, and underscores.

Even from these few examples we can see a couple of things. First, although we are testing email addresses there is not a universal set of equivalent partitions that works in all contexts. We need to partition the test data into equivalent class subsets based on the specific domain we are testing. For example, the invalid class subset of characters for a Google local address includes the underscore character, but both Yahoo and Hotmail allow the underscore as a valid character in an email local address. (But, I will talk next week about the equivalent partitioning of this data…for now let’s get back to boundary testing!)

Back to my story – as I was exploring each email providers requirements in order to determine how to partition the data I discovered a interesting problem with Yahoo. Remember, the maximum length of the local address for a Yahoo account is 32 characters. yahoo msg

And, the textbox control property on the web page is set to only allow a maximum input of 32 characters to prevent the user from inputting more than 32 characters. Copying a string longer than 32 characters into that textbox simply truncates the string after the 32nd character.

But, when I bump up against the maximum allowable length with some test strings the underlying program that generates suggested alternative local address names will actually produce a local address of 35 characters in length!

yahoo msg 2

Now, if the software message tells me I can’t do something (like have a local address name of more than 32 characters and then the software generates a local address name of 35 characters for me…well, I am the sort of fellow who will push that button!

yahoo msg 3

And sure enough it looks like I can use it. But wait. Only one more button to push and…

yahoo msg 4

What do you mean “Sorry, this appears to be an invalid Yahoo ID?” You generated an invalid local address for me! Why would Yahoo mail torment me so?

I am thinking in the developers mind the user story went sort of like;

User: “I would like this.”

System: “No you can’t have that, but you can have this.”

User: “OK”

System: “No, you can’t have that either.”

It’s funny this came up this week because I was talking with a group of senior SDETs about defect prevention versus defect detection and how 99.999% of boundary issues can be found at the unit level or API level of testing well before the UI is slapped onto the functional layer.

Testing the functional layer more thoroughly or a code review would most likely have revealed this ‘magic’ number was inconsistent. Or by forcing the algorithm that generates suggested local addresses to test boundary conditions would have much sooner exposed this problem.

Now I don’t know Yahoo’s development and testing practices, and unfortunately it’s not uncommon to overlook bugs similar to this. But, I suspect that if developer rely on testers to find all their bugs, and testers primarily rely on testing through the user interface to find bugs then we are always going to find boundary bugs post release (and that’s a good thing because it gives me something to blog about).

API Testing – Thinking Differently About the Problem

Last year the University of Washington Extension Program started running a new Software Test Automation using C# program that I designed and developed for experienced testers with little or no programming background. The program is very popular and has more than 60 people waiting for the next offering. Unfortunately, the pay is not that great so I have no intention of quitting my day job. It helps with the moorage costs for my sailboat, but the stipend I receive is not my motivation for teaching this course.

A few years ago I realized the industry would once again require software testers to have a richer understanding of the complete ‘systems’ they are testing, and also require testers to have a wider range of ‘testing’ skills beyond emulating user behavior in an attempt to expose as many bugs as possible before the software is released. I also realized there are many testers in the Seattle area who are good testers but simply lacked the coding skills necessary to design and develop automated test cases (that more and more companies are expecting from their testing staff).

So, this program is one way I can help testers in the community gain additional skills and share some ideas with my colleagues in the local community. Don’t tell the program coordinator from UW, but my real reward comes when a student tells me about how he/she was able to solve a test problem using something they learned in class. Frankly, I don’t think I am a really great teacher, but it is nice to think that in some small way I can sometimes help testers unleash their own potential to overcome challenges and succeed.

Anyway, the final project after the first 10 weeks of the course is to design automated tests of  3 simple API methods from a ‘black box’ perspective (e.g. they had to design a test that called the API method in a DLL). Each method required one or more argument variables to be passed to the method’s parameters when it was called in the automated test case, and each method returned a type (bool, int, and string) that had to be checked against the expected result based on the variables used in the test. The final project also introduces data-driven automation concepts. The focus of the project was to reinforce the programming concepts and skills they learned over the previous 9 weeks and put that knowledge and skill to use in a reasonably realistic testing project.

I am a big fan of API testing, and at Microsoft we do a lot of API testing and I would venture to say that a significant portion of our test automation runs below the UI layer banging away at various APIs. If API is broken…well it’s that whole “lipstick on a pig” thing; you might mask it for awhile, but it is still a pig and eventually the lipstick wears off.

Prior to the project I try to set the stage by telling everyone that the key to data-driven testing is dependent on the test data crafted by the tester. If the test data is insufficient you potentially miss a critical error. If the data is wrong then you are likely to throw a false positive; an error or exception thrown by the test and not by the system under test (or API method in this case). If a C# method parameter takes an intrinsic data type of int (Integer32) then trying to pass a string variable into the test case from a test data file to that parameter will throw an exception in the test code well before it makes the call to the API method being tested.

For example, the simplified sample test case below is testing a simple API static method ConvertValueToUnicodeChar(int value) that takes a integer value and converts it to a UTF-16 Unicode character. If the integer value is outside the UTF-16 range (0 through 65535) the method ConvertValueToUnicodeChar(int value) will throw an ArgumentOutOfRangeException.

   1: // <copyright file="simpletestcase.cs" company="TestingMentor"> 

   2:  // Copyright © 2009 by Bj Rollison. All rights reserved. 

   3:  // </copyright> 

   4: 

   5: namespace TestingMentor.Sample

   6: {

   7:   using System;

   8:   using System.IO;

   9:   using TestingMentor.Simulation;

  10: 

  11:   class TestCase

  12:   {

  13:     static void Main(string[] args)

  14:     {

  15:       int testCounter = 0;

  16:       // Read in an array of strings representing the test data. 

  17:       // Of course this would likely come from a static test data file

  18:       // on a server or copied to a folder on the local machine

  19:       string[] testData = new string[]

  20:       { "90,Z",

  21:         "24798,惞",

  22:         "0,null",

  23:         "65536,Error",

  24:         "-1,Error",

  25:         "1.5,",

  26:         "xyz,xyz"

  27:       };

  28: 

  29:       // Loop through each test data string

  30:       foreach (string test in testData)

  31:       {

  32:         testCounter++;

  33:         // This nested try/catch block catches invalid test data

  34:         // but allow additonal tests in the testData array

  35:         try

  36:         {

  37:           // Parse each string into the test data and expected result

  38:           string[] testElement = test.Split(',');

  39:           string expectedResult = testElement[1];

  40:           string actualResult = String.Empty;

  41: 

  42:           // Convert the string to a type int value

  43:           int value = int.Parse(testElement[0]);

  44:

  45:           // We need a way to handle int values 0 through 32 which are 

  46:           // control characters, this is an example of how to deal with 

  47:           // a int value of 0 which is a null character

  48:           if (expectedResult.Equals("null", StringComparison.OrdinalIgnoreCase))

  49:           {

  50:             expectedResult = '\0'.ToString();

  51:           }

  52: 

  53:           // This nested try/catch block tests catches exceptions thrown by 

  54:           // the method under test. If the method under test throws an 

  55:           // exception we certainly want to test for that case!

  56:           try

  57:           {

  58:             // Call the API method under test 

  59:             char result = Converter.ConvertValueToUnicodeChar(value);

  60:             actualResult = result.ToString();

  61:           }

  62: 

  63:           catch (ArgumentOutOfRangeException)

  64:           {

  65:             actualResult = "Error";

  66:           }

  67: 

  68:           catch (Exception)

  69:           {

  70:             // if this happens this is a failure because the documentation

  71:             // states that this method will only throw an 

  72:             // ArgumentOutOfRangeException.

  73:             actualResult = "Non-specific or unexpected error thrown";

  74:           }

  75: 

  76:           // Call a simple oracle and log results

  77:           if (String.Equals(actualResult, expectedResult))

  78:           {

  79:             // log pass

  80:             Console.WriteLine("{0} Pass", testCounter);

  81:           }

  82:           else

  83:           {

  84:             // log fail...of course log as much detail as possible

  85:             Console.WriteLine("{0} Fail", testCounter);

  86:           }

  87:         }

  88: 

  89:         catch (FormatException)

  90:         {

  91:           // log the test data for this test as incorrect, test is skipped

  92:           Console.WriteLine("{0} Bad test data. Test skipped.", testCounter);

  93:         }

  94:       }

  95:     }

  96:   }

  97: }

Instead of reading in test data from a file I simply created a string array called csvTestData to simulate a partial list of test data that might be contained in our csv formatted test data file. Notice that the test data on lines #25 and #26 are invalid integer types. So, when these test data variables are converted from strings to type int values in line #43 the int.Parse method will throw a FormatException which is caught by the outer catch block on line #89, marked as bad data and the oracle is skipped. Of course, we want to test the integer values that represent the physical boundaries for a UTF-16 char in C# (which are 0 and 65535) and the values immediately above and below those values (e.g. –1, 0, 1, 65534, 65535, and 65536). Then of course, we need to determine how many samples from the population of possible input variables (integer values between 0 and 65535) we need to test to attain a reasonable degree of confidence that the API method would return the correct UTF-6 Unicode character for a given integer value. (or in this case the population of test data is relatively small and we could simply run through all 65536 values because it would only take a minute or two).

Unfortunately, some of the test data files submitted in the final project contained invalid test data for the API method being called. In some test cases the parameter type required was a type int, but the test data read in from the file for that parameter was a real number such as 1.5, or a string such as “xyz” similar to the example above. I asked myself why would someone include these variables in a test that are being passed to a parameter of type int? The only thing I can think of is that when these testers designed their test data files, they were thinking about the problem as if they were testing the API method through a user interface. (And, in fact my suspicion was confirmed later when I asked them.)

The bottom line here is that we often times throw a lot of ‘tests’ or a lot of data at something in an attempt to trigger an unexpected error. Sometimes we are successful, and hopefully we document that information and share it with others so we can all learn. But, a lot of times it seems we can’t see the trees because of the forest and execute tests or include test data in our tests just for the sake of physical activity. I sometimes wonder whether or not it matters to think critically about the problem, analyze the situation, and design well-thought out tests, or is simply throwing stuff against the wall and seeing what sticks good enough testing?

Thinking About Critical Thinking And Test Design

Did you ever notice that when you ask someone to test something the first thing they do is to start ‘testing?’

I often see this in my classes and I ask the person, “what is the purpose of your test?” Typically the response is, “I’m testing this,” or “I’m trying to find a bug.”

Unfortunately this seems to indicate there is no or very little pre-thought that goes into the act of software testing. To some people, testing appears to be little more than simply pounding away at the keyboard and trying whatever flies into our subconscious mind as we interact with the software and declare a bug when we stumble upon unexpected behavior or see something we might disagree with.

This is why I found it especially interesting in my own research, and the case studies by Juha Itkonen that testers who were trained in formal software testing techniques or patterns there was no significant difference in terms of defect rates or coverage between pre-defined test cases and an exploratory testing approach. This is not to say that one approach to testing is preferred over the other. It is not an either or proposition as I explained in my post on the pesticide paradox, and there are certainly more than 2 approaches to software testing. Testing requires multiple approaches to most effectively aid us in collecting and presenting the appropriate information to the decision makers.

But, I am often puzzled that it seems we can easily think of negative or destructive tests once we have the product in hand, yet when we are designing a set of tests from the requirements the tests simply test the requirements and little else. I wonder why it is that we can think of ‘tests’ while executing other tests, but we can’t think of those same tests before hand. Is there some limitation in our psyche that prevents us from analyzing a problem until we are actually faced with the problem (software in hand)?

I don’t think so, but I suspect there is a mental hurdle in that we sometimes feel more productive when we are interacting with software as opposed to sitting back and analyzing the problem more prior to executing well-designed test cases. (More tests doesn’t equal better testing!)

The bottom line is that if we are given a set of requirements and can only design tests that only test the requirements, then we are probably not thinking critically about how to design test cases.

Random Test Data – Credit Card Numbers

Things are winding down for the year. The Christmas lights are up on the house, my gardens are tilled and mulched for next spring, people are disappearing from the office like there is a plague, it hasn’t snowed in a while which means the mountains are mostly ice (I dislike skiing on ice), the next GSHL ice hockey league doesn’t start for awhile and pick up games are few and far between (I suck at hockey but it is fun). So, what to do? Oh…I forgot Christmas shopping. I hate Christmas shopping! So, I have spent the past few idle nights refactoring the automation libraries for some of my test data generation tools after my daughter goes to bed.

One of the most popular random test data generators that I have developed so far has been a tool called CCMaker to generate random valid and invalid credit card numbers. (Sometimes I wonder why that is, but I don’t dwell on it for too long and I haven’t been interrogated by the FBI lately.) Testing forms that require a credit card has always been risky business because you certainly don’t want to use your own card. Often times developers will include a check on web forms or client apps to do a high level verification of a credit card number before sending all the data across the wire to be validated. This early or high level verification prevents flooding the pipe with bad data. So, one test we can do prior to testing the end-to-end scenario is to test to see if and how the developer is validating credit cards numbers prior to submission.

As far as data goes, generating credit card numbers are fairly simple. There is a bank ID number (BIN), there is a number of digits between 12 and 19 depending on the card type, and there is a checksum. So, if we know the valid BINs for each issuing bank, the valid number of digits for each card type, and how to calculate the checksum we can generate valid credit card numbers. (Of course this is a bit oversimplified because many credit and debit card companies are issued multiple BINs and use varying number lengths.)

Testing for invalid credit card numbers should include using numbers that look close to being correct in some way but are slightly altered. For example for the 3 defined equivalent partitions (BIN, length, checksum) there are seven possible invalid combinations (23 – 1) we could test.

  1. Valid BIN, invalid length and valid checksum
  2. Valid BIN, valid length, and invalid checksum
  3. Valid BIN, invalid length, and invalid checksum
  4. Invalid BIN, valid length and valid checksum
  5. Invalid BIN, invalid length, and valid checksum
  6. Invalid BIN, valid length, and invalid checksum
  7. Invalid BIN, checksum and length

This doesn’t mean I run 7 tests and call it good because there are numerous invalid lengths and invalid BINs for the different card types. A common mistake when using an equivalent partition testing approach is to simply plug in values for each combination listed above and call it good. The problem is that there are several hundred BINs and 8 different valid lengths. For example, for just the Discover card there are 829 valid BIN numbers, and for the Maestro cards there are 56 combinations of BINs and card lengths ranging from 12 to 19 numbers in length. This doesn’t include the permutations of the other numbers that compose the entire card number.

The question every tester must ask him or herself every day when designing tests is how many tests do I need to have any reasonable sense of confidence that risk is minimal and the perception of quality is high. Of course, there is no single right answer here and not magic formula, but since we can’t possibly execute every possible positive or negative test we should at least understand that ultimately testing is sampling.

For example, one strategy for positive testing might be to test every valid BIN for every valid card length for any given credit card. For example for American Express I would want to test at least one number with a BIN of 34 and a card length of 15 that satisfies the checksum requirement, and at least one number with a BIN of 37 and a card length of 15 that also satisfies the checksum requirement. For a card type of Visa I would need a minimum of 2 tests in which the BIN is 4, the checksum requirement is satisfied, and one has a card length of  13 numbers and the other has a card length of 16 numbers.

That probably sounds like quite a bit of testing, and tests which most likely would not produce an error (unless of course the BIN is miss identified (e.g. instead of checking for a BIN of 5020 the BIN is incorrectly assigned as 5002), or if a valid BIN is not recognized as valid because it is omitted from a list or enumeration of valid BINs for that credit card). Certainly testing of this magnitude would be expensive if done manually. But when automated using a random test data generator and a data-driven automation approach to set the random generator properties comprehensive testing becomes a much more reasonable proposition and can significantly increase overall confidence.

This is where my CCMaker 3.1 test data generator can help by randomly generating both valid and invalid credit card numbers. The updated CCMaker test automation library has just been posted to my web site with documentation and examples. If you have any questions, or find any issues with the new library please let me know.

Evaluating Exploratory Testing

This month’s issue of Testing Experience published my article that summarizes the findings of several case studies of exploratory testing both inside and outside of Microsoft. Although some people consider me to be a harsh critic of exploratory testing nothing could be further from the truth. When I started my career as a professional tester my approach to software testing was primarily exploratory in nature. I was focused on executing as many negative tests I could possibly conceive of in search of the most heinous bugs I could find; and I was good at it. My criticism is not of exploratory testing as an approach; however, I do ‘question’ the claim that claim exploratory testing is “orders of magnitude more productive.” And, I am also critical of the argument that we don’t understand exploratory testing if we don’t conform to one notion of the concept (or buy into an ideological doctrine) because I don’t believe that there is only one ‘right’ way to perform or think about exploratory testing.

Of course, I know it is un-unpopular to question the claims of exploratory testing ‘experts,’ but I just happen to be one of those people who question things that are founded on anecdotal observations without any hard data to substantiate those claims. I certainly don’t have all the information, but I personally like to be able to back up my position with facts (known at the time) and several verifiable/repeatable data points so I can answer questions from a defendable position rather than trying to convince or cajole someone with my subjective opinion. (I know a lot of studies show that many Americans base their decisions on their emotional state at the time. But I learned a long time ago that you should never buy the boat you fall in love with because you will spend more time maintaining her than sailing her.) Also, it’s easier to persuade me that I might be wrong with solid, verifiable information and repeatable data versus emotional rhetoric or personal insults.

I think most people who promote exploratory testing are well intentioned and realize in conjunction with other testing approaches that exploratory testing adds value to any testing effort. I also think that many practitioners realize that while we must not only hone our intellectual capabilities of critical thinking and logical reasoning, we must also constantly build our knowledge and skills of the other approaches, methods, and techniques used in our professional trade.

At Microsoft, I can’t think of any testing group that does not use exploratory testing as part of its overall strategy. We have learned not to rely on exploratory testing as our primary approach because it simply doesn’t scale as project size and complexity increase, and it is easy for testers to focus too much on out of context issues in hopes of finding another bug. As one Principal Test Manager summarized, exploratory testing helps

  • flush out “low hanging fruit” (identify obvious issues very quickly)
  • provide welcomed context switching by getting folks to look at other areas of the product
  • to seed new testing ideas or helps identify holes (which is great as long as we have a way to preserve those ideas and they are learnable by other testers)

But, of course, it was also noted that greater ‘system knowledge’ and an understanding of other various testing techniques and approaches enriched the overall effectiveness of the testers on the teams. My job as a teacher and mentor of software testing is to take really smart people who already know how to think critically about problems and provide them with the foundational knowledge of alternative techniques, methods, approaches, and the skills that are specific to the profession of software testing that will enable them to decide what approach to use depending on the context.

Similar to other testing approaches exploratory testing has benefits and limitations and is more effective in exposing certain categories of issues, and is less effective at exposing other types of problems. (See post on Pesticide Paradox.) And now we have researched case studies that begin to help us understand how to utilize exploratory testing as part of our overall testing strategy. Of course, further research could be done in this area, but it is very interesting that the independent studies used in the article reached similar findings and conclusions.

Anyway, I look forward to comments or feedback on the article.