Archive for the ‘Boundary Testing’ tag
I have been a little busy at work lately designing 2 new advanced software testing courses. One of the courses is on combinatorial testing. The course focuses primarily on feature decomposition to identify input parameter interactions, modeling input variables, using the more advanced features of our PICT tool to customize the model file, how to generate a variety of subsets of combinatorial tests from a single model to increase test coverage using PICT, and how to design oracles for data-driven automated combinatorial tests.
In this particular course I used the Page Setup dialog in Paint as a feature to model in one of the exercises. And as it turns out, this was a good choice because as it turns out, it has made me rethink how to model input variables for use in combinatorial testing.
I generally don’t advocate hard-coding specific values for input parameters that have a linear range of values. The reason should be reasonably obvious; if we have a range of values from 1 to 100, and I hard-code the values of 1, 10, 50, and 75, 100 (for a positive test) then I have absolutely 0 probability of ever including the value of 42 in combination with other input parameters. To avoid hard-coding values I usually recommend creating equivalent partitions of appropriate input parameters (e.g. xsmall (1-10), small (11 – 25), medium (26-50), etc). Modeling a range of input values using equivalent partitions allows me to randomly select a value in each set, increases my probability of testing with values that I might not otherwise include in a hard-coded set, and adds some degree of variability of inputs for improved test coverage of all possible input values.
However, sometimes we might want to include specific values in the model file we use to generate combinatorial tests. These specific values might include boundary conditions or other values based on historical failure indicators for that feature. In the past I suggested that we don’t necessarily have to specify boundary values in our combinatorial tests. The reason for this suggestion is based on the idea that:
- many boundary issues are single mode faults (meaning the error occurs when 1 parameter is set at or immediately above or below its boundary condition
- testing for single mode errors is often easier and less costly as compared to combinatorial testing
- combinatorial testing might obfuscate the cause of a boundary bug
However, I am now convinced that
- some developers are so inept at unit testing and completely overlook boundary conditions (If you are a developer and only write “happy path” unit tests, please read Pragmatic Unit Testing by Hunt and Thomas, and Clean Code by Robert Martin)
- we find boundary bugs so late in the test cycle that someone determines they are too obscure to fix
- we have “trained” customers to avoid boundaries (due to the number of issues and resultant failures that often occur around boundaries) so we don’t care about them anymore either
- we don’t understand the fault model and therefore don’t now how to adequately identify boundary conditions and test for them
But boundary issues are still fun to find, and they always make for good examples in training or conference demos.
Anyway, on to the bug. While ‘checking’ the ranges of the margins on Paint’s Page Setup dialog for the exercise in this course I came across an interesting anomaly. When the margins were set to values that were grossly outside the allowable margin and I pressed the OK button I got an appropriate error message. But, when I changed the Scaling variable state from Fit to: to Adjust to: the Fit to: value changed to 0 although the textbox control was grayed out. I now realized the margin values are being used to auto calculate the Fit to: output values.
Since the boundary value for letter size paper with a portrait orientation is 8.5 inches, I decided to see what happens when I set the left margin to 8.501 and the right margin to 0 and then change from Fit to: to Adjust to: to check and see what happens. Interestingly enough, the Fit to: value now changed to 4,294,965,329. OK…now, I just overflowed a variable (the developer only allows the user to input a maximum of 2 characters (99) in the Fit to textboxes).
Surely, I am thinking that a page size boundary is a standard value, and surely someone tested this. But, I decided to check the specific boundary value just to see what happens anyway. So, I set the left margin to 8.5 and the right margin to 0, change the Scaling from Fit to: to Adjust to: and…
There are many ways to expose this failure. Another fun way is to set the Scaling parameter to Adjust to: first. Next set the left margin to 8.5 and the right margin to 0 (assuming letter size paper with a portrait orientation), and click the OK button. Then, open the Page Setup dialog again and…game over!
Now, I really didn’t find this bug doing combinatorial testing. In fact, although combinatorial testing might ultimately reveal this problem (depending on the model of inputs provided to the tool that generates variable combinations), this bug was discovered during the data modeling process and discovering where calculations were occurring on certain variables. Once I saw an output boundary anomaly caused by other input variables I forced those input values to target the output boundary conditions of the output variable I wanted to further investigate. So, while we should use failure indicators and experience to specify important values in our combinatorial tests in conjunction with random values within the total population of possible values) I am still not thoroughly convinced that we should always include specific boundary values in our combinatorial test models because I suspect that even the process of modeling this feature for combinatorial testing would likely have exposed this issue.
But, in the end this is really just another example of a simple boundary bug that could have easily been found during unit testing.
If there is a bug at a boundary that doesn’t lead to an unhandled exception or security exploit should we care?
Perhaps an even more important question is why do we find so many boundary type bugs via exploratory testing when they can and should be caught earlier? Why don’t we find these types of bugs in our unit testing? Why don’t we find these types of bugs by more systematically testing the software? Maybe we do find them, and those who make the decisions to fix these types of bugs just don’t care if they are fixed because there is no severe negative impact to the user. Maybe someone just wants to give me fodder for my blog!
This week I wanted to compare the range of allowable font sizes for a simulation program I developed as an example for a magazine article that I am working on. I knew that Office applications allow a font size within the range of 1 – 1638. I thought that range might be too large for my purposes, and since I knew that Windows Notepad included a font dialog I decided to check the allowable range of font sizes in Notepad.
OK, I’ll play along. Maybe if I put in a size of 99999 and press the OK button on the dialog I will get an error message, or at least Notepad defaults to the last ‘valid’ selected font size. That might seem reasonable. But is that what happens? NO! Instead of doing something reasonable (e.g. error message, default font size) the font changes to a size of 1 (yes that is a font size 1 in the upper left corner in the image below).
I am sure that defaulting to a font size of 1 makes sense when the allowable size value overflows! Really…someone thought that was a good idea? Now I wanted to see what magical boundary value the developer decided was an acceptable font size. Since the combobox size property allowed 5 characters I immediately tried 65535. No, that also resulted in the overflow and displayed the text in a font size of 1. Next I tried 32767. Wait…32767 didn’t display the string in Notepad’s edit control at a font size of 1. Now, I am thinking the developer is using a data type of signed short for the font size variable. So, I enter 32768 expecting the value to overflow and display my string as a size 1 font again. But, no…that doesn’t happen.
Now, when I am design boundary tests I generally rely on 2 heuristics for identifying boundary values for input or output parameters.
- Values at the extreme edges of a physical range of values
- Values at the edges of equivalence partitions of physical values
So, in these situations I ask myself “What sort of demented developer debauchery have I now found myself?” I can’t think of any other obvious edge values that might apply, so out of curiosity I quickly narrow down the magical value to 39321. I then ask myself, “OK…even if there were a display capable of rendering or a printer capable of printing a font of this size, what is so unique about 39321?” In hexadecimal it is 0×9999, and in binary it is 1001100110011001. OK…nothing obviously special here, but I am certain the implementation details are much more complex then a simple range of values and at this point I really don’t care because this bug just doesn’t make sense.
Maybe it’s not supposed to make sense! Maybe nobody really cares about these types of bugs!
(BTW…somebody please take the Thesaurus away from the developer…’Oblique?’ Are you serious…why not just be consistent and use the word ‘Italic?’)
This past weekend I was working on a new test tool library for generating random email addresses; specifically the local address segment of an email address. I know, there are already a lot of email address generators available and this could be construed as reinventing the wheel. But I wanted to give my students in my test automation course at the University of Washington something to test at the API level. So why not have them test a test tool and learn a bit more about API level testing and how to use combinatorial analysis of the input property values to drive a data-driven automated test case. Also, having them test it means that I don’t have too!
Anyway, one of the tool’s properties is a character array of invalid characters for the specific email address system under test. Although the guidelines for email addresses are outlined in RFC 5322 and RFC 2821 many companies can place greater restrictions on the characters that are allowed for the local address component of an email address (the local address is the part before the ‘@’ character).
For example, Yahoo only allows a local address to be between 4 and 32 characters, the first character must be a letter, and only letters, numbers, underscores and only 1 period character. The Google mail local address is between 6 and 30 characters, and only allows letters, numbers, and (multiple) period characters. Hotmail and Live mail allow local address name lengths between 6 and 64 characters (64 is the maximum allowable size according to RFC 5322), and can only contain letters, numbers, periods, hyphens, and underscores.
Even from these few examples we can see a couple of things. First, although we are testing email addresses there is not a universal set of equivalent partitions that works in all contexts. We need to partition the test data into equivalent class subsets based on the specific domain we are testing. For example, the invalid class subset of characters for a Google local address includes the underscore character, but both Yahoo and Hotmail allow the underscore as a valid character in an email local address. (But, I will talk next week about the equivalent partitioning of this data…for now let’s get back to boundary testing!)
Back to my story – as I was exploring each email providers requirements in order to determine how to partition the data I discovered a interesting problem with Yahoo. Remember, the maximum length of the local address for a Yahoo account is 32 characters.
And, the textbox control property on the web page is set to only allow a maximum input of 32 characters to prevent the user from inputting more than 32 characters. Copying a string longer than 32 characters into that textbox simply truncates the string after the 32nd character.
But, when I bump up against the maximum allowable length with some test strings the underlying program that generates suggested alternative local address names will actually produce a local address of 35 characters in length!
Now, if the software message tells me I can’t do something (like have a local address name of more than 32 characters and then the software generates a local address name of 35 characters for me…well, I am the sort of fellow who will push that button!
And sure enough it looks like I can use it. But wait. Only one more button to push and…
What do you mean “Sorry, this appears to be an invalid Yahoo ID?” You generated an invalid local address for me! Why would Yahoo mail torment me so?
I am thinking in the developers mind the user story went sort of like;
User: “I would like this.”
System: “No you can’t have that, but you can have this.”
System: “No, you can’t have that either.”
It’s funny this came up this week because I was talking with a group of senior SDETs about defect prevention versus defect detection and how 99.999% of boundary issues can be found at the unit level or API level of testing well before the UI is slapped onto the functional layer.
Testing the functional layer more thoroughly or a code review would most likely have revealed this ‘magic’ number was inconsistent. Or by forcing the algorithm that generates suggested local addresses to test boundary conditions would have much sooner exposed this problem.
Now I don’t know Yahoo’s development and testing practices, and unfortunately it’s not uncommon to overlook bugs similar to this. But, I suspect that if developer rely on testers to find all their bugs, and testers primarily rely on testing through the user interface to find bugs then we are always going to find boundary bugs post release (and that’s a good thing because it gives me something to blog about).
Originally Published Tuesday, November 04, 2008
At a recent conference a speaker posed a problem in which a field accepted a string of characters with a maximum of 32,768 bytes, then asked the audience what values they would use for boundary testing. Immediately some of the attendees unleashed a flurry of silly wild ass guesses (SWAG) such as “32,000,” “64,000,” and, of course, what attempt at guessing would be complete without someone yelling out “how about a really large string!” One person asked whether it was bytes or characters? A reasonable question, but the speaker than began talking about double byte characters (DBCS). (Double byte is, in technological time, a relatively antiquated character encoding technology since most modern operating systems process data as Unicode.)
So, while some folks in the audience continued to shout out various SWAGs, I was still pondering why anyone in their right mind would artificially constrain a user input to such a seemingly ridiculous magic number within the context of computer processing and programming languages. Programming languages allow specific ranges of numeric input. Most strongly typed languages such as the C family of languages have explicit built in or intrinsic data types that include signed and unsigned ranges. For example, an unsigned short is 2^16 or 0 through 65,535, and a signed short is also 2^16 but the range is -32,768 through +32,787. Since the speaker didn’t indicate what programming language was used in this magical field, the only logical conclusion a professional tester can rationally deduce is that 32,768 is a magic number, or in other words a “hard-coded” constant value embedded somewhere in the code.
Asking questions is important! But, asking a bunch of contextually-free questions or throwing out random guesses is usually not the most efficient or productive use of one’s time. Asking specific rational questions or making logical assertions based on knowledge and understanding is important, and is generally more productive; especially when testing the boundary conditions of input or output values in software. Boundary testing is a technique that focuses on linear input or output values that are fixed, or fixed-in-time and used for various computations or Boolean decisions (branching) within the software. Similar to most testing techniques boundary testing focuses on exposing one category of issues based on a very specific fault model, and is an extremely efficient systematic approach to effectively expose that particular category of issues. In particular boundary testing is useful in identifying problems with:
- improperly used relational operators
- incorrectly assigned constant values
- and computational errors that might cause an intrinsic data type to either overflow or wrap especially when casting or converting between data types (proper identification of the data type and knowledge of the minimum and maximum ranges is critical)
I previously wrote about approaches to help the tester identify potential boundary conditions, and how to design tests to adequately analyze those specific boundary values. As I previously stated, boundary testing involves the systematic analysis of a specific value. For example, a long file name on the Windows platform (both the base file name and the extension) should not exceed 255 characters. For file types that use a default 3-character extension the most interesting boundary values are 1 character (minimum base file name length) 251 characters (maximum base file name length assuming a standard 3-character extension), and 255 characters (with or without an extension to test what occurs with a base file name equal in length to the maximum base file name with a standard 3-character extension. (Of course, if the default extension is 1-character, or 2-characters, or 4-characters, etc., than the maximum base file name without extension needs to be recalculated.) Now, let’s see why specific values are important and critical to accurately analyze boundaries.
On Windows Xp I used Notepad to test file name boundaries with a default 3-character extension. Of course the minimum -1 value is an empty string, and minimum and minimum +1 is saving a file with a 1-character and 2-character file name respectively. Next I entered a base file name of 250-characters (maximum -1) and 251-characters (maximum allowed assuming a default 3-character extension) and these file names were saved to the system with the default extension. Then I entered a 252-character file name and I got the expected error message indicating the file name is too long. But, what about my boundary of 255 characters maximum. (IMPORTANT – boundary values are not just at the edges of the extreme ranges of values, but there could be sub or supra boundary values within a range of values that may occur at the edges of equivalent class ranges, or specific values in special or unique equivalence class subsets.) So, I wondered what would happen if I entered just a base file name of 255 characters (which is the maximum length of a file name assuming an extension is also part of that file name)? Interestingly enough, on Windows Xp the operating system saved a file with 255-characters, but it did not have any extension which means that there was no application associated with the file. The same occurred with a 254-character base file name, and when I tried the maximum +1 of the overall complete file name range I was again presented with the same message I got with a 252 character base file name indicating the file name was too long.
Fortunately, the above issue was fixed in Windows Vista. But, as sometimes occurs in complex systems one fix occasionally leads to a different (but related)issue in the same functional area which is why regression testing is typically an effective testing strategy. So, when I ran my ‘regression tests’ on Windows Vista I quickly discovered the system would not save a file with only a base file name of any number of characters greater than 252-characters via Notepad. But, as I ran the specific boundary tests I realized something very important! When I entered a base file name of 252-characters I received the following error message.
And when I attempted the test with a base file name composed of any number of characters greater than 252 I received the following error message.
Now, those of you who are paying attention realize these 2 messages are different. Of course, in either case a file is not saved to the system which is what I expect; however, there is a strange anomaly here. Although one might notice the first message prepends the drive letter to the string of 252 characters, and the second message does not. But, the important question doesn’t really have anything to do with the message text per se, in this case the professional tester tester should ask, “why is there an apparent conditional branch in the code that shunts control flow one way for a base file name of 252 characters and a different path for a base file name greater than 252 characters?”
Of course, if we just guessed, or tested ‘a really large string of characters, we might have never exposed this anomaly which occurs only at the maximum + 1 length of a base file name (assuming a default 3 character extension). Interestingly enough, if a highly skilled, technically savvy tester had designed white box tests for decision testing or path analysis then I suspect he or she could have very easily found this anomaly with even greater efficiency and exposed it earlier in the cycle.
The point here is that boundary testing is simply not random guessing, wild speculation, or simple parlor tricks. The technique of boundary value analysis requires in-depth knowledge of what the system is doing behind the user interface, and careful analysis of system and data to accurately determine the specific boundary conditions and a rigorous analysis of linear values immediately above and below each identified specific boundary value. Testers must be able to properly identify the specific and interesting boundary values based on in-depth knowledge of the system, an understanding of what is happening beneath the user interface, and experience. Then we can perform a more systematic analysis of any identified boundary conditions and potentially increase our probability of identifying real anomalies caused by this specific fault model. Boundary value analysis is a prime example of where good enough is simply not good enough in our discipline…we must be technically spot on!
Originally Published Wednesday, February 27, 2008
This morning I installed Vista SP1 onto my laptop. I was pretty excited about this release of Vista SP1 because it includes some pretty significant performance enhancements. But, as I was preparing to teach an internal course I came across a new boundary issue. I thought, how fitting this comes as I prepare to teach another class on systematic testing techniques (including boundary value analysis) that I find yet another classic boundary issue in Explorer’s list view (albeit the boundary condition is not readily apparent) . Interestingly enough, the previous boundary issue I found appears to be fixed with Vista’s SP1; however, I previously did not run across this issue which is also connected with how listview repaints itself after an event.
To reproduce this issue, open a folder with several dozen files in Explorer and select Views -> List and resize the Explorer window so there are several columns of files.
Select a file in the list and press the up or down arrow key so a dotted line appears around the file name
Resize the Explorer window so that the highlighted area is about 1 pixel away from the right most edge of the Explorer window as in the example below
Press the down (or up depending on which file you have selected) arrow key, and notice how the file highlight jumps to the next file in the list as expected.
Next, resize the the Explorer window so the dotted line touches the inner boundary of the window as in the example below
Now, press the down (or up depending on which file you have selected) and…
Notice, in the image above that just to the left of the frogger.def file the right most edge of the file highlight is visible, and also notice the file image has changed, and the scroll bar has jumped over to the next column.
Press the down (or up) arrow key again, and….
Seeing this behavior the first time made me think I was losing my mind! (OK…I probably am, but that is a completely different topic). And yes, reducing the size of the window beyond where the maximum boundary of the highlight window will also cause the same behavior, but the issue occurs when the width of the file highlight window >= the maximum x boundary size of the Explorer listview window.
The listview in Windows Vista has some very cool features, and overall I like Vista. What I dislike are some of the simple anomalies that could be easily exposed via more detailed, systematic testing approaches and analysis of the system under test rather than simply assuming more bodies banging on keyboards for some arbitrary period of time equates to better testing.
So, some things to consider when boundary testing…
- Not all boundary conditions are easily identified from the GUI by numbers, but all boundaries conditions have linear physical values that can be measured at some level
- Boundaries can change, similar to way the boundaries changed around Kosovo. Somebody moved the lines. The same thing is true in software, a boundary condition may change with human interaction, but a boundary condition is a linear physical range between a minimum and a maximum value (in this case the x coordinates of the Explorer changed with human interaction, but once focus moved away from the Explorer window the size of x established a fixed boundary (at least until a user again resized the window or the window is closed).
- Boundary conditions for one parameter are usually independent. For example, resizing the y-axis in the above scenario has no impact on this defect (unless of course the y-axis is large enough to accommodate all files in the listview without having to scroll horizontally). Boundary testing is based on the single fault assumption theory which states that a boundary issue is most likely to occur with independent parameters where the boundary variables for one parameter are analyzed while holding other parameters to nominal values. (Note: if we suspect that parameters are dependent or semi-coupled, then we should also perform combinatorial analysis testing.)
- There is a difference between boundary conditions (at least momentarily fixed, linear, measures) and threshold values. Threshold values can be altered by various influences. For example, in performance testing the point of degradation in performance can often be changed by several external influences such as increasing physical memory, cleaning and defragmenting the hard disk, modifying the software, etc.
- A detailed analysis of the system under test will reveal issues that other approaches to testing do not expose…that’s the pesticide paradox!
Originally Published Monday, October 08, 2007
I previously discussed various types of defects exposed via application of the boundary value analysis testing technique including a repaint problem, a casting problem, and a wrapping problem. While the minimum and maximum physical linear boundaries of a parameter are often easy to identify, it is surprisingly more difficult to identify boundary conditions within the minimum and maximum range especially if the tester does not adequately decompose the data. This week I will discuss another boundary defect that is often hidden below the user interface, but could be exposed using boundary value analysis testing at the unit level.
Loops are common structures in software, and (depending on the programming language) are susceptible to boundary defects. Boundary value analysis of a loop structure involves (at a minimum) bypassing the loop, iterating through the loop one time, iterating through the loop 2 times, iterating through the loop the maximum number of times and one minus the maximum number of times, and finally trying to exceed the maximum number of iterations through a loop structure by one time. (This is the min -1, min, min +1, max -1, max, and max +1 analysis of the boundary conditions.)
For example, the following method counts the number of characters in a string (actually, it counts the number of Unicode character code points in a string). To boundary test this method we would need to bypass the loop once by passing an argument of an empty string (minimum -1), then a string of one character "a" (minimum), and a string of 2 characters "ab" (minimum +1). Next, we would test the maximum range with a string of 2,147,483,646 characters (maximum -1), 2,147,483,647 characters (maximum) and 2,147,483,648 characters (maximum +1). The ToCharArray method will copy a maximum number of Unicode characters from a string to a character array equal to a signed 32-bit int type. So, passing this method a string of 1,073,741,824 Unicode surrogate pair characters the actual number of Unicode characters will be 2,147,483,648 which will throw the out of range exception. (This happens to be a common error. Many developers assume one character == one byte or one UTF-16 Unicode code point value.)
However, it is sometimes difficult to identify the boundaries of looping structures unless the tester is familiar with the programming language and/or data types. In the above example, if the tester is not aware of Unicode encoding patterns (especially surrogate pair encoding) and simply tests the physical extreme boundary conditions using only ASCII characters the method will appear to return the correct number of characters in a string up to and including the maximum length of 2,147,483,647 characters. But, passing a string of 2,147,483,647 characters in which even one character in that string is a surrogate pair will cause the ToCharArray method to throw the out of range exception.
Occasionally it may be difficult to even identify looping structures, especially when designing tests from only a black box test design approach. For example, in Window Xp a known defect appeared to allow a device name (LPT1, CON, etc.) as the base file name if the extension was appended to the base filename component in the Filename edit control (I’ll talk more about this defect later.) A Windows Xp patch attempted to correct this defect; however classic boundary analysis testing easily revealed the defect was only partially fixed as illustrated in the steps below.
- Launch Notepad
- Select the File -> Save menu item.
- In the Filename edit control on the Save dialog enter "LPT1.txt" (without the quotes).
- Press the Save button
- Press the Yes button on the error dialog that states the file already exists and asks "Do you want to replace it?’ As illustrated below,
If the patch/update is applied to the system Notepad will return an error message indicating it cannot create the file as illustrated below.
- Next, press the OK button on the error dialog
- Repeat steps 2 through 5 above.
Now, notice instead of an error message the Window title of Notepad has changed from Untitled – Notepad to LPT1 – Notepad. But, this is a reserved device name, so how can we save a file named LPT1.txt to the Windows file system? The answer is we cannot! Although the application title reads LPT1 – Notepad a file named LPT1.txt does not exist on the file system. This essentially constitutes a data loss defect because it appears to the user that they saved a file named LPT1.txt. (Yes, I am aware of the other bugs associated with reserved device names as well and shall write about them in the future.)
Now, some of you may ask how I knew to test for a looping structure with the Save dialog? Quite simply; I use a technique I refer to as the deja vu heuristic anytime I encounter an error dialog. (A heuristic is a commonsense rule (or set of rules) intended to increase the probability of solving some problem.) Anytime I encounter an error dialog I repeat exactly the same set of steps to make sure I get the same exact error dialog. Error handling routines often employ loops and are often prone to defects especially if some variable is initialized inside the loop structure. The deja vu heuristic is designed to analyze the minimum boundary of an error handling routine that employs a loop. The minimum – 1 value is not executing the error path, the minimum boundary condition is executing the path that instantiates the error dialog, and the minimum +1 value is repeating the same steps to execute the same path (or not in case of a defect). In fact, anytime I execute the same exact steps of an error path and get a different result the underlying architecture of the code is suspect and bound to contain one or more defects.
Looping structures are another common cause of boundary class defects, and this is clearly a case where visibility into the code and in-depth knowledge of data types is advantageous for the professional tester.
Originally Published Friday, September 14, 2007
I have never been really good at math. Sure I understand basic formulas, but I rely on a calculator when I run out of fingers and toes. I am envious of people who can look at a hexadecimal or octal value and convert it to an integer value in their heads without a second thought. But, I do know that when I multiply 1,073,741,824 * 1,073,741,824 the result should not equal 0! No, this isn’t some perverse bug in Calculator (calc.exe), but this sort of magic math sometimes manifests itself in the outputs of various calculations. (And there is a lot of calculating going on in even simple software programs.)
Familiarity with data types and specifically the physical linear boundaries of data types can help expose defects or explain unexpected errors encountered while testing, especially when evaluating output variables. For example, the physical linear boundary of a type int is -2,147,483,648 through +2,147,483,647 because this data type is a signed 32-bit (232) integer value. A value greater than +2,147,483,647 input into a control that only accepts a signed integer value will throw an overflow exception error. But, what happens when we have 2 input parameters that accept type int and the result of the calculation is greater than the maximum bounded value of the data type? Well, the answer to that questions depends on the data type used to store the output. For example, we should expect the result of the variable z below to be 4,294,967,294.
int x = 2147483647; int y = 2147483647; int z = x + y;
However, the result is actually -2! How did that happen? Since the maximum value of a signed int is 2,147,483,647, when y is added to x the number simply wraps itself within the physical bounds of the data type. If we added 1 to 2,147,483,647 the int value would equal -2,147,483,648. (In the above example, the number keeps counting positively to eventually equal -2). Hopefully, it is easy to understand why wrapping can be a bad thing! I am thinking some billionaire would be pretty upset if his/her bank balance was $2,147,483,647 and he/she made a deposit of let’s say $10,000 on Monday, then wrote a check for $5 on Wednesday and the check bounced because of a negative balance of -2,147,473,649!
Historical failure indicators suggest traditional boundary type defects have a high rate of occurance any time complex calculations act on linear bounded input or output values. This doesn’t imply that "most bugs occur at boundaries," (in fact that is quite a foolish notion). It simply means that empirical evidence has consistently proven traditional boundary defects are likely to occur at the edges of bounded linear variables, especially involving the output variables of complex calculations.
A real life example of a wrapping type bug can be found in Word 2007; although it is not quite as obvious as the above example. In the below example, the string is indented 5.9" to the left.
If we increase the size of the indentation to 6.0" the last character of the string wraps to the next line in the document, and this ‘wrapping’ is also displayed in the Preview window as illustrated below.Figure 2.
If we increase the indentation using the numeric up/down control buttons to 6.1" or larger we get an error message indicating the indent size is too large even though the Preview window illustrates the ‘g’ character wrapping to the next line! (In fact, if we input a size of 6.001" or larger we get the error message.)
Hmm…OK…so it will wrap one character but not 2 or more? That’s a little puzzling!!! (And yes, I know that I can append additional characters to the end of the string after wrapping in Figure 2., but that is not really relevant to the problem at hand.) Puzzling for sure, but at least the program stopped us at some point it calculated to be "too large,"and didn’t continue to wrap to a negative value as it did in Word 2003!
But, one thing I really dislike about software is when it does something, and then gives me an error message that makes it seem like I did something wrong. For example, if we reset the indentation back to 6.0" as illustrated in Figure 1, then highlight the string, and then click the Bullets button in the tool bar the string wraps to a vertical column of characters as illustrated in Figure 4. Next, we bring up the context menu, select Paragraph menu item and notice the software set the Left indentation to 6.15". So, although we couldn’t manually input any number greater than 6.001" I am thinking if the software did it own its own, then it must be OK…right? But, now we push the OK button on the Paragraph dialog and we get an error message that implies we did something wrong! How can that be…we didn’t change the indentation value, the software set the incorrect value? (Also notice the Preview pane doesn’t even look close to what is happening…I guess that’s what one would call WYSIAWYG or what you see is almost what you get.)
I understand that word wrapping and word breaking algorithms are really hard especially given the complexity of the interactions between font types, kerning, bullets and numbering styles, paper width, margins, and numerous other variables. This is just used as an example of a traditional output type boundary defect caused by incorrect calculations. This example is used here to illustrate how this category of defects can manifest various symptoms due to a single root cause, and to also provide insight to you, the professional tester, as to how to think about and expose these types of traditional boundary errors.
Originally Published Wednesday, September 05, 2007
The traditional concept of boundary testing was established as a systematic procedure to more effectively and more efficiently identify a particular category of defects. Historically, boundary value analysis has focused on bounded physical (countable) linear input and/or output variables of independent parameters, and is especially useful in programming languages that are not strongly typed (such as C). The value of using the boundary value analysis technique is early identification of:
- typographical errors of specific bounded values (especially artificial constraints on a data type)
- improper use of relational operators around a physical bounded variable
- errors in implicit or explicit casting between data types
The application of boundary value analysis is not easy, because identifying boundary conditions from the user interface is extremely difficult. Also, if the tester lacks an understanding of programming concepts such as data types, casting, and looping structures then identifying boundaries becomes a mysterious art rather than a professional practice.
Casting (or conversion) between data types occurs rather frequently in software. For example, a calculator may only accept integer values as inputs, but the output of a division operation may result in a decimal value. In this case the operands are converted or parsed from strings to integer values using int.Parse(operand), or Convert.ToInt32(operand) in C#. Now, in order for the division operation to display decimal values if the result is not a whole number the operands must be cast to doubles (or floats) as in the example below.
OK…so that is the basics of casting types, now let’s examine where casting can get us into trouble by exceeding physical bounded parameters. And to find these common errors we don’t have to go too far. In fact, the Attributes feature of MSPaint program provides us with plenty of examples of boundary issues.
Notice the default units for Width and Height parameters is Pixels. Pixels are in hole numbers. The input parameters Width and Height will accept a decimal value; but if the units selected is Pixels the decimal will be rounded to the nearest integer value. However, selecting either the inches or centimeters radio buttons for Units will allow decimal inputs. This is a pretty good clue that there is some pretty serious casting of these input parameters going on below the GUI. Now, we should know that inches are larger than centimeters, and centimeters are larger than pixels in size. We should also know that the maximum allowable input in the width or height parameter is limited to 5 characters, or a maximum valid input boundary of 99999.
Now there is nothing particularly unique or interesting about 99999, but we know that Pixels defaults to integer values, and inches and centimeters use decimal values, and that inches is bigger than pixels. And now, the magic math. If 29875 inches = 75882.50 centimeters, then 29876 inches should equal 75885.04 centimeters. (It doesn’t matter at this point if we click the OK button on the dialog and get an error message that reads "Please enter no more than 5 characters." even though we didn’t enter those 8 characters the software did that all by itself. We might not like the fact the software does what it tells the user he/she specifically can’t do, but (although many will argue this point) the program flooding its own edit control with more than 5 characters and then displaying an error message is really superficial and generally uninteresting.) But, it is a good thing we don’t use the Attributes feature of MSPaint as a conversion calculator because when we select inches as the units and enter 29876 in either the width or height parameters, then select the Cm radio button the value converts to 0!
What is this magical math? How can this happen when we aren’t even close to the maximum allowable input of 99999 inches? The value of 75885.04 is not some magical, unique or special value in computing. To understand what is happening we have to back up and convert from inches to pixels for a clearer picture. Select the inches radio button in the Unit groupbox, and enter the value 29875 for either the width or height parameters. Now select the Pixels radio button. Notice the value is 2147474. Now, testers know the upper physical bound of a signed 32-bit integer is 2 billion 147 million and some change (2,147,483,647 to be exact), but this value isn’t close. But, what if the last 3 characters are truncated by the control? Even in that case 29875 * 71.882 (pixels/inch) = 2147474.75 and the most significant digits above 1000 is still a value below the range of a signed 32-bit integer value (-2,147,483,648 ~ +2,147,483,647). But, when we enter 29876 inches and then select the pixels radio button the value jumps to 4294967 (which are the most significant digits above the one thousands place of an unsigned 32-bit integer value (0 ~ 4294967296). This is because 29876 * 71.882 (pixels/inch) = 2147546.632 (and again looking at the significant values above the one thousands place) exceeds the upper physical bound of a signed 32-bit integer (2,147,483,648) resulting in an implicit cast to an unsigned 32-bit integer.
Now, simply setting the Units radio button to inches, entering 99999 for either the width or height parameters, and then selecting the Cm radio button, or selecting the Pixel radio button and then back to inches or Cm would reveal an error. However, a professional tester has enough working knowledge of the underlying system (including basic programming concepts) to not only know how and in which situations this type of defect can occur, but they can also use their cognitive abilities (having a basis in or reducible to empirical factual knowledge [of basic programming concepts in this case]) to perform conscious intellectual acts (cognition) to analyze features and identify areas where this type of problem might occur, and can then apply logical and rational tests that have a greater probability of exposing a defect (if one exists), or providing other valuable information regarding the valid operation or correctness of that feature.
There are other bugs in this relatively simple feature regarding data types, and perhaps I shall expose those at a later time!
Originally Published Monday, August 20, 2007
If you like obscure bugs, then I think you’ll love this little gem!
USB flash drives are wonderful little gadgets, and I have several of them to store various files. Tonight I was using a flash drive to move some files around between machines when I noticed a slight problem. It seems when I deleted one of the files on the flash drive all the other files in file list disappeared! But, wait…I know I only had one file highlighted, so what happened to the other files? Fortunately, a quick press of the F5 function key refreshed the window and my remaining files reappeared.
This defect manifests itself in various ways , but here is the easy way to reproduce it:
- Insert your favorite flash drive and open an explorer window.
- Resize the explorer window’s horizontal axis to a point where the tree list pane just begins to collapse
(The size of the tree list pane is not especially critical to this problem, but this step makes it easier to reproduce the defect)
- Right click in the file list pane and select New -> Text Document from the context menu
- Rename the file with a long file name (string length must exceed the width of the file list pane)
- Press the F5 function key to refresh the window
(Notice the file name extends beyond the window rather than being truncated by the ellipsis)
Right click in the file list pane and select New -> Text Document from the context menu
- Abracadabra! No files! (Also notice there is no scrollbar.)
- Press the F5 function key again…and abracadabra….the file(s) reappear!
As noted above there are multiple ways to reproduce this particular bug, but the root cause is the same regardless of whether the files preexist and you highlight multiple files, highlight and select to delete that file, etc.
It does not require a flash drive to replicate this defect. But, I was using a flash drive when I encountered this problem and you can probably imagine my initial reaction (somewhere between surprise and horror) when the remaining files on the flash drive seemed to have disappeared…lost forever!
Now, this is not an especially nasty bug and it is rather obscure. But if you have a lot of files with long file names, and you don’t maximize explorer views then I bet your F5 function key might get a good workout simply because there appear to be so many ways to reproduce this problem.
First, Windows taught us the 3 finger salute. And now with Windows Vista (which I still think is way cool) we have a 1 finger (F5 function key) salute!
Have fun playing with this one!
Originally Published Wednesday, March 07, 2007
Over the past 3 days I have learned more about Mp3 file encoding and decoding than I have since the technology was introduced. I don’t spend time downloading files from the Internet to burn CD’s, I don’t own an iPod or Mp3 player, or a digital video recorder. So, prior to this I haven’t really paid attention to this technology and was quite ignorant of the various tools available and their capabilities. But, I must say it is pretty fascinating from a technology standpoint even though I am not an audiophile or videophile.
I still disagree with Pradeep’s assertion regarding boundary testing and the notion of no fixed boundaries, but respect Pradeep’s expertise in the area of Mp3 technology. An Aussie gentleman by the name of Dean Harding pointed out my incorrect assumption regarding bitrate encodings and explained the LAME encoder does allow a freeformat option in "expert" mode to produce a fixed bitrate in one kilo bit increments between 8 kb/s and 640 kb/s. (Thanks for serving up the pie Dean.) However, of 30+ common decoders I only discovered 4 decoders supported freeformatted Mp3 files even if the encoded bitrate is less than 320 kb/s. Only one decoder (WinAmp MAD) is capable of decoding files above 600 kb/s.
So, (other than me having to eat a big helping of humble pie) where does that leave us in the specific debate about boundary testing, and Pradeep’s question "As a tester have you ever seen a boundary?" To that, I shall adamantly reply "yes" there are specific boundary conditions in software. Some are easy to find, some are not so easy. A tester’s ability to correctly identify a boundary value are heavily influenced by his/her in-depth domain and ‘system’ knowledge. For example, using the knowledge of Mp3 encodings I have learned over the past 3 days let’s go back and review what tests I would design based on Pradeep’s original description of the audio decoder that played an Mp3 file within the range of 24 kb/s to 196 kb/s.
Since 196 kb/s is not a standard Mp3 encoding supported by ISO standards let’s assume the Mp3 player used either a Cdex, LAME, I3dec, or WinAmp MAD decoder. Using this as a reference, and some recently acquired domain knowledge I would design a set of initial tests using the following sample test data (files encoded with the specified criteria).
23, 24, 25 kb/s – Specified minimum value and minimum -1, and minimum +1 values to analyze relational operators used to artificially constrain the encoding range to a low end of 24 kb/s.
195, 196, 197 kb/s – Specified maximum value and maximum -1 and maximum + 1 values to analyze to analyze relational operators used to artificially constrain the encoding range to a high end of 196 kb/s.
16 kb/s – this is the next ISO standard encoding bitrate below the specified minimum, so although the decoder does not support a file encoded at 23 kb/s (min -1 value) I would still want to check at the next lower standard value.
224 kb/s – this is the next ISO standard encoding bitrate above the specified minimum (same reason as explained above.)
- 32, 40, 48, 56, 64, 80, 96, 112, 128, 144 (see #6), 160, 176, 192 kb/s – these are the typical ISO standard Mp3 encodings within the specified range, so we should assume that all these must work properly because there is a high probability of decoding files using these bitrates. Since there are not many of them test each one.
143, 144, 145 kb/s because the 144 kb/s bitrate seems to be an interesting value that "sticks out" more than others, and so I may also want to analyze the values around that particular value for any other anomalies
Generate several randomly encoded files in the following ranges (between 24 kb/s and 127 kb/s) (between 128 and 143 kb/s), and (between 145 kb/s and 196 kb/s) to gain confidence the decoder can decode non-standard encoded files within the specified range without having to test all 174 or so possibilities
These are not the only tests I would execute; however, they would be the first set of tests I would design and execute to make sure the code at least does what it is supposed to do. Any failure in the above cases means our basic program functionality has some serious flaws. Once I established the program does what it is supposed to do (including handling expected errors gracefully), then I would begin exploring other possibilities including rigorous falsification/negative testing.
Pradeep indicated a file encoded at 96 and 128 kb/s crashed the system. These are not boundaries conditions (unless the developer did something totally unreasonable), and unfortunately since we can only assume files encoded above and below 96 and 128 kb/s played correctly we will never really know the cause of this problem (unless Pradeep did some root cause analysis and will share those findings). However, a failure with 128 kb/s is really a red flag to me because this happens to be the most prevalent bitrate for encoding Mp3 files. As a tester I would really want to know why unit testing or build acceptance testing, etc. didn’t at least hit the most probable encoding format (the happy path) before throwing crap code over the wall for Pradeep to test.
I hope the reader takes away a few lessons from all this (besides the obvious one of not going off half cocked especially if you lack expertise in the specific context (e.g. Mp3 encodings)). For example,
- In-depth knowledge of the domain space (including the data set, how the data is encoded, and how the data set can and cannot be manipulated in code both correctly and incorrectly), industry standards, and how the domain space interacts with the system are critical for greater test effectiveness
- The less we know about the domain, the data set, the system interaction the less effective are the application of specific techniques to identify specific classes of defects
- Boundary testing is simply one technique (systematic procedure to solve one type of a complex problem). The boundary value analysis technique is designed to identify a specific class of defects involving incorrectly specified constant values, incorrect use of data types, or casting between data types, artificially constraining data types, and incorrect usage of relational operators. It is not effective to identify other classes of defects.
- Boundary conditions simply do not exist at the extreme physical ranges, there could also be multiple boundary conditions within the overall range. (A good example is the Unicode repertoire. The Unicode BMP spans from U+0000 to U+FFFF, but within this range there are several important boundaries one must take into consideration when using Unicode data depending on the purpose of the test and the application under test (e.g. Private use area, surrogate pair area, etc).)
- Understanding how to decompose the data set into equivalence class partition subsets exposes boundary conditions we might not otherwise consider
- There is a great deal of detail in the code that can expose interesting information for a tester
- When talking technology, be as specific and precise as possible to avoid ambiguity
- And perhaps most importantly, one technique or one approach to testing is not sufficient. As testers we must gather and learn to use a great variety of skills and knowledge and approach the problem from multiple perspectives to be most effective in our roles.
OK…now time to get some really bitter coffee to wash down that humble pie