Archive for the ‘Globalization Testing’ tag
A Source of “Real-World” Test Data for Globalization Testing
I am generally not a big fan of static test data. I do know that in the proper context static test data can provide some value. Of course we should be aware of the common problems with files of static test data or (even worse) hard coded test data in a test case. Some problems with static test data include:
- Stagnation – static test data may add some initial value, but over time simply reusing the same test data over and over in a test diminishes the value of that test. For example, retesting the same name strings in a first name input textbox is not providing any new information if those ‘static’ names worked in the previous build and the underlying functionality has not changed.
- Contextual blindness – sometimes we have files of static test data that was identified as “problematic” in one situation (context), so we reuse the “problematic” test data regardless of the context. In 1995 I wrote a white paper on “problematic double-byte encoded characters (DBCS) explaining why each code point was problematic in a given context. For example, a Japanese character that began with a 0x5C trail-byte might be problematic in a filename on an ANSI based system that parsed characters by bytes instead of wide bytes. This is not true on Windows systems where the default encoding is the Unicode transformation format of UTF-16. However, some people continue to use obsolete DBCS problem characters perhaps because they don’t fully understand the underlying contextual differences between ANSI based encodings and Unicode.
Perhaps on the opposite end of the test data spectrum is random test data. Many of you that read this blog or have heard me speak know that I am a big proponent of parameterized random test data generation. Parameterization allows us to better model our test data. I know that even parameterized random data can be crafted to be representative of real data, but it is not “real” data.
But, there may be a happy medium between static test data and random test data. And, best of all it is abundantly available. One of the best sources for (especially non-English) test data comes from sources that most of us already use on a daily basis. The test data source I speak of are social networks.
I have met many wonderful people from around the world both in person and virtually, and stayed in contact with many of them. Last year while keynoting at the first software testing conference in Vietnam (VistaCon 2010) I was privileged to meet my dear friend Thuyen, who helped organize the conference. Since the conference we have stayed in contact via email and Facebook. When she posts on Facebook it is usually in Vietnamese. Since I don’t (yet) read Vietnamese I use Bing Translator to help me figure out the comment.
Last week she had an entry on her Facebook wall that began “Tối nay vô tình nghe trên TV 1 bài hát mà giai điệu…” So, I copied the entry and opened Bing Translator to translate the entry.
Many of you will quickly notice the strange anomaly in the translation. I initially thought that this service might be incrementing this numeric value for some reason, but when I changed the number value to 2 the number 2 displayed in the translated string. I tried various other numbers and quickly discovered that 6 incremented to the number 7, 8 decremented to 7, and 9 decremented to the number 3. I didn’t see a clear pattern here so I thought this might be an issues resulting from parsing a particular sequence of characters.
So, I modified different parts of the string (removed words) to narrow down the problem. I found the string “tình nghe trên TV 1 bài hát mà giai điệu” contained the problematic sequence. Removing any ‘word’ from this string displayed the translated string with a number of 1, with the exception of 1 word. Removing the word “nghe” from the above string resulted in the translation illustrated below.
By the way…the Google translation engine doesn’t fair much better. And, the results are different between www.google.com/ig and http://translate.google.com.
But, the purpose of this post is not to illustrate this particular bug, but to give you ideas of how we can use social network feeds in our testing. People around the world use social networks and you can find “real world” strings in various languages that you can use as test data in various contexts. Most of the time this ‘test data’ will not likely result in a bug; but sometimes it can reveal interesting issues. Best of all, strings taken from social networks are not some manufactured static or random test data. Using strings copied from social networks is about as “real world” as we can get…this is the “data” from our customers.
Basic International Sufficiency Testing in Action
A person who speaks 3 languages is trilingual; a person who speaks 2 languages is bilingual; and a person who speaks one language is an American (or Brit…depending on which side of the pond you are on).
I work at a company with tremendous cultural diversity embodied in very smart people from around the globe. Yet, it seems that when people come to Redmond their cultural uniqueness seems to disappear. Maybe it is an engineering thing, maybe it is an assimilation thing, but whatever it is, it is not good thing especially considering that a growing number of our customers come from non-US markets and the way they interact with software is often quite different compared to the US centric scenarios and personas I so often seen used to design and develop our software and services.
Monday afternoon I taught a class on globalization testing basics geared towards SDETs who are not experts in globalization. These testers usually come to the training mostly because they’ve been tagged by their manager to help with globalization testing efforts on their team. But, what they learn is that with a little understanding of some basic concepts and with the aid of a few tools they can incorporate international sufficiency testing concepts into their test designs (both exploratory and automated) and drive quality upstream (e.g. find some bugs sooner).
One of the topics I discuss in the class is customizing the current user locale settings. I also discussed this in an earlier post this year. Customizing the national conventions settings for the current user locale include things like custom date and time pictures, as well as number and currency formats. Of course to really understand custom locale settings we should have a good technical understanding of the national language support (NLS) API functions and specifically the LCType parameter Locale Information Constants.
<tangent alert> When I and other people talk about encouraging testers to develop their technical skills or knowledge there are some people who simply assume that we mean “technical” equates to coding. This is really a shallow view of how they interpret ‘technical.’ When I refer to technical knowledge and skills I am primarily referring to an understanding of how the system (or parts of the system) work and how to use that knowledge and their skills to design more effective tests.</tangent alert>
In this case, our technical understanding of the variable arguments that can be passed to the lpLCData parameter of an NLS API function such as SetLocaleInfo() for the various LCType constants can help is more fully explore the functionality of features that use the operating system’s NLS settings. For example, the constant value for the decimal symbol for number formats is LOCALE_SDECIMAL. When this LCType is specified the value we can pass to the lpLCData parameter is a string of up to 3 characters in length (“maximum number of characters allowed for this string is four, including a terminating null character.”)
In class I often use a custom decimal symbol as an example of customizing national convention settings by manually changing the number format decimal symbol from a period character (.) to random Unicode characters. From previous examples I knew of a minor anomaly in the calculator (calc.exe) in which the decimal button changed to show “abc” (or some other random string of 1 to 3 characters), but the decimal symbol in the result window only displayed the first character (“a”).
But, in this week’s class I gave an example of changing the decimal symbol from a period character to the literal string “dot.” Then, Andreas Schiffler, one of the SDETs in the class used the example “dot” string in the calculator and quickly discovered an anomaly.
It seems that the calculator does not like the lower case letter ‘d’ as a decimal point. If you customize the decimal symbol for the number format to the Unicode lower case letter d, then launch the calculator (calc.exe) and press the decimal symbol key (or perform any calculation in which the result includes a decimal value) the calculator result window will show “Overflow.”
Andreas initially thought this might be caused due to the fact that the letter ‘d’ is a formatting character. However, we quickly discovered the upper case letter ‘D’ is also a formatting character, but the upper case ‘D’ and other formatting characters do not cause a similar incorrect condition. (I went home and automated this test looking for clues using the GlobalTester library and pumped in over 10000 random Unicode characters, and none of them have caused an overflow.)
Now this particular problem might seem out of context, or have no ‘real-world’ scenario because the commonly used characters for the decimal symbol are the period (.), the comma (,) and the space ( ) characters. And for the life of me I can’t think of any national convention in this world that uses the letter ‘d’ as a decimal symbol in their number formats. But, this application is using the NLS settings (at least to some level of implementation), and since we allow the user to customize these settings, then I shouldn’t expect that functionality to break. The bottom line is that something out of the ordinary is going on that probably needs investigating.
Sometimes simply changing the current user default locale settings might reveal basic internationalization issues. And sometimes customizing the national conventions and using randomized data for those conventions might reveal problems with how the developer implements national language support in the product. You don’t have to be an expert in globalization to discover these issues! You just have to know a little bit about the technicalities of NLS, and have a desire to potentially find some pretty cool, or at least weird anomalies earlier in your testing.
Globalization Testing: Basic International Sufficiency
I started my career at Microsoft in 1994 working on the Windows 95 International Test team. Globalization testing is a unique specialty in software testing just like performance, security, and other specific areas of testing. Globalization testing doesn’t necessarily require a tester to be bi-lingual, or be from a country other than the United States. A good globalization tester has an in-depth understanding of such things as character encoding types and issues associated with the different types, character mapping and conversion issues, data manipulation by the application, operating system, and network protocols.
Many people might also say that globalization testers also need to know that different locales (places) around the world use different formats for date and time (national conventions). For example, in the United States the default long date format is Thursday, June 03, 2010 but in Germany it is Donnerstag, 3. Juni 2010. A tester doesn’t have to ‘read’ German to see the abstract date format has changed from dddd, MMMM dd, yyyy to dddd, d. MMMM yyyy.
Testing for support of these different national conventions used around the world is referred to as basic international sufficiency testing. I suspect the reason why some people might assume basic international sufficiency testing these different national conventions is the domain of the globalization tester is because the national conventions are set by default on the different localized versions of a software product so that’s when they are tested. But, this reasoning is absurd!
First, not all products are “localized” into all languages or ‘locales.” So, who tests the Canadian long date format of MMMM-dd-yy, or the Georgian (Georgia) long date format of yyyy ‘წლის’ dd MM, dddd? Also, Vista and later versions of Windows allow the user to ‘customize’ the date and time “format pictures” to use different separator symbols and orderings.
Secondly, way too many bugs such as hard-coded date formats are found way too late in the testing cycle (because localized versions tend to lag US English language version). And of course, we all know the cost of finding bugs later in the lifecycle are more costly to correct.
So, we must ask if there is a way for basic international sufficiency testing to be ‘pushed upstream?’ And of course the answer is yes. The easiest way is to host a “globalization bug bash” early in the cycle. (A “bug bash” is a day where testers are given some basic training on attack patterns, fault models, etc., in a general focus area and then spend a day exploring different areas of the product trying to flush out bugs in a competition style format.) Another way is to assign each tester a different locale (preferably one that is not associated with a localized language version) and have them set their test and self-host environments to that locale during their testing.
This is easily accomplished on Windows test environments by having testers launch the Regional and Language control panel applet (the short cut is Start –> Run, then type “intl.cpl” without the quotes, and press the OK button).
This just tests for a basic level of international sufficiency, and any good tester would want to explore their project’s capability to support the more than 150 different locale national conventions at a deeper level. This is especially true if your product is going to be used by customers around the world (including Canada). But, of course, we don’t want to run the same tests on all 150+ locales supported by the operating system.
The national convention settings for a particular locale are stored in a data type called the LCID, and when we change our locale (Format on the latest Regional and Language control panel applet) through the user interface we are actually calling various National Language Support (NLS) APIs. A “world-wide” application should use the universal NLS APIs and data available via the operating system.
One way to test our application’s ability to correctly use the national convention data supplied by the operating system is to set customized conventions. For example, did you know the Windows 7 operating system allows a digit grouping symbol to be a string of up to 3 characters? Or the Negative sign symbol can be a string of up to 4 characters.
Although having testers change their default locale (Format) on their test environment and self-host machines is a good first step in basic international sufficiency testing, we also want to see if our application can process a negative value of “!NEG7” instead of just “–7,” and any textboxes correctly display the customized negative sign symbol (especially at the upper extreme boundary of the textbox size property.
To customize the national convention settings we simply click the Advanced settings… button on the Formats property sheet of the Region and Language control panel applet which instantiates a new dialog with 4 property sheets for Numbers, Currency, Time, and Date.
Solution for Test Automation
That’s all well and fine for basic testing, or testing a “few” customized values, but if we wanted to test the permutations for each convention, or the combination of different conventions on numbers, currency, time, or date formats the number of tests is astronomical. Typically, testers writing an automated test would try to navigate the user interface of the Regional and Language control panel applet and the Customize Format property sheets in order to set custom conventions.
In the past I provided some code snippets for changing the convention settings on the Customize Format property sheets on versions of Windows pre-Vista. Earlier this year I also provided code snippets for customizing the date format picture and the time format picture.
That’s all well and good, but I recently released a new test automation library called GlobalTester for test developers to use in their automated test scripts. The GlobalTester library provides testers methods to set custom national conventions for the current user without having to navigate the user interface of the Region and Language options control panel applet. These national conventions include number formats, currency formats, date formats, time formats, and also current location.
The following example illustrates how we might design a test script to customize the date format for a test and reset the date format to its original setting (restoring the test environment to pre-test conditions). (Usage documentation for the GlobalTester library is on the Testing Mentor website.)
1: namespace CustomizeDateSettingsExampleScript
2: {
3: using System;
4: using System.Globalization;
5: using TestingMentor.TestTool.GlobalTester;
6:
7: class MyTest
8: {
9: static void Main()
10: {
11: try
12: {
13: CustomDateFormat time = new CustomDateFormat();
14:
15: string defaultLongDateFormat =
16: CultureInfo.CurrentCulture.DateTimeFormat.LongDatePattern;
17: string newLongDateFormat = "MMM - d (yyyy) gg";
18:
19: if (time.ChangeLongDateFormat(newLongDateFormat))
20: {
21: // Launch AUT
22: // Execute test - (e.g. AUT implements long date string)
23: // Oracle - (e.g. compare long date format against customized pattern)
24:
25: // Reset test platform to original configuration
26: time.ChangeLongDateFormat(defaultLongDateFormat);
27: }
28: else
29: {
30: // Date format not changed; test not executed (e.g. invalid
31: // day, month, year, and era format pictures)
32: }
33: }
34:
35: catch(ArgumentOutOfRange e)
36: {
37: // Test script failure - (e.g. long date format string argument out of range)
38: }
39:
40: finally
41: {
42: // Log test results
43: }
44: }
45: }
46: }
Globalization Testing: Customizing Time Formats
Time is a commodity in short supply. I have been juggling a lot lately and there never seems to be enough time to do everything I need to do, and even less time to do the things I want to do. (Blogging falls under the want to do category.) I wish sometimes I could slow down the hands of time, but that is beyond my control. What is within my control is changing the time format displayed on the computer. And if I need to do that in an automated test to increase the robustness of my test to include globalization, then I can programmatically change the time format without having to manipulate the Region and Language settings control panel applet.
Time and date information is commonly pulled from the operating system by many developers for use in headers or footers on documents, default file names, printing, and other places time/date stamps are useful or important. To ensure our products are “world-ready” we should modify the formats to validate whether our product supports various national conventions used in different regions (locales) around the world. In the previous post I illustrated how to programmatically customize the date formats on a Windows environment for including some basic globalization tests in your test automation. This week let’s look at how we can programmatically change both the short time and long time formats.
We will again need the 2 Win32 API functions SetLocaleInfo() and PostMessage() that we marshaled over into the NativeMethods class. Since that code doesn’t change I won’t repeat it here you can simply refer to the code snippet in the previous post. In this situation we need to set the lcType in SetLocaleInfo() to the LOCALE_STIMEFORMAT constant. Then we can pass a null-terminated string to the lcData variable in the SetLocaleInfo() function. MSDN explains “The maximum number of characters allowed for this string is 80, including a terminating null character. The string can consist of a combination of hour, minute, and second format pictures.”
Once again, to simplify that a bit I wrote some more wrapper methods to change the time format. Also, since we will be calling SetLocaleInfo() and PostMessage() a lot for customizing date, time, and other national conventions I created a wrapper method called UpdateLocaleInformation() to remove redundancy.
1: namespace TestingMentor.TestTool.GlobalTester
2: {
3: using System;
4:
5: public enum TimeFormatType
6: {
7: LongTimeFormat = 0x00001003,
8: ShortTimeFormat = 0x00000079
9: }
10:
11: public class CustomTimeFormat
12: {
13: private int timeFormatType = (int)TimeFormatType.ShortTimeFormat;
14: private string timeFormatPicture = string.Empty;
15:
16: public int SetTimeFormatType
17: {
18: set
19: {
20: if (value == (int)TimeFormatType.ShortTimeFormat ||
21: value == (int)TimeFormatType.LongTimeFormat)
22: {
23: this.timeFormatType = value;
24: }
25: else
26: {
27: throw new ArgumentOutOfRangeException("TimeFormatType invalid");
28: }
29: }
30: }
31:
32: public string SetTimeFormatPicture
33: {
34: set { this.timeFormatPicture = value; }
35: }
36:
37: public bool ChangeTimeFormat()
38: {
39: return UpdateLocaleInformation(
40: this.timeFormatType,
41: this.timeFormatPicture);
42: }
43:
44: private bool UpdateLocaleInformation(int localeType, string localeData)
45: {
46: bool success = false;
47: if (NativeMethods.SetLocaleInfo(
48: NativeMethods.SystemDefaultLocale,
49: localeType,
50: localeData))
51: {
52: NativeMethods.PostMessage(
53: NativeMethods.BroadcastMessage,
54: NativeMethods.SettingChangeMessage,
55: IntPtr.Zero,
56: IntPtr.Zero);
57: success = true;
58: }
59:
60: return success;
61: }
62: }
63: }
Once again, we simply have to set the SetTimeFormatType property to either the Short time or Long time format, provide the format picture by setting the SetTimeFormatPicture property, and then call ChangeTimeFormat(). The sample below illustrates how to change the short time format with different time separators and a reverse order.
1: static void Main(string[] args)
2: {
3: CustomTimeFormat time = new CustomTimeFormat();
4: time.SetTimeFormatType = (int)TimeFormatType.ShortTimeFormat;
5: time.SetTimeFormatPicture = "ss'mm,hh - tt";
6: if (time.ChangeTimeFormat())
7: {
8: Console.WriteLine("Success");
9: }
10: }
Now, we can also customize the AM/PM designator as well. To change the AM/PM designator we need to add a few more properties and another wrapper method. In this case, I’ve added the SetAmPmDesignator property, the SetAmPmString property, and the ChangeAmPmDesignator() method.
1: public enum AmPmDesignator
2: {
3: AM = 0x00000028,
4: PM = 0x00000029
5: }
6:
7: public class CustomTimeFormat
8: {
9: public int SetAmPmDesignator
10: {
11: set
12: {
13: if (value == (int)AmPmDesignator.AM || value == (int)AmPmDesignator.PM)
14: {
15: this.designatorForAmPm = value;
16: }
17: else
18: {
19: throw new ArgumentOutOfRangeException("AmPmDesignator invalid.");
20: }
21: }
22: }
23: }
24: public string SetAmPmString
25: {
26: set { this.timeDesignator = value; }
27: }
28:
29: public bool ChangeAmPmDesignator()
30: {
31: return UpdateLocaleInformation(
32: this.designatorForAmPm,
33: this.timeDesignator);
34: }
The code snippet below illustrates how to change the AM designator from “AM” to “In the morning.”
1: static void Main(string[] args)
2: {
3: CustomTimeFormat time = new CustomTimeFormat();
4: time.SetAmPmDesignator = AmPmDesignator.AM;
5: time.SetAmPmString = "In the morning.";
6: if (time.ChangeAmPmDesignator())
7: {
8: Console.WriteLine("Success");
9: }
10: }
Modifying national conventions is one way to test for globalization support upstream and should be done early in the testing cycle rather than relying on a separate globalization testing cycle. Time and date are perhaps the most visible national conventions used in many different ways in our applications. We should test the common (equivalent) conventions used in various regions around the world, and customizing these settings helps ensure the developer is properly calling NLS APIs and not using custom functions.
Also, check out the beta release of the GlobalTester automation library that has this functionality and more, and let me know what you think.
Globalization Testing: Customizing the Date Format
The ability of our software products to function correctly in a global environment is becoming more and more important. Our software should support national conventions used by the various locales around the globe. For example, in some regions of the world the period character is used as the number group separator and the comma is used as the decimal symbol (radix). European calendars generally start on Monday rather than Sunday which is customary in the United States. Era based calendars are still in common use in Japan and Korea, date formats and order, and time formats also vary by region or locale. As testers we need to test our software to ensure our customers around the world can use the national conventions they are accustomed to, and not force them down a US-centric, one-size-fits-all format or standard.
There are several settings that we can modify and customize for more robust globalization testing such as number, currency, time and date formats. Modifying these settings can help us test that our application is globalized to use National Language System (NLS) APIs provided by the system.Although a user would change these settings using the Regional Options user interface property sheets, if the purpose of our test is not to emulate user interaction, then modifying the custom regional settings for globalization testing programmatically is more efficient.
Last year I talked about how to programmatically make changes to the settings in the Region and Language control panel applet when doing globalization testing. Unfortunately, the code sample provided in the previous post was appropriate for versions of Windows XP and earlier. For versions of Windows Vista and later things have changed a bit. Also, the previous sample tried to be a one-size fits all and relied on the test developer to set the appropriate lcType constants and lcData argument variables required by the Win32 function SetLocaleInfo().
This time, I decided to simplify things a bit and wrapped some methods to call the appropriate Win32 API functions and properties to set lcType and lcData values to make it easier to incorporate into automated tests. I also separated the various advanced custom formats for Region and Language options into separate classes. Of course, I have a beta version of an automation library (DLL) called GlobalTest.DLL on my website that testers can use in their automated test cases, but this week let’s look at the class for setting custom date formats.
Making these changes programmatically still requires the Win32 SetLocaleInfo() function. MSDN also states this function modifies the specified values for all applications, so to prevent potential issues in other applications running on the system we should also broadcast the WM_SETTINGCHANGE message. To broadcast the WM_SETTINGCHANGE message we will also need the Win32 PostMessage() function. Since we are Process Invocation (PInvoke) to call these unmanaged functions we should put them in a separate class that I’ve called NativeMethods. I also included all necessary constant values required by these methods in the NativeMethods class also as illustrated below.
1: namespace TestingMentor.TestTool.GlobalTester
2: {
3: using System;
4: using System.Runtime.InteropServices;
5:
6: internal sealed class NativeMethods
7: {
8: internal const int SystemDefaultLocale = (int)0x00000800;
9: internal const int BroadcastMessage = (int)0x0000FFFF;
10: internal const int SettingChangeMessage = (int)0x0000001A;
11:
12: private NativeMethods() { }
13:
14: [DllImport("kernel32.dll", CharSet = CharSet.Unicode, SetLastError = true)]
15: [return: MarshalAs(UnmanagedType.Bool)]
16: internal static extern bool SetLocaleInfo(
17: int locale,
18: int localeType,
19: string localeData);
20:
21: [DllImport("user32.dll", SetLastError = true)]
22: [return: MarshalAs(UnmanagedType.Bool)]
23: internal static extern bool PostMessage(
24: int handle,
25: int message,
26: IntPtr wParam,
27: IntPtr lParam);
28: }
29: }
The class for the custom wrapper method is TestingMentor.TestTool.GlobalTester.SetDateFormat. There is a public enumeration for the short date and long date constants. One of these values must be assigned to the SetDateType property. The other property that must be set is the SetDateFormatPicture. The big change in the SetLocaleInfo() function is that the lcData type is a null-terminated string that MSDN refers to as a format picture. Current versions of Windows allow users to customize the order of the month, day and year, the format for each, and even allow different separators between the date elements. The format picture enables the user to select various format types in different orders for either the short date or the long date. See MSDN’s Month, Day, Year and Era Format Pictures for the various supported format types.
1: namespace TestingMentor.TestTool.GlobalTester
2: {
3: using System;
4:
5: public enum DateFormatType
6: {
7: ShortDate = 0x0000001F,
8: LongDate = 0x00000020
9: }
10:
11: public class CustomDateFormat
12: {
13: private string dateFormatPicture = string.Empty;
14: private int dateType = (int)DateFormatType.ShortDate;
15:
16: public string SetDateFormatPicture
17: {
18: set { this.dateFormatPicture = value; }
19: }
20:
21: public int SetDateType
22: {
23: set
24: {
25: if (value == (int)DateFormatType.ShortDate ||
26: value == (int)DateFormatType.LongDate)
27: {
28: this.dateType = value;
29: }
30: else
31: {
32: throw new ArgumentOutOfRangeException("Invalid DateType");
33: }
34: }
35: }
36:
37: public bool ChangeDateFormat()
38: {
39: bool success = false;
40: if (NativeMethods.SetLocaleInof(
41: NativeMethods.SystemDefaultLocale,
42: this.dateType,
43: this.dateFormatPicture))
44: {
45: NativeMethods.PostMessage(
46: NativeMethods.BroadcastMessage,
47: NativeMethods.SettingChangeMessage,
48: IntPtr.Zero,
49: IntPtr.Zero);
50: }
51:
52: return success;
53: }
54: }
55: }
Once the SetDateType and SetDateFormatPicture properties are assigned we simply have to call ChangeDateFormat() method to change the settings and broadcast the message to the system. The code snippet below illustrates how a tester would change the default long date format in an automated test to determine globalization support in the application under test. Customizing the date format is useful if the application under test uses a date string in any way. For example, if the application includes a function to insert a date string in an edit control, or if the date is printed as a header or footer in a document, or if a date string is appended to a record.
1: using TestingMentor.TestTool.GlobalTester;
2: ...
3: static void Main(string[] args)
4: {
5: CustomDateFormat date = new CustomDateFormat();
6: date.SetDateType = (int)DateType.LongDate;
7: date.SetDateFormatPicture = "[ dd % MM | yyyy ]";
8: if (date.ChangeDateFormat())
9: {
10: Console.WriteLine("Long date was changed");
11: }
12: }
Programmatically changing the date format is an easy way testers can customize date formats in their automated tests without having to manipulate the controls on Region and Language property sheet. Also note, that since the format picture is a string the order of the supported date format types is now controlled by the arrangement in the string, and the separator characters can be different between the day and month and the month and year as illustrated in the example above.
Modifying national conventions is one way to test for globalization support upstream and should be done early in the testing cycle rather than relying on a separate globalization testing cycle.
Next week I will discuss customizing the time format. Also, check out the beta release of the GlobalTester automation library that has this functionality and more and let me know what you think.
Test Automation: Look Below the UI for More Effective and Robust UI Automated Test Case Designs
Originally Published Tuesday, April 14, 2009
Last month I wrote about simplistic views of UI test automation in which some people want to pretend that recording for playback or scripting hard-coded actions and data to mimic some human’s interactions at the keyboard is an automated test. Balderdash! Automating a set of sequences or preconceived steps simply for the sake of automating or preparing an environment is perhaps what Kaner, et. el. mean when they refer to computer assisted testing; however, computer assisted testing is not the same as a well designed automated test. (And yes, computers are very good tools for completely automating some types of tests quite effectively; including the oracle.) We see a lot of computer assisted testing in UI automation projects. I suspect this occurs because people are focused on trying to automate a test the same way they or an end-user would interact with the computer rather than design the automated test to evaluate an important attribute or capability of the software in order to provide significant information to the project team and add value to the testing process.
Personally, I am not a big fan of UI automation because it is usually done poorly, and it is usually very fragile and needs constant massaging; more so than test automation that runs below the UI layer. Also, I see a lot of misuse of UI automation. For example, I recently came across a comment by one fellow that wrote, “UI Automation is not necessarily meant for testing the UI (though, we use it for that also).” What??? I do understand the need for UI automation in the testing process, and done well it can provide tremendous benefit and free up my time to actually design new tests and think more critically about what has and has not been tested. But, when I automate through the UI my test cases are primarily testing behavioral aspects of the software (end-user scenarios for example) and that UI elements call the appropriate event handlers. While UI automation can be used to test functional capabilities also, it is generally not the best approach for robust functional testing. This is especially true when the automated UI test is over-loaded with excess baggage (manipulating UI elements not directly associated with the purpose of a test). The more baggage a UI test carries, the greater the potential for maintenance nightmares.
For example, not too long ago a tester was performing international sufficiency testing of his component to ensure his feature supported multiple national conventions and custom formats supported by Windows National Language Support (NLS) APIs. He knew the steps to manipulate the national conventions and custom formats required the user to click the Start menu, select Control Panel, then click on the Regional and Language Options control panel applet, click the Customize button, select the appropriate property sheet for the national convention he wanted to customize the setting for, and finally click the OK button the the Customize dialog and the Regional Settings dialog, and verify the results. Lather, rinse, and repeat as necessary!
To make matters more complicated the sequence of steps to change these settings are slightly different between Windows Xp and Windows Vista and we certainly don’t want to write 2 separate test cases, or branch the test code depending on the operating system in this case. Complexity cultivates complication; especially with UI automation! Fortunately, this fellow also knew that essentially all underlying functionality can be accessed via Windows APIs, and that is exactly the information he was looking for. In this situation I suggested he look at the SetLocaleInfo function and within minutes he incorporated that function to efficiently resolve his problem, and his automated test was capable of testing his application on any currently supported version of the Windows operating system.
In C# automation, we can use Process Invocation Services to PInvoke this Win32 API function from Kernel32.DLL as illustrated below
1: namespace TestingMentor.PInvokeSample
2: {
3: using System;
4: using System.Runtime.InteropServices;
5:
6: /// <summary>
7: /// This class contains Native Win32 API functions that are marshalled
8: /// over for use in C#
9: /// </summary>
10: class NativeMethod
11: {
12: /// <summary>
13: /// Sets an item of information in the user override portion of the
14: /// current locale. This function does not set the system defaults.
15: /// </summary>
16: /// <param name="locale">the locale identifier of the locale with the
17: /// code page used </param>
18: /// <param name="localeType">Type of locale information to set.</param>
19: /// <param name="localeData">A null-terminated string containing the
20: /// locale information to set</param>
21: /// <returns>Returns true if successful; otherwise false</returns>
22: [DllImport("kernel32.dll", CharSet = CharSet.Auto, SetLastError = true)]
23: public static extern bool SetLocaleInfo(
24: uint locale,
25: uint localeType,
26: string localeData);
27:
28: /// <summary>
29: /// Sets an item of information in the user override portion of the
30: /// current locale. This function does not set the system defaults.
31: /// </summary>
32: /// <param name="locale">the locale identifier of the locale with the
33: /// code page used </param>
34: /// <param name="localeType">Type of locale information to set.</param>
35: /// <param name="localeData">An integer value representing the locale
36: /// information to set</param>
37: /// <returns>Returns true if successful; otherwise false</returns>
38: [DllImport("kernel32.dll", SetLastError = true)]
39: public static extern bool SetLocaleInfo(
40: uint locale,
41: uint localeType,
42: int localeData);
43: }
44: }
The argument values that we can pass to these functions are enumerated in a separate class similar to the one below
1: namespace TestingMentor.NlsInfo
2: {
3: /// <summary>
4: /// Constant values for SetLocaleInfo API
5: /// </summary>
6: class NlsConstant
7: {
8: public enum Locale : uint
9: {
10: Invariant = 0x007F,
11: SystemDefault = 0x0800, // use system default for setlocaleinfo
12: UserDefault = 0x0400,
13: Neutral = 0x0000,
14: CustomDefault = 0X0C00, // Vista and later
15: CustomUiDefault = 0x1400, // Vista and later
16: CustomUnspecified = 0X1000 // Vista and later
17: };
18:
19: public enum LocaleType : uint
20: {
21: // VALUE LCDATA TYPES
22: CalendarType = 0x00001009, // type of calendar specifier
23: CurrencyDigits = 0x00000019, // local monetary fractional digits
24: CurrencySymbol = 0x0000001B, // position of positive currency symbol
25: FractionalDigits = 0x00000011, // number of fractional digits
26: NativeDigitSubstitution = 0x00001014, // native digit substitution
27: FirstDayOfWeek = 0x0000100C, // first day of week specifier
28: FirstWeekOfYear = 0x0000100D, // first week of year specifier
29: LeadingZeros = 0x00000012, // leading zeros for decimal
30: Measure = 0x0000000D, // 0 = metric, 1 = US
31: NegativeCurrency = 0x0000001C, // negative currency mode
32: NegativeNumber = 0x00001010, // negative number mode
33: PaperSize = 0x0000100A, // paper size
34: TimeFormat = 0x00000023, // time format specifier
35:
36: // STRING LCDATA TYPES
37: // Valid Unicode characters
38: AM = 0X00000028, // AM designator
39: PM = 0x00000029, // PM designator
40: CurrencySymbol = 0x00000014, // local monetary symbol
41: DecimalSeparator = 0x0000000E, // decimal separator
42: DigitGrouping = 0x00000010, // digit grouping
43: ListSeparator = 0x0000000C, // list item separator
44: LongDate = 0x00000020, // long date format string
45: MonetaryDecimalSeparator = 0x00000016, // monetary decimal separator
46: MonetaryGrouping = 0x00000018, // monetary grouping
47: MonetaryThousandSeparator = 0x00000017, // monetary thousand separator
48: NativeDigits = 0x00000013, // native ascii 0-9
49: NegativeSign = 0x00000051, // negative sign
50: PositiveSign = 0x00000050, // positive sign
51: ShortDate = 0x0000001D, // short date format string
52: ThousandSeparator = 0x0000000F, // thousand separator
53: TimeSeparator = 0x0000001E, // time separator
54: TimeFormat = 0x00001003, // time format string
55: YearMonthFormat = 0x00001006 // year month format string
56: };
57:
58: public enum LocaleData : int
59: {
60: // LOCALE_ICALENDARTYPE VALUES
61: Gregorian = 1, // Gregorian (localized)
62: GregorianUS = 2, // Gregorian(Always English)
63: GregorianMEFrench = 9, // Middle East French
64: GregorianArabic = 10,
65: GregorianXlitEnglish = 11, // transliterated English
66: GregorianXlitFrench = 12, // transliterated French
67: Japan = 3,
68: Taiwan = 4,
69: Korea = 5,
70: Hijri = 6,
71: Thai = 7,
72: Hebrew = 8,
73: Umalqura = 23, // Um Al Qura (Arabic lunar) Vista or later
74:
75: // LOCALE_ICURRENCY
76: PositiveCurrencyPrefixNoSeparation = 0,
77: PositiveCurrencySuffixNoSeparation = 1,
78: PositiveCurrencyPrefixSeparation = 2, // one character separation
79: PositiveCurrencySuffixSeparation = 3, // one character separation
80:
81: // LOCALE_IDIGITSUBSTITUTION
82: DigitSubstitutionContextBased = 0,
83: DigitSubstitutionNone = 1, // use this setting for full unicode support
84: DigitSubstitutionNative = 2, // uses digits based on national conventions
85: // according to LOCALE_SNATIVEDIGITS
86:
87: //LOCALE_IFIRSTDAYOFWEEK
88: Monday = 0, // LOCALE_SDAYNAME1
89: Tuesday = 1, // LOCALE_SDAYNAME2
90: Wednesday = 2, // LOCALE_SDAYNAME3
91: Thursday = 3, // LOCALE_SDAYNAME4
92: Friday = 4, // LOCALE_SDAYNAME5
93: Saturday = 5, // LOCALE_SDAYNAME6
94: Sunday = 6, // LOCALE_SDAYNAME7
95:
96: // LOCALE_IFRISTWEEKOFYEAR
97: FirstDay = 0, // Week containing 1/1 even if single day
98: FirstFullWeek = 1, // first full week following 1/1
99: FirstWeek = 2, // first week with at least 4 days after 1/1
100:
101: // LOCALE_ILZERO
102: NoLeadingZero = 0, // .975 119:
103: LeadingZero = 1, // 0.975
104:
105: // LOCALE_IMEASURE
106: Metric = 0,
107: US = 1,
108:
109: // LOCALE_INEGCURR
110: ParenthesisSymbolNumber = 0, // ($1.1)
111: NegativeSignSymbolNumber = 1, // -$1.1
112: SymbolNegativeSignNumber = 2, // $-1.1
113: SymbolNumberNegativeSign = 3, // $1.1-
114: ParenthesisNumberSymbol = 4, // (1.1$)
115: NegativeSignNumberSymbol = 5, // -1.1$
116: NumberNegativeSignSymbol = 6, // 1.1-$
117: NumberSymbolNegativeSign = 7, // 1.1$-
118: NegativeSignNumberSpaceSymbol = 8, // -1.1 $
119: NegativeSignSymbolSpaceNumber = 9, // -$ 1.1
120: NumberSpaceSymbolNegativeSign = 10, // 1.1 $-
121: SymbolSpaceNumberNegativeSign = 11, // $ 1.1-
122: SymbolSpaceNegativeSignNumber = 12, // $ -1.1
123: NumberNegativeSignSpaceSymbol = 13, // 1.1- $
124: ParenthesisSymbolSpaceNumber = 14, // ($ 1.1)
125: ParenthesisNumberSpaceSymbol = 15, // (1.1 $)
126:
127: // LOCALE_INEGNUMBER
128: Parenthesis = 0, // (1)
129: NegativeSignNumber = 1, // -1
130: NegativeSignSpaceNumber = 2, // - 1
131: NumberNegativeSign = 3, // 1-
132: NumberSpaceNegativeSign = 4, // 1 -
133:
134: // LOCALE_PAPERSIZE
135: USLetter = 1,
136: USLegal = 5,
137: A3 = 8,
138: A4 = 9,
139:
140: // LOCALE_ITIME
141: FormatAM_PM = 0,
142: Format24Hour = 1
143: };
144: }
145: }
You see, manipulating the Regional Options settings through the user interface had nothing to do with the purpose of his test; it was whether or not those changes in the NLS settings were propagated to the application under test, and whether the resultant output displayed correctly. The oracle to verify the output in this case was simply reading the string from the appropriate control in the application and comparing each character code point value with the expected character. For example, one test changed the date format from dd/mm/yyyy to yyyy-MM-dd. The automated oracle verified the year, month and day values in the correct format and order, and also checked whether the date separator characters in the 4th and 7th position in the string were Unicode values U+002D in this example (or other randomly generated Unicode character value(s)). This automated test was able to test and verify 31 different customizable NLS settings with multiple variables per setting to satisfy basic international sufficiency of this tester’s feature in a fraction of the time it would require a human, and with greater precision. Of course, this assumes that as a tester you have an in-depth understanding of the “system” on which you are tasked to test, and capable of designing effective tests from perspectives other than that of the end-user.
I try to constantly emphasize the emerging role of a software tester primarily focuses on analysis and design; analysis of the “system”, the tests, and the results of tests, and the design of effective tests with reasoned purpose and well defined goals. Professional testers provide value by enriching their organization’s intellectual knowledge repository and ultimately resolving hard problems. But, we can’t start to resolve the hard problem of effective UI test automation by perpetuating the medieval mentality that UI automation is merely mindlessly mimicking the clicks and keystrokes through the user interface because we don’t understand how the system works below the surface, or we can’t think intelligently about effective oracles capable of interpreting the results for some of our automated tests. The persistent prophets of pestilence will perpetually pule, but fortunately I see more and more professional software testers stepping up to meet increasingly complex technological challenges head on with increasing success. As I have said before, the only problems we can’t solve are those which we have not yet devised a solution.
UTF What?
Originally Published Monday, January 14, 2008
Years ago life was pretty simple with regard to data input. Most computer programs were limited to ASCII characters and a set of character glyphs mapped into the code points between 0×80 and 0xFF (high or extended ASCII) depending on the language. The set of characters was limited to 256 code points (0×00 through 0xFF) primarily due to the CPU architecture. Multiple languages were made available via ANSI code pages. Modifying the glyphs in the upper 127 character code points between 0×80 and 0xFF worked pretty well expect for East Asian language versions. So, someone came up with the brilliant idea of encoding a character glyph with 2 bytes instead of just one. This double byte encoding worked quite well except that many developers were unaware that a lead byte could be an 0xE5 character and a trail byte could be a reserved character such as 0x5C (backslash). So, an unknowledgeable developer who stepped incrementally though a string byte by byte would often encounter all sorts of defects in their code. Fortunately today, most of us no longer have to deal with ANSI based character streams on a daily basis. Today most operating system platforms, the Internet, and many of our applications implement Unicode for data input, manipulation, data interchange, and data storage.
Unicode was designed to solve a lot of the problems with data interchange between computers, especially between computer systems using different language version platforms. For example, using a Windows 95 operating system there was virtually no way to view a file containing double byte encoded Chinese ideographic characters using Notepad on an English version of Windows 95. But, on Windows Xp or Vista not only can we view the correct character glyph we can also enter Chinese characters by simply installing the appropriate keyboard drivers and fonts. No special language version or language pack necessary! So, if we created a Unicode document using Russian characters those same character glyphs would appear no matter what language version operating system or application I used as long as the OS and application were 100% Unicode compliant.
However, Unicode of course has its own unique problems. Unicode was originally based on the UCS-2 Universal Multiple Octet Coded Character Set defined by ISO/IEC 10646. Essentially, UCS-2 provided an encoding schema in which each character glyph is encoded with 16-bits (or 32-bits for UCS-4). A pure 16-bit or 32-bit encoding format didn’t really appeal to a lot of people due to various problems that would arise in string parsing. Most data around the world up to that point (with the exception of East Asian language files) were encoded with 8-bit characters. So, some really creative folks came up with ingenious ways to encode characters that more or less captured the essence of UCS (i.e., one code point == one character) using UCS transformation formats (UTF).
Another problem with UCS-2 and a pure 16-bit encoding was the limitation of 65,635 character code points. It wasn’t very long before most people realized this set of code points was not adequate for our data needs. But, instead of adopting a UCS-4 encoding schema the Unicode Consortium redefined a range of character code points in the private use area as surrogates. These surrogate pairs would reference 16-bit character code points in different UCS-4 planes.
A while back I designed a tool called Str2Val to help developers and testers troubleshoot problematic strings. For example, lets assume the following string ṙュϑӈɅ䩲Ẩլ。ḩ»モNJĬջḰǝĦ涃ᾬよㇳლȝỄ caused an error in a text box control that accepted a string of Unicode characters. A professional tester would isolate the problematic character or combination of characters causing the error and reference the exact character code point(s) by encoding format in the defect report. I recently upgraded the Str2Val tool to show the same string by various encoding formats such as UTF 16 (big and little endian), UTF-8, UTF-7, and decimal. Not only is this a good tool for trouble shooting problematic strings, it is also a useful training tool to explain the differences in the various common UCS Transformation Formats or UTF encoding methods.
Why is this important as a tester? Well, if you think you represent your customers yet the only characters you use in your testing are the ones labeled on the keyboard that is currently staring you in the face then you are only dealing with a small fraction of the data used by customers around the world (assuming that your software is used outside the country where it is developed, and most English language versions of software are used around the world if they are available on the open market.) If you don’t know how the characters are encoded or which types of problems can arise from the various encoding methods then do you really know how to devise good tests, or are you just guessing? Do you know how to design robust tests with stochastic test data, or are you stuck with stale static data strings in flat files that you simply use over and over again? When a defect occurs in a string of characters (since string data is quite common in testing) can you troubleshoot the cause or isolate the code point, or do you simply just say "yea!…I found another bug!" and throw it back at the developer to figure out?
More on Generating Strings with Random Unicode Characters
Originally Published Sunday, December 24, 2006
Well, for those of you living outside the Pacific Northwest you are probably unaware of the recent wind storm with winds gusting to 60+ miles per hour that left more than 1 million people on the eastern side of the state without power. The damage was pretty extensive, and since I live in a fairly remote area I was without power for more than 7 days and without the Internet for almost 9 days. I do have a generator, but it hadn’t been used in almost 4 years. Sure, I started it every 6 months for about 15 minutes each time, but after the first full day of operation the generator started doing wierd things. So, during the past week I have become pretty good at fixing generators (mine and my neighbors), tracing electrical systems, troubleshooting furnace problems, splitting a lot of firewood, cutting up fallen trees, and repairing fences.
After the sun set (which is quite early) I had little else to do (other than making sure nobody stole my generator), so between stoking the fire I started developing a DLL for Unicode string generation in automated tests based on the GString utility. While reviewing the data tables I created for the GString utility with the Unicode Handbook I noticed some holes (OK…defects). Some of the boundaries for code ranges that are not assigned to any Unicode script group were incorrect. (That will teach me to use a web page with the listing put together by a web developer rather than using the Unicode handbook.) But, I also found a problem that prevented unassigned code points from being generated even if the Only use assigned code points check box was unchecked.So, the (hopefully final) update to GString is complete, including the GString.DLL! So, along with the massive overhaul of the Unicode data tables, the new GString package available from my personal website also includes a new DLL for anyone needing to generate strings of random Unicode characters in test automation. The GString zip file also includes detailed documentation on the utility and the dll usage. Let me know if you have any questions about the tool or using random string generation in your testing.
Well, now back to (mostly) normal life.
More on Globalization Testing and Random Unicode String Generation
Originally Published Sunday, November 12, 2006 3
After a week in Boston presenting at the 3rd Software Testing and Performance Conference I am relaxing in Baltimore (where I grew up) visiting family and friends. For the second year in a row I presented a workshop on functional and structural testing techniques, and also presented a double-track session on GUI test automation using C#. One speaker cancelled at the last moment, so I volunteered to present the globalization testing basics talk I presented at STAR West a few weeks before. At both conferences I promised the attendees a tool to generate strings of random Unicode characters, and while relaxing along the waterfront of Baltimore’s inner harbor (the weather was quite beautiful this weekend) I managed to finish the tool (at least I am meeting the functional requirements I wanted to achieve).
So, without further ado, on my site Software Testing Mentor is a new section for tools and utilities where you will find the tool I have named "GString." GString will generate random strings of Unicode characters between the ranges of U+0020 and U+FFFF up to 65,535 characters in length either as a fixed length string or a random length string. The ranges of Unicode code points that are not assigned to a language script, and special areas such as Private use and surrogate areas are excluded from the generated strings. The resultant string can be copied to the clipboard and pasted into the edit control you are testing. (I am already thinking that a 2.0 version will populate the edit control that has focus automatically.)
GString is written in C# and requires the 2.0 .NET runtime available from Microsoft if you don’t already have it installed on your computer.
Well, back up to Boston for a few days before heading home. If you have any comments about the tool (or find any defects) please let me know.
StarWest 2006 Presentation – Testing for Global Customers
Originally Published Friday, November 03, 2006
I recently presented a talk on a globalization testing; a topic I think is really interesting, yet I find that most people try to ignore the topic for various reasons. I think that when people start talking about globalization or internationalization testing, some testers and developers shut down mentally. I have heard some devs and testers say, "I don’t know how to read Japanese, so how can I test it?"
Well, after years of working in this area I discovered that a tester doesn’t have to know how to read, speak, or write another language in order to conduct testing which includes string data composed of characters from different language groups. The computer doesn’t know language, it simply knows a series of 0′s and 1′s. The glyphs which represent characters used in some written language is for human edification, but the computer really doesn’t know the difference between Greek, Russian, or Japanese.
I have been trying to convince testers for years to expand their input testing beyond the typical ASCII characters they see on the keyboard in front of them and include characters from various languages such as Japanese and Arabic, and Hindi. While most testers think this is really cool, I think we sometimes forget how to input Unicode characters from different language groups.
So, to solve that problem I have created a few job aids that provide step-by-step instructions to manually input characters and strings from different languages other than English. So, for those of you looking to expand your string testing capabilities please refer to the job aids I created on my new (work in progress) personal website Software Testing Mentor. The slides for the talk are also available in the Presentations section, but the simulations and examples used during the presentation are not yet posted. Also, I will add more job aids for Chinese, Korean, and Hebrew in the near future.
