Skip to content

Dealing with locale/language specific static test data

Photo_8E4D6B91-9BD2-E46E-F9EB-0E718B64C8E1Photo_EA7B694A-5664-CD3C-B691-B859E85F742CIt has been sometime since my last post. This seems to happen every so often lately; not because I don’t have anything to write about but mostly due to having too many irons in the fire so to speak and juggling hot irons is never fun and one is always going to drop. Also about this time every year I go sailing in the San Juan Islands or the Gulf Islands of British Columbia. This year I went to the San Juan Islands, and spent a few days incommunicado anchored in Shallow Bay on Sucia Island. Sucia. Echo Bay is a great anchorage with sandy beaches (unusual for the PNW), and the famous China Caves to explore.

Another place I have been known to explore from time to time is the Stack Exchange Software Quality Assurance and Testing forum. There are many interesting questions and a great variety of responses that offer a wealth of information or provide different perspectives. Recently a question was posed about how to read in static test data for a specific locale or language. Many regular readers know that I am a strong proponent of pseudo-random test data generation in conjunction with automated testing to increase the variability of test data used in each test iteration and generally improve test coverage. But I also understand the value of static test data in providing a solid baseline, and in some cases enabling access to specific test data in different locales or languages.

For example, suppose I am testing a text editor application and I want to read in a text file in the appropriate language based on the operating system current users locale settings. In this situation, I could save a text file containing strings or sentences for each target language or locale dialect. Each file would get a unique name based on the 3 letter ISO-639-2 language name (the complete list is at http://www.loc.gov/standards/iso639-2/php/code_list.php), prepended it to a common filename that describes the contents and the appropriate extension. For example,

  • ENG[TestData].txt would be English
  • ZHO[TestData].txt would be Chinese
  • DEU[TestData].txt would be German

To get the appropriate text file auto-magically read in to the test at runtime the only thing we would need to do is to get the current user locale using the CultureInfo class Three Letter ISO Language Name property in C#.

   1:              string testDataFileName = "testdata.txt";
   2:   
   3:              CultureInfo ci = CultureInfo.CurrentCulture;
   4:   
   5:              // Path to server location where static files exist 
   6:              string path = Path.GetFullPath(
   7:                  Environment.GetFolderPath(Environment.SpecialFolder.Desktop));
   8:   
   9:              // Read file contents
  10:              using (StreamReader readFile = 
  11:                  new StreamReader(Path.Combine(
  12:                      path, string.Concat(
  13:                      ci.ThreeLetterISOLanguageName, testDataFileName))))
  14:              {
  15:                  //parse test data and do test stuff
  16:              }

 

Notice we concatenate the filename (and extension) and the 3-letter ISO language name in line 13 and then combine that with the path to the file location and read the file contents using StreamReader.

But, we might need more specialization depending on what we are testing. For example, if we were testing a spell checker for US versus Great Britain (and Canada), or testing simplified Chinese and also traditional Chinese. In this case the ISO 639-2 specification does not delineate between simplified Chinese and traditional Chinese or US English and British English.  In this case we could “make up” a 3-letter designation such as GBR for Great Britain, or CHT for Chinese (traditional).

Or, perhaps a better solution would be to use the Locale Identifiers (LCID) used by Windows to identify specific locales (rather than languages). The solution is identical to the above except instead of calling the ThreeLetterISOLanguageName property we call the LCID property as illustrated below.

   1:              string testDataFileName = "testdata.txt";
   2:   
   3:              CultureInfo ci = CultureInfo.CurrentCulture;
   4:   
   5:              // Path to server location where static files exist 
   6:              string path = Path.GetFullPath(
   7:                  Environment.GetFolderPath(Environment.SpecialFolder.Desktop));
   8:   
   9:              // Read file contents
  10:              using (StreamReader readFile = 
  11:                  new StreamReader(Path.Combine(
  12:                      path, string.Concat(
  13:                      ci.LCID, testDataFileName))))
  14:              {
  15:                  //parse test data and do test stuff
  16:              }

Of course, now we would need to name our static file names with the appropriate LCID decimal number such as

  • 1028testdata.txt would be traditional Chinese used in Taiwan, and
  • 2052testdata.txt would be for simplified Chinese used in PRC

Personally, I prefer getting the LCID as it provides greater control and more specificity. But the down side of using LCIDs is that if you may end up having multiple files that contain the same contents. For example, although Singapore, Malaysia, and PRC all use simplified Chinese there are 3 different LCIDs.

There are other properties that allow you to get the culture info for the current user in Windows, and the right property to use ultimately depends on your specific needs. But, CultureInfo class members can easily be used to manage localized static data files or even manage control flow through an automated test that has specific dependencies on a language or a locale setting.

WP_000019

One Trackback/Pingback

  1. [...] конька: тестирование приложения с различными региональными [...]

Post a Comment

Your email is never published nor shared. Required fields are marked *
*
*

marsolek.dalila moon_fidel@mailxu.com