Archive for the ‘Internationalization Testing’ Category
Localization Testing – Part II
Originally Published Friday, October 30, 2009
I should be of no surprise to anyone that localization testing generally focuses on changes in the user interface, although as mentioned in the previous post these are not the only changes necessary to adapt a product to a specific target market. But, the most common category of localization class bug are usability or behavioral type issues that do involve the user interface. Bugs in this category generally include un-localized or un-translated text, key mnemonics, clipped or truncated text labels and other user interface controls, incorrect tab order, and layout issues. Fortunately, the majority of problems in this category do not require a fix in the software’s functional or business layer. Also, the majority of problems in this category do not require any special linguistic skills in order to identify, and in some cases, an automated approach can be even more effective than the human eye (more on that later).
Perhaps the most commonly reported issue in this category is “un-localized” or un-translated textual string. Unfortunately, in many cases un-translated strings is also an over-reported problem that only serves to flood the the defect tracking database with unnecessary bugs. Translating textual strings is a demanding task, and made even more difficult when there are constant changes in the user interface or contextual meaning of messages early in the product life cycle. Over-reporting of un-translated text too early in the product cycle only serves to artificially inflate the bug count, and causes undue pressure and creates extra work for the localization team.
Identifying this type of bugs is actually pretty easy. Here’s a simple heuristic; if you are testing a non-English language version in a language you are not familiar with and you can clearly read the textual string in English it is probably not localized or translated into the target language. The illustration below provides a pretty good example of this general rule of thumb. A tester doesn’t have to read German to realize that the text in the label control under the first radio button is not German.
There are several causes of un-localized text strings to appear in dialogs and other areas of the user interface. For example:
- Worse case scenario is that the string is hard-coded into the source files
- Perhaps localizers did not have enough time to completely process all strings in a particular file
- Perhaps this is a new string in a file localizers thought was 100% localized
- Strings displayed in some dialogs come from files other than the file that generates the dialog, and the localization team has not process that file
- And, sometimes (usually not often), a string may simply be overlooked during the localization process
Testing for un-localized text is often a manually intensive process of enumerating menus, dialogs, and other user interface dialogs, message boxes and form, and form elements. But, if the textual strings are located in a separate resource file (as they should be), a quick scan of resource files might more quickly reveal un-translated textual strings. Of course, there is little context in the resource file, and I also hope the localization team is reviewing their own work as well prior to handing it over to test.
Also, here are a few suggestions that might help focus localization testing efforts early in the project milestone and reduce the number of ‘known’ or false-positive un-translated text bugs being reported:
- Ask the localization team to report the percentage of translation completion by file or module for each test build. Early in the development lifecycle only modules that are reported to be 100% complete which appear to have un-translated text should be reported as valid bugs. Of course, sometimes some strings are used in multiple modules, or may be coming from external resources. But, especially early in the development lifecycle reporting a gaggle of un-translated text bugs is simply “make work.” As the life cycle starts winding down…all strings are fair game for bug hunters!
- Testers should use tools such a Spy++ or Reflector to help identify the module or other resources, and the unique resource ID for the problematic string or resource. This is much better then than simply attaching an image of the offending dialog to a defect report. Identifying the module and the specific resource ID number allows the localization team to affect a quick fix instead of having to search for the dialog through repro steps and track down the problem.
- Also remember that not all textual strings are translated into a specific target language. Registered or trademarked product names are often not translated into different languages. In case of doubt, ask the localization team if a string that appears un-localized is a ‘true’ problem or not.
Unlocalized strings usually due to hard coded strings also tend to occur in menu items. This is especially true in the Windows Start menu or sub-menu items hard-coded in the INF or other installation/setup files. For example, the image on the right shows a common problem on European versions of Windows. Many European language versions localize the name of the Program Files folder, and the menu item in the start menu. But, often times when we install an English language version of software to Windows it creates a new "Programs" menu item (and even a new Program Files directory, rather than detecting the default folder to install to. In the example on the left, the string Accessories is a hard-coded folder name. But, there is another issue as well. This illustrates not only a problem with the non-translated string "Accessories," but also shows one full-width Katakana string for ‘Accessories’ and another half-width string.
In part 3 I will discuss another often problematic area in localization….key mnemonics.
Localization Testing: Part 1
Originally Published Tuesday, October 27, 2009
When I first joined Microsoft 15 years ago I was on the Windows 95 International team. Our team was responsible for reducing the delta between the release of the English version and the Japanese version to 90 days, and I am very proud to say that we achieved that goal and Windows 95 took Japan by storm. It was so amazing that even people without computers were lined up outside of sales outlets waiting to purchase a copy of Windows 95. The growth of personal computers in Japan shot through the roof over the next few years. Today the Chinese market is exploding, and eastern European nations are experiencing unprecedented growth as well. While the demand for the English language versions of our software still remains high, many of our customers are demanding software that is ‘localized’ to accommodate the customers national conventions, language, and even locally available hardware. Although much of the Internets content is in English, non-English sites on the web are growing, and even ICANN is considering allowing international domain names that contain non-ASCII characters this week in Seoul, Korea.
But, a lot has changed in how we develop software to support international markets. International versions of Windows 95 were developed on a forked code base. Basically, this means the source code contained #ifdefs to instruct the compiler to compile different parts of the source code depending on the language family. From a testing perspective this is a nightmare, because if the underlying code base of a localized version is fundamentally different than the base (US English) version then the testing problem is magnified because there is a lot of functionality that must be retested. Fortunately today, much software being produced is based on a single-worldwide binary model. (I briefly explained the single world wide binary concept at a talk in 1991, and Michael Kaplan talks about the advantages here.) In a nutshell, a single worldwide binary model is a development approach in which any functionality any user anywhere in the world might need is codified in the core source code so we don’t need to modify the core code once it is compiled to include some language/locale specific functionality. For example, it was impossible to input Japanese text into Notepad on an English version of Windows 95 using an Input Method Editor (IME); I needed the localized Japanese version. But, on the English version of Windows Xp, Vista, or Windows 7 all I have to do is install the appropriate keyboard drivers and font files and expose the IME functionality. In fact, these days I can map my keyboard to over 150 different layouts and install fonts for all defined Unicode characters on any language version of the Windows operating system.
The big advantage of the single worldwide binary development model is that it allows us to differentiate between globalization testing and localization testing. At Microsoft we define globalization as “the process of designing and implementing a product and/or content (including text and non-text elements) so that it can accommodate any locale market (locale).” And, we define localization as “the process of adapting a product and/or content to meet the language, cultural and political expectations and/or requirements of a specific target market.” This means we can better focus on the specific types of issues that each testing approach is most effective at identifying. For localization testing, this means we can focus on the specific things that change in the software during the “adaptation processes” to localize a product for each specific target market.
The most obvious adaptation process is the ‘localization’ or actually the translation of the user interface textual elements such as menu labels, text in label controls, and other string resources that are commonly exposed to the end user. However, the translation of string resources is not the only thing that occurs during the localization process. The localization processes that are required to adapt a software product to a specific local may also include other changes such as font files and drivers installed by default, registry keys set differently, drivers to support locale specific hardware devices, etc.
3 Categories of Localization Class Bugs
I am a big fan of developing a bug taxonomic hierarchy as part of my defect tracking database as a best practice because it better enables me to analyze bug data more efficiently. If I see a particular category of bug or a type of bug in a category that is being reported a lot, then perhaps we should find ways to prevent or at least minimize the problem from occurring later in the development lifecycle. After years of analyzing different bugs, I classified all localization class bugs into 3 categories; functionality, behavioral/usability, and linguistic quality.
Functionality type bugs exposed in localized software affect the functionality of the software and require a fix in the core source code. Fortunately, with a single worldwide binary development model where the core functionality is separated from the user interface the number of bugs in this category of localization class bugs is relatively small. Checking the appropriate registry keys are set and files are installed in a new build is reasonably straight forward and should be built into the build verification test (BVT) suite. Other types of things that should be checked include application and hardware compatibility. It is important to identify these types of problems early because they are quite costly to correct, and can have a pretty large ripple effect on the testing effort.
Behavioral and usability issues primarily impact the usefulness or aesthetic quality of the user interface elements. Many of the problems in this category do not require a fix in the core functional layer of the source code. The types of bugs in this category include layout issues, un-translated text, key mnemonic issues, and other problems that are often fixed in the user interface form design, form class, or form element properties. This category of problems often accounts for more than 90% of all localization class bugs. Fortunately, the majority of problems in this category do not require any special linguistic skills; a careful eye for detail is all that is required to expose even the most discrete bugs in this category.
The final category of localization class bug is linguistic quality. This category of bugs are primarily mis-translations. Obviously, the ability to read the language being tested is required to identify most problems in this category of errors. We found testers spent a lot of time looking for this type of bug, but later found the majority of linguistic quality type issues reported were resolved as won’t fix. There are many reasons for this, but here is my position on this type of testing….Stop wasting the time of your test team to validate the linguistic quality of the product. If this is a problem then hire a new localizer, hire a team of ‘editors’ to review the work of the localizer, or hire a professional linguistic specialist from the target locale as an auditor. Certainly, if testers see an obvious gaff in a translation then we must report it; however, testers are generally not linguistic experts (even in their native tongue), and I would never advocate hiring testers simply based on linguistic skills nor as a manager would I want to dedicate my valuable testing resources on validating linguistic quality…that’s usually not where their expertise lies, and it probably shouldn’t be.
What’s Next
Since behavioral /usability category issues are the most prevalent type of localization class bug this series of posts will focus on localization testing designed to evaluate user interface elements and resources. The next post will expose the often single most reported bug in this category.
UTF What?
Originally Published Monday, January 14, 2008
Years ago life was pretty simple with regard to data input. Most computer programs were limited to ASCII characters and a set of character glyphs mapped into the code points between 0×80 and 0xFF (high or extended ASCII) depending on the language. The set of characters was limited to 256 code points (0×00 through 0xFF) primarily due to the CPU architecture. Multiple languages were made available via ANSI code pages. Modifying the glyphs in the upper 127 character code points between 0×80 and 0xFF worked pretty well expect for East Asian language versions. So, someone came up with the brilliant idea of encoding a character glyph with 2 bytes instead of just one. This double byte encoding worked quite well except that many developers were unaware that a lead byte could be an 0xE5 character and a trail byte could be a reserved character such as 0x5C (backslash). So, an unknowledgeable developer who stepped incrementally though a string byte by byte would often encounter all sorts of defects in their code. Fortunately today, most of us no longer have to deal with ANSI based character streams on a daily basis. Today most operating system platforms, the Internet, and many of our applications implement Unicode for data input, manipulation, data interchange, and data storage.
Unicode was designed to solve a lot of the problems with data interchange between computers, especially between computer systems using different language version platforms. For example, using a Windows 95 operating system there was virtually no way to view a file containing double byte encoded Chinese ideographic characters using Notepad on an English version of Windows 95. But, on Windows Xp or Vista not only can we view the correct character glyph we can also enter Chinese characters by simply installing the appropriate keyboard drivers and fonts. No special language version or language pack necessary! So, if we created a Unicode document using Russian characters those same character glyphs would appear no matter what language version operating system or application I used as long as the OS and application were 100% Unicode compliant.
However, Unicode of course has its own unique problems. Unicode was originally based on the UCS-2 Universal Multiple Octet Coded Character Set defined by ISO/IEC 10646. Essentially, UCS-2 provided an encoding schema in which each character glyph is encoded with 16-bits (or 32-bits for UCS-4). A pure 16-bit or 32-bit encoding format didn’t really appeal to a lot of people due to various problems that would arise in string parsing. Most data around the world up to that point (with the exception of East Asian language files) were encoded with 8-bit characters. So, some really creative folks came up with ingenious ways to encode characters that more or less captured the essence of UCS (i.e., one code point == one character) using UCS transformation formats (UTF).
Another problem with UCS-2 and a pure 16-bit encoding was the limitation of 65,635 character code points. It wasn’t very long before most people realized this set of code points was not adequate for our data needs. But, instead of adopting a UCS-4 encoding schema the Unicode Consortium redefined a range of character code points in the private use area as surrogates. These surrogate pairs would reference 16-bit character code points in different UCS-4 planes.
A while back I designed a tool called Str2Val to help developers and testers troubleshoot problematic strings. For example, lets assume the following string ṙュϑӈɅ䩲Ẩլ。ḩ»モNJĬջḰǝĦ涃ᾬよㇳლȝỄ caused an error in a text box control that accepted a string of Unicode characters. A professional tester would isolate the problematic character or combination of characters causing the error and reference the exact character code point(s) by encoding format in the defect report. I recently upgraded the Str2Val tool to show the same string by various encoding formats such as UTF 16 (big and little endian), UTF-8, UTF-7, and decimal. Not only is this a good tool for trouble shooting problematic strings, it is also a useful training tool to explain the differences in the various common UCS Transformation Formats or UTF encoding methods.
Why is this important as a tester? Well, if you think you represent your customers yet the only characters you use in your testing are the ones labeled on the keyboard that is currently staring you in the face then you are only dealing with a small fraction of the data used by customers around the world (assuming that your software is used outside the country where it is developed, and most English language versions of software are used around the world if they are available on the open market.) If you don’t know how the characters are encoded or which types of problems can arise from the various encoding methods then do you really know how to devise good tests, or are you just guessing? Do you know how to design robust tests with stochastic test data, or are you stuck with stale static data strings in flat files that you simply use over and over again? When a defect occurs in a string of characters (since string data is quite common in testing) can you troubleshoot the cause or isolate the code point, or do you simply just say "yea!…I found another bug!" and throw it back at the developer to figure out?
More on Generating Strings with Random Unicode Characters
Originally Published Sunday, December 24, 2006
Well, for those of you living outside the Pacific Northwest you are probably unaware of the recent wind storm with winds gusting to 60+ miles per hour that left more than 1 million people on the eastern side of the state without power. The damage was pretty extensive, and since I live in a fairly remote area I was without power for more than 7 days and without the Internet for almost 9 days. I do have a generator, but it hadn’t been used in almost 4 years. Sure, I started it every 6 months for about 15 minutes each time, but after the first full day of operation the generator started doing wierd things. So, during the past week I have become pretty good at fixing generators (mine and my neighbors), tracing electrical systems, troubleshooting furnace problems, splitting a lot of firewood, cutting up fallen trees, and repairing fences.
After the sun set (which is quite early) I had little else to do (other than making sure nobody stole my generator), so between stoking the fire I started developing a DLL for Unicode string generation in automated tests based on the GString utility. While reviewing the data tables I created for the GString utility with the Unicode Handbook I noticed some holes (OK…defects). Some of the boundaries for code ranges that are not assigned to any Unicode script group were incorrect. (That will teach me to use a web page with the listing put together by a web developer rather than using the Unicode handbook.) But, I also found a problem that prevented unassigned code points from being generated even if the Only use assigned code points check box was unchecked.So, the (hopefully final) update to GString is complete, including the GString.DLL! So, along with the massive overhaul of the Unicode data tables, the new GString package available from my personal website also includes a new DLL for anyone needing to generate strings of random Unicode characters in test automation. The GString zip file also includes detailed documentation on the utility and the dll usage. Let me know if you have any questions about the tool or using random string generation in your testing.
Well, now back to (mostly) normal life.
More on Globalization Testing and Random Unicode String Generation
Originally Published Sunday, November 12, 2006 3
After a week in Boston presenting at the 3rd Software Testing and Performance Conference I am relaxing in Baltimore (where I grew up) visiting family and friends. For the second year in a row I presented a workshop on functional and structural testing techniques, and also presented a double-track session on GUI test automation using C#. One speaker cancelled at the last moment, so I volunteered to present the globalization testing basics talk I presented at STAR West a few weeks before. At both conferences I promised the attendees a tool to generate strings of random Unicode characters, and while relaxing along the waterfront of Baltimore’s inner harbor (the weather was quite beautiful this weekend) I managed to finish the tool (at least I am meeting the functional requirements I wanted to achieve).
So, without further ado, on my site Software Testing Mentor is a new section for tools and utilities where you will find the tool I have named "GString." GString will generate random strings of Unicode characters between the ranges of U+0020 and U+FFFF up to 65,535 characters in length either as a fixed length string or a random length string. The ranges of Unicode code points that are not assigned to a language script, and special areas such as Private use and surrogate areas are excluded from the generated strings. The resultant string can be copied to the clipboard and pasted into the edit control you are testing. (I am already thinking that a 2.0 version will populate the edit control that has focus automatically.)
GString is written in C# and requires the 2.0 .NET runtime available from Microsoft if you don’t already have it installed on your computer.
Well, back up to Boston for a few days before heading home. If you have any comments about the tool (or find any defects) please let me know.
StarWest 2006 Presentation – Testing for Global Customers
Originally Published Friday, November 03, 2006
I recently presented a talk on a globalization testing; a topic I think is really interesting, yet I find that most people try to ignore the topic for various reasons. I think that when people start talking about globalization or internationalization testing, some testers and developers shut down mentally. I have heard some devs and testers say, "I don’t know how to read Japanese, so how can I test it?"
Well, after years of working in this area I discovered that a tester doesn’t have to know how to read, speak, or write another language in order to conduct testing which includes string data composed of characters from different language groups. The computer doesn’t know language, it simply knows a series of 0′s and 1′s. The glyphs which represent characters used in some written language is for human edification, but the computer really doesn’t know the difference between Greek, Russian, or Japanese.
I have been trying to convince testers for years to expand their input testing beyond the typical ASCII characters they see on the keyboard in front of them and include characters from various languages such as Japanese and Arabic, and Hindi. While most testers think this is really cool, I think we sometimes forget how to input Unicode characters from different language groups.
So, to solve that problem I have created a few job aids that provide step-by-step instructions to manually input characters and strings from different languages other than English. So, for those of you looking to expand your string testing capabilities please refer to the job aids I created on my new (work in progress) personal website Software Testing Mentor. The slides for the talk are also available in the Presentations section, but the simulations and examples used during the presentation are not yet posted. Also, I will add more job aids for Chinese, Korean, and Hebrew in the near future.