Skip to content

Complex != Better

I know all about over-engineering. I previously wrote about a barn my father designed and built using telephone poles and oak planks and pallets for the stall walls. From a structural perspective this barn would have stood virtually anything mother nature could have thrown at it. And if someday people wanted to tear down that barn they would surely curse the builders because the job would be much harder then they expect.

I sometimes find code that is way over-engineered. In some cases it may be necessary for making the code more robust. But, over-engineered code may also result from ignorance of more efficient algorithms or design patterns. And, sometimes overly complex algorithms are a result of developers trying to craft something that obfuscates the simplicity of the solution and fool others into believing the complexity of the solution is somehow better than a more simple solution.

I sometimes see this in test code. I am a firm believer in robust, ‘bullet-proof’ test code because each time an automated test case throws a false positive or ‘breaks’ the team loses confidence in the automation project, and it takes our time to ‘massage’ the test back to health. But, there seems to be 2 extremes in test automation. Simplistic prescriptive scripted tests with a bunch of hard-coded ‘test-data,’ or overly complex test code that is virtually undecipherable by anyone other than the original developer that renders any downstream maintenance virtually impossible. I have seen many instances where complete libraries of automation was scrapped and re-developed simply because it was easier to re-write the code rather than trying to wade through the quagmire of complexity.

In a recent example, some of my students submitted their test automation projects with a library that contained a method to generate a random string. When I first looked at their method for generating a random string I was a bit perplexed. Besides the fact that the method only produced a grand total of 26 upper case characters, the code was much more complicated than necessary. Despite having showed them how to use the Babel random string generator they choose to make their own, so I asked them how they came up with this solution and they said “they searched on the Internet. ” Others in the class indicated they also used the same method in their projects. So, I when I got home I did a quick search and found the code sample.

   1: private static string RandomString(int size)

   2: {

   3:   StringBuilder builder = new StringBuilder();

   4:   Random random = new Random();

   5:   

   6:   char ch;

   7:   for (int i = 0; i < size; i++)

   8:   {

   9:     ch = Convert.ToChar(

  10:       Convert.ToInt32(Math.Floor(26 * random.NextDouble() + 65)));

  11:     builder.Append(ch);

  12:   }

  13:  

  14:   return builder.ToString();

  15: }

OK…for a moment let’s overlook the fact that if we make 2 consecutive calls to this method we will get the same identical random string of characters (which the author of this code also discovered), and let’s assume that the range of characters is limited to upper case A through Z, and let’s also ignore the fact that we are not seeding our Random generator for reproducibility of the randomly generated output from this method.

Let’s focus on the statement in lines 9 and 10. Why in the world would we generate a number between 0.0 and 1.0, multiply it by 26 and add 65, and then use Math.Floor to return the largest integer less then or equal to the resultant double, then convert that double to a type Int32, and then convert the Int32 to a type char? Now I ask you, can we make this any more complicated?

Now, I am not critiquing the style of the code or its inherent limitations, but this solution is an example of over-engineering. As I have said before, complexity cultivates chaos. At first I thought maybe this approach might give me a better distribution of characters, but when I tested this hypothesis it simply didn’t pan out. So, the way the random character is generated is simply too complex. An easier, more readable, and effective alternative might be to simply generate a random integer within the allowable range and cast that int to a type char as illustrated below in line 10. (Yes, I realize the limitations with this approach if trying to generate surrogate pair characters above U+FFFF.)

   1: private const int minCharacter = 65;

   2: private const int maxCharacter = 91;

   3: private static string SimpleRandomString(int size)

   4: {

   5:   StringBuilder sb = new StringBuilder();

   6:   System.Threading.Thread.Sleep(1);

   7:   Random r = new Random();

   8:   for (int i = 0; i < size; i++)

   9:   {

  10:     sb.Append((char)r.Next(minCharacter, maxCharacter + 1));

  11:   }

  12:  

  13:   return sb.ToString();

  14: }

BTW…the Sleep() in line 6 is a easy solution to preventing identical return values for consecutive calls. But, a more effective solution would be to pass in a seed value as a parameter to the method and use that to seed the new Random generator as illustrated below. Different seeds not only guarantee randomness in the resultant string, but also allow for repeatability if necessary as long as the seed value is preserved.

   1: private static string SimpleRandomString(int size, int seed)

   2: {

   3:   StringBuilder sb = new StringBuilder();

   4:   Random r = new Random(seed);

   5:   ...

   6: }

But, I digress. This post isn’t about random generation or style…it’s about complexity. Overly complex code tends to harbor errors that might go undetected (at least initially). Also, maintenance of complex code not only becomes more problematic, it often leads to costly re-writes down the road.

If we consider that greater complexity may increase the likelihood of error, then we don’t want our development partners to unnecessarily over-engineered algorithms. Also, overly complex solutions often require overly complex testing. This not only leads to increased testing costs, but it also increases the probability of an important test being missed or overlooked. Think testability!

Finally, with regard to test automation we should consider that our automated tests might be reused by other teams, and they certainly will be used during maintenance or sustained engineering efforts well after the product has released. And, in some cases, the people maintaining the product might not be the same people who shipped the product. Well-written automated tests are not just reviewable by someone other than the author of the code, but they should also be easily maintainable. Otherwise, we might end up paying double for that automated test case. Ouch!

One Comment

  1. Hi Bj,

    I guess you assume that “SimpleRandomString” is supposed to be wrapped up in another function that validates “size” argument first, and handles the following special cases:

    size = (2147483647 + 1)

    …or would you increase complexity of “SimpleRandomString” function?
    ——————————–
    I’m sorry for a few tests down here…

    I’ve just submitted my comment and noticed that entire lines containing “prohibited” characters were wiped out.

    So I’m not sure which way it will work and try a few.

    size = = 0
    “size equals zero”

    size <= -1
    “size is equal to or less than”

    size >= (2147483647 + 1)
    “size is equal to or greater than”

    Thanks,
    Albert

    [Bj's Reply] Hi Albert, I combined both your comments to avoid redundancy.

    In answer to your question, the answer is no. The SimpleRandomString method does not need increased complexity to handle your ’special case tests.’

    If you call the SimpleRandomString method and pass a 0 or any value less than 0 then what do you get in return?

    You get an empty string! Basically, you are getting exactly what you ask for.

    Now, if you try to pass it a value greater than 2147483647 then your code will likly not compile. And even if it did (say you weren’t checked mode), you can’t pass an argument greater than 2147483647 to a type int parameter in strongly typed languages such as C#. So, if you try to pass an argument greater than 2147483647 it will actually be your code that throws an exception rather than the SimpleRandomString method.

    These ’special tests’ are in fact all false positives where you think you are getting an error from my code, but in fact the test inputs are based on false assumptions.

    Thursday, April 1, 2010 at 5:47 AM | Permalink