A closer look at Enum

Claus Asbjørn Sørensen

My inner nerd has always enjoyed ripping things apart to figure out what goes on inside of them. This week an Enum (which usually live quiet lives without making a big fuzz about themselves) presented me with some odd behaviour and a great opportunity to take a closer look at what goes on underneath the surface of .NET.

It All Started With a Quiz

A couple of days ago a good friend of mine sent me a quiz: “Given the following code, what do you expect to see in your console”.
using System;

namespace EnumToString
{
    enum FunWithEnum
    {
        One = 1,
        Two,
        Three = 2,
        Four = 1
    }

    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine(FunWithEnum.One);
            Console.WriteLine(FunWithEnum.Two);
            Console.WriteLine(FunWithEnum.Three);
            Console.WriteLine(FunWithEnum.Four);
            Console.ReadLine();
        }
    }
}   

The answer to the question is

One
Two
Two
One

And even though I didn’t get the answer right, it made perfect sense to me when I thought about it. There seems to be two parts to this behavior:

  1. Since no value for Two is provided, the value is calculated by incrementing the value for One which precedes it.
  2. Since you are allowed to choose the same value for different names in enums, it seems only reasonable that the first name provided is used when you .ToString() a value.

Except that’s not what is happening. Take a look at what happens if you change the enum slightly.

enum FunWithEnum
{
    One = 1,
    Two = 3,
    Three = 2,
    Four = 1
}

The new result

Four
Two
Three
Four

By changing the value of Two, we have managed to change the name for the value 1 from One to Four. So much for the “returning the first name for a value” theory. This is probably as far from a real world problem as you can get, but it is nevertheless quite interesting behavior.

So what is going on here?

In order to understand what is happening we need to take a closer look at the .ToString() implementation for enums. Thankfully this is not a problem since Microsoft has made the source code available to us, and because we have access to a myriad of great decompiler tools.

By simply navigating my way through the implementation using Visual Studio and the symbol files made available at http://referencesource.microsoft.com/ (and DotPeek for assistance here and there), I was able to get an overview of what goes on behind the scenes.

The next step was to yank out the method stack and put together my own “FakeNum” object to play with. This involved cutting and changing quite a bit of the original .NET implementation as some of the constructs used are internal to the .NET framework, while others are simply irrelevant to the case at hand.

What I ended up with was a stack essentially identical to the .NET Enum .ToString() implementation. It looks something like this.

EnumToString.exe!EnumToString.FakeNum.GetEnumData(out string[] enumNames, out System.Array enumValues)
EnumToString.exe!EnumToString.FakeNum.GetEnumNames()
EnumToString.exe!EnumToString.FakeNum.GetEnumName(object value)
EnumToString.exe!EnumToString.FakeNum.GetName(System.Type enumType, object value)
EnumToString.exe!EnumToString.FakeNum.InternalFormat(System.Type eT, object value)
EnumToString.exe!EnumToString.FakeNum.ToString(object value)
EnumToString.exe!EnumToString.FakeNumField.ToString()
EnumToString.exe!EnumToString.Program.Main(string[] args)

The reason our enum behaves the way it does can be found in the last step of the call stack.

public void GetEnumData(out string[] enumNames, out Array enumValues)
{
    var flds = _fields;

    object[] values = new object[flds.Length];
    string[] names = new string[flds.Length];

    for (int i = 0; i < flds.Length; i++)
    {
        names[i] = flds[i].Name;
        values[i] = flds[i].GetRawConstantValue();
    }

    IComparer comparer = Comparer.Default;
    for (int i = 1; i < values.Length; i++)
    {
        int j = i;
        string tempStr = names[i];
        object val = values[i];
        bool exchanged = false;

        while (comparer.Compare(values[j - 1], val) > 0)
        {
            names[j] = names[j - 1];
            values[j] = values[j - 1];
            j--;
            exchanged = true;
            if (j == 0)
                break;
        }

        if (exchanged)
        {
            names[j] = tempStr;
            values[j] = val;
        }
    }

    enumNames = names;
    enumValues = values;
}

An insertion sort algorithm is used to sort the names and values of the internal enum representation which is basically two arrays. The arrays are sorted together and eventually returned to GetEnumName() which will use a BinarySearch algorithm on the values array to find the index of the requested value. The same index can then be used to find the name corresponding to the value:

int index = BinarySearch(values, value);

if (index >= 0)
{
    string[] names = GetEnumNames();
    return names[index];
}

return null;

This means that the output of the sorting algorithm will vary depending on the values of the enum, while the BinarySearch can still return the same index (the value is the same, but the name is different). Example:

[ { Name = A, Value = 1 }, { Name = B, Value = 1 }, { Name = C, Value = 2 }, { Name = D, Value = 2 } ]

If we are looking for the name of the value 1, and BinarySearch returns Index = 1 the resulting name is B. But if we change the value for D to 0 we get the following array:

[ { Name = D, Value = 0 }, { Name = A, Value = 1 }, { Name = B, Value = 1 }, { Name = C, Value = 2 } ]

We still have the value 1 at index 1, but the corresponding name is now A.

Conclusion

Unfortunately I haven’t been able to reproduce the exact case I started out with. My “FakeNum”-implementation shows the same kind of behavior except “reversed”. I am guessing this is a simple issue related to sorting, or perhaps something “got lost in translation” when I started putting bits of .NET implementation together to create the FakeNum.

I think the natural next step to investigate further is to enable source stepping in Visual Studio and start debugging what actually happens in .ToString() at runtime. But I will leave that exercise for another blog post.

For now I am satisfied with having illustrated how the internal implementation of an enum allows seemingly unrelated changes in values to alter the output of ToString().

I have made the code available at https://github.com/braincell/EnumToString if you want to try it out for yourself.

comments powered by Disqus