C# and Boxing and Unboxing

In order to understand boxing, you first must understand the different value types in C#. There are two types you should be aware of.

Value Types

Value Types are any type that is a struct or an enum.  Value types are fast because they are handled by the stack. Most of the built-in types in the C# language are structs, and so are value types.  Here is a complete list of value types.

bool System.Boolean
byte System.Byte
sbyte System.SByte
char System.Char
decimal System.Decimal
double System.Double
float System.Single
int System.Int32
uint System.UInt32
long System.Int64
ulong System.UInt64
short System.Int16
ushort System.UInt16

Value types are not Objects, they are structs.

Reference Types

Reference Types are not usually built-in types, though there are two built-in reference types. They are stored on the heap, so they are fast, but not as fast as value types.  Also, if you run out of memory, the swap or page file could be used and your object could potentially be on disk, which is slow.

object System.Object
string System.String

Object and String are both reference types even though they are built-in types. That means they are stored on the heap. However, the stack is still used. Instead of holding the data, the stack must hold a reference to the data.

Note: When a value type is boxed, it is boxed into the built-in Object type.

Any class created is a reference type.  So if you create a Person class, it is a reference type.

What is Boxing in C#

Boxing is the concept of encapsulating a value type in the Object reference type.

This is the simplest example of boxing.

int i = 100;
Object o = i;

So if you look at the above you have two variables. The first, i, is an int that is a value type and stored on the heap.  The second is a reference type. The heap holds a reference to a location on the heap where the object is stored.  The heap is slower than the stack, and there is a cost when copying from the stack to the heap.

Sometimes boxing is not at first obvious. For example, if you have your own struct you might see different behavior than the built-in types. Calling the ToString() function may or may not cause boxing. Usually it does not cause boxing in built-in types. Look at this example.

    public struct Rectangle
    {
        public double width;
        public double height;
    }
            int i = 100;
            Rectangle rect = new Rectangle() { height = 100, width = 200 };
            r.ToString();
            // No boxing
            Console.WriteLine(i.ToString());
            // Boxing occurs
            Console.WriteLine(rect.ToString());

The reason boxing occurs on your own struct is because you didn’t override the ToString() function.  The documentation states:

If thisType is a value type and thisType does not implement method then ptr is dereferenced, boxed, and passed as the ‘this’ pointer to the callvirt  method instruction. [1]

So you could have done this to prevent boxing.

    public struct Rectangle
    {
        public double width;
        public double height;
        public override string ToString() { return width.ToString() + "," + height.ToString(); }
    }

Anytime an value type is encapsulated into an Object reference type and stored on the heap, boxing occurs.

What is Unboxing

Unboxing just means to remove a value type the is encapsulated in an Object reference type and restore the value back to the stack.

int i = 100;
// Boxing
Object o = i;

// Unboxing
int j = (int)o;

So the above code shows a simple cast that pulls a value type out of the Object reference type.

Why is this important?

Speed.  When you box a value type, you are encapsulating it into an object. The object is stored on the Heap. The Heap is fast, but the stack is faster. There is of course a cost to copying data back and forth between the heap and stack. And if you are short on memory it could really bog you down if your application had to rely on the swap or page file, as these are stored on disk and disks are not fast.

Fastest Fast Slow
Stack Heap Disk (swap or page file)

Is Boxing bad?

No. It is a very important part of C#.

It is only bad when it is not needed but still used (usually unintentionally) and causing a performance hit, such as when looping through data in a large list. So if you are ever looping through a large list of value types, it might be a good idea to check to see if you are avoiding boxing and unboxing every value in the list.


Resources

http://www.dijksterhuis.org/exploring-boxing/#more-908
http://msdn.microsoft.com/en-us/library/yz2be5wk.aspx
http://msdn.microsoft.com/en-us/library/s1ax56ch.aspx
http://www.codeproject.com/KB/cs/boxing.aspx
http://msdn.microsoft.com/en-us/library/system.reflection.emit.opcodes.constrained.aspx

Leave a Reply