Classes, Structs, and Objects—Boxing and Unboxing
| Visual C# Tutorials |
| C# Tutorials |
|
© 2006 Weldon W. Nash, III |
Boxing and Unboxing
Allow me to introduce boxing and unboxing. All types within the CLR fall into one of two categories: reference types (objects) or value types (values). You define objects using classes, and you define values using structs. A clear divide exists between these two. Objects live on the memory heap and are managed by the garbage collector. Values normally live in temporary storage spaces, such as on the stack. The one notable exception already mentioned is that a value type can live on the heap as long as it is contained as a field within an object. However, it is not autonomous, and the GC doesn’t control its lifetime directly. Consider the following code:
public class EntryPoint { static void Print( object obj ) { System.Console.WriteLine( "{0}", obj.ToString() ); } static void Main() { int x = 42; Print( x ); } }
It looks simple enough. In Main(), there is an int, which is a C# alias for System.Int32, and it is a value type. You could have just as well declared x as type System.Int32. The space allocated for x is on the local stack. You then pass it as a parameter to the Print() method. The Print() method takes an object reference and simply sends the results of calling ToString() on that object to the console. Let’s analyze this. Print() accepts an object reference, which is a reference to a heap-based object. Yet, you’re passing a value type to the method. What’s going on here? How is this possible?
The key is a concept called boxing. At the point where a value type is defined, the CLR creates a runtime-created wrapper class to contain the value type. Instances of the wrapper live on the heap and are commonly called boxing objects. This is the CLR’s way of bridging the gap between value types and reference types. In fact, if you use ILDASM to look at the IL code generated for the Main() method, you’ll see the following:
.method private hidebysig static void Main() cil managed { .entrypoint // Code size 15 (0xf) .maxstack 1 .locals init (int32 V_0) IL_0000: ldc.i4.s 42 IL_0002: stloc.0 IL_0003: ldloc.0 IL_0004: box [mscorlib]System.Int32 IL_0009: call void EntryPoint::Print(object) IL_000e: ret } // end of method EntryPoint::Main
Notice the IL instruction, box, which takes care of the boxing operation before the Print() method is called. This creates an object, which Figure 4-2 visualizes.
![]()
Figure 4-2. Result of boxing operation
Figure 4-2 depicts the action of copying the value type into the boxing object that lives on the heap. The boxing object behaves just like any other reference type in the CLR. Also, note that the boxing type implements the interfaces of the contained value type. The boxing type is a class type that is generated internally by the virtual execution system of the CLR at the point where the contained value type is defined. The CLR then uses this internal class type when it performs boxing operations as needed.
The most important thing to keep in mind with boxing is that the boxed value is a copy of the original. Therefore, any changes made to the value inside the box are not propagated back to the original value. For example, consider this slight modification to the previous code:
public class EntryPoint { static void PrintAndModify( object obj ) { System.Console.WriteLine( "{0}", obj.ToString() ); int x = (int) obj; x = 21; } static void Main() { int x = 42; PrintAndModify( x ); PrintAndModify( x ); } }
This output from the preceding code might surprise you:
42
42
The fact is, the original value, x, declared and initialized in Main(), is never changed. As you pass it to the PrintAndModify() method, it is boxed, since the PrintAndModify() method takes an object as its parameter. Even though PrintAndModify() takes a reference to an object that you can modify, the object it receives is a boxing object that contains a copy of the original value. The preceding code also introduces another operation called unboxing in the PrintAndModify() method. Since the value is boxed inside of an instance of an object on the heap, you can’t change the value because the only methods supported by that object are methods that System.Object implements. Technically, it also supports the same interfaces that System.Int32 supports. Therefore, you need a way to get the value out of the box. In C#, you can accomplish this syntactically with casting. Notice that you cast the object instance back into an int, and the compiler is smart enough to know that what you’re really doing is unboxing the value type and using the unbox IL instruction, as the following IL for the PrintAndModify() method shows:
.method private hidebysig static void PrintAndModify(object obj) cil managed { // Code size 28 (0x1c) .maxstack 2 .locals init (int32 V_0) IL_0000: ldstr "{0}" IL_0005: ldarg.0 IL_0006: callvirt instance string [mscorlib]System.Object::ToString() IL_000b: call void [mscorlib]System.Console::WriteLine(string, object) IL_0010: ldarg.0 IL_0011: unbox [mscorlib]System.Int32 IL_0016: ldind.i4 IL_0017: stloc.0 IL_0018: ldc.i4.s 21 IL_001a: stloc.0 IL_001b: ret } // end of method EntryPoint::PrintAndModify
Let me be very clear about what happens during unboxing in C#. The operation of unboxing a value is the exact opposite of boxing. The value in the box is copied into an instance of the value on the local stack. Again, any changes made to this unboxed copy are not propagated back to the value contained in the box. Now, you can see how boxing and unboxing can really become confusing. As shown, the code’s behavior is not obvious to the casual observer who is not familiar with the fact that boxing and unboxing are going on under the covers. What’s worse is that two copies of the int are created between the time the call to PrintAndModify() is initiated and the time that the int is manipulated in the method. The first copy is the one put into the box. The second copy is the one created when the boxed value is copied out of the box.
Technically, it’s possible to modify the value that is contained within the box. However, you must do this through an interface. The runtime-generated box that contains the value also implements the interfaces that the value type implements and forwards the calls to the contained value. So, you could do the following:
public interface IModifyMyValue { int X { get; set; } } public struct MyValue : IModifyMyValue { public int x; public int X { get { return x; } set { x = value; } } public override string ToString() { System.Text.StringBuilder output = new System.Text.StringBuilder(); output.AppendFormat( "{0}", x ); return output.ToString(); } } public class EntryPoint { static void Main() { // Create value MyValue myval = new MyValue(); myval.x = 123; // box it object obj = myval; System.Console.WriteLine( "{0}", obj.ToString() ); // modify the contents in the box. IModifyMyValue iface = (IModifyMyValue) obj; iface.X = 456; System.Console.WriteLine( "{0}", obj.ToString() ); // unbox it and see what it is. MyValue newval = (MyValue) obj; System.Console.WriteLine( "{0}", newval.ToString() ); } }
You can see that the output from the code is as follows:
123
456
456
As expected, you’re able to modify the value inside the box using the interface named IModifyMyValue. However, it’s not the most straightforward process. And keep in mind that before you can obtain an interface reference to a value type, it must be boxed. This makes sense if you think about the fact that references to interfaces are object reference types.
|

