C# Coding Solutions—Immutable Types Are Scalable Types

Microsoft .NET Framework, ASP.NET, Visual C# (CSharp, C Sharp, C-Sharp) Developer Training, Visual Studio


Jump to: navigation, search
CSharp-Online.NET:Articles
C# Articles

C# Coding Solutions

© 2006 Christian Gross

Immutable Types Are Scalable Types

One of the most commonly used types is string, and string is an immutable class, meaning that once it’s assigned the type cannot change. To change the state of an immutable class, a new class needs to be instantiated. Developers rarely consider using immutable classes because our logic says immutable classes are inefficient. Yet one of the most used types, string, is immutable.

Efficiency can be measured multiple ways, but for today’s hardware efficiency is best measured as code that gets in the way of the processor very little. These days we have processors that can blow the socks off your feet. Yet we write code that accesses the disk too often, waits for other information, and worse, calls a network connection. During all of these operations the processor is idly twiddling its thumbs waiting for a response. And then where a response is generated the program uses threads and locks to control access to an object.

A scalable class uses the keyword lock as little as possible. The keyword lock blocks multiple threads from accessing the same piece of code concurrently. The following example uses the lock keyword:

class ExampleSynchronized {
    public void OnlySingleThread() {
        lock( this) {
            Console.WriteLine( "Hello world");
            }
    }
}

In the example the method OnlySingleThread allows only one thread to output the text to the console. Allowing only one thread access is a bottleneck, and will cause a program to slow down. With the introduction of multicore CPUs the bottleneck problem will become worse as people adapt to writing multithreaded applications. An immutable class is the best type write in a multithread and multiprocess scenario.

In the section "Designing Consistent Classes" I designed an immutable class to illustrate a consistent class. Here is that source code again:

class Rectangle {
    private readonly long _length, _width;
 
    public Rectangle( long length, long width) {
        _length = length;
        _width = width;
    }
    public virtual long Length {
        get {
            return _length;
        }
    }
    public virtual long Width {
        get {
            return _width;
        }
    }
}
 
class Square : Rectangle {
    public Square( long width) : base( width, width) {
 
    }
}

The example class is immutable because the Length and Width properties of Rectangle do not use the set keyword. Under most circumstances the immutable definition of Rectangle is faster than a read-write version of Rectangle and Square because memory managers have become smarter, and because there is no lock to slow down the code.

However, an immutable object requires more resources than other object types do. Our prevailing logic says that changing a variable requires less memory than copying and modifying a variable. The reality is quite a bit different, though, and depends on the context in which an immutable object is used.

One of reasons why immutable types can be faster is that they are optimized due to having dealt with memory management in years past. In C and C++, a buffer was allocated from the heap, which a memory manager managed. The memory manager split up the memory to mark used areas and unused areas. In the C and C++ memory model, whenever a piece of data is allocated, the memory manager searches the heap for an appropriate piece of memory to fit the need. This searching and slicing of the memory, however, costs many CPU cycles.

Memory management for immutable objects is more efficient because immutable object memory is of fixed dimensions and is either used or available. The memory manager does not have to split the memory, expand the memory, or do any of the expensive operations that most memory managers do currently. Memory managers in runtimes like .NET, Java, and Apache Portable Runtime (APR) make assumptions about memory use. These assumptions use linear mathematics and greatly speed up allocations and freeing. Of course, these assumptions work today because memory is plentiful. The old memory-manager model was geared toward computers that considered RAM scarce.

The other reason why immutable objects are faster is that there is no need for synchronization. When multiple threads access a piece of data, synchronization is needed so that each thread manipulates stable data. If the data is always stable, then synchronization is not necessary. Therefore, immutable objects have a definite speed advantage over objects that have synchronization requirements.

Ideally, an immutable class should be a data class, which stores and manipulates data but typically does not modify it. The string class is an excellent example of data class. It has data-member operations, but the operations do not operate on the local data. The operations copy the data and return new string instance. This separation of data and manipulations makes it possible to never need synchronization routines.

There is a downside to immutable classes: They can be resource-intensive. For example, performing too many string manipulations will result in memory thrashing as buffers are being constantly allocated and freed. Another problem is that if a class allocates 4MB with each instantiation, then that instantiation will become costly. In such a situation, the solution is to use pooled objects. Pooled objects are recycled immutable objects that are beyond the scope of this book.

To illustrate that immutable types are faster in most contexts, let’s carry out a performance test. The following source code illustrates three different class types that implement immutable and modifiable types:

class Regular {
    private int _value;
 
    public Regular( int initial) {
        _value = initial;
    }
    public Regular Increment() {
        _value ++;
        return this;
    }
}
 
class Immutable {
    private readonly int _value;
 
    public Immutable( int initial) {
        _value = initial;
    }
    public Immutable Increment() {
        return new Immutable( _value + 1);
    }
}
 
struct structImmutable {
    private readonly int _value;
    public structImmutable( int initial) {
        _value = initial;
    }
    public structImmutable Increment() {
        return new structImmutable( _value + 1);
    }
}

The class Regular is a modifiable type that has the ability to manipulate its data members. The class Immutable is an example immutable class that has a data member that allocates a new class instance when the value is incremented. The structure structImmutable is like the class Immutable, except a structure is used. When a structure is defined, the data members are value types that are stored on the stack.

Let’s put Regular, Immutable, and structImmutable side by side for performance analysis. The code will be executed in the context of one thread, running flat out without any type of synchronization. Intuitively, we’d think Regular should be the fastest, and following are the performance numbers, where the percentage times relate to how much was spent allocating and executing the Increment method:

Regular:            0.05%
Immutable:          0.25%
structImmutable:    0.10%

Our intuition was correct—the immutable types that were constantly allocating objects causes the routines to require five times more computing time than the read-write types. However, using a struct value type is only two times slower than using Regular. Right now the score is modifiable 1, and immutable 0.

The next performance test involves the use of code locks and synchronization techniques. Rerunning the performance tests generates the following data:

Regular:            0.46%
Immutable:          0.52%
structImmutable:    0.45%

This time the numbers indicate that it does not matter which type is used; they are all executing at roughly the same performance levels. The score now seems to be modifiable 2, and immutable 0. However, remember this code is running using code locks, but there are no concurrency situations. The modifiable code is running flat out with no waiting time. And this is where immutable gets bonus marks. The moment a modifiable object has to wait, the modifiable type takes a massive performance hit. It could get worse; the code could deadlock. With immutable objects, there are no wait times and no deadlocks.

When creating immutable classes, remember the following:

  • Immutable classes are often data classes. A data class holds data only, and does not provide many operations that relate to application logic. The class might have a large number of methods that relate to manipulating the data, such as the String class.
  • An immutable class does not reference a read-write class. This means that an immutable class does not publicly expose a type that is read-write capable.
  • Immutable classes typically do not implement interfaces. But a very useful technique is the implementation of an immutable interface by a read-write class. An immutable interface has read-only operations and allows a class implementation to optimize when synchronization is necessary.
  • Immutable classes can be serialized and are typically transported across networks or AppDomains. Rarely will a type hold a remote reference to an immutable type.


Previous_Page_.gif Next_Page_.gif


Personal tools