Garbage Collector and Dispose

Today I have done a bad combination, a busy airport because of Christmas and setting an alarm too late. The result is quite predictable ... I've lost my flight!! At least I have my laptop and I can do something while I wait for the next flight.

These days I've been doing some interviews to hire two new developers, and it's shocking how many people have been working for more than one year with .NET, but they don't have clear the concepts regarding the Garbage Collector and Dispose. Affirmations like "I should force a garbage collection because the memory management in .NET is very bad" are unfortunately very common. I'm not going to say that is perfect but the memory management in .NET is very good and I can assure that before say the above sentence we should understand how to use it in the right way and then do some self-criticism about our code.

Background

The automatic memory management consists basically in release the developer from allocating and freeing memory for the objects created. To do it, .NET uses  a contiguous space of memory called managed heap from where will take memory every time an object is created, when there isn't space enough to allocate the object the garbage collector will perform a collection to liberate some space. If the GC algorithm concludes that is time to perform a collection, it will look which objects are no longer referenced by the application, because they went out of scope or because they were set to null, using a graph based on the roots that the JIT and the runtime handles, any object not present in the graph created will be considered as unreachable and will be ready to be cleaned from memory, so the GC will perform the necessary actions to compact the heap and to update the pointers to the other objects.

As you can imagine compacting the memory and updating the pointers needs to be done in a thread safe environment, otherwise it could happen that a thread tries to access invalid memory spaces. This implies that the GC must suspend all the running threads before the collection takes place, something that has a big impact on the application. To improve the performance of the GC the managed heap is divided on three different "generations", the lowest is the generation the newest objects you find in it. This division is done for performance reasons following two basic principles:

  1. Compact a part of the heap it's faster than compact the whole heap.
  2. It's supposed that a new object will have a shorter lifetime than an old object.

When an application demands space for a new object but the generation 0 is full, the collection is performed to liberate space in generation 0 promoting all the objects not cleaned to the next generation. If there are already objects in generation 1, these will be promoted to generation 2 and from that moment they will be in generation 2 until they are cleaned from memory.

Remember that GC collector only acts for managed objects, this means that if you use unmanaged resources you should take care yourself of cleaning those objects. In addition to the COM objects, we use unmanaged resources more often than you think, there are lot of .NET classes that encapsulate unmanaged resources like window handles, file and network streams, database connections, etc.

Let's take a look about how we can do the cleanup.

Dispose

You have seen an endless list of classes in .NET having a method called Dispose like Timer, Control (Form, Button, Label ...), Socket, Stream ...

The good news is that the classes implementing the IDisposable interface will help us to mark objects to be cleaned by the GC, the "bad" news is that the GC does not call this method. This is an extended believing, almost every developer knows that method "Close", which generally it's more or less equivalent to Dispose, needs to be called but thinks the GC will take care about calling Dispose, when actually the method Dispose is intended to be called by the users of the class that implements it. Once we know this, we can go a step further and take a look about how to implement Dispose properly in our own classes.

You should always implement a method Dispose for types containing external resources that should be released. Next you can see an approach about how to implement the IDisposable interface.

   1: public class MyClass : IDisposable
   2: {
   3:    // tracks calls to Dispose method
   4:    private bool disposed; 
   5:  
   6:    public void Dispose()
   7:    {
   8:        // Call the overloaded Dispose method with true to indicate it was called by the user's object
   9:        Dispose(true);
  10:    } 
  11:  
  12:    protected virtual void Dispose(bool disposing)
  13:    {
  14:       if (!this.disposed)
  15:       {
  16:          if (disposing)
  17:          {
  18:                // Release managed code
  19:          }
  20:          // Release unmanaged code 
  21:  
  22:          // prevent to call dispose again
  23:          disposed = true;
  24:       }
  25:    }
  26: } 
  27:  

After review the code you will be probably wondering why we use the parameter "disposing", before answer it let's talk about Finalizers.

Finalizers

The Finalizers are intended to release unmanaged resources and are called automatically when the memory of the managed object implementing it needs to be released. This sounds very good, but has some implications that we need to know before implement Finalizers instead of the Dispose pattern. The main problem is that before an object that needs to be finalized is fully released from memory, the GC will need at least two collections. In the first one the GC will mark the object to be finalized, then an external process to the GC will call the finalizer and will let the object ready for garbage collection and finally the GC will clean the object. Next you can see the syntax to implement a Finalizer for the class above.

   1: ~MyClass()
   2: {
   3:    Dispose(false);
   4: } 

As you can see the inside we call the method Dispose we created before, but with the parameter disposing set to false, this is to avoid the cleanup of managed code is performed, since we cannot warranties that managed objects will be available anymore. The Finalizer should not be considered as solution itself but as kind of insurance, the developers using the class described must always call the method Dispose to avoid adding the overcost of calling the Finalizer. 

There is only thing more we should add to the code, if the developer does his homework he will call the method Dispose and would be unnecessary to call it again. If we add the line "GC.SuppressFinalize(this);", in the paramless Dispose method, we will inform the GC that the Finalizer doesn't need to be called anymore.

Let's see the full code.

   1: public class MyClass : IDisposable
   2: {
   3:    // tracks calls to Dispose method
   4:    private bool disposed; 
   5:  
   6:    public void Dispose()
   7:    {
   8:        // Call the overloaded Dispose method with true to indicate it was called by the user's object
   9:        Dispose(true);
  10:  
  11:        GC.SuppressFinalize(this); 
  12:    } 
  13:  
  14:    protected virtual void Dispose(bool disposing)
  15:    {
  16:       if (!this.disposed)
  17:       {
  18:          if (disposing)
  19:          {
  20:                // Release managed code
  21:          }
  22:          // Release unmanaged code 
  23:  
  24:          // prevent to call dispose again
  25:          disposed = true;
  26:       }
  27:    }  
  28:  
  29:    ~MyClass()
  30:    {
  31:       Dispose(false);
  32:    } 
  33: } 
  34:  

To use the above class correctly we will need to call the method Dispose, since it is needed to cleanup the resources used, and to be 100% sure that Dispose is called we have two ways to do it: one is with the clause "using" and the other one is surrounding the written code with a try / finally block. I'm a big fan of "using" because when the object goes out of scope it will call the method Dispose for us and the result code is very clean and clear. So I show you here.

   1: // wrong code 
   2: MyClass myClass = new MyClass();
   3: //...
   4: // work with the object created
   5: //...
   6: myClass.Dispose();
   7:  
   8:  
   9: // right code
  10: using (MyClass myClass = new MyClass())
  11: {
  12:    //...
  13:    // work with the object created
  14:    //...
  15: }
  16:  

In the sample above I've started with an excerpt marked as wrong code, maybe I'm a bit radical saying that the code is wrong, but in the case an exception is thrown while we work with myClass the method Dispose will not be called, avoiding the cleanup is performed as desired. So, please make use always of the clause using or the try / finally blocks. 

Conclusions

After read the post you can see that the automatic memory management does lot of work for us, but it's in our hands to help this task is done in the best conditions. There are lot of common sense things to apply in our applications to improve the memory management like:

  • Understand when allocation takes place and allocate as less as possible.
  • Reduce the complexity of the graphs.
  • Don't call directly the GC, since we hardly will do the collection in a better moment than the GC itself.
  • Call always Close or Dispose on classes supporting it.
  • Release non used COM objects (take a look to the method Marshal.ReleaseComObject )
  • Implement Finalizer only when necesary
  • ...

I hope from now you can do a better implementation of your .net applications and you don't need to say anymore that you should force a Garbage Collection.

Pingbacks and trackbacks (1)+

Add comment




  Country flag
biuquote
  • Comment
  • Preview
Loading