Davyd McColl

Posted on Oct 23, 2019

A tale of two classes

#dotnet #inheritence #csharp

"It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, it was the epoch of belief, it was the epoch of incredulity, it was the season of light, it was the season of darkness, it was the spring of hope, it was the winter of despair"

Have you ever seen a warning from the C# compiler that looks like this:

Perhaps you have, and you've ignored it along with the hundreds of other warnings in your project. I hope that, by the end of this article, you'll decided to do some housekeeping: those warnings are there because people a lot cleverer than me know that there are issues that can arise from ignoring them!

To dive into what this warning means (and why we should care), we have to take a step back and examine three keywords: virtual, override and new. But even before that, we have to look (generally) at how object references work (and this is true for pretty-much all OO environments).

Object member resolution.

1. The simple case

Let's say you have a class Animal, and you create an object from it, animal. Let's imagine that that class has a single member, which we'll keep as a method, to keep this simple, for now:

public class Anima()
{
  public string MakeNoise()
  {
    return "generic animal sound";
  }
}

public static class Program
{
  public static void Main(string[] args)
  {
    var animal = new Animal();
    var noise = animal.MakeNoise();
    Console.WriteLine(noise); // prints out "generic animal noise"
  }
}

So far, so good. No surprises here. What's happening under the hood is that, at compile-time, the class is compiled as a "template", with the MakeNoise method compiled into the result, and the address of that method within the assembly is stored alongside the "template" for Animal.
At run-time, the program asks for a new animal, so the template is used to allocate memory for a pointer to the animal and pointer(s) to all the members (in this case, the single MakeNoise method), and those member addresses are copied to the area of memory which is used to represent the animal in code, so when you invoke animal.MakeNoise(), the memory address for that method is already on-hand. That method was actually compiled with 1 parameter: what is going to be this during the call, and we can get an idea of how the runtime invokes it by doing the same with reflection:

var animal = new Animal();
var method = typeof(Animal).GetMethod("MakeNoise");
method.Invoke(animal, new object[0]);

Note that, even though there are no parameters to MakeNoise the reflection invocation requires an empty array.

Side-notes:

when invoking a static member method, the this argument is null
this is analagous to the JavaScript .apply() method on function objects
most OO languages hide this from you. Python, on the other hand, doesn't -- member methods must have a first argument which is the this pointer, most often called self

2. Hiding methods

In the example above, we can see we've set up for a base Animal class. We'd perhaps like to make Dogs that "woof" and Cats that meow, eg:

public class Dog: Animal
{
  public string MakeNoise()
  {
    return "woof";
  }
}

public class Cat: Animal
{
  public string MakeNoise()
  {
    return "meow";
  }
}

public static class Program
{
  public static void Main(string[] args)
  {
    var animal = new Animal();
    var dog = new Dog();
    var cat = new Cat();
    // will print out:
    // generic animal sound
    // woof
    // meow
    Console.WriteLine(animal.MakeNoise());
    Console.WriteLine(dog.MakeNoise());
    Console.WriteLine(cat.MakeNoise());
  }
}

All well and good. But we probably want to refactor: each animal simply has it's unique sound printed out to the console. What if we did this:

public static class Program
{
  public static void Main(string[] args)
  {
    var animal = new Animal();
    var dog = new Dog();
    var cat = new Cat();
    PrintNoises(new[]
    {
      animal, dog, cat
    });
  }

  private static void PrintNoises(Animal[] animals)
  {
    foreach (var animal in animals)
    {
      Console.WriteLine(animal.MakeNoise());
    }
  }
}

Well, we'd find that instead of getting different sounds, we get the same message ("generic animal sound") three times!

Let's look at Dog to get an idea of what's going on here:

The compiled Dog type actually has two MakeNoise methods which we can find by reflection:

public void Show()
{
  foreach (var method in typeof(Dog).GetMethods())
  {
    Console.WriteLine($"{method.DeclaringType}.{method.Name}");
  }
}

This prints out two lines:

Dog.MakeNoise
Animal.MakeNoise

So the method that's invoked on the dog object depends entirely on what type it's posing as at the point of calling:

(dog as Animal).MakeNoise(); // generic animal noise
(dog as Dog).MakeNoise(); // woof

This is rather inconvenient, but there's an easy way to resolve this:

3. `virtual` and `override`

If we change our Animal class a little:

public class Animal
{
  public virtual string MakeNoise()
  {
    return "generic animal sound";
  }
}

First we should see a different compiler warning:

(and if we do nothing about it, the result is the same as if we added the 'new' keyword)

Now we update our derivatives:

public class Dog: Animal
{
  public override string MakeNoise()
  {
    return "woof";
  }
}

public class Cat: Animal
{
  public override string MakeNoise()
  {
    return "meow";
  }
}

And re-run the refactored program, we should see the desired result:

public static void Main(string[] args)
{
    var animal = new Animal();
    var dog = new Dog();
    foreach (var method in typeof(Dog).GetMethods())
    {
        // note that this now only prints out _one_ method:
        //  Dog.MakeNoise
        Console.WriteLine($"{method.DeclaringType}.{method.Name}");
    }

    var cat = new Cat();
    // will print out:
    // generic animal sound
    // woof
    // meow
    PrintNoise(new[]
    {
        animal, dog, cat
    });
}

private static void PrintNoise(Animal[] animals)
{
    foreach (var animal in animals)
    {
        Console.WriteLine(animal.MakeNoise());
    }
}

What's happening here is that your class "template" for Dog, Cat and Animal now no longer have the memory address of their implementations of MakeNoise baked into the template. Instead, there's a bit of logic there which boils down to: "at run-time, patch the object that is a result of new Dog() to have the MakeNoise method always point to the override from the Dog class". Now when that object is down-cast to the type Animal, the Dog.MakeNoise method is still invoked. This likely to be the desired behavior in 99.99% of the cases where you're deriving from classes and implementing methods with the same name.

Remember also that properties are implemented with backing fields and methods, even when they are auto-props, eg:

public class AutoFoo
{
  public int Id { get; set; }
}

// is the same as:
public class ManualFoo
{
  public int Id
  {
    get => _id;            // getter method
    set => _id = value;    // setter method
  }
  private int _id;
}

So the same discussion about virtual/override and new applies to properties.

When people talk about this virtual table of addresses, you may hear the term "vtable" used.

Conclusion:

We should pay attention to compiler warnings -- they can save us from unexpected runtime behaviors!
We should prefer to make members virtual when we intend to override behavior in derived classes
If we really can't make members virtual and override, then we need to keep in mind that the new keyword simply hides the ancestor member, and we have to be careful about the cast type of the object when that member is invoked

You may wonder why you'd ever use new on purpose! Sometimes you don't have a choice:

the new member has a different signature
- property with a different type
- method with different return type and same parameters
the class we're deriving from is in an assembly not under our control, so we can't make the base class member virtual

In the case of (1), this should be a "code smell" -- an indication that the code is doing something poorly, and should be refactored to be better. In the case of (2), we could also refactor to have a new facade class shielding the original, alien type and exposing the new property that we want. In both cases, choosing to use the new keyword or ignoring the compiler warning can lead to unexpected behaviors at runtime.

DEV Community

A tale of two classes

Object member resolution.

1. The simple case

2. Hiding methods

3. `virtual` and `override`

Conclusion:

Top comments (0)

Object member resolution.

1. The simple case

2. Hiding methods

3. virtual and override

Conclusion:

3. `virtual` and `override`