I am old enough to remember COM

#com #legacy #dotnet #windows

This question by @ben brought me back to the late 1990s where I first started coding professionally (yes, I am that old).

What are you "old enough to remember" in software development?

Ben Halpern ・ May 23 '19 ・ 1 min read

#discuss

I first started coding on my Sinclair ZX Spectrum by a lucky incident (I won in a drawing competition but that is for another story) but my first real job was as a very young actuary fresh out of university at a company that developed systems for pension companies.

Back then the de facto way of building applications on Windows was C++ and COM using Microsoft Visual Studio C++ 6.0.

I had no idea what I was doing when I first started and neither C++ nor COM are for the faint hearted so I was in for a steep learning curve. COM can really be a challenge...

Disclaimer: Everything below is written from memory and my memory is not exactly state of the art. Please bear with me.

COM allows you to communicate between components that can live in the same process or in another process. In some cases you can even use COM to communicate with components on other machines (i.e. Distributed COM).

COM is based on interfaces. If you want somebody else to call your code you have to define an interface for it (similar to interfaces in C#). In C++ you write your interfaces as abstract classes.

But how, you may ask, do you create and object that may or may not live in another process?

That is a good question.

Obviously, you can't just new up an object by calling its constructor. Instead the good ol' Windows API has a special function for creating objects called CoCreateInstance.

This raises a new question: How does CoCreateInstance know where your component lives? This is one of the big WHY?s of COM: You need to register your COM components in the Windows Registry. That's right! You have to register your COM component in the Windows Registry with a unique ID (Guid) and where the dll og exe hosting your component is located.

You can probably already imagine the kind of mess you can run into when trying to mess with the registry whenever you need to deploy new versions.

But how, you may ask, do you clean up objects that may or may not live in another process?

That is a great question.

Naturally, you can't just delete an object that you really don't know where lives.

COM's solution to that is reference counting. Each object keeps an internal count of the number of references that exist to the object. Of course, C++ has no built-in ways of helping the developer with keeping track of references so there are a bunch of rules on how to be a COM developer. The rules dictate when the developer has to manually add or subtract to the reference count. You can do that through the IUnknown interface that every COM component must implement.

If you forget to follow the rules, the memory leaks will start piling up. Here is some pseudo code on how to create a memory leak in no time:

IHorse *p1;
CoCreateInstance(..., &p1);
IHorse *p2;
CoCreateInstance(..., &p2);
p2 = p1; 
// Oops! You forgot to release p2 before reassigning 
// and to increase the ref. count for p1.

Hunting down these kind of memory leaks is a real pain. You practically have no idea where to begin. At the company I worked for we created a special in-memory log for logging reference counting which was a great help and we would also use smart pointers to handle releases automatically. But the pain though... it this makes you really appreciate the garbage collection in modern runtimes.

If you are programming in Swift or Objective-C you may know reference counting as ARC. ARC can get you in trouble but the runtime on Apple devices know about reference counting so it is a lot easier on the developer than COM.

But how, you may ask, do you handle concurrency, when calling objects that may or may not live in another process?

That is a really great question.

Naturally, with objects living in different process you are going to have to worry about concurrency and COM tries to help you with that by putting the objects in "apartments".

Objects that are not thread safe should live in the Single Thread Apartment (STA) and objects that are thread safe may live in a Multiple Thread Apartment (MTA). You have to decide where your object should live when registering it in the registry.

When calling into another apartment your call needs to be serialized by a so-called marshaller. Unless you are insane you want those marshallers to be created automatically which means that you can't write your interfaces in C++. You need to define them in the Interface Definition Language (IDL) that will be transpiled into C++.

As you may suspect by now, COM apartments may you lead you to serious self-inflicted injuries. They should come with a warning. For example, consider the ADODB.Connection type used for connecting to databases.

ADODB.Connection is an STA component, meaning that only one thread is allowed to access the object at the same time. Not all developers were aware of this back in the 2000s so people would put the connection object in the session cache on classic ASP web sites to save time and speed up response times. Of course, sending all requests to a web site through the same thread is usually not a good idea for your response times.

Back then many developers including Microsoft said that VB6 was the answer to all your problems with COM as VB6 would handle the reference counting for you. That is true to some extent but you did not know the internal workings of reference counting you could very easily create memory leaks in VB6 and you would have no idea why!

Closing

One of the gurus of COM back then, Don Box, famously coined the term "COM is love" because of how COM was built on contracts. But I can say for sure that I did not love COM all that much and I do not miss it at all. It only makes me appreciate .NET that much more.

COM is still very much in use today: The Office applications rely heavily on COM. If you want to write a component with functions callable from within Excel cells or from VBA, you need a COM component. Luckily, you can write it in C# and fairly easily register it as a COM component. .NET will create as shim for you that handles the reference counting and all the other good stuff.