Phantz

Posted on May 16, 2021 • Edited on Jun 3, 2021

Implementing Interfaces in C

#c #tutorial #functional #polymorphism

Note: An extended and updated version of this demonstration can be found at typeclass-interface-pattern.

This article describes an extensible and practical design pattern you can use to achieve functional polymorphism in pure, standard C99 (or above). You can view an implementation and usage of this pattern in c-iterators, where I implement lazy, type safe, rust-like iterators.

Brief

If you've written a fair amount of C, chances are- you've been struck by the desire to have some sort of polymorphism like the higher level languages do.

Maybe you want to have generic containers (i.e a linked list of generic elements), or maybe you want to have a function that takes in a polymorphic type, or anything in between. Chances are- you're somewhat frustrated at the lack of native polymorphism support.

Thankfully, polymorphism in C is nothing new - there are many articles, repositories, and projects in general explaining, and using a Virtual Method Table (aka vtable) based OOP polymorphism pattern. The code behind linux, cpython implementation, and many other large scale C projects implement OOP through vtables - a pattern that is pretty common amongst the C community.

At the same time, if you've used this pattern - you may also know of its deficiencies.

It's not type safe, since type safety isn't a primary concern of this pattern.
It inherently depends on unsafe casts.
The semantics are, arguably, very ugly and hacky.
It presents a fair bit of traps.
Some implementations of this pattern break standard conformance and depend on implementation defined behavior. Usually, the rule they're breaking is strict aliasing. Though, this issue really isn't hard to avoid. CPython actually had this issue at one point, and fixed later. But that's not necessarily the only rule being broken. Many implementations will simply not use standard C, and instead some extended implementation of C, such as gnuc.

Oh, there's actually another drawback - but this one's totally subjective and doesn't contribute anything to the actual conversation- what if you just don't like OOP and prefer functional polymorphism rather than OOP inheritance? Well, I'd fall under that category :)

Enter Type Classes

If you're familiar with Haskell, you're already familiar with Type Classes. A way to do ad-hoc polymorphism. Haskell is the main language that inspired me to come up with this design pattern. Although the end product really isn't exactly like type classes, since type classes are inherently based on static dispatch (a feature impossible to implement in C in an extensible fashion), you'll probably still find the similarities along the way.

If you're familiar with Rust, you're also already familiar with type classes! Just with a different name and a bit less power- Traits. Once again, the end result of this pattern will not be exactly like traits - rather, they'd be more similar to trait objects. You know, Dynamic traits, rust's way of doing dynamic dispatch.

If you're not familiar with a functional language, but rather OOP languages- no worries! OOP also has a concept very similar to typeclasses and traits- Interfaces.

If you aren't familiar with any of the concepts mentioned above, that's totally fine! The concepts aren't actually that complicated from a base level view. Typeclasses, Traits, Interfaces are just ways of modeling polymorphism around actions, rather than objects. When I ask for a type that implements a typeclass (or a trait, or an interface) - I'm asking for a type that can do certain things that fall under the typeclass. That's all there is to it. Polymorphism based around abilities, not objects and object hierarchy.

Goals of the Typeclass Pattern

Type safety through monomorphization and simple abstractions
Extensible by using dynamic dispatch, allowing them to be used in library APIs
Usable as completely normal types - allowing them to be used alongside existing container libraries such as CTL (C template Library).
Polymorphism constrained around actions/abilities/functions, not objects

A Small Taste

Before we go on to discussing and implementing the pattern itself, here's a tiny taste of what you can do with the pattern-

void print(Showable showable)
{
    char* s = showable.tc->show(showable.self);
    puts(s);
    free(s);
}

typedef enum
{
    holy,
    hand,
    grenade
} Antioch;

Antioch ant = holy;
Showable antsh = prep_antioch_show(&ant);

This is the Show typeclass. It describes the ability to turn a type into its string representation.

I've implemented Show for the Antioch type given above by defining prep_antioch_show. Now any Antioch value can be turned into a Showable, which can then be used with generic functions working on Showable types.

The Typeclass Pattern

We can finally talk about the design pattern itself. There's 3 core parts to this. I'll demonstrate these 3 parts by implementing the Show typeclass mentioned above.

The `typeclass` struct definition

This is the struct containing the function pointers related to the typeclass. For Show, we'll just be using the show function here, it takes in a value of the type for which Show is implemented (i.e self) and returns a printable string.

typedef struct
{
    char* (*const show)(void* self);
} Show;

A simple struct containing the virtual function(s). When the wrapper function is first called (to convert a certain type to its typeclass instance), a typeclass struct of static storage duration is created with the function pointers for that specific type (a vtable of sorts). The pointer to this struct is then used in all typeclass instances. More on this will be discussed in the impl_ macro part.

The `typeclass_instance` struct definition

This is the concrete instance to be used as a type constraint. It should contain a pointer to the typeclass, and the self member containing the value to pass to the functions in the typeclass struct.

typedef struct
{
    void* self;
    Show const* tc;
} Showable;

The `impl_` macro used to implement the typeclass

This macro is the real heavy lifter when it comes to type safety.

It takes in some information about the type you're implementing a typeclass for, and the exact function implementations that will be used for that type, and defines a function which does the following-

Takes in an argument of the type the implementation is for
Type checks the function implementations given. This is done by storing the given function implementations into function pointers of an exact and expected type
Initializes the typeclass struct to store these function pointers, with static storage duration
Creates and returns the typeclass instance, which stores a pointer to the aforementioned typeclass struct, and the function argument into the self member.

Following these rules, this is what impl_show would look like-

#define impl_show(T, Name, show_f)                                                                                     \
    Showable Name(T x)                                                                                                 \
    {                                                                                                                  \
        char* (*const show_)(T e) = (show_f);                                                                          \
        (void)show_;                                                                                                   \
        static Show const tc = {.show = (char* (*const)(void*))(show_f) };                                             \
        return (Showable){.tc = &tc, .self = x};                                                                       \
    }

It takes the show implementation as its third argument. In the function definition, it stores that impl in a variable of type char* (*const show_)(T e), which is the exact type it should be - T is the specific type the implementation is for. It must be a pointer type. Since it's stored into void* self.

The (void)show_; line is to suppress the unused variable warning emitted by compilers, since show_ isn't actually used. It's only there for typechecking purposes. These 2 typechecking lines will be completely eliminated by any decent compiler.

Then it simply defines a static typeclass and stores the function pointer inside. Then it creates and returns the Showable struct, containing the x argument, and a pointer to the typeclass struct.

Implementing Typeclasses for your own types!

Once the typeclass and typeclass instance structs have been defined, all the user has to do is call the impl_ macro with their own type and the function implementation(s) required for the typeclass. The declaration of the function defined by said macro can then be included in a header.

Here's an example of implementing the previously defined Show typeclass for a very holy enum-

typedef enum
{
    holy,
    hand,
    grenade
} Antioch;

static inline char* strdup_(char const* x)
{
    char* s = malloc((strlen(x) + 1) * sizeof(*s));
    strcpy(s, x);
    return s;
}

/* The `show` function implementation for `Antioch*` */
static char* antioch_show(Antioch* x)
{
    /*
    Note: The `show` function of a `Showable` is expected to return a malloc'ed value
    The users of a generic `Showable` are expected to `free` the returned pointer from the function `show`.
    */
    switch (*x)
    {
        case holy:
            return strdup_("holy");
        case hand:
            return strdup_("hand");
        case grenade:
            return strdup_("grenade");
        default:
            return strdup_("breakfast cereal");
    }
}

/*
Implement the `Show` typeclass for the type `Antioch*`

This will define a function to convert a value of type `Antioch*` into a `Showable`, the function will be named `prep_antioch_show`

The `show` implementation used will be the `antioch_show` function
*/
impl_show(Antioch*, prep_antioch_show, antioch_show)

The impl_show macro here, simply translates to-

Showable prep_antioch_show(Antioch* x)
{
    char* (*const show_)(Antioch* e) = (show_f);
    (void)show_;
    static Show const tc = {.show = (char* (*const)(void*)(show_f) };
    return (Showable){.tc = &tc, .self = x};
}

Now, you can convert an Antioch into a Showable like so-

Antioch ant = holy;
Showable antsh = prep_antioch_show(&ant);

And this Showable will automatically dispatch to the antioch_show function whenever someone calls the show function inside it.

Great! Let's make a polymorphic function that works on Showables-

void print(Showable showable)
{
    char* s = showable.tc->show(showable.self);
    puts(s);
    free(s);
}

You can now easily print an Antioch with these abstractions-

Antioch ant = holy;
print(prep_antioch_show(&ant));

Where this really shines though, is when you have multiple types that implement Show - all of them can be used with print. Or any other function that works on a generic Showable!

Combining Multiple Typeclasses

One of the core design goals of a typeclass is to be modular. A Show typeclass should only have actions directly related to "showing", a Num typeclass should only have actions directly related to numerical operations. Unlike objects, that may contain many different methods of arbitrary relevance to each other.

This means that, more often than not, you'll want a type that can do multiple different classes of actions. A type that implements multiple typeclasses.

You can model that pretty easily with this pattern-

/* Type constraint that requires both `Show` and `Enum` to be implemented */
typedef struct
{
    void* self;
    Show const* showtc;
    Enum const* enumtc;
} ShowableEnumerable;

#define impl_show_enum(T, Name, showimpl, enumimpl)                                                                    \
    ShowableEnumerable Name(T x)                                                                                       \
    {                                                                                                                  \
        Showable showable = showimpl(x);                                                                               \
        Enumerable enumerable = enumimpl(x);                                                                           \
        return (ShowableEnumerable){.showtc = showable.tc, .enumtc = enumerable.tc, .self = x};                        \
    }

Where Enum is also a typeclass defined like-

typedef struct
{
    size_t (*const from_enum)(void* self);
    void* (*const to_enum)(size_t x);
} Enum;

typedef struct
{
   void* self;
   Enum const* tc;
} Enumerable;

#define impl_enum(T, Name, from_enum_f, to_enum_f)                                                                     \
    Enumerable Name(T x)                                                                                               \
    {                                                                                                                  \
        size_t (*const from_enum_)(T e) = (from_enum_f);                                                               \
        T (*const to_enum_)(size_t x)   = (to_enum_f);                                                                 \
        (void)from_enum_;                                                                                              \
        (void)to_enum_;                                                                                                \
        static Enum const tc = {                                                                                       \
            .from_enum = (size_t (*const)(void*))(from_enum_f), .to_enum = (void* (*const)(size_t x))(to_enum_f)       \
        };                                                                                                             \
        return (Enumerable){.tc = &tc, .self = x};                                                                     \
    }

Essentially, you can have a struct that stores each of the typeclass pointers you want to combine, and the self member. The impl macro would also be very simple. It should simply define a function that puts the given value into ShowableEnumerable's self, as well as use the impl functions to get the typeclass instances of that type.

With this, if you implemented Show for Antioch* and defined the function as prep_antioch_show, and also implemented Enum with the function name prep_antioch_enum, you could call impl_show_enum using-

impl_enum(Antioch*, prep_antioch_show_enum, prep_antioch_show, prep_antioch_enum)

The defined function would have the signature-

ShowableEnumerable prep_antioch_show_enum(Antioch* x);

That's it!

You can now have functions that require their argument to implement multiple typeclasses-

void foo(ShowableEnumerable se)
{
    /* Use the enumerable abilities */
    size_t x = se.enumtc->from_enum(se.self);
    /* Use the showable abilities */
    char* s = se.showtc->show(se.self);
}

Real World Usage

With this pattern, I've implemented the lazy functional Iterators in pure C. It's essentially modeled after Rust's Iterator typeclass.

But that's not all! A lazy Iterator isn't complete without cool abstractions like take, drop, map, filter etc. You can implement all of these abstractions using the same Typeclass pattern.

It's very similar to how rust does it, the take method, for example, simply returns a Take struct in rust. This struct has its own Iterator implementation - which is what allows this whole abstraction to be completely lazy.

map is even cooler! Here's an example of mapping a function over an Iterable, it-

/* Map an increment function over the iterable */
Iterable(int) mappedit = map_over(it, incr, int, int);

Where incr is-

/* A function that increments and returns the given integer */
static int incr(int x) { return x + 1; }

Once again, fully type safe and completely lazy. This map operation isn't performed until you explicitly iterate over the iterable. Which means, you can chain take_from and map_over and there won't be multiple iterations, just one!

/* Map an increment function over the iterable */
Iterable(int) mappedit = map_over(it, incr, int, int);
/* Take, at most, the first 10 elements */
Iterable(int) mappedit10 = take_from(mappedit, 10, int);

The exact code to implement these, as well as thorough explanations of the semantics can be found in the c-iterators repo.

DEV Community

Implementing Interfaces in C

Brief

Enter Type Classes

Goals of the Typeclass Pattern

A Small Taste

The Typeclass Pattern

The `typeclass` struct definition

The `typeclass_instance` struct definition

The `impl_` macro used to implement the typeclass

Implementing Typeclasses for your own types!

Combining Multiple Typeclasses

Real World Usage

Top comments (0)

Read next

Detecting and Analyzing Comment Quality Using Vector Search

Unlocking Quickpix AI's Potential: Features, Pricing, and Performance Review

How to Check if Google Tag Manager is Working?

Understanding Neural Networks: A Simple Interactive Visualization ⚙️

Brief

Enter Type Classes

Goals of the Typeclass Pattern

A Small Taste

The Typeclass Pattern

The typeclass struct definition

The typeclass_instance struct definition

The impl_ macro used to implement the typeclass

Implementing Typeclasses for your own types!

Combining Multiple Typeclasses

Real World Usage

Read next

Detecting and Analyzing Comment Quality Using Vector Search

Unlocking Quickpix AI's Potential: Features, Pricing, and Performance Review

How to Check if Google Tag Manager is Working?

Understanding Neural Networks: A Simple Interactive Visualization ⚙️

The `typeclass` struct definition

The `typeclass_instance` struct definition

The `impl_` macro used to implement the typeclass