Note: An extended and updated version of this demonstration can be found at typeclass-interface-pattern.
This article describes an extensible and practical design pattern you can use to achieve functional polymorphism in pure, standard C99 (or above). You can view an implementation and usage of this pattern in c-iterators, where I implement lazy, type safe, rust-like iterators.
Brief
If you've written a fair amount of C, chances are- you've been struck by the desire to have some sort of polymorphism like the higher level languages do.
Maybe you want to have generic containers (i.e a linked list of generic elements), or maybe you want to have a function that takes in a polymorphic type, or anything in between. Chances are- you're somewhat frustrated at the lack of native polymorphism support.
Thankfully, polymorphism in C is nothing new - there are many articles, repositories, and projects in general explaining, and using a Virtual Method Table (aka vtable) based OOP polymorphism pattern. The code behind linux, cpython implementation, and many other large scale C projects implement OOP through vtables - a pattern that is pretty common amongst the C community.
At the same time, if you've used this pattern - you may also know of its deficiencies.
- It's not type safe, since type safety isn't a primary concern of this pattern.
- It inherently depends on unsafe casts.
- The semantics are, arguably, very ugly and hacky.
- It presents a fair bit of traps.
- Some implementations of this pattern break standard conformance and depend on implementation defined behavior. Usually, the rule they're breaking is strict aliasing. Though, this issue really isn't hard to avoid. CPython actually had this issue at one point, and fixed later. But that's not necessarily the only rule being broken. Many implementations will simply not use standard C, and instead some extended implementation of C, such as gnuc.
Oh, there's actually another drawback - but this one's totally subjective and doesn't contribute anything to the actual conversation- what if you just don't like OOP and prefer functional polymorphism rather than OOP inheritance? Well, I'd fall under that category :)
Enter Type Classes
If you're familiar with Haskell, you're already familiar with Type Classes. A way to do ad-hoc polymorphism. Haskell is the main language that inspired me to come up with this design pattern. Although the end product really isn't exactly like type classes, since type classes are inherently based on static dispatch (a feature impossible to implement in C in an extensible fashion), you'll probably still find the similarities along the way.
If you're familiar with Rust, you're also already familiar with type classes! Just with a different name and a bit less power- Traits. Once again, the end result of this pattern will not be exactly like traits - rather, they'd be more similar to trait objects. You know, Dynamic traits, rust's way of doing dynamic dispatch.
If you're not familiar with a functional language, but rather OOP languages- no worries! OOP also has a concept very similar to typeclasses and traits- Interfaces.
If you aren't familiar with any of the concepts mentioned above, that's totally fine! The concepts aren't actually that complicated from a base level view. Typeclasses, Traits, Interfaces are just ways of modeling polymorphism around actions, rather than objects. When I ask for a type that implements a typeclass (or a trait, or an interface) - I'm asking for a type that can do certain things that fall under the typeclass. That's all there is to it. Polymorphism based around abilities, not objects and object hierarchy.
Goals of the Typeclass Pattern
- Type safety through monomorphization and simple abstractions
- Extensible by using dynamic dispatch, allowing them to be used in library APIs
- Usable as completely normal types - allowing them to be used alongside existing container libraries such as CTL (C template Library).
- Polymorphism constrained around actions/abilities/functions, not objects
A Small Taste
Before we go on to discussing and implementing the pattern itself, here's a tiny taste of what you can do with the pattern-
void print(Showable showable)
{
char* s = showable.tc->show(showable.self);
puts(s);
free(s);
}
typedef enum
{
holy,
hand,
grenade
} Antioch;
Antioch ant = holy;
Showable antsh = prep_antioch_show(&ant);
This is the Show
typeclass. It describes the ability to turn a type into its string representation.
I've implemented Show
for the Antioch
type given above by defining prep_antioch_show
. Now any Antioch
value can be turned into a Showable
, which can then be used with generic functions working on Showable
types.
The Typeclass Pattern
We can finally talk about the design pattern itself. There's 3 core parts to this. I'll demonstrate these 3 parts by implementing the Show
typeclass mentioned above.
The typeclass
struct definition
This is the struct containing the function pointers related to the typeclass. For Show
, we'll just be using the show
function here, it takes in a value of the type for which Show
is implemented (i.e self
) and returns a printable string.
typedef struct
{
char* (*const show)(void* self);
} Show;
A simple struct containing the virtual function(s). When the wrapper function is first called (to convert a certain type to its typeclass instance), a typeclass struct of static
storage duration is created with the function pointers for that specific type (a vtable of sorts). The pointer to this struct is then used in all typeclass instances. More on this will be discussed in the impl_
macro part.
The typeclass_instance
struct definition
This is the concrete instance to be used as a type constraint. It should contain a pointer to the typeclass, and the self
member containing the value to pass to the functions in the typeclass struct.
typedef struct
{
void* self;
Show const* tc;
} Showable;
The impl_
macro used to implement the typeclass
This macro is the real heavy lifter when it comes to type safety.
It takes in some information about the type you're implementing a typeclass for, and the exact function implementations that will be used for that type, and defines a function which does the following-
- Takes in an argument of the type the implementation is for
- Type checks the function implementations given. This is done by storing the given function implementations into function pointers of an exact and expected type
- Initializes the typeclass struct to store these function pointers, with static storage duration
- Creates and returns the typeclass instance, which stores a pointer to the aforementioned typeclass struct, and the function argument into the
self
member.
Following these rules, this is what impl_show
would look like-
#define impl_show(T, Name, show_f) \
Showable Name(T x) \
{ \
char* (*const show_)(T e) = (show_f); \
(void)show_; \
static Show const tc = {.show = (char* (*const)(void*))(show_f) }; \
return (Showable){.tc = &tc, .self = x}; \
}
It takes the show
implementation as its third argument. In the function definition, it stores that impl in a variable of type char* (*const show_)(T e)
, which is the exact type it should be - T
is the specific type the implementation is for. It must be a pointer type. Since it's stored into void* self
.
The (void)show_;
line is to suppress the unused variable warning emitted by compilers, since show_
isn't actually used. It's only there for typechecking purposes. These 2 typechecking lines will be completely eliminated by any decent compiler.
Then it simply defines a static typeclass and stores the function pointer inside. Then it creates and returns the Showable
struct, containing the x
argument, and a pointer to the typeclass struct.
Implementing Typeclasses for your own types!
Once the typeclass and typeclass instance structs have been defined, all the user has to do is call the impl_
macro with their own type and the function implementation(s) required for the typeclass. The declaration of the function defined by said macro can then be included in a header.
Here's an example of implementing the previously defined Show
typeclass for a very holy enum-
typedef enum
{
holy,
hand,
grenade
} Antioch;
static inline char* strdup_(char const* x)
{
char* s = malloc((strlen(x) + 1) * sizeof(*s));
strcpy(s, x);
return s;
}
/* The `show` function implementation for `Antioch*` */
static char* antioch_show(Antioch* x)
{
/*
Note: The `show` function of a `Showable` is expected to return a malloc'ed value
The users of a generic `Showable` are expected to `free` the returned pointer from the function `show`.
*/
switch (*x)
{
case holy:
return strdup_("holy");
case hand:
return strdup_("hand");
case grenade:
return strdup_("grenade");
default:
return strdup_("breakfast cereal");
}
}
/*
Implement the `Show` typeclass for the type `Antioch*`
This will define a function to convert a value of type `Antioch*` into a `Showable`, the function will be named `prep_antioch_show`
The `show` implementation used will be the `antioch_show` function
*/
impl_show(Antioch*, prep_antioch_show, antioch_show)
The impl_show
macro here, simply translates to-
Showable prep_antioch_show(Antioch* x)
{
char* (*const show_)(Antioch* e) = (show_f);
(void)show_;
static Show const tc = {.show = (char* (*const)(void*)(show_f) };
return (Showable){.tc = &tc, .self = x};
}
Now, you can convert an Antioch
into a Showable
like so-
Antioch ant = holy;
Showable antsh = prep_antioch_show(&ant);
And this Showable
will automatically dispatch to the antioch_show
function whenever someone calls the show
function inside it.
Great! Let's make a polymorphic function that works on Showable
s-
void print(Showable showable)
{
char* s = showable.tc->show(showable.self);
puts(s);
free(s);
}
You can now easily print an Antioch
with these abstractions-
Antioch ant = holy;
print(prep_antioch_show(&ant));
Where this really shines though, is when you have multiple types that implement Show
- all of them can be used with print
. Or any other function that works on a generic Showable
!
Combining Multiple Typeclasses
One of the core design goals of a typeclass is to be modular. A Show
typeclass should only have actions directly related to "showing", a Num
typeclass should only have actions directly related to numerical operations. Unlike objects, that may contain many different methods of arbitrary relevance to each other.
This means that, more often than not, you'll want a type that can do multiple different classes of actions. A type that implements multiple typeclasses.
You can model that pretty easily with this pattern-
/* Type constraint that requires both `Show` and `Enum` to be implemented */
typedef struct
{
void* self;
Show const* showtc;
Enum const* enumtc;
} ShowableEnumerable;
#define impl_show_enum(T, Name, showimpl, enumimpl) \
ShowableEnumerable Name(T x) \
{ \
Showable showable = showimpl(x); \
Enumerable enumerable = enumimpl(x); \
return (ShowableEnumerable){.showtc = showable.tc, .enumtc = enumerable.tc, .self = x}; \
}
Where Enum
is also a typeclass defined like-
typedef struct
{
size_t (*const from_enum)(void* self);
void* (*const to_enum)(size_t x);
} Enum;
typedef struct
{
void* self;
Enum const* tc;
} Enumerable;
#define impl_enum(T, Name, from_enum_f, to_enum_f) \
Enumerable Name(T x) \
{ \
size_t (*const from_enum_)(T e) = (from_enum_f); \
T (*const to_enum_)(size_t x) = (to_enum_f); \
(void)from_enum_; \
(void)to_enum_; \
static Enum const tc = { \
.from_enum = (size_t (*const)(void*))(from_enum_f), .to_enum = (void* (*const)(size_t x))(to_enum_f) \
}; \
return (Enumerable){.tc = &tc, .self = x}; \
}
Essentially, you can have a struct that stores each of the typeclass pointers you want to combine, and the self
member. The impl
macro would also be very simple. It should simply define a function that puts the given value into ShowableEnumerable
's self
, as well as use the impl functions to get the typeclass instances of that type.
With this, if you implemented Show
for Antioch*
and defined the function as prep_antioch_show
, and also implemented Enum
with the function name prep_antioch_enum
, you could call impl_show_enum
using-
impl_enum(Antioch*, prep_antioch_show_enum, prep_antioch_show, prep_antioch_enum)
The defined function would have the signature-
ShowableEnumerable prep_antioch_show_enum(Antioch* x);
That's it!
You can now have functions that require their argument to implement multiple typeclasses-
void foo(ShowableEnumerable se)
{
/* Use the enumerable abilities */
size_t x = se.enumtc->from_enum(se.self);
/* Use the showable abilities */
char* s = se.showtc->show(se.self);
}
Real World Usage
With this pattern, I've implemented the lazy functional Iterator
s in pure C. It's essentially modeled after Rust's Iterator
typeclass.
But that's not all! A lazy Iterator
isn't complete without cool abstractions like take
, drop
, map
, filter
etc. You can implement all of these abstractions using the same Typeclass pattern.
It's very similar to how rust does it, the take
method, for example, simply returns a Take
struct in rust. This struct has its own Iterator
implementation - which is what allows this whole abstraction to be completely lazy.
map
is even cooler! Here's an example of mapping a function over an Iterable
, it
-
/* Map an increment function over the iterable */
Iterable(int) mappedit = map_over(it, incr, int, int);
Where incr
is-
/* A function that increments and returns the given integer */
static int incr(int x) { return x + 1; }
Once again, fully type safe and completely lazy. This map operation isn't performed until you explicitly iterate over the iterable. Which means, you can chain take_from
and map_over
and there won't be multiple iterations, just one!
/* Map an increment function over the iterable */
Iterable(int) mappedit = map_over(it, incr, int, int);
/* Take, at most, the first 10 elements */
Iterable(int) mappedit10 = take_from(mappedit, 10, int);
The exact code to implement these, as well as thorough explanations of the semantics can be found in the c-iterators repo.
Top comments (0)