DEV Community

Andreas Krennmair
Andreas Krennmair

Posted on

A brief look at Go's new generics

I have been following Go's development fairly closely ever since it was first publicly released in November 2009. I remember a time when you still had to terminate all statements with semi-colons, the build system was rudimentary and Makefile-based, and the standard error type was not error but os.Error (having the effect that virtually every package was importing the os package).

At the time, I was mostly doing Unix systems programming with C and a bit of C++, on both Solaris and Linux. Go impressed me because it really felt like an evolution of C that kept most of C's simplicity while adding memory safety, garbage collection and a whole set of features enabling users to build complex concurrent algorithms. And even better: the language runtime implemented an M:N threading model that distributed potentially many light-weight goroutines among a few heavy-weight OS threads, effectively parallelizing the execution of your concurrent code.

Among the many early review articles of pre-1.0 Go, a common criticism was the lack of generics, or parameterized types. Having used Go for my personal projects since late 2009 and professionally since 2013, I kind of understood the criticism because it was the kind of feature that you would naturally expect in a new language as other modern languages of the time would often feature generics in some shape or form. C++ went to the extreme with it, as its template system goes beyond just type parameterization and was even found to be Turing-complete. Practically though, in the several years of actively using Go pretty much every work day, writing many tens of thousands of line of code, I've only ever come across one situation where I thought that parameterized types would have been really handy and would have made it so much easier to write less repetitive code.

So when Go 1.18 beta1 was released a few weeks ago, I decided to revisit that code to try and remove some of the code duplication for several different types by changing this over to use parameterized types.

I'll spare you the most basic introduction. The Go team has already written a very straightforward intro to this new language feature, so there's no point in repeating it here.

So let me describe a simplified situation of my existing code and the issue I ran into here. I'm starting with multiple implementations of encoding int32 and int64 values to bytes.

package main

import (
    "encoding/binary"
    "fmt"
)

type int32encoder struct{}

func (e *int32encoder) EncodeList(values []int32) (ret []byte) {
    for _, v := range values {
        ret = append(ret, e.Encode(v)...)
    }
    return ret
}

func (e *int32encoder) Encode(v int32) (ret []byte) {
    ret = make([]byte, 4)
    binary.BigEndian.PutUint32(ret, uint32(v))
    return ret
}

type int64encoder struct{}

func (e *int64encoder) EncodeList(values []int64) (ret []byte) {
    for _, v := range values {
        ret = append(ret, e.Encode(v)...)
    }
    return ret
}

func (e *int64encoder) Encode(v int64) (ret []byte) {
    ret = make([]byte, 8)
    binary.BigEndian.PutUint64(ret, uint64(v))
    return ret
}

func main() {
    var (
        enc32 int32encoder
        enc64 int64encoder
    )

    fmt.Printf("int32 list: %+v\n", enc32.EncodeList([]int32{23, 42, 9001}))
    fmt.Printf("int64 list: %+v\n", enc64.EncodeList([]int64{23, 42, 9001}))
}
Enter fullscreen mode Exit fullscreen mode

While this is a very simplified example, you do see some code repetition: the Encode methods are similar (but not identical) between int32encoder and int64encoder, while the EncodeList methods are virtually the same, with the only difference being the use of int32 vs int64.

So how would we turn this into a more generic version that reduces this duplication to only the minimum necessary? The most straight-forward way that I implemented first looked like this:

package main

import (
    "encoding/binary"
    "fmt"
)

type intEncoder[T int32 | int64] struct{}

func (e *intEncoder[T]) EncodeList(values []T) (ret []byte) {
    for _, v := range values {
        ret = append(ret, e.Encode(v)...)
    }
    return ret
}

func (e *intEncoder[T]) Encode(v T) (ret []byte) {
    switch interface{}(v).(type) {
    case int32:
        ret = make([]byte, 4)
        binary.BigEndian.PutUint32(ret, uint32(v))
    case int64:
        ret = make([]byte, 8)
        binary.BigEndian.PutUint64(ret, uint64(v))
    }
    return ret
}

func main() {
    var (
        enc32 intEncoder[int32]
        enc64 intEncoder[int64]
    )

    fmt.Printf("int32 list: %+v\n", enc32.EncodeList([]int32{23, 42, 9001}))
    fmt.Printf("int64 list: %+v\n", enc64.EncodeList([]int64{23, 42, 9001}))
}
Enter fullscreen mode Exit fullscreen mode

This is better already: there's only one intEncoder type parameterized to support both int32 and int64, only one EncodeList and Encode method each.

But what's that? Oh no, a type switch! Since int32 and int64 are encoded slightly differently, initially I chose this hack-ish way of coding the specialization for each supported type. It works, but it's not great, as it encodes the list of supported types both in the type constraint of the type parameter and the Encode method. That means that every time I want to add support for another type, I need to add it both in the type constraint and in the Encode method.

Not only is this tedious and error-prone as it means that multiple changes in different parts of the code need to be conducted, it also strictly locks in the set of supported types. If this was in a package used by other developers, it would not allow them to add encoding support for their own custom types. So that approach is a no-go.

I then looked around whether there is any other solution, but due to the lack of comprehensive documentation (it's only been released in a beta version), it took me a little bit of thinking and playing around to find a solution that works.

Back from my C++ days, I remembered that you could specialize templates for concrete types to provide concrete implementations. I have to admit though, my C++ is pretty rusty these days. I couldn't find a specific way of doing that with Go, and even the Go generics proposal documents as a limitation that it doesn't allow specialization. Surely there must be some way of doing it, I thought, this new feature into which many years of discussion, design and implementation went, ought to support more than just the most basic use cases, no?

After a bit more experimentation, I eventually found a solution that works, it keeps the code extendable with your own types, and it also ensures type safety.

And here it is my first attempt at it:

package main

import (
    "encoding/binary"
    "fmt"
    "math"
)

type encodeImpl[T any] interface {
    encode(T) []byte
}

type encoder[T any] struct {
    impl encodeImpl[T]
}

func (e *encoder[T]) EncodeList(values []T) (ret []byte) {
    for _, v := range values {
        ret = append(ret, e.Encode(v)...)
    }
    return ret
}

func (e *encoder[T]) Encode(v T) (ret []byte) {
    return e.impl.encode(v)
}

type int32impl struct{}

func (i int32impl) encode(v int32) []byte {
    ret := make([]byte, 4)
    binary.BigEndian.PutUint32(ret, uint32(v))
    return ret
}

type int64impl struct{}

func (i int64impl) encode(v int64) []byte {
    ret := make([]byte, 8)
    binary.BigEndian.PutUint64(ret, uint64(v))
    return ret
}

type float64impl struct{}

func (i float64impl) encode(v float64) []byte {
    ret := make([]byte, 8)
    binary.BigEndian.PutUint64(ret, math.Float64bits(v))
    return ret
}

func main() {
    var (
        enc32      = encoder[int32]{int32impl{}}
        enc64      = encoder[int64]{int64impl{}}
        encf64     = encoder[float64]{float64impl{}}
    )

    fmt.Printf("int32 list: %+v\n", enc32.EncodeList([]int32{23, 42, 9001}))
    fmt.Printf("int64 list: %+v\n", enc64.EncodeList([]int64{23, 42, 9001}))
    fmt.Printf("float64 list: %+v\n", encf64.EncodeList([]float64{23.5, 42.007, 900.1}))
}
Enter fullscreen mode Exit fullscreen mode

What I've essentially done is that I moved the code that is different per type, the inner-most encoding functionality, into separate concrete types, and added a specific implementation. As you can see, I could even effortlessly add support for another concrete type, float64. There's one problem though: if you forget to provide the encoding implementation, calling Encode will panic.

So here's an improved version of it:

package main

import (
    "encoding/binary"
    "fmt"
    "math"
)

type encodeImpl[T any] interface {
    encode(T) []byte
}

type encoder[T any, I encodeImpl[T]] struct {
    impl I
}

func (e *encoder[T, I]) EncodeList(values []T) (ret []byte) {
    for _, v := range values {
        ret = append(ret, e.Encode(v)...)
    }
    return ret
}

func (e *encoder[T, I]) Encode(v T) (ret []byte) {
    return e.impl.encode(v)
}

type int32impl struct{}

func (i int32impl) encode(v int32) []byte {
    ret := make([]byte, 4)
    binary.BigEndian.PutUint32(ret, uint32(v))
    return ret
}

type int64impl struct{}

func (i int64impl) encode(v int64) []byte {
    ret := make([]byte, 8)
    binary.BigEndian.PutUint64(ret, uint64(v))
    return ret
}

type float64impl struct{}

func (i float64impl) encode(v float64) []byte {
    ret := make([]byte, 8)
    binary.BigEndian.PutUint64(ret, math.Float64bits(v))
    return ret
}

func main() {
    var (
        enc32  = encoder[int32, int32impl]{}
        enc64  = encoder[int64, int64impl]{}
        encf64 = encoder[float64, float64impl]{}
    )

    fmt.Printf("int32 list: %+v\n", enc32.EncodeList([]int32{23, 42, 9001}))
    fmt.Printf("int64 list: %+v\n", enc64.EncodeList([]int64{23, 42, 9001}))
    fmt.Printf("float64 list: %+v\n", encf64.EncodeList([]float64{23.5, 42.007, 900.1}))
}
Enter fullscreen mode Exit fullscreen mode

The only change here is that the type of the type-specific implementation has been added to the list of parameterized types, but still with the same constraint that it needs to implement the same type as the type of the encoder itself. So even though this adds a bit of stuttering (in the code above, the type name appears multiple times when declaring an instance variable of the encoder type), it's still perfectly type-safe and provides the concrete encoder implementation for whatever type you want to use. You can even go further and have multiple encoder implementations per type, e.g. one for big endian and one for little endian.

All in all, I'm quite happy with Go generics. They're fairly simple, easy to learn and to understand, but have just the right balance to provide enough power to do more complex things. Not nearly at the level of C++ templates, but more than good enough for the vast majority of use cases.

My only criticism is that the amount of documentation is rather sparse. There's the brief tutorial from the Go team itself, the original proposal that eventually led to this implementation (which even slightly differs in syntax), and a few more blog articles, but nothing of the like how to do more complex things with generics. But then, this was the motivation to write this article, and I'm certain that the Go community will soon produce more great documentation about all the details and pitfalls of the new generics feature. As with other language features previously, it will probably take a while until a set of best practices or new surprising properties of the language feature will be discovered.

Work on Go started in 2007 and was first made public in 2009. In 2022, we'll see the first stable release of Go with support for type parametrization. Even though it took 15 years for that feature to reach consensus within the developer community and to be implemented, I'd say it was totally worth the wait.

Discussion (0)