Recently, someone asked a rather innocuous question on a programming forum:
What is the difference between generics and type classes [in Scala]?
I spent some time thinking about that question - trying to get at the heart of where the confusion may lie. I concluded that it was a misunderstanding with regard to the various types of polymorphism we generally have available to us in Scala. For this question, in particular, it's the difference between parametric polymorphism (generics) and ad-hoc polymorphism (type classes).
What is Polymorphism?
The idea behind polymorphism is always the same: we are trying to increase code re-use by writing code in a more generic fashion. That is to say, in a higher level of abstraction.
There are three main forms of polymorphism: [1]
-
Subtype Polymorphism (Inheritance)
when a name denotes instances of many different classes related by some common superclass
-
Parametric Polymorphism (Generics)
when one or more types are not specified by name but by abstract symbols that can represent any type
-
Ad-Hoc Polymorphism (Type Classes)
defines a common interface for an arbitrary set of individually specified types
In order to really understand how these forms are different in practice, we need to get a sense for what types of problems each one solves and what their limitations are.
Luckily, this can be done with only a few examples.
Subtype Polymorphism (Inheritance)
subtyping allows a function to be written to take an object of a certain type
T
, but also work correctly, if passed an object that belongs to a typeS
that is a subtype ofT
(according to the Liskov substitution principle) [2]
In other words, subtyping allows a function to accept a value of a certain type T
, but also accept objects derived from type T
, ie: any type which extends T
.
This is the one we're most familiar with in OOP.
Example without subtype polymorphism:
class Cat {
val meow = "meow"
}
class Dog {
val bark = "woof!"
}
def dogSpeak(d: Dog) = println(d.bark)
def catSpeak(c: Cat) = println(c.meow)
Here, it's obvious we would like to define a singular behavior, that of the ability to speak
. But, without any form of polymorphism, we don't have a way of creating a singular, type safe implementation.
That is to say, we could do something like:
def speak(v: Any) = v match {
case v: Dog => v.bark
case c: Cat => c.meow
}
But, we lose static type checking because all type checking is done at runtime here. Not only that, but we also have to continually modify this method as we add more animals that can speak
to our program.
Clearly, there must be a better way!
Example with subtype polymorphism:
trait Animal {
def sound: String
}
class Cat extends Animal {
val sound: String = "meow"
}
class Dog extends Animal {
val sound: String = "woof!"
}
def speak(a: Animal) = println(a.sound)
While this is a very basic example, we can see that using inheritance via subtype polymorphism, we are able to write code at a higher level of abstraction by using the most generic type in the type hierarchy that satisfies our needs. In this case, we know that all Animal
s can make a sound
, so we can write a single speak
method which will now work for any Animal
.
Subtype polymorphism allows us to reduce boiler-plate by treating entire groups of types the same based on their position in a type hierarchy. Importantly, it allows us to do this despite having disparate implementations in the invoked methods between the types in the hierarchy. That is to say, both a Dog
and a Cat
can make a sound
because all Animal
s make sound
s, but a Cat
's sound
is not the same as a Dog
's sound
. The implementations are not uniform across types in the hierarchy.
Parametric Polymorphism (Generics)
Parametric polymorphism allows a function or a data type to be written generically, so that it can handle values uniformly without depending on their type [3]
Subtype polymorphism gets us pretty far, but what about when we want to implement something such as a custom Array
data type?
trait Array {
def set(index: Int, item: ???): Array
def get(index: Int): ???
}
What type can we use in place of ???
? Well, we could do what Java did pre-generics and use the top type. In Java, the closest thing to a top type is Object
, and in Scala we have a true top type of Any
. In other words, every type is a subtype of type Any
in Scala. So, we could do
trait Array {
def set(index: Int, item: Any): Array
def get(index: Int): Any
}
Note: In Java, we don't have a true top type because Java has the concept of primitives such as
int
,float
, etc. which are not objects and are, thus, not part of any type hierarchy.
But, of course, this means that we would have to do a bunch of runtime type checking and casting to actually make our custom Array
type useful, since everything we put into it will come out as type Any
, which is not a very useful type.
val a: String = "hello"
val b: String = "world"
val strings = new Array() // Pretend this is implemented and not just a trait
strings.set(0, a) // a goes in as a String
strings.set(1, b) // b goes in as a String
println(strings.get(0) + strings.get(1)) // strings.get(0) and strings.get(1) are both of type Any
// ERROR!
// type mismatch;
// found : Any
// required: String
If we want to maintain static type checking and avoid runtime casts, we're stuck creating a specialized version of our custom Array
type for each type we want to store in it.
trait StringArray {
def set(index: Int, item: String): StringArray
def get(index: Int): String
}
trait IntArray {
def set(index: Int, item: Int): IntArray
def get(index: Int): Int
}
This obviously gets out of hand quickly. Furthermore, even though we aren't showing the implementation of our set
or get
methods, it's probably obvious to you that we don't actually ever need to invoke any methods on our item
parameter or access any of its fields. Our Array
type doesn't actually care what type item
is, it's just storing a reference to it paired with an index
value to look it up later. In other words, our Array
can handle these values uniformly. We could take the implementations of set
and get
for String
and for Int
and just swap the types out and they would otherwise remain the same between StringArray
and IntArray
. This is the area where parametric polymorphism shines, as it allows us to templatize our code.
trait Array[T] {
def set(index: Int, item: T): Array
def get(index: Int): T
}
val a: String = "hello"
val b: String = "world"
val strings: Array[String] = new Array() // Again, pretend it's implemented
strings.set(0, a) // a goes in as a String
strings.set(1, b) // b goes in as a String
println(strings.get(0) + strings.get(1)) // both strings.get(0) and strings.get(1) are of type String!
// We can also parameterize functions
def first[T](items: Array[T]): T = items.get(1)
Parametric polymorphism allows us to reduce boiler-plate by writing code that can work uniformly over a range of types by using type parameters to templatize the code. An Array
type can work for any type T
because its implementation doesn't depend on any type-specific behavior - the Array
implementation is uniform across all types.
Ad-hoc Polymorphism (Type Classes)
the term ad hoc polymorphism [refers] to polymorphic functions that can be applied to arguments of different types, but that behave differently depending on the type of the argument to which they are applied [4]
What if we need to be able to create methods/functions whose parameters can be of different unrelated types (with regard to subtyping) and where the implementation is also type specific, ie: not uniform?
Let's say you want to write a function that, given a list of numbers, will return the arithmetic mean of the numbers:
Note: I am using examples lifted from an excellent blog post about the subject from Daniel Westheide Why? Because his example is awesome, and you should read his post to understand Type Classes better than my brief introduction here.
To start, let's just implement for Double
def mean(xs: List[Double]: Double = xs.reduce(_ + _) / xs.size
Nice! ...but what about lists of Int
? Surely we want to be able to take the average of other numeric types without having to convert each one to a double first.
def mean(xs: List[Int]: Double = xs.reduce(_ + _) / xs.size
Okay, cool. Thanks to Java's overloading capability, we now we have mean
implemented for both Double
and Int
, right?
Runtime Type Erasure makes Overloading difficult at times
def mean(xs: List[Double]): Double = xs.reduce(_ + _) / xs.size
def mean(xs: List[Int]): Double = xs.reduce(_ + _) / xs.size
// ERROOR:
// double definition:
// def mean(xs: List[Double]): Double at line 1 and
// def mean(xs: List[Int]): Double at line 2
// have same type after erasure: (xs: List)Double
// def mean(xs: List[Int]): Double = xs.reduce(_ + _) / xs.size
// ^
// Compilation Failed
Because of runtime type erasure, the JVM, at runtime, can't tell the difference between List[Int]
and List[Double]
so we can't use overloading to implement our mean
function - at least not easily. It's also a bit repetitious to implement the function over and over again for every type we care about.
Subtype polymorphism to the rescue?
In Scala, Int
and Double
do not share a type hierarchy other than both extending from AnyVal
. If only the numeric types extended some kind of Numeric
trait!
What if they did?
def mean(xs: List[Numeric]): Numeric = ???
Daniel's advice?
Thankfully, in this case there is no such common trait, so we arenβt tempted to walk this road at all. However, in other cases, that might very well be the case β and still be a bad idea. Not only do we drop previously available type information, we also close our API against future extensions to types whose sources we donβt control: We cannot make some new number type coming from a third party extend the [
Numeric
] trait.
He's right. If we key our mean
function on the shared super type, Numeric
, we widen the type of our input to its super type upon output meaning we lose type information. In other words, things go in as specific types Int
, Double
, etc. but come out as the more generic type Numeric
.
Furthermore, let's say we import a 3rd party library such as Joda Time and wish to use our mean
function for its Duration
type - which we can probably all agree fits into the Numeric
definition. We're out of luck! There's no way to make a 3rd party's numeric type extend our Numeric
trait!
What about the classic Adapter Pattern?
trait NumberLike[A] {
def get: A
def plus(y: NumberLike[A]): NumberLike[A]
def divide(y: Int): NumberLike[A]
}
case class NumberLikeDouble(x: Double) extends NumberLike[Double] {
def get: Double = x
def plus(y: NumberLike[Double]) = NumberLikeDouble(x + y.get)
def divide(y: Int) = NumberLikeDouble(x / y)
}
case class NumberLikeInt(x: Int) extends NumberLike[Int] {
def get: Int = x
def plus(y: NumberLike[Int]) = NumberLikeInt(x + y.get)
def divide(y: Int) = NumberLikeInt(x / y)
}
def mean[A](xs: List[NumberLike[A]]): NumberLike[A] = xs.reduce(_.plus(_)).divide(xs.size)
This is the classic OOP solution to the problem of ad-hoc polymorphism. On the upside, we regain full compile-time type safety and extensibility. The downside, however, is the performance hit of having to convert every Int
in a List[Int]
to a NumberLike[Int]
before the mean
function can use it. So, the larger the list of numbers we wish to pass to mean
, the more adapter instances we must create. Type classes solve this problem, as well as the other problems we've encountered thus far as we look for a solution for ad-hoc polymorphism.
So what is a Type Class, then?
A type class
C
defines some behaviour in the form of operations that must be supported by a typeT
for it to be a member of type classC
. Whether the typeT
is a member of the type classC
is not inherent in the type. Rather, any developer can declare that a type is a member of a type class simply by providing implementations of the operations the type must support. Now, onceT
is made a member of the type classC
, functions that have constrained one or more of their parameters to be members ofC
can be called with arguments of typeT
.Type classes allow ad-hoc and retroactive polymorphism. Code that relies on type classes is open to extension without the need to create adapter objects.
In other words, we're providing polymorphism over an interface without using inheritance, but we maintain the ability to have different implementations across types for the set of required of operations defined by our type class interface. Furthermore, we maintain the ability to easily extend the type class membership at-will, ie: in an ad-hoc fashion.
Note: In languages where type classes are built-in as a language feature, they save us from boiler plate code in similar ways to how the other two forms of polymorphism do. However, in Scala, we do not have type classes built into the language and, instead, must encode them using the powerful type system. We end up writing similar amounts of code to that of the adapter pattern, but it ends up being a much cleaner, more runtime efficient implementation. We also must make use of parametric polymorphism (generics) as well as Scala's implicit functionality. These two things are the key to being able to encode type classes in Scala, as seen below.
So, first off, to define our type class, we need a Trait to represent it
trait NumberLike[T] {
def plus(x: T, y: T): T
def divide(x: T, y: Int): T
}
This defines the interface that a type must implement in order to be a member of our type class. In essence, this is the type class.
We can then add some default members to the type class
object NumberLike {
implicit object NumberLikeDouble extends NumberLike[Double] {
def plus(x: Double, y: Double): Double = x + y
def divide(x: Double, y: Int): Double = x / y
}
implicit object NumberLikeInt extends NumberLike[Int] {
def plus(x: Int, y: Int): Int = x + y
def divide(x: Int, y: Int): Int = x / y
}
If NumberLike
is the type class, then NumberLikeDouble
and NumberLikeInt
are both type class members . Well, at least most directly. What they serve to do is allow Double
and Int
to become members of the type class by providing the type-specific implementations of the type class interface's API. That is to say, plus
is defined separately for both Double
and Int
via NumberLikeDouble
and NumberLikeInt
much like it would be if Int
and Double
both extended from a Numeric
super type and each implemented plus
directly. With type classes, we achieve this without the need for the subtype relationship!
It's important to note that these objects are defined as implicit
. This is the key to making type classes encodable in Scala, by allowing these objects to become implicitly available under the right circumstances. Understanding how implicits work in Scala is key to understanding how our encoding of type classes in Scala works, but that's a much larger topic than I am covering here.
Once we have our type class and its members defined, we can write our mean
method utilizing the ad-hoc polymorphism we just created:
def mean[T](xs: List[T])(implicit ev: NumberLike[T]): T = ev.divide(xs.reduce(ev.plus(_, _)), xs.size)
Now when we call mean
with a List[Int]
, T
will be bound to Int
and thus the compiler will attempt to locate an implicit NumberLike[Int]
to automatically pass into the second argument list. If an instance of NumberLike[T]
can't be found for the type T
we specify, then we know we have bound T
to a type which does not satisfy our type class interface and, thus, is not a member of the type class. If we go back to our subtype polymorphism example, this would be like passing in a value to speak
which does not extend Animal
.
In this case, we're passing in List[Int]
so T
is bound to Int
and we have defined an implicit NumberLike[Int]
object, NumberLikeInt
, so we know Int
satisfies the type class interface and we are able to implement our mean
functionality using the plus
and divide
methods concretely defined within NumberLikeInt
.
Extensibility
We can also easily demonstrate that we are able to arbitrarily add support for more types to mean
by simply adding more members to our NumberLike
type class.
object JodaImplicits {
implicit object NumberLikeDuration extends NumberLike[Duration] {
def plus(x: Duration, y: Duration): Duration = x.plus(y)
def divide(x: Duration, y: Int): Duration = Duration.millis(x.getMillis / y)
}
}
This is where type classes shine. If we have many functions that accept NumberLike
as a parameter, we automatically expand their utility every time we add a new member to the type class. And, amazingly, we can add any type to our type class even if we don't control the source code for the type!
Type classes allow us to take types which are related in some arbitrary way and provide a common interface to identify them by even if they are not related by subtyping. In our case, the arbitrary relationship we defined was that of being numeric, or, NumberLike
. We were able to ascertain that any NumberLike
value must be able to support addition and division, and so that became the interface defined in our NumberLike
type class via the plus
and divide
methods. I'm sure you can imagine other methods that would be appropriate for NumberLike
values, eg: multiply
.
Conclusion
I hope this helps some people who are having issues with the three forms of polymorphism in Scala - in particular, type classes.
As for the original question, it should be noted that we must make use of parametric polymorphism (generics) in order to encode Type Classes in Scala. That does not mean that they are the same thing.
Top comments (3)
Thanks for sharing! Great explanation.
Really great explanation, especially this: "we're providing polymorphism over an interface without using inheritance". Instantly clicked for me after I read that.
Thank you!
It cleared all my doubts...such a succint explanation.Thank You