Hello, my name is Dmitriy Karlovskiy and I want to tell you about a fundamental feature of type systems, which is often either not understood at all or is not understood correctly through the prism of the implementation of a particular language, which, due to evolutionary development, has many atavisms. Therefore, even if you think you know what "variance" is, try to look at the issue with fresh eyes. We will start from the very basics, so that even a beginner will understand everything. And let's continue without water, so that even the pros would be useful for structuring their knowledge. Code examples will be in a pseudo-language similar to TypeScript. Then the approaches of several real languages will be analyzed. And if you are developing your own language, then this article will help you not to step on someone else's rake.
Arguments and Parameters
Parameter is what we accept. Describing the type of a parameter, we set a restriction on the set of types that can be passed to us. A few examples:
// function parameter
function log( id: string | number ) {}
// constructor parameter
class logger {
constructor( readonly id: Natural ) {}
}
// template parameter
class Node< Id extends Number > {
id: ID
}
Argument is what we are passing. When passed, an argument is always of some particular type. However, with static analysis, the specific type may not be known, which is why the compiler again operates with type restrictions. A few examples:
log( 123 ) // concrete type
new Logger( promptStringOrNumber( 'Enter id' ) ) // concrete type is only known at runtime
new Node( !!'id' ) // clearly invalid type, compilation error
Subtypes
Types can form a hierarchy. Subtype is a special case of supertype. A subtype can be formed by narrowing the set of possible values of the supertype. For example, the Natural type is a subtype of Integer and Positive. And all three are simultaneously subtypes of Real. At the same time, the Positive and Integer types are overlapping, but neither of them is a judgment of the other.
Another way to form a subtype is to extend it by combining it with another type orthogonal to it. For example, there is a "color figure" having the "color" property, and there is a "square" having the "height" property. Combining these types we get a "colored square". And by adding a "circle" with its "radius" we can get a "colored cylinder".
Hierarchies
For further narration, we need a small hierarchy of animals and a similar hierarchy of cages.
abstract class Animal {}
abstract class Pet extends Animal {}
class Cat extends Pet {}
class Dog extends Pet {}
class Fox extends Animal {}
class AnimalCage { content: Animal }
class PetCage extends AnimalCage { content: Pet }
class CatCage extends PetCage { content: Cat }
class DogCage extends PetCage { content: Dog }
class FoxCage extends AnimalCage { content: Fox }
Everything below in the figure is a narrowing of the type above. A pet cage can only contain pets, not wild ones. A cage with a dog can only contain dogs.
Covariance
The simplest and most understandable is supertype constraint or covariance. In the following example, a function parameter is covariant to its specified type. That is, a function can accept both this type itself and any of its subtypes, but cannot accept supertypes or other types.
function touchPet( cage: PetCage ): void {
log( `touch ${cage.content}` )
}
touchPet( new AnimalCage ) // forbid
touchPet( new PetCage ) // allow
touchPet( new CatCage ) // allow
touchPet( new DogCage ) // allow
touchPet( new FoxCage ) // forbid
Since we do not change anything in the cage, we can easily pass the function to the cage with the cat, since it is nothing more than a special case of a cage with a pet.
Contravariance
It's a little harder to understand the subtype constraint or contravariance. In the following example, a function parameter is contravariant to the type specified for it. That is, a function can accept both this type itself and any of its supertypes, but cannot accept subtypes or other types.
function pushPet( cage: PetCage ): void {
const Pet = random() > .5 ? Cat: Dog
cage.content = new Pet
}
pushPet( new AnimalCage ) // allow
pushPet( new PetCage ) // allow
pushPet( new CatCage ) // forbid
pushPet( new DogCage ) // forbid
pushPet( new FoxCage ) // forbid
We can't pass in a cage with a cat, because the function can put a dog in there, which is not allowed. But a cage with any animal can be safely transferred, since both a cat and a dog can be placed there.
Invariance
Subtype and supertype constraint can be both. This case is called invariance. In the following example, a function parameter is invariant to the type specified for it. That is, a function can only take the specified type and no more.
function replacePet( cage: PetCage ): void {
touchPet(cage)
pushPet(cage)
}
replacePet( new AnimalCage ) // forbid
replacePet( new PetCage ) // allow
replacePet( new CatCage ) // forbid
replacePet( new DogCage ) // forbid
replacePet( new FoxCage ) // forbid
The replacePet
function inherits the restrictions from the functions that it uses internally: from touchPet
it took the restriction on the supertype, and from pushPet
- the restriction on the subtype. If we pass it a cage with any animal, then it will not be able to pass it to the touchPet function, which does not know how to work with foxes (a wild animal will just bite off its finger). And if we pass a cage with a cat, then it will not work to call pushPet
.
Bivariance
It is impossible not to mention the exotic absence of restrictions - bivariance. In the following example, a function can accept any type that is a supertype or subtype.
function ensurePet( cage: PetCage ): void {
if( cage.content instanceof Pet ) return
pushPet(cage)
}
ensurePet( new AnimalCage ) // allow
ensurePet( new PetCage ) // allow
ensurePet( new CatCage ) // allow
ensurePet( new DogCage ) // allow
ensurePet( new FoxCage ) // forbid
You can transfer a cage with an animal into it. Then it will check that there is a pet in the cage, otherwise it will put a random pet inside. And you can also transfer, for example, a cage with a cat, then it simply will not do anything.
Generics
Some people think that variance has something to do with generalizations. Often because variance is often explained using generic containers as an example. However, in the whole story, we still have not had a single generalization - entirely specific classes:
class AnimalCage { content: Animal }
class PetCage extends AnimalCage { content: Pet }
class CatCage extends PetCage { content: Cat }
class DogCage extends PetCage { content: Dog }
class FoxCage extends AnimalCage { content: Fox }
This was done to show that variance problems have nothing to do with generalizations. Generalizations are only needed to reduce copy-paste. For example, the code above can be rewritten with a simple generalization:
class Cage<Animal> { content: Animal }
And now you can create instances of any cages:
const animalCage = new Cage<Animal>()
const petCage = new Cage<Pet>()
const catCage = new Cage<Cat>()
const dogCage = new Cage<Dog>()
const foxCage = new Cage<Fox>()
Declaration of Restrictions
Please note that the signatures of all four previously given functions are exactly the same:
( cage: PetCage )=> void
That is, such a description of the received parameters of the function does not have completeness - it cannot be said from it that it can be passed to the function. Well, except that it is clearly visible that it is definitely not worth transferring a cage with a fox into it.
Therefore, in modern languages there are means for explicitly specifying what type restrictions a parameter has. For example, the modifiers in and out in C#:
interface ICageIn<in T> { T content { set; } } // contravariant generic parameter
interface ICageOut<out T> { T content { get; } } // covariant generic parameter
interface ICageInOut<T> { T content { get; set; } } // invariant generic parameter
Unfortunately, in C#, for each version of modifiers, you need to enter a separate interface. In addition, as far as I understand, bivariance in C# is generally inexpressible.
Output Parameters
Functions can not only accept but also return values. In general, the return value can be more than one. As an example, let's take a function that takes a cage with a pet and returns two pets.
function getPets( input: PetCage ): [ Pet , Pet ] {
return [ input.content , new Cat ]
}
Such a function is equivalent to a function that takes, in addition to one input parameter, also two output parameters.
function getPets( input: PetCage, &output1: PetCage, &output2: PetCage ): void {
*output1 = input.content
*output2 = new Cat
}
The outer code allocates extra memory on the stack so that the function puts whatever it wants to return into it. And upon completion, the calling code will already be able to use these containers for their own purposes.
From the equivalence of these two functions, it follows that the return values of the function, unlike the parameters, are always contravariant to the specified output type. For the function can write to them, but cannot read from them.
Object Methods
Object methods are functions that take an additional object pointer as an implicit parameter. That is, the following two functions are equivalent.
class PetCage {
pushPet(): void {
const Pet = random() > .5 ? Cat : Dog
this.content = new Pet
}
}
function pushPet( this: PetCage ): void {
const Pet = random() > .5 ? Cat : Dog
this.content = new Pet
}
However, it is important to note that a method, unlike a regular function, is also a member of a class, which is an extension of a type. This leads to the fact that there is an additional supertype constraint for functions that call this method:
function fillPetCage( cage: PetCage ): void {
cage.pushPet()
}
We cannot pass a supertype to it where the pushPet
method has not yet been defined. This is similar to the case of invariance in that there is a limit both from below and from above. However, the location of the pushPet
method definition may be higher in the hierarchy. And that's where the supertype constraint will be.
Barbara Liskov Substitution Principle (LSP)
Many people think that the supertype-subtype relationship is defined not by the methods of narrowing and extending the type mentioned earlier, but by the ability to substitute a subtype in any place where the supertype is used. Apparently, the reason for this misconception is precisely in the LSP. However, let's read the definition of this principle carefully, paying attention to what is primary and what is secondary:
Functions that use a base type should be able to use subtypes of the base type without knowing it and without violating the correctness of the program.
For immutable (including those that do not refer to mutable) objects, this principle is automatically followed, since there is no way to get a subtype constraint.
Things are more complicated with mutables, since the following two situations are mutually exclusive for the LSP principle:
- The class
A
has a subclassB
, where the fieldB::foo
is a subtype ofA::foo
. - Class
A
has a method that can change the fieldA::foo
.
Accordingly, there are only three ways:
- Forbid objects to narrow the types of their fields when inheriting. But then you can put an elephant in a cage for a cat.
- Be guided not by LSP, but by the variance of each parameter of each function separately. But then you have to think a lot and explain to the compiler where what restrictions on types are.
- To spit on everything and go to
monasteryfunctional programming, where all objects are immutable, which means that their accepting parameters are covariant to the declared type.
TypeScript
In typescript, the logic is simple: all function parameters are considered covariant (which is not true), and return values are contravariant (which is true). It was shown earlier that function parameters can have absolutely any variance, depending on what this function does with these parameters. So there are cases like this:
abstract class Animal { is!: 'cat' | 'dog' | 'fox' }
abstract class Pet extends Animal { is!: 'cat' | 'dog'}
class Cat extends Pet { is!: 'cat' }
class Dog extends Pet { is!: 'dog' }
class Fox extends Animal { is!: 'fox' }
class Cage<Animal> { content!: animal }
function pushPet( cage: Cage<Pet> ): void {
const Pet = Math.random() > .5 ? Cat : Dog
cage.content = new Pet
}
pushPet( new Cage<Animal>() ) // forbid to push Pet to Animal Cage ❌
pushPet( new Cage<Cat>() ) // allow to push Dog to Cat Cage ❌
To solve this problem, you have to help the compiler with a rather non-trivial code:
function pushPet<
PetCage extends Cage<Animal>
>(
cage: Cage<Pet> extends PetCage ? PetCage : never
): void {
const Pet = Math.random() > .5 ? Cat: Dog
cage.content = new Pet
}
pushPet( new Cage<Animal>() ) // allow ✅
pushPet( new Cage<Pet>() ) // allow ✅
pushPet( new Cage<Cat>() ) // forbid ✅
pushPet( new Cage<Dog>() ) // forbid ✅
pushPet( new Cage<Fox>() ) // forbid ✅
FlowJS
FlowJS has a more advanced type system. In particular, in the type description, you can specify its variance for generic parameters and for object fields. In our example with cages, it looks something like this:
class Animal {}
class Pet extends Animal {}
class Cat extends Pet {}
class Dog extends Pet {}
class Fox extends Animal {}
class Cage< Animal > { content: Animal }
function touchPet( cage: { +content: Pet } ): void {
console.log( `touch ${typeof cage.content}` )
}
function pushPet( cage: { -content: Pet } ): void {
const Pet = Number((0: any)) > .5 ? Cat : Dog
cage.content = new Pet
}
function replacePet( cage: { conten: Pet } ): void {
touchPet( cage )
pushPet( cage )
}
touchPet( new Cage<Animal> ) // forbid ✅
touchPet( new Cage<Pet> ) // allow ✅
touchPet( new Cage<Cat> ) // allow ✅
touchPet( new Cage<Dog> ) // allow ✅
touchPet( new Cage<Fox> ) // forbid ✅
pushPet( new Cage<Animal> ) // allow ✅
pushPet( new Cage<Pet> ) // allow ✅
pushPet( new Cage<Cat> ) // forbid ✅
pushPet( new Cage<Dog> ) // forbid ✅
pushPet( new Cage<Fox> ) // forbid ✅
replacePet( new Cage<Animal> ) // forbid ✅
replacePet( new Cage<Pet> ) // allow ✅
replacePet( new Cage<Cat> ) // forbid ✅
replacePet( new Cage<Dog> ) // forbid ✅
replacePet( new Cage<Fox>) // forbid ✅
The bivariance here is inexpressible. Unfortunately, I could not find a way to more conveniently set the variance without explicitly describing the types of all fields. For example, something like this:
function pushPet( cage: Contra< Cage<Pet> , 'content' > ): void {
const Pet = Number((0: any)) > .5 ? Cat : Dog
cage.content = new Pet
}
C Sharp
C# was originally designed without any understanding of variance. However, in and out parameter modifiers were subsequently added to it, which allowed the compiler to correctly check the types of arguments passed. Unfortunately, using these modifiers is again not very convenient.
using System;
abstract class Animal {}
abstract class Pet: Animal {}
class Cat: Pet {}
class Dog: Pet {}
class Fox: Animal {}
interface ICageIn<in T> { T content { set; } }
interface ICageOut<out T> { T content { get; } }
interface ICageInOut<T> { T content { get; set; } }
class Cage<T> : ICageIn<T>, ICageOut<T>, ICageInOut<T> { public T content { get; set; } }
public class Program {
static void touchPet( ICageOut<Pet> cage ) {
Console.WriteLine( cage.content );
}
static void pushPet( ICageIn<Pet> cage ) {
cage.content = new Dog();
}
static void replacePet( ICageInOut<Pet> cage ) {
touchPet( cage as ICageOut<Pet> );
pushPet( cage as ICageIn<Pet> );
}
public static void Main() {
var animalCage = new Cage<Animal>();
var petCage = new Cage<Pet>();
var catCage = new Cage<Cat>();
var dogCage = new Cage<Dog>();
var foxCage = new Cage<Fox>();
touchPet( animalCage ); // forbid ✅
touchPet( petCage ); // allow ✅
touchPet( catCage ); // allow ✅
touchPet( dogCage ); // allow ✅
touchPet( foxCage ); // forbid ✅
pushPet( animalCage ); // allow ✅
pushPet( petCage ); // allow ✅
pushPet( catCage ); // forbid ✅
pushPet( dogCage ); // forbid ✅
pushPet( foxCage ); // forbid ✅
replacePet( animalCage ); // forbid ✅
replacePet( petCage ); // allow ✅
replacePet( catCage ); // forbid ✅
replacePet( dogCage ); // forbid ✅
replacePet( foxCage ); // forbid ✅
}
}
Java
The ability to switch variance was added to Java rather late and only for generic parameters that themselves appeared relatively recently. If the parameter is not generalized, then trouble.
abstract class Animal {}
abstract class Pet extends Animal {}
class Cat extends Pet {}
class Dog extends Pet {}
class Fox extends Animal {}
class Cage<T> { public T content; }
public class Main {
static void touchPet( Cage<? extends Pet> cage ) {
System.out.println( cage.content );
}
static void pushPet( Cage<? super Pet> cage ) {
cage.content = new Dog();
}
static void replacePet( Cage<Pet> cage ) {
touchPet( cage );
pushPet( cage );
}
public static void main(String[] args) {
Cage<Animal> animalCage = new Cage<Animal>();
Cage<Pet> petCage = new Cage<Pet>();
Cage<Cat> catCage = new Cage<Cat>();
Cage<Dog> dogCage = new Cage<Dog>();
Cage<Fox> foxCage = new Cage<Fox>();
touchPet( animalCage ); // forbid ✅
touchPet( petCage ); // allow ✅
touchPet( catCage ); // allow ✅
touchPet( dogCage ); // allow ✅
touchPet( foxCage ); // forbid ✅
pushPet( animalCage ); // allow ✅
pushPet( petCage ); // allow ✅
pushPet( catCage ); // forbid ✅
pushPet( dogCage ); // forbid ✅
pushPet( foxCage ); // forbid ✅
replacePet( animalCage ); // forbid ✅
replacePet( petCage ); // allow ✅
replacePet( catCage ); // forbid ✅
replacePet( dogCage ); // forbid ✅
replacePet( foxCage ); // forbid ✅
}
}
C++
C++, thanks to its powerful template system, can express various variations, but, of course, there is a lot of code.
#include <iostream>
#include <typeinfo>
#include <type_traits>
class Animal {};
class Pet: public Animal {};
class Cat: public Pet {};
class Dog: public Pet {};
class Fox: public Animal {};
template< class T > class Cage { public: T *content; };
template< class T, class = std::enable_if_t< std::is_base_of< Pet, T >::value> >
void touchPet( const Cage<T> &cage ) {
std::cout << typeid(T).name();
}
template< class T, class = std::enable_if_t< std::is_base_of< T, Pet >::value > >
void pushPet( Cage<T> &cage ) {
cage.content = new Dog();
}
void replacePet( Cage<Pet> &cage ) {
touchPet( cage );
pushPet( cage );
}
int main(void) {
Cage<Animal> animalCage { new Fox() };
Cage<Pet> petCage { new Cat() };
Cage<Cat> catCage { new Cat() };
Cage<Dog> dogCage { new Dog() };
Cage<Fox> foxCage { new Fox() };
touchPet( animalCage ); // forbid ✅
touchPet( petCage ); // allow ✅
touchPet( catCage ); // allow ✅
touchPet( dogCage ); // allow ✅
touchPet( foxCage ); // forbid ✅
pushPet( animalCage ); // allow ✅
pushPet( petCage ); // allow ✅
pushPet( catCage ); // forbid ✅
pushPet( dogCage ); // forbid ✅
pushPet( foxCage ); // forbid ✅
replacePet( animalCage ); // forbid ✅
replacePet(petCage); // allow ✅
replacePet(catCage); // forbid ✅
replacePet(dogCage); // forbid ✅
replacePet( foxCage ); // forbid ✅
return 0;
}
D
D does not have any sane means of explicitly indicating variance, but it can itself infer types based on their use.
import std.stdio, std.random;
abstract class Animal {}
abstract class Pet: Animal { string name; }
class Cat: Pet {}
class Dog: Pet {}
class Fox: Animal {}
class Cage(T) {
T content;
}
void touchPet( PetCage )( PetCage cage ) {
writeln( cage.content.name );
}
void pushPet( PetCage )( PetCage cage ) {
cage.content = ( uniform(0,2) > 0 ) ? new Dog() : new Cat();
}
void replacePet( PetCage )( PetCage cage ) {
touchPet( cage );
pushPet( cage );
}
void main() {
Cage!Animal animalCage;
Cage!Pet petCage;
Cage!Cat catCage;
Cage!Dog dogCage;
Cage!Fox foxCage;
animalCage.touchPet(); // forbid ✅
petCage.touchPet(); // allow ✅
catCage.touchPet(); // allow ✅
dogCage.touchPet(); // allow ✅
foxCage.touchPet(); // forbid ✅
animalCage.pushPet(); // allow ✅
petCage.pushPet(); // allow ✅
catCage.pushPet(); // forbid ✅
dogCage.pushPet(); // forbid ✅
foxCage.pushPet(); // forbid ✅
animalCage.replacePet(); // forbid ✅
petCage.replacePet(); // allow ✅
catCage.replacePet(); // forbid ✅
dogCage.replacePet(); // forbid ✅
foxCage.replacePet(); // forbid ✅
}
Epilogue
That's all for now. I hope this material has helped you better understand type constraints and how they are implemented in different languages. Somewhere better, somewhere worse, somewhere not, but in general - so-so. Perhaps it is you who will develop a language in which all this will be implemented both conveniently and type-safely. It would be interesting for me to read your thoughts on this topic.
Top comments (0)