Pragmatic Maciej

Posted on Sep 24, 2019 • Edited on Jul 9, 2020

Algebraic Structures Explained - Part 1 - Base Definitions

#javascript #typescript #elm #functional

Introduction

The article is a opening of the series Algebraic Structures Explained. In the series I will try to cover useful programming abstractions which are grounded in math and specifically in the abstract algebra. The series is addressed to front-end devs, as I will show examples mostly in front-end based languages, but it doesn't mean that somebody with different background can't benefit from the content.

Expectations

I am not a mathematician, I am a developer, that is why this series is focuses more at the usage of those constructs rather than strict math theory. I will do my best in order to connect those two worlds, and present the topic in the most friendly way.

Jargon used in the series will not be strict or mathematical, also description of constructs can be less deep and accurate, I will explain them in my own way 😉, if you see a mistake in any article from the series, please don't hesitate to leave a comment.

Topics will be presented with code examples. I will use many programming languages, in most it will be TypeScript and Elm, but also be prepared for others, and the choose really depends from the topic and my mood 😀, yes really. But don't be scary if you don't know languages in which I write examples, code should be straight forward. Every snippet will be marked by language name. Code parts which are not marked will be pseudo code.

Code standards

I will provide code examples which will focus on the presented topic. I will sometimes skip some constructs like TS variable declarations(const,let,var), or module imports, sometimes I will also leave some code part as ... in order to show that it is less important for the given subject.

Base definitions - What is a set

First of all, I am not talking about set as any specific implementation like set data structure in JS, but set as more abstract mathematical concept - set.

Set is just a collection of objects which have some common characteristic or are just listed as set members S = {a,b,c}, Nat = {1,2,3...}. We can understand common characteristic as a function which for a given object as an input, will return if the object is a member of the set or not. isInSet: (a: Anything) -> Yes | No

Ok, so what this characteristic can be? It can be literally anything which can be verified and brought to simple yes/no answer. To keep it simple we can assume that set can be created from any kind of elements, set can be created even from nothing, and this set is empty set ∅, or in more software terms - void.

Exercise for the reader - void is an empty set, and what about null ?

Base definitions - What is a type

What does it have to programming at all? Quite a lot. As we can think that type is a set. Taking into consideration type theory, type is a type, not a set. But for simplicity we can assume those terms are equal. Check out this great explanation about types as sets - Elm - types as sets. So primary types in for example TypeScript, like number, bool, string, object are sets of some possible values with common characteristic, the same applies to other programming languages, even to dynamic typed ones, but there types are implicit. To give an illustration I will take the number type from TypeScript.

[TS] isNumber = (a): a is number => typeof a === 'number'

Function isNumber is the number set characteristic function. The type definition of this function is [TS](a:unknown): a is number => boolean. It takes as argument value from type unknown and returns an answer if the given value is a type of number, in other words if an element is a member of the number set. Take a look at a is number part, it is very explicit information that a is a member of a number type/set.

Note Unknown type in TypeScript is really a union of all types available in the codebase. So all types are subtypes of the unknown type.

Composed types are also sets

Not only primary types are sets. But every new type in the code also forms a set. Custom type can be a composition of other types, in other words it is created by some operations on sets. There are two common compositions of types - Product and Sum, which are widely know as Algebraic Data Types. Product creates new type by and operation, sum by or operation.

Product type

Products type is a type created from other types in such a way that in order to create instance of the product, all instances of sub-types are required. Product or Cartesian product is terms of the set theory is the result of joining two or more sets into one by putting them into tuple. Product of sets A, B and C is a 3-tuple - (A,B,C).

The simplest example of the product type is a Pair (2‑tuple). Pair is just two values (a:A, b:B), one from type A, one from type B (B can be equal A). To create such a pair, both values are required. Also the amount of possible values is a multiplication of sizes of these two sub-types - A and B. So size(Pair<A,B>) equals size(A) * size(B). Size of a set, number of set elements has more proper name - cardinality

[TS]
// tuple
type Triangle = [number, number, number] // Product Int*Int*Int
// record - labeled tuple
type User = { name: string, age: number } // Product String*Int

Surprisingly almost the same syntax for products exists in Elm.

[Elm]
-- tuple
type alias Triangle = (Int, Int, Int) -- Product Int*Int*Int
-- record - labeled tuple
type alias User = { name : String, age : Int } -- Product String*Int

In order to create member of the Triangle set, there needs to be provided an input in the form of 3 values from Int set. Amount of possibilities is equal Int³.

Note Every product type can be considered as isomorphic to the tuple type. Any record/map is just a tuple with additional label, and we can create such - (name, lastname) -> {name:name, lastname:lastname} and we can define reverse operation - {name:name, lastname:lastname} -> (name, lastname).

Sum type

Sum type is a type created from other types in such a way that in order to create an instance of the sum, only one instance from sub-types is required.

The simplest example of the sum type is Boolean type. If something is Boolean, it can be or true, or false, never both. Tipical understanding of this construct is just a set with listed possible elements, so this goes into our set definition where we define what is inside the set - BoolSet = {True, False}. But how then we can name this construct as a sum type, if there is only one type with two possible values?

It is very interesting question. Sum type holds here true only if we consider True and False as single value sets - singletons with one element inside, and this is fully correct reasoning.

[Elm]
type Boolean = True | False

Also the amount of possible values is a sum of cardinality in all sub-sets. As for Boolean True and False represent sets with one element each, then the Boolean type contains exactly two values.

The most important feature of the sum type is that value can belong only to one type. It is not possible to sum two types with any shared elements. Sum types force tag on every sub-type. In order to create a type which has representation of type String or type Int, we need to create tagged versions of those types.

[Elm]
--- type definition
type StrOrInt = Str String | In Int
--- usage
x = Str 12 
-- x has a type StrOrInt and is represented as tagged String type

Sum type implementation details

Consider above diagram. There are two explanation of Elm definition of the sum type. First is based on types/sets where the sum is a union of disjoint sets, where even element C is also considered as a set with one element. The second shows definitions in terms of value constructors. In other words on the right side of the declaration we have functions which provide us the value of type Sum. Even C is presented here as function without arguments, this kind of function has a name - const. And because it is const, the value here is evaluated eagerly, for other constructors, they remain as functions (are lazy) and need an argument to create the value of type Sum.

Note Sum type has many names, it is also disjoint union, discriminated union, tagged union or variant

Note Sum type in Category Theory is named - Coproduct.

Note Union type with possible intersection is possible in TypeScript - union - it is not a sum type as we cannot differentiate from which union type value comes from - string | number | string ≡(equal) string | number

Composing sums and products

Sums and products are in not any way different types then others. That is why further compositions are possible. We can create products including other products and sums, sums containing other products and sums.

[ELM]
type ResultStatus = Pending | Ok | Error
-- product created from sum type Boolean and sum type ResultStatus
type alias UserProduct = { active: Boolean, age: Int, status: ResultStatus }    

-- sum created from product types tagged by Regular and Visitor tags
type UserSum
  = Regular String Int
  | Visitor String Int

In above code snippet, the last two variants - Regular and Visitor can be confusing. These variants represent truly product types, as there needs to be provided value from type String and from type Int in order to fulfill the value constructor. So it is isomorphic to tagged tuple - Regular String Int ≈(isomorphic) Regular (String, Int). And as we know, tuple is a most basic representation of the product type

More detail about algebraic data types can be found in this very nice explanation - What is an algebraic data type? by Tikhon Jelvis.

Note Third common algebraic data type is Exponential type. As functions in most modern languages can be passed as values, values need to have types, and these types for functions are exactly exponential types. Having a function type A -> B, type can be presented as B^A, so the type represents value of type B on given A.

Note I leave you here great talk by Philip Wadler about Sums and Products. He describe it far more better then I have.

Base definition - what is an algebraic structure

Ok, now we know what is set, we know that we work with sets in programming and we call those sets - types. Types can be mixed and mingled with each other, these compositions are products or sums, or sums of products, or products of sums 😄.

We have a set, however to have algebraic structure we need one thing more. And this things is a operation on the set. In other words algebraic structure is a set + operation/s working with this set. Simple as that.

Remember that operation needs to be closed, it means that arguments and return type needs to be the same T -> T -> T. Set + operation which is not closed forms a mathematical structure, but not algebraic one.

For instance int set has operation of adding elements, then int set with adding operation - binary function int + int -> int, creates an algebraic structure, and in this example it is Semigroup. Also there exists neutral element for this operation and it is 0 as adding anything to it will not change the value a + 0 = a and 0 + a = a, taking these two properties - neutral element and adding operation we created another algebraic structure - Monoid.