sam

Posted on Feb 27, 2020

Category Theory You're Already Using

#categorytheory #haskell #javascript

Category Theory has a reputation for being abstract, thorny, complicated, difficult, high-level, and for nerds. While this characterization is not wholly without merit, the truth is closer to that of many other "advanced" CS topics: the pool gets very deep indeed, but even the shallow end has its charms. In the same way that you don't have to fully grok the depths of the V8 engine to write JavaScript, you don't have to understand, e.g. Kan extensions to use category theory.

Let's take a tour of some categories that, as the title of this post suggests, you're already using in your everyday programming. We'll start basic, and work our way towards the legendary burrito itself, the dreaded monad. I'll describe the categories in terms of Haskell typeclasses, which are analogous to interfaces in languages like Go or Typescript, and also give some non-Haskell examples, because, again: you are already using this stuff!

Semigroup

A semigroup, in the most basic terms, is something you can stick together. In slightly more complex terms:

class Semigroup a where
    (<>) :: a -> a -> a

This looks a bit scary, but all it means that if some type is a semigroup, there's a function called <> that combines two instances of that type together. It's also specified that <> is associative, i.e. b <> (c <> d) returns the same thing as (b <> c) <> d.

You can probably think of some types you use everyday that meet this criterion, but put a pin in it because the ones you're thinking of are probably actually instances of...

Monoid

A monoid is a semigroup plus an identity element. In Haskell terms:

class Semigroup a => Monoid a where
    mempty :: a

mempty is short for monoid empty (because naming things is hard) and is the "identity element" for the monoid in question, such that mempty <> b and b <> mempty are the same as just plain b.

I didn't come up with any examples of semigroups just now, because most things that look like semigroups are actually monoids, due to the presence of the identity element:

[1,2,3] + [4,5 6] == [1,2,3,4,5,6]
[1,2,3] + [] == [1,2,3]
[] + [1,2,3] == [1,2,3]

"abc" + "def" == "abcdef"
"abc" + "" == "abc"
"" + "abc" == "abc"

1 + (2 + 3) == (1 + 2) + 3
0 + 1 == 1
1 + 0 == 1

2 * (3 * 1) == (2 * 3) * 1
1 * 2 == 2
2 * 1 == 2

The last 2 examples highlight a crucial fact about monoids, and categories in general--one cannot simply say that some type "is a monoid" or not. Rather, "being a monoid" involves not just a type, but an identity element and the <> function. Thus, positive integers "are a monoid" with mempty as 0 and <> as +, but also "are a monoid" with mempty as 1 and <> as *. Obviously, x * 0 != x and x + 1 != x; you need all the pieces.

Functor

A functor is, basically, kind of, at first approximation, a container. In different terms, a functor is something whose "contents" can be "mapped over," i.e. changed, possibly to a different type. In Haskell terms:

class Functor f where
    fmap :: (a -> b) -> f a -> f b

This also seems scary, but isn't, really. f a is a functor "containing" stuff of type a, and (a -> b) is Haskellese for "a function from type a to type b," so all fmap does is apply a function to every element in a functor. A lot of other languages drop the "f" and just call it map:

[1,2,3].map(x => x + 1) == [2,3,4]

l = [1,2,3]

def f(x):
    return x + 1

list(map(f, l)) == [2,3,4]

# or

[f(v) for v in l] == [2,3,4]

(N.B: in Python 3.x, map produces an iterator, which we have to then turn into a list. The reasons for this are beyond the scope of this post, but are worth it, believe me.)

DIGRESSIONS INCOMING, FEEL FREE TO SKIP

Digression 1

The definition above is limited to just one a, but we can define functor instances for types that "contain" more than one type! In Haskell, lists can only contain one type, but tuples can contain multiple types (and, in fact, are basically defined as all the types they contain):

one :: (Int, String) = (1, "one")
two :: (Int, String, List Int) = (2, "two", [1, 2])

Similarly, (hash) maps are defined by the types of their keys and values:

strIntMap :: Map String Int = fromList [("one", 1), ("two", 2)]
intListMap :: Map Int [Int] = fromList [(1, [1,2]), (2, [2,3,4])]

(ignore the fromList thing, that's just how Haskell displays hashmaps.)

We define functor instances for types like this by treating the right-most type as our a, and the rest of it as our f:

instance Functor (b,) where
    fmap func (x, y) = (x, func y)
instance Functor (Map k) where
    fmap func m = <definition omitted because it's scary and irrelevant>

This seems like a lot but the upside is that you can use fmap on anything that's a functor instance! You don't have to worry about .map vs .forEach vs .dictMap vs .treeMap vs whatever.

Digression 2

The reason I was so hedgy about equating functors and containers earlier is that there are things that are functors but aren't containers, really, like...functions! Yes, functions are functors! As I mentioned before, Haskell represents functions as a -> b. We can treat this like we treated other multi-type functors, like hashmaps and tuples--(a ->) becomes our f, and b becomes our a:

instance Functor (a ->) where
    fmap f2 f1 = \x -> f2 (f1 x)

\x -> ... is how Haskell represents anonymous/lambda functions, so this takes two functions and returns another function! If this blows your mind, we can retreat briefly to the container analogy. Take a function like \x -> show x (or lambda x: str(x), or x => String(x)). We could implement this function, basically, as a hashmap:

addOneFunc :: Int -> String = \x -> show x
addOneMap :: Map Int String = fromList [(0, "0"), (1, "1"), (2, "2"), (3, "3")...]

Obviously, we would never do this, because it's incredibly inefficient and basically impossible in most programming languages. But we could, and then we'd see that we could use the same functor instance we had before, where Map Int is our f, and String, on the right, is our a. Ta-da!
END OF DIGRESSIONS, YOU CAN START READING AGAIN

Applicative

Full disclosure: applicative is important in Haskell, but, like semigroup, I can't really think of how to show its use in other languages. Briefly, it's kind of a "next-level" functor, where the function(s) are also in a "container":

instance Functor f => Applicative f where
    pure :: a -> f a
    <*> :: f (a -> b) -> f a -> f b

Cool, great. Get a glass of water, and pat yourself on the back for sticking with me this long, because now we're going to move on to the fear of FP neophytes everywhere, the slayer of analogies, the big beast:

Monad

We're In It Now, so let's just dive in and start with the typeclass, as usually presented:

instance Monad f where
   return :: a -> f a
   >>= :: f a -> (a -> f b) -> f b

If return looks like pure, well, that's because it actually is. applicative was discovered and popularized after monad made it into standard Haskell, so for a while pure and return were separate. This has since been rectified, such that current Haskell has

instance Applicative f => Monad f where
    return = pure
    >>= :: f a -> (a -> f b) -> f b

Monads get a lot of pixels dedicated to them. If you get nothing else from this post, I want it to be this: monads aren't scary. Take a closer look at >>=:

>>= :: f a -> (a -> f b) -> f b

For kicks, let's specialize it...

>>= :: [a] -> (a -> [b]) -> [b]

Does it possibly remind you of anything?

Yes: it's just flatMap. That's it, that's the whole typeclass.

I'm not even joking. Yes, it turns out that you can also use monads to do IO in a type-safe and "pure" way, but that's just a bonus. It's really just flatMap, but generalizable over a whole bunch of things besides lists, in the same way that functor is. Every time you use flatMap, every time you write a nested for loop, you're using a monad. Category theory is on your side, whether you know it or not!

Postscript and further reading

Like I said, there are a lot of "monad tutorials" out there. My recommendation is: skip 'em. You know what monads are now, go out and be smug about it with my blessings.

If you do want to read more, written by people who are smarter and more eloquent than I am, I recommend the following:

Category Theory For Programmers -- An absolutely fantastic intro to Category Theory. I've read the first 2/3rds or so a half-dozen times--I've never actually finished it because it goes way far, beyond monads and into adjunctions, kan extensions, and more. Great writing, great drawings, great diagrams. It's great, read it.

Haskell Programming From First Principles -- They call it "haskellbook" for a reason. It's big, it's rigorous, it's the best intro to Haskell book out there.

Typeclassopedia -- all the typeclasses in the "core" Haskell, with their laws and raisons d'etre. Great as a reference, satisfying and edifying just to browse through.

DEV Community