DEV Community

Hadrian Hughes
Hadrian Hughes

Posted on • Originally published at hadrianhughes.dev on

Understanding applicatives in Haskell

My previous post was about the problem of parsing left-recursive grammars when using the parser combinator pattern in Haskell. In my code examples, I made extensive use of applicatives but didn't explain what they are or how they work. This was partially because their syntax is fairly intuitive when applied to parser construction, but also because I hadn't really grokked them myself---I was relying heavily on that intuitiveness without properly understanding what I was doing. So, I decided to take a deeper look at applicatives, what they are and how they work.

Rather than defining applicatives from the get-go, I'll present how they're used by parser libraries, as this is the way I first became properly aware of them.

Let's say we're building a programming language, and we want to be able to declare variables in the style of C---for example, we might declare an integer like int x;. To represent this in the AST, we might create a type like so:

data VarDecl = VarDecl { varType :: String, varName :: String }
Enter fullscreen mode Exit fullscreen mode

Then to parse this kind of expression, we might create a parser like so:

varDeclP :: Parser VarDecl
varDeclP = VarDecl <$> string <*> string <* symbol ";"
Enter fullscreen mode Exit fullscreen mode

As I mentioned previously, it is fairly intuitive to read this code as "we want to parse a string, then another string, then a semicolon and then use those things to construct a VarDecl." But how exactly does this work? It isn't immediately obvious how the two strings end up populating the varType and varName fields correctly. To understand it, let's break down this parser function piece by piece.

Data constructors are functions

The first important thing to know is that in Haskell, instantiating a data type is done using a data constructor, and these data constructors are just functions. In the snippet above where we define the VarDecl type, we write VarDecl on both sides of the = sign. The instance on the left is defining the name of the type, whereas the instance on the right is specifying the name of the constructor used for instantiating the type. If we wanted to, we could give the constructor a different name from the type, like VarDeclConstructor.

We then specify that VarDecl is a record type by placing the fields we expect between curly braces. It isn't obvious, but we're actually creating a function called VarDecl with the signature String -> String -> VarDecl---a function that takes two strings and returns a VarDecl. The function's arguments are the fields we specified between the curly braces, in the order we specified them.

To explicitly construct a VarDecl for the expression int x; we would use the data constructor like so:

VarDecl "int" "x"
-- This creates a value like
-- VarDecl { varType = "int", varName = "x" }
Enter fullscreen mode Exit fullscreen mode

Understanding data constructors gets us a step closer to understanding how the two strings in varDeclP are used to construct a VarDecl: we're simply passing arguments to a function in the order they appear in the constructor definition.

What is a functor

The next term in the parser is the <$> operator. This is an alias of the fmap function, made into an infix operator. fmap is part of the Functor typeclass---it has the signature:

fmap :: Functor f => (a -> b) -> f a -> f b
Enter fullscreen mode Exit fullscreen mode

It is used to "lift" a function into a functorial context, meaning a function that takes a and returns b becomes a function that takes f a (a in the context of a functor f) and returns f b (b in the context of a functor f).

I won't go into detail about functors here, but they can be thought of as types which can be mapped over. The clearest example of a functor is the list type. For a list, fmap behaves exactly the same as map: replacing f in fmap's signature with list types gives us:

fmap :: (a -> b) -> [a] -> [b]
Enter fullscreen mode Exit fullscreen mode

fmap takes a function from a -> b and gives us back a function that applies that function to each element in a list and then returns the resulting list. Since <$> is an alias for fmap, the following expressions mean the same thing:

fmap f xs == f <$> xs
Enter fullscreen mode Exit fullscreen mode

Importantly, the behaviour of fmap changes based on the particular functorial context of its second argument. Let's add parentheses to our parser to make it clearer how this applies there.

varDeclP :: Parser VarDecl
varDeclP = ((VarDecl <$> string) <*> string) <* symbol ";"
Enter fullscreen mode Exit fullscreen mode

Function application is left-associative, so the parentheses do not alter the function's behaviour, they only highlight the associativity.

The first sub-expression we have is VarDecl <$> string. We are lifting the data constructor VarDecl (which, again, is just a function) into whichever context string is in. Which context is that? The string parser has the signature string :: Parser String, so assuming Parser is a member of the Functor typeclass (it needs to be for the parser combinator pattern to work) we can use it as the second argument to fmap. Let's substitute Parser for f and String for a in the fmap type signature:

fmap :: (String -> b) -> Parser String -> Parser b
Enter fullscreen mode Exit fullscreen mode

This works, but what is the type of b? Well, functions are always curried, and the VarDecl constructor has the signature VarDecl :: String -> String -> VarDecl, so passing a single string argument to it leaves us with a resulting function of type String -> VarDecl---this is b.

fmap :: (String -> (String -> VarDecl)) -> Parser String -> Parser (String -> VarDecl)
Enter fullscreen mode Exit fullscreen mode

Because our VarDecl constructor is a binary function, mapping it over a single argument in the Parser context results in a partially applied function lifted into that context. What we want is to continue passing arguments to the function so we can finish building our VarDecl record, but we'll need another way to invoke it now that it's in the Parser context. Now we're ready to introduce applicatives.

What is an applicative?

An applicative (short for applicative functor) is a special case of functors, where the type of value inside the functor can be a function. For a type to qualify as a member of the Applicative class, it must implement two functions:

pure :: Functor f => a -> f a
<*> :: Functor f => f (a -> b) -> f a -> f b
Enter fullscreen mode Exit fullscreen mode

The pure function takes a value of type a and lifts it into the context of a functor type f. This means that for a Functor f to also be a member of the Applicative class, it must be possible to construct an instance of f from a single value.

The <*> function, called the sequential application operator, takes an applicative value (a function a -> b in a functorial context) and a value of type f a (a in the same context). The value inside the f a argument is fed into the function a -> b inside the applicative, and the resulting b value is returned, still inside the f context. As with functors, the specifics of how this behaviour is implemented is different for each instance of Applicative.

Let's revisit the list functor for a concrete example. To test if lists are a member of the Applicative class, we'll need to see if we can implement the pure and <*> functions. We can implement pure by simply taking some value x and creating a list where x is the only element.

pure :: a -> [a]
pure x = [x]
Enter fullscreen mode Exit fullscreen mode

For sequential application, we need to accept a list of functions a -> b along with a list of a values, apply the each function in the first list to each value in the second, and then return the list of results.

<*> :: [a -> b] -> [a] -> [b]
(f:fs) <*> xs = (f <$> xs) <> (fs <*> xs)
[] <*> _ = []
Enter fullscreen mode Exit fullscreen mode

This recursive implementation takes the list of functions and the list of values, and it maps the first function over the full list of values (using the fmap <$> operator discussed previously). Then it calls itself with the remainder of the list of functions and the full list of values. When finally called with an empty list of functions, it returns an empty list of values. This is not the official implementation, it's only an example to demonstrate that both functions required by the Applicative class can be implemented for lists, and hence, lists are an example of an Applicative type.

Parser as an applicative

Returning to our parser function, we've evaluated the first sub-expression and found that it evaluates to the type Parser (String -> VarDecl). We know that Parser is a functor, so this type certainly looks like an applicative, and indeed it is. Following the parser combinator pattern, the Parser type might be defined like so:

data Parser a = Parser { parse :: String -> Maybe (a, String) }
Enter fullscreen mode Exit fullscreen mode

The Parser has a parse method which attempts to construct a value of type a by consuming characters from an input string. If successful, the value is returned in a tuple along with the remainder of the input string. Otherwise, it returns Nothing. Typically Either would be used instead of Maybe to enable more informative errors to be returned, but I'm using Maybe here for simplicity. All parsing errors just result in Nothing.

The pure and <*> functions for this type might be implemented like so:

pure :: a -> Parser a
pure x = Parser $ \input -> Just (a, input)

<*> :: Parser (a -> b) -> Parser a -> Parser b
Parser f <*> Parser g = do
Parser f <*> Parser g = Parser $ \input ->
    case f input of
        Nothing -> Nothing
        Just (f', input') ->
            case g input' of
                Nothing -> Nothing
                Just (val, rem) -> Just (f' val, rem)
Enter fullscreen mode Exit fullscreen mode

The pure function is fairly self-explanatory---it simply takes a value x and uses it to construct a parser that consumes no input and always returns x along with the full input stream.

Sequential application is a little more complicated. This function takes two parsers as its arguments: the first is a parser that attempts to construct a function a -> b; the second is a parser that attempts to construct a value of type a. It runs the first parser on the input, and if it succeeds to construct the function a -> b, it continues by then applying the second parser to the remaining input. If the second parser is also successful in constructing a value of type a, that value is passed to the function a -> b, resulting in a value of type b. This final value is then returned in the form of a parser which always returns that b value without consuming any input. If either of the parsers fail, the whole operation results in Nothing.

This part had me stumped for a while. I couldn't wrap my head around the idea of a function inside a parser. A type like Parser String or Parser Int is intuitive to understand---they're trying to parse strings or integers from their inputs. But how could an unevaluated function be parsed from the text? The key to understanding this is to think of Parser not as a container, but as a name for a particular kind of computation---namely a computation that has access to, and can consume, a stream of input text. The type passed to Parser is the type of value you're trying to construct within that context. That is, Parser (a -> b) is not a parser of functions (which doesn't seem to make much sense), it is a function a -> b resulting from a computation in the Parser context.

With that said, we can now better make sense of our parser function. Again, the sub-expression VarDecl <$> string results in a value of type Parser (String -> VarDecl). We have used the string function to consume a String from the input text, and fed this into the VarDecl constructor, partially applying it and resulting in another function String -> VarDecl. Because this is part of a computation in the Parser context (we lifted VarDecl into that context using fmap <$>), this resulting function is also in the Parser context.

Working our way from the inside out, the next sub-expression is (VarDecl <$> string) <*> string (including the expression we've already looked at). Here we're using the sequential application operator <*>. Its first argument is the result of the first expression (of type Parser (String -> VarDecl)) and its second argument is the string function which has a type of Parser String. Let's substitute these types into the type signature for the <*> function.

<*> :: Parser (String -> VarDecl) -> Parser String -> Parser VarDecl
Enter fullscreen mode Exit fullscreen mode

Thinking about how the <*> is implemented for the Parser type, this expression will run the first parser to parse a string and use it to partially apply the VarDecl constructor as we've already described. Then it will run the second parser to parse another string, and if successful, it will pass that string into the partially applied function, finally arriving at a value of type VarDecl which it returns still wrapped in the Parser context. Voilà! We've constructed the VarDecl type by composing parsers together.

But hold on, the full function is:

varDeclP = ((VarDecl <$> string) <*> string) <* symbol ";"
Enter fullscreen mode Exit fullscreen mode

There is another part to the expression, a symbol parser which looks for the semicolon that terminates our int x; expression. It is important for the semicolon to be there, but we don't need its value to construct a VarDecl record, so how is this parser incorporated into the function?

This final parser is being applied using an <* operator we haven't looked at yet. This operator is also a part of the Applicative class, but it is not part of the minimal definition. The type signature for this operator is:

<* :: Applicative f => f a -> f b -> f a
Enter fullscreen mode Exit fullscreen mode

This is a function which takes two values both in an applicative context, and just seems to return the first one. This initially seems kind of pointless, but take a look at how it might be implemented.

Parser f <* Parser g = Parser $ \input ->
    case f input of
        Nothing -> Nothing
        Just (x, rem) ->
            case g rem of
                Nothing -> Nothing
                Just (_, rem') -> Just (x, rem')
Enter fullscreen mode Exit fullscreen mode

Although the value produced by the second parser isn't used, we do nonetheless run both parsers, allowing them both to consume input and possibly fail. If the second parser does fail, the whole expression also fails. This is ideal for our case where we want to parse a semicolon but then throw the parsed value away. Let's plug the types from our parser into the type signature for <*.

<* :: Parser VarDecl -> Parser String -> Parser VarDecl
Enter fullscreen mode Exit fullscreen mode

The parser on the left-hand side of the <* is executed and its resulting value is the value returned from the overall expression. The parser on the right-hand side does have to run successfully, but its resulting value has no effect on the output of the overall expression.

The bigger picture

We've looked at how applicatives are used in the parser combinator pattern, but the Applicative class is intentionally abstract so that it can be used in lots of different contexts. In Haskell, I often find myself able to understand an abstraction well enough to use it in a certain context, but unable to fully grasp the essence of what it aims to express.

For example, I've been aware for quite a long time that an applicative is "a function inside a functor". That mental model works well enough when thinking about lists or the Maybe type, because both of those types can be thought of as containers for other values, which is intuitive. However, that understanding tripped me up when it came to learning about parsers; how can a parser be a container for a value, especially when that value is a function? Haskell is unique in that you can often ascertain a lot about what a function does just by looking at its type signature. Of course, this often isn't all the information you need to effectively use it, implementation does matter, but often the essence of an abstraction is best understood by looking at its type.

Take another look at the type of the sequential application operator:

<*> :: Functor f => f (a -> b) -> f a -> f b
Enter fullscreen mode Exit fullscreen mode

It bears a resemblance to the type of the function application operator $:

$ :: (a -> b) -> a -> b
Enter fullscreen mode Exit fullscreen mode

They seem to be expressing the same thing, except the former is in a functorial context. The function application operator $ is a controversial one, because in a simple expression, it doesn't do anything useful. For example, the following expressions are equivalent:

f x == f $ x
Enter fullscreen mode Exit fullscreen mode

The only purpose served by the operator is in longer, more complex expressions where it can be used to disambiguate the precedence of terms. The reason for this is because there is no question of how a function a -> b can be mapped over a value a---this is a pure function and a pure value, so Haskell knows how to evaluate it. This is the essence of applicatives as an abstraction.

The sequential application operator <*> is serving a very similar purpose to the function application operator $, but in the former case, Haskell doesn't know how to apply the function to the argument, because they're both in an applicative context. The <*> operator leverages the applicative type to gain further information about how to map the function over the value.

This is incredibly powerful; we're able to express the composition of potentially very complex logic in terms of simple functions, because the complexity is abstracted away in the type. In the language of category theory, we're able to move into another context (like a parser context) and express morphisms in that context using the syntax and semantics of functions.

After improving my understanding of applicatives, the question occurred to me: "then what is a monad?" That could warrant its own post, monads serve a similar purpose, the difference essentially being that monads can be chained together with the results of one term in the chain depending on the results of the previous terms---this isn't possible with applicatives. In cases where that sort of behaviour isn't needed, applicatives are a more natural fit.

Top comments (0)