static  ·  no trackers  ·  built to be lightweight

What is a Monad, Again...

Introduction

There are a seemingly infinite supply of tutorials explaining what a monad is. Each of them have their own merits and applications - in this tutorial, I’m going to try and give a bit of background on the mathematics that inspires the Haskell concepts, and then explain their uses in Haskell - but more specifically, how we can define our parsers to be instances of functors, monads and applicatives.

This introduction is aimed at somebody with a mathematical background interesting in the theory behind these concepts. There are much better examples that approach this from a purely practical perspective.

Functors

The concept of a functor in Haskell isn’t too complicated to understand. Succinctly, a data type in Haskell is a functor if there is an explicitly defined way to penetrate the structure of the data type and apply a given function to the values contained within the type, without changing the structure.

The important function here is

  fmap :: (a -> b) -> f a -> f b

In the type signature, f is the functor data type. So f a is the functor containing values of type a . So what this says is that with a function from a to b , we can penetrate the structure of a functor to map the values of type a to values of type b using f .

At this point, some people might notice that there’s no way to guarantee the structure of the data type is unchanged after applying fmap , but worry not, there are laws that a functor must satisfy. They are

  fmap id == id
  fmap f . fmap g = fmap (f . g)

The first just says that if we map every value to itself, then it should be like we never touched the data structure. The second one says if we fmap two functions - one after the other - the result will be just like if we fmap both functions together. From this, you should have a little bit more trust that fmap does not directly changed anything about the data structure, only the function modifies the values.

Perhaps it’s more obvious if we use some concrete examples. The first functor that you’ve been using without knowing is a list, even more is that you’ve been using fmap on lists this whole time! It turns out that the fmap defined on lists is actually just map - the slightly prettier twin. If we look at the type of map , we get

  map :: (a -> b) -> [a] -> [b]

What’s the structure preservingness of this though ? Well, [a] is an ordered collection of objects of type a - we would want fmap f xs to maintain this order, that means that if x and y are elements of the list with the former coming before latter, then f x comes before f y in the list. Another way to describe the structure presevation is that if x is the element at index i of xs , then f x is the element at position i of fmap f xs - and the length of the list is unchanged.

A classic example is the Maybe data type. Just as a little refresher, the definition is

  Maybe a = Nothing | Just a 

Hopefully you have some ideas of what the functor instance would look like here to penetrate the structure

  fmap f Nothing  = Nothing 
  fmap f (Just x) = Just (f x)

The structure is either nothingness or a container with a value inside. If we have nothingness, we simply maintain this nothingness. But if we have a container, we penetrate this container and apply our function.

Hopefully this gives an intuition for what a functor is in Haskell . Hopefully now when I say structure penetrating and structure preserving, it makes a bit more sense. When you make a data type in Haskell, one of the first things you want to do is to define a functor instance for it - it’s a very common design pattern and provides incredibly elegant ways of doing things. The definition comes from category theory, I initially intended to explain what a functor was in category theory first before trying to apply it to the definition that we’ve gone through. Hoewever I decided that the most important takeaway to understand was that there is a notion of preserving structure - I’m not sure whether the category theory definition would give more intuition or confusion for most people.

Motivation Behind Monads

A computation can do two things: it returns a value, and it produces side effects. Side effects occur when data outside of the local scope of the function is modified, for example, consider this C code

  int numbersDoubled = 0;
  
  int doubleNumber(int x) {
    numbersDouble += 1;
    return x * 2;
  } 

The value returned is the doubled number, but this function has a side effect - it increments the value of the global variable numbersDoubled . That means that this function isn’t pure, so there doesn’t exist a straightforward translation of this function into Haskell. Another important part of Haskell is referential transparency - this is the idea that if I call a function with the same arguments, I should get the same result every time. It’s easy to thing of functions which aren’t referentially transparent in C

  interest = 0.01; 

  double compoundInterest(double base, int numYears) {
    return base * ((1 + interest) ** numYears)
  }

This function doesn’t have any side effects, but it isn’t referentially transparent. If I call compoundInterest(100, 5) , then the value returned by this function will change depending on what the value of the global variable interest

Unfortunately there is no easy translation of these functions into Haskell because they’re pure functions. However, functions like this are ubiqutous all throughout programming, so it would be nice if we could write similar functions in Haskell. There’s even bigger motivation to be able to do things like this, functions that take some input from the user’s keyboard and process it are clearly not referentially transparent, or if you print some text onto the screen, then that’s a side effect - so with our current model of Haskell, we can’t do any IO!

The solution in functional programming is to do the computation inside of some “context” and then return a new context along with the return value. This might not immediately make sense if you’ve never heard of this idea before, so let’s break it down using an example. We’re going to try and solve this problem by considering the side effects of some code that “logs” its own activity. Consider the following Java code

  String functionsCalled = "";

  int double(int x) {
    functionsCalled += "double, "
    return x * 2;
  }

  bool isEven(int x) {
    functionsCalled += "isEven, "
    return x % 2 == 0;
  }

The context of this function is the value of functionsCalled , so we need to have a type that represents this idea. What this example is doing is essentially just keeping a log - so let’s make the type

  Logger a = Logger String a 

This represents a computation that has logged a string and returned a value of type a . For example,

  triple :: Int -> Logger String Int 
  triple x = Logger "triple, " (3 * x)

  isEven :: Int -> Logger String Bool 
  isEven x = Logger "isEven, " (even x) 

Interestingly, we can see that these functions have no side effects. The side effects are contained within the return types. What these function type signatures are saying is that these functions take an int and return as a type representing a computation that has logged a string. A useful thing that we’re missing is the ability to chain together these computations, currently, each computation seems completely disjoint from any other computation because we can’t pass the Logger in as a parameter to each of the functions - i.e. if I call triple 3 and then isEven 9 , then the side effects should produce the string "triple, isEven, " , not just each one individually.

The first solution that comes to mind might just be to try and pass the Logger in as a parameter, but this slightly changes the semantics of the function - it would mean that our function comsumes a Logger to double a number, but this is not necessary, the function just needs a number. The problem lies in when we try to chain together sequential computations, so let’s try and create a function that chains together computations. Suppose I have

  f :: c -> Logger a
  g :: a -> Logger b
  x :: c 

then f x :: Logger a , but unfortunately, this is not the correct type for us to be able to use g . We need to somehow extract the value of type a from the logger, and then call the function again. We also need to combine the side effects, but this is not too complicated, it’s a simple string concatenation. If we want to compose two functions like this, we have to go through a lot of effort to unwrap them, so let’s save ourselves some time and abstract this functionality. If it helps to conceptualise, run through this example by calling our tripling function then the evenness function (but let’s keep the types as abstract as possible to increase reusability).

  seqComp :: Logger a -> (a -> Logger b) -> Logger b
  seqComp (Logger log y) g = Logger z (log ++ log')
    where 
      Logger z log' = g y 

I didn’t put much thought into the name of this function, because it’s going to reappear with it’s real name later. Hopefully it’s clear what’s going on. We just unwrap the input computation to get all of the component parts out and then recombine them in what makes semantic sense (and type checks). So now we can do

  > triple 3 `seqComp` isEven 
  Logger "triple, isEven, " False 

and we’ve combined two computations with side effects!

Monads

From before, we saw that we have a need to combine to functions that return values wrapped in the context of a Logger . Functions that wrap values inside of contexts are present all over code that needs to modify state - so the idea of sequential composition that we built using the logger example would really benefit us if we could generalise it to other, similar types. Finally, we can start to understand why monads appear everywhere - it’s because they provide a really nice abstraction for the sequential composition that we defined before, and now I want to try and convince you of that.

Firstly, what is the mathematical definition of a monad ? Well, let’s try not to delve into the category theory too much, but a monad is the tuple (T,η,μ)(T, \eta, \mu) with T:CCT : C \rightarrow C, η:1CT\eta : 1_C \rightarrow T and μ:T2T\mu : T^2 \rightarrow T. Here, CC is a category, 1C1_C is the identity functor on CC and TT is an endofunctor. Of course, there’s a lot more to the definition, but we leave it at this for now and compare it to the corresponding functions in Haskell.

The Haskell equivalent of (T,η,μ)(T, \eta, \mu) is (m, return, join) with

  return :: a -> m a 
  join :: m (m a) -> m a   

Also, another result from the category theory “equivalence” is that m must be a functor, so we also have

  fmap :: (a -> b) -> m a -> m b  

At this point you might be getting a bit confused as to what any of this has to do with what we talked about when we went through the example with Logger . Remember that the key motivation was the ability to compose the computations. Well, it turns out that this comes naturally from the definition of a monad in Haskell! Let’s build it!

Recall the type of seqComp from before, let’s generalise the type by replacing Logger by an arbitrary constructor which we will call m , and also let’s use its real name: bind - represented by the infix function >>= . Let’s begin by analysing the type of this generalised version of the function

  (>>=) :: m a -> (a -> m b) -> m b 

So now, let’s try and generate a definition of x >>= f that will type check. For reference, we have x :: m a and f :: a -> m b

  x >>= a = join (fmap f x) 

It’s quite concise, isn’t it! I recommend you trying to convince yourself that the types match up by considering the composition of all of the functions. But the important thing here is that the m a corresponds to a value of type a wrapped in a context m . Even more important is that if m is a monad, then we get this sequential composition of the contexts for free!

What we’ve done is that we’ve shown that if a type is a monad, then we have a way to chain together the computations. Hopefully now, the idea of a monad makes sense. This means that we can define a plethora of features for monad abstraction and then get them for free if we can show that our type is a monad - but how do you do that ? Well, if we are pedantic, we have to define return and join , but realistically we will use >>= much more than join . So, in reality, we normally provide defitions of return and >>= (see exercise 2 to convince yourself that this is okay). You can think of return as wrapping a value in the most minimal context possible, so if we go back to the logger example, we would want to define

  return :: a -> Logger a 
  return x = Logger "" x          

But that’s not enough to ensure that our types are monads, there’s also some laws that they have to satisfy:

  (return x) >>= f  == f x 
  m >>= return      == m 
  (m >>= f) >>= g   == m >>= (\x -> f x >>= g) 

The first is like a left identity rule - you can think of it as saying that if we wrap x , in an empty context, then bind that to f , it’s as if we just did f. The second is the right identity rule. It says that if we unwrap a context, then rewrap it, we get the original monad. The final one is like an assosociativity rule.

Another thing that we get for free from a monad is >> :: m a -> m b -> m b , essentially, it sequentially composes two computations, and then throws away the result from the first result. This is like sequencing in imperative languages and it’s so common in Haskell that it has alternative notation, the code m1 >> m2 >> m3 is equivalent to

  do 
    m1 
    m2 
    m3 

Hopefully now you’re beginning to see what a monad represents. Essentially, it is a way to model a computation with side effects inside of a pure paradigm. I’ll admit the best way to understand what they are is just to have a lot of practise using them - but if there’s anything to take away, it’s that they are a way to chain together computations sequentially and keep their side effects contained.

Exercises

  1. Consider the data type
      data BTree a = Leaf a | Node a (BTree a) (BTree a) 
    i.e. a non-empty binary tree. Define fmap on this data type.
  2. Define join in terms of >>=