Haskell’s monads are an enigma. They hold the key to shorter, more modular Haskell programs, but at first they’re hard to understand. For sure, many monadic tutorials exist, but I had to study them for weeks before it all clicked. I know others who struggled just as much.
I had such a hard time partly because I didn’t understand some fundamental concepts as well as I had thought. Unfortunately, it took me a while to realize this because I already had experience writing real Haskell code. It didn’t occur to me that I had to revisit basic concepts like types and functions. I also think that other tutorials don’t spend enough time breaking down the syntactic sugar that makes Haskell’s monads so terse.
This writeup focuses on the concepts that confused me. The intended audience is people who have studied Haskell, yet have struggled with monads. Although some fundemental concepts are explained, this is not a comprehensive introduction to Haskell.
Before we dive in, I want to acknowledge a few people and resources that I found most helpful during my studies. Paul Hudak’s writing is exceptionally lucid. His original A Gentle Introduction to Haskell is a tremendous resource. Cale Gibbard wrote my two favorite monadic introductions: Monads as Containers and Monads as Computation. In addition, Jeff Newbern’s All About Monads contains a well written Catalog of Standard Monads. Finally, the experts on the #haskell IRC channel are always helpful.
Back to Basics
Before we talk about the syntax or semantics of monads, and before we step through any examples, we go back to basics. When I started learning about monads, I overestimated my understanding of Haskell’s type system. This section highlights the concepts that I overlooked.
Every monad is nothing more than a paramaterized type that implements a few functions. Let’s review paramaterized types by exploring the details of (Maybe a).
> data Maybe a = Nothing
> | Just a
It’s straightforward to see that (Maybe a) can be used to represent values that either exist or are missing. It’s also intuitive that the variable a is a place holder for the type that may or may not be there (perhaps an Int). Fair enough, but there’s more.
First let’s break down the type’s name. Although it is often called the “maybe type” in conversation, its proper name is (Maybe a). The word Maybe is a type constructor and the character a is a type variable.
The type constructor Maybe is a function that accepts the given type variable a and returns the type (Maybe a). I found this notion confusing because you cannot call the Maybe function within regular code. For example, ghci prints out:
Prelude> Maybe 4
:1:0: Not in scope: data constructor `Maybe'
Prelude> :t Maybe
:1:0: Not in scope: data constructor `Maybe'
Prelude> :kind Maybe
Maybe :: * -> *
The last statement, using :kind, gives an indication that Maybe is a unary type constructor. We won’t stop to discuss notion of a kinds (partly because I don’t understand them well enough myself). We can safely skip that topic. The error messages are printed because the Maybe function is only valid within the context of type definitions, type class definitions, and type signatures. For example, you might write:
> catMaybes :: [Maybe a] -> [a]
Within this type signature, Maybe is a function being applied to the type variable a, to produce a type of (Maybe a). (Incidentally, the type (Maybe a) is passed in turn to the type constructor [].) This is code, but it is within the scope of type signatures. We’ll return to this distinction later when we replace particular type constructors like Maybe with a variable that represents any parametric type constructor.
The right hand side of the type definition lists one or more value constructors, separated by the pipe symbol. For (Maybe a), we have Nothing and Just a. Even though these value constructors are defined within the type definition, they are valid in regular Haskell code, as we can see with ghci:
Prelude> :t Nothing
Nothing :: Maybe a
Prelude> :t Just
Just :: a -> Maybe a
Prelude> Just 5
Just 5
The ghci output says that the function named Just takes some value x of type a and returns a value of type (Maybe a), which by definition equals Just x. We introduce the value variable x here to distinguish it from a, which is a type variable. Similarly, Nothing is a nullary function of type (Maybe a).
Note that unlike the type constructor Maybe, which produces a type named (Maybe a), the value constructors Nothing and Just produce values that have the type (Maybe a). One consequence of this distinction is the tendency for value constructors to be named after its type constructor. This is not a conflict because value constructors and type constructors are in different namespaces.
Finally, be sure to review the notion of Haskell type classes and instances. Type classes allow us to define a particular interface, and then instances allow us to declare that a type implements that interface. For the purposes of monads, we will be examining the type classes Functor, and Monad.
Functors
For whatever reason, I had it in my mind that I would put off learning about functors until I understood monads. I wasn’t sure what a functor was, but I also wasn’t sure what a monad was, and I wanted to take things one at a time. In retrospect, this was a foolish mistake because the concepts are related, and functors are much simpler than monads.
A functor is simply a fancy name for a type that supports the map function, which happens to be called fmap in the Functor class. In ghci you can see:
Prelude> :t map
map :: (a -> b) -> [a] -> [b]
Prelude> :t fmap
fmap :: (Functor f) => (a -> b) -> f a -> f b
As the types show, the map function only operates over lists, whereas fmap is generalized to operate over any parametric type. The idea is that we want to apply some function to every item within a collection. The Functor class is defined as follows:
> class Functor f where
> fmap :: (a -> b) -> f a -> f b
This type class definition is written in the language for types. The variable f is a placeholder for any parametric type constructor. We could instantiate the variable with any parametric type constructor, including Maybe. For example:
> instance Functor Maybe where
> fmap f Nothing = Nothing
> fmap f (Just x) = Just (f x)
A minor point — do not let the reuse of the variable f confuse you. In the Functor class definition, f is a placeholder for a type constructor. Within this instance defintion, f represents the function that we want to apply to whatever is within the Just x value.
Let’s see what happens in ghci:
Prelude> fmap (* 2) (Just 4)
Just 8
Prelude> fmap (* 2) Nothing
Nothing
Prelude> fmap odd (Just 4)
Just False
Prelude> fmap odd Nothing
Nothing
For the list type, we simply have:
> instance Functor [] where
> fmap = map
To be precise, there are a two semantic rules that every proper fmap implementation must abide by:
> fmap id = id
> fmap (f . g) = fmap f . fmap g
Together, these rules ensure that fmap doesn’t modify the shape or order of its input. Most sensible definitions of fmap abide by these rules, so don’t think about them too hard.
The important takeway from this section is the generalization of map to any parametric type using the variable f within the type signature of fmap. This allows us to make any parametric type (such as [a]) implement fmap, satisfying the common need of applying a function to every item in some collection. In the case of (Maybe a), we apply the function to zero or one values. For lists (the [a] type), we apply the function to zero or more values.
The Monad Type Class
Now we turn to the Monad class. Perhaps surprisingly, we’ll hold off discussing how monads can improve your Haskell code. Instead, we simply introduce the methods of the Monad class, just as we did for the Functor class. We also introduce Haskell’s special monadic syntactic sugar.
Here is the Monad class definition:
> infixl 1 >>, >>=
> class Monad m where
> return :: a -> m a
> (>>=) :: m a -> (a -> m b) -> m b
> (>>) :: m a -> m b -> m b
> fail :: String -> m a
> m >> k = m >>= \_ -> k
The type class’s two important functions are return and (>>=), which is conversationally called “bind”. The function (>>) is defined in terms of bind, and exists to simplify syntax. We’ll discuss fail later, but it’s typically used to handle exceptional results.
Don’t fall in the trap of trying to think about what each of these functions “typically do.” The meaning is different for each instance of the Monad class. We will examine particular implementations shortly.
First, let’s examine return. Despite its name, it has nothing to do with the standard flow control keyword in other languages. In Haskell it is not a keyword, and it is not a flow control operator. Instead, return is a generalized value constructor, similar to how fmap was a genearlized version of map. We can show this in ghci:
Prelude> :t Just
Just :: a -> Maybe a
Prelude> :t return
return :: (Monad m) => a -> m a
Both functions take a value of some type, call it a, and return a value of a type that is parameterized over a. The Just function always returns a value of type (Maybe a), whereas the return function can generate a value of any parameterized type that is an instance of the Monad class. It is an alias for one or more of the type’s value constructors. We’ll give use cases for return later, but for now understand that if you need to create a monad, you often use return instead of explicitly using the type’s value constructors.
Next, consider bind, or (>>=). It is very similar to the fmap function, as we can see in ghci:
Prelude> :t fmap
fmap :: (Functor f) => (a -> b) -> f a -> f b
Prelude> :t (>>=)
(>>=) :: (Monad m) => m a -> (a -> m b) -> m b
Prelude> :t (flip (>>=))
(flip (>>=)) :: (Monad m) => (a -> m b) -> m a -> m b
Both functions operate over a function and a parameterized type. The first two arguments are flipped, but that’s a small matter of syntax. Also, the functions have slightly different signatures, but the intent looks familiar.
Just like with Functor, there are also some rules that any well defined instance of the Monad class must define.
> return a >>= k = k a
> m >>= return = m
> xs >>= return . f = fmap f xs
> m >>= (\x -> k x >>= h) = (m >>= k) >>= h
As with the fmap rules, don’t think about these too hard. The first two rules say that return doesn’t tinker its argument. The third rule relates bind to fmap, as we hinted above. The third rule is a sort of associativity.
Haskell provides special syntactic sugar for the bind operator within its do block. Haskell performs the following two translations for us:
> do e1 ; e2 = e1 >> e2
> do p <- e1 ; e2 = e1 >>= \p -> e2
In the second case, if p does not pattern match with e2, then fail is called. These two translations allow us to write monadic code that implicitly calls the bind operator. One effect of this notation is that the resulting code resembles typical more familiar imperative code, as we’ll see in the next section when we look at examples.
Basic Example Monads
Let’s make these monadic functions concrete by looking at two basic monads: (Maybe a) and [a]. For some time, I was confused when familiar types like these were referred to as monads. It wasn’t clear to me that any parametric type which implements the Monad class is always a monad. The monadic functionality only shines we apply a monadic operator, such as bind, on a value with the monadic type.
Monadic Maybe a
Starting with (Maybe a), we define the two key monadic operators as follows:
> instance Monad Maybe where
> return = Just
> Nothing >>= f = Nothing
> (Just x) >>= f = f x
> fail _ = Nothing
The return function is simply an alias for the value constructor Just. The binding operator always returns Nothing if the first argument is Nothing. Otherwise, we apply the given function f to the value x paramaterized by Just in the first argument. Due to its type, the function f must return either Just x' or Nothing.
It’s helpful to think of this instance of the bind operator passing zero or more values from one function to the next. As the type signature of (>>=) dictates, each function must produce a value of type (Maybe a). Consider the following example.
Suppose we want to define a decrement operator over the natural numbers (all integers greater than or equal to zero). To seamlessly support inputs less than or equal to zero, we define our function as follows:
> decrementNat x | x <= 0 = Nothing
> | otherwise = Just (x-1)
Note that decrementNat has the desirable signature a -> Maybe a. Now we can use the monadic bind function to perform multiple decrements. If we load the above function from a module Foo in ghci, we can write:
*Foo> Nothing >>= decrementNat
Nothing
*Foo> Nothing >>= decrementNat >>= decrementNat
Nothing
*Foo> Just 2 >>= decrementNat
Just 1
*Foo> Just 2 >>= decrementNat >>= decrementNat
Just 0
*Foo> Just 2 >>= decrementNat >>= decrementNat >>= decrementNat
Nothing
We can also use the syntactic sugar of do, to write a function such as:
> decrementNat3 x = do x1 <- decrementNat x
> x2 <- decrementNat x1
> decrementNat x2
Then, once again, in ghci, we can write:
*Foo> decrementNat3 2
Nothing
*Foo> decrementNat3 5
Just 2
Notice that once the bind function (as defined for the (Maybe a) type) encounters a Nothing, it forever generates a Nothing. It short circuits. Otherwise, the incrementally smaller value is passed along the chain.
Monadic List
The list type is also an instance of the Monad type class, defined as follows:
> instance Monad [] where
> return x = [x]
> m >>= f = concatMap f m
> fail _ = []
Once again, return is an alias for the value constructor []. The bind operator is best conceptualized in two steps. First, a 2d list (a list of lists) is created because the function f is applied to every element in the list m. Remember that f creates a list. Then all of those lists are concatenated together.
Consider a simple expression such as:
Prelude> [1,2] >>= (\x -> return $ odd x)
[True,False]
Alternatively, you could write:
> myodd = do x <- [1,2]
> return $ odd x
What exactly is going on here? Remember that for lists, the bind operator is concatMap, and return is the list constructor. We might also write it as:
Prelude> concatMap (return . odd) [1,2]
[True,False]
So the short monadic bind expression applies the function \x -> return $ odd x to both 1 and 2 to create [True] and [False], which are then concatenated together.
We can expand this example to produce all pairs of a list l.
> allpairs l = do x <- l
> y <- l
> return (x, y)
Note that in this case there are two chained calls to concatMap.
To be continued. . .
These are the basic ideas behind monads. We have not covered the type class MonadPlus. We have also not explored many common instances of the Monad class, most notably (IO a) and (State a). Perhaps I’ll tackle those topics in a subsequent posts.