Binders Unbound
Stephanie Weirich, Tim Sheard and I recently submitted a paper to ICFP entitled Binders Unbound. (You can read a draft here.) It’s about our kick-ass, I mean, expressive and flexible library, unbound (note: GHC 7 required), for generically dealing with names and binding structure when writing programs (compilers, interpreters, refactorers, proof assistants…) that work with syntax. Let’s look at a small example of representing untyped lambda calculus terms. This post is working Literate Haskell, feel free to save it to a .lhs
file and play around with it yourself!
First, we need to enable lots of wonderful GHC extensions:
> {-# LANGUAGE MultiParamTypeClasses
> , TemplateHaskell
> , ScopedTypeVariables
> , FlexibleInstances
> , FlexibleContexts
> , UndecidableInstances
> #-}
Now to import the library and a few other things we’ll need:
> import Unbound.LocallyNameless
>
> import Control.Applicative
> import Control.Arrow ((+++))
> import Control.Monad
> import Control.Monad.Trans.Maybe
>
> import Text.Parsec hiding ((<|>), Empty)
> import qualified Text.Parsec.Token as P
> import Text.Parsec.Language (haskellDef)
We now declare a Term
data type to represent lambda calculus terms.
> data Term = Var (Name Term)
> | App Term Term
> | Lam (Bind (Name Term) Term)
> deriving Show
The App
constructor is straightforward, but the other two constructors are worth discussing in more detail. First, the Var
constructor holds a Name Term
. Name
is an abstract type for representing names, provided by Unbound. Name
s are indexed by the sorts of things to which they can refer (or more precisely, the sorts of things which can be substituted for them). Here, a variable is simply a name for some Term
, so we use the type Name Term
.
Lambdas are where names are bound, so we use the special Bind
combinator, also provided by the library. Something of type Bind p b
represents a pair consisting of a pattern p
and a body b
. The pattern may bind names which occur in b
. Here is where the power of generic programming comes into play: we may use (almost) any types at all as patterns and bodies, and Unbound will be able to handle it with very little extra guidance from us. In this particular case, a lambda simply binds a single name, so the pattern is just a Name Term
, and the body is just another Term
.
Now we use Template Haskell to automatically derive a generic representation for Term
:
> $(derive [''Term])
There are just a couple more things we need to do. First, we make Term
an instance of Alpha
, which provides most of the methods we will need for working with the variables and binders within Term
s.
> instance Alpha Term
What, no method definitions? Nope! In this case (and in most cases) the default implementations, implemented in terms of automatically-derived generic representations, work just fine.
We also need to provide a Subst Term Term
instance. In general, an instance for Subst b a
means that we can use the subst
function to substitute things of type b
for Name
s occurring in things of type a
. We override the isvar
method so the library knows which constructor(s) of our type represent variables which can be substituted for.
> instance Subst Term Term where
> isvar (Var v) = Just (SubstName v)
> isvar _ = Nothing
OK, now that we’ve got the necessary preliminaries set up, what can we do with this? Here’s a little lambda-calculus evaluator:
> done :: MonadPlus m => m a
> done = mzero
>
> step :: Term -> MaybeT FreshM Term
> step (Var _) = done
> step (Lam _) = done
> step (App (Lam b) t2) = do
> (x,t1) <- unbind b
> return $ subst x t2 t1
> step (App t1 t2) =
> App <$> step t1 <*> pure t2
> <|> App <$> pure t1 <*> step t2
>
> tc :: (Monad m, Functor m) => (a -> MaybeT m a) -> (a -> m a)
> tc f a = do
> ma' <- runMaybeT (f a)
> case ma' of
> Just a' -> tc f a'
> Nothing -> return a
>
> eval :: Term -> Term
> eval x = runFreshM (tc step x)
Note how we use unbind
to take bindings apart safely, using the the FreshM
monad (also provided by Unbound) for generating fresh names. We also get to use subst
for capture-avoiding substitution. All without ever having to touch a de Bruijn index!
OK, but does it work? First, a little Parsec parser:
> lam :: String -> Term -> Term
> lam x t = Lam $ bind (string2Name x) t
>
> var :: String -> Term
> var = Var . string2Name
>
> lexer = P.makeTokenParser haskellDef
> parens = P.parens lexer
> brackets = P.brackets lexer
> ident = P.identifier lexer
>
> parseTerm = parseAtom `chainl1` (pure App)
>
> parseAtom = parens parseTerm
> <|> var <$> ident
> <|> lam <$> (brackets ident) <*> parseTerm
>
> runTerm :: String -> Either ParseError Term
> runTerm = (id +++ eval) . parse parseTerm ""
Now let’s try some arithmetic:
*Main> runTerm "([m][n][s][z] m s (n s z))
([s] [z] s (s z))
([s] [z] s (s (s z)))
s z"
Right (App (Var s) (App (Var s) (App (Var s)
(App (Var s) (App (Var s) (Var z))))))
2 + 3 is still 5, and all is right with the world.
This blog post has only scratched the surface of what’s possible. There are several other combinators other than just Bind
for expressing binding structure: for example, nested bindings, recursive bindings and embedding terms within patterns are all supported. There are also other operations provided, such as free variable calculation, simultaneous unbinding, and name permutation. To learn more, see the package documentation, or read the paper!