In a previous
post
I discussed the first half of my solution to Factor-Full
Tree. In this post,
I will demonstrate how to decompose a tree into disjoint paths.
Technically, we should clarify that we are looking for directed
paths in a rooted tree, that is, paths that only proceed down the
tree. One could also ask about decomposing an unrooted tree into
disjoint undirected paths; I haven’t thought about how to do that in
general but intuitively I expect it is not too much more difficult.
For
this particular problem, we want to decompose a tree into
maximum-length paths (i.e. we start by taking the longest possible
path, then take the longest path from what remains, and so on); I will call
this the max-chain decomposition (I don’t know if there is a
standard term). However, there are other types of path
decomposition, such as heavy-light decomposition, so we will try to
keep the decomposition code somewhat generic.
This post is literate Haskell; you can find the source code on GitHub. We begin with some language pragmas and imports.
{-# LANGUAGE ImportQualifiedPost #-}
{-# LANGUAGE RecordWildCards #-}
{-# LANGUAGE TupleSections #-}
module TreeDecomposition where
import Control.Arrow ((>>>), (***))
import Data.Bifunctor (second)
import Data.ByteString.Lazy.Char8 (ByteString)
import Data.ByteString.Lazy.Char8 qualified as BS
import Data.List (sortBy)
import Data.List.NonEmpty (NonEmpty)
import Data.List.NonEmpty qualified as NE
import Data.Map (Map, (!), (!?))
import Data.Map qualified as M
import Data.Ord (Down(..), comparing)
import Data.Tree (Tree(..), foldTree)
import Data.Tuple (swap)
import ScannerBS
Remember, our goal is to split up a tree into a collection of linear paths; that is, in general, something like this:
What do we need in order to specify a decomposition of a tree into disjoint paths this way? Really, all we need is to choose at most one linked child for each node. In other words, at every node we can choose to continue the current path into a single child node (in which case all the other children will start their own new paths), or we could choose to terminate the current path (in which case every child will be the start of its own new path). We can represent such a choice with a function of type
type SubtreeSelector a = a -> [Tree a] -> Maybe (Tree a, [Tree a])
which takes as input the value at a node and the list of all the
subtrees, and possibly returns a selected subtree along with the list of remaining
subtrees.Of course, there is nothing in the
type that actually requires a SubtreeSelector
to return one of the
trees from its input paired with the rest, but nothing we will do
depends on this being true. In fact, I expect there may be some
interesting algorithms obtainable by running a “path decomposition”
with a “selector” function that actually makes up new trees instead of just
selecting one, similar to the chop
function.
Given such a subtree selection function, a generic path decomposition
function will then take a tree and turn it into a list of non-empty
paths:We could also imagine wanting information about the parent of each
path, and a mapping from tree nodes to some kind of path ID, but we
will keep things simple for now.
pathDecomposition :: SubtreeSelector a -> Tree a -> [NonEmpty a]
Implementing pathDecomposition
is a nice exercise; you might like to
try it yourself! You can find my implementation at the end of this
blog post.
Now, let’s use our generic path decomposition to implement a max-chain decomposition. At each node we want to select the tallest subtree; in order to do this efficiently, we can first annotate each tree node with its height, via a straightforward tree fold:
type Height = Int
labelHeight :: Tree a -> Tree (Height, a)
= foldTree node
labelHeight where
= case ts of
node a ts -> Node (0, a) []
[] -> Node (1 + maximum (map (fst . rootLabel) ts), a) ts _
Our subtree selection function can now select the subtree with the
largest Height
annotation. Instead of implementing this directly,
we might as well make a generic function for selecting the “best”
element from a list (we will reuse it later):
selectMaxBy :: (a -> a -> Ordering) -> [a] -> Maybe (a, [a])
= Nothing
selectMaxBy _ [] : as) = case selectMaxBy cmp as of
selectMaxBy cmp (a Nothing -> Just (a, [])
Just (b, bs) -> case cmp a b of
LT -> Just (b, a : bs)
-> Just (a, b : bs) _
We can now put the pieces together to implement max-chain
decomposition. We first label the tree by height, then do a path
decomposition that selects the tallest subtree at each node. We leave
the height annotations in the final output since they might be
useful—for example, we can tell how long each path is just by
looking at the Height
annotation on the first element. If we don’t
need them we can easily get rid of them later. We also sort by
descending Height
, since getting the longest chains first was kind
of the whole point.
maxChainDecomposition :: Tree a -> [NonEmpty (Height, a)]
=
maxChainDecomposition >>>
labelHeight const (selectMaxBy (comparing (fst . rootLabel)))) >>>
pathDecomposition (Down . fst . NE.head)) sortBy (comparing (
To flesh this out into a full solution to Factor-Full
Tree, after
computing the chain decomposition we need to assign prime factors to
the chains. From those, we can compute the value for each node if we
know which chain it is in and the value of its parent. To this end,
we will need one more function which computes a Map
recording the
parent of each node in a tree. Note that if we already know all the
edges in a given edge list are oriented the same way, we can build
this much more simply as e.g. map swap >>> M.fromList
; but when
(as in general) we don’t know which way the edges should be oriented
first, we might as well first build a Tree a
via DFS with
edgesToTree
and then construct the parentMap
like this afterwards.
parentMap :: Ord a => Tree a -> Map a a
= foldTree node >>> snd
parentMap where
node :: Ord a => a -> [(a, Map a a)] -> (a, Map a a)
= (a, M.fromList (map (,a) as) <> mconcat ms)
node a b where
= unzip b (as, ms)
Finally, we can solve Factor-Full tree. Note that some code from my
previous blog
post
is needed as well, and is included at the end of the post for
completeness. Once we compute the max chain decomposition and the
prime factor for each node, we use a lazy recursive
Map
to compute the value assigned to each node.
solve :: TC -> [Int]
TC{..} = M.elems assignment
solve where
-- Build the tree and compute its parent map
= edgesToTree Node edges 1
t = parentMap t
parent
-- Compute the max chain decomposition, and use it to assign a prime factor
-- to each non-root node
paths :: [[Node]]
= map (NE.toList . fmap snd) $ maxChainDecomposition t
paths
factor :: Map Node Int
= M.fromList . concat $ zipWith (\p -> map (,p)) primes paths
factor
-- Compute an assignment of each node to a value, using a lazy map
assignment :: Map Node Int
= M.fromList $ (1,1) : [(v, factor!v * assignment!(parent!v)) | v <- [2..n]] assignment
For an explanation of this code for primes
, see this old blog post.
primes :: [Int]
= 2 : sieve primes [3 ..]
primes where
: ps) xs =
sieve (p let (h, t) = span (< p * p) xs
in h ++ sieve ps (filter ((/= 0) . (`mod` p)) t)
We can easily use our generic path decomposition to compute a heavy-light decomposition as well:
type Size = Int
labelSize :: Tree a -> Tree (Size, a)
= foldTree $ \a ts -> Node (1 + sum (map (fst . rootLabel) ts), a) ts
labelSize
heavyLightDecomposition :: Tree a -> [NonEmpty (Size, a)]
=
heavyLightDecomposition >>>
labelSize const (selectMaxBy (comparing (fst . rootLabel)))) pathDecomposition (
I plan to write about this in a future post.
Here’s my implementation of pathDecomposition
; how did you do?
= go
pathDecomposition select where
= selectPath select >>> second (concatMap go) >>> uncurry (:)
go
selectPath :: SubtreeSelector a -> Tree a -> (NonEmpty a, [Tree a])
= go
selectPath select where
Node a ts) = case select a ts of
go (Nothing -> (NE.singleton a, ts)
Just (t, ts') -> ((a NE.<|) *** (ts' ++)) (go t)
We also include some input parsing and tree-building code from last time.
main :: IO ()
= BS.interact $ runScanner tc >>> solve >>> map (show >>> BS.pack) >>> BS.unwords
main
type Node = Int
data TC = TC { n :: Int, edges :: [(Node, Node)] }
deriving (Eq, Show)
tc :: Scanner TC
= do
tc <- int
n <- (n - 1) >< pair int int
edges return TC{..}
edgesToMap :: Ord a => [(a, a)] -> Map a [a]
= concatMap (\p -> [p, swap p]) >>> dirEdgesToMap
edgesToMap
dirEdgesToMap :: Ord a => [(a, a)] -> Map a [a]
= map (second (: [])) >>> M.fromListWith (++)
dirEdgesToMap
mapToTree :: Ord a => (a -> [b] -> b) -> Map a [a] -> a -> b
= dfs root root
mapToTree nd m root where
= nd root (maybe [] (map (dfs root) . filter (/= parent)) (m !? root))
dfs parent root
edgesToTree :: Ord a => (a -> [b] -> b) -> [(a, a)] -> a -> b
= mapToTree nd . edgesToMap edgesToTree nd
tl;dr: if you appreciate my past or ongoing contributions to the Haskell community, please consider helping me get to ICFP this year by donating via my ko-fi page!
Working at a small liberal arts institution has some tremendous benefits (close interaction with motivated students, freedom to pursue the projects I want rather than jump through a bunch of hoops to get tenure, fantastic colleagues), and I love my job. But there are also downsides; the biggest ones for me are the difficulty of securing enough travel funding, and, relatedly, the difficulty of cultivating and maintaining collaborations.
Last year I was very grateful for people’s generosity in helping me get to Seattle. I am planning to again attend ICFP in Milan this September; this time I will even bring some students along. I have once again secured some funding from my institution, but it will not be enough to cover all the expenses.
So, if you have been positively impacted by my contributions to the Haskell community (e.g. CIS 194, the Typeclassopedia, diagrams, split, MonadRandom, burrito metaphors…) and/or would like to support my ongoing work (competitive programming in Haskell, swarm, disco, ongoing package maintenance…), and are able to express that appreciation or support with a donation of any size to help me get to ICFP, I would really appreciate it!
Thank you, friends — I hope to see many people in Milan! Next up: I will soon publish another post about tree path decomposition!
Lately I’ve been thinking about representing eventually constant streams in Haskell. An eventually constant stream is an infinite stream which eventually, after some finite prefix, starts repeating the same value forever. For example,
\(6, 8, 2, 9, 3, 1, 1, 1, 1, \dots\)
There are many things we can do in a decidable way with eventually constant streams that we can’t do with infinite streams in general—for example, test them for equality.
This is a work in progress. I only have one specific use case in mind (infinite-precision two’s complement arithmetic, explained at the end of the post), so I would love to hear of other potential use cases, or any other feedback. Depending on the feedback I may eventually turn this into a package on Hackage.
This blog post is typeset from a literate Haskell file; if you want to play along you can download the source from GitHub.
River
typeSome preliminaries:
{-# LANGUAGE LambdaCase #-}
{-# LANGUAGE PatternSynonyms #-}
{-# LANGUAGE ViewPatterns #-}
module River where
import Data.Monoid (All (..), Any (..))
import Data.Semigroup (Max (..), Min (..))
import Prelude hiding (all, and, any, drop, foldMap, maximum, minimum, or, repeat, take, zipWith, (!!))
import Prelude qualified as P
Now let’s get to the main definition. A value of type River a
is
either a constant C a
, representing an infinite stream of copies of
a
, or a Cons
with an a
and a River a
.
data River a = C !a | Cons !a !(River a)
deriving Functor
I call this a River
since “all River
s flow to the C
”!
The strictness annotations on the a
values just seem like a good
idea in general. The strictness annotation on the River a
tail,
however, is more interesting: it’s there to rule out infinite streamsAlthough the strictness annotation on the River a
is semantically correct, I could imagine not wanting it there for performance reasons; I’d be happy to hear any feedback on this point.
constructed using only Cons
, such as flipflop = Cons 0 (Cons 1 flipflop)
. In
other words, the only way to make a non-bottom value of type Stream a
is
to have a finite sequence of Cons
finally terminated by C
.
We need to be a bit careful here, since there are multiple ways to
represent streams which are semantically supposed to be the same. For
example, Cons 1 (Cons 1 (C 1))
and C 1
both represent an infinite stream of
all 1
’s. In general, we have the law
C a === Cons a (C a)
,
and want to make sure that any functions we write respect this
It would be interesting to try implementing rivers as a higher inductive type, say, in Cubical Agda.
equivalence, i.e. do not distinguish between such values. This is
the reason I did not derive an Eq
instance; we will have to write
our own.
We can partially solve this problem with a bidirectional pattern synonym:
expand :: River a -> River a
C a) = Cons a (C a)
expand (= as
expand as
infixr 5 :::
pattern (:::) :: a -> River a -> River a
pattern (:::) a as <- (expand -> Cons a as)
where
::: as = Cons a as
a
{-# COMPLETE (:::) #-}
Matching with the pattern (a ::: as)
uses a view pattern
to potentially expand a C
one step into a Cons
, so that we can
pretend all River
values are always constructed with (:::)
.
In the other direction, (:::)
merely constructs a Cons
.
We mark (:::)
as COMPLETE
on its own since it is, in fact,
sufficient to handle every possible input of type River
. However,
in order to obtain terminating algorithms we will often include one or
more special cases for C
.
As an alternative, we could use a variant pattern synonym:
infixr 5 ::=
pattern (::=) :: Eq a => a -> River a -> River a
pattern (::=) a as <- (expand -> Cons a as)
where
::= C a | a' == a = C a
a' ::= as = Cons a as
a
{-# COMPLETE (::=) #-}
As compared to (:::)
, this has an extra Eq a
constraint: when we
construct a River
with (::=)
, it checks to see whether we are
consing an identical value onto an existing C a
, and if so, simply
returns the C a
unchanged. If we always use (::=)
instead of
directly using Cons
, it ensures that River
values are always
normalized—that is, for every eventually constant stream, we
always use the canonical representative where the element immediately
preciding the constant tail is not equal to it.
This, in turn, technically makes it impossible to write functions
which do not respect the equivalence C a === Cons a (C a)
, simply
because they will only ever be given canonical rivers as input.
However, as we will see when we discuss folds, it is still possible to
write “bad” functions, i.e. functions that are semantically
questionable as functions on eventually constant streams—it would
just mean we cannot directly observe them behaving badly.
The big downside of using this formulation is that the Eq
constraint
infects absolutely everything—we even end up with Eq
constraints
in places where we would not expect them (for example, on head :: River a -> a
), because the pattern synonym incurs an Eq
constraint
anywhere we use it, regardless of whether we are using it to construct
or destruct River
values. As you can see from the definition above,
we only do an equality check when using (::=)
to construct a
River
, not when using it to pattern-match, but there is no way to
give the pattern synonym different types in the two directions.Of course, we could make it a unidirectional pattern synonym and just make a differently named smart constructor, but that seems somewhat ugly, as we would have to remember which to use in which situation.
So, because this normalizing variant does not really go far enough in
removing our burden of proof, and has some big downsides in the form
of leaking Eq
constraints everywhere, I have chosen to stick with
the simpler (:::)
in this post. But I am still a bit unsure about this
choice; in fact, I went back and forth two times while writing.
We can at least provide a normalize
function, which we can use when
we want to ensure normalization:
normalize :: Eq a => River a -> River a
C a) = C a
normalize (::= as) = a ::= as normalize (a
With the preliminary definitions out of the way, we can now build up a
library of standard functions and instances for working with River a
values. To start, we can write an Eq
instance as follows:
instance Eq a => Eq (River a) where
C a == C b = a == b
::: as) == (b ::: bs) = a == b && as == bs (a
Notice that we only need two cases, not four: if we compare two values
whose finite prefixes are different lengths, the shorter one will
automatically expand (via matching on (:::)
) to the length of the
longer.
We already derived a Functor
instance; we can also define a “zippy”
Applicative
instance like so:
repeat :: a -> River a
repeat = C
instance Applicative River where
pure = repeat
C f <*> C x = C (f x)
::: fs) <*> (x ::: xs) = f x ::: (fs <*> xs)
(f
zipWith :: (a -> b -> c) -> River a -> River b -> River c
zipWith = liftA2
We can write safe head
, tail
, and index functions:
head :: River a -> a
head (a ::: _) = a
tail :: River a -> River a
tail (_ ::: as) = as
infixl 9 !!
(!!) :: River a -> Int -> a
C a !! _ = a
::: _) !! 0 = a
(a ::: as) !! n = as !! (n - 1) (_
We can also write take
and drop
variants. Note that take
returns a finite prefix of a River
, which is a list, not another
River
. The special case for drop _ (C a)
is not strictly
necessary, but makes it more efficient.
take :: Int -> River a -> [a]
take n _ | n <= 0 = []
take n (a ::: as) = a : take (n - 1) as
drop :: Int -> River a -> River a
drop n r | n <= 0 = r
drop _ (C a) = C a
drop n (_ ::: as) = drop (n - 1) as
There are many other such functions we could implement (e.g. span
,
dropWhile
, tails
…); if I eventually put this on Hackage I would
be sure to have a much more thorough selection of functions. Which
functions would you want to see?
River
How do we fold over a River a
? The Foldable
type class requires us
to define either foldMap
or foldr
; let’s think about foldMap
,
which would have type
foldMap :: Monoid m => (a -> m) -> River a -> m
However, this doesn’t really make sense. For example, suppose we have
a River Int
; if we had foldMap
with the above type, we could use
foldMap Sum
to turn our River Int
into a Sum Int
. But what is
the sum of an infinite stream of Int
? Unless the eventually
repeating part is C 0
, this is not well-defined. If we simply write
a function to add up all the Int
values in a River
, including
(once) the value contained in the final C
, this would be a good
example of a semantically “bad” function: it does not respect the law
C a === a ::: C a
. If we ensure River
values are always
normalized, we would not be able to directly observe anything amiss,
but the function still seems suspect.
Thinking about the law C a === a ::: C a
again is the key.
Supposing foldMap f (C a) = f a
(since it’s unclear what else it
could possibly do), applying foldMap
to both sides of the law we
obtain f a == f a <> f a
, that is, the combining operation must be
idempotent. This makes sense: with an idempotent operation,
continuing to apply the operation to the infinite constant tail will
not change the answer, so we can simply stop once we reach the C
.
We can create a subclass of Semigroup
to represent idempotent
semigroups, that is, semigroups for which a <> a = a
. There are
several idempotent semigroups in base
; we list a few below. Note
that since rivers are never empty, we can get away with just a
semigroup and not a monoid, since we do not need an identity value
onto which to map an empty structure.
class Semigroup m => Idempotent m
-- No methods, since Idempotent represents adding only a law,
-- namely, ∀ a. a <> a == a
-- Exercise for the reader: convince yourself that these are all
-- idempotent
instance Idempotent All
instance Idempotent Any
instance Idempotent Ordering
instance Ord a => Idempotent (Max a)
instance Ord a => Idempotent (Min a)
Now, although we cannot make a Foldable
instance, we can write our own
variant of foldMap
which requires an idempotent semigroup instead of
a monoid:
foldMap :: Idempotent m => (a -> m) -> River a -> m
foldMap f (C a) = f a
foldMap f (a ::: as) = f a <> foldMap f as
fold :: Idempotent m => River m -> m
= foldMap id fold
We can then instantiate it at some of the semigroups listed above to
get some useful folds. These are all guaranteed to terminate and
yield a sensible answer on any River
.
and :: River Bool -> Bool
and = getAll . foldMap All
or :: River Bool -> Bool
or = getAny . foldMap Any
all :: (a -> Bool) -> River a -> Bool
all f = and . fmap f
any :: (a -> Bool) -> River a -> Bool
any f = or . fmap f
maximum :: Ord a => River a -> a
maximum = getMax . foldMap Max
minimum :: Ord a => River a -> a
minimum = getMin . foldMap Min
lexicographic :: Ord a => River a -> River a -> Ordering
= fold $ zipWith compare xs ys lexicographic xs ys
We could make an instance Ord a => Ord (River a)
with compare = lexicographic
; however, in the next section I want to make a
different Ord
instance for a specific instantiation of River
.
Briefly, here’s the particular application I have in mind: infinite-precision two’s complement arithmetic, i.e. \(2\)-adic numbers. Chris Smith also wrote about \(2\)-adic numbers recently; however, unlike Chris, I am not interested in \(2\)-adic numbers in general, but only specifically those \(2\)-adic numbers which represent an embedded copy of \(\mathbb{Z}\). These are precisely the eventually constant ones: nonnegative integers are represented in binary as usual, with an infinite tail of \(0\) bits, and negative integers are represented with an infinite tail of \(1\) bits. For example, \(-1\) is represented as an infinite string of all \(1\)’s. The amazing thing about this representation (and the reason it is commonly used in hardware) is that the usual addition and multiplication algorithms continue to work without needing special cases to handle negative integers. If you’ve never seen how this works, you should definitely read about it.
data Bit = O | I deriving (Eq, Ord, Enum)
type Bits = River Bit
First, some functions to convert to and from integers. We only need
special cases for \(0\) and \(-1\), and beyond that it is just the usual
business with mod
and div
to peel off one bit at a time, or
multiplying by two and adding to build up one bit at a time. (I am a big fan of LambdaCase
.)
toBits :: Integer -> Bits
= \case
toBits 0 -> C O
-1 -> C I
-> toEnum (fromIntegral (n `mod` 2)) ::: toBits (n `div` 2)
n
fromBits :: Bits -> Integer
= \case
fromBits C O -> 0
C I -> -1
::: bs -> 2 * fromBits bs + fromIntegral (fromEnum b) b
For testing, we can also make a Show
instance. When it comes to
showing the infinite constant tail, I chose to repeat the bit 3 times
and then show an ellipsis; this is not really necessary but somehow
helps my brain more easily see whether it is an infinite tail of zeros
or ones.
instance Show Bits where
show = reverse . go
where
C b) = replicate 3 (showBit b) ++ "..."
go (::: bs) = showBit b : go bs
go (b
= ("01" P.!!) . fromEnum showBit
Let’s try it out:
ghci> toBits 26
...00011010
ghci> toBits (-30)
...11100010
ghci> fromBits (toBits (-30))
-30
ghci> quickCheck $ \x -> fromBits (toBits x) == x
+++ OK, passed 100 tests.
Let’s implement some arithmetic. First, incrementing. It is standard
except for a special case for C I
(without which, incrementing C I
would diverge). Notice that we use (::=)
instead of (:::)
, which
ensures our Bits
values remain normalized.
inc :: Bits -> Bits
= \case
inc C I -> C O
O ::= bs -> I ::= bs
I ::= bs -> O ::= inc bs
dec
is similar, just the opposite:
dec :: Bits -> Bits
= \case
dec C O -> C I
I ::= bs -> O ::= bs
O ::= bs -> I ::= dec bs
Then we can write inv
to invert all bits, and neg
as the
composition of inc
and inv
.
inv :: Bits -> Bits
= fmap $ \case { O -> I; I -> O }
inv
neg :: Bits -> Bits
= inc . inv neg
Trying it out:
λ> toBits 3
...00011
λ> neg it
...11101
λ> inc it
...1110
λ> inc it
...111
λ> inc it
...000
λ> inc it
...0001
λ> dec it
...000
λ> dec it
...111
Finally, addition, multiplication, and Ord
and Num
instances:
add :: Bits -> Bits -> Bits
= \cases
add C O) y -> y
(C O) -> x
x (C I) (C I) -> O ::= C I
(I ::= xs) (I ::= ys) -> O ::= inc (add xs ys)
(::= xs) (y ::= ys) -> (x .|. y) ::= add xs ys
(x where
I .|. _ = I
.|. y = y
_
mul :: Bits -> Bits -> Bits
= \cases
mul C O) _ -> C O
(C O) -> C O
_ (C I) y -> neg y
(C I) -> neg x
x (O ::= xs) ys -> O ::= mul xs ys
(I ::= xs) ys -> add ys (O ::= mul xs ys)
(
instance Ord Bits where
-- It's a bit mind-boggling that this works
compare (C x) (C y) = compare y x
compare (x ::= xs) (y ::= ys) = compare xs ys <> compare x y
instance Num Bits where
fromInteger = toBits
negate = neg
+) = add
(*) = mul
(abs = toBits . abs . fromBits
signum = toBits . signum . fromBits
λ> quickCheck $ withMaxSuccess 1000 $ \x y -> fromBits (mul (toBits x) (toBits y)) == x * y
+++ OK, passed 1000 tests.
λ> quickCheck $ \x y -> compare (toBits x) (toBits y) == compare x y
+++ OK, passed 100 tests.
Just for fun, let’s implement the Collatz map:
collatz :: Bits -> Bits
O ::= bs) = bs
collatz (@(I ::= _) = 3*bs + 1 collatz bs
λ> P.take 20 $ map fromBits (iterate collatz (toBits (-13)))
[-13,-38,-19,-56,-28,-14,-7,-20,-10,-5,-14,-7,-20,-10,-5,-14,-7,-20,-10,-5]
λ> P.take 20 $ map fromBits (iterate collatz (toBits 7))
[7,22,11,34,17,52,26,13,40,20,10,5,16,8,4,2,1,4,2,1]
Is (:::)
or (::=)
the better default? It’s tempting to just say
“provide both and let the user decide”. I don’t disagree with that;
however, the question is which one we use to implement various basic
functions such as map
/fmap
. For example, if we use (:::)
, we
can make a Functor
instance, but values may not be normalized
after mapping.
Can we generalize from eventually constant to eventually periodic?
That is, instead of repeating the same value forever, we cycle
through a repeating period of some finite length. I think this
is possible, but it would make the implementation more
complex, and I don’t know the right way to generalize foldMap
. (We
could insist that it only works for commutative idempotent
semigroups, but in that case what’s the point of having a sequence
of values rather than just a set?)
Happy to hear any comments or suggestions!
In a previous post I challenged you to solve Factor-Full Tree. In this problem, we are given an unlabelled rooted tree, and asked to create a divisor labelling. That is, we must label the vertices with positive integers in such a way that \(v\) is an ancestor of \(u\) if and only if \(v\)’s label evenly divides \(u\)’s label.
For example, here is a tree with a divisor labelling:
Divisor labelling of a tree
The interesting point (though irrelevant to solving the problem) is
that this is a method for encoding a tree as a set of integers:
because \(v\) is an ancestor of \(u\) if and only if \(v\)’s label divides
\(u\)’s, all the information about the tree’s structure is fully
contained in the set of labels. For example, if we simply write
down the set \(\{1, 5, 6, 7, 12, 14, 21, 49, 63\}\), it is possible to
fully reconstruct the above tree from this set.Note that we
consider trees equivalent up to reordering of siblings, that is, each
node has a bag, not a list, of children.
This is not a
particularly efficient way to encode a tree, but it is certainly
interesting!
First, some basic setup.See here for the Scanner
abstraction, and
here
for the basics of how I organize solutions.
The first line of
input specifies the number of nodes \(N\), and after that there are
\(N-1\) lines, each specifying a single undirected edge.
import Control.Category ((>>>))
import Data.Bifunctor (second)
import Data.Map (Map, (!?))
import qualified Data.Map as M
import Data.Tuple (swap)
= C.interact $ runScanner tc >>> solve >>> format
main
data TC = TC { n :: Int, edges :: [Edge] }
deriving (Eq, Show)
tc :: Scanner TC
= do
tc <- int
n <- (n - 1) >< pair int int
edges return TC{..}
format :: [Integer] -> ByteString
= map showB >>> C.unwords format
We are guaranteed that the edges describe a tree; next we will actually build a tree data structure from the input.
There are many similar problems which specify a tree structure by giving a list of edges, so it’s worthwhile trying to write some generic code to transform such an input into an actual tree. In an imperative language we would do this by building a map from each node to its neighbors, then doing a DFS to orient the tree. Our Haskell code will be similar, except building the map and doing a DFS will both be one-liners!
First, a function to turn a list of undirected edges into a Map
associating each vertex to all its neighbors. It’s convenient to
decompose this into a function to turn a list of directed edges into
a Map
, and a function to duplicate and swap each pair. We won’t
need dirEdgesToMap
for this problem, but we can certainly imagine
wanting it elsewhere.
edgesToMap :: Ord a => [(a, a)] -> Map a [a]
= concatMap (\p -> [p, swap p]) >>> dirEdgesToMap
edgesToMap
dirEdgesToMap :: Ord a => [(a, a)] -> Map a [a]
= map (second (: [])) >>> M.fromListWith (++) dirEdgesToMap
Next, we can turn such a neighbor Map
into a tree. Rather than
returning a literal Tree
data structure, it’s convenient to
incorporate a tree fold: that is, given a function a -> [b] -> b
, a neighbor
map, and a root node, we fold over the whole tree and return the
resulting b
value. (Of course, if we want an actual Tree
we can use
mapToTree Node
.) We can also compose these into a single function edgesToTree
.
mapToTree :: Ord a => (a -> [b] -> b) -> Map a [a] -> a -> b
= dfs root root
mapToTree nd m root where
= nd root (maybe [] (map (dfs root) . filter (/= parent)) (m !? root))
dfs parent root
edgesToTree :: Ord a => (a -> [b] -> b) -> [(a, a)] -> a -> b
= mapToTree nd . edgesToMap edgesToTree nd
So how do we create a divisor labelling for a given tree? Clearly, we might as well choose the root to have label \(1\), and every time we descend from a parent to a child, we must multiply by some integer, which might as well be a prime. Of course, we need to multiply by a different prime for each sibling. We might at first imagine simply multiplying by 2 for each (arbitrarily chosen) leftmost child, 3 for each second child, 5 for each third child, and so on, but this does not work—the second child of the first child ends up with the same label as the first child of the second child, and so on.
Each node \(u\)’s label is some prime \(p\) times its parent’s label; call \(p\) the factor of node \(u\). It is OK for one child of \(u\) to also have factor \(p\), but the other children must get different factors. To be safe, we can give each additional child a new globally unique prime factor. This is not always necessary—in some cases it can be OK to reuse a factor if it does not lead to identically numbered nodes—but it is certainly sufficient. As an example, below is a divisor labelling of the example tree from before, via this scheme. Each edge is labelled with the factor of its child.
Divisor labelling of a tree with consecutive primes
Notice how we use \(2\) for the first child of the root, and \(3\) for the next child. \(3\)’s first child can also use a factor of \(3\), yielding a label of \(3^2 = 9\). \(3\)’s next child uses a new, globally unique prime \(5\), and its third child uses \(7\); the final child of \(1\) uses the next available prime, \(11\).
We can code this up via a simple stateful traversal of the tree. (For
primes
, see this
post.)
It’s a bit fiddly since we have to switch to the next prime between
consecutive children, but not after the last child.
primes :: [Integer]
= 2 : sieve primes [3 ..]
primes where
: ps) xs =
sieve (p let (h, t) = span (< p * p) xs
in h ++ sieve ps (filter ((/= 0) . (`mod` p)) t)
curPrime :: State [Integer] Integer
= gets head
curPrime
nextPrime :: State [Integer] ()
= modify tail
nextPrime
labelTree :: Tree a -> Tree (Integer, a)
= flip evalState primes . go 1
labelTree where
go :: Integer -> Tree a -> State [Integer] (Tree (Integer, a))
Node a ts) = Node (x, a) <$> labelChildren x ts
go x (
labelChildren :: Integer -> [Tree a] -> State [Integer] [Tree (Integer, a)]
= pure []
labelChildren _ [] : ts) = do
labelChildren x (t <- curPrime
p <- go (x * p) t
t' case ts of
-> pure [t']
[] -> do
_
nextPrime:) <$> labelChildren x ts (t'
There is a bit of additional glue code we need get the parsed tree
from the input, apply labelTree
, and then print out the node
labels in order. However, I’m not going to bother showing it,
because—this solution is not accepted! It fails with a WA (Wrong
Answer) verdict. What gives?
The key is one of the last sentences in the problem statement, which I haven’t mentioned so far: all the labels in our output must be at most \(10^{18}\). Why is this a problem? Multiplying by primes over and over again, it’s not hard to get rather large numbers. For example, consider the tree below:
Tree for which our naïve scheme generates labels that are too large
Under our scheme, the root gets label \(1\), and the children of the root get consecutive primes \(2, 3, 5, \dots, 29\). Then the nodes in the long chain hanging off the last sibling get labels \(29^2, 29^3, \dots, 29^{13}\), and \(29^{13}\) is too big—in fact, it is approximately \(10^{19}\). And this tree has only 23 nodes; in general the input can have up to 60.
Of course, \(29\) was a poor choice of factor for such a long chain—we should have instead labelled the long chain with powers of, say, 2. Notice that if we have a “tree” consisting of a single long chain of 60 nodes (and you can bet this is one of the secret test inputs!), we just barely get by labelling it with powers of two from \(2^0\) up to \(2^{59}\): in fact \(2^{59} < 10^{18} < 2^{60}\). So in general, we want to find a way to label long chains with small primes, and reserve larger primes for shorter chains.
One obvious approach is to simply sort the children at each node by decreasing height, before traversing the tree to assign prime factors. This handles the above example correctly, since the long chain would be sorted to the front and assigned the factor 2. However, this does not work in general! It can still fail to assign the smallest primes to the longest chains. As a simple example, consider this tree, in which the children of every node are already sorted by decreasing height from left to right:
Tree for which sorting by height first does not work
The straightforward traversal algorithm indeed assigns powers of 2 to the left spine of the tree, but it then assigns 3, 5, 7, and so on to all the tiny spurs hanging off it. So by the time we get to other long chain hanging off the root, it is assigned powers of \(43\), which are too big. In fact, we want to assign powers of 2 to the left spine, powers of 3 to the chain on the right, and then use the rest of the primes for all the short spurs. But this sort of “non-local” labelling means we can’t assign primes via a tree traversal.
To drive this point home, here’s another example tree. This one is small enough that it probably doesn’t matter too much how we label it, but it’s worth thinking about how to label the longest chains with the smallest primes. I’ve drawn it in a “left-leaning” style to further emphasize the different chains that are involved.
Tree with chains of various lengths
In fact, we want to assign the factor 2 to the long chain on the left; then the factor 3 to the second-longest chain, in the fourth column; then 5 to the length-6 chain in the second column; 7 to the length-3 chain all the way on the right; and finally 11 to the smallest chain, in column 3.
In general, then, we want a way to decompose an arbitrary tree into chains, where we repeatedly identify the longest chain, remove it from consideration, and then identify the longest chain from the remaining nodes, and so on. Once we have decomposed a tree into chains, it will be a relatively simple matter to sort the chains by length and assign consecutive prime factors.
This decomposition occasionally comes in handy (for example, see Floating Formation), and belongs to a larger family of important tree decomposition techniques such as heavy-light decomposition. Next time, I’ll demonstrate how to implement such tree decompositions in Haskell!
Recently, Dani Rybe wrote this really cool blog post (in turn based on this old post by Samuel Gélineau) about encoding truly unordered n-tuples in Haskell. This is something I thought about a long time ago in my work on combinatorial species, but I never came up with a way to represent them. Samuel and Dani’s solution is wonderful and clever and totally impractical, and I love it.
I won’t go into more detail than that; I’ll let you go read it if you’re interested. This blog post exists solely to respond to Dani’s statement towards the end of her post:
I’m not sure how to, for example, write a function that multiplies the inputs.
Challenge accepted!
primes :: [Int]
= 2 : sieve primes [3 ..]
primes where
: ps) xs =
sieve (p let (h, t) = span (< p * p) xs
in h ++ sieve ps (filter ((/= 0) . (`mod` p)) t)
mul :: [Int] -> Int
= unfuck mulU
mul where
mulU :: U n Int -> Int
= ufold 1 id (< 0) \(US neg nonNeg) ->
mulU * mulPos primes (abs <$> neg) * (-1) ^ ulen neg
mulNonNeg nonNeg
mulNonNeg :: U n Int -> Int
= ufold 1 id (== 0) \(US zero pos) ->
mulNonNeg if ulen zero > 0 then 0 else mulPos primes pos
mulPos :: [Int] -> U n Int -> Int
= ufold 1 id (== 1) \(US _ pos) -> mulGTOne ps pos
mulPos ps
mulGTOne :: [Int] -> U n Int -> Int
: ps) = ufold 1 id ((== 0) . (`mod` p)) \(US divP nondivP) ->
mulGTOne (p : ps) ((`div` p) <$> divP) * (p ^ ulen divP) * mulGTOne ps nondivP mulPos (p
Since every integer has a unique prime factorization, at each step we split the remaining numbers into those divisible by \(p\) and those not divisible by \(p\). For the ones that are, we divide out \(p\) from all of them, multiply by the appropriate power of \(p\), and recurse on what’s left; for those that are not, we move on to trying the next prime.
Dani also speculates about ubind :: U n (U m a) -> U (n :*: m) a
. I
believe in my heart this should be possible to implement, but after
playing with it a bit, I concluded it would require an astounding feat
of type-fu.
PS I’m working on getting comments set up here on my new blog… hopefully coming soon!
In a previous post I challenged you to solve Product Divisors. In this problem, we are given a sequence of positive integers \(a_1, \dots, a_n\), and we are asked to compute the total number of divisors of their product. For example, if we are given the numbers \(4, 2, 3\), then the answer should be \(8\), since \(4 \times 2 \times 3 = 24\) has the \(8\) distinct divisors \(1, 2, 3, 4, 6, 8, 12, 24\).
In general, if \(a\) has the prime factorization \(a = p_1^{\alpha_1} p_2^{\alpha_2} \cdots p_k^{\alpha_k}\) (where the \(p_i\) are all distinct primes), then the number of divisors of \(a\) is
\[(\alpha_1 + 1)(\alpha_2 + 1) \cdots (\alpha_k + 1),\]
since we can independently choose how many powers of each prime to include. There are \(\alpha_i + 1\) choices for \(p_i\) since we can choose anything from \(p_i^0\) up to \(p_i^{\alpha_i}\), inclusive.
So at a fundamental level, the solution is clear: factor each \(a_i\),
count up the number of copies of each prime in their product, then do
something like map (+1) >>> product
. We are also told the answer
should be given mod \(10^9 + 7\), so we can use aUsing Int
instead of Integer
here is OK as long as we are sure to be running
on a 64-bit system; multiplying two Int
values up to \(10^9 + 7\)
yields a result that still fits within a 64-bit signed Int
.
Otherwise (e.g. on Codeforces) we would have to use Integer
.
newtype
with a
custom Num
instance:
p :: Int
= 10^9 + 7
p
newtype M = M { unM :: Int } deriving (Eq, Ord)
instance Show M where show = show . unM
instance Num M where
fromInteger = M . (`mod` p) . fromInteger
M x + M y = M ((x + y) `mod` p)
M x - M y = M ((x - y) `mod` p)
M x * M y = M ((x * y) `mod` p)
Of course, I would not be writing about this problem if it were that
easy! If we try implementing the above solution idea in a
straightforward way—for example, if we take the simple factoring code from this blog
post
and then do something like map factor >>> M.unionsWith (+) >>> M.elems >>> map (+1) >>> product
, we get the dreaded Time Limit Exceeded.
Why doesn’t this work? I haven’t mentioned how many integers might be in the input: in fact, we might be given as many as one million (\(10^6\))! We need to be able to factor each number very quickly if we’re going to finish within the one second time limit. Factoring each number from scratch by trial division is simply too slow.
While more sophisticated methods are needed to factor a single number more quickly than trial division, there is a standard technique we can use to speed things up when we need to factor many numbers. We can use a sieve to precompute a lookup table, which we can then use to factor numbers very quickly.
In particular, we will compute a table \(\mathit{smallest}\) such that \(\mathit{smallest}[i]\) will store the smallest prime factor of \(i\). Given this table, to factor a positive integer \(i\), we simply look up \(\mathit{smallest}[i] = p\), add it to the prime factorization, then recurse on \(i/p\); the base case is when \(i = 1\).
How do we compute \(\mathit{smallest}\)? The basic idea is to create an
array of size \(n\), initializing it with \(\mathit{smallest}[k] = k\). For each \(k\) from \(2\) up to \(n\),We could optimize this even
further via the approach in this blog
post, which takes \(O(n)\)
rather than \(O(n \lg n)\) time, but it would complicate our Haskell
quite a bit and it’s not needed for solving this problem.
if
\(\mathit{smallest}[k]\) is still equal to \(k\), then \(k\) must be prime;
iterate through multiples of \(k\) (starting with \(k^2\), since any
smaller multiple of \(k\) is already divisible by a smaller prime) and
set each \(\mathit{smallest}[ki]\) to the minimum of \(k\) and whatever
value it had before.
This is one of those cases where for efficiency’s sake, we actually
want to use an honest-to-goodness mutable array. Immutable arrays are
not a good fit for sieving, and using something like a Map
would
introduce a lot of overhead that we would rather avoid. However, we
only need the table to be mutable while we are computing it; after
that, it should just be an immutable lookup table. This is a great fit
for an STUArray
:Note that as of this writing, the version of the
array
library installed in the Kattis environment does not have
modifyArray'
, so we actually have to do readArray
followed by
writeArray
.
= 1000000
maxN
smallest :: UArray Int Int
= runSTUArray $ do
smallest <- newListArray (2,maxN) [2 ..]
a 2 .. maxN] $ \k -> do
forM_ [<- readArray a k
k' == k') $ do
when (k *k, k*(k+1) .. maxN] $ \n ->
forM_ [kmin k)
modifyArray' a n (return a
Haskell, the world’s finest imperative programming language!
We can now write a new factor
function that works by repeatedly
looking up the smallest prime factor:
factor :: Int -> Map Int Int
= \case
factor 1 -> M.empty
-> M.insertWith (+) p 1 (factor (n `div` p))
n where
= smallest!n p
And now we can just do map factor >>> M.unionsWith (+) >>> M.elems >>> map (+1) >>> product
as before, but since our factor
is so much faster this time, it
should…
What’s that? Still TLE? Sigh.
Unfortunately, creating a bunch of Map
values and then doing
unionsWith
one million times still introduces way too much overhead.
For many problems working with Map
(which is impressively fast) is
good enough, but not in this case. Instead of returning a Map
from
each call to factor
and then later combining them, we can write a version of
factor
that directly increments counters for each prime in a
mutable array:
factor :: STUArray s Int Int -> Int -> ST s ()
= go n
factor counts n where
1 = return ()
go = do
go n let p = smallest!n
+1)
modifyArray' counts p (`div` p) go (n
Then we have the following top-level solution, which is finally fast enough:
main :: IO ()
= C.interact $ runScanner (numberOf int) >>> solve >>> showB
main
solve :: [Int] -> M
= counts >>> elems >>> map ((+1) >>> M) >>> product
solve
counts :: [Int] -> UArray Int Int
= runSTUArray $ do
counts ns <- newArray (2,maxN) 0
cs
forM_ ns (factor cs)return cs
This solution runs in just over 0.4s for me. Considering that this is
only about 4x slower than the fastest solution (0.09s, in C++), I’m
pretty happy with it! We did have to sacrifice a bit of elegance for
speed, especially with the factor
and counts
functions instead of
M.unionsWith
, but in the end it’s not too bad.
I thought we might be able to make this even faster by using a strict
fold over the counts
array instead of converting to a list with
elems
and then doing a map
and a product
, but (1) there is no
generic fold operation on UArray
, and (2) I trust that GHC is
already doing a pretty good job optimizing this via list fusion.
Next time I’ll write about my solution to the other challenge problem, Factor-Full Tree. Until then, give it a try!
This Saturday, June 15, we will have the third Swarm swarm, i.e. collaborative virtual hackathon. Details can be found here on the Swarm wiki.
As a reminder, Swarm is a 2D, open-world programming and resource gathering game, implemented in Haskell, with a strongly-typed, functional programming language and a unique upgrade system. Unlocking language features is tied to collecting resources, making it an interesting challenge to bootstrap your way into the use of the full language.
I haven’t written here in a while—partly due to being busy, but also partly due to getting sick of Wordpress and deciding it was finally time to rebuild my blog from scratch using Hakyll. I still haven’t quite worked out what I’m doing about comments (I looked into Isso but haven’t gotten it to work yet—if you have used it successfully, let me know!).
For today I have two hard competitive programming challenge problems for you. Both involve some number theory, and both are fairly challenging, but that’s about all they have in common!
Since there are no comments (for now), feel free to email me with your thoughts. I’ll post my solutions (with commentary) in a later post or two!
The Swarm development team is very proud to announce the latest release of the game. This should still be considered a development/preview release—you still can’t save your games—but it’s made some remarkable progress and there are lots of fun things to try.
As a reminder, Swarm is a 2D, open-world programming and resource gathering game with a strongly-typed, functional programming language and a unique upgrade system. Unlocking language features is tied to collecting resources, making it an interesting challenge to bootstrap your way into the use of the full language. It has also become a flexible and powerful platform for constructing programming challenges.
A few of the most significant new features are highlighted below; for full details, see the release notes. If you just want to try it out, see the installation instructions.
The default play mode is the open-world, resource-gathering scenario—but Swarm also supports “challenge scenarios”, where you have to complete one or more specific objectives with given resources on a custom map. There are currently 58 scenarios and counting—some are silly proofs of concept, but many are quite fun and challenging! I especially recommend checking out the Ranching
and Sokoban
scenarios, as well as A Frivolous Excursion
(pictured below). And creating new scenarios is a great way you can contribute to Swarm even if you don’t know Haskell, or aren’t comfortable hacking on the codebase.
Recently, a large amount of work has gone into expanding the possibilities for scenario design:
stride
, detect
, sniff
, chirp,
resonate
, watch
, surveil
, scout
, instant
, push
, density
, use
, halt
, and backup
.
In the past, entity and goal descriptions were simply plain text; recently, we switched to actually parsing Markdown. Partly, this is just to make things look nice, since we can highlight code snippets, entity names, etc.:
But it also means that we can now validate all code examples and entity names, and even test that the tutorial is pedagogically sound: any command used in a tutorial solution must be mentioned in a previous tutorial, or else our CI fails!
There are also a number of other small UI enhancements, such as improved type error messages, inventory search, and a collapsible REPL panel, among others.
We now keep track of a number of metrics related to challenge scenario solutions, such as total time, total game ticks, and code size. These metrics are tracked and saved across runs, so you can compete with yourself, and with others. For now, see these wiki pages:
In the future, perhaps there will eventually be some kind of social website with leaderboards and user-uploaded scenarios.
Last but not least, we now have an integrated single-stepping and debugging mode (enabled by the tweezers
device).
To install, check out the installation instructions: you can download a binary release (for now, Linux only, but MacOS binaries should be on the horizon), or install from Hackage. Give it a try and send us your feedback, either via a github issue or IRC!
We’re still hard at work on the game. Fun upcoming things include:
import
construct
Of course, there are also tons of small things that need fixing and polishing too! If you’re interested in getting involved, check out our contribution guide, come join us on IRC (#swarm
on Libera.Chat), or take a look at the list of issues marked “low-hanging fruit”.
Brought to you by the Swarm development team:
With contributions from:
…not to mention many others who gave valuable suggestions and feedback. Want to see your name listed here in the next release? See how you can contribute!
tl;dr: How to compile a functional language via combinators (and evaluate via the Haskell runtime) while keeping the entire process type-indexed, with a bibliography and lots of references for further reading
There is a long history, starting with Schönfinkel and Curry, of abstracting away variable names from lambda calculus terms by converting to combinators, aka bracket abstraction. This was popular in the 80’s as a compilation technique for functional languages (Turner, 1979; Augustsson, 1986; Jones, 1987; Diller, 1988), then apparently abandoned. More recently, however, it has been making a bit of a comeback. For example, see Naylor (2008), Gratzer (2015), Lynn (2017), and Mahler (2021). Bracket abstraction is intimately related to compiling to cartesian closed categories (Elliott, 2017; Mahler, 2021), and also enables cool tricks like doing evaluation via the Haskell runtime system (Naylor, 2008; Seo, 2016; Mahler, 2022).
However, it always bothered me that the conversion to combinators was invariably described in an untyped way. Partly to gain some assurance that we are doing things correctly, but mostly for fun, I wondered if it would be possible to do the whole pipeline in an explicitly type-indexed way. I eventually found a nice paper by Oleg Kiselyov (2018) which explains exactly how to do it (it even came with OCaml code that I was easily able to port to Haskell!).
In this blog post, I:
This blog post is rendered automatically from a literate Haskell file; you can find the complete working source code and blog post on GitHub. I’m always happy to receive comments, fixes, or suggestions for improvement.
So many yummy language extensions.
{-# LANGUAGE ConstraintKinds #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE ExplicitForAll #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE ImportQualifiedPost #-}
{-# LANGUAGE InstanceSigs #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE LambdaCase #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE PatternSynonyms #-}
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE UnicodeSyntax #-}
{-# LANGUAGE ViewPatterns #-}
module TypedCombinators where
import Control.Monad.Combinators.Expr
import Data.Functor.Const qualified as F
import Data.Void
import Data.Text ( Text )
import Data.Text qualified as T
import Data.Kind (Type)
import Data.Type.Equality ( type (:~:)(Refl), TestEquality(..) )
import Text.Megaparsec
import Text.Megaparsec.Char
import Text.Megaparsec.Char.Lexer qualified as L
import Witch (into)
import Prelude hiding (lookup)
Here’s an algebraic data type to represent raw terms of our DSL, something which might come directly out of a parser. The exact language we use here isn’t all that important; I’ve put in just enough features to make it nontrivial, but not much beyond that. We have integer literals, variables, lambdas, application, let
and if
expressions, addition, and comparison with >
. Of course, it would be easy to add more types, constants, and language features.
data Term where
Lit :: Int -> Term
Var :: Text -> Term
Lam :: Text -> Ty -> Term -> Term
App :: Term -> Term -> Term
Let :: Text -> Term -> Term -> Term
If :: Term -> Term -> Term -> Term
Add :: Term -> Term -> Term
Gt :: Term -> Term -> Term
deriving Show
A few things to note:
In order to keep things simple, notice that lambdas must be annotated with the type of the argument. There are other choices we could make, but this is the simplest for now. I’ll have more to say about other choices later.
I included if
not only because it gives us something to do with Booleans, but also because it is polymorphic, which adds an interesting twist to our typechecking.
I included >
, not only because it gives us a way to produce Boolean values, but also because it uses ad-hoc polymorphism, that is, we can compare at any type which is an instance of Ord
. This is an even more interesting twist.
Here are our types: integers, booleans, and functions.
data Ty where
TyInt :: Ty
TyBool :: Ty
TyFun :: Ty -> Ty -> Ty
deriving Show
Finally, here’s an example term that uses all the features of our language (I’ve included a simple parser in an appendix at the end of this post):
example :: Term
example = readTerm $ T.unlines
[ "let twice = \\f:Int -> Int. \\x:Int. f (f x) in"
, "let z = 1 in"
, "if 7 > twice (\\x:Int. x + 3) z then z else z + 1"
]
Since 7 is not, in fact, strictly greater than 1 + 3 + 3, this should evaluate to 2.
That was the end of our raw, untyped representations—from now on, everything is going to be type-indexed! First of all, we’ll declare an enumeration of constants, with each constant indexed by its corresponding host language type. These will include both any special language built-ins (like if
, +
, and >
) as well as a set of combinators which we’ll be using as a compilation target—more on these later.
data Const :: Type -> Type where
CInt :: Int -> Const Int
CIf :: Const (Bool -> α -> α -> α)
CAdd :: Const (Int -> Int -> Int)
CGt :: Ord α => Const (α -> α -> Bool)
I :: Const (α -> α)
K :: Const (α -> b -> α)
S :: Const ((α -> b -> c) -> (α -> b) -> α -> c)
B :: Const (( b -> c) -> (α -> b) -> α -> c)
C :: Const ((α -> b -> c) -> b -> α -> c)
deriving instance Show (Const α)
The polymorphism of if
(and the combinators I
, K
, etc., for that matter) poses no real problems. If we really wanted the type of CIf
to be indexed by the exact type of if
, it would be something like
CIf :: Const (∀ α. Bool -> α -> α -> α)
but this would require impredicative types which can be something of a minefield. However, what we actually get is
CIf :: ∀ α. Const (Bool -> α -> α -> α)
which is unproblematic and works just as well for our purposes.
The type of CGt
is more interesting: it includes an Ord α
constraint. That means that at the time we construct a CGt
value, we must have in scope an Ord
instance for whatever type α
is; conversely, when we pattern-match on CGt
, we will bring that instance into scope. We will see how to deal with this later.
For convenience, we make a type class HasConst
for type-indexed things that can contain embedded constants (we will end up with several instances of this class).
class HasConst t where
embed :: Const α -> t α
Also for convenience, here’s a type class for type-indexed things that support some kind of application operation. (Note that we don’t necessarily want to require t
to support a pure :: a -> t a
operation, or even be a Functor
, so using Applicative
would not be appropriate, even though $$
has the same type as <*>
.)
infixl 1 $$
class Applicable t where
($$) :: t (α -> β) -> t α -> t β
Note that, unlike the standard $
operator, $$
is left-associative, so, for example, f $$ x $$ y
should be read just like f x y
, that is, f $$ x $$ y = (f $$ x) $$ y
.
Finally, we’ll spend a bunch of time applying constants to things, or applying things to constants, so here are a few convenience operators for combining $$
and embed
:
infixl 1 .$$
(.$$) :: (HasConst t, Applicable t) => Const (α -> β) -> t α -> t β
c .$$ t = embed c $$ t
infixl 1 $$.
($$.) :: (HasConst t, Applicable t) => t (α -> β) -> Const α -> t β
t $$. c = t $$ embed c
infixl 1 .$$.
(.$$.) :: (HasConst t, Applicable t) => Const (α -> β) -> Const α -> t β
c1 .$$. c2 = embed c1 $$ embed c2
Now let’s build up our type-indexed core language. First, we’ll need a data type for type-indexed de Bruijn indices. A value of type Idx γ α
is a variable with type α
in the context γ
(represented as a type-level list of types). For example, Idx [Int,Bool,Int] Int
would represent a variable of type Int
(and hence must either be variable 0 or 2).
data Idx :: [Type] -> Type -> Type where
VZ :: Idx (α ': γ) α
VS :: Idx γ α -> Idx (β ': γ) α
deriving instance Show (Idx γ α)
Now we can build our type-indexed terms. Just like variables, terms are indexed by a typing context and a type; t : TTerm γ α
can be read as “t
is a term with type α
, possibly containing variables whose types are described by the context γ
”. Our core language has only variables, constants, lambdas, and application. Note we’re not just making a type-indexed version of our original term language; for simplicity, we’re going to simultaneously typecheck and elaborate down to this much simpler core language. (Of course, it would also be entirely possible to introduce another intermediate data type for type-indexed terms, and separate the typechecking and elaboration phases.)
data TTerm :: [Type] -> Type -> Type where
TVar :: Idx γ α -> TTerm γ α
TConst :: Const α -> TTerm γ α
TLam :: TTerm (α ': γ) β -> TTerm γ (α -> β)
TApp :: TTerm γ (α -> β) -> TTerm γ α -> TTerm γ β
deriving instance Show (TTerm γ α)
instance Applicable (TTerm γ) where
($$) = TApp
instance HasConst (TTerm γ) where
embed = TConst
Now for some type-indexed types!
data TTy :: Type -> Type where
TTyInt :: TTy Int
TTyBool :: TTy Bool
(:->:) :: TTy α -> TTy β -> TTy (α -> β)
deriving instance Show (TTy ty)
TTy
is a term-level representation of our DSL’s types, indexed by corresponding host language types. In other words, TTy
is a singleton: for a given type α
there is a single value of type TTy α
. Put another way, pattern-matching on a value of type TTy α
lets us learn what the type α
is. (See (Le, 2017) for a nice introduction to the idea of singleton types.)
We will need to be able to test two value-level type representations for equality and have that reflected at the level of type indices; the TestEquality
class from Data.Type.Equality
is perfect for this. The testEquality
function takes two type-indexed things and returns a type equality proof wrapped in Maybe
.
instance TestEquality TTy where
testEquality :: TTy α -> TTy β -> Maybe (α :~: β)
testEquality TTyInt TTyInt = Just Refl
testEquality TTyBool TTyBool = Just Refl
testEquality (α₁ :->: β₁) (α₂ :->: β₂) =
case (testEquality α₁ α₂, testEquality β₁ β₂) of
(Just Refl, Just Refl) -> Just Refl
_ -> Nothing
testEquality _ _ = Nothing
Recall that the CGt
constant requires an Ord
instance; the checkOrd
function pattern-matches on a TTy
and witnesses the fact that the corresponding host-language type has an Ord
instance (if, in fact, it does).
checkOrd :: TTy α -> (Ord α => r) -> Maybe r
checkOrd TTyInt r = Just r
checkOrd TTyBool r = Just r
checkOrd _ _ = Nothing
As a quick aside, for simplicity’s sake, I am going to use Maybe
throughout the rest of this post to indicate possible failure. In a real implementation, one would of course want to return more information about any error(s) that occur.
Sometimes we will need to wrap type-indexed things inside an existential wrapper to hide the type index. For example, when converting from a Ty
to a TTy
, or when running type inference, we can’t know in advance which type we’re going to get. So we create the Some
data type which wraps up a type-indexed thing along with a corresponding TTy
. Pattern-matching on the singleton TTy
will allow us to recover the type information later.
data Some :: (Type -> Type) -> Type where
Some :: TTy α -> t α -> Some t
mapSome :: (∀ α. s α -> t α) -> Some s -> Some t
mapSome f (Some α t) = Some α (f t)
The first instantiation we’ll create is an existentially wrapped type, where the TTy
itself is the only thing we care about, and the corresponding t
will just be the constant unit type functor. It would be annoying to keep writing F.Const ()
everywhere so we create some type and pattern synonyms for convenience.
type SomeTy = Some (F.Const ())
pattern SomeTy :: TTy α -> SomeTy
pattern SomeTy α = Some α (F.Const ())
{-# COMPLETE SomeTy #-}
The someType
function converts from a raw Ty
to a type-indexed TTy
, wrapped up in an existential wrapper.
someType :: Ty -> SomeTy
someType TyInt = SomeTy TTyInt
someType TyBool = SomeTy TTyBool
someType (TyFun a b) = case (someType a, someType b) of
(SomeTy α, SomeTy β) -> SomeTy (α :->: β)
Now that we have our type-indexed core language all set, it’s time to do type inference, that is, translate from untyped terms to type-indexed ones! First, let’s define type contexts, i.e. mappings from variables to their types. We store contexts simply as a (fancy, type-indexed) list of variable names paired with their types. This is inefficient—it takes linear time to do a lookup—but we don’t care, because this is an intermediate representation used only during typechecking. By the time we actually get around to running terms, variables won’t even exist any more.
data Ctx :: [Type] -> Type where
-- CNil represents an empty context.
CNil :: Ctx '[]
-- A cons stores a variable name and its type,
-- and then the rest of the context.
(:::) :: (Text, TTy α) -> Ctx γ -> Ctx (α ': γ)
Now we can define the lookup
function, which takes a variable name and a context and tries to return a corresponding de Bruijn index into the context. When looking up a variable name in the context, we can’t know in advance what index we will get and what type it will have, so we wrap the returned Idx
in Some
.
lookup :: Text -> Ctx γ -> Maybe (Some (Idx γ))
lookup _ CNil = Nothing
lookup x ((y, α) ::: ctx)
| x == y = Just (Some α VZ)
| otherwise = mapSome VS <$> lookup x ctx
Now we’re finally ready to define the infer
function! It takes a type context and a raw term, and tries to compute a corresponding type-indexed term. Note that there’s no particular guarantee that the term we return corresponds to the input term—we will just have to be careful—but at least the Haskell type system guarantees that we can’t return a type-incorrect term, which is especially important when we have some nontrivial elaboration to do. Of course, just as with variable lookups, when inferring the type of a term we can’t know in advance what type it will have, so we will need to return an existential wrapper around a type-indexed term.
infer :: Ctx γ -> Term -> Maybe (Some (TTerm γ))
infer ctx = \case
To infer the type of a literal integer value, just return TTyInt
with a literal integer constant.
Lit i -> return $ Some TTyInt (embed (CInt i))
To infer the type of a variable, look it up in the context and wrap the result in TVar
. Notice how we are allowed to pattern-match on the Some
returned from lookup
(revealing the existentially quantified type inside) since we immediately wrap it back up in another Some
when returning the TVar
.
Var x -> mapSome TVar <$> lookup x ctx
To infer the type of a lambda, we convert the argument type annotation to a type-indexed type, infer the type of the body under an extended context, and then return a lambda with an appropriate function type. (If lambdas weren’t required to have type annotations, then we would either have to move the lambda case to the check
function, or else use unification variables and solve type equality constraints. The former would be straightforward, but I don’t know how to do the latter in a type-indexed way—sounds like a fun problem for later.)
Lam x a t -> do
case someType a of
Some α _ -> do
Some β t' <- infer ((x,α) ::: ctx) t
return $ Some (α :->: β) (TLam t')
To infer the type of an application, we infer the type of the left-hand side, ensure it is a function type, and check
that the right-hand side has the correct type. We will see the check
function later.
App t1 t2 -> do
Some τ t1' <- infer ctx t1
case τ of
α :->: β -> do
t2' <- check ctx α t2
return $ Some β (TApp t1' t2')
_ -> Nothing
To infer the type of a let
-expression, we infer the type of the definition, infer the type of the body under an extended context, and then desugar it into an application of a lambda. That is, let x = t1 in t2
desugars to (\x.t2) t1
.
Let x t1 t2 -> do
Some α t1' <- infer ctx t1
Some β t2' <- infer ((x, α) ::: ctx) t2
return $ Some β (TApp (TLam t2') t1')
Note again that we can’t accidentally get mixed up here—for example, if we incorrectly desugar to (\x.t1) t2
we get a Haskell type error, like this:
• Couldn't match type ‘γ’ with ‘α : γ’
Expected: TTerm γ α1
Actual: TTerm (α : γ) α1
To infer an if
-expression, we can check that the test has type Bool
, infer the types of the two branches, and ensure that they are the same. If so, we return the CIf
constant applied to the three arguments. The reason this typechecks is that pattern-matching on the Refl
from the testEquality
call brings into scope the fact that the types of t2
and t3
are equal, so we can apply CIf
which requires them to be so.
If t1 t2 t3 -> do
t1' <- check ctx TTyBool t1
Some α t2' <- infer ctx t2
Some β t3' <- infer ctx t3
case testEquality α β of
Nothing -> Nothing
Just Refl -> return $ Some α (CIf .$$ t1' $$ t2' $$ t3')
Addition is simple; we just check that both arguments have type Int
.
Add t1 t2 -> do
t1' <- check ctx TTyInt t1
t2' <- check ctx TTyInt t2
return $ Some TTyInt (CAdd .$$ t1' $$ t2')
“Greater than” is a bit interesting because we allow it to be used at both Int
and Bool
. So, just as with if
, we must infer the types of the arguments and check that they match. But then we must also use the checkOrd
function to ensure that the argument types are an instance of Ord
. In particular, we wrap CGt
(which requires an Ord
constraint) in a call to checkOrd α
(which provides one).
Gt t1 t2 -> do
Some α t1' <- infer ctx t1
Some β t2' <- infer ctx t2
case testEquality α β of
Nothing -> Nothing
Just Refl -> (\c -> Some TTyBool (c .$$ t1' $$ t2')) <$> checkOrd α CGt
Finally, here’s the check
function: to check that an expression has an expected type, just infer its type and make sure it’s the one we expected. (With more interesting languages we might also have more cases here for terms which can be checked but not inferred.) Notice how this also allows us to return the type-indexed term without using an existential wrapper, since the expected type is an input.
check :: Ctx γ -> TTy α -> Term -> Maybe (TTerm γ α)
check ctx α t = do
Some β t' <- infer ctx t
case testEquality α β of
Nothing -> Nothing
Just Refl -> Just t'
Putting this all together so far, we can check that the example
term has type Int
and see what it elaborates to (I’ve included a simple pretty-printer for TTerm
in an appendix):
λ> putStrLn . pretty . fromJust . check CNil TTyInt $ example
(λ. (λ. if (gt 7 (x1 (λ. plus x0 3) x0)) x0 (plus x0 1)) 1) (λ. λ. x1 (x1 x0))
We can now easily write an interpreter. However, this is pretty inefficient (it has to carry around an environment and do linear-time variable lookups), and later we’re going to compile our terms directly to host language terms. So this interpreter is just a nice aside, for fun and testing.
With that said, given a closed term, we can interpret it directly to a value of its corresponding host language type. We need typed environments and a indexing function (note that for some reason GHC can’t see that the last case of the indexing function is impossible; if we tried implementing it in, say, Agda, we wouldn’t have to write that case).
data Env :: [Type] -> Type where
ENil :: Env '[]
ECons :: α -> Env γ -> Env (α ': γ)
(!) :: Env γ -> Idx γ α -> α
(ECons x _) ! VZ = x
(ECons _ e) ! (VS x) = e ! x
ENil ! _ = error "GHC can't tell this is impossible"
Now the interpreter is straightforward. Look how beautifully everything works out with the type indexing.
interpTTerm :: TTerm '[] α -> α
interpTTerm = go ENil
where
go :: Env γ -> TTerm γ α -> α
go e = \case
TVar x -> e ! x
TLam body -> \x -> go (ECons x e) body
TApp f x -> go e f (go e x)
TConst c -> interpConst c
interpConst :: Const α -> α
interpConst = \case
CInt i -> i
CIf -> \b t e -> if b then t else e
CAdd -> (+)
CGt -> (>)
K -> const
S -> (<*>)
I -> id
B -> (.)
C -> flip
λ> interpTTerm . fromJust . check CNil TTyInt $ example
2
Now, on with the main attraction! It’s well-known that certain sets of combinators are Turing-complete: for example, SKI is the most well-known complete set (or just SK if you’re trying to be minimal). There are well-known algorithms for compiling lambda calculus terms into combinators, known generally as bracket abstraction (for further reading about bracket abstraction in general, see Diller (2014); for some in-depth history along with illustrative Haskell code, see Ben Lynn’s page on Combinatory Logic (2022); for nice example implementations in Haskell, see blog posts by Gratzer (2015), Seo (2016), and Mahler (2021).)
So the idea is to compile our typed core language down to combinators. The resulting terms will have no lambdas or variables—only constants and application! The point is that by making environments implicit, with a few more tricks we can make use of the host language runtime’s ability to do beta reduction, which will be much more efficient than our interpreter.
The BTerm
type below will be the compilation target. Again for illustration and/or debugging we can easily write a direct interpreter for BTerm
—but this still isn’t the intended code path. There will still be one more step to convert BTerm
s directly into host language terms.
data BTerm :: Type -> Type where
BApp :: BTerm (α -> β) -> BTerm α -> BTerm β
BConst :: Const α -> BTerm α
deriving instance Show (BTerm ty)
instance Applicable BTerm where
($$) = BApp
instance HasConst BTerm where
embed = BConst
interpBTerm :: BTerm ty -> ty
interpBTerm (BApp f x) = interpBTerm f (interpBTerm x)
interpBTerm (BConst c) = interpConst c
We will use the usual SKI combinators as well as B
and C
, which are like special-case variants of S
:
S x y z = x z (y z)
B x y z = x (y z)
C x y z = x z (y )
S
handles the application of x
to y
in the case where they both need access to a shared parameter z
; B
and C
are similar, but B
is used when only y
, and not x
, needs access to z
, and C
is for when only x
needs access to z
. Using B
and C
will allow for more efficient encodings than would be possible with S
alone. If you want to compile a language with recursion you can also easily add the usual Y
combinator (“SICKBY
”), although the example language in this post has no recursion so we won’t use it.
Bracket abstraction is often presented in an untyped way, but I found this really cool paper by Oleg Kiselyov (2018) where he shows how to do bracket abstraction in a completely compositional, type-indexed way. I found the paper a bit hard to understand, but fortunately it came with working OCaml code! Translating it to Haskell was straightforward. Much later, after writing most of this blog post, I found a a nice explanation of Kiselyov’s paper by Lynn (2022) which helped me make more sense of the paper.
First, a data type for open terms, which represent an intermediate stage in the bracket abstraction algorithm, where some parts have been converted to closed combinator terms (the E
constructor embeds BTerm
values), and some parts still have not. This corresponds to Kiselyov’s eta-optimized version (section 4.1 of the paper). A simplified version that does not include V
is possible, but results in longer combinator expressions.
data OTerm :: [Type] -> Type -> Type where
-- E contains embedded closed (i.e. already abstracted) terms.
E :: BTerm α -> OTerm γ α
-- V represents a reference to the innermost/top environment
-- variable, i.e. Z
V :: OTerm (α ': γ) α
-- N represents internalizing the innermost bound variable as a
-- function argument. In other words, we can represent an open
-- term referring to a certain variable as a function which
-- takes that variable as an argument.
N :: OTerm γ (α -> β) -> OTerm (α ': γ) β
-- For efficiency, there is also a special variant of N for the
-- case where the term does not refer to the topmost variable at
-- all.
W :: OTerm γ β -> OTerm (α ': γ) β
instance HasConst (OTerm γ) where
embed = E . embed
Now for the bracket abstraction algorithm. First, a function to do type- and environment-preserving conversion from TTerm
to OTerm
. The conv
function handles the variable, lambda, and constant cases. The application case is handled by the Applicable
instance.
conv :: TTerm γ α -> OTerm γ α
conv = \case
TVar VZ -> V
TVar (VS x) -> W (conv (TVar x))
TLam t -> case conv t of
V -> E (embed I)
E d -> E (K .$$ d)
N e -> e
W e -> K .$$ e
TApp t1 t2 -> conv t1 $$ conv t2
TConst c -> embed c
The Applicable
instance for OTerm
has 15 cases—one for each combination of OTerm
constructors. Why not 16, you ask? Because the V $$ V
case is impossible (exercise for the reader: why?). The cool thing is that GHC can tell that case would be ill-typed, and agrees that this definition is total—that is, it does not give a non-exhaustive pattern match warning. This is a lot of code, but understanding each individual case is not too hard if you understand the meaning of the constructors E
, V
, N
, and W
. For example, if we have one term that ignores the innermost bound variable being applied to another term that also ignores the innermost bound variable (W e1 $$ W e2
), we can apply one term to the other and wrap the result in W
again (W (e1 $$ e2)
). Other cases use the combinators B
, C
, S
to route the input to the proper places in an application.
instance Applicable (OTerm γ) where
($$) :: OTerm γ (α -> β) -> OTerm γ α -> OTerm γ β
W e1 $$ W e2 = W (e1 $$ e2)
W e $$ E d = W (e $$ E d)
E d $$ W e = W (E d $$ e)
W e $$ V = N e
V $$ W e = N (E (C .$$. I) $$ e)
W e1 $$ N e2 = N (B .$$ e1 $$ e2)
N e1 $$ W e2 = N (C .$$ e1 $$ e2)
N e1 $$ N e2 = N (S .$$ e1 $$ e2)
N e $$ V = N (S .$$ e $$. I)
V $$ N e = N (E (S .$$. I) $$ e)
E d $$ N e = N (E (B .$$ d) $$ e)
E d $$ V = N (E d)
V $$ E d = N (E (C .$$. I $$ d))
N e $$ E d = N (E (C .$$. C $$ d) $$ e)
E d1 $$ E d2 = E (d1 $$ d2)
The final bracket abstraction algorithm consists of calling conv
on a closed TTerm
—this must result in a term of type OTerm ’[] α
, and the only constructor which could possibly produce such a type is E
, containing an embedded BTerm
. So we can just extract that BTerm
, and GHC can see that this is total.
bracket :: TTerm '[] α -> BTerm α
bracket t = case conv t of { E t' -> t' }
Let’s apply this to our example
term and see what we get:
λ> putStrLn . pretty . bracket . fromJust . check CNil TTyInt $ example
C C 1 (C C (C C 1 plus) (B S (C C I (B S (B (B if) (B (B (gt 7)) (C I (C C 3 plus)))))))) (S B I)
λ> interpBTerm . bracket . fromJust . check CNil TTyInt $ example
2
Neat! This is not too much longer than the original term, which is the point of using the optimized version. Interestingly, this example happens to not use K
at all, but a more complex term certainly would.
Kiselyov also presents an even better algorithm using \(n\)-ary combinators which uses guaranteed linear time and space. For simplicity, he presents it in an untyped way and claims in passing that it “can be backported to the typed case”, though I am not aware of anyone who has actually done this yet (perhaps I will, later). Lynn (2022) has a nice explanation of Kiselyov’s paper, including a section that explores several alternatives to Kiselyov’s linear-time algorithm.
So at this point we can take a Term
, typecheck it to produce a TTerm
, then use bracket abstraction to convert that to a BTerm
. We have an interpreter for BTerm
s, but we’re instead going to do one more compilation step, to turn BTerm
s directly into native Haskell values. This idea originates with Naylor (2008) and is well-explained in blog posts by Seo (2016) and Mahler (2022). This still feels a little like black magic to me, and I am actually unclear on whether it is really faster than calling interpBTerm
; some benchmarking would be needed. In any case I include it here for completeness.
Our target for this final compilation step is the following CTerm
type, which has only functions, represented by CFun
, and constants. Note, however, that CConst
is intended to be used only for non-function types, i.e. base types, although there’s no nice way (that I know of, at least) to use the Haskell type system to enforce this.
data CTerm α where
CFun :: (CTerm α -> CTerm β) -> CTerm (α -> β)
CConst :: α -> CTerm α -- CConst invariant: α is not a function type
instance Applicable CTerm where
CFun f $$ x = f x
CConst _ $$ _ = error "CConst should never contain a function!"
compile :: BTerm α -> CTerm α
compile (BApp b1 b2) = compile b1 $$ compile b2
compile (BConst c) = compileConst c
compileConst :: Const α -> CTerm α
compileConst = \case
(CInt i) -> CConst i
CIf -> CFun $ \(CConst b) -> CFun $ \t -> CFun $ \e -> if b then t else e
CAdd -> binary (+)
CGt -> binary (>)
K -> CFun $ \x -> CFun $ \_ -> x
S -> CFun $ \f -> CFun $ \g -> CFun $ \x -> f $$ x $$ (g $$ x)
I -> CFun id
B -> CFun $ \f -> CFun $ \g -> CFun $ \x -> f $$ (g $$ x)
C -> CFun $ \f -> CFun $ \x -> CFun $ \y -> f $$ y $$ x
binary :: (α -> b -> c) -> CTerm (α -> b -> c)
binary op = CFun $ \(CConst x) -> CFun $ \(CConst y) -> CConst (op x y)
Finally, we can “run” a CTerm α
to extract a value of type α
. Typically, if α
is some kind of base type like Int
, runCTerm
doesn’t actually do any work—all the work is done by the Haskell runtime itself. However, for completeness, I include a case for CFun
as well.
runCTerm :: CTerm α -> α
runCTerm (CConst a) = a
runCTerm (CFun f) = runCTerm . f . CConst
We can put this all together into our final pipeline:
evalInt :: Term -> Maybe Int
evalInt = fmap (runCTerm . compile . bracket) . check CNil TTyInt
λ> evalInt example
Just 2
There’s nothing interesting to see here—unless you’ve never written a parser or pretty-printer before, in which case perhaps it is very interesting! If you want to learn how to write parsers, see this very nice Megaparsec tutorial. And see here for some help writing a basic pretty-printer.
type Parser = Parsec Void Text
type ParserError = ParseErrorBundle Text Void
reservedWords :: [Text]
reservedWords = ["let", "in", "if", "then", "else", "Int", "Bool"]
sc :: Parser ()
sc = L.space space1 (L.skipLineComment "--") empty
lexeme :: Parser a -> Parser a
lexeme = L.lexeme sc
symbol :: Text -> Parser Text
symbol = L.symbol sc
reserved :: Text -> Parser ()
reserved w = (lexeme . try) $ string' w *> notFollowedBy alphaNumChar
identifier :: Parser Text
identifier = (lexeme . try) (p >>= nonReserved) <?> "variable name"
where
p = (:) <$> letterChar <*> many alphaNumChar
nonReserved (into @Text -> t)
| t `elem` reservedWords =
fail . into @String $
T.concat ["reserved word '", t, "' cannot be used as variable name"]
| otherwise = return t
integer :: Parser Int
integer = lexeme L.decimal
parens :: Parser a -> Parser a
parens = between (symbol "(") (symbol ")")
parseTermAtom :: Parser Term
parseTermAtom =
Lit <$> integer
<|> Var <$> identifier
<|> Lam <$> (symbol "\\" *> identifier)
<*> (symbol ":" *> parseType)
<*> (symbol "." *> parseTerm)
<|> Let <$> (reserved "let" *> identifier)
<*> (symbol "=" *> parseTerm)
<*> (reserved "in" *> parseTerm)
<|> If <$> (reserved "if" *> parseTerm)
<*> (reserved "then" *> parseTerm)
<*> (reserved "else" *> parseTerm)
<|> parens parseTerm
parseTerm :: Parser Term
parseTerm = makeExprParser parseTermAtom
[ [InfixL (App <$ symbol "")]
, [InfixL (Add <$ symbol "+")]
, [InfixL (Gt <$ symbol ">")]
]
parseTypeAtom :: Parser Ty
parseTypeAtom =
TyInt <$ reserved "Int"
<|> TyBool <$ reserved "Bool"
<|> parens parseType
parseType :: Parser Ty
parseType = makeExprParser parseTypeAtom
[ [InfixR (TyFun <$ symbol "->")] ]
readTerm :: Text -> Term
readTerm = either undefined id . runParser parseTerm ""
type Prec = Int
class Pretty p where
pretty :: p -> String
pretty = prettyPrec 0
prettyPrec :: Prec -> p -> String
prettyPrec _ = pretty
mparens :: Bool -> String -> String
mparens True = ("("++) . (++")")
mparens False = id
instance Pretty (Const α) where
prettyPrec _ = \case
CInt i -> show i
CIf -> "if"
CAdd -> "plus"
CGt -> "gt"
c -> show c
instance Pretty (Idx γ α) where
prettyPrec _ = ("x" ++) . show . toNat
where
toNat :: Idx γ α -> Int
toNat VZ = 0
toNat (VS i) = 1 + toNat i
instance Pretty (TTerm γ α) where
prettyPrec p = \case
TVar x -> pretty x
TConst c -> pretty c
TLam t -> mparens (p>0) $ "λ. " ++ prettyPrec 0 t
TApp t1 t2 -> mparens (p>1) $
prettyPrec 1 t1 ++ " " ++ prettyPrec 2 t2
instance Pretty (BTerm α) where
prettyPrec p = \case
BConst c -> pretty c
BApp t1 t2 -> mparens (p>0) $
prettyPrec 0 t1 ++ " " ++ prettyPrec 1 t2