Competitive programming in Haskell: BFS, part 3 (implementation via HashMap)
In a previous post, I showed how we can solve Modulo Solitaire (and hopefully other BFS problems as well) using a certain API for BFS, and we also explored some alternatives. I had a very interesting discussion with Andrey Mokhov in the comments about potential designs for an even more general API; more on that in a future post, perhaps!
For today, though, I want to finally show one way to implement this API efficiently. Spoiler alert: this implementation ultimately won’t be fast enough for us, but it will be a helpful stepping stone on our way to a yet faster implementation (which will of course get its own post in due time).
This post is literate Haskell; you can obtain the source from the darcs repo. We begin with a few LANGUAGE
pragmas and imports.
{-# LANGUAGE RecordWildCards #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TupleSections #-}
module BFS where
import Control.Arrow ((>>>))
import Data.Hashable (Hashable)
import Data.HashMap.Strict (HashMap, (!))
import qualified Data.HashMap.Strict as HM
import Data.List (foldl')
import Data.Sequence (Seq (..), ViewL (..), (|>))
import qualified Data.Sequence as Seq
Now a couple utility functions: (>$>)
is just flipped function application, and exhaust
iterates an (a -> Maybe a)
function as many times as possible, returning the last non-Nothing
value.
infixl 0 >$>
(>$>) :: a -> (a -> b) -> b
(>$>) = flip ($)
{-# INLINE (>$>) #-}
exhaust :: (a -> Maybe a) -> a -> a
exhaust f = go
where
go a = maybe a go (f a)
Here is the BFSResult
record that we ultimately want to produce; it should be familiar from previous posts.
data BFSResult v =
BFSR { getLevel :: v -> Maybe Int, getParent :: v -> Maybe v }
While running our BFS, we’ll keep track of three things: the level of each vertex that has been encountered; a mapping from each encountered vertex to its parent; and a queue of vertices that have been encountered but yet to be processed. We use a Seq
from Data.Sequence
to represent the queue, since it supports efficient (amortized constant-time) insertion and removal from either end of the sequence. There are certainly other potential ways to represent a queue in Haskell (and this probably deserves its own blog post) but Data.Sequence
seems to give good performance for minimal effort (and in any case, as we’ll see, it’s not the performance bottleneck here). We use a pair of HashMap
s to represent the level
and parent
maps.
data BFSState v =
BS { level :: HashMap v Int, parent :: HashMap v v, queue :: Seq v }
Given a list of starting vertices, we can create an initial state, with a queue containing the starting vertices and all of them set to level 0.
initBFSState :: (Eq v, Hashable v) => [v] -> BFSState v
initBFSState vs = BS (HM.fromList (map (,0) vs)) HM.empty (Seq.fromList vs)
Now, here is our imeplementation of BFS, using the API discussed previously: it takes a list of starting vertices, a function giving the neighbors of each vertex, and a function identifying “target vertices” (so we can stop early), and returns a BFSResult
record. We create an initial state, run bfsStep
as much as possible, and convert the end state into a result.
bfs :: forall v. (Eq v, Hashable v) => [v] -> (v -> [v]) -> (v -> Bool) -> BFSResult v
bfs vs next goal = toResult $ exhaust bfsStep (initBFSState vs)
where
Converting the final BFSState
into a BFSResult
is easy: just return functions that do a lookup
into the relevant map.
toResult BS{..} = BFSR (`HM.lookup` level) (`HM.lookup` parent)
To do a single step of BFS, try to remove the next vertex v
from the queue. If the queue is empty, or the next vertex is a goal vertex, return Nothing
to signal that we are done.
bfsStep st@BS{..} = case Seq.viewl queue of
EmptyL -> Nothing
v :< q'
| goal v -> Nothing
Otherwise, use the next
function to find the neighbors of v
, keep only those we haven’t encountered before (i.e. those which are not keys in the level
map), and use each one to update the BFS state (being sure to first set the queue to the new one with v
removed).
| otherwise ->
v >$> next >>> filter (not . (`HM.member` level)) >>>
foldl' (upd v) (st{queue=q'}) >>> Just
To update the BFS state based on a newly visited vertex, we record its parent, insert it into the level
map with a level one greater than its parent, and add it to the end of the queue.
upd p BS{..} v = BS
(HM.insert v l level)
(HM.insert v p parent)
(queue |> v)
where
l = level!p + 1
And that’s it! This is good enough to solve many BFS problems on Open Kattis, such as Breaking Bad, ARMPIT Computations, and Folding a Cube. (I will leave you the pleasure of solving these problems yourself; I am especially fond of my Haskell solution to Folding a Cube.)
Unfortunately, it is not fast enough to solve Modulo Solitaire, which I picked specifically because it seems to be one of the most computationally demanding BFS problems I’ve seen. My solution using this HashMap
-based implementation solves a bunch of initial test cases, but exceeds the 2 second time limit on one of the later test cases. Next time, I’ll show how to adapt this into an even faster implementation which is actually fast enough to solve Modulo Solitaire.