Competitive programming in Haskell: BFS, part 3 (implementation via HashMap)
Tagged BFS, graph, HashMap, Kattis, search, competitive programming, haskell
In a previous post, I showed how we can solve Modulo Solitaire (and hopefully other BFS problems as well) using a certain API for BFS, and we also explored some alternatives. I had a very interesting discussion with Andrey Mokhov in the comments about potential designs for an even more general API; more on that in a future post, perhaps!
For today, though, I want to finally show one way to implement this API efficiently. Spoiler alert: this implementation ultimately won’t be fast enough for us, but it will be a helpful stepping stone on our way to a yet faster implementation (which will of course get its own post in due time).
This post is literate Haskell; you can obtain the source from the darcs repo. We begin with a few LANGUAGE pragmas and imports.
{-# LANGUAGE RecordWildCards            #-}
{-# LANGUAGE ScopedTypeVariables        #-}
{-# LANGUAGE TupleSections              #-}
module BFS where
import           Control.Arrow               ((>>>))
import           Data.Hashable               (Hashable)
import           Data.HashMap.Strict         (HashMap, (!))
import qualified Data.HashMap.Strict         as HM
import           Data.List                   (foldl')
import           Data.Sequence               (Seq (..), ViewL (..), (|>))
import qualified Data.Sequence               as Seq
Now a couple utility functions: (>$>) is just flipped function application, and exhaust iterates an (a -> Maybe a) function as many times as possible, returning the last non-Nothing value.
infixl 0 >$>
(>$>) :: a -> (a -> b) -> b
(>$>) = flip ($)
{-# INLINE (>$>) #-}
exhaust :: (a -> Maybe a) -> a -> a
exhaust f = go
  where
    go a = maybe a go (f a)
Here is the BFSResult record that we ultimately want to produce; it should be familiar from previous posts.
data BFSResult v =
  BFSR { getLevel :: v -> Maybe Int, getParent :: v -> Maybe v }
While running our BFS, we’ll keep track of three things: the level of each vertex that has been encountered; a mapping from each encountered vertex to its parent; and a queue of vertices that have been encountered but yet to be processed. We use a Seq from Data.Sequence to represent the queue, since it supports efficient (amortized constant-time) insertion and removal from either end of the sequence. There are certainly other potential ways to represent a queue in Haskell (and this probably deserves its own blog post) but Data.Sequence seems to give good performance for minimal effort (and in any case, as we’ll see, it’s not the performance bottleneck here). We use a pair of HashMaps to represent the level and parent maps.
data BFSState v =
  BS { level :: HashMap v Int, parent :: HashMap v v, queue :: Seq v }Given a list of starting vertices, we can create an initial state, with a queue containing the starting vertices and all of them set to level 0.
initBFSState :: (Eq v, Hashable v) => [v] -> BFSState v
initBFSState vs = BS (HM.fromList (map (,0) vs)) HM.empty (Seq.fromList vs)
Now, here is our imeplementation of BFS, using the API discussed previously: it takes a list of starting vertices, a function giving the neighbors of each vertex, and a function identifying “target vertices” (so we can stop early), and returns a BFSResult record. We create an initial state, run bfsStep as much as possible, and convert the end state into a result.
bfs :: forall v. (Eq v, Hashable v) => [v] -> (v -> [v]) -> (v -> Bool) -> BFSResult v
bfs vs next goal = toResult $ exhaust bfsStep (initBFSState vs)
  where
Converting the final BFSState into a BFSResult is easy: just return functions that do a lookup into the relevant map.
    toResult BS{..} = BFSR (`HM.lookup` level) (`HM.lookup` parent)
To do a single step of BFS, try to remove the next vertex v from the queue. If the queue is empty, or the next vertex is a goal vertex, return Nothing to signal that we are done.
    bfsStep st@BS{..} = case Seq.viewl queue of
      EmptyL -> Nothing
      v :< q'
        | goal v    -> Nothing
Otherwise, use the next function to find the neighbors of v, keep only those we haven’t encountered before (i.e. those which are not keys in the level map), and use each one to update the BFS state (being sure to first set the queue to the new one with v removed).
        | otherwise ->
          v >$> next >>> filter (not . (`HM.member` level)) >>>
            foldl' (upd v) (st{queue=q'}) >>> Just
To update the BFS state based on a newly visited vertex, we record its parent, insert it into the level map with a level one greater than its parent, and add it to the end of the queue.
    upd p BS{..} v = BS
      (HM.insert v l level)
      (HM.insert v p parent)
      (queue |> v)
      where
        l = level!p + 1And that’s it! This is good enough to solve many BFS problems on Open Kattis, such as Breaking Bad, ARMPIT Computations, and Folding a Cube. (I will leave you the pleasure of solving these problems yourself; I am especially fond of my Haskell solution to Folding a Cube.)
Unfortunately, it is not fast enough to solve Modulo Solitaire, which I picked specifically because it seems to be one of the most computationally demanding BFS problems I’ve seen. My solution using this HashMap-based implementation solves a bunch of initial test cases, but exceeds the 2 second time limit on one of the later test cases. Next time, I’ll show how to adapt this into an even faster implementation which is actually fast enough to solve Modulo Solitaire.