Competitive programming in Haskell: BFS, part 2 (alternative APIs)
In my last post, I showed how we can solve Modulo Solitaire (and hopefully other BFS problems as well) using a certain API for BFS, which returns two functions: one, level :: v -> Maybe Int
, gives the level (i.e. length of a shortest path to) of each vertex, and parent :: v -> Maybe v
gives the parent of each vertex in the BFS forest. Before showing an implementation, I wanted to talk a bit more about this API and why I chose it.
In particular, Andrey Mokhov left a comment on my previous post with some alternative APIs:
bfsForest :: Ord a => [a] -> AdjacencyMap a -> Forest a
bfs :: Ord a => [a] -> AdjacencyMap a -> [[a]]
Of course, as Andrey notes, AdjacencyMap
is actually a reified graph data structure, which we don’t want here, but that’s not essential; presumably the AdjacencyMap
arguments in Andrey’s functions could easily be replaced by an implicit graph description instead. (Note that an API requiring an implicit representation is strictly more powerful, since if you have an explicit representation you can always just pass in a function which does lookups into your explicit representation.) However, Andrey raises a good point. Both these APIs return information which is not immediately available from my API.
-
bfsForest
returns an actual forest we can traverse, giving the children of each node. My API only returns aparent
function which gives the parent of each node. These contain equivalent information, however, and we can convert back and forth efficiently (where by “efficiently” in this context I mean “in \(O(n \lg n)\) time or better”) as long as we have a list of all vertices. To convert from aForest
to a parent function, just traverse the forest and remember all the parent-child pairs we see, building e.g. aMap
that can be used for lookup. To convert back, first iterate over the list of all vertices, find the parent of each, and build an inverse mapping from parents to sets of children. If we want to proceed to building an actualForest
data structure, we can unfold one via repeated lookups into our child mapping.However, I would argue that in typical applications, having the
parent
function is more useful than having aForest
. For example, theparent
function allows us to efficiently answer common, classic queries such as “Is vertexv
reachable from vertexs
?” and “What is a shortest path froms
tov
?” Answering these questions with aForest
would require traversing the entireForest
to look for the target vertexv
. -
bfs
returns a list of levels: that is, the first list is the starting vertices, the next list is all vertices one step away from any starting vertex, the next list is all vertices two steps away, and so on. Again, given a list of all vertices, we can recover a list of levels from thelevel
function: just traverse the list of all vertices, looking up the level of each and adding it to an appropriate mapping from levels to sets of vertices. Converting in the other direction is easy as well.A level list lets us efficiently answer a queries such as “how many vertices are exactly 5 steps away from
s
”?, whereas with thelevel
function we can efficiently answer queries such as “What is the length of a shortest path froms
tov
?” In practice, the latter form of query seems more common.
In the final version of this BFS API, I will probably include some functions to recover forests and level sets as described above. Some benchmarking will be needed to see whether it’s more efficient to recover them after the fact or to actually keep track of them along the way.