Competitive Programming in Haskell: tree path decomposition, part I
In a previous post I challenged you to solve Factor-Full Tree. In this problem, we are given an unlabelled rooted tree, and asked to create a divisor labelling. That is, we must label the vertices with positive integers in such a way that \(v\) is an ancestor of \(u\) if and only if \(v\)’s label evenly divides \(u\)’s label.
For example, here is a tree with a divisor labelling:
Divisor labelling of a tree
The interesting point (though irrelevant to solving the problem) is
that this is a method for encoding a tree as a set of integers:
because \(v\) is an ancestor of \(u\) if and only if \(v\)’s label divides
\(u\)’s, all the information about the tree’s structure is fully
contained in the set of labels. For example, if we simply write
down the set \(\{1, 5, 6, 7, 12, 14, 21, 49, 63\}\), it is possible to
fully reconstruct the above tree from this set.Note that we
consider trees equivalent up to reordering of siblings, that is, each
node has a bag, not a list, of children.
This is not a
particularly efficient way to encode a tree, but it is certainly
interesting!
Basic setup
First, some basic setup.See here for the Scanner
abstraction, and
here
for the basics of how I organize solutions.
The first line of
input specifies the number of nodes \(N\), and after that there are
\(N-1\) lines, each specifying a single undirected edge.
import Control.Category ((>>>))
import Data.Bifunctor (second)
import Data.Map (Map, (!?))
import qualified Data.Map as M
import Data.Tuple (swap)
= C.interact $ runScanner tc >>> solve >>> format
main
data TC = TC { n :: Int, edges :: [Edge] }
deriving (Eq, Show)
tc :: Scanner TC
= do
tc <- int
n <- (n - 1) >< pair int int
edges return TC{..}
format :: [Integer] -> ByteString
= map showB >>> C.unwords format
We are guaranteed that the edges describe a tree; next we will actually build a tree data structure from the input.
Building trees
There are many similar problems which specify a tree structure by giving a list of edges, so it’s worthwhile trying to write some generic code to transform such an input into an actual tree. In an imperative language we would do this by building a map from each node to its neighbors, then doing a DFS to orient the tree. Our Haskell code will be similar, except building the map and doing a DFS will both be one-liners!
First, a function to turn a list of undirected edges into a Map
associating each vertex to all its neighbors. It’s convenient to
decompose this into a function to turn a list of directed edges into
a Map
, and a function to duplicate and swap each pair. We won’t
need dirEdgesToMap
for this problem, but we can certainly imagine
wanting it elsewhere.
edgesToMap :: Ord a => [(a, a)] -> Map a [a]
= concatMap (\p -> [p, swap p]) >>> dirEdgesToMap
edgesToMap
dirEdgesToMap :: Ord a => [(a, a)] -> Map a [a]
= map (second (: [])) >>> M.fromListWith (++) dirEdgesToMap
Next, we can turn such a neighbor Map
into a tree. Rather than
returning a literal Tree
data structure, it’s convenient to
incorporate a tree fold: that is, given a function a -> [b] -> b
, a neighbor
map, and a root node, we fold over the whole tree and return the
resulting b
value. (Of course, if we want an actual Tree
we can use
mapToTree Node
.) We can also compose these into a single function edgesToTree
.
mapToTree :: Ord a => (a -> [b] -> b) -> Map a [a] -> a -> b
= dfs root root
mapToTree nd m root where
= nd root (maybe [] (map (dfs root) . filter (/= parent)) (m !? root))
dfs parent root
edgesToTree :: Ord a => (a -> [b] -> b) -> [(a, a)] -> a -> b
= mapToTree nd . edgesToMap edgesToTree nd
Inventing divisor labellings
So how do we create a divisor labelling for a given tree? Clearly, we might as well choose the root to have label \(1\), and every time we descend from a parent to a child, we must multiply by some integer, which might as well be a prime. Of course, we need to multiply by a different prime for each sibling. We might at first imagine simply multiplying by 2 for each (arbitrarily chosen) leftmost child, 3 for each second child, 5 for each third child, and so on, but this does not work—the second child of the first child ends up with the same label as the first child of the second child, and so on.
Each node \(u\)’s label is some prime \(p\) times its parent’s label; call \(p\) the factor of node \(u\). It is OK for one child of \(u\) to also have factor \(p\), but the other children must get different factors. To be safe, we can give each additional child a new globally unique prime factor. This is not always necessary—in some cases it can be OK to reuse a factor if it does not lead to identically numbered nodes—but it is certainly sufficient. As an example, below is a divisor labelling of the example tree from before, via this scheme. Each edge is labelled with the factor of its child.
Divisor labelling of a tree with consecutive primes
Notice how we use \(2\) for the first child of the root, and \(3\) for the next child. \(3\)’s first child can also use a factor of \(3\), yielding a label of \(3^2 = 9\). \(3\)’s next child uses a new, globally unique prime \(5\), and its third child uses \(7\); the final child of \(1\) uses the next available prime, \(11\).
We can code this up via a simple stateful traversal of the tree. (For
primes
, see this
post.)
It’s a bit fiddly since we have to switch to the next prime between
consecutive children, but not after the last child.
primes :: [Integer]
= 2 : sieve primes [3 ..]
primes where
: ps) xs =
sieve (p let (h, t) = span (< p * p) xs
in h ++ sieve ps (filter ((/= 0) . (`mod` p)) t)
curPrime :: State [Integer] Integer
= gets head
curPrime
nextPrime :: State [Integer] ()
= modify tail
nextPrime
labelTree :: Tree a -> Tree (Integer, a)
= flip evalState primes . go 1
labelTree where
go :: Integer -> Tree a -> State [Integer] (Tree (Integer, a))
Node a ts) = Node (x, a) <$> labelChildren x ts
go x (
labelChildren :: Integer -> [Tree a] -> State [Integer] [Tree (Integer, a)]
= pure []
labelChildren _ [] : ts) = do
labelChildren x (t <- curPrime
p <- go (x * p) t
t' case ts of
-> pure [t']
[] -> do
_
nextPrime:) <$> labelChildren x ts (t'
There is a bit of additional glue code we need get the parsed tree
from the input, apply labelTree
, and then print out the node
labels in order. However, I’m not going to bother showing it,
because—this solution is not accepted! It fails with a WA (Wrong
Answer) verdict. What gives?
Keeping things small
The key is one of the last sentences in the problem statement, which I haven’t mentioned so far: all the labels in our output must be at most \(10^{18}\). Why is this a problem? Multiplying by primes over and over again, it’s not hard to get rather large numbers. For example, consider the tree below:
Tree for which our naïve scheme generates labels that are too large
Under our scheme, the root gets label \(1\), and the children of the root get consecutive primes \(2, 3, 5, \dots, 29\). Then the nodes in the long chain hanging off the last sibling get labels \(29^2, 29^3, \dots, 29^{13}\), and \(29^{13}\) is too big—in fact, it is approximately \(10^{19}\). And this tree has only 23 nodes; in general the input can have up to 60.
Of course, \(29\) was a poor choice of factor for such a long chain—we should have instead labelled the long chain with powers of, say, 2. Notice that if we have a “tree” consisting of a single long chain of 60 nodes (and you can bet this is one of the secret test inputs!), we just barely get by labelling it with powers of two from \(2^0\) up to \(2^{59}\): in fact \(2^{59} < 10^{18} < 2^{60}\). So in general, we want to find a way to label long chains with small primes, and reserve larger primes for shorter chains.
Attempt 1: sorting by height
One obvious approach is to simply sort the children at each node by decreasing height, before traversing the tree to assign prime factors. This handles the above example correctly, since the long chain would be sorted to the front and assigned the factor 2. However, this does not work in general! It can still fail to assign the smallest primes to the longest chains. As a simple example, consider this tree, in which the children of every node are already sorted by decreasing height from left to right:
Tree for which sorting by height first does not work
The straightforward traversal algorithm indeed assigns powers of 2 to the left spine of the tree, but it then assigns 3, 5, 7, and so on to all the tiny spurs hanging off it. So by the time we get to other long chain hanging off the root, it is assigned powers of \(43\), which are too big. In fact, we want to assign powers of 2 to the left spine, powers of 3 to the chain on the right, and then use the rest of the primes for all the short spurs. But this sort of “non-local” labelling means we can’t assign primes via a tree traversal.
To drive this point home, here’s another example tree. This one is small enough that it probably doesn’t matter too much how we label it, but it’s worth thinking about how to label the longest chains with the smallest primes. I’ve drawn it in a “left-leaning” style to further emphasize the different chains that are involved.
Tree with chains of various lengths
In fact, we want to assign the factor 2 to the long chain on the left; then the factor 3 to the second-longest chain, in the fourth column; then 5 to the length-6 chain in the second column; 7 to the length-3 chain all the way on the right; and finally 11 to the smallest chain, in column 3.
In general, then, we want a way to decompose an arbitrary tree into chains, where we repeatedly identify the longest chain, remove it from consideration, and then identify the longest chain from the remaining nodes, and so on. Once we have decomposed a tree into chains, it will be a relatively simple matter to sort the chains by length and assign consecutive prime factors.
This decomposition occasionally comes in handy (for example, see Floating Formation), and belongs to a larger family of important tree decomposition techniques such as heavy-light decomposition. Next time, I’ll demonstrate how to implement such tree decompositions in Haskell!