Competitive programming in Haskell: data representation and optimization, with cake
In my previous post I challenged you to solve Checking Break, which presents us with a cake in the shape of a rectangular prism, with chocolate chips at various locations, and asks us to check whether a proposed division of the cake is valid. A division of the cake is valid if it is a partition (no pieces overlap and every part of the cake is in some piece) and every piece contains one chocolate chip.
No one posted a solution—I don’t know if that’s because people have lost interest, or because no one was able to solve it—but in any case, don’t read this post yet if you still want to try solving it! As a very small hint, part of the reason I chose this problem is that it is an interesting example of a case where just getting the correct asymptotic time complexity is not enough—we actually have to work a bit to optimize our code so it fits within the allotted time limit.
The algorithm
When solving this problem I first just spent some time thinking about the different things I would have to compute and what algorithms and data structures I could use to accomplish them.
-
The first thing that jumped out at me is that we are going to want some kind of abstractions for 3D coordinates, and for 3D rectangular prisms (i.e. boxes, i.e. pieces of cake). Probably we can just represent boxes as a pair of points at opposite corners of the box (in fact this is how boxes are given to us). As we plan out how the rest of the solution is going to work we will come up with a list of operations these will need to support.
As an aside, when working in Java I rarely make any classes beyond the single main class, because it’s too heavyweight. When working in Haskell, on the other hand, I often define little abstractions (i.e. data types and operations on them) because they are so lightweight, and being able to cleanly separate things into different layers of abstraction helps me write solutions that are more obviously correct.
-
We need to check that the coordinates of each given box are valid.
-
We will need to check that every piece of cake contains exactly one chocolate chip. At first this sounds difficult—given a chip, how do we find out which box(es) it is in? Or given a box, how can we find out which chips are in it? To do this efficiently seems like it will require some kind of advanced 3D space partitioning data structure, like an octree or a BSP tree. BUT this is a situation where reading carefully pays off: the problem statement actually says that “the \(i\)-th part must contain the \(i\)-th chocolate chip”. So all we have to do is zip the list of pieces together with the list of chips. We just need an operation to test whether a given point is contained in a given box.
-
We have to check that none of the boxes overlap. We can make a primitive to check whether two boxes intersect, but how do we make sure that none of the boxes intersect? Again, complicated space-partitioning data structures come to mind; but since there are at most \(10^3\) boxes, the number of pairs is on the order of \(10^6\). There can be multiple test cases, though, and the input specification says the sum of values for \(m\) (the number of pieces) over all test cases will be at most \(5 \times 10^4\). That means that in the worst case, we could get up to \(50\) test cases with \(10^3\) pieces of cake (and thus on the order of \(10^6\) pairs of pieces) per test case. Given \(10^8\) operations per second as a rough rule of thumb, it should be just barely manageable to do a brute-force check over every possible pair of boxes.
-
Finally, we have to check that the pieces account for every last bit of the cake. If we think about trying checking this directly, it is quite tricky. One could imagine making a 3D array representing every cubic unit of cake, and simply marking off the cubes covered by each piece, but this is completely out of the question since the cake could be up to \(10^6 \times 10^6 \times 10^6\) in size! Or we could again imagine some complicated space-partitioning structure to keep track of which parts have and have not been covered so far.
But there is a much simpler way: just add up the volume of all the pieces and make sure it is the same as the volume of the whole cake! Of course this relies on the fact that we are also checking to make sure none of the pieces overlap: the volumes being equal implies that the whole cake is covered if and only if none of the pieces overlap. In any case, we will need a way to compute the volume of a box.
Implementation and optimization
Let’s start with some preliminaries: LANGUAGE
pragmas, imports, main
, and the parser.
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE RecordWildCards #-}
{-# LANGUAGE TupleSections #-}
import Control.Arrow
import Data.Bool
import qualified Data.ByteString.Lazy.Char8 as C
import Data.Monoid
import ScannerBS
main = C.interact $
runScanner (many tc) >>> init >>>
map (solve >>> bool "NO" "YES") >>>
C.unlines
data TC = TC { cake :: Box, chips :: [Pos], parts :: [Box] }
tc :: Scanner TC
tc = do
a <- int
case a of
-1 -> return undefined
_ -> do
xs <- three int
let [b,c,m] = xs
cake = Box (Pos 1 1 1) (Pos a b c)
TC cake <$> m `times` pos <*> m `times` box
The parser is worth remarking upon. The input consists of multiple test cases, with a single value of \(-1\) marking the end of the input. This is annoying: ideally, we would have a many
combinator that keeps running a Scanner
until it fails, but we don’t. To keep things simple and fast, our Scanner
abstraction does not support parse failures and alternatives! The many
combinator we made keeps running a given Scanner
until the end of input, not until it fails. The quick-and-dirty solution I adopted is to make the test case Scanner
return undefined
if it sees a -1
, and then simply ignore the final element of the list of test cases via init
. Not pretty but it gets the job done.
Representing positions and boxes
Next let’s consider building abstractions for 3D coordinates and boxes. It is very tempting to do something like this:
type Pos = [Integer]
type Box = [Pos]
-- Check whether one position is componentwise <= another
posLTE :: Pos -> Pos -> Pos
posLTE p1 p2 = and $ zipWith (<=) p1 p2
-- ... and so on
Using list combinators like zipWith
to work with Pos
and Box
values is quite convenient. And for some problems, using lists is totally fine. Having a small number of large lists—e.g. reading in a list of \(10^5\) integers and processing them somehow—is rarely a problem. But having a large number of small lists, as we would if we use lists to represent Pos
and Box
here, slows things down a lot (as I learned the hard way). I won’t go into the details of why—I am no expert on Haskell performance—but suffice to say that lists are a linked structure with a large memory overhead.
So let’s do something more direct. We’ll represent both Pos
and Box
as data types with strict fields (the strict fields make a big difference, especially in the case of Pos
), and make some trivial Scanner
s for them. The volume
function computes the volume of a box; given that the coordinates are coordinates of the cubes that make up the pieces, and are both inclusive, we have to add one to the difference between the coordinates. Note we assume that the first coordinate of a Box
should be elementwise less than or equal to the second; otherwise, the call to max 0
ensures we will get a volume of zero.
data Pos = Pos !Int !Int !Int
data Box = Box !Pos !Pos
pos :: Scanner Pos
pos = Pos <$> int <*> int <*> int
box :: Scanner Box
box = Box <$> pos <*> pos
volume :: Box -> Int
volume (Box (Pos x1 y1 z1) (Pos x2 y2 z2)) = (x2 -. x1) * (y2 -. y1) * (z2 -. z1)
where
x -. y = max 0 (x - y + 1)
Another very important note is that we are using Int
instead of Integer
. Using Integer
is lovely when we can get away with it, since it means not worrying about overflow at all; but in this case using Int
instead of Integer
yields a huge speedup (some quick and dirty tests show about a factor of 6 speedup on my local machine, and replacing Int
with Integer
, without changing anything else, makes my solution no longer accepted on Kattis). Of course, this comes with an obligation to think about potential overflow: the cake can be at most \(10^6\) units on each side, giving a maximum possible volume of \(10^{18}\). On a 64-bit machine, that just fits within an Int
(maxBound :: Int
is approximately \(9.2 \times 10^{18}\)). Since the Kattis test environment is definitely 64-bit, we are good to go. In fact, limits for competitive programming problems are often chosen so that required values will fit within 64-bit signed integers (C++ has no built-in facilities for arbitrary-size integers); I’m quite certain that’s why \(10^6\) was chosen as the maximum size of one dimension of the cake.
Pos and Box utilities
Next, some utilities for checking whether one Pos
is elementwise less than or equal to another, and for taking the elementwise max
and min
of two Pos
values. Checking whether a Box
contains a Pos
simply reduces to doing two calls to posLTE
(again assuming a valid Box
with the first corner componentwise no greater than the second).
posLTE (Pos x1 y1 z1) (Pos x2 y2 z2) = x1 <= x2 && y1 <= y2 && z1 <= z2
posMax (Pos x1 y1 z1) (Pos x2 y2 z2) = Pos (max x1 x2) (max y1 y2) (max z1 z2)
posMin (Pos x1 y1 z1) (Pos x2 y2 z2) = Pos (min x1 x2) (min y1 y2) (min z1 z2)
contains :: Box -> Pos -> Bool
contains (Box lo hi) p = posLTE lo p && posLTE p hi
To test whether a box is a valid box within a given cake, we test that its corners are in the correct order and fit within the low and high coordinates of the cake.
valid :: Box -> Box -> Bool
valid (Box lo hi) (Box c1 c2) = posLTE lo c1 && posLTE c1 c2 && posLTE c2 hi
How to test whether two given boxes intersect or not? There are probably many ways to do this, but the nicest way I could come up with is to first find the actual Box
which represents their intersection, and check whether it has a positive volume (relying on the fact that volume
returns 0 for degenerate boxes with out-of-order coordinates). In turn, to find the intersection of two boxes, we just take the coordinatewise max
of their lower corners, and the coordinatewise min
of their upper corners.
intersection :: Box -> Box -> Box
intersection (Box c11 c12) (Box c21 c22) = Box (posMax c11 c21) (posMin c12 c22)
disjoint :: Box -> Box -> Bool
disjoint b1 b2 = volume (intersection b1 b2) == 0
The solution
Finally, we can put the pieces together to write the solve
function. We simply check that all the given cake parts are valid; that every part contains its corresponding chocolate chip; that every pair of parts is disjoint; and that the sum of the volumes of all parts equals the volume of the entire cake.
solve :: TC -> Bool
solve (TC{..}) = and
[ all (valid cake) parts
, and $ zipWith contains parts chips
, all (uncurry disjoint) (pairs parts)
, sum (map volume parts) == volume cake
]
Computing all pairs
Actually, there’s still one missing piece: how to compute all possible pairs of parts. The simplest possible thing would be to use a list comprehension like
[(x,y) | x <- parts, y <- parts]
but this has problems: first, it includes a pairing of each part with itself, which will definitely have a nonzero intersection. We could exclude such pairs by adding x /= y
as a guard, but there is another problem: (p2,p1)
is included whenever (p1,p2)
is included, but this is redundant since disjoint
is commutative. In fact, we don’t really want all pairs; we want all unordered pairs, that is, all sets of size two. We can do that with the below utility function (which I have now added to Util.hs
):
pairs :: [a] -> [(a,a)]
pairs [] = []
pairs [_] = []
pairs (a:as) = map (a,) as ++ pairs as
This is accepted, and runs in about 0.91 seconds (the time limit is 2 seconds). However, I was curious whether we are paying anything here for all the list operations, so I wrote the following version, which takes a binary operation for combining list elements, and a Monoid
specifying how to combine the results, and directly returns the monoidal result of combining all the pairs, without ever constructing any intermediate lists or tuples at all. It’s sort of like taking the above pairs
function, following it by a call to foldMap
, and then manually fusing the two to get rid of the intermediate list.
withPairs :: Monoid r => (a -> a -> r) -> [a] -> r
withPairs _ [] = mempty
withPairs _ [_] = mempty
withPairs f (a:as) = go as
where
go [] = withPairs f as
go (a2:rest) = f a a2 <> go rest
To use this, we have to change the solve
function slightly: instead of
, all (uncurry disjoint) (pairs parts)
we now have
, getAll $ withPairs (\p1 p2 -> All $ disjoint p1 p2) parts
This version runs significantly faster on Kattis—0.72 seconds as opposed to 0.91 seconds. (In fact, it’s faster than the currently-fastest Java solution (0.75 seconds), though there is still a big gap to the fastest C++ solution (0.06 seconds).) I don’t completely understand why this version is faster—perhaps one of you will be able to enlighten us!
For next time
For next time, we’ll go back to computational geometry: I invite you to solve Cookie Cutters.