« Competitive programming in Haskell: introduction to dynamic programming » Dynamic programming in Haskell: automatic memoization

Dynamic programming in Haskell: lazy immutable arrays

Posted on June 2, 2023
Tagged challenge, dynamic programming, Kattis, competitive programming, haskell

This is part 1 of a promised multi-part series on dynamic programming in Haskell. As a reminder, we’re using Zapis as a sample problem. In this problem, we are given a sequence of opening and closing brackets (parens, square brackets, and curly braces) with question marks, and have to compute the number of different ways in which the question marks could be replaced by brackets to create valid, properly nested bracket sequences.

Last time, we developed a recurrence for this problem and saw some naive, directly recursive Haskell code for computing it. Although this naive version is technically correct, it is much too slow, so our goal is to implement it more efficiently.

Mutable arrays?

Someone coming from an imperative background might immediately reach for some kind of mutable array, e.g. STUArray. Every time we call the function, we check whether the corresponding array index has already been filled in. If so, we simply return the stored value; if not, we compute the value recursively, and then fill in the array before returning it.

This would work, but there is a better way!

Immutable arrays

While mutable arrays occasionally have their place, we can surprisingly often get away with immutable arrays, where we completely define the array up front and then only use it for fast lookups afterwards.

If the type of the array elements is suitable, and we can initialize the array elements all at once from a list using some kind of formula, map, scan, etc., we should use UArray since it is much faster than Array.
However, UArray is strict in the elements, and the elements must be of a type that can be stored unboxed. If we need a more complex element type, or we need to compute the array recursively (where some elements depend on other elements), we can use Array.

What about the vector library, you ask? Well, it’s a very nice library, and quite fast, but unfortunately it is not available on many judging platforms, so I tend to stick to array to be safe. However, if you’re doing something like Advent of Code or Project Euler where you get to run the code on your own machine, then you should definitely reach for vector.

Lazy, recursive, immutable arrays

In my previous post on topsort we already saw the basic idea: since Arrays are lazy in their elements, we can define them recursively; the Haskell runtime then takes care of computing the elements in a suitable order. Previously, we saw this applied to automatically compute a topological sort, but more generally, we can use it to fill out a table of values for any recurrence.

So, as a first attempt, let’s just replace our recursive c function from last time with an array. I’ll only show the solve function for now; the rest of the code remains the same. (Spoiler alert: this solution works, but it’s ugly. We’ll develop much better solutions later.)

solve :: String -> Integer
solve str = c!(0,n)
  where
    n = length str
    s = listArray (0,n-1) str

    c :: Array (Int, Int) Integer
    c = array ((0,0),(n,n)) $
      [ ((i,i), 1) | i <- [0..n] ]
      ++
      [ ((i,j),0) | i <- [0..n], j <- [0..n], even i /= even j ]
      ++
      [ ((i,j),v)
      | i <- [0..n], j <- [0..n], i /= j, even i == even j
      , let v = sum [ m (s!i) (s!k) * c!(i+1,k) * c!(k+1,j) | k <- [i+1, i+3 .. j-1]]
      ]

We use the array function to create an array, which takes first a pair of indices specifying the index range, and then a list of (index, value) pairs. (The listArray function can also be particularly useful, when we have a list of values which are already in index order, as in the definition of s.)

This solution is accepted, and it’s quite fast (0.04s for me). However, it’s really ugly, and although it’s conceptually close to our directly recursive function from before, the code is almost unrecognizably different. It’s ugly that we have to repeat conditions like i /= j and even i == even j, and binders like i <- [0..n]; the multiple list comprehensions and nested pairs like ((i,j),v) are kind of ugly, and the fact that this is implementing a recurrence is completely obscured.

However, I included this solution as a first step because for a long time, after I learned about using lazy immutable arrays to implement dynamic programming in Haskell, this was the kind of solution I wrote! Indeed, if you just think about the idea of creating a recursively defined array, this might be the kind of thing you come up with: we define an array c using the array function, then we have to list all its elements, and we get to refer to c along the way.

Mutual recursion to the rescue

Most of the ugliness comes from losing sight of the fact that there is a function mapping indices to values: we simply listed out all the function’s input/output pairs without getting to use any of Haskell’s very nice facilities for defining functions! So we can clean up the code considerably if we make a mutually recursive pair of an array and a function: the array values are defined using the function, and the function definition can look up values in the array.

solve :: String -> Integer
solve str = cA!(0,n)
  where
    n = length str
    s = listArray (0,n-1) str

    cA :: Array (Int, Int) Integer
    cA = array ((0,0),(n,n)) $
      [ ((i,j), c (i,j)) | i <- [0 .. n], j <- [0 .. n] ]

    c :: (Int, Int) -> Integer
    c (i,j)
      | i == j           = 1
      | even i /= even j = 0
      | otherwise        = sum
        [ m (s!i) (s!k) * cA ! (i+1,k) * cA ! (k+1,j)
        | k <- [i+1, i+3 .. j-1]
        ]

Much better! The c function looks much the same as our naive version from before, with the one difference that instead of calling itself recursively, it looks up values in the array cA. The array, in turn, is simply defined as a lookup table for the outputs of the function.

Generalized tabulation

One nice trick we can use to simplify the code a bit more is to use the range function to generate the list of all valid array indices, and then just map the c function over this. This also allows us to use the listArray function, since we know that the range will generate the indices in the right order.

cA :: Array (Int, Int) Integer
cA = listArray rng $ map c (range rng)
  where
    rng = ((0,0), (n,n))

In fact, we can abstract this into a useful little function to create a lookup table for a function:

tabulate :: Ix i => (i,i) -> (i -> a) -> Array i a
tabulate rng f = listArray rng (map f $ range rng)

(We can generalize this even more to make it work for UArray as well as Array, but I’ll stop here for now. And yes, I intentionally named this to echo the tabulate function from the adjunctions package; Array i is indeed a representable functor, though it’s not really possible to express without dependent types.)

The solution so far

Putting it all together, here’s our complete solution so far. It’s pretty good, and in fact it’s organized in a very similar way to Soumik Sarkar’s dynamic programming solution to Chemist’s Vows. (However, there’s an even better solution coming in my next post!)

import Control.Arrow
import Data.Array

main = interact $ lines >>> last >>> solve >>> format

format :: Integer -> String
format = show >>> reverse >>> take 5 >>> reverse

tabulate :: Ix i => (i,i) -> (i -> a) -> Array i a
tabulate rng f = listArray rng (map f $ range rng)

solve :: String -> Integer
solve str = cA!(0,n)
  where
    n = length str
    s = listArray (0,n-1) str

    cA :: Array (Int, Int) Integer
    cA = tabulate ((0,0),(n,n)) c

    c :: (Int, Int) -> Integer
    c (i,j)
      | i == j           = 1
      | even i /= even j = 0
      | otherwise        = sum
        [ m (s!i) (s!k) * cA ! (i+1,k) * cA ! (k+1,j)
        | k <- [i+1, i+3 .. j-1]
        ]

m '(' ')'                = 1
m '[' ']'                = 1
m '{' '}'                = 1
m '?' '?'                = 3
m b '?' | b `elem` "([{" = 1
m '?' b | b `elem` ")]}" = 1
m _ _                    = 0

Coming up next: automatic memoization!

So what’s not to like about this solution? Well, I still don’t like the fact that we have to define a mutually recursive array and function. Conceptually, I want to name them both c (or whatever) since they are really isomorphic representations of the exact same mathematical function. It’s annoying that I have to make up a name like cA or c’ or whatever for one of them. I also don’t like that we have to remember to do array lookups instead of recursive calls in the function—and if we forget, Haskell will not complain! It will just be really slow.

Next time, we’ll see how to use some clever ideas from Conal Elliot’s MemoTrie package (which themselves ultimately came from a paper by Ralf Hinze) to solve these remaining issues and end up with some really beautiful code!