« Competitive programming in Haskell challenge: Letter Optimization » Competitive programming in Haskell: tries

Competitive programming in Haskell: topsort via laziness

Posted on April 11, 2023
Tagged ,

In my previous post, I challenged you to solve Letter Optimiztion. In this problem, we have a directed acyclic graph where each vertex represents a person, and there is an edge p -> q when person p sends their finished envelopes to person q. Also:

The problem is to figure out which people are actually working at their maximum speed. Of course, the reason this is interesting is that the rate at which a person can work is partially determined by the rate at which envelopes are coming to them, which depends on the rates at which people before them in the pipeline are working, and so on.

The typical way to solve this would be to first topologically sort the people (e.g. using a DFS or Kahn’s Algorithm), then fill in the speed of each person in order of the topological sort. That way, when we calculate each person’s rate, we already know the rates of anyone that sends them input. This can also be thought of as a particularly simple form of dynamic programming.

Get rid of topological sort with this one neat trick

However, there is a nice trick we can use in Haskell to save ourselves a bunch of work: instead of doing an explicit topological sort, we can simply define a lazy, recursive array or map with the final values we want; laziness will take care of evaluating the array or map entries in the correct order. Essentially, we are co-opting the Haskell runtime into doing a topological sort for us!

Let’s see some code! First, some pragmas, imports, and boring utility functions. (For an explanation of the Scanner import, see this post and also this one.)

{-# LANGUAGE RecordWildCards #-}
{-# LANGUAGE TupleSections   #-}

import           Control.Arrow              (second, (>>>))
import           Data.Array
import qualified Data.ByteString.Lazy.Char8 as C
import           Data.Set                   (Set)
import qualified Data.Set                   as S
import           ScannerBS

infixl 0 >$>
(>$>) = flip ($)

showB :: Show a => a -> C.ByteString
showB = show >>> C.pack

Now for some data types to represent the input, and some code to parse it.

data Person = Person { maxSpeed :: Double, sends :: [(Int, Double)] }
  deriving (Eq, Show)
data TC = TC { n :: Int, people :: Array Int Person }
  deriving (Eq, Show)

tc :: Scanner TC
tc = do
  n <- int
  people <- listArray (1,n) <$> (n >< (Person <$> double <*> numberOf ((,) <$> int <*> ((/100) <$> double))))
  return TC{..}

As an aside, notice how I use a record wildcard to create the output TC value. I find this a quick, simple, and consistent way to structure my scanning code, without having to come up with multiple names for the same thing. I don’t know whether I would ever use it in production code; I’ll leave that to others for debate.

To solve the problem, we take an array production holding the computed production speeds for each person (we’ll see how to build it in a minute), and extract the people who are working at their max speed.

main = C.interact $ runScanner tc >>> solve >>> map showB >>> C.unwords

solve :: TC -> [Int]
solve TC{..} =
  production >$> assocs >>>
  filter (\(p,u) -> abs (u - maxSpeed (people!p)) < 0.0001) >>>
  map fst

How do we compute the array of production speeds? First, we build a map from each person to their set of inputs:

 where
  -- inputMap!p = set of people from whom p gets input, with percentage for each
  inputMap :: Array Int (Set (Int,Double))
  inputMap = accumArray (flip S.insert) S.empty (1,n) (concatMap getInputs (assocs people))

  getInputs :: (Int, Person) -> [(Int, (Int, Double))]
  getInputs (p, Person _ ss) = map (second (p,)) ss

Now we create a lazy, recursive Array that maps each person to their production speed. Notice how the definition of production refers to itself: this works because the Array type is lazy in the values stored in the array. The values are not computed until we actually demand the value stored for a particular index; the Haskell runtime then goes off to compute it, which may involve demanding the values at other indices, and so on.

  production :: Array Int Double
  production = array (1,n)
    [ (p,u)
    | p <- [1 .. n]
    , let m = maxSpeed (people!p)
          i = (inputMap!p) >$> S.toList >>> map (\(x,pct) -> pct * (production!x)) >>> sum
          u = if S.null (inputMap!p) then m else min i m
    ]

For each person p, m is their maximum speed, i is the sum of all the production coming from their inputs (depending on their inputs’ own production speeds), and the person’s production speed u is the minimum of their input and their maximum speed (or simply their maximum speed if they have no inputs).

For next time

For next time, I challenge you to solve Alien Math!