« Things I have learned about vector spaces » Typed type-level programming: status report

Species subtraction made simple

Posted on November 25, 2010
Tagged data, integers, natural, numbers, species, structures, subtraction, combinatorics, haskell

> {-# OPTIONS_GHC -fno-warn-missing-methods #-}
> module Virtual where
> 
> import Control.Applicative
> import Test.QuickCheck

Yesterday on #haskell, augur asked me to explain how subtraction works for combinatorial species. (For an introduction to species, see my paper from the 2010 Haskell Symposium.) For example, given the equation

\(M = 1 + X\)

defining the species \(M\) (which corresponds to the Haskell type constructor Maybe), our algebraic sensibilities tell us this can be rewritten as

\(X = M - 1\)

but can we make sense of this combinatorially? Addition has an obvious combinatorial interpretation (disjoint union), and so does multiplication (pairing), but what does subtraction mean? Once we allow subtraction, we have to assign meaning to expressions like Void - Bool, which apparently must be a species with negative two inhabitants. What on earth could that mean!? Combinatorics is all about counting things; we can have zero structures, or one structure, or seventy-three structures, but surely we can’t have a negative number of structures, can we?

Building the Integers

Let’s start with a slight detour to talk about natural numbers and integers. Here’s a standard Haskell definition of the naturals:

> data Nat = Z | S Nat
>   deriving (Show, Eq, Ord)

Note that the auto-derived Eq and Ord instances have exactly the right behavior (in the latter case due to the fact that we happened to list the Z constructor first).

*Virtual> S (S Z) == S (S Z)
True
*Virtual> S Z == S (S Z)
False
*Virtual> S (S Z) < S Z
False

We can do arithmetic on natural numbers, implemented via a Num instance:

> instance Num Nat where
>   Z   + n = n
>   S m + n = S (m + n) 
> 
>   Z   * n = Z
>   S m * n = n + m * n
> 
>   (-) = error "Subtraction! What's that?"
> 
>   fromInteger n | n < 0 = error "No can do."
>   fromInteger 0 = Z
>   fromInteger n = S (fromInteger (n-1))
> 
>   negate = error "Huh?"
>   abs = id

It seems to work:

*Virtual> 1 + 3 :: Nat
S (S (S (S Z)))

As much as I dislike Num, in this particular instance it is actually useful to illustrate an important point about Nat, namely, that subtraction (and, more fundamentally, negation) are not defined for natural numbers. (Of course, we could implement a partial subtraction operation, but we’d like to stick to total functions as much as possible.) We have no way to implement (-) or negate, and we even have problems with fromInteger if the input is negative.

But what if we were crazy enough to actually want negation and subtraction? How should we generalize the natural numbers? Our first instinct might be to define something like

> data I' = Pos Nat | Neg Nat

but this is annoying for a number of reasons. There are two different representations of zero (and having Neg Z represent -1, and so on, is too hideous to even mention (whoops)). Code to work with values of this type would also be ugly: lots of case analysis on Pos and Neg constructors with very similar code duplicated in each case. And we’re going to have to implement subtraction pretty much from scratch; there’s no good way to reuse the machinery we’ve already built for working with natural numbers.

However, there’s a much better way. No subtraction? No problem! We’ll just treat subtraction as a new syntactic construct.

> data I = Nat :- Nat
>   deriving Show

That is, a value of type I is a pair of Nats, thought of as their difference. Why call it I? Well, observe that any integer, positive or negative, can be represented in this way, as a difference of two natural numbers. For example, -2 can be represented by the pair 3 :- 5. So this is our representation of integers!

Of course, -2 can also be represented by 4 :- 6, or 0 :- 2, or a countably infinite number of other choices. We really want to deal with equivalence classes of these pairs, so the first task is to decide on the proper equivalence. When are the pairs p1 :- n1 and p2 :- n2 equivalent? Well… when the difference between p1 and n1 is the same as the difference between p2 and n2! This is circular, of course, since we are in the process of defining what “difference” means. But a moment’s thought reveals that we can reexpress this using only addition:

> instance Eq I where
>   p1 :- n1 == p2 :- n2  =  p1 + n2 == p2 + n1

In fact, our equivalence extends naturally to an ordering as well:

> instance Ord I where
>   compare (p1 :- n1) (p2 :- n2) = compare (p1 + n2) (p2 + n1)

How about arithmetic? To add two integers, we just add their positive and negative parts pointwise; multiplication works by distributing out the product and collecting up the positive and negative parts; negation is just a flip:

> instance Num I where
>   (p1 :- n1) + (p2 :- n2) = (p1 + p2) :- (n1 + n2)
>   (p1 :- n1) * (p2 :- n2) = (p1 * p2 + n1 * n2) :- (p1 * n2 + p2 * n1)
>   negate (p :- n) = n :- p
>   fromInteger n | n < 0     = Z :- fromInteger (-n)
>                 | otherwise = fromInteger n :- Z
>   abs i | i > 0     =  i
>         | otherwise = -i

Let’s see if it works:

*Virtual> (-4) :: I
Z :- S (S (S (S Z)))
*Virtual> 2 + 4 + (-3) :: I
S (S (S (S (S (S Z))))) :- S (S (S Z))
*Virtual> canonicalize it
S (S (S Z)) :- Z
*Virtual> 2 - 5 :: I
S (S Z) :- S (S (S (S (S Z))))
*Virtual> canonicalize it
Z :- S (S (S Z))

Seems to! Another important point is that we can inject the naturals into this new setting,

> n2i :: Nat -> I
> n2i = (:- 0)

and this injection is a semiring homomorphism: that is, adding (or multiplying) two Nats and applying n2i to the result is the same (up to equality on I) as first applying n2i to each and then adding (or multiplying) the resulting I values. But you don’t have to take my word for it:

> instance Arbitrary Nat where
>   arbitrary = (fromInteger . unNN) `fmap` arbitrary
>     where unNN (NonNegative x) = x
> 
> instance Arbitrary I where
>   arbitrary = (:-)  arbitrary  arbitrary
> 
> prop_n2i_hom_add i j = n2i (i+j) == n2i i + n2i j
> prop_n2i_hom_mul i j = n2i (i*j) == n2i i * n2i j

*Virtual> quickCheck prop_n2i_hom_add
+++ OK, passed 100 tests.
*Virtual> quickCheck prop_n2i_hom_mul
+++ OK, passed 100 tests.

It is also not hard to choose a canonical representative from each equivalence class of I values, namely, the pairs where at least one of the elements is zero. Here’s a function that returns the canonical representative corresponding to any value:

> canonicalize :: I -> I
> canonicalize i@(Z :- _)   = i
> canonicalize i@(_ :- Z)   = i
> canonicalize (S m :- S n) = canonicalize (m :- n)

We can use canonicalize to write a (partial) projection from integers back to naturals. To check whether an integer actually corresponds to a natural, just canonicalize it and check whether its negative component is zero:

> i2n :: I -> Maybe Nat
> i2n i = case canonicalize i of
>           (n :- Z) -> Just n
>           _        -> Nothing

Although i2n is partial, it is complete:

> prop_i2n_complete n = i2n (n2i n) == Just n

*Virtual> quickCheck prop_i2n_complete 
+++ OK, passed 100 tests.

The result of all this is that we can assign meaning to arbitrary expressions involving natural numbers, addition, multiplication, and subtraction, by first injecting all the naturals into our type for integers. What’s more, we have a method for deciding whether the final result is actually the image of a natural number.

*Virtual> 7 - 3 :: Nat
*** Exception: Subtraction! What's that?
*Virtual> i2n (7 - 3) :: Maybe Nat
Just (S (S (S (S Z))))

Virtual Species

So what was the point of talking about natural numbers and integers? We were supposed to be talking about combinatorial species, right?

Well, it turns out that we were talking about combinatorial species, since they include the natural numbers as a sub-semiring. The point is that the whole construction outlined above can be generalized to work over the entire semiring of species. In particular, we define virtual species to be equivalence classes of pairs of species \((P, N)\), where \(P\) is considered “positive” and \(N\) “negative”. \((P_1, N_1)\) and \((P_2, N_2)\) are equivalent if \(P_1 + N_2 \cong P_2 + N_1\) (where \(\cong\) denotes species isomorphism). Addition and multiplication for virtual species are defined in exactly the same way as we did for integers.

And what about canonicalize? Does something similar exist for virtual species? The answer, as it turns out, is yes, and this is a rather big deal! This blog post is too short to contain the details (they could probably make another post or three), but the key theorem is that every species can be decomposed uniquely into a sum of species of a special type known as molecular species. To canonicalize a virtual species we need only decompose both its positive and negative parts, and cancel any molecular species we find on both sides. Unlike integers, canonical virtual species do not necessarily have the zero species on one side or the other (that is, there exist “complex” virtual species which are not purely positive or negative). However, injection of species into the ring of virtual species does of course work by pairing with the zero species, and this injection is a semiring homomorphism as we would expect. Other species operations (such as differentiation and cartesian product) can also be extended to virtual species in a way compatible with this injection.

So to our original species expression \(M - 1\) we can assign the virtual species \((M,1)\). Of course, this is not canonical; unsurprisingly, the molecular decomposition of \(M\) is \(1 + X\), so canceling the \(1\) gives us the canonical (and strictly positive) species \((X,0)\), as expected.

Here, however, is where the similarity to naturals and integers ends. Virtual species give us a lot more additional structure over species than integers do over naturals. For example, integers don’t give us any sort of multiplicative inverses for naturals; for that we need another round of completion to obtain the rationals. Surprisingly, it turns out that virtual species already provide multiplicative inverses for “most” species! This allows us to assign meaning to expressions such as \(\frac{1}{1 - X}\). Perhaps this should be the subject of another blog post…