Parametricity for Bifunctor
I’ve begun work to add Bifunctor
, Bifoldable
, and Bitraversable
to the Typeclassopedia. While thinking and writing about Bifunctor
I ended up proving some “folklore” results for my own satisfaction, and decided someone might possibly find it useful if I formally record the proofs here.
Bifunctor and its laws
The Bifunctor
type class governs types of kind * -> * -> *
which are (covariantly) functorial in both type arguments; that is, we can map over the two type arguments independently:
class Bifunctor (p :: * -> * -> *) where
bimap :: (a -> b) -> (c -> d) -> p a c -> p b d
first :: (a -> b) -> p a c -> p b c
second :: (b -> c) -> p a b -> p a c
bimap
allows mapping over both type arguments simultaneously, for example, bimap toUpper (+1) (‘j’, 3) = (‘J’,4)
. first
and second
allow mapping over only one or the other. Note that bimap
can be implemented in terms of first
and second
, or first
and second
in terms of bimap
(we’ll see what these implementations are later, or you can work them out as an easy exercise if you haven’t seen them before).
Instances of Bifunctor
should satisfy some laws, which are straightforward analogues of the laws for Functor
instances:
bimap id id = id (IB)
bimap (f . g) (h . i) = bimap f h . bimap g i (CB)
first id = id (IF)
first (f . g) = first f . first g (CF)
second id = id (IS)
second (f . g) = second f . second g (CS)
So far so good. But there is an additional law relating bimap
, first
, and second
:
bimap f g = first f . second g (BFS)
When I first read this law, it gave me pause: unlike the other laws, there seems to be some arbitrary asymmetry here. What about the symmetric law BFS2?
bimap f g = second g . first f (BFS2)
Intuitively, the order in which we perform first f
and second g
should not make any difference, since they cannot interact with each other in any way; the a
values and b
values in an arbitrary p a b
structure have to be distinct. That is, intuitively, if (BFS) holds, then (BFS2) should hold as well, and I wondered whether it is possible to formally derive this fact.
Let us note first that if one defines a Bifunctor
instance by defining bimap
alone, then first
and second
receive default definitions:
first f = bimap f id
second g = bimap id g
In this case, assuming that bimap
satisfies its composition law (CB), we can see that
first f . second g
= bimap f id . bimap id g
= bimap (f . id) (id . g)
= bimap f g
= bimap (id . f) (g . id)
= bimap id g . bimap f id
= second g . first f
So when bimap
alone is defined and satisfies (CB), both (BFS) and (BFS2) hold automatically. Let’s now consider the situation where first
and second
are defined, and bimap
acquires its default definition of bimap f g = first f . second g
(which is just (BFS))—or we can assume that all three of first
, second
, and bimap
are defined, as long as law (BFS) still holds. In this case, can we derive (BFS2) as well?
Parametricity for Bifunctor
We can derive parametricity, aka a free theorem, for any function with the type of bimap
. Let p :: * -> * -> *
be some two-argument type. In order to even express parametricity for types involving p
, we have to assume that p
is functorial—that is, we assume the existence of a function
BIMAP :: (a -> b) -> (c -> d) -> p a c -> p b d
satisfying (IB). (As with Functor
, (IB) + parametricity is enough to imply that it must satisfy (CB) as well.) We write BIMAP
in all caps to indicate that it is some canonical, Platonic function, handed down from on high, inherent to the very essence of p
. (Something like that.) Now let us suppose that we mortals have written some function bimap
with the same type as BIMAP
, which also satisfies (IB). But we don’t make any assumptions about whether our bimap
is the same function as BIMAP
; we only assume that it has type (a -> b) -> (c -> d) -> p a c -> p b d
and satisfies (IB).
We can now take the polymorphic type of bimap
and turn the crank to derive a statement of parametricity. I’m assuming some familiarity with this process; for details, see Wadler’s paper. If you don’t care about the details of the derivation, you can just skip to the final statement of parametricity at the end of this section, and then proceed to see how it is applied in the next.
Parametricity for bimap
says that if we interpret its type as a relation (according to certain rules), then (bimap, bimap)
will be a member of the relation, that is, bimap
will be related to itself. Below I show the process of expanding out the definition of the relation corresponding to bimap
’s type. (I don’t want to take the time to typeset it really nicely using LaTeX or anything like that; a slightly cleaned-up text version will do for now.) I use @a
to denote explicit type application, but after the second step I leave type application implicit to reduce clutter.
(bimap,bimap) ∈ ∀ a b c d. (a -> b) -> (c -> d) -> p a c -> p b d
<=>
∀ Q :: a <-> a', R :: b <-> b', S :: c <-> c', T :: d <-> d'.
(bimap @a @b @c @d, bimap @a' @b' @c' @d')
∈ (Q -> R) -> (S -> T) -> p Q S -> p R T
<=>
∀ Q :: a <-> a', R :: b <-> b', S :: c <-> c', T :: d <-> d'.
∀ (g,g') ∈ (Q -> R).
∀ (h,h') ∈ (S -> T).
∀ (x,x') ∈ p Q S.
(bimap g h x, bimap g' h' x') ∈ p R T
<=>
∀ Q :: a <-> a', R :: b <-> b', S :: c <-> c', T :: d <-> d'.
∀ (g,g'). (∀ (y,y') ∈ Q. (g y, g' y') ∈ R) ->
∀ (h,h'). (∀ (z,z') ∈ S. (h z, h' z') ∈ T) ->
∀ (x,x') ∈ p Q S.
(bimap g h x, bimap g' h' x') ∈ p R T
At this point, we don’t need the extra generality afforded by assuming everything is a relation, so we specialize all the relations in sight to functions. For example, the relation Q :: a <-> a’
just becomes a function q :: a -> a’
; statements of relatedness, such as (y,y’) ∈ Q
, turn into equations like q y = y’
.
∀ q :: a -> a', r :: b -> b', s :: c -> c', t :: d -> d'.
∀ (g,g'). (∀ (y,y'). q y = y' -> r (g y) = g' y') ->
∀ (h,h'). (∀ (z,z'). s z = z' -> t (h z) = h' z') ->
∀ (x,x') ∈ p q s.
(bimap g h x, bimap g' h' x') ∈ p r t
We now have to interpret p q s
and p r t
as relations. Wadler’s original paper doesn’t really cover this; it only talks about the specific case of the list functor. However, reading between the lines (and consulting subsequent work on parametricity), we can see that the right way to do this is
(a,b) ∈ p f g <=> BIMAP f g a = b
That is, we interpret the type p
as the relation corresponding to its BIMAP
function. Hence, we now have
∀ q :: a -> a', r :: b -> b', s :: c -> c', t :: d -> d'.
∀ (g,g'). (∀ (y,y'). q y = y' -> r (g y) = g' y') ->
∀ (h,h'). (∀ (z,z'). s z = z' -> t (h z) = h' z') ->
∀ (x,x'). (BIMAP q s x = x') ->
BIMAP r t (bimap g h x) = bimap g' h' x'
Finally, we can simplify this by substituting equals for equals in a few places:
∀ q :: a -> a', r :: b -> b', s :: c -> c', t :: d -> d'.
∀ (g,g'). (∀ (y,y'). r (g y) = g' (q y)) ->
∀ (h,h'). (∀ (z,z'). t (h z) = h' (s z)) ->
∀ (x,x').
BIMAP r t (bimap g h x) = bimap g' h' (BIMAP q s x) (PAR)
This is the specific statement of parametricity for bimap
that we will use to derive the results below.
Uniquness of bimap
Let’s first prove that bimap
is unique—that is, anything with the type of bimap
must in fact be equal to BIMAP
. The name of the game is just to pick appropriate values for all those quantified variables in (PAR). In particular, let’s pick
-
g = h = q = s = id
, -
r = g’
, and -
t = h’
.
We have some conditions to check:
-
(∀ (y,y’). r (g y) = g’ (q y))
under our choices becomes(∀ (y,y’). r (id y) = r (id y))
which is obviously true. -
Likewise,
(∀ (z,z’). t (h z) = h’ (s z))
becomes(∀ (z,z’). t (id z) = t (id z))
which is also obviously true.
We then get to conclude that ∀ (x,x’). BIMAP r t (bimap g h x) = bimap g’ h’ (BIMAP q s x)
, which is equivalent to
∀ (x,x'). BIMAP r t (bimap g h x) = bimap g' h' (BIMAP q s x)
<=>
BIMAP r t . bimap g h = bimap g' h' . BIMAP q s
<=>
BIMAP r t . bimap id id = bimap r t . BIMAP id id
Since we have assumed that both BIMAP
and bimap
satisfy the id
law, this is equivalent to BIMAP r t = bimap r t
, so bimap
and BIMAP
coincide (r
and t
were arbitrary).
first
and second
commute
Finally, let’s prove that first
and second
commute. We will again use (PAR), but with different choices:
-
g = g’ = t = s = id
-
h = h’
-
q = r
Let’s check the conditions:
-
We have to check
(∀ (y,y’). r (g y) = g’ (q y))
, which becomes(∀ (y,y’). r (id y) = id (r y))
, which is true since both sides reduce tor y
. -
Likewise,
(∀ (z,z’). t (h z) = h’ (s z))
becomes(∀ (z,z’). id (h z) = h (id z))
which is true by similar reasoning.
Hence, we may conclude that BIMAP r t (bimap g h x) = bimap g’ h’ (BIMAP q s x)
, which is equivalent to
BIMAP r id . bimap id h = bimap id h . BIMAP r id
We can simplify this further using a few assumptions: first, we already showed that BIMAP = bimap
. Second, since we are assuming (BFS), that is, bimap f g = first f . second g
, setting f
or g
to id
shows that bimap f id = first f . second id = first f . id = first f
(by the id
law for second
), and likewise bimap id g = second g
. Hence the equation becomes
first r . second h = second h . first r
And voila!
One can also show that the id
laws plus parametricity together imply the composition laws—I will leave that as an exercise. (Hint: make appropriate choices for g’
, h’
, q
, and s
.)