« New ICFP functional pearl on subtracting bijections » Counting inversions with monoidal sparks

Monoidal sparks

Posted on October 1, 2018
Tagged monoid, pairs, product, quadratic, spark, math

While at ICFP in St. Louis this past week, I discovered an interesting construction on monoids that no one I talked to (including Kenny Foner and Edward Kmett) seemed to have heard of or thought about before. In this post I’ll present the abstract construction itself, and in another post I’ll explain the particular context in which I first came up with it. (Normally I would put the examples first, before explaining the idea in general; but I only know of one example so far, and I’m curious to see if anyone will come up with more. I don’t want to bias people’s thinking too much with my one example!)

The bigger context here is thinking about different ways of putting a monoid structure on a product type \(A \times B\), assuming \(A\) and \(B\) are themselves monoids. Previously I knew of two ways to do this. The obvious way is to just have the two sides operate in parallel, without interacting at all: \((a_1,b_1) \diamond (a_2,b_2) = (a_1 \diamond a_2, b_1 \diamond b_2)\) and so on. Alternatively, if \(A\) acts on \(B\), we can form the semidirect product.

But here is a third way. Suppose \((A, \varepsilon_A, \diamond)\) is a monoid, and \((B, \varepsilon_B, \oplus)\) is a commutative monoid. To connect them, we also suppose there is another binary operation \(- \cdot - : A \to A \to B\), which I will pronounce “spark”. The way I like to think of it is that two values of type \(A\), in addition to combining to produce another \(A\) via the monoid operation, also produce a “spark” of type \(B\). That is, values of type \(B\) somehow capture information about the interaction between pairs of \(A\) values.

Sparking needs to interact sensibly with the monoid structures on \(A\) and \(B\): in particular, fixing either argument must result in a monoid homomorphism from \(A\) to \(B\). That is, for any choice of \(a \in A\), we have \(a \cdot \varepsilon_A = \varepsilon_B\) (i.e. \(\varepsilon_A\) is boring and can never produce a nontrivial spark with anything else), and \(a \cdot (a_1 \diamond a_2) = (a \cdot a_1) \oplus (a \cdot a_2)\) (i.e. sparking commutes with the monoid operations). Similar laws should hold if we fix the second argument of \(- \cdot -\) instead of the first.

Given all of this, we can now define a monoid on \(A \times B\) as follows:

\((a_1, b_1) \otimes (a_2, b_2) = (a_1 \diamond a_2, b_1 \oplus b_2 \oplus (a_1 \cdot a_2))\)

That is, we combine the \(A\) values normally, and then we combine the \(B\) values together with the “spark” of type \(B\) produced by the two \(A\) values.

Let’s see that this does indeed define a valid monoid:

The identity is \((\varepsilon_A, \varepsilon_B)\), since

\((\varepsilon_A, \varepsilon_B) \otimes (a,b) = (\varepsilon_A \diamond a, \varepsilon_B \oplus b \oplus (\varepsilon_A \cdot a)) = (a, b \oplus \varepsilon_B) = (a,b)\)

Notice how we used the identity laws for \(A\) (once) and \(B\) (twice), as well as the law that says \(\varepsilon_A \cdot a = \varepsilon_B\). The proof that \((\varepsilon_A, \varepsilon_B)\) is a right identity for \(\otimes\) is similar.
To see that the combining operation is associative, we can reason as follows:

\(\begin{array}{rcl} & & ((a_1,b_1) \otimes (a_2,b_2)) \otimes (a_3,b_3) \\[0.5em] & = & \qquad \text{\{expand definition of \begin{math}\otimes\end{math}\}} \\[0.5em] & & (a_1 \diamond a_2, b_1 \oplus b_2 \oplus (a_1 \cdot a_2)) \otimes (a_3,b_3) \\[0.5em] & = & \qquad \text{\{expand definition of \begin{math}\otimes\end{math} again\}} \\[0.5em] & & (a_1 \diamond a_2 \diamond a_3, b_1 \oplus b_2 \oplus (a_1 \cdot a_2) \oplus b_3 \oplus ((a_1 \diamond a_2) \cdot a_3)) \\[0.5em] & = & \qquad \text{\{\begin{math}- \cdot a_3\end{math} is a homomorphism, \begin{math}\oplus\end{math} is commutative\}} \\[0.5em] & & (a_1 \diamond a_2 \diamond a_3, b_1 \oplus b_2 \oplus b_3 \oplus (a_1 \cdot a_2) \oplus (a_1 \cdot a_3) \oplus (a_2 \cdot a_3)) \end{array}\)

and

\(\begin{array}{rcl} & & (a_1,b_1) \otimes ((a_2,b_2) \otimes (a_3,b_3)) \\[0.5em] & = & \qquad \text{\{expand definition of \begin{math}\otimes\end{math}\}} \\[0.5em] & & (a_1,b_1) \otimes (a_2 \diamond a_3, b_2 \oplus b_3 \oplus (a_2 \cdot a_3)) \\[0.5em] & = & \qquad \text{\{expand definition of \begin{math}\otimes\end{math} again\}} \\[0.5em] & & (a_1 \diamond a_2 \diamond a_3, b_1 \oplus (b_2 \oplus b_3 \oplus (a_2 \cdot a_3)) \oplus (a_1 \cdot (a_2 \diamond a_3))) \\[0.5em] & = & \qquad \text{\{\begin{math}a_1 \cdot -\end{math} is a homomorphism, \begin{math}\oplus\end{math} is commutative\}} \\[0.5em] & & (a_1 \diamond a_2 \diamond a_3, b_1 \oplus b_2 \oplus b_3 \oplus (a_1 \cdot a_2) \oplus (a_1 \cdot a_3) \oplus (a_2 \cdot a_3)) \end{array}\)

In a formal proof one would have to also explicitly note uses of associativity of \(\diamond\) and \(\oplus\) but I didn’t want to be that pedantic here.

In addition, if \(A\) is a commutative monoid and the spark operation \(- \cdot -\) commutes, then the resulting monoid \((A \times B, \otimes)\) will be commutative as well.

The proof of associativity gives us a bit of insight into what is going on here. Notice that when reducing \((a_1,b_1) \otimes (a_2,b_2) \otimes (a_3,b_3)\), we end up with all possible sparks between pairs of \(a\)’s, i.e. \((a_1 \cdot a_2) \oplus (a_1 \cdot a_3) \oplus (a_2 \cdot a_3)\), and one can prove that this holds more generally. In particular, if we start with a list of \(A\) values:

\([a_1, a_2, \dots, a_n]\),

then inject them all into \(A \times B\) by pairing them with \(\varepsilon_B\):

\([(a_1, \varepsilon_B), (a_2, \varepsilon_B), \dots, (a_n, \varepsilon_B)]\),

and finally fold this list with \(\otimes\), the second element of the resulting pair is

\(\displaystyle \bigoplus_{i \neq j} (a_i \cdot a_j)\),

that is, the combination (via the monoid on \(B\)) of the sparks between all possible pairings of the \(a_i\). Of course there are \(O(n^2)\) such pairings: the point is that whereas computing this via a straightforward fold over the list may well take \(O(n^2)\) time, by using a balanced fold (i.e. splitting the list in half recursively and then combining from the leaves up) it may be possible to compute it in \(O(n \log n)\) time instead (at least, this is the case for the example I have in mind!).

Please leave a comment if you can you think of any instances of this pattern, or if you have seen this pattern discussed elsewhere. In a future post I will write about the instance I had in mind when coming up with this generalized formulation.