Crystal Ball Connection Patterns via Species and Generating Functions
Tagged generating functions, involutions, multinomial, species, summation, telephone
A couple weeks ago, Denise posted Puzzle: Crystal Ball Connection Patterns on her blog, Let’s Play Math. I had fun playing with it and thought I would demonstrate how to apply some high-powered combinatorial techniques to it (probably not what Denise had in mind!).
The setup is that there are \(n\) (distinct) friends who can talk to each other on the phone. Only two people can talk at a time (no conference calls). The question is to determine how many different “configurations” there are. Not everyone has to talk, so a configuration consists of some subset of the friends arranged in (unordered) conversational pairs.
Warning: spoilers ahead! If you’d like to play around with this yourself (and it is indeed a nice, accessible combinatorics problem to play with), stop reading now. My goal in this post is to have fun applying some advanced tools to this (relatively) simple problem.
Telephone numbers
Let’s start by visualizing some configurations. In her post, Denise illustrated the complete set of configurations for \(n = 4\), which I will visualize like this:
Notice how I’ve arranged them: in the first row is the unique configuration where no one is talking (yes, that counts). In the second row are the six possible configurations with just a single conversation. The last row has the three possible configurations with two conversations.
One good approach at this point would be to derive some recurrences. This problem does indeed admit a nice recurrence, but I will let you ponder it. Instead, let’s see if we can just “brute-force” our way to a general formula, using our combinatorial wits. Later I will demonstrate a much more principled, mechanical way to derive a general formula.
Let’s start by coming up with a formula for \(T_{n,k}\), the number of configurations with \(n\) people and \(k\) conversations. The number of ways of choosing \(k\) pairs out of a total of \(n\) is the multinomial coefficient \(\displaystyle \binom{n}{2,2,\dots,2,n-2k} = \frac{n!}{(2!)^k(n-2k)!}\). However, that overcounts things: it actually distinguishes the first pair, second pair, and so on, but we don’t want to have any ordering on the pairs. So we have to divide by \(k!\), the number of distinct orderings of the pairs. Thus,
\(\displaystyle T_{n,k} = \frac{n!}{2^k (n-2k)! k!}.\)
Let’s do a few sanity checks. First, when \(k=0\), we have \(T_{n,0} = \frac{n!}{n!} = 1\). We can also try some other small numbers we’ve already enumerated by hand: for example, \(T_{4,1} = \frac{4!}{2 \cdot 2 \cdot 1} = 6\), and \(T_{4,2} = \frac{4!}{4 \cdot 1 \cdot 2} = 3\). So this seems to work.
For \(n\) people, there can be at most \(\lfloor n/2 \rfloor\) conversations. So, the total number of configurations is going to be
\(\displaystyle T_n = \sum_{k=0}^{\lfloor n/2 \rfloor} T_{n,k}\).
We can use this to compute \(T_n\) for the first few values of \(n\):
\(\begin{array}{rcl}T_0 &=& 1\\T_1 &=& 1 \\ T_2 &=& 1 + 1 = 2 \\ T_3 &=& 1 + 3!/2 = 4 \\ T_4 &=& 1 + 6 + 3 = 10 \\ T_5 &=& 1 + 5!/(2 \cdot 3!) + 5!/(4 \cdot 2) = 1 + 10 + 15 = 26 \\ T_6 &=& 1 + 6!/(2 \cdot 4!) + 6!/(4 \cdot 2 \cdot 2) + 6!/(8 \cdot 3!) = 1 + 15 + 45 + 15 = 76 \end{array}\)
At this point we could look up the sequence 1,1,2,4,10,26,76 on the OEIS and find out all sorts of fun things: e.g. that we are also counting self-inverse permutations, i.e. involutions, that these numbers are also called “restricted Stirling numbers of the second kind”, some recurrence relations, etc., as well as enough references to keep us busy reading for a whole year.
Species
We can describe configurations as elements of the combinatorial species \(C = E \circ (E_2 + X)\). That is, a configuration is an unordered set (\(E\)) of (\(\circ\)) things (\(E_2 + X\)), where each thing can either be an unordered pair (\(E_2\)) of people talking on the phone, or (\(+\)) a single person (\(X\)) who is not talking.
We can now use the Haskell species
library to automatically generate some counts and see whether they agree with our manual enumerations. First, some boilerplate setup:
ghci> :set -XNoImplicitPrelude
ghci> :m +NumericPrelude
ghci> :m +Math.Combinatorics.Species
Now we define the species of configurations:
ghci> let configurations = set `o` (set `ofSizeExactly` 2 + singleton)
We can ask the library to count the number of configurations for different \(n\):
ghci> take 10 (labelled configurations)
[1,1,2,4,10,26,76,232,764,2620]
Oh good, those numbers look familiar! Now, I wonder how many configurations there are for \(n = 100\)?
ghci> labelled configurations !! 100
24053347438333478953622433243028232812964119825419485684849162710512551427284402176
Yikes!
We can also use the library to generate exhaustive lists of configurations, and draw them using diagrams. For example, here are all \(76\) configurations for \(n=6\). (If you want to see the code used to generate this diagram, you can find it here.)
And just for fun, let’s draw all \(232\) configurations for \(n = 7\):
Whee!
A general formula, via generating functions
Finally, I want to show how to use the species definition given above and the theory of generating functions to (somewhat) mechanically derive a general formula for the number of configurations. (Hopefully it will end up being equivalent to the formula we came up with near the beginning of the post!) Of course, this is also what the species
library is doing, but only numerically—we will do things symbolically.
First, note that we are counting labelled configurations (the friends are all distinct), so we want to consider exponential generating functions (egfs). Recall that the egf for a species \(F\) is given by
\(\displaystyle F(x) = \sum_{n \geq 0} |F[n]| \frac{x^n}{n!}\),
that is, a (possibly infinite) formal power series where the coefficient of \(x^n/n!\) is the number of distinct labelled \(F\)-structures of size \(n\). In our case, we need
\(\displaystyle E(x) = \sum_{n \geq 0} 1 \cdot \frac{x^n}{n!} = e^x\),
since there is exactly one set structure of any size, and
\(\displaystyle E_2(x) = \frac{x^2}{2}\),
which is just the restriction of \(E(x)\) to only the \(x^2\) term. Of course, we also have \(X(x) = x\). Putting this together, we calculate
\(\begin{array}{rcl}\displaystyle (E \circ (E_2 + X))(x) &=& e^{(x^2/2 + x)} \\[1em] &=& \displaystyle \sum_{n \geq 0} \frac{(x^2/2 + x)^n}{n!} \\[1em] &=& \displaystyle \sum_{n \geq 0} \sum_{k = 0}^n \frac{1}{n!} \binom{n}{k} \left(\frac{x^2}{2}\right)^k x^{n-k} \\[1em] &=& \displaystyle \sum_{n \geq 0} \sum_{k=0}^n \frac{1}{n!} \binom{n}{k} \frac{x^{n+k}}{2^k} \end{array}\)
Ultimately, we want something of the form \(\displaystyle \sum_{m \geq 0} f_m \frac{x^m}{m!}\), so we’ll need to collect up like powers of \(x\). To do that, we can do a bit of reindexing. Right now, the double sum is adding up a bunch of terms that can be thought of as making a triangle:
Each ordered pair in the triangle corresponds to a single term being added. Each column corresponds to a particular value of \(n\), with \(n\) increasing to the right. Within each column, \(k\) goes from \(0\) up to \(n\).
The powers of \(x\) in our double sum are given by \(n+k\). If we draw in lines showing terms that have the same power of \(x\), it looks like this:
So let’s choose a new variable \(m\), defined by \(m = n + k\). We can see that we will have terms for every \(m \geq 0\). We will also keep the variable \(k\) for our other index, and substitute \(n = m - k\) to get rid of \(n\). In other words, instead of adding up the triangle by columns, we are going to add it up by diagonals.
Previously we had \(k \leq n\); substituting for \(n\) that now turns into \(k \leq m - k\). Adding \(k\) to both sides and dividing by \(2\) yields \(k \leq \lfloor m/2 \rfloor\) (we can round down since \(k\) is an integer). Looking at the diagram above, this makes sense: the height of each diagonal line is indeed half its index. Rewriting our indices of summation and substituting \(m - k\) for \(n\), we now have:
\(\begin{array}{rcl}\displaystyle \sum_{n \geq 0} \sum_{k=0}^n \frac{1}{n!} \binom{n}{k} \frac{x^{n+k}}{2^k} &=& \displaystyle \sum_{m \geq 0} \sum_{k=0}^{\lfloor m/2 \rfloor} \frac{1}{(m-k)!} \binom{m-k}{k} \frac{x^m}{2^k} \\[1em] &=& \displaystyle \sum_{m \geq 0} \sum_{k=0}^{\lfloor m/2 \rfloor} \frac{1}{k!(m-2k)!} \frac{x^m}{2^k} \\[1em] &=& \displaystyle \sum_{m \geq 0} \frac{x^m}{m!} \sum_{k=0}^{\lfloor m/2 \rfloor} \frac{m!}{k!(m-2k)!2^k} \end{array}\)
And hey, look at that! The coefficient of \(x^m/m!\) is exactly what we previously came up with for \(T_m\). Math works!