Skip to main content

Section 1.5 The Catalan Numbers

Let's put all of the techniques we have developed in this chapter to work to solve some deeper counting problems.

Subsection 1.5.1 A Few Counting Problems

The counting problems in Activities 109–114 are not supposed to be easy to solve, but you have a lot of combinatorial tools to try. Say as much as you can about as many as you can before moving on. At the very least, you should answer the counting problems for small values of \(n\) by writing out the set of outcomes. In Subsection 1.5.2, we will investigate the problems in more depth and see how to answer them fully.

Activity 109.

We have already seen how to count lattice paths in Activities 2 and 23. Now consider a variation.

How many paths of length \(2n\text{,}\) consisting of horizontal and vertical segments of unit length, are there from \((0, 0)\) to \((n, n)\) such that the path never goes above the line \(y = x\text{?}\) One such path to \((3, 3)\) is shown in Figure 1.5.1.

Figure 1.5.1. One of the acceptable lattice paths from \((0,0)\) to \((3,3)\text{.}\)
Activity 110.

\(2n\) people stand in line at a old-timey movie theater. Admission is 50 cents, (denoted by H), and the box office starts with no change. \(n\) of the people have H and \(n\) have $1 (denoted D). In how many ways can the \(2n\) people line up so that all can be admitted?

Hint

Here we enumerate the number of workable sequences of \(n\) H's and \(n\) D's such that at each point in the sequence the number of H's is not less than the number of D's. Try writing out all HD sequences and see which ones are acceptable and which are not.

Activity 111.

Insert the integers \(1, 2, \ldots, 2n\) into a 2 by \(n\) rectangle of boxes such that the entries are monotonic in rows and columns. We will call this an acceptable tableau insertion. How many ways can you do this?

For example, when \(n=3\text{,}\) there are five arrangements:

1 2 3
4 5 6
1 2 4
3 5 6
1 2 5
3 4 6
1 3 4
2 5 6
1 3 5
2 4 6
Activity 112.

If we multiply \(n+1\) numbers, say \(a_1a_2\cdots a_n a_{n+1}\text{,}\) we really should put in \(n\) pairs of parentheses since multiplication is a binary operation (and what if multiplication is not associative?). How many ways can we do this? For example, when \(n = 3\text{,}\) there are 5 ways:

\begin{equation*} (ab)(cd)\quad ((ab)c)d \quad a(b(cd)) \quad (a(bc))d \quad a((bc)d). \end{equation*}
Hint

Try working recursively. For a product of \(n+1\) symbols \(a_{1}a_{2}\ldots a_{n+1}\text{,}\) break it at the \(k\)th symbol:

\begin{equation*} (a_{1}a_{2}a_{3}\ldots a_{k})(a_{k + 1}\ldots a_{n+1}). \end{equation*}
Activity 113.

Consider a convex polygon with \((n+2)\) labeled vertices. In how many ways can non-intersecting diagonals be inserted so as to decompose the polygon into triangles? For \(n = 3\) the five figures are shown below.

Hint

You might have found a recursion for the sequence of answers in Exercise 74.

Activity 114.

Consider rooted binary trees. Rooted trees are trees in the graph theory sense, except that one vertex is designated as the root, which puts a natural ordering on the vertices. Vertices adjacent to the root are its children, and vertices adjacent to those (other than the root) are their children, and so on. We are looking at binary trees, so each vertex will have either two children (designated the left child and right child) or no children at all (i.e., the vertex is a leaf).

How many rooted binary trees have exactly \(n+1\) leaves? Note that because we designate children as left or right, the two trees below are counted as distinct.

Hint

There should be 5 rooted binary trees with 4 leaves (the \(n = 3\) case).

Subsection 1.5.2 The Catalan Numbers

The sequence of Catalan numbers, named after Eugene Catalan who along with Euler discovered many of the properties of these numbers, is the sequence \((C_n)_{n \ge 0}\) starting,

\begin{equation*} 1, 1, 2, 5, 14, 42, 132, \ldots\text{.} \end{equation*}

All of the counting problems above should be answered by Catalan numbers. Let's investigate this sequence and discover some of its properties.

First, we should make sure that at the very least, all of the counting problems above have the same answer. This will be hugely helpful, since then we could use any of them as a model to discuss the sequence.

How do you prove that two counting problems have the same answer? The bijection principle tells us that if there is a bijection between two sets, then the sets have the same size. So let's find some bijections.

Example 1.5.2.

Find a bijection between the set of lattice paths from \((0,0)\) to \((n,n)\) that do not rise above the line \(y = x\) (as in Activity 109) and the set of ways that the movie patrons in Activity 110 can line up.

Solution

Represent each lattice path as a string of R's and U's. Represent each way that the movie patrons can line up as a string of H's and D's. The bijection will send each R to an H and each U to a D.

We need to verify that this bijection is both well defined, and that its inverse is well defined (that the function is injective and surjective). The key to all of these is to consider what makes a string valid in each case.

First, notice that the lattice paths must have an equal number of R's and U's. Similarly, the string of movie patrons must have an equal number of H's and D's. Further, every initial substring of R's and U's must have at least as many R's as U's, so that the path does not go above the line \(y=x\text{.}\) Similarly, the every initial substring of H's and D's (half dollars and dollars) must have at least as many H's as D's, so that change can always be given. The H's and R's must dominate.

The the conditions on what makes a string valid match up, so every string in one set will match up with exactly one string in the other.

Representing the set of outcomes as strings of sequences is useful, because it suggests a bijection between different types of problems. The strings of paths and movie theater lines from Example 1.5.2 are called Dyck words. These are strings of an equal number of two symbols, say \(x\) and \(y\text{,}\) such that no initial segment of the string has more \(y\)'s than \(x\)'s.

We will now see that the number of Dyck words of length \(2n\) is \(C_n\text{.}\)

Activity 115.
(a)

Show that the number of acceptable tableau insertions from Activity 111 is always a Catalan number. That is, give a bijection between the set of Dyck words of length \(2n\) to the set of ways to insert the numbers 1 through \(2n\) into a \(2\times n\) tableau so that both rows and columns are increasing.

Hint

If you put the numbers into the tableau in order, for each number, you must decide to put it in the top row or the bottom row. Do you have any other choices? What would constitute a mistake?

(b)

Show the number of ways to parenthesize a product of \(n+1\) numbers is \(C_n\) (see Activity 112).

Hint

One way to parenthesize the product \(abcd\) is \(a((bc)d)\text{,}\) but this is really \((a((bc)d))\text{,}\) so that the outer set set of parentheses belong with the product of \(a\) and \(((bc)d)\text{.}\) Further, not that if we wrote this just as \((a((bcd\text{,}\) there would be only one way to insert the right parentheses that would make the product parse correctly.

Not all of the models for the Catalan numbers are easily represented by Dyck words (or at least not in an obvious way).

Exercise 116.

Demonstrate a bijection between the set of acceptable ways to parenthesize the product of \(n+1\) terms and the set of rooted binary trees with \(n+1\) leaves (see Activity 114).

Hint

Think about how students are taught to factor numbers.

We have now verified that all our sample problems are Catalan numbers, except for the triangulations of polygons problem. There is a clever way to associate each triangulation with a binary tree, as suggested by Figure 1.5.3. However, let's take this opportunity to illustrate another method for proving two problems have the same answers.

Figure 1.5.3. A triangulation of a 5-gon and associated rooted binary tree.
Activity 117.

Consider the recurrence relation

\begin{equation*} C_{n + 1} = \sum_{i = 0}^n C_iC_{n-i} = C_{0}C_{n} + C_{1}C_{n - 1} + \ldots + C_{n}C_{0} \end{equation*}

with \(C_0 = 1\text{.}\)

(a)

Calculate the first 6 values of the sequence from this recurrence relation, to verify that it appears to agree with the Catalan numbers.

(b)

Prove that the Catalan numbers satisfy the recurrence relation.

Hint

Look at rooted binary trees. What happens when you remove the root? You could also use the parenthesizing model by asking where “main product” occurs.

(c)

Prove that the number of triangulations of a convex polygon with \(n+2\) sides also satisfies this recurrence relation (if you haven't already done so in Exercise 74). Conclude that the triangulations problem is also solved by the Catalan numbers.

Hint

Fix an edge of the polygon. There will be exactly \(n-2\) triangles this edge could be part of. For each of those, can you count the number of triangles that can be made of what ?

Next, we would like to find a closed formula for \(C_n\text{.}\) We can do so using lattice paths.

Activity 118.

Recall that \(C_n\) gives the number of lattice paths from \((0,0)\) to \((n,n)\) that do not cross the line \(y = x\) (but they may touch this line). We will compare this to all the lattice paths from \((0,0)\) to \((n,n)\text{.}\)

(a)

Explain why the number of lattice paths from \((0,0)\) to \((n,n)\) that do cross the line \(y = x\) is the same as the number of lattice paths from \((0,0)\) to \((n,n)\) that touch or cross the line \(y = x + 1\text{.}\)

(b)

Find a bijection between lattice paths from \((0,0)\) to \((n,n)\) that touch (or cross) the line \(y=x+1\) and lattice paths from \((-1,1)\) to \((n,n)\text{.}\)

Hint

Given a path from \((0, 0)\) to \((n, n)\) which touches or crosses the line \(y = x + 1\text{,}\) how can you modify the part of the path from \((0, 0)\) to the first touch of \(y = x + 1\) so that the modified path starts instead at \((-1, 1)\text{?}\) The trick is to do this in a systematic way that will give you your bijection.

(c)

Find a formula for the number of lattice paths from \((0,0)\) to \((n,n)\) that do not cross the line \(y=x\text{.}\) That is, a formula for \(C_n\text{.}\)

Hint

A path either touches the line \(y = x + 1\) or it doesn't. This partitions the set of paths into two blocks.

So \(C_n = \binom{2n}{n} - \binom{2n}{n+1}\text{.}\) Another formula you might run into is \(C_n = \frac{1}{n+1}\binom{2n}{n}\text{.}\) Why is this also correct?

Exercise 119.
(a)

Prove that \(\binom{2n}{n} - \binom{2n}{n+1} = \frac{1}{n+1}\binom{2n}{n}\) using a method of your choice.

Hint

The algebraic formula for \(\binom{n}{k}\) would be appropriate here.

(b)

Challenge problem: explain the formula \(C_n = \frac{1}{n+1}\binom{2n}{n+1}\) using the quotient principle. This would give an alternate proof for the closed formula for the Catalan numbers.

Subsection 1.5.3 More Problems

Here are a few more activities to practice with the Catalan numbers.

Exercise 120.

Find \(C_6\) and \(C_7\) using both the recursive and closed formulas.

Exercise 121.

Show, by example, how the bijection between Dyck words and valid parenthesizing works. To do this, list the \(C_4 = 14\) valid Dyck words, then list the \(C_4 = 14\) ways to parenthesize \(abcde\text{,}\) in the corresponding order.

Exercise 122.

Here is a way to establish the closed formula for \(C_n\) using Dyck words. For simplicity, consider Dyck words using the symbols 0 and 1, insisting that no initial segment contains more 1's than 0's.

(a)

Of all \(2n\)-bit strings of weight \(n\text{,}\) some are Dyck words and some are not. Explain which are not. What do all of these have in common?

(b)

Suppose some initial segment of a bit string contains one more 1 than 0. If you changed every 1 to a 0 and every 0 to a 1 in this initial segment, how many 0's and how many 1's would the entire bit string have?

(c)

Describe a bijection between the set of \(2n\)-bit strings of weight \(n-1\) and the set of \(2n\)-bit strings of weight \(n\) that are NOT Dyck words.

(d)

Why does this prove that \(C_n = \binom{2n}{n} - \binom{2n}{n-1}\text{?}\)

Exercise 123.

Find a bijection between the ways of parenthesizing \(n+1\) terms and the ways of triangulating a convex polygon with \(n+2\) sides. Illustrate this by matching up the outcomes for \(n = 3\text{.}\)

Exercise 124.

Let \(C_{n} = \frac{1}{n + 1}\binom{2n}{n}\text{.}\) Verify the formula:

\begin{equation*} C_{n} = \binom{n}{1} C_{n - 1} - \binom{n - 1}{2} C_{n - 2} + \binom{n - 2}{3} C_{n - 3} - \cdots \end{equation*}

We have considered 6 or 7 different interpretations of Catalan numbers so far. There are many others. In fact, Richard P. Stanley includes 66 different interpretations of the Catalan numbers in the book Enumerative Combinatorics: Volume 2.

The remainder of this section contains some examples. See if you can explain why each of these really are interpretations of \(C_n\text{.}\)

Exercise 125.

A and B each receive \(n\) votes. How many ways are there that the \(2n\) votes can be tallied so that A never trails B.

Exercise 126.

Place \(2n\) points on the circumference of a circle and draw \(n\) non-intersecting chords. How many ways can you do this? Equivalently, if \(2n\) people stand in a circle, how many ways can everyone shake hands with one other person so that no one's arms cross? Assume these people have arbitrarily long arms.

Exercise 127.

\(2n\) kids, all of different heights, must line up for a class picture. They will do this in two equal rows, but for variety, they will take a picture from the front and from the right. How many ways can the kids line up so that both pictures don't have any taller kid blocking a shorter kid?

Exercise 128.

Consider the graph \(P_n\) (a path with \(n\) edges and \(n+1\) vertices). Call one of the endpoints of the path \(v\text{.}\) How many walks of length \(2n\) start and stop at \(v\text{?}\) A walk in a graph is a sequence of adjacent vertices (think of tracing along edges of the graph), that does allow for repeated vertices.

Exercise 129.

Draw a \(n\) by \(n\) right triangle using squares on graph paper (a sort of staircase shape, made up of \(\frac{n(n+1)}{2}\) squares). Shade the squares using exactly \(n\) colors so that each color makes a rectangle (but not all colors are rectangles of the same dimension). How many ways can this be done?

Exercise 130.

If you have \(2n\) toothpicks, you could arrange them into a “mountain range” shape by putting half of the toothpicks angled upward, and the other half angled downward. Assume the first toothpick starts at sea level, so no valleys dip below that height. In how many ways can this be done?

Alternatively, we can define a diagonal lattice path as one in which each segment travels from \((a,b)\) to \((a+1, b+1)\) or to \((a+1, b-1)\text{.}\) How many diagonal lattice paths from \((0,0)\) to \((2n,0)\) never dip below the \(x\)-axis? Such diagonal lattice paths are sometimes called Dyck paths or Catalan paths.

Exercise 131.

How many permutations of \([n]\) do not contain three consecutive numbers \(abc\) with \(a \lt b \lt c\text{?}\) For example, the permutations \(1324\) and \(1423\) are both acceptable, but \(2341\) and \(1342\) are not.